Re: [PATCH 1/2] mm/page_alloc: add tracepoints for zone->lock acquisitions

From: Jesper Dangaard Brouer

Date: Wed May 13 2026 - 12:16:58 EST




On 08/05/2026 20.07, Dmitry Ilvokhin wrote:
On Fri, May 08, 2026 at 07:40:51PM +0200, Vlastimil Babka (SUSE) wrote:
On 5/8/26 7:38 PM, Vlastimil Babka (SUSE) wrote:
On 5/8/26 7:29 PM, Andrew Morton wrote:
e .configOn Fri, 8 May 2026 18:22:06 +0200 hawk@xxxxxxxxxx wrote:

Add tracepoints to the page allocator fast paths that acquire
zone->lock, allowing diagnosis of lock contention in production.

Thanks, I'm surprised we haven't done this yet.

There was a recent attempt [1]. Not being a generic solution wasn't welcome.

[1] https://lore.kernel.org/all/cover.1772206930.git.d@xxxxxxxxxxxx/

And this is the generic solution I think?

https://lore.kernel.org/all/cover.1777999826.git.d@xxxxxxxxxxxx/

Thanks for cc'ing me, Vlastimil.

Yes, this is an attempt at a generic solution for tracing contended
locks, including spinlocks, so it should also cover the use case
proposed in this patchset.


I'm aware of the generic solution and often use `perf lock contention`.
And the tool libbpf-tools/klockstat. My experience is unfortunately that
enabling these tracepoint is prohibitive expensive on production server,
and production suffers when I run these tools.

I'm very happy to see a patchset adding a contended case. But I worry
that tracing all contented locks in the system is also too much to have
enabled continuously for production.

This patch is carefully constructed to minimize overhead, such that I
can enable this continuously on production to catch issues. If I
identify issue I will use the generic tracpoints for further debugging.


In fact, zone->lock contention was one of the primary motivations for
this work.

In the generic solution I'm loosing the "zone" and pages "count". I
need this information to get the answers I'm looking for. Specifically
I'm looking at reducing CONFIG_PCP_BATCH_SCALE_MAX, but I want to this
to be a data-driven decision (my first principle is: if you cannot
measure it you cannot improve it).

I'm likely going to apply this patch to our production system, such that
I can get my data-driven decision. I need to deploy it widely enough to
get enough server experiencing direct-reclaim. I'll report back if
people are interested in these learning?

--Jesper