Re: [RFC PATCH] mm: Add missing release barrier on PGDAT_RECLAIM_LOCKED unlock
From: Andrea Parri
Date: Wed Mar 12 2025 - 06:56:27 EST
On Fri, Mar 07, 2025 at 02:30:47PM -0500, Mathieu Desnoyers wrote:
> The PGDAT_RECLAIM_LOCKED bit is used to provide mutual exclusion of
> node reclaim for struct pglist_data using a single bit.
>
> It is "locked" with a test_and_set_bit (similarly to a try lock) which
> provides full ordering with respect to loads and stores done within
> __node_reclaim().
>
> It is "unlocked" with clear_bit(), which does not provide any ordering
> with respect to loads and stores done before clearing the bit.
>
> The lack of clear_bit() memory ordering with respect to stores within
> __node_reclaim() can cause a subsequent CPU to fail to observe stores
> from a prior node reclaim. This is not an issue in practice on TSO (e.g.
> x86), but it is an issue on weakly-ordered architectures (e.g. arm64).
>
> Fix this with following changes:
>
> A) Use clear_bit_unlock rather than clear_bit to clear PGDAT_RECLAIM_LOCKED
> with a release memory ordering semantic.
>
> This provides stronger memory ordering (release rather than relaxed).
>
> B) Use test_and_set_bit_lock rather than test_and_set_bit to test-and-set
> PGDAT_RECLAIM_LOCKED with an acquire memory ordering semantic.
>
> This changes the "lock" acquisition from a full barrier to an acquire
> memory ordering, which is weaker. The acquire semi-permeable barrier
> paired with the release on unlock is sufficient for this mutual
> exclusion use-case.
FWIW, this aligns with my understanding.
Is (A) intended to be (submitted separately and) backported?
Andrea
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>
> Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx>
> Cc: Alan Stern <stern@xxxxxxxxxxxxxxxxxxx>
> Cc: Andrea Parri <parri.andrea@xxxxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
> Cc: Nicholas Piggin <npiggin@xxxxxxxxx>
> Cc: David Howells <dhowells@xxxxxxxxxx>
> Cc: Jade Alglave <j.alglave@xxxxxxxxx>
> Cc: Luc Maranget <luc.maranget@xxxxxxxx>
> Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
> Cc: linux-mm@xxxxxxxxx
> ---
> mm/vmscan.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c22175120f5d..021b25bdba91 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -7567,11 +7567,11 @@ int node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned int order)
> if (node_state(pgdat->node_id, N_CPU) && pgdat->node_id != numa_node_id())
> return NODE_RECLAIM_NOSCAN;
>
> - if (test_and_set_bit(PGDAT_RECLAIM_LOCKED, &pgdat->flags))
> + if (test_and_set_bit_lock(PGDAT_RECLAIM_LOCKED, &pgdat->flags))
> return NODE_RECLAIM_NOSCAN;
>
> ret = __node_reclaim(pgdat, gfp_mask, order);
> - clear_bit(PGDAT_RECLAIM_LOCKED, &pgdat->flags);
> + clear_bit_unlock(PGDAT_RECLAIM_LOCKED, &pgdat->flags);
>
> if (ret)
> count_vm_event(PGSCAN_ZONE_RECLAIM_SUCCESS);
> --
> 2.25.1
>