Re: [RFC] mm: stress-ng --mremap triggers severe lruvec lock contention in populate/unmap paths

From: Barry Song

Date: Thu Apr 09 2026 - 18:00:17 EST


On Wed, Apr 8, 2026 at 4:09 PM David Hildenbrand (Arm) <david@xxxxxxxxxx> wrote:
>
> >>
> >> It was also found that adding '--mremap-numa' changes the behavior
> >> substantially:
> >
> > "assign memory mapped pages to randomly selected NUMA nodes. This is
> > disabled for systems that do not support NUMA."
> >
> > so this is just sharding your lock contention across your NUMA nodes (you
> > have an lruvec per node).
> >
> >>
> >> stress-ng --mremap 8192 --mremap-bytes 4K --timeout 30 --mremap-numa
> >> --metrics-brief
> >>
> >> mremap 2570798 29.39 8.06 106.23 87466.50 22494.74
> >>
> >> So it's possible that either actual swapping, or the mbind(...,
> >> MPOL_MF_MOVE) path used by '--mremap-numa', removes most of the excessive
> >> system time.
> >>
> >> Does this look like a known MM scalability issue around short-lived
> >> MAP_POPULATE / munmap churn?
> >
> > Yes. Is this an actual issue on some workload?
>
> Same thought, it's unclear to me why we should care here. In particular,
> when talking about excessive use of zero-filled pages.

About 2–3 years ago, I had the impression that we might need
separate LRU locks for file and anon. This could reduce
contention in real-world scenarios, especially when memcg is
not enabled, but I never built a prototype for it.

Yes, this is of course not related to the problem reported in
this thread—here, the contention appears to be primarily
between anon pages themselves.

Thanks
Barry