Re: [PATCH 0/15] mm: introduce ANON_VMA_LAZY for deferred anon_vma creation

From: Lorenzo Stoakes

Date: Tue Jun 02 2026 - 16:54:33 EST

On Fri, May 29, 2026 at 01:04:08PM +0100, Lorenzo Stoakes wrote:
> On Fri, May 29, 2026 at 09:41:20AM +0000, wangtao wrote:
> > > Hi Tao,
> > >
> > > Lorenzo had a discussion about rmap in Zagreb here:
> > > https://lore.kernel.org/linux-mm/aec533b2-37a7-4f44-a279-
> > > c4aa604206ac@lucifer.local/
> > >
> > > He also shared the PoC code here:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/ljs/linux.git/log/?h=project/
> > > cow-context
> > >
> > > and the slides were shared as well. In case you can't find them on linux-mm (I
> > > actually couldn't find them myself), I am attaching them again here -
> > > "scalable-cow-lsf-longer-version.pdf"
> > >
> > > After coming back from Zagreb, I kept trying to find one or two full days to
> > > read Lorenzo's code and slides carefully and write a blog about them.
> > > Unfortunately, I have been completely busy with other work. Sigh... we
> > > always seem to have too many non-upstream tasks.
> > >
> > > If possible, I'd really appreciate it if you could take a deep dive into it and
> > > write a detailed blog post. I'd be very eager to read it and better understand
> > > the overall design.
> > > Otherwise, I'll try to find some time next week or later to go through it
> > > myself.
> > >
> > Hi Barry,
> >
> > Thank you very much for your reply.
> >
> > I took an initial look at the cow-context code, and a few points
> > might be worth noting:
> >
> > 1. cow_context_walk currently assumes that the rmap walk runs
> > under RCU protection. This may need to be adjusted early,
> > since paths such as try_to_unmap_one, page_vma_mkclean_one,
> > and try_to_migrate_one may involve task switching.
> >
> > 2. In cow_context_walk, traverse_contexts appears to involve
> > multiple nested loops. When there are many child processes
> > across several fork layers, it may not be as simple or
> > efficient as the current anon_vma approach.
> >
> > It needs to traverse all child cow_ctx, and within each
> > cow_ctx, remaps_for_each() has two levels of iteration:
> > remaps_for_each_entry and remaps_for_each_entry_offset.
> >
> > In other words, it first iterates over cow_ctx and then
> > traverses rmap_mt inside each one. The rough complexity
>
> > seems to be O(#proc * log(#rmap_entries_in_cow)), which
> > may be somewhat higher than anon_vma's
> > O(#vmas_in_anon_vma). However, in most cases the number
> > of processes is not large, so the impact may be limited.
> >
> > Previously, I also considered converting anon_vma's rb_tree
> > to a mapletree. If one entry records a single VMA, the
> > average overhead could be less than two longs per VMA.
> >
> > However, unlike rb_tree, mapletree does not support storing
> > multiple elements under a single key. The key would need to
> > look like (vma_id/mm_id + pgoff). On 32-bit platforms, since
> > 64-bit mapletree keys are not supported yet, the remaining
> > 12 bits are not enough for vma_id/mm_id.
> >
> > Because of this limitation, I later started thinking about
> > ways to reduce anon_vma allocations instead.
> >
> > I will try to find some time next week to analyze the
> > cow-context design and code more thoroughly, and then
> > write up a summary.
>
> Tao,
>
> This response is so full of misunderstandings it's not really worth me
> responding to any of it. You've even hallucinated an imaginary field which
> is REALLY suspicious.
>
> You've no mm expertise or history and came up with this in a few hours. I
> asked Claude to analyse it and it puts it at 75-80% chance of being solely
> LLM-generated from cow_context.c.
>
> I simply don't have the time to deal with this, so unfortunately I'm going
> to have to withdraw the suggestion of further discussion with you on this
> topic.
>
> I am working on the scalable CoW project and will solicit opinions of those
> with relevant expertise.
>
> We are not interested in your approach or analysis.
>
> Thanks, Lorenzo

Apparently there's some misunderstanding about this situation here, sigh.

So for avoidance of doubt - I've now spent many hours on this, and unfortunately
(as I've already said in multiple places) this series has serious architectural
and code flaws.

And unfortunately, the anon_vma approach is not something we wish to extend, for
reasons I've gone into elsewhere - but broadly because it's a broken
abstraction, that uses lots of memory and causes lock contention.

The approach here has multiple technical issues, so many that getting into each
one would require hours more of my time to analyse, maybe all week?

And then if there were further replies and replies to the replies and respins...

However, I also feel there's substantive, overlapping, evidence of the _logic_
(not the text, we are FINE with using AI to assist text for non-native speakers)
being LLM-generated.

However you can never prove this for 100% certain. But you can certainly be more
or less sure. I would never suggest this unless I was really pretty certain.

I am very keen to avoid 'witch hunts', or rash accusations. This is not
that. It's a _carefully considered_ opinion, based on evidence.

But of course - I do not know for SURE. You can never know.

The big problem here is asymmetry of maintainer resource. I simply _cannot_
respond to every single issue here. And when the architecture is something we
don't want, then it's not really necessary to.

And my big deep underlying concern with all this is - people can generate a very
significant amount of this kind of work, and we have limited reviewer time.

I've already dealt with burnout recently that I'm thankfully recovering
from. I'm not really keen to go back to that.

I really truly worry that if we don't have a means by which we can quickly
dismiss/deprioritise things when we have a _significant_ evidence of wholesale
AI generation, then maintainer overload will increase exponentially.

And that's really a serious problem.

If we treat it like simply a technically incorrect solution, then it means we
open it up to further discussion on and onx, as we're actually observing here. If
the responses are also LLM-generated then it's even more problematic.

This is why I bring it up, and proactively say it's lead to a real loss in trust
in this case, and why, after there was a response that included a hallucinated
field in it, I went further and said that I really don't want to have a
discussion either.

It's because of this asymmetry.

And even this reply, written at 9.45pm at night, after several hours of
discussion about this off-list, is evidence of the problem we have with this
kind of asymmetry.

It's nothing personal, it's about managing time and resources.

Thanks, Lorenzo