Re: [Lsf-pc] [LSF/MM/BPF TOPIC] Towards Unified and Extensible Memory Reclaim (reclaim_ext)

From: Lorenzo Stoakes (Oracle)

Date: Fri Mar 27 2026 - 05:37:01 EST

On Fri, Mar 27, 2026 at 09:07:55AM +0100, Vlastimil Babka wrote:
> On 3/26/26 21:02, Axel Rasmussen via Lsf-pc wrote:
> > On Thu, Mar 26, 2026 at 1:02 AM Lorenzo Stoakes (Oracle) <ljs@xxxxxxxxxx> wrote:
> >>
> >> On Thu, Mar 26, 2026 at 08:03:34AM +0100, Michal Hocko wrote:
> >> > On Wed 25-03-26 19:05:47, Andrew Morton wrote:
> >> > > On Wed, 25 Mar 2026 14:06:37 -0700 Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote:
> >> > >
> >> > > > We should unify both algorithms into a single code path.
> >> > >
> >> > > I'm here to ask the questions which others fear will sound dumb.
> >> >
> >> > Not dumb at all and recently discussed here https://lore.kernel.org/all/CAMgjq7AkYOtUL2HuZjBu5dJw=RTL7W2L1+zVv=SCOyHKYwc3AA@xxxxxxxxxxxxxx/T/#u
> >> >
> >> > > Is it indeed the plan to maintain both implementations? I thought the
> >> > > long-term ambition was to knock MGLRU into shape and to drop the legacy LRU?
> >
> > I think one thing we all agree on at least is, long term, there isn't
> > really a good argument for having > 1 LRU implementation. E.g., we
> > don't believe there are just irreconcilable differences, where one
> > impl is better for some workloads, and another is better for others,
> > and there is no way the two can be converged.
> >
> > On that basis, I would be hesitant to add some complex abstraction
> > layer / reclaim_ops to facilitate having two. It seems ilke it may
> > make things a bit cleaner in the short term, but long term might make
> > that end goal harder (because we'd add the task of cleaning up this
> > abstraction at some point).
> >
> > My preferred way would be more like:
> >
> > - Look for opportunities where we can deduplicate code, but without
> > adding abstraction (e.g., factor out common operations into common
> > functions both impls can call).
> > - Identify gaps where MGLRU performs worse than classic LRU, and close them.
>
> I'm afraid to identify these gaps we'd have to indeed split the MGLRU
> differences (as listed in Shakeel's proposal) in a way that they can be
> tried separately. I recall when MGLRU was proposed, we did argue that it's a
> combination of several things done differently and they should be introduced
> to the existing reclaim and validated separately. But the author refused to
> go that way.

It's unfortunate we were ok with going ahead with this anyway, I hope in the
current mm culture we'd simply say 'ok series doesn't land then'.

It feels like the 'merge by default' stuff has been both a process _and_ a
culture fail for a while honestly.

We are managing arguably _the_ most core part of the kernel (I know people will
argue on that but at it's at least up there :) and almost certainly _the_
subsystem with the most direct route to exploitable security bugs, but we've
been running it with some of the loosest merge criteria in the entire kernel.

It's patently insane.

Things are changing however!

Cheers, Lorenzo