Re: [LSF/MM/BPF TOPIC] Towards Unified and Extensible Memory Reclaim (reclaim_ext)
From: Tal Zussman
Date: Tue Apr 14 2026 - 17:38:11 EST
On 4/9/26 4:22 AM, Lorenzo Stoakes wrote:
> Yes, thanks for that, it's interesting!
>
> But I would say for now we need to defer any consideration of bpf being a
> thing until we actually get things into shape in terms of improving and
> modularising the existing reclaim mechanisms.
>
> mm has been far too keen to take features without paying down technical
> debt first and it's been very costly, so before anything else, we must
> ensure that reclaim is both long-term maintainable and maintained.
>
Completely agreed that cleanup is necessary and modularization is a big step
towards that. I do think it makes sense to think about what eBPF would look
like as part of that future though. It would be a shame to do all of that
modularization work, then decide to integrate eBPF later on and realize that
we need major changes to make that happen (but, given a well-designed
interface, I think that's unlikely to be a significant issue).
> In terms of reclaim bpf as a concept in general - reclaim is so very
> sensitive to even minor changes, and I fear that people might find
> something that appears to dramatically improve matters in one scenario, but
> end up with an unusable system in another.
That concern is precisely why we implemented per-cgroup policies. We found
that running each application with a policy that works best for it yielded
greater overall performance. System-wide policies may need something more
generic, but if you can make them more granular, you can specialize more
without running into such issues. This was simple enough to do given that
the LRU lists are already per-memcg, and eBPF (is about to) support
per-cgroup struct_ops programs.
> A bad sched_ext implementation might result in poor responsiveness, but a
> bad reclaim_ext implementation might result in a soft-locked system, and I
> fear that it might be quite easy to do that.
Anecdotally, having implemented about a dozen policies across dozens of
workloads, we never ran into a soft-lockup. That's not a guarantee that it
can't happen, but with a properly implemented fallback/watchdog to ensure
that pages actually get reclaimed as necessary, this should be manageable.
sched_ext actually implements such a watchdog to kick out misbehaving eBPF
schedulers.
> In any case, we can look at all that once we are in a better place with
> reclaim, which Shakeel's proposal focuses on and I'm very much in favour
> of! :)
>
> Cheers, Lorenzo
>