Re: [PATCH v3 2/4] mm/zswap: Implement proactive writeback

From: Yosry Ahmed

Date: Wed Jun 03 2026 - 14:43:55 EST

On Wed, Jun 3, 2026 at 11:34 AM Nhat Pham <nphamcs@xxxxxxxxx> wrote:
>
> On Wed, Jun 3, 2026 at 11:26 AM Yosry Ahmed <yosry@xxxxxxxxxx> wrote:
> >
> > > > > Is the main difference that we are scanning in batches here? I think we
> > > > > can have shrink_memcg() do that too. If anything, it might make the
> > > > > shrinker more efficient. Over-reclaim is ofc a concern, and especially
> > > > > in the zswap_store() path as the overhead can be noticeable. Maybe we
> > > > > can parameterize the batch size based on the code path.
> > > > >
> > > > > Nhat, what do you think?
> > > >
> > > > Nhat, since we now have the referenced-based second chance algorithm,
> > > > should we consider doing batch writeback for shrink_memcg() as well?
> > >
> > > I just take a look at shrink_memcg() and realized it's reclaiming one
> > > page at a time. Thanks for the reminder - I hated it.
> > >
> > > Please batchify it if it makes your life easier :) We don't reclaim
> > > "just one page/object" anywhere else in the reclaim path, Sure, it
> > > adds a bit of latency to zswap_store() if we reached cgroup limit, but
> > > IMHO if we hit zswap.max limit at zswap_store() time, that is already
> > > the slowest of slow path that you should have avoided with proactive
> > > reclaim/zswap shrinker in the first place. And, setting zswap.max does
> > > not make sense to me in the first place (I can write a whole essay
> > > about it).
> >
> > Should we batchify shrink_memcg() from the shrinker and background
> > writeback, but leave the synchronous zswap_store() path to reclaim one
> > page for this series at least to avoid potential regressions?
> >
> > I think this change specifically needs more intensive testing (vs the
> > other code paths).
>
> I'm fine with having shrink_memcg() takes a batch_size argument for now :)
>
> I suspect not a lot of people invokes the shrink_memcg() synchronous
> path in zswap store though. Setting zswap.max is hard (as it involves
> guessing compression ratio ahead of time) and induces quite a bit of
> overhead (obj_cgroup_may_zswap() does a force flush for every store if
> you set zswap.max to a value other than 0 and max).

Yeah I agree, but I just wanna make sure we don't completely kill
performance in case anyone is actually doing that.

Regarding the force flush, it's unfortunate that we need to rely on a
stat for internal limit checking. It might make sense to use page
counters here, which already support hierarchical charging and
whatnot. We'll need something similar to the objcg approach to charge
sub-pages (or maybe just reuse that somehow?).