Re: [PATCH] mm: remove '!root_reclaim' checking in should_abort_scan()
From: Zhaoyang Huang
Date: Tue Mar 17 2026 - 08:32:44 EST
On Tue, Mar 17, 2026 at 3:52 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> On Mon 16-03-26 14:09:52, T.J. Mercier wrote:
> > On Mon, Mar 16, 2026 at 1:02 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
> > >
> > > On Thu 12-02-26 11:21:11, zhaoyang.huang wrote:
> > > > From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
> > > >
> > > > Nowadays, ANDROID system replaces madivse with memory.reclaim to implement
> > > > user space memory management which desires to reclaim a certain amount of
> > > > memcg's memory. However, oversized reclaiming and high latency are observed
> > > > as there is no limitation over nr_reclaimed inside try_to_shrink_lruvec
> > > > when MGLRU enabled. Besides, this could also affect all none root_reclaim
> > > > such as reclaim_high etc.
> > > > Since the commit 'b82b530740b9' ("mm: vmscan: restore incremental cgroup
> > > > iteration") introduces sc->memcg_full_walk to limit the walk range of
> > > > mem_cgroup_iter and keep the fairness among the descendants of one memcg.
> > > > This commit would like to make single memcg's scanning more precised by
> > > > removing the criteria of 'if (!root_reclaim)' inside
> > > > should_abort_scan().
> > >
> > > This changelog, similar to its previous version is lacking details on
> > > what exactly is going on. How much over-reclaim are we talking about
> > > here? Is this MGLRU specific?
> >
> > Hi Michal,
> >
> > This came from https://lore.kernel.org/all/20260210054312.303129-1-zhaoyang.huang@xxxxxxxxxx/
> >
> > Zhaoyang would have to provide numbers, but yes this is MGLRU specific.
> >
> > > Why doesn't our standard over-reclaim
> > > protection work?
> >
> > "there is no limitation over nr_reclaimed inside try_to_shrink_lruvec"
> > This means that for proactive reclaim the check for sc->nr_reclaimed
> > >= sc->nr_to_reclaim is skipped, because the !root_reclaim(sc)
> > condition is hit first. So we never abort based on the value of
> > sc->nr_reclaimed, which can lead to overreclaim.
> >
> > For try_to_free_mem_cgroup_pages -> shrink_node_memcgs ->
> > shrink_lruvec -> lru_gen_shrink_lruvec -> try_to_shrink_lruvec, the
> > !root_reclaim(sc) check was there for reclaim fairness, which was
> > necessary before commit 'b82b530740b9' ("mm: vmscan: restore
> > incremental cgroup iteration") because the fairness depended on
> > attempted proportional reclaim from every memcg under the target
> > memcg. However after commit 'b82b530740b9' there is no longer a need
> > to visit every memcg to ensure fairness, horray. The problem is for
> > large lruvecs, the lack of a check against sc->nr_to_reclaim inside
> > try_to_shrink_lruvec (caused by the continued presence of the
> > !root_reclaim(sc) check) can cause overreclaim. The non-MGLRU
> > implementation in shrink_lruvec already checks nr_reclaimed against
> > nr_to_reclaim.
>
> OK, this describes the underlying problem much better. It should go into
> the changelog. Including an explanation why MGLRU cannot follow the
> traditional reclaim bail out logic.
Patchv2 sent with commit message update. Thanks
>
> Thanks!
> --
> Michal Hocko
> SUSE Labs