[PATCH 0/3] mm/memcontrol: control demotion in memcg reclaim
From: Bing Jiao
Date: Tue Mar 17 2026 - 19:17:43 EST
In tiered-memory systems, NUMA demotion counts towards reclaim targets
in shrink_folio_list(), but it does not reduce the total memory usage of
a memcg.
In memcg direct reclaim paths (charge-triggered or manual limit writes),
this leads to "fake progress" where the reclaim loop concludes it has
satisfied the memory request without actually reducing the cgroup's charge.
This results in inefficient reclaim loops, CPU waste, moving all pages
to far-tier nodes, and potentially premature OOM kills, and potentially
premature OOM kills when the cgroup is under memory pressure but demotion
is still possible.
This series fixes this issue by disabling demotion in memcg-specific
direct reclaim paths and provides user control for proactive reclaim.
Patch 1: Fixes a state leak in try_charge_memcg() where reclaim_options
were modified and carried over to retries improperly.
Patch 2: Introduces MEMCG_RECLAIM_NO_DEMOTION and disables demotion in
memcg direct reclaim paths.
Patch 3: Adds a 'demote=' option to the proactive reclaim interface
(memory.reclaim), allowing users to explicitly enable demotion if
desired, while defaulting it to disabled for consistency.
Bing Jiao (3):
mm/memcontrol: fix reclaim_options leak in try_charge_memcg()
mm/memcontrol: disable demotion in memcg direct reclaim
mm/vmscan: add demote= option to proactive reclaim
include/linux/swap.h | 1 +
mm/memcontrol-v1.c | 10 ++++++++--
mm/memcontrol.c | 17 ++++++++++++-----
mm/vmscan.c | 11 +++++++++++
4 files changed, 32 insertions(+), 7 deletions(-)
--
2.53.0.851.ga537e3e6e9-goog