Re: [PATCH v3 2/2] mm/mglru: maintain workingset refault context across state transitions

From: Leno Hou

Date: Tue Mar 17 2026 - 23:41:55 EST

On 3/18/26 11:30 AM, Kairui Song wrote:

On Mon, Mar 16, 2026 at 1:56 PM Leno Hou via B4 Relay
<devnull+lenohou.gmail.com@xxxxxxxxxx> wrote:

From: Leno Hou <lenohou@xxxxxxxxx>

When MGLRU state is toggled dynamically, existing shadow entries (eviction
tokens) lose their context. Traditional LRU and MGLRU handle workingset
refaults using different logic. Without context, shadow entries
re-activated by the "wrong" reclaim logic trigger excessive page
activations (pgactivate) and system thrashing, as the kernel cannot
correctly distinguish if a refaulted page was originally managed by
MGLRU or the traditional LRU.

This patch introduces shadow entry context tracking:

- Encode MGLRU origin: Introduce WORKINGSET_MGLRU_SHIFT into the shadow
entry (eviction token) encoding. This adds an 'is_mglru' bit to shadow
entries, allowing the kernel to correctly identify the originating
reclaim logic for a page even after the global MGLRU state has been
toggled.

Hi Leno,

I really don't think it's a good idea to waste one bit there just for
the transition state which is rarely used. And if you switched between
MGLRU / non-MGLRU then the refault distance check is already kind of
meaning less unless we unify their logic of reactivation.

BTW I tried that sometime ago: https://lwn.net/Articles/945266/

- Refault logic dispatch: Use this 'is_mglru' bit in workingset_refault()
and workingset_test_recent() to dispatch refault events to the correct
handler (lru_gen_refault vs. traditional workingset refault).

Hmm, restoring the folio ref count in MGLRU is not the same thing as
reactivation or restoring the workingset flag in non-MGLRU case, and
not really comparable. Not sure this will be helpful.

Maybe for now we just igore this part, shadow is just a hint after
all, switch the LRU at runtime is already a huge performance impact
factor and not recommend, that the shadow part is trivial compared to
that.

Hi Kairui,

Thank you for the insightful feedback. I completely agree with your assessment: the workingset refault context is indeed just a hint, and trying to align or convert these tokens between MGLRU and non-MGLRU states is overly complex and likely unnecessary, especially given that runtime switching is an extreme and infrequent operation.

I have decided to take your advice and completely remove the patches related to workingset refault context tracking and folio_lru_gen state checking.

My revised patch will focus solely on the lru_drain_core state machine, which is the minimal and robust approach to address the primary issue: preventing cgroup OOMs caused by the race condition during state transitions. This should significantly reduce the complexity and risk of the patch series.

I've sent a simplified v4 patch series that focuses strictly on the lru_drain_core logic, removing all the disputed context-tracking code.
And this patch was tested on latest 7.0.0-rc1 with 1000 iterations
toggle on/off and no OOM.

Thank you for helping me sharpen the focus of this fix.

Best regards,
Leno Hou