Re: [PATCH v3 2/2] mm/mglru: maintain workingset refault context across state transitions
From: Leno Hou
Date: Tue Mar 17 2026 - 23:41:55 EST
On 3/18/26 11:30 AM, Kairui Song wrote:
On Mon, Mar 16, 2026 at 1:56 PM Leno Hou via B4 Relay
<devnull+lenohou.gmail.com@xxxxxxxxxx> wrote:
From: Leno Hou <lenohou@xxxxxxxxx>
When MGLRU state is toggled dynamically, existing shadow entries (eviction
tokens) lose their context. Traditional LRU and MGLRU handle workingset
refaults using different logic. Without context, shadow entries
re-activated by the "wrong" reclaim logic trigger excessive page
activations (pgactivate) and system thrashing, as the kernel cannot
correctly distinguish if a refaulted page was originally managed by
MGLRU or the traditional LRU.
This patch introduces shadow entry context tracking:
- Encode MGLRU origin: Introduce WORKINGSET_MGLRU_SHIFT into the shadow
entry (eviction token) encoding. This adds an 'is_mglru' bit to shadow
entries, allowing the kernel to correctly identify the originating
reclaim logic for a page even after the global MGLRU state has been
toggled.
Hi Leno,
I really don't think it's a good idea to waste one bit there just for
the transition state which is rarely used. And if you switched between
MGLRU / non-MGLRU then the refault distance check is already kind of
meaning less unless we unify their logic of reactivation.
BTW I tried that sometime ago: https://lwn.net/Articles/945266/
- Refault logic dispatch: Use this 'is_mglru' bit in workingset_refault()
and workingset_test_recent() to dispatch refault events to the correct
handler (lru_gen_refault vs. traditional workingset refault).
Hmm, restoring the folio ref count in MGLRU is not the same thing as
reactivation or restoring the workingset flag in non-MGLRU case, and
not really comparable. Not sure this will be helpful.
Maybe for now we just igore this part, shadow is just a hint after
all, switch the LRU at runtime is already a huge performance impact
factor and not recommend, that the shadow part is trivial compared to
that.
Hi Kairui,
Thank you for the insightful feedback. I completely agree with your assessment: the workingset refault context is indeed just a hint, and trying to align or convert these tokens between MGLRU and non-MGLRU states is overly complex and likely unnecessary, especially given that runtime switching is an extreme and infrequent operation.
I have decided to take your advice and completely remove the patches related to workingset refault context tracking and folio_lru_gen state checking.
My revised patch will focus solely on the lru_drain_core state machine, which is the minimal and robust approach to address the primary issue: preventing cgroup OOMs caused by the race condition during state transitions. This should significantly reduce the complexity and risk of the patch series.
I've sent a simplified v4 patch series that focuses strictly on the lru_drain_core logic, removing all the disputed context-tracking code.
And this patch was tested on latest 7.0.0-rc1 with 1000 iterations
toggle on/off and no OOM.
Thank you for helping me sharpen the focus of this fix.
Best regards,
Leno Hou