Re: [PATCH v1 2/3] mm: process_mrelease: skip LRU movement for exclusive file folios
From: David Hildenbrand (Arm)
Date: Thu Apr 30 2026 - 02:08:16 EST
On 4/29/26 16:44, Michal Hocko wrote:
> On Wed 29-04-26 15:07:04, David Hildenbrand wrote:
>> On 4/29/26 12:33, Michal Hocko wrote:
>>>
>>> While the oom is the only current kernel user of MMF_UNSTABLE (in a
>>> sense it sets the flag) the flag should denote that any page faults are
>>> reliable because it might fault in a fresh memory and user would lose
>>> the previous content without knowing that. Not sure MMF_OOM_REAPING
>>> would reflect that reality better.
>>
>> We use it for failed fork() as well, but that's slightly different semantics (no
>> real page faults ever made sense).
Well, there is a difference: a failed-fork process was never scheduled and will
never get scheduled.
In fact, we added the MMF_UNSTABLE to the fork path in
commit 64c37e134b120fb462fb4a80694bfb8e7be77b14
Author: Liam R. Howlett <liam@xxxxxxxxxxxxx>
Date: Mon Jan 27 12:02:21 2025 -0500
kernel: be more careful about dup_mmap() failures and uprobe registering
If a memory allocation fails during dup_mmap(), the maple tree can be left
in an unsafe state for other iterators besides the exit path. All the
locks are dropped before the exit_mmap() call (in mm/mmap.c), but the
incomplete mm_struct can be reached through (at least) the rmap finding
the vmas which have a pointer back to the mm_struct.
Up to this point, there have been no issues with being able to find an
mm_struct that was only partially initialised. Syzbot was able to make
the incomplete mm_struct fail with recent forking changes, so it has been
proven unsafe to use the mm_struct that hasn't been initialised, as
referenced in the link below.
Although 8ac662f5da19f ("fork: avoid inappropriate uprobe access to
invalid mm") fixed the uprobe access, it does not completely remove the
race.
This patch sets the MMF_OOM_SKIP to avoid the iteration of the vmas on the
oom side (even though this is extremely unlikely to be selected as an oom
victim in the race window), and sets MMF_UNSTABLE to avoid other potential
users from using a partially initialised mm_struct.
Which was later changed in
commit 43873af772f8138c5cb4b76dde9c26339e89be3b
Author: Liam R. Howlett <liam@xxxxxxxxxxxxx>
Date: Wed Jan 21 11:49:42 2026 -0500
mm: change dup_mmap() recovery
When the dup_mmap() fails during the vma duplication or setup, don't write
the XA_ZERO entry in the vma tree. Instead, destroy the tree and free the
new resources, leaving an empty vma tree.
Using XA_ZERO introduced races where the vma could be found between
dup_mmap() dropping all locks and exit_mmap() taking the locks. The race
can occur because the mm can be reached through the other trees via
successfully copied vmas and other methods such as the swapoff code.
...
and I am not sure if MMF_UNSTABLE is still required, as we don't leave these
stale VMA copies in the maple tree.
The process might just look like just another process that is getting torn down now.
But we'd have to learn from Liam :)
>
> The bottom line is the same. Make sure PF fails rather than silently
> provide potentially corrupted data.
>
>> Looking at the original patch here, using MMF_OOM_REAPING to modify zapping
>> behavior would be clearer than MMF_UNSTABLE, I guess.
>
> Ohh, you mean to add a new flag, right?
We could do that as well, if it's of any help.
--
Cheers,
David