Re: [PATCH] mm/memory: fix PMD/PUD checks in follow_pfnmap_start()

From: David Hildenbrand (Arm)

Date: Tue Mar 24 2026 - 08:52:34 EST


On 3/24/26 12:04, Lorenzo Stoakes (Oracle) wrote:
> On Mon, Mar 23, 2026 at 09:20:18PM +0100, David Hildenbrand (Arm) wrote:
>> follow_pfnmap_start() suffers from two problems:
>>
>> (1) We are not re-fetching the pmd/pud after taking the PTL
>>
>> Therefore, we are not properly stabilizing what the lock lock actually
>> protects. If there is concurrent zapping, we would indicate to the
>> caller that we found an entry, however, that entry might already have
>> been invalidated, or contain a different PFN after taking the lock.
>>
>> Properly use pmdp_get() / pudp_get() after taking the lock.
>>
>> (2) pmd_leaf() / pud_leaf() are not well defined on non-present entries
>>
>> pmd_leaf()/pud_leaf() could wrongly trigger on non-present entries.
>>
>> There is no real guarantee that pmd_leaf()/pud_leaf() returns something
>> reasonable on non-present entries. Most architectures indeed either
>> perform a present check or make it work by smart use of flags.
>
> It seems huge page split is the main user via pmd_invalidate() ->
> pmd_mkinvalid().
>
> And I guess this is the kind of thing you mean by smart use of flags, for
> x86-64:

Exactly.

[...]

>
>>
>> However, for example loongarch checks the _PAGE_HUGE flag in pmd_leaf(),
>> and always sets the _PAGE_HUGE flag in __swp_entry_to_pmd(). Whereby
>> pmd_trans_huge() explicitly checks pmd_present(), pmd_leaf() does not
>> do that.
>
> But pmd_present() checks for _PAGE_HUGE in pmd_present(), and if set checks
> whether one of _PAGE_PRESENT, _PAGE_PROTNONE, _PAGE_PRESENT_INVALID is set,
> and pmd_mkinvalid() sets _PAGE_PRESENT_INVALID (clearing _PAGE_PRESENT,
> _VALID, _DIRTY, _PROTNONE) so it'd return true.

pmd_present() will correctly indicate "not present" for, say, a softleaf
migration entry.

However, pmd_leaf() will indicate "leaf" for a softleaf migration entry.

So not checking pmd_present() will actually treat non-present migration
entries as present leafs in this function, which is wrong in the context
of this function.

We're walking present entries where things like pmd_pfn(pmd) etc make sense.

>
> pmd_leaf() simply checks to see if _PAGE_HUGE is set which should be
> retained on split so should all still have worked?
>
> But anyway this is still worthwhile I think.
>
>>
>> Let's check pmd_present()/pud_present() before assuming "the is a
>> present PMD leaf" when spotting pmd_leaf()/pud_leaf(), like other page
>> table handling code that traverses user page tables does.
>>
>> Given that non-present PMD entries are likely rare in VM_IO|VM_PFNMAP,
>> (1) is likely more relevant than (2). It is questionable how often (1)
>> would actually trigger, but let's CC stable to be sure.
>>
>> This was found by code inspection.
>>
>> Fixes: 6da8e9634bb7 ("mm: new follow_pfnmap API")
>> Cc: stable@xxxxxxxxxxxxxxx
>> Signed-off-by: David Hildenbrand (Arm) <david@xxxxxxxxxx>
>
> This looks correct to me, so:
>
> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@xxxxxxxxxx>

Thanks!

>
>> ---
>> Gave it a quick test in a VM with MM selftests etc, but I am not sure if
>> I actually trigger the follow_pfnmap machinery.
>> ---
>> mm/memory.c | 18 +++++++++++++++---
>> 1 file changed, 15 insertions(+), 3 deletions(-)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 219b9bf6cae0..2921d35c50ae 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -6868,11 +6868,16 @@ int follow_pfnmap_start(struct follow_pfnmap_args *args)
>>
>> pudp = pud_offset(p4dp, address);
>> pud = pudp_get(pudp);
>> - if (pud_none(pud))
>> + if (!pud_present(pud))
>> goto out;
>> if (pud_leaf(pud)) {
>> lock = pud_lock(mm, pudp);
>> - if (!unlikely(pud_leaf(pud))) {
>> + pud = pudp_get(pudp);
>> +
>> + if (unlikely(!pud_present(pud))) {
>> + spin_unlock(lock);
>> + goto out;
>> + } else if (unlikely(!pud_leaf(pud))) {
>
> Tiny nit, but no need for else here. Sometimes compilers complain about
> this but not sure if it such pedantry is enabled in default kernel compiler
> flags :)

You mean

if (unlikely(!pud_present(pud))) {
spin_unlock(lock);
goto out;
}
if (...) {

?

That just creates an additional LOC without any benefit IMHO. And we use
it all over the place :)

In fact, I will beat any C compiler with the C standard that complains
about that ;)

--
Cheers,

David