Re: [PATCH mm-unstable v3 1/5] mm: consolidate anonymous folio PTE mapping into helpers
From: Lorenzo Stoakes (Oracle)
Date: Mon Mar 16 2026 - 14:17:53 EST
On Wed, Mar 11, 2026 at 03:13:11PM -0600, Nico Pache wrote:
> The anonymous page fault handler in do_anonymous_page() open-codes the
> sequence to map a newly allocated anonymous folio at the PTE level:
> - construct the PTE entry
> - add rmap
> - add to LRU
> - set the PTEs
> - update the MMU cache.
Yikes yeah this all needs work. Thanks for looking at this!
>
> Introduce a two helpers to consolidate this duplicated logic, mirroring the
NIT: 'Introduce a two helpers' -> "introduce two helpers'
> existing map_anon_folio_pmd_nopf() pattern for PMD-level mappings:
>
> map_anon_folio_pte_nopf(): constructs the PTE entry, takes folio
> references, adds anon rmap and LRU. This function also handles the
> uffd_wp that can occur in the pf variant.
>
> map_anon_folio_pte_pf(): extends the nopf variant to handle MM_ANONPAGES
> counter updates, and mTHP fault allocation statistics for the page fault
> path.
MEGA nit, not sure why you're not just putting this in a bullet list, just
weird to see not-code indented here :)
>
> The zero-page read path in do_anonymous_page() is also untangled from the
> shared setpte label, since it does not allocate a folio and should not
> share the same mapping sequence as the write path. Make nr_pages = 1
> rather than relying on the variable. This makes it more clear that we
> are operating on the zero page only.
>
> This refactoring will also help reduce code duplication between mm/memory.c
> and mm/khugepaged.c, and provides a clean API for PTE-level anonymous folio
> mapping that can be reused by future callers.
Maybe worth mentioning subseqent patches that will use what you set up
here?
Also you split things out into _nopf() and _pf() variants, it might be
worth saying exactly why you're doing that or what you are preparing to do?
>
> Reviewed-by: Dev Jain <dev.jain@xxxxxxx>
> Reviewed-by: Lance Yang <lance.yang@xxxxxxxxx>
> Acked-by: David Hildenbrand (Arm) <david@xxxxxxxxxx>
> Signed-off-by: Nico Pache <npache@xxxxxxxxxx>
There's nits above and below, but overall the logic looks good, so with
nits addressed/reasonably responded to:
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@xxxxxxxxxx>
> ---
> include/linux/mm.h | 4 ++++
> mm/memory.c | 60 +++++++++++++++++++++++++++++++---------------
> 2 files changed, 45 insertions(+), 19 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 4c4fd55fc823..9fea354bd17f 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -4903,4 +4903,8 @@ static inline bool snapshot_page_is_faithful(const struct page_snapshot *ps)
>
> void snapshot_page(struct page_snapshot *ps, const struct page *page);
>
> +void map_anon_folio_pte_nopf(struct folio *folio, pte_t *pte,
> + struct vm_area_struct *vma, unsigned long addr,
> + bool uffd_wp);
How I hate how uffd infiltrates all our code like this.
Not your fault :)
> +
> #endif /* _LINUX_MM_H */
> diff --git a/mm/memory.c b/mm/memory.c
> index 6aa0ea4af1fc..5c8bf1eb55f5 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5197,6 +5197,37 @@ static struct folio *alloc_anon_folio(struct vm_fault *vmf)
> return folio_prealloc(vma->vm_mm, vma, vmf->address, true);
> }
>
> +void map_anon_folio_pte_nopf(struct folio *folio, pte_t *pte,
> + struct vm_area_struct *vma, unsigned long addr,
> + bool uffd_wp)
> +{
> + unsigned int nr_pages = folio_nr_pages(folio);
const would be good
> + pte_t entry = folio_mk_pte(folio, vma->vm_page_prot);
> +
> + entry = pte_sw_mkyoung(entry);
> +
> + if (vma->vm_flags & VM_WRITE)
> + entry = pte_mkwrite(pte_mkdirty(entry), vma);
> + if (uffd_wp)
> + entry = pte_mkuffd_wp(entry);
> +
> + folio_ref_add(folio, nr_pages - 1);
> + folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE);
> + folio_add_lru_vma(folio, vma);
> + set_ptes(vma->vm_mm, addr, pte, entry, nr_pages);
> + update_mmu_cache_range(NULL, vma, addr, pte, nr_pages);
> +}
> +
> +static void map_anon_folio_pte_pf(struct folio *folio, pte_t *pte,
> + struct vm_area_struct *vma, unsigned long addr, bool uffd_wp)
> +{
> + unsigned int order = folio_order(folio);
const would be good here also!
> +
> + map_anon_folio_pte_nopf(folio, pte, vma, addr, uffd_wp);
> + add_mm_counter(vma->vm_mm, MM_ANONPAGES, 1 << order);
Is 1 << order strictly right here? This field is a long value, so 1L <<
order maybe? I get nervous about these shifts...
Note that folio_large_nr_pages() uses 1L << order so that does seem
preferable.
> + count_mthp_stat(order, MTHP_STAT_ANON_FAULT_ALLOC);
> +}
> +
> /*
> * We enter with non-exclusive mmap_lock (to exclude vma changes,
> * but allow concurrent faults), and pte mapped but not yet locked.
> @@ -5243,7 +5274,14 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
> pte_unmap_unlock(vmf->pte, vmf->ptl);
> return handle_userfault(vmf, VM_UFFD_MISSING);
> }
> - goto setpte;
> + if (vmf_orig_pte_uffd_wp(vmf))
> + entry = pte_mkuffd_wp(entry);
> + set_pte_at(vma->vm_mm, addr, vmf->pte, entry);
How I _despise_ how uffd is implemented in mm. Feels like open coded
nonsense spills out everywhere.
Not your fault obviously :)
> +
> + /* No need to invalidate - it was non-present before */
> + update_mmu_cache_range(vmf, vma, addr, vmf->pte,
> + /*nr_pages=*/ 1);
Is there any point in passing vmf here given you pass NULL above, and it
appears that nobody actually uses this? I guess it doesn't matter but
seeing this immediately made we question why you set it in one, and not the
other?
Maybe I'm mistaken and some arch uses it? Don't think so though.
Also can't we then just use update_mmu_cache() which is the single-page
wrapper of this AFAICT? That'd make it even simpler.
Having done this, is there any reason to keep the annoying and confusing
initial assignment of nr_pages = 1 at declaration time?
It seems that nr_pages is unconditionally assigned before it's used
anywhere now at line 5298:
nr_pages = folio_nr_pages(folio);
addr = ALIGN_DOWN(vmf->address, nr_pages * PAGE_SIZE);
...
It's kinda weird to use nr_pages again after you go out of your way to
avoid it using folio_nr_pages() in map_anon_folio_pte_nopf() and
folio_order() in map_anon_folio_pte_pf().
But yeah, ok we align the address and it's yucky maybe leave for now (but
we can definitely stop defaulting nr_pages to 1 :)
> + goto unlock;
> }
>
> /* Allocate our own private page. */
> @@ -5267,11 +5305,6 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
> */
> __folio_mark_uptodate(folio);
>
> - entry = folio_mk_pte(folio, vma->vm_page_prot);
> - entry = pte_sw_mkyoung(entry);
> - if (vma->vm_flags & VM_WRITE)
> - entry = pte_mkwrite(pte_mkdirty(entry), vma);
> -
> vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl);
> if (!vmf->pte)
> goto release;
> @@ -5293,19 +5326,8 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
> folio_put(folio);
> return handle_userfault(vmf, VM_UFFD_MISSING);
> }
> -
> - folio_ref_add(folio, nr_pages - 1);
> - add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages);
> - count_mthp_stat(folio_order(folio), MTHP_STAT_ANON_FAULT_ALLOC);
> - folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE);
> - folio_add_lru_vma(folio, vma);
> -setpte:
> - if (vmf_orig_pte_uffd_wp(vmf))
> - entry = pte_mkuffd_wp(entry);
> - set_ptes(vma->vm_mm, addr, vmf->pte, entry, nr_pages);
> -
> - /* No need to invalidate - it was non-present before */
> - update_mmu_cache_range(vmf, vma, addr, vmf->pte, nr_pages);
> + map_anon_folio_pte_pf(folio, vmf->pte, vma, addr,
> + vmf_orig_pte_uffd_wp(vmf));
So we're going from:
entry = folio_mk_pte(...)
entry = pte_sw_mkyoung(...)
if (write)
entry = pte_mkwrite(also dirty...)
folio_ref_add(nr_pages - 1)
add_mm_counter(... MM_ANON_PAGES, nr_pages)
count_mthp_stat(folio_order(folio), MTHP_STAT_ANON_FAULT_ALLOC)
folio_add_new_anon_rmap(.., RMAP_EXCLUSIVE)
folio_add_lru_vma(folio, vma)
if (vmf_orig_pte_uffd_wp(vmf))
entry = pte_mkuffd_wp(entry)
set_ptes(mm, addr, vmf->pte, entry, nr_pages)
update_mmu_cache_range(vmf, vma, addr, vmf->pte, nr_pages)
To:
<map_anon_folio_pte_nopf>
entry = folio_mk_pte(...)
entry = pte_sw_mkyoung(...)
if (write)
entry = pte_mkwrite(also dirty...)
if (vmf_orig_pte_uffd_wp(vmf) <passed in via uffd_wp>) <-- reordered
entry = pte_mkuffd_wp(entry)
folio_ref_add(nr_pages - 1)
folio_add_new_anon_rmap(.., RMAP_EXCLUSIVE)
folio_add_lru_vma(folio, vma)
set_ptes(mm, addr, pte, entry, nr_pages)
update_mmu_cache_range(NULL, vma, addr, pte, nr_pages)
<map_anon_folio_pte_pf>
add_mm_counter(... MM_ANON_PAGES, nr_pages) <-- reordered
count_mthp_stat(folio_order(folio), MTHP_STAT_ANON_FAULT_ALLOC) <-- reodrdered
But the reorderings seem fine, and it is achieving the same thing.
All the parameters being passed seem correct too.
> unlock:
> if (vmf->pte)
> pte_unmap_unlock(vmf->pte, vmf->ptl);
> --
> 2.53.0
>
Cheers, Lorenzo