Re: [PATCH 2/4] mm: add a template-based fast path for zone-device page init

From: Mike Rapoport

Date: Mon May 18 2026 - 02:51:49 EST


Hi,

On Fri, May 15, 2026 at 04:20:43PM +0800, Li Zhe wrote:
> On 64-bit builds, memmap_init_zone_device() spends most of its time
> repeating the same struct page initialization for every PFN. Prepare a
> template page through the existing slow path once, then copy that
> template into each destination page and fix up the PFN-dependent state
> afterwards.
>
> Keep the optimized path disabled when the page_ref_set tracepoint is
> active, because the template-copy path bypasses set_page_count() and
> would otherwise hide the corresponding trace event.
>
> Non-64-bit builds continue to use the existing slow path.

ZONE_DEVICE depends on MEMORY_HOTPLUG and MEMORY_HOTPLUG is only supported
for 64 bits, so there can't be 32-bit builds for ZONE_DEVICE functionality.

> Tested in a VM with a 100 GB fsdax namespace device configured with
> map=dev on Intel Ice Lake server. This test exercises the nd_pmem rebind
> path (pfns_per_compound == 1).
>
> Test procedure:
> Rebind the nd_pmem driver 30 times and collect the memmap initialization
> time from the pr_debug() output of memmap_init_zone_device().
>
> Base(v7.1-rc3):
> First binding: 1486 ms
> Average of subsequent rebinds: 273.52 ms
>
> With this patch:
> First binding: 1421 ms
> Average of subsequent rebinds: 246.14 ms
>
> This reduces the average rebind time from 273.52 ms to 246.14 ms, or
> about 10%.
>
> Signed-off-by: Li Zhe <lizhe.67@xxxxxxxxxxxxx>
> ---
> mm/mm_init.c | 103 +++++++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 96 insertions(+), 7 deletions(-)
>
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 5244acb96dbb..4c475c71a9d6 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -1013,7 +1013,7 @@ static inline int zone_device_page_init_refcount(
> }
> }
>
> -static void __ref generic_init_zone_device_page(struct page *page,
> +static void __ref generic_init_zone_device_page_slow(struct page *page,
> unsigned long pfn, unsigned long zone_idx, int nid,
> struct dev_pagemap *pgmap)
> {
> @@ -1040,12 +1040,9 @@ static void __ref generic_init_zone_device_page(struct page *page,
> set_page_count(page, 0);
> }
>
> -static void __ref __init_zone_device_page(struct page *page, unsigned long pfn,
> - unsigned long zone_idx, int nid,
> - struct dev_pagemap *pgmap)
> +static void __ref zone_device_page_init_pageblock(struct page *page,
> + unsigned long pfn)

Please move splitting _pageblock helper into the first patch, so that the
first patch would contain all code movement.

> {
> - generic_init_zone_device_page(page, pfn, zone_idx, nid, pgmap);
> -
> /*
> * Mark the block movable so that blocks are reserved for
> * movable at startup. This will force kernel allocations
> @@ -1062,6 +1059,88 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn,
> }
> }
>
> +static inline void __init_zone_device_page(struct page *page, unsigned long pfn,
> + unsigned long zone_idx, int nid,
> + struct dev_pagemap *pgmap)
> +{
> + generic_init_zone_device_page_slow(page, pfn, zone_idx, nid, pgmap);
> + zone_device_page_init_pageblock(page, pfn);
> +}
> +
> +#if BITS_PER_LONG == 64
> +static inline bool zone_device_page_init_optimization_enabled(void)
> +{
> + /*
> + * We use template pages and assign page->_refcount via memory copy.
> + * This means the optimized path bypasses set_page_count(), so the
> + * page_ref_set tracepoint cannot observe this initialization.
> + * Skip the optimized path when the tracepoint is enabled.
> + */
> + return !page_ref_tracepoint_active(page_ref_set);
> +}
> +
> +static inline void struct_page_layout_check(void)
> +{
> + BUILD_BUG_ON(sizeof(struct page) & (sizeof(u64) - 1));

Does it have to be a BUILD_BUG()? Can't we fallback to slow path if struct
page has a weird size?
Just do the check in zone_device_page_init_optimization_enabled().

> +}
> +
> +static inline void init_template_page(struct page *template,
> + unsigned long pfn,
> + unsigned long zone_idx,
> + int nid,
> + struct dev_pagemap *pgmap)

The name should include zone_device to avoid confusion with regular pages.
> +{
> + generic_init_zone_device_page_slow(template, pfn, zone_idx, nid, pgmap);
> +}
> +
> +/*
> + * Initialize parts that differ from the template
> + */
> +static inline void generic_init_zone_device_page_finish(struct page *page,
> + unsigned long pfn)
> +{
> +#ifdef SECTION_IN_PAGE_FLAGS
> + set_page_section(page, pfn_to_section_nr(pfn));

Can we add a stub for set_page_address() for !SECTION_IN_PAGE_FLAGS case
and drop the #ifdef here and in set_page_links()?

> +#endif
> +#ifdef WANT_PAGE_VIRTUAL
> + if (!is_highmem_idx(ZONE_DEVICE))
> + set_page_address(page, __va(pfn << PAGE_SHIFT));

set_page_address() is a not when WANT_PAGE_VIRTUAL, you can drop the ifdef.

> +#endif
> +}
> +
> +static void init_zone_device_page_from_template(struct page *page,
> + unsigned long pfn, const struct page *template)

zone_device_page_init_from_template() please.

> +{
> + const u64 *src = (const u64 *)template;
> + u64 *dst = (u64 *)page;
> + unsigned int i;
> +
> + for (i = 0; i < sizeof(struct page) / sizeof(u64); i++)
> + dst[i] = src[i];
> + generic_init_zone_device_page_finish(page, pfn);
> + zone_device_page_init_pageblock(page, pfn);
> +}

--
Sincerely yours,
Mike.