Re: [PATCH] sched/numa, mm: Skip page promotion if cpu pid is valid
From: David Hildenbrand (Arm)
Date: Thu Mar 26 2026 - 06:45:04 EST
On 3/26/26 08:12, Donet Tom wrote:
> If memory tiering is disabled, cpupid of slow memory pages may
> contain a valid CPU and PID. If tiering is enabled at runtime,
> there is a chance that in should_numa_migrate_memory(), this
> valid CPU/PID is treated as a last access timestamp, leading
> to unnecessary promotion.
Is that measurable? Should we at least have a Fixes: ?
>
> Prevent this by skipping promotion when cpupid is valid.
>
> Signed-off-by: Donet Tom <donettom@xxxxxxxxxxxxx>
> ---
> kernel/sched/fair.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 4b43809a3fb1..f5830a5a94d5 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2001,6 +2001,13 @@ bool should_numa_migrate_memory(struct task_struct *p, struct folio *folio,
> unsigned int latency, th, def_th;
> long nr = folio_nr_pages(folio);
>
/*
* When ...
> + /* When tiering is enabled at runtime, last_cpupid may
> + * hold a valid cpupid instead of an access timestamp.
> + * If so, skip page promotion.
> + */
> + if (cpupid_valid(folio_last_cpupid(folio)))
> + return false;
> +
IIUC, as timestamp we use jiffies_to_msecs(). So, soon after bootup,
we would no longer get false positives for cpupid_valid().
I suppose overflows are not a problem, correct?
So what we're saying is that folio_use_access_time()==true does not
imply that there is actually a valid time in there.
In numa_migrate_check() we could still use the valid cpuid I guess and
make that code a bit clearer?
diff --git a/mm/memory.c b/mm/memory.c
index 631205a384e1..ba68933a9e4a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -6119,10 +6119,9 @@ int numa_migrate_check(struct folio *folio, struct vm_fault *vmf,
* For memory tiering mode, cpupid of slow memory page is used
* to record page access time. So use default value.
*/
- if (folio_use_access_time(folio))
+ *last_cpupid = folio_last_cpupid(folio);
+ if (!cpupid_valid(*last_cpupid))
*last_cpupid = (-1 & LAST_CPUPID_MASK);
- else
- *last_cpupid = folio_last_cpupid(folio);
/* Record the current PID accessing VMA */
vma_set_access_pid_bit(vma);
The change itself here looks reasonable to me.
Acked-by: David Hildenbrand (Arm) <david@xxxxxxxxxx>
--
Cheers,
David