Re: [Patch v4 02/22] sched/cache: Limit the scan number of CPUs when calculating task occupancy
From: Peter Zijlstra
Date: Thu Apr 09 2026 - 09:48:50 EST
On Thu, Apr 09, 2026 at 09:17:10PM +0800, Luo Gengkun wrote:
>
>
> On 2026/4/2 5:52, Tim Chen wrote:
> > From: Chen Yu <yu.c.chen@xxxxxxxxx>
> >
> > When NUMA balancing is enabled, the kernel currently iterates over all
> > online CPUs to aggregate process-wide occupancy data. On large systems,
> > this global scan introduces significant overhead.
> >
> > To reduce scan latency, limit the search to a subset of relevant CPUs:
> > 1. The task's preferred NUMA node.
> > 2. The node where the task is currently running.
> > 3. The node that contains the task's current preferred LLC..
> >
> > While focusing solely on the preferred NUMA node is ideal, a
> > process-wide scan must remain flexible because the "preferred node"
> > is a per-task attribute. Different threads within the same process may
> > have different preferred nodes, causing the process-wide preference to
> > migrate. Maintaining a mask that covers both the preferred and active
> > running nodes ensures accuracy while significantly reducing the number of
> > CPUs inspected.
>
> To address the issue of scanning overhead, there is a more targeted
> approach: only scanning the CPUs actually accessed by the process, and
> evicting these CPUs when they remain unaccessed for a specific period of
> time.
>
> This significantly reduces unnecessary scanning in most scenario. I have
> attached the patch below for review. Please feel free to integrate or modify
> these changes.
>
> Thansk!
> Luo Gengkun
Please fix your MUA, whatever you tried to send is horribly white space
mangled.