Re: [PATCH v2 4/4] sched/rt: Split cpupri_vec->cpumask to per NUMA node to reduce contention

Next message: Andy Shevchenko: "Re: [PATCH v5 3/5] iio: ssp_sensors: factor out pending list add/remove helpers"
Previous message: David Woodhouse: "[PATCH v2 2/5] KVM: arm64: vgic: Allow userspace to set IIDR revision 1"
In reply to: Tim Chen: "Re: [PATCH v2 4/4] sched/rt: Split cpupri_vec-&gt;cpumask to per NUMA node to reduce contention"
Next in thread: K Prateek Nayak: "Re: [PATCH v2 4/4] sched/rt: Split cpupri_vec-&gt;cpumask to per NUMA node to reduce contention"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Chen, Yu C

Date: Wed Apr 08 2026 - 07:37:03 EST

Hello Prateek,

On 4/8/2026 11:06 AM, K Prateek Nayak wrote:

Hello Tim,

On 4/8/2026 2:05 AM, Tim Chen wrote:

And regarding your other question about the calculation of arch_sbm_shift,
I'm trying to understand why there is a subtraction of 1, should it be:
- arch_sbm_shift = x86_topo_system.dom_shifts[TOPO_DIE_DOMAIN] - 1;
+ arch_sbm_shift = x86_topo_system.dom_shifts[TOPO_DIE_DOMAIN - 1];

Perhaps something like

arch_sbm_shift = min(sizeof(unsigned long),
topology_get_domain_shift(TOPO_TILE_DOMAIN));

to take care of both AMD system and the 64 bit leaf bitmask limit?

Ack! But do we want to separate CPUs on same LLC domain across
different cachelines in 64 CPU chunks or should we use the rest
of the padding to represent them?

I just saw your email and I had the same question.

I'm collecting some performance numbers to see if makes any
difference under high contention but have you seen benefits of
sharding the mask further when there are hundreds of CPU on the
same LLC?

We haven't tried breaking it down further. One possible approach
is to partition it at L2 scope, the benefit of which may depend on
the workload.

thanks,
Chenyu