Re: [PATCH] sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection
From: Christian Loehle
Date: Thu Mar 19 2026 - 08:08:53 EST
On 3/18/26 17:09, Andrea Righi wrote:
> Hi Christian,
>
> On Wed, Mar 18, 2026 at 03:43:26PM +0000, Christian Loehle wrote:
>> On 3/18/26 10:31, Andrea Righi wrote:
>>> Hi Vincent,
>>>
>>> On Wed, Mar 18, 2026 at 10:41:15AM +0100, Vincent Guittot wrote:
>>>> On Wed, 18 Mar 2026 at 10:22, Andrea Righi <arighi@xxxxxxxxxx> wrote:
>>>>>
>>>>> On systems with asymmetric CPU capacity (e.g., ACPI/CPPC reporting
>>>>> different per-core frequencies), the wakeup path uses
>>>>> select_idle_capacity() and prioritizes idle CPUs with higher capacity
>>>>> for better task placement. However, when those CPUs belong to SMT cores,
>>>>
>>>> Interesting, which kind of system has both SMT and SD_ASYM_CPUCAPACITY
>>>> ? I thought both were never set simultaneously and SD_ASYM_PACKING was
>>>> used for system involving SMT like x86
>>>
>>> It's an NVIDIA platform (not publicly available yet), where the firmware
>>> exposes different CPU capacities and has SMT enabled, so both
>>> SD_ASYM_CPUCAPACITY and SMT are present. I'm not sure whether the final
>>> firmware release will keep this exact configuration (there's a good chance
>>> it will), so I'm targeting it to be prepared.
>>
>>
>> Andrea,
>> that makes me think, I've played with a nvidia grace available to me recently,
>> which sets slightly different CPPC highest_perf values (~2%) which automatically
>> will set SD_ASYM_CPUCAPACITY and run the entire capacity-aware scheduling
>> machinery for really almost negligible capacity differences, where it's
>> questionable how sensible that is.
>
> That looks like the same system that I've been working with. I agree that
> treating small CPPC differences as full asymmetry can be a bit overkill.
>
> I've been experimenting with flattening the capacities (to force the
> "regular" idle CPU selection policy), which performs better than the
> current asym-capacity CPU selection. However, adding the SMT awareness to
> the asym-capacity, seems to give a consistent +2-3% (same set of
> CPU-intensive benchmarks) compared to flatening alone, which is not bad.
>
>> I have an arm64 + CPPC implementation for asym-packing for this machine, maybe
>> we can reuse that for here too?
>
> Sure, that sounds interesting, if it's available somewhere I'd be happy to
> do some testing.
>
Hi Andrea,
I will clean up the asympacking code a bit and share it with you for testing.
Interestingly, when we looked at DCPerf MediaWiki, we found the exact opposite.
On NVIDIA Grace, enabling CAS due to the small CPPC highest_perf differences was
actually beneficial for the workload. More interestingly, we saw a similar uplift
on a different arm64 server without ASYM_CPUCAPACITY when we force-enabled
sched_asym_cpucap_active() even though the system was highest_perf-symmetric.
That suggests the uplift on Grace may have come from CAS-specific behavior rather
than from better selection of the highest_perf CPUs.
I'd be very curious whether something similar (i.e. the inverse) is happening in your
case as well, i.e. flattening the capacities but still forcing
select_idle_sibling() / sched_asym_cpucap_active() despite equal capacities. Of course,
that will also depend on the workloads (what are you testing?)
Just to illustrate, below is one example where CAS improved both score and CPU utilization:
+--------------------------+----------------------+-------------------------+-----------------------------------------+
| Platform | default (v6.8) | force all CPUs = 1024 | force sched_asym_cpucap_active() = TRUE |
+--------------------------+----------------------+-------------------------+-----------------------------------------+
| arm64 symmetric (72 CPUs)| 100% (90% CPU util) | ------------- | 104.26% (99%) |
| Grace (72 CPUs) | 100% (99%) | 99.49% (90%) | ------------- |
+--------------------------+----------------------+-------------------------+-----------------------------------------+