Re: [PATCH] sched/topology: Avoid spurious asymmetry from CPU capacity noise
From: Dietmar Eggemann
Date: Wed Mar 25 2026 - 11:46:27 EST
On 25.03.26 13:25, Andrea Righi wrote:
> On Wed, Mar 25, 2026 at 12:16:59PM +0100, Dietmar Eggemann wrote:
>> On 25.03.26 10:32, Andrea Righi wrote:
>>> On Wed, Mar 25, 2026 at 10:23:09AM +0100, Dietmar Eggemann wrote:
>>>> On 24.03.26 12:01, Andrea Righi wrote:
>>>>> Hi Dietmar,
>>>>>
>>>>> On Tue, Mar 24, 2026 at 11:29:24AM +0100, Dietmar Eggemann wrote:
>>>>>> On 24.03.26 10:46, Andrea Righi wrote:
>>>>>>> Hi Christian,
>>>>>>>
>>>>>>> On Tue, Mar 24, 2026 at 08:08:22AM +0000, Christian Loehle wrote:
>>>>>>>> On 3/24/26 07:55, Christian Loehle wrote:
>>>>>>>>> On 3/24/26 07:39, Vincent Guittot wrote:
>>>>>>>>>> On Tue, 24 Mar 2026 at 01:55, Andrea Righi <arighi@xxxxxxxxxx> wrote:
[...]
> Exactly, we already prefer fully-idle cores over partially-idle cores with
> asym-capacity disabled, but in that case the idle selection logic stays in
> a world of idle bits, without cap/util math, so it's a bit easier. And it's
> probably fine also when we have both asym-capacity + SMT (at least it seems
> better than what we have now, ignoring the SMT part).
>
> Essentially having somethig like the following (which already gives better
> performance on Vera):
>
> kernel/sched/fair.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d57c02e82f3a1..534634f813fca 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8086,7 +8086,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> * For asymmetric CPU capacity systems, our domain of interest is
> * sd_asym_cpucapacity rather than sd_llc.
> */
> - if (sched_asym_cpucap_active()) {
> + if (sched_asym_cpucap_active() && !sched_smt_active()) {
> sd = rcu_dereference_all(per_cpu(sd_asym_cpucapacity, target));
> /*
> * On an asymmetric CPU capacity system where an exclusive
Ah, I thought we were talking !sched_asym_cpucap_active() case, either
by letting CPPC return the same value for all CPUs or by introducing
this 20%/5% threshold into asym_cpu_capacity_scan().
ASYM_CPUCAP + SHARE_CPUCAP vs SHARE_CPUCAP would still behave slightly
differently because of asym_fits_cpu() in all those early bailout
conditions (1) in sis().
select_idle_sibling()
if (choose_idle_cpu(target, p) &&
asym_fits_cpu(task_util, util_min, util_max, target)) <-- (1)
return target;
...
And you would still have misfit_task load balance enabled.
Those subtle differences may influence behavior compared to a simpler
homogeneous CPU capacity model, but it’s unclear whether they justify
introducing yet another variant alongside the existing homogeneous and
fully heterogeneous (non-SMT) approaches.
IMHO, we should only consider allowing this if there is clear evidence
of significant benefits across a representative range of benchmarks and
workloads.
[...]