Re: [PATCH] sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection

From: Vincent Guittot

Date: Wed Mar 18 2026 - 05:46:06 EST


On Wed, 18 Mar 2026 at 10:22, Andrea Righi <arighi@xxxxxxxxxx> wrote:
>
> On systems with asymmetric CPU capacity (e.g., ACPI/CPPC reporting
> different per-core frequencies), the wakeup path uses
> select_idle_capacity() and prioritizes idle CPUs with higher capacity
> for better task placement. However, when those CPUs belong to SMT cores,

Interesting, which kind of system has both SMT and SD_ASYM_CPUCAPACITY
? I thought both were never set simultaneously and SD_ASYM_PACKING was
used for system involving SMT like x86

> their effective capacity can be much lower than the nominal capacity
> when the sibling thread is busy: SMT siblings compete for shared
> resources, so a "high capacity" CPU that is idle but whose sibling is
> busy does not deliver its full capacity. This effective capacity
> reduction cannot be modeled by the static capacity value alone.
>
> Introduce SMT awareness in the asym-capacity idle selection policy: when
> SMT is active prefer fully-idle SMT cores over partially-idle ones. A
> two-phase selection first tries only CPUs on fully idle cores, then
> falls back to any idle CPU if none fit.
>
> Prioritizing fully-idle SMT cores yields better task placement because
> the effective capacity of partially-idle SMT cores is reduced; always
> preferring them when available leads to more accurate capacity usage on
> task wakeup.
>
> On an SMT system with asymmetric CPU capacities, SMT-aware idle
> selection has been shown to improve throughput by around 15-18% for
> CPU-bound workloads, running an amount of tasks equal to the amount of
> SMT cores.
>
> Signed-off-by: Andrea Righi <arighi@xxxxxxxxxx>
> ---
> kernel/sched/fair.c | 24 +++++++++++++++++++++---
> 1 file changed, 21 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0a35a82e47920..0f97c44d4606b 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7945,9 +7945,13 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
> * Scan the asym_capacity domain for idle CPUs; pick the first idle one on which
> * the task fits. If no CPU is big enough, but there are idle ones, try to
> * maximize capacity.
> + *
> + * When @smt_idle_only is true (asym + SMT), only consider CPUs on cores whose
> + * SMT siblings are all idle, to avoid stacking and sharing SMT resources.
> */
> static int
> -select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
> +select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target,
> + bool smt_idle_only)
> {
> unsigned long task_util, util_min, util_max, best_cap = 0;
> int fits, best_fits = 0;
> @@ -7967,6 +7971,9 @@ select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int target)
> if (!choose_idle_cpu(cpu, p))
> continue;
>
> + if (smt_idle_only && !is_core_idle(cpu))
> + continue;
> +
> fits = util_fits_cpu(task_util, util_min, util_max, cpu);
>
> /* This CPU fits with all requirements */
> @@ -8102,8 +8109,19 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> * capacity path.
> */
> if (sd) {
> - i = select_idle_capacity(p, sd, target);
> - return ((unsigned)i < nr_cpumask_bits) ? i : target;
> + /*
> + * When asym + SMT and the hint says idle cores exist,
> + * try idle cores first to avoid stacking on SMT; else
> + * scan all idle CPUs.
> + */
> + if (sched_smt_active() && test_idle_cores(target)) {
> + i = select_idle_capacity(p, sd, target, true);
> + if ((unsigned int)i >= nr_cpumask_bits)
> + i = select_idle_capacity(p, sd, target, false);

Can't you make it one pass in select_idle_capacity ?

> + } else {
> + i = select_idle_capacity(p, sd, target, false);
> + }
> + return ((unsigned int)i < nr_cpumask_bits) ? i : target;
> }
> }
>
> --
> 2.53.0
>