Re: [PATCH 4/4] sched/fair: Prefer fully-idle SMT core for NOHZ idle load balancer

From: K Prateek Nayak

Date: Fri Mar 27 2026 - 07:39:14 EST


Hello Andrea,

On 3/27/2026 3:14 PM, Andrea Righi wrote:
> Hi Vincent,
>
> On Fri, Mar 27, 2026 at 09:45:56AM +0100, Vincent Guittot wrote:
>> On Thu, 26 Mar 2026 at 16:12, Andrea Righi <arighi@xxxxxxxxxx> wrote:
>>>
>>> When choosing which idle housekeeping CPU runs the idle load balancer,
>>> prefer one on a fully idle core if SMT is active, so balance can migrate
>>> work onto a CPU that still offers full effective capacity. Fall back to
>>> any idle candidate if none qualify.
>>
>> This one isn't straightforward for me. The ilb cpu will check all
>> other idle CPUs 1st and finish with itself so unless the next CPU in
>> the idle_cpus_mask is a sibling, this should not make a difference
>>
>> Did you see any perf diff ?
>
> I actually see a benefit, in particular, with the first patch applied I see
> a ~1.76x speedup, if I add this on top I get ~1.9x speedup vs baseline,
> which seems pretty consistent across runs (definitely not in error range).
>
> The intention with this change was to minimize SMT noise running the ILB
> code on a fully-idle core when possible, but I also didn't expect to see
> such big difference.
>
> I'll investigate more to better understand what's happening.

Interesting! Either this "CPU-intensive workload" hates SMT turning
busy (but to an extent where performance drops visibly?) or ILB
keeps getting interrupted on an SMT sibling that is burdened by
interrupts leading to slower balance (or IRQs driving the workload
being delayed by rq_lock disabling them)

Would it be possible to share the total SCHED_SOFTIRQ time, load
balancing attempts, and utlization with and without the patch? I too
will go queue up some runs to see if this makes a difference.

--
Thanks and Regards,
Prateek