Re: [PATCH] cpuidle: Deny idle entry when CPU already have IPI interrupt pending

From: Maulik Shah (mkshah)

Date: Mon Mar 23 2026 - 08:13:31 EST

On 3/20/2026 11:59 PM, Rafael J. Wysocki wrote:
> On Mon, Mar 16, 2026 at 8:38 AM Maulik Shah
> <maulik.shah@xxxxxxxxxxxxxxxx> wrote:
>>
>> CPU can get IPI interrupt from another CPU while it is executing
>> cpuidle_select() or about to execute same. The selection do not account
>> for pending interrupts and may continue to enter selected idle state only
>> to exit immediately.
>>
>> Example trace collected when there is cross CPU IPI.
>>
>> [000] 154.892148: sched_waking: comm=sugov:4 pid=491 prio=-1 target_cpu=007
>> [000] 154.892148: ipi_raise: target_mask=00000000,00000080 (Function call interrupts)
>> [007] 154.892162: cpu_idle: state=2 cpu_id=7
>> [007] 154.892208: cpu_idle: state=4294967295 cpu_id=7
>> [007] 154.892211: irq_handler_entry: irq=2 name=IPI
>> [007] 154.892211: ipi_entry: (Function call interrupts)
>> [007] 154.892213: sched_wakeup: comm=sugov:4 pid=491 prio=-1 target_cpu=007
>> [007] 154.892214: ipi_exit: (Function call interrupts)
>>
>> This impacts performance and the above count increments.
>>
>> commit ccde6525183c ("smp: Introduce a helper function to check for pending
>> IPIs") already introduced a helper function to check the pending IPIs and
>> it is used in pmdomain governor to deny the cluster level idle state when
>> there is a pending IPI on any of cluster CPUs.
>
> You seem to be overlooking the fact that resched wakeups need not be
> signaled via IPIs, but they may be updates of a monitored cache line.
>
>> This however does not stop CPU to enter CPU level idle state. Make use of
>> same at CPUidle to deny the idle entry when there is already IPI pending.
>>
>> With change observing glmark2 [1] off screen scores improving in the range
>> of 25% to 30% on Qualcomm lemans-evk board which is arm64 based having two
>> clusters each with 4 CPUs.
>>
>> [1] https://github.com/glmark2/glmark2
>>
>> Signed-off-by: Maulik Shah <maulik.shah@xxxxxxxxxxxxxxxx>
>> ---
>> drivers/cpuidle/cpuidle.c | 3 +++
>> 1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
>> index c7876e9e024f9076663063ad21cfc69343fdbbe7..c88c0cbf910d6c2c09697e6a3ac78c081868c2ad 100644
>> --- a/drivers/cpuidle/cpuidle.c
>> +++ b/drivers/cpuidle/cpuidle.c
>> @@ -224,6 +224,9 @@ noinstr int cpuidle_enter_state(struct cpuidle_device *dev,
>> bool broadcast = !!(target_state->flags & CPUIDLE_FLAG_TIMER_STOP);
>> ktime_t time_start, time_end;
>>
>> + if (cpus_peek_for_pending_ipi(drv->cpumask))
>> + return -EBUSY;
>> +
>
> So what if the driver handles all CPUs in the system and there are
> many of them (say ~500) and if IPIs occur rarely (because resched
> events are not IPIs)?

Missed the case of driver handling multiple CPUs,
In v2 would fix this as below, which checks pending IPI on single
CPU trying to enter idle.

if (cpus_peek_for_pending_ipi(cpumask_of(dev->cpu)))

I see IPIs do occur often, in the glmark2 offscreen case
mentioned in commit text, out of total ~12.2k IPIs across all 8 CPUs,
~9.6k are function call IPIs, ~2k are IRQ work IPIs, ~560 Timer broadcast
IPIs while rescheduling IPIs are only 82.

Thanks,
Maulik

>
>> instrumentation_begin();
>>
>> /*
>>
>> ---