Re: [PATCH] cpuidle: Deny idle entry when CPU already have IPI interrupt pending

From: Rafael J. Wysocki

Date: Tue Mar 24 2026 - 12:16:54 EST

On Mon, Mar 23, 2026 at 1:13 PM Maulik Shah (mkshah)
<maulik.shah@xxxxxxxxxxxxxxxx> wrote:
>
>
>
> On 3/20/2026 11:59 PM, Rafael J. Wysocki wrote:
> > On Mon, Mar 16, 2026 at 8:38 AM Maulik Shah
> > <maulik.shah@xxxxxxxxxxxxxxxx> wrote:
> >>
> >> CPU can get IPI interrupt from another CPU while it is executing
> >> cpuidle_select() or about to execute same. The selection do not account
> >> for pending interrupts and may continue to enter selected idle state only
> >> to exit immediately.
> >>
> >> Example trace collected when there is cross CPU IPI.
> >>
> >> [000] 154.892148: sched_waking: comm=sugov:4 pid=491 prio=-1 target_cpu=007
> >> [000] 154.892148: ipi_raise: target_mask=00000000,00000080 (Function call interrupts)
> >> [007] 154.892162: cpu_idle: state=2 cpu_id=7
> >> [007] 154.892208: cpu_idle: state=4294967295 cpu_id=7
> >> [007] 154.892211: irq_handler_entry: irq=2 name=IPI
> >> [007] 154.892211: ipi_entry: (Function call interrupts)
> >> [007] 154.892213: sched_wakeup: comm=sugov:4 pid=491 prio=-1 target_cpu=007
> >> [007] 154.892214: ipi_exit: (Function call interrupts)
> >>
> >> This impacts performance and the above count increments.
> >>
> >> commit ccde6525183c ("smp: Introduce a helper function to check for pending
> >> IPIs") already introduced a helper function to check the pending IPIs and
> >> it is used in pmdomain governor to deny the cluster level idle state when
> >> there is a pending IPI on any of cluster CPUs.
> >
> > You seem to be overlooking the fact that resched wakeups need not be
> > signaled via IPIs, but they may be updates of a monitored cache line.
> >
> >> This however does not stop CPU to enter CPU level idle state. Make use of
> >> same at CPUidle to deny the idle entry when there is already IPI pending.
> >>
> >> With change observing glmark2 [1] off screen scores improving in the range
> >> of 25% to 30% on Qualcomm lemans-evk board which is arm64 based having two
> >> clusters each with 4 CPUs.
> >>
> >> [1] https://github.com/glmark2/glmark2
> >>
> >> Signed-off-by: Maulik Shah <maulik.shah@xxxxxxxxxxxxxxxx>
> >> ---
> >> drivers/cpuidle/cpuidle.c | 3 +++
> >> 1 file changed, 3 insertions(+)
> >>
> >> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> >> index c7876e9e024f9076663063ad21cfc69343fdbbe7..c88c0cbf910d6c2c09697e6a3ac78c081868c2ad 100644
> >> --- a/drivers/cpuidle/cpuidle.c
> >> +++ b/drivers/cpuidle/cpuidle.c
> >> @@ -224,6 +224,9 @@ noinstr int cpuidle_enter_state(struct cpuidle_device *dev,
> >> bool broadcast = !!(target_state->flags & CPUIDLE_FLAG_TIMER_STOP);
> >> ktime_t time_start, time_end;
> >>
> >> + if (cpus_peek_for_pending_ipi(drv->cpumask))
> >> + return -EBUSY;
> >> +
> >
> > So what if the driver handles all CPUs in the system and there are
> > many of them (say ~500) and if IPIs occur rarely (because resched
> > events are not IPIs)?
>
> Missed the case of driver handling multiple CPUs,
> In v2 would fix this as below, which checks pending IPI on single
> CPU trying to enter idle.
>
> if (cpus_peek_for_pending_ipi(cpumask_of(dev->cpu)))

And the for_each_cpu() loop in cpus_peek_for_pending_ipi() would then
become useless overhead, wouldn't ir?

> I see IPIs do occur often, in the glmark2 offscreen case
> mentioned in commit text, out of total ~12.2k IPIs across all 8 CPUs,
> ~9.6k are function call IPIs, ~2k are IRQ work IPIs, ~560 Timer broadcast
> IPIs while rescheduling IPIs are only 82.

So how many of those IPIs actually wake up CPUs from idle prematurely?