Re: [PATCH 2/3] powerpc/powernv: fix preempt count leak in pnv_kexec_wait_secondaries_down
From: Aboorva Devarajan
Date: Wed Jun 03 2026 - 02:12:25 EST
On Mon, 2026-05-18 at 13:26 +0530, Shrikanth Hegde wrote:
>
> Hi Aboorva.
>
> On 5/18/26 10:38 AM, Aboorva Devarajan wrote:
> > pnv_kexec_wait_secondaries_down() calls get_cpu() to obtain the current
> > CPU id but never calls the matching put_cpu(), leaking one
> > preempt_disable() nesting level on every invocation.
> >
> > In practice the imbalance does not trigger a visible splat because the
> > kexec teardown path is a one-way trip: IRQs are already disabled, no
> > schedule() occurs after the leak, and default_machine_kexec() overwrites
> > preempt_count with HARDIRQ_OFFSET before jumping into kexec_sequence()
> > which never returns. However the bookkeeping is still wrong.
> >
> > In the kexec teardown path IRQs are already disabled and the CPU is
> > pinned, so get_cpu()'s preempt_disable() side-effect is unnecessary.
> > Replace get_cpu() with raw_smp_processor_id() which returns the CPU id
> > without touching preempt_count.
> >
> > Fixes: 298b34d7d578 ("powerpc/powernv: Fix kexec races going back to OPAL")
> > Signed-off-by: Aboorva Devarajan <aboorvad@xxxxxxxxxxxxx>
> > ---
> > arch/powerpc/platforms/powernv/setup.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
> > index 4dbb47ddbdcc4..177da0defcb36 100644
> > --- a/arch/powerpc/platforms/powernv/setup.c
> > +++ b/arch/powerpc/platforms/powernv/setup.c
> > @@ -396,7 +396,7 @@ static void pnv_kexec_wait_secondaries_down(void)
> > {
> > int my_cpu, i, notified = -1;
> >
> > - my_cpu = get_cpu();
> > + my_cpu = raw_smp_processor_id();
> >
>
> Is it always with irq-disabled?
> How about !CONFIG_SMP and in kexec_prepare_cpus. I see it disables interrupt later.
> (though it is a less common config)
>
IIUC, PPC_POWERNV does 'select FORCE_SMP' (-> selects SMP), so there is no
!CONFIG_SMP powernv build. The !SMP kexec_prepare_cpus() variant in
arch/powerpc/kexec/core_64.c, the one you spotted that calls
ppc_md.kexec_cpu_down() before local_irq_disable() is therefore
never compiled with powernv, so pnv_kexec_cpu_down() ->
pnv_kexec_wait_secondaries_down() can't be reached through it.
so, IRQs are disabled in every case that reaches this function.
> So use smp_processor_id()?? One could compile with CONFIG_DEBUG_PREEMPT=y and
> see any reports.
>
> > for_each_online_cpu(i) {
> > uint8_t status;
sure, I'll switch to smp_processor_id() in v2 rather than raw_smp_processor_id().
It returns the cpu id without touching preempt_count (so the leak is
gone), and unlike the raw variant it keeps the CONFIG_DEBUG_PREEMPT
check which is a no-op here since IRQs are off, but will flag any
future caller that reaches this path while preemptible instead of
silently hiding it.
Thanks,
Aboorva