Re: [PATCH] arm64: suspend: Remove forcing error from suspend finisher

From: Will Deacon

Date: Thu Mar 19 2026 - 10:11:52 EST


[+Mark, Lorenzo and Sudeep]

On Mon, Mar 16, 2026 at 02:18:18PM +0530, Maulik Shah wrote:
> Successful cpu_suspend() may not always want to return to cpu_resume() to
> save the work and latency involved.
>
> consider a scenario,
>
> when single physical CPU (pCPU) is used on different virtual machines (VMs)
> as virtual CPUs (vCPUs). VM-x's vCPU can request a powerdown state after
> saving the context by invoking __cpu_suspend_enter() whereas VM-y's vCPU is
> requesting a shallower than powerdown state. The hypervisor aggregates to a
> non powerdown state for pCPU. A wakeup event for VM-x's vCPU may want to
> resume the execution at the same place instead of jumping to cpu_resume()
> as the HW never reached till powerdown state which would have lost the
> context.
>
> While the vCPU of VM-x had latency impact of saving the context in suspend
> entry path but having the return to same place saves the latency to restore
> the context in resume path.
>
> consider another scenario,
>
> Newer CPUs include a feature called “powerdown abandon”. The feature is
> based on the observation that events like GIC wakeups have a high
> likelihood of happening while the CPU is in the middle of its powerdown
> sequence (at wfi). Older CPUs will powerdown and immediately power back
> up when this happens. The newer CPUs will “give up” mid way through if
> no context has been lost yet. This is possible as the powerdown operation
> is lengthy and a large part of it does not lose context [1].
>
> As the wakeup arrived after SW powerdown is done but before HW is fully
> powered down. From SW view this is still a successful entry to suspend
> and since the HW did not loose the context there is no reason to return at
> entry address cpu_resume() to restore the context.
>
> Remove forcing the failure at kernel if the execution does not resume at
> cpu_resume() as kernel has no reason to treat such returns as failures
> when the firmware has already filled in return as success.
>
> [1] https://trustedfirmware-a.readthedocs.io/en/v2.14.0/design/firmware-design.html#cpu-specific-operations-framework
>
> Signed-off-by: Maulik Shah <maulik.shah@xxxxxxxxxxxxxxxx>
> ---
> arch/arm64/kernel/suspend.c | 15 +++++++--------
> 1 file changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
> index eaaff94329cddb8d1fb8d1523395453f3501c9a5..b54e578f0f8b03c1dba38157c6012bb064adaa12 100644
> --- a/arch/arm64/kernel/suspend.c
> +++ b/arch/arm64/kernel/suspend.c
> @@ -144,15 +144,14 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
> ret = fn(arg);
>
> /*
> - * Never gets here, unless the suspend finisher fails.
> - * Successful cpu_suspend() should return from cpu_resume(),
> - * returning through this code path is considered an error
> - * If the return value is set to 0 force ret = -EOPNOTSUPP
> - * to make sure a proper error condition is propagated
> + * Successful HW power down should return at cpu_resume()
> + * however successful SW power down may still want to
> + * return here to save the work and latency involved in
> + * restoring the context when the HW never lost it.
> + *
> + * If the return value is set to 0 do not force failure
> + * from here.
> */
> - if (!ret)
> - ret = -EOPNOTSUPP;
> -

This doesn't look right to me.

afaict, the only suspend finisher we get here on arm64 is for PSCI. The
PSCI spec returns SUCCESS if a shallower state is entered than the one
requested, in which case we should return an error back to cpuidle rather
than pretend to have entered a deeper state than we actually did.

I wonder if we could remove the 'fn' paramater from cpu_suspend()
altogether for arm64 and hardwire PSCI directly, given that it's the
only one we seem to support?

Will