Re: [PATCH] kernel/smp: Improve smp_call_function_single() CSD-lock diagnostics

From: Paul E. McKenney

Date: Fri Mar 27 2026 - 10:21:58 EST


On Wed, Mar 25, 2026 at 01:59:50PM -0700, Paul E. McKenney wrote:
> On Wed, Mar 25, 2026 at 12:32:16PM +0100, Ulf Hansson wrote:
> > On Fri, 20 Mar 2026 at 11:45, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> > >
> > > Both smp_call_function() and smp_call_function_single() use per-CPU
> > > call_single_data_t variable to hold the infamous CSD lock. However,
> > > while smp_call_function() acquires the destination CPU's CSD lock,
> > > smp_call_function_single() instead uses the source CPU's CSD lock.
> > > (These are two separate sets of CSD locks, cfd_data and csd_data,
> > > respectively.)
> > >
> > > This otherwise inexplicable pair of choices is explained by their
> > > respective queueing properties. If smp_call_function() where to
> > > use the sending CPU's CSD lock, that would serialize the destination
> > > CPUs' IPI handlers and result in long smp_call_function() latencies,
> > > especially on systems with large numbers of CPUs. For its part, if
> > > smp_call_function_single() were to use the (single) destination CPU's
> > > CSD lock, this would similarly serialize in the case where many CPUs
> > > are sending IPIs to a single "victim" CPU. Plus it would result in
> > > higher levels of memory contention.
> > >
> > > Except that if you don't have NMI-based stack tracing and you are working
> > > with a weakly ordered system where remote unsynchronized stack traces are
> > > especially unreliable, the improved debugging beats the improved queueing.
> > > Keep in mind that this improved queueing only matters if a bunch of
> > > CPUs are calling smp_call_function_single() concurrently for a single
> > > "victim" CPU, which is not the common case.
> > >
> > > Therefore, make smp_call_function_single() use the destination CPU's
> > > csd_data instance in kernels built with CONFIG_CSD_LOCK_WAIT_DEBUG=y
> > > where csdlock_debug_enabled is also set. Otherwise, continue to use
> > > the source CPU's csd_data.
> > >
> > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> >
> > FWIW, feel free to add:
> >
> > Reviewed-by: Ulf Hansson <ulf.hansson@xxxxxxxxxx>
>
> Thank you! I will apply this on my next rebase.

Except that -tip beat me to it, which is even better! ;-)

https://lore.kernel.org/all/177452999316.1647592.118093660207781017.tip-bot2@tip-bot2/

Thanx, Paul