Re: BUG: workqueue lockup - SRCU schedules work on not-online CPUs during size transition
From: Paul E. McKenney
Date: Fri Apr 10 2026 - 00:04:00 EST
On Thu, Apr 09, 2026 at 01:10:14PM -0700, Paul E. McKenney wrote:
> On Thu, Apr 09, 2026 at 09:15:50PM +0200, Vasily Gorbik wrote:
[ . . . ]
> > Yes, tested on s390 LPAR (76 online, 400 possible) as well as
> > on x86 KVM with --smp 16,maxcpus=255 and CONFIG_NR_CPUS=256
> > no more workqueue lockup in both cases.
> >
> > Thank you!
> >
> > Tested-by: Vasily Gorbik <gor@xxxxxxxxxxxxx>
>
> Thank you for testing this!
>
> Please see below for an updated patch. Tejun's patch might obsolete
> this one, but just in case he balks at SRCU queueing handlers for CPUs
> that are not even in the cpu_possible_mask. ;-)
And because we don't invoke SRCU callbacks on CPUs that are not yet fully
online, such CPUs had better not invoke call_srcu(), synchronize_srcu(),
or synchronize_srcu_expedited() on a CPU that is not yet fully online.
I am therefore adding the warning shown below.
Better paranoid late than paranoid not at all. ;-)
Thanx, Paul
------------------------------------------------------------------------
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index a67af44fc0745..d62509efb52f5 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1431,6 +1431,7 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
static void __call_srcu(struct srcu_struct *ssp, struct rcu_head *rhp,
rcu_callback_t func, bool do_norm)
{
+ WARN_ON_ONCE(!rcu_cpu_beenfullyonline(raw_smp_processor_id()));
if (debug_rcu_head_queue(rhp)) {
/* Probable double call_srcu(), so leak the callback. */
WRITE_ONCE(rhp->func, srcu_leak_callback);