Re: [PATCH 1/5] sched/fair: Drop redundant RCU read lock in NOHZ kick path

From: Andrea Righi

Date: Thu May 21 2026 - 16:14:58 EST


Hi Marek,

On Thu, May 21, 2026 at 09:47:03PM +0200, Marek Szyprowski wrote:
> On 09.05.2026 20:07, Andrea Righi wrote:
> > nohz_balancer_kick() is reached from sched_balance_trigger(), which is
> > called from sched_tick(). sched_tick() runs with IRQs disabled, so the
> > additional rcu_read_lock/unlock() used around sched_domain accesses in
> > this path is redundant. Rely on the existing IRQ-disabled context (and
> > the rcu_dereference_all() checking) instead.
> >
> > The same applies to set_cpu_sd_state_idle(), called from the idle entry
> > path with IRQs disabled, and to set_cpu_sd_state_busy(), reachable via
> > nohz_balance_exit_idle() from two contexts: nohz_balancer_kick() (IRQs
> > disabled, as above) and sched_cpu_deactivate() (the CPUHP_AP_ACTIVE
> > teardown, which runs under cpus_write_lock(), so it cannot race with
> > sched-domain rebuilds). In both cases the rcu_dereference_all()
> > validation is sufficient.
> >
> > No functional change intended.
> >
> > Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> > Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
> > Suggested-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>
> > Reviewed-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>
> > Signed-off-by: Andrea Righi <arighi@xxxxxxxxxx>
> This patch landed in today's linux-next as commit c9d93a73ce87 ("sched/fair: Drop
> redundant RCU read lock in NOHZ kick path"). In my tests I found that it introduced
> the following warning during the CPU hot-plug tests:
>
>
> root@target:~# for i in /sys/devices/system/cpu/cpu[1-9]; do echo 0 >$i/online; done
>
> =============================
> WARNING: suspicious RCU usage
> 7.1.0-rc2+ #12775 Not tainted
> -----------------------------
> kernel/sched/fair.c:12793 suspicious rcu_dereference_check() usage!
>
> other info that might help us debug this:
>
>
> rcu_scheduler_active = 2, debug_locks = 1
> 2 locks held by cpuhp/1/20:
>  #0: ffffffff81a16220 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x42/0x1ae
>  #1: ffffffff81a16270 (cpuhp_state-down){+.+.}-{0:0}, at: cpuhp_thread_fun+0x72/0x1ae
>
> stack backtrace:
> CPU: 1 UID: 0 PID: 20 Comm: cpuhp/1 Not tainted 7.1.0-rc2+ #12775 PREEMPTLAZY
> Hardware name: StarFive VisionFive 2 v1.2A (DT)
> Call Trace:
> [<ffffffff8001827c>] dump_backtrace+0x1c/0x24
> [<ffffffff800014c0>] show_stack+0x28/0x34
> [<ffffffff80010d42>] dump_stack_lvl+0x5e/0x86
> [<ffffffff80010d7e>] dump_stack+0x14/0x1c
> [<ffffffff800987ec>] lockdep_rcu_suspicious+0x14c/0x1b8
> [<ffffffff80079992>] nohz_balance_exit_idle+0xf4/0xf6
> [<ffffffff800664e6>] sched_cpu_deactivate+0x6c/0x1c8
> [<ffffffff8002a5d0>] cpuhp_invoke_callback+0xf8/0x1ce
> [<ffffffff8002a944>] cpuhp_thread_fun+0x150/0x1ae
> [<ffffffff8005dc64>] smpboot_thread_fn+0x138/0x2a4
> [<ffffffff800554ae>] kthread+0xea/0x10c
> [<ffffffff800134c4>] ret_from_fork_kernel+0x22/0x386
> [<ffffffff80c278ee>] ret_from_fork_kernel_asm+0x16/0x18
> CPU1: off
> CPU2: off
> CPU3: off
>
> This issue is observed on most of my ARM 32bit, ARM 64bit and RiscV64 based boards.
>

Ah, yes, makes sense. We missed the CPU hotplug case. When CPUs are taken
offline, set_cpu_sd_state_busy() is invoked via:

cpuhp/N kthread
cpuhp_thread_fun()
cpuhp_invoke_callback()
sched_cpu_deactivate()
nohz_balance_exit_idle()
set_cpu_sd_state_busy()
rcu_dereference_all(per_cpu(sd_llc, cpu))

The cpuhp kthread holds cpu_hotplug_lock, but runs with preemption and IRQs
enabled. I think we should just restore the RCU read lock in
set_cpu_sd_state_{busy,idle}() to fix this. I'll send a patch soon.

Thanks,
-Andrea