Re: [PATCH sched/core] sched/rt: Fix RT_PUSH_IPI soft lockup loop
From: Steven Rostedt
Date: Thu May 14 2026 - 10:09:14 EST
On Wed, 13 May 2026 18:48:31 -1000
Tejun Heo <tj@xxxxxxxxxx> wrote:
> Hello,
>
> On Wed, May 13, 2026 at 10:01:36PM -0400, Steven Rostedt wrote:
> > I could try, but there are still some things that I don't understand.
> > One is that to send more IPIs due to the RT pull request, there needs
> > to be RT tasks constantly sleeping. Is that happening in this use case?
> > Are the softirqs waking up RT tasks that run for a short time and go
> > back to sleep, causing the pull IPI to trigger again?
>
> Ah, yes, that makes sense. That's why the repro is using FIFO threads too.
> In prod, there's mpi3mr threaded irq handlers that are FIFO. These are
> storage machines so they're also constantly active.
I was thinking about this more and does disabling the RT_PUSH_IPI cause any
problems for you?
# echo NO_RT_PUSH_IPI > /sys/kernel/debug/sched/features
The reason I ask is that I'm not sure he RT_PUSH_IPI even makes sense to
have enabled when CONFIG_IRQ_FORCED_THREADING is not enabled. The reason
the RT_PUSH_IPI was created in the first place was due to a kind of
"thundering herd" of taking the rq lock of the CPU that has an overloaded
set of RT tasks on it.
When RT_PUSH_IPI is disabled, instead of sending an IPI to the CPU to do a
push, the CPU that is scheduling a lower priority task takes the overloaded
CPU's rq lock and will try to pull tasks from it.
The issue that RT_PUSH_IPI solved was that if you had a 100 CPUs all
scheduling a lower priority task at the same time, they would all try to
take the lock of the overloaded CPU. Only the first one would succeed in
pulling a task. The other 99 would finally get that lock and see that it
has no tasks to pull from. I found that this could cause 500us of latency
or more.
That 500us mattered a lot for PREEMPT_RT, but doesn't really matter if you
have softirqs running uninterruptable and for 500us themselves. I'm
thinking that we could just have the following instead:
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 9f63b15d309d..0a4f4a212cd6 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -829,8 +829,14 @@ static inline int rt_bandwidth_enabled(void)
return sysctl_sched_rt_runtime >= 0;
}
-/* RT IPI pull logic requires IRQ_WORK */
-#if defined(CONFIG_IRQ_WORK) && defined(CONFIG_SMP)
+/*
+ * RT IPI pull logic requires IRQ_WORK and doesn't make sense for uniprocessors.
+ * If CONFIG_IRQ_FORCED_THREADING isn't set, then softirqs do not run as threads
+ * and can cause latency larger than what RT_PUSH_IPI can save, killing the
+ * effect of it.
+ */
+#if defined(CONFIG_IRQ_WORK) && defined(CONFIG_SMP) && \
+ defined(CONFIG_IRQ_FORCED_THREADING)
# define HAVE_RT_PUSH_IPI
#endif
-- Steve