Re: [PATCH v2] workqueue: Fix false positive stall reports

From: Song Liu

Date: Mon Mar 23 2026 - 15:08:51 EST

On Mon, Mar 23, 2026 at 7:09 AM Petr Mladek <pmladek@xxxxxxxx> wrote:
>
> On Sat 2026-03-21 20:30:45, Song Liu wrote:
> > On weakly ordered architectures (e.g., arm64), the lockless check in
> > wq_watchdog_timer_fn() can observe a reordering between the worklist
> > insertion and the last_progress_ts update. Specifically, the watchdog
> > can see a non-empty worklist (from a list_add) while reading a stale
> > last_progress_ts value, causing a false positive stall report.
> >
> > This was confirmed by reading pool->last_progress_ts again after holding
> > pool->lock in wq_watchdog_timer_fn():
> >
> > workqueue watchdog: pool 7 false positive detected!
> > lockless_ts=4784580465 locked_ts=4785033728
> > diff=453263ms worklist_empty=0
> >
> > To avoid slowing down the hot path (queue_work, etc.), recheck
> > last_progress_ts with pool->lock held. This will eliminate the false
> > positive with minimal overhead.
> >
> > Remove two extra empty lines in wq_watchdog_timer_fn() as we are on it.
> >
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> > @@ -7699,8 +7699,28 @@ static void wq_watchdog_timer_fn(struct timer_list *unused)
> > else
> > ts = touched;
> >
> > - /* did we stall? */
> > + /*
> > + * Did we stall?
> > + *
> > + * Do a lockless check first. On weakly ordered
> > + * architectures, the lockless check can observe a
> > + * reordering between worklist insert_work() and
> > + * last_progress_ts update from __queue_work(). Since
> > + * __queue_work() is a much hotter path than the timer
> > + * function, we handle false positive here by reading
> > + * last_progress_ts again with pool->lock held.
> > + */
> > if (time_after(now, ts + thresh)) {
> > + scoped_guard(raw_spinlock_irqsave, &pool->lock) {
> > + pool_ts = pool->last_progress_ts;
> > + if (time_after(pool_ts, touched))
> > + ts = pool_ts;
> > + else
> > + ts = touched;
> > + }
> > + if (!time_after(now, ts + thresh))
> > + continue;
>
> The new code is pretty hairy. It might make sense to take the lock
> around the original check and keep it as is.
>
> IMHO, if a contention on a pool->lock has ever been a problem than maybe
> the non-trivial workqueue API was not a good choice for the affected
> use-case. Or am I wrong?

Given the false positive is really rare (a handful of them per day among a
fleet of thousands of hosts), I think taking hundreds of pool->lock per 30
second seems not necessary.

Thanks,
Song