Re: [PATCH] workqueue: Fix false positive stall reports
From: Breno Leitao
Date: Mon Mar 23 2026 - 05:41:29 EST
On Fri, Mar 20, 2026 at 12:23:32PM -0700, Song Liu wrote:
> On weakly ordered architectures (e.g., arm64), the lockless check in
> wq_watchdog_timer_fn() can observe a reordering between the worklist
> insertion and the last_progress_ts update. Specifically, the watchdog
> can see a non-empty worklist (from a list_add) while reading a stale
> last_progress_ts value, causing a false positive stall report.
>
> This was confirmed by reading pool->last_progress_ts again after holding
> pool->lock in wq_watchdog_timer_fn():
>
> workqueue watchdog: pool 7 false positive detected!
> lockless_ts=4784580465 locked_ts=4785033728
> diff=453263ms worklist_empty=0
>
> To avoid slowing down the hot path (queue_work, etc.), recheck
> last_progress_ts with pool->lock held. This will eliminate the false
> positive with minimal overhead.
>
> Remove two extra empty lines in wq_watchdog_timer_fn() as we are on it.
>
> Assisted-by: claude-code:claude-opus-4-6
> Signed-off-by: Song Liu <song@xxxxxxxxxx>
Acked-by: Breno Leitao <leitao@xxxxxxxxxx>