Re: [PATCH] io_uring/io-wq: re-check IO_WQ_BIT_EXIT for each linked work item

From: Jens Axboe

Date: Wed May 27 2026 - 12:11:06 EST

On 5/27/26 8:37 AM, Runyu Xiao wrote:
> Commit bdf0bf73006e ("io_uring/io-wq: check IO_WQ_BIT_EXIT inside work
> run loop") fixed the obvious case where io_worker_handle_work() took one
> exit-bit snapshot before draining pending work, but the fix stops one
> level too early.
>
> io_worker_handle_work() now re-checks IO_WQ_BIT_EXIT in its outer work
> run loop, yet it still snapshots that bit once before processing a
> whole dependent linked-work chain. If io_wq_exit_start() sets
> IO_WQ_BIT_EXIT after the first linked item has started, the remaining
> linked items can still reuse stale do_kill = false, skip
> IO_WQ_WORK_CANCEL, and continue running after exit has begun.
>
> That means the previous fix did not fully eliminate the exit-latency
> problem; it only narrowed it to linked chains. A long or slow linked
> chain can still keep io-wq exit waiting for work that should already
> have been canceled.
>
> The issue was found on Linux v6.18.21 by our static-analysis tool,
> which flagged linked-work loops that snapshot shared exit state
> outside per-item cancel decisions, and was then confirmed by manual
> auditing of io_worker_handle_work(). It was later reproduced with a
> QEMU no-device validation selftest that preserved the same contract:
> a three-node unbound linked chain, an exit actor setting
> IO_WQ_BIT_EXIT after work1, and slow post-exit linked work. With a
> 3000 ms delay injected into each post-exit item, the buggy path
> spends about 6066 ms after exit running work2/work3, while the fixed
> path cancels both and finishes in about 2 ms.
>
> Re-check test_bit(IO_WQ_BIT_EXIT, &wq->state) for each iteration of the
> dependent-link loop, right before deciding whether to cancel the
> current work item. That closes the remaining stale-snapshot window and
> prevents linked post-exit work from stretching shutdown latency.

I think this change makes sense to further cut down on the time, but you
need to send it in for the _upstream_ kernel, stable only does backports
of those. Eg if you send this one for current -git and mark it fixing
the correct upstream commit (not the stable one) and add CC stable, then
it'll wind up in stable as well.

--
Jens Axboe