Re: [PATCH v2] blk-mq: add tracepoint block_rq_tag_wait

Next message: Xinliang Liu: "Re: [PATCH V1 for drm-misc-fixes] MAINTAINERS: split hisilicon maintenance and add Yongbang Shi for hibmc-drm matainers"
Previous message: Qi Zheng: "Re: [PATCH 1/8] mm/huge_memory: simplify vma_is_specal_huge()"
In reply to: Aaron Tomlin: "[PATCH v2] blk-mq: add tracepoint block_rq_tag_wait"
Next in thread: Damien Le Moal: "Re: [PATCH v2] blk-mq: add tracepoint block_rq_tag_wait"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Chaitanya Kulkarni

Date: Wed Mar 18 2026 - 23:18:37 EST

On 3/18/26 18:53, Aaron Tomlin wrote:
> In high-performance storage environments, particularly when utilising
> RAID controllers with shared tag sets (BLK_MQ_F_TAG_HCTX_SHARED), severe
> latency spikes can occur when fast devices (SSDs) are starved of hardware
> tags when sharing the same blk_mq_tag_set.
>
> Currently, diagnosing this specific hardware queue contention is
> difficult. When a CPU thread exhausts the tag pool, blk_mq_get_tag()
> forces the current thread to block uninterruptible via io_schedule().
> While this can be inferred viasched:sched_switch or dynamically
> traced by attaching a kprobe to blk_mq_mark_tag_wait(), there is no
> dedicated, out-of-the-box observability for this event.
>
> This patch introduces the block_rq_tag_wait static trace point in the
> tag allocation slow-path. It triggers immediately before the thread
> yields the CPU, exposing the exact hardware context (hctx) that is
> starved, the specific pool experiencing starvation (hardware or software
> scheduler), and the total pool depth.
>
> This provides storage engineers and performance monitoring agents
> with a zero-configuration, low-overhead mechanism to definitively
> identify shared-tag bottlenecks and tune I/O schedulers or cgroup
> throttling accordingly.
>
> Signed-off-by: Aaron Tomlin<atomlin@xxxxxxxxxxx>
> ---
> Changes in v1 [1]:
> - Improved the description of the trace point (Damien Le Moal)
> - Removed the redundant "active requests" (Laurence Oberman)
> - Introduced pool-specific starvation tracking
>
> [1]:https://lore.kernel.org/lkml/20260317182835.258183-1-atomlin@xxxxxxxxxxx/

LGTM.

Reviewed-by: Chaitanya Kulkarni <kch@xxxxxxxxxx>

-ck