Re: [PATCH] dma-debug: skip cacheline overlap tracking on cache-coherent architectures

From: Leon Romanovsky

Date: Mon May 18 2026 - 08:55:24 EST


On Mon, May 18, 2026 at 05:23:15PM +0500, Mikhail Gavrilov wrote:
> On Mon, May 18, 2026 at 5:10 PM Leon Romanovsky <leon@xxxxxxxxxx> wrote:
> >
> > I would say this reproducer is incorrect. From what I recall, the only two
> > legitimate use cases for cacheline overlap are virtio and RDMA.
>
> The wild trace in the commit message is NVMe block I/O -- neither virtio
> nor RDMA:
>
> add_dma_entry -> debug_dma_map_phys -> dma_map_phys ->
> blk_dma_map_iter_start -> nvme_map_data
>
> The block layer submits many concurrent in-flight requests; small
> kmalloc'd buffers naturally land in the same cacheline under high IOPS,
> which is incidental rather than intentional overlap. Ming Lei's report
> linked in the commit message [1] enumerates additional non-virtio /
> non-RDMA cases hitting the same WARN: liburing iopoll tests, raid1,
> dm-thin and other storage utilities.

Actually, later in that thread, people agreed that this debug message
correctly pointed out the underlying issue in the code.
https://lore.kernel.org/all/20241015075418.GA25487@xxxxxx/

>
> > The first intentionally relies on it for small allocations, and the second exports the
> > cachelines to the user space and cannot operate on non‑coherent architectures.
>
> The reproducer isn't claiming to be either of those. It deterministically
> reaches the same state-based gate the wild NVMe trace hits
> (!is_cache_clean && overlap > 7, with direction != DMA_TO_DEVICE, after
> the v2 coherent-arch / SWIOTLB-bounce suppressions are evaluated). Since
> that gate has no subsystem-specific term, any caller -- synthetic or real
> -- reaching it with those state values triggers the same WARN.
>
> If the broader concern is that the block layer should opt into your
> coherency-attribute work rather than relying on debug-side suppression,
> that's a reasonable longer-term direction. But it's additive: even with
> opt-in adoption, the WARN remains a false positive on coherent arches
> for callers that don't annotate -- which is exactly what v2 (3d48c9fd78dd)
> already established for the sibling "cacheline tracking EEXIST" err_printk.

How difficult is it to annotate call sites?

Thanks

>
> [1] https://lore.kernel.org/all/ZwxzdWmYcBK27mUs@fedora/
>
> --
> Thanks,
> Mikhail
>