Re: [PATCH] dma-debug: skip cacheline overlap tracking on cache-coherent architectures

From: Mikhail Gavrilov

Date: Mon May 18 2026 - 08:30:30 EST


On Mon, May 18, 2026 at 5:10 PM Leon Romanovsky <leon@xxxxxxxxxx> wrote:
>
> I would say this reproducer is incorrect. From what I recall, the only two
> legitimate use cases for cacheline overlap are virtio and RDMA.

The wild trace in the commit message is NVMe block I/O -- neither virtio
nor RDMA:

add_dma_entry -> debug_dma_map_phys -> dma_map_phys ->
blk_dma_map_iter_start -> nvme_map_data

The block layer submits many concurrent in-flight requests; small
kmalloc'd buffers naturally land in the same cacheline under high IOPS,
which is incidental rather than intentional overlap. Ming Lei's report
linked in the commit message [1] enumerates additional non-virtio /
non-RDMA cases hitting the same WARN: liburing iopoll tests, raid1,
dm-thin and other storage utilities.

> The first intentionally relies on it for small allocations, and the second exports the
> cachelines to the user space and cannot operate on non‑coherent architectures.

The reproducer isn't claiming to be either of those. It deterministically
reaches the same state-based gate the wild NVMe trace hits
(!is_cache_clean && overlap > 7, with direction != DMA_TO_DEVICE, after
the v2 coherent-arch / SWIOTLB-bounce suppressions are evaluated). Since
that gate has no subsystem-specific term, any caller -- synthetic or real
-- reaching it with those state values triggers the same WARN.

If the broader concern is that the block layer should opt into your
coherency-attribute work rather than relying on debug-side suppression,
that's a reasonable longer-term direction. But it's additive: even with
opt-in adoption, the WARN remains a false positive on coherent arches
for callers that don't annotate -- which is exactly what v2 (3d48c9fd78dd)
already established for the sibling "cacheline tracking EEXIST" err_printk.

[1] https://lore.kernel.org/all/ZwxzdWmYcBK27mUs@fedora/

--
Thanks,
Mikhail