Re: [PATCH] dma-debug: skip cacheline overlap tracking on cache-coherent architectures
From: Mikhail Gavrilov
Date: Mon May 18 2026 - 09:32:03 EST
On Mon, May 18, 2026 at 5:53 PM Leon Romanovsky <leon@xxxxxxxxxx> wrote:
>
> Actually, later in that thread, people agreed that this debug message
> correctly pointed out the underlying issue in the code.
> https://lore.kernel.org/all/20241015075418.GA25487@xxxxxx/
The full thread is more split than that. Christoph in the message
you linked says the warnings are "perfectly valid because the I/O
patterns will create data corruption on non-coherent architectures.
For direct I/O from userspace the kernel can't prevent it".
Dan Williams (original author of the cacheline tracking) earlier in
the same thread:
> I don't see an easy way out of this without instrumenting archs
> that can not support overlapping mappings to opt-in to bounce
> buffering for these cases.
>
> Archs that can support this can skip the opt-in and quiet this
> test, but some of the value is being able to catch boundary
> conditions on more widely available systems.
So Christoph scopes validity to non-coherent arches, and Dan
explicitly recognizes the "coherent arches skip the tracking" path
-- with the trade-off of losing boundary-condition catching on
widely available systems. That's the same trade-off 3d48c9fd78dd
already accepted for the sibling err_printk, which this patch
extends to (2). In both cases the production cost (spurious splats
on real workloads, e.g. NVMe block I/O) outweighs the diagnostic
value on coherent arches where bus snooping prevents the corruption
the warning is about.
> How difficult is it to annotate call sites?
For some callers it's tractable -- virtio via DMA_ATTR_CPU_CACHE_CLEAN,
RDMA via DMA_ATTR_REQUIRE_COHERENT. For others, Christoph himself in
the same thread:
> For direct I/O from userspace the kernel can't prevent it, but
> for raid1 we should be able to do something better. As raid1_
> sync_request is a convoluted and undocumented mess I don't have
> a straight shot answer to what it is doing (wrong) and how to
> fix it.
DIO from userspace is unfixable from the kernel side per that
message; raid1 acknowledged as needing a fix Christoph didn't have.
Two years on, those cases (plus dm-thin and io_uring polled tests
from Ming Lei's report) still don't have an annotation path. This
patch covers what annotation can't reach without preventing future
annotation work.
--
Thanks,
Mikhail