[PATCH] dma-debug: skip cacheline overlap tracking on cache-coherent architectures

From: Mikhail Gavrilov

Date: Mon May 18 2026 - 07:37:53 EST


The dma-debug cacheline overlap tracking emits two distinct warnings
when multiple DMA mappings share a cacheline:

1. add_dma_entry() calls err_printk("cacheline tracking EEXIST,
overlapping mappings aren't supported\n") on every -EEXIST from
active_cacheline_insert().

2. active_cacheline_inc_overlap() calls WARN_ONCE("exceeded %d
overlapping mappings of cacheline %pa\n", ...) when the 3-bit
per-cacheline overlap counter in the dma_active_cacheline radix
tree would saturate past ACTIVE_CACHELINE_MAX_OVERLAP (= 7).

Commit 3d48c9fd78dd ("dma-debug: suppress cacheline overlap warning
when arch has no DMA alignment requirement") suppressed (1) on
architectures where hardware bus snooping makes cacheline-overlapping
DMA mappings safe. The same reasoning applies to (2): the tracking is
pure overhead on those architectures, and (2) still fires under real
workloads, e.g. heavy NVMe block I/O on x86_64:

DMA-API: exceeded 7 overlapping mappings of cacheline 0x...
WARNING: kernel/dma/debug.c:465 at add_dma_entry+0x394/0x410
Call Trace:
add_dma_entry+...
debug_dma_map_phys+...
dma_map_phys+...
blk_dma_map_iter_start+...
nvme_map_data+...

The block layer routinely produces nine or more concurrent in-flight
mappings whose buffers share a single cacheline. On hardware-coherent
systems this is harmless, but it saturates the tag-based overlap
counter and produces a splat indistinguishable from a real driver bug.

Extend the gate to skip the cacheline overlap tracking entirely on
cache-coherent architectures, mirroring the DMA_TO_DEVICE early-return
that already exists for the same "tracking is unnecessary" reason. The
helper dma_debug_cacheline_tracking_needed() captures the condition and
is symmetric to the existing add_dma_entry() check.

The same DMA_BOUNCE_UNALIGNED_KMALLOC + SWIOTLB suppression that
commit 03521c892bb8 ("dma-debug: don't report false positives with
DMA_BOUNCE_UNALIGNED_KMALLOC") added to (1) applies here for the same
reason: unaligned kmalloc buffers are bounced through aligned swiotlb
buffers, so the original cacheline overlap never reaches DMA. The
helper preserves both suppression conditions.

Reproducer (out-of-tree module): map a single 8-byte buffer with
dma_map_single(..., DMA_BIDIRECTIONAL) nine times in a row. The 9th
call deterministically fires the WARN_ONCE on an unfixed kernel; with
this patch applied no warning is emitted regardless of the number of
overlapping mappings.

Without this patch (n_maps=9):
DMA-API: exceeded 7 overlapping mappings of cacheline 0x00000000071d7dbe
WARNING: kernel/dma/debug.c:465 at add_dma_entry+0x39e/0x410
[...]

With this patch (n_maps=1000):
dma_debug_overlap_repro: 1000/1000 mappings active
[no warning]

Link: https://lore.kernel.org/all/ZwxzdWmYcBK27mUs@fedora/
Fixes: 3b7a6418c749 ("dma debug: account for cachelines and read-only mappings in overlap tracking")
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@xxxxxxxxx>
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@xxxxxxxxx>
---
kernel/dma/debug.c | 35 +++++++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)

diff --git a/kernel/dma/debug.c b/kernel/dma/debug.c
index 1a725edbbbf6..2d1609b9d362 100644
--- a/kernel/dma/debug.c
+++ b/kernel/dma/debug.c
@@ -474,6 +474,35 @@ static int active_cacheline_dec_overlap(phys_addr_t cln)
return active_cacheline_set_overlap(cln, --overlap);
}

+/*
+ * Whether cacheline-overlap tracking is meaningful for @dev.
+ *
+ * Mirrors the suppression conditions add_dma_entry() already applies to
+ * the sibling "cacheline tracking EEXIST" err_printk:
+ *
+ * - On architectures with hardware DMA cache coherence
+ * (dma_get_cache_alignment() < L1_CACHE_BYTES, e.g. x86_64) bus
+ * snooping makes overlapping cacheline mappings safe.
+ *
+ * - With CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC and an active SWIOTLB,
+ * unaligned kmalloc buffers are bounced through aligned swiotlb
+ * buffers, so the original cacheline overlap never reaches DMA.
+ * See commit 03521c892bb8 ("dma-debug: don't report false positives
+ * with DMA_BOUNCE_UNALIGNED_KMALLOC").
+ *
+ * In both cases tracking is pure overhead and produces false-positive
+ * WARN_ONCEs.
+ */
+static bool dma_debug_cacheline_tracking_needed(struct device *dev)
+{
+ if (dma_get_cache_alignment() < L1_CACHE_BYTES)
+ return false;
+ if (IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) &&
+ is_swiotlb_active(dev))
+ return false;
+ return true;
+}
+
static int active_cacheline_insert(struct dma_debug_entry *entry,
bool *overlap_cache_clean)
{
@@ -490,6 +519,9 @@ static int active_cacheline_insert(struct dma_debug_entry *entry,
if (entry->direction == DMA_TO_DEVICE)
return 0;

+ if (!dma_debug_cacheline_tracking_needed(entry->dev))
+ return 0;
+
spin_lock_irqsave(&radix_lock, flags);
rc = radix_tree_insert(&dma_active_cacheline, cln, entry);
if (rc == -EEXIST) {
@@ -516,6 +548,9 @@ static void active_cacheline_remove(struct dma_debug_entry *entry)
if (entry->direction == DMA_TO_DEVICE)
return;

+ if (!dma_debug_cacheline_tracking_needed(entry->dev))
+ return;
+
spin_lock_irqsave(&radix_lock, flags);
/* since we are counting overlaps the final put of the
* cacheline will occur when the overlap count is 0.
--
2.54.0