Re: [REGRESSION][BISECTED] Performance Regression in IOMMU/VT-d Since Kernel 6.10

From: Baolu Lu
Date: Wed Jul 02 2025 - 01:17:05 EST


On 7/2/25 01:11, Ioanna Alifieraki wrote:
#regzbot introduced: 129dab6e1286

Hello everyone,

We've identified a performance regression that starts with linux
kernel 6.10 and persists through 6.16(tested at commit e540341508ce).
Bisection pointed to commit:
129dab6e1286 ("iommu/vt-d: Use cache_tag_flush_range_np() in iotlb_sync_map").

The issue occurs when running fio against two NVMe devices located
under the same PCIe bridge (dual-port NVMe configuration). Performance
drops compared to configurations where the devices are on different
bridges.

Observed Performance:
- Before the commit: ~6150 MiB/s, regardless of NVMe device placement.
- After the commit:
-- Same PCIe bridge: ~4985 MiB/s
-- Different PCIe bridges: ~6150 MiB/s


Currently we can only reproduce the issue on a Z3 metal instance on
gcp. I suspect the issue can be reproducible if you have a dual port
nvme on any machine.
At [1] there's a more detailed description of the issue and details
on the reproducer.

This test was running on bare metal hardware instead of any
virtualization guest, right? If that's the case,
cache_tag_flush_range_np() is almost a no-op.

Can you please show me the capability register of the IOMMU by:

#cat /sys/bus/pci/devices/[pci_dev_name]/iommu/intel-iommu/cap


Could you please advise on the appropriate path forward to mitigate or
address this regression?

Thanks,
Jo

[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2115738

Thanks,
baolu