#regzbot introduced: 129dab6e1286
Hello everyone,
We've identified a performance regression that starts with linux
kernel 6.10 and persists through 6.16(tested at commit e540341508ce).
Bisection pointed to commit:
129dab6e1286 ("iommu/vt-d: Use cache_tag_flush_range_np() in iotlb_sync_map").
The issue occurs when running fio against two NVMe devices located
under the same PCIe bridge (dual-port NVMe configuration). Performance
drops compared to configurations where the devices are on different
bridges.
Observed Performance:
- Before the commit: ~6150 MiB/s, regardless of NVMe device placement.
- After the commit:
-- Same PCIe bridge: ~4985 MiB/s
-- Different PCIe bridges: ~6150 MiB/s
Currently we can only reproduce the issue on a Z3 metal instance on
gcp. I suspect the issue can be reproducible if you have a dual port
nvme on any machine.
At [1] there's a more detailed description of the issue and details
on the reproducer.
Could you please advise on the appropriate path forward to mitigate or
address this regression?
Thanks,
Jo
[1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2115738