Re: [RFC PATCH RESEND] iommu: intel: apply quirk_iommu_igfx for 8086:0044 (QM57/QS57)

From: Baolu Lu
Date: Thu Apr 17 2025 - 23:14:39 EST


On 4/18/25 11:07, Baolu Lu wrote:
On 1/20/25 17:35, Mingcong Bai wrote:
(I'm not very confident about the approach of this patch but I failed to
find a better way to address the issue I have on hand, so please consider
this patch as an RFC...)

On the Lenovo ThinkPad X201, when Intel VT-d is enabled in the BIOS, the
kernel boots with errors related to DMAR, the graphical interface appeared
quite choppy, and the system resets erratically within a minute after it
booted:

DMAR: DRHD: handling fault status reg 3
DMAR: [DMA Write NO_PASID] Request device [00:02.0] fault addr 0xb97ff000
[fault reason 0x05] PTE Write access is not set

Upon comparing boot logs with VT-d on/off, I found that the Intel Calpella
quirk (`quirk_calpella_no_shadow_gtt()') correctly applied the igfx IOMMU
disable/quirk correctly:

pci 0000:00:00.0: DMAR: BIOS has allocated no shadow GTT; disabling IOMMU
for graphics

Whereas with VT-d on, it went into the "else" branch, which then
triggered the DMAR handling fault above:

... else if (!disable_igfx_iommu) {
    /* we have to ensure the gfx device is idle before we flush */
    pci_info(dev, "Disabling batched IOTLB flush on Ironlake\n");
    iommu_set_dma_strict();
}

Now, this is not exactly scientific, but moving 0x0044 to quirk_iommu_igfx
seems to have fixed the aforementioned issue. Running a few `git blame'
runs on the function, I have found that the quirk was originally
introduced as a fix specific to ThinkPad X201:

commit 9eecabcb9a92 ("intel-iommu: Abort IOMMU setup for igfx if BIOS gave
no shadow GTT space")

Which was later revised twice to the "else" branch we saw above:

- 2011: commit 6fbcfb3e467a ("intel-iommu: Workaround IOTLB hang on
   Ironlake GPU")
- 2024: commit ba00196ca41c ("iommu/vt-d: Decouple igfx_off from graphic
   identity mapping")

I'm uncertain whether further testings on this particular laptops were
done in 2011 and (honestly I'm not sure) 2024, but I would be happy to do
some distro-specific testing if that's what would be required to verify
this patch.

P.S., I also see IDs 0x0040, 0x0062, and 0x006a listed under the same
`quirk_calpella_no_shadow_gtt()' quirk, but I'm not sure how similar these
chipsets are (if they share the same issue with VT-d or even, indeed, if
this issue is specific to a bug in the Lenovo BIOS). With regards to
0x0062, it seems to be a Centrino wireless card, but not a chipset?

I have also listed a couple (distro and kernel) bug reports below as
references (some of them are from 7-8 years ago!), as they seem to be
similar issue found on different Westmere/Ironlake, Haswell, and Broadwell
hardware setups.

Link:https://bugzilla.kernel.org/show_bug.cgi?id=197029
Link:https://groups.google.com/g/qubes-users/c/4NP4goUds2c?pli=1
Link:https://bugs.archlinux.org/task/65362
Link:https://bbs.archlinux.org/viewtopic.php?id=230323
Reported-by: Wenhao Sun<weiguangtwk@xxxxxxxxxxx>
Signed-off-by: Mingcong Bai<jeffbai@xxxxxxx>
---
  drivers/iommu/intel/iommu.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

Queued for v6.15-rc. Thank you!

Please ignore this. I picked the latest one instead.

https://lore.kernel.org/r/20250415133330.12528-1-jeffbai@xxxxxxx

Sorry for the inconvenience.