Re: VMX Preemption Timer appears to be buggy on SKX, CLX, and ICX

From: Chao Gao

Date: Fri Jun 05 2026 - 01:57:05 EST


On Thu, Jun 04, 2026 at 10:34:40PM -0700, Jim Mattson wrote:
>> >
>> >I think vmx_set_hv_timer() should return -EINVAL for values impacted
>> >by this erratum. However, the only documented issue is for EMR, and we
>> >have not observed the problem on EMR. That's unsettling.
>>
>> Could you clarify what tests you ran?
>
>Just tools/testing/selftests/kvm/x86/apic_bus_clock_test.
>
>It fails on SKX, CLX, and ICX. It passes on SPR, EMR, and GMR.

Thanks. In that case, that test likely does not trigger the issue.

>> >2) Is there any compelling reason not to simplify the limit to 2^25?
>>
>> We can use 2^25 as a conservative bound, but it is much lower than necessary.
>> The current bound comes from theoretical analysis and was validated on multiple
>> platforms.
>
>Yes, but how often do guests program their local APIC timer to fire
>more than 2^(25 + IA32_VMX_MISC[4:0]) cycles in the future?

I had interpreted your earlier question as referring to the erratum write-up
itself (i.e., why Intel did not publish 2^25 directly as the limit).

If we are talking about the VMM implementation, this should indeed be rare. I
do not see a strong reason KVM could not use 2^25 as a conservative limit.

>
>> >
>> >3) Is it just coincidence that 25 + IA32_VMX_MISC[4:0] (on EMR) == 32,
>> >or should the limit be calculated as 32 - IA32_VMX_MISC[4:0]?
>>
>> My understanding is that hardware scales the preemption-timer value and
>> converts it to a 32-bit core crystal clock counter, rather than directly
>> using a 32-bit TSC delta. IA32_VMX_MISC[4:0] likely participates in that
>> calculation.
>
>That doesn't definitively answer my question. Let me try to rephrase it.
>
>With respect to EMR, you wrote previously, "A mitigation for this
>erratum is for software to program the VMX preemption timer for values
>below 2^25 * CPUID.15H:EBX[31:0] / CPUID.15H:EAX[31:0]."
>
>My question is whether the exponent, 25, is a fixed value for all
>CPUs, regardless of their IA32_VMX_MISC[4:0]. It sounds like you are
>saying that the exponent may depend on IA32_VMX_MISC[4:0].

Let me double-check this with the internal team and get back to you.