Re: [PATCH net-next v8 3/3] gve: implement PTP gettimex64

From: Thomas Gleixner

Date: Fri May 22 2026 - 10:56:12 EST


On Fri, May 22 2026 at 11:34, David Woodhouse wrote:
> On Fri, 2026-05-22 at 00:43 +0200, Thomas Gleixner wrote:
>>      1) Guest TSC value at freeze
>>      2) Guest nominal TSC frequency
>>      3) Old host REALTIME at freeze - Ideally you use TAI
>>      4) New host TSC frequency
>>      5) New host TSC/REALTIME/TAI snapshot
>>
>>   #1 is a KVM problem, but see #3
>>  
>>   #2 ideally communicated from the guest to the host after early
>>      initialization at boot.
>>
>>      You really want this information because the guest won't change the
>>      mult/shift pair for it ever.
>
> If *tell* the guest the frequency in CPUID, then it shouldn't be trying
> to manually calibrate it against an emulated PIT while suffering steal
> time, and its mult/shift should have a little bit less entropy.

They are identical on every boot evaluation.

> Even a system which *has* to do that crappy calibration still does it
> with a lot more *precision* than accuracy, so I suspect we ought to be
> rounding the result to the nearest 1MHz as long as that's within 10PPM
> or something like that. But that really *is* a digression :)

:)

> The model I'm enabling and documenting for KVM migration is basically
> within the noise of what you describe above, yes.
>
> But if we want to give the illusion of the TSC just ticking away while
> the guest happens to experience a little steal time, when in fact it's
> been completely migrated to a new host, we actually want to work with
> the *true* running frequency of the TSC at the moment of migration.
>
> So...
>
> 1) Use clock_get_time_reference() to get a { host tsc, time, rate }
> from the source host at 'freeze' time.
>
> 2) Use clock_get_time_reference() to get a { host tsc, time, rate }
> from the destination host, when resuming.
>
> 3) (Optionally) scale the guest's TSC frequency, not by the *nominal* 
> rates, but by the *actual* ratio of the rates from (1) and (2)
> above (plus any original nominal scaling of the guest's TSC from
> the original host).
>
> 4) Calculate the guest TSC *offset* in order to convey the effect
> that the guest's TSC continued to tick at the rate from (1),
> during the time period between (1) and (2).
>
> 5) (Optionally) Once the guest is running, slowly undo the scaling
> in (1) in order to get the guest back to a nice simple unscaled
> TSC (or scaled only by nominal frequencies as it was when launched)
>
>
> Obviously, a dedicated environment which disciplines its TSC directly
> can do all of that right now already because it *has* all the
> information it would get from clock_get_time_reference().
>
> But as you know perfectly well, Thomas, I'm never happy to keep the
> blinkers on and focus only on my specific use case at hand; I want this
> to work for the *general* case, including people running QEMU in a
> fairly standard environment. And I think clock_get_time_reference()
> might be a reasonable way of doing that, and a fairly clean counterpart
> to the clock_set_time_reference() you suggested?

Agreed.