Re: [RFC PATCH v2 4/8] timekeeping: Add absolute reference for feed-forward clock discipline
From: David Woodhouse
Date: Tue May 19 2026 - 07:16:09 EST
On Mon, 2026-05-18 at 19:09 -0700, John Stultz wrote:
> On Sun, May 17, 2026 at 3:03 PM David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote:
> >
> > From: David Woodhouse <dwmw@xxxxxxxxxxxx>
> >
> > Add timekeeping_set_reference() which allows an external clock source
> > (such as a hypervisor vmclock) to provide an absolute time reference.
> > The reference defines a linear counter-to-time mapping that the kernel
> > uses to set both the frequency and phase of the system clock.
> >
> > When timekeeping_set_reference() is called:
> > - tick_length is computed from the reference period and set via
> > ntp_set_tick_length(), keeping all NTP state consistent
> > - A pending flag is set so that on the next tick (under the
> > timekeeping lock), the phase error is set via ntp_set_time_offset()
> >
> > The existing time_offset slew mechanism then converges the clock to
> > the reference, with the clawback fix ensuring ntp_error stays accurate
> > across mult changes.
> >
> > Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx>
> > ---
> > include/linux/timekeeping_reference.h | 19 ++++++++++++
> > kernel/time/ntp.c | 27 ++++++++++++++++
> > kernel/time/ntp_internal.h | 3 ++
> > kernel/time/timekeeping.c | 44 +++++++++++++++++++++++++++
> > 4 files changed, 93 insertions(+)
> > create mode 100644 include/linux/timekeeping_reference.h
> >
> > diff --git a/include/linux/timekeeping_reference.h b/include/linux/timekeeping_reference.h
> > new file mode 100644
> > index 000000000000..4c1d8a6c02f1
> > --- /dev/null
> > +++ b/include/linux/timekeeping_reference.h
> > @@ -0,0 +1,19 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _LINUX_TIMEKEEPING_REFERENCE_H
> > +#define _LINUX_TIMEKEEPING_REFERENCE_H
> > +
> > +#include <linux/clocksource_ids.h>
> > +#include <linux/types.h>
> > +
> > +struct tk_reference {
> > + enum clocksource_ids cs_id;
> > + u64 counter_value;
> > + u64 time_sec;
> > + u64 time_frac_sec;
> > + u64 period_frac_sec;
> > + u8 period_shift;
> > +};
>
> Can you add comments documenting each of these values?
Ack.
> All the math here could use some comments, just to be explicit about
> what is intended.
Ack.
>
> > @@ -2339,6 +2366,23 @@ static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
> > tk->ntp_tick = ntp_tl;
> > mult = div64_u64(tk->ntp_tick >> tk->ntp_error_shift,
> > tk->cycle_interval);
> > + if (tk_ref_pending && tk->cs_id == tk_ref.cs_id) {
> > + u64 d = tk->tkr_mono.cycle_last - tk_ref.counter_value;
> > + __uint128_t p = (__uint128_t)d * tk_ref.period_frac_sec;
> > + u64 rf;
> > + s64 ref_err;
> > +
> > + p >>= tk_ref.period_shift;
> > + p += tk_ref.time_frac_sec;
> > + rf = (u64)p;
> > + ref_err = (s64)mul_u64_u64_shr(rf,
> > + (u64)NSEC_PER_SEC << tk->tkr_mono.shift, 64) -
> > + (s64)tk->tkr_mono.xtime_nsec;
> > + ntp_set_time_offset(tk->id,
> > + ref_err >> tk->tkr_mono.shift);
> > + tk->ntp_error = 0;
> > + tk_ref_pending = false;
> > + }
>
> Just a quick skim here, but I don't see anything using tk_ref_valid.
> Am I missing it? Or can that value be dropped?
Yeah, I think it can be dropped; it's a hangover from the time when we
needed to consult the reference at *every* tick to drive the mult±1
dithering.
But now the core timekeeping can actually follow a line for itself,
that external oracle is no longer needed — because time definition (C)
*is* the reference and doesn't keep accumulating errors.
I'm also going to take another look at whether I need this hunk in
timekeeping_adjust() at all, or whether the tracking is now
sufficiently predictable that timekeeper_set_reference() could just set
time_offset directly.
I did it this way because timekeeper_set_reference() runs mid-tick,
while timekeeping_adjust() can adjust time_offset to clamp (C) to the
reference at the start of the new tick.
But if the accounting is all fixed now, there's no reason why
timekeeper_set_reference() couldn't get just apply the appropriate
correction for the moment at the existing cycle_last, effectively
setting time (C) for *that* moment, and then trust that it tracks
correctly from there. I'll play with it...
Attachment:
smime.p7s
Description: S/MIME cryptographic signature