Re: [RFC PATCH v2 4/8] timekeeping: Add absolute reference for feed-forward clock discipline

From: David Woodhouse

Date: Tue May 19 2026 - 07:16:09 EST


On Mon, 2026-05-18 at 19:09 -0700, John Stultz wrote:
> On Sun, May 17, 2026 at 3:03 PM David Woodhouse <dwmw2@xxxxxxxxxxxxx> wrote:
> >
> > From: David Woodhouse <dwmw@xxxxxxxxxxxx>
> >
> > Add timekeeping_set_reference() which allows an external clock source
> > (such as a hypervisor vmclock) to provide an absolute time reference.
> > The reference defines a linear counter-to-time mapping that the kernel
> > uses to set both the frequency and phase of the system clock.
> >
> > When timekeeping_set_reference() is called:
> >  - tick_length is computed from the reference period and set via
> >    ntp_set_tick_length(), keeping all NTP state consistent
> >  - A pending flag is set so that on the next tick (under the
> >    timekeeping lock), the phase error is set via ntp_set_time_offset()
> >
> > The existing time_offset slew mechanism then converges the clock to
> > the reference, with the clawback fix ensuring ntp_error stays accurate
> > across mult changes.
> >
> > Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx>
> > ---
> >  include/linux/timekeeping_reference.h | 19 ++++++++++++
> >  kernel/time/ntp.c                     | 27 ++++++++++++++++
> >  kernel/time/ntp_internal.h            |  3 ++
> >  kernel/time/timekeeping.c             | 44 +++++++++++++++++++++++++++
> >  4 files changed, 93 insertions(+)
> >  create mode 100644 include/linux/timekeeping_reference.h
> >
> > diff --git a/include/linux/timekeeping_reference.h b/include/linux/timekeeping_reference.h
> > new file mode 100644
> > index 000000000000..4c1d8a6c02f1
> > --- /dev/null
> > +++ b/include/linux/timekeeping_reference.h
> > @@ -0,0 +1,19 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _LINUX_TIMEKEEPING_REFERENCE_H
> > +#define _LINUX_TIMEKEEPING_REFERENCE_H
> > +
> > +#include <linux/clocksource_ids.h>
> > +#include <linux/types.h>
> > +
> > +struct tk_reference {
> > +       enum clocksource_ids    cs_id;
> > +       u64                     counter_value;
> > +       u64                     time_sec;
> > +       u64                     time_frac_sec;
> > +       u64                     period_frac_sec;
> > +       u8                      period_shift;
> > +};
>
> Can you add comments documenting each of these values?

Ack.

> All the math here could use some comments, just to be explicit about
> what is intended.

Ack.

>
> > @@ -2339,6 +2366,23 @@ static void timekeeping_adjust(struct timekeeper *tk, s64 offset)
> >                 tk->ntp_tick = ntp_tl;
> >                 mult = div64_u64(tk->ntp_tick >> tk->ntp_error_shift,
> >                                  tk->cycle_interval);
> > +               if (tk_ref_pending && tk->cs_id == tk_ref.cs_id) {
> > +                       u64 d = tk->tkr_mono.cycle_last - tk_ref.counter_value;
> > +                       __uint128_t p = (__uint128_t)d * tk_ref.period_frac_sec;
> > +                       u64 rf;
> > +                       s64 ref_err;
> > +
> > +                       p >>= tk_ref.period_shift;
> > +                       p += tk_ref.time_frac_sec;
> > +                       rf = (u64)p;
> > +                       ref_err = (s64)mul_u64_u64_shr(rf,
> > +                               (u64)NSEC_PER_SEC << tk->tkr_mono.shift, 64) -
> > +                               (s64)tk->tkr_mono.xtime_nsec;
> > +                       ntp_set_time_offset(tk->id,
> > +                               ref_err >> tk->tkr_mono.shift);
> > +                       tk->ntp_error = 0;
> > +                       tk_ref_pending = false;
> > +               }
>
> Just a quick skim here, but I don't see anything using tk_ref_valid.
> Am I missing it? Or can that value be dropped?

Yeah, I think it can be dropped; it's a hangover from the time when we
needed to consult the reference at *every* tick to drive the mult±1
dithering.

But now the core timekeeping can actually follow a line for itself,
that external oracle is no longer needed — because time definition (C)
*is* the reference and doesn't keep accumulating errors.

I'm also going to take another look at whether I need this hunk in
timekeeping_adjust() at all, or whether the tracking is now
sufficiently predictable that timekeeper_set_reference() could just set
time_offset directly.

I did it this way because timekeeper_set_reference() runs mid-tick,
while timekeeping_adjust() can adjust time_offset to clamp (C) to the
reference at the start of the new tick.

But if the accounting is all fixed now, there's no reason why
timekeeper_set_reference() couldn't get just apply the appropriate
correction for the moment at the existing cycle_last, effectively
setting time (C) for *that* moment, and then trust that it tracks
correctly from there. I'll play with it...

Attachment: smime.p7s
Description: S/MIME cryptographic signature