Re: [PATCH net-next v8 3/3] gve: implement PTP gettimex64

From: David Woodhouse

Date: Fri May 22 2026 - 04:56:44 EST


On Fri, 2026-05-22 at 09:49 +0200, Thomas Gleixner wrote:
> On Fri, May 22 2026 at 01:12, David Woodhouse wrote:
> > On Fri, 2026-05-22 at 00:43 +0200, Thomas Gleixner wrote:
> > > > So surely a device can only support this if it *does* have a precise
> > > > hardware timestamp, e.g. from ART? And the corresponding sys_realtime
> > > > and sys_monoraw fields also have to be from the *same* value, surely?
> > >
> > > That is correct, but you still fail to explain how this new made up
> > > att.counter_id/att.counter_value is filled in coherently without the
> > > driver doing something abysmal?
> > >
> > > You can't explain it, because the only way to do that correctly is by
> > > extending the cross time stamp struct and get_device_system_crosststamp().
> >
> > Yes. Isn't that what I asked Arthur to do?
>
> You maybe asked him, but I can't see any evidence of it anywhere. All I
> can see is a magic struct attr, which conveys CS related data that is
> supposed to be produced magically at some undefined place.

https://lore.kernel.org/all/8b8a748a2ac6e8e3d970f5e74a2774133e7e7d8c.camel@xxxxxxxxxxxxx/

> > > > Is that not how it works? If not.... WTF *is* it for?
> > >
> > > Yes, that's the way it works since get_device_system_crosststamp() got
> > > introduced, but that does not give a random driver access to the actual
> > > converted (TSC) counter value. The driver can only observe the PTM value
> > > which is ART on x86 and whatever on other architectures. So it does not
> > > know anything about it.
> >
> > Sure, the driver can only return what it *knows*.
> >
> > We're certainly missing some infrastructure here... maybe the pci_bus
> > should have a CSID which corresponds to its PTM domain, which on x86
> > would be CSID_X86_ART?
>
> Yes, that's correct. But that does not solve the underlying problem:
>
> ...
>
> Q: How is a driver supposed to fill in attr->cs_cycles coherently?
>
> A: Not at all.
>

Let me rephrase that to check my full understanding of it in context:

Q: There are drivers today which operate on the TSC or arch counter,
and they're just fine. But how is a future driver which uses PCIe
PTM supposed to full in attr->cs_cycles coherently?

A: Yeah, our PCI and platform code lack the proper arch-agnostic
support for interpreting PTM time values; a driver wanting to do
that is going to need to do some infrastructure work first to
enable a proper conversion. But that's literally what PTM/ART is
*for* so that work has to happen at some point anyway.


> The consequence of this is that the next driver writer will just go and
> do:
>
>        attr->cs_id = IS_ENABLED(X86) ? CSID_X86_TSC : CSID_ARM_COUNTER;
>        attr->cs_cycles = get_cycles();
>    or
>        attr->cs_cycles = ptm_cycles * pulledout_of_thin_air_factor +
>                          pulledout_of_thin_air_offset;

Driver authors are crap, sure, but that would just be egregiously wrong
and hopefully it would be caught in review. It has to be a literally
*synchronous* capture or it's pointless.


> The same applies to the pre/post timestamp mechanism for
> gettimexattrs64().

There's literally a ptp helper to generate those, which can use
ktime_get_snapshot().

> These new attr IOCTLs are not restricted to your favorite vmclock PTP
> toy. They are available for every PTP driver which implements them, but
> there is no infrastructure to actually provide the required data. That
> ENA driver implementing gettimexattrs64() is demonstrating it. It at
> least does not try to make them up, but that's just a matter of time to
> happen.

The ENA driver demonstrates nothing of the sort. It's *not* filling in
those fields because it *doesn't* have PTM and it has nothing to fill
in.

Not *all* driver authors will fill in an optional field with random
bytes just because they *can*, Thomas.

> I know you don't care because that's outside of your sandpit, 

Thomas, I'd like you to stop saying that. It's insulting, and it's wrong.

Of *course* I'm trying to solve things for the general case.

But that doesn't extend to implementing random parts of PCI and
platform infrastructure to allow for hypothetical future drivers which
want to use PTM to implement the optional fields we're adding today.

This yak is deep enough already :)

And although we're having this discussion *in* the GVE thread, I've
barely even *looked* at those patches and I think I'm still calling it
a 'hypothetical future driver', because obviously if it's trying to use
PTM then it's going to need that generic support.

> Are you going to monitor drivers/net/ for those instances and make
> sure they are burned before seeing the light of day?
>
> Surely not and neither is netdev going to care as demonstrated with
> this GEV mess.

I think that message has been received loud and clear, Thomas :)

> > Never get_cycles(). Let's perhaps make it *not* need unholy hacks to
> > interpret a PTM counter value. Otherwise what's the point in ART and
> > PTM at all?
>
> Perhaps? There is no perhaps and aside of that I gave you the solution
> for the problem already in the mail you just replied to, no?

Hm, did you? I'm sorry, I must have missed it. Even rereading it this
morning, I see a lot of 'is bonkers' and 'unholy hacks' and '_cannot_
fill them in coherently' but I can't see a concrete constructive
suggestion. You meant <87bje8v0xl.ffs@tglx>, yes?

Maybe it's lost in the noise. Let's try to simplify.

• Userspace would like to be able to use PTP clocks to discipline the
actual counter which is directly accessible in userspace and by KVM
guests. Do do this, we only really need a system_counterval_t in
the pre/post timestamps, and the helpers can add that.

• *Some* drivers can actually provide a system_counterval_t in the
device reading itself, either because they're naturally based on
the CPU counter or because they have PTM. Drivers should report
what they *do* know, in the correct time domain, and not make
crap up.

• Any driver which actually wants to do this based on PTM in the
future will need to do the infrastructure work to make that work
cleanly, and avoid having to make crap up.

And if I'm understanding correctly:

• The GVE driver made crap up? Although they do so orthogonally to the
patch set we're *actually* talking about in this thread, because
they did that entirely on their own *without* reference to Arthur's
userspace API changes?


> I'm glad we agree on "Never get_cycles()" at least. May I ask you then
> to get a stiff drink ready and run:
>
>   # git blame drivers/ptp/ptp_vmclock.c | grep get_cycles

That's different because it's actually using it for the *device* timestamp.
But if you see an actual problem there, I'd love to know:

/*
* When invoked for gettimex64(), fill in the pre/post system
* times. The simple case is when system time is based on the
* same counter as st->cs_id, in which case all three times
* will be derived from the *same* counter value.
*
* If the system isn't using the same counter, then the value
* from ktime_get_snapshot() will still be used as pre_ts, and
* ptp_read_system_postts() is called to populate postts after
* calling get_cycles().
*
* The conversion to timespec64 happens further down, outside
* the seq_count loop.
*/
if (sts) {
ktime_get_snapshot(&systime_snapshot);
if (systime_snapshot.cs_id == st->cs_id) {
cycle = systime_snapshot.cycles;
} else {
cycle = get_cycles();
ptp_read_system_postts(sts);
}
} else {
cycle = get_cycles();
}

delta = cycle - le64_to_cpu(st->clk->counter_value);
...

Attachment: smime.p7s
Description: S/MIME cryptographic signature