Re: [QUESTION] printk: time namespaces and printk timestamps

From: Thomas Weißschuh

Date: Mon May 25 2026 - 11:44:01 EST


On 2026-05-25 15:29:01+0200, Petr Mladek wrote:
> Adding timekeeping maintainers into Cc.
>
> On Mon 2026-05-25 12:26:47, Thomas Weißschuh wrote:
> > On 2026-05-25 11:30:08+0200, Petr Mladek wrote:
> > > On Mon 2026-05-25 10:47:46, Thomas Weißschuh wrote:
> > > > On 2026-05-25 10:37:40+0200, Petr Mladek wrote:
> > > > > On Sat 2026-05-23 17:20:45, Thomas Weißschuh wrote:
> > > > > > It looks like the monotonic timestamps of printk records do not take the
> > > > > > callers time namespace into account when emitted through /dev/kmsg,
> > > > > > /proc/kmsg and syslog(2).
> > > > > >
> > > > > > To me this looks like a bug. What do you think? Should it be changed?
> > > > >
> > > > My idea was rather to do the conversion when *reading* the message from
> > > > a time namespaced user process.
> > >
> > > I see. But it might produce messages with negative timestamps. I am
> > > not sure if it is a good idea. I guess that many tools might get
> > > confused.
> >
> > Yeah. While the time namespace core ensures that the observed "now" can
> > never be negative, messages that were recorded before the creation of
> > the namespace could result in negative timestamps.
> > These could be filtered out during formatting, but that is ugly, too.
>
> Maybe, I was wrong. It seems that the offset might only be positive,
> see below.

With your patch below (with an added NULL-check as explained below)
I was able to trigger such a negative number, or rather an unsigned
underflow:

#!/bin/bash

set -e

echo test > /dev/kmsg

sleep 3

# lsclocks from recent util-linux. I don't know another way to
# get CLOCK_MONOTONIC from shell.
mono=$(lsclocks --time monotonic | cut -d. -f1)

unshare -T --monotonic -$mono lsclocks --time monotonic
unshare -T --monotonic -$mono dmesg -r | tail -1

Result:
[ 2.998937] test (host console, non-namespaced)
0.873620599 (lsclocks --time monotonic in namespace)
<12>[18446744071.708489] test (/dev/kmsg in namespace)

> > > > > I see two potential problems:
> > > > >
> > > > > 1. printk() historically used a monotonic clock. I am not sure if any
> > > > > usespace application depends on it but there is a risk of breaking
> > > > > something.
> > > >
> > > > >From the perspective of the namespaced process currently the timestamps
> > > > from printk may in fact not be CLOCK_MONOTONIC.
> > >
> > > The messages stored in the kernel ring buffer currently always have
> > > monotonic timestamps.
> >
> > "monotonic" as in "never decreases" (obious) or "CLOCK_MONOTONIC"?
> > That is not actually well-documented I think.
> > dmesg(1) assumes CLOCK_MONOTONIC.
>
> Honestly, I do not know the details. printk() is calling local_clock()
> which is a wrapper over sched_clock(). I think that it is based on
> "CLOCK_MONOTONIC" but it is not precise because it is lockless.

We have ktime_get_mono_fast_ns() which is also supposed to do this.
The name would be clearer and it should be guaranteed to be
CLOCK_MONOTONIC. But I don't know the details or history so this is just
guesswork right now.

> > > I guess that you talk about matching kernel messages with other
> > > userspace logs. But they might be inconsistent also because
> > > of using different clocks...
> >
> > Or matching it with clock_gettime().
> >
> > > There were several discussions in the past about storing more
> > > timestamps from different clocks in the printk ring buffer.
> > > But it never went anywhere.
> >
> > >From the perspective of a namespaced process, this *is* the actual
> > CLOCK_MONOTONIC. It doesn't really know about any other one, and
> > the time APIs do not expose the un-namespaced clock value to a
> > namespaced process. So I am not sure this is the same usecase.
> >
> > As a real-world example, dmesg(1) uses clock_gettime(CLOCK_MONOTONIC) in
> > its logic to format kmsg timestamps as wallcock/CLOCK_REALTIME ones:
> >
> > # echo test > /dev/kmsg
> > # unshare -T --monotonic 5000000 dmesg --time-format iso | tail -1
> > 2026-03-28T14:20:35,366875+01:00 test
> > # dmesg --time-format iso | tail -1
> > 2026-05-25T12:13:55,366876+02:00 test
> > # date
> > Mon May 25 12:13:55 CEST 2026
> >
> > The timestamp from the namespaced dmesg(1) is wrong.
> > (CLOCK_REALTIME itself is not namespace-aware)
>
> Note that kernel always gives the same output:

(...)

> All the transformations into other time formats are done by dmesg
> internally. See dmesg(1)"

(...)

> It means that the "iso" format is broken by definition.

ctime and reltime, too.

> But we probably could do better here.
>
> The big difference of the timestamp in your example is because
> the namespace uses so big offset for the monotonic clock.

I don't see how the size of the offset makes a difference.
Could you explain?

> It should get fixed if we added the name space offset, similar
> to commit 3ae700ecfae91331 ("fs/proc: apply the time namespace
> offset to /proc/stat btime"). But we should use
> timens_add_boottime_ns() which does not exist at the moment.
>
> Something like:
>
> diff --git a/include/linux/time_namespace.h b/include/linux/time_namespace.h
> index 58bd9728df58..c34713b657b5 100644
> --- a/include/linux/time_namespace.h
> +++ b/include/linux/time_namespace.h
> @@ -77,6 +77,13 @@ static inline void timens_add_monotonic(struct timespec64 *ts)
> *ts = timespec64_add(*ts, ns_offsets->monotonic);
> }
>
> +static inline u64 timens_add_monotonic_ns(u64 nsec)
> +{
> + struct timens_offsets *ns_offsets = &current->nsproxy->time_ns->offsets;
> +
> + return nsec + timespec64_to_ns(&ns_offsets->monotonic);
> +}
> +
> static inline void timens_add_boottime(struct timespec64 *ts)
> {
> struct timens_offsets *ns_offsets = &current->nsproxy->time_ns->offsets;
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 0323149548f6..c5dcf544e267 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -48,6 +48,7 @@
> #include <linux/sched/clock.h>
> #include <linux/sched/debug.h>
> #include <linux/sched/task_stack.h>
> +#include <linux/time_namespace.h>
> #include <linux/panic.h>
>
> #include <linux/uaccess.h>
> @@ -641,7 +642,7 @@ static void append_char(char **pp, char *e, char c)
> static ssize_t info_print_ext_header(char *buf, size_t size,
> struct printk_info *info)
> {
> - u64 ts_usec = info->ts_nsec;
> + u64 ts_usec = timens_add_monotonic_ns(info->ts_nsec);
> char caller[20];
> #ifdef CONFIG_PRINTK_CALLER
> u32 id = info->caller_id;
> @@ -1382,7 +1383,7 @@ static size_t info_print_prefix(const struct printk_info *info, bool syslog,
> len = print_syslog((info->facility << 3) | info->level, buf);
>
> if (time)
> - len += print_time(info->ts_nsec, buf + len);
> + len += print_time(timens_add_monotonic_ns(info->ts_nsec), buf + len);
>
> len += print_caller(info->caller_id, buf + len);

I tried something like this before. There are some notes:

In some contexts where this is called from, current->nsproxy can be NULL
(maybe when printing to the console) and the kernel will crash.
So either we add a NULL check or explicitly pass through a parameter
which enables timens handling when called from the right entrypoints.

This would require to make it clear that the timestamp is
CLOCK_MONOTONIC. Not that outlandish given that userspace already relies
on it.

Given that nobody has complained so far, maybe it is not a real issue.


Thomas