Re: [PATCH] cgroup: add cpu.stat.percpu for per-CPU cgroup stats
From: Barro Raffel, Willy
Date: Wed Apr 08 2026 - 14:39:14 EST
On Wed, Apr 08, 2026 at 02:30:11PM +0200, Michal Koutný wrote:
> ...
>The argument "to complete the interface" explains the actual need for
>such a new attribute not convincingly.
>
>Willy, what is the expected use of these per-cgroup per-cpu stats?
>(Given there's: global per-cpu stat, per-cgroup total stat, cpusets for
>binding and the mentioned bpf/drgn availability for precise
>control/debugging.)
Our use case is that we run systems where services in separate cgroups
are pinned to specific CPUs via sched_setaffinity (not cgroup cpusets).
We need to know how much of each core's time each cgroup is consuming,
particularly on shared cores where multiple services compete. I believe
this use case is not unique to us.
/proc/stat gives per-CPU totals without per-cgroup breakdown.
cpu.stat gives per-cgroup totals without per-CPU breakdown.
Neither answers "how much of core N is cgroup X using?"
The data already exists in subtree_bstat per CPU. BPF can access
per-cgroup totals, but reading the per-CPU subtree_bstat requires either
Clang-compiled kernels (for percpu type tags) or custom kfuncs IIRC,
which are nontrivial dependencies for simple monitoring.
>Thanks,
>Michal
Regarding output format: I'm open to a more compact format if preferred,
for example, skip CPUs with zero stats, skip offline CPUs, using a
simpler positional format without keys, or a mix of all these ideas.
I personally prefer clear key-value pairs that don't require the
developer/operator/human to need to go to the manual just to find out
what a number in a certain position means.
Happy to adjust based on what you all think fits best though.
Thanks! Willy