Re: [RFC PATCH v5 20/29] sched/deadline: Allow deeper hierarchies of RT cgroups

From: Tejun Heo

Date: Mon May 18 2026 - 14:47:54 EST


Hello, Yuri.

On Mon, May 18, 2026 at 05:27:17PM +0200, Yuri Andriaccio wrote:
> Interface:
> - File: cpu.rt.max
> - Format: <runtime>|"max" <period>
> - Default value:
>     "max" <parent period> - if the parent schedules on the root runqueue.
>     0 <parent period> - if the parent is instead using HCBS.
> - Meaning (incomplete/dubious):
>     The bandwidth allocated to the specific cgroup and all of its children.
>     Since sum(children bw) <= own bw, a cgroup's servers will be configured
>     with (own bw - sum(children bw)) bandwidth.
>     A cgroup set to "max" whose parents are all set to "max" (root cgroup
> excluded)
>     will run their tasks in the root runqueue.
>     A cgroup set to "max" whose parent has a non-zero reservation will
>     inherit the parent's configuration.
>     The root cgroup's cpu.rt.max file reserves the maximum HCBS bandwidth
> for
>     the whole hierarchy. Root set to "max" disable HCBS (as if set with a
> zero runtime).

I wonder whether it can be generalized more. Would something like the
following work? I'm going to ignore period for the sake of simplicity as it
doesn't seem to affect admission decisions.

- There is no root cgroup.rt.max in line with other control knobs.

- max means running in the nearest ancestor that has budget configuration.
Obviously, if no one has budget configured, run in root.

- Setting a budget is subject to admission control in both directions - the
budget source (the nearest budgeted ancestor, or the root pool if none)
should have enough to give out and the target budget should be big enough
to contain the actual usages and !max descendants in the subtree. Going
to max is always fine - the source previously gave the budget out, so it
has room to take everything back.

It seems like the above would give fairly generic behavior without abrupt
system-wide switches while staying relatively close to the behaviors of
other resource knobs. I could be missing something tho. Would something like
this work?

Thanks.

--
tejun