Re: [PATCH v2 2/2] cgroup/dmem: add dmem.memcg control file for double-charging to memcg

From: Maarten Lankhorst

Date: Fri Jun 05 2026 - 07:35:54 EST


Hey,

On 5/26/26 18:59, Eric Chanudet wrote:
> On Fri, May 22, 2026 at 05:26:16PM +0200, Michal Koutný wrote:
>> Hello Eric.
>>
>> On Tue, May 19, 2026 at 11:59:02AM -0400, Eric Chanudet <echanude@xxxxxxxxxx> wrote:
>>> Add a root-only cgroupfs file "dmem.memcg" that lets an administrator
>>> configure whether allocations in a dmem region should also be charged to
>>> the memory controller.
>>
>> This kinda makes sense as it is not unlike io.cost.* device
>> configurators.
>>
>> Just for my better understanding -- will there be a space for userspace
>> to switch this? (No charged dmem allocations happen before responsible
>> userspace runs, so that the attribute remains unlocked.)
>>
>> (I'm rather indifferent about the actual double charging/non-charging
>> matter.)
>
> Yes, this is intended to be configured before the user space stack that
> would start allocating things is started. Once it has started (and tried
> to charge something), the configuration is locked
>
>>
>>>
>>> To handle inheritance, dmem adds a depends_on the memory controller,
>>> unless MEMCG isn't configured in.
>>>
>>> Double-charging is disabled by default. Once a charge is attempted, the
>>> setting is locked to prevent inconsistent accounting by a small 4-state
>>> machine (off, on, locked off, locked on).
>>>
>>> The memcg to charge is derived from the pool's cgroup, since the pool
>>> holds a reference to the dmem cgroup state that keeps the cgroup alive
>>> until it gets uncharged.
>>>
>>> Signed-off-by: Eric Chanudet <echanude@xxxxxxxxxx>
>>> ---
>>> Documentation/admin-guide/cgroup-v2.rst | 23 +++++
>>> kernel/cgroup/dmem.c | 158 +++++++++++++++++++++++++++++++-
>>> 2 files changed, 178 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
>>> index 6efd0095ed995b1550317662bc1b56c7a7f3db23..1d2fa55ddf0faa17baa916a8914d3033e8e42359 100644
>>> --- a/Documentation/admin-guide/cgroup-v2.rst
>>> +++ b/Documentation/admin-guide/cgroup-v2.rst
>>> @@ -2828,6 +2828,29 @@ DMEM Interface Files
>>> drm/0000:03:00.0/vram0 12550144
>>> drm/0000:03:00.0/stolen 8650752
>>>
>>> + dmem.memcg
>>> + A readwrite nested-keyed file that exists only on the root
>>> + cgroup.
>>
>> Strictly speaking this is not nested-keyed but flat keyed [1],
>
> Indeed,
>
>> which leads me to realization that this is the first instance of a boolean.
>> All in call, such a composition comes to my mind (latter is RO):
>>
>> drm/0000:03:00.0/vram0 enable=0|1 locked=0|1
>>
>
> So per[1] 1 key, 2 sub-keys (enable RW, locked RO), that looks better
> and match the documentation, thanks!
>
>>
>>
>>> +static ssize_t dmem_cgroup_memcg_write(struct kernfs_open_file *of, char *buf,
>>> + size_t nbytes, loff_t off)
>>> +{
>>> + while (buf) {
>>> + struct dmem_cgroup_region *region;
>>> + char *options, *name;
>>> + bool flag;
>>> +
>>> + options = buf;
>>> + buf = strchr(buf, '\n');
>>> + if (buf)
>>> + *buf++ = '\0';
>>
>> I recall there was a discussion about accepting only a single device per
>> write(2) (at the same time I see this idiom is still present in other
>> dmem.* files, so this is nothing to change in _this_ patch).
>
> I would second that. When setting say dmem.max for 2 regions, with a
> typo on the second, the first one is set, but write still get EINVAL.
>
> Also, I just notice dmemcg_limit_write() returns EINVAL if the region is
> not found (this patch returns ENODEV).
>
>>
>> Thanks,
>> Michal
>>
>> [1] https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#format
>
>
>

Perhaps a bit late, but before we start adding this UAPI we should enforce a
single region per write?

Kind regards,
~Maarten Lankhorst