Re: [PATCH v14 27/32] x86,fs/resctrl: Auto assign/unassign counters on mkdir

From: Moger, Babu
Date: Tue Jul 01 2025 - 15:07:11 EST


Hi Reinette,

On 6/25/25 18:25, Reinette Chatre wrote:
> Hi Babu,
>
> On 6/13/25 2:05 PM, Babu Moger wrote:
>> Resctrl provides a user-configurable option mbm_assign_on_mkdir that
>> determines if a counter will automatically be assigned to an RMID, event
>> pair when its associated monitor group is created via mkdir.
>>
>> Enable mbm_assign_on_mkdir by default and automatically assign or unassignq
>> counters when a resctrl group is created or deleted.
>
> This is a bit confusing since I do not think mbm_assign_on_mkdir has *anything*
> to do with unassign of counters. Counters are always (irrespective of mbm_assign_on_mkdir)
> unassigned when a resctrl group is deleted, no?

Yes. That is correct. Changed the text now.

>
> The subject also does not seem accurate since there is no unassign on
> mkdir.

Changed the subject to:

x86,fs/resctrl: Auto assign counters on mkdir and clean up on group removal

>
>>
>> By default, each group requires two counters: one for the MBM total event
>> and one for the MBM local event.
>>
>> If the counters are exhausted, the kernel will log the error message
>> "Unable to allocate counter in domain" in
>> /sys/fs/resctrl/info/last_cmd_status when a new group is created and the
>> counter assignment will fail. However, the creation of a group should not
>> fail due to assignment failures. Users have the flexibility to modify the
>> assignments at a later time.
>>
>> Signed-off-by: Babu Moger <babu.moger@xxxxxxx>
>> ---
>
> ...
>
>> ---
>> arch/x86/kernel/cpu/resctrl/monitor.c | 1 +
>> fs/resctrl/rdtgroup.c | 71 ++++++++++++++++++++++++++-
>> 2 files changed, 70 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
>> index ee0aa741cf6c..053f516a8e67 100644
>> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
>> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
>> @@ -429,6 +429,7 @@ int __init rdt_get_mon_l3_config(struct rdt_resource *r)
>> r->mon.mbm_cntr_assignable = true;
>> cpuid_count(0x80000020, 5, &eax, &ebx, &ecx, &edx);
>> r->mon.num_mbm_cntrs = (ebx & GENMASK(15, 0)) + 1;
>> + r->mon.mbm_assign_on_mkdir = true;
>> }
>>
>> r->mon_capable = true;
>> diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
>> index bf5fd46bd455..128a9db339f3 100644
>> --- a/fs/resctrl/rdtgroup.c
>> +++ b/fs/resctrl/rdtgroup.c
>> @@ -2945,6 +2945,55 @@ static void schemata_list_destroy(void)
>> }
>> }
>>
>> +/**
>> + * rdtgroup_assign_cntrs() - Assign counters to MBM events. Called when
>> + * a new group is created.
>> + * If "mbm_event" mode is enabled, counters are automatically assigned.
>
> "counters are automatically assigned" -> "counters should be automatically assigned
> if the "mbm_assign_on_mkdir" is set"?

Sure.

>
>> + * Each group can accommodate two counters: one for the total event and
>> + * one for the local event. Assignments may fail due to the limited number
>> + * of counters. However, it is not necessary to fail the group creation
>> + * and thus no failure is returned. Users have the option to modify the
>> + * counter assignments after the group has been created.
>> + */
>> +static void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
>> +{
>> + struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
>> +
>> + if (!r->mon_capable)
>> + return;
>> +
>> + if (resctrl_arch_mbm_cntr_assign_enabled(r) && !r->mon.mbm_assign_on_mkdir)
>> + return;
>
> This check is not clear to me. It looks to me as though counter assignment
> will be attempted if !resctrl_arch_mbm_cntr_assign_enabled(r)? Perhaps
> something like:
> if (!r->mon_capable || !resctrl_arch_mbm_cntr_assign_enabled(r) ||
> !r->mon.mbm_assign_on_mkdir)
> return;
>

Yes. Good catch.


>> +
>> + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
>> + resctrl_assign_cntr_event(r, NULL, rdtgrp,
>> + &mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID]);
>
> Switching the namespace like this is confusing to me. rdtgroup_assign_cntrs()
> has prefix rdtgroup_ to indicate it operates on a resource group. It is confusing
> when it switches namespace to call resctrl_assign_cntr_event() that actually assigns
> a specific event to a resource group. I think this will be easier to follow if:
> resctrl_assign_cntr_event() -> rdtgroup_assign_cntr_event()

Sure.

>
>> +
>> + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
>> + resctrl_assign_cntr_event(r, NULL, rdtgrp,
>> + &mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID]);
>> +}
>> +
>> +/*
>> + * rdtgroup_unassign_cntrs() - Unassign the counters associated with MBM events.
>> + * Called when a group is deleted.
>> + */
>> +static void rdtgroup_unassign_cntrs(struct rdtgroup *rdtgrp)
>> +{
>> + struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
>> +
>> + if (!r->mon_capable || !resctrl_arch_mbm_cntr_assign_enabled(r))
>> + return;
>> +
>> + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_TOTAL_EVENT_ID))
>> + resctrl_unassign_cntr_event(r, NULL, rdtgrp,
>> + &mon_event_all[QOS_L3_MBM_TOTAL_EVENT_ID]);
>
> same here, I think this will be easier to follow when namespace is
> consistent:
> resctrl_unassign_cntr_event() -> rdtgroup_unassign_cntr_event()
>

Sure.

>
> Also, the struct rdt_resource parameter should not be needed when
> struct mon_evt is provided and resource can be obtained from mon_evt::rid.
>
>> +
>> + if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
>> + resctrl_unassign_cntr_event(r, NULL, rdtgrp,
>> + &mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID]);
>> +}
>> +
>> static int rdt_get_tree(struct fs_context *fc)
>> {
>> struct rdt_fs_context *ctx = rdt_fc2context(fc);
>
>
> Reinette
>

--
Thanks
Babu Moger