Re: [PATCH 00/11] x86,fs/resctrl: Improve resctrl quality and consistency

From: Ben Horgan

Date: Mon Mar 16 2026 - 13:49:30 EST


Hi Reinette,

On 3/2/26 18:46, Reinette Chatre wrote:
> Hi Everybody,
>
> This is a collection of resctrl cleanups assembled together for convenience
> and simpler tracking. I'd be happy to split them up if it makes review and/or
> handling easier.
>
> Summary of changes:
>
> - Let resctrl pass stricter checks from various tools to provide a cleaner
> baseline with the goal to promote healthier contributions:
> - ./tools/docs/kernel-doc -Wall -v <files>
> - Build with W=12
> - ./scripts/coccicheck
> - Static checkers
>
> - Use accurate and consistent type for all uses of resource ID.
>
> - In the unlikely scenario that resctrl picked a wrong CPU to read an event
> from, pass the error through to user space instead of claiming to succeed
> and returning a (wrong) result.
>
> - Since inception of last_cmd_status feature there have been mismatches
> between resctrl file operation failures and the contents of
> info/last_cmd_status. This pattern keeps propagating with each new resctrl
> feature. Establish a new baseline with a new pattern that ensures
> info/last_cmd_status contains an accurate failure description that matches
> the most recent resctrl file operation failure.
>
One related issue I've just noticed is that when ABMC and mbm_assign_on_mkdir are
enabled the creation of MON/CTRL_MON directories may succeed but an error message
is written to last_cmd_status. E.g.

/sys/fs/resctrl# mkdir mon_groups/new5
/sys/fs/resctrl# cat info/last_cmd_status
Failed to allocate counter for mbm_total_bytes in domain 2

The failure is ignored, as expected, in rdt_assign_cntrs() but the last_cmd_status
is never cleared. I think this could be fixed by:

diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 62edb464410a..396f17ed72c6 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -1260,6 +1260,8 @@ void rdtgroup_assign_cntrs(struct rdtgroup *rdtgrp)
if (resctrl_is_mon_event_enabled(QOS_L3_MBM_LOCAL_EVENT_ID))
rdtgroup_assign_cntr_event(NULL, rdtgrp,
&mon_event_all[QOS_L3_MBM_LOCAL_EVENT_ID]);
+
+ rdt_last_cmd_clear();
}

Is this right thing to do? Let me know if you want a proper patch.


Thanks,

Ben