Re: [RFC PATCH] fs/resctrl: Fix use-after-free during unmount
From: Reinette Chatre
Date: Thu May 14 2026 - 18:45:26 EST
Hi Tony,
On 5/14/26 3:23 PM, Luck, Tony wrote:
> On Thu, May 14, 2026 at 02:45:10PM -0700, Reinette Chatre wrote:
>> On 5/13/26 3:40 PM, Tony Luck wrote:
...
>> Another alternative to consider is to not call mon_put_kn_priv() on unmount but
>> instead on resctrl_exit()? Thus treating it similar to the RMID LRU list.
>> This may be more complicated in the long term since it needs more care to ensure
>> needed state is still available a resctrl file reader that was blocked because of
>> unmount or failure (via resctrl_exit()).
>
> Pushing the resctrl_exit() is currently saying we don't care about the
> leaked allocation (since resctrl_exit() is never called - actually
> discarded). Cleaning up on unmount now means one less thing to do if we
> ever make resctrl a loadable module.
ack. One correction is that resctrl_exit() is now called by MPAM as part of its
error handling on receipt of a special error IRQ.
>
>>> }
>>>
>>> static void _update_task_closid_rmid(void *task)
>>> @@ -2965,6 +2966,8 @@ static void resctrl_fs_teardown(void)
>>> mon_put_kn_priv();
>>> rdt_pseudo_lock_release();
>>> rdtgroup_default.mode = RDT_MODE_SHAREABLE;
>>> + if (atomic_read(&rdtgroup_default.waitcount) != 0)
>>> + rdtgroup_default.flags = RDT_DELETED;
>>
>> sashiko found a race here ... looks like setting RDT_DELETED unconditionally would
>> help.
>
> Yes - as long as you are OK with the asymmetry between the default group
> and regular groups. I think it is OK because there are already many
> special cases for the default group.
I assume that you mean that in equivalent scenario a dynamically allocated regular group
is freed when it has no waiters. Since the default group is not dynamically allocated
it cannot be freed so it cannot be fully symmetrical here. I think an unconditional
RDT_DELETED is appropriate and matches how the default group is never freed.
>
>>> closid_exit();
>>> schemata_list_destroy();
>>> rdtgroup_destroy_root();
>>> @@ -4291,6 +4294,7 @@ static int rdtgroup_setup_root(struct rdt_fs_context *ctx)
>>>
>>> ctx->kfc.root = rdt_root;
>>> rdtgroup_default.kn = kernfs_root_to_node(rdt_root);
>>> + rdtgroup_default.flags = 0;
>>>
>>> return 0;
>>> }
>>
>> The "permanent lock leak" issue reported by sashiko is not clear to me. It claims:
>>
>> ---8<---
>> In rdtgroup_mondata_show(), if rdtgroup_kn_lock_live() returns NULL, the
>> error path jumps to the out label:
>> out:
>> if (rdtgrp)
>> rdtgroup_kn_unlock(of->kn);
>> Because rdtgrp is NULL, the unlock is skipped, leaving the locks permanently
>> held.
>> ---8<---
>>
>> Comparing the claim to actual code the snippet looks like a mismatch since
>> rdtgroup_mondata_show() actually looks like:
>> out:
>> rdtgroup_kn_unlock(of->kn);
>
> Yes. Looks like a problem in hallucinated code.
Thank you very much for the sanity check.
Reinette