Re: [PATCH] prctl: require checkpoint_restore_ns_capable for PR_SET_MM_MAP

From: David Hildenbrand (Arm)

Date: Wed Apr 08 2026 - 08:36:03 EST


On 4/3/26 05:54, Qi Tang wrote:
> On Thu, Apr 2, 2026, Andrei Vagin wrote:
>> A approach is to eliminate CAP_SYS_RESOURCE check but pass all
>> new values in one bundle, which would allow the kernel to make
>> more intensive test for sanity of values and same time allow us
>> to support checkpoint/restore of user namespaces.
>>
>> The initial implementation of PR_SET_MM_MAP didn't have the
>> capability check.
>
> This clears up the history. The two paths have different
> permission models by design, not by accident.
>
> On Thu, Apr 2, 2026, Lorenzo Stoakes wrote:
>> But if it's your process does it really matter? You can
>> manipulate memory all over the place in your process...
>
> I went back and checked each impact I claimed. The SELinux
> execheap bypass does not work because file_map_prot_check()
> still enforces PROCESS__EXECMEM on anonymous mappings
> regardless of start_brk. The procfs paths use
> access_remote_vm() which safely returns zero for unmapped
> addresses. auxv only affects the process itself. So yes,
> it doesn't really matter.
>
> I should have verified these claims more carefully before
> sending the patch. Lesson learned.
>
> Please drop this patch.
>
> That said, the man page still documents PR_SET_MM as requiring
> CAP_SYS_RESOURCE, and the individual field path enforces it
> while the MAP path does not. Might be worth a man-pages fix
> or a code comment to make the intent explicit, but that's a
> separate cleanup.

Would you have time to improve the man page? :)

Also, I think it might be helpful to add some comments in this code
*why* it is okay to not do any capability checks. Can you look into that
as well?

--
Cheers,

David