Re: [PATCH] mm/slub: preserve original size in _kmalloc_nolock_noprof retry path
From: hu.shengming
Date: Thu Jun 04 2026 - 06:47:19 EST
VlastimilBabka(SUSE)<vbabka@xxxxxxxxxx> wrote:
> On 6/4/26 07:46, Harry Yoo wrote:
> >
> >
> > On 6/3/26 10:10 PM, hu.shengming@xxxxxxxxxx wrote:
> >> From: Shengming Hu <hu.shengming@xxxxxxxxxx>
> >>
> >> _kmalloc_nolock_noprof() retries from the next kmalloc bucket when the
> >> initial allocation fails. The retry currently reuses `size` as the
> >> bucket selector and overwrites it with s->object_size + 1.
> >>
> >> That value is later passed as the original allocation size to
> >> __slab_alloc_node(), slab_post_alloc_hook() and kasan_kmalloc(). On a
> >> successful retry this makes KASAN/slub-debug observe the retry bucket
> >> selector rather than the caller requested size, potentially widening the
> >> valid kmalloc range and hiding overflows.
> >
> > Good catch!
> >
> >> Keep a separate `bucket_size` for choosing the retry cache and preserve
> >> `size`.
> >>
> >> Fixes: <af92793e52c3> ("slab: Introduce kmalloc_nolock() and kfree_nolock()")
> >> Signed-off-by: Shengming Hu <hu.shengming@xxxxxxxxxx>
> >> ---
> >
> > I want to note that this conflicts with Vlastimil's work-in-progress
> > feature [1] where it separately stores orig_size in struct
> > slab_alloc_context which naturally solves the problem.
>
> Thanks, didn't realize it was solving this problem as a side-effect, hehe.
>
> > [1]
> > https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/linux.git/log/?h=b4/slab_alloc_flags
> >
> > No strong opinion on whether to queue this patch and then rebase
> > Vlastimil's work on top, or wait for Vlastimil's work to solve the
> > problem instead...
>
> It's better to queue fixes separately first, in case someone wants to
> backport them, even though stable shouldn't be really necessary here.
> This fix could still go to 7.2 but my series only to 7.3.
>
Hi Vlastimil,
I'll send a v2, it should be easy to rebase your
slab_alloc_context work on top.
> >> mm/slub.c | 7 ++++---
> >> 1 file changed, 4 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/mm/slub.c b/mm/slub.c
> >> index 67abbbf68fc1..6a2b3ade3611 100644
> >> --- a/mm/slub.c
> >> +++ b/mm/slub.c
> >> @@ -5350,6 +5350,7 @@ EXPORT_SYMBOL(__kmalloc_noprof);
> >> void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, int node)
> >> {
> >> gfp_t alloc_gfp = __GFP_NOWARN | __GFP_NOMEMALLOC | gfp_flags;
> >> + size_t bucket_size = size;
> >> struct kmem_cache *s;
> >> bool can_retry = true;
> >> void *ret;
> >
> > But in either way, I think it's more straightforward to introduce
> > orig_size as a variable to keep the original size and pass it to
> > __slab_alloc_node().
>
> Agree. Because above, we wouldn't initialize bucket_size to a real bucket
> size, but a <=bucket size so it's misleading.
>
Agreed, it makes sense.
> >> @@ -5372,9 +5373,9 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
> >> return NULL;
> >>
> >> retry:
> >> - if (unlikely(size > KMALLOC_MAX_CACHE_SIZE))
> >> + if (unlikely(bucket_size > KMALLOC_MAX_CACHE_SIZE))
> >> return NULL;
> >> - s = kmalloc_slab(size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));
> >> + s = kmalloc_slab(bucket_size, NULL, alloc_gfp, PASS_TOKEN_PARAM(token));
> >>
> >> if (!(s->flags & __CMPXCHG_DOUBLE) && !kmem_cache_debug(s))
> >> /*
> >> @@ -5408,7 +5409,7 @@ void *_kmalloc_nolock_noprof(DECL_TOKEN_PARAMS(size, token), gfp_t gfp_flags, in
> >> */
> >> if (!ret && can_retry) {
> >> /* pick the next kmalloc bucket */
> >> - size = s->object_size + 1;
> >> + bucket_size = s->object_size + 1;
> >> /*
> >> * Another alternative is to
> >> * if (memcg) alloc_gfp &= ~__GFP_ACCOUNT;
> >