Re: [syzbot] [kernel?] KASAN: slab-use-after-free Write in flush_tlb_func

From: Jann Horn
Date: Wed Jul 02 2025 - 12:54:00 EST

Next message: Donald Dutile: "Re: [PATCH v9 0/6] KVM: arm64: Map GPU device memory as cacheable"
Previous message: Simon Horman: "Re: [PATCH] net: ipv6: Fix spelling mistake"
In reply to: Rik van Riel: "Re: [syzbot] [kernel?] KASAN: slab-use-after-free Write in flush_tlb_func"
Next in thread: Jann Horn: "Re: [syzbot] [kernel?] KASAN: slab-use-after-free Write in flush_tlb_func"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Jul 2, 2025 at 5:24 PM Rik van Riel <riel@xxxxxxxxxxx> wrote:
>
> On Wed, 2025-07-02 at 06:50 -0700, syzbot wrote:
> >
> > The issue was bisected to:
> >
> > commit a12a498a9738db65152203467820bb15b6102bd2
> > Author: Yury Norov [NVIDIA] <yury.norov@xxxxxxxxx>
> > Date: Mon Jun 23 00:00:08 2025 +0000
> >
> > smp: Don't wait for remote work done if not needed in
> > smp_call_function_many_cond()
>
> While that change looks like it would increase the
> likelihood of hitting this issue, it does not look
> like the root cause.
>
> Instead, the stack traces below show that the
> TLB flush code is being asked to flush the TLB
> for an mm that is exiting.
>
> One CPU is running the TLB flush handler, while
> another CPU is freeing the mm_struct.
>
> The CPU that sent the simultaneous TLB flush
> is not visible in the stack traces below,
> but we seem to have various places around the
> MM where we flush the TLB for another mm,
> without taking any measures to protect against
> that mm being freed while the flush is ongoing.

TLB flushes via IPIs on x86 are always synchronous, right?
flush_tlb_func is only referenced from native_flush_tlb_multi() in
calls to on_each_cpu_mask() (with wait=true) or
on_each_cpu_cond_mask() (with wait=1).
So I think this is not an issue, unless you're claiming that we call
native_flush_tlb_multi() with an already-freed info->mm?

And I think the bisected commit really is the buggy one: It looks at
"nr_cpus", which tracks *how many CPUs we have to IPI*, but assumes
that "nr_cpus" tracks *how many CPUs we posted work to*. Those numbers
are not the same: If we post work to a CPU that already had IPI work
pending, we just add a list entry without sending another IPI.

Next message: Donald Dutile: "Re: [PATCH v9 0/6] KVM: arm64: Map GPU device memory as cacheable"
Previous message: Simon Horman: "Re: [PATCH] net: ipv6: Fix spelling mistake"
In reply to: Rik van Riel: "Re: [syzbot] [kernel?] KASAN: slab-use-after-free Write in flush_tlb_func"
Next in thread: Jann Horn: "Re: [syzbot] [kernel?] KASAN: slab-use-after-free Write in flush_tlb_func"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]