Re: [PATCH v1] irq_work: Fix use-after-free in irq_work_single on PREEMPT_RT
From: Steven Rostedt
Date: Wed Mar 25 2026 - 11:57:35 EST
On Wed, 25 Mar 2026 11:05:04 +0800
Jiayuan Chen <jiayuan.chen@xxxxxxxxx> wrote:
Hi Jiayuan,
Adding "v1" to your first patch doesn't show much confidence in the code.
As it implies there will definitely be a v2. It's kind of like introducing
your spouse as, "Hi, this is my first wife". ;-)
> On PREEMPT_RT, non-HARD irq_work runs in a per-CPU kthread, so
> irq_work_sync() uses rcuwait (sleeping) to wait for BUSY==0.
>
> After irq_work_single() clears BUSY via atomic_cmpxchg(), an
> irq_work_sync() caller on another CPU that enters *after* BUSY is
> cleared can observe BUSY==0 immediately (without sleeping), return,
> and free the work. Meanwhile irq_work_single() still dereferences
> @work for irq_work_is_hard() and rcuwait_wake_up(), causing a
> use-after-free.
>
> Note: if a sync waiter is actually sleeping, @work is still alive
> (it can't be freed until the waiter returns), so there is no UAF in
> that case. The UAF only occurs when sync checks BUSY==0 without
> going through schedule().
>
> Fix this by extracting and pinning the irq_work_sync waiter's
> task_struct (if any) while BUSY is still set and @work is guaranteed
> alive. After clearing BUSY, wake the pinned task directly without
> touching @work.
>
> Signed-off-by: Jiayuan Chen <jiayuan.chen@xxxxxxxxx>
> ---
> kernel/irq_work.c | 29 ++++++++++++++++++++++++++---
> 1 file changed, 26 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/irq_work.c b/kernel/irq_work.c
> index 73f7e1fd4ab4..b10b75d1cc09 100644
> --- a/kernel/irq_work.c
> +++ b/kernel/irq_work.c
> @@ -200,6 +200,7 @@ bool irq_work_needs_cpu(void)
>
> void irq_work_single(void *arg)
> {
> + struct task_struct *waiter = NULL;
> struct irq_work *work = arg;
> int flags;
>
> @@ -221,15 +222,37 @@ void irq_work_single(void *arg)
> work->func(work);
> lockdep_irq_work_exit(flags);
>
> + /*
> + * Extract and pin the irq_work_sync() waiter before clearing
> + * BUSY. Once BUSY is cleared, @work may be freed immediately
> + * by a sync caller that observes BUSY==0 without sleeping, so
> + * @work must not be dereferenced after the cmpxchg below.
> + */
> + if ((IS_ENABLED(CONFIG_PREEMPT_RT) && !irq_work_is_hard(work)) ||
> + !arch_irq_work_has_interrupt()) {
Unrelated to this patch, but the above if conditional should be defined as
a macro as it is used in more than one place, and they do need to match.
Maybe something like:
#define is_irq_work_threaded(work) ((IS_ENABLED(CONFIG_PREEMPT_RT) && !irq_work_is_hard(work)) || \
!arch_irq_work_has_interrupt())
But again, that isn't related to this patch.
> + rcu_read_lock();
> + waiter = rcu_dereference(work->irqwait.task);
Hmm, is it proper to access the internals of an rcuwait type?
Paul, is there a more proper way to do this?
> + if (waiter)
> + get_task_struct(waiter);
> + rcu_read_unlock();
> + }
> +
> /*
> * Clear the BUSY bit, if set, and return to the free state if no-one
> * else claimed it meanwhile.
> */
> (void)atomic_cmpxchg(&work->node.a_flags, flags, flags & ~IRQ_WORK_BUSY);
>
> - if ((IS_ENABLED(CONFIG_PREEMPT_RT) && !irq_work_is_hard(work)) ||
> - !arch_irq_work_has_interrupt())
> - rcuwait_wake_up(&work->irqwait);
> + /*
> + * @work must not be dereferenced past this point. Wake the
> + * pinned waiter if one was sleeping; if none was sleeping,
> + * either irq_work_sync() has not been called or it will
> + * observe BUSY==0 on its own.
> + */
> + if (waiter) {
> + wake_up_process(waiter);
Yeah, this is open coding the rcuwait_wake_up().
I'm not sure we want to do this.
Perhaps RCU could provide a way to save the rcuwait?
struct rcuwait = __RCUWAIT_INITIALIZER(rcuwait);
[..]
rcuwait_copy(&rcuwait, &work->irqwait);
[..]
rcuwait_wake_up(&rcuwait);
?
> + put_task_struct(waiter);
> + }
-- Steve
> }
>
> static void irq_work_run_list(struct llist_head *list)