Re: [PATCH 1/6] sched/proxy: Remove superfluous clear_task_blocked_in()
From: Peter Zijlstra
Date: Fri May 29 2026 - 06:09:05 EST
On Fri, May 29, 2026 at 01:28:45PM +0530, K Prateek Nayak wrote:
> Hello John,
>
> On 5/29/2026 12:18 PM, John Stultz wrote:
> > Bascially we can get in a situation where (sorry this gets a bit convoluted):
> >
> > 1) On CPU1, __mutex_lock_common, we set task A
> > blocked_on/TASK_UNINTERRUPTABLE, and call into __schedule().
> >
> > 2) On CPU2, task B who holds the mutex calls __mutex_unlock_slowpath()
> > and sets task A as PROXY_WAKING and starts to call into
> > wake_up_task().
> >
> > 3) On CPU1, in __schedule() we pick_next_task(), which returns task A,
> > which is_blocked. We call find_proxy_task() and note task A is
> > PROXY_WAKING. Since its also current, we take the short-cut and clear
> > is_blocked and blocked_on and return task A to run.
> >
> > 4) On CPU2, try_to_wake_up() hits ttwu_runnable(), and
> > proxy_needs_return() returns false as A->blocked_on is zero.
> >
> > 5) On CPU 1, task A is running, it grabs the lock it was waiting for
> > and exits __mutex_lock_common. It then enters __mutex_lock_common to
> > grab a different mutex that is already locked. It sets itself
> > blocked_on/TASK_INTERRUPTABLE and calls into __schedule()
> >
> > 6) On CPU2, ttwu_runnable() continues, and calls ttwu_do_wakeup(),
> > which clears A->is_blocked and sets the A->__state TASK_RUNNING
> >
> > 7) On CPU3, task C that holds the mutex A is waiting on, calls
> > __mutex_unlock_slowpath, setting A as PROXY_WAKING and calls into
> > wake_up_task()
> >
> > 8) On CPU1, in __schedule() pick_next_task() again returns task A. But
> > is_blocked is now zero, so we just return task A, even though
> > blocked_on is PROXY_WAKING.
> >
> > 9) On CPU1, task A gets back to the __mutex_lock_common() loop, calls
> > set_task_blocked_on() and trips warnings as A->blocked_on is still
> > PROXY_WAKING.
>
> Oh geez! Me tries to visualize:
Thanks!, I too need pictures, prose will forever confuse me :/
>
> CPU1 CPU2 CPU3
> ==== ==== ====
>
> __mutex_lock_common(MutexA)
> set_task_blocked_on(TaskA, MutexA)
> set_current_state(TASK_UNINTERRUPTABLE) __mutex_unlock_slowpath(MutexA)
> ... set_task_blocked_on_waking(TaskA)
> schedule_preempt_disabled() wake_up_process(TaskA)
> __schedule() /* (1) */ ... /* (2) */
>
> if (prev_state &..)
> TaskA->is_blocked = 0;
Should this be: TaskA->is_blocked = 1? Otherwise I'm not following.
> next = TaskA
> find_proxy_task(TaskA)
> /* TaskA-> blocked_on == TASK_WAKING */
> clear_task_blocked_on(TaskA, NULL);
> TaskA->is_blocked = 0;
> ...
> next = TaskA /* (3) */
> rq_unlock(CPU1) try_to_wake_up(TaskA)
> rq_lock(CPU1)
> ttwu_runnable()
> /* TaskA->blocked_on == 0 (4) */
> ...
> set_curent_state(TASK_RUNNING)
> ...
>
> mutex_lock(MutexB)
> __mutex_lock_common(MutexB)
> ...
> set_task_blocked_on(TaskA, MutexB)
> set_current_state(TASK_UNINTERRUPTABLE)
> schedule_preempt_disabled()
> __schedule() /* (5) */ ... __mutex_unlock_slowpath(MutexB)
> ttwu_do_wakeup(TaskA) /* (6) */ set_task_blocked_on_waking(TaskA) /* (7) */
> rq_unlock(CPU1)
>
> rq_lock(CPU1)
> /* TaskA->__state == TASK_RUNNING */
> next = TasKA;
>
> if (TaskA->is_blocked /* False */ && TaskA->blocked_on /* PROXY_WAKING */)
> /* Skip */
> next = TaskA; /* (8) */
>
> set_task_blocked_on(p, MutexB)
>
> !!! p->blocked_on != MutexB !!!
>
>
> Yup! That is a concern too then!
>
> I think we can just squash the PROXY_WAKING removal with "p->is_blocked"
> introduction and a part of this problem should go away since unlocks
> always clear task->blocked_on then.
While staring at this, I noted that the PROXY_WAKING removal patch
should also remove the clear_task_blocked_on() line in the very last
hunk.
That said, I do have a note to double check the lockless access to
p->blocked_on there.
Anyway, yes, the hunk removed in patch 1 cures this by clearing
TaskA->blocked_on (because prev == current == TaskA). I don't think we
need to squash everything into one giant patch over this if we just
reorder things.
The Changelog of patch 1 needs an extra few links to this discussion,
but that should be it, no?