[PATCH v26 03/10] sched: Fix potentially missing balancing with Proxy Exec
From: John Stultz
Date: Tue Mar 24 2026 - 15:21:22 EST
K Prateek pointed out that with Proxy Exec, we may have cases
where we context switch in __schedule(), while the donor remains
the same. This could cause balancing issues, since the
put_prev_set_next() logic short-cuts if (prev == next). With
proxy-exec prev is the previous donor, and next is the next
donor. Should the donor remain the same, but different tasks are
picked to actually run, the shortcut will have avoided enqueuing
the sched class balance callback.
So, if we are context switching, add logic to catch the
same-donor case, and trigger the put_prev/set_next calls to
ensure the balance callbacks get enqueued.
Reported-by: K Prateek Nayak <kprateek.nayak@xxxxxxx>
Closes: https://lore.kernel.org/lkml/20ea3670-c30a-433b-a07f-c4ff98ae2379@xxxxxxx/
Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Signed-off-by: John Stultz <jstultz@xxxxxxxxxx>
---
Cc: Joel Fernandes <joelagnelf@xxxxxxxxxx>
Cc: Qais Yousef <qyousef@xxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
Cc: Valentin Schneider <vschneid@xxxxxxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
Cc: Ben Segall <bsegall@xxxxxxxxxx>
Cc: Zimuzo Ezeozue <zezeozue@xxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxx>
Cc: Will Deacon <will@xxxxxxxxxx>
Cc: Waiman Long <longman@xxxxxxxxxx>
Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
Cc: Metin Kaya <Metin.Kaya@xxxxxxx>
Cc: Xuewen Yan <xuewen.yan94@xxxxxxxxx>
Cc: K Prateek Nayak <kprateek.nayak@xxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx>
Cc: Suleiman Souhlal <suleiman@xxxxxxxxxx>
Cc: kuyo chang <kuyo.chang@xxxxxxxxxxxx>
Cc: hupu <hupu.gm@xxxxxxxxx>
Cc: kernel-team@xxxxxxxxxxx
---
kernel/sched/core.c | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index dc044a405f83b..610e48cdb66a9 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6829,9 +6829,11 @@ static void __sched notrace __schedule(int sched_mode)
pick_again:
next = pick_next_task(rq, rq->donor, &rf);
- rq_set_donor(rq, next);
rq->next_class = next->sched_class;
if (sched_proxy_exec()) {
+ struct task_struct *prev_donor = rq->donor;
+
+ rq_set_donor(rq, next);
if (unlikely(next->blocked_on)) {
next = find_proxy_task(rq, next, &rf);
if (!next)
@@ -6839,7 +6841,27 @@ static void __sched notrace __schedule(int sched_mode)
if (next == rq->idle)
goto keep_resched;
}
+ if (rq->donor == prev_donor && prev != next) {
+ struct task_struct *donor = rq->donor;
+ /*
+ * When transitioning like:
+ *
+ * prev next
+ * donor: B B
+ * curr: A B or C
+ *
+ * then put_prev_set_next_task() will not have done
+ * anything, since B == B. However, A might have
+ * missed a RT/DL balance opportunity due to being
+ * on_cpu.
+ */
+ donor->sched_class->put_prev_task(rq, donor, donor);
+ donor->sched_class->set_next_task(rq, donor, true);
+ }
+ } else {
+ rq_set_donor(rq, next);
}
+
picked:
clear_tsk_need_resched(prev);
clear_preempt_need_resched();
--
2.53.0.1018.g2bb0e51243-goog