[PATCH v2] smp: Wait for enqueued work regardless of IPI sent

From: Rik van Riel
Date: Thu Jul 03 2025 - 20:31:44 EST


On Thu, 03 Jul 2025 18:56:11 +0200
Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:

> On Wed, Jul 02 2025 at 13:59, Rik van Riel wrote:
> > Thomas, please let me know if you already reverted Yury's patch,
> > and want me to re-send this without the last hunk.
>
> I did so immediately after saying so in my previous reply. It's gone in
> tip and next.

Here is v2 of the patch, with the last hunk removed, and
the changelog adjusted to match the new context.

---8<---
From 2ae6417fa7ce16f1bfa574cbabba572436adbed9 Mon Sep 17 00:00:00 2001
From: Rik van Riel <riel@xxxxxxxxxxx>
Date: Wed, 2 Jul 2025 13:52:54 -0400
Subject: [PATCH] smp: Wait only if work was enqueued

Whenever work is enqueued with a remote CPU, smp_call_function_many_cond()
may need to wait for that work to be completed. However, if no work is
enqueued with a remote CPU, because "func" told us to skip all CPUs,
there is no need to wait.

Set run_remote only if work was enqueued on remote CPUs.

Document the difference between "work enqueued", and "CPU needs to be
woken up"

Signed-off-by: Rik van Riel <riel@xxxxxxxxxxx>
Suggested-by: Jann Horn <jannh@xxxxxxxxxx>
Reviewed-by: Yury Norov (NVIDIA) <yury.norov@xxxxxxxxx>
---
kernel/smp.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 84561258cd22..c5e1da7a88da 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -802,7 +802,6 @@ static void smp_call_function_many_cond(const struct cpumask *mask,

/* Check if we need remote execution, i.e., any CPU excluding this one. */
if (cpumask_any_and_but(mask, cpu_online_mask, this_cpu) < nr_cpu_ids) {
- run_remote = true;
cfd = this_cpu_ptr(&cfd_data);
cpumask_and(cfd->cpumask, mask, cpu_online_mask);
__cpumask_clear_cpu(this_cpu, cfd->cpumask);
@@ -816,6 +815,9 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
continue;
}

+ /* Work is enqueued on a remote CPU. */
+ run_remote = true;
+
csd_lock(csd);
if (wait)
csd->node.u_flags |= CSD_TYPE_SYNC;
@@ -827,6 +829,10 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
#endif
trace_csd_queue_cpu(cpu, _RET_IP_, func, csd);

+ /*
+ * Kick the remote CPU if this is the first work
+ * item enqueued.
+ */
if (llist_add(&csd->node.llist, &per_cpu(call_single_queue, cpu))) {
__cpumask_set_cpu(cpu, cfd->cpumask_ipi);
nr_cpus++;
--
2.49.0