Re: [PATCH net-next 2/2] net/smc: reduce TX slot contention with exclusive wait
From: D. Wythe
Date: Tue May 19 2026 - 02:09:29 EST
On Tue, May 12, 2026 at 10:26:43AM +0200, Paolo Abeni wrote:
> From: AI Reviewer <ai@xxxxxxxxxxx>
>
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://netdev-ai.bots.linux.dev/sashiko/
> ---
> net/smc: reduce TX slot contention with exclusive wait
>
> This patch replaces the non-exclusive wait_event_interruptible_timeout()
> in smc_wr_tx_get_free_slot() with an open-coded exclusive wait loop via
> prepare_to_wait_exclusive(), so wake_up() on a freed TX slot wakes only
> a single waiter instead of the entire wait queue.
>
> > diff --git a/net/smc/smc_wr.c b/net/smc/smc_wr.c
> > index 48037a3d97a3..0a6f2befb0e2 100644
> > --- a/net/smc/smc_wr.c
> > +++ b/net/smc/smc_wr.c
> > @@ -171,17 +173,27 @@ int smc_wr_tx_get_free_slot(struct smc_link *link,
> > if (rc)
> > return rc;
> > } else {
> > - rc = wait_event_interruptible_timeout(
> > - link->wr_tx_wait,
> > - !smc_link_sendable(link) ||
> > - lgr->terminating ||
> > - (smc_wr_tx_get_free_slot_index(link, &idx) != -EBUSY),
> > - SMC_WR_TX_WAIT_FREE_SLOT_TIME);
> > - if (!rc) {
> > - /* timeout - terminate link */
> > - smcr_link_down_cond_sched(link);
> > - return -EPIPE;
> > + rc = 0;
> > + for (;;) {
> > + prepare_to_wait_exclusive(&link->wr_tx_wait, &wait,
> > + TASK_INTERRUPTIBLE);
> > + if (!smc_link_sendable(link) || lgr->terminating ||
> > + smc_wr_tx_get_free_slot_index(link, &idx) != -EBUSY)
> > + break;
> > + timeout = schedule_timeout(timeout);
> > + if (!timeout) {
> > + /* timeout - terminate link */
> > + smcr_link_down_cond_sched(link);
> > + break;
> > + }
> > + if (signal_pending(current)) {
> > + rc = -ERESTARTSYS;
> > + break;
> > + }
>
> [High]
> With exclusive waiters, wake_up() on link->wr_tx_wait wakes only one
> task per freed slot. If that waiter is about to exit via the
> signal_pending() branch, can the freed slot be lost to the remaining
> sleepers?
>
> Consider two exclusive waiters A and B asleep in schedule_timeout():
>
> 1. smc_wr_tx_put_slot() -> wake_up(&link->wr_tx_wait) is effectively
> __wake_up(..., nr_exclusive=1). default_wake_function makes A
> runnable, autoremove_wake_function pulls A off the queue, and
> __wake_up_common stops -- B is not woken.
>
> 2. A signal is delivered to A before it actually runs. Because A is
> already TASK_RUNNING, signal_wake_up only sets TIF_SIGPENDING.
>
> 3. A returns from schedule_timeout() with timeout > 0, so the
> !timeout branch is skipped. signal_pending(current) is true, so
> the loop sets rc = -ERESTARTSYS and breaks without re-testing
> smc_wr_tx_get_free_slot_index() and without issuing any wake_up()
> on the way out.
>
> 4. B stays asleep on the wait queue even though the wr_tx_mask bit
> for the freed slot is still available. With no further
> completions, B sleeps until the full SMC_WR_TX_WAIT_FREE_SLOT_TIME
> expires, at which point smcr_link_down_cond_sched() tears down an
> otherwise healthy link.
>
> The previous wait_event_interruptible_timeout() was non-exclusive, so
> every waiter was woken on each wake_up() and re-evaluated the
> condition, which avoided this class of lost-wakeup.
>
> Should the signal path re-check the slot condition before returning,
> or issue a wake_up() on exit so another exclusive waiter gets a chance
> at the slot?
>
Thanks for the review, all three points are valid.
[High] & [Low]: Both resolved by a unified re-check after finish_wait():
if no slot was acquired in the loop (due to signal or timeout),
re-evaluate the condition one final time. If a slot is now available,
claim it and proceed. No wake_up() is needed on exit — the teardown path
already does wake_up_all when the link goes down, and waking the next
waiter when no slot exists is pointless.
> [Medium]
> The commit message says this is a contention optimization and that
> "teardown behavior is unchanged", but does it also change the errno
> returned to callers on signal delivery?
>
> Before this patch, wait_event_interruptible_timeout() returned
> -ERESTARTSYS on signal; that value is non-zero, so the old if (!rc)
> branch was skipped, control fell through to the trailing
> if (idx == link->wr_tx_cnt) return -EPIPE, and callers saw -EPIPE.
>
> After the patch, the signal path does:
>
> if (signal_pending(current)) {
> rc = -ERESTARTSYS;
> break;
> }
> ...
> finish_wait(&link->wr_tx_wait, &wait);
> if (rc)
> return rc;
>
> so -ERESTARTSYS propagates up through smc_cdc_get_free_slot() (which
> only remaps rc to -EPIPE when conn->killed is set),
> smcr_cdc_get_slot_and_msg_send(), smc_cdc_get_slot_and_msg_send() and
> smc_tx_sendmsg(), meaning userspace sees syscall restart / -EINTR
> instead of -EPIPE.
>
> Note that the sibling smc_wr_reg_send() in the same file explicitly
> converts -ERESTARTSYS to -EINTR. Should this return-code change be
> called out in the commit message (and possibly split out), given that
> the stated intent of the patch is purely to reduce contention?
Agreed. I'll keep the return code as -EPIPE to match the original behavior, so
this patch remains a pure contention optimization with no semantic
change.
> [Low]
> On timeout expiry the new loop takes the teardown path without
> re-testing whether a slot became available:
>
> > + timeout = schedule_timeout(timeout);
> > + if (!timeout) {
> > + /* timeout - terminate link */
> > + smcr_link_down_cond_sched(link);
> > + break;
> > + }
>
> The old wait_event_interruptible_timeout() expansion re-evaluated the
> condition via ___wait_cond_timeout after schedule_timeout() returned
> 0, so a wake_up() racing with timer expiry could still report success
> and let smc_wr_tx_get_free_slot_index() claim the slot.
>
> In the narrow race where a slot is freed at the same instant the
> timer fires, does this now tear the link down even though a slot was
> actually available?
>
> > }
> > + finish_wait(&link->wr_tx_wait, &wait);
> > + if (rc)
> > + return rc;
> > if (idx == link->wr_tx_cnt)
> > return -EPIPE;
> > }
D. Wythe
> --
> This is an AI-generated review.