[PATCH net-next v3 2/3] tcp: keep scaled no-shrink window representable
From: Wesley Atwell
Date: Tue Mar 24 2026 - 16:53:36 EST
In the scaled no-shrink path, __tcp_select_window() currently rounds the
raw free-space value up to the receive-window scale quantum.
When raw backed free_space sits just below the next quantum, that can
expose fresh sender-visible credit beyond the currently backed receive
space.
Fix this by keeping tp->rcv_wnd representable in scaled units: round
larger windows down to the scale quantum and preserve only the small
non-zero case that would otherwise scale away to zero.
This series intentionally leaves that smaller longstanding non-zero case
unchanged. The proven bug and the new reproducer are both in the
larger-window path where free_space is at least one scale quantum, so
changing 0 < free_space < granularity into zero would be a separate
behavior change.
That representability matters across ACK transitions too, not only on
the immediate raw-free_space-limited ACK. tcp_select_window() preserves
the currently offered window when shrinking is disallowed, so if an
earlier ACK stores a rounded-up value in tp->rcv_wnd, a later
raw-free_space-limited ACK can keep inheriting that extra unit.
Keeping tp->rcv_wnd representable throughout the scaled no-shrink path
prevents that carry-forward and makes later no-shrink decisions reason
from a right edge the peer could actually have seen on the wire.
This removes the larger-window quantization slack while preserving the
small non-zero case needed to avoid scaling away to zero.
Signed-off-by: Wesley Atwell <atwellwea@xxxxxxxxx>
---
v3:
- keep granularity in signed int space so the free_space comparison
stays type-safe
v2:
- rename gran to granularity
- clarify why representable tp->rcv_wnd state is required across later
no-shrink transitions
- clarify that this series still intentionally leaves the smaller
longstanding non-zero case unchanged
net/ipv4/tcp_output.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 35c3b0ab5a0cb714155d5720fe56888f71aecced..5fc0e0d22f10bf56ece1be536b75013768112acf 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3375,13 +3375,19 @@ u32 __tcp_select_window(struct sock *sk)
* scaled window will not line up with the MSS boundary anyway.
*/
if (tp->rx_opt.rcv_wscale) {
- window = free_space;
+ int granularity = 1 << tp->rx_opt.rcv_wscale;
- /* Advertise enough space so that it won't get scaled away.
- * Import case: prevent zero window announcement if
- * 1<<rcv_wscale > mss.
+ /* Keep tp->rcv_wnd representable in scaled units so later
+ * no-shrink decisions reason about the same right edge we
+ * can advertise on the wire. Preserve only a small non-zero
+ * offer that would otherwise get scaled away to zero.
*/
- window = ALIGN(window, (1 << tp->rx_opt.rcv_wscale));
+ if (free_space >= granularity)
+ window = round_down(free_space, granularity);
+ else if (free_space > 0)
+ window = granularity;
+ else
+ window = 0;
} else {
window = tp->rcv_wnd;
/* Get the largest window that is a nice multiple of mss.
--
2.43.0