Re: [PATCH net] net: ovpn: refuse TCP socket layering with an active ULP

From: Sabrina Dubroca

Date: Tue May 19 2026 - 10:37:14 EST


2026-05-17, 11:56:45 -0400, Michael Bommarito wrote:
> ovpn's TCP transport replaces sk_prot, sk_data_ready, sk_write_space
> and sk_socket->ops by direct field writes when a peer is attached, and
> restores them by direct field writes when the peer is detached. That
> layering scheme is not compatible with a TCP ULP (e.g. kTLS) being
> installed on the same socket: ULP setup also captures the current
> sk_prot as its lower layer and replaces sk_prot with its own. The two
> schemes are not aware of each other and can be combined in either
> order on the same fd.

It looks like we should make ovpn-tcp a ULP (without requesting
userspace to do the setsockopt, since we messed that up a year ago. I
thought I had suggested that at some point, but maybe not). That's
what it is anyway, and layering ovpn on top of (or underneath) ktls or
espintcp makes no sense. It would save us the messy restore hacks in
this patch.

mptcp calls tcp_set_ulp() from the setup, maybe ovpn can also do
that. detach() may be an issue.


[...]
> @@ -583,6 +610,26 @@ static void ovpn_tcp_close(struct sock *sk, long timeout)
> sock = rcu_dereference_sk_user_data(sk);
> if (!sock || !sock->peer || !ovpn_peer_hold(sock->peer)) {
> rcu_read_unlock();
> + /* No (or no-longer-attached) ovpn state on this socket.
> + * That happens when ovpn_tcp_socket_detach() already ran
> + * and cleared sk_user_data, but the caller chain still
> + * dispatches ->close() on ovpn_tcp_prot. The two cases
> + * that exhibit this are:
> + * - a TCP ULP layered on top of ovpn whose close hook
> + * chains via the captured ctx->sk_proto (which is
> + * &ovpn_tcp_prot) after detach has run;
> + * - the close-vs-detach race window where sk_user_data
> + * has been cleared but sk_prot has not yet been
> + * restored.
> + * In both cases the TCP layer still owes the peer a
> + * graceful close, so chain to the base TCP close.
> + */

This is a whole commit message, not a code comment.

> +#if IS_ENABLED(CONFIG_IPV6)
> + if (sk->sk_family == AF_INET6)
> + tcpv6_prot.close(sk, timeout);
> + else
> +#endif
> + tcp_prot.close(sk, timeout);

Isn't that a separate problem? (detach racing with close)
Or it only appears after the other changes because you removed
clearing the CBs?

--
Sabrina