Re: [PATCH net-next v3 0/4] net: move .getsockopt away from __user buffers
From: Stefan Metzmacher
Date: Wed Apr 08 2026 - 09:56:55 EST
Am 08.04.26 um 13:26 schrieb David Laight:
On Wed, 08 Apr 2026 03:30:28 -0700
Breno Leitao <leitao@xxxxxxxxxx> wrote:
Currently, the .getsockopt callback requires __user pointers:
int (*getsockopt)(struct socket *sock, int level,
int optname, char __user *optval, int __user *optlen);
This prevents kernel callers (io_uring, BPF) from using getsockopt on
levels other than SOL_SOCKET, since they pass kernel pointers.
Following Linus' suggestion [0], this series introduces sockopt_t, a
type-safe wrapper around iov_iter, and a getsockopt_iter callback that
works with both user and kernel buffers. AF_PACKET and CAN raw are
converted as initial users, with selftests covering the trickiest
conversion patterns.
What are you doing about the cases where 'optlen' is a complete lie?
IIRC there is one related to some form of async io where it is just
the length of the header, the actual buffer length depends on
data in the header.
This doesn't matter with the existing code for applications, when they
get it wrong they just crash.
But kernel users will need to pass the actual buffer length separately
from optlen.
It also affects any code that tries to cache the actual data and copy
it back to userspace in the syscall wrapper - which makes sense for
most short getsockopt.
(This is different from historic code where the length might be
assumed to be 4 regardless of what was passed.)
As the insane legacy cases can only happen for keeping
compatibility with existing userspace applications,
we could get the original optval and optlen __user pointers
out of sockopt_t again via something like:
char __user * __must_check sockopt_get_insame_legacy_optval(sockopt_t *sopt);
int __user * __must_check sockopt_get_insame_legacy_optlen(sockopt_t *sopt);
And for kernel callers they return NULL and the code should
turn that into -EINVAL or something similar.
Then legacy stuff can do what they need, but most things are
sane and able to be called via io_uring and in kernel users.
Unrelated to legacy stuff I think it should be an opt-in
(or at least opt-out) for the writeback of optlen.
metze