[BUG io_uring] Failed RECVSEND_BUNDLE can persistently shrink non-INC pbuf ring len and affect later READ operations
From: Federico Brasili
Date: Sun Jun 07 2026 - 07:44:31 EST
Hi,
I found a reproducible io_uring provided-buffer ring issue on Ubuntu
kernel 7.0.0-22-generic.
A failed IORING_RECVSEND_BUNDLE receive on a non-INC provided-buffer
ring can persistently shrink the user-visible buffer descriptor
length. The modified length is not rolled back when the receive fails
with -EAGAIN/no data, and a later unrelated io_uring operation, such
as IORING_OP_READ from a pipe, consumes the corrupted length.
This is not a demonstrated privilege escalation. The demonstrated
impact is deterministic unprivileged provided-buffer ring metadata
corruption across unrelated io_uring operations.
Tested kernel:
Linux ubuntu 7.0.0-22-generic #22-Ubuntu SMP PREEMPT_DYNAMIC Mon May
25 15:54:34 UTC 2026 x86_64 GNU/Linux
Summary:
Create an io_uring instance as an unprivileged user.
Register a non-INC provided-buffer ring with two buffers:
entry0.len = 4096
entry1.len = 4096
Submit IORING_OP_RECV with:
IOSQE_BUFFER_SELECT
IORING_RECVSEND_BUNDLE
req_len = 1
MSG_DONTWAIT
empty AF_UNIX SOCK_DGRAM socket
The receive fails with -EAGAIN, but entry0.len is changed from 4096 to 1.
Submit a later unrelated IORING_OP_READ from a pipe using the same
provided-buffer group with req_len = 4096.
The READ returns only 1 byte, because it uses the previously corrupted
entry0.len.
A second READ then consumes entry1 normally and returns 4096 bytes,
showing that head/bid accounting remains coherent and the corruption
is localized to the poisoned descriptor.
Observed output from clean unprivileged reproduction:
[INIT] uid=1002 entry0.len=4096 entry1.len=4096 tail=2
[STEP1] RECV BUNDLE on empty socket, req_len=1, expected CQE=-EAGAIN
[CQE_RECV_BUNDLE] res=-11 flags=0x0 user=0x1111
[AFTER_RECV_BUNDLE] entry0.len=1 entry1.len=4096 changed_buf0=0
changed_buf1=0 guard_before=0 guard_after=0
[STEP2] write pipe bytes=4096, then IORING_OP_READ req_len=4096 using
same pbuf group
[CQE_READ1] res=1 flags=0x1 user=0x6666
[AFTER_READ1] entry0.len=1 entry1.len=4096 changed_buf0=1
changed_buf1=0 guard_before=0 guard_after=0
[STEP3] write second pipe bytes=4096, then second IORING_OP_READ
req_len=4096 without republish
[CQE_READ2] res=4096 flags=0x10001 user=0x7777
[AFTER_READ2] entry0.len=1 entry1.len=4096 changed_buf0=1
changed_buf1=4096 guard_before=0 guard_after=0
[RESULT] PASS: unprivileged RECV_BUNDLE -EAGAIN poisoned pbuf len and
later IORING_OP_READ consumed the corrupted len.
Why this looks like a bug:
The failed receive should not persistently alter the provided-buffer
descriptor in a way that affects future unrelated operations. In this
case, a no-data/-EAGAIN RECV_BUNDLE changes entry0.len from 4096 to 1,
and that corrupted length is later consumed by IORING_OP_READ from a
pipe.
The suspected root cause is in the non-INC provided-buffer ring BUNDLE
selection path:
io_ring_buffers_peek()
if (len > arg->max_len) {
len = arg->max_len;
if (!(bl->flags & IOBL_INC)) {
arg->partial_map = 1;
if (iov != arg->iovs)
break;
WRITE_ONCE(buf->len, len);
}
}
The descriptor length is modified during buffer selection/peek before
the receive operation has completed successfully. If the receive later
fails with -EAGAIN/no data, the buffer is recycled but the modified
buf->len is not restored.
Additional observations:
The issue reproduces as an unprivileged user.
The effect crosses io_uring operations: RECV affects a later READ.
The effect crosses subsystems: socket receive affects pipe read.
The second READ correctly uses entry1 and returns 4096 bytes, so this
does not appear to be a head/bid desync in the tested case.
No kernel crash, OOB write, UAF, or privilege escalation has been demonstrated.
Expected behavior:
If IORING_RECVSEND_BUNDLE fails with -EAGAIN/no data, the
provided-buffer ring descriptor should not be persistently modified,
or the original len should be restored during recycle/rollback.
Actual behavior:
The failed BUNDLE receive leaves entry0.len shortened to the requested
length, and later unrelated operations using the same provided-buffer
group consume that corrupted length.
I can provide the minimal C reproducer and full output if useful.
Thanks,
Federico