Re: [BUG io_uring] Failed RECVSEND_BUNDLE can persistently shrink non-INC pbuf ring len and affect later READ operations
From: Federico Brasili
Date: Sun Jun 07 2026 - 16:09:11 EST
Hi Jens,
Sure, attaching the minimal reproducer and the output from my Ubuntu
7.0.0-22-generic test system.
The reproducer runs unprivileged and demonstrates:
1. non-INC provided-buffer ring with entry0.len = 4096 and entry1.len = 4096
2. IORING_OP_RECV + IOSQE_BUFFER_SELECT + IORING_RECVSEND_BUNDLE on an
empty SOCK_DGRAM socket
3. CQE returns -EAGAIN, but entry0.len is changed from 4096 to 1
4. a later unrelated IORING_OP_READ from a pipe using the same buffer
group returns 1 byte instead of 4096
5. a second READ uses entry1 and returns 4096, so head/bid accounting
appears coherent in this repro
I am not claiming privilege escalation from this. The demonstrated
issue is persistent provided-buffer descriptor length corruption after
a failed/no-data RECV_BUNDLE, affecting a later READ operation.
Thanks,
Federico
Il giorno dom 7 giu 2026 alle ore 21:07 Jens Axboe <axboe@xxxxxxxxx> ha scritto:
>
> On 6/7/26 5:41 AM, Federico Brasili wrote:
> > Hi,
> >
> > I found a reproducible io_uring provided-buffer ring issue on Ubuntu
> > kernel 7.0.0-22-generic.
> >
> > A failed IORING_RECVSEND_BUNDLE receive on a non-INC provided-buffer
> > ring can persistently shrink the user-visible buffer descriptor
> > length. The modified length is not rolled back when the receive fails
> > with -EAGAIN/no data, and a later unrelated io_uring operation, such
> > as IORING_OP_READ from a pipe, consumes the corrupted length.
> >
> > This is not a demonstrated privilege escalation. The demonstrated
> > impact is deterministic unprivileged provided-buffer ring metadata
> > corruption across unrelated io_uring operations.
> >
> > Tested kernel:
> >
> > Linux ubuntu 7.0.0-22-generic #22-Ubuntu SMP PREEMPT_DYNAMIC Mon May
> > 25 15:54:34 UTC 2026 x86_64 GNU/Linux
> >
> > Summary:
> >
> > Create an io_uring instance as an unprivileged user.
> >
> > Register a non-INC provided-buffer ring with two buffers:
> >
> > entry0.len = 4096
> >
> > entry1.len = 4096
> >
> > Submit IORING_OP_RECV with:
> >
> > IOSQE_BUFFER_SELECT
> >
> > IORING_RECVSEND_BUNDLE
> >
> > req_len = 1
> >
> > MSG_DONTWAIT
> >
> > empty AF_UNIX SOCK_DGRAM socket
> >
> > The receive fails with -EAGAIN, but entry0.len is changed from 4096 to 1.
> >
> > Submit a later unrelated IORING_OP_READ from a pipe using the same
> > provided-buffer group with req_len = 4096.
> >
> > The READ returns only 1 byte, because it uses the previously corrupted
> > entry0.len.
> >
> > A second READ then consumes entry1 normally and returns 4096 bytes,
> > showing that head/bid accounting remains coherent and the corruption
> > is localized to the poisoned descriptor.
> >
> > Observed output from clean unprivileged reproduction:
> >
> > [INIT] uid=1002 entry0.len=4096 entry1.len=4096 tail=2
> > [STEP1] RECV BUNDLE on empty socket, req_len=1, expected CQE=-EAGAIN
> > [CQE_RECV_BUNDLE] res=-11 flags=0x0 user=0x1111
> > [AFTER_RECV_BUNDLE] entry0.len=1 entry1.len=4096 changed_buf0=0
> > changed_buf1=0 guard_before=0 guard_after=0
> > [STEP2] write pipe bytes=4096, then IORING_OP_READ req_len=4096 using
> > same pbuf group
> > [CQE_READ1] res=1 flags=0x1 user=0x6666
> > [AFTER_READ1] entry0.len=1 entry1.len=4096 changed_buf0=1
> > changed_buf1=0 guard_before=0 guard_after=0
> > [STEP3] write second pipe bytes=4096, then second IORING_OP_READ
> > req_len=4096 without republish
> > [CQE_READ2] res=4096 flags=0x10001 user=0x7777
> > [AFTER_READ2] entry0.len=1 entry1.len=4096 changed_buf0=1
> > changed_buf1=4096 guard_before=0 guard_after=0
> > [RESULT] PASS: unprivileged RECV_BUNDLE -EAGAIN poisoned pbuf len and
> > later IORING_OP_READ consumed the corrupted len.
> >
> > Why this looks like a bug:
> >
> > The failed receive should not persistently alter the provided-buffer
> > descriptor in a way that affects future unrelated operations. In this
> > case, a no-data/-EAGAIN RECV_BUNDLE changes entry0.len from 4096 to 1,
> > and that corrupted length is later consumed by IORING_OP_READ from a
> > pipe.
> >
> > The suspected root cause is in the non-INC provided-buffer ring BUNDLE
> > selection path:
> >
> > io_ring_buffers_peek()
> > if (len > arg->max_len) {
> > len = arg->max_len;
> > if (!(bl->flags & IOBL_INC)) {
> > arg->partial_map = 1;
> > if (iov != arg->iovs)
> > break;
> > WRITE_ONCE(buf->len, len);
> > }
> > }
> >
> > The descriptor length is modified during buffer selection/peek before
> > the receive operation has completed successfully. If the receive later
> > fails with -EAGAIN/no data, the buffer is recycled but the modified
> > buf->len is not restored.
> >
> > Additional observations:
> >
> > The issue reproduces as an unprivileged user.
> >
> > The effect crosses io_uring operations: RECV affects a later READ.
> >
> > The effect crosses subsystems: socket receive affects pipe read.
> >
> > The second READ correctly uses entry1 and returns 4096 bytes, so this
> > does not appear to be a head/bid desync in the tested case.
> >
> > No kernel crash, OOB write, UAF, or privilege escalation has been demonstrated.
> >
> > Expected behavior:
> >
> > If IORING_RECVSEND_BUNDLE fails with -EAGAIN/no data, the
> > provided-buffer ring descriptor should not be persistently modified,
> > or the original len should be restored during recycle/rollback.
> >
> > Actual behavior:
> >
> > The failed BUNDLE receive leaves entry0.len shortened to the requested
> > length, and later unrelated operations using the same provided-buffer
> > group consume that corrupted length.
> >
> > I can provide the minimal C reproducer and full output if useful.
>
> Please do, no point in me recreating one for it. Then it can also get
> turned into a regression test cor liburing. Reproducers also mean more
> than a thousand words in an email, it tells us exactly what is bring run
> and what is going wrong. Or in some cases, what the wrong expectations
> are.
>
> --
> Jens Axboe
Attachment:
iouring_pbuf_reproducer_for_jens.tar.gz
Description: GNU Zip compressed data