Re: [PATCH 2/2 RESEND] greybus: raw: fix use-after-free if write is called after disconnect

From: Dan Carpenter

Date: Mon Mar 16 2026 - 10:31:23 EST


On Mon, Mar 16, 2026 at 09:16:31AM -0400, Damien Riégel wrote:
> On Mon Mar 16, 2026 at 3:36 AM EDT, Dan Carpenter wrote:
> > On Wed, Mar 11, 2026 at 05:25:11PM -0400, Damien Riégel wrote:
> >> If a user writes to the chardev after disconnect has been called, the
> >> kernel panics with the following trace (with
> >> CONFIG_INIT_ON_FREE_DEFAULT_ON=y):
> >>
> >> [ 83.828726] BUG: kernel NULL pointer dereference, address: 0000000000000218
> >> [ 83.829288] #PF: supervisor read access in kernel mode
> >> [ 83.829528] #PF: error_code(0x0000) - not-present page
> >> [ 83.829828] PGD 0 P4D 0
> >> [ 83.830126] Oops: Oops: 0000 [#1] SMP NOPTI
> >> [ 83.830753] CPU: 0 UID: 0 PID: 140 Comm: raw_chardev_tes Tainted: G C 6.18.0-rc4 #212 PREEMPT(voluntary)
> >> [ 83.831260] Tainted: [C]=CRAP
> >> [ 83.831426] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
> >> [ 83.831912] RIP: 0010:gb_operation_message_alloc+0x14/0xc0
> >> [ 83.832366] Code: 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 56 4c 8d 72 08 41 55 41 89 cd1
> >> [ 83.832979] RSP: 0018:ffffb73f0027bd58 EFLAGS: 00010286
> >> [ 83.833247] RAX: ffffa44741f72300 RBX: ffffa44741f72300 RCX: 0000000000000cc0
> >> [ 83.833513] RDX: 000000000000000a RSI: 0000000000000002 RDI: 0000000000000000
> >> [ 83.833732] RBP: 0000000000000cc0 R08: 0000000000000000 R09: 0000000000000000
> >> [ 83.834044] R10: ffffa44741f72300 R11: 0000000000000000 R12: 0000000000000002
> >> [ 83.834267] R13: 0000000000000cc0 R14: 0000000000000012 R15: 0000000000000000
> >> [ 83.834533] FS: 00007fead7859740(0000) GS:ffffa447a31bc000(0000) knlGS:0000000000000000
> >> [ 83.834776] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> [ 83.834974] CR2: 0000000000000218 CR3: 000000000216b000 CR4: 00000000000006f0
> >> [ 83.835259] Call Trace:
> >> [ 83.835983] <TASK>
> >> [ 83.836362] gb_operation_create_common+0x61/0x180
> >> [ 83.836653] gb_operation_create_flags+0x28/0xa0
> >> [ 83.836912] gb_operation_sync_timeout+0x6f/0x100
> >> [ 83.837162] raw_write+0x7b/0xc7 [gb_raw]
> >> [ 83.837460] vfs_write+0xcf/0x420
> >> [ 83.837615] ? task_mm_cid_work+0x136/0x220
> >> [ 83.837784] ksys_write+0x63/0xe0
> >> [ 83.837946] do_syscall_64+0xa4/0x290
> >> [ 83.838097] entry_SYSCALL_64_after_hwframe+0x77/0x7f
> >> [ 83.838359] RIP: 0033:0x7fead78e9cc7
> >> [ 83.838712] Code: 48 89 fa 4c 89 df e8 08 ae 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44f
> >> [ 83.839190] RSP: 002b:00007ffece5c3de0 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
> >> [ 83.839489] RAX: ffffffffffffffda RBX: 00007fead7859740 RCX: 00007fead78e9cc7
> >> [ 83.839675] RDX: 0000000000000006 RSI: 0000563d13f96326 RDI: 0000000000000003
> >> [ 83.839892] RBP: 00007ffece5c3e38 R08: 0000000000000000 R09: 0000000000000000
> >> [ 83.840112] R10: 0000000000000000 R11: 0000000000000202 R12: 0000563cf8925128
> >> [ 83.840350] R13: 00007fead78596d0 R14: 0000563d13f96320 R15: 0000563d13f96326
> >> [ 83.840635] </TASK>
> >> [ 83.840824] Modules linked in: gb_raw(C)
> >> [ 83.841311] CR2: 0000000000000218
> >> [ 83.842009] ---[ end trace 0000000000000000 ]---
> >>
> >> Disconnect calls gb_connection_destroy, which ends up freeing the
> >> connection object. When gb_operation_sync is called in the write file
> >> operations, its gets a freed connection as parameter and the kernel
> >> panics.
> >>
> >> The gb_connection_destroy cannot be moved out of the disconnect
> >> function, as the Greybus subsystem expect all connections belonging to a
> >> bundle to be destroyed when disconnect returns.
> >>
> >> To prevent this bug, use a lock to synchronize access between write and
> >> disconnect. This guarantees that in the write function raw->connection
> >> is either a valid object or a NULL pointer.
> >>
> >> Fixes: e806c7fb8e9b ("greybus: raw: add raw greybus kernel driver")
> >> Signed-off-by: Damien Riégel <damien.riegel@xxxxxxxxxx>
> >> ---
> >> resend: added linux-staging as Cc, this list was not part of the first
> >> submission.
> >>
> >> drivers/staging/greybus/raw.c | 26 ++++++++++++++++++++------
> >> 1 file changed, 20 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/staging/greybus/raw.c b/drivers/staging/greybus/raw.c
> >> index b92214f97e3..aa4086ff397 100644
> >> --- a/drivers/staging/greybus/raw.c
> >> +++ b/drivers/staging/greybus/raw.c
> >> @@ -21,6 +21,7 @@ struct gb_raw {
> >> struct list_head list;
> >> int list_data;
> >> struct mutex list_lock;
> >> + struct mutex write_lock; /* Synchronize access to connection */
> >> struct cdev cdev;
> >> struct device dev;
> >> };
> >> @@ -124,8 +125,8 @@ static int gb_raw_request_handler(struct gb_operation *op)
> >>
> >> static int gb_raw_send(struct gb_raw *raw, u32 len, const char __user *data)
> >> {
> >> - struct gb_connection *connection = raw->connection;
> >> struct gb_raw_send_request *request;
> >> + struct gb_connection *connection;
> >> int retval;
> >>
> >> request = kmalloc(len + sizeof(*request), GFP_KERNEL);
> >> @@ -139,9 +140,15 @@ static int gb_raw_send(struct gb_raw *raw, u32 len, const char __user *data)
> >>
> >> request->len = cpu_to_le32(len);
> >>
> >> - retval = gb_operation_sync(connection, GB_RAW_TYPE_SEND,
> >> - request, len + sizeof(*request),
> >> - NULL, 0);
> >> + mutex_lock(&raw->write_lock);
> >> + retval = -ENODEV;
> >> +
> >> + connection = raw->connection;
> >> + if (connection)
> >> + retval = gb_operation_sync(connection, GB_RAW_TYPE_SEND,
> >> + request, len + sizeof(*request),
> >> + NULL, 0);
> >> + mutex_unlock(&raw->write_lock);
> > ^^^^^^^^^^^^^^^^
> >
> > I feel like we need to do a get_device() here as well otherwise the
> > put_device(&raw->dev) in gb_raw_disconnect() could delete the last
> > reference and free raw. I have looked at this and I feel like what
> > I'm saying is reasonable but I don't necessarily know how the reference
> > couting works for cdev. Please feel free to correct me. :)
>
> This is not my understanding, nor what I could see when I tested this.
> With cdev_device_add(cdev, dev), dev becomes the parent of the chardev.
> So as long as the cdev is opened, dev cannot go away because its child
> holds a reference to it (it's done for us by device core logic, we don't
> have to take care of that or manually get_device()).
>
> If gb_raw_disconnect() is called while the device is opened, raw->dev
> won't be freed until the cdev is closed. When that happens, cdev's
> refcount drops to 0, which drops the reference to its parent, which can
> finally be freed.
>
> So I think the part you highlighted is fine as is. If you're fine with
> it, I'll just send a new version of the patchset with the first patch
> fixed (error path mishandled in probe function).

Yeah. Thanks for the explanation.

regards,
dan carpenter