Re: [PATCH 2/2 RESEND] greybus: raw: fix use-after-free if write is called after disconnect
From: Damien Riégel
Date: Mon Mar 16 2026 - 09:39:10 EST
On Mon Mar 16, 2026 at 3:36 AM EDT, Dan Carpenter wrote:
> On Wed, Mar 11, 2026 at 05:25:11PM -0400, Damien Riégel wrote:
>> If a user writes to the chardev after disconnect has been called, the
>> kernel panics with the following trace (with
>> CONFIG_INIT_ON_FREE_DEFAULT_ON=y):
>>
>> [ 83.828726] BUG: kernel NULL pointer dereference, address: 0000000000000218
>> [ 83.829288] #PF: supervisor read access in kernel mode
>> [ 83.829528] #PF: error_code(0x0000) - not-present page
>> [ 83.829828] PGD 0 P4D 0
>> [ 83.830126] Oops: Oops: 0000 [#1] SMP NOPTI
>> [ 83.830753] CPU: 0 UID: 0 PID: 140 Comm: raw_chardev_tes Tainted: G C 6.18.0-rc4 #212 PREEMPT(voluntary)
>> [ 83.831260] Tainted: [C]=CRAP
>> [ 83.831426] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
>> [ 83.831912] RIP: 0010:gb_operation_message_alloc+0x14/0xc0
>> [ 83.832366] Code: 00 00 00 00 66 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 56 4c 8d 72 08 41 55 41 89 cd1
>> [ 83.832979] RSP: 0018:ffffb73f0027bd58 EFLAGS: 00010286
>> [ 83.833247] RAX: ffffa44741f72300 RBX: ffffa44741f72300 RCX: 0000000000000cc0
>> [ 83.833513] RDX: 000000000000000a RSI: 0000000000000002 RDI: 0000000000000000
>> [ 83.833732] RBP: 0000000000000cc0 R08: 0000000000000000 R09: 0000000000000000
>> [ 83.834044] R10: ffffa44741f72300 R11: 0000000000000000 R12: 0000000000000002
>> [ 83.834267] R13: 0000000000000cc0 R14: 0000000000000012 R15: 0000000000000000
>> [ 83.834533] FS: 00007fead7859740(0000) GS:ffffa447a31bc000(0000) knlGS:0000000000000000
>> [ 83.834776] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 83.834974] CR2: 0000000000000218 CR3: 000000000216b000 CR4: 00000000000006f0
>> [ 83.835259] Call Trace:
>> [ 83.835983] <TASK>
>> [ 83.836362] gb_operation_create_common+0x61/0x180
>> [ 83.836653] gb_operation_create_flags+0x28/0xa0
>> [ 83.836912] gb_operation_sync_timeout+0x6f/0x100
>> [ 83.837162] raw_write+0x7b/0xc7 [gb_raw]
>> [ 83.837460] vfs_write+0xcf/0x420
>> [ 83.837615] ? task_mm_cid_work+0x136/0x220
>> [ 83.837784] ksys_write+0x63/0xe0
>> [ 83.837946] do_syscall_64+0xa4/0x290
>> [ 83.838097] entry_SYSCALL_64_after_hwframe+0x77/0x7f
>> [ 83.838359] RIP: 0033:0x7fead78e9cc7
>> [ 83.838712] Code: 48 89 fa 4c 89 df e8 08 ae 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44f
>> [ 83.839190] RSP: 002b:00007ffece5c3de0 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
>> [ 83.839489] RAX: ffffffffffffffda RBX: 00007fead7859740 RCX: 00007fead78e9cc7
>> [ 83.839675] RDX: 0000000000000006 RSI: 0000563d13f96326 RDI: 0000000000000003
>> [ 83.839892] RBP: 00007ffece5c3e38 R08: 0000000000000000 R09: 0000000000000000
>> [ 83.840112] R10: 0000000000000000 R11: 0000000000000202 R12: 0000563cf8925128
>> [ 83.840350] R13: 00007fead78596d0 R14: 0000563d13f96320 R15: 0000563d13f96326
>> [ 83.840635] </TASK>
>> [ 83.840824] Modules linked in: gb_raw(C)
>> [ 83.841311] CR2: 0000000000000218
>> [ 83.842009] ---[ end trace 0000000000000000 ]---
>>
>> Disconnect calls gb_connection_destroy, which ends up freeing the
>> connection object. When gb_operation_sync is called in the write file
>> operations, its gets a freed connection as parameter and the kernel
>> panics.
>>
>> The gb_connection_destroy cannot be moved out of the disconnect
>> function, as the Greybus subsystem expect all connections belonging to a
>> bundle to be destroyed when disconnect returns.
>>
>> To prevent this bug, use a lock to synchronize access between write and
>> disconnect. This guarantees that in the write function raw->connection
>> is either a valid object or a NULL pointer.
>>
>> Fixes: e806c7fb8e9b ("greybus: raw: add raw greybus kernel driver")
>> Signed-off-by: Damien Riégel <damien.riegel@xxxxxxxxxx>
>> ---
>> resend: added linux-staging as Cc, this list was not part of the first
>> submission.
>>
>> drivers/staging/greybus/raw.c | 26 ++++++++++++++++++++------
>> 1 file changed, 20 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/staging/greybus/raw.c b/drivers/staging/greybus/raw.c
>> index b92214f97e3..aa4086ff397 100644
>> --- a/drivers/staging/greybus/raw.c
>> +++ b/drivers/staging/greybus/raw.c
>> @@ -21,6 +21,7 @@ struct gb_raw {
>> struct list_head list;
>> int list_data;
>> struct mutex list_lock;
>> + struct mutex write_lock; /* Synchronize access to connection */
>> struct cdev cdev;
>> struct device dev;
>> };
>> @@ -124,8 +125,8 @@ static int gb_raw_request_handler(struct gb_operation *op)
>>
>> static int gb_raw_send(struct gb_raw *raw, u32 len, const char __user *data)
>> {
>> - struct gb_connection *connection = raw->connection;
>> struct gb_raw_send_request *request;
>> + struct gb_connection *connection;
>> int retval;
>>
>> request = kmalloc(len + sizeof(*request), GFP_KERNEL);
>> @@ -139,9 +140,15 @@ static int gb_raw_send(struct gb_raw *raw, u32 len, const char __user *data)
>>
>> request->len = cpu_to_le32(len);
>>
>> - retval = gb_operation_sync(connection, GB_RAW_TYPE_SEND,
>> - request, len + sizeof(*request),
>> - NULL, 0);
>> + mutex_lock(&raw->write_lock);
>> + retval = -ENODEV;
>> +
>> + connection = raw->connection;
>> + if (connection)
>> + retval = gb_operation_sync(connection, GB_RAW_TYPE_SEND,
>> + request, len + sizeof(*request),
>> + NULL, 0);
>> + mutex_unlock(&raw->write_lock);
> ^^^^^^^^^^^^^^^^
>
> I feel like we need to do a get_device() here as well otherwise the
> put_device(&raw->dev) in gb_raw_disconnect() could delete the last
> reference and free raw. I have looked at this and I feel like what
> I'm saying is reasonable but I don't necessarily know how the reference
> couting works for cdev. Please feel free to correct me. :)
This is not my understanding, nor what I could see when I tested this.
With cdev_device_add(cdev, dev), dev becomes the parent of the chardev.
So as long as the cdev is opened, dev cannot go away because its child
holds a reference to it (it's done for us by device core logic, we don't
have to take care of that or manually get_device()).
If gb_raw_disconnect() is called while the device is opened, raw->dev
won't be freed until the cdev is closed. When that happens, cdev's
refcount drops to 0, which drops the reference to its parent, which can
finally be freed.
So I think the part you highlighted is fine as is. If you're fine with
it, I'll just send a new version of the patchset with the first patch
fixed (error path mishandled in probe function).
Regards,
damien