Re: [PATCH] media: amphion: Fix race between m2m job_abort and device_run

From: Nicolas Dufresne

Date: Thu Mar 19 2026 - 16:53:42 EST


Le vendredi 06 mars 2026 à 14:59 +0800, ming.qian@xxxxxxxxxxx a écrit :
> From: Ming Qian <ming.qian@xxxxxxxxxxx>
>
> Fix kernel panic caused by race condition where v4l2_m2m_ctx_release()
> frees m2m_ctx while v4l2_m2m_try_run() is about to call device_run
> with the same context.
>
> Race sequence:
>   v4l2_m2m_try_run():           v4l2_m2m_ctx_release():
>     lock/unlock                   v4l2_m2m_cancel_job()
>                                     job_abort()
>                                       v4l2_m2m_job_finish()
>                                   kfree(m2m_ctx)  <- frees ctx
>     device_run()  <- use-after-free crash at 0x538
>
> Crash trace:
>   Unable to handle kernel read from unreadable memory at virtual address
>   0000000000000538
>   v4l2_m2m_try_run+0x78/0x138
>   v4l2_m2m_device_run_work+0x14/0x20
>
> The amphion vpu driver does not rely on the m2m framework's device_run
> callback to perform encode/decode operations.
>
> Fix the race by preventing m2m framework job scheduling entirely:
> - Add job_ready callback returning 0 (no jobs ready for m2m framework)
> - Remove job_abort callback to avoid the race condition
>
> Fixes: 3cd084519c6f ("media: amphion: add vpu v4l2 m2m support")
> Signed-off-by: Ming Qian <ming.qian@xxxxxxxxxxx>

Ok, I guess that also reduce the overhead of scheduling jobs too.

Reviewed-by: Nicolas Dufresne <nicolas.dufresne@xxxxxxxxxxxxx>

> ---
>  drivers/media/platform/amphion/vpu_v4l2.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/media/platform/amphion/vpu_v4l2.c b/drivers/media/platform/amphion/vpu_v4l2.c
> index 64fc88d89ccc..7cccc994fc50 100644
> --- a/drivers/media/platform/amphion/vpu_v4l2.c
> +++ b/drivers/media/platform/amphion/vpu_v4l2.c
> @@ -447,17 +447,14 @@ static void vpu_m2m_device_run(void *priv)
>  {
>  }
>  
> -static void vpu_m2m_job_abort(void *priv)
> +static int vpu_m2m_job_ready(void *priv)
>  {
> - struct vpu_inst *inst = priv;
> - struct v4l2_m2m_ctx *m2m_ctx = inst->fh.m2m_ctx;
> -
> - v4l2_m2m_job_finish(m2m_ctx->m2m_dev, m2m_ctx);
> + return 0;
>  }
>  
>  static const struct v4l2_m2m_ops vpu_m2m_ops = {
>   .device_run = vpu_m2m_device_run,
> - .job_abort = vpu_m2m_job_abort
> + .job_ready = vpu_m2m_job_ready,
>  };
>  
>  static int vpu_vb2_queue_setup(struct vb2_queue *vq,
>
> base-commit: f505e978d1a0442adbbde48aed38c084ddea6d6e
> prerequisite-patch-id: 0000000000000000000000000000000000000000

Not sure why this line ...

Nicolas

Attachment: signature.asc
Description: This is a digitally signed message part