RE: [PATCH] media: chips-media: wave5: Move src_buf Removal to finish_encode

From: jackson . lee

Date: Wed Mar 18 2026 - 21:39:23 EST


Hi Brandon


> -----Original Message-----
> From: Brandon Brnich <b-brnich@xxxxxx>
> Sent: Saturday, March 14, 2026 1:55 AM
> To: linux-media@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Cc: detheridge@xxxxxx; mchehab@xxxxxxxxxx; Nas Chung
> <nas.chung@xxxxxxxxxxxxxxx>; jackson.lee <jackson.lee@xxxxxxxxxxxxxxx>;
> nicolas.dufresne@xxxxxxxxxxxxx; Brandon Brnich <b-brnich@xxxxxx>
> Subject: [PATCH] media: chips-media: wave5: Move src_buf Removal to
> finish_encode
>
> During encoder processing, there is a case where the IRQ response could
> return the buffer back to userspace via v4l2_m2m_buf_done call. In this
> time, userspace could queue up this same buffer before start_encode
> removes the index from the ready queue. This would then lead to a case
> where the buffer in the ready queue could be a self loop due to the
> WRITE_ONCE(prev->next, new) call in __list_add.
>
> When __list_del is finally called, the loop is already made so nothing
> points back to ready queue list head and pointers are poisoned.
>
> A buffer should not be marked as DONE before the buffer is removed from
> m2m ready queue. Move removal entirely to finish_encode.
>
> Signed-off-by: Brandon Brnich <b-brnich@xxxxxx>
> ---
>
> This bug is very hard to reproduce in simple encode environments. It
> primarily occurs during long run cases where CPU is strained doing other
> forms of computation. Crash log shared below.
>
> I see other drivers removing buffer in their device_run process, but they
> do it before any chance of DONE state transition. I can move this removal
> to there as well if that is the correct location, but I can't find
> anywhere saying this is required.
>
> Kernel tested on: 7.0.0-rc3-00167-g66dbdc5b5d2d Gstreamer version: 1.26.9
>
> [ 609.879961] pc : v4l2_m2m_buf_remove_by_idx+0x84/0xe8 [v4l2_mem2mem]
> [ 609.886313] lr : v4l2_m2m_buf_remove_by_idx+0x28/0xe8 [v4l2_mem2mem]
> [ 609.892663] sp : ffff800081a4bbd0 [ 609.895968] x29: ffff800081a4bbd0
> x28: 0000000000000000 x27: 000000000007f800 [ 609.903096] x26:
> 00000000aefd2800 x25: 0000000000000780 x24: ffff0000014efdc8 [ 609.910224]
> x23: ffff0000014efc28 x22: ffff0000014efc00 x21: ffff0000014eff60
> [ 609.917351] x20: ffff0000014efdc8 x19: ffff000001daf800 x18:
> 0000000000000000 [ 609.924478] x17: 0000000000000000 x16: 0000000000000000
> x15: 0000000000000002 [ 609.931605] x14: 0000000000000800 x13:
> 00000000000001e6 x12: 0000000000000000 [ 609.938732] x11: 0000000000000000
> x10: 00000000000009d0 x9 : ffff800081a4bcc0 [ 609.945859] x8 :
> ffff800081a4bc88 x7 : 0000000000000000 x6 : ffff0000014eff50 [ 609.952986]
> x5 : dead000000000100 x4 : dead000000000100 x3 : dead000000000122
> [ 609.960113] x2 : 0000000000000100 x1 : 0000000000000000 x0 :
> ffff000001dafc00 [ 609.967242] Call trace:
> [ 609.969678] v4l2_m2m_buf_remove_by_idx+0x84/0xe8 [v4l2_mem2mem]
> [ 609.975685] start_encode+0x28c/0x554 [wave5] [ 609.980063]
> wave5_vpu_enc_device_run+0x10c/0x230 [wave5] [ 609.985466]
> v4l2_m2m_try_run+0x84/0x140 [v4l2_mem2mem] [ 609.990692]
> v4l2_m2m_device_run_work+0x14/0x20 [v4l2_mem2mem] [ 609.996522]
> process_one_work+0x148/0x28c [ 610.000532] worker_thread+0x2d0/0x3d8
> [ 610.004277] kthread+0x110/0x114 [ 610.007500] ret_from_fork+0x10/0x20
>
> .../chips-media/wave5/wave5-vpu-enc.c | 32 +++----------------
> 1 file changed, 5 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
> b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
> index 7613fcdbafed..3e198a7cefb1 100644
> --- a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
> @@ -223,17 +223,9 @@ static int start_encode(struct vpu_instance *inst,
> u32 *fail_res)
> dst_buf->vb2_buf.timestamp = src_buf->vb2_buf.timestamp;
> v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_ERROR);
> v4l2_m2m_buf_done(dst_buf, VB2_BUF_STATE_ERROR);
> - } else {
> + } else

The else without braces violates kernel coding style when the if branch uses braces:

> dev_dbg(inst->dev->dev, "%s: wave5_vpu_enc_start_one_frame
> success\n",
> __func__);
> - /*
> - * Remove the source buffer from the ready-queue now and
> finish
> - * it in the videobuf2 framework once the index is returned
> by the
> - * firmware in finish_encode
> - */
> - if (src_buf)
> - v4l2_m2m_src_buf_remove_by_idx(m2m_ctx, src_buf-
> >vb2_buf.index);
> - }
>
> return 0;
> }
> @@ -259,27 +251,13 @@ static void wave5_vpu_enc_finish_encode(struct
> vpu_instance *inst)
> __func__, enc_output_info.pic_type,
> enc_output_info.recon_frame_index,
> enc_output_info.enc_src_idx, enc_output_info.enc_pic_byte,
> enc_output_info.pts);
>
> - /*
> - * The source buffer will not be found in the ready-queue as it has
> been
> - * dropped after sending of the encode firmware command, locate it
> in
> - * the videobuf2 queue directly
> - */
> if (enc_output_info.enc_src_idx >= 0) {
> - struct vb2_buffer *vb =
> vb2_get_buffer(v4l2_m2m_get_src_vq(m2m_ctx),
> - enc_output_info.enc_src_idx);
> - if (vb->state != VB2_BUF_STATE_ACTIVE)
> - dev_warn(inst->dev->dev,
> - "%s: encoded buffer (%d) was not in ready
> queue %i.",
> - __func__, enc_output_info.enc_src_idx, vb-
> >state);
> - else
> - src_buf = to_vb2_v4l2_buffer(vb);
> -
> - if (src_buf) {
> + src_buf = v4l2_m2m_src_buf_remove(m2m_ctx);


v4l2_m2m_src_buf_remove() is too weak

The new finish_encode uses v4l2_m2m_src_buf_remove() which blindly removes the head of the ready queue.
It ignores enc_output_info.enc_src_idx entirely.
This works under the assumption that the m2m framework is strictly FIFO (which it is for single-job-at-a-time), but a safer approach would be:

src_buf = v4l2_m2m_src_buf_remove_by_idx(m2m_ctx, enc_output_info.enc_src_idx);


thanks
Jackson

> + if (!src_buf)
> + dev_warn(inst->dev->dev, "%s: no source buffer
> found\n", __func__);
> + else {
> inst->timestamp = src_buf->vb2_buf.timestamp;
> v4l2_m2m_buf_done(src_buf, VB2_BUF_STATE_DONE);
> - } else {
> - dev_warn(inst->dev->dev, "%s: no source buffer with
> index: %d found\n",
> - __func__, enc_output_info.enc_src_idx);
> }
> }
>
> --
> 2.43.0