Re: [PATCH net-next v5 1/2] net: pse-pd: add devm_pse_poll_helper()

From: Oleksij Rempel

Date: Sun May 17 2026 - 05:57:47 EST


On Sat, May 16, 2026 at 12:17:20PM +0200, Carlo Szelinsky wrote:
> Hi Jakub,
>
> Thanks for sending the AI review. I checked the code again... both
> points are real, but one of them looks to me like more theory than practice.
> Out of curiosity, what prompt do you use for these reviews, so I can use it
> before sending the next version?

Sashiko is using https://github.com/masoncl/review-prompts

I personally run pre-review with opencode + review-prompts +
gemini-3.1-pro-preview-customtools / or claude-sonnet-4-5

After installing review-prompts to opencode config the review can be
done with "/kreview commit_id"

But sashiko still finds things, i do not know what is the difference.

> Replying inline:
>
> On Mon, 4 May 2026 18:57:57 -0700 Jakub Kicinski wrote:
> > > @@ -1118,6 +1125,8 @@ void pse_controller_unregister(struct pse_controller_dev *pcdev)
> > > pse_release_pis(pcdev);
> > > if (pcdev->irq)
> > > disable_irq(pcdev->irq);
> > > + if (pcdev->polling)
> > > + cancel_delayed_work_sync(&pcdev->poll_work);
> >
> > Because pse_release_pis() executes kfree(pcdev->pi) before the delayed work
> > is cancelled, can pse_poll_worker() run concurrently and access the freed
> > pcdev->pi array inside pse_handle_events() before the cancellation completes?
>
> Yes, seems like a real bug. The IRQ thread and the poll worker can still run
> after pse_release_pis() frees pcdev->pi, and both end up in
> pse_handle_events() touching pcdev->pi[i]. We need to stop them
> first. v6 will reorder pse_controller_unregister() to:
>
> if (pcdev->irq)
> disable_irq(pcdev->irq);
> if (pcdev->polling)
> cancel_delayed_work_sync(&pcdev->poll_work);
> cancel_work_sync(&pcdev->ntf_work);
> pse_release_pis(pcdev);
>
> The IRQ side seem to be already broken before this patch. I went
> through the git log:
>
> * fc0e6db30941 ("net: pse-pd: Add support for reporting events")
> added pse_isr but never called disable_irq() in unregister at all.
> * ffef61d6d273 ("net: pse-pd: Add support for budget evaluation
> strategies") added the disable_irq() call, but put it in the
> wrong place (after pse_release_pis()).
>
> What do you suggest to do, I could imagine: ...
>
> 1) Send the fix as a standalone "Fixes:" patch to net first,
> then rebase v6 on top once it lands? Or fold it into v6 as
> patch 1/3? My vote is standalone since the fix stands on its
> own, but up to you.

Yes, please send the fix to stable. In this case you will need to wait
until net-next includes stable fixes.

> 2) Which commit for the Fixes: tag? I'd pick ffef61d6d273
> (that's where the broken order came from), but fc0e6db30941
> also works (no protection at all back then). wdyt?

If it is introduce by ffef61d6d273, then use this for Fixes.

> > > +int devm_pse_poll_helper(struct pse_controller_dev *pcdev,
> > > + const struct pse_irq_desc *d)
> > > +{
> > [...]
> > > + queue_delayed_work(system_freezable_wq, &pcdev->poll_work,
> > > + msecs_to_jiffies(pcdev->poll_interval_ms));
> >
> > Since memory is allocated with devm_ helpers but no devm cleanup action is
> > registered to cancel the delayed work, what happens if a driver's probe
> > function encounters an error after calling this helper? [...]
> >
> > The commit message mentions that drivers typically set up the helper before
> > registering the controller. If queue_delayed_work() schedules the worker
> > to run before devm_pse_controller_register() completes, will
> > pse_poll_worker() attempt to acquire pcdev->lock and push events into
> > pcdev->ntf_fifo before they are initialized?
>
> The probe-failure case is a real bug. If probe fails between this
> helper and devm_pse_controller_register(), the work stays queued
> and fires later on a pcdev whose devres memory is gone. UAF.
>
> The "worker fires before mutex/kfifo are ready" case is real in
> theory, but I don't think you can actually hit it?! Default
> poll_interval_ms is 1000 ms, so pse_controller_register() would
> need to take over a second to lose the race. Worth fixing to keep
> the code clean, but not urgent or?

The probe() is a process and can be outscheduled by any high prio process
at any time. May be on a single core SoC with pse-pd core build
as module and squashfs root it will make the boot load heavy enough
to reproduce it?

> Same fix for both: move queue_delayed_work() out of the helper.
> The helper just allocates the state and does INIT_DELAYED_WORK.
> pse_controller_register() then arms the work as the very last
> step on success, gated on pcdev->polling. So the work only ever
> runs when everything is set up AND register has succeeded.

Sounds good.

> Should I fix both in v6, or just the probe-failure leak? One
> change covers both, so I would suggest to do both. wdyt?

Fixes for pre-existing bugs to net, new bugs to net-next :)

Best Regards,
Oleksij
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |