Re: [PATCH v2 1/2] software node: fix refcount leak in software_node_get_next_child()

From: Xu Yang

Date: Sun Jun 07 2026 - 22:39:43 EST


On Fri, Jun 05, 2026 at 06:20:52PM +0300, Andy Shevchenko wrote:
> On Fri, Jun 05, 2026 at 05:16:32PM +0800, Xu Yang wrote:
> > On Thu, Jun 04, 2026 at 08:50:16PM +0300, Andy Shevchenko wrote:
> > > On Thu, Jun 04, 2026 at 07:15:26PM +0800, Xu Yang wrote:
> > > > On Wed, Jun 03, 2026 at 12:40:27PM +0300, Andy Shevchenko wrote:
> > > > > On Wed, Jun 03, 2026 at 04:44:31PM +0800, Xu Yang wrote:
>
> ...
>
> > > > > > struct swnode *p = to_swnode(fwnode);
> > > > > > struct swnode *c = to_swnode(child);
> > > > > >
> > > > > > - if (!p || list_empty(&p->children) ||
> > > > > > - (c && list_is_last(&c->entry, &p->children))) {
> > > > > > - fwnode_handle_put(child);
> > > > >
> > > > > Wouldn't be better to use swnode_get() / swnode_put() instead?
> > > > > *Yes, we might need to add some NULL checks there.
> > > >
> > > > It's not newly added by me. The software_node_get_next_child() has been using
> > > > fwnode_handle_get() / fwnode_handle_put() before. In my opinion, this should
> > > > be fine since they do the same thing here for a swnode.
> > >
> > > It doesn't matter who added that. But according to the point of this patch
> > > (correct me if I am wrong) is to avoid bumping or dropping reference count for
> > > the nodes that are *not* of swnode type. Moving away from fwnode_handle_*()
> > > loop we make the point clear.
> >
> > Yes.
> >
> > > See the of_get_next_status_child() implementation, it does *not* use
> > > fwnode_handle_*() at all. So, making it here to use same approach should
> > > fix your issue, no?
> >
> > You are right. I had also noticed this before. Actually, the difference between
> > OF node and swnode is that OF node uses to_of_node() to filter out non-OF type
> > fwnodes. Similarly, swnode uses to_swnode() to filter out non-swnode type fwnodes.
> > So replace fwnode_handle_get() / fwnode_handle_put() with software_node_get() /
> > software_node_put() does fix the issue.
> >
> > When I reviewed patch #1 again, I found it already fixes the refcount leak issue
> > because when it switches to the secondary fwnode, it no longer passes the primary
> > child to secondary fwnode. So the patch #1 is not needed anymore. I will remove
> > it in v3.
>
> I'm lost in here. My expectation that patch 1 should fix the issue as it won't
> let the fwnode_handle_*() be called against wrong type of fwnode. What did I
> miss?

Sorry, I meant "When I reviewed patch #2 again, ..."

Let me clarify the issues here, patch #1 fixes refcount leak issue and patch #2 fixes
the infinite loop issue. Although replacing fwnode_handle_*() with software_node_*()
in patch #1 can fix refcount issue in another way, it can not fix the infinite loop
issue. So patch #2 is still required. When patch #2 changes:

return fwnode_call_ptr_op(fwnode->secondary, get_next_child_node, child);
|
to |
v
return fwnode_call_ptr_op(parent->secondary, get_next_child_node, NULL);

the secondary fwnode will no longer deal with primary fwnode's child. So patch #2 has
already fixed the refcount leak issue. Therefore, patch #1 can be removed.

Thanks,
Xu Yang