Re: [PATCH] drm/bridge: Fix refcount shown via debugfs for encoder_bridges_show()
From: Liu Ying
Date: Mon Mar 16 2026 - 05:57:26 EST
On Fri, Mar 13, 2026 at 06:32:46PM +0100, Luca Ceresoli wrote:
> On Fri Mar 13, 2026 at 11:22 AM CET, Liu Ying wrote:
>> On Fri, Mar 13, 2026 at 10:57:04AM +0100, Luca Ceresoli wrote:
>>> On Fri Mar 13, 2026 at 9:33 AM CET, Liu Ying wrote:
>>>> On Thu, Mar 12, 2026 at 06:30:22PM +0100, Luca Ceresoli wrote:
>>>>> Hello Liu, Maxime,
>>>>>
>>>>> On Thu Mar 12, 2026 at 7:05 AM CET, Liu Ying wrote:
>>>>>> A typical bridge refcount value is 3 after a bridge chain is formed:
>>>>>> - devm_drm_bridge_alloc() initializes the refcount value to be 1.
>>>>>> - drm_bridge_add() gets an additional reference hence 2.
>>>>>> - drm_bridge_attach() gets the third reference hence 3.
>>>>>>
>>>>>> This typical refcount value aligns with allbridges_show()'s behaviour.
>>>>>> However, since encoder_bridges_show() uses
>>>>>> drm_for_each_bridge_in_chain_scoped() to automatically get/put the
>>>>>> bridge reference while iterating, a bogus reference is accidentally
>>>>>> got when showing the wrong typical refcount value as 4 to users via
>>>>>> debugfs. Fix this by caching the refcount value returned from
>>>>>> kref_read() while iterating and explicitly decreasing the cached
>>>>>> refcount value by 1 before showing it to users.
>>>>>
>>>>> Good point, indeed the refcount shown by
>>>>> <debugfs>/dri/<card>/encoder-0/bridges is by one unit higher than the one
>>>>> shown in <debugfs>/dri/bridges. I understand it's puzzling from a debugfs
>>>>> user point of view.
>>>>>
>>>>> As you noticed, this is because the _scoped loop holds an extra ref on the
>>>>> current bridge.
>>>>>
>>>>> For other reasons I proposed a mutex for stronger protection around the
>>>>> bridge chain [v2]. With the mutex the extra ref is redundant, so in [v2]
>>>>> the extra ref is removed, thus making your patch unneeded. However Maxime
>>>>> asked to keep the extra ref, and so my latest iteration [v4] still has the
>>>>> extra ref.
>>>>>
>>>>> That series is still on the mailing list, we are still in time to rediscuss
>>>>> it.
>>>>>
>>>>> @Maxime: based on the issue Liu is trying to work around, do you think it
>>>>> would make sense to go back to the initial approach for that series?
>>>>> I.e. drm_for_each_bridge_in_chain_scoped() grabs the chain lock, which is a
>>>>> superset of the per-bridge refcount, and thus the refcount can be dropped?
>>>>> This would remove the debugfs issue, slightly simplify
>>>>> drm_for_each_bridge_in_chain_scoped(), and introduce no new issues AFAIK.
>>>>
>>>> Just my take on the chain lock approach - I agree Maxime's comment on [v2]
>>>> that keeping the get/put is a better than using the chain lock to ensure
>>>> the refcount is correct. The chain lock could be added later on if needed.
>>>
>>> Well, no, adding the chain mutex is necessary(*), otherwise Thread A could
>>> iterate over the chain while thread B is adding/removing bridges to/from
>>> the chain.
>>>
>>> And the chain mutex is a superset of the per-bridge refcount, so when
>>> adding the mutex the refcount inside drm_for_each_bridge_in_chain_scoped()
>>> becomes useless (and slightly hurting as it makes the refcount shown in
>>> debugfs inconsistent, as you noticed).
>>
>> For better code readability, I think keeping the get/put is fine even if
>> you add a lock
>
> The [v4] code with the removal of the extra refcount would not be more
> complex. It would be a bit less code (no need for the DEFINE_FREE and
> __free()). Maybe it'd need an extra comment to clarify when the
> drm_bridge_put() is called.
>
> [v4] https://lore.kernel.org/all/20260113-drm-bridge-alloc-encoder-chain-mutex-v4-4-60f3135adc45@xxxxxxxxxxx/
>
>> (maybe RCU list is better than mutex, since the chain is
>> read often). That follows the idea that you mentioned in [1]: "every
>> pointer to a drm_bridge stored somewhere is a reference to a bridge".
>
> That's true. However while it's an important pointer hygiene rule for
> device drivers, for core code it's OK to deviate when there is a reason.
>
>> Plus, seems no performance issue with the get/put, as discussed in [v2].
>
> I confirm performance is surely not an issue here.
>
> All that said, I'm OK with either option:
>
> * no ref taken when the mutex is added
> * ref taken when the mutex is added (as v4) + your patch to fix debugfs
Maybe consider to take this patch first, since it doesn't hurt. Even if
we end up with the first option, the refcount is supposed to be correct
anyway.
Luca, do you think this patch deserves at least an A-b tag from you if not
a R-b tag?
>
> Luca
>
> --
> Luca Ceresoli, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com/
--
Regards,
Liu Ying