Re: [PATCH v18 net-next 8/8] octeontx2-af: npc: cn20k: Allocate npc_priv and dstats dynamically.

From: Ratheesh Kannoth

Date: Wed Jun 03 2026 - 03:24:52 EST


On 2026-06-02 at 11:33:59, Ratheesh Kannoth (rkannoth@xxxxxxxxxxx) wrote:
> Replace the file-scope static npc_priv with a kcalloc'd struct filled
> from hardware bank/subbank geometry at init (num_banks is no longer a
> const compile-time constant; drop init_done and use a non-NULL
> npc_priv pointer for liveness). Thread npc_priv_get() / pointer access
> through the CN20K NPC code paths, extend teardown to kfree the root
> struct on failure and in npc_cn20k_deinit, and adjust MCAM section
> setup to use the discovered subbank count.
>
> Allocate MCAM debugfs dstats via devm_kzalloc instead of a static matrix,
> and use the allocated backing store consistently when computing deltas
> (including the counter rollover compare).
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@xxxxxxxxxxx>

https://sashiko.dev/#/patchset/20260602060359.1894952-1-rkannoth%40marvell.com says

>>
>> + dstats = devm_kzalloc(rvu->dev, sizeof(*dstats), GFP_KERNEL);
>> + if (!dstats)
>> + return -ENOMEM;
>> +
>Since dstats remains a global static pointer, won't multiple RVU devices
>probing cause this to be unconditionally overwritten, leaking the previously
>allocated memory?
>
There is only one admin-function PCI device per system. Will post a patch to net-next (after this series is merged)to reject any additional AF probe
with -EBUSY so the driver model matches hardware and automated reviewers can rely on a single bound instance.

>Additionally, because this uses devm_kzalloc, if a device is unbound, won't
>devres free the memory while leaving the global pointer dangling? Could this
>lead to a use-after-free for other active devices that attempt to read from
>the debugfs file?

There is only one AF device per system, there is no support unbind AF device, so it is already attached.

>> - bw = bitmap_weight(bmap, npc_priv.subbank_depth);
>> - bfree = npc_priv.subbank_depth - bw;
>> + bw = bitmap_weight(bmap, npc_priv->subbank_depth);
>> + bfree = npc_priv->subbank_depth - bw;
>>
>> if (!bfree) {
>> dev_dbg(rvu->dev, "%s: subbank is full\n", __func__);
>This is a pre-existing issue, but does __npc_subbank_alloc() properly verify
>the bounds of the mcam_idx buffer before writing to it?
Pre-existing issue, will add bound check as part of hardening series, after the patch series merged.

> - npc_priv.subbank_depth = subbank_depth;
>> + npc_priv = kcalloc(1, sizeof(*npc_priv), GFP_KERNEL);
>> + if (!npc_priv)
>> + return -ENOMEM;
>> +
>Since npc_priv remains a file-scope global pointer, if multiple PCI physical
>functions probe, won't the second device initialization overwrite this pointer
>and permanently leak the memory allocated by the first device? Should this
>state be encapsulated within the per-device private data structure (such as
>struct rvu) instead?

THere is only one AF device, so only one npc_priv_t instance.

> + kfree(npc_priv->sb);
>> kfree(subbank_srch_order);
>> - bitmap_clear(npc_priv.en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
>> + bitmap_clear(npc_priv->en_map, 0, MAX_NUM_BANKS * MAX_NUM_SUB_BANKS *
>> MAX_SUBBANK_DEPTH);
>> + npc_defrag_list_clear();
>> + kfree(npc_priv);
>> + npc_priv = NULL;
>> }
>If one device is torn down and calls npc_cn20k_deinit(), it explicitly frees
>the global npc_priv pointer and sets it to NULL. Won't this immediately break
>any other active RVU devices that are concurrently using this shared global
>pointer, leading to use-after-free or NULL pointer crashes?
>
THere is only one AF device, so only one npc_priv_t instance.