Re: [PATCH v18 net-next 4/8] octeontx2-af: npc: cn20k: add subbank search order control

From: Ratheesh Kannoth

Date: Wed Jun 03 2026 - 22:35:18 EST

On 2026-06-02 at 11:33:55, Ratheesh Kannoth (rkannoth@xxxxxxxxxxx) wrote:
> CN20K NPC MCAM is split into 32 subbanks that are searched in a
> predefined order during allocation. Lower-numbered subbanks have
> higher priority than higher-numbered ones.
>
> Add a runtime "srch_order" to control the order in which
> subbanks are searched during MCAM allocation.
>
> Signed-off-by: Ratheesh Kannoth <rkannoth@xxxxxxxxxxx>

https://netdev-ai.bots.linux.dev/sashiko/#/patchset/20260602060359.1894952-1-rkannoth%40marvell.com says

> @@ -3906,6 +3906,122 @@ static void npc_unlock_all_subbank(void)
>> mutex_unlock(&npc_priv.sb[i].lock);
>> }
>>
>> +int npc_cn20k_search_order_set(struct rvu *rvu,
>> + u64 narr[MAX_NUM_SUB_BANKS], int cnt)
>> +{
>> + struct npc_mcam *mcam = &rvu->hw->mcam;
>> + int rsrc[2][MAX_NUM_SUB_BANKS] = { };
>> + u8 save[MAX_NUM_SUB_BANKS] = { };
>> + struct npc_subbank *sb;
>> + struct xarray *xa;
>> + int prio, rc, err;
>> + int sb_idx;
>The rvu pointer is taken as an argument, but the state being mutated
>(subbank_srch_order, restrict_valid, npc_priv) lives in file-scope statics
>in cn20k/npc.c. The rvu parameter is used only for dev_err() and to grab
>mcam->lock. Devlink parameters are per-device, so what happens on systems
>with more than one CN20K AF instance? It looks like setting npc_srch_order
>on one instance would silently apply to all of them, and a get on any
>instance would return whatever the last writer set globally. Was this
>intended, or should the state be moved under struct rvu (or its hw/npc
>container)?
There is only one AF device per system.

>The save[] array stores sb->arr_idx, but arr_idx is u16 in struct
>npc_subbank while save[] is declared u8. With MAX_NUM_SUB_BANKS == 32
>today the values fit, but is a BUILD_BUG_ON or a u16 declaration
>appropriate so that a future bump of MAX_NUM_SUB_BANKS does not silently
>truncate restored values during rollback? The neighbouring rsrc[][] array
>is correctly typed int.
Maximum subbank available on highend SoC is 32.

>> + for (int i = 0; i < cnt; i++)
>> + subbank_srch_order[i] = (u32)narr[i];
>> +
>> + restrict_valid = false;
>restrict_valid is unconditionally set to false here on every successful
>set, and there is no path elsewhere that sets it back to true. The flag
>gates several other behaviours in cn20k/npc.c:
> - npc_subbank_restrict_usage() returns false once restrict_valid is
> false (disabling the minimization of allocations from restricted
> subbanks).
> - The "Allocate from restricted subbanks" fallback loop is skipped.
> - npc_defrag_skip_restricted_sb() always returns false.
>Is it intended that the first userspace set of npc_srch_order silently
>disables the restricted-subbank usage-minimization feature and the
>restricted-subbank allocation fallback for the remaining lifetime of the
>driver? If so, could that be documented and exposed through the same
>uAPI so userspace can observe and revert it?
This is intentional. Once user change srch orer, "restrict_valid" is invalid.

>> +fail:
>> + for (prio = 0; prio < cnt; prio++) {
>> + if (rsrc[FREE][prio] == 1)
>> + xa_erase(&npc_priv.xa_sb_free, prio);
>> +
>> + if (rsrc[USED][prio] == 1)
>> + xa_erase(&npc_priv.xa_sb_used, prio);
>> + }
>> +
>> + for (sb_idx = 0; sb_idx < cnt; sb_idx++) {
>> + sb = &npc_priv.sb[sb_idx];
>> + sb->arr_idx = save[sb_idx];
>> +
>> + if (sb->flags & NPC_SUBBANK_FLAG_USED)
>> + xa = &npc_priv.xa_sb_used;
>> + else
>> + xa = &npc_priv.xa_sb_free;
>> +
>> + /* Since the entry already exists, xa_store() replaces
>> + * the value without a kmalloc(), making failure highly unlikely.
>> + */
>> + err = xa_err(xa_store(xa, sb->arr_idx,
>> + xa_mk_value(sb->idx), GFP_KERNEL));
>> + WARN(!!err, "Failed to rollback sb=%u idx=%u\n",
>> + sb->idx, sb->arr_idx);
>> + }
>Is the comment above accurate on the failure path?
>The first loop in fail: erases keys where rsrc[FLAG][prio] == 1, which
>are exactly the prio slots written by completed iterations. When the
>rollback loop then writes the saved subbanks back to those keys via
>xa_store(xa, sb->arr_idx, ...), those keys are no longer present, so the
>xa_store has to allocate a new node and is not a simple replace.
>Under memory pressure that allocation can return -ENOMEM, and the only
>response here is WARN().
rsrc[FLAG][prio] == 1 indicate entries which are newly allocated. So rolling back
will delete those. Other slots do exist, so xa_store() wont fail due to memory pressure.

> +static int rvu_af_dl_npc_srch_order_get(struct devlink *devlink, u32 id,
>> + struct devlink_param_gset_ctx *ctx,
>> + struct netlink_ext_ack *extack)
>> +{
>> + bool restricted_order;
>> + const u32 *order;
>> + u32 sz;
>> +
>> + order = npc_cn20k_search_order_get(&restricted_order, &sz);
>> + ctx->val.u64arr.size = sz;
>> + for (int i = 0; i < sz; i++)
>> + ctx->val.u64arr.val[i] = order[i];
>> +
>> + return 0;
>> +}
>The restricted_order out parameter is written here but never read. The
>same is true in rvu_af_dl_npc_srch_order_validate() below. Is the
>intention to expose whether the default (restricted) order or a
>user-provided order is currently active to userspace, perhaps via a
>separate read-only param? As written, the value never leaves the local
>stack, so userspace has no way to learn it.
if default srch_order is modified, restrict_order will be false. No need
to read this value from userspace.