Re: [PATCH] RDMA/rxe: Replace use of system_unbound_wq with system_dfl_wq
From: Leon Romanovsky
Date: Tue Mar 17 2026 - 15:04:50 EST
On Tue, Mar 17, 2026 at 10:24:11AM -0700, Yanjun.Zhu wrote:
>
> On 3/17/26 7:38 AM, Zhu Yanjun wrote:
> > 在 2026/3/16 13:13, Leon Romanovsky 写道:
> > > On Fri, Mar 13, 2026 at 04:40:23PM +0100, Marco Crivellari wrote:
> > > > This patch continues the effort to refactor workqueue APIs,
> > > > which has begun
> > > > with the changes introducing new workqueues and a new
> > > > alloc_workqueue flag:
> > > >
> > > > commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and
> > > > system_dfl_wq")
> > > > commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
> > > >
> > > > The point of the refactoring is to eventually alter the default
> > > > behavior of
> > > > workqueues to become unbound by default so that their workload
> > > > placement is
> > > > optimized by the scheduler.
> > > >
> > > > Before that to happen, workqueue users must be converted to the
> > > > better named
> > > > new workqueues with no intended behaviour changes:
> > > >
> > > > system_wq -> system_percpu_wq
> > > > system_unbound_wq -> system_dfl_wq
> > > >
> > > > This way the old obsolete workqueues (system_wq,
> > > > system_unbound_wq) can be
> > > > removed in the future.
> > >
> > > I recall earlier efforts to replace system workqueues with
> > > per‑driver queues,
> > > because unloading a driver forces a flush of the entire system
> > > workqueue,
> > > which is undesirable for overall system behavior.
> > >
> > > Wouldn't it be better to introduce a local workqueue here and use
> > > that instead?
> >
> > Thanks.
> >
> > 1.The initialization should be:
> >
> > my_wq = alloc_workqueue("my_driver_queue", WQ_UNBOUND | WQ_MEM_RECLAIM,
> > 0);
> > if (!my_wq)
> > return -ENOMEM;
> >
> > 2. The Submission should be:
> >
> > queue_work(my_wq, &my_work);
> >
> > 3. Destroy should be:
> >
> > destroy_workqueue()
> >
> > Thanks,
> > Zhu Yanjun
>
> Hi, Leon
>
> The diff for a new work queue in rxe is as below. Please review it.
I'm not sure that you need second workqueue and destroy_workqueue
already does flush_workqueue. There is no need to call it explicitly.
Thanks
>
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_odp.c
> b/drivers/infiniband/sw/rxe/rxe_odp.c
> index bc11b1ec59ac..03199fef47fb 100644
> --- a/drivers/infiniband/sw/rxe/rxe_odp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_odp.c
> @@ -545,7 +545,7 @@ static int rxe_ib_advise_mr_prefetch(struct ib_pd *ibpd,
> work->frags[i].mr = mr;
> }
>
> - queue_work(system_unbound_wq, &work->work);
> + rxe_queue_aux_work(&work->work);
>
> return 0;
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_task.c
> b/drivers/infiniband/sw/rxe/rxe_task.c
> index f522820b950c..a2da699b969e 100644
> --- a/drivers/infiniband/sw/rxe/rxe_task.c
> +++ b/drivers/infiniband/sw/rxe/rxe_task.c
> @@ -6,19 +6,36 @@
>
> #include "rxe.h"
>
> +/* work for rxe_task */
> static struct workqueue_struct *rxe_wq;
>
> +/* work for other rxe jobs */
> +static struct workqueue_struct *rxe_aux_wq;
> +
> int rxe_alloc_wq(void)
> {
> - rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, WQ_MAX_ACTIVE);
> + rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND | WQ_MEM_RECLAIM,
> + WQ_MAX_ACTIVE);
> if (!rxe_wq)
> return -ENOMEM;
>
> + rxe_aux_wq = alloc_workqueue("rxe_aux_wq",
> + WQ_UNBOUND | WQ_MEM_RECLAIM, WQ_MAX_ACTIVE);
> + if (!rxe_aux_wq) {
> + destroy_workqueue(rxe_wq);
> + return -ENOMEM;
> +
> + }
> +
> return 0;
> }
>
> void rxe_destroy_wq(void)
> {
> + flush_workqueue(rxe_aux_wq);
> + destroy_workqueue(rxe_aux_wq);
> +
> + flush_workqueue(rxe_wq);
> destroy_workqueue(rxe_wq);
> }
>
> @@ -254,6 +271,14 @@ void rxe_sched_task(struct rxe_task *task)
> spin_unlock_irqrestore(&task->lock, flags);
> }
>
> +/* rxe_wq for rxe tasks. rxe_aux_wq for other rxe jobs.
> + */
> +void rxe_queue_aux_work(struct work_struct *work)
> +{
> + WARN_ON_ONCE(!rxe_aux_wq);
> + queue_work(rxe_aux_wq, work);
> +}
> +
> /* rxe_disable/enable_task are only called from
> * rxe_modify_qp in process context. Task is moved
> * to the drained state by do_task.
> diff --git a/drivers/infiniband/sw/rxe/rxe_task.h
> b/drivers/infiniband/sw/rxe/rxe_task.h
> index a8c9a77b6027..e1c0a34808b4 100644
> --- a/drivers/infiniband/sw/rxe/rxe_task.h
> +++ b/drivers/infiniband/sw/rxe/rxe_task.h
> @@ -36,6 +36,7 @@ int rxe_alloc_wq(void);
>
> void rxe_destroy_wq(void);
>
> +void rxe_queue_aux_work(struct work_struct *work);
> /*
> * init rxe_task structure
> * qp => parameter to pass to func
>
> Zhu Yanjun
>
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Link:
> > > > https://lore.kernel.org/all/20250221112003.1dSuoGyc@xxxxxxxxxxxxx/
> > > > Suggested-by: Tejun Heo <tj@xxxxxxxxxx>
> > > > Signed-off-by: Marco Crivellari <marco.crivellari@xxxxxxxx>
> > > > ---
> > > > drivers/infiniband/sw/rxe/rxe_odp.c | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/infiniband/sw/rxe/rxe_odp.c
> > > > b/drivers/infiniband/sw/rxe/rxe_odp.c
> > > > index bc11b1ec59ac..d440c8cbaea5 100644
> > > > --- a/drivers/infiniband/sw/rxe/rxe_odp.c
> > > > +++ b/drivers/infiniband/sw/rxe/rxe_odp.c
> > > > @@ -545,7 +545,7 @@ static int rxe_ib_advise_mr_prefetch(struct
> > > > ib_pd *ibpd,
> > > > work->frags[i].mr = mr;
> > > > }
> > > > - queue_work(system_unbound_wq, &work->work);
> > > > + queue_work(system_dfl_wq, &work->work);
> > > > return 0;
> > > > --
> > > > 2.53.0
> > > >
> >
>