Re: [PATCH] RDMA/rxe: Replace use of system_unbound_wq with system_dfl_wq

From: Leon Romanovsky

Date: Tue Mar 17 2026 - 15:04:50 EST


On Tue, Mar 17, 2026 at 10:24:11AM -0700, Yanjun.Zhu wrote:
>
> On 3/17/26 7:38 AM, Zhu Yanjun wrote:
> > 在 2026/3/16 13:13, Leon Romanovsky 写道:
> > > On Fri, Mar 13, 2026 at 04:40:23PM +0100, Marco Crivellari wrote:
> > > > This patch continues the effort to refactor workqueue APIs,
> > > > which has begun
> > > > with the changes introducing new workqueues and a new
> > > > alloc_workqueue flag:
> > > >
> > > >     commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and
> > > > system_dfl_wq")
> > > >     commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
> > > >
> > > > The point of the refactoring is to eventually alter the default
> > > > behavior of
> > > > workqueues to become unbound by default so that their workload
> > > > placement is
> > > > optimized by the scheduler.
> > > >
> > > > Before that to happen, workqueue users must be converted to the
> > > > better named
> > > > new workqueues with no intended behaviour changes:
> > > >
> > > >     system_wq -> system_percpu_wq
> > > >     system_unbound_wq -> system_dfl_wq
> > > >
> > > > This way the old obsolete workqueues (system_wq,
> > > > system_unbound_wq) can be
> > > > removed in the future.
> > >
> > > I recall earlier efforts to replace system workqueues with
> > > per‑driver queues,
> > > because unloading a driver forces a flush of the entire system
> > > workqueue,
> > > which is undesirable for overall system behavior.
> > >
> > > Wouldn't it be better to introduce a local workqueue here and use
> > > that instead?
> >
> > Thanks.
> >
> > 1.The initialization should be:
> >
> > my_wq = alloc_workqueue("my_driver_queue", WQ_UNBOUND | WQ_MEM_RECLAIM,
> > 0);
> > if (!my_wq)
> >     return -ENOMEM;
> >
> > 2. The Submission should be:
> >
> > queue_work(my_wq, &my_work);
> >
> > 3. Destroy should be:
> >
> > destroy_workqueue()
> >
> > Thanks,
> > Zhu Yanjun
>
> Hi, Leon
>
> The diff for a new work queue in rxe is as below. Please review it.

I'm not sure that you need second workqueue and destroy_workqueue
already does flush_workqueue. There is no need to call it explicitly.

Thanks

>
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_odp.c
> b/drivers/infiniband/sw/rxe/rxe_odp.c
> index bc11b1ec59ac..03199fef47fb 100644
> --- a/drivers/infiniband/sw/rxe/rxe_odp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_odp.c
> @@ -545,7 +545,7 @@ static int rxe_ib_advise_mr_prefetch(struct ib_pd *ibpd,
>          work->frags[i].mr = mr;
>      }
>
> -    queue_work(system_unbound_wq, &work->work);
> +    rxe_queue_aux_work(&work->work);
>
>      return 0;
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_task.c
> b/drivers/infiniband/sw/rxe/rxe_task.c
> index f522820b950c..a2da699b969e 100644
> --- a/drivers/infiniband/sw/rxe/rxe_task.c
> +++ b/drivers/infiniband/sw/rxe/rxe_task.c
> @@ -6,19 +6,36 @@
>
>  #include "rxe.h"
>
> +/* work for rxe_task */
>  static struct workqueue_struct *rxe_wq;
>
> +/* work for other rxe jobs */
> +static struct workqueue_struct *rxe_aux_wq;
> +
>  int rxe_alloc_wq(void)
>  {
> -    rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND, WQ_MAX_ACTIVE);
> +    rxe_wq = alloc_workqueue("rxe_wq", WQ_UNBOUND | WQ_MEM_RECLAIM,
> +                WQ_MAX_ACTIVE);
>      if (!rxe_wq)
>          return -ENOMEM;
>
> +    rxe_aux_wq = alloc_workqueue("rxe_aux_wq",
> +                WQ_UNBOUND | WQ_MEM_RECLAIM, WQ_MAX_ACTIVE);
> +    if (!rxe_aux_wq) {
> +        destroy_workqueue(rxe_wq);
> +        return -ENOMEM;
> +
> +    }
> +
>      return 0;
>  }
>
>  void rxe_destroy_wq(void)
>  {
> +    flush_workqueue(rxe_aux_wq);
> +    destroy_workqueue(rxe_aux_wq);
> +
> +    flush_workqueue(rxe_wq);
>      destroy_workqueue(rxe_wq);
>  }
>
> @@ -254,6 +271,14 @@ void rxe_sched_task(struct rxe_task *task)
>      spin_unlock_irqrestore(&task->lock, flags);
>  }
>
> +/* rxe_wq for rxe tasks. rxe_aux_wq for other rxe jobs.
> + */
> +void rxe_queue_aux_work(struct work_struct *work)
> +{
> +    WARN_ON_ONCE(!rxe_aux_wq);
> +    queue_work(rxe_aux_wq, work);
> +}
> +
>  /* rxe_disable/enable_task are only called from
>   * rxe_modify_qp in process context. Task is moved
>   * to the drained state by do_task.
> diff --git a/drivers/infiniband/sw/rxe/rxe_task.h
> b/drivers/infiniband/sw/rxe/rxe_task.h
> index a8c9a77b6027..e1c0a34808b4 100644
> --- a/drivers/infiniband/sw/rxe/rxe_task.h
> +++ b/drivers/infiniband/sw/rxe/rxe_task.h
> @@ -36,6 +36,7 @@ int rxe_alloc_wq(void);
>
>  void rxe_destroy_wq(void);
>
> +void rxe_queue_aux_work(struct work_struct *work);
>  /*
>   * init rxe_task structure
>   *    qp  => parameter to pass to func
>
> Zhu Yanjun
>
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Link:
> > > > https://lore.kernel.org/all/20250221112003.1dSuoGyc@xxxxxxxxxxxxx/
> > > > Suggested-by: Tejun Heo <tj@xxxxxxxxxx>
> > > > Signed-off-by: Marco Crivellari <marco.crivellari@xxxxxxxx>
> > > > ---
> > > >   drivers/infiniband/sw/rxe/rxe_odp.c | 2 +-
> > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/infiniband/sw/rxe/rxe_odp.c
> > > > b/drivers/infiniband/sw/rxe/rxe_odp.c
> > > > index bc11b1ec59ac..d440c8cbaea5 100644
> > > > --- a/drivers/infiniband/sw/rxe/rxe_odp.c
> > > > +++ b/drivers/infiniband/sw/rxe/rxe_odp.c
> > > > @@ -545,7 +545,7 @@ static int rxe_ib_advise_mr_prefetch(struct
> > > > ib_pd *ibpd,
> > > >           work->frags[i].mr = mr;
> > > >       }
> > > >   -    queue_work(system_unbound_wq, &work->work);
> > > > +    queue_work(system_dfl_wq, &work->work);
> > > >         return 0;
> > > >   --
> > > > 2.53.0
> > > >
> >
>