Re: [RFC PATCH] driver core: Don't link the device to the bus until we're ready to probe

From: Doug Anderson

Date: Thu Mar 26 2026 - 17:50:12 EST


Hi,

On Tue, Mar 24, 2026 at 8:21 AM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
>
> > I'll do that if that's what everyone wants, but the more I think about
> > it the more worried I am that we'll end up with a hidden / harder to
> > debug problem where some driver gets unhappy when its probe is called
> > before dpm_sysfs_add(), device_pm_add(), device_create_file(),
> > device_create_sys_dev_entry(), BUS_NOTIFY_ADD_DEVICE, ...
>
> It's hard to know for all of them. However, it seems pretty clear that
> device_pm_add() should come before probing, since a probe routine will
> generally want to affect the device's runtime PM state.

Yup, that seems right to me, too. It's why I was trying to avoid just
moving fixing the fwdevlink assignment. I didn't want to run into more
hard-to-debug issues later.


> > > There should not be any difference between probing caused by the device
> > > being added to the bus, vs. caused by a new driver being registered, vs.
> > > caused by anything else (such as sysfs). None of these should be
> > > allowed until all of them can be handled properly.
> >
> > Right. ...and I think that's what my proposed "ready_to_probe" does.
> > It really does seem like quite a safe change. It _just_ prevents the
> > driver load path from initiating a probe too early.
>
> Any such consideration should apply to all the probe paths, not just
> driver loading. (Also, if it's too early to probe the device, perhaps
> the return code should be -EAGAIN instead of 0.)

In my proposed solution, I was returning 0 from __driver_attach(). The
only place that's called from is driver_attach(), which calls it with
bus_for_each_dev(). I don't think returning -EAGAIN is a good idea
there since it stops bus_for_each_dev(). In general __driver_attach()
always returns 0.

In general, the goal of my new proposed patch is to add the device to
the subsystem's "klist_devices" exactly where we do it today for
maximum compatibility. This means that if any code was relying on
being able to find the device, they can still find it. The _only_
exception is that I don't want to be able to find the device in
driver_attach(). So my proposed solution just hides the device in that
one case.

I believe this should be fine. Specifically, driver_attach() could
have been called (in another thread) immediately before
bus_add_device() and everything would have been fine. driver_attach()
wouldn't have found the device (because it wasn't linked in) but the
probe would still happen.


> I'm not at all sure whether the constraints we've got will need to force
> some events to happen after adding the device to the bus list and before
> allowing probing to start.
>
> > > And linking the device into the bus's list of devices should be the
> > > event that makes probing possible.
> >
> > Sure, but moving the linking into the bus's list of devices all the
> > way to the end is definitely a bigger change. If nothing else,
> > "bus_for_each_dev()" starts to be able to find the device once it's
> > linked into the list. If any of the ~50 drivers who register for
> > BUS_NOTIFY_ADD_DEVICE are relying on the device to show up in
> > "bus_for_each_dev()", it would be bad...
>
> I don't know the answer to this. That is, I don't know if there are any
> notification handlers depending on the device showing up in the bus's
> list. The safest thing to do is issue the notification after adding the
> device to the list -- which may mean after probing has potentially
> started. Is there any reason why that would be a problem? I'm not
> aware of any.

I'm not completely sure I follow what you're suggesting here...


> The order constraints should be commented explicitly in device_add(),
> not just implicitly implied by the code. Otherwise people won't know
> what changes are allowed and what changes are forbidden.

Yup! I added comments about ordering constraints in this RFC patch,
and will continue to do so as it evolves.

I still believe adding a flag that just hides the device from
driver_attach() is a safe and correct approach. In general I don't
want to fragment the discussoin, but I think it might be useful to
send a v2 that shows what that looks like. Any objections?

-Doug