Re: [RFC PATCH] driver core: Don't link the device to the bus until we're ready to probe

From: Doug Anderson

Date: Fri Mar 27 2026 - 15:36:43 EST

Hi,

On Fri, Mar 27, 2026 at 11:46 AM Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Mar 26, 2026 at 02:49:33PM -0700, Doug Anderson wrote:
> > > > Right. ...and I think that's what my proposed "ready_to_probe" does.
> > > > It really does seem like quite a safe change. It _just_ prevents the
> > > > driver load path from initiating a probe too early.
> > >
> > > Any such consideration should apply to all the probe paths, not just
> > > driver loading. (Also, if it's too early to probe the device, perhaps
> > > the return code should be -EAGAIN instead of 0.)
> >
> > In my proposed solution, I was returning 0 from __driver_attach(). The
> > only place that's called from is driver_attach(), which calls it with
> > bus_for_each_dev(). I don't think returning -EAGAIN is a good idea
> > there since it stops bus_for_each_dev(). In general __driver_attach()
> > always returns 0.
> >
> > In general, the goal of my new proposed patch is to add the device to
> > the subsystem's "klist_devices" exactly where we do it today for
> > maximum compatibility. This means that if any code was relying on
> > being able to find the device, they can still find it. The _only_
> > exception is that I don't want to be able to find the device in
> > driver_attach(). So my proposed solution just hides the device in that
> > one case.
>
> But why just in that one case? That's what I don't understand. If it's
> not okay to bind at this time on the driver-load path, why is it okay to
> bind on other pathways (such as bus.c:bind_store())?

Ah, I see!

Yeah, OK. I spent more time, and I think I've a patch that will
address things. I still like adding the "ready_to_probe" flag and
setting it in device_add() right before bus_probe_device(). ...but
I've changed where I'm testing this flag. Now I've got the test in
__driver_probe_device(), where I simply do:

/*
* In device_add(), the "struct device" gets linked into the subsystem's
* list of devices and broadcast to userspace (via uevent) before we're
* quite ready to probe. Those open pathways to driver probe before
* we've finished enough of device_add() to reliably support probe.
* Detect this and tell other pathways to try again later. device_add()
* itself will also try to probe immediately after setting
* "ready_to_probe".
*/
if (!dev->ready_to_probe)
return dev_err_probe(dev, -EPROBE_DEFER, "Device not ready_to_probe");

I think that is more inline with your intuition that we should return
some sort of "try again" code when we end up with this situation. This
should also block _all_ probe paths safely by adding to the deferral
list (just in case) or returning -EAGAIN (in the case of
device_driver_attach()).

Does that sound like what you're looking for?

-Doug