Re: [PATCH v2] driver core: Don't let a device probe until it's ready
From: Rafael J. Wysocki
Date: Mon Mar 30 2026 - 11:46:02 EST
On Mon, Mar 30, 2026 at 4:29 PM Douglas Anderson <dianders@xxxxxxxxxxxx> wrote:
>
> The moment we link a "struct device" into the list of devices for the
> bus, it's possible probe can happen. This is because another thread
> can load the driver at any time and that can cause the device to
> probe. This has been seen in practice with a stack crawl that looks
> like this [1]:
>
> really_probe()
> __driver_probe_device()
> driver_probe_device()
> __driver_attach()
> bus_for_each_dev()
> driver_attach()
> bus_add_driver()
> driver_register()
> __platform_driver_register()
> init_module() [some module]
> do_one_initcall()
> do_init_module()
> load_module()
> __arm64_sys_finit_module()
> invoke_syscall()
>
> As a result of the above, it was seen that device_links_driver_bound()
> could be called for the device before "dev->fwnode->dev" was
> assigned. This prevented __fw_devlink_pickup_dangling_consumers() from
> being called which meant that other devices waiting on our driver's
> sub-nodes were stuck deferring forever.
>
> It's believed that this problem is showing up suddenly for two
> reasons:
> 1. Android has recently (last ~1 year) implemented an optimization to
> the order it loads modules [2]. When devices opt-in to this faster
> loading, modules are loaded one-after-the-other very quickly. This
> is unlike how other distributions do it. The reproduction of this
> problem has only been seen on devices that opt-in to Android's
> "parallel module loading".
> 2. Android devices typically opt-in to fw_devlink, and the most
> noticeable issue is the NULL "dev->fwnode->dev" in
> device_links_driver_bound(). fw_devlink is somewhat new code and
> also not in use by all Linux devices.
>
> Even though the specific symptom where "dev->fwnode->dev" wasn't
> assigned could be fixed by moving that assignment higher in
> device_add(), other parts of device_add() (like the call to
> device_pm_add()) are also important to run before probe. Only moving
> the "dev->fwnode->dev" assignment would likely fix the current
> symptoms but lead to difficult-to-debug problems in the future.
>
> Fix the problem by preventing probe until device_add() has run far
> enough that the device is ready to probe. If somehow we end up trying
> to probe before we're allowed, __driver_probe_device() will return
> -EPROBE_DEFER which will make certain the device is noticed.
>
> In the race condition that was seen with Android's faster module
> loading, we will temporarily add the device to the deferred list and
> then take it off immediately when device_add() probes the device.
>
> [1] Captured on a machine running a downstream 6.6 kernel
> [2] https://cs.android.com/android/platform/superproject/main/+/main:system/core/libmodprobe/libmodprobe.cpp?q=LoadModulesParallel
>
> Cc: stable@xxxxxxxxxxxxxxx
> Fixes: 2023c610dc54 ("Driver core: add new device to bus's list before probing")
> Signed-off-by: Douglas Anderson <dianders@xxxxxxxxxxxx>
Reviewed-by: Rafael J. Wysocki (Intel) <rafael@xxxxxxxxxx>
> ---
> v1: https://lore.kernel.org/r/20260320200656.RFC.1.Id750b0fbcc94f23ed04b7aecabcead688d0d8c17@changeid
>
> This v2 feels like a very safe change. It doesn't change the ordering
> of any steps of probe and it _just_ prevents the early probe from
> happening.
>
> I ran tests where I turned the printout "Device not ready_to_probe" on
> and I could see the printout happening, evidence of the race occurring
> from other printouts, and things successfully being resolved.
>
> Changes in v2:
> - Instead of adjusting the ordering, use "ready_to_probe" flag
>
> drivers/base/core.c | 15 +++++++++++++++
> drivers/base/dd.c | 12 ++++++++++++
> include/linux/device.h | 3 +++
> 3 files changed, 30 insertions(+)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 09b98f02f559..4caa3fd1ecdb 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -3688,6 +3688,21 @@ int device_add(struct device *dev)
> fw_devlink_link_device(dev);
> }
>
> + /*
> + * The moment the device was linked into the bus's "klist_devices" in
> + * bus_add_device() then it's possible that probe could have been
> + * attempted in a different thread via userspace loading a driver
> + * matching the device. "ready_to_probe" being false would have blocked
> + * those attempts. Now that all of the above initialization has
> + * happened, unblock probe. If probe happens through another thread
> + * after this point but before bus_probe_device() runs then it's fine.
> + * bus_probe_device() -> device_initial_probe() -> __device_attach()
> + * will notice (under device_lock) that the device is already bound.
> + */
> + device_lock(dev);
> + dev->ready_to_probe = true;
> + device_unlock(dev);
> +
> bus_probe_device(dev);
>
> /*
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 37c7e54e0e4c..a1762254828f 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -848,6 +848,18 @@ static int __driver_probe_device(const struct device_driver *drv, struct device
> if (dev->driver)
> return -EBUSY;
>
> + /*
> + * In device_add(), the "struct device" gets linked into the subsystem's
> + * list of devices and broadcast to userspace (via uevent) before we're
> + * quite ready to probe. Those open pathways to driver probe before
> + * we've finished enough of device_add() to reliably support probe.
> + * Detect this and tell other pathways to try again later. device_add()
> + * itself will also try to probe immediately after setting
> + * "ready_to_probe".
> + */
> + if (!dev->ready_to_probe)
> + return dev_err_probe(dev, -EPROBE_DEFER, "Device not ready_to_probe");
> +
> dev->can_match = true;
> dev_dbg(dev, "bus: '%s': %s: matched device with driver %s\n",
> drv->bus->name, __func__, drv->name);
> diff --git a/include/linux/device.h b/include/linux/device.h
> index e65d564f01cd..e2f83384b627 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -553,6 +553,8 @@ struct device_physical_location {
> * @dma_skip_sync: DMA sync operations can be skipped for coherent buffers.
> * @dma_iommu: Device is using default IOMMU implementation for DMA and
> * doesn't rely on dma_ops structure.
> + * @ready_to_probe: If set to %true then device_add() has finished enough
> + * initialization that probe could be called.
> *
> * At the lowest level, every device in a Linux system is represented by an
> * instance of struct device. The device structure contains the information
> @@ -675,6 +677,7 @@ struct device {
> #ifdef CONFIG_IOMMU_DMA
> bool dma_iommu:1;
> #endif
> + bool ready_to_probe:1;
> };
>
> /**
> --
> 2.53.0.1018.g2bb0e51243-goog
>