Re: [PATCH] thermal: core: fix use-after-free due to init/cancel delayed_work race
From: Mauricio Faria de Oliveira
Date: Wed Mar 25 2026 - 15:23:04 EST
On 2026-03-25 13:24, Rafael J. Wysocki wrote:
> On Wed, Mar 25, 2026 at 4:13 PM Mauricio Faria de Oliveira
> <mfo@xxxxxxxxxx> wrote:
>>
>> On 2026-03-25 11:28, Mauricio Faria de Oliveira wrote:
>> > On 2026-03-25 11:17, Mauricio Faria de Oliveira wrote:
>> >> Thanks for looking into this.
>> >>
>> >> On 2026-03-25 09:47, Rafael J. Wysocki wrote:
>> >>> I can see the one between thermal_zone_device_unregister() and
>> >>> thermal_zone_device_resume(), but that can be addressed by adding a
>> >>> TZ_STATE_FLAG_EXIT check to the latter AFAICS.
>> >>
>> >
>> > Please disregard this paragraph; I incorrectly read/wrote _resume()
>> > as thermal_zone_pm_complete() discussed above. The rest should be
>> > right. I'll review this and get back shortly.
>> >
>> >> In the example describe above and detailed below, apparently that
>> >> is not sufficient, if I'm not missing anything. See, if _resume()
>> >> is reached with thermal_list_lock held, thermal_zone_device_exit()
>> >> is waiting for thermal_list_lock before setting TZ_STATE_FLAG_EXIT,
>> >> thus a check for it in _resume() would find it clear yet.
>>
>> Ok, similarly:
>>
>> Say, thermal_pm_notify() -> thermal_pm_notify_complete() ->
>> thermal_zone_pm_complete()
>> run before thermal_zone_device_unregister() is called;
>> thermal_zone_device_resume()
>> starts, and by now thermal_zone_device_unregister() is called.
>>
>> If thermal_zone_device_resume() wins the race over thermal_zone_exit()
>> for guard(thermal_zone(tz) (tz->lock), it sees TZ_STATE_FLAG_EXIT clear;
>> note its callees (eg, thermal_zone_device_init()) run with tz->lock
>> held,
>> so they see it clear as well.
>>
>> So, thermal_zone_device_init() calls INIT_DELAYED_WORK(), everything
>> returns, tz->lock is released and the thermal_zone_device_unregister()
>> -> thermal_zone_exit() path can continue to run.
>>
>> Only now thermal_zone_exit() sets TZ_STATE_FLAG_EXIT (too late),
>> returns.
>> cancel_delayed_work_sync() does not wait for
>> thermal_zone_device_resume()
>> due to INIT_DELAYED_WORK() in thermal_zone_device_init(); and kfree(tz).
>>
>> Then, thermal_zone_device_resume() accesses tz and hits use-after-free.
>>
>> Hope this clarifies. Please let me know your thoughts. Thanks!
>
> Thanks for the analysis, it sounds accurate.
>
> I'd say that thermal_zone_device_unregister() needs to flush the
> workqueue before calling cancel_delayed_work_sync() to get rid of the
> stuff that may be running out of it that hasn't seen the changes made
> by thermal_zone_exit().
IIUIC, cancel_delayed_work_sync() has that effect: it waits for
(specific)
work that might be running and hasn't seen changes by
thermal_zone_exit()).
> This should take care of all of the existing races because if anything
> is running out of the workqueue when thermal_zone_device_unregister()
> runs, it will be waited for after calling thermal_zone_exit() and any
> leftover stuff will be caught by cancel_delayed_work_sync().
Likewise, the wait-for part is an effect of cancel_delayed_work_sync(),
and AFAIK, there is no leftover after cancel_delayed_work_sync(), as
it waits for the running work function to finish.
And no further work is queued in the 2 code paths that can queue work:
1) thermal_zone_device_check(): even if it misses the tz->state check,
mod_delayed_work() does not requeue the current work item if it is
canceled/waited for by cancel_delayed_work_sync() (tested locally).
2) thermal_zone_pm_complete(): this function will no longer be reached
because tz is no longer in thermal_tz_list.
> Of course, it's better to switch over to using a dedicated workqueue
> in the thermal core for that.
Considering the points above, AFAICT, it should be sufficient to call
cancel_delayed_work_sync() for the 2 code paths in unregister()
(which thus require the distint work items for each code path).
Thanks,
--
Mauricio