Re: [PATCH v3 0/2] governor: Fix races and stale baseline on prev_cpu_nice

From: Rafael J. Wysocki

Date: Fri May 22 2026 - 11:15:49 EST


On Sun, Apr 19, 2026 at 3:27 PM Zhongqiu Han
<zhongqiu.han@xxxxxxxxxxxxxxxx> wrote:
>
> Patch 1 fixes two baseline-consistency issues in the DBS governor
> framework.
>
> gov_update_cpu_data() writes prev_cpu_idle and prev_cpu_nice while
> holding only attr_set->update_lock, whereas dbs_update() reads and
> writes the same fields while holding only policy_dbs->update_mutex.
> Because these are independent locks, the two paths are not mutually
> exclusive. The fix acquires policy_dbs->update_mutex inside
> gov_update_cpu_data() for each policy.
>
> Additionally, cpufreq_dbs_governor_start() reads io_is_busy before
> acquiring policy_dbs->update_mutex. A concurrent io_is_busy_store()
> can update io_is_busy and call gov_update_cpu_data(), which writes
> prev_cpu_idle with the new value under the mutex.
> cpufreq_dbs_governor_start() then overwrites prev_cpu_idle with the
> stale value. The fix reads io_is_busy inside the mutex.
>
> Patch 2 fixes a stale-baseline race on prev_cpu_nice when
> ignore_nice_load is enabled via sysfs. There is a race window
> between setting ignore_nice_load and gov_update_cpu_data() acquiring
> update_mutex: a concurrent dbs_update() can see the new ignore_nice_load
> value but read a stale prev_cpu_nice, producing an inflated idle_time
> for that sample.
>
> The fix unconditionally advances prev_cpu_nice on every dbs_update()
> call. With prev_cpu_nice always reflecting the most recent sample,
> enabling ignore_nice_load can never produce a stale-baseline spike.
> To keep prev_cpu_nice handling consistent, gov_update_cpu_data()
> resets prev_cpu_nice unconditionally alongside prev_cpu_idle, and
> cpufreq_dbs_governor_start() initializes prev_cpu_nice unconditionally.
>
> v3:
> - Update linux-next base
> - Fix two additional issues found during sashiko.dev review:
> - Patch 1: move io_busy read inside policy_dbs->update_mutex in
> cpufreq_dbs_governor_start() to close a TOCTOU window against
> concurrent io_is_busy_store().
> - Patch 2: restore unconditional prev_cpu_nice reset in
> gov_update_cpu_data() to keep both baselines consistent when
> io_is_busy changes with ignore_nice_load enabled; update comment.
> - Patch 2: keep gov_update_cpu_data() in ignore_nice_load_store() to
> prevent a stale-baseline spike in the governor-start window (when
> ignore_nice_load is set before the first dbs_update() call); only
> cpufreq_governor.c is modified. Update commit message and add one
> more fixes tag '326c86deaed54a ("[CPUFREQ] Remove unneeded locks").'
> - Update the commit messages.
> - Link to v2: https://lore.kernel.org/all/20260409111407.9775-1-zhongqiu.han@xxxxxxxxxxxxxxxx/
> - Sashiko v2 comments: https://sashiko.dev/#/patchset/20260409111407.9775-1-zhongqiu.han%40oss.qualcomm.com

So sashiko.dev still has some comments on the first patch in v3:

https://sashiko.dev/#/patchset/20260419132655.3800673-1-zhongqiu.han%40oss.qualcomm.com

Can you please have a look at that and let me know what you think?

> v2:
> - Update linux-next base
> - Based on v1 review, patch 1 is updated to add the missing
> protection around cpufreq_dbs_governor_start(), and patch 2/2 is added.
> - Link to v1: https://lore.kernel.org/all/20260406110113.3475920-1-zhongqiu.han@xxxxxxxxxxxxxxxx/
>
> Zhongqiu Han (2):
> cpufreq: governor: Fix data races on per-CPU idle/nice baselines
> cpufreq: governor: Fix stale prev_cpu_nice spike when enabling
> ignore_nice_load
>
> drivers/cpufreq/cpufreq_governor.c | 42 ++++++++++++++++++++----------
> 1 file changed, 28 insertions(+), 14 deletions(-)
>
>
> base-commit: c7275b05bc428c7373d97aa2da02d3a7fa6b9f66
> --
> 2.43.0
>
>