Re: [PATCH] cpufreq: conservative: Fix incorrect frequency decrease due to stale target

From: Zhongqiu Han

Date: Mon May 11 2026 - 00:04:05 EST


On 5/8/2026 6:14 PM, Viresh Kumar wrote:
On 08-05-26, 14:46, Zhongqiu Han wrote:
Nice suggestion! — I think it aligns well with the general direction of
addressing this issue and relying more on the cpufreq core.

That said, I have a concern if we drop the decrease-path early-exit
check while keeping the existing condition:

if (requested_freq > freq_step)

unchanged. In that case, when the system is already at policy->min, this
appears to introduce a governor-level oscillation, even though the
effective hardware frequency remains unchanged:

Consider the following scenario
(policy->min = 200, freq_step = 100, hardware already at min, low load):

Iteration N:
requested_freq = 200
decrease path: 200 > 100 -> requested_freq = 100
__cpufreq_driver_target(100, LE):
clamp -> 200, target == cur
-> without CPUFREQ_NEED_UPDATE_LIMITS: driver not invoked
-> with CPUFREQ_NEED_UPDATE_LIMITS: driver is still invoked
dbs_info->requested_freq = 100 <- below min

Iteration N+1:
requested_freq = 100
out-of-range check: reset to policy->cur = 200
decrease path: 200 > 100 -> requested_freq = 100
dbs_info->requested_freq = 100 <- below min again

This results in a repeated oscillation pattern. For drivers with
CPUFREQ_NEED_UPDATE_LIMITS (amd-pstate, intel_cpufreq, cppc_cpufreq),
the driver may be invoked every sampling period even though the
effective frequency does not change. I'm happy to defer to your judgment
on whether this is significant enough to warrant further changes.

Given this, may i know would it make sense to adjust only the decrease
path early-exit condition to use dbs_info->requested_freq (i.e. the last
target actually requested from hardware), rather than the idle-adjusted
local requested_freq?

- if (requested_freq == policy->min)
+ if (dbs_info->requested_freq == policy->min)
goto out;

With this change, the scenario raised by Lifeng
(dbs_info->requested_freq = 400, idle_periods = 2, low load) behaves as
follows:

Iteration 1:
requested_freq = 400
idle decay: requested_freq = 200
if (dbs_info->requested_freq(400) == min(200)) ? NO -> continue
200 > 100 -> requested_freq = 100
__cpufreq_driver_target(100, LE):
clamp -> 200, target(200) != cur(400)
-> driver invoked, hardware: 400 -> 200 MHz [bug fixed]
dbs_info->requested_freq = 100 <- one-time transient value

Iteration 2:
requested_freq = 100
out-of-range check: reset to policy->cur = 200
if (dbs_info->requested_freq(200) == min(200)) ? YES -> goto out

-> Stable from iteration 2 onward, with no extra driver calls.

There is a one-time transient where dbs_info->requested_freq briefly
drops below policy->min, but this is harmless: the existing out-of-range
check corrects it in the next iteration. A similar situation was
discussed before [1], where the conclusion at the time was that
__cpufreq_driver_target() already performs the necessary clamping. If it
would be clearer or more robust to add an explicit guard
here, this can be revisited as well at your discretion.😊

[1]https://lore.kernel.org/linux-pm/20231005105746.ikezg2buza2qwvig@vireshk-i7/

The increase-path early-exit is intentionally left unchanged, to avoid
altering the behavior in the scenario where conservative idle decay
would otherwise be ignored when the previous requested frequency was
already at policy->max.

What about this ?

diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index df01d33993d8..c7ec11de9a43 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -103,10 +103,6 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)
if (load > dbs_data->up_threshold) {
dbs_info->down_skip = 0;

- /* if we are already at full speed then break out early */
- if (requested_freq == policy->max)
- goto out;
-
requested_freq += freq_step;
if (requested_freq > policy->max)
requested_freq = policy->max;
@@ -124,15 +120,8 @@ static unsigned int cs_dbs_update(struct cpufreq_policy *policy)

/* Check for frequency decrease */
if (load < cs_tuners->down_threshold) {
- /*
- * if we cannot reduce the frequency anymore, break out early
- */
- if (requested_freq == policy->min)
- goto out;
-
- if (requested_freq > freq_step)
- requested_freq -= freq_step;
- else
+ requested_freq -= freq_step;
+ if (requested_freq < policy->min)
requested_freq = policy->min;

__cpufreq_driver_target(policy, requested_freq,


Just one small concern is that requested_freq may underflow as an
unsigned value; perhaps this could be improved by:

- if (requested_freq > freq_step)
+ if (requested_freq > policy->min + freq_step)
requested_freq -= freq_step;
else
requested_freq = policy->min;

or use min() func?

Additionally, it seems that dropping the early‑exit checks also appears
to be a nice side effect fix for CPUFREQ_NEED_UPDATE_LIMITS drivers when
updating internal upper and lower frequency boundaries.

As designed in commit 1c534352f47f ("cpufreq: Introduce
CPUFREQ_NEED_UPDATE_LIMITS driver flag"), a cpufreq driver may need to
update its internal frequency limits when policy min or max changes for
drivers setting CPUFREQ_NEED_UPDATE_LIMITS.

However, the early‑exit in cs_dbs_update() can prevent
__cpufreq_driver_target() from ever being called.

For example, when policy->min rises from 200 to 400 kHz while policy
->cur is already at 400 kHz, under low load:

cpufreq_policy_apply_limits():
policy->min(400) > policy->cur(400) ? NO -> driver not called

cs_limits():
dbs_info->requested_freq = policy->cur = 400

[next sampling period, low load]
cs_dbs_update():
requested_freq = 400
if (requested_freq == policy->min) /* 400 == 400 -> true */
goto out; /* __cpufreq_driver_target() never called */

With the early‑exit removed, __cpufreq_driver_target() is called with
target == policy->cur, and CPUFREQ_NEED_UPDATE_LIMITS ensures the driver
is invoked to update its internal performance boundaries.


--
Thx and BRs,
Zhongqiu Han