Re: [PATCH] sched/fair: Revert boost in cpu_util()

From: Dietmar Eggemann

Date: Fri May 22 2026 - 03:55:17 EST

On 18.05.26 04:40, hongyan.xia(夏弘彦) wrote:
> From: Hongyan Xia <hongyan.xia@xxxxxxxxxxxxx>

I'm on vacation this week so will have a closer look beginning of next week.

> We have seen a massive power consumption regression (20% SoC power
> increase in many apps) after updating our kernel. After bisection we

What is the kernel version you updated to? Which one you have been using
so far?

Are you using Android on your devices? I remember there was some
functionality added to avoid janks in display pipeline.

> pinpointed the regression to the cpu_util(boost) feature. After
> reverting the boost feature the massive energy regression is gone.
> Detailed trace analysis down below. The regression is found across quite
> many apps but Youtube is one of the worst offenders, shown in the
> 1080p60fps video benchmark:
>
> Setup FPS SoC Power (mW) diff
> w/ boost 59.94 913.6
> w/o boost 59.93 720.4 -21.15%
>
> Signed-off-by: Hongyan Xia <hongyan.xia@xxxxxxxxxxxxx>
>
> ---
> Analysis:

[...]

> 2. Using the absolute value of runnable_avg to drive frequency is
> too high to be reasonable:
>
> We use runnable in a _relative_ way to util to know whether there is

Is this part of the value adds you put on top of mainline kernel? Are
you able to share this here?

> contention in several places. However, the _absolute_ value should not
> be used like util. Runnable_avg tends to be significantly higher,
> making it much easier to saturate frequency.
>
> For example, if three tasks each with a util of 100 contend on the same
> rq, the rq util is 300 but runnable_avg shoots up to 900. 900 drives the
> CPU at the max frequency, and it's highly questionable whether this
> boost is the right decision.

Shouldn't this be max 600, in case the task's runtime overlap perfectly?
In case they don't overlap at all runnable_avg should be util_avg. Is
this a theoretical example or taken from your traces?

> 3. Runnable_avg may not even reflect true contention:
>
> When tasks are dependent, the bottleneck is often the data flow between
> tasks, not the contention seen by runnable_avg. Boosting frequency with
> runnable in such scenarios wastes power without performance benefits.

That's probably true. But here any global feature (which doesn't need
per-task setup) won't be able to give perfect results, only per-task
setup can fix this.