Re: [PATCH] sched/fair: Revert boost in cpu_util()

From: hongyan.xia(夏弘彦)

Date: Mon May 18 2026 - 07:39:11 EST

On 5/18/2026 6:04 PM, Christian Loehle wrote:
> [Some people who received this message don't often get email from christian.loehle@xxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> On 5/18/26 03:40, hongyan.xia(夏弘彦) wrote:
>> From: Hongyan Xia <hongyan.xia@xxxxxxxxxxxxx>
>>
>> We have seen a massive power consumption regression (20% SoC power
>> increase in many apps) after updating our kernel. After bisection we
>> pinpointed the regression to the cpu_util(boost) feature. After
>> reverting the boost feature the massive energy regression is gone.
>> Detailed trace analysis down below. The regression is found across quite
>> many apps but Youtube is one of the worst offenders, shown in the
>> 1080p60fps video benchmark:
>>
>> Setup FPS SoC Power (mW) diff
>> w/ boost 59.94 913.6
>> w/o boost 59.93 720.4 -21.15%
>>
>> Signed-off-by: Hongyan Xia <hongyan.xia@xxxxxxxxxxxxx>
>>
>> ---
>> Analysis:
>>
>> We found several problems that result in the power spike:
>>
>> 1. Arithmetic should not happen between util_avg and runnable_avg:
>>
>> After util = max(util, runnable) which potentially picks runnable value
>> in cpu_util(), we then add or subtract task util values from it. This
>> produces a value that is half-runnable-half-util which is ill-defined.
>> This alone should be a warning sign. This breaks EAS calculations in
>> many cases, leading to sub-optimal task placements.
>>
>> 2. Using the absolute value of runnable_avg to drive frequency is
>> too high to be reasonable:
>>
>> We use runnable in a _relative_ way to util to know whether there is
>> contention in several places. However, the _absolute_ value should not
>> be used like util. Runnable_avg tends to be significantly higher,
>> making it much easier to saturate frequency.
>>
>> For example, if three tasks each with a util of 100 contend on the same
>> rq, the rq util is 300 but runnable_avg shoots up to 900. 900 drives the
>> CPU at the max frequency, and it's highly questionable whether this
>> boost is the right decision.
>>
>> 3. Runnable_avg may not even reflect true contention:
>>
>> When tasks are dependent, the bottleneck is often the data flow between
>> tasks, not the contention seen by runnable_avg. Boosting frequency with
>> runnable in such scenarios wastes power without performance benefits.
>>
>> We found 1 has minor power regression but 2 and 3 regresses power
>> significantly. We have seen multiple applications with the
>> producer-consumer model with many worker threads suffer. When there is
>> IPC between producer and consumer, boosting frequency blindly does not
>> help performance at all if consumer is limited by how much data is flown
>> through. Youtube suffer from 1, 2 and 3 at the same time, leading to a
>> total SoC power regression of 20% shown in the results above.
>
> We did discuss removing runnable boost internally as well, but I’d love to see
> more data too.
> The original issue it was trying to solve was avoiding jank frames during load
> spikes, which YouTube does not really exercise. Some gaming workload data would
> therefore be a useful addition here.

Although I would be glad to provide more data (after more benchmarks and
pending our internal approval), I wonder, what level of performance gain
do we expect from this feature to justify the big energy regression?

> Runnable boost was considered as an alternative to approaches like reducing the
> PELT half-life and similar changes. Qais’ current ideas also try to tackle this
> problem, of course, so +CC.
>
> If you have run many workloads, do you also have data on where this feature actually
> helped, especially in reducing jank frames?

We ran our Day of Use (DoU, including Facebook, Youtube and other
popular apps) test model and we did see a 6.6% increase in jank frames
after the revert. Dropped frames went up from 106 to 113 in a total of
70210 frames. However, in our test model there is no way an increase of
7 frames within 70210 justifies the energy regression between 10% and
20% in a lot of apps, hence for us the trade-off decision is very clear
here.

Another question from me is, if this feature has potentially buggy
corners or mathematical unsoundness (mostly the half-util-half-runnable
value inside cpu_util()), should we rely on its performance gain?

>
> Some discussion from back then:
> https://lore.kernel.org/lkml/20230406155030.1989554-1-dietmar.eggemann@xxxxxxx/
> https://lore.kernel.org/lkml/20220829055450.1703092-1-dietmar.eggemann@xxxxxxx/
>
>> [snip]