Re: [PATCH] cpufreq: cppc: Reduce cppc delivered perf sampling jitter

From: Jeremy Linton

Date: Wed Jun 03 2026 - 13:12:05 EST

Hi,

Thanks for looking at this.

On 6/3/26 5:54 AM, Breno Leitao wrote:

On Tue, Jun 02, 2026 at 04:20:52PM -0500, Jeremy Linton wrote:

CPPC uses a pair of registers cycling at different frequencies to
determine an accumulated performance level. For userspace reporting we
want to convert this to an instantaneous CPU frequency, but over short
time periods small errors caused by CPPC counter reads can cause
fairly significant reported frequency variations even when the core
CPU clock isn't changing.

Reduce this by keeping a start sample fixed and retrying the end
sample until the counter deltas are large enough to reduce short
window error, or until adjacent delivered performance estimates are
within the CPU's observed CPPC read noise floor.

To begin, resample the initial pair a small fixed number of times
looking for matching delivered performance deltas. This reduces the
chance that a disturbed start sample anchors the rest of the
calculation.

Then look for an end sample while updating the noise floor from the
best error seen between samples. The floor remains zero on systems
with stable feedback reads, but lets noisy systems stop early once
another retry is unlikely to improve the result. The retry loop is
capped at 200 iterations, giving an ~20 usec explicit delay budget
derived from ndelay(100).

Signed-off-by: Jeremy Linton <jeremy.linton@xxxxxxx>
---
drivers/cpufreq/cppc_cpufreq.c | 68 ++++++++++++++++++++++++++++++----
1 file changed, 61 insertions(+), 7 deletions(-)

diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
index 7e7f9dfb7a24..362c08def420 100644
--- a/drivers/cpufreq/cppc_cpufreq.c
+++ b/drivers/cpufreq/cppc_cpufreq.c
@@ -50,7 +50,7 @@ struct cppc_freq_invariance {
static DEFINE_PER_CPU(struct cppc_freq_invariance, cppc_freq_inv);
static struct kthread_worker *kworker_fie;
-static int cppc_perf_from_fbctrs(u64 reference_perf,
+static u64 cppc_perf_from_fbctrs(u64 reference_perf,
struct cppc_perf_fb_ctrs *fb_ctrs_t0,
struct cppc_perf_fb_ctrs *fb_ctrs_t1);
@@ -750,7 +750,7 @@ static inline u64 get_delta(u64 t1, u64 t0)
return (u32)t1 - (u32)t0;
}
-static int cppc_perf_from_fbctrs(u64 reference_perf,
+static u64 cppc_perf_from_fbctrs(u64 reference_perf,
struct cppc_perf_fb_ctrs *fb_ctrs_t0,
struct cppc_perf_fb_ctrs *fb_ctrs_t1)
{
@@ -771,19 +771,71 @@ static int cppc_perf_from_fbctrs(u64 reference_perf,
return (reference_perf * delta_delivered) / delta_reference;
}
-static int cppc_get_perf_ctrs_sample(int cpu,
+/* CPPC read noise floor for early retry exit. */
+static DEFINE_PER_CPU(u64, err_floor);
+
+#define CPPC_SAMPLE_MAX_RETRIES 200

Could the remaining tuning literals get the same treatment?
Specifically:

- the 10 initial-resample iteration count
- the 2000 multiplier in ref * 2000
- the 100 ns in ndelay(100)

Sure. A few of these were personal judgment from the platforms I tried it on. I had some instrumentation at the bottom which was printing loop counts and error values and largely I picked those values based on how they were behaving, or back of the evelope estimates. For example, that 200 is afaik overkill, its usually settles down around 20 or less, which makes this faster than the old method on at least one platform I tried it on. And they are all intended to be "upper bound" exit the loop because something isn't working right values.

I'm interested in whether this patch stabilizes the frequency reporting in some of the cases I've heard people talking about.