Re: [PATCH v2] blk-iocost: fix busy_level reset when no IOs complete

From: Tejun Heo

Date: Mon Mar 30 2026 - 15:23:45 EST


Hello,

On Sun, Mar 29, 2026 at 03:41:12PM +0000, Jialin Wang wrote:
...
> Before:
> CGROUP IOPS MB/s Avg(ms) Max(ms) P90(ms) P99 P99.9 P99.99
>
> cgA-1m 167 167.02 748.65 1641.43 960.50 1551.89 1635.78 1635.78
> cgB-4k 5 0.02 190.57 806.84 742.39 809.50 809.50 809.50
>
> cgA-1m 166 166.36 751.38 1744.31 994.05 1451.23 1736.44 1736.44
> cgB-32k 4 0.14 225.71 1057.25 759.17 1061.16 1061.16 1061.16
>
> cgA-1m 166 165.91 751.48 1610.94 1010.83 1417.67 1602.22 1619.00
> cgB-256k 5 1.26 198.50 1046.30 742.39 1044.38 1044.38 1044.38
>
> After:
> CGROUP IOPS MB/s Avg(ms) Max(ms) P90(ms) P99 P99.9 P99.99
>
> cgA-1m 159 158.59 769.06 828.52 809.50 817.89 826.28 826.28
> cgB-4k 200 0.78 2.01 26.11 2.87 6.26 12.39 26.08
>
> cgA-1m 147 146.84 832.05 985.80 943.72 960.50 985.66 985.66
> cgB-32k 200 6.25 2.82 71.05 3.42 15.40 50.07 70.78
>
> cgA-1m 114 114.47 1044.98 1294.48 1199.57 1283.46 1300.23 1300.23
> cgB-256k 200 50.00 4.01 34.49 5.08 15.66 30.54 34.34

Are the latency numbers end-to-end or on-device? If former, can you provide
on-device numbers? What period duration are you using?

> @@ -2397,9 +2400,29 @@ static void ioc_timer_fn(struct timer_list *timer)
> * and should increase vtime rate.
> */
> prev_busy_level = ioc->busy_level;
> - if (rq_wait_pct > RQ_WAIT_BUSY_PCT ||
> - missed_ppm[READ] > ppm_rthr ||
> - missed_ppm[WRITE] > ppm_wthr) {
> + if (!nr_done) {
> + if (nr_lagging)

Please use {} even when it's just comments that makes the bodies multi-line.

> + /*
> + * When there are lagging IOs but no completions, we
> + * don't know if the IO latency will meet the QoS
> + * targets. The disk might be saturated or not. We
> + * should not reset busy_level to 0 (which would
> + * prevent vrate from scaling up or down), but rather
> + * try to keep it unchanged. To avoid drastic vrate
> + * oscillations, we clamp it between -4 and 4.
> + */
> + ioc->busy_level = clamp(ioc->busy_level, -4, 4);

Is this from some observed behavior or just out of intuition? The
justification seems a bit flimsy. Why -4 and 4?

> + else if (nr_shortages)
> + /*
> + * The vrate might be too low to issue any IOs. We
> + * should allow vrate to increase but not decrease.
> + */
> + ioc->busy_level = min(ioc->busy_level, 0);

So, this is no completion, no lagging and shortages case. In the existing
code, this would alos get busy_level-- to get things moving. Wouldn't this
path need that too? Or rather, would it make more sense to handle !nr_done
&& nr_lagging case and leave the other cases as-are?

Thanks.

--
tejun