Re: EEVDF regression still exists

From: K Prateek Nayak
Date: Fri May 02 2025 - 23:34:48 EST


Hello Linus,

On 5/2/2025 11:22 PM, Linus Torvalds wrote:
On Fri, 2 May 2025 at 10:25, Prundeanu, Cristian <cpru@xxxxxxxxxx> wrote:

Another, more recent observation is that 6.15-rc4 has worse performance than
rc3 and earlier kernels. Maybe that can help narrow down the cause?
I've added the perf reports for rc3 and rc2 in the same location as before.

The only _scheduler_ change that looks relevant is commit bbce3de72be5
("sched/eevdf: Fix se->slice being set to U64_MAX and resulting
crash"). Which does affect the slice calculation, although supposedly
only under special circumstances.> Of course, it could be something else.

Since it is the only !SCHED_EXT change in kernel/sched, Cristian can
perhaps try reverting it on top of v6.15-rc4 and checking if the
benchmark results jump back to v6.15-rc3 level to rule that single
change out. Very likely it could be something else.


For example, we have a AMD performance regression in general due to
_another_ CPU leak mitigation issue, but that predates rc3 (happened
during the merge window), so that one isn't relevant, but maybe
something else is..

Although honestly, that slice calculation still looks just plain odd.
It defaults the slice to zero, so if none of the 'break' conditions in
the first loop happens, it will reset the slice to that zero value and

I believe setting slice to U64_MAX was the actual problem. Previously,
when the slice was initialized as:

cfs_rq = group_cfs_rq(se);
slice = cfs_rq_min_slice(cfs_rq);

If the "se" was delayed, it basically means that the group_cfs_rq() had
no tasks on it and cfs_rq_min_slice() would return "~0ULL" which will
get propagated and can lead to bad math.

then the

slice = cfs_rq_min_slice(cfs_rq);

ion that second loop looks like it might just pick up that zero value again.

If the first loop does not break, even for "if (cfs_rq->load.weight)",
it basically means that there are no tasks / delayed entities queued
all the way until root cfs_rq so the slices shouldn't matter.

Enqueue of the next task will correct the slices for the queued
hierarchy.


I clearly don't understand the code.

Linus

--
Thanks and Regards,
Prateek