Re: [PATCH 2/2] sched/fair: get this cpu once in find_new_ilb

From: Shrikanth Hegde

Date: Thu Mar 19 2026 - 09:26:19 EST


Hi Peter.

On 3/19/26 2:50 PM, Peter Zijlstra wrote:
On Thu, Mar 19, 2026 at 12:23:14PM +0530, Shrikanth Hegde wrote:
smp_processor_id() is pointless to fetch in the loop. Move it out.
No functional change.

Isn't this what compilers are for?

This is what i see in pseries.

c00000000028cc7c: 1d 70 80 48 bl c000000000a93c98 <_find_next_and_bit>
c00000000028cc80: 00 00 00 60 nop
c00000000028cc84: 00 00 bd 80 lwz r5,0(r29)
c00000000028cc88: b4 07 7e 7c extsw r30,r3
c00000000028cc8c: 78 1b 7f 7c mr r31,r3 << This is ilb_cpu
c00000000028cc90: 78 1b 7a 7c mr r26,r3
c00000000028cc94: 40 18 05 7c cmplw r5,r3
c00000000028cc98: 78 f3 c3 7f mr r3,r30
c00000000028cc9c: 5c 00 81 40 ble c00000000028ccf8 <kick_ilb+0x10c>
c00000000028cca0: 08 00 2d a1 lhz r9,8(r13) << cpu comes from paca_index
c00000000028cca4: 00 f8 09 7c cmpw r9,r31
c00000000028cca8: 18 00 82 41 beq c00000000028ccc0 <kick_ilb+0xd4>
c00000000028ccac: 8d 0c 04 48 bl c0000000002cd938 <idle_cpu+0x8>
c00000000028ccb0: 00 00 00 60 nop
c00000000028ccb4: 00 00 03 2c cmpwi r3,0
c00000000028ccb8: 78 00 82 40 bne c00000000028cd30 <kick_ilb+0x144>
c00000000028ccbc: 00 00 bd 80 lwz r5,0(r29)
c00000000028ccc0: 01 00 df 38 addi r6,r31,1
c00000000028ccc4: 20 00 a5 78 clrldi r5,r5,32
c00000000028ccc8: 78 e3 84 7f mr r4,r28
c00000000028cccc: 78 db 63 7f mr r3,r27
c00000000028ccd0: b4 07 c6 7c extsw r6,r6
c00000000028ccd4: c5 6f 80 48 bl c000000000a93c98 <_find_next_and_bit>



So, yes it is done inside the loop. But accessing it should be cheap as it is r13 access
which is the paca pointer.

Since different archs implement it in their own ways
(many would have it cheap, but for some may be costly)
and there is CONFIG_DEBUG_PREEMPT too.
So it would be good to hoist it out of the loop IMHO.