On 05.05.25 11:36, Oscar Salvador wrote:
On Mon, May 05, 2025 at 10:12:48AM +0200, David Hildenbrand wrote:
Assume you hotplug the second CPU. The node is already registered/online, so
who does the register_cpu_under_node() call?
It's register_cpu() I guess? But no idea in which order that is called with
node onlining.
The code has to be cleaned up such that onlining a node does not traverse
any cpus / memory.
Whoever adds a CPU / memory *after onlining the node* must register the
device manually under the *now online* node.
So, I think this is the sequence of events:
- hotplug cpu:
acpi_processor_hotadd_init
register_cpu
register_cpu_under_node
online_store
device_online()->dev_bus_online()
cpu_subsys->online()
cpu_subsys_online
cpu_device_up
cpu_up
try_online_node <- brings node online
...
register_one_node <- registers cpu under node
_cpu_up
My thinking was, whether we can simply move the register_cpu_under_node() after the try_online_node(). See below regarding early.
And then, remove the !node_online check from register_cpu_under_node().
But it's all complicated, because for memory, we link a memory block to the node (+set the node online) when it gets added, not when it gets onlined.
For CPUs, we seem to be creating the link + set the node online when the CPU gets onlined.
> Later, online_store()->...->cpu_subsys_online()->..->cpu_up() will take> care of 1) onlining the node and 2) register the cpu to the node (so,
The first time we hotplug a cpu to the node, note that
register_cpu()->register_cpu_under_node() will bail out as node is still
offline, so only cpu's sysfs will be created but they will not be linked
to the node.
link the sysfs).
And only if it actually gets onlined I assume.
The second time we hotplug a cpu,
register_cpu()->register_cpu_under_node() will do its job as the node is
already onlined.
And we will not be calling register_one_node() from __try_online_node()
because of the same reason.
The thing that bothers me is having register_cpu_under_node() spread
around.
Right.
I think that ideally, we should only be calling register_cpu_under_node()
from register_cpu(), but we have this kinda of (sort of weird?) relation
that even if we hotplug the cpu, but we do not online it, the numa node
will remain online, and so we cannot do the linking part (cpu <-> node),
so we could not really only have register_cpu_under_node() in
register_cpu(), which is the hot-add part, but we also need it in the
cpu_up()->try_online_node() which is the online part.
Maybe one could handle CPUs similar to how we handle it with memory: node gets onlined + link created as soon as we add the CPU, not when we online it.
But likely there is a reason why we do it like that today ...
And we cannot also remove the register_cpu_under_node() from
register_cpu() because it is used in other paths (e.g: at boot time ).
Ah, so in that case we don't call cpu_up ... hm.
Of course, we can always detect the context (early vs. hotplug). Maybe, we should split the early vs. hotplug case up much earlier.
register_cpu_early() / register_cpu_hotplug() ... maybe