Re: [PATCH 0/8] dax/kmem: atomic whole-device hotplug via sysfs
From: David Hildenbrand (Arm)
Date: Mon Apr 13 2026 - 11:00:02 EST
On 4/13/26 16:49, David Hildenbrand (Arm) wrote:
> On 3/21/26 16:03, Gregory Price wrote:
>> The dax kmem driver currently onlines memory during probe using the
>> system default policy, with no way to control or query the region state
>> at runtime - other than by inspecting the state of individual blocks.
>>
>> Offlining and removing an entire region requires operating on individual
>> memory blocks, creating race conditions where external entities can
>> interfere between the offline and remove steps.
>>
>> The problem was discussed specifically in the LPC2025 device memory
>> sessions - https://lpc.events/event/19/contributions/2016/ - where
>> it was discussed how the non-atomic interface for dax hotplug is causing
>> issues in some distributions which have competing userland controllers
>> that interfere with each other.
>>
>> This series adds a sysfs "hotplug" attribute for atomic whole-device
>> hotplug control, along with the mm and dax plumbing to support it.
>>
>> The first five patches prepare the mm and dax layers:
>>
>> 1. Consolidate memory-tier type deduplication into mt_get_memory_type(),
>> removing redundant per-driver infrastructure.
>> 2. Add a memory_block_align_range() helper for hotplug range alignment.
>> 3-5. Thread an explicit online_type through the memory hotplug and dax
>> paths, allowing drivers to specify a preferred auto-online policy
>> (ZONE_NORMAL vs ZONE_MOVABLE) instead of being forced to the
>> system default.
>>
>> The last three patches build the dax/kmem feature:
>>
>> 6. Plumb online_type through the dax device creation path.
>> 7. Extract hotplug/hotremove into helper functions to separate resource
>> lifecycle from memory onlining.
>> 8. Add the "hotplug" sysfs attribute supporting three states:
>> - "unplug": memory blocks removed
>> - "online": online as normal system RAM
>> - "online_movable": online in ZONE_MOVABLE
>>
>> Transitions are atomic across all ranges in the device. Backward
>> compatibility is preserved: probe still auto-onlines when the configured
>> policy matches the system default.
>>
>> Specific notes for maintainers:
>>
>> I downgraded a BUG() to a WARN() when unbind is called while the dax
>> device is not un an UNPLUGGED state. This is because the old pattern of
>> toggling individual memory blocks is still used by userland tools, and
>> will disconnect the `hotplug` value from the actual state of the overall
>> memory region.
>>
>> Unless we move to deprecate per-block controls, we should just WARN()
>> instead of BUG() as an indicator that userland tools need to be updated
>> to use the new pattern (the old pattern is subject to race conditions).
>>
>> The first two commits are semi-unrelated cleanups that conflict with the
>> changes made in the refactoring commits. (memory-tier dedup and align_range
>> helper). These are intended to be used for future cxl region extensions,
>> but if you prefer them to be dropped or submitted separately let me
>> know.
>>
>> This is technically v3, but the patch line has diverged considerably and
>> I've reworked the cover letter, apologies for prior obtuseness
>> Link: https://lore.kernel.org/all/20260114235022.3437787-1-gourry@xxxxxxxxxx/
>
>
> Hi Gregory,
>
> against which branch / base commit is this series?
>
b4 am --guess-base
Was helpful :)
--
Cheers,
David