Re: [PATCH v4 09/21] mm: swap: allocate a virtual swap slot for each swapped out page

From: Peter Zijlstra

Date: Thu Mar 19 2026 - 04:01:43 EST


On Wed, Mar 18, 2026 at 03:29:40PM -0700, Nhat Pham wrote:
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index 62cd7b35a29c9..85cb45022e796 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -86,6 +86,7 @@ enum cpuhp_state {
> CPUHP_FS_BUFF_DEAD,
> CPUHP_PRINTK_DEAD,
> CPUHP_MM_MEMCQ_DEAD,
> + CPUHP_MM_VSWAP_DEAD,
> CPUHP_PERCPU_CNT_DEAD,
> CPUHP_RADIX_DEAD,
> CPUHP_PAGE_ALLOC,

> +static int vswap_cpu_dead(unsigned int cpu)
> +{
> + struct vswap_cluster *cluster;
> + int order;
> +
> + rcu_read_lock();

nit:
guard(rcu)();

> + for (order = 0; order < SWAP_NR_ORDERS; order++) {
> + cluster = per_cpu(percpu_vswap_cluster.clusters[order], cpu);
> + if (cluster) {
> + per_cpu(percpu_vswap_cluster.clusters[order], cpu) = NULL;
> + spin_lock(&cluster->lock);

This breaks on PREEMPT_RT as this is ran with IRQs disabled. This must
be a raw_spinlock_t.

> + cluster->cached = false;
> + if (refcount_dec_and_test(&cluster->refcnt))
> + vswap_cluster_free(cluster);

And this... below.

> + spin_unlock(&cluster->lock);
> + }
> + }
> + rcu_read_unlock();
> +
> + return 0;
> +}

> +static void vswap_cluster_free(struct vswap_cluster *cluster)
> +{
> + VM_WARN_ON(cluster->count || cluster->cached);
> + VM_WARN_ON(!spin_is_locked(&cluster->lock));

This is terrible, please use:

lockdep_assert_held(&cluster->lock);

> + xa_lock(&vswap_cluster_map);

This is again broken, this cannot be from a DEAD callback with IRQs
disabled.

> + list_del_init(&cluster->list);
> + __xa_erase(&vswap_cluster_map, cluster->id);

Strictly speaking this can end up in xas_alloc(), which is again, not
allowed in a DEAD callback.

> + xa_unlock(&vswap_cluster_map);
> + rcu_head_init(&cluster->rcu);
> + kvfree_rcu(cluster, rcu);
> +}