Re: [PATCH v2 06/16] iommupt: Implement preserve/unpreserve/restore callbacks

From: Samiullah Khawaja

Date: Tue May 19 2026 - 13:17:03 EST


On Tue, May 19, 2026 at 01:15:08PM +0000, Pranjal Shrivastava wrote:
On Mon, Apr 27, 2026 at 05:56:23PM +0000, Samiullah Khawaja wrote:
Implement the iommu domain ops for presevation, unpresevation and
restoration of iommu domains for liveupdate. Use the existing page
walker to preserve the ioptdesc of the top_table and the lower tables.

Preserve top_level, VASZ and FEAT Sign Extended to restore the domain in
the next kernel. On restore the domain has only the preserved features
enabled and all the other features are zeroed. This is ok since the
restored domain is made immutable and can only be freed. A kunit test is
added to verify that the IOMMU domain free can be done with trimmed
features.

Signed-off-by: Samiullah Khawaja <skhawaja@xxxxxxxxxx>
---
drivers/iommu/generic_pt/iommu_pt.h | 131 ++++++++++++++++++++++
drivers/iommu/generic_pt/kunit_iommu_pt.h | 28 +++++
include/linux/generic_pt/iommu.h | 19 +++-
3 files changed, 177 insertions(+), 1 deletion(-)


[...]

+static const struct pt_iommu_ops NS(ops_immutable);
+
+/**
+ * restore() - Restore page tables and other state of a domain.
+ * @domain: Domain to preserve
+ *
+ * Returns: -ERRNO on failure, 0 on success.
+ */
+int DOMAIN_NS(restore)(struct iommu_domain *domain, struct iommu_domain_ser *ser)
+{
+ struct pt_iommu *iommu_table =
+ container_of(domain, struct pt_iommu, domain);
+ struct pt_common *common = common_from_iommu(iommu_table);
+ struct pt_range range;
+
+ common->max_vasz_lg2 = ser->vasz;
+
+ /* Make this domain immutable.*/
+ iommu_table->ops = &NS(ops_immutable);
+

Nit: Let's consider adding a comment since we just have .deinit in ops:

/*
* Restored page tables are strictly transient and only permitted to be
* destroyed via .deinit. Because we only preserve user-assigned devices
* utilizing pass-through frameworks (VFIO / IOMMUFD), any concurrent or
* subsequent map/unmap operations on a restored domain are explicitly
* blocked at the subsystem boundary (e.g., via IOAS immutability seals).
*/

This makes it explicitly clear that any nullptr dererfs occuring due to
immutable_ops->map_range etc. (in the future) are considered as bugs.

Sounds great, will add a comment here.

+ /*
+ * It is safe to override this here since this domain is immutable and
+ * can only be freed.
+ */
+ common->features = 0;
+ if (ser->sign_extend)
+ common->features |= BIT(PT_FEAT_SIGN_EXTEND);
+
+ range = pt_all_range(common);
+ iommu_restore_page(ser->top_table_phys);
+
+ /* Free new table */
+ iommu_free_pages(range.top_table);
+
+ /* Set the restored top table */
+ pt_top_set(common, phys_to_virt(ser->top_table_phys), ser->top_level);
+
+ /* Restore all pages*/
+ range = pt_all_range(common);
+ return pt_walk_range(&range, __restore_tables, NULL);
+}
+EXPORT_SYMBOL_NS_GPL(DOMAIN_NS(restore), "GENERIC_PT_IOMMU");
+#endif
+
struct pt_unmap_args {
struct iommu_pages_list free_list;
pt_vaddr_t unmapped;
@@ -1138,6 +1265,10 @@ static const struct pt_iommu_ops NS(ops) = {
.deinit = NS(deinit),
};

+static const struct pt_iommu_ops NS(ops_immutable) = {
+ .deinit = NS(deinit),
+};
+

[...]

Thanks,
Praan

Sami