Re: [PATCH 3/4] x86/virt/tdx: Add SEAMCALL wrapper for TDH.SYS.DISABLE

From: Kiryl Shutsemau

Date: Tue Mar 17 2026 - 05:56:32 EST


On Mon, Mar 16, 2026 at 09:15:13PM +0000, Edgecombe, Rick P wrote:
> On Mon, 2026-03-16 at 11:51 +0000, Kiryl Shutsemau wrote:
> > On Fri, Mar 06, 2026 at 05:03:57PM -0800, Rick Edgecombe wrote:
> > > From: Vishal Verma <vishal.l.verma@xxxxxxxxx>
> > >
> > > Some early TDX-capable platforms have an erratum where a partial write
> > > to TDX private memory can cause a machine check on a subsequent read.
> > > On these platforms, kexec and kdump have been disabled in these cases,
> > > because the old kernel cannot safely hand off TDX state to the new
> > > kernel. Later TDX modules support the TDH.SYS.DISABLE SEAMCALL, which
> > > provides a way to cleanly disable TDX and allow kexec to proceed.
> >
> > Does it need to be enumerated?
> >
> > I don't see this SEAMCALL be covered in the public documentation.
> > </me looking around>
> > Ah! Found it the the draft. So the feature is not yet finalized.
> >
> > "Support of TDH.SYS.DISABLE is enumerated by TDX_FEATURES0. SYS_DISABLE
> > (bit 53)"
> >
> > I am seeing the next patch calling it unconditionally. Is it okay?
>
> We debated checking the feature bit before allowing kexec, but decided it was
> simpler to just blindly call and ignore the errors. The reasoning was that this
> is already a somewhat exotic scenario being addressed, and future modules will
> have the feature. So maintaining a check for the feature bit only helps a little
> bit, for a short time. And then only if the user would rather have kexec blocked
> than attempt it. Do you think it is worth it?

No, I see very limited reason to support stale TDX modules. Users are
expected to keep the module up-to-date, so skipping enumeration should
be okay. But it deserves explanation in the commit message or a comment.

> > > This can be a long running operation, and the time needed largely
> > > depends on the amount of memory that has been allocated to TDs. If all
> > > TDs have been destroyed prior to the sys_disable call, then it is fast,
> > > with only needing to override the TDX module memory.
> > >
> > > After the SEAMCALL completes, the TDX module is disabled and all memory
> > > resources allocated to TDX are freed and reset. The next kernel can then
> > > re-initialize the TDX module from scratch via the normal TDX bring-up
> > > sequence.
> > >
> > > The SEAMCALL may be interrupted by an interrupt. In this case, it
> > > returns TDX_INTERRUPTED_RESUMABLE, and it must be retried in a loop
> > > until the operation completes successfully.
> > >
> > > Add a tdx_sys_disable() helper, which implements the retry loop around
> > > the SEAMCALL to provide this functionality.
> > >
> > > Signed-off-by: Vishal Verma <vishal.l.verma@xxxxxxxxx>
> > > Signed-off-by: Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx>
> > > ---
> > >   arch/x86/include/asm/tdx.h  |  3 +++
> > >   arch/x86/virt/vmx/tdx/tdx.c | 18 ++++++++++++++++++
> > >   arch/x86/virt/vmx/tdx/tdx.h |  1 +
> > >   3 files changed, 22 insertions(+)
> > >
> > > diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> > > index f0826b0a512a..baaf43a09e99 100644
> > > --- a/arch/x86/include/asm/tdx.h
> > > +++ b/arch/x86/include/asm/tdx.h
> > > @@ -173,6 +173,8 @@ static inline int pg_level_to_tdx_sept_level(enum pg_level level)
> > >           return level - 1;
> > >   }
> > >  
> > > +void tdx_sys_disable(void);
> > > +
> > >   u64 tdh_vp_enter(struct tdx_vp *vp, struct tdx_module_args *args);
> > >   u64 tdh_mng_addcx(struct tdx_td *td, struct page *tdcs_page);
> > >   u64 tdh_mem_page_add(struct tdx_td *td, u64 gpa, struct page *page, struct page *source, u64 *ext_err1, u64 *ext_err2);
> > > @@ -204,6 +206,7 @@ static inline void tdx_init(void) { }
> > >   static inline u32 tdx_get_nr_guest_keyids(void) { return 0; }
> > >   static inline const char *tdx_dump_mce_info(struct mce *m) { return NULL; }
> > >   static inline const struct tdx_sys_info *tdx_get_sysinfo(void) { return NULL; }
> > > +static inline void tdx_sys_disable(void) { }
> > >   #endif /* CONFIG_INTEL_TDX_HOST */
> > >  
> > >   #endif /* !__ASSEMBLER__ */
> > > diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> > > index 0802d0fd18a4..68bd2618dde4 100644
> > > --- a/arch/x86/virt/vmx/tdx/tdx.c
> > > +++ b/arch/x86/virt/vmx/tdx/tdx.c
> > > @@ -37,6 +37,7 @@
> > >   #include <asm/msr.h>
> > >   #include <asm/cpufeature.h>
> > >   #include <asm/tdx.h>
> > > +#include <asm/shared/tdx_errno.h>
> > >   #include <asm/cpu_device_id.h>
> > >   #include <asm/processor.h>
> > >   #include <asm/mce.h>
> > > @@ -1940,3 +1941,20 @@ u64 tdh_phymem_page_wbinvd_hkid(u64 hkid, struct page *page)
> > >    return seamcall(TDH_PHYMEM_PAGE_WBINVD, &args);
> > >   }
> > >   EXPORT_SYMBOL_FOR_KVM(tdh_phymem_page_wbinvd_hkid);
> > > +
> > > +void tdx_sys_disable(void)
> > > +{
> > > + struct tdx_module_args args = {};
> > > +
> > > + /*
> > > + * SEAMCALLs that can return TDX_INTERRUPTED_RESUMABLE are guaranteed
> > > + * to make forward progress between interrupts, so it is safe to loop
> > > + * unconditionally here.
> > > + *
> > > + * This is a 'destructive' SEAMCALL, in that no other SEAMCALL can be
> > > + * run after this until a full reinitialization is done.
> > > + */
> > > + while (seamcall(TDH_SYS_DISABLE, &args) == TDX_INTERRUPTED_RESUMABLE)
> > > + ;
> >
> > Silently ignore any other errors?
>
> Do you think it's worth a warn? There are a couple other considerations.
> - Kai brought up offline that we should handle TDX_SYS_BUSY here too.
> - Previous kexec patches had trouble solving races around tdx enabling. So we
> have to handle the seamcall failures.
>
> So we have to exclude a few different errors in different ways. And then the
> warn worthy error codes either don't impact anything, or the new kernel will
> fail to initialize the TDX module and give notice there.

The delayed error is harder to debug. It can be useful to leave a
breadcrumbs.

Also, do we want to make try_init_module_global() return failure after
tdx_sys_disable()? I guess, TDH_SYS_LP_INIT will fail anyway, so it
shouldn't matter.

--
Kiryl Shutsemau / Kirill A. Shutemov