Re: [RFC PATCH 02/21] x86/virt/tdx: Enhance tdh_mem_page_aug() to support huge pages

From: Yan Zhao
Date: Fri May 16 2025 - 05:08:11 EST


On Wed, May 14, 2025 at 02:52:49AM +0800, Edgecombe, Rick P wrote:
> On Thu, 2025-04-24 at 11:04 +0800, Yan Zhao wrote:
> > Enhance the SEAMCALL wrapper tdh_mem_page_aug() to support huge pages.
> >
> > Verify the validity of the level and ensure that the mapping range is fully
> > contained within the page folio.
> >
> > As a conservative solution, perform CLFLUSH on all pages to be mapped into
> > the TD before invoking the SEAMCALL TDH_MEM_PAGE_AUG. This ensures that any
> > dirty cache lines do not write back later and clobber TD memory.
>
> This should have a brief background on why it doesn't use the arg - what is
> deficient today. Also, an explanation of how it will be used (i.e. what types of
> pages will be passed)
Will do.

> >
> > Signed-off-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx>
> > Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> > Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
> > ---
> >  arch/x86/virt/vmx/tdx/tdx.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> > index f5e2a937c1e7..a66d501b5677 100644
> > --- a/arch/x86/virt/vmx/tdx/tdx.c
> > +++ b/arch/x86/virt/vmx/tdx/tdx.c
> > @@ -1595,9 +1595,18 @@ u64 tdh_mem_page_aug(struct tdx_td *td, u64 gpa, int level, struct page *page, u
> >   .rdx = tdx_tdr_pa(td),
> >   .r8 = page_to_phys(page),
> >   };
> > + unsigned long nr_pages = 1 << (level * 9);
> > + struct folio *folio = page_folio(page);
> > + unsigned long idx = 0;
> >   u64 ret;
> >  
> > - tdx_clflush_page(page);
> > + if (!(level >= TDX_PS_4K && level < TDX_PS_NR) ||
> > +     (folio_page_idx(folio, page) + nr_pages > folio_nr_pages(folio)))
> > + return -EINVAL;
>
> Shouldn't KVM not try to map a huge page in this situation? Doesn't seem like a
> job for the SEAMCALL wrapper.
Ok. If the decision is to trust KVM and all potential callers, it's reasonable
to drop those checks.

> > +
> > + while (nr_pages--)
> > + tdx_clflush_page(nth_page(page, idx++));
>
> clflush_cache_range() is:
> static void tdx_clflush_page(struct page *page)
> {
> clflush_cache_range(page_to_virt(page), PAGE_SIZE);
> }
>
> So we have loops within loops... Better to add an arg to tdx_clflush_page() or
> add a variant that takes one.
Ok.

One thing to note is that even with an extra arg, tdx_clflush_page() has to call
clflush_cache_range() page by page because with
"#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)",
page virtual addresses are not necessarily contiguous.

What about Binbin's proposal [1]? i.e.,

while (nr_pages)
tdx_clflush_page(nth_page(page, --nr_pages));

[1] https://lore.kernel.org/all/a7d0988d-037c-454f-bc6b-57e71b357488@xxxxxxxxxxxxxxx/

> > +
> >   ret = seamcall_ret(TDH_MEM_PAGE_AUG, &args);
> >  
> >   *ext_err1 = args.rcx;
>