Re: [PATCH v9 2/2] selftests/vfio: Add NVIDIA Falcon driver for DMA testing

From: Alex Williamson

Date: Thu Mar 19 2026 - 15:29:51 EST


On Thu, 19 Mar 2026 19:04:37 +0000
David Matlack <dmatlack@xxxxxxxxxx> wrote:

> On 2026-03-17 02:42 PM, Rubin Du wrote:
> > Add a new VFIO PCI driver for NVIDIA GPUs that enables DMA testing
> > via the Falcon (Fast Logic Controller) microcontrollers. This driver
> > extracts and adapts the DMA test functionality from the NVIDIA
> > gpu-admin-tools project and integrates it into the existing VFIO
> > selftest framework.
> >
> > The Falcon is a general-purpose microcontroller present on NVIDIA GPUs
> > that can perform DMA operations between system memory and device memory.
> > By leveraging Falcon DMA, this driver allows NVIDIA GPUs to be tested
> > alongside Intel IOAT and DSA devices using the same selftest infrastructure.
> >
> > Supported GPUs:
> > - Kepler: K520, GTX660, K4000, K80, GT635
> > - Maxwell Gen1: GTX750, GTX745
> > - Maxwell Gen2: M60
> > - Pascal: P100, P4, P40
> > - Volta: V100
> > - Turing: T4
> > - Ampere: A16, A100, A10
> > - Ada: L4, L40S
> > - Hopper: H100
> >
> > The PMU falcon on Kepler and Maxwell Gen1 GPUs uses legacy FBIF register
> > offsets and requires enabling via PMC_ENABLE with the HUB bit set.
> >
> > Limitations and tradeoffs:
> >
> > 1. Architecture support:
> > Blackwell and newer architectures may require additional work
> > due to firmware.
> >
> > 2. Synchronous DMA operations:
> > Each transfer blocks until completion because the reference
> > implementation does not expose command queuing - only one
> > DMA operation can be in flight at a time.
>
> Asynchronous DMA will be important for testing Live Update:
>
> https://lore.kernel.org/kvm/20260129212510.967611-23-dmatlack@xxxxxxxxxx/
>
> That is why I split memcpy_start() and memcpy_wait() from the beginning.
>
> Would it be possible to add support for it here even though it is not in
> the reference implementation?

I'll leave the can-we questions to Rubin, but do you see either the MSI
or asynchronous issues as blockers? Currently our driver tests are
limited to a very narrow range of Intel server platforms, whereas this
is a plug'able endpoint we can install anywhere. I'd think that's
sufficiently valuable in expanding the test base to make some
compromises. Thanks,

Alex