Re: [PATCH v11 12/22] gpu: nova-core: Hopper/Blackwell: add FSP secure boot completion waiting

From: Eliot Courtney

Date: Mon Jun 01 2026 - 04:33:17 EST


On Mon Jun 1, 2026 at 4:48 PM JST, Alexandre Courbot wrote:
> On Sat May 30, 2026 at 12:09 PM JST, John Hubbard wrote:
>> Hopper and Blackwell use FSP instead of SEC2 for secure boot. The
>> driver must wait for FSP secure boot to complete before continuing
>> with GSP bring-up. Poll for boot success with a 5-second timeout.
>>
>> Co-developed-by: Alexandre Courbot <acourbot@xxxxxxxxxx>
>> Signed-off-by: Alexandre Courbot <acourbot@xxxxxxxxxx>
>> Signed-off-by: John Hubbard <jhubbard@xxxxxxxxxx>
>> ---
>> drivers/gpu/nova-core/fsp.rs | 51 ++++++++++++++++++++++++++
>> drivers/gpu/nova-core/fsp/hal.rs | 27 ++++++++++++++
>> drivers/gpu/nova-core/fsp/hal/gb202.rs | 23 ++++++++++++
>> drivers/gpu/nova-core/fsp/hal/gh100.rs | 23 ++++++++++++
>> drivers/gpu/nova-core/gsp/hal/gh100.rs | 5 ++-
>> drivers/gpu/nova-core/nova_core.rs | 1 +
>> drivers/gpu/nova-core/regs.rs | 36 ++++++++++++++++++
>> 7 files changed, 165 insertions(+), 1 deletion(-)
>> create mode 100644 drivers/gpu/nova-core/fsp.rs
>> create mode 100644 drivers/gpu/nova-core/fsp/hal.rs
>> create mode 100644 drivers/gpu/nova-core/fsp/hal/gb202.rs
>> create mode 100644 drivers/gpu/nova-core/fsp/hal/gh100.rs
>>
>> diff --git a/drivers/gpu/nova-core/fsp.rs b/drivers/gpu/nova-core/fsp.rs
>> new file mode 100644
>> index 000000000000..ee8fc384fe38
>> --- /dev/null
>> +++ b/drivers/gpu/nova-core/fsp.rs
>> @@ -0,0 +1,51 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
>> +
>> +//! FSP (Firmware System Processor) interface for Hopper/Blackwell GPUs.
>> +//!
>> +//! Hopper/Blackwell use a simplified firmware boot sequence: FMC, then FSP, then GSP.
>> +//! Unlike Turing/Ampere/Ada, there is no SEC2 (Security Engine 2) usage.
>> +//! FSP handles secure boot directly using FMC firmware and Chain of Trust.
>> +
>> +use kernel::{
>> + device,
>> + io::poll::read_poll_timeout,
>> + prelude::*,
>> + time::Delta, //
>> +};
>> +
>> +use crate::{
>> + driver::Bar0,
>> + gpu::Chipset,
>> + regs, //
>> +};
>> +
>> +mod hal;
>> +
>> +/// FSP interface for Hopper/Blackwell GPUs.
>> +pub(crate) struct Fsp;
>
> Throughout the patchset, this type is never instantiated and is only
> used as a namespace for static methods - something that the `fsp` module
> itself could also do.
>
> But I think it could be useful to create it and pass it as the `&mut
> self` parameter of the other methods in the module, as doing so would
> make the module more resilient: we could request `&mut self` for its two
> other methods added later in the patchset and guarantee that there won't
> be any concurrency issue.
>
> This should also probably create and own the `Falcon<Fsp>`, as it is the
> only user through this patchset (and this makes sense from an
> architectural point of view). The `FSP falcon engine stub` patch could
> then refrain from creating the `Falcon<Fsp>`, which would be created by
> this patch.
>
>> +
>> +impl Fsp {
>> + /// Wait for FSP secure boot completion.
>> + ///
>> + /// Polls the thermal scratch register until FSP signals boot completion
>> + /// or timeout occurs.
>> + pub(crate) fn wait_secure_boot(dev: &device::Device, bar: &Bar0, chipset: Chipset) -> Result {
>
> ... and with the design proposed above, this method can return
> `Result<Fsp>`: we are not supposed to use `Fsp` it until secure boot has
> successfully completed, so making it return the instance that enables
> the other methods guarantees that this has happened at the API level.

I think this is a good direction. I think we can also make FspFirmware
owned by this fsp::Fsp object - i.e., move more control of FSP from
Gh100::boot to the Fsp object. You might need to change up FmcBootArgs
and e.g. return something from `boot_fmc` to keep the DMA allocation
alive.