Re: [PATCH 6/9] gpu: nova-core: generalize `flush_into_kvec` to `flush_into_vec`

From: Alexandre Courbot

Date: Mon Mar 16 2026 - 21:55:48 EST

On Mon Mar 16, 2026 at 9:21 PM JST, Danilo Krummrich wrote:
> (Cc: Gary)
>
> On Mon Mar 16, 2026 at 12:44 PM CET, Eliot Courtney wrote:
>> On Tue Mar 10, 2026 at 7:01 AM JST, Danilo Krummrich wrote:
>>> On Mon Mar 9, 2026 at 10:57 PM CET, Danilo Krummrich wrote:
>>>> On 2/27/2026 1:32 PM, Eliot Courtney wrote:
>>>>> Add general `flush_into_vec` function. Add `flush_into_kvvec`
>>>>> convenience wrapper alongside the existing `flush_into_kvec` function.
>>>>> This is generally useful but immediately used for e.g. holding RM
>>>>> control payloads, which can be large (~>=20 KiB).
>>>>
>>>> Why not just always use KVVec? It also seems that the KVec variant is not used?
>>>
>>> (Besides its single usage in GspSequence, which wouldn't hurt to be a KVVec.)
>>>
>>>> If there's no reason for having both, I'd also just call this into_vec().
>>
>> I think always using KVVec should be fine, thanks!
>>
>> For the naming, I think `read_to_vec` may be more conventional for this
>> -- `into_vec` implies consuming the object, but if we want to keep the
>> warning in `Cmdq::receive_msg` if not all the data is consumed we need
>> to take &mut self.
>
> I had another look at this and especially how the SBuffer you refer to is used.
> Unfortunately, the underlying code is broken.
>
> driver_read_area() creates a reference to the whole DMA object, including the
> area the GSP might concurrently write to. This is undefined behavior. See also
> commit commit 0073a17b4666 ("gpu: nova-core: gsp: fix UB in DmaGspMem pointer
> accessors"), where I fixed something similar.

We shouldn't be doing that - I think we are limited by the current
CoherentAllocation API though. But IIUC this is something that I/O
projections will allow us to handle properly?

>
> Additionally, even if it would only create a reference to the part of the buffer
> that can be considerd untouched by the GSP and hence suits for creating a
> reference, driver_read_area() and all subsequent callers would still need to be
> unsafe as they would need to promise to not keep the reference alive beyond GSP
> accessing that memory region again.

This is guaranteed by the inability to update the CPU read pointer for
as long as the slices exists.

To expand a bit: `driver_read_area` returns a slice to the area of the
DMA object that the GSP is guaranteed *not* to write into until the
driver updates the CPU read pointer.

This area is between the CPU read pointer (which signals the next bytes
the CPU has to read, and which the GSP won't cross) and the GSP write
pointer (i.e. the next page to be written by the GSP).

Everything in this zone is data that the GSP has already written but the
driver hasn't read yet at the time of the call.

The CPU read pointer cannot be updated for as long as the returned
slices exist - the slices hold a reference to the `DmaGspMem`, and
updating the read pointer requires a mutable reference to the same
`DmaGspMem`.

Meanwhile, the GSP can keep writing data while the slice exists but that
data will be past the area of the slice, and the GSP will never write
past the CPU read pointer.

So the data in the returned slices is guaranteed to be there at the time
of the call, and immutable for as long as the slices exist. Thus, they
can be provided by a safe method.

Unless we decide to not trust the GSP, but that would be opening a whole
new can of worms.

> I don't want to merge any code that builds on top of this before we have sorted
> this out.

If what I have written above is correct, then the fix should simply be
to use I/O projections to create properly-bounded references. Any more
immediate fix would need to be much more intrusive and require a
refactoring that is imho more risky than carrying on for a bit with the
current behavior.