Re: [PATCH v12 2/3] rust: add dma coherent allocator abstraction.
From: Alice Ryhl
Date: Mon Mar 03 2025 - 10:46:22 EST
On Mon, Mar 3, 2025 at 4:21 PM Andreas Hindborg <a.hindborg@xxxxxxxxxx> wrote:
>
> "Alice Ryhl" <aliceryhl@xxxxxxxxxx> writes:
>
> > On Mon, Mar 3, 2025 at 2:00 PM Andreas Hindborg <a.hindborg@xxxxxxxxxx> wrote:
> >>
> >> "Daniel Almeida" <daniel.almeida@xxxxxxxxxxxxx> writes:
> >>
> >> > Hi Benno,
> >> >
> >>
> >> [...]
> >>
> >> >>> + /// Writes data to the region starting from `offset`. `offset` is in units of `T`, not the
> >> >>> + /// number of bytes.
> >> >>> + ///
> >> >>> + /// # Examples
> >> >>> + ///
> >> >>> + /// ```
> >> >>> + /// # fn test(alloc: &mut kernel::dma::CoherentAllocation<u8>) -> Result {
> >> >>> + /// let somedata: [u8; 4] = [0xf; 4];
> >> >>> + /// let buf: &[u8] = &somedata;
> >> >>> + /// alloc.write(buf, 0)?;
> >> >>> + /// # Ok::<(), Error>(()) }
> >> >>> + /// ```
> >> >>> + pub fn write(&self, src: &[T], offset: usize) -> Result {
> >> >>> + let end = offset.checked_add(src.len()).ok_or(EOVERFLOW)?;
> >> >>> + if end >= self.count {
> >> >>> + return Err(EINVAL);
> >> >>> + }
> >> >>> + // SAFETY:
> >> >>> + // - The pointer is valid due to type invariant on `CoherentAllocation`
> >> >>> + // and we've just checked that the range and index is within bounds.
> >> >>> + // - `offset` can't overflow since it is smaller than `selfcount` and we've checked
> >> >>> + // that `self.count` won't overflow early in the constructor.
> >> >>> + unsafe {
> >> >>> + core::ptr::copy_nonoverlapping(src.as_ptr(), self.cpu_addr.add(offset), src.len())
> >> >>
> >> >> Why are there no concurrent write or read operations on `cpu_addr`?
> >> >
> >> > Sorry, can you rephrase this question?
> >>
> >> This write is suffering the same complications as discussed here [1].
> >> There are multiple issues with this implementation.
> >>
> >> 1) `write` takes a shared reference and thus may be called concurrently.
> >> There is no synchronization, so `copy_nonoverlapping` could be called
> >> concurrently on the same address. The safety requirements for
> >> `copy_nonoverlapping` state that the destination must be valid for
> >> write. Alice claims in [1] that any memory area that experience data
> >> races are not valid for writes. So the safety requirement of
> >> `copy_nonoverlapping` is violated and this call is potential UB.
> >>
> >> 2) The destination of this write is DMA memory. It could be concurrently
> >> modified by hardware, leading to the same issues as 1). Thus the
> >> function cannot be safe if we cannot guarantee hardware will not write
> >> to the region while this function is executing.
> >>
> >> Now, I don't think that these _should_ be issues, but according to our
> >> Rust language experts they _are_.
> >>
> >> I really think that copying data through a raw pointer to or from a
> >> place that experiences data races, should _not_ be UB if the data is not
> >> interpreted in any way, other than moving it.
> >>
> >>
> >> Best regards,
> >> Andreas Hindborg
> >
> > We need to make progress on this series, and it's starting to get late
> > in the cycle. I suggest we:
>
> There is always another cycle.
>
> >
> > 1. Delete as_slice, as_slice_mut, write, and skip_drop.
> > 2. Change field_read/field_write to use a volatile read/write.
>
> Volatile reads/writes that race are OK?
I will not give a blanket yes to that. If you read their docs, you
will find that they claim to not allow it. But they are the correct
choice for DMA memory, and there's no way in practice to get
miscompilations on memory locations that are only accessed with
volatile operations, and never have references to them created.
In general, this will fall into the exception that we've been given
from the Rust people. In cases such as this where the Rust language
does not give us the operation we want, do it like you do in C. Since
Rust uses LLVM which does not miscompile the C part of the kernel, it
should not miscompile the Rust part either.
> > This will let us make progress now and sidestep this discussion. The
> > deleted methods can happen in a follow-up.
>
> `item_from_index`, the `dma_read` and `dma_write` macros as well, I would think?
Those are necessary to use field_read/field_write, so I think it's
fine to keep those.
Alice