Re: [PATCH v1 RESEND 4/4] drm/tyr: add GPU reset handling
From: Boris Brezillon
Date: Fri Apr 10 2026 - 09:27:25 EST
On Fri, 10 Apr 2026 10:00:56 -0300
Daniel Almeida <daniel.almeida@xxxxxxxxxxxxx> wrote:
> >
> > When you begin using the hardware, you start an srcu critical region and
> > read the counter. If the counter has the sentinel value, you know the
> > hardware is resetting and you fail. Otherwise you record the couter and
> > proceed.
> >
> > If at any point you release the srcu critical region and want to
> > re-acquire it to continue the same ongoing work, then you must ensure
> > that the counter still has the same value. This ensures that if the GPU
> > is reset, then even if the reset has finished by the time you come back,
> > you still fail because the counter has changed.
>
> We don't want to "come back”, anything that is in-flight must complete, i.e.:
> the reset logic must wait for in-flight jobs, because the work has already been
> dispatched to the hardware.
I assume you meant s/in-flight jobs/in-flight works/, because the whole
point of a reset is to recover for in-flight GPU jobs that hanged the
GPU, so if you have to wait for them to land, you're screwed :P.