Re: [PATCH] md/raid5: skip 2-failure compute when other disk is R5_LOCKED
From: Yu Kuai
Date: Fri Mar 20 2026 - 00:04:50 EST
在 2026/3/19 13:33, FengWei Shih 写道:
> When skip_copy is enabled on a doubly-degraded RAID6, a device that is
> being written to will be in R5_LOCKED state with R5_UPTODATE cleared.
> If a new read triggers fetch_block() while the write is still in
> flight, the 2-failure compute path may select this locked device as a
> compute target because it is not R5_UPTODATE.
>
> Because skip_copy makes the device page point directly to the bio page,
> reconstructing data into it might be risky. Also, since the compute
> marks the device R5_UPTODATE, it triggers WARN_ON in ops_run_io()
> which checks that R5_SkipCopy and R5_UPTODATE are not both set.
>
> This can be reproduced by running small-range concurrent read/write on
> a doubly-degraded RAID6 with skip_copy enabled, for example:
>
> mdadm -C /dev/md0 -l6 -n6 -R -f /dev/loop[0-3] missing missing
> echo 1 > /sys/block/md0/md/skip_copy
> fio --filename=/dev/md0 --rw=randrw --bs=4k --numjobs=8 \
> --iodepth=32 --size=4M --runtime=30 --time_based --direct=1
>
> Fix by checking R5_LOCKED before proceeding with the compute. The
> compute will be retried once the lock is cleared on IO completion.
>
> Signed-off-by: FengWei Shih<dannyshih@xxxxxxxxxxxx>
> ---
> drivers/md/raid5.c | 2 ++
> 1 file changed, 2 insertions(+)
Reviewed-by: Yu Kuai <yukuai@xxxxxxxxx>
--
Thansk,
Kuai