Re: [PATCH v3 2/2] exfat: EXFAT_IOC_GET_VALID_DATA ioctl

From: David Timber

Date: Mon Mar 16 2026 - 18:38:56 EST


On 3/13/26 22:59, Namjae Jeon wrote:
> I'm working on adding iomap support to exFAT, and I think SEEK_HOLE
> will be able to address the requirements we discussed. I will bring
> this up again once the iomap work is complete.
Good! exFAT not constructed using iomap leading to an eventual
catastrophic tech debt has been one of my concerns. Since you're working
on it, you'd naturally implement FIEMAP as well, right? I was cooking up
a quick and dirty functions for FIEMAP... Not sure if you're interested
in them, but would happy to submit in the future. It's only a
kernel-specific interface almost no userspace uses, but really
insightful tool in analyzing fragmentation issue I'm trying to solve.
Might come in handy in the meantime the new iomap implementation is
settling down.

On 3/14/26 01:24, Darrick J. Wong wrote:
> Ah, ok. Does it do that zeroing at write() time, or only when you're
> initiating writeback from the pagecache? I'm guessing write() time,
> since otherwise you're signing the kernel up for initiating a lot of IO
> at a time when memory could be scarce.
NO! That'd be insane! I'm pretty sure faults are handled "sparsely".
> In contrast, lot of people call readable file regions not backed by any
> space "sparse holes". Unfortunately the fallocate manpage muddies
> things up by saying:
Yes, that to which I myself have fallen victim. Thank you for such a
great insight from a filesystem guru, btw.

> OTOH I guess they could confirm that by calling the VDL ioctl and
> getting a non-error response. But if we've solved finding the VDL by
> making SEEK_HOLE return values below EOF, then why do we need the ioctl?
> What if we added a statx flag to advertise sparse hole support on a
> file? And then didn't set it for exfat?
> /me notes that you can implement iomap only for lseek.
Yes, I'm aware. Was just expressing my concerns regarding the tech debt
in exFAT mentioned earlier. Just swapped back in differently from my head.

Anyway, admitting my recent defeat, I've been toying with the idea of
using SEEK_DATA and SEEK_HOLE for detecting the [VDL - isize)
discrepancy(attached at the end of the email). Not making an official
submission yet because I've been testing if it breaks any critial
userland utils. So far, so good. A bunch of xfstests fail, though. The
ones written with the assumption that filesystems that support SEEK_DATA
and SEEK_HOLE can always have data or hole in the middle(which I'd like
to call "hole sandwich/burger" and "data sandwich/burger",
respectively). This was what I was worried about and it seems it
actually manifested.

generic/285:
> 05.15 SEEK_HOLE expected -1 with errno -6, got -6.               
> succ  ERROR 28: Failed to write 524288 bytes
generic/490:
> File system does not support punch hole.
>   ERROR 28: Failed to write 32768 bytes
Not sure why all the units "succ" whilst the program returns non-zero
exit code before saying:
> seek sanity check failed!
Might have to fix the test program to catch the misbehaving filesystems
that returns other erronos.

Davo

diff --git a/fs/exfat/file.c b/fs/exfat/file.c
index 2daf0dbabb24..99ba1f5f9a57 100644
--- a/fs/exfat/file.c
+++ b/fs/exfat/file.c
@@ -799,8 +799,80 @@ static ssize_t exfat_splice_read(struct file *in, loff_t *ppos,
     return filemap_splice_read(in, ppos, pipe, len, flags);
 }
 
+/*
+ * A special SEEK_DATA and SEEK_HOLE handler that treats the unwritten range
+ * between the VDL(valid data length) and EOF as a hole. Since the VDL in exFAT
+ * is not required to be aligned to any block boundary and holes in extent-based
+ * filesystems are typically aligned to a certain block size, we try our best to
+ * align the VDL to the device block size as not to confuse any userland
+ * programs that may depend on that assumption.
+ *
+ * The function will treat the last block containing data as the last data block
+ * and the block that follows immediately after as the start of the hole leading
+ * up to EOF. The last data block may have some unwritten bytes, but that's only
+ * O(1) write amplification.
+ */
+static loff_t exfat_vdl_llseek(struct file *file, loff_t offset, int whence)
+{
+    struct inode *inode = file->f_mapping->host;
+    struct super_block *sb = inode->i_sb;
+    struct exfat_inode_info *ei = EXFAT_I(inode);
+    loff_t maxbytes = inode->i_sb->s_maxbytes;
+    loff_t datasize;
+    loff_t size;
+
+    inode_lock(inode);
+
+    size = i_size_read(inode);
+
+    datasize = EXFAT_B_TO_BLK_ROUND_UP(ei->valid_size, sb);
+    datasize = EXFAT_BLK_TO_B(datasize, sb);
+    if (datasize > size)
+        datasize = size;
+
+    /* Same check found in iomap_seek_*() */
+    if (offset < 0 || offset >= size) {
+        offset = -ENXIO;
+        goto out;
+    }
+
+    if (whence == SEEK_DATA) {
+        /*
+         * As exFAT does not support sparse files, SEEK_DATA is pretty
+         * much useless. But still, to be compliant, SEEK_DATA shouldn't
+         * work if the offset is in a hole.
+         */
+        if (offset >= datasize)
+            offset = -ENXIO;
+    }
+    else if (whence == SEEK_HOLE) {
+        if (offset < datasize)
+            offset = datasize;
+    }
+    else
+        BUG();
+
+out:
+    inode_unlock(inode);
+
+    if (offset < 0) {
+        return offset;
+    }
+
+    return vfs_setpos(file, offset, maxbytes);
+}
+
+static loff_t exfat_file_llseek(struct file *file, loff_t offset, int whence)
+{
+    if (whence == SEEK_DATA || whence == SEEK_HOLE) {
+        return exfat_vdl_llseek(file, offset, whence);
+    }
+
+    return generic_file_llseek(file, offset, whence);
+}
+
 const struct file_operations exfat_file_operations = {
-    .llseek        = generic_file_llseek,
+    .llseek        = exfat_file_llseek,
     .read_iter    = exfat_file_read_iter,
     .write_iter    = exfat_file_write_iter,
     .unlocked_ioctl = exfat_ioctl,