Re: [PATCH v2] hfs: prevent MDB and bitmap buffer_head aliasing
From: Viacheslav Dubeyko
Date: Fri Jun 05 2026 - 14:57:12 EST
On Fri, 2026-06-05 at 12:00 +0800, Sam Sun wrote:
> On Fri, Jun 5, 2026 at 7:52 AM Viacheslav Dubeyko <slava@xxxxxxxxxxx>
> wrote:
> >
> > On Thu, 2026-06-04 at 14:25 -0700, Viacheslav Dubeyko wrote:
> > > On Fri, 2026-06-05 at 02:02 +0800, Yue Sun wrote:
> > > > On Thu, Jun 4, 2026 at 12:25 AM <slava@xxxxxxxxxxx> wrote:
> > > > >
> > > > > Could you please take a deeper look into my diff that I've
> > > > > shared
> > > > > with
> > > > > you before? I don't see the point to review your suggestion
> > > > > if
> > > > > you
> > > > > completely ignored my diff. From my point of view you
> > > > > suggested
> > > > > completely the same approach but in more complicated manner.
> > > >
> > > > You are right. I am sorry for the confusion.
> > > >
> > > > I should have looked more carefully at your diff and based my
> > > > reply
> > > > on
> > > > it. Adding an HFS-level MDB mutex and avoiding holding mdb_bh
> > > > across
> > > > the
> > > > whole hfs_mdb_commit() path is the right direction. My previous
> > > > reply
> > > > mixed that with extra cleanups and a separate corrupted-layout
> > > > check,
> > > > which made it look like I was proposing a different approach.
> > > > That
> > > > was
> > > > my mistake.
> > > >
> > > > I am not very familiar with HFS internals, so please treat the
> > > > following
> > > > only as a small suggestion. After re-reading your diff, the
> > > > only
> > > > detail
> > > > I am still unsure about is the HFS_FLG_ALT_MDB_DIRTY path. In
> > > > that
> > > > path,
> > > > hfs_inode_write_fork() gets MDB fields as output buffers:
> > > >
> > > > hfs_inode_write_fork(..., mdb->drXTExtRec,
> > > > &mdb->drXTFlSize, NULL);
> > > > hfs_inode_write_fork(..., mdb->drCTExtRec,
> > > > &mdb->drCTFlSize, NULL);
> > >
> > >
> > > I've started to think that, maybe, the location of these
> > > hfs_inode_write_fork() calls is not correct. We are trying to
> > > save
> > > the
> > > current size of Catalog File and Extents Overflow File. OK, we
> > > need
> > > to
> > > update this if size is changed. But we can save the sizes during
> > > every
> > > call of hfs_mdb_commit(). It's logically incorrect to call these
> > > methods in the alternative MDB section because it is the content
> > > of
> > > primary MDB. I suggest to move these two hfs_inode_write_fork()
> > > to
> > > primary MDB writing section:
> > >
> > > lock_buffer(HFS_SB(sb)->mdb_bh);
> > > /* These parameters may have been modified, so
> > > write
> > > them back */
> > > mdb->drLsMod = hfs_mtime();
> > > mdb->drFreeBks = cpu_to_be16(HFS_SB(sb)-
> > > > free_ablocks);
> > > mdb->drNxtCNID =
> > > cpu_to_be32((u32)atomic64_read(&HFS_SB(sb)-
> > > > next_id));
> > > mdb->drNmFls = cpu_to_be16(HFS_SB(sb)->root_files);
> > > mdb->drNmRtDirs = cpu_to_be16(HFS_SB(sb)-
> > > > root_dirs);
> > > mdb->drFilCnt =
> > > cpu_to_be32((u32)atomic64_read(&HFS_SB(sb)-
> > > > file_count));
> > > mdb->drDirCnt =
> > > cpu_to_be32((u32)atomic64_read(&HFS_SB(sb)-
> > > > folder_count));
> > >
> > > hfs_inode_write_fork(HFS_SB(sb)->ext_tree->inode,
> > > mdb-
> > > > drXTExtRec,
> > > &mdb->drXTFlSize, NULL);
> > > hfs_inode_write_fork(HFS_SB(sb)->cat_tree->inode,
> > > mdb-
> > > > drCTExtRec,
> > > &mdb->drCTFlSize, NULL);
> > >
> > > /* write MDB to disk */
> > > mark_buffer_dirty(HFS_SB(sb)->mdb_bh);
> > > unlock_buffer(HFS_SB(sb)->mdb_bh);
> > >
> > > In this case, we don;'t need to overlap the locks. What do you
> > > think?
> > > Sounds like reasonable solution?
> > >
> > >
> >
> > Frankly speaking, I don't like this approach of assigning the
> > pointer
> > on bh->b_data to HFS_SB(sb)->mdb:
> >
> > data = (void *)(__bh->b_data + __offset);
> >
> > We manipulate by various fields of HFS_SB(sb)->mdb throughout of
> > the
> > hfs_mdb_commit(). I think that we need to allocate buffer for
> > struct
> > hfs_mdb *mdb during hfs_mdb_get() (and destroy in hfs_mdb_put())
> > and
> > synchronize its content with struct buffer_head *mdb_bh. Otherwise,
> > we
> > never can resolve the problem of proper locking in
> > hfs_mdb_commit().
> > What do you think?
> >
> > Thanks,
> > Slava.
> >
>
> Yes, I agree. Keeping HFS_SB(sb)->mdb as a separate in-memory copy
> sounds
> cleaner than pointing it directly into mdb_bh->b_data.
>
> For the immediate deadlock fix, I think moving hfs_inode_write_fork()
> into the
> primary MDB update section as you suggested sounds like a good
> minimal step. The
> separate in-memory MDB copy sounds like a larger cleanup that could
> follow if
> you think that is the right direction.
>
I don't expect that in-memory MDB copy will require significant
cleanup. We already have mdb and alt_mdb pointer that are used in the
HFS code. We simply need to change the initialization phase, freeing
phase and hfs_mdb_commit() logic. It doesn't sound like a big cleanup.
But it will be great to get rid of using the buffer heads for primary
and alternative MDB records. This is the more complicated change. But
it should be not the part of this patch.
Thanks,
Slava.