Eliminate snaplk / bufwait LOR when creating UFS snapshots

Each vnode has an embedded lock that controls access to its contents.
However vnodes describing a UFS snapshot all share a single snapshot
lock to coordinate their access and update. As part of creating a
new UFS snapshot, it has to have its individual vnode lock replaced
with the filesystem's snapshot lock.

The lock order for regular vnodes with respect to buffer locks is that
they must first acquire the vnode lock, then a buffer lock. The order
for the snapshot lock is reversed: a buffer lock must be acquired before
the snapshot lock.

When creating a new snapshot, the snapshot file must retain its vnode
lock until it has allocated all the blocks that it needs before
switching to the snapshot lock. This update moves one final piece of
the initial snapshot block allocation so that it is done before the
newly created snapshot is switched to use the snapshot lock.

Reported by:  Witness code
MFC after:    1 week
Sponsored by: Netflix
This commit is contained in:
Kirk McKusick 2021-09-18 16:51:07 -07:00
parent ad6dc36520
commit d7770a5495
1 changed files with 21 additions and 21 deletions

View File

@ -650,6 +650,27 @@ loop:
BLK_NOCOPY, 0);
vput(xvp);
}
/*
* Preallocate all the direct blocks in the snapshot inode so
* that we never have to write the inode itself to commit an
* update to the contents of the snapshot. Note that once
* created, the size of the snapshot will never change, so
* there will never be a need to write the inode except to
* update the non-integrity-critical time fields and
* allocated-block count.
*/
for (blockno = 0; blockno < UFS_NDADDR; blockno++) {
if (DIP(ip, i_db[blockno]) != 0)
continue;
error = UFS_BALLOC(vp, lblktosize(fs, blockno),
fs->fs_bsize, KERNCRED, BA_CLRBUF, &bp);
if (error)
goto resumefs;
error = readblock(vp, bp, blockno);
bawrite(bp);
if (error != 0)
goto resumefs;
}
/*
* Acquire a lock on the snapdata structure, creating it if necessary.
*/
@ -691,27 +712,6 @@ loop:
sn->sn_listsize = blkp - snapblklist;
VI_UNLOCK(devvp);
}
/*
* Preallocate all the direct blocks in the snapshot inode so
* that we never have to write the inode itself to commit an
* update to the contents of the snapshot. Note that once
* created, the size of the snapshot will never change, so
* there will never be a need to write the inode except to
* update the non-integrity-critical time fields and
* allocated-block count.
*/
for (blockno = 0; blockno < UFS_NDADDR; blockno++) {
if (DIP(ip, i_db[blockno]) != 0)
continue;
error = UFS_BALLOC(vp, lblktosize(fs, blockno),
fs->fs_bsize, KERNCRED, BA_CLRBUF, &bp);
if (error)
goto resumefs;
error = readblock(vp, bp, blockno);
bawrite(bp);
if (error != 0)
goto resumefs;
}
/*
* Record snapshot inode. Since this is the newest snapshot,
* it must be placed at the end of the list.