linux/fs/btrfs
Filipe Manana 723df2bcc9 btrfs: join running log transaction when logging new name
When logging a new name, in case of a rename, we pin the log before
changing it. We then either delete a directory entry from the log or
insert a key range item to mark the old name for deletion on log replay.

However when doing one of those log changes we may have another task that
started writing out the log (at btrfs_sync_log()) and it started before
we pinned the log root. So we may end up changing a log tree while its
writeback is being started by another task syncing the log. This can lead
to inconsistencies in a log tree and other unexpected results during log
replay, because we can get some committed node pointing to a node/leaf
that ends up not getting written to disk before the next log commit.

The problem, conceptually, started to happen in commit 88d2beec7e
("btrfs: avoid logging all directory changes during renames"), because
there we started to update the log without joining its current transaction
first.

However the problem only became visible with commit 259c4b96d7
("btrfs: stop doing unnecessary log updates during a rename"), and that is
because we used to pin the log at btrfs_rename() and then before entering
btrfs_log_new_name(), when unlinking the old dentry, we ended up at
btrfs_del_inode_ref_in_log() and btrfs_del_dir_entries_in_log(). Both
of them join the current log transaction, effectively waiting for any log
transaction writeout (due to acquiring the root's log_mutex). This made it
safe even after leaving the current log transaction, because we remained
with the log pinned when we called btrfs_log_new_name().

Then in commit 259c4b96d7 ("btrfs: stop doing unnecessary log updates
during a rename"), we removed the log pinning from btrfs_rename() and
stopped calling btrfs_del_inode_ref_in_log() and
btrfs_del_dir_entries_in_log() during the rename, and started to do all
the needed work at btrfs_log_new_name(), but without joining the current
log transaction, only pinning the log, which is racy because another task
may have started writeout of the log tree right before we pinned the log.

Both commits landed in kernel 5.18, so it doesn't make any practical
difference which should be blamed, but I'm blaming the second commit only
because with the first one, by chance, the problem did not happen due to
the fact we joined the log transaction after pinning the log and unpinned
it only after calling btrfs_log_new_name().

So make btrfs_log_new_name() join the current log transaction instead of
pinning it, so that we never do log updates if it's writeout is starting.

Fixes: 259c4b96d7 ("btrfs: stop doing unnecessary log updates during a rename")
CC: stable@vger.kernel.org # 5.18+
Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Tested-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-07-25 17:45:42 +02:00
..
tests btrfs: add optimized btrfs_ino() version for 64 bits systems 2022-07-25 17:45:41 +02:00
acl.c btrfs: reserve correct number of items for inode creation 2022-05-16 17:03:08 +02:00
async-thread.c btrfs: simplify WQ_HIGHPRI handling in struct btrfs_workqueue 2022-05-16 17:03:15 +02:00
async-thread.h btrfs: remove unused typedefs get_extent_t and btrfs_work_func_t 2022-07-25 17:45:36 +02:00
backref.c btrfs: sink iterator parameter to btrfs_ioctl_logical_to_ino 2022-07-25 17:45:36 +02:00
backref.h btrfs: sink iterator parameter to btrfs_ioctl_logical_to_ino 2022-07-25 17:45:36 +02:00
block-group.c btrfs: zoned: activate necessary block group 2022-07-25 17:45:42 +02:00
block-group.h btrfs: zoned: prevent allocation from previous data relocation BG 2022-06-21 14:43:48 +02:00
block-rsv.c btrfs: use enum for btrfs_block_rsv::type 2022-07-25 17:45:40 +02:00
block-rsv.h btrfs: use enum for btrfs_block_rsv::type 2022-07-25 17:45:40 +02:00
btrfs_inode.h btrfs: add optimized btrfs_ino() version for 64 bits systems 2022-07-25 17:45:41 +02:00
check-integrity.c btrfs: check-integrity: simplify bio allocation in btrfsic_read_block 2022-05-16 17:03:12 +02:00
check-integrity.h btrfs: check-integrity: split submit_bio from btrfsic checking 2022-05-16 17:03:12 +02:00
compression.c btrfs: do not return errors from btrfs_map_bio 2022-07-25 17:45:39 +02:00
compression.h btrfs: don't use btrfs_bio_wq_end_io for compressed writes 2022-07-25 17:45:33 +02:00
ctree.c btrfs: sink parameter is_data to btrfs_set_disk_extent_flags 2022-05-16 17:17:31 +02:00
ctree.h btrfs: zoned: wait until zone is finished when allocation didn't progress 2022-07-25 17:45:42 +02:00
delalloc-space.c btrfs: convert count_max_extents() to use fs_info->max_extent_size 2022-07-25 17:45:41 +02:00
delalloc-space.h
delayed-inode.c btrfs: batch up release of reserved metadata for delayed items used for deletion 2022-07-25 17:45:37 +02:00
delayed-inode.h btrfs: reduce amount of reserved metadata for delayed item insertion 2022-07-25 17:44:36 +02:00
delayed-ref.c btrfs: switch btrfs_block_rsv::full to bool 2022-07-25 17:45:40 +02:00
delayed-ref.h btrfs: remove btrfs_delayed_extent_op::is_data 2022-05-16 17:17:31 +02:00
dev-replace.c btrfs: clean up chained assignments 2022-07-25 17:45:39 +02:00
dev-replace.h
dir-item.c btrfs: use btrfs_for_each_slot in btrfs_search_dir_index_item 2022-05-16 17:03:07 +02:00
discard.c
discard.h
disk-io.c btrfs: zoned: wait until zone is finished when allocation didn't progress 2022-07-25 17:45:42 +02:00
disk-io.h btrfs: handle allocation failure in btrfs_wq_submit_bio gracefully 2022-07-25 17:45:40 +02:00
export.c
export.h
extent-io-tree.h btrfs: Convert from invalidatepage to invalidate_folio 2022-03-15 08:23:29 -04:00
extent-tree.c btrfs: zoned: write out partially allocated region 2022-07-25 17:45:42 +02:00
extent_io.c btrfs: replace BTRFS_MAX_EXTENT_SIZE with fs_info->max_extent_size 2022-07-25 17:45:41 +02:00
extent_io.h btrfs: remove extent writepage address space operation 2022-07-25 17:45:37 +02:00
extent_map.c btrfs: assert we have a write lock when removing and replacing extent maps 2022-03-14 13:13:50 +01:00
extent_map.h btrfs: defrag: don't use merged extent map for their generation check 2022-02-23 17:43:13 +01:00
file-item.c btrfs: handle csum lookup errors properly on reads 2022-03-14 13:13:51 +01:00
file.c btrfs: don't fallback to buffered IO for NOWAIT direct IO writes 2022-07-25 17:45:40 +02:00
free-space-cache.c btrfs: clean up chained assignments 2022-07-25 17:45:39 +02:00
free-space-cache.h
free-space-tree.c btrfs: use rbtree with leftmost node cached for tracking lowest block group 2022-05-16 17:03:13 +02:00
free-space-tree.h
inode-item.c
inode-item.h
inode.c btrfs: simplify error handling in btrfs_lookup_dentry 2022-07-25 17:45:42 +02:00
ioctl.c btrfs: use fs_info->max_extent_size in get_extent_max_capacity() 2022-07-25 17:45:41 +02:00
Kconfig btrfs: use generic Kconfig option for 256kB page size limit 2022-01-20 08:52:55 +02:00
locking.c btrfs: don't set lock_owner when locking extent buffer for reading 2022-06-21 14:46:56 +02:00
locking.h
lzo.c btrfs: replace kmap() with kmap_local_page() in lzo.c 2022-07-25 17:45:33 +02:00
Makefile Kbuild: add -Wno-shift-negative-value where -Wextra is used 2022-03-13 17:30:31 +09:00
misc.h
ordered-data.c btrfs: remove the finish_func argument to btrfs_mark_ordered_io_finished 2022-07-25 17:45:37 +02:00
ordered-data.h btrfs: remove the finish_func argument to btrfs_mark_ordered_io_finished 2022-07-25 17:45:37 +02:00
orphan.c
print-tree.c btrfs: unify the error handling pattern for read_tree_block() 2022-03-14 13:13:53 +01:00
print-tree.h
props.c btrfs: move common inode creation code into btrfs_create_new_inode() 2022-05-16 17:03:08 +02:00
props.h btrfs: move common inode creation code into btrfs_create_new_inode() 2022-05-16 17:03:08 +02:00
qgroup.c btrfs: avoid blocking on space revervation when doing nowait dio writes 2022-05-16 17:03:10 +02:00
qgroup.h btrfs: avoid blocking on space revervation when doing nowait dio writes 2022-05-16 17:03:10 +02:00
raid56.c btrfs: raid56: transfer the bio counter reference to the raid submission helpers 2022-07-25 17:45:40 +02:00
raid56.h btrfs: do not return errors from raid56_parity_recover 2022-07-25 17:45:39 +02:00
rcu-string.h
ref-verify.c
ref-verify.h
reflink.c btrfs: clean up chained assignments 2022-07-25 17:45:39 +02:00
reflink.h
relocation.c Page cache changes for 5.19 2022-05-24 19:55:07 -07:00
root-tree.c btrfs: avoid blocking on space revervation when doing nowait dio writes 2022-05-16 17:03:10 +02:00
scrub.c btrfs: do not return errors from raid56_parity_recover 2022-07-25 17:45:39 +02:00
send.c btrfs: send: always use the rbtree based inode ref management infrastructure 2022-07-25 17:45:42 +02:00
send.h btrfs: send: add new command FILEATTR for file attributes 2022-07-25 17:45:38 +02:00
space-info.c btrfs: zoned: activate metadata block group on flush_space 2022-07-25 17:45:42 +02:00
space-info.h btrfs: zoned: introduce space_info->active_total_bytes 2022-07-25 17:45:42 +02:00
struct-funcs.c btrfs: remove redundant check in up check_setget_bounds 2022-07-25 17:45:33 +02:00
subpage.c btrfs: remove extent writepage address space operation 2022-07-25 17:45:37 +02:00
subpage.h btrfs: make nodesize >= PAGE_SIZE case to reuse the non-subpage routine 2022-05-16 17:03:11 +02:00
super.c btrfs: use mask for all RAID1* profiles in btrfs_calc_avail_data_space 2022-07-25 17:45:38 +02:00
sysfs.c btrfs: sysfs: remove BIG_METADATA feature files 2022-07-25 17:45:39 +02:00
sysfs.h
transaction.c btrfs: clean up chained assignments 2022-07-25 17:45:39 +02:00
transaction.h btrfs: pass btrfs_fs_info for deleting snapshots and cleaner 2022-03-14 13:13:52 +01:00
tree-checker.c btrfs: tree-checker: check extent buffer owner against owner rootid 2022-05-16 17:03:09 +02:00
tree-checker.h btrfs: tree-checker: check extent buffer owner against owner rootid 2022-05-16 17:03:09 +02:00
tree-defrag.c
tree-log.c btrfs: join running log transaction when logging new name 2022-07-25 17:45:42 +02:00
tree-log.h btrfs: tree-log: make the return value for log syncing consistent 2022-07-25 17:45:34 +02:00
tree-mod-log.c
tree-mod-log.h
ulist.c
ulist.h
uuid-tree.c
verity.c
volumes.c btrfs: raid56: transfer the bio counter reference to the raid submission helpers 2022-07-25 17:45:40 +02:00
volumes.h btrfs: do not return errors from btrfs_map_bio 2022-07-25 17:45:39 +02:00
xattr.c btrfs: use btrfs_for_each_slot in btrfs_listxattr 2022-05-16 17:03:08 +02:00
xattr.h
zlib.c btrfs: zlib: replace kmap() with kmap_local_page() in zlib_decompress_bio() 2022-07-25 17:45:41 +02:00
zoned.c btrfs: zoned: wait until zone is finished when allocation didn't progress 2022-07-25 17:45:42 +02:00
zoned.h btrfs: zoned: activate metadata block group on flush_space 2022-07-25 17:45:42 +02:00
zstd.c btrfs: zstd: replace kmap() with kmap_local_page() 2022-07-25 17:45:40 +02:00