linux/fs/btrfs
Tejun Heo f7bddf1e27 btrfs: Avoid getting stuck during cyclic writebacks
During a cyclic writeback, extent_write_cache_pages() uses done_index
to update the writeback_index after the current run is over.  However,
instead of current index + 1, it gets to to the current index itself.

Unfortunately, this, combined with returning on EOF instead of looping
back, can lead to the following pathlogical behavior.

1. There is a single file which has accumulated enough dirty pages to
   trigger balance_dirty_pages() and the writer appending to the file
   with a series of short writes.

2. balance_dirty_pages kicks in, wakes up background writeback and sleeps.

3. Writeback kicks in and the cursor is on the last page of the dirty
   file.  Writeback is started or skipped if already in progress.  As
   it's EOF, extent_write_cache_pages() returns and the cursor is set
   to done_index which is pointing to the last page.

4. Writeback is done.  Nothing happens till balance_dirty_pages
   finishes, at which point we go back to #1.

This can almost completely stall out writing back of the file and keep
the system over dirty threshold for a long time which can mess up the
whole system.  We encountered this issue in production with a package
handling application which can reliably reproduce the issue when
running under tight memory limits.

Reading the comment in the error handling section, this seems to be to
avoid accidentally skipping a page in case the write attempt on the
page doesn't succeed.  However, this concern seems bogus.

On each page, the code either:

* Skips and moves onto the next page.

* Fails issue and sets done_index to index + 1.

* Successfully issues and continue to the next page if budget allows
  and not EOF.

IOW, as long as it's not EOF and there's budget, the code never
retries writing back the same page.  Only when a page happens to be
the last page of a particular run, we end up retrying the page, which
can't possibly guarantee anything data integrity related.  Besides,
cyclic writes are only used for non-syncing writebacks meaning that
there's no data integrity implication to begin with.

Fix it by always setting done_index past the current page being
processed.

Note that this problem exists in other writepages too.

CC: stable@vger.kernel.org # 4.19+
Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-18 12:46:54 +01:00
..
tests Btrfs: fix selftests failure due to uninitialized i_mode in test inodes 2019-09-24 14:45:02 +02:00
acl.c
async-thread.c btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
async-thread.h btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
backref.c
backref.h
block-group.c btrfs: block-group: Rework documentation of check_system_chunk function 2019-11-18 12:46:54 +01:00
block-group.h btrfs: move struct io_ctl to free-space-cache.h 2019-09-09 14:59:15 +02:00
block-rsv.c btrfs: use btrfs_try_granting_tickets in update_global_rsv 2019-09-09 14:59:19 +02:00
block-rsv.h
btrfs_inode.h
check-integrity.c
check-integrity.h
compression.c Btrfs: use REQ_CGROUP_PUNT for worker thread submitted bios 2019-11-18 12:46:53 +01:00
compression.h Btrfs: use REQ_CGROUP_PUNT for worker thread submitted bios 2019-11-18 12:46:53 +01:00
ctree.c btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
ctree.h Btrfs: delete the entire async bio submission framework 2019-11-18 12:46:53 +01:00
delalloc-space.c Btrfs: fix qgroup double free after failure to reserve metadata for delalloc 2019-10-17 20:13:44 +02:00
delalloc-space.h
delayed-inode.c btrfs: use refcount_inc_not_zero in kill_all_nodes 2019-11-18 12:46:51 +01:00
delayed-inode.h
delayed-ref.c btrfs: rename btrfs_space_info_add_old_bytes 2019-09-09 14:59:18 +02:00
delayed-ref.h
dev-replace.c btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
dev-replace.h btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
dir-item.c
disk-io.c btrfs: Enhance error output for write time tree checker 2019-11-18 12:46:54 +01:00
disk-io.h btrfs: add __cold attribute to more functions 2019-11-18 12:46:52 +01:00
export.c btrfs: drop unused parameter is_new from btrfs_iget 2019-11-18 12:46:52 +01:00
export.h
extent-io-tree.h btrfs: move the failrec tree stuff into extent-io-tree.h 2019-11-18 12:46:47 +01:00
extent-tree.c btrfs: refactor the ticket wakeup code 2019-09-09 14:59:18 +02:00
extent_io.c btrfs: Avoid getting stuck during cyclic writebacks 2019-11-18 12:46:54 +01:00
extent_io.h btrfs: move the failrec tree stuff into extent-io-tree.h 2019-11-18 12:46:47 +01:00
extent_map.c
extent_map.h
file-item.c
file.c btrfs: drop unused parameter is_new from btrfs_iget 2019-11-18 12:46:52 +01:00
free-space-cache.c btrfs: drop unused parameter is_new from btrfs_iget 2019-11-18 12:46:52 +01:00
free-space-cache.h btrfs: move struct io_ctl to free-space-cache.h 2019-09-09 14:59:15 +02:00
free-space-tree.c
free-space-tree.h
inode-item.c btrfs: Make btrfs_find_name_in_ext_backref return struct btrfs_inode_extref 2019-09-09 14:59:16 +02:00
inode-map.c btrfs: qgroup: Always free PREALLOC META reserve in btrfs_delalloc_release_extents() 2019-10-15 18:50:07 +02:00
inode-map.h
inode.c Btrfs: use REQ_CGROUP_PUNT for worker thread submitted bios 2019-11-18 12:46:53 +01:00
ioctl.c btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
Kconfig
locking.c btrfs: move btrfs_unlock_up_safe to other locking functions 2019-11-18 12:46:49 +01:00
locking.h btrfs: move btrfs_unlock_up_safe to other locking functions 2019-11-18 12:46:49 +01:00
lzo.c btrfs: compression: replace set_level callbacks by a common helper 2019-09-09 14:59:11 +02:00
Makefile
misc.h btrfs: add 64bit safe helper for power of two checks 2019-11-18 12:46:50 +01:00
ordered-data.c btrfs: get rid of unique workqueue helper functions 2019-11-18 12:46:48 +01:00
ordered-data.h
orphan.c
print-tree.c
print-tree.h
props.c btrfs: drop unused parameter is_new from btrfs_iget 2019-11-18 12:46:52 +01:00
props.h
qgroup.c btrfs: get rid of unique workqueue helper functions 2019-11-18 12:46:48 +01:00
qgroup.h
raid56.c btrfs: get rid of unique workqueue helper functions 2019-11-18 12:46:48 +01:00
raid56.h
rcu-string.h
reada.c btrfs: get rid of unique workqueue helper functions 2019-11-18 12:46:48 +01:00
ref-verify.c btrfs: fix uninitialized ret in ref-verify 2019-10-03 15:00:56 +02:00
ref-verify.h
relocation.c btrfs: drop unused parameter is_new from btrfs_iget 2019-11-18 12:46:52 +01:00
root-tree.c btrfs: rename the btrfs_calc_*_metadata_size helpers 2019-09-09 14:59:13 +02:00
scrub.c btrfs: get rid of unique workqueue helper functions 2019-11-18 12:46:48 +01:00
send.c btrfs: drop unused parameter is_new from btrfs_iget 2019-11-18 12:46:52 +01:00
send.h
space-info.c btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
space-info.h btrfs: add __pure attribute to functions 2019-11-18 12:46:52 +01:00
struct-funcs.c btrfs: tie extent buffer and it's token together 2019-09-09 14:59:16 +02:00
super.c Btrfs: delete the entire async bio submission framework 2019-11-18 12:46:53 +01:00
sysfs.c btrfs: sysfs: move helper macros to sysfs.c 2019-09-09 14:59:08 +02:00
sysfs.h btrfs: sysfs: move helper macros to sysfs.c 2019-09-09 14:59:08 +02:00
transaction.c btrfs: transaction: Cleanup unused TRANS_STATE_BLOCKED 2019-11-18 12:46:50 +01:00
transaction.h btrfs: transaction: Cleanup unused TRANS_STATE_BLOCKED 2019-11-18 12:46:50 +01:00
tree-checker.c btrfs: tree-checker: Refactor prev_key check for ino into a function 2019-11-18 12:46:53 +01:00
tree-checker.h
tree-defrag.c
tree-log.c btrfs: drop unused parameter is_new from btrfs_iget 2019-11-18 12:46:52 +01:00
tree-log.h
ulist.c
ulist.h
uuid-tree.c
volumes.c Btrfs: delete the entire async bio submission framework 2019-11-18 12:46:53 +01:00
volumes.h Btrfs: delete the entire async bio submission framework 2019-11-18 12:46:53 +01:00
xattr.c
xattr.h
zlib.c btrfs: compression: replace set_level callbacks by a common helper 2019-09-09 14:59:11 +02:00
zstd.c btrfs: move cond_wake_up functions out of ctree 2019-09-09 14:59:15 +02:00