linux/fs
Naohiro Aota 82187d2ecd btrfs: zoned: fix chunk allocation condition for zoned allocator
The ZNS specification defines a limit on the number of "active"
zones. That limit impose us to limit the number of block groups which
can be used for an allocation at the same time. Not to exceed the
limit, we reuse the existing active block groups as much as possible
when we can't activate any other zones without sacrificing an already
activated block group in commit a85f05e59b ("btrfs: zoned: avoid
chunk allocation if active block group has enough space").

However, the check is wrong in two ways. First, it checks the
condition for every raid index (ffe_ctl->index). Even if it reaches
the condition and "ffe_ctl->max_extent_size >=
ffe_ctl->min_alloc_size" is met, there can be other block groups
having enough space to hold ffe_ctl->num_bytes. (Actually, this won't
happen in the current zoned code as it only supports SINGLE
profile. But, it can happen once it enables other RAID types.)

Second, it checks the active zone availability depending on the
raid index. The raid index is just an index for
space_info->block_groups, so it has nothing to do with chunk allocation.

These mistakes are causing a faulty allocation in a certain
situation. Consider we are running zoned btrfs on a device whose
max_active_zone == 0 (no limit). And, suppose no block group have a
room to fit ffe_ctl->num_bytes but some room to meet
ffe_ctl->min_alloc_size (i.e. max_extent_size > num_bytes >=
min_alloc_size).

In this situation, the following occur:

- With SINGLE raid_index, it reaches the chunk allocation checking
  code
- The check returns true because we can activate a new zone (no limit)
- But, before allocating the chunk, it iterates to the next raid index
  (RAID5)
- Since there are no RAID5 block groups on zoned mode, it again
  reaches the check code
- The check returns false because of btrfs_can_activate_zone()'s "if
  (raid_index != BTRFS_RAID_SINGLE)" part
- That results in returning -ENOSPC without allocating a new chunk

As a result, we end up hitting -ENOSPC too early.

Move the check to the right place in the can_allocate_chunk() hook,
and do the active zone check depending on the allocation flag, not on
the raid index.

CC: stable@vger.kernel.org # 5.16
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-01-07 14:18:26 +01:00
..
9p netfs, 9p, afs, ceph: Use folios 2021-11-10 21:16:56 +00:00
adfs
affs affs: use bdev_nr_sectors instead of open coding it 2021-10-18 14:43:22 -06:00
afs afs: Fix mmap 2021-12-16 09:10:13 -08:00
autofs autofs: fix wait name hash calculation in autofs_wait() 2021-10-20 21:09:02 -04:00
befs
bfs
btrfs btrfs: zoned: fix chunk allocation condition for zoned allocator 2022-01-07 14:18:26 +01:00
cachefiles for-5.16/ki_complete-2021-10-29 2021-11-01 10:17:11 -07:00
ceph ceph: fix up non-directory creation in SGID directories 2021-12-01 17:08:27 +01:00
cifs cifs: sanitize multiple delimiters in prepath 2021-12-17 19:16:49 -06:00
coda coda: bump module version to 7.2 2021-11-09 10:02:51 -08:00
configfs
cramfs cramfs: use bdev_nr_bytes instead of open coding it 2021-10-18 14:43:22 -06:00
crypto fscrypt: improve a few comments 2021-10-25 19:11:50 -07:00
debugfs
devpts
dlm
ecryptfs
efivarfs
efs
erofs erofs: fix deadlock when shrink erofs slab 2021-11-23 14:58:16 +08:00
exfat exfat: fix incorrect loading of i_blocks for large files 2021-11-01 07:49:21 +09:00
exportfs
ext2 ext2: fix sleeping in atomic bugs on error 2021-09-22 13:05:23 +02:00
ext4 Only bug fixes and cleanups for ext4 this merge window. Of note are 2021-11-10 17:05:37 -08:00
f2fs Update to zstd-1.4.10 2021-11-13 15:32:30 -08:00
fat for-5.16/inode-sync-2021-10-29 2021-11-01 10:25:27 -07:00
freevxfs
fscache fscache: Remove an unused static variable 2021-10-04 22:13:12 +01:00
fuse fuse: release pipe buf after last use 2021-11-25 14:05:18 +01:00
gfs2 gfs2: gfs2_create_inode rework 2021-12-02 12:41:10 +01:00
hfs Merge branch 'akpm' (patches from Andrew) 2021-11-09 10:11:53 -08:00
hfsplus Merge branch 'akpm' (patches from Andrew) 2021-11-09 10:11:53 -08:00
hostfs
hpfs treewide: Replace open-coded flex arrays in unions 2021-10-18 12:28:53 -07:00
hugetlbfs mm,hugetlb: remove mlock ulimit for SHM_HUGETLB 2021-11-09 10:02:48 -08:00
iomap iomap: iomap_read_inline_data cleanup 2021-11-24 10:15:47 -08:00
isofs isofs: Fix out of bound access for corrupted isofs image 2021-10-19 12:51:02 +02:00
jbd2
jffs2
jfs Just one JFS patch 2021-11-03 09:23:25 -07:00
kernfs Merge 5.15-rc6 into driver-core-next 2021-10-18 09:43:37 +02:00
ksmbd ksmbd: disable SMB2_GLOBAL_CAP_ENCRYPTION for SMB 3.1.1 2021-12-17 19:19:45 -06:00
lockd A slow cycle for nfsd: mainly cleanup, including Neil's patch dropping 2021-11-10 16:45:54 -08:00
minix
netfs netfs: fix parameter of cleanup() 2021-12-07 15:47:09 +00:00
nfs NFS client bugfixes for Linux 5.16 2021-11-27 10:33:55 -08:00
nfs_common nfs: Fix kerneldoc warning shown up by W=1 2021-10-04 22:02:17 +01:00
nfsd NFSD: Fix READDIR buffer overflow 2021-12-18 17:11:06 -05:00
nilfs2 Merge branch 'akpm' (patches from Andrew) 2021-11-09 10:11:53 -08:00
nls
notify fanotify: Allow users to request FAN_FS_ERROR events 2021-10-27 12:53:45 +02:00
ntfs fs: ntfs: Limit NTFS_RW to page sizes smaller than 64k 2021-11-27 14:34:41 -08:00
ntfs3 gfs2: Fix mmap + page fault deadlocks 2021-11-02 12:25:03 -07:00
ocfs2 Merge branch 'exit-cleanups-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-11-10 16:15:54 -08:00
omfs
openpromfs
orangefs orangefs: three fixes from other folks... 2021-11-09 10:34:06 -08:00
overlayfs overlayfs update for 5.16 2021-11-09 10:51:12 -08:00
proc proc/vmcore: fix clearing user buffer by properly using clear_user() 2021-11-20 10:35:55 -08:00
pstore pstore/blk: Use "%lu" to format unsigned long 2021-11-21 09:44:19 -08:00
qnx4
qnx6
quota \n 2021-11-06 16:40:48 -07:00
ramfs Merge branch 'akpm' (patches from Andrew) 2021-11-09 10:11:53 -08:00
reiserfs \n 2021-11-06 16:40:48 -07:00
romfs
smbfs_common cifs: Fix crash on unload of cifs_arc4.ko 2021-12-07 22:38:03 -06:00
squashfs lib: zstd: Add kernel-specific API 2021-11-08 16:55:21 -08:00
sysfs fs/sysfs/dir.c: replace S_IRWXU|S_IRUGO|S_IXUGO with 0755 sysfs_create_dir_ns() 2021-10-05 16:35:05 +02:00
sysv sysv: use BUILD_BUG_ON instead of runtime check 2021-11-09 10:02:52 -08:00
tracefs tracefs: Set all files to the same group ownership as the mount option 2021-12-08 08:06:40 -05:00
ubifs
udf udf: Fix crash after seekdir 2021-11-09 12:53:58 +01:00
ufs
unicode
vboxsf vboxfs: fix broken legacy mount signature checking 2021-09-27 11:26:21 -07:00
verity fs-verity: fix signed integer overflow with i_size near S64_MAX 2021-09-22 10:56:34 -07:00
xfs xfs: remove all COW fork extents when remounting readonly 2021-12-07 10:17:29 -08:00
zonefs zonefs: add MODULE_ALIAS_FS 2021-12-17 16:56:35 +09:00
aio.c aio: Fix incorrect usage of eventfd_signal_allowed() 2021-12-09 10:52:55 -08:00
anon_inodes.c
attr.c fs: handle circular mappings correctly 2021-11-17 09:26:09 +01:00
bad_inode.c
binfmt_aout.c
binfmt_elf.c Merge branch 'akpm' (patches from Andrew) 2021-11-09 10:11:53 -08:00
binfmt_elf_fdpic.c coredump: Limit coredumps to a single thread group 2021-10-08 12:06:02 -05:00
binfmt_flat.c
binfmt_misc.c
binfmt_script.c
buffer.c fs: simplify init_page_buffers 2021-10-18 14:43:22 -06:00
char_dev.c
compat_binfmt_elf.c
coredump.c coredump: Limit coredumps to a single thread group 2021-10-08 12:06:02 -05:00
d_path.c d_path: fix Kernel doc validator complaining 2021-11-06 13:30:32 -07:00
dax.c
dcache.c
direct-io.c fs: get rid of the res2 iocb->ki_complete argument 2021-10-25 10:36:24 -06:00
drop_caches.c
eventfd.c
eventpoll.c
exec.c Merge branch 'exit-cleanups-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-11-10 16:15:54 -08:00
fcntl.c
fhandle.c
file.c fget: clarify and improve __fget_files() implementation 2021-12-13 10:55:30 -08:00
file_table.c
filesystems.c
fs-writeback.c Various hardening fixes and cleanups for 5.16-rc1 2021-11-01 17:29:10 -07:00
fs_context.c
fs_parser.c
fs_pin.c
fs_struct.c
fs_types.c
fsopen.c
init.c
inode.c fs: Remove FS_THP_SUPPORT 2021-11-17 10:36:35 -05:00
internal.h Merge branch 'akpm' (patches from Andrew) 2021-11-09 10:11:53 -08:00
io-wq.c io-wq: drop wqe lock before creating new worker 2021-12-13 09:04:01 -07:00
io-wq.h io_uring: optimise INIT_WQ_LIST 2021-10-19 05:49:54 -06:00
io_uring.c io_uring: zero iocb->ki_pos for stream file types 2021-12-22 20:34:32 -07:00
ioctl.c
Kconfig
Kconfig.binfmt
kernel_read_file.c vfs: check fd has read access in kernel_read_file_from_fd() 2021-10-18 20:22:03 -10:00
libfs.c libfs: Support RENAME_EXCHANGE in simple_rename() 2021-11-03 15:43:08 +01:00
locks.c locks: remove changelog comments 2021-10-19 14:11:39 -04:00
Makefile
mbcache.c
mount.h
mpage.c
namei.c File locking changes for v5.16 2021-11-01 09:06:53 -07:00
namespace.c fs/mount_setattr: always cleanup mount_kattr 2021-12-30 15:12:13 -08:00
no-block.c
nsfs.c
open.c Merge branch 'akpm' (patches from Andrew) 2021-11-06 14:08:17 -07:00
pipe.c
pnode.c
pnode.h
posix_acl.c fs/posix_acl.c: avoid -Wempty-body warning 2021-11-06 13:30:32 -07:00
proc_namespace.c
read_write.c fs: remove leftover comments from mandatory locking removal 2021-10-26 12:20:50 -04:00
readdir.c
remap_range.c
select.c
seq_file.c seq_file: move seq_escape() to a header 2021-11-09 10:02:52 -08:00
signalfd.c signalfd: use wake_up_pollfree() 2021-12-09 10:49:56 -08:00
splice.c
stack.c
stat.c
statfs.c
super.c fs: explicitly unregister per-superblock BDIs 2021-11-06 13:30:34 -07:00
sync.c block: simplify the block device syncing code 2021-10-22 08:36:55 -06:00
timerfd.c
userfaultfd.c userfaultfd: fix a race between writeprotect and exit_mmap() 2021-10-18 20:22:02 -10:00
utimes.c
xattr.c