Commit graph

875 commits

Author SHA1 Message Date
Jaegeuk Kim
078cad8212 f2fs: drop inode from the donation list when the last file is closed
Let's drop the inode from the donation list when there is no other
open file.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-30 17:13:12 +00:00
Hongbo Li
94b3ce7f15 f2fs: switch to the new mount api
The new mount api will execute .parse_param, .init_fs_context, .get_tree
and will call .remount if remount happened. So we add the necessary
functions for the fs_context_operations. If .init_fs_context is added,
the old .mount should remove.

See Documentation/filesystems/mount_api.rst for more information.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
[sandeen: forward port]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
[hongbo: context modified]
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-22 15:58:15 +00:00
Hongbo Li
bb463a75ab f2fs: introduce fs_context_operation structure
The handle_mount_opt() helper is used to parse mount parameters,
and so we can rename this function to f2fs_parse_param() and set
it as .param_param in fs_context_operations.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
[sandeen: forward port]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-22 15:58:14 +00:00
Hongbo Li
d185351325 f2fs: separate the options parsing and options checking
The new mount api separates option parsing and super block setup
into two distinct steps and so we need to separate the options
parsing out of the parse_options().

In order to achieve this, here we handle the mount options with
three steps:
  - Firstly, we move sb/sbi out of handle_mount_opt.
    As the former patch introduced f2fs_fs_context, so we record
    the changed mount options in this context. In handle_mount_opt,
    sb/sbi is null, so we should move all relative code out of
    handle_mount_opt (thus, some check case which use sb/sbi should
    move out).
  - Secondly, we introduce the some check helpers to keep the option
    consistent.
    During filling superblock period, sb/sbi are ready. So we check
    the f2fs_fs_context which holds the mount options base on sb/sbi.
  - Thirdly, we apply the new mount options to sb/sbi.
    After checking the f2fs_fs_context, all changed on mount options
    are valid. So we can apply them to sb/sbi directly.

After do these, option parsing and super block setting have been
decoupled. Also it should have retained the original execution
flow.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
[sandeen: forward port, minor fixes and updates]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
[hongbo: minor fixes]
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-22 15:58:14 +00:00
Hongbo Li
1a9094b10c f2fs: Add f2fs_fs_context to record the mount options
At the parsing phase of mouont in the new mount api, options
value will be recorded with the context, and then it will be
used in fill_super and other helpers.

Note that, this is a temporary status, we want remove the sb
and sbi usages in handle_mount_opt. So here the f2fs_fs_context
only records the mount options, it will be copied in sb/sbi in
later process. (At this point in the series, mount options are
temporarily not set during mount.)

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
[sandeen: forward port, minor fixes and updates]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
[hongbo: minor cleanup]
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-22 15:58:14 +00:00
Hongbo Li
19c4b380f2 f2fs: Allow sbi to be NULL in f2fs_printk
At the parsing phase of the new mount api, sbi will not be
available. So here allows sbi to be NULL in f2fs log helpers
and use that in handle_mount_opt().

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
[sandeen: forward port]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-22 15:58:14 +00:00
Hongbo Li
02eb5fe42a f2fs: move the option parser into handle_mount_opt
In handle_mount_opt, we use fs_parameter to parse each option.
However we're still using the old API to get the options string.
Using fsparams parse_options allows us to remove many of the Opt_
enums, so remove them.

The checkpoint disable cap (or percent) involves rather complex
parsing; we retain the old match_table mechanism for this, which
handles it well.

There are some changes about parsing options:
  1. For `active_logs`, `inline_xattr_size` and `fault_injection`,
     we use s32 type according the internal structure to record the
     option's value.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
[sandeen: forward port, minor fixes and updates]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
[hongbo: minor cleanup]
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-22 15:58:14 +00:00
Hongbo Li
f2091cc188 f2fs: Add fs parameter specifications for mount options
Use an array of `fs_parameter_spec` called f2fs_param_specs to
hold the mount option specifications for the new mount api.

Add constant_table structures for several options to facilitate
parsing.

Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
[sandeen: forward port, minor fixes and updates, more fsparam_enum]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-22 15:58:13 +00:00
Jiazi Li
e9705c61b1 f2fs: use kfree() instead of kvfree() to free some memory
options in f2fs_fill_super is alloc by kstrdup:
	options = kstrdup((const char *)data, GFP_KERNEL)
sit_bitmap[_mir], nat_bitmap[_mir] are alloc by kmemdup:
	sit_i->sit_bitmap = kmemdup(src_bitmap, sit_bitmap_size, GFP_KERNEL);
	sit_i->sit_bitmap_mir = kmemdup(src_bitmap,
					sit_bitmap_size, GFP_KERNEL);
	nm_i->nat_bitmap = kmemdup(version_bitmap, nm_i->bitmap_size,
					GFP_KERNEL);
	nm_i->nat_bitmap_mir = kmemdup(version_bitmap, nm_i->bitmap_size,
					GFP_KERNEL);
write_io is alloc by f2fs_kmalloc:
	sbi->write_io[i] = f2fs_kmalloc(sbi,
			array_size(n, sizeof(struct f2fs_bio_info))

Use kfree is more efficient.

Signed-off-by: Jiazi Li <jqqlijiazi@gmail.com>
Signed-off-by: peixuan.qiu <peixuan.qiu@transsion.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-09 17:59:39 +00:00
Swarna Prabhu
1f13689026 f2fs: Fix the typos in comments
This patch fixes minor typos in comments in f2fs.

Signed-off-by: Swarna Prabhu <s.prabhu@samsung.com>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-06-24 21:34:37 +00:00
Chao Yu
59c1c89e9b f2fs: introduce reserved_pin_section sysfs entry
This patch introduces /sys/fs/f2fs/<dev>/reserved_pin_section for tuning
@needed parameter of has_not_enough_free_secs(), if we configure it w/
zero, it can avoid f2fs_gc() as much as possible while fallocating on
pinned file.

Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: wangzijie <wangzijie1@honor.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-06-23 22:13:02 +00:00
Sheng Yong
554d9b7242 f2fs: fix bio memleak when committing super block
When committing new super block, bio is allocated but not freed, and
kmemleak complains:

  unreferenced object 0xffff88801d185600 (size 192):
    comm "kworker/3:2", pid 128, jiffies 4298624992
    hex dump (first 32 bytes):
      00 00 00 00 00 00 00 00 80 67 c3 00 81 88 ff ff  .........g......
      01 08 06 00 00 00 00 00 00 00 00 00 01 00 00 00  ................
    backtrace (crc 650ecdb1):
      kmem_cache_alloc_noprof+0x3a9/0x460
      mempool_alloc_noprof+0x12f/0x310
      bio_alloc_bioset+0x1e2/0x7e0
      __f2fs_commit_super+0xe0/0x370
      f2fs_commit_super+0x4ed/0x8c0
      f2fs_record_error_work+0xc7/0x190
      process_one_work+0x7db/0x1970
      worker_thread+0x518/0xea0
      kthread+0x359/0x690
      ret_from_fork+0x34/0x70
      ret_from_fork_asm+0x1a/0x30

The issue can be reproduced by:

  mount /dev/vda /mnt
  i=0
  while :; do
      echo '[h]abc' > /sys/fs/f2fs/vda/extension_list
      echo '[h]!abc' > /sys/fs/f2fs/vda/extension_list
      echo scan > /sys/kernel/debug/kmemleak
      dmesg | grep "new suspected memory leaks"
      [ $? -eq 0 ] && break
      i=$((i + 1))
      echo "$i"
  done
  umount /mnt

Fixes: 5bcde45578 ("f2fs: get rid of buffer_head use")
Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-06-23 22:13:01 +00:00
Zhiguo Niu
a6c397a31f f2fs: use d_inode(dentry) cleanup dentry->d_inode
no logic changes.

Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-28 16:05:22 +00:00
Chao Yu
54ca9be0bc f2fs: introduce FAULT_VMALLOC
Introduce a new fault type FAULT_VMALLOC to simulate no memory error in
f2fs_vmalloc().

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-27 23:52:36 +00:00
Chao Yu
5827e3c720 f2fs: add f2fs_bug_on() in f2fs_quota_read()
mapping_read_folio_gfp() will return a folio, it should always be
uptodate, let's check folio uptodate status to detect any potenial
bug.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-27 23:52:35 +00:00
Eric Biggers
d005af3b67 f2fs: remove unused sbi argument from checksum functions
Since __f2fs_crc32() now calls crc32() directly, it no longer uses its
sbi argument.  Remove that, and simplify its callers accordingly.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-27 23:52:35 +00:00
Chao Yu
a920196062 f2fs: don't over-report free space or inodes in statvfs
This fixes an analogus bug that was fixed in modern filesystems:
a) xfs in commit 4b8d867ca6 ("xfs: don't over-report free space or
inodes in statvfs")
b) ext4 in commit f87d3af741 ("ext4: don't over-report free space
or inodes in statvfs")
where statfs can report misleading / incorrect information where
project quota is enabled, and the free space is less than the
remaining quota.

This commit will resolve a test failure in generic/762 which tests
for this bug.

generic/762       - output mismatch (see /share/git/fstests/results//generic/762.out.bad)
    --- tests/generic/762.out   2025-04-15 10:21:53.371067071 +0800
    +++ /share/git/fstests/results//generic/762.out.bad 2025-05-13 16:13:37.000000000 +0800
    @@ -6,8 +6,10 @@
     root blocks2 is in range
     dir blocks2 is in range
     root bavail2 is in range
    -dir bavail2 is in range
    +dir bavail2 has value of 1539066
    +dir bavail2 is NOT in range 304734.87 .. 310891.13
     root blocks3 is in range
    ...
    (Run 'diff -u /share/git/fstests/tests/generic/762.out /share/git/fstests/results//generic/762.out.bad'  to see the entire diff)

HINT: You _MAY_ be missing kernel fix:
      XXXXXXXXXXXXXX xfs: don't over-report free space or inodes in statvfs

Cc: stable@kernel.org
Fixes: ddc34e328d ("f2fs: introduce f2fs_statfs_project")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-13 15:32:41 +00:00
Kairui Song
0427e811c9 f2fs: drop usage of folio_index
folio_index is only needed for mixed usage of page cache and swap
cache, for pure page cache usage, the caller can just use
folio->index instead.

It can't be a swap cache folio here.  Swap mapping may only call into fs
through `swap_rw` but f2fs does not use that method for swap.

Signed-off-by: Kairui Song <kasong@tencent.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org> (maintainer:F2FS FILE SYSTEM)
Cc: Chao Yu <chao@kernel.org> (maintainer:F2FS FILE SYSTEM)
Cc: linux-f2fs-devel@lists.sourceforge.net (open list:F2FS FILE SYSTEM)
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-06 15:46:55 +00:00
Chao Yu
0244c77fed f2fs: support FAULT_TIMEOUT
Support to inject a timeout fault into function, currently it only
support to inject timeout to commit_atomic_write flow to reproduce
inconsistent bug, like the bug fixed by commit f098aeba04 ("f2fs:
fix to avoid atomicity corruption of atomic file").

By default, the new type fault will inject 1000ms timeout, and the
timeout process can be interrupted by SIGKILL.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-06 15:46:55 +00:00
Chao Yu
dc6d9ef57f f2fs: zone: fix to calculate first_zoned_segno correctly
A zoned device can has both conventional zones and sequential zones,
so we should not treat first segment of zoned device as first_zoned_segno,
instead, we need to check zone type for each zone during traversing zoned
device to find first_zoned_segno.

Otherwise, for below case, first_zoned_segno will be 0, which could be
wrong.

create_null_blk 512 2 1024 1024
mkfs.f2fs -m /dev/nullb0

Testcase:

export SCRIPTS_PATH=/share/git/scripts

test multiple devices w/ zoned device
for ((i=0;i<8;i++)) do {
	zonesize=$((2<<$i))
	conzone=$((4096/$zonesize))
	seqzone=$((4096/$zonesize))
	$SCRIPTS_PATH/nullblk_create.sh 512 $zonesize $conzone $seqzone
	mkfs.f2fs -f -m /dev/vdb -c /dev/nullb0
	mount /dev/vdb /mnt/f2fs
	touch /mnt/f2fs/file
	f2fs_io pinfile set /mnt/f2fs/file $((8589934592*2))
	stat /mnt/f2fs/file
	df
	cat /proc/fs/f2fs/vdb/segment_info
	umount /mnt/f2fs
	$SCRIPTS_PATH/nullblk_remove.sh 0
} done

test single zoned device
for ((i=0;i<8;i++)) do {
	zonesize=$((2<<$i))
	conzone=$((4096/$zonesize))
	seqzone=$((4096/$zonesize))
	$SCRIPTS_PATH/nullblk_create.sh 512 $zonesize $conzone $seqzone
	mkfs.f2fs -f -m /dev/nullb0
	mount /dev/nullb0 /mnt/f2fs
	touch /mnt/f2fs/file
	f2fs_io pinfile set /mnt/f2fs/file $((8589934592*2))
	stat /mnt/f2fs/file
	df
	cat /proc/fs/f2fs/nullb0/segment_info
	umount /mnt/f2fs
	$SCRIPTS_PATH/nullblk_remove.sh 0
} done

Fixes: 9703d69d9d ("f2fs: support file pinning for zoned devices")
Cc: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-28 15:26:48 +00:00
Chao Yu
5db0d252c6 f2fs: fix to do sanity check on sit_bitmap_size
w/ below testcase, resize will generate a corrupted image which
contains inconsistent metadata, so when mounting such image, it
will trigger kernel panic:

touch img
truncate -s $((512*1024*1024*1024)) img
mkfs.f2fs -f img $((256*1024*1024))
resize.f2fs -s -i img -t $((1024*1024*1024))
mount img /mnt/f2fs

------------[ cut here ]------------
kernel BUG at fs/f2fs/segment.h:863!
Oops: invalid opcode: 0000 [#1] SMP PTI
CPU: 11 UID: 0 PID: 3922 Comm: mount Not tainted 6.15.0-rc1+ #191 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:f2fs_ra_meta_pages+0x47c/0x490

Call Trace:
 f2fs_build_segment_manager+0x11c3/0x2600
 f2fs_fill_super+0xe97/0x2840
 mount_bdev+0xf4/0x140
 legacy_get_tree+0x2b/0x50
 vfs_get_tree+0x29/0xd0
 path_mount+0x487/0xaf0
 __x64_sys_mount+0x116/0x150
 do_syscall_64+0x82/0x190
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7fdbfde1bcfe

The reaseon is:

sit_i->bitmap_size is 192, so size of sit bitmap is 192*8=1536, at maximum
there are 1536 sit blocks, however MAIN_SEGS is 261893, so that sit_blk_cnt
is 4762, build_sit_entries() -> current_sit_addr() tries to access
out-of-boundary in sit_bitmap at offset from [1536, 4762), once sit_bitmap
and sit_bitmap_mirror is not the same, it will trigger f2fs_bug_on().

Let's add sanity check in f2fs_sanity_check_ckpt() to avoid panic.

Cc: stable@vger.kernel.org
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-28 15:26:48 +00:00
Matthew Wilcox (Oracle)
0d1e687e43 f2fs: Use a folio in f2fs_quota_read()
Support arbitrary size folios and remove a few hidden calls to
compound_head().  Also remove an unnecessary test of the uptodaate flag;
if mapping_read_folio_gfp() cannot bring the folio uptodate, it will
return an error.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-28 15:21:35 +00:00
Chao Yu
db03c20c08 f2fs: fix to set atomic write status more clear
1. After we start atomic write in a database file, before committing
all data, we'd better not set inode w/ vfs dirty status to avoid
redundant updates, instead, we only set inode w/ atomic dirty status.

2. After we commit all data, before committing metadata, we need to
clear atomic dirty status, and set vfs dirty status to allow vfs flush
dirty inode.

Cc: Daeho Jeong <daehojeong@google.com>
Reported-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-10 03:59:58 +00:00
Chao Yu
2be96c2147 f2fs: fix to update injection attrs according to fault_option
When we update inject type via sysfs, it shows wrong rate value as
below, there is a same problem when we update inject rate, fix it.

Before:
F2FS-fs (vdd): build fault injection attr: rate: 0, type: 0xffff
F2FS-fs (vdd): build fault injection attr: rate: 1, type: 0x0

After:
F2FS-fs (vdd): build fault injection type: 0x1
F2FS-fs (vdd): build fault injection rate: 1

Meanwhile, let's avoid turning on all fault types when we enable fault
injection via fault_injection mount option, it will lead to shutdown
filesystem or fail the mount() easily.

mount -o fault_injection=4 /dev/vdd /mnt/f2fs
F2FS-fs (vdd): build fault injection attr: rate: 4, type: 0x7fffff
F2FS-fs (vdd): inject kmalloc in f2fs_kmalloc of f2fs_fill_super+0xbdf/0x27c0

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-10 03:59:58 +00:00
Chao Yu
e073e92789 f2fs: add a proc entry show inject stats
This patch adds a proc entry named inject_stats to show total injected
count for each fault type.

cat /proc/fs/f2fs/<dev>/inject_stats
fault_type              injected_count
kmalloc                 0
kvmalloc                0
page alloc              0
page get                0
alloc bio(obsolete)     0
alloc nid               0
orphan                  0
no more block           0
too big dir depth       0
evict_inode fail        0
truncate fail           0
read IO error           0
checkpoint error        0
discard error           0
write IO error          0
slab alloc              0
dquot initialize        0
lock_op                 0
invalid blkaddr         0
inconsistent blkaddr    0
no free segment         0
inconsistent footer     0

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-10 03:59:57 +00:00
Yeongjin Gil
f098aeba04 f2fs: fix to avoid atomicity corruption of atomic file
In the case of the following call stack for an atomic file,
FI_DIRTY_INODE is set, but FI_ATOMIC_DIRTIED is not subsequently set.

f2fs_file_write_iter
  f2fs_map_blocks
    f2fs_reserve_new_blocks
      inc_valid_block_count
        __mark_inode_dirty(dquot)
          f2fs_dirty_inode

If FI_ATOMIC_DIRTIED is not set, atomic file can encounter corruption
due to a mismatch between old file size and new data.

To resolve this issue, I changed to set FI_ATOMIC_DIRTIED when
FI_DIRTY_INODE is set. This ensures that FI_DIRTY_INODE, which was
previously cleared by the Writeback thread during the commit atomic, is
set and i_size is updated.

Cc: <stable@vger.kernel.org>
Fixes: fccaa81de8 ("f2fs: prevent atomic file from being dirtied before commit")
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Sunmin Jeong <s_min.jeong@samsung.com>
Signed-off-by: Yeongjin Gil <youngjin.gil@samsung.com>
Reviewed-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-17 17:38:33 +00:00
Eric Sandeen
71e9bd3d5c f2fs: pass sbi rather than sb to parse_options()
With the new mount API the sb will not be available during initial option
parsing, which will happen before fill_super reads sb from disk.

Now that the sb is no longer directly referenced in parse_options, switch
it to use sbi.

(Note that all calls to f2fs_sb_has_* originating from parse_options will
need to be deferred to later before we can use the new mount API.)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-13 18:16:07 +00:00
Eric Sandeen
b7de231b9d f2fs: pass sbi rather than sb to quota qf_name helpers
With the new mount api we will not have the superblock available during
option parsing. Prepare for this by passing *sbi rather than *sb.

For now, we are parsing after fill_super has been done, so sbi->sb will
exist. Under the new mount API this will require more care, but do the
simple change for now.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-13 18:16:06 +00:00
Eric Sandeen
9cca498759 f2fs: defer readonly check vs norecovery
Defer the readonly-vs-norecovery check until after option parsing is done
so that option parsing does not require an active superblock for the test.
Add a helpful message, while we're at it.

(I think could be moved back into parsing after we switch to the new mount
API if desired, as the fs context will have RO state available.)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-13 18:16:06 +00:00
Eric Sandeen
0edcb2197e f2fs: Pass sbi rather than sb to f2fs_set_test_dummy_encryption
This removes another sb instance from parse_options()

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-13 18:16:06 +00:00
Eric Sandeen
9100adf326 f2fs: make LAZYTIME a mount option flag
Set LAZYTIME into sbi during parsing, and transfer it to the sb in
fill_super, so that an sb is not required during option parsing.

(Note: While lazytime is normally handled via mount flag in the vfs,
some f2fs users do expect to be able to use it as an explicit mount
option string via the mount syscall, so this option must remain.)

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-13 18:16:06 +00:00
Eric Sandeen
7d6ee50330 f2fs: make INLINECRYPT a mount option flag
Set INLINECRYPT into sbi during parsing, and transfer it to the sb in
fill_super, so that an sb is not required during option parsing.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-13 18:16:06 +00:00
Eric Sandeen
abd0e040e9 f2fs: factor out an f2fs_default_check function
The current options parsing function both parses options and validates
them - factor the validation out to reduce the size of the function and
make transition to the new mount API possible, because under the new mount
API, options are parsed one at a time, and cannot all be tested at the end
of the parsing function.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-13 18:16:06 +00:00
Eric Sandeen
277352b6cb f2fs: consolidate unsupported option handling errors
When certain build-time options are disabled, some mount options are not
accepted. For quota and compression, all related options are dismissed
with a single error message. For xattr, acl, and fault injection, each
option is handled individually. In addition, inline_xattr_size was missed
when CONFIG_F2FS_FS_XATTR was disabled.

Collapse xattr, acl, and fault injection errors into a single string, for
simplicity, and handle the missing inline_xattr_size case.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-13 18:16:06 +00:00
Eric Sandeen
64ee7503cb f2fs: use f2fs_sb_has_device_alias during option parsing
Rather than using F2FS_HAS_FEATURE directly, use f2fs_sb_has_device_alias
macro during option parsing for consistency.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-13 18:16:06 +00:00
Daeho Jeong
d7b549def0 f2fs: add carve_out sysfs node
For several zoned storage devices, vendors will provide extra space
which was used for device level GC than specs and F2FS can use this
space for filesystem level GC. To do that, we can reserve the space
using reserved_blocks. However, it is not enough, since this extra
space should not be shown to users. So, with this new sysfs node,
we can hide the space by substracting reserved_blocks from total
bytes.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-13 18:15:59 +00:00
Chao Yu
1788971e0b f2fs: introduce FAULT_INCONSISTENT_FOOTER
To simulate inconsistent node footer error.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-11 03:25:53 +00:00
Chao Yu
c2ecba0265 f2fs: control nat_bits feature via mount option
Introduce a new mount option "nat_bits" to control nat_bits feature,
by default nat_bits feature is disabled.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-08 16:04:10 +00:00
Chao Yu
d8f5b91d77 f2fs: fix to call f2fs_recover_quota_end() correctly
f2fs_recover_quota_begin() and f2fs_recover_quota_end() should be called
in pair, there is some cases we may skip calling f2fs_recover_quota_end(),
fix it.

Fixes: e1bb7d3d9c ("f2fs: fix to recover quota data correctly")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-04 00:47:13 +00:00
Jaegeuk Kim
201e07aec6 f2fs: fix the missing write pointer correction
If checkpoint was disabled, we missed to fix the write pointers.

Cc: <stable@vger.kernel.org>
Fixes: 1015035609 ("f2fs: fix changing cursegs if recovery fails on zoned device")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-02-28 16:23:01 +00:00
Jaegeuk Kim
ef0c333cad f2fs: keep POSIX_FADV_NOREUSE ranges
This patch records POSIX_FADV_NOREUSE ranges for users to reclaim the caches
instantly off from LRU.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-02-13 17:58:31 +00:00
Chao Yu
4f91f07470 f2fs: add dump_stack() in f2fs_handle_critical_error()
To show call stack, so that we can see who causes critical error, note
that it won't call dump_stack() for shutdown path.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-02-12 02:28:22 +00:00
Chao Yu
eb85c2410d f2fs: quota: fix to avoid warning in dquot_writeback_dquots()
F2FS-fs (dm-59): checkpoint=enable has some unwritten data.

------------[ cut here ]------------
WARNING: CPU: 6 PID: 8013 at fs/quota/dquot.c:691 dquot_writeback_dquots+0x2fc/0x308
pc : dquot_writeback_dquots+0x2fc/0x308
lr : f2fs_quota_sync+0xcc/0x1c4
Call trace:
dquot_writeback_dquots+0x2fc/0x308
f2fs_quota_sync+0xcc/0x1c4
f2fs_write_checkpoint+0x3d4/0x9b0
f2fs_issue_checkpoint+0x1bc/0x2c0
f2fs_sync_fs+0x54/0x150
f2fs_do_sync_file+0x2f8/0x814
__f2fs_ioctl+0x1960/0x3244
f2fs_ioctl+0x54/0xe0
__arm64_sys_ioctl+0xa8/0xe4
invoke_syscall+0x58/0x114

checkpoint and f2fs_remount may race as below, resulting triggering warning
in dquot_writeback_dquots().

atomic write                                    remount
                                                - do_remount
                                                 - down_write(&sb->s_umount);
                                                  - f2fs_remount
- ioctl
 - f2fs_do_sync_file
  - f2fs_sync_fs
   - f2fs_write_checkpoint
    - block_operations
     - locked = down_read_trylock(&sbi->sb->s_umount)
       : fail to lock due to the write lock was held by remount
                                                 - up_write(&sb->s_umount);
     - f2fs_quota_sync
      - dquot_writeback_dquots
       - WARN_ON_ONCE(!rwsem_is_locked(&sb->s_umount))
       : trigger warning because s_umount lock was unlocked by remount

If checkpoint comes from mount/umount/remount/freeze/quotactl, caller of
checkpoint has already held s_umount lock, calling dquot_writeback_dquots()
in the context should be safe.

So let's record task to sbi->umount_lock_holder, so that checkpoint can
know whether the lock has held in the context or not by checking current
w/ it.

In addition, in order to not misrepresent caller of checkpoint, we should
not allow to trigger async checkpoint for those callers: mount/umount/remount/
freeze/quotactl.

Fixes: af033b2aa8 ("f2fs: guarantee journalled quota data by checkpoint")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-02-10 16:58:42 +00:00
Eric Biggers
3ca4bec40e f2fs: switch to using the crc32 library
Now that the crc32() library function takes advantage of
architecture-specific optimizations, it is unnecessary to go through the
crypto API.  Just use crc32().  This is much simpler, and it improves
performance due to eliminating the crypto API overhead.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Chao Yu <chao@kernel.org>
Link: https://lore.kernel.org/r/20241202010844.144356-19-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-12-01 17:23:02 -08:00
Chao Yu
bc8aeb04fd f2fs: fix to drop all discards after creating snapshot on lvm device
Piergiorgio reported a bug in bugzilla as below:

------------[ cut here ]------------
WARNING: CPU: 2 PID: 969 at fs/f2fs/segment.c:1330
RIP: 0010:__submit_discard_cmd+0x27d/0x400 [f2fs]
Call Trace:
 __issue_discard_cmd+0x1ca/0x350 [f2fs]
 issue_discard_thread+0x191/0x480 [f2fs]
 kthread+0xcf/0x100
 ret_from_fork+0x31/0x50
 ret_from_fork_asm+0x1a/0x30

w/ below testcase, it can reproduce this bug quickly:
- pvcreate /dev/vdb
- vgcreate myvg1 /dev/vdb
- lvcreate -L 1024m -n mylv1 myvg1
- mount /dev/myvg1/mylv1 /mnt/f2fs
- dd if=/dev/zero of=/mnt/f2fs/file bs=1M count=20
- sync
- rm /mnt/f2fs/file
- sync
- lvcreate -L 1024m -s -n mylv1-snapshot /dev/myvg1/mylv1
- umount /mnt/f2fs

The root cause is: it will update discard_max_bytes of mounted lvm
device to zero after creating snapshot on this lvm device, then,
__submit_discard_cmd() will pass parameter @nr_sects w/ zero value
to __blkdev_issue_discard(), it returns a NULL bio pointer, result
in panic.

This patch changes as below for fixing:
1. Let's drop all remained discards in f2fs_unfreeze() if snapshot
of lvm device is created.
2. Checking discard_max_bytes before submitting discard during
__submit_discard_cmd().

Cc: stable@vger.kernel.org
Fixes: 35ec7d5748 ("f2fs: split discard command in prior to block layer")
Reported-by: Piergiorgio Sartor <piergiorgio.sartor@nexgo.de>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219484
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-11-23 15:48:15 +00:00
Sheng Yong
f88c7904b5 f2fs: clear SBI_POR_DOING before initing inmem curseg
SBI_POR_DOING can be cleared after recovery is completed, so that
changes made before recovery can be persistent, and subsequent
errors can be recorded into cp/sb.

Signed-off-by: Song Feng <songfeng@oppo.com>
Signed-off-by: Yongpeng Yang <yangyongpeng1@oppo.com>
Signed-off-by: Sheng Yong <shengyong@oppo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-11-21 16:14:07 +00:00
Sheng Yong
1015035609 f2fs: fix changing cursegs if recovery fails on zoned device
Fsync data recovery attempts to check and fix write pointer consistency
of cursegs and all other zones. If the write pointers of cursegs are
unaligned, cursegs are changed to new sections.

If recovery fails, zone write pointers are still checked and fixed,
but the latest checkpoint cannot be written back. Additionally, retry-
mount skips recovery and rolls back to reuse the old cursegs whose
zones are already finished. This can lead to unaligned write later.

This patch addresses the issue by leaving writer pointers untouched if
recovery fails. When retry-mount is performed, cursegs and other zones
are checked and fixed after skipping recovery.

Signed-off-by: Song Feng <songfeng@oppo.com>
Signed-off-by: Yongpeng Yang <yangyongpeng1@oppo.com>
Signed-off-by: Sheng Yong <shengyong@oppo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-11-21 16:13:53 +00:00
Daeho Jeong
a35749b1ed f2fs: adjust unusable cap before checkpoint=disable mode
The unusable cap value must be adjusted before checking whether
checkpoint=disable is feasible.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-11-21 16:12:30 +00:00
Chao Yu
3273d8ad94 f2fs: fix to do cast in F2FS_{BLK_TO_BYTES, BTYES_TO_BLK} to avoid overflow
It missed to cast variable to unsigned long long type before
bit shift, which will cause overflow, fix it.

Fixes: f7ef9b83b5 ("f2fs: introduce macros to convert bytes and blocks in f2fs")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-11-21 16:12:29 +00:00
Daniel Yang
789ca0eb47 f2fs: replace deprecated strcpy with strscpy
strcpy is deprecated. Kernel docs recommend replacing strcpy with
strscpy. The function strcpy() return value isn't used so there
shouldn't be an issue replacing with the safer alternative strscpy.

Signed-off-by: Daniel Yang <danielyangkang@gmail.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-11-21 16:12:29 +00:00