Commit graph

107 commits

Author SHA1 Message Date
Daeho Jeong
c8705cefce f2fs: add gc_boost_gc_greedy sysfs node
Add this to control GC algorithm for boost GC.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-29 15:02:36 +00:00
Daeho Jeong
1d4c5dbba1 f2fs: add gc_boost_gc_multiple sysfs node
Add a sysfs knob to set a multiplier for the background GC migration
window when F2FS Garbage Collection is boosted.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-29 15:02:33 +00:00
Chao Yu
59c1c89e9b f2fs: introduce reserved_pin_section sysfs entry
This patch introduces /sys/fs/f2fs/<dev>/reserved_pin_section for tuning
@needed parameter of has_not_enough_free_secs(), if we configure it w/
zero, it can avoid f2fs_gc() as much as possible while fallocating on
pinned file.

Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: wangzijie <wangzijie1@honor.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-06-23 22:13:02 +00:00
Chao Yu
54ca9be0bc f2fs: introduce FAULT_VMALLOC
Introduce a new fault type FAULT_VMALLOC to simulate no memory error in
f2fs_vmalloc().

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-27 23:52:36 +00:00
Chao Yu
13be879576 f2fs: fix 32-bits hexademical number in fault injection doc
FAULT_KMALLOC                    0x000000001

There is one redundant '0' in 32-bits hexademical number of fault type,
remove it.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-13 15:37:00 +00:00
Chao Yu
0244c77fed f2fs: support FAULT_TIMEOUT
Support to inject a timeout fault into function, currently it only
support to inject timeout to commit_atomic_write flow to reproduce
inconsistent bug, like the bug fixed by commit f098aeba04 ("f2fs:
fix to avoid atomicity corruption of atomic file").

By default, the new type fault will inject 1000ms timeout, and the
timeout process can be interrupted by SIGKILL.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-06 15:46:55 +00:00
Chao Yu
617e0491ab f2fs: sysfs: export linear_lookup in features directory
cat /sys/fs/f2fs/features/linear_lookup
supported

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-06 15:46:54 +00:00
Chao Yu
3fea0641b0 f2fs: sysfs: add encoding_flags entry
This patch adds a new sysfs entry /sys/fs/f2fs/<disk>/encoding_flags,
it is a read-only entry to show the value of sb.s_encoding_flags, the
value is hexadecimal.

============================     ==========
Flag_Name                        Flag_Value
============================     ==========
SB_ENC_STRICT_MODE_FL            0x00000001
SB_ENC_NO_COMPAT_FALLBACK_FL     0x00000002
============================     ==========

case#1
mkfs.f2fs -f -O casefold -C utf8:strict /dev/vda
mount /dev/vda /mnt/f2fs
cat /sys/fs/f2fs/vda/encoding_flags
1

case#2
mkfs.f2fs -f -O casefold -C utf8 /dev/vda
fsck.f2fs --nolinear-lookup=1 /dev/vda
mount /dev/vda /mnt/f2fs
cat /sys/fs/f2fs/vda/encoding_flags
2

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-06 15:46:52 +00:00
Linus Torvalds
81d8e5e213 f2fs-for-6.15-rc1
In this round, there are three major updates: 1) folio conversion, 2) refactor
 for mount API conversion, 3) some performance improvement such as direct IO,
 checkpoint speed, and IO priority hints. For stability, there are patches
 which add more sanity checks and fixes some major issues like i_size in
 atomic write operations and write pointer recovery in zoned devices.
 
 Enhancement:
  - huge folio converion work by Matthew Wilcox
  - clean up for mount API conversion by Eric Sandeen
  - improve direct IO speed in the overwrite case
  - add some sanity check on node consistency
  - set highest IO priority for checkpoint thread
  - keep POSIX_FADV_NOREUSE ranges and add sysfs entry to reclaim pages
  - add ioctl to get IO priority hint
  - add carve_out sysfs node for fsstat
 
 Bug fix:
  - disable nat_bits during umount to avoid potential nat entry corruption
  - fix missing i_size update on atomic writes
  - fix missing discard for active segments
  - fix running out of free segments
  - fix out-of-bounds access in f2fs_truncate_inode_blocks()
  - call f2fs_recover_quota_end() correctly
  - fix potential deadloop in prepare_compress_overwrite()
  - fix the missing write pointer correction for zoned device
  - fix to avoid panic once fallocation fails for pinfile
  - don't retry IO for corrupted data scenario
 
 There are many other clean up patches and minor bug fixes as usual.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmfhjuYACgkQQBSofoJI
 UNJF+hAAip0Sf6bXwt43KR3lmbXg/YBTtPDACOs375Pn8pfRVsAPIAf5e1AnlBET
 rA0KpqUrEZHXenQFF9n4LIoHYqfuUw3EMempuvp4qgQcx15Kaajw+EGWVLYstNy2
 9dELc9DA55f/i1uvHezGfQFy6hMfXasf+tkaYk0z0ZYDStYboLgkVAY+F869q1Dl
 D16Y7Lna12h4eCxDdssIPDsLjFH/2LDn7SmhXsnpZxwK0Zx8JSo83WxXHnzXHxz4
 vUkukInKgqcBDvf6ufW/YF3/tqSs20XEXNK3cI1vyHx1dwij6Us+G0n0WJHL30QE
 zsliecR0X28JP1rJ3ldCjr+4Kxm6/u/Uwpinm2Fm1jL67UY4TZcaARBE8k20I6ND
 j/L1+sXrIdZ1aILM9/bwCjgXiVFdZbvlfGfpTj1duAkQgRd+/s9cjPlo5j5HTad5
 XJmzJz6YOaUNarhP/E31Z9SV9M2kEcmoDOTxKBg6ZcMWXZ27B6Z0ag92prd0GWk6
 rWDuVj0eP/LDY1QedbHRPbg1D84jVgcnldfPaf9ptln993skJGdgS0dkegTqMR0L
 H8RgpWOzWZI53gnQdePdej8diHmD8uRTrLf/oABm//GzTHC5BdwVpArOl1DCuLFF
 YkMtVEicgnWRmL3PgODAdTJXaDi3uGvT116i+lfMlUn9p+t93r0=
 =E+TE
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this round, there are three major updates: (1) folio conversion,
  (2) refactoring for mount API conversion, (3) some performance
  improvement such as direct IO, checkpoint speed, and IO priority
  hints.

  For stability, there are patches which add more sanity checks and
  fixes some major issues like i_size in atomic write operations and
  write pointer recovery in zoned devices.

  Enhancements:
   - huge folio converion work by Matthew Wilcox
   - clean up for mount API conversion by Eric Sandeen
   - improve direct IO speed in the overwrite case
   - add some sanity check on node consistency
   - set highest IO priority for checkpoint thread
   - keep POSIX_FADV_NOREUSE ranges and add sysfs entry to reclaim pages
   - add ioctl to get IO priority hint
   - add carve_out sysfs node for fsstat

  Bug fixes:
   - disable nat_bits during umount to avoid potential nat entry corruption
   - fix missing i_size update on atomic writes
   - fix missing discard for active segments
   - fix running out of free segments
   - fix out-of-bounds access in f2fs_truncate_inode_blocks()
   - call f2fs_recover_quota_end() correctly
   - fix potential deadloop in prepare_compress_overwrite()
   - fix the missing write pointer correction for zoned device
   - fix to avoid panic once fallocation fails for pinfile
   - don't retry IO for corrupted data scenario

  There are many other clean up patches and minor bug fixes as usual"

* tag 'f2fs-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (68 commits)
  f2fs: fix missing discard for active segments
  f2fs: optimize f2fs DIO overwrites
  f2fs: fix to avoid atomicity corruption of atomic file
  f2fs: pass sbi rather than sb to parse_options()
  f2fs: pass sbi rather than sb to quota qf_name helpers
  f2fs: defer readonly check vs norecovery
  f2fs: Pass sbi rather than sb to f2fs_set_test_dummy_encryption
  f2fs: make LAZYTIME a mount option flag
  f2fs: make INLINECRYPT a mount option flag
  f2fs: factor out an f2fs_default_check function
  f2fs: consolidate unsupported option handling errors
  f2fs: use f2fs_sb_has_device_alias during option parsing
  f2fs: add carve_out sysfs node
  f2fs: fix to avoid running out of free segments
  f2fs: Remove f2fs_write_node_page()
  f2fs: Remove f2fs_write_meta_page()
  f2fs: Remove f2fs_write_data_page()
  f2fs: Remove check for ->writepage
  Revert "f2fs: rebuild nat_bits during umount"
  f2fs: fix to avoid accessing uninitialized curseg
  ...
2025-03-27 12:55:54 -07:00
Daeho Jeong
d7b549def0 f2fs: add carve_out sysfs node
For several zoned storage devices, vendors will provide extra space
which was used for device level GC than specs and F2FS can use this
space for filesystem level GC. To do that, we can reserve the space
using reserved_blocks. However, it is not enough, since this extra
space should not be shown to users. So, with this new sysfs node,
we can hide the space by substracting reserved_blocks from total
bytes.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-13 18:15:59 +00:00
Chao Yu
1788971e0b f2fs: introduce FAULT_INCONSISTENT_FOOTER
To simulate inconsistent node footer error.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-11 03:25:53 +00:00
Jaegeuk Kim
a907f3a68e f2fs: add a sysfs entry to reclaim POSIX_FADV_NOREUSE pages
1. fadvise(fd1, POSIX_FADV_NOREUSE, {0,3});
2. fadvise(fd2, POSIX_FADV_NOREUSE, {1,2});
3. fadvise(fd3, POSIX_FADV_NOREUSE, {3,1});
4. echo 1024 > /sys/fs/f2fs/tuning/reclaim_caches_kb

This gives a way to reclaim file-backed pages by iterating all f2fs mounts until
reclaiming 1MB page cache ranges, registered by #1, #2, and #3.

5. cat /sys/fs/f2fs/tuning/reclaim_caches_kb
-> gives total number of registered file ranges.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-02-13 17:58:36 +00:00
Mauro Carvalho Chehab
90800df0da ABI: sysfs-fs-f2fs: fix date tags
Some date tags are missing colons. Add them to comply with
ABI description and produce right results when converted to html/pdf.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/336ab631c0636e419282a38e7dd5b5cfb52fcd2d.1739182025.git.mchehab+huawei@kernel.org
2025-02-10 11:19:56 -07:00
Chao Yu
009a8241a8 f2fs: add a sysfs node to limit max read extent count per-inode
Quoted:
"at this time, there are still 1086911 extent nodes in this zombie
extent tree that need to be cleaned up.

crash_arm64_sprd_v8.0.3++> extent_tree.node_cnt ffffff80896cc500
  node_cnt = {
    counter = 1086911
  },
"

As reported by Xiuhong, there will be a huge number of extent nodes
in extent tree, it may potentially cause:
- slab memory fragments
- extreme long time shrink on extent tree
- low mapping efficiency

Let's add a sysfs node to limit max read extent count for each inode,
by default, value of this threshold is 10240, it can be updated
according to user's requirement.

Reported-by: Xiuhong Wang <xiuhong.wang@unisoc.com>
Closes: https://lore.kernel.org/linux-f2fs-devel/20241112110627.1314632-1-xiuhong.wang@unisoc.com/
Signed-off-by: Xiuhong Wang <xiuhong.wang@unisoc.com>
Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-11-23 15:48:13 +00:00
Zhiguo Niu
296b8cb34e f2fs: fix to avoid use GC_AT when setting gc_mode as GC_URGENT_LOW or GC_URGENT_MID
If gc_mode is set to GC_URGENT_LOW or GC_URGENT_MID, cost benefit GC
approach should be used, but if ATGC is enabled at the same time,
Age-threshold approach will be selected, which can only do amount of
GC and it is much less than the numbers of CB approach.

some traces:
  f2fs_gc-254:48-396     [007] ..... 2311600.684028: f2fs_gc_begin: dev = (254,48), gc_type = Background GC, no_background_GC = 0, nr_free_secs = 0, nodes = 1053, dents = 2, imeta = 18, free_sec:44898, free_seg:44898, rsv_seg:239, prefree_seg:0
  f2fs_gc-254:48-396     [007] ..... 2311600.684527: f2fs_get_victim: dev = (254,48), type = No TYPE, policy = (Background GC, LFS-mode, Age-threshold), victim = 10, cost = 4294364975, ofs_unit = 1, pre_victim_secno = -1, prefree = 0, free = 44898
  f2fs_gc-254:48-396     [007] ..... 2311600.714835: f2fs_gc_end: dev = (254,48), ret = 0, seg_freed = 0, sec_freed = 0, nodes = 1562, dents = 2, imeta = 18, free_sec:44898, free_seg:44898, rsv_seg:239, prefree_seg:0
  f2fs_gc-254:48-396     [007] ..... 2311600.714843: f2fs_background_gc: dev = (254,48), wait_ms = 50, prefree = 0, free = 44898
  f2fs_gc-254:48-396     [007] ..... 2311600.771785: f2fs_gc_begin: dev = (254,48), gc_type = Background GC, no_background_GC = 0, nr_free_secs = 0, nodes = 1562, dents = 2, imeta = 18, free_sec:44898, free_seg:44898, rsv_seg:239, prefree_seg:
  f2fs_gc-254:48-396     [007] ..... 2311600.772275: f2fs_gc_end: dev = (254,48), ret = -61, seg_freed = 0, sec_freed = 0, nodes = 1562, dents = 2, imeta = 18, free_sec:44898, free_seg:44898, rsv_seg:239, prefree_seg:0

Fixes: 0e5e81114d ("f2fs: add GC_URGENT_LOW mode in gc_urgent")
Fixes: d98af5f455 ("f2fs: introduce gc_urgent_mid mode")
Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-11-01 01:24:41 +00:00
Daeho Jeong
e791d00bd0 f2fs: add valid block ratio not to do excessive GC for one time GC
We need to introduce a valid block ratio threshold not to trigger
excessive GC for zoned deivces. The initial value of it is 95%. So, F2FS
will stop the thread from intiating GC for sections having valid blocks
exceeding the ratio.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11 03:33:08 +00:00
Daeho Jeong
9a481a1c16 f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent
Added control knobs for gc_no_zoned_gc_percent and
gc_boost_zoned_gc_percent.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11 03:33:05 +00:00
Daeho Jeong
4cdca5a904 f2fs: add reserved_segments sysfs node
For the fine tuning of GC behavior, add reserved_segments sysfs node.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11 03:32:57 +00:00
Daeho Jeong
8c890c4c60 f2fs: introduce migration_window_granularity
We can control the scanning window granularity for GC migration. For
more frequent scanning and GC on zoned devices, we need a fine grained
control knob for it.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11 03:32:54 +00:00
liujinbao1
6f092b55e1 f2fs: sysfs: support atgc_enabled
When we add "atgc" to the fstab table, ATGC is not immediately enabled.
There is a 7-day time threshold, and we can use "atgc_enabled" to
show whether ATGC is enabled.

Signed-off-by: liujinbao1 <liujinbao1@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-15 15:26:40 +00:00
Liao Yuanhong
8444ce5249 f2fs: add write priority option based on zone UFS
Currently, we are using a mix of traditional UFS and zone UFS to support
some functionalities that cannot be achieved on zone UFS alone. However,
there are some issues with this approach. There exists a significant
performance difference between traditional UFS and zone UFS. Under normal
usage, we prioritize writes to zone UFS. However, in critical conditions
(such as when the entire UFS is almost full), we cannot determine whether
data will be written to traditional UFS or zone UFS. This can lead to
significant performance fluctuations, which is not conducive to
development and testing. To address this, we have added an option
zlu_io_enable under sys with the following three modes:
1) zlu_io_enable == 0:Normal mode, prioritize writing to zone UFS;
2) zlu_io_enable == 1:Zone UFS only mode, only allow writing to zone UFS;
3) zlu_io_enable == 2:Traditional UFS priority mode, prioritize writing to
traditional UFS.

Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com>
Signed-off-by: Wu Bo <bo.wu@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-08-05 20:18:35 +00:00
Chao Yu
c521a6ab4a f2fs: fix to limit gc_pin_file_threshold
type of f2fs_inode.i_gc_failures, f2fs_inode_info.i_gc_failures, and
f2fs_sb_info.gc_pin_file_threshold is __le16, unsigned int, and u64,
so it will cause truncation during comparison and persistence.

Unifying variable of these three variables to unsigned short, and
add an upper boundary limitation for gc_pin_file_threshold.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-05-09 01:03:44 +00:00
Chao Yu
8b10d36537 f2fs: introduce FAULT_NO_SEGMENT
Use it to simulate no free segment case during block allocation.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-29 08:34:34 -08:00
Jeffrey Hugo
3ae768a132 f2fs: doc: Fix bouncing email address for Sahitya Tummala
The servers for the @codeaurora domain are long retired and any messages
addressed there will bounce.  Sahitya Tummala has a .mailmap entry to an
updated address, but the documentation files still list @codeaurora
which might be a problem for anyone reading the documentation directly.
Update the documentation files to match the .mailmap update.

Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-27 09:41:15 -08:00
Chao Yu
c7115e094c f2fs: introduce FAULT_BLKADDR_CONSISTENCE
We will encounter below inconsistent status when FAULT_BLKADDR type
fault injection is on.

Info: checkpoint state = d6 :  nat_bits crc fsck compacted_summary orphan_inodes sudden-power-off
[ASSERT] (fsck_chk_inode_blk:1254)  --> ino: 0x1c100 has i_blocks: 000000c0, but has 191 blocks
[FIX] (fsck_chk_inode_blk:1260)  --> [0x1c100] i_blocks=0x000000c0 -> 0xbf
[FIX] (fsck_chk_inode_blk:1269)  --> [0x1c100] i_compr_blocks=0x00000026 -> 0x27
[ASSERT] (fsck_chk_inode_blk:1254)  --> ino: 0x1cadb has i_blocks: 0000002f, but has 46 blocks
[FIX] (fsck_chk_inode_blk:1260)  --> [0x1cadb] i_blocks=0x0000002f -> 0x2e
[FIX] (fsck_chk_inode_blk:1269)  --> [0x1cadb] i_compr_blocks=0x00000011 -> 0x12
[ASSERT] (fsck_chk_inode_blk:1254)  --> ino: 0x1c62c has i_blocks: 00000002, but has 1 blocks
[FIX] (fsck_chk_inode_blk:1260)  --> [0x1c62c] i_blocks=0x00000002 -> 0x1

After we inject fault into f2fs_is_valid_blkaddr() during truncation,
a) it missed to increase @nr_free or @valid_blocks
b) it can cause in blkaddr leak in truncated dnode
Which may cause inconsistent status.

This patch separates FAULT_BLKADDR_CONSISTENCE from FAULT_BLKADDR,
and rename FAULT_BLKADDR to FAULT_BLKADDR_VALIDITY
so that we can:
a) use FAULT_BLKADDR_CONSISTENCE in f2fs_truncate_data_blocks_range()
to simulate inconsistent issue independently, then it can verify fsck
repair flow.
b) FAULT_BLKADDR_VALIDITY fault will not cause any inconsistent status,
we can just use it to check error path handling in kernel side.

Reviewed-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-05 18:58:39 -08:00
Zhiguo Niu
c3c2d45b90 f2fs: show more discard status by sysfs
The current pending_discard attr just only shows the discard_cmd_cnt
information. More discard status can be shown so that we can check
them through sysfs when needed.

Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-12-26 13:07:26 -08:00
Chao Yu
d346fa09ab f2fs: sysfs: support discard_io_aware
It gives a way to enable/disable IO aware feature for background
discard, so that we can tune background discard more precisely
based on undiscard condition. e.g. force to disable IO aware if
there are large number of discard extents, and discard IO may
always be interrupted by frequent common IO.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-11-28 10:59:51 -08:00
Chao Yu
726865e69a f2fs: doc: fix description of max_small_discards
The description of max_small_discards is out-of-update in below two
aspects, fix it.
- it is disabled by default
- small discards will be issued during checkpoint

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-18 14:28:26 -07:00
Randy Dunlap
c709d099a0 f2fs: fix spelling in ABI documentation
Correct spelling problems as identified by codespell.

Fixes: 9e615dbba4 ("f2fs: add missing description for ipu_policy node")
Fixes: b2e4a2b300 ("f2fs: expose discard related parameters in sysfs")
Fixes: 846ae671ad ("f2fs: expose extension_list sysfs entry")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Chao Yu <chao@kernel.org>
Cc: linux-f2fs-devel@lists.sourceforge.net
Cc: Yangtao Li <frank.li@vivo.com>
Cc: Konstantin Vyshetsky <vkon@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-08-14 13:41:07 -07:00
Yangtao Li
abae448626 f2fs: remove batched_trim_sections node description
It's deprecated since commit 377224c471 ("f2fs: don't split checkpoint in fstrim").

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-04-13 16:37:57 -07:00
Yangtao Li
960fa2c828 f2fs: export compress_percent and compress_watermark entries
This patch export below sysfs entries for better control cached
compress page count.

/sys/fs/f2fs/<disk>/compress_watermark
/sys/fs/f2fs/<disk>/compress_percent

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-03-29 15:17:37 -07:00
Yangtao Li
9e615dbba4 f2fs: add missing description for ipu_policy node
IPU policy can be disabled, let's add description for it and other policy.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-02-07 10:46:47 -08:00
qixiaoyu1
d23be468ea f2fs: add sysfs nodes to set last_age_weight
Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com>
Signed-off-by: xiongping1 <xiongping1@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-02-07 10:39:19 -08:00
Yangtao Li
120e0ea12d f2fs: introduce discard_io_aware_gran sysfs node
The current discard_io_aware_gran is a fixed value, change it to be
configurable through the sys node.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-30 13:23:51 -08:00
Chao Yu
cec32b00fa f2fs: add missing doc for fault injection sysfs
We supported configuring fault injection parameter via sysfs w/
below commits, however, we forgot to add doc entry, fix it.

commit 087968974f ("f2fs: add fault injection to sysfs")
/sys/fs/f2fs/fault_injection/fault_*

commit 1ecc0c5c50 ("f2fs: support configuring fault injection per superblock")
/sys/fs/f2fs/<device>/fault_*

Cc: Sheng Yong <shengyong@oppo.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-01-06 15:13:41 -08:00
Jaegeuk Kim
71644dff48 f2fs: add block_age-based extent cache
This patch introduces a runtime hot/cold data separation method
for f2fs, in order to improve the accuracy for data temperature
classification, reduce the garbage collection overhead after
long-term data updates.

Enhanced hot/cold data separation can record data block update
frequency as "age" of the extent per inode, and take use of the age
info to indicate better temperature type for data block allocation:
 - It records total data blocks allocated since mount;
 - When file extent has been updated, it calculate the count of data
blocks allocated since last update as the age of the extent;
 - Before the data block allocated, it searches for the age info and
chooses the suitable segment for allocation.

Test and result:
 - Prepare: create about 30000 files
  * 3% for cold files (with cold file extension like .apk, from 3M to 10M)
  * 50% for warm files (with random file extension like .FcDxq, from 1K
to 4M)
  * 47% for hot files (with hot file extension like .db, from 1K to 256K)
 - create(5%)/random update(90%)/delete(5%) the files
  * total write amount is about 70G
  * fsync will be called for .db files, and buffered write will be used
for other files

The storage of test device is large enough(128G) so that it will not
switch to SSR mode during the test.

Benefit: dirty segment count increment reduce about 14%
 - before: Dirty +21110
 - after:  Dirty +18286

Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com>
Signed-off-by: xiongping1 <xiongping1@xiaomi.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:53:56 -08:00
Yangtao Li
8a47d228de f2fs: introduce discard_urgent_util sysfs node
Through this node, you can control the background discard
to run more aggressively or not aggressively when reach the
utilization rate of the space.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-28 12:51:01 -08:00
Yangtao Li
fc031877b8 f2fs: fix description about discard_granularity node
Let's fix the inconsistency in the text description.
Default discard granularity is 16. For small devices,
default value is 1.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-28 12:48:59 -08:00
Yangtao Li
e5a0db6a9e f2fs: replace gc_urgent_high_remaining with gc_remaining_trials
The user can set the trial count limit for GC urgent and
idle mode with replaced gc_remaining_trials.. If GC thread gets
to the limit, the mode will turn back to GC normal mode finally.

It was applied only to GC_URGENT, while this patch expands it for
GC_IDLE.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-01 17:56:04 -07:00
Jaegeuk Kim
eebd36a408 f2fs: add missing bracket in doc
Let's add missing <>.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-01 17:56:04 -07:00
Yangtao Li
a3951cd199 f2fs: introduce gc_mode sysfs node
Revert "f2fs: make gc_urgent and gc_segment_mode sysfs node readable".

Add a gc_mode sysfs node to show the current gc_mode as a string.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-01 17:56:04 -07:00
Yangtao Li
c46867e9b9 f2fs: introduce max_ordered_discard sysfs node
The current max_ordered_discard is a fixed value, change it to be
configurable through the sys node.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-11-01 17:56:03 -07:00
Chao Yu
718693c84d f2fs: introduce cp_status sysfs entry
This patch adds a new sysfs entry named cp_status, it can output
checkpoint flags in real time.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-10-04 13:31:44 -07:00
Daeho Jeong
f8e2f32bcd f2fs: introduce sysfs atomic write statistics
introduce the below 4 new sysfs node for atomic write statistics.
- current_atomic_write: the total current atomic write block count,
                        which is not committed yet.
- peak_atomic_write: the peak value of total current atomic write block
                     count after boot.
- committed_atomic_block: the accumulated total committed atomic write
                          block count after boot.
- revoked_atomic_block: the accumulated total revoked atomic write block
                        count after boot.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30 20:17:07 -07:00
Jaegeuk Kim
8e0f54a70e f2fs: add a sysfs entry to show zone capacity
This patch adds a sysfs entry showing the unusable space in a section
made by zone capacity.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-07-30 20:16:20 -07:00
Daeho Jeong
d98af5f455 f2fs: introduce gc_urgent_mid mode
We need a mid level of gc urgent mode to do GC forcibly in a period
of given gc_urgent_sleep_time, but not like using greedy GC approach
and switching to SSR mode such as gc urgent high mode. This can be
used for more aggressive periodic storage clean up.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-03-17 09:16:22 -07:00
Jaegeuk Kim
ba900534f8 f2fs: don't get FREEZE lock in f2fs_evict_inode in frozen fs
Let's purge inode cache in order to avoid the below deadlock.

[freeze test]                         shrinkder
freeze_super
 - pwercpu_down_write(SB_FREEZE_FS)
                                       - super_cache_scan
                                         - down_read(&sb->s_umount)
                                           - prune_icache_sb
                                            - dispose_list
                                             - evict
                                              - f2fs_evict_inode
thaw_super
 - down_write(&sb->s_umount);
                                              - __percpu_down_read(SB_FREEZE_FS)

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-03-11 07:36:17 -08:00
Jaegeuk Kim
47c8ebcce8 f2fs: add a way to limit roll forward recovery time
This adds a sysfs entry to call checkpoint during fsync() in order to avoid
long elapsed time to run roll-forward recovery when booting the device.
Default value doesn't enforce the limitation which is same as before.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-02-12 05:58:18 -08:00
Chao Yu
1018a5463a f2fs: introduce F2FS_IPU_HONOR_OPU_WRITE ipu policy
Once F2FS_IPU_FORCE policy is enabled in some cases:
a) f2fs forces to use F2FS_IPU_FORCE in a small-sized volume
b) user sets F2FS_IPU_FORCE policy via sysfs

Then we may fail to defragment file due to IPU policy check, it doesn't
make sense, let's introduce a new IPU policy to allow OPU during file
defragmentation.

In small-sized volume, let's enable F2FS_IPU_HONOR_OPU_WRITE policy
by default.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-02-07 11:28:35 -08:00
Konstantin Vyshetsky
b2e4a2b300 f2fs: expose discard related parameters in sysfs
This patch exposes max_discard_request, min_discard_issue_time,
mid_discard_issue_time, and max_discard_issue_time in sysfs. This will
allow the user to fine tune discard operations.

Signed-off-by: Konstantin Vyshetsky <vkon@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-02-02 16:34:55 -08:00