Commit graph

401 commits

Author SHA1 Message Date
Daeho Jeong
c8705cefce f2fs: add gc_boost_gc_greedy sysfs node
Add this to control GC algorithm for boost GC.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-29 15:02:36 +00:00
Daeho Jeong
1d4c5dbba1 f2fs: add gc_boost_gc_multiple sysfs node
Add a sysfs knob to set a multiplier for the background GC migration
window when F2FS Garbage Collection is boosted.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-29 15:02:33 +00:00
Daeho Jeong
e6d5e789c3 f2fs: ignore valid ratio when free section count is low
Otherwise F2FS will not do GC in background in low free section.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-24 20:19:13 +00:00
mason.zhang
b93bf64e34 f2fs: merge the two conditions to avoid code duplication
No functional changes.

Signed-off-by: mason.zhang <masonzhang.linuxer@gmail.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-24 20:16:01 +00:00
Matthew Wilcox (Oracle)
4ecaf580ee f2fs: Add folio counterparts to page_private_flags functions
Name these new functions folio_test_f2fs_*(), folio_set_f2fs_*() and
folio_clear_f2fs_*().  Convert all callers which currently have a folio
and cast back to a page.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-22 15:57:05 +00:00
Matthew Wilcox (Oracle)
a5f3be6e65 f2fs: Pass a folio to IS_INODE()
All callers now have a folio so pass it in.  Also make it const to help
the compiler.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-22 15:57:02 +00:00
Matthew Wilcox (Oracle)
6d3a7f6589 f2fs: Pass a folio to ofs_of_node()
All callers now have a folio so pass it in.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-22 15:56:53 +00:00
Matthew Wilcox (Oracle)
d342b7adad f2fs: Add fio->folio
Put fio->page insto a union with fio->folio.  This lets us remove a
lot of folio->page and page->folio conversions.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-22 15:56:39 +00:00
Matthew Wilcox (Oracle)
9d71780716 f2fs: Pass a folio to F2FS_INODE()
All callers now have a folio, so pass it in.  Also make it const as
F2FS_INODE() does not modify the struct folio passed in (the data it
describes is mutable, but it does not change the contents of the struct).
This may improve code generation.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-22 15:55:43 +00:00
Chao Yu
c1cfc87e49 f2fs: introduce is_cur{seg,sec}()
There are redundant codes in IS_CUR{SEG,SEC}() macros, let's introduce
inline is_cur{seg,sec}() functions, and use a loop in it for cleanup.

Meanwhile, it enhances expansibility, as it doesn't need to change
is_cur{seg,sec}() when we add a new log header.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-07-09 18:02:20 +00:00
Daeho Jeong
8142daf8a5 f2fs: turn off one_time when forcibly set to foreground GC
one_time mode is only for background GC. So, we need to set it back to
false when foreground GC is enforced.

Fixes: 9748c2ddea ("f2fs: do FG_GC when GC boosting is required for zoned devices")
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-06-23 22:13:01 +00:00
Daeho Jeong
24bf3ee37f f2fs: make sure zoned device GC to use FG_GC in shortage of free section
We already use FG_GC when we have free sections under
gc_boost_zoned_gc_percent. So, let's make it consistent.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-06-23 22:13:01 +00:00
Linus Torvalds
d8441523f2 f2fs-for-6.16-rc1
In this round, Matthew converted most of page operations to using folio. Beyond
 the work, we've applied some performance tunings such as GC and linear lookup,
 in addition to enhancing fault injection and sanity checks.
 
 Enhancement:
  - large number of folio conversions
  - add a control to turn on/off the linear lookup for performance
  - tune GC logics for zoned block device
  - improve fault injection and sanity checks
 
 Bug fix:
  - handle error cases of memory donation
  - fix to correct check conditions in f2fs_cross_rename
  - fix to skip f2fs_balance_fs() if checkpoint is disabled
  - don't over-report free space or inodes in statvfs
  - prevent the current section from being selected as a victim during GC
  - fix to calculate first_zoned_segno correctly
  - fix to avoid inconsistence in between SIT and SSA for zoned block device
 
 As usual, there are several debugging patches and clean-ups as well.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmg3PdcACgkQQBSofoJI
 UNL/mQ/9Hkru4XSCokhxt8+/HoFRnTliAlzfD45Vzkkhz1YP7J8VdvWOzJV/WEai
 D3Ib50Q6/y2ptxu7cwOpmToR3fI3RzAlgQsYooFAiZOBnyUkBOLA1oaVuT4s/EYg
 u85xxLx0SW/IMX5CKKbYzhbXnocGAvRUkp/k30kjKJxpCeQ7pw/mLhw/2XeNIb9h
 FxJbECWPpf4PA6ot22YUNvQn0plF/s9873PPhv50vpGyXTHIlTbDCSMeEC1r1E5v
 xWsPcWmTkyPIyBhNFEONWJw1l3wcVIVKNBfBqwMEDr+Tgqi5UDEREeTDV9q5C6y+
 vw3KnsOqX7RTdLExGfefTOnBsTqqMwSZQSH2HL5/Poayg5obXf3D/fUqAQajJpt/
 FbAtfKaXElJcC7l3DJQU3Trh+WpdEPbuMiJo43OzX0YGvMfkA/sYrAHTYm5Q4nsC
 wrRLaWiBgG6nQDKNXz+amD9kL1SMxp+Vsf6ybtChH3gvMqDAJsR7DY1F/Cxe3ry8
 8JoJiGRYq70lw5xNACfJNQwWwRbtySy63nIwMA7FGR9zaXBQJx+cSPhEeLsS+0hI
 zgijgtgRjbfuojlh7qvfFArHEIL4A67Um3RhjHbLWSFhREPaTB0665ElUNTGPe+y
 hVdYtkb0X2ngsYdV/Xdmp/OThpSxI8x1ZCXVsrElawVIMpjP+nA=
 =G8sl
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this round, Matthew converted most of page operations to using
  folio. Beyond the work, we've applied some performance tunings such as
  GC and linear lookup, in addition to enhancing fault injection and
  sanity checks.

  Enhancements:
   - large number of folio conversions
   - add a control to turn on/off the linear lookup for performance
   - tune GC logics for zoned block device
   - improve fault injection and sanity checks

  Bug fixes:
   - handle error cases of memory donation
   - fix to correct check conditions in f2fs_cross_rename
   - fix to skip f2fs_balance_fs() if checkpoint is disabled
   - don't over-report free space or inodes in statvfs
   - prevent the current section from being selected as a victim during GC
   - fix to calculate first_zoned_segno correctly
   - fix to avoid inconsistence between SIT and SSA for zoned block device

  As usual, there are several debugging patches and clean-ups as well"

* tag 'f2fs-for-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (195 commits)
  f2fs: fix to correct check conditions in f2fs_cross_rename
  f2fs: use d_inode(dentry) cleanup dentry->d_inode
  f2fs: fix to skip f2fs_balance_fs() if checkpoint is disabled
  f2fs: clean up to check bi_status w/ BLK_STS_OK
  f2fs: introduce is_{meta,node}_folio
  f2fs: add ckpt_valid_blocks to the section entry
  f2fs: add a method for calculating the remaining blocks in the current segment in LFS mode.
  f2fs: introduce FAULT_VMALLOC
  f2fs: use vmalloc instead of kvmalloc in .init_{,de}compress_ctx
  f2fs: add f2fs_bug_on() in f2fs_quota_read()
  f2fs: add f2fs_bug_on() to detect potential bug
  f2fs: remove unused sbi argument from checksum functions
  f2fs: fix 32-bits hexademical number in fault injection doc
  f2fs: don't over-report free space or inodes in statvfs
  f2fs: return bool from __write_node_folio
  f2fs: simplify return value handling in f2fs_fsync_node_pages
  f2fs: always unlock the page in f2fs_write_single_data_page
  f2fs: remove wbc->for_reclaim handling
  f2fs: return bool from __f2fs_write_meta_folio
  f2fs: fix to return correct error number in f2fs_sync_node_pages()
  ...
2025-05-30 08:40:25 -07:00
Chao Yu
019a891242 f2fs: introduce is_{meta,node}_folio
Just cleanup, no changes.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-05-28 16:03:26 +00:00
Christian Brauner
1afe9e7da8
f2fs: fix freezing filesystem during resize
Using FREEZE_HOLDER_USERSPACE has two consequences:

(1) If userspace freezes the filesystem after mnt_drop_write_file() but
    before freeze_super() was called filesystem resizing will fail
    because the freeze isn't marked as nestable.

(2) If the kernel has successfully frozen the filesystem via
    FREEZE_HOLDER_USERSPACE userspace can simply undo it by using the
    FITHAW ioctl.

Fix both issues by using FREEZE_HOLDER_KERNEL. It will nest with
FREEZE_HOLDER_USERSPACE and cannot be undone by userspace.

And it is the correct thing to do because the kernel temporarily freezes
the filesystem.

Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-09 12:41:24 +02:00
Christian Brauner
1af3331764
super: add filesystem freezing helpers for suspend and hibernate
Allow the power subsystem to support filesystem freeze for
suspend and hibernate.

For some kernel subsystems it is paramount that they are guaranteed that
they are the owner of the freeze to avoid any risk of deadlocks. This is
the case for the power subsystem. Enable it to recognize whether it did
actually freeze the filesystem.

If userspace has 10 filesystems and suspend/hibernate manges to freeze 5
and then fails on the 6th for whatever odd reason (current or future)
then power needs to undo the freeze of the first 5 filesystems. It can't
just walk the list again because while it's unlikely that a new
filesystem got added in the meantime it still cannot tell which
filesystems the power subsystem actually managed to get a freeze
reference count on that needs to be dropped during thaw.

There's various ways out of this ugliness. For example, record the
filesystems the power subsystem managed to freeze on a temporary list in
the callbacks and then walk that list backwards during thaw to undo the
freezing or make sure that the power subsystem just actually exclusively
freezes things it can freeze and marking such filesystems as being owned
by power for the duration of the suspend or resume cycle. I opted for
the latter as that seemed the clean thing to do even if it means more
code changes.

If hibernation races with filesystem freezing (e.g. DM reconfiguration),
then hibernation need not freeze a filesystem because it's already
frozen but userspace may thaw the filesystem before hibernation actually
happens.

If the race happens the other way around, DM reconfiguration may
unexpectedly fail with EBUSY.

So allow FREEZE_EXCL to nest with other holders. An exclusive freezer
cannot be undone by any of the other concurrent freezers.

Link: https://lore.kernel.org/r/20250329-work-freeze-v2-6-a47af37ecc3d@kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-05-09 12:41:02 +02:00
Matthew Wilcox (Oracle)
6f7ec66180 f2fs: Convert dnode_of_data->node_page to node_folio
All assignments to this struct member are conversions from a folio
so convert it to be a folio and convert all users.  At the same time,
convert data_blkaddr() to take a folio as all callers now have a folio.
Remove eight calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-28 15:26:47 +00:00
Matthew Wilcox (Oracle)
1a116e876a f2fs: Use a folio in is_alive()
Remove four calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-28 15:26:37 +00:00
Matthew Wilcox (Oracle)
c795d9dbe0 f2fs: Convert f2fs_move_node_page() to f2fs_move_node_folio()
Pass the folio in from the one caller and use it throughout.
Removes eight hidden calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-28 15:26:37 +00:00
Matthew Wilcox (Oracle)
c528defa64 f2fs: Use a folio in gc_node_segment()
Remove three calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-28 15:26:37 +00:00
Matthew Wilcox (Oracle)
2a96ddcb4a f2fs: Use a folio in move_data_block()
Remove 11 hidden calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-28 15:26:34 +00:00
Matthew Wilcox (Oracle)
0d53be2323 f2fs: Use a folio in ra_data_block()
Remove three hidden calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-28 15:26:34 +00:00
Matthew Wilcox (Oracle)
5d895f7bea f2fs: Use folios in do_garbage_collect()
Get a folio instead of a page and operate on folios throughout.
Remove six calls to compound_head() and use folio_put_refs() to put
both references we hold at the same time, reducing the number of atomic
operations we do.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-28 15:26:31 +00:00
Matthew Wilcox (Oracle)
c14b4562bc f2fs: Use a folio in move_data_block()
Fetch a folio from the pagecache instead of a page and operate on it
throughout.  Removes eight calls to compound_head() and an access to
page->mapping.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-28 15:21:33 +00:00
Chao Yu
773704c1ef f2fs: zone: fix to avoid inconsistence in between SIT and SSA
w/ below testcase, it will cause inconsistence in between SIT and SSA.

create_null_blk 512 2 1024 1024
mkfs.f2fs -m /dev/nullb0
mount /dev/nullb0 /mnt/f2fs/
touch /mnt/f2fs/file
f2fs_io pinfile set /mnt/f2fs/file
fallocate -l 4GiB /mnt/f2fs/file

F2FS-fs (nullb0): Inconsistent segment (0) type [1, 0] in SSA and SIT
CPU: 5 UID: 0 PID: 2398 Comm: fallocate Tainted: G           O       6.13.0-rc1 #84
Tainted: [O]=OOT_MODULE
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Call Trace:
 <TASK>
 dump_stack_lvl+0xb3/0xd0
 dump_stack+0x14/0x20
 f2fs_handle_critical_error+0x18c/0x220 [f2fs]
 f2fs_stop_checkpoint+0x38/0x50 [f2fs]
 do_garbage_collect+0x674/0x6e0 [f2fs]
 f2fs_gc_range+0x12b/0x230 [f2fs]
 f2fs_allocate_pinning_section+0x5c/0x150 [f2fs]
 f2fs_expand_inode_data+0x1cc/0x3c0 [f2fs]
 f2fs_fallocate+0x3c3/0x410 [f2fs]
 vfs_fallocate+0x15f/0x4b0
 __x64_sys_fallocate+0x4a/0x80
 x64_sys_call+0x15e8/0x1b80
 do_syscall_64+0x68/0x130
 entry_SYSCALL_64_after_hwframe+0x67/0x6f
RIP: 0033:0x7f9dba5197ca
F2FS-fs (nullb0): Stopped filesystem due to reason: 4

The reason is f2fs_gc_range() may try to migrate block in curseg, however,
its SSA block is not uptodate due to the last summary block data is still
in cache of curseg.

In this patch, we add a condition in f2fs_gc_range() to check whether
section is opened or not, and skip block migration for opened section.

Fixes: 9703d69d9d ("f2fs: support file pinning for zoned devices")
Reviewed-by: Daeho Jeong <daehojeong@google.com>
Cc: Daeho Jeong <daehojeong@google.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-04-10 03:59:58 +00:00
Matthew Wilcox (Oracle)
a86e109ee2 f2fs: Convert gc_data_segment() to use a folio
Use f2fs_get_read_data_folio() instead of f2fs_get_read_data_page().
Saves a hidden call to compound_head() in f2fs_put_page().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-04 17:02:26 +00:00
Matthew Wilcox (Oracle)
6d1ba45c8d f2fs: Convert move_data_page() to use a folio
Fetch a folio from the page cache and use it throughout, saving
eight hidden calls to compound_head().

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-04 17:02:26 +00:00
Yi Sun
d217b5cea4 f2fs: add parameter @len to f2fs_invalidate_internal_cache()
New function can process some consecutive blocks at a time.

Signed-off-by: Yi Sun <yi.sun@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-01-08 18:31:45 +00:00
Yongpeng Yang
e9a844f6e4 f2fs: The GC triggered by ioctl also needs to mark the segno as victim
In SSR mode, the segment selected for allocation might be the same as
the target segment of the GC triggered by ioctl, resulting in the GC
moving the CURSEG_I(sbi, type)->segno.
Thread A				Thread B or Thread A
- f2fs_ioc_gc_range
 - __f2fs_ioc_gc_range(.victim_segno=segno#N)
  - f2fs_gc
   - __get_victim
    - f2fs_get_victim
    : segno#N is valid, return segno#N as source segment of GC
					- f2fs_allocate_data_block
						- need_new_seg
						- get_ssr_segment
						- f2fs_get_victim
						: get segno #N as destination segment
						- change_curseg

Fixes: e066b83c9b ("f2fs: add ioctl to flush data from faster device to cold area")
Signed-off-by: Yongpeng Yang <yangyongpeng1@oppo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-12-16 16:12:29 +00:00
Zhiguo Niu
296b8cb34e f2fs: fix to avoid use GC_AT when setting gc_mode as GC_URGENT_LOW or GC_URGENT_MID
If gc_mode is set to GC_URGENT_LOW or GC_URGENT_MID, cost benefit GC
approach should be used, but if ATGC is enabled at the same time,
Age-threshold approach will be selected, which can only do amount of
GC and it is much less than the numbers of CB approach.

some traces:
  f2fs_gc-254:48-396     [007] ..... 2311600.684028: f2fs_gc_begin: dev = (254,48), gc_type = Background GC, no_background_GC = 0, nr_free_secs = 0, nodes = 1053, dents = 2, imeta = 18, free_sec:44898, free_seg:44898, rsv_seg:239, prefree_seg:0
  f2fs_gc-254:48-396     [007] ..... 2311600.684527: f2fs_get_victim: dev = (254,48), type = No TYPE, policy = (Background GC, LFS-mode, Age-threshold), victim = 10, cost = 4294364975, ofs_unit = 1, pre_victim_secno = -1, prefree = 0, free = 44898
  f2fs_gc-254:48-396     [007] ..... 2311600.714835: f2fs_gc_end: dev = (254,48), ret = 0, seg_freed = 0, sec_freed = 0, nodes = 1562, dents = 2, imeta = 18, free_sec:44898, free_seg:44898, rsv_seg:239, prefree_seg:0
  f2fs_gc-254:48-396     [007] ..... 2311600.714843: f2fs_background_gc: dev = (254,48), wait_ms = 50, prefree = 0, free = 44898
  f2fs_gc-254:48-396     [007] ..... 2311600.771785: f2fs_gc_begin: dev = (254,48), gc_type = Background GC, no_background_GC = 0, nr_free_secs = 0, nodes = 1562, dents = 2, imeta = 18, free_sec:44898, free_seg:44898, rsv_seg:239, prefree_seg:
  f2fs_gc-254:48-396     [007] ..... 2311600.772275: f2fs_gc_end: dev = (254,48), ret = -61, seg_freed = 0, sec_freed = 0, nodes = 1562, dents = 2, imeta = 18, free_sec:44898, free_seg:44898, rsv_seg:239, prefree_seg:0

Fixes: 0e5e81114d ("f2fs: add GC_URGENT_LOW mode in gc_urgent")
Fixes: d98af5f455 ("f2fs: introduce gc_urgent_mid mode")
Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-11-01 01:24:41 +00:00
liuderong
b19ee72722 f2fs: introduce f2fs_get_section_mtime
When segs_per_sec is larger than 1, section may contain invalid segments,
mtime should be the average value of each valid blocks,
so introduce f2fs_get_section_mtime to
record the average mtime of all valid blocks in a section.

Signed-off-by: liuderong <liuderong@oppo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-10-14 20:04:57 +00:00
Daeho Jeong
5cc69a27ab f2fs: forcibly migrate to secure space for zoned device file pinning
We need to migrate data blocks even though it is full to secure space
for zoned device file pinning.

Fixes: 9703d69d9d ("f2fs: support file pinning for zoned devices")
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-13 16:37:00 +00:00
liuderong
2af583afcf f2fs: remove unused parameters
Remove unused parameter segno from f2fs_usable_segs_in_sec.

Signed-off-by: liuderong <liuderong@oppo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-12 14:42:03 +00:00
Daeho Jeong
e791d00bd0 f2fs: add valid block ratio not to do excessive GC for one time GC
We need to introduce a valid block ratio threshold not to trigger
excessive GC for zoned deivces. The initial value of it is 95%. So, F2FS
will stop the thread from intiating GC for sections having valid blocks
exceeding the ratio.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11 03:33:08 +00:00
Daeho Jeong
9a481a1c16 f2fs: create gc_no_zoned_gc_percent and gc_boost_zoned_gc_percent
Added control knobs for gc_no_zoned_gc_percent and
gc_boost_zoned_gc_percent.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11 03:33:05 +00:00
Daeho Jeong
9748c2ddea f2fs: do FG_GC when GC boosting is required for zoned devices
Under low free section count, we need to use FG_GC instead of BG_GC to
recover free sections.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11 03:33:02 +00:00
Daeho Jeong
2223fe652f f2fs: increase BG GC migration window granularity when boosted for zoned devices
Need bigger BG GC migration window granularity when free section is
running low.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11 03:33:00 +00:00
Daeho Jeong
8c890c4c60 f2fs: introduce migration_window_granularity
We can control the scanning window granularity for GC migration. For
more frequent scanning and GC on zoned devices, we need a fine grained
control knob for it.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11 03:32:54 +00:00
Daeho Jeong
5062b5bed4 f2fs: make BG GC more aggressive for zoned devices
Since we don't have any GC on device side for zoned devices, need more
aggressive BG GC. So, tune the parameters for that.

Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-09-11 03:32:44 +00:00
Sunmin Jeong
f18d007693 f2fs: use meta inode for GC of COW file
In case of the COW file, new updates and GC writes are already
separated to page caches of the atomic file and COW file. As some cases
that use the meta inode for GC, there are some race issues between a
foreground thread and GC thread.

To handle them, we need to take care when to invalidate and wait
writeback of GC pages in COW files as the case of using the meta inode.
Also, a pointer from the COW inode to the original inode is required to
check the state of original pages.

For the former, we can solve the problem by using the meta inode for GC
of COW files. Then let's get a page from the original inode in
move_data_block when GCing the COW file to avoid race condition.

Fixes: 3db1de0e58 ("f2fs: change the current atomic write way")
Cc: stable@vger.kernel.org #v5.19+
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yeongjin Gil <youngjin.gil@samsung.com>
Signed-off-by: Sunmin Jeong <s_min.jeong@samsung.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-07-10 22:48:20 +00:00
Sunmin Jeong
b40a2b0037 f2fs: use meta inode for GC of atomic file
The page cache of the atomic file keeps new data pages which will be
stored in the COW file. It can also keep old data pages when GCing the
atomic file. In this case, new data can be overwritten by old data if a
GC thread sets the old data page as dirty after new data page was
evicted.

Also, since all writes to the atomic file are redirected to COW inodes,
GC for the atomic file is not working well as below.

f2fs_gc(gc_type=FG_GC)
  - select A as a victim segment
  do_garbage_collect
    - iget atomic file's inode for block B
    move_data_page
      f2fs_do_write_data_page
        - use dn of cow inode
        - set fio->old_blkaddr from cow inode
    - seg_freed is 0 since block B is still valid
  - goto gc_more and A is selected as victim again

To solve the problem, let's separate GC writes and updates in the atomic
file by using the meta inode for GC writes.

Fixes: 3db1de0e58 ("f2fs: change the current atomic write way")
Cc: stable@vger.kernel.org #v5.19+
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yeongjin Gil <youngjin.gil@samsung.com>
Signed-off-by: Sunmin Jeong <s_min.jeong@samsung.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-07-10 22:48:04 +00:00
Zhiguo Niu
6924c8b6fd f2fs: fix to remove redundant SBI_NEED_FSCK flag set
Subsequent f2fs_stop_checkpoint will set cp_err, so this
SBI_NEED_FSCK flag set action is invalid.

Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12 15:46:02 +00:00
Chao Yu
fc01008c92 f2fs: fix to do sanity check on F2FS_INLINE_DATA flag in inode during GC
syzbot reports a f2fs bug as below:

------------[ cut here ]------------
kernel BUG at fs/f2fs/inline.c:258!
CPU: 1 PID: 34 Comm: kworker/u8:2 Not tainted 6.9.0-rc6-syzkaller-00012-g9e4bc4bcae01 #0
RIP: 0010:f2fs_write_inline_data+0x781/0x790 fs/f2fs/inline.c:258
Call Trace:
 f2fs_write_single_data_page+0xb65/0x1d60 fs/f2fs/data.c:2834
 f2fs_write_cache_pages fs/f2fs/data.c:3133 [inline]
 __f2fs_write_data_pages fs/f2fs/data.c:3288 [inline]
 f2fs_write_data_pages+0x1efe/0x3a90 fs/f2fs/data.c:3315
 do_writepages+0x35b/0x870 mm/page-writeback.c:2612
 __writeback_single_inode+0x165/0x10b0 fs/fs-writeback.c:1650
 writeback_sb_inodes+0x905/0x1260 fs/fs-writeback.c:1941
 wb_writeback+0x457/0xce0 fs/fs-writeback.c:2117
 wb_do_writeback fs/fs-writeback.c:2264 [inline]
 wb_workfn+0x410/0x1090 fs/fs-writeback.c:2304
 process_one_work kernel/workqueue.c:3254 [inline]
 process_scheduled_works+0xa12/0x17c0 kernel/workqueue.c:3335
 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
 kthread+0x2f2/0x390 kernel/kthread.c:388
 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

The root cause is: inline_data inode can be fuzzed, so that there may
be valid blkaddr in its direct node, once f2fs triggers background GC
to migrate the block, it will hit f2fs_bug_on() during dirty page
writeback.

Let's add sanity check on F2FS_INLINE_DATA flag in inode during GC,
so that, it can forbid migrating inline_data inode's data block for
fixing.

Reported-by: syzbot+848062ba19c8782ca5c8@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-f2fs-devel/000000000000d103ce06174d7ec3@google.com
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-06-12 15:46:01 +00:00
Chao Yu
a798ff17cd f2fs: fix to add missing iput() in gc_data_segment()
During gc_data_segment(), if inode state is abnormal, it missed to call
iput(), fix it.

Fixes: b73e52824c ("f2fs: reposition unlock_new_inode to prevent accessing invalid inode")
Fixes: 9056d6489f ("f2fs: fix to do sanity check on inode type during garbage collection")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-05-11 00:40:07 +00:00
Jaegeuk Kim
16778aea91 f2fs: use folio_test_writeback
Let's convert PageWriteback to folio_test_writeback.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-04-12 20:58:35 +00:00
Zhiguo Niu
245930617c f2fs: fix to handle error paths of {new,change}_curseg()
{new,change}_curseg() may return error in some special cases,
error handling should be did in their callers, and this will also
facilitate subsequent error path expansion in {new,change}_curseg().

Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-12 18:25:17 -07:00
Zhiguo Niu
31f85ccc84 f2fs: unify the error handling of f2fs_is_valid_blkaddr
There are some cases of f2fs_is_valid_blkaddr not handled as
ERROR_INVALID_BLKADDR,so unify the error handling about all of
f2fs_is_valid_blkaddr.
Do f2fs_handle_error in __f2fs_is_valid_blkaddr for cleanup.

Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Chao Yu <chao@kernel.org>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-12 18:25:17 -07:00
Chao Yu
45809cd3bd f2fs: introduce SEGS_TO_BLKS/BLKS_TO_SEGS for cleanup
Just cleanup, no functional change.

Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04 10:18:26 -08:00
Zhiguo Niu
22af1b8c31 f2fs: fix to check return value of f2fs_gc_range
f2fs_gc_range may return error, so its caller
f2fs_allocate_pinning_section should determine whether
to do retry based on ist return value.

Also just do f2fs_gc_range when f2fs_allocate_new_section
return -EAGAIN, and check cp error case in f2fs_gc_range.

Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-03-04 09:52:58 -08:00
Chao Yu
7d009e048d f2fs: fix to handle segment allocation failure correctly
If CONFIG_F2FS_CHECK_FS is off, and for very rare corner case that
we run out of free segment, we should not panic kernel, instead,
let's handle such error correctly in its caller.

Signed-off-by: Chao Yu <chao@kernel.org>
Tested-by: Zhiguo Niu <zhiguo.niu@unisoc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2024-02-29 08:34:34 -08:00