Commit graph

394 commits

Author SHA1 Message Date
Kent Overstreet
94426e4201 bcachefs: opts.casefold_disabled
Add an option for completely disabling casefolding on a filesystem, as a
workaround for overlayfs.

This should only be needed as a temporary workaround, until the
overlayfs fix arrives.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-01 19:33:46 -04:00
Kent Overstreet
1cddad0fcb bcachefs: Call bch2_fs_init_rw() early if we'll be going rw
kthread creation checks for pending signals, which is _very_ annoying if
we have to do a long recovery and don't go rw until we've done
significant work.

Check if we'll be going rw and pre-allocate kthreads/workqueues.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16 19:04:48 -04:00
Kent Overstreet
b68baf9a87 bcachefs: Print devices we're mounting on multi device filesystems
Previously, we only ever logged the filesystem UUID.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11 23:24:21 -04:00
Alan Huang
0acb385ec1 bcachefs: Fix possible console lock involved deadlock
Link: https://lore.kernel.org/all/6822ab02.050a0220.f2294.00cb.GAE@google.com/T/
Reported-by: syzbot+2c3ef91c9523c3d1a25c@syzkaller.appspotmail.com
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11 23:21:30 -04:00
Kent Overstreet
af5b88618a bcachefs: Update /dev/disk/by-uuid on device add
Invalidate pagecache after we write the new superblock and send a
uevent.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11 23:21:30 -04:00
Kent Overstreet
b938d3c970 bcachefs: Fix bch2_fsck_rename_dirent() for casefold
bch2_fsck_renamed_dirent was creating bch_dirent keys open-coded - but
we need to use the appropriate helper, if the directory is casefolded.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-04 16:45:41 -04:00
Kent Overstreet
09b9c72bd4 bcachefs: bch_err_throw()
Add a tracepoint for any time we return an error and unwind.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02 12:16:35 -04:00
Kent Overstreet
18dad454cd bcachefs: Replace rcu_read_lock() with guards
The new guard(), scoped_guard() allow for more natural code.

Some of the uses with creative flow control have been left.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-01 00:03:12 -04:00
Kent Overstreet
686db67a8e bcachefs: Move unicode message to after the startup message
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-27 00:03:45 -04:00
Kent Overstreet
9caea9208f bcachefs: Don't mount bs > ps without TRANSPARENT_HUGEPAGE
Large folios aren't supported without TRANSPARENT_HUGEPAGE

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23 22:00:07 -04:00
Kent Overstreet
15f969326e bcachefs: Improve bucket_bitmap code
Add some more helpers, and mismatches is now a superset of the empty
bitmap - simplifies most checks.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:04 -04:00
Kent Overstreet
68708efcac bcachefs: struct bch_fs_recovery
bch_fs has gotten obnoxiously big, let's start organizing thins a bit
better.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:03 -04:00
Kent Overstreet
b42fac043f bcachefs: bch2_fs_emergency_read_only2()
More error message cleanup: instead of multiple printk()s per error, we
want to be building up a single error message in a printbuf, so that it
can be printed with indenting that shows grouping and avoid errors
getting interspersed or lost in the log.

This gets rid of most calls to bch2_fs_emergency_read_only(). We still
have calls to
 - bch2_fatal_error()
 - bch2_fs_fatal_error()
 - bch2_fs_fatal_err_on()

that need work.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:56 -04:00
Kent Overstreet
2842515575 bcachefs: Debug params are now static_keys
We'd like users to be able to debug without building custom kernels, so
this will help us get rid of CONFIG_BCACHEFS_DEBUG, at least for most
things.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:54 -04:00
Kent Overstreet
001c1d146f bcachefs: online_fsck_mutex -> run_recovery_passes_lock
Prep work for automatically running recovery passes asynchronously.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:53 -04:00
Kent Overstreet
13ffcbae86 bcachefs: "buckets with backpointer mismatches" now allocated on demand
More self healing work: we're going to be calling
check_bucket_backpointer_mismatch() at runtime, outside of fsck.

Then when we need to we'll kick off the full
check_extents_to_backpointers recovery pass.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:52 -04:00
Kent Overstreet
a8539ad8fa bcachefs: bcachefs_metadata_version_fast_device_removal
Fast device removal, that uses backpointers to find pointers to the
device being removed instead of a full metadata scan.

This requires BCH_SB_MEMBER_DELETED_UUID, which is an incompatible
change - hence the version number bump. We don't fully trust
backpointers, so we don't want to reuse device indexes until after a
fsck has verified that there aren't any pointers to removed devices.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:49 -04:00
Kent Overstreet
66e9a7f139 bcachefs: bch2_dev_remove_stripes() respects degraded flags
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:48 -04:00
Kent Overstreet
96fc7d8adb bcachefs: opts.rebalance_on_ac_only
Add an option for setting rebalance to only run when connected to mains
power.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:48 -04:00
Kent Overstreet
502222041c bcachefs: __bch2_fs_free() cleanup
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:47 -04:00
Kent Overstreet
15dbd0d814 bcachefs: snapshot delete progress indicator
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:40 -04:00
Kent Overstreet
8a6b883e78 bcachefs: Fix setting ca->name in device add
Device add doesn't get the devide index and attach to the filesystem
until after attaching the block device, and setting the device name from
the block device name - these needs some minor tweaks.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:39 -04:00
Kent Overstreet
98e5e36d8c bcachefs: bch2_dev_add() can run on a non-started fs
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:37 -04:00
Kent Overstreet
a349868b5e bcachefs: bch2_fs_open() now takes a darray
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:37 -04:00
Kent Overstreet
0499a82b18 bcachefs: Async object debugging
Debugging infrastructure for async objs: this lets us easily create
fast_lists for various object types so they'll be visible in debugfs.

Add new object types to the BCH_ASYNC_OBJS_TYPES() enum, and drop a
pretty-printer wrapper in async_objs.c.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:29 -04:00
Kent Overstreet
cca2c0d224 bcachefs: bch_dev.io_ref -> enumerated_ref
Convert device IO refs to enumerated_refs, for easier debugging of
refcount issues.

Simple conversion: enumerate all users and convert to the new helpers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:28 -04:00
Kent Overstreet
c9b1d94a21 bcachefs: bch_fs.writes -> enumerated_refs
Drop the single-purpose write ref code in bcachefs.h, and convert to
enumarated refs.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:27 -04:00
Kent Overstreet
e14e06e91d bcachefs: __bch2_fs_read_write() no longer depends on io_ref
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:26 -04:00
Kent Overstreet
9fa4a8a3bd bcachefs: for_each_online_member_rcu()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:26 -04:00
Kent Overstreet
0ca375b177 bcachefs: BCH_MEMBER_RESIZE_ON_MOUNT
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:21 -04:00
Kent Overstreet
530112d88e bcachefs: BCH_FEATURE_small_image
We can't go RW if it's an image file that hasn't been resized.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:20 -04:00
Kent Overstreet
203852d9db bcachefs: BCH_FEATURE_no_alloc_info
If a filesystem is going to only be used read-only, and will be a
deployable image, we can strip out alloc info for a substantial
reduction in metadata size - around half, due to backpointers.

Alloc info will be regenerated on first read-write mount.

Remounting RW is disallowed for now, since we don't yet have
check_allocations running in RW mode.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:20 -04:00
Kent Overstreet
576493133f bcachefs: Print features on startup with -o verbose
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:20 -04:00
Kent Overstreet
ebf561b208 bcachefs: print_str_as_lines() -> print_str()
bch2_print_string_as_lines() is a low level helper that allows messages
longer than 1k to be printed without truncation.

But we should always be printing with the helpers that take a filesystem
object, if we're in fsck they direct output to the userspace process
controlling fsck instead of the dmesg log.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:18 -04:00
Kent Overstreet
c79eb06da4 bcachefs: Clean up option pre/post hooks, small fixes
The helpers are now:
- bch2_opt_hook_pre_set()
- bch2_opts_hooks_pre_set()
- bch2_opt_hook_post_set

Fix a bug where the filesystem discard option would incorrectly be
changed when setting the device option, and don't trigger rebalance
scans unnecessarily (when options aren't changing).

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:16 -04:00
Kent Overstreet
c02e5b5728 bcachefs: Single device mode
Single device filesystems are now identified by the block device name,
not the UUID - and single device filesystems with the same UUID can be
mounted simultaneously, without any special options.

This allocates a new bit in the superblock, BCH_SB_MULTI_DEVICE, which
indicates whether a filesystem has ever been multi device.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:15 -04:00
Kent Overstreet
58c36e6710 bcachefs: Initialize c->name earlier on single dev filesystems
On single device filesystems, c->name contains the block device name,
not the UUID.

Initialize this earlier, so that single device mode can use it for
initializing sysfs/debugfs.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:15 -04:00
Kent Overstreet
ef8dd631f7 bcachefs: Improve opts.degraded
Kill 'opts.very_degraded', and make 'opts.degraded' a persistent option,
stored in the superblock.

It's now an enum, with available choices ask/yes/very/no.

"ask" mode will be handled by the mount helper, for prompting the user
(on a machine used interactively) for whether to do a degraded mount.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:12 -04:00
Kent Overstreet
d4d71b58e5 bcachefs: RO mounts now use less memory
Defer memory allocations only needed in RW mode until we actually go RW.

This is part of improved support for RO images.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:04 -04:00
Kent Overstreet
a17e985be9 bcachefs: Move various init code to _init_early()
_init_early() is for initialization that cannot fail, and often must
happen for teardown partway through initialization to work.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:02 -04:00
Kent Overstreet
31813dcf37 bcachefs: alphabetize init function calls
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:00 -04:00
Kent Overstreet
2767f4f258 bcachefs: btree_io_complete_wq -> btree_write_complete_wq
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:13:56 -04:00
Kent Overstreet
da18dabc37 bcachefs: Ensure superblock gets written when we go ERO
When we go emergency read-only, make sure we do a final write_super() to
persist counters and error counts - this can be critical for piecing
together what fsck was doing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-07 17:09:59 -04:00
Kent Overstreet
bdc32a10a2 bcachefs: Add missing utf8_unload()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-28 16:46:12 -04:00
Kent Overstreet
70c3d89f49 bcachefs: Emit unicode version message on startup
fstests expects this

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-28 16:46:12 -04:00
Kent Overstreet
caab547686 bcachefs: Print mount opts earlier
If we aren't mounting with the correct degraded option, it's helpful to
know that before we fail to mount degraded.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-24 19:09:52 -04:00
Kent Overstreet
387df33129 bcachefs: Start copygc, rebalance threads earlier
Previously, copygc and rebalance weren't started until the very end of
mounting, after all recvoery passes have finished.

But copygc really should be started earlier, since it may be needed for
allocations to make forward progress. Additionally, we've been seeing
occasional bug reports where starting the kthread fails due to a pending
signal - i.e. we're getting timed out by systemd (during a version
upgrade), but we're not seeing the signal until mount is about to
complete.

Additionally, we now have copygc/rebalance explicitly wait for
check_snapshots to complete (if being run); they require that for
snapshot_is_ancestor() in the data move path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-21 11:57:24 -04:00
Kent Overstreet
10e42b6f25 bcachefs: bch2_copygc_wakeup()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-20 20:01:48 -04:00
Kent Overstreet
417f01e726 bcachefs: Error ratelimiting is no longer only during fsck
We now more often do repair automatically, without the user invoking
fsck - and sometimes that can involve fixing lots of errors, so let's
avoid flooding the dmesg log.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-20 19:41:38 -04:00
Kent Overstreet
4c0d2c67ac bcachefs: Fix early startup error path
Don't set JOURNAL_running until we're also calling
journal_space_available() for the first time.

If JOURNAL_running is set, shutdown will write an empty journal entry -
but this will hit an assert in journal_entry_open() if we've never
called journal_space_available().

Reported-by: syzbot+53bb24d476ef8368a7f0@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-20 19:41:38 -04:00