Commit graph

363 commits

Author SHA1 Message Date
Kent Overstreet
c37495fe35 bcachefs: Add missing snapshots_seen_add_inorder()
This fixes an infinite loop when repairing "extent past end of inode",
when the extent is an older snapshot than the inode that needs repair.

Without the snaphsots_seen_add_inorder() we keep trying to delete the
same extent, even though it's no longer visible in the inode's snapshot.

Fixes: 63d6e93119 ("bcachefs: bch2_fpunch_snapshot()")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-24 22:56:37 -04:00
Kent Overstreet
63d6e93119 bcachefs: bch2_fpunch_snapshot()
Add a new version of fpunch for operating on a snapshot ID, not a
subvolume - and use it for "extent past end of inode" repair.

Previously, repair would try to delete everything at once, but deleting
too many extents at once can overflow the btree_trans bump allocator, as
well as causing other problems - the new helper properly uses
bch2_extent_trim_atomic().

Reported-and-tested-by: Edoardo Codeglia <bcachefs@404.blue>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-04 15:45:22 -04:00
Kent Overstreet
94426e4201 bcachefs: opts.casefold_disabled
Add an option for completely disabling casefolding on a filesystem, as a
workaround for overlayfs.

This should only be needed as a temporary workaround, until the
overlayfs fix arrives.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-07-01 19:33:46 -04:00
Alan Huang
fbf913cb72 bcachefs: Fix incorrect transaction restart handling
Reported-by: syzbot+cc7567f096079cb4146f@syzkaller.appspotmail.com
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-30 17:28:55 -04:00
Kent Overstreet
1df310860a bcachefs: fsck: Fix oops in key_visible_in_snapshot()
The normal fsck code doesn't call key_visible_in_snapshot() with an
empty list of snapshot IDs seen (the current snapshot ID will always be
on the list), but str_hash_repair_key() ->
bch2_get_snapshot_overwrites() can, and that's totally fine as long as
we check for it.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-17 20:45:26 -04:00
Kent Overstreet
bbc3a0b17a bcachefs: fsck: Fix check_directory_structure when no check_dirents
check_directory_structure runs after check_dirents, so it expects that
it won't see any inodes with missing backpointers - normally.

But online fsck can't run check_dirents yet, or the user might only be
running a specific pass, so we need to be careful that this isn't an
error. If an inode is unreachable, that's handled by a separate pass.

Also, add a new 'bch2_inode_has_backpointer()' helper, since we were
doing this inconsistently.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-17 13:35:19 -04:00
Kent Overstreet
495ba899d5 bcachefs: fsck: Fix check_path_loop() + snapshots
A path exists in a particular snapshot: we should do the pathwalk in the
snapshot ID of the inode we started from, _not_ change snapshot ID as we
walk inodes and dirents.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16 19:05:02 -04:00
Kent Overstreet
583ba52a40 bcachefs: fsck: check_subdir_count logs path
We can easily go from inode number -> path now, which makes for more
useful log messages.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16 19:05:02 -04:00
Kent Overstreet
8d6ac82361 bcachefs: fsck: additional diagnostics for reattach_inode()
Log the inode's new path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16 19:05:02 -04:00
Kent Overstreet
3e5ceaa5bf bcachefs: fsck: check_directory_structure runs in reverse order
When we find a directory connectivity problem, we should do the repair
in the oldest snapshot that has the issue - so that we don't end up
duplicating work or making a real mess of things.

Oldest snapshot IDs have the highest integer value, so - just walk
inodes in reverse order.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16 19:05:02 -04:00
Kent Overstreet
9fb09ace59 bcachefs: fsck: Fix reattach_inode() for subvol roots
bch_subvolume.fs_path_parent needs to be updated as well, it should
match inode.bi_parent_subvol.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16 19:04:59 -04:00
Kent Overstreet
c1ca07a4dd bcachefs: fsck: Fix remove_backpointer() for subvol roots
The dirent will be in a different snapshot if the inode is a subvolume
root.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16 19:04:54 -04:00
Kent Overstreet
7029cc4d13 bcachefs: fsck: Print path when we find a subvol loop
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16 19:04:48 -04:00
Kent Overstreet
f2a701fd94 bcachefs: fsck: Improve check_key_has_inode()
Print out more info when we find a key (extent, dirent, xattr) for a
missing inode - was there a good inode in an older snapshot, full(ish)
list of keys for that missing inode, so we can make better decisions on
how to repair.

If it looks like it should've been deleted, autofix it. If we ever hit
the non-autofix cases, we'll want to write more repair code (possibly
reconstituting the inode).

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16 19:04:44 -04:00
Kent Overstreet
191334400d bcachefs: fsck: fix extent past end of inode repair
Fix the case where we're deleting in a different snapshot and need to
emit a whiteout - that requires a regular BTREE_ITER_filter_snapshots
iterator.

Also, only delete the part of the extent that extents past i_size.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-15 22:11:56 -04:00
Kent Overstreet
b17d7bdb12 bcachefs: fsck: fix add_inode()
the inode btree uses the offset field for the inum, not the inode field.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-15 22:11:56 -04:00
Kent Overstreet
c27e5782d9 bcachefs: Fix snapshot_key_missing_inode_snapshot repair
When the inode was a whiteout, we were inserting a new whiteout at the
wrong (old) snapshot.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-15 22:11:56 -04:00
Kent Overstreet
2bf380c005 bcachefs: Fix dirent_casefold_mismatch repair
Instead of simply recreating a mis-casefolded dirent, use the str_hash
repair code, which will rename it if necessary - the dirent might have
been created again with the correct casefolding.

Factor out out bch2_str_hash_repair key() from
__bch2_str_hash_check_key() for the new path to use, and export
bch2_dirent_create_key() as well.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-04 16:45:41 -04:00
Kent Overstreet
b938d3c970 bcachefs: Fix bch2_fsck_rename_dirent() for casefold
bch2_fsck_renamed_dirent was creating bch_dirent keys open-coded - but
we need to use the appropriate helper, if the directory is casefolded.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-04 16:45:41 -04:00
Nathan Chancellor
01d925f7e1 bcachefs: Fix -Wc23-extensions in bch2_check_dirents()
Clang warns (or errors with CONFIG_WERROR=y):

  fs/bcachefs/fsck.c:2325:2: error: label followed by a declaration is a C23 extension [-Werror,-Wc23-extensions]
   2325 |         int ret = bch2_trans_run(c,
        |         ^

On clang-17 and older, this is an unconditional error:

  fs/bcachefs/fsck.c:2325:2: error: expected expression
   2325 |         int ret = bch2_trans_run(c,
        |         ^

Move the declaration of ret to the top of the function to resolve both
ways this issue manifests.

Fixes: c72def5237 ("bcachefs: Run check_dirents second time if required")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-04 16:45:38 -04:00
Kent Overstreet
c72def5237 bcachefs: Run check_dirents second time if required
If we move a key backwards, we'll need a second pass to run the rest of
the fsck checks.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02 12:16:36 -04:00
Kent Overstreet
09b9c72bd4 bcachefs: bch_err_throw()
Add a tracepoint for any time we return an error and unwind.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02 12:16:35 -04:00
Kent Overstreet
36a2fdf7c5 bcachefs: Repair code for directory i_size
We had a bug due due to an incomplete revert of the patch implementing
directory i_size (summing up the size of the dirents), leading to
completely screwy i_size values that underflow.

Most userspace programs don't seem to care (e.g. du ignores it), but it
turns out this broke sshfs, so needs to be repaired.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02 12:16:35 -04:00
Kent Overstreet
165815c296 bcachefs: Convert BUG() to error
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02 12:16:35 -04:00
Kent Overstreet
5802caf74f bcachefs: darray_find(), darray_find_p()
New helpers to avoid open coded loops.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-31 22:03:17 -04:00
Kent Overstreet
a592268260 bcachefs: bch2_str_hash_check_key() may now be called without snapshots_seen
We don't track snapshot overwrites outside of fsck, so for this to be
called at runtime outside of fsck we need to create it on demand, when
we have repair to do.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-31 22:03:17 -04:00
Kent Overstreet
1cda5b88e6 bcachefs: Fix missing commit in check_dirents
Other repair code seems to be doing commits themselves, but
check_key_has_snapshot() does not.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-27 00:02:44 -04:00
Kent Overstreet
016c4b48b8 bcachefs: Fix endianness in casefold check/repair
Fixes: 010c894681 ("bcachefs: Check for casefolded dirents in non casefolded dirs")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23 19:52:31 -04:00
Kent Overstreet
4ba99dde33 bcachefs: BCH_INODE_has_case_insensitive
Add a flag for tracking whether a directory has case-insensitive
descendents - so that overlayfs can disallow mounting, even though the
filesystem supports case insensitivity.

This is a new on disk format version, with a (cheap) upgrade to ensure
the flag is correctly set on existing inodes.

Create, rename and fssetxattr are all plumbed to ensure the new flag is
set, and we've got new fsck code that hooks into check_inode(0.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:10 -04:00
Kent Overstreet
77eac89c79 bcachefs: bch2_inode_find_by_inum_snapshot()
Move a fsck.c helper into inode.c, eliminate some duplicate and organize
the inode lookup helpers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:09 -04:00
Kent Overstreet
6b86da9282 bcachefs: fsck: Include loops in error messages
This fixes the subvol loop checking and directory loop checking to print
the loop.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:05 -04:00
Kent Overstreet
7ed4c14e20 bcachefs: Reduce usage of recovery.curr_pass
We want recovery.curr_pass to be private to the recovery passes code,
for better showing recovery pass status; also, it may rewind and is
generally not the correct member to use.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:03 -04:00
Kent Overstreet
ab35552030 bcachefs: __bch2_run_recovery_passes()
Consolidate bch2_run_recovery_passes() and
bch2_run_online_recovery_passes(), prep work for automatically
scheduling and running recovery passes in the background.

- Now takes a mask of which passes to run, automatic background repair
  will pass in sb.recovery_passes_required.

- Skips passes that are failing: a pass that failed may be reattempted
  after another pass succeeds (some passes depend on repair done by
  other passes for successful completion).

- bch2_recovery_passes_match() helper to skip alloc passes on a
  filesystem without alloc info.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:03 -04:00
Kent Overstreet
68708efcac bcachefs: struct bch_fs_recovery
bch_fs has gotten obnoxiously big, let's start organizing thins a bit
better.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:03 -04:00
Kent Overstreet
bde41d9a58 bcachefs: better error message for subvol_fs_path_parent_wrong
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:00 -04:00
Kent Overstreet
fdd0807f81 bcachefs: Improve bch2_repair_inode_hash_info()
Improve this so it can be used by fsck.c check_inode(); it provides a
much better error message than the check_inode() version.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:00 -04:00
Kent Overstreet
123d2d09ff bcachefs: bch2_inode_find_snapshot_root()
Factor out a small common helper.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:59 -04:00
Kent Overstreet
367cad0966 bcachefs: Rename fsck_running, recovery_running flags
Slightly more readable.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:56 -04:00
Kent Overstreet
001c1d146f bcachefs: online_fsck_mutex -> run_recovery_passes_lock
Prep work for automatically running recovery passes asynchronously.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:53 -04:00
Kent Overstreet
0afdf4969e bcachefs: BCH_FSCK_ERR_snapshot_key_missing_inode_snapshot
We're going to be doing some snapshot deletion performance improvements,
and those will strictly require that if an extent/dirent/xattr is
present, an inode is present in that snapshot ID.

We already check for this, but we don't repair it on disk: this patch
adds that repair and turns it into a real fsck_err().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:43 -04:00
Kent Overstreet
855070dc0b bcachefs: get_inodes_all_snapshots() now includes whiteouts
The next patch is going to change lookup_inode_for_snapshot to
rigorously require that a extent/dirent/xattr keys have a corresponding
inode key present - whiteouts included, so this simplifies the checks
lookup_inode_for_snapshot() will have to do.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:43 -04:00
Kent Overstreet
6f2bbd5747 bcachefs: kill inode_walker_entry.snapshot
redundant

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:41 -04:00
Kent Overstreet
a349868b5e bcachefs: bch2_fs_open() now takes a darray
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:37 -04:00
Kent Overstreet
ea27e8ca5d bcachefs: darray: provide typedefs for primitive types
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:13:47 -04:00
Kent Overstreet
010c894681 bcachefs: Check for casefolded dirents in non casefolded dirs
Check for mismatches between casefold dirents and casefold directories.

A mismatch will cause lookups to fail, as we'll be doing the lookup with
the casefolded name, which won't match the non-casefolded dirent, and
vice versa.

Reported-by: Christopher Snowhill <chris@kode54.net>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:13:14 -04:00
Kent Overstreet
ecd76c5f10 bcachefs: Fix bch2_dirent_create_snapshot() for casefolding
bch2_dirent_create_snapshot(), used in fsck, neglected to create a
casefolded dirent.

Just move this into dirent_create_key().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:13:13 -04:00
Kent Overstreet
9c09e59cc5 bcachefs: fix wrong arg to fsck_err()
fsck_err() needs the btree transaction passed to it if there is one - so
that it can unlock/relock around prompting userspace for fixing the
error.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-14 18:59:15 -04:00
Kent Overstreet
261592ba06 bcachefs: Fix snapshotting a subvolume, then renaming it
Subvolume roots and the dirents that point to them are special; they
don't obey the normal snapshot versioning rules because they cross
snapshot boundaries.

We don't keep around older versions of subvolume dirents on rename - we
don't need to, because subvolume dirents are only visible in the parent
subvolume, and we wouldn't be able to match up the different dirent and
inode versions due to crossing the snapshot ID boundary.

That means that when we rename a subvolume, that's been snapshotted, the
older version of the subvolume root will become dangling - it won't have
a dirent that points to it.

That's expected, we just need to tell fsck that this is ok.

Fixes: https://github.com/koverstreet/bcachefs/issues/856
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-17 14:17:16 -04:00
Kent Overstreet
9180ad2e16 bcachefs: Kill btree_iter.trans
This was planned to be done ages ago, now finally completed; there are
places where we have quite a few btree_trans objects on the stack, so
this reduces stack usage somewhat.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-02 10:24:34 -04:00
Kent Overstreet
7337f9f14e bcachefs: bch2_count_fsck_err()
Factor out a helper from __bch2_fsck_err(), for counting the error in
the superblock and deciding whether to print or ratelimit - will be used
to replace some log_fsck_err() calls, where we want to lift out printing
the error message.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-29 13:26:13 -04:00