linux

mirror of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-08-05 16:54:27 +00:00

Author	SHA1	Message	Date
Kent Overstreet	acb3b26e76	bcachefs: Move btree lock debugging to slowpath fn Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:58 -04:00
Kent Overstreet	e751c01a8e	bcachefs: Start using bpos.snapshot field This patch starts treating the bpos.snapshot field like part of the key in the btree code: * bpos_successor() and bpos_predecessor() now include the snapshot field * Keys in btrees that will be using snapshots (extents, inodes, dirents and xattrs) now always have their snapshot field set to U32_MAX The btree iterator code gets a new flag, BTREE_ITER_ALL_SNAPSHOTS, that determines whether we're iterating over keys in all snapshots or not - internally, this controlls whether bkey_(successor\|predecessor) increment/decrement the snapshot field, or only the higher bits of the key. We add a new member to struct btree_iter, iter->snapshot: when BTREE_ITER_ALL_SNAPSHOTS is not set, iter->pos.snapshot should always equal iter->snapshot, which will be 0 for btrees that don't use snapshots, and alsways U32_MAX for btrees that will use snapshots (until we enable snapshot creation). This patch also introduces a new metadata version number, and compat code for reading from/writing to older versions - this isn't a forced upgrade (yet). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:57 -04:00
Kent Overstreet	4cf91b0270	bcachefs: Split out bpos_cmp() and bkey_cmp() With snapshots, we're going to need to differentiate between comparisons that should and shouldn't include the snapshot field. bpos_cmp is now the comparison function that does include the snapshot field, used by core btree code. Upper level filesystem code generally does _not_ want to compare against the snapshot field - that code wants keys to compare as equal even when one of them is in an ancestor snapshot. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:57 -04:00
Kent Overstreet	43d002432d	bcachefs: Add a mechanism for running callbacks at trans commit time Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:57 -04:00
Kent Overstreet	f793fd85dc	bcachefs: Fix for bch2_trans_commit() unlocking when it's not supposed to When we pass BTREE_INSERT_NOUNLOCK bch2_trans_commit isn't supposed to unlock after a successful commit, but it was calling bch2_trans_cond_resched() - oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:57 -04:00
Kent Overstreet	a9d79c6e8b	bcachefs: Use pcpu mode of six locks for interior nodes Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:57 -04:00
Kent Overstreet	08070cba4a	bcachefs: Split btree_iter_traverse and bch2_btree_iter_traverse() External (to the btree iterator code) users of bch2_btree_iter_traverse expect that on success the iterator will be pointed at iter->pos and have that position locked - but since we split iter->pos and iter->real_pos, that means it has to update iter->real_pos if necessary. Internal users don't expect it to modify iter->real_pos, so we need two separate functions. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:57 -04:00
Kent Overstreet	bcad562259	bcachefs: Update iter->real_pos lazily peek() has to update iter->real_pos - there's no need for bch2_btree_iter_set_pos() to update it as well. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:57 -04:00
Kent Overstreet	818664f505	bcachefs: Consolidate bch2_btree_iter_peek() and peek_with_updates() Ideally we'll be getting rid of peek_with_updates(), but the callers will need to be checked. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:57 -04:00
Kent Overstreet	ca58cbd471	bcachefs: Improve iter->real_pos handling Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:57 -04:00
Kent Overstreet	3b0baf6f29	bcachefs: Internal btree iterator renaming This just gives some internal helpers some better names. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:57 -04:00
Kent Overstreet	07fc72e103	bcachefs: Kill btree_iter_peek_uptodate() Since we're no longer doing next() immediately followed by peek(), this optimization isn't doing anything anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:57 -04:00
Kent Overstreet	5cde51cd48	bcachefs: Iterators are now always consistent with iter->real_pos This means bch2_btree_iter_traverse_one() can be made more efficient. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:57 -04:00
Kent Overstreet	345ca825e7	bcachefs: Have btree_iter_next_node() use btree_iter_set_search_pos() btree node iterators need to obey the regular btree node invarionts w.r.t. iter->real_pos; once they do, bch2_btree_iter_traverse will have less that it needs to check. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:56 -04:00
Kent Overstreet	e0ba3b6429	bcachefs: Replace bch2_btree_iter_next() calls with bch2_btree_iter_advance The way btree iterators work internally has been changing, particularly with the iter->real_pos changes, and bch2_btree_iter_next() is no longer hyper optimized - it's just advance followed by peek, so it's more efficient to just call advance where we're not using the return value of bch2_btree_iter_next(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:56 -04:00
Kent Overstreet	4ce41957a7	bcachefs: Optimize bch2_btree_iter_verify_level() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:56 -04:00
Kent Overstreet	5c1ec980f9	bcachefs: Fix iterator picking comparison was wrong Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:56 -04:00
Kent Overstreet	e9895f0ab9	bcachefs: Assert that iterators aren't being double freed Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:56 -04:00
Kent Overstreet	50dc0f692a	bcachefs: Require all btree iterators to be freed We keep running into occasional bugs with btree transaction iterators overflowing - this will make those bugs more visible. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:56 -04:00
Kent Overstreet	8d956c2fb8	bcachefs: btree_iter_set_dontneed() This is a bit clearer than using bch2_btree_iter_free(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:56 -04:00
Kent Overstreet	abcecb49f5	bcachefs: Fsck code refactoring Change fsck code to always put btree iterators - also, make some flow control improvements to deal with lock restarts better, and refactor check_extents() to not walk extents twice for counting/checking i_sectors. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:56 -04:00
Kent Overstreet	f2eaea2fc1	bcachefs: Kill btree_iter_pos_changed() this is used in only one place now, so just inline it into the caller. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:56 -04:00
Kent Overstreet	57447b7acc	bcachefs: Fix a btree iterator leak Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:56 -04:00
Kent Overstreet	c7bb769c81	bcachefs: __bch2_trans_get_iter() refactoring, BTREE_ITER_NOT_EXTENTS Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:55 -04:00
Kent Overstreet	a045be5a0e	bcachefs: Simplify bch2_btree_iter_peek_prev() Since we added iter->real_pos, btree_iter_set_pos_to_(next\|prev)_leaf no longer modify iter->pos, so we don't have to save it at the start anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:55 -04:00
Kent Overstreet	61a19ce425	bcachefs: Fix bpos_diff() Previously, bpos_diff() did not handle borrows correctly. Minor thing considering how it was used, but worth fixing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:55 -04:00
Kent Overstreet	f020bfcdb0	bcachefs: Use bch2_bpos_to_text() more consistently Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:55 -04:00
Kent Overstreet	18fc6ae503	bcachefs: btree_iter_prev_slot() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:55 -04:00
Kent Overstreet	1f7fdc0abd	bcachefs: btree_iter_live() New helper to clean things up a bit - also, improve iter->flags handling. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:55 -04:00
Kent Overstreet	c052cf82f3	bcachefs: KEY_TYPE_discard is no longer used KEY_TYPE_discard used to be used for extent whiteouts, but when handling over overlapping extents was lifted above the core btree code it became unused. This patch updates various code to reflect that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:55 -04:00
Kent Overstreet	9620c3ec2f	bcachefs: Add a mempool for the replicas delta list Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:54 -04:00
Kent Overstreet	e131b6aa0a	bcachefs: Add a mempool for btree_trans bump allocator This allocation is required for filesystem operations to make forward progress, thus needs a mempool. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:54 -04:00
Kent Overstreet	8042b5b715	bcachefs: Extents may now cross btree node boundaries When snapshots arrive, we won't necessarily be able to arbitrarily split existis - when we need to split an existing extent, we'll have to check if the extent was overwritten in child snapshots and if so emit a whiteout for the split in the child snapshot. Because extents couldn't span btree nodes previously, journal replay would sometimes have to split existing extents. That's no good anymore, but fortunately since extent handling has already been lifted above most of the btree code there's no real need for that rule anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:53 -04:00
Kent Overstreet	7e1a3aa9df	bcachefs: iter->real_pos We need to differentiate between the search position of a btree iterator, vs. what it actually points at (what we found). This matters for extents, where iter->pos will typically be the start of the key we found and iter->real_pos will be the end of the key we found (which soon won't necessarily be in the same btree node!) and it will also matter for snapshots. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:53 -04:00
Kent Overstreet	3d4955952f	bcachefs: Fix bch2_btree_iter_peek_prev() This makes bch2_btree_iter_peek_prev() and bch2_btree_iter_prev() consistent with peek() and next(), w.r.t. iter->pos. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:53 -04:00
Kent Overstreet	434094bec0	bcachefs: bch2_btree_iter_advance_pos() This adds a new common helper for advancing past the last key returned by peek(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:53 -04:00
Kent Overstreet	792e2c4c85	bcachefs: Kill bch2_btree_iter_set_pos_same_leaf() The only reason we were keeping this around was for BTREE_INSERT_NOUNLOCK semantics - if bch2_btree_iter_set_pos() advances to the next leaf node, it'll drop the lock on the node that we just inserted to. But we don't rely on BTREE_INSERT_NOUNLOCK semantics for the extents btree, just the inodes btree, and if we do need it for the extents btree in the future we can do it more cleanly by cloning the iterator - this lets us delete some special cases in the btree iterator code, which is complicated enough as it is. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:53 -04:00
Kent Overstreet	2b2c1a89ce	bcachefs: Simplify btree_iter_(next\|prev)_leaf() There's no good reason for these functions to not be using bch2_btree_iter_set_pos(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:53 -04:00
Kent Overstreet	edfbba58e3	bcachefs: Add btree node prefetching to bch2_btree_and_journal_walk() bch2_btree_and_journal_walk() walks the btree overlaying keys from the journal; it was introduced so that we could read in the alloc btree prior to journal replay being done, when journalling of updates to interior btree nodes was introduced. But it didn't have btree node prefetching, which introduced a severe regression with mount times, particularly on spinning rust. This patch implements btree node prefetching for the btree + journal walk, hopefully fixing that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:51 -04:00
Kent Overstreet	07a1006ae8	bcachefs: Reduce/kill BKEY_PADDED use With various newer key types - stripe keys, inline data extents - the old approach of calculating the maximum size of the value is becoming more and more error prone. Better to switch to bkey_on_stack, which can dynamically allocate if necessary to handle any size bkey. In particular we also want to get rid of BKEY_EXTENT_VAL_U64s_MAX. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:50 -04:00
Kent Overstreet	07bd4c285b	bcachefs: Fix btree lock being incorrectly dropped __btree_trans_get_iter() was using bch2_btree_iter_upgrade, but it shouldn't have been because on failure bch2_btree_iter_upgrade may drop locks in other iterators, expecting the transaction to be restarted. But __btree_trans_get_iter can't return an error to indicate that we need to restart thet transaction - oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:50 -04:00
Kent Overstreet	537c49d6af	bcachefs: Fix btree node merge -> split operations If a btree node merger is followed by a split or compact of the parent node, we could end up with the parent btree node iterator pointing to the whiteout inserted by the btree node merge operation - the fix is to ensure that interior btree node iterators always point to the first non whiteout. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:49 -04:00
Kent Overstreet	cc578a36f9	bcachefs: Fix __btree_iter_next() when all iters are in use_next() when all iters are in use Also, print out more information on btree transaction iterator overflow. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:49 -04:00
Kent Overstreet	a2bfc8412a	bcachefs: Try to print full btree error message Metadata corruption bugs are hard to debug if we can't see exactly what went wrong - try to allocate a bigger buffer so we can print out everything we have. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:49 -04:00
Kent Overstreet	3eb26d0157	bcachefs: bch2_trans_get_iter() no longer returns errors Since we now always preallocate the maximum number of iterators when we initialize a btree transaction, getting an iterator never fails - we can delete a fair amount of error path code. This patch also simplifies the iterator allocation code a bit. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:48 -04:00
Kent Overstreet	dbd1e8259a	bcachefs: Dont' use percpu btree_iter buf in userspace bcachefs-tools doesn't have a real percpu (per thread) implementation yet Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:47 -04:00
Kent Overstreet	0b5c9f5940	bcachefs: Set preallocated transaction mem to avoid restarts this will reduce transaction restarts, from observation of tracepoints. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:47 -04:00
Kent Overstreet	876c7af3a6	bcachefs: Take a SRCU lock in btree transactions Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:47 -04:00
Kent Overstreet	b3d1e6cab2	bcachefs: Fix build warning when CONFIG_BCACHEFS_DEBUG=n this function is only used by debug code, but we'd like to always build it so we know that it does build. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:46 -04:00
Kent Overstreet	7e7ae6ca57	bcachefs: Fix spurious transaction restarts The checks for lock ordering violations weren't quite right. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:46 -04:00

... 7 8 9 10 11 ...

559 commits