mirror of
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2025-04-13 09:59:31 +00:00

-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZ90r2wAKCRCRxhvAZXjc
ouC6AQCk3MoqskN0WeNcaZT23dB7dHbEhf/7YXOFC9MFRMKXqQD9Fbn95+GuIe3U
nBVPbVyQfDtfXE08ml6gbDJrCsbkkQI=
=Xm1C
-----END PGP SIGNATURE-----
Merge tag 'vfs-6.15-rc1.mount.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs mount namespace updates from Christian Brauner:
"This expands the ability of anonymous mount namespaces:
- Creating detached mounts from detached mounts
Currently, detached mounts can only be created from attached
mounts. This limitaton prevents various use-cases. For example, the
ability to mount a subdirectory without ever having to make the
whole filesystem visible first.
The current permission modelis:
(1) Check that the caller is privileged over the owning user
namespace of it's current mount namespace.
(2) Check that the caller is located in the mount namespace of the
mount it wants to create a detached copy of.
While it is not strictly necessary to do it this way it is
consistently applied in the new mount api. This model will also be
used when allowing the creation of detached mount from another
detached mount.
The (1) requirement can simply be met by performing the same check
as for the non-detached case, i.e., verify that the caller is
privileged over its current mount namespace.
To meet the (2) requirement it must be possible to infer the origin
mount namespace that the anonymous mount namespace of the detached
mount was created from.
The origin mount namespace of an anonymous mount is the mount
namespace that the mounts that were copied into the anonymous mount
namespace originate from.
In order to check the origin mount namespace of an anonymous mount
namespace the sequence number of the original mount namespace is
recorded in the anonymous mount namespace.
With this in place it is possible to perform an equivalent check
(2') to (2). The origin mount namespace of the anonymous mount
namespace must be the same as the caller's mount namespace. To
establish this the sequence number of the caller's mount namespace
and the origin sequence number of the anonymous mount namespace are
compared.
The caller is always located in a non-anonymous mount namespace
since anonymous mount namespaces cannot be setns()ed into. The
caller's mount namespace will thus always have a valid sequence
number.
The owning namespace of any mount namespace, anonymous or
non-anonymous, can never change. A mount attached to a
non-anonymous mount namespace can never change mount namespace.
If the sequence number of the non-anonymous mount namespace and the
origin sequence number of the anonymous mount namespace match, the
owning namespaces must match as well.
Hence, the capability check on the owning namespace of the caller's
mount namespace ensures that the caller has the ability to copy the
mount tree.
- Allow mount detached mounts on detached mounts
Currently, detached mounts can only be mounted onto attached
mounts. This limitation makes it impossible to assemble a new
private rootfs and move it into place. Instead, a detached tree
must be created, attached, then mounted open and then either moved
or detached again. Lift this restriction.
In order to allow mounting detached mounts onto other detached
mounts the same permission model used for creating detached mounts
from detached mounts can be used (cf. above).
Allowing to mount detached mounts onto detached mounts leaves three
cases to consider:
(1) The source mount is an attached mount and the target mount is
a detached mount. This would be equivalent to moving a mount
between different mount namespaces. A caller could move an
attached mount to a detached mount. The detached mount can now
be freely attached to any mount namespace. This changes the
current delegatioh model significantly for no good reason. So
this will fail.
(2) Anonymous mount namespaces are always attached fully, i.e., it
is not possible to only attach a subtree of an anoymous mount
namespace. This simplifies the implementation and reasoning.
Consequently, if the anonymous mount namespace of the source
detached mount and the target detached mount are the identical
the mount request will fail.
(3) The source mount's anonymous mount namespace is different from
the target mount's anonymous mount namespace.
In this case the source anonymous mount namespace of the
source mount tree must be freed after its mounts have been
moved to the target anonymous mount namespace. The source
anonymous mount namespace must be empty afterwards.
By allowing to mount detached mounts onto detached mounts a caller
may do the following:
fd_tree1 = open_tree(-EBADF, "/mnt", OPEN_TREE_CLONE)
fd_tree2 = open_tree(-EBADF, "/tmp", OPEN_TREE_CLONE)
fd_tree1 and fd_tree2 refer to two different detached mount trees
that belong to two different anonymous mount namespace.
It is important to note that fd_tree1 and fd_tree2 both refer to
the root of their respective anonymous mount namespaces.
By allowing to mount detached mounts onto detached mounts the
caller may now do:
move_mount(fd_tree1, "", fd_tree2, "",
MOVE_MOUNT_F_EMPTY_PATH | MOVE_MOUNT_T_EMPTY_PATH)
This will cause the detached mount referred to by fd_tree1 to be
mounted on top of the detached mount referred to by fd_tree2.
Thus, the detached mount fd_tree1 is moved from its separate
anonymous mount namespace into fd_tree2's anonymous mount
namespace.
It also means that while fd_tree2 continues to refer to the root of
its respective anonymous mount namespace fd_tree1 doesn't anymore.
This has the consequence that only fd_tree2 can be moved to another
anonymous or non-anonymous mount namespace. Moving fd_tree1 will
now fail as fd_tree1 doesn't refer to the root of an anoymous mount
namespace anymore.
Now fd_tree1 and fd_tree2 refer to separate detached mount trees
referring to the same anonymous mount namespace.
This is conceptually fine. The new mount api does allow for this to
happen already via:
mount -t tmpfs tmpfs /mnt
mkdir -p /mnt/A
mount -t tmpfs tmpfs /mnt/A
fd_tree3 = open_tree(-EBADF, "/mnt", OPEN_TREE_CLONE | AT_RECURSIVE)
fd_tree4 = open_tree(-EBADF, "/mnt/A", 0)
Both fd_tree3 and fd_tree4 refer to two different detached mount
trees but both detached mount trees refer to the same anonymous
mount namespace. An as with fd_tree1 and fd_tree2, only fd_tree3
may be moved another mount namespace as fd_tree3 refers to the root
of the anonymous mount namespace just while fd_tree4 doesn't.
However, there's an important difference between the
fd_tree3/fd_tree4 and the fd_tree1/fd_tree2 example.
Closing fd_tree4 and releasing the respective struct file will have
no further effect on fd_tree3's detached mount tree.
However, closing fd_tree3 will cause the mount tree and the
respective anonymous mount namespace to be destroyed causing the
detached mount tree of fd_tree4 to be invalid for further mounting.
By allowing to mount detached mounts on detached mounts as in the
fd_tree1/fd_tree2 example both struct files will affect each other.
Both fd_tree1 and fd_tree2 refer to struct files that have
FMODE_NEED_UNMOUNT set.
To handle this we use the fact that @fd_tree1 will have a parent
mount once it has been attached to @fd_tree2.
When dissolve_on_fput() is called the mount that has been passed in
will refer to the root of the anonymous mount namespace. If it
doesn't it would mean that mounts are leaked. So before allowing to
mount detached mounts onto detached mounts this would be a bug.
Now that detached mounts can be mounted onto detached mounts it
just means that the mount has been attached to another anonymous
mount namespace and thus dissolve_on_fput() must not unmount the
mount tree or free the anonymous mount namespace as the file
referring to the root of the namespace hasn't been closed yet.
If it had been closed yet it would be obvious because the mount
namespace would be NULL, i.e., the @fd_tree1 would have already
been unmounted. If @fd_tree1 hasn't been unmounted yet and has a
parent mount it is safe to skip any cleanup as closing @fd_tree2
will take care of all cleanup operations.
- Allow mount propagation for detached mount trees
In commit ee2e3f5062
("mount: fix mounting of detached mounts
onto targets that reside on shared mounts") I fixed a bug where
propagating the source mount tree of an anonymous mount namespace
into a target mount tree of a non-anonymous mount namespace could
be used to trigger an integer overflow in the non-anonymous mount
namespace causing any new mounts to fail.
The cause of this was that the propagation algorithm was unable to
recognize mounts from the source mount tree that were already
propagated into the target mount tree and then reappeared as
propagation targets when walking the destination propagation mount
tree.
When fixing this I disabled mount propagation into anonymous mount
namespaces. Make it possible for anonymous mount namespace to
receive mount propagation events correctly. This is now also a
correctness issue now that we allow mounting detached mount trees
onto detached mount trees.
Mark the source anonymous mount namespace with MNTNS_PROPAGATING
indicating that all mounts belonging to this mount namespace are
currently in the process of being propagated and make the
propagation algorithm discard those if they appear as propagation
targets"
* tag 'vfs-6.15-rc1.mount.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (21 commits)
selftests: test subdirectory mounting
selftests: add test for detached mount tree propagation
fs: namespace: fix uninitialized variable use
mount: handle mount propagation for detached mount trees
fs: allow creating detached mounts from fsmount() file descriptors
selftests: seventh test for mounting detached mounts onto detached mounts
selftests: sixth test for mounting detached mounts onto detached mounts
selftests: fifth test for mounting detached mounts onto detached mounts
selftests: fourth test for mounting detached mounts onto detached mounts
selftests: third test for mounting detached mounts onto detached mounts
selftests: second test for mounting detached mounts onto detached mounts
selftests: first test for mounting detached mounts onto detached mounts
fs: mount detached mounts onto detached mounts
fs: support getname_maybe_null() in move_mount()
selftests: create detached mounts from detached mounts
fs: create detached mounts from detached mounts
fs: add may_copy_tree()
fs: add fastpath for dissolve_on_fput()
fs: add assert for move_mount()
fs: add mnt_ns_empty() helper
...
642 lines
16 KiB
C
642 lines
16 KiB
C
// SPDX-License-Identifier: GPL-2.0-only
|
|
/*
|
|
* linux/fs/pnode.c
|
|
*
|
|
* (C) Copyright IBM Corporation 2005.
|
|
* Author : Ram Pai (linuxram@us.ibm.com)
|
|
*/
|
|
#include <linux/mnt_namespace.h>
|
|
#include <linux/mount.h>
|
|
#include <linux/fs.h>
|
|
#include <linux/nsproxy.h>
|
|
#include <uapi/linux/mount.h>
|
|
#include "internal.h"
|
|
#include "pnode.h"
|
|
|
|
/* return the next shared peer mount of @p */
|
|
static inline struct mount *next_peer(struct mount *p)
|
|
{
|
|
return list_entry(p->mnt_share.next, struct mount, mnt_share);
|
|
}
|
|
|
|
static inline struct mount *first_slave(struct mount *p)
|
|
{
|
|
return list_entry(p->mnt_slave_list.next, struct mount, mnt_slave);
|
|
}
|
|
|
|
static inline struct mount *last_slave(struct mount *p)
|
|
{
|
|
return list_entry(p->mnt_slave_list.prev, struct mount, mnt_slave);
|
|
}
|
|
|
|
static inline struct mount *next_slave(struct mount *p)
|
|
{
|
|
return list_entry(p->mnt_slave.next, struct mount, mnt_slave);
|
|
}
|
|
|
|
static struct mount *get_peer_under_root(struct mount *mnt,
|
|
struct mnt_namespace *ns,
|
|
const struct path *root)
|
|
{
|
|
struct mount *m = mnt;
|
|
|
|
do {
|
|
/* Check the namespace first for optimization */
|
|
if (m->mnt_ns == ns && is_path_reachable(m, m->mnt.mnt_root, root))
|
|
return m;
|
|
|
|
m = next_peer(m);
|
|
} while (m != mnt);
|
|
|
|
return NULL;
|
|
}
|
|
|
|
/*
|
|
* Get ID of closest dominating peer group having a representative
|
|
* under the given root.
|
|
*
|
|
* Caller must hold namespace_sem
|
|
*/
|
|
int get_dominating_id(struct mount *mnt, const struct path *root)
|
|
{
|
|
struct mount *m;
|
|
|
|
for (m = mnt->mnt_master; m != NULL; m = m->mnt_master) {
|
|
struct mount *d = get_peer_under_root(m, mnt->mnt_ns, root);
|
|
if (d)
|
|
return d->mnt_group_id;
|
|
}
|
|
|
|
return 0;
|
|
}
|
|
|
|
static int do_make_slave(struct mount *mnt)
|
|
{
|
|
struct mount *master, *slave_mnt;
|
|
|
|
if (list_empty(&mnt->mnt_share)) {
|
|
if (IS_MNT_SHARED(mnt)) {
|
|
mnt_release_group_id(mnt);
|
|
CLEAR_MNT_SHARED(mnt);
|
|
}
|
|
master = mnt->mnt_master;
|
|
if (!master) {
|
|
struct list_head *p = &mnt->mnt_slave_list;
|
|
while (!list_empty(p)) {
|
|
slave_mnt = list_first_entry(p,
|
|
struct mount, mnt_slave);
|
|
list_del_init(&slave_mnt->mnt_slave);
|
|
slave_mnt->mnt_master = NULL;
|
|
}
|
|
return 0;
|
|
}
|
|
} else {
|
|
struct mount *m;
|
|
/*
|
|
* slave 'mnt' to a peer mount that has the
|
|
* same root dentry. If none is available then
|
|
* slave it to anything that is available.
|
|
*/
|
|
for (m = master = next_peer(mnt); m != mnt; m = next_peer(m)) {
|
|
if (m->mnt.mnt_root == mnt->mnt.mnt_root) {
|
|
master = m;
|
|
break;
|
|
}
|
|
}
|
|
list_del_init(&mnt->mnt_share);
|
|
mnt->mnt_group_id = 0;
|
|
CLEAR_MNT_SHARED(mnt);
|
|
}
|
|
list_for_each_entry(slave_mnt, &mnt->mnt_slave_list, mnt_slave)
|
|
slave_mnt->mnt_master = master;
|
|
list_move(&mnt->mnt_slave, &master->mnt_slave_list);
|
|
list_splice(&mnt->mnt_slave_list, master->mnt_slave_list.prev);
|
|
INIT_LIST_HEAD(&mnt->mnt_slave_list);
|
|
mnt->mnt_master = master;
|
|
return 0;
|
|
}
|
|
|
|
/*
|
|
* vfsmount lock must be held for write
|
|
*/
|
|
void change_mnt_propagation(struct mount *mnt, int type)
|
|
{
|
|
if (type == MS_SHARED) {
|
|
set_mnt_shared(mnt);
|
|
return;
|
|
}
|
|
do_make_slave(mnt);
|
|
if (type != MS_SLAVE) {
|
|
list_del_init(&mnt->mnt_slave);
|
|
mnt->mnt_master = NULL;
|
|
if (type == MS_UNBINDABLE)
|
|
mnt->mnt.mnt_flags |= MNT_UNBINDABLE;
|
|
else
|
|
mnt->mnt.mnt_flags &= ~MNT_UNBINDABLE;
|
|
}
|
|
}
|
|
|
|
/*
|
|
* get the next mount in the propagation tree.
|
|
* @m: the mount seen last
|
|
* @origin: the original mount from where the tree walk initiated
|
|
*
|
|
* Note that peer groups form contiguous segments of slave lists.
|
|
* We rely on that in get_source() to be able to find out if
|
|
* vfsmount found while iterating with propagation_next() is
|
|
* a peer of one we'd found earlier.
|
|
*/
|
|
static struct mount *propagation_next(struct mount *m,
|
|
struct mount *origin)
|
|
{
|
|
/* are there any slaves of this mount? */
|
|
if (!IS_MNT_PROPAGATED(m) && !list_empty(&m->mnt_slave_list))
|
|
return first_slave(m);
|
|
|
|
while (1) {
|
|
struct mount *master = m->mnt_master;
|
|
|
|
if (master == origin->mnt_master) {
|
|
struct mount *next = next_peer(m);
|
|
return (next == origin) ? NULL : next;
|
|
} else if (m->mnt_slave.next != &master->mnt_slave_list)
|
|
return next_slave(m);
|
|
|
|
/* back at master */
|
|
m = master;
|
|
}
|
|
}
|
|
|
|
static struct mount *skip_propagation_subtree(struct mount *m,
|
|
struct mount *origin)
|
|
{
|
|
/*
|
|
* Advance m such that propagation_next will not return
|
|
* the slaves of m.
|
|
*/
|
|
if (!IS_MNT_PROPAGATED(m) && !list_empty(&m->mnt_slave_list))
|
|
m = last_slave(m);
|
|
|
|
return m;
|
|
}
|
|
|
|
static struct mount *next_group(struct mount *m, struct mount *origin)
|
|
{
|
|
while (1) {
|
|
while (1) {
|
|
struct mount *next;
|
|
if (!IS_MNT_PROPAGATED(m) && !list_empty(&m->mnt_slave_list))
|
|
return first_slave(m);
|
|
next = next_peer(m);
|
|
if (m->mnt_group_id == origin->mnt_group_id) {
|
|
if (next == origin)
|
|
return NULL;
|
|
} else if (m->mnt_slave.next != &next->mnt_slave)
|
|
break;
|
|
m = next;
|
|
}
|
|
/* m is the last peer */
|
|
while (1) {
|
|
struct mount *master = m->mnt_master;
|
|
if (m->mnt_slave.next != &master->mnt_slave_list)
|
|
return next_slave(m);
|
|
m = next_peer(master);
|
|
if (master->mnt_group_id == origin->mnt_group_id)
|
|
break;
|
|
if (master->mnt_slave.next == &m->mnt_slave)
|
|
break;
|
|
m = master;
|
|
}
|
|
if (m == origin)
|
|
return NULL;
|
|
}
|
|
}
|
|
|
|
/* all accesses are serialized by namespace_sem */
|
|
static struct mount *last_dest, *first_source, *last_source, *dest_master;
|
|
static struct hlist_head *list;
|
|
|
|
static inline bool peers(const struct mount *m1, const struct mount *m2)
|
|
{
|
|
return m1->mnt_group_id == m2->mnt_group_id && m1->mnt_group_id;
|
|
}
|
|
|
|
static int propagate_one(struct mount *m, struct mountpoint *dest_mp)
|
|
{
|
|
struct mount *child;
|
|
int type;
|
|
/* skip ones added by this propagate_mnt() */
|
|
if (IS_MNT_PROPAGATED(m))
|
|
return 0;
|
|
/* skip if mountpoint isn't covered by it */
|
|
if (!is_subdir(dest_mp->m_dentry, m->mnt.mnt_root))
|
|
return 0;
|
|
if (peers(m, last_dest)) {
|
|
type = CL_MAKE_SHARED;
|
|
} else {
|
|
struct mount *n, *p;
|
|
bool done;
|
|
for (n = m; ; n = p) {
|
|
p = n->mnt_master;
|
|
if (p == dest_master || IS_MNT_MARKED(p))
|
|
break;
|
|
}
|
|
do {
|
|
struct mount *parent = last_source->mnt_parent;
|
|
if (peers(last_source, first_source))
|
|
break;
|
|
done = parent->mnt_master == p;
|
|
if (done && peers(n, parent))
|
|
break;
|
|
last_source = last_source->mnt_master;
|
|
} while (!done);
|
|
|
|
type = CL_SLAVE;
|
|
/* beginning of peer group among the slaves? */
|
|
if (IS_MNT_SHARED(m))
|
|
type |= CL_MAKE_SHARED;
|
|
}
|
|
|
|
child = copy_tree(last_source, last_source->mnt.mnt_root, type);
|
|
if (IS_ERR(child))
|
|
return PTR_ERR(child);
|
|
read_seqlock_excl(&mount_lock);
|
|
mnt_set_mountpoint(m, dest_mp, child);
|
|
if (m->mnt_master != dest_master)
|
|
SET_MNT_MARK(m->mnt_master);
|
|
read_sequnlock_excl(&mount_lock);
|
|
last_dest = m;
|
|
last_source = child;
|
|
hlist_add_head(&child->mnt_hash, list);
|
|
return count_mounts(m->mnt_ns, child);
|
|
}
|
|
|
|
/*
|
|
* mount 'source_mnt' under the destination 'dest_mnt' at
|
|
* dentry 'dest_dentry'. And propagate that mount to
|
|
* all the peer and slave mounts of 'dest_mnt'.
|
|
* Link all the new mounts into a propagation tree headed at
|
|
* source_mnt. Also link all the new mounts using ->mnt_list
|
|
* headed at source_mnt's ->mnt_list
|
|
*
|
|
* @dest_mnt: destination mount.
|
|
* @dest_dentry: destination dentry.
|
|
* @source_mnt: source mount.
|
|
* @tree_list : list of heads of trees to be attached.
|
|
*/
|
|
int propagate_mnt(struct mount *dest_mnt, struct mountpoint *dest_mp,
|
|
struct mount *source_mnt, struct hlist_head *tree_list)
|
|
{
|
|
struct mount *m, *n;
|
|
int ret = 0;
|
|
|
|
/*
|
|
* we don't want to bother passing tons of arguments to
|
|
* propagate_one(); everything is serialized by namespace_sem,
|
|
* so globals will do just fine.
|
|
*/
|
|
last_dest = dest_mnt;
|
|
first_source = source_mnt;
|
|
last_source = source_mnt;
|
|
list = tree_list;
|
|
dest_master = dest_mnt->mnt_master;
|
|
|
|
/* all peers of dest_mnt, except dest_mnt itself */
|
|
for (n = next_peer(dest_mnt); n != dest_mnt; n = next_peer(n)) {
|
|
ret = propagate_one(n, dest_mp);
|
|
if (ret)
|
|
goto out;
|
|
}
|
|
|
|
/* all slave groups */
|
|
for (m = next_group(dest_mnt, dest_mnt); m;
|
|
m = next_group(m, dest_mnt)) {
|
|
/* everything in that slave group */
|
|
n = m;
|
|
do {
|
|
ret = propagate_one(n, dest_mp);
|
|
if (ret)
|
|
goto out;
|
|
n = next_peer(n);
|
|
} while (n != m);
|
|
}
|
|
out:
|
|
read_seqlock_excl(&mount_lock);
|
|
hlist_for_each_entry(n, tree_list, mnt_hash) {
|
|
m = n->mnt_parent;
|
|
if (m->mnt_master != dest_mnt->mnt_master)
|
|
CLEAR_MNT_MARK(m->mnt_master);
|
|
}
|
|
read_sequnlock_excl(&mount_lock);
|
|
return ret;
|
|
}
|
|
|
|
static struct mount *find_topper(struct mount *mnt)
|
|
{
|
|
/* If there is exactly one mount covering mnt completely return it. */
|
|
struct mount *child;
|
|
|
|
if (!list_is_singular(&mnt->mnt_mounts))
|
|
return NULL;
|
|
|
|
child = list_first_entry(&mnt->mnt_mounts, struct mount, mnt_child);
|
|
if (child->mnt_mountpoint != mnt->mnt.mnt_root)
|
|
return NULL;
|
|
|
|
return child;
|
|
}
|
|
|
|
/*
|
|
* return true if the refcount is greater than count
|
|
*/
|
|
static inline int do_refcount_check(struct mount *mnt, int count)
|
|
{
|
|
return mnt_get_count(mnt) > count;
|
|
}
|
|
|
|
/**
|
|
* propagation_would_overmount - check whether propagation from @from
|
|
* would overmount @to
|
|
* @from: shared mount
|
|
* @to: mount to check
|
|
* @mp: future mountpoint of @to on @from
|
|
*
|
|
* If @from propagates mounts to @to, @from and @to must either be peers
|
|
* or one of the masters in the hierarchy of masters of @to must be a
|
|
* peer of @from.
|
|
*
|
|
* If the root of the @to mount is equal to the future mountpoint @mp of
|
|
* the @to mount on @from then @to will be overmounted by whatever is
|
|
* propagated to it.
|
|
*
|
|
* Context: This function expects namespace_lock() to be held and that
|
|
* @mp is stable.
|
|
* Return: If @from overmounts @to, true is returned, false if not.
|
|
*/
|
|
bool propagation_would_overmount(const struct mount *from,
|
|
const struct mount *to,
|
|
const struct mountpoint *mp)
|
|
{
|
|
if (!IS_MNT_SHARED(from))
|
|
return false;
|
|
|
|
if (IS_MNT_PROPAGATED(to))
|
|
return false;
|
|
|
|
if (to->mnt.mnt_root != mp->m_dentry)
|
|
return false;
|
|
|
|
for (const struct mount *m = to; m; m = m->mnt_master) {
|
|
if (peers(from, m))
|
|
return true;
|
|
}
|
|
|
|
return false;
|
|
}
|
|
|
|
/*
|
|
* check if the mount 'mnt' can be unmounted successfully.
|
|
* @mnt: the mount to be checked for unmount
|
|
* NOTE: unmounting 'mnt' would naturally propagate to all
|
|
* other mounts its parent propagates to.
|
|
* Check if any of these mounts that **do not have submounts**
|
|
* have more references than 'refcnt'. If so return busy.
|
|
*
|
|
* vfsmount lock must be held for write
|
|
*/
|
|
int propagate_mount_busy(struct mount *mnt, int refcnt)
|
|
{
|
|
struct mount *m, *child, *topper;
|
|
struct mount *parent = mnt->mnt_parent;
|
|
|
|
if (mnt == parent)
|
|
return do_refcount_check(mnt, refcnt);
|
|
|
|
/*
|
|
* quickly check if the current mount can be unmounted.
|
|
* If not, we don't have to go checking for all other
|
|
* mounts
|
|
*/
|
|
if (!list_empty(&mnt->mnt_mounts) || do_refcount_check(mnt, refcnt))
|
|
return 1;
|
|
|
|
for (m = propagation_next(parent, parent); m;
|
|
m = propagation_next(m, parent)) {
|
|
int count = 1;
|
|
child = __lookup_mnt(&m->mnt, mnt->mnt_mountpoint);
|
|
if (!child)
|
|
continue;
|
|
|
|
/* Is there exactly one mount on the child that covers
|
|
* it completely whose reference should be ignored?
|
|
*/
|
|
topper = find_topper(child);
|
|
if (topper)
|
|
count += 1;
|
|
else if (!list_empty(&child->mnt_mounts))
|
|
continue;
|
|
|
|
if (do_refcount_check(child, count))
|
|
return 1;
|
|
}
|
|
return 0;
|
|
}
|
|
|
|
/*
|
|
* Clear MNT_LOCKED when it can be shown to be safe.
|
|
*
|
|
* mount_lock lock must be held for write
|
|
*/
|
|
void propagate_mount_unlock(struct mount *mnt)
|
|
{
|
|
struct mount *parent = mnt->mnt_parent;
|
|
struct mount *m, *child;
|
|
|
|
BUG_ON(parent == mnt);
|
|
|
|
for (m = propagation_next(parent, parent); m;
|
|
m = propagation_next(m, parent)) {
|
|
child = __lookup_mnt(&m->mnt, mnt->mnt_mountpoint);
|
|
if (child)
|
|
child->mnt.mnt_flags &= ~MNT_LOCKED;
|
|
}
|
|
}
|
|
|
|
static void umount_one(struct mount *mnt, struct list_head *to_umount)
|
|
{
|
|
CLEAR_MNT_MARK(mnt);
|
|
mnt->mnt.mnt_flags |= MNT_UMOUNT;
|
|
list_del_init(&mnt->mnt_child);
|
|
list_del_init(&mnt->mnt_umounting);
|
|
move_from_ns(mnt, to_umount);
|
|
}
|
|
|
|
/*
|
|
* NOTE: unmounting 'mnt' naturally propagates to all other mounts its
|
|
* parent propagates to.
|
|
*/
|
|
static bool __propagate_umount(struct mount *mnt,
|
|
struct list_head *to_umount,
|
|
struct list_head *to_restore)
|
|
{
|
|
bool progress = false;
|
|
struct mount *child;
|
|
|
|
/*
|
|
* The state of the parent won't change if this mount is
|
|
* already unmounted or marked as without children.
|
|
*/
|
|
if (mnt->mnt.mnt_flags & (MNT_UMOUNT | MNT_MARKED))
|
|
goto out;
|
|
|
|
/* Verify topper is the only grandchild that has not been
|
|
* speculatively unmounted.
|
|
*/
|
|
list_for_each_entry(child, &mnt->mnt_mounts, mnt_child) {
|
|
if (child->mnt_mountpoint == mnt->mnt.mnt_root)
|
|
continue;
|
|
if (!list_empty(&child->mnt_umounting) && IS_MNT_MARKED(child))
|
|
continue;
|
|
/* Found a mounted child */
|
|
goto children;
|
|
}
|
|
|
|
/* Mark mounts that can be unmounted if not locked */
|
|
SET_MNT_MARK(mnt);
|
|
progress = true;
|
|
|
|
/* If a mount is without children and not locked umount it. */
|
|
if (!IS_MNT_LOCKED(mnt)) {
|
|
umount_one(mnt, to_umount);
|
|
} else {
|
|
children:
|
|
list_move_tail(&mnt->mnt_umounting, to_restore);
|
|
}
|
|
out:
|
|
return progress;
|
|
}
|
|
|
|
static void umount_list(struct list_head *to_umount,
|
|
struct list_head *to_restore)
|
|
{
|
|
struct mount *mnt, *child, *tmp;
|
|
list_for_each_entry(mnt, to_umount, mnt_list) {
|
|
list_for_each_entry_safe(child, tmp, &mnt->mnt_mounts, mnt_child) {
|
|
/* topper? */
|
|
if (child->mnt_mountpoint == mnt->mnt.mnt_root)
|
|
list_move_tail(&child->mnt_umounting, to_restore);
|
|
else
|
|
umount_one(child, to_umount);
|
|
}
|
|
}
|
|
}
|
|
|
|
static void restore_mounts(struct list_head *to_restore)
|
|
{
|
|
/* Restore mounts to a clean working state */
|
|
while (!list_empty(to_restore)) {
|
|
struct mount *mnt, *parent;
|
|
struct mountpoint *mp;
|
|
|
|
mnt = list_first_entry(to_restore, struct mount, mnt_umounting);
|
|
CLEAR_MNT_MARK(mnt);
|
|
list_del_init(&mnt->mnt_umounting);
|
|
|
|
/* Should this mount be reparented? */
|
|
mp = mnt->mnt_mp;
|
|
parent = mnt->mnt_parent;
|
|
while (parent->mnt.mnt_flags & MNT_UMOUNT) {
|
|
mp = parent->mnt_mp;
|
|
parent = parent->mnt_parent;
|
|
}
|
|
if (parent != mnt->mnt_parent) {
|
|
mnt_change_mountpoint(parent, mp, mnt);
|
|
mnt_notify_add(mnt);
|
|
}
|
|
}
|
|
}
|
|
|
|
static void cleanup_umount_visitations(struct list_head *visited)
|
|
{
|
|
while (!list_empty(visited)) {
|
|
struct mount *mnt =
|
|
list_first_entry(visited, struct mount, mnt_umounting);
|
|
list_del_init(&mnt->mnt_umounting);
|
|
}
|
|
}
|
|
|
|
/*
|
|
* collect all mounts that receive propagation from the mount in @list,
|
|
* and return these additional mounts in the same list.
|
|
* @list: the list of mounts to be unmounted.
|
|
*
|
|
* vfsmount lock must be held for write
|
|
*/
|
|
int propagate_umount(struct list_head *list)
|
|
{
|
|
struct mount *mnt;
|
|
LIST_HEAD(to_restore);
|
|
LIST_HEAD(to_umount);
|
|
LIST_HEAD(visited);
|
|
|
|
/* Find candidates for unmounting */
|
|
list_for_each_entry_reverse(mnt, list, mnt_list) {
|
|
struct mount *parent = mnt->mnt_parent;
|
|
struct mount *m;
|
|
|
|
/*
|
|
* If this mount has already been visited it is known that it's
|
|
* entire peer group and all of their slaves in the propagation
|
|
* tree for the mountpoint has already been visited and there is
|
|
* no need to visit them again.
|
|
*/
|
|
if (!list_empty(&mnt->mnt_umounting))
|
|
continue;
|
|
|
|
list_add_tail(&mnt->mnt_umounting, &visited);
|
|
for (m = propagation_next(parent, parent); m;
|
|
m = propagation_next(m, parent)) {
|
|
struct mount *child = __lookup_mnt(&m->mnt,
|
|
mnt->mnt_mountpoint);
|
|
if (!child)
|
|
continue;
|
|
|
|
if (!list_empty(&child->mnt_umounting)) {
|
|
/*
|
|
* If the child has already been visited it is
|
|
* know that it's entire peer group and all of
|
|
* their slaves in the propgation tree for the
|
|
* mountpoint has already been visited and there
|
|
* is no need to visit this subtree again.
|
|
*/
|
|
m = skip_propagation_subtree(m, parent);
|
|
continue;
|
|
} else if (child->mnt.mnt_flags & MNT_UMOUNT) {
|
|
/*
|
|
* We have come across a partially unmounted
|
|
* mount in a list that has not been visited
|
|
* yet. Remember it has been visited and
|
|
* continue about our merry way.
|
|
*/
|
|
list_add_tail(&child->mnt_umounting, &visited);
|
|
continue;
|
|
}
|
|
|
|
/* Check the child and parents while progress is made */
|
|
while (__propagate_umount(child,
|
|
&to_umount, &to_restore)) {
|
|
/* Is the parent a umount candidate? */
|
|
child = child->mnt_parent;
|
|
if (list_empty(&child->mnt_umounting))
|
|
break;
|
|
}
|
|
}
|
|
}
|
|
|
|
umount_list(&to_umount, &to_restore);
|
|
restore_mounts(&to_restore);
|
|
cleanup_umount_visitations(&visited);
|
|
list_splice_tail(&to_umount, list);
|
|
|
|
return 0;
|
|
}
|