2005-04-16 15:20:36 -07:00
|
|
|
/*
|
2007-04-26 15:55:03 -07:00
|
|
|
* Copyright (c) 2002, 2007 Red Hat, Inc. All rights reserved.
|
2005-04-16 15:20:36 -07:00
|
|
|
*
|
|
|
|
* This software may be freely redistributed under the terms of the
|
|
|
|
* GNU General Public License.
|
|
|
|
*
|
|
|
|
* You should have received a copy of the GNU General Public License
|
|
|
|
* along with this program; if not, write to the Free Software
|
|
|
|
* Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
|
|
|
*
|
2008-06-05 22:46:18 -07:00
|
|
|
* Authors: David Woodhouse <dwmw2@infradead.org>
|
2005-04-16 15:20:36 -07:00
|
|
|
* David Howells <dhowells@redhat.com>
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <linux/kernel.h>
|
|
|
|
#include <linux/module.h>
|
|
|
|
#include <linux/init.h>
|
2007-04-26 15:55:03 -07:00
|
|
|
#include <linux/circ_buf.h>
|
Detach sched.h from mm.h
First thing mm.h does is including sched.h solely for can_do_mlock() inline
function which has "current" dereference inside. By dealing with can_do_mlock()
mm.h can be detached from sched.h which is good. See below, why.
This patch
a) removes unconditional inclusion of sched.h from mm.h
b) makes can_do_mlock() normal function in mm/mlock.c
c) exports can_do_mlock() to not break compilation
d) adds sched.h inclusions back to files that were getting it indirectly.
e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were
getting them indirectly
Net result is:
a) mm.h users would get less code to open, read, preprocess, parse, ... if
they don't need sched.h
b) sched.h stops being dependency for significant number of files:
on x86_64 allmodconfig touching sched.h results in recompile of 4083 files,
after patch it's only 3744 (-8.3%).
Cross-compile tested on
all arm defconfigs, all mips defconfigs, all powerpc defconfigs,
alpha alpha-up
arm
i386 i386-up i386-defconfig i386-allnoconfig
ia64 ia64-up
m68k
mips
parisc parisc-up
powerpc powerpc-up
s390 s390-up
sparc sparc-up
sparc64 sparc64-up
um-x86_64
x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig
as well as my two usual configs.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-05-21 01:22:52 +04:00
|
|
|
#include <linux/sched.h>
|
2005-04-16 15:20:36 -07:00
|
|
|
#include "internal.h"
|
2007-04-26 15:55:03 -07:00
|
|
|
|
afs: Fix mmap coherency vs 3rd-party changes
Fix the coherency management of mmap'd data such that 3rd-party changes
become visible as soon as possible after the callback notification is
delivered by the fileserver. This is done by the following means:
(1) When we break a callback on a vnode specified by the CB.CallBack call
from the server, we queue a work item (vnode->cb_work) to go and
clobber all the PTEs mapping to that inode.
This causes the CPU to trip through the ->map_pages() and
->page_mkwrite() handlers if userspace attempts to access the page(s)
again.
(Ideally, this would be done in the service handler for CB.CallBack,
but the server is waiting for our reply before considering, and we
have a list of vnodes, all of which need breaking - and the process of
getting the mmap_lock and stripping the PTEs on all CPUs could be
quite slow.)
(2) Call afs_validate() from the ->map_pages() handler to check to see if
the file has changed and to get a new callback promise from the
server.
Also handle the fileserver telling us that it's dropping all callbacks,
possibly after it's been restarted by sending us a CB.InitCallBackState*
call by the following means:
(3) Maintain a per-cell list of afs files that are currently mmap'd
(cell->fs_open_mmaps).
(4) Add a work item to each server that is invoked if there are any open
mmaps when CB.InitCallBackState happens. This work item goes through
the aforementioned list and invokes the vnode->cb_work work item for
each one that is currently using this server.
This causes the PTEs to be cleared, causing ->map_pages() or
->page_mkwrite() to be called again, thereby calling afs_validate()
again.
I've chosen to simply strip the PTEs at the point of notification reception
rather than invalidate all the pages as well because (a) it's faster, (b)
we may get a notification for other reasons than the data being altered (in
which case we don't want to clobber the pagecache) and (c) we need to ask
the server to find out - and I don't want to wait for the reply before
holding up userspace.
This was tested using the attached test program:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
int main(int argc, char *argv[])
{
size_t size = getpagesize();
unsigned char *p;
bool mod = (argc == 3);
int fd;
if (argc != 2 && argc != 3) {
fprintf(stderr, "Format: %s <file> [mod]\n", argv[0]);
exit(2);
}
fd = open(argv[1], mod ? O_RDWR : O_RDONLY);
if (fd < 0) {
perror(argv[1]);
exit(1);
}
p = mmap(NULL, size, mod ? PROT_READ|PROT_WRITE : PROT_READ,
MAP_SHARED, fd, 0);
if (p == MAP_FAILED) {
perror("mmap");
exit(1);
}
for (;;) {
if (mod) {
p[0]++;
msync(p, size, MS_ASYNC);
fsync(fd);
}
printf("%02x", p[0]);
fflush(stdout);
sleep(1);
}
}
It runs in two modes: in one mode, it mmaps a file, then sits in a loop
reading the first byte, printing it and sleeping for a second; in the
second mode it mmaps a file, then sits in a loop incrementing the first
byte and flushing, then printing and sleeping.
Two instances of this program can be run on different machines, one doing
the reading and one doing the writing. The reader should see the changes
made by the writer, but without this patch, they aren't because validity
checking is being done lazily - only on entry to the filesystem.
Testing the InitCallBackState change is more complicated. The server has
to be taken offline, the saved callback state file removed and then the
server restarted whilst the reading-mode program continues to run. The
client machine then has to poke the server to trigger the InitCallBackState
call.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/163111668833.283156.382633263709075739.stgit@warthog.procyon.org.uk/
2021-09-02 16:43:10 +01:00
|
|
|
/*
|
|
|
|
* Handle invalidation of an mmap'd file. We invalidate all the PTEs referring
|
|
|
|
* to the pages in this file's pagecache, forcing the kernel to go through
|
|
|
|
* ->fault() or ->page_mkwrite() - at which point we can handle invalidation
|
|
|
|
* more fully.
|
|
|
|
*/
|
|
|
|
void afs_invalidate_mmap_work(struct work_struct *work)
|
|
|
|
{
|
|
|
|
struct afs_vnode *vnode = container_of(work, struct afs_vnode, cb_work);
|
|
|
|
|
netfs: Fix gcc-12 warning by embedding vfs inode in netfs_i_context
While randstruct was satisfied with using an open-coded "void *" offset
cast for the netfs_i_context <-> inode casting, __builtin_object_size() as
used by FORTIFY_SOURCE was not as easily fooled. This was causing the
following complaint[1] from gcc v12:
In file included from include/linux/string.h:253,
from include/linux/ceph/ceph_debug.h:7,
from fs/ceph/inode.c:2:
In function 'fortify_memset_chk',
inlined from 'netfs_i_context_init' at include/linux/netfs.h:326:2,
inlined from 'ceph_alloc_inode' at fs/ceph/inode.c:463:2:
include/linux/fortify-string.h:242:25: warning: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Wattribute-warning]
242 | __write_overflow_field(p_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix this by embedding a struct inode into struct netfs_i_context (which
should perhaps be renamed to struct netfs_inode). The struct inode
vfs_inode fields are then removed from the 9p, afs, ceph and cifs inode
structs and vfs_inode is then simply changed to "netfs.inode" in those
filesystems.
Further, rename netfs_i_context to netfs_inode, get rid of the
netfs_inode() function that converted a netfs_i_context pointer to an
inode pointer (that can now be done with &ctx->inode) and rename the
netfs_i_context() function to netfs_inode() (which is now a wrapper
around container_of()).
Most of the changes were done with:
perl -p -i -e 's/vfs_inode/netfs.inode/'g \
`git grep -l 'vfs_inode' -- fs/{9p,afs,ceph,cifs}/*.[ch]`
Kees suggested doing it with a pair structure[2] and a special
declarator to insert that into the network filesystem's inode
wrapper[3], but I think it's cleaner to embed it - and then it doesn't
matter if struct randomisation reorders things.
Dave Chinner suggested using a filesystem-specific VFS_I() function in
each filesystem to convert that filesystem's own inode wrapper struct
into the VFS inode struct[4].
Version #2:
- Fix a couple of missed name changes due to a disabled cifs option.
- Rename nfs_i_context to nfs_inode
- Use "netfs" instead of "nic" as the member name in per-fs inode wrapper
structs.
[ This also undoes commit 507160f46c55 ("netfs: gcc-12: temporarily
disable '-Wattribute-warning' for now") that is no longer needed ]
Fixes: bc899ee1c898 ("netfs: Add a netfs inode context")
Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
cc: Jonathan Corbet <corbet@lwn.net>
cc: Eric Van Hensbergen <ericvh@gmail.com>
cc: Latchesar Ionkov <lucho@ionkov.net>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Christian Schoenebeck <linux_oss@crudebyte.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Steve French <smfrench@gmail.com>
cc: William Kucharski <william.kucharski@oracle.com>
cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
cc: Dave Chinner <david@fromorbit.com>
cc: linux-doc@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-afs@lists.infradead.org
cc: ceph-devel@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: samba-technical@lists.samba.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-hardening@vger.kernel.org
Link: https://lore.kernel.org/r/d2ad3a3d7bdd794c6efb562d2f2b655fb67756b9.camel@kernel.org/ [1]
Link: https://lore.kernel.org/r/20220517210230.864239-1-keescook@chromium.org/ [2]
Link: https://lore.kernel.org/r/20220518202212.2322058-1-keescook@chromium.org/ [3]
Link: https://lore.kernel.org/r/20220524101205.GI2306852@dread.disaster.area/ [4]
Link: https://lore.kernel.org/r/165296786831.3591209.12111293034669289733.stgit@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/165305805651.4094995.7763502506786714216.stgit@warthog.procyon.org.uk # v2
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-09 21:46:04 +01:00
|
|
|
unmap_mapping_pages(vnode->netfs.inode.i_mapping, 0, 0, false);
|
afs: Fix mmap coherency vs 3rd-party changes
Fix the coherency management of mmap'd data such that 3rd-party changes
become visible as soon as possible after the callback notification is
delivered by the fileserver. This is done by the following means:
(1) When we break a callback on a vnode specified by the CB.CallBack call
from the server, we queue a work item (vnode->cb_work) to go and
clobber all the PTEs mapping to that inode.
This causes the CPU to trip through the ->map_pages() and
->page_mkwrite() handlers if userspace attempts to access the page(s)
again.
(Ideally, this would be done in the service handler for CB.CallBack,
but the server is waiting for our reply before considering, and we
have a list of vnodes, all of which need breaking - and the process of
getting the mmap_lock and stripping the PTEs on all CPUs could be
quite slow.)
(2) Call afs_validate() from the ->map_pages() handler to check to see if
the file has changed and to get a new callback promise from the
server.
Also handle the fileserver telling us that it's dropping all callbacks,
possibly after it's been restarted by sending us a CB.InitCallBackState*
call by the following means:
(3) Maintain a per-cell list of afs files that are currently mmap'd
(cell->fs_open_mmaps).
(4) Add a work item to each server that is invoked if there are any open
mmaps when CB.InitCallBackState happens. This work item goes through
the aforementioned list and invokes the vnode->cb_work work item for
each one that is currently using this server.
This causes the PTEs to be cleared, causing ->map_pages() or
->page_mkwrite() to be called again, thereby calling afs_validate()
again.
I've chosen to simply strip the PTEs at the point of notification reception
rather than invalidate all the pages as well because (a) it's faster, (b)
we may get a notification for other reasons than the data being altered (in
which case we don't want to clobber the pagecache) and (c) we need to ask
the server to find out - and I don't want to wait for the reply before
holding up userspace.
This was tested using the attached test program:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
int main(int argc, char *argv[])
{
size_t size = getpagesize();
unsigned char *p;
bool mod = (argc == 3);
int fd;
if (argc != 2 && argc != 3) {
fprintf(stderr, "Format: %s <file> [mod]\n", argv[0]);
exit(2);
}
fd = open(argv[1], mod ? O_RDWR : O_RDONLY);
if (fd < 0) {
perror(argv[1]);
exit(1);
}
p = mmap(NULL, size, mod ? PROT_READ|PROT_WRITE : PROT_READ,
MAP_SHARED, fd, 0);
if (p == MAP_FAILED) {
perror("mmap");
exit(1);
}
for (;;) {
if (mod) {
p[0]++;
msync(p, size, MS_ASYNC);
fsync(fd);
}
printf("%02x", p[0]);
fflush(stdout);
sleep(1);
}
}
It runs in two modes: in one mode, it mmaps a file, then sits in a loop
reading the first byte, printing it and sleeping for a second; in the
second mode it mmaps a file, then sits in a loop incrementing the first
byte and flushing, then printing and sleeping.
Two instances of this program can be run on different machines, one doing
the reading and one doing the writing. The reader should see the changes
made by the writer, but without this patch, they aren't because validity
checking is being done lazily - only on entry to the filesystem.
Testing the InitCallBackState change is more complicated. The server has
to be taken offline, the saved callback state file removed and then the
server restarted whilst the reading-mode program continues to run. The
client machine then has to poke the server to trigger the InitCallBackState
call.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/163111668833.283156.382633263709075739.stgit@warthog.procyon.org.uk/
2021-09-02 16:43:10 +01:00
|
|
|
}
|
|
|
|
|
2023-11-07 17:59:33 +00:00
|
|
|
static void afs_server_init_callback(struct afs_server *server)
|
afs: Fix mmap coherency vs 3rd-party changes
Fix the coherency management of mmap'd data such that 3rd-party changes
become visible as soon as possible after the callback notification is
delivered by the fileserver. This is done by the following means:
(1) When we break a callback on a vnode specified by the CB.CallBack call
from the server, we queue a work item (vnode->cb_work) to go and
clobber all the PTEs mapping to that inode.
This causes the CPU to trip through the ->map_pages() and
->page_mkwrite() handlers if userspace attempts to access the page(s)
again.
(Ideally, this would be done in the service handler for CB.CallBack,
but the server is waiting for our reply before considering, and we
have a list of vnodes, all of which need breaking - and the process of
getting the mmap_lock and stripping the PTEs on all CPUs could be
quite slow.)
(2) Call afs_validate() from the ->map_pages() handler to check to see if
the file has changed and to get a new callback promise from the
server.
Also handle the fileserver telling us that it's dropping all callbacks,
possibly after it's been restarted by sending us a CB.InitCallBackState*
call by the following means:
(3) Maintain a per-cell list of afs files that are currently mmap'd
(cell->fs_open_mmaps).
(4) Add a work item to each server that is invoked if there are any open
mmaps when CB.InitCallBackState happens. This work item goes through
the aforementioned list and invokes the vnode->cb_work work item for
each one that is currently using this server.
This causes the PTEs to be cleared, causing ->map_pages() or
->page_mkwrite() to be called again, thereby calling afs_validate()
again.
I've chosen to simply strip the PTEs at the point of notification reception
rather than invalidate all the pages as well because (a) it's faster, (b)
we may get a notification for other reasons than the data being altered (in
which case we don't want to clobber the pagecache) and (c) we need to ask
the server to find out - and I don't want to wait for the reply before
holding up userspace.
This was tested using the attached test program:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
int main(int argc, char *argv[])
{
size_t size = getpagesize();
unsigned char *p;
bool mod = (argc == 3);
int fd;
if (argc != 2 && argc != 3) {
fprintf(stderr, "Format: %s <file> [mod]\n", argv[0]);
exit(2);
}
fd = open(argv[1], mod ? O_RDWR : O_RDONLY);
if (fd < 0) {
perror(argv[1]);
exit(1);
}
p = mmap(NULL, size, mod ? PROT_READ|PROT_WRITE : PROT_READ,
MAP_SHARED, fd, 0);
if (p == MAP_FAILED) {
perror("mmap");
exit(1);
}
for (;;) {
if (mod) {
p[0]++;
msync(p, size, MS_ASYNC);
fsync(fd);
}
printf("%02x", p[0]);
fflush(stdout);
sleep(1);
}
}
It runs in two modes: in one mode, it mmaps a file, then sits in a loop
reading the first byte, printing it and sleeping for a second; in the
second mode it mmaps a file, then sits in a loop incrementing the first
byte and flushing, then printing and sleeping.
Two instances of this program can be run on different machines, one doing
the reading and one doing the writing. The reader should see the changes
made by the writer, but without this patch, they aren't because validity
checking is being done lazily - only on entry to the filesystem.
Testing the InitCallBackState change is more complicated. The server has
to be taken offline, the saved callback state file removed and then the
server restarted whilst the reading-mode program continues to run. The
client machine then has to poke the server to trigger the InitCallBackState
call.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/163111668833.283156.382633263709075739.stgit@warthog.procyon.org.uk/
2021-09-02 16:43:10 +01:00
|
|
|
{
|
|
|
|
struct afs_vnode *vnode;
|
|
|
|
struct afs_cell *cell = server->cell;
|
|
|
|
|
|
|
|
down_read(&cell->fs_open_mmaps_lock);
|
|
|
|
|
|
|
|
list_for_each_entry(vnode, &cell->fs_open_mmaps, cb_mmap_link) {
|
|
|
|
if (vnode->cb_server == server) {
|
|
|
|
clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
|
|
|
|
queue_work(system_unbound_wq, &vnode->cb_work);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
up_read(&cell->fs_open_mmaps_lock);
|
|
|
|
}
|
|
|
|
|
2017-11-02 15:27:49 +00:00
|
|
|
/*
|
2020-05-27 15:51:30 +01:00
|
|
|
* Allow the fileserver to request callback state (re-)initialisation.
|
|
|
|
* Unfortunately, UUIDs are not guaranteed unique.
|
2017-11-02 15:27:49 +00:00
|
|
|
*/
|
|
|
|
void afs_init_callback_state(struct afs_server *server)
|
|
|
|
{
|
2023-11-07 17:59:33 +00:00
|
|
|
struct afs_cell *cell = server->cell;
|
|
|
|
|
|
|
|
down_read(&cell->vs_lock);
|
|
|
|
|
2020-05-27 15:51:30 +01:00
|
|
|
do {
|
|
|
|
server->cb_s_break++;
|
2021-09-02 21:51:01 +01:00
|
|
|
atomic_inc(&server->cell->fs_s_break);
|
afs: Fix mmap coherency vs 3rd-party changes
Fix the coherency management of mmap'd data such that 3rd-party changes
become visible as soon as possible after the callback notification is
delivered by the fileserver. This is done by the following means:
(1) When we break a callback on a vnode specified by the CB.CallBack call
from the server, we queue a work item (vnode->cb_work) to go and
clobber all the PTEs mapping to that inode.
This causes the CPU to trip through the ->map_pages() and
->page_mkwrite() handlers if userspace attempts to access the page(s)
again.
(Ideally, this would be done in the service handler for CB.CallBack,
but the server is waiting for our reply before considering, and we
have a list of vnodes, all of which need breaking - and the process of
getting the mmap_lock and stripping the PTEs on all CPUs could be
quite slow.)
(2) Call afs_validate() from the ->map_pages() handler to check to see if
the file has changed and to get a new callback promise from the
server.
Also handle the fileserver telling us that it's dropping all callbacks,
possibly after it's been restarted by sending us a CB.InitCallBackState*
call by the following means:
(3) Maintain a per-cell list of afs files that are currently mmap'd
(cell->fs_open_mmaps).
(4) Add a work item to each server that is invoked if there are any open
mmaps when CB.InitCallBackState happens. This work item goes through
the aforementioned list and invokes the vnode->cb_work work item for
each one that is currently using this server.
This causes the PTEs to be cleared, causing ->map_pages() or
->page_mkwrite() to be called again, thereby calling afs_validate()
again.
I've chosen to simply strip the PTEs at the point of notification reception
rather than invalidate all the pages as well because (a) it's faster, (b)
we may get a notification for other reasons than the data being altered (in
which case we don't want to clobber the pagecache) and (c) we need to ask
the server to find out - and I don't want to wait for the reply before
holding up userspace.
This was tested using the attached test program:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
int main(int argc, char *argv[])
{
size_t size = getpagesize();
unsigned char *p;
bool mod = (argc == 3);
int fd;
if (argc != 2 && argc != 3) {
fprintf(stderr, "Format: %s <file> [mod]\n", argv[0]);
exit(2);
}
fd = open(argv[1], mod ? O_RDWR : O_RDONLY);
if (fd < 0) {
perror(argv[1]);
exit(1);
}
p = mmap(NULL, size, mod ? PROT_READ|PROT_WRITE : PROT_READ,
MAP_SHARED, fd, 0);
if (p == MAP_FAILED) {
perror("mmap");
exit(1);
}
for (;;) {
if (mod) {
p[0]++;
msync(p, size, MS_ASYNC);
fsync(fd);
}
printf("%02x", p[0]);
fflush(stdout);
sleep(1);
}
}
It runs in two modes: in one mode, it mmaps a file, then sits in a loop
reading the first byte, printing it and sleeping for a second; in the
second mode it mmaps a file, then sits in a loop incrementing the first
byte and flushing, then printing and sleeping.
Two instances of this program can be run on different machines, one doing
the reading and one doing the writing. The reader should see the changes
made by the writer, but without this patch, they aren't because validity
checking is being done lazily - only on entry to the filesystem.
Testing the InitCallBackState change is more complicated. The server has
to be taken offline, the saved callback state file removed and then the
server restarted whilst the reading-mode program continues to run. The
client machine then has to poke the server to trigger the InitCallBackState
call.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/163111668833.283156.382633263709075739.stgit@warthog.procyon.org.uk/
2021-09-02 16:43:10 +01:00
|
|
|
if (!list_empty(&server->cell->fs_open_mmaps))
|
2023-11-07 17:59:33 +00:00
|
|
|
afs_server_init_callback(server);
|
afs: Fix mmap coherency vs 3rd-party changes
Fix the coherency management of mmap'd data such that 3rd-party changes
become visible as soon as possible after the callback notification is
delivered by the fileserver. This is done by the following means:
(1) When we break a callback on a vnode specified by the CB.CallBack call
from the server, we queue a work item (vnode->cb_work) to go and
clobber all the PTEs mapping to that inode.
This causes the CPU to trip through the ->map_pages() and
->page_mkwrite() handlers if userspace attempts to access the page(s)
again.
(Ideally, this would be done in the service handler for CB.CallBack,
but the server is waiting for our reply before considering, and we
have a list of vnodes, all of which need breaking - and the process of
getting the mmap_lock and stripping the PTEs on all CPUs could be
quite slow.)
(2) Call afs_validate() from the ->map_pages() handler to check to see if
the file has changed and to get a new callback promise from the
server.
Also handle the fileserver telling us that it's dropping all callbacks,
possibly after it's been restarted by sending us a CB.InitCallBackState*
call by the following means:
(3) Maintain a per-cell list of afs files that are currently mmap'd
(cell->fs_open_mmaps).
(4) Add a work item to each server that is invoked if there are any open
mmaps when CB.InitCallBackState happens. This work item goes through
the aforementioned list and invokes the vnode->cb_work work item for
each one that is currently using this server.
This causes the PTEs to be cleared, causing ->map_pages() or
->page_mkwrite() to be called again, thereby calling afs_validate()
again.
I've chosen to simply strip the PTEs at the point of notification reception
rather than invalidate all the pages as well because (a) it's faster, (b)
we may get a notification for other reasons than the data being altered (in
which case we don't want to clobber the pagecache) and (c) we need to ask
the server to find out - and I don't want to wait for the reply before
holding up userspace.
This was tested using the attached test program:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
int main(int argc, char *argv[])
{
size_t size = getpagesize();
unsigned char *p;
bool mod = (argc == 3);
int fd;
if (argc != 2 && argc != 3) {
fprintf(stderr, "Format: %s <file> [mod]\n", argv[0]);
exit(2);
}
fd = open(argv[1], mod ? O_RDWR : O_RDONLY);
if (fd < 0) {
perror(argv[1]);
exit(1);
}
p = mmap(NULL, size, mod ? PROT_READ|PROT_WRITE : PROT_READ,
MAP_SHARED, fd, 0);
if (p == MAP_FAILED) {
perror("mmap");
exit(1);
}
for (;;) {
if (mod) {
p[0]++;
msync(p, size, MS_ASYNC);
fsync(fd);
}
printf("%02x", p[0]);
fflush(stdout);
sleep(1);
}
}
It runs in two modes: in one mode, it mmaps a file, then sits in a loop
reading the first byte, printing it and sleeping for a second; in the
second mode it mmaps a file, then sits in a loop incrementing the first
byte and flushing, then printing and sleeping.
Two instances of this program can be run on different machines, one doing
the reading and one doing the writing. The reader should see the changes
made by the writer, but without this patch, they aren't because validity
checking is being done lazily - only on entry to the filesystem.
Testing the InitCallBackState change is more complicated. The server has
to be taken offline, the saved callback state file removed and then the
server restarted whilst the reading-mode program continues to run. The
client machine then has to poke the server to trigger the InitCallBackState
call.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/163111668833.283156.382633263709075739.stgit@warthog.procyon.org.uk/
2021-09-02 16:43:10 +01:00
|
|
|
|
|
|
|
} while ((server = rcu_dereference(server->uuid_next)));
|
2023-11-07 17:59:33 +00:00
|
|
|
|
|
|
|
up_read(&cell->vs_lock);
|
2007-04-26 15:55:03 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* actually break a callback
|
|
|
|
*/
|
2019-06-20 18:12:16 +01:00
|
|
|
void __afs_break_callback(struct afs_vnode *vnode, enum afs_cb_break_reason reason)
|
2007-04-26 15:55:03 -07:00
|
|
|
{
|
|
|
|
_enter("");
|
|
|
|
|
2018-04-06 14:17:26 +01:00
|
|
|
clear_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags);
|
2017-11-02 15:27:49 +00:00
|
|
|
if (test_and_clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags)) {
|
|
|
|
vnode->cb_break++;
|
afs: Parse the VolSync record in the reply of a number of RPC ops
A number of fileserver RPC operations return a VolSync record as part of
their reply that gives some information about the state of the volume being
accessed, including:
(1) A volume Creation timestamp. For an RW volume, this is the time at
which the volume was created; if it changes, the RW volume was
presumably restored from a backup and all cached data should be
scrubbed as Data Version numbers could regress on the files in the
volume.
For an RO volume, this is the time it was last snapshotted from the RW
volume. It is expected to advance each time this happens; if it
regresses, cached data should be scrubbed.
(2) A volume Update timestamp (Auristor only). For an RW volume, this is
updated any time any change is made to a volume or its contents. If
it regresses, all cached data must be scrubbed.
For an RO volume, this is a copy of the RW volume's Update timestamp
at the point of snapshotting. It can be used as a version number when
checking to see if a callback on a RO volume was due to a snapshot.
If it regresses, all cached data must be scrubbed.
but this is currently not made use of by the in-kernel afs filesystem.
Make the afs filesystem use this by:
(1) Add an update time field to the afs_volsync struct and use a value of
TIME64_MIN in both that and the creation time to indicate that they
are unset.
(2) Add creation and update time fields to the afs_volume struct and use
this to track the two timestamps.
(3) Add a volsync_lock mutex to the afs_volume struct to control
modification access for when we detect a change in these values.
(3) Add a 'pre-op volsync' struct to the afs_operation struct to record
the state of the volume tracking before the op.
(4) Add a new counter, cb_scrub, to the afs_volume struct to count events
that require all data to be scrubbed. A copy is placed in the
afs_vnode struct (inode) and if they no longer match, a scrub takes
place.
(5) When the result of an operation is being parsed, parse the VolSync
data too, if it is provided. Note that the two timestamps are handled
separately, since they don't work in quite the same way.
- If the afs_volume tracking is unset, just set it and do nothing
else.
- If the result timestamps are the same as the ones in afs_volume, do
nothing.
- If the timestamps regress, increment cb_scrub if not already done
so.
- If the creation timestamp on a RW volume changes, increment cb_scrub
if not already done so.
- If the creation timestamp on a RO volume advances, update the server
list and see if the current server has been excluded, if so reissue
the op. Once over half of the replication sites have been updated,
increment cb_ro_snapshot to indicate updates may be required and
switch over to excluding unupdated replication sites.
- If the creation timestamp on a Backup volume advances, just
increment cb_ro_snapshot to trigger updates.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-11-05 16:11:07 +00:00
|
|
|
vnode->cb_v_break = atomic_read(&vnode->volume->cb_v_break);
|
2017-11-02 15:27:49 +00:00
|
|
|
afs_clear_permits(vnode);
|
2007-04-26 15:55:03 -07:00
|
|
|
|
2019-05-10 23:03:31 +01:00
|
|
|
if (vnode->lock_state == AFS_VNODE_LOCK_WAITING_FOR_CB)
|
2007-07-15 23:40:12 -07:00
|
|
|
afs_lock_may_be_available(vnode);
|
2019-06-20 18:12:16 +01:00
|
|
|
|
afs: Fix mmap coherency vs 3rd-party changes
Fix the coherency management of mmap'd data such that 3rd-party changes
become visible as soon as possible after the callback notification is
delivered by the fileserver. This is done by the following means:
(1) When we break a callback on a vnode specified by the CB.CallBack call
from the server, we queue a work item (vnode->cb_work) to go and
clobber all the PTEs mapping to that inode.
This causes the CPU to trip through the ->map_pages() and
->page_mkwrite() handlers if userspace attempts to access the page(s)
again.
(Ideally, this would be done in the service handler for CB.CallBack,
but the server is waiting for our reply before considering, and we
have a list of vnodes, all of which need breaking - and the process of
getting the mmap_lock and stripping the PTEs on all CPUs could be
quite slow.)
(2) Call afs_validate() from the ->map_pages() handler to check to see if
the file has changed and to get a new callback promise from the
server.
Also handle the fileserver telling us that it's dropping all callbacks,
possibly after it's been restarted by sending us a CB.InitCallBackState*
call by the following means:
(3) Maintain a per-cell list of afs files that are currently mmap'd
(cell->fs_open_mmaps).
(4) Add a work item to each server that is invoked if there are any open
mmaps when CB.InitCallBackState happens. This work item goes through
the aforementioned list and invokes the vnode->cb_work work item for
each one that is currently using this server.
This causes the PTEs to be cleared, causing ->map_pages() or
->page_mkwrite() to be called again, thereby calling afs_validate()
again.
I've chosen to simply strip the PTEs at the point of notification reception
rather than invalidate all the pages as well because (a) it's faster, (b)
we may get a notification for other reasons than the data being altered (in
which case we don't want to clobber the pagecache) and (c) we need to ask
the server to find out - and I don't want to wait for the reply before
holding up userspace.
This was tested using the attached test program:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/mman.h>
int main(int argc, char *argv[])
{
size_t size = getpagesize();
unsigned char *p;
bool mod = (argc == 3);
int fd;
if (argc != 2 && argc != 3) {
fprintf(stderr, "Format: %s <file> [mod]\n", argv[0]);
exit(2);
}
fd = open(argv[1], mod ? O_RDWR : O_RDONLY);
if (fd < 0) {
perror(argv[1]);
exit(1);
}
p = mmap(NULL, size, mod ? PROT_READ|PROT_WRITE : PROT_READ,
MAP_SHARED, fd, 0);
if (p == MAP_FAILED) {
perror("mmap");
exit(1);
}
for (;;) {
if (mod) {
p[0]++;
msync(p, size, MS_ASYNC);
fsync(fd);
}
printf("%02x", p[0]);
fflush(stdout);
sleep(1);
}
}
It runs in two modes: in one mode, it mmaps a file, then sits in a loop
reading the first byte, printing it and sleeping for a second; in the
second mode it mmaps a file, then sits in a loop incrementing the first
byte and flushing, then printing and sleeping.
Two instances of this program can be run on different machines, one doing
the reading and one doing the writing. The reader should see the changes
made by the writer, but without this patch, they aren't because validity
checking is being done lazily - only on entry to the filesystem.
Testing the InitCallBackState change is more complicated. The server has
to be taken offline, the saved callback state file removed and then the
server restarted whilst the reading-mode program continues to run. The
client machine then has to poke the server to trigger the InitCallBackState
call.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Markus Suvanto <markus.suvanto@gmail.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/163111668833.283156.382633263709075739.stgit@warthog.procyon.org.uk/
2021-09-02 16:43:10 +01:00
|
|
|
if (reason != afs_cb_break_for_deleted &&
|
|
|
|
vnode->status.type == AFS_FTYPE_FILE &&
|
|
|
|
atomic_read(&vnode->cb_nr_mmap))
|
|
|
|
queue_work(system_unbound_wq, &vnode->cb_work);
|
|
|
|
|
2019-06-20 18:12:16 +01:00
|
|
|
trace_afs_cb_break(&vnode->fid, vnode->cb_break, reason, true);
|
|
|
|
} else {
|
|
|
|
trace_afs_cb_break(&vnode->fid, vnode->cb_break, reason, false);
|
2007-04-26 15:55:03 -07:00
|
|
|
}
|
2018-10-20 00:57:58 +01:00
|
|
|
}
|
2017-11-02 15:27:49 +00:00
|
|
|
|
2019-06-20 18:12:16 +01:00
|
|
|
void afs_break_callback(struct afs_vnode *vnode, enum afs_cb_break_reason reason)
|
2018-10-20 00:57:58 +01:00
|
|
|
{
|
|
|
|
write_seqlock(&vnode->cb_lock);
|
2019-06-20 18:12:16 +01:00
|
|
|
__afs_break_callback(vnode, reason);
|
2017-11-02 15:27:49 +00:00
|
|
|
write_sequnlock(&vnode->cb_lock);
|
2007-04-26 15:55:03 -07:00
|
|
|
}
|
|
|
|
|
2020-03-27 15:02:44 +00:00
|
|
|
/*
|
2020-04-30 01:03:49 +01:00
|
|
|
* Look up a volume by volume ID under RCU conditions.
|
2020-03-27 15:02:44 +00:00
|
|
|
*/
|
2020-04-30 01:03:49 +01:00
|
|
|
static struct afs_volume *afs_lookup_volume_rcu(struct afs_cell *cell,
|
|
|
|
afs_volid_t vid)
|
2020-03-27 15:02:44 +00:00
|
|
|
{
|
2020-04-30 01:03:49 +01:00
|
|
|
struct afs_volume *volume = NULL;
|
2020-03-27 15:02:44 +00:00
|
|
|
struct rb_node *p;
|
2023-11-30 12:56:06 +01:00
|
|
|
int seq = 1;
|
2020-03-27 15:02:44 +00:00
|
|
|
|
2023-11-07 17:59:33 +00:00
|
|
|
for (;;) {
|
2020-03-27 15:02:44 +00:00
|
|
|
/* Unfortunately, rbtree walking doesn't give reliable results
|
|
|
|
* under just the RCU read lock, so we have to check for
|
|
|
|
* changes.
|
|
|
|
*/
|
2023-11-30 12:56:06 +01:00
|
|
|
seq++; /* 2 on the 1st/lockless path, otherwise odd */
|
2020-04-30 01:03:49 +01:00
|
|
|
read_seqbegin_or_lock(&cell->volume_lock, &seq);
|
2020-03-27 15:02:44 +00:00
|
|
|
|
2020-04-30 01:03:49 +01:00
|
|
|
p = rcu_dereference_raw(cell->volumes.rb_node);
|
2020-03-27 15:02:44 +00:00
|
|
|
while (p) {
|
2020-04-30 01:03:49 +01:00
|
|
|
volume = rb_entry(p, struct afs_volume, cell_node);
|
2020-03-27 15:02:44 +00:00
|
|
|
|
2020-04-30 01:03:49 +01:00
|
|
|
if (volume->vid < vid)
|
2020-03-27 15:02:44 +00:00
|
|
|
p = rcu_dereference_raw(p->rb_left);
|
2020-04-30 01:03:49 +01:00
|
|
|
else if (volume->vid > vid)
|
2020-03-27 15:02:44 +00:00
|
|
|
p = rcu_dereference_raw(p->rb_right);
|
|
|
|
else
|
|
|
|
break;
|
2020-04-30 01:03:49 +01:00
|
|
|
volume = NULL;
|
2020-03-27 15:02:44 +00:00
|
|
|
}
|
|
|
|
|
2023-11-07 17:59:33 +00:00
|
|
|
if (volume && afs_try_get_volume(volume, afs_volume_trace_get_callback))
|
|
|
|
break;
|
|
|
|
if (!need_seqretry(&cell->volume_lock, seq))
|
|
|
|
break;
|
|
|
|
seq |= 1; /* Want a lock next time */
|
|
|
|
}
|
2020-03-27 15:02:44 +00:00
|
|
|
|
2020-04-30 01:03:49 +01:00
|
|
|
done_seqretry(&cell->volume_lock, seq);
|
|
|
|
return volume;
|
2020-03-27 15:02:44 +00:00
|
|
|
}
|
|
|
|
|
2007-04-26 15:55:03 -07:00
|
|
|
/*
|
|
|
|
* allow the fileserver to explicitly break one callback
|
|
|
|
* - happens when
|
|
|
|
* - the backing file is changed
|
|
|
|
* - a lock is released
|
|
|
|
*/
|
2020-04-30 01:03:49 +01:00
|
|
|
static void afs_break_one_callback(struct afs_volume *volume,
|
|
|
|
struct afs_fid *fid)
|
2007-04-26 15:55:03 -07:00
|
|
|
{
|
2020-04-30 01:03:49 +01:00
|
|
|
struct super_block *sb;
|
2007-04-26 15:55:03 -07:00
|
|
|
struct afs_vnode *vnode;
|
2017-11-02 15:27:49 +00:00
|
|
|
struct inode *inode;
|
afs: Parse the VolSync record in the reply of a number of RPC ops
A number of fileserver RPC operations return a VolSync record as part of
their reply that gives some information about the state of the volume being
accessed, including:
(1) A volume Creation timestamp. For an RW volume, this is the time at
which the volume was created; if it changes, the RW volume was
presumably restored from a backup and all cached data should be
scrubbed as Data Version numbers could regress on the files in the
volume.
For an RO volume, this is the time it was last snapshotted from the RW
volume. It is expected to advance each time this happens; if it
regresses, cached data should be scrubbed.
(2) A volume Update timestamp (Auristor only). For an RW volume, this is
updated any time any change is made to a volume or its contents. If
it regresses, all cached data must be scrubbed.
For an RO volume, this is a copy of the RW volume's Update timestamp
at the point of snapshotting. It can be used as a version number when
checking to see if a callback on a RO volume was due to a snapshot.
If it regresses, all cached data must be scrubbed.
but this is currently not made use of by the in-kernel afs filesystem.
Make the afs filesystem use this by:
(1) Add an update time field to the afs_volsync struct and use a value of
TIME64_MIN in both that and the creation time to indicate that they
are unset.
(2) Add creation and update time fields to the afs_volume struct and use
this to track the two timestamps.
(3) Add a volsync_lock mutex to the afs_volume struct to control
modification access for when we detect a change in these values.
(3) Add a 'pre-op volsync' struct to the afs_operation struct to record
the state of the volume tracking before the op.
(4) Add a new counter, cb_scrub, to the afs_volume struct to count events
that require all data to be scrubbed. A copy is placed in the
afs_vnode struct (inode) and if they no longer match, a scrub takes
place.
(5) When the result of an operation is being parsed, parse the VolSync
data too, if it is provided. Note that the two timestamps are handled
separately, since they don't work in quite the same way.
- If the afs_volume tracking is unset, just set it and do nothing
else.
- If the result timestamps are the same as the ones in afs_volume, do
nothing.
- If the timestamps regress, increment cb_scrub if not already done
so.
- If the creation timestamp on a RW volume changes, increment cb_scrub
if not already done so.
- If the creation timestamp on a RO volume advances, update the server
list and see if the current server has been excluded, if so reissue
the op. Once over half of the replication sites have been updated,
increment cb_ro_snapshot to indicate updates may be required and
switch over to excluding unupdated replication sites.
- If the creation timestamp on a Backup volume advances, just
increment cb_ro_snapshot to trigger updates.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-11-05 16:11:07 +00:00
|
|
|
unsigned int cb_v_break;
|
2007-04-26 15:55:03 -07:00
|
|
|
|
2020-04-30 01:03:49 +01:00
|
|
|
if (fid->vnode == 0 && fid->unique == 0) {
|
|
|
|
/* The callback break applies to an entire volume. */
|
|
|
|
write_lock(&volume->cb_v_break_lock);
|
afs: Parse the VolSync record in the reply of a number of RPC ops
A number of fileserver RPC operations return a VolSync record as part of
their reply that gives some information about the state of the volume being
accessed, including:
(1) A volume Creation timestamp. For an RW volume, this is the time at
which the volume was created; if it changes, the RW volume was
presumably restored from a backup and all cached data should be
scrubbed as Data Version numbers could regress on the files in the
volume.
For an RO volume, this is the time it was last snapshotted from the RW
volume. It is expected to advance each time this happens; if it
regresses, cached data should be scrubbed.
(2) A volume Update timestamp (Auristor only). For an RW volume, this is
updated any time any change is made to a volume or its contents. If
it regresses, all cached data must be scrubbed.
For an RO volume, this is a copy of the RW volume's Update timestamp
at the point of snapshotting. It can be used as a version number when
checking to see if a callback on a RO volume was due to a snapshot.
If it regresses, all cached data must be scrubbed.
but this is currently not made use of by the in-kernel afs filesystem.
Make the afs filesystem use this by:
(1) Add an update time field to the afs_volsync struct and use a value of
TIME64_MIN in both that and the creation time to indicate that they
are unset.
(2) Add creation and update time fields to the afs_volume struct and use
this to track the two timestamps.
(3) Add a volsync_lock mutex to the afs_volume struct to control
modification access for when we detect a change in these values.
(3) Add a 'pre-op volsync' struct to the afs_operation struct to record
the state of the volume tracking before the op.
(4) Add a new counter, cb_scrub, to the afs_volume struct to count events
that require all data to be scrubbed. A copy is placed in the
afs_vnode struct (inode) and if they no longer match, a scrub takes
place.
(5) When the result of an operation is being parsed, parse the VolSync
data too, if it is provided. Note that the two timestamps are handled
separately, since they don't work in quite the same way.
- If the afs_volume tracking is unset, just set it and do nothing
else.
- If the result timestamps are the same as the ones in afs_volume, do
nothing.
- If the timestamps regress, increment cb_scrub if not already done
so.
- If the creation timestamp on a RW volume changes, increment cb_scrub
if not already done so.
- If the creation timestamp on a RO volume advances, update the server
list and see if the current server has been excluded, if so reissue
the op. Once over half of the replication sites have been updated,
increment cb_ro_snapshot to indicate updates may be required and
switch over to excluding unupdated replication sites.
- If the creation timestamp on a Backup volume advances, just
increment cb_ro_snapshot to trigger updates.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
2023-11-05 16:11:07 +00:00
|
|
|
cb_v_break = atomic_inc_return(&volume->cb_v_break);
|
|
|
|
trace_afs_cb_break(fid, cb_v_break,
|
2020-04-30 01:03:49 +01:00
|
|
|
afs_cb_break_for_volume_callback, false);
|
|
|
|
write_unlock(&volume->cb_v_break_lock);
|
|
|
|
return;
|
|
|
|
}
|
2018-05-12 22:31:33 +01:00
|
|
|
|
2020-04-30 01:03:49 +01:00
|
|
|
/* See if we can find a matching inode - even an I_NEW inode needs to
|
|
|
|
* be marked as it can have its callback broken before we finish
|
|
|
|
* setting up the local inode.
|
|
|
|
*/
|
|
|
|
sb = rcu_dereference(volume->sb);
|
|
|
|
if (!sb)
|
|
|
|
return;
|
|
|
|
|
|
|
|
inode = find_inode_rcu(sb, fid->vnode, afs_ilookup5_test_by_fid, fid);
|
|
|
|
if (inode) {
|
|
|
|
vnode = AFS_FS_I(inode);
|
|
|
|
afs_break_callback(vnode, afs_cb_break_for_callback);
|
|
|
|
} else {
|
|
|
|
trace_afs_cb_miss(fid, afs_cb_break_for_callback);
|
2017-11-02 15:27:49 +00:00
|
|
|
}
|
2020-03-27 15:02:44 +00:00
|
|
|
}
|
2007-04-26 15:55:03 -07:00
|
|
|
|
2020-03-27 15:02:44 +00:00
|
|
|
static void afs_break_some_callbacks(struct afs_server *server,
|
|
|
|
struct afs_callback_break *cbb,
|
|
|
|
size_t *_count)
|
|
|
|
{
|
|
|
|
struct afs_callback_break *residue = cbb;
|
2020-04-30 01:03:49 +01:00
|
|
|
struct afs_volume *volume;
|
2020-03-27 15:02:44 +00:00
|
|
|
afs_volid_t vid = cbb->fid.vid;
|
|
|
|
size_t i;
|
|
|
|
|
2023-11-07 17:59:33 +00:00
|
|
|
rcu_read_lock();
|
2020-04-30 01:03:49 +01:00
|
|
|
volume = afs_lookup_volume_rcu(server->cell, vid);
|
2020-03-27 15:02:44 +00:00
|
|
|
/* TODO: Find all matching volumes if we couldn't match the server and
|
|
|
|
* break them anyway.
|
|
|
|
*/
|
|
|
|
for (i = *_count; i > 0; cbb++, i--) {
|
|
|
|
if (cbb->fid.vid == vid) {
|
|
|
|
_debug("- Fid { vl=%08llx n=%llu u=%u }",
|
|
|
|
cbb->fid.vid,
|
|
|
|
cbb->fid.vnode,
|
|
|
|
cbb->fid.unique);
|
|
|
|
--*_count;
|
2020-04-30 01:03:49 +01:00
|
|
|
if (volume)
|
|
|
|
afs_break_one_callback(volume, &cbb->fid);
|
2020-03-27 15:02:44 +00:00
|
|
|
} else {
|
|
|
|
*residue++ = *cbb;
|
|
|
|
}
|
|
|
|
}
|
2023-11-07 17:59:33 +00:00
|
|
|
|
|
|
|
rcu_read_unlock();
|
|
|
|
afs_put_volume(volume, afs_volume_trace_put_callback);
|
2007-04-26 15:49:28 -07:00
|
|
|
}
|
2005-04-16 15:20:36 -07:00
|
|
|
|
|
|
|
/*
|
|
|
|
* allow the fileserver to break callback promises
|
|
|
|
*/
|
2007-04-26 15:55:03 -07:00
|
|
|
void afs_break_callbacks(struct afs_server *server, size_t count,
|
2018-04-09 21:12:31 +01:00
|
|
|
struct afs_callback_break *callbacks)
|
2005-04-16 15:20:36 -07:00
|
|
|
{
|
2007-04-26 15:55:03 -07:00
|
|
|
_enter("%p,%zu,", server, count);
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2007-04-26 15:55:03 -07:00
|
|
|
ASSERT(server != NULL);
|
2005-04-16 15:20:36 -07:00
|
|
|
|
2020-03-27 15:02:44 +00:00
|
|
|
while (count > 0)
|
|
|
|
afs_break_some_callbacks(server, callbacks, &count);
|
2007-04-26 15:55:03 -07:00
|
|
|
}
|