linux/mm/swap.h

249 lines
6.5 KiB
C
Raw Permalink Normal View History

mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _MM_SWAP_H
#define _MM_SWAP_H
mempolicy: alloc_pages_mpol() for NUMA policy without vma Shrink shmem's stack usage by eliminating the pseudo-vma from its folio allocation. alloc_pages_mpol(gfp, order, pol, ilx, nid) becomes the principal actor for passing mempolicy choice down to __alloc_pages(), rather than vma_alloc_folio(gfp, order, vma, addr, hugepage). vma_alloc_folio() and alloc_pages() remain, but as wrappers around alloc_pages_mpol(). alloc_pages_bulk_*() untouched, except to provide the additional args to policy_nodemask(), which subsumes policy_node(). Cleanup throughout, cutting out some unhelpful "helpers". It would all be much simpler without MPOL_INTERLEAVE, but that adds a dynamic to the constant mpol: complicated by v3.6 commit 09c231cb8bfd ("tmpfs: distribute interleave better across nodes"), which added ino bias to the interleave, hidden from mm/mempolicy.c until this commit. Hence "ilx" throughout, the "interleave index". Originally I thought it could be done just with nid, but that's wrong: the nodemask may come from the shared policy layer below a shmem vma, or it may come from the task layer above a shmem vma; and without the final nodemask then nodeid cannot be decided. And how ilx is applied depends also on page order. The interleave index is almost always irrelevant unless MPOL_INTERLEAVE: with one exception in alloc_pages_mpol(), where the NO_INTERLEAVE_INDEX passed down from vma-less alloc_pages() is also used as hint not to use THP-style hugepage allocation - to avoid the overhead of a hugepage arg (though I don't understand why we never just added a GFP bit for THP - if it actually needs a different allocation strategy from other pages of the same order). vma_alloc_folio() still carries its hugepage arg here, but it is not used, and should be removed when agreed. get_vma_policy() no longer allows a NULL vma: over time I believe we've eradicated all the places which used to need it e.g. swapoff and madvise used to pass NULL vma to read_swap_cache_async(), but now know the vma. [hughd@google.com: handle NULL mpol being passed to __read_swap_cache_async()] Link: https://lkml.kernel.org/r/ea419956-4751-0102-21f7-9c93cb957892@google.com Link: https://lkml.kernel.org/r/74e34633-6060-f5e3-aee-7040d43f2e93@google.com Link: https://lkml.kernel.org/r/1738368e-bac0-fd11-ed7f-b87142a939fe@google.com Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun heo <tj@kernel.org> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: Domenico Cerasuolo <mimmocerasuolo@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-19 13:39:08 -07:00
struct mempolicy;
struct swap_iocb;
extern int page_cluster;
mempolicy: alloc_pages_mpol() for NUMA policy without vma Shrink shmem's stack usage by eliminating the pseudo-vma from its folio allocation. alloc_pages_mpol(gfp, order, pol, ilx, nid) becomes the principal actor for passing mempolicy choice down to __alloc_pages(), rather than vma_alloc_folio(gfp, order, vma, addr, hugepage). vma_alloc_folio() and alloc_pages() remain, but as wrappers around alloc_pages_mpol(). alloc_pages_bulk_*() untouched, except to provide the additional args to policy_nodemask(), which subsumes policy_node(). Cleanup throughout, cutting out some unhelpful "helpers". It would all be much simpler without MPOL_INTERLEAVE, but that adds a dynamic to the constant mpol: complicated by v3.6 commit 09c231cb8bfd ("tmpfs: distribute interleave better across nodes"), which added ino bias to the interleave, hidden from mm/mempolicy.c until this commit. Hence "ilx" throughout, the "interleave index". Originally I thought it could be done just with nid, but that's wrong: the nodemask may come from the shared policy layer below a shmem vma, or it may come from the task layer above a shmem vma; and without the final nodemask then nodeid cannot be decided. And how ilx is applied depends also on page order. The interleave index is almost always irrelevant unless MPOL_INTERLEAVE: with one exception in alloc_pages_mpol(), where the NO_INTERLEAVE_INDEX passed down from vma-less alloc_pages() is also used as hint not to use THP-style hugepage allocation - to avoid the overhead of a hugepage arg (though I don't understand why we never just added a GFP bit for THP - if it actually needs a different allocation strategy from other pages of the same order). vma_alloc_folio() still carries its hugepage arg here, but it is not used, and should be removed when agreed. get_vma_policy() no longer allows a NULL vma: over time I believe we've eradicated all the places which used to need it e.g. swapoff and madvise used to pass NULL vma to read_swap_cache_async(), but now know the vma. [hughd@google.com: handle NULL mpol being passed to __read_swap_cache_async()] Link: https://lkml.kernel.org/r/ea419956-4751-0102-21f7-9c93cb957892@google.com Link: https://lkml.kernel.org/r/74e34633-6060-f5e3-aee-7040d43f2e93@google.com Link: https://lkml.kernel.org/r/1738368e-bac0-fd11-ed7f-b87142a939fe@google.com Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun heo <tj@kernel.org> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: Domenico Cerasuolo <mimmocerasuolo@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-19 13:39:08 -07:00
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
#ifdef CONFIG_SWAP
#include <linux/swapops.h> /* for swp_offset */
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
#include <linux/blk_types.h> /* for bio_end_io_t */
/* linux/mm/page_io.c */
mm: introduce ->swap_rw and use it for reads from SWP_FS_OPS swap-space swap currently uses ->readpage to read swap pages. This can only request one page at a time from the filesystem, which is not most efficient. swap uses ->direct_IO for writes which while this is adequate is an inappropriate over-loading. ->direct_IO may need to had handle allocate space for holes or other details that are not relevant for swap. So this patch introduces a new address_space operation: ->swap_rw. In this patch it is used for reads, and a subsequent patch will switch writes to use it. No filesystem yet supports ->swap_rw, but that is not a problem because no filesystem actually works with filesystem-based swap. Only two filesystems set SWP_FS_OPS: - cifs sets the flag, but ->direct_IO always fails so swap cannot work. - nfs sets the flag, but ->direct_IO calls generic_write_checks() which has failed on swap files for several releases. To ensure that a NULL ->swap_rw isn't called, ->activate_swap() for both NFS and cifs are changed to fail if ->swap_rw is not set. This can be removed if/when the function is added. Future patches will restore swap-over-NFS functionality. To submit an async read with ->swap_rw() we need to allocate a structure to hold the kiocb and other details. swap_readpage() cannot handle transient failure, so we create a mempool to provide the structures. Link: https://lkml.kernel.org/r/164859778125.29473.13430559328221330589.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:48 -07:00
int sio_pool_init(void);
struct swap_iocb;
void swap_read_folio(struct folio *folio, struct swap_iocb **plug);
void __swap_read_unplug(struct swap_iocb *plug);
static inline void swap_read_unplug(struct swap_iocb *plug)
{
if (unlikely(plug))
__swap_read_unplug(plug);
}
void swap_write_unplug(struct swap_iocb *sio);
int swap_writeout(struct folio *folio, struct swap_iocb **swap_plug);
void __swap_writepage(struct folio *folio, struct swap_iocb **swap_plug);
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
/* linux/mm/swap_state.c */
/* One swap address space for each 64M swap space */
#define SWAP_ADDRESS_SPACE_SHIFT 14
#define SWAP_ADDRESS_SPACE_PAGES (1 << SWAP_ADDRESS_SPACE_SHIFT)
mm/swap: reduce swap cache search space Currently we use one swap_address_space for every 64M chunk to reduce lock contention, this is like having a set of smaller swap files inside one swap device. But when doing swap cache look up or insert, we are still using the offset of the whole large swap device. This is OK for correctness, as the offset (key) is unique. But Xarray is specially optimized for small indexes, it creates the radix tree levels lazily to be just enough to fit the largest key stored in one Xarray. So we are wasting tree nodes unnecessarily. For 64M chunk it should only take at most 3 levels to contain everything. But if we are using the offset from the whole swap device, the offset (key) value will be way beyond 64M, and so will the tree level. Optimize this by using a new helper swap_cache_index to get a swap entry's unique offset in its own 64M swap_address_space. I see a ~1% performance gain in benchmark and actual workload with high memory pressure. Test with `time memhog 128G` inside a 8G memcg using 128G swap (ramdisk with SWP_SYNCHRONOUS_IO dropped, tested 3 times, results are stable. The test result is similar but the improvement is smaller if SWP_SYNCHRONOUS_IO is enabled, as swap out path can never skip swap cache): Before: 6.07user 250.74system 4:17.26elapsed 99%CPU (0avgtext+0avgdata 8373376maxresident)k 0inputs+0outputs (55major+33555018minor)pagefaults 0swaps After (1.8% faster): 6.08user 246.09system 4:12.58elapsed 99%CPU (0avgtext+0avgdata 8373248maxresident)k 0inputs+0outputs (54major+33555027minor)pagefaults 0swaps Similar result with MySQL and sysbench using swap: Before: 94055.61 qps After (0.8% faster): 94834.91 qps Radix tree slab usage is also very slightly lower. Link: https://lkml.kernel.org/r/20240521175854.96038-12-ryncsn@gmail.com Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Barry Song <v-songbaohua@oppo.com> Cc: Chao Yu <chao@kernel.org> Cc: Chris Li <chrisl@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Minchan Kim <minchan@kernel.org> Cc: NeilBrown <neilb@suse.de> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Xiubo Li <xiubli@redhat.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-22 01:58:53 +08:00
#define SWAP_ADDRESS_SPACE_MASK (SWAP_ADDRESS_SPACE_PAGES - 1)
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
extern struct address_space *swapper_spaces[];
#define swap_address_space(entry) \
(&swapper_spaces[swp_type(entry)][swp_offset(entry) \
>> SWAP_ADDRESS_SPACE_SHIFT])
/*
* Return the swap device position of the swap entry.
*/
static inline loff_t swap_dev_pos(swp_entry_t entry)
{
return ((loff_t)swp_offset(entry)) << PAGE_SHIFT;
}
mm/swap: reduce swap cache search space Currently we use one swap_address_space for every 64M chunk to reduce lock contention, this is like having a set of smaller swap files inside one swap device. But when doing swap cache look up or insert, we are still using the offset of the whole large swap device. This is OK for correctness, as the offset (key) is unique. But Xarray is specially optimized for small indexes, it creates the radix tree levels lazily to be just enough to fit the largest key stored in one Xarray. So we are wasting tree nodes unnecessarily. For 64M chunk it should only take at most 3 levels to contain everything. But if we are using the offset from the whole swap device, the offset (key) value will be way beyond 64M, and so will the tree level. Optimize this by using a new helper swap_cache_index to get a swap entry's unique offset in its own 64M swap_address_space. I see a ~1% performance gain in benchmark and actual workload with high memory pressure. Test with `time memhog 128G` inside a 8G memcg using 128G swap (ramdisk with SWP_SYNCHRONOUS_IO dropped, tested 3 times, results are stable. The test result is similar but the improvement is smaller if SWP_SYNCHRONOUS_IO is enabled, as swap out path can never skip swap cache): Before: 6.07user 250.74system 4:17.26elapsed 99%CPU (0avgtext+0avgdata 8373376maxresident)k 0inputs+0outputs (55major+33555018minor)pagefaults 0swaps After (1.8% faster): 6.08user 246.09system 4:12.58elapsed 99%CPU (0avgtext+0avgdata 8373248maxresident)k 0inputs+0outputs (54major+33555027minor)pagefaults 0swaps Similar result with MySQL and sysbench using swap: Before: 94055.61 qps After (0.8% faster): 94834.91 qps Radix tree slab usage is also very slightly lower. Link: https://lkml.kernel.org/r/20240521175854.96038-12-ryncsn@gmail.com Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Barry Song <v-songbaohua@oppo.com> Cc: Chao Yu <chao@kernel.org> Cc: Chris Li <chrisl@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Minchan Kim <minchan@kernel.org> Cc: NeilBrown <neilb@suse.de> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Xiubo Li <xiubli@redhat.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-22 01:58:53 +08:00
/*
* Return the swap cache index of the swap entry.
*/
static inline pgoff_t swap_cache_index(swp_entry_t entry)
{
BUILD_BUG_ON((SWP_OFFSET_MASK | SWAP_ADDRESS_SPACE_MASK) != SWP_OFFSET_MASK);
return swp_offset(entry) & SWAP_ADDRESS_SPACE_MASK;
}
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
void show_swap_cache_info(void);
void *get_shadow_from_swap_cache(swp_entry_t entry);
int add_to_swap_cache(struct folio *folio, swp_entry_t entry,
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
gfp_t gfp, void **shadowp);
void __delete_from_swap_cache(struct folio *folio,
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
swp_entry_t entry, void *shadow);
void delete_from_swap_cache(struct folio *folio);
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
void clear_shadow_from_swap_cache(int type, unsigned long begin,
unsigned long end);
mm: swap: add nr argument in swapcache_prepare and swapcache_clear to support large folios Right now, swapcache_prepare() and swapcache_clear() supports one entry only, to support large folios, we need to handle multiple swap entries. To optimize stack usage, we iterate twice in __swap_duplicate(): the first time to verify that all entries are valid, and the second time to apply the modifications to the entries. Currently, we're using nr=1 for the existing users. [v-songbaohua@oppo.com: clarify swap_count_continued and improve readability for __swap_duplicate] Link: https://lkml.kernel.org/r/20240802071817.47081-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20240730071339.107447-2-21cnbao@gmail.com Signed-off-by: Barry Song <v-songbaohua@oppo.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Chris Li <chrisl@kernel.org> Cc: Gao Xiang <xiang@kernel.org> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kairui Song <kasong@tencent.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-30 19:13:39 +12:00
void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry, int nr);
struct folio *swap_cache_get_folio(swp_entry_t entry,
struct vm_area_struct *vma, unsigned long addr);
struct folio *filemap_get_incore_folio(struct address_space *mapping,
pgoff_t index);
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
struct folio *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
struct vm_area_struct *vma, unsigned long addr,
struct swap_iocb **plug);
struct folio *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_flags,
struct mempolicy *mpol, pgoff_t ilx, bool *new_page_allocated,
bool skip_if_exists);
struct folio *swap_cluster_readahead(swp_entry_t entry, gfp_t flag,
struct mempolicy *mpol, pgoff_t ilx);
struct folio *swapin_readahead(swp_entry_t entry, gfp_t flag,
struct vm_fault *vmf);
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
static inline unsigned int folio_swap_flags(struct folio *folio)
{
return swp_swap_info(folio->swap)->flags;
}
mm: fix swap_read_folio_zeromap() for large folios with partial zeromap Patch series "mm: enable large folios swap-in support", v9. Currently, we support mTHP swapout but not swapin. This means that once mTHP is swapped out, it will come back as small folios when swapped in. This is particularly detrimental for devices like Android, where more than half of the memory is in swap. The lack of mTHP swapin functionality makes mTHP a showstopper in scenarios that heavily rely on swap. This patchset introduces mTHP swap-in support. It starts with synchronous devices similar to zRAM, aiming to benefit as many users as possible with minimal changes. This patch (of 3): There could be a corner case where the first entry is non-zeromap, but a subsequent entry is zeromap. In this case, we should not let swap_read_folio_zeromap() return false since we will still read corrupted data. Additionally, the iteration of test_bit() is unnecessary and can be replaced with bitmap operations, which are more efficient. We can adopt the style of swap_pte_batch() and folio_pte_batch() to introduce swap_zeromap_batch() which seems to provide the greatest flexibility for the caller. This approach allows the caller to either check if the zeromap status of all entries is consistent or determine the number of contiguous entries with the same status. Since swap_read_folio() can't handle reading a large folio that's partially zeromap and partially non-zeromap, we've moved the code to mm/swap.h so that others, like those working on swap-in, can access it. Link: https://lkml.kernel.org/r/20240908232119.2157-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20240908232119.2157-2-21cnbao@gmail.com Fixes: 0ca0c24e3211 ("mm: store zero pages to be swapped out in a bitmap") Signed-off-by: Barry Song <v-songbaohua@oppo.com> Reviewed-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: Usama Arif <usamaarif642@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Chris Li <chrisl@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chuanhua Han <hanchuanhua@oppo.com> Cc: David Hildenbrand <david@redhat.com> Cc: Gao Xiang <xiang@kernel.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kairui Song <kasong@tencent.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Kanchana P Sridhar <kanchana.p.sridhar@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 11:21:17 +12:00
/*
* Return the count of contiguous swap entries that share the same
* zeromap status as the starting entry. If is_zeromap is not NULL,
* it will return the zeromap status of the starting entry.
*/
static inline int swap_zeromap_batch(swp_entry_t entry, int max_nr,
bool *is_zeromap)
{
struct swap_info_struct *sis = swp_swap_info(entry);
unsigned long start = swp_offset(entry);
unsigned long end = start + max_nr;
bool first_bit;
first_bit = test_bit(start, sis->zeromap);
if (is_zeromap)
*is_zeromap = first_bit;
if (max_nr <= 1)
return max_nr;
if (first_bit)
return find_next_zero_bit(sis->zeromap, end, start) - start;
else
return find_next_bit(sis->zeromap, end, start) - start;
}
mm/shmem, swap: fix softlockup with mTHP swapin Following softlockup can be easily reproduced on my test machine with: echo always > /sys/kernel/mm/transparent_hugepage/hugepages-64kB/enabled swapon /dev/zram0 # zram0 is a 48G swap device mkdir -p /sys/fs/cgroup/memory/test echo 1G > /sys/fs/cgroup/test/memory.max echo $BASHPID > /sys/fs/cgroup/test/cgroup.procs while true; do dd if=/dev/zero of=/tmp/test.img bs=1M count=5120 cat /tmp/test.img > /dev/null rm /tmp/test.img done Then after a while: watchdog: BUG: soft lockup - CPU#0 stuck for 763s! [cat:5787] Modules linked in: zram virtiofs CPU: 0 UID: 0 PID: 5787 Comm: cat Kdump: loaded Tainted: G L 6.15.0.orig-gf3021d9246bc-dirty #118 PREEMPT(voluntary)· Tainted: [L]=SOFTLOCKUP Hardware name: Red Hat KVM/RHEL-AV, BIOS 0.0.0 02/06/2015 RIP: 0010:mpol_shared_policy_lookup+0xd/0x70 Code: e9 b8 b4 ff ff 31 c0 c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 41 54 55 53 <48> 8b 1f 48 85 db 74 41 4c 8d 67 08 48 89 fb 48 89 f5 4c 89 e7 e8 RSP: 0018:ffffc90002b1fc28 EFLAGS: 00000202 RAX: 00000000001c20ca RBX: 0000000000724e1e RCX: 0000000000000001 RDX: ffff888118e214c8 RSI: 0000000000057d42 RDI: ffff888118e21518 RBP: 000000000002bec8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000bf4 R11: 0000000000000000 R12: 0000000000000001 R13: 00000000001c20ca R14: 00000000001c20ca R15: 0000000000000000 FS: 00007f03f995c740(0000) GS:ffff88a07ad9a000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f03f98f1000 CR3: 0000000144626004 CR4: 0000000000770eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> shmem_alloc_folio+0x31/0xc0 shmem_swapin_folio+0x309/0xcf0 ? filemap_get_entry+0x117/0x1e0 ? xas_load+0xd/0xb0 ? filemap_get_entry+0x101/0x1e0 shmem_get_folio_gfp+0x2ed/0x5b0 shmem_file_read_iter+0x7f/0x2e0 vfs_read+0x252/0x330 ksys_read+0x68/0xf0 do_syscall_64+0x4c/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f03f9a46991 Code: 00 48 8b 15 81 14 10 00 f7 d8 64 89 02 b8 ff ff ff ff eb bd e8 20 ad 01 00 f3 0f 1e fa 80 3d 35 97 10 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 4f c3 66 0f 1f 44 00 00 55 48 89 e5 48 83 ec RSP: 002b:00007fff3c52bd28 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000040000 RCX: 00007f03f9a46991 RDX: 0000000000040000 RSI: 00007f03f98ba000 RDI: 0000000000000003 RBP: 00007fff3c52bd50 R08: 0000000000000000 R09: 00007f03f9b9a380 R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000040000 R13: 00007f03f98ba000 R14: 0000000000000003 R15: 0000000000000000 </TASK> The reason is simple, readahead brought some order 0 folio in swap cache, and the swapin mTHP folio being allocated is in conflict with it, so swapcache_prepare fails and causes shmem_swap_alloc_folio to return -EEXIST, and shmem simply retries again and again causing this loop. Fix it by applying a similar fix for anon mTHP swapin. The performance change is very slight, time of swapin 10g zero folios with shmem (test for 12 times): Before: 2.47s After: 2.48s [kasong@tencent.com: add comment] Link: https://lkml.kernel.org/r/20250610181645.45922-1-ryncsn@gmail.com Link: https://lkml.kernel.org/r/20250610181645.45922-1-ryncsn@gmail.com Link: https://lkml.kernel.org/r/20250609171751.36305-1-ryncsn@gmail.com Fixes: 1dd44c0af4fa ("mm: shmem: skip swapcache for swapin of synchronous swap device") Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Barry Song <baohua@kernel.org> Acked-by: Nhat Pham <nphamcs@gmail.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Chris Li <chrisl@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-10 01:17:51 +08:00
static inline int non_swapcache_batch(swp_entry_t entry, int max_nr)
{
struct swap_info_struct *si = swp_swap_info(entry);
pgoff_t offset = swp_offset(entry);
int i;
/*
* While allocating a large folio and doing mTHP swapin, we need to
* ensure all entries are not cached, otherwise, the mTHP folio will
* be in conflict with the folio in swap cache.
*/
for (i = 0; i < max_nr; i++) {
if ((si->swap_map[offset + i] & SWAP_HAS_CACHE))
return i;
}
return i;
}
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
#else /* CONFIG_SWAP */
struct swap_iocb;
static inline void swap_read_folio(struct folio *folio, struct swap_iocb **plug)
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
{
}
static inline void swap_write_unplug(struct swap_iocb *sio)
{
}
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
static inline struct address_space *swap_address_space(swp_entry_t entry)
{
return NULL;
}
mm/swap: reduce swap cache search space Currently we use one swap_address_space for every 64M chunk to reduce lock contention, this is like having a set of smaller swap files inside one swap device. But when doing swap cache look up or insert, we are still using the offset of the whole large swap device. This is OK for correctness, as the offset (key) is unique. But Xarray is specially optimized for small indexes, it creates the radix tree levels lazily to be just enough to fit the largest key stored in one Xarray. So we are wasting tree nodes unnecessarily. For 64M chunk it should only take at most 3 levels to contain everything. But if we are using the offset from the whole swap device, the offset (key) value will be way beyond 64M, and so will the tree level. Optimize this by using a new helper swap_cache_index to get a swap entry's unique offset in its own 64M swap_address_space. I see a ~1% performance gain in benchmark and actual workload with high memory pressure. Test with `time memhog 128G` inside a 8G memcg using 128G swap (ramdisk with SWP_SYNCHRONOUS_IO dropped, tested 3 times, results are stable. The test result is similar but the improvement is smaller if SWP_SYNCHRONOUS_IO is enabled, as swap out path can never skip swap cache): Before: 6.07user 250.74system 4:17.26elapsed 99%CPU (0avgtext+0avgdata 8373376maxresident)k 0inputs+0outputs (55major+33555018minor)pagefaults 0swaps After (1.8% faster): 6.08user 246.09system 4:12.58elapsed 99%CPU (0avgtext+0avgdata 8373248maxresident)k 0inputs+0outputs (54major+33555027minor)pagefaults 0swaps Similar result with MySQL and sysbench using swap: Before: 94055.61 qps After (0.8% faster): 94834.91 qps Radix tree slab usage is also very slightly lower. Link: https://lkml.kernel.org/r/20240521175854.96038-12-ryncsn@gmail.com Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Barry Song <v-songbaohua@oppo.com> Cc: Chao Yu <chao@kernel.org> Cc: Chris Li <chrisl@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Howells <dhowells@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: Marc Dionne <marc.dionne@auristor.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Minchan Kim <minchan@kernel.org> Cc: NeilBrown <neilb@suse.de> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Xiubo Li <xiubli@redhat.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-22 01:58:53 +08:00
static inline pgoff_t swap_cache_index(swp_entry_t entry)
{
return 0;
}
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
static inline void show_swap_cache_info(void)
{
}
static inline struct folio *swap_cluster_readahead(swp_entry_t entry,
mempolicy: alloc_pages_mpol() for NUMA policy without vma Shrink shmem's stack usage by eliminating the pseudo-vma from its folio allocation. alloc_pages_mpol(gfp, order, pol, ilx, nid) becomes the principal actor for passing mempolicy choice down to __alloc_pages(), rather than vma_alloc_folio(gfp, order, vma, addr, hugepage). vma_alloc_folio() and alloc_pages() remain, but as wrappers around alloc_pages_mpol(). alloc_pages_bulk_*() untouched, except to provide the additional args to policy_nodemask(), which subsumes policy_node(). Cleanup throughout, cutting out some unhelpful "helpers". It would all be much simpler without MPOL_INTERLEAVE, but that adds a dynamic to the constant mpol: complicated by v3.6 commit 09c231cb8bfd ("tmpfs: distribute interleave better across nodes"), which added ino bias to the interleave, hidden from mm/mempolicy.c until this commit. Hence "ilx" throughout, the "interleave index". Originally I thought it could be done just with nid, but that's wrong: the nodemask may come from the shared policy layer below a shmem vma, or it may come from the task layer above a shmem vma; and without the final nodemask then nodeid cannot be decided. And how ilx is applied depends also on page order. The interleave index is almost always irrelevant unless MPOL_INTERLEAVE: with one exception in alloc_pages_mpol(), where the NO_INTERLEAVE_INDEX passed down from vma-less alloc_pages() is also used as hint not to use THP-style hugepage allocation - to avoid the overhead of a hugepage arg (though I don't understand why we never just added a GFP bit for THP - if it actually needs a different allocation strategy from other pages of the same order). vma_alloc_folio() still carries its hugepage arg here, but it is not used, and should be removed when agreed. get_vma_policy() no longer allows a NULL vma: over time I believe we've eradicated all the places which used to need it e.g. swapoff and madvise used to pass NULL vma to read_swap_cache_async(), but now know the vma. [hughd@google.com: handle NULL mpol being passed to __read_swap_cache_async()] Link: https://lkml.kernel.org/r/ea419956-4751-0102-21f7-9c93cb957892@google.com Link: https://lkml.kernel.org/r/74e34633-6060-f5e3-aee-7040d43f2e93@google.com Link: https://lkml.kernel.org/r/1738368e-bac0-fd11-ed7f-b87142a939fe@google.com Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Hildenbrand <david@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Tejun heo <tj@kernel.org> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: Domenico Cerasuolo <mimmocerasuolo@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-10-19 13:39:08 -07:00
gfp_t gfp_mask, struct mempolicy *mpol, pgoff_t ilx)
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
{
return NULL;
}
static inline struct folio *swapin_readahead(swp_entry_t swp, gfp_t gfp_mask,
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
struct vm_fault *vmf)
{
return NULL;
}
static inline int swap_writeout(struct folio *folio,
struct swap_iocb **swap_plug)
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
{
return 0;
}
mm: swap: add nr argument in swapcache_prepare and swapcache_clear to support large folios Right now, swapcache_prepare() and swapcache_clear() supports one entry only, to support large folios, we need to handle multiple swap entries. To optimize stack usage, we iterate twice in __swap_duplicate(): the first time to verify that all entries are valid, and the second time to apply the modifications to the entries. Currently, we're using nr=1 for the existing users. [v-songbaohua@oppo.com: clarify swap_count_continued and improve readability for __swap_duplicate] Link: https://lkml.kernel.org/r/20240802071817.47081-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20240730071339.107447-2-21cnbao@gmail.com Signed-off-by: Barry Song <v-songbaohua@oppo.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Chris Li <chrisl@kernel.org> Cc: Gao Xiang <xiang@kernel.org> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kairui Song <kasong@tencent.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-30 19:13:39 +12:00
static inline void swapcache_clear(struct swap_info_struct *si, swp_entry_t entry, int nr)
mm/swap: fix race when skipping swapcache When skipping swapcache for SWP_SYNCHRONOUS_IO, if two or more threads swapin the same entry at the same time, they get different pages (A, B). Before one thread (T0) finishes the swapin and installs page (A) to the PTE, another thread (T1) could finish swapin of page (B), swap_free the entry, then swap out the possibly modified page reusing the same entry. It breaks the pte_same check in (T0) because PTE value is unchanged, causing ABA problem. Thread (T0) will install a stalled page (A) into the PTE and cause data corruption. One possible callstack is like this: CPU0 CPU1 ---- ---- do_swap_page() do_swap_page() with same entry <direct swapin path> <direct swapin path> <alloc page A> <alloc page B> swap_read_folio() <- read to page A swap_read_folio() <- read to page B <slow on later locks or interrupt> <finished swapin first> ... set_pte_at() swap_free() <- entry is free <write to page B, now page A stalled> <swap out page B to same swap entry> pte_same() <- Check pass, PTE seems unchanged, but page A is stalled! swap_free() <- page B content lost! set_pte_at() <- staled page A installed! And besides, for ZRAM, swap_free() allows the swap device to discard the entry content, so even if page (B) is not modified, if swap_read_folio() on CPU0 happens later than swap_free() on CPU1, it may also cause data loss. To fix this, reuse swapcache_prepare which will pin the swap entry using the cache flag, and allow only one thread to swap it in, also prevent any parallel code from putting the entry in the cache. Release the pin after PT unlocked. Racers just loop and wait since it's a rare and very short event. A schedule_timeout_uninterruptible(1) call is added to avoid repeated page faults wasting too much CPU, causing livelock or adding too much noise to perf statistics. A similar livelock issue was described in commit 029c4628b2eb ("mm: swap: get rid of livelock in swapin readahead") Reproducer: This race issue can be triggered easily using a well constructed reproducer and patched brd (with a delay in read path) [1]: With latest 6.8 mainline, race caused data loss can be observed easily: $ gcc -g -lpthread test-thread-swap-race.c && ./a.out Polulating 32MB of memory region... Keep swapping out... Starting round 0... Spawning 65536 workers... 32746 workers spawned, wait for done... Round 0: Error on 0x5aa00, expected 32746, got 32743, 3 data loss! Round 0: Error on 0x395200, expected 32746, got 32743, 3 data loss! Round 0: Error on 0x3fd000, expected 32746, got 32737, 9 data loss! Round 0 Failed, 15 data loss! This reproducer spawns multiple threads sharing the same memory region using a small swap device. Every two threads updates mapped pages one by one in opposite direction trying to create a race, with one dedicated thread keep swapping out the data out using madvise. The reproducer created a reproduce rate of about once every 5 minutes, so the race should be totally possible in production. After this patch, I ran the reproducer for over a few hundred rounds and no data loss observed. Performance overhead is minimal, microbenchmark swapin 10G from 32G zram: Before: 10934698 us After: 11157121 us Cached: 13155355 us (Dropping SWP_SYNCHRONOUS_IO flag) [kasong@tencent.com: v4] Link: https://lkml.kernel.org/r/20240219082040.7495-1-ryncsn@gmail.com Link: https://lkml.kernel.org/r/20240206182559.32264-1-ryncsn@gmail.com Fixes: 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous device") Reported-by: "Huang, Ying" <ying.huang@intel.com> Closes: https://lore.kernel.org/lkml/87bk92gqpx.fsf_-_@yhuang6-desk2.ccr.corp.intel.com/ Link: https://github.com/ryncsn/emm-test-project/tree/master/swap-stress-race [1] Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Acked-by: Yu Zhao <yuzhao@google.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Chris Li <chrisl@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Yosry Ahmed <yosryahmed@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: Barry Song <21cnbao@gmail.com> Cc: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-07 02:25:59 +08:00
{
}
static inline struct folio *swap_cache_get_folio(swp_entry_t entry,
struct vm_area_struct *vma, unsigned long addr)
{
return NULL;
}
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
static inline
struct folio *filemap_get_incore_folio(struct address_space *mapping,
pgoff_t index)
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
{
return filemap_get_folio(mapping, index);
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
}
static inline void *get_shadow_from_swap_cache(swp_entry_t entry)
{
return NULL;
}
static inline int add_to_swap_cache(struct folio *folio, swp_entry_t entry,
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
gfp_t gfp_mask, void **shadowp)
{
return -1;
}
static inline void __delete_from_swap_cache(struct folio *folio,
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
swp_entry_t entry, void *shadow)
{
}
static inline void delete_from_swap_cache(struct folio *folio)
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
{
}
static inline void clear_shadow_from_swap_cache(int type, unsigned long begin,
unsigned long end)
{
}
static inline unsigned int folio_swap_flags(struct folio *folio)
{
return 0;
}
mm: fix swap_read_folio_zeromap() for large folios with partial zeromap Patch series "mm: enable large folios swap-in support", v9. Currently, we support mTHP swapout but not swapin. This means that once mTHP is swapped out, it will come back as small folios when swapped in. This is particularly detrimental for devices like Android, where more than half of the memory is in swap. The lack of mTHP swapin functionality makes mTHP a showstopper in scenarios that heavily rely on swap. This patchset introduces mTHP swap-in support. It starts with synchronous devices similar to zRAM, aiming to benefit as many users as possible with minimal changes. This patch (of 3): There could be a corner case where the first entry is non-zeromap, but a subsequent entry is zeromap. In this case, we should not let swap_read_folio_zeromap() return false since we will still read corrupted data. Additionally, the iteration of test_bit() is unnecessary and can be replaced with bitmap operations, which are more efficient. We can adopt the style of swap_pte_batch() and folio_pte_batch() to introduce swap_zeromap_batch() which seems to provide the greatest flexibility for the caller. This approach allows the caller to either check if the zeromap status of all entries is consistent or determine the number of contiguous entries with the same status. Since swap_read_folio() can't handle reading a large folio that's partially zeromap and partially non-zeromap, we've moved the code to mm/swap.h so that others, like those working on swap-in, can access it. Link: https://lkml.kernel.org/r/20240908232119.2157-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20240908232119.2157-2-21cnbao@gmail.com Fixes: 0ca0c24e3211 ("mm: store zero pages to be swapped out in a bitmap") Signed-off-by: Barry Song <v-songbaohua@oppo.com> Reviewed-by: Yosry Ahmed <yosryahmed@google.com> Reviewed-by: Usama Arif <usamaarif642@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Chris Li <chrisl@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chuanhua Han <hanchuanhua@oppo.com> Cc: David Hildenbrand <david@redhat.com> Cc: Gao Xiang <xiang@kernel.org> Cc: Huang Ying <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kairui Song <kasong@tencent.com> Cc: Kairui Song <ryncsn@gmail.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Kanchana P Sridhar <kanchana.p.sridhar@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-09 11:21:17 +12:00
static inline int swap_zeromap_batch(swp_entry_t entry, int max_nr,
bool *has_zeromap)
{
return 0;
}
mm/shmem, swap: fix softlockup with mTHP swapin Following softlockup can be easily reproduced on my test machine with: echo always > /sys/kernel/mm/transparent_hugepage/hugepages-64kB/enabled swapon /dev/zram0 # zram0 is a 48G swap device mkdir -p /sys/fs/cgroup/memory/test echo 1G > /sys/fs/cgroup/test/memory.max echo $BASHPID > /sys/fs/cgroup/test/cgroup.procs while true; do dd if=/dev/zero of=/tmp/test.img bs=1M count=5120 cat /tmp/test.img > /dev/null rm /tmp/test.img done Then after a while: watchdog: BUG: soft lockup - CPU#0 stuck for 763s! [cat:5787] Modules linked in: zram virtiofs CPU: 0 UID: 0 PID: 5787 Comm: cat Kdump: loaded Tainted: G L 6.15.0.orig-gf3021d9246bc-dirty #118 PREEMPT(voluntary)· Tainted: [L]=SOFTLOCKUP Hardware name: Red Hat KVM/RHEL-AV, BIOS 0.0.0 02/06/2015 RIP: 0010:mpol_shared_policy_lookup+0xd/0x70 Code: e9 b8 b4 ff ff 31 c0 c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 66 0f 1f 00 0f 1f 44 00 00 41 54 55 53 <48> 8b 1f 48 85 db 74 41 4c 8d 67 08 48 89 fb 48 89 f5 4c 89 e7 e8 RSP: 0018:ffffc90002b1fc28 EFLAGS: 00000202 RAX: 00000000001c20ca RBX: 0000000000724e1e RCX: 0000000000000001 RDX: ffff888118e214c8 RSI: 0000000000057d42 RDI: ffff888118e21518 RBP: 000000000002bec8 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000bf4 R11: 0000000000000000 R12: 0000000000000001 R13: 00000000001c20ca R14: 00000000001c20ca R15: 0000000000000000 FS: 00007f03f995c740(0000) GS:ffff88a07ad9a000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f03f98f1000 CR3: 0000000144626004 CR4: 0000000000770eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> shmem_alloc_folio+0x31/0xc0 shmem_swapin_folio+0x309/0xcf0 ? filemap_get_entry+0x117/0x1e0 ? xas_load+0xd/0xb0 ? filemap_get_entry+0x101/0x1e0 shmem_get_folio_gfp+0x2ed/0x5b0 shmem_file_read_iter+0x7f/0x2e0 vfs_read+0x252/0x330 ksys_read+0x68/0xf0 do_syscall_64+0x4c/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f03f9a46991 Code: 00 48 8b 15 81 14 10 00 f7 d8 64 89 02 b8 ff ff ff ff eb bd e8 20 ad 01 00 f3 0f 1e fa 80 3d 35 97 10 00 00 74 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 4f c3 66 0f 1f 44 00 00 55 48 89 e5 48 83 ec RSP: 002b:00007fff3c52bd28 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000040000 RCX: 00007f03f9a46991 RDX: 0000000000040000 RSI: 00007f03f98ba000 RDI: 0000000000000003 RBP: 00007fff3c52bd50 R08: 0000000000000000 R09: 00007f03f9b9a380 R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000040000 R13: 00007f03f98ba000 R14: 0000000000000003 R15: 0000000000000000 </TASK> The reason is simple, readahead brought some order 0 folio in swap cache, and the swapin mTHP folio being allocated is in conflict with it, so swapcache_prepare fails and causes shmem_swap_alloc_folio to return -EEXIST, and shmem simply retries again and again causing this loop. Fix it by applying a similar fix for anon mTHP swapin. The performance change is very slight, time of swapin 10g zero folios with shmem (test for 12 times): Before: 2.47s After: 2.48s [kasong@tencent.com: add comment] Link: https://lkml.kernel.org/r/20250610181645.45922-1-ryncsn@gmail.com Link: https://lkml.kernel.org/r/20250610181645.45922-1-ryncsn@gmail.com Link: https://lkml.kernel.org/r/20250609171751.36305-1-ryncsn@gmail.com Fixes: 1dd44c0af4fa ("mm: shmem: skip swapcache for swapin of synchronous swap device") Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Barry Song <baohua@kernel.org> Acked-by: Nhat Pham <nphamcs@gmail.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Chris Li <chrisl@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-10 01:17:51 +08:00
static inline int non_swapcache_batch(swp_entry_t entry, int max_nr)
{
return 0;
}
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
#endif /* CONFIG_SWAP */
mm: swap: add nr argument in swapcache_prepare and swapcache_clear to support large folios Right now, swapcache_prepare() and swapcache_clear() supports one entry only, to support large folios, we need to handle multiple swap entries. To optimize stack usage, we iterate twice in __swap_duplicate(): the first time to verify that all entries are valid, and the second time to apply the modifications to the entries. Currently, we're using nr=1 for the existing users. [v-songbaohua@oppo.com: clarify swap_count_continued and improve readability for __swap_duplicate] Link: https://lkml.kernel.org/r/20240802071817.47081-1-21cnbao@gmail.com Link: https://lkml.kernel.org/r/20240730071339.107447-2-21cnbao@gmail.com Signed-off-by: Barry Song <v-songbaohua@oppo.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Chris Li <chrisl@kernel.org> Cc: Gao Xiang <xiang@kernel.org> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kairui Song <kasong@tencent.com> Cc: Kalesh Singh <kaleshsingh@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Yang Shi <shy828301@gmail.com> Cc: Yosry Ahmed <yosryahmed@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-30 19:13:39 +12:00
/**
* folio_index - File index of a folio.
* @folio: The folio.
*
* For a folio which is either in the page cache or the swap cache,
* return its index within the address_space it belongs to. If you know
* the folio is definitely in the page cache, you can look at the folio's
* index directly.
*
* Return: The index (offset in units of pages) of a folio in its file.
*/
static inline pgoff_t folio_index(struct folio *folio)
{
if (unlikely(folio_test_swapcache(folio)))
return swap_cache_index(folio->swap);
return folio->index;
}
mm: create new mm/swap.h header file Patch series "MM changes to improve swap-over-NFS support". Assorted improvements for swap-via-filesystem. This is a resend of these patches, rebased on current HEAD. The only substantial changes is that swap_dirty_folio has replaced swap_set_page_dirty. Currently swap-via-fs (SWP_FS_OPS) doesn't work for any filesystem. It has previously worked for NFS but that broke a few releases back. This series changes to use a new ->swap_rw rather than ->readpage and ->direct_IO. It also makes other improvements. There is a companion series already in linux-next which fixes various issues with NFS. Once both series land, a final patch is needed which changes NFS over to use ->swap_rw. This patch (of 10): Many functions declared in include/linux/swap.h are only used within mm/ Create a new "mm/swap.h" and move some of these declarations there. Remove the redundant 'extern' from the function declarations. [akpm@linux-foundation.org: mm/memory-failure.c needs mm/swap.h] Link: https://lkml.kernel.org/r/164859751830.29473.5309689752169286816.stgit@noble.brown Link: https://lkml.kernel.org/r/164859778120.29473.11725907882296224053.stgit@noble.brown Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: David Howells <dhowells@redhat.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Hugh Dickins <hughd@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09 18:20:47 -07:00
#endif /* _MM_SWAP_H */