mirror of
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2025-08-05 16:54:27 +00:00

It is possible for a reclaimer to cause demotions of an lruvec belonging to a cgroup with cpuset.mems set to exclude some nodes. Attempt to apply this limitation based on the lruvec's memcg and prevent demotion. Notably, this may still allow demotion of shared libraries or any memory first instantiated in another cgroup. This means cpusets still cannot cannot guarantee complete isolation when demotion is enabled, and the docs have been updated to reflect this. This is useful for isolating workloads on a multi-tenant system from certain classes of memory more consistently - with the noted exceptions. Note on locking: The cgroup_get_e_css reference protects the css->effective_mems, and calls of this interface would be subject to the same race conditions associated with a non-atomic access to cs->effective_mems. So while this interface cannot make strong guarantees of correctness, it can therefore avoid taking a global or rcu_read_lock for performance. Link: https://lkml.kernel.org/r/20250424202806.52632-3-gourry@gourry.net Signed-off-by: Gregory Price <gourry@gourry.net> Suggested-by: Shakeel Butt <shakeel.butt@linux.dev> Suggested-by: Waiman Long <longman@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Waiman Long <longman@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michal Koutný <mkoutny@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Roman Gushchin <roman.gushchin@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
28 lines
1.3 KiB
Text
28 lines
1.3 KiB
Text
What: /sys/kernel/mm/numa/
|
|
Date: June 2021
|
|
Contact: Linux memory management mailing list <linux-mm@kvack.org>
|
|
Description: Interface for NUMA
|
|
|
|
What: /sys/kernel/mm/numa/demotion_enabled
|
|
Date: June 2021
|
|
Contact: Linux memory management mailing list <linux-mm@kvack.org>
|
|
Description: Enable/disable demoting pages during reclaim
|
|
|
|
Page migration during reclaim is intended for systems
|
|
with tiered memory configurations. These systems have
|
|
multiple types of memory with varied performance
|
|
characteristics instead of plain NUMA systems where
|
|
the same kind of memory is found at varied distances.
|
|
Allowing page migration during reclaim enables these
|
|
systems to migrate pages from fast tiers to slow tiers
|
|
when the fast tier is under pressure. This migration
|
|
is performed before swap if an eligible numa node is
|
|
present in cpuset.mems for the cgroup (or if cpuset v1
|
|
is being used). If cpusets.mems changes at runtime, it
|
|
may move data to a NUMA node that does not fall into the
|
|
cpuset of the new cpusets.mems, which might be construed
|
|
to violate the guarantees of cpusets. Shared memory,
|
|
such as libraries, owned by another cgroup may still be
|
|
demoted and result in memory use on a node not present
|
|
in cpusets.mem. This should not be enabled on systems
|
|
which need strict cpuset location guarantees.
|