mirror of
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2025-11-27 01:11:31 +00:00
After fixing the GPF in mem_cgroup_lru_del_list(), three times one machine running a similar load (moving and removing memcgs while swapping) has oopsed in mem_cgroup_zone_nr_lru_pages(), when retrieving memcg zone numbers for get_scan_count() for shrink_mem_cgroup_zone(): this is where a struct mem_cgroup is first accessed after being chosen by mem_cgroup_iter(). Just what protects a struct mem_cgroup from being freed, in between mem_cgroup_iter()'s css_get_next() and its css_tryget()? css_tryget() fails once css->refcnt is zero with CSS_REMOVED set in flags, yes: but what if that memory is freed and reused for something else, which sets "refcnt" non-zero? Hmm, and scope for an indefinite freeze if refcnt is left at zero but flags are cleared. It's tempting to move the css_tryget() into css_get_next(), to make it really "get" the css, but I don't think that actually solves anything: the same difficulty in moving from css_id found to stable css remains. But we already have rcu_read_lock() around the two, so it's easily fixed if __mem_cgroup_free() just uses kfree_rcu() to free mem_cgroup. However, a big struct mem_cgroup is allocated with vzalloc() instead of kzalloc(), and we're not allowed to vfree() at interrupt time: there doesn't appear to be a general vfree_rcu() to help with this, so roll our own using schedule_work(). The compiler decently removes vfree_work() and vfree_rcu() when the config doesn't need them. Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Tejun Heo <tj@kernel.org> Cc: Ying Han <yinghan@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
|---|---|---|
| .. | ||
| backing-dev.c | ||
| bootmem.c | ||
| bounce.c | ||
| cleancache.c | ||
| compaction.c | ||
| debug-pagealloc.c | ||
| dmapool.c | ||
| fadvise.c | ||
| failslab.c | ||
| filemap.c | ||
| filemap_xip.c | ||
| fremap.c | ||
| highmem.c | ||
| huge_memory.c | ||
| hugetlb.c | ||
| hwpoison-inject.c | ||
| init-mm.c | ||
| internal.h | ||
| Kconfig | ||
| Kconfig.debug | ||
| kmemcheck.c | ||
| kmemleak-test.c | ||
| kmemleak.c | ||
| ksm.c | ||
| maccess.c | ||
| madvise.c | ||
| Makefile | ||
| memblock.c | ||
| memcontrol.c | ||
| memory-failure.c | ||
| memory.c | ||
| memory_hotplug.c | ||
| mempolicy.c | ||
| mempool.c | ||
| migrate.c | ||
| mincore.c | ||
| mlock.c | ||
| mm_init.c | ||
| mmap.c | ||
| mmu_context.c | ||
| mmu_notifier.c | ||
| mmzone.c | ||
| mprotect.c | ||
| mremap.c | ||
| msync.c | ||
| nobootmem.c | ||
| nommu.c | ||
| oom_kill.c | ||
| page-writeback.c | ||
| page_alloc.c | ||
| page_cgroup.c | ||
| page_io.c | ||
| page_isolation.c | ||
| pagewalk.c | ||
| percpu-km.c | ||
| percpu-vm.c | ||
| percpu.c | ||
| pgtable-generic.c | ||
| prio_tree.c | ||
| process_vm_access.c | ||
| quicklist.c | ||
| readahead.c | ||
| rmap.c | ||
| shmem.c | ||
| slab.c | ||
| slob.c | ||
| slub.c | ||
| sparse-vmemmap.c | ||
| sparse.c | ||
| swap.c | ||
| swap_state.c | ||
| swapfile.c | ||
| thrash.c | ||
| truncate.c | ||
| util.c | ||
| vmalloc.c | ||
| vmscan.c | ||
| vmstat.c | ||