linux/kernel
Peter Zijlstra f5d39b0208 freezer,sched: Rewrite core freezer logic
Rewrite the core freezer to behave better wrt thawing and be simpler
in general.

By replacing PF_FROZEN with TASK_FROZEN, a special block state, it is
ensured frozen tasks stay frozen until thawed and don't randomly wake
up early, as is currently possible.

As such, it does away with PF_FROZEN and PF_FREEZER_SKIP, freeing up
two PF_flags (yay!).

Specifically; the current scheme works a little like:

	freezer_do_not_count();
	schedule();
	freezer_count();

And either the task is blocked, or it lands in try_to_freezer()
through freezer_count(). Now, when it is blocked, the freezer
considers it frozen and continues.

However, on thawing, once pm_freezing is cleared, freezer_count()
stops working, and any random/spurious wakeup will let a task run
before its time.

That is, thawing tries to thaw things in explicit order; kernel
threads and workqueues before doing bringing SMP back before userspace
etc.. However due to the above mentioned races it is entirely possible
for userspace tasks to thaw (by accident) before SMP is back.

This can be a fatal problem in asymmetric ISA architectures (eg ARMv9)
where the userspace task requires a special CPU to run.

As said; replace this with a special task state TASK_FROZEN and add
the following state transitions:

	TASK_FREEZABLE	-> TASK_FROZEN
	__TASK_STOPPED	-> TASK_FROZEN
	__TASK_TRACED	-> TASK_FROZEN

The new TASK_FREEZABLE can be set on any state part of TASK_NORMAL
(IOW. TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE) -- any such state
is already required to deal with spurious wakeups and the freezer
causes one such when thawing the task (since the original state is
lost).

The special __TASK_{STOPPED,TRACED} states *can* be restored since
their canonical state is in ->jobctl.

With this, frozen tasks need an explicit TASK_FROZEN wakeup and are
free of undue (early / spurious) wakeups.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20220822114649.055452969@infradead.org
2022-09-07 21:53:50 +02:00
..
bpf Networking changes for 6.0. 2022-08-03 16:29:08 -07:00
cgroup freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
configs Char / Misc driver changes for 6.0-rc1 2022-08-04 11:05:48 -07:00
debug Modules updates for v5.19-rc1 2022-05-26 17:13:43 -07:00
dma remoteproc updates for v5.20 2022-08-08 15:16:29 -07:00
entry context_tracking: Take NMI eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
events Misc fixes to kprobes and the faddr2line script, plus a cleanup. 2022-08-06 17:28:12 -07:00
futex freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
gcov
irq irqchip/genirq updates for 5.20: 2022-07-28 12:36:35 +02:00
kcsan kcsan: test: Add a .kunitconfig to run KCSAN tests 2022-07-22 09:22:59 -06:00
livepatch Livepatching changes for 5.19 2022-06-02 08:55:01 -07:00
locking RCU pull request for v5.20 (or whatever) 2022-08-02 19:12:45 -07:00
module Modules updates for 6.0 2022-08-08 14:12:19 -07:00
power freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
printk printk: do not wait for consoles when suspended 2022-07-15 10:52:11 +02:00
rcu - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
sched freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
time freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
trace Tracing updates for 5.20 / 6.0 2022-08-05 09:41:12 -07:00
.gitignore
acct.c
async.c
audit.c audit: make is_audit_feature_set() static 2022-06-13 14:08:57 -04:00
audit.h
audit_fsnotify.c
audit_tree.c
audit_watch.c
auditfilter.c
auditsc.c audit: free module name 2022-06-15 19:28:44 -04:00
backtracetest.c
bounds.c
capability.c
cfi.c context_tracking: Take IRQ eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
compat.c
configs.c
context_tracking.c MAINTAINERS: Add Paul as context tracking maintainer 2022-07-05 13:33:00 -07:00
cpu.c
cpu_pm.c context_tracking: Take IRQ eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
crash_core.c kdump: round up the total memory size to 128M for crashkernel reservation 2022-07-17 17:31:40 -07:00
crash_dump.c
cred.c
delayacct.c delayacct: track delays from write-protect copy 2022-06-01 15:55:25 -07:00
dma.c
exec_domain.c
exit.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
extable.c context_tracking: Take NMI eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
fail_function.c
fork.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
freezer.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
gen_kheaders.sh
groups.c security: Add LSM hook to setgroups() syscall 2022-07-15 18:21:49 +00:00
hung_task.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
iomem.c
irq_work.c
jump_label.c jump_label: make initial NOP patching the special case 2022-06-24 09:48:55 +02:00
kallsyms.c Updates to various subsystems which I help look after. lib, ocfs2, 2022-08-07 10:03:24 -07:00
kallsyms_internal.h kallsyms: move declarations to internal header 2022-07-17 17:31:39 -07:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c kcov: update pos before writing pc in trace function 2022-05-25 13:05:42 -07:00
kexec.c
kexec_core.c kexec: drop weak attribute from functions 2022-07-15 12:21:16 -04:00
kexec_elf.c
kexec_file.c Updates to various subsystems which I help look after. lib, ocfs2, 2022-08-07 10:03:24 -07:00
kexec_internal.h
kheaders.c
kmod.c
kprobes.c kprobes: Forbid probing on trampoline and BPF code areas 2022-08-02 11:47:29 +02:00
ksysfs.c
kthread.c kthread: make it clear that kthread_create_on_node() might be terminated by any fatal signal 2022-06-16 19:11:30 -07:00
latencytop.c
Makefile kernel: remove platform_has() infrastructure 2022-08-01 07:42:56 +02:00
module_signature.c
notifier.c
nsproxy.c fs/exec: allow to unshare a time namespace on vfork+exec 2022-06-15 07:58:04 -07:00
padata.c
panic.c linux-kselftest-kunit-5.20-rc1 2022-08-02 19:34:45 -07:00
params.c
pid.c
pid_namespace.c
profile.c profile: setup_profiling_timer() is moslty not implemented 2022-07-29 18:12:36 -07:00
ptrace.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
range.c
reboot.c Merge branch 'rework/kthreads' into for-linus 2022-06-23 19:11:28 +02:00
regset.c
relay.c
resource.c resource: Introduce alloc_free_mem_region() 2022-07-21 17:19:25 -07:00
resource_kunit.c
rseq.c rseq: Kill process when unknown flags are encountered in ABI structures 2022-08-01 15:21:42 +02:00
scftorture.c
scs.c
seccomp.c
signal.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
smp.c locking/csd_lock: Change csdlock_debug from early_param to __setup 2022-07-19 11:40:00 -07:00
smpboot.c
smpboot.h
softirq.c context_tracking: Take IRQ eqs entrypoints over RCU 2022-07-05 13:32:59 -07:00
stackleak.c
stacktrace.c
static_call.c
static_call_inline.c
stop_machine.c
sys.c
sys_ni.c
sysctl-test.c
sysctl.c kernel/sysctl.c: Remove trailing white space 2022-08-08 09:01:36 -07:00
task_work.c
taskstats.c
torture.c
tracepoint.c
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c freezer,sched: Rewrite core freezer logic 2022-09-07 21:53:50 +02:00
up.c
user-return-notifier.c
user.c
user_namespace.c
usermode_driver.c
utsname.c
utsname_sysctl.c
watch_queue.c This was a moderately busy cycle for documentation, but nothing all that 2022-08-02 19:24:24 -07:00
watchdog.c powerpc updates for 6.0 2022-08-06 16:38:17 -07:00
watchdog_hld.c Revert "printk: add functions to prefer direct printing" 2022-06-23 18:41:40 +02:00
workqueue.c drm for 5.20/6.0 2022-08-03 19:52:08 -07:00
workqueue_internal.h