linux/kernel
Sameer Nanda 45226e944c NMI watchdog: fix for lockup detector breakage on resume
On the suspend/resume path the boot CPU does not go though an
offline->online transition.  This breaks the NMI detector post-resume
since it depends on PMU state that is lost when the system gets
suspended.

Fix this by forcing a CPU offline->online transition for the lockup
detector on the boot CPU during resume.

To provide more context, we enable NMI watchdog on Chrome OS.  We have
seen several reports of systems freezing up completely which indicated
that the NMI watchdog was not firing for some reason.

Debugging further, we found a simple way of repro'ing system freezes --
issuing the command 'tasket 1 sh -c "echo nmilockup > /proc/breakme"'
after the system has been suspended/resumed one or more times.

With this patch in place, the system freeze result in panics, as
expected.

These panics provide a nice stack trace for us to debug the actual issue
causing the freeze.

[akpm@linux-foundation.org: fiddle with code comment]
[akpm@linux-foundation.org: make lockup_detector_bootcpu_resume() conditional on CONFIG_SUSPEND]
[akpm@linux-foundation.org: fix section errors]
Signed-off-by: Sameer Nanda <snanda@chromium.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Mandeep Singh Baines <msb@chromium.org>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-07-30 17:25:13 -07:00
..
debug kdb: Switch to nolock variants of kmsg_dump functions 2012-07-21 10:34:00 -07:00
events perf: Introduce perf_pmu_migrate_context() 2012-06-18 12:13:21 +02:00
gcov
irq Devicetree updates for 3.6 2012-07-24 14:07:22 -07:00
power NMI watchdog: fix for lockup detector breakage on resume 2012-07-30 17:25:13 -07:00
sched sched: Fix race in task_group() 2012-07-24 13:58:20 +02:00
time Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-07-22 11:35:46 -07:00
trace Staging tree patches for 3.6-rc1 2012-07-26 11:14:49 -07:00
.gitignore
acct.c
async.c [SCSI] async: make async_synchronize_full() flush all work regardless of domain 2012-07-20 09:07:37 +01:00
audit.c netlink: add netlink_kernel_cfg parameter to netlink_kernel_create 2012-06-29 16:46:02 -07:00
audit.h
audit_tree.c VFS: Make clone_mnt()/copy_tree()/collect_mounts() return errors 2012-07-14 16:37:27 +04:00
audit_watch.c get rid of kern_path_parent() 2012-07-14 16:35:02 +04:00
auditfilter.c
auditsc.c seccomp: remove duplicated failure logging 2012-04-14 11:13:20 +10:00
backtracetest.c
bounds.c
capability.c userns: Teach inode_capable to understand inodes whose uids map to other namespaces. 2012-05-15 14:59:24 -07:00
cgroup.c Merge branch 'for-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2012-07-24 17:47:44 -07:00
cgroup_freezer.c
compat.c new helper: sigsuspend() 2012-05-21 23:52:30 -04:00
configs.c
cpu.c kernel/cpu.c: document clear_tasks_mm_cpumask() 2012-05-31 17:49:30 -07:00
cpu_pm.c kernel/cpu_pm.c: fix various typos 2012-05-31 17:49:27 -07:00
cpuset.c cpusets: Remove/update outdated comments 2012-07-24 13:53:28 +02:00
crash_dump.c
cred.c keys: kill task_struct->replacement_session_keyring 2012-05-23 22:11:41 -04:00
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c posix_types.h: Cleanup stale __NFDBITS and related definitions 2012-07-26 13:36:43 -07:00
extable.c extable: Skip sorting if sorted at build time. 2012-04-19 15:06:55 -07:00
fork.c Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-07-23 12:27:27 -07:00
freezer.c
futex.c
futex_compat.c
groups.c userns: Convert in_group_p and in_egroup_p to use kgid_t 2012-05-03 03:29:33 -07:00
hrtimer.c hrtimer: Update hrtimer base offsets each hrtimer_interrupt 2012-07-11 23:34:39 +02:00
hung_task.c hung task debugging: Inject NMI when hung and going to panic 2012-04-25 12:39:25 +02:00
irq_work.c irq_work: fix compile failure on tile from missing include 2012-04-13 13:15:16 -04:00
itimer.c itimer: Use printk_once instead of WARN_ONCE 2012-04-10 11:00:30 +02:00
jump_label.c
kallsyms.c vsprintf: fix %ps on non symbols when using kallsyms 2012-05-29 16:22:32 -07:00
kcmp.c syscalls, x86: add __NR_kcmp syscall 2012-05-31 17:49:32 -07:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec.c
kfifo.c [media] kernel:kfifo: export __kfifo_max_r symbol 2012-04-11 18:24:37 -03:00
kmod.c kmod.c: fix kernel-doc warning 2012-05-31 17:49:28 -07:00
kprobes.c
ksysfs.c
kthread.c kthread_worker: reimplement flush_kthread_work() to allow freeing the work item being executed 2012-07-22 10:15:28 -07:00
latencytop.c
lglock.c brlocks/lglocks: turn into functions 2012-05-29 23:28:41 -04:00
lockdep.c
lockdep_internals.h
lockdep_proc.c
lockdep_states.h
Makefile Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-06-01 10:34:35 -07:00
module.c Guard check in module loader against integer overflow 2012-05-23 22:28:53 +09:30
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
nsproxy.c
padata.c
panic.c panic: fix a possible deadlock in panic() 2012-07-30 17:25:13 -07:00
params.c params: replace printk(KERN_<LVL>...) with pr_<lvl>(...) 2012-05-04 17:28:18 -07:00
pid.c mm: add a low limit to alloc_large_system_hash 2012-05-24 00:28:21 -04:00
pid_namespace.c pidns: guarantee that the pidns init will be the last pidns process reaped 2012-06-20 14:39:36 -07:00
posix-cpu-timers.c
posix-timers.c
printk.c Driver core merge for 3.6-rc1 2012-07-26 11:25:33 -07:00
profile.c
ptrace.c userns: Convert ptrace, kill, set_priority permission checks to work with kuids and kgids 2012-05-03 03:28:51 -07:00
range.c
rcu.h
rcupdate.c rcu: Consolidate tree/tiny __rcu_read_{,un}lock() implementations 2012-07-02 12:34:23 -07:00
rcutiny.c rcu: Fix rcu_is_cpu_idle() #ifdef in TINY_RCU 2012-07-02 12:34:25 -07:00
rcutiny_plugin.h rcu: Fix code-style issues involving "else" 2012-07-06 06:01:48 -07:00
rcutorture.c rcu: Fix broken strings in RCU's source code. 2012-07-06 06:01:49 -07:00
rcutree.c rcu: Fix code-style issues involving "else" 2012-07-06 06:01:48 -07:00
rcutree.h Merge branches 'bigrtm.2012.07.04a', 'doctorture.2012.07.02a', 'fixes.2012.07.06a' and 'fnh.2012.07.02a' into HEAD 2012-07-06 05:59:30 -07:00
rcutree_plugin.h rcu: Fix code-style issues involving "else" 2012-07-06 06:01:48 -07:00
rcutree_trace.c rcu: Fix broken strings in RCU's source code. 2012-07-06 06:01:49 -07:00
relay.c splice: fix racy pipe->buffers uses 2012-06-13 21:16:42 +02:00
res_counter.c rescounters: add res_counter_uncharge_until() 2012-05-29 16:22:27 -07:00
resource.c resources: allow adjust_resource() for resources with no parent 2012-06-13 15:42:22 -06:00
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rtmutex_common.h
rwsem.c
seccomp.c seccomp: fix build warnings when there is no CONFIG_SECCOMP_FILTER 2012-04-18 12:24:52 +10:00
semaphore.c
signal.c signal: make sure we don't get stopped with pending task_work 2012-07-22 23:57:54 +04:00
smp.c smp: Remove ipi_call_lock[_irq]()/ipi_call_unlock[_irq]() 2012-06-05 17:27:14 +02:00
smpboot.c smpboot, idle: Fix comment mismatch over idle_threads_init() 2012-05-24 22:58:08 +02:00
smpboot.h smpboot: Remove leftover declaration 2012-06-11 15:07:52 +02:00
softirq.c
spinlock.c
srcu.c rcu: Implement per-domain single-threaded call_srcu() state machine 2012-04-30 10:48:25 -07:00
stacktrace.c
stop_machine.c
sys.c prctl: remove redunant assignment of "error" to zero 2012-07-30 17:25:11 -07:00
sys_ni.c syscalls, x86: add __NR_kcmp syscall 2012-05-31 17:49:32 -07:00
sysctl.c coredump: warn about unsafe suid_dumpable / core_pattern combo 2012-07-30 17:25:11 -07:00
sysctl_binary.c ipv4: Don't add deprecated new binary sysctl value. 2012-06-22 23:02:22 -07:00
task_work.c deal with task_work callbacks adding more work 2012-07-22 23:57:57 +04:00
taskstats.c
test_kprobes.c
time.c
timeconst.pl
timer.c timers: Improve get_next_timer_interrupt() 2012-06-06 13:49:02 +02:00
tracepoint.c
tsacct.c
uid16.c userns: Convert setting and getting uid and gid system calls to use kuid and kgid 2012-05-03 03:28:41 -07:00
up.c
user-return-notifier.c
user.c userns: Silence silly gcc warning. 2012-05-19 15:44:40 -06:00
user_namespace.c userns: Store uid and gid values in struct cred with kuid_t and kgid_t types 2012-05-03 03:28:38 -07:00
utsname.c userns: Use cred->user_ns instead of cred->user->user_ns 2012-04-07 16:55:51 -07:00
utsname_sysctl.c
wait.c
watchdog.c NMI watchdog: fix for lockup detector breakage on resume 2012-07-30 17:25:13 -07:00
workqueue.c workqueue: fix spurious CPU locality WARN from process_one_work() 2012-07-22 10:16:34 -07:00
workqueue_sched.h