linux/arch/x86/kernel/cpu/mcheck
Youquan Song e503f9e4b0 x86, apic: Fix spurious error interrupts triggering on all non-boot APs
This patch fixes a bug reported by a customer, who found
that many unreasonable error interrupts reported on all
non-boot CPUs (APs) during the system boot stage.

According to Chapter 10 of Intel Software Developer Manual
Volume 3A, Local APIC may signal an illegal vector error when
an LVT entry is set as an illegal vector value (0~15) under
FIXED delivery mode (bits 8-11 is 0), regardless of whether
the mask bit is set or an interrupt actually happen. These
errors are seen as error interrupts.

The initial value of thermal LVT entries on all APs always reads
0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI
sequence to them and LVT registers are reset to 0s except for
the mask bits which are set to 1s when APs receive INIT IPI.

When the BIOS takes over the thermal throttling interrupt,
the LVT thermal deliver mode should be SMI and it is required
from the kernel to keep AP's LVT thermal monitoring register
programmed as such as well.

This issue happens when BIOS does not take over thermal throttling
interrupt, AP's LVT thermal monitor register will be restored to
0x10000 which means vector 0 and fixed deliver mode, so all APs will
signal illegal vector error interrupts.

This patch check if interrupt delivery mode is not fixed mode before
restoring AP's LVT thermal monitor register.

Signed-off-by: Youquan Song <youquan.song@intel.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Yong Wang <yong.y.wang@intel.com>
Cc: hpa@linux.intel.com
Cc: joe@perches.com
Cc: jbaron@redhat.com
Cc: trenn@suse.de
Cc: kent.liu@intel.com
Cc: chaohong.guo@intel.com
Cc: <stable@kernel.org> # As far back as possible
Link: http://lkml.kernel.org/r/1303402963-17738-1-git-send-email-youquan.song@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-16 13:48:25 +02:00
..
Makefile ACPI, APEI, Generic Hardware Error Source memory error support 2010-05-19 22:41:16 -04:00
mce-apei.c ACPI, APEI, Add ERST record ID cache 2011-03-21 22:59:06 -04:00
mce-inject.c x86: Fix common misspellings 2011-03-18 10:39:30 +01:00
mce-internal.h ACPI, APEI, Use ERST for persistent storage of MCE 2010-05-19 22:41:40 -04:00
mce-severity.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
mce.c rcu: create new rcu_access_index() and use in mce 2011-04-01 07:27:31 -07:00
mce_amd.c x86, mce, AMD: Fix leaving freed data in a list 2011-05-13 17:11:02 +02:00
mce_intel.c x86: Replace uses of current_cpu_data with this_cpu ops 2010-12-30 12:22:03 +01:00
p5.c x86, mce: make mce_disabled boolean 2009-06-16 16:56:07 -07:00
therm_throt.c x86, apic: Fix spurious error interrupts triggering on all non-boot APs 2011-05-16 13:48:25 +02:00
threshold.c x86, mce: enable MCE_INTEL for 32bit new MCE 2009-05-28 09:24:13 -07:00
winchip.c x86, mce: unify mce.h 2009-06-16 16:56:07 -07:00