linux/arch/x86/kernel/cpu
Yan, Zheng 3569c0d7c5 perf/x86/intel: Implement batched PEBS interrupt handling (large PEBS interrupt threshold)
PEBS always had the capability to log samples to its buffers without
an interrupt. Traditionally perf has not used this but always set the
PEBS threshold to one.

For frequently occurring events (like cycles or branches or load/store)
this in term requires using a relatively high sampling period to avoid
overloading the system, by only processing PMIs. This in term increases
sampling error.

For the common cases we still need to use the PMI because the PEBS
hardware has various limitations. The biggest one is that it can not
supply a callgraph. It also requires setting a fixed period, as the
hardware does not support adaptive period. Another issue is that it
cannot supply a time stamp and some other options. To supply a TID it
requires flushing on context switch. It can however supply the IP, the
load/store address, TSX information, registers, and some other things.

So we can make PEBS work for some specific cases, basically as long as
you can do without a callgraph and can set the period you can use this
new PEBS mode.

The main benefit is the ability to support much lower sampling period
(down to -c 1000) without extensive overhead.

One use cases is for example to increase the resolution of the c2c tool.
Another is double checking when you suspect the standard sampling has
too much sampling error.

Some numbers on the overhead, using cycle soak, comparing the elapsed
time from "kernbench -M -H" between plain (threshold set to one) and
multi (large threshold).

The test command for plain:
  "perf record --time -e cycles:p -c $period -- kernbench -M -H"

The test command for multi:
  "perf record --no-time -e cycles:p -c $period -- kernbench -M -H"

( The only difference of test command between multi and plain is time
  stamp options. Since time stamp is not supported by large PEBS
  threshold, it can be used as a flag to indicate if large threshold is
  enabled during the test. )

	period    plain(Sec)  multi(Sec)  Delta
	10003     32.7        16.5        16.2
	20003     30.2        16.2        14.0
	40003     18.6        14.1        4.5
	80003     16.8        14.6        2.2
	100003    16.9        14.1        2.8
	800003    15.4        15.7        -0.3
	1000003   15.3        15.2        0.2
	2000003   15.3        15.1        0.1

With periods below 100003, plain (threshold one) cause much more
overhead. With 10003 sampling period, the Elapsed Time for multi is
even 2X faster than plain.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: eranian@google.com
Link: http://lkml.kernel.org/r/1430940834-8964-5-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-07 16:08:49 +02:00
..
mcheck Merge branch 'x86-ras-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-13 13:33:20 -07:00
microcode x86/microcode/amd: Drop the pci_ids.h dependency 2015-03-31 09:54:32 +02:00
mtrr x86: mtrr: if: remove use of seq_printf return value 2015-04-15 16:35:24 -07:00
.gitignore
amd.c x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue 2015-04-26 17:57:38 -07:00
bugs.c
bugs_64.c
centaur.c
common.c Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-14 14:37:47 -07:00
cpu.h
cyrix.c
hypervisor.c hypervisor/x86/xen: Unset X86_BUG_SYSRET_SS_ATTRS on Xen PV guests 2015-05-05 18:27:43 +01:00
intel.c x86/cpu/intel: Fix trivial typo in intel_tlb_table[] 2015-02-22 08:55:58 +01:00
intel_cacheinfo.c x86/cpu/cacheinfo: Fix cache_get_priv_group() for Intel processors 2015-03-23 10:22:38 +01:00
intel_pt.h perf/x86/intel/pt: Add Intel PT PMU driver 2015-04-02 17:14:20 +02:00
Makefile perf/x86/intel/bts: Add BTS PMU driver 2015-04-02 17:14:21 +02:00
match.c
mkcapflags.sh x86/build: Fix mkcapflags.sh bash-ism 2015-02-19 02:21:00 +01:00
mshyperv.c x86, hyperv: Mark the Hyper-V clocksource as being continuous 2015-01-20 14:36:25 +01:00
perf_event.c perf/x86/intel: Use the PEBS auto reload mechanism when possible 2015-06-07 16:08:35 +02:00
perf_event.h perf/x86/intel: Implement batched PEBS interrupt handling (large PEBS interrupt threshold) 2015-06-07 16:08:49 +02:00
perf_event_amd.c perf/x86: Add 'index' param to get_event_constraint() callback 2015-04-02 17:33:10 +02:00
perf_event_amd_ibs.c perf/x86/amd/ibs: Convert force_ibs_eilvt_setup() to void 2015-02-18 17:01:46 +01:00
perf_event_amd_iommu.c cpumask: factor out show_cpumap into separate helper function 2014-11-07 11:45:00 -08:00
perf_event_amd_iommu.h
perf_event_amd_uncore.c cpumask: factor out show_cpumap into separate helper function 2014-11-07 11:45:00 -08:00
perf_event_intel.c perf/x86/intel: Implement batched PEBS interrupt handling (large PEBS interrupt threshold) 2015-06-07 16:08:49 +02:00
perf_event_intel_bts.c perf/x86/intel/pt: Fix the 32-bit build 2015-04-02 17:58:45 +02:00
perf_event_intel_cqm.c perf/x86/intel/cqm: Use 'u32' data type for RMIDs 2015-05-27 09:17:41 +02:00
perf_event_intel_ds.c perf/x86/intel: Implement batched PEBS interrupt handling (large PEBS interrupt threshold) 2015-06-07 16:08:49 +02:00
perf_event_intel_lbr.c perf/x86/intel: add support for PERF_SAMPLE_BRANCH_IND_JUMP 2015-06-07 16:08:27 +02:00
perf_event_intel_pt.c perf/x86/intel/pt: Remove redundant variable declaration 2015-05-27 09:17:48 +02:00
perf_event_intel_rapl.c perf/x86/rapl: Enable Broadwell-U RAPL support 2015-05-11 11:52:30 +02:00
perf_event_intel_uncore.c Merge branch 'perf/urgent' into perf/core, before applying dependent patches 2015-05-27 09:17:21 +02:00
perf_event_intel_uncore.h Merge branch 'perf/urgent' into perf/core, before applying dependent patches 2015-05-27 09:17:21 +02:00
perf_event_intel_uncore_nhmex.c
perf_event_intel_uncore_snb.c perf/x86/intel/uncore: Add Broadwell-U uncore IMC PMU support 2015-05-11 11:57:47 +02:00
perf_event_intel_uncore_snbep.c perf/x86/intel/uncore: Delete an unnecessary check before pci_dev_put() call 2015-02-18 17:01:42 +01:00
perf_event_knc.c
perf_event_p4.c
perf_event_p6.c
perfctr-watchdog.c
powerflags.c
proc.c x86: Replace seq_printf() with seq_puts() 2014-12-08 11:48:15 +01:00
rdrand.c
scattered.c x86: Add Intel Processor Trace (INTEL_PT) cpu feature detection 2015-04-02 17:14:18 +02:00
topology.c
transmeta.c
umc.c
vmware.c