Blame - tools/perf/Documentation/perf-stat.txt - kernel/msm-4.19

blob: d96ccd4844df9a49f33b05c6c0384b5f8e6eef05 [file] [log] [blame]

Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	1	perf-stat(1)
Ingo Molnar	6e6b754	2008-04-15 22:39:31 +0200	[diff] [blame]	2	============
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	3
				4	NAME
				5	----
				6	perf-stat - Run a command and gather performance counter statistics
				7
				8	SYNOPSIS
				9	--------
				10	[verse]
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	11	'perf stat' [-e <EVENT> \| --event=EVENT] [-a] <command>
				12	'perf stat' [-e <EVENT> \| --event=EVENT] [-a] -- <command> [<options>]
Jiri Olsa	4979d0c	2015-11-05 15:40:46 +0100	[diff] [blame]	13	'perf stat' [-e <EVENT> \| --event=EVENT] [-a] record [-o file] -- <command> [<options>]
Jiri Olsa	ba6039b6	2015-11-05 15:40:55 +0100	[diff] [blame]	14	'perf stat' report [-i file]
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	15
				16	DESCRIPTION
				17	-----------
				18	This command runs a command and gathers performance counter statistics
				19	from it.
				20
				21
				22	OPTIONS
				23	-------
				24	<command>...::
				25	Any command you can specify in a shell.
				26
Jiri Olsa	4979d0c	2015-11-05 15:40:46 +0100	[diff] [blame]	27	record::
				28	See STAT RECORD.
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	29
Jiri Olsa	ba6039b6	2015-11-05 15:40:55 +0100	[diff] [blame]	30	report::
				31	See STAT REPORT.
				32
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	33	-e::
				34	--event=::
Cody P Schafer	f9ab9c1	2015-01-07 17:13:53 -0800	[diff] [blame]	35	Select the PMU event. Selection can be:
				36
				37	- a symbolic event name (use 'perf list' to list all events)
				38
				39	- a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
				40	hexadecimal event descriptor.
				41
				42	- a symbolically formed event like 'pmu/param1=0x3,param2/' where
				43	param1 and param2 are defined as formats for the PMU in
				44	/sys/bus/event_sources/devices/<pmu>/format/*
				45
				46	- a symbolically formed event like 'pmu/config=M,config1=N,config2=K/'
				47	where M, N, K are numbers (in decimal, hex, octal format).
				48	Acceptable values for each of 'config', 'config1' and 'config2'
				49	parameters are defined by corresponding entries in
				50	/sys/bus/event_sources/devices/<pmu>/format/*
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	51
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	52	-i::
Stephane Eranian	2e6cdf9	2010-05-12 10:40:01 +0200	[diff] [blame]	53	--no-inherit::
				54	child tasks do not inherit counters
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	55	-p::
				56	--pid=<pid>::
David Ahern	b52956c	2012-02-08 09:32:52 -0700	[diff] [blame]	57	stat events on existing process id (comma separated list)
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	58
				59	-t::
				60	--tid=<tid>::
David Ahern	b52956c	2012-02-08 09:32:52 -0700	[diff] [blame]	61	stat events on existing thread id (comma separated list)
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	62
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	63
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	64	-a::
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	65	--all-cpus::
				66	system-wide collection from all CPUs
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	67
Brice Goglin	b26bc5a	2009-08-07 10:18:39 +0200	[diff] [blame]	68	-c::
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	69	--scale::
				70	scale/normalize counter values
				71
Borislav Petkov	f594bae	2016-03-07 16:44:44 -0300	[diff] [blame]	72	-d::
				73	--detailed::
				74	print more detailed statistics, can be specified up to 3 times
				75
				76	-d: detailed events, L1 and LLC data cache
				77	-d -d: more detailed events, dTLB and iTLB events
				78	-d -d -d: very detailed events, adding prefetch events
				79
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	80	-r::
				81	--repeat=<n>::
Frederik Deweerdt	a7e191c	2013-03-01 13:02:27 -0500	[diff] [blame]	82	repeat command and print average + stddev (max: 100). 0 means forever.
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	83
Stephane Eranian	5af52b5	2010-05-18 15:00:01 +0200	[diff] [blame]	84	-B::
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	85	--big-num::
Stephane Eranian	5af52b5	2010-05-18 15:00:01 +0200	[diff] [blame]	86	print large numbers with thousands' separators according to locale
				87
Stephane Eranian	c45c6ea	2010-05-28 12:00:01 +0200	[diff] [blame]	88	-C::
				89	--cpu=::
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	90	Count only on the list of CPUs provided. Multiple CPUs can be provided as a
				91	comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
Stephane Eranian	c45c6ea	2010-05-28 12:00:01 +0200	[diff] [blame]	92	In per-thread mode, this option is ignored. The -a option is still necessary
				93	to activate system-wide monitoring. Default is to count on all CPUs.
				94
Stephane Eranian	f5b4a9c3	2010-11-16 11:05:01 +0200	[diff] [blame]	95	-A::
				96	--no-aggr::
				97	Do not aggregate counts across all monitored CPUs in system-wide mode (-a).
				98	This option is only valid in system-wide mode.
				99
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	100	-n::
				101	--null::
				102	null run - don't start any counters
				103
				104	-v::
				105	--verbose::
				106	be more verbose (show counter open errors, etc)
				107
Stephane Eranian	d7470b6	2010-12-01 18:49:05 +0200	[diff] [blame]	108	-x SEP::
				109	--field-separator SEP::
				110	print counts using a CSV-style output to make it easy to import directly into
				111	spreadsheets. Columns are separated by the string specified in SEP.
				112
Stephane Eranian	023695d	2011-02-14 11:20:01 +0200	[diff] [blame]	113	-G name::
				114	--cgroup name::
				115	monitor only in the container (cgroup) called "name". This option is available only
				116	in per-cpu mode. The cgroup filesystem must be mounted. All threads belonging to
				117	container "name" are monitored when they run on the monitored CPUs. Multiple cgroups
				118	can be provided. Each cgroup is applied to the corresponding event, i.e., first cgroup
				119	to first event, second cgroup to second event and so on. It is possible to provide
				120	an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must have
				121	corresponding events, i.e., they always refer to events defined earlier on the command
				122	line.
				123
Stephane Eranian	4aa9015	2011-08-15 22:22:33 +0200	[diff] [blame]	124	-o file::
Jim Cromie	56f3bae	2011-09-07 17:14:00 -0600	[diff] [blame]	125	--output file::
Stephane Eranian	4aa9015	2011-08-15 22:22:33 +0200	[diff] [blame]	126	Print the output into the designated file.
				127
				128	--append::
				129	Append to the output file designated with the -o option. Ignored if -o is not specified.
				130
Jim Cromie	56f3bae	2011-09-07 17:14:00 -0600	[diff] [blame]	131	--log-fd::
				132
				133	Log output to fd, instead of stderr. Complementary to --output, and mutually exclusive
				134	with it. --append may be used here. Examples:
				135	3>results perf stat --log-fd 3 -- $cmd
				136	3>>results perf stat --log-fd 3 --append -- $cmd
				137
Peter Zijlstra	1f16c57	2012-10-23 13:40:14 +0200	[diff] [blame]	138	--pre::
				139	--post::
				140	Pre and post measurement hooks, e.g.:
				141
				142	perf stat --repeat 10 --null --sync --pre 'make -s O=defconfig-build/clean' -- make -s -j64 O=defconfig-build/ bzImage
Jim Cromie	56f3bae	2011-09-07 17:14:00 -0600	[diff] [blame]	143
Stephane Eranian	13370a9	2013-01-29 12:47:44 +0100	[diff] [blame]	144	-I msecs::
				145	--interval-print msecs::
Kan Liang	19afd10	2015-10-02 05:04:34 -0400	[diff] [blame]	146	Print count deltas every N milliseconds (minimum: 10ms)
				147	The overhead percentage could be high in some cases, for instance with small, sub 100ms intervals. Use with caution.
				148	example: 'perf stat -I 1000 -e cycles -a sleep 5'
Jim Cromie	56f3bae	2011-09-07 17:14:00 -0600	[diff] [blame]	149
Andi Kleen	54b5091	2016-03-03 15:57:36 -0800	[diff] [blame]	150	--metric-only::
				151	Only print computed metrics. Print them in a single line.
Andi Kleen	206cab6	2016-03-03 15:57:37 -0800	[diff] [blame]	152	Don't show any raw values. Not supported with --per-thread.
Andi Kleen	54b5091	2016-03-03 15:57:36 -0800	[diff] [blame]	153
Stephane Eranian	d430495	2013-02-14 13:57:28 +0100	[diff] [blame]	154	--per-socket::
Stephane Eranian	d7e7a45	2013-02-06 15:46:02 +0100	[diff] [blame]	155	Aggregate counts per processor socket for system-wide mode measurements. This
				156	is a useful mode to detect imbalance between sockets. To enable this mode,
Stephane Eranian	d430495	2013-02-14 13:57:28 +0100	[diff] [blame]	157	use --per-socket in addition to -a. (system-wide). The output includes the
Stephane Eranian	d7e7a45	2013-02-06 15:46:02 +0100	[diff] [blame]	158	socket number and the number of online processors on that socket. This is
				159	useful to gauge the amount of aggregation.
				160
Stephane Eranian	12c08a9	2013-02-14 13:57:29 +0100	[diff] [blame]	161	--per-core::
				162	Aggregate counts per physical processor for system-wide mode measurements. This
				163	is a useful mode to detect imbalance between physical cores. To enable this mode,
				164	use --per-core in addition to -a. (system-wide). The output includes the
				165	core number and the number of online logical processors on that physical processor.
				166
Jiri Olsa	32b8af8	2015-06-26 11:29:27 +0200	[diff] [blame]	167	--per-thread::
				168	Aggregate counts per monitored threads, when monitoring threads (-t option)
				169	or processes (-p option).
				170
Andi Kleen	4119168	2013-08-02 17:41:11 -0700	[diff] [blame]	171	-D msecs::
Andi Kleen	8f3dd2b	2014-01-07 14:14:06 -0800	[diff] [blame]	172	--delay msecs::
Andi Kleen	4119168	2013-08-02 17:41:11 -0700	[diff] [blame]	173	After starting the program, wait msecs before measuring. This is useful to
				174	filter out the startup phase of the program, which is often very different.
				175
Andi Kleen	4cabc3d	2013-08-21 16:47:26 -0700	[diff] [blame]	176	-T::
				177	--transaction::
				178
				179	Print statistics of transactional execution if supported.
				180
Jiri Olsa	4979d0c	2015-11-05 15:40:46 +0100	[diff] [blame]	181	STAT RECORD
				182	-----------
				183	Stores stat data into perf data file.
				184
				185	-o file::
				186	--output file::
				187	Output file name.
				188
Jiri Olsa	ba6039b6	2015-11-05 15:40:55 +0100	[diff] [blame]	189	STAT REPORT
				190	-----------
				191	Reads and reports stat data from perf data file.
				192
				193	-i file::
				194	--input file::
				195	Input file name.
				196
Jiri Olsa	89af4e0	2015-11-05 15:41:02 +0100	[diff] [blame]	197	--per-socket::
				198	Aggregate counts per processor socket for system-wide mode measurements.
				199
				200	--per-core::
				201	Aggregate counts per physical processor for system-wide mode measurements.
				202
				203	-A::
				204	--no-aggr::
				205	Do not aggregate counts across all monitored CPUs.
				206
Andi Kleen	44b1e60	2016-05-30 12:49:42 -0300	[diff] [blame]	207	--topdown::
				208	Print top down level 1 metrics if supported by the CPU. This allows to
				209	determine bottle necks in the CPU pipeline for CPU bound workloads,
				210	by breaking the cycles consumed down into frontend bound, backend bound,
				211	bad speculation and retiring.
				212
				213	Frontend bound means that the CPU cannot fetch and decode instructions fast
				214	enough. Backend bound means that computation or memory access is the bottle
				215	neck. Bad Speculation means that the CPU wasted cycles due to branch
				216	mispredictions and similar issues. Retiring means that the CPU computed without
				217	an apparently bottleneck. The bottleneck is only the real bottleneck
				218	if the workload is actually bound by the CPU and not by something else.
				219
				220	For best results it is usually a good idea to use it with interval
				221	mode like -I 1000, as the bottleneck of workloads can change often.
				222
				223	The top down metrics are collected per core instead of per
				224	CPU thread. Per core mode is automatically enabled
				225	and -a (global monitoring) is needed, requiring root rights or
				226	perf.perf_event_paranoid=-1.
				227
				228	Topdown uses the full Performance Monitoring Unit, and needs
				229	disabling of the NMI watchdog (as root):
				230	echo 0 > /proc/sys/kernel/nmi_watchdog
				231	for best results. Otherwise the bottlenecks may be inconsistent
				232	on workload with changing phases.
				233
				234	This enables --metric-only, unless overriden with --no-metric-only.
				235
				236	To interpret the results it is usually needed to know on which
				237	CPUs the workload runs on. If needed the CPUs can be forced using
				238	taskset.
Jiri Olsa	4979d0c	2015-11-05 15:40:46 +0100	[diff] [blame]	239
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	240	EXAMPLES
				241	--------
				242
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	243	$ perf stat -- make -j
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	244
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	245	Performance counter stats for 'make -j':
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	246
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	247	8117.370256 task clock ticks # 11.281 CPU utilization factor
				248	678 context switches # 0.000 M/sec
				249	133 CPU migrations # 0.000 M/sec
				250	235724 pagefaults # 0.029 M/sec
				251	24821162526 CPU cycles # 3057.784 M/sec
				252	18687303457 instructions # 2302.138 M/sec
				253	172158895 cache references # 21.209 M/sec
				254	27075259 cache misses # 3.335 M/sec
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	255
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	256	Wall-clock time elapsed: 719.554352 msecs
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	257
Andi Kleen	6b45f7b	2016-03-03 15:57:35 -0800	[diff] [blame]	258	CSV FORMAT
				259	----------
				260
				261	With -x, perf stat is able to output a not-quite-CSV format output
				262	Commas in the output are not put into "". To make it easy to parse
				263	it is recommended to use a different character like -x \;
				264
				265	The fields are in this order:
				266
				267	- optional usec time stamp in fractions of second (with -I xxx)
				268	- optional CPU, core, or socket identifier
				269	- optional number of logical CPUs aggregated
				270	- counter value
				271	- unit of the counter value or empty
				272	- event name
				273	- run time of counter
				274	- percentage of measurement time the counter was running
				275	- optional variance if multiple values are collected with -r
				276	- optional metric value
				277	- optional unit of metric
				278
				279	Additional metrics may be printed with all earlier fields being empty.
				280
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	281	SEE ALSO
				282	--------
Thomas Gleixner	386b05e	2009-06-06 14:56:33 +0200	[diff] [blame]	283	linkperf:perf-top[1], linkperf:perf-list[1]