Blame - tools/perf/Documentation/perf-stat.txt - kernel/msm-4.19

blob: bd0e4417f2be63f892a870ec5ad60ae0fd5d9994 [file] [log] [blame]

Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	1	perf-stat(1)
Ingo Molnar	6e6b754	2008-04-15 22:39:31 +0200	[diff] [blame]	2	============
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	3
				4	NAME
				5	----
				6	perf-stat - Run a command and gather performance counter statistics
				7
				8	SYNOPSIS
				9	--------
				10	[verse]
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	11	'perf stat' [-e <EVENT> \| --event=EVENT] [-a] <command>
				12	'perf stat' [-e <EVENT> \| --event=EVENT] [-a] -- <command> [<options>]
Jiri Olsa	4979d0c	2015-11-05 15:40:46 +0100	[diff] [blame]	13	'perf stat' [-e <EVENT> \| --event=EVENT] [-a] record [-o file] -- <command> [<options>]
Jiri Olsa	ba6039b6	2015-11-05 15:40:55 +0100	[diff] [blame]	14	'perf stat' report [-i file]
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	15
				16	DESCRIPTION
				17	-----------
				18	This command runs a command and gathers performance counter statistics
				19	from it.
				20
				21
				22	OPTIONS
				23	-------
				24	<command>...::
				25	Any command you can specify in a shell.
				26
Jiri Olsa	4979d0c	2015-11-05 15:40:46 +0100	[diff] [blame]	27	record::
				28	See STAT RECORD.
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	29
Jiri Olsa	ba6039b6	2015-11-05 15:40:55 +0100	[diff] [blame]	30	report::
				31	See STAT REPORT.
				32
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	33	-e::
				34	--event=::
Cody P Schafer	f9ab9c1	2015-01-07 17:13:53 -0800	[diff] [blame]	35	Select the PMU event. Selection can be:
				36
				37	- a symbolic event name (use 'perf list' to list all events)
				38
				39	- a raw PMU event (eventsel+umask) in the form of rNNN where NNN is a
				40	hexadecimal event descriptor.
				41
				42	- a symbolically formed event like 'pmu/param1=0x3,param2/' where
				43	param1 and param2 are defined as formats for the PMU in
				44	/sys/bus/event_sources/devices/<pmu>/format/*
				45
				46	- a symbolically formed event like 'pmu/config=M,config1=N,config2=K/'
				47	where M, N, K are numbers (in decimal, hex, octal format).
				48	Acceptable values for each of 'config', 'config1' and 'config2'
				49	parameters are defined by corresponding entries in
				50	/sys/bus/event_sources/devices/<pmu>/format/*
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	51
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	52	-i::
Stephane Eranian	2e6cdf9	2010-05-12 10:40:01 +0200	[diff] [blame]	53	--no-inherit::
				54	child tasks do not inherit counters
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	55	-p::
				56	--pid=<pid>::
David Ahern	b52956c	2012-02-08 09:32:52 -0700	[diff] [blame]	57	stat events on existing process id (comma separated list)
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	58
				59	-t::
				60	--tid=<tid>::
David Ahern	b52956c	2012-02-08 09:32:52 -0700	[diff] [blame]	61	stat events on existing thread id (comma separated list)
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	62
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	63
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	64	-a::
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	65	--all-cpus::
Jiri Olsa	0d79f8b	2017-02-17 18:00:34 +0100	[diff] [blame]	66	system-wide collection from all CPUs (default if no target is specified)
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	67
Brice Goglin	b26bc5a	2009-08-07 10:18:39 +0200	[diff] [blame]	68	-c::
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	69	--scale::
				70	scale/normalize counter values
				71
Borislav Petkov	f594bae	2016-03-07 16:44:44 -0300	[diff] [blame]	72	-d::
				73	--detailed::
				74	print more detailed statistics, can be specified up to 3 times
				75
				76	-d: detailed events, L1 and LLC data cache
				77	-d -d: more detailed events, dTLB and iTLB events
				78	-d -d -d: very detailed events, adding prefetch events
				79
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	80	-r::
				81	--repeat=<n>::
Frederik Deweerdt	a7e191c	2013-03-01 13:02:27 -0500	[diff] [blame]	82	repeat command and print average + stddev (max: 100). 0 means forever.
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	83
Stephane Eranian	5af52b5	2010-05-18 15:00:01 +0200	[diff] [blame]	84	-B::
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	85	--big-num::
Stephane Eranian	5af52b5	2010-05-18 15:00:01 +0200	[diff] [blame]	86	print large numbers with thousands' separators according to locale
				87
Stephane Eranian	c45c6ea	2010-05-28 12:00:01 +0200	[diff] [blame]	88	-C::
				89	--cpu=::
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	90	Count only on the list of CPUs provided. Multiple CPUs can be provided as a
				91	comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
Stephane Eranian	c45c6ea	2010-05-28 12:00:01 +0200	[diff] [blame]	92	In per-thread mode, this option is ignored. The -a option is still necessary
				93	to activate system-wide monitoring. Default is to count on all CPUs.
				94
Stephane Eranian	f5b4a9c3	2010-11-16 11:05:01 +0200	[diff] [blame]	95	-A::
				96	--no-aggr::
Ravi Bangoria	efc9c05	2017-03-20 18:07:18 +0530	[diff] [blame]	97	Do not aggregate counts across all monitored CPUs.
Stephane Eranian	f5b4a9c3	2010-11-16 11:05:01 +0200	[diff] [blame]	98
Shawn Bohrer	8c20769	2010-11-30 19:57:19 -0600	[diff] [blame]	99	-n::
				100	--null::
				101	null run - don't start any counters
				102
				103	-v::
				104	--verbose::
				105	be more verbose (show counter open errors, etc)
				106
Stephane Eranian	d7470b6	2010-12-01 18:49:05 +0200	[diff] [blame]	107	-x SEP::
				108	--field-separator SEP::
				109	print counts using a CSV-style output to make it easy to import directly into
				110	spreadsheets. Columns are separated by the string specified in SEP.
				111
Stephane Eranian	023695d	2011-02-14 11:20:01 +0200	[diff] [blame]	112	-G name::
				113	--cgroup name::
				114	monitor only in the container (cgroup) called "name". This option is available only
				115	in per-cpu mode. The cgroup filesystem must be mounted. All threads belonging to
				116	container "name" are monitored when they run on the monitored CPUs. Multiple cgroups
				117	can be provided. Each cgroup is applied to the corresponding event, i.e., first cgroup
				118	to first event, second cgroup to second event and so on. It is possible to provide
				119	an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must have
				120	corresponding events, i.e., they always refer to events defined earlier on the command
				121	line.
				122
Stephane Eranian	4aa9015	2011-08-15 22:22:33 +0200	[diff] [blame]	123	-o file::
Jim Cromie	56f3bae	2011-09-07 17:14:00 -0600	[diff] [blame]	124	--output file::
Stephane Eranian	4aa9015	2011-08-15 22:22:33 +0200	[diff] [blame]	125	Print the output into the designated file.
				126
				127	--append::
				128	Append to the output file designated with the -o option. Ignored if -o is not specified.
				129
Jim Cromie	56f3bae	2011-09-07 17:14:00 -0600	[diff] [blame]	130	--log-fd::
				131
				132	Log output to fd, instead of stderr. Complementary to --output, and mutually exclusive
				133	with it. --append may be used here. Examples:
				134	3>results perf stat --log-fd 3 -- $cmd
				135	3>>results perf stat --log-fd 3 --append -- $cmd
				136
Peter Zijlstra	1f16c57	2012-10-23 13:40:14 +0200	[diff] [blame]	137	--pre::
				138	--post::
				139	Pre and post measurement hooks, e.g.:
				140
				141	perf stat --repeat 10 --null --sync --pre 'make -s O=defconfig-build/clean' -- make -s -j64 O=defconfig-build/ bzImage
Jim Cromie	56f3bae	2011-09-07 17:14:00 -0600	[diff] [blame]	142
Stephane Eranian	13370a9	2013-01-29 12:47:44 +0100	[diff] [blame]	143	-I msecs::
				144	--interval-print msecs::
Kan Liang	19afd10	2015-10-02 05:04:34 -0400	[diff] [blame]	145	Print count deltas every N milliseconds (minimum: 10ms)
				146	The overhead percentage could be high in some cases, for instance with small, sub 100ms intervals. Use with caution.
				147	example: 'perf stat -I 1000 -e cycles -a sleep 5'
Jim Cromie	56f3bae	2011-09-07 17:14:00 -0600	[diff] [blame]	148
Andi Kleen	54b5091	2016-03-03 15:57:36 -0800	[diff] [blame]	149	--metric-only::
				150	Only print computed metrics. Print them in a single line.
Andi Kleen	206cab6	2016-03-03 15:57:37 -0800	[diff] [blame]	151	Don't show any raw values. Not supported with --per-thread.
Andi Kleen	54b5091	2016-03-03 15:57:36 -0800	[diff] [blame]	152
Stephane Eranian	d430495	2013-02-14 13:57:28 +0100	[diff] [blame]	153	--per-socket::
Stephane Eranian	d7e7a45	2013-02-06 15:46:02 +0100	[diff] [blame]	154	Aggregate counts per processor socket for system-wide mode measurements. This
				155	is a useful mode to detect imbalance between sockets. To enable this mode,
Stephane Eranian	d430495	2013-02-14 13:57:28 +0100	[diff] [blame]	156	use --per-socket in addition to -a. (system-wide). The output includes the
Stephane Eranian	d7e7a45	2013-02-06 15:46:02 +0100	[diff] [blame]	157	socket number and the number of online processors on that socket. This is
				158	useful to gauge the amount of aggregation.
				159
Stephane Eranian	12c08a9	2013-02-14 13:57:29 +0100	[diff] [blame]	160	--per-core::
				161	Aggregate counts per physical processor for system-wide mode measurements. This
				162	is a useful mode to detect imbalance between physical cores. To enable this mode,
				163	use --per-core in addition to -a. (system-wide). The output includes the
				164	core number and the number of online logical processors on that physical processor.
				165
Jiri Olsa	32b8af8	2015-06-26 11:29:27 +0200	[diff] [blame]	166	--per-thread::
				167	Aggregate counts per monitored threads, when monitoring threads (-t option)
				168	or processes (-p option).
				169
Andi Kleen	4119168	2013-08-02 17:41:11 -0700	[diff] [blame]	170	-D msecs::
Andi Kleen	8f3dd2b	2014-01-07 14:14:06 -0800	[diff] [blame]	171	--delay msecs::
Andi Kleen	4119168	2013-08-02 17:41:11 -0700	[diff] [blame]	172	After starting the program, wait msecs before measuring. This is useful to
				173	filter out the startup phase of the program, which is often very different.
				174
Andi Kleen	4cabc3d	2013-08-21 16:47:26 -0700	[diff] [blame]	175	-T::
				176	--transaction::
				177
				178	Print statistics of transactional execution if supported.
				179
Jiri Olsa	4979d0c	2015-11-05 15:40:46 +0100	[diff] [blame]	180	STAT RECORD
				181	-----------
				182	Stores stat data into perf data file.
				183
				184	-o file::
				185	--output file::
				186	Output file name.
				187
Jiri Olsa	ba6039b6	2015-11-05 15:40:55 +0100	[diff] [blame]	188	STAT REPORT
				189	-----------
				190	Reads and reports stat data from perf data file.
				191
				192	-i file::
				193	--input file::
				194	Input file name.
				195
Jiri Olsa	89af4e0	2015-11-05 15:41:02 +0100	[diff] [blame]	196	--per-socket::
				197	Aggregate counts per processor socket for system-wide mode measurements.
				198
				199	--per-core::
				200	Aggregate counts per physical processor for system-wide mode measurements.
				201
				202	-A::
				203	--no-aggr::
				204	Do not aggregate counts across all monitored CPUs.
				205
Andi Kleen	44b1e60	2016-05-30 12:49:42 -0300	[diff] [blame]	206	--topdown::
				207	Print top down level 1 metrics if supported by the CPU. This allows to
				208	determine bottle necks in the CPU pipeline for CPU bound workloads,
				209	by breaking the cycles consumed down into frontend bound, backend bound,
				210	bad speculation and retiring.
				211
				212	Frontend bound means that the CPU cannot fetch and decode instructions fast
				213	enough. Backend bound means that computation or memory access is the bottle
				214	neck. Bad Speculation means that the CPU wasted cycles due to branch
				215	mispredictions and similar issues. Retiring means that the CPU computed without
				216	an apparently bottleneck. The bottleneck is only the real bottleneck
				217	if the workload is actually bound by the CPU and not by something else.
				218
				219	For best results it is usually a good idea to use it with interval
				220	mode like -I 1000, as the bottleneck of workloads can change often.
				221
				222	The top down metrics are collected per core instead of per
				223	CPU thread. Per core mode is automatically enabled
				224	and -a (global monitoring) is needed, requiring root rights or
				225	perf.perf_event_paranoid=-1.
				226
				227	Topdown uses the full Performance Monitoring Unit, and needs
				228	disabling of the NMI watchdog (as root):
				229	echo 0 > /proc/sys/kernel/nmi_watchdog
				230	for best results. Otherwise the bottlenecks may be inconsistent
				231	on workload with changing phases.
				232
				233	This enables --metric-only, unless overriden with --no-metric-only.
				234
				235	To interpret the results it is usually needed to know on which
				236	CPUs the workload runs on. If needed the CPUs can be forced using
				237	taskset.
Jiri Olsa	4979d0c	2015-11-05 15:40:46 +0100	[diff] [blame]	238
Andi Kleen	430daf2	2017-03-20 13:17:00 -0700	[diff] [blame^]	239	--no-merge::
				240	Do not merge results from same PMUs.
				241
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	242	EXAMPLES
				243	--------
				244
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	245	$ perf stat -- make -j
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	246
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	247	Performance counter stats for 'make -j':
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	248
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	249	8117.370256 task clock ticks # 11.281 CPU utilization factor
				250	678 context switches # 0.000 M/sec
				251	133 CPU migrations # 0.000 M/sec
				252	235724 pagefaults # 0.029 M/sec
				253	24821162526 CPU cycles # 3057.784 M/sec
				254	18687303457 instructions # 2302.138 M/sec
				255	172158895 cache references # 21.209 M/sec
				256	27075259 cache misses # 3.335 M/sec
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	257
Ingo Molnar	20c84e9	2009-06-04 16:33:00 +0200	[diff] [blame]	258	Wall-clock time elapsed: 719.554352 msecs
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	259
Andi Kleen	6b45f7b	2016-03-03 15:57:35 -0800	[diff] [blame]	260	CSV FORMAT
				261	----------
				262
				263	With -x, perf stat is able to output a not-quite-CSV format output
				264	Commas in the output are not put into "". To make it easy to parse
				265	it is recommended to use a different character like -x \;
				266
				267	The fields are in this order:
				268
				269	- optional usec time stamp in fractions of second (with -I xxx)
				270	- optional CPU, core, or socket identifier
				271	- optional number of logical CPUs aggregated
				272	- counter value
				273	- unit of the counter value or empty
				274	- event name
				275	- run time of counter
				276	- percentage of measurement time the counter was running
				277	- optional variance if multiple values are collected with -r
				278	- optional metric value
				279	- optional unit of metric
				280
				281	Additional metrics may be printed with all earlier fields being empty.
				282
Ingo Molnar	1d8c8b2	2009-04-20 15:52:29 +0200	[diff] [blame]	283	SEE ALSO
				284	--------
Thomas Gleixner	386b05e	2009-06-06 14:56:33 +0200	[diff] [blame]	285	linkperf:perf-top[1], linkperf:perf-list[1]