Ingo Molnar | 1d8c8b2 | 2009-04-20 15:52:29 +0200 | [diff] [blame] | 1 | perf-stat(1) |
Ingo Molnar | 6e6b754 | 2008-04-15 22:39:31 +0200 | [diff] [blame] | 2 | ============ |
Ingo Molnar | 1d8c8b2 | 2009-04-20 15:52:29 +0200 | [diff] [blame] | 3 | |
| 4 | NAME |
| 5 | ---- |
| 6 | perf-stat - Run a command and gather performance counter statistics |
| 7 | |
| 8 | SYNOPSIS |
| 9 | -------- |
| 10 | [verse] |
Shawn Bohrer | 8c20769 | 2010-11-30 19:57:19 -0600 | [diff] [blame] | 11 | 'perf stat' [-e <EVENT> | --event=EVENT] [-a] <command> |
| 12 | 'perf stat' [-e <EVENT> | --event=EVENT] [-a] -- <command> [<options>] |
Ingo Molnar | 1d8c8b2 | 2009-04-20 15:52:29 +0200 | [diff] [blame] | 13 | |
| 14 | DESCRIPTION |
| 15 | ----------- |
| 16 | This command runs a command and gathers performance counter statistics |
| 17 | from it. |
| 18 | |
| 19 | |
| 20 | OPTIONS |
| 21 | ------- |
| 22 | <command>...:: |
| 23 | Any command you can specify in a shell. |
| 24 | |
Ingo Molnar | 20c84e9 | 2009-06-04 16:33:00 +0200 | [diff] [blame] | 25 | |
Ingo Molnar | 1d8c8b2 | 2009-04-20 15:52:29 +0200 | [diff] [blame] | 26 | -e:: |
| 27 | --event=:: |
Thomas Gleixner | 386b05e | 2009-06-06 14:56:33 +0200 | [diff] [blame] | 28 | Select the PMU event. Selection can be a symbolic event name |
| 29 | (use 'perf list' to list all events) or a raw PMU |
| 30 | event (eventsel+umask) in the form of rNNN where NNN is a |
| 31 | hexadecimal event descriptor. |
Ingo Molnar | 1d8c8b2 | 2009-04-20 15:52:29 +0200 | [diff] [blame] | 32 | |
Ingo Molnar | 20c84e9 | 2009-06-04 16:33:00 +0200 | [diff] [blame] | 33 | -i:: |
Stephane Eranian | 2e6cdf9 | 2010-05-12 10:40:01 +0200 | [diff] [blame] | 34 | --no-inherit:: |
| 35 | child tasks do not inherit counters |
Ingo Molnar | 20c84e9 | 2009-06-04 16:33:00 +0200 | [diff] [blame] | 36 | -p:: |
| 37 | --pid=<pid>:: |
David Ahern | b52956c | 2012-02-08 09:32:52 -0700 | [diff] [blame] | 38 | stat events on existing process id (comma separated list) |
Shawn Bohrer | 8c20769 | 2010-11-30 19:57:19 -0600 | [diff] [blame] | 39 | |
| 40 | -t:: |
| 41 | --tid=<tid>:: |
David Ahern | b52956c | 2012-02-08 09:32:52 -0700 | [diff] [blame] | 42 | stat events on existing thread id (comma separated list) |
Shawn Bohrer | 8c20769 | 2010-11-30 19:57:19 -0600 | [diff] [blame] | 43 | |
Ingo Molnar | 20c84e9 | 2009-06-04 16:33:00 +0200 | [diff] [blame] | 44 | |
Ingo Molnar | 1d8c8b2 | 2009-04-20 15:52:29 +0200 | [diff] [blame] | 45 | -a:: |
Shawn Bohrer | 8c20769 | 2010-11-30 19:57:19 -0600 | [diff] [blame] | 46 | --all-cpus:: |
| 47 | system-wide collection from all CPUs |
Ingo Molnar | 1d8c8b2 | 2009-04-20 15:52:29 +0200 | [diff] [blame] | 48 | |
Brice Goglin | b26bc5a | 2009-08-07 10:18:39 +0200 | [diff] [blame] | 49 | -c:: |
Shawn Bohrer | 8c20769 | 2010-11-30 19:57:19 -0600 | [diff] [blame] | 50 | --scale:: |
| 51 | scale/normalize counter values |
| 52 | |
| 53 | -r:: |
| 54 | --repeat=<n>:: |
Frederik Deweerdt | a7e191c | 2013-03-01 13:02:27 -0500 | [diff] [blame] | 55 | repeat command and print average + stddev (max: 100). 0 means forever. |
Ingo Molnar | 1d8c8b2 | 2009-04-20 15:52:29 +0200 | [diff] [blame] | 56 | |
Stephane Eranian | 5af52b5 | 2010-05-18 15:00:01 +0200 | [diff] [blame] | 57 | -B:: |
Shawn Bohrer | 8c20769 | 2010-11-30 19:57:19 -0600 | [diff] [blame] | 58 | --big-num:: |
Stephane Eranian | 5af52b5 | 2010-05-18 15:00:01 +0200 | [diff] [blame] | 59 | print large numbers with thousands' separators according to locale |
| 60 | |
Stephane Eranian | c45c6ea | 2010-05-28 12:00:01 +0200 | [diff] [blame] | 61 | -C:: |
| 62 | --cpu=:: |
Shawn Bohrer | 8c20769 | 2010-11-30 19:57:19 -0600 | [diff] [blame] | 63 | Count only on the list of CPUs provided. Multiple CPUs can be provided as a |
| 64 | comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2. |
Stephane Eranian | c45c6ea | 2010-05-28 12:00:01 +0200 | [diff] [blame] | 65 | In per-thread mode, this option is ignored. The -a option is still necessary |
| 66 | to activate system-wide monitoring. Default is to count on all CPUs. |
| 67 | |
Stephane Eranian | f5b4a9c3 | 2010-11-16 11:05:01 +0200 | [diff] [blame] | 68 | -A:: |
| 69 | --no-aggr:: |
| 70 | Do not aggregate counts across all monitored CPUs in system-wide mode (-a). |
| 71 | This option is only valid in system-wide mode. |
| 72 | |
Shawn Bohrer | 8c20769 | 2010-11-30 19:57:19 -0600 | [diff] [blame] | 73 | -n:: |
| 74 | --null:: |
| 75 | null run - don't start any counters |
| 76 | |
| 77 | -v:: |
| 78 | --verbose:: |
| 79 | be more verbose (show counter open errors, etc) |
| 80 | |
Stephane Eranian | d7470b6 | 2010-12-01 18:49:05 +0200 | [diff] [blame] | 81 | -x SEP:: |
| 82 | --field-separator SEP:: |
| 83 | print counts using a CSV-style output to make it easy to import directly into |
| 84 | spreadsheets. Columns are separated by the string specified in SEP. |
| 85 | |
Stephane Eranian | 023695d | 2011-02-14 11:20:01 +0200 | [diff] [blame] | 86 | -G name:: |
| 87 | --cgroup name:: |
| 88 | monitor only in the container (cgroup) called "name". This option is available only |
| 89 | in per-cpu mode. The cgroup filesystem must be mounted. All threads belonging to |
| 90 | container "name" are monitored when they run on the monitored CPUs. Multiple cgroups |
| 91 | can be provided. Each cgroup is applied to the corresponding event, i.e., first cgroup |
| 92 | to first event, second cgroup to second event and so on. It is possible to provide |
| 93 | an empty cgroup (monitor all the time) using, e.g., -G foo,,bar. Cgroups must have |
| 94 | corresponding events, i.e., they always refer to events defined earlier on the command |
| 95 | line. |
| 96 | |
Stephane Eranian | 4aa9015 | 2011-08-15 22:22:33 +0200 | [diff] [blame] | 97 | -o file:: |
Jim Cromie | 56f3bae | 2011-09-07 17:14:00 -0600 | [diff] [blame] | 98 | --output file:: |
Stephane Eranian | 4aa9015 | 2011-08-15 22:22:33 +0200 | [diff] [blame] | 99 | Print the output into the designated file. |
| 100 | |
| 101 | --append:: |
| 102 | Append to the output file designated with the -o option. Ignored if -o is not specified. |
| 103 | |
Jim Cromie | 56f3bae | 2011-09-07 17:14:00 -0600 | [diff] [blame] | 104 | --log-fd:: |
| 105 | |
| 106 | Log output to fd, instead of stderr. Complementary to --output, and mutually exclusive |
| 107 | with it. --append may be used here. Examples: |
| 108 | 3>results perf stat --log-fd 3 -- $cmd |
| 109 | 3>>results perf stat --log-fd 3 --append -- $cmd |
| 110 | |
Peter Zijlstra | 1f16c57 | 2012-10-23 13:40:14 +0200 | [diff] [blame] | 111 | --pre:: |
| 112 | --post:: |
| 113 | Pre and post measurement hooks, e.g.: |
| 114 | |
| 115 | perf stat --repeat 10 --null --sync --pre 'make -s O=defconfig-build/clean' -- make -s -j64 O=defconfig-build/ bzImage |
Jim Cromie | 56f3bae | 2011-09-07 17:14:00 -0600 | [diff] [blame] | 116 | |
Stephane Eranian | 13370a9 | 2013-01-29 12:47:44 +0100 | [diff] [blame] | 117 | -I msecs:: |
| 118 | --interval-print msecs:: |
Stephane Eranian | d7e7a45 | 2013-02-06 15:46:02 +0100 | [diff] [blame] | 119 | Print count deltas every N milliseconds (minimum: 100ms) |
Stephane Eranian | 13370a9 | 2013-01-29 12:47:44 +0100 | [diff] [blame] | 120 | example: perf stat -I 1000 -e cycles -a sleep 5 |
Jim Cromie | 56f3bae | 2011-09-07 17:14:00 -0600 | [diff] [blame] | 121 | |
Stephane Eranian | d430495 | 2013-02-14 13:57:28 +0100 | [diff] [blame] | 122 | --per-socket:: |
Stephane Eranian | d7e7a45 | 2013-02-06 15:46:02 +0100 | [diff] [blame] | 123 | Aggregate counts per processor socket for system-wide mode measurements. This |
| 124 | is a useful mode to detect imbalance between sockets. To enable this mode, |
Stephane Eranian | d430495 | 2013-02-14 13:57:28 +0100 | [diff] [blame] | 125 | use --per-socket in addition to -a. (system-wide). The output includes the |
Stephane Eranian | d7e7a45 | 2013-02-06 15:46:02 +0100 | [diff] [blame] | 126 | socket number and the number of online processors on that socket. This is |
| 127 | useful to gauge the amount of aggregation. |
| 128 | |
Stephane Eranian | 12c08a9 | 2013-02-14 13:57:29 +0100 | [diff] [blame] | 129 | --per-core:: |
| 130 | Aggregate counts per physical processor for system-wide mode measurements. This |
| 131 | is a useful mode to detect imbalance between physical cores. To enable this mode, |
| 132 | use --per-core in addition to -a. (system-wide). The output includes the |
| 133 | core number and the number of online logical processors on that physical processor. |
| 134 | |
Andi Kleen | 4119168 | 2013-08-02 17:41:11 -0700 | [diff] [blame^] | 135 | -D msecs:: |
| 136 | --initial-delay msecs:: |
| 137 | After starting the program, wait msecs before measuring. This is useful to |
| 138 | filter out the startup phase of the program, which is often very different. |
| 139 | |
Ingo Molnar | 1d8c8b2 | 2009-04-20 15:52:29 +0200 | [diff] [blame] | 140 | EXAMPLES |
| 141 | -------- |
| 142 | |
Ingo Molnar | 20c84e9 | 2009-06-04 16:33:00 +0200 | [diff] [blame] | 143 | $ perf stat -- make -j |
Ingo Molnar | 1d8c8b2 | 2009-04-20 15:52:29 +0200 | [diff] [blame] | 144 | |
Ingo Molnar | 20c84e9 | 2009-06-04 16:33:00 +0200 | [diff] [blame] | 145 | Performance counter stats for 'make -j': |
Ingo Molnar | 1d8c8b2 | 2009-04-20 15:52:29 +0200 | [diff] [blame] | 146 | |
Ingo Molnar | 20c84e9 | 2009-06-04 16:33:00 +0200 | [diff] [blame] | 147 | 8117.370256 task clock ticks # 11.281 CPU utilization factor |
| 148 | 678 context switches # 0.000 M/sec |
| 149 | 133 CPU migrations # 0.000 M/sec |
| 150 | 235724 pagefaults # 0.029 M/sec |
| 151 | 24821162526 CPU cycles # 3057.784 M/sec |
| 152 | 18687303457 instructions # 2302.138 M/sec |
| 153 | 172158895 cache references # 21.209 M/sec |
| 154 | 27075259 cache misses # 3.335 M/sec |
Ingo Molnar | 1d8c8b2 | 2009-04-20 15:52:29 +0200 | [diff] [blame] | 155 | |
Ingo Molnar | 20c84e9 | 2009-06-04 16:33:00 +0200 | [diff] [blame] | 156 | Wall-clock time elapsed: 719.554352 msecs |
Ingo Molnar | 1d8c8b2 | 2009-04-20 15:52:29 +0200 | [diff] [blame] | 157 | |
| 158 | SEE ALSO |
| 159 | -------- |
Thomas Gleixner | 386b05e | 2009-06-06 14:56:33 +0200 | [diff] [blame] | 160 | linkperf:perf-top[1], linkperf:perf-list[1] |