Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 1 | Event Tracing |
| 2 | |
| 3 | Documentation written by Theodore Ts'o |
Tom Zanussi | 95b6960 | 2009-09-10 23:13:51 -0500 | [diff] [blame] | 4 | Updated by Li Zefan and Tom Zanussi |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 5 | |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 6 | 1. Introduction |
| 7 | =============== |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 8 | |
| 9 | Tracepoints (see Documentation/trace/tracepoints.txt) can be used |
| 10 | without creating custom kernel modules to register probe functions |
| 11 | using the event tracing infrastructure. |
| 12 | |
| 13 | Not all tracepoints can be traced using the event tracing system; |
| 14 | the kernel developer must provide code snippets which define how the |
| 15 | tracing information is saved into the tracing buffer, and how the |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 16 | tracing information should be printed. |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 17 | |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 18 | 2. Using Event Tracing |
| 19 | ====================== |
| 20 | |
| 21 | 2.1 Via the 'set_event' interface |
| 22 | --------------------------------- |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 23 | |
| 24 | The events which are available for tracing can be found in the file |
GeunSik Lim | 52ad51e | 2009-09-07 21:37:17 +0900 | [diff] [blame] | 25 | /sys/kernel/debug/tracing/available_events. |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 26 | |
| 27 | To enable a particular event, such as 'sched_wakeup', simply echo it |
GeunSik Lim | 52ad51e | 2009-09-07 21:37:17 +0900 | [diff] [blame] | 28 | to /sys/kernel/debug/tracing/set_event. For example: |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 29 | |
GeunSik Lim | 52ad51e | 2009-09-07 21:37:17 +0900 | [diff] [blame] | 30 | # echo sched_wakeup >> /sys/kernel/debug/tracing/set_event |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 31 | |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 32 | [ Note: '>>' is necessary, otherwise it will firstly disable |
| 33 | all the events. ] |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 34 | |
| 35 | To disable an event, echo the event name to the set_event file prefixed |
| 36 | with an exclamation point: |
| 37 | |
GeunSik Lim | 52ad51e | 2009-09-07 21:37:17 +0900 | [diff] [blame] | 38 | # echo '!sched_wakeup' >> /sys/kernel/debug/tracing/set_event |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 39 | |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 40 | To disable all events, echo an empty line to the set_event file: |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 41 | |
GeunSik Lim | 52ad51e | 2009-09-07 21:37:17 +0900 | [diff] [blame] | 42 | # echo > /sys/kernel/debug/tracing/set_event |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 43 | |
| 44 | To enable all events, echo '*:*' or '*:' to the set_event file: |
| 45 | |
GeunSik Lim | 52ad51e | 2009-09-07 21:37:17 +0900 | [diff] [blame] | 46 | # echo *:* > /sys/kernel/debug/tracing/set_event |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 47 | |
| 48 | The events are organized into subsystems, such as ext4, irq, sched, |
| 49 | etc., and a full event name looks like this: <subsystem>:<event>. The |
| 50 | subsystem name is optional, but it is displayed in the available_events |
| 51 | file. All of the events in a subsystem can be specified via the syntax |
| 52 | "<subsystem>:*"; for example, to enable all irq events, you can use the |
| 53 | command: |
| 54 | |
GeunSik Lim | 52ad51e | 2009-09-07 21:37:17 +0900 | [diff] [blame] | 55 | # echo 'irq:*' > /sys/kernel/debug/tracing/set_event |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 56 | |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 57 | 2.2 Via the 'enable' toggle |
| 58 | --------------------------- |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 59 | |
GeunSik Lim | 52ad51e | 2009-09-07 21:37:17 +0900 | [diff] [blame] | 60 | The events available are also listed in /sys/kernel/debug/tracing/events/ hierarchy |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 61 | of directories. |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 62 | |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 63 | To enable event 'sched_wakeup': |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 64 | |
GeunSik Lim | 52ad51e | 2009-09-07 21:37:17 +0900 | [diff] [blame] | 65 | # echo 1 > /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 66 | |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 67 | To disable it: |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 68 | |
GeunSik Lim | 52ad51e | 2009-09-07 21:37:17 +0900 | [diff] [blame] | 69 | # echo 0 > /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 70 | |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 71 | To enable all events in sched subsystem: |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 72 | |
GeunSik Lim | 52ad51e | 2009-09-07 21:37:17 +0900 | [diff] [blame] | 73 | # echo 1 > /sys/kernel/debug/tracing/events/sched/enable |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 74 | |
Michal Sojka | 6afb1c6 | 2009-09-10 08:02:21 +0200 | [diff] [blame] | 75 | To enable all events: |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 76 | |
GeunSik Lim | 52ad51e | 2009-09-07 21:37:17 +0900 | [diff] [blame] | 77 | # echo 1 > /sys/kernel/debug/tracing/events/enable |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 78 | |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 79 | When reading one of these enable files, there are four results: |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 80 | |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 81 | 0 - all events this file affects are disabled |
| 82 | 1 - all events this file affects are enabled |
| 83 | X - there is a mixture of events enabled and disabled |
| 84 | ? - this file does not affect any event |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 85 | |
Li Zefan | 020e5f8 | 2009-07-01 10:47:05 +0800 | [diff] [blame] | 86 | 2.3 Boot option |
| 87 | --------------- |
| 88 | |
| 89 | In order to facilitate early boot debugging, use boot option: |
| 90 | |
| 91 | trace_event=[event-list] |
| 92 | |
Li Zefan | 03d646e | 2009-12-21 14:27:24 +0800 | [diff] [blame] | 93 | event-list is a comma separated list of events. See section 2.1 for event |
| 94 | format. |
Li Zefan | 020e5f8 | 2009-07-01 10:47:05 +0800 | [diff] [blame] | 95 | |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 96 | 3. Defining an event-enabled tracepoint |
| 97 | ======================================= |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 98 | |
Li Zefan | 143c145 | 2009-05-19 14:43:15 +0800 | [diff] [blame] | 99 | See The example provided in samples/trace_events |
Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame] | 100 | |
Tom Zanussi | 95b6960 | 2009-09-10 23:13:51 -0500 | [diff] [blame] | 101 | 4. Event formats |
| 102 | ================ |
| 103 | |
| 104 | Each trace event has a 'format' file associated with it that contains |
| 105 | a description of each field in a logged event. This information can |
| 106 | be used to parse the binary trace stream, and is also the place to |
| 107 | find the field names that can be used in event filters (see section 5). |
| 108 | |
| 109 | It also displays the format string that will be used to print the |
| 110 | event in text mode, along with the event name and ID used for |
| 111 | profiling. |
| 112 | |
| 113 | Every event has a set of 'common' fields associated with it; these are |
| 114 | the fields prefixed with 'common_'. The other fields vary between |
| 115 | events and correspond to the fields defined in the TRACE_EVENT |
| 116 | definition for that event. |
| 117 | |
| 118 | Each field in the format has the form: |
| 119 | |
| 120 | field:field-type field-name; offset:N; size:N; |
| 121 | |
| 122 | where offset is the offset of the field in the trace record and size |
| 123 | is the size of the data item, in bytes. |
| 124 | |
| 125 | For example, here's the information displayed for the 'sched_wakeup' |
| 126 | event: |
| 127 | |
Tao Ma | bfc672d | 2011-01-05 14:07:54 +0800 | [diff] [blame] | 128 | # cat /sys/kernel/debug/tracing/events/sched/sched_wakeup/format |
Tom Zanussi | 95b6960 | 2009-09-10 23:13:51 -0500 | [diff] [blame] | 129 | |
| 130 | name: sched_wakeup |
| 131 | ID: 60 |
| 132 | format: |
| 133 | field:unsigned short common_type; offset:0; size:2; |
| 134 | field:unsigned char common_flags; offset:2; size:1; |
| 135 | field:unsigned char common_preempt_count; offset:3; size:1; |
| 136 | field:int common_pid; offset:4; size:4; |
| 137 | field:int common_tgid; offset:8; size:4; |
| 138 | |
| 139 | field:char comm[TASK_COMM_LEN]; offset:12; size:16; |
| 140 | field:pid_t pid; offset:28; size:4; |
| 141 | field:int prio; offset:32; size:4; |
| 142 | field:int success; offset:36; size:4; |
| 143 | field:int cpu; offset:40; size:4; |
| 144 | |
| 145 | print fmt: "task %s:%d [%d] success=%d [%03d]", REC->comm, REC->pid, |
| 146 | REC->prio, REC->success, REC->cpu |
| 147 | |
| 148 | This event contains 10 fields, the first 5 common and the remaining 5 |
| 149 | event-specific. All the fields for this event are numeric, except for |
| 150 | 'comm' which is a string, a distinction important for event filtering. |
| 151 | |
| 152 | 5. Event filtering |
| 153 | ================== |
| 154 | |
| 155 | Trace events can be filtered in the kernel by associating boolean |
| 156 | 'filter expressions' with them. As soon as an event is logged into |
| 157 | the trace buffer, its fields are checked against the filter expression |
| 158 | associated with that event type. An event with field values that |
| 159 | 'match' the filter will appear in the trace output, and an event whose |
| 160 | values don't match will be discarded. An event with no filter |
| 161 | associated with it matches everything, and is the default when no |
| 162 | filter has been set for an event. |
| 163 | |
| 164 | 5.1 Expression syntax |
| 165 | --------------------- |
| 166 | |
| 167 | A filter expression consists of one or more 'predicates' that can be |
| 168 | combined using the logical operators '&&' and '||'. A predicate is |
| 169 | simply a clause that compares the value of a field contained within a |
| 170 | logged event with a constant value and returns either 0 or 1 depending |
| 171 | on whether the field value matched (1) or didn't match (0): |
| 172 | |
| 173 | field-name relational-operator value |
| 174 | |
| 175 | Parentheses can be used to provide arbitrary logical groupings and |
| 176 | double-quotes can be used to prevent the shell from interpreting |
| 177 | operators as shell metacharacters. |
| 178 | |
| 179 | The field-names available for use in filters can be found in the |
| 180 | 'format' files for trace events (see section 4). |
| 181 | |
| 182 | The relational-operators depend on the type of the field being tested: |
| 183 | |
| 184 | The operators available for numeric fields are: |
| 185 | |
Steven Rostedt | 1a891cf | 2013-06-12 13:16:25 -0400 | [diff] [blame] | 186 | ==, !=, <, <=, >, >=, & |
Tom Zanussi | 95b6960 | 2009-09-10 23:13:51 -0500 | [diff] [blame] | 187 | |
| 188 | And for string fields they are: |
| 189 | |
Steven Rostedt (Red Hat) | c3e13c7 | 2013-06-17 10:59:17 -0400 | [diff] [blame] | 190 | ==, !=, ~ |
Tom Zanussi | 95b6960 | 2009-09-10 23:13:51 -0500 | [diff] [blame] | 191 | |
Masami Hiramatsu | 60f1d5e | 2016-10-05 20:58:15 +0900 | [diff] [blame] | 192 | The glob (~) accepts a wild card character (*,?) and character classes |
| 193 | ([). For example: |
Steven Rostedt (Red Hat) | c3e13c7 | 2013-06-17 10:59:17 -0400 | [diff] [blame] | 194 | |
| 195 | prev_comm ~ "*sh" |
| 196 | prev_comm ~ "sh*" |
| 197 | prev_comm ~ "*sh*" |
Masami Hiramatsu | 60f1d5e | 2016-10-05 20:58:15 +0900 | [diff] [blame] | 198 | prev_comm ~ "ba*sh" |
Tom Zanussi | 95b6960 | 2009-09-10 23:13:51 -0500 | [diff] [blame] | 199 | |
Tom Zanussi | 95b6960 | 2009-09-10 23:13:51 -0500 | [diff] [blame] | 200 | 5.2 Setting filters |
| 201 | ------------------- |
| 202 | |
| 203 | A filter for an individual event is set by writing a filter expression |
| 204 | to the 'filter' file for the given event. |
| 205 | |
| 206 | For example: |
| 207 | |
Tao Ma | bfc672d | 2011-01-05 14:07:54 +0800 | [diff] [blame] | 208 | # cd /sys/kernel/debug/tracing/events/sched/sched_wakeup |
Tom Zanussi | 95b6960 | 2009-09-10 23:13:51 -0500 | [diff] [blame] | 209 | # echo "common_preempt_count > 4" > filter |
| 210 | |
| 211 | A slightly more involved example: |
| 212 | |
Tao Ma | bfc672d | 2011-01-05 14:07:54 +0800 | [diff] [blame] | 213 | # cd /sys/kernel/debug/tracing/events/signal/signal_generate |
Tom Zanussi | 95b6960 | 2009-09-10 23:13:51 -0500 | [diff] [blame] | 214 | # echo "((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter |
| 215 | |
| 216 | If there is an error in the expression, you'll get an 'Invalid |
| 217 | argument' error when setting it, and the erroneous string along with |
| 218 | an error message can be seen by looking at the filter e.g.: |
| 219 | |
Tao Ma | bfc672d | 2011-01-05 14:07:54 +0800 | [diff] [blame] | 220 | # cd /sys/kernel/debug/tracing/events/signal/signal_generate |
Tom Zanussi | 95b6960 | 2009-09-10 23:13:51 -0500 | [diff] [blame] | 221 | # echo "((sig >= 10 && sig < 15) || dsig == 17) && comm != bash" > filter |
| 222 | -bash: echo: write error: Invalid argument |
| 223 | # cat filter |
| 224 | ((sig >= 10 && sig < 15) || dsig == 17) && comm != bash |
| 225 | ^ |
| 226 | parse_error: Field not found |
| 227 | |
| 228 | Currently the caret ('^') for an error always appears at the beginning of |
| 229 | the filter string; the error message should still be useful though |
| 230 | even without more accurate position info. |
| 231 | |
| 232 | 5.3 Clearing filters |
| 233 | -------------------- |
| 234 | |
| 235 | To clear the filter for an event, write a '0' to the event's filter |
| 236 | file. |
| 237 | |
| 238 | To clear the filters for all events in a subsystem, write a '0' to the |
| 239 | subsystem's filter file. |
| 240 | |
| 241 | 5.3 Subsystem filters |
| 242 | --------------------- |
| 243 | |
| 244 | For convenience, filters for every event in a subsystem can be set or |
| 245 | cleared as a group by writing a filter expression into the filter file |
Thomas Weber | 8839316 | 2010-03-16 11:47:56 +0100 | [diff] [blame] | 246 | at the root of the subsystem. Note however, that if a filter for any |
Tom Zanussi | 95b6960 | 2009-09-10 23:13:51 -0500 | [diff] [blame] | 247 | event within the subsystem lacks a field specified in the subsystem |
| 248 | filter, or if the filter can't be applied for any other reason, the |
| 249 | filter for that event will retain its previous setting. This can |
| 250 | result in an unintended mixture of filters which could lead to |
| 251 | confusing (to the user who might think different filters are in |
| 252 | effect) trace output. Only filters that reference just the common |
| 253 | fields can be guaranteed to propagate successfully to all events. |
| 254 | |
| 255 | Here are a few subsystem filter examples that also illustrate the |
| 256 | above points: |
| 257 | |
Thomas Weber | 8839316 | 2010-03-16 11:47:56 +0100 | [diff] [blame] | 258 | Clear the filters on all events in the sched subsystem: |
Tom Zanussi | 95b6960 | 2009-09-10 23:13:51 -0500 | [diff] [blame] | 259 | |
| 260 | # cd /sys/kernel/debug/tracing/events/sched |
| 261 | # echo 0 > filter |
| 262 | # cat sched_switch/filter |
| 263 | none |
| 264 | # cat sched_wakeup/filter |
| 265 | none |
| 266 | |
| 267 | Set a filter using only common fields for all events in the sched |
Thomas Weber | 8839316 | 2010-03-16 11:47:56 +0100 | [diff] [blame] | 268 | subsystem (all events end up with the same filter): |
Tom Zanussi | 95b6960 | 2009-09-10 23:13:51 -0500 | [diff] [blame] | 269 | |
| 270 | # cd /sys/kernel/debug/tracing/events/sched |
| 271 | # echo common_pid == 0 > filter |
| 272 | # cat sched_switch/filter |
| 273 | common_pid == 0 |
| 274 | # cat sched_wakeup/filter |
| 275 | common_pid == 0 |
| 276 | |
| 277 | Attempt to set a filter using a non-common field for all events in the |
Thomas Weber | 8839316 | 2010-03-16 11:47:56 +0100 | [diff] [blame] | 278 | sched subsystem (all events but those that have a prev_pid field retain |
Tom Zanussi | 95b6960 | 2009-09-10 23:13:51 -0500 | [diff] [blame] | 279 | their old filters): |
| 280 | |
| 281 | # cd /sys/kernel/debug/tracing/events/sched |
| 282 | # echo prev_pid == 0 > filter |
| 283 | # cat sched_switch/filter |
| 284 | prev_pid == 0 |
| 285 | # cat sched_wakeup/filter |
| 286 | common_pid == 0 |
Tom Zanussi | ac38fb8 | 2013-10-24 08:59:30 -0500 | [diff] [blame] | 287 | |
Steven Rostedt (Red Hat) | 627645f | 2015-11-03 16:37:15 -0500 | [diff] [blame] | 288 | 5.4 PID filtering |
| 289 | ----------------- |
| 290 | |
| 291 | The set_event_pid file in the same directory as the top events directory |
| 292 | exists, will filter all events from tracing any task that does not have the |
| 293 | PID listed in the set_event_pid file. |
| 294 | |
| 295 | # cd /sys/kernel/debug/tracing |
| 296 | # echo $$ > set_event_pid |
| 297 | # echo 1 > events/enabled |
| 298 | |
| 299 | Will only trace events for the current task. |
| 300 | |
| 301 | To add more PIDs without losing the PIDs already included, use '>>'. |
| 302 | |
| 303 | # echo 123 244 1 >> set_event_pid |
| 304 | |
| 305 | |
Tom Zanussi | ac38fb8 | 2013-10-24 08:59:30 -0500 | [diff] [blame] | 306 | 6. Event triggers |
| 307 | ================= |
| 308 | |
| 309 | Trace events can be made to conditionally invoke trigger 'commands' |
| 310 | which can take various forms and are described in detail below; |
| 311 | examples would be enabling or disabling other trace events or invoking |
| 312 | a stack trace whenever the trace event is hit. Whenever a trace event |
| 313 | with attached triggers is invoked, the set of trigger commands |
| 314 | associated with that event is invoked. Any given trigger can |
| 315 | additionally have an event filter of the same form as described in |
| 316 | section 5 (Event filtering) associated with it - the command will only |
| 317 | be invoked if the event being invoked passes the associated filter. |
| 318 | If no filter is associated with the trigger, it always passes. |
| 319 | |
| 320 | Triggers are added to and removed from a particular event by writing |
| 321 | trigger expressions to the 'trigger' file for the given event. |
| 322 | |
| 323 | A given event can have any number of triggers associated with it, |
| 324 | subject to any restrictions that individual commands may have in that |
| 325 | regard. |
| 326 | |
| 327 | Event triggers are implemented on top of "soft" mode, which means that |
| 328 | whenever a trace event has one or more triggers associated with it, |
| 329 | the event is activated even if it isn't actually enabled, but is |
| 330 | disabled in a "soft" mode. That is, the tracepoint will be called, |
| 331 | but just will not be traced, unless of course it's actually enabled. |
| 332 | This scheme allows triggers to be invoked even for events that aren't |
| 333 | enabled, and also allows the current event filter implementation to be |
| 334 | used for conditionally invoking triggers. |
| 335 | |
| 336 | The syntax for event triggers is roughly based on the syntax for |
| 337 | set_ftrace_filter 'ftrace filter commands' (see the 'Filter commands' |
| 338 | section of Documentation/trace/ftrace.txt), but there are major |
| 339 | differences and the implementation isn't currently tied to it in any |
| 340 | way, so beware about making generalizations between the two. |
| 341 | |
| 342 | 6.1 Expression syntax |
| 343 | --------------------- |
| 344 | |
| 345 | Triggers are added by echoing the command to the 'trigger' file: |
| 346 | |
| 347 | # echo 'command[:count] [if filter]' > trigger |
| 348 | |
| 349 | Triggers are removed by echoing the same command but starting with '!' |
| 350 | to the 'trigger' file: |
| 351 | |
| 352 | # echo '!command[:count] [if filter]' > trigger |
| 353 | |
| 354 | The [if filter] part isn't used in matching commands when removing, so |
| 355 | leaving that off in a '!' command will accomplish the same thing as |
| 356 | having it in. |
| 357 | |
| 358 | The filter syntax is the same as that described in the 'Event |
| 359 | filtering' section above. |
| 360 | |
| 361 | For ease of use, writing to the trigger file using '>' currently just |
| 362 | adds or removes a single trigger and there's no explicit '>>' support |
| 363 | ('>' actually behaves like '>>') or truncation support to remove all |
| 364 | triggers (you have to use '!' for each one added.) |
| 365 | |
| 366 | 6.2 Supported trigger commands |
| 367 | ------------------------------ |
| 368 | |
| 369 | The following commands are supported: |
| 370 | |
| 371 | - enable_event/disable_event |
| 372 | |
| 373 | These commands can enable or disable another trace event whenever |
| 374 | the triggering event is hit. When these commands are registered, |
| 375 | the other trace event is activated, but disabled in a "soft" mode. |
| 376 | That is, the tracepoint will be called, but just will not be traced. |
| 377 | The event tracepoint stays in this mode as long as there's a trigger |
| 378 | in effect that can trigger it. |
| 379 | |
| 380 | For example, the following trigger causes kmalloc events to be |
| 381 | traced when a read system call is entered, and the :1 at the end |
| 382 | specifies that this enablement happens only once: |
| 383 | |
| 384 | # echo 'enable_event:kmem:kmalloc:1' > \ |
| 385 | /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger |
| 386 | |
| 387 | The following trigger causes kmalloc events to stop being traced |
| 388 | when a read system call exits. This disablement happens on every |
| 389 | read system call exit: |
| 390 | |
| 391 | # echo 'disable_event:kmem:kmalloc' > \ |
| 392 | /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger |
| 393 | |
| 394 | The format is: |
| 395 | |
| 396 | enable_event:<system>:<event>[:count] |
| 397 | disable_event:<system>:<event>[:count] |
| 398 | |
| 399 | To remove the above commands: |
| 400 | |
| 401 | # echo '!enable_event:kmem:kmalloc:1' > \ |
| 402 | /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger |
| 403 | |
| 404 | # echo '!disable_event:kmem:kmalloc' > \ |
| 405 | /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger |
| 406 | |
| 407 | Note that there can be any number of enable/disable_event triggers |
| 408 | per triggering event, but there can only be one trigger per |
| 409 | triggered event. e.g. sys_enter_read can have triggers enabling both |
| 410 | kmem:kmalloc and sched:sched_switch, but can't have two kmem:kmalloc |
| 411 | versions such as kmem:kmalloc and kmem:kmalloc:1 or 'kmem:kmalloc if |
| 412 | bytes_req == 256' and 'kmem:kmalloc if bytes_alloc == 256' (they |
| 413 | could be combined into a single filter on kmem:kmalloc though). |
| 414 | |
| 415 | - stacktrace |
| 416 | |
| 417 | This command dumps a stacktrace in the trace buffer whenever the |
| 418 | triggering event occurs. |
| 419 | |
| 420 | For example, the following trigger dumps a stacktrace every time the |
| 421 | kmalloc tracepoint is hit: |
| 422 | |
| 423 | # echo 'stacktrace' > \ |
| 424 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger |
| 425 | |
| 426 | The following trigger dumps a stacktrace the first 5 times a kmalloc |
| 427 | request happens with a size >= 64K |
| 428 | |
| 429 | # echo 'stacktrace:5 if bytes_req >= 65536' > \ |
| 430 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger |
| 431 | |
| 432 | The format is: |
| 433 | |
| 434 | stacktrace[:count] |
| 435 | |
| 436 | To remove the above commands: |
| 437 | |
| 438 | # echo '!stacktrace' > \ |
| 439 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger |
| 440 | |
| 441 | # echo '!stacktrace:5 if bytes_req >= 65536' > \ |
| 442 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger |
| 443 | |
| 444 | The latter can also be removed more simply by the following (without |
| 445 | the filter): |
| 446 | |
| 447 | # echo '!stacktrace:5' > \ |
| 448 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger |
| 449 | |
| 450 | Note that there can be only one stacktrace trigger per triggering |
| 451 | event. |
| 452 | |
| 453 | - snapshot |
| 454 | |
| 455 | This command causes a snapshot to be triggered whenever the |
| 456 | triggering event occurs. |
| 457 | |
| 458 | The following command creates a snapshot every time a block request |
| 459 | queue is unplugged with a depth > 1. If you were tracing a set of |
| 460 | events or functions at the time, the snapshot trace buffer would |
Carlos Garcia | c98be0c | 2014-04-04 22:31:00 -0400 | [diff] [blame] | 461 | capture those events when the trigger event occurred: |
Tom Zanussi | ac38fb8 | 2013-10-24 08:59:30 -0500 | [diff] [blame] | 462 | |
| 463 | # echo 'snapshot if nr_rq > 1' > \ |
| 464 | /sys/kernel/debug/tracing/events/block/block_unplug/trigger |
| 465 | |
| 466 | To only snapshot once: |
| 467 | |
| 468 | # echo 'snapshot:1 if nr_rq > 1' > \ |
| 469 | /sys/kernel/debug/tracing/events/block/block_unplug/trigger |
| 470 | |
| 471 | To remove the above commands: |
| 472 | |
| 473 | # echo '!snapshot if nr_rq > 1' > \ |
| 474 | /sys/kernel/debug/tracing/events/block/block_unplug/trigger |
| 475 | |
| 476 | # echo '!snapshot:1 if nr_rq > 1' > \ |
| 477 | /sys/kernel/debug/tracing/events/block/block_unplug/trigger |
| 478 | |
| 479 | Note that there can be only one snapshot trigger per triggering |
| 480 | event. |
| 481 | |
| 482 | - traceon/traceoff |
| 483 | |
| 484 | These commands turn tracing on and off when the specified events are |
| 485 | hit. The parameter determines how many times the tracing system is |
| 486 | turned on and off. If unspecified, there is no limit. |
| 487 | |
| 488 | The following command turns tracing off the first time a block |
| 489 | request queue is unplugged with a depth > 1. If you were tracing a |
| 490 | set of events or functions at the time, you could then examine the |
| 491 | trace buffer to see the sequence of events that led up to the |
| 492 | trigger event: |
| 493 | |
| 494 | # echo 'traceoff:1 if nr_rq > 1' > \ |
| 495 | /sys/kernel/debug/tracing/events/block/block_unplug/trigger |
| 496 | |
| 497 | To always disable tracing when nr_rq > 1 : |
| 498 | |
| 499 | # echo 'traceoff if nr_rq > 1' > \ |
| 500 | /sys/kernel/debug/tracing/events/block/block_unplug/trigger |
| 501 | |
| 502 | To remove the above commands: |
| 503 | |
| 504 | # echo '!traceoff:1 if nr_rq > 1' > \ |
| 505 | /sys/kernel/debug/tracing/events/block/block_unplug/trigger |
| 506 | |
| 507 | # echo '!traceoff if nr_rq > 1' > \ |
| 508 | /sys/kernel/debug/tracing/events/block/block_unplug/trigger |
| 509 | |
| 510 | Note that there can be only one traceon or traceoff trigger per |
| 511 | triggering event. |
Tom Zanussi | 0fc3813 | 2016-03-03 12:54:56 -0600 | [diff] [blame] | 512 | |
| 513 | - hist |
| 514 | |
| 515 | This command aggregates event hits into a hash table keyed on one or |
| 516 | more trace event format fields (or stacktrace) and a set of running |
| 517 | totals derived from one or more trace event format fields and/or |
| 518 | event counts (hitcount). |
| 519 | |
| 520 | The format of a hist trigger is as follows: |
| 521 | |
| 522 | hist:keys=<field1[,field2,...]>[:values=<field1[,field2,...]>] |
| 523 | [:sort=<field1[,field2,...]>][:size=#entries][:pause][:continue] |
Tom Zanussi | 5463bfd | 2016-03-03 12:54:59 -0600 | [diff] [blame] | 524 | [:clear][:name=histname1] [if <filter>] |
Tom Zanussi | 0fc3813 | 2016-03-03 12:54:56 -0600 | [diff] [blame] | 525 | |
| 526 | When a matching event is hit, an entry is added to a hash table |
| 527 | using the key(s) and value(s) named. Keys and values correspond to |
| 528 | fields in the event's format description. Values must correspond to |
| 529 | numeric fields - on an event hit, the value(s) will be added to a |
| 530 | sum kept for that field. The special string 'hitcount' can be used |
| 531 | in place of an explicit value field - this is simply a count of |
| 532 | event hits. If 'values' isn't specified, an implicit 'hitcount' |
| 533 | value will be automatically created and used as the only value. |
| 534 | Keys can be any field, or the special string 'stacktrace', which |
| 535 | will use the event's kernel stacktrace as the key. The keywords |
| 536 | 'keys' or 'key' can be used to specify keys, and the keywords |
| 537 | 'values', 'vals', or 'val' can be used to specify values. Compound |
| 538 | keys consisting of up to two fields can be specified by the 'keys' |
| 539 | keyword. Hashing a compound key produces a unique entry in the |
| 540 | table for each unique combination of component keys, and can be |
| 541 | useful for providing more fine-grained summaries of event data. |
| 542 | Additionally, sort keys consisting of up to two fields can be |
| 543 | specified by the 'sort' keyword. If more than one field is |
| 544 | specified, the result will be a 'sort within a sort': the first key |
| 545 | is taken to be the primary sort key and the second the secondary |
Tom Zanussi | 5463bfd | 2016-03-03 12:54:59 -0600 | [diff] [blame] | 546 | key. If a hist trigger is given a name using the 'name' parameter, |
| 547 | its histogram data will be shared with other triggers of the same |
| 548 | name, and trigger hits will update this common data. Only triggers |
| 549 | with 'compatible' fields can be combined in this way; triggers are |
| 550 | 'compatible' if the fields named in the trigger share the same |
| 551 | number and type of fields and those fields also have the same names. |
| 552 | Note that any two events always share the compatible 'hitcount' and |
| 553 | 'stacktrace' fields and can therefore be combined using those |
| 554 | fields, however pointless that may be. |
Tom Zanussi | 0fc3813 | 2016-03-03 12:54:56 -0600 | [diff] [blame] | 555 | |
| 556 | 'hist' triggers add a 'hist' file to each event's subdirectory. |
| 557 | Reading the 'hist' file for the event will dump the hash table in |
Tom Zanussi | 52a7f16 | 2016-03-03 12:54:57 -0600 | [diff] [blame] | 558 | its entirety to stdout. If there are multiple hist triggers |
| 559 | attached to an event, there will be a table for each trigger in the |
Tom Zanussi | 5463bfd | 2016-03-03 12:54:59 -0600 | [diff] [blame] | 560 | output. The table displayed for a named trigger will be the same as |
| 561 | any other instance having the same name. Each printed hash table |
| 562 | entry is a simple list of the keys and values comprising the entry; |
| 563 | keys are printed first and are delineated by curly braces, and are |
| 564 | followed by the set of value fields for the entry. By default, |
| 565 | numeric fields are displayed as base-10 integers. This can be |
| 566 | modified by appending any of the following modifiers to the field |
| 567 | name: |
Tom Zanussi | 0fc3813 | 2016-03-03 12:54:56 -0600 | [diff] [blame] | 568 | |
| 569 | .hex display a number as a hex value |
| 570 | .sym display an address as a symbol |
| 571 | .sym-offset display an address as a symbol and offset |
| 572 | .syscall display a syscall id as a system call name |
| 573 | .execname display a common_pid as a program name |
| 574 | |
| 575 | Note that in general the semantics of a given field aren't |
| 576 | interpreted when applying a modifier to it, but there are some |
| 577 | restrictions to be aware of in this regard: |
| 578 | |
| 579 | - only the 'hex' modifier can be used for values (because values |
| 580 | are essentially sums, and the other modifiers don't make sense |
| 581 | in that context). |
| 582 | - the 'execname' modifier can only be used on a 'common_pid'. The |
| 583 | reason for this is that the execname is simply the 'comm' value |
| 584 | saved for the 'current' process when an event was triggered, |
| 585 | which is the same as the common_pid value saved by the event |
| 586 | tracing code. Trying to apply that comm value to other pid |
| 587 | values wouldn't be correct, and typically events that care save |
| 588 | pid-specific comm fields in the event itself. |
| 589 | |
| 590 | A typical usage scenario would be the following to enable a hist |
| 591 | trigger, read its current contents, and then turn it off: |
| 592 | |
| 593 | # echo 'hist:keys=skbaddr.hex:vals=len' > \ |
| 594 | /sys/kernel/debug/tracing/events/net/netif_rx/trigger |
| 595 | |
| 596 | # cat /sys/kernel/debug/tracing/events/net/netif_rx/hist |
| 597 | |
| 598 | # echo '!hist:keys=skbaddr.hex:vals=len' > \ |
| 599 | /sys/kernel/debug/tracing/events/net/netif_rx/trigger |
| 600 | |
| 601 | The trigger file itself can be read to show the details of the |
| 602 | currently attached hist trigger. This information is also displayed |
| 603 | at the top of the 'hist' file when read. |
| 604 | |
| 605 | By default, the size of the hash table is 2048 entries. The 'size' |
| 606 | parameter can be used to specify more or fewer than that. The units |
| 607 | are in terms of hashtable entries - if a run uses more entries than |
| 608 | specified, the results will show the number of 'drops', the number |
| 609 | of hits that were ignored. The size should be a power of 2 between |
| 610 | 128 and 131072 (any non- power-of-2 number specified will be rounded |
| 611 | up). |
| 612 | |
| 613 | The 'sort' parameter can be used to specify a value field to sort |
| 614 | on. The default if unspecified is 'hitcount' and the default sort |
| 615 | order is 'ascending'. To sort in the opposite direction, append |
| 616 | .descending' to the sort key. |
| 617 | |
| 618 | The 'pause' parameter can be used to pause an existing hist trigger |
| 619 | or to start a hist trigger but not log any events until told to do |
| 620 | so. 'continue' or 'cont' can be used to start or restart a paused |
| 621 | hist trigger. |
| 622 | |
| 623 | The 'clear' parameter will clear the contents of a running hist |
| 624 | trigger and leave its current paused/active state. |
| 625 | |
| 626 | Note that the 'pause', 'cont', and 'clear' parameters should be |
| 627 | applied using 'append' shell operator ('>>') if applied to an |
| 628 | existing trigger, rather than via the '>' operator, which will cause |
| 629 | the trigger to be removed through truncation. |
| 630 | |
| 631 | - enable_hist/disable_hist |
| 632 | |
| 633 | The enable_hist and disable_hist triggers can be used to have one |
| 634 | event conditionally start and stop another event's already-attached |
| 635 | hist trigger. Any number of enable_hist and disable_hist triggers |
| 636 | can be attached to a given event, allowing that event to kick off |
| 637 | and stop aggregations on a host of other events. |
| 638 | |
| 639 | The format is very similar to the enable/disable_event triggers: |
| 640 | |
| 641 | enable_hist:<system>:<event>[:count] |
| 642 | disable_hist:<system>:<event>[:count] |
| 643 | |
| 644 | Instead of enabling or disabling the tracing of the target event |
| 645 | into the trace buffer as the enable/disable_event triggers do, the |
| 646 | enable/disable_hist triggers enable or disable the aggregation of |
| 647 | the target event into a hash table. |
| 648 | |
| 649 | A typical usage scenario for the enable_hist/disable_hist triggers |
| 650 | would be to first set up a paused hist trigger on some event, |
| 651 | followed by an enable_hist/disable_hist pair that turns the hist |
| 652 | aggregation on and off when conditions of interest are hit: |
| 653 | |
| 654 | # echo 'hist:keys=skbaddr.hex:vals=len:pause' > \ |
| 655 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger |
| 656 | |
| 657 | # echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \ |
| 658 | /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger |
| 659 | |
| 660 | # echo 'disable_hist:net:netif_receive_skb if comm==wget' > \ |
| 661 | /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger |
| 662 | |
| 663 | The above sets up an initially paused hist trigger which is unpaused |
| 664 | and starts aggregating events when a given program is executed, and |
| 665 | which stops aggregating when the process exits and the hist trigger |
| 666 | is paused again. |
| 667 | |
| 668 | The examples below provide a more concrete illustration of the |
| 669 | concepts and typical usage patterns discussed above. |
| 670 | |
| 671 | |
| 672 | 6.2 'hist' trigger examples |
| 673 | --------------------------- |
| 674 | |
| 675 | The first set of examples creates aggregations using the kmalloc |
| 676 | event. The fields that can be used for the hist trigger are listed |
| 677 | in the kmalloc event's format file: |
| 678 | |
| 679 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/format |
| 680 | name: kmalloc |
| 681 | ID: 374 |
| 682 | format: |
| 683 | field:unsigned short common_type; offset:0; size:2; signed:0; |
| 684 | field:unsigned char common_flags; offset:2; size:1; signed:0; |
| 685 | field:unsigned char common_preempt_count; offset:3; size:1; signed:0; |
| 686 | field:int common_pid; offset:4; size:4; signed:1; |
| 687 | |
| 688 | field:unsigned long call_site; offset:8; size:8; signed:0; |
| 689 | field:const void * ptr; offset:16; size:8; signed:0; |
| 690 | field:size_t bytes_req; offset:24; size:8; signed:0; |
| 691 | field:size_t bytes_alloc; offset:32; size:8; signed:0; |
| 692 | field:gfp_t gfp_flags; offset:40; size:4; signed:0; |
| 693 | |
| 694 | We'll start by creating a hist trigger that generates a simple table |
| 695 | that lists the total number of bytes requested for each function in |
| 696 | the kernel that made one or more calls to kmalloc: |
| 697 | |
| 698 | # echo 'hist:key=call_site:val=bytes_req' > \ |
| 699 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger |
| 700 | |
| 701 | This tells the tracing system to create a 'hist' trigger using the |
| 702 | call_site field of the kmalloc event as the key for the table, which |
| 703 | just means that each unique call_site address will have an entry |
| 704 | created for it in the table. The 'val=bytes_req' parameter tells |
| 705 | the hist trigger that for each unique entry (call_site) in the |
| 706 | table, it should keep a running total of the number of bytes |
| 707 | requested by that call_site. |
| 708 | |
| 709 | We'll let it run for awhile and then dump the contents of the 'hist' |
| 710 | file in the kmalloc event's subdirectory (for readability, a number |
| 711 | of entries have been omitted): |
| 712 | |
| 713 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist |
| 714 | # trigger info: hist:keys=call_site:vals=bytes_req:sort=hitcount:size=2048 [active] |
| 715 | |
| 716 | { call_site: 18446744072106379007 } hitcount: 1 bytes_req: 176 |
| 717 | { call_site: 18446744071579557049 } hitcount: 1 bytes_req: 1024 |
| 718 | { call_site: 18446744071580608289 } hitcount: 1 bytes_req: 16384 |
| 719 | { call_site: 18446744071581827654 } hitcount: 1 bytes_req: 24 |
| 720 | { call_site: 18446744071580700980 } hitcount: 1 bytes_req: 8 |
| 721 | { call_site: 18446744071579359876 } hitcount: 1 bytes_req: 152 |
| 722 | { call_site: 18446744071580795365 } hitcount: 3 bytes_req: 144 |
| 723 | { call_site: 18446744071581303129 } hitcount: 3 bytes_req: 144 |
| 724 | { call_site: 18446744071580713234 } hitcount: 4 bytes_req: 2560 |
| 725 | { call_site: 18446744071580933750 } hitcount: 4 bytes_req: 736 |
| 726 | . |
| 727 | . |
| 728 | . |
| 729 | { call_site: 18446744072106047046 } hitcount: 69 bytes_req: 5576 |
| 730 | { call_site: 18446744071582116407 } hitcount: 73 bytes_req: 2336 |
| 731 | { call_site: 18446744072106054684 } hitcount: 136 bytes_req: 140504 |
| 732 | { call_site: 18446744072106224230 } hitcount: 136 bytes_req: 19584 |
| 733 | { call_site: 18446744072106078074 } hitcount: 153 bytes_req: 2448 |
| 734 | { call_site: 18446744072106062406 } hitcount: 153 bytes_req: 36720 |
| 735 | { call_site: 18446744071582507929 } hitcount: 153 bytes_req: 37088 |
| 736 | { call_site: 18446744072102520590 } hitcount: 273 bytes_req: 10920 |
| 737 | { call_site: 18446744071582143559 } hitcount: 358 bytes_req: 716 |
| 738 | { call_site: 18446744072106465852 } hitcount: 417 bytes_req: 56712 |
| 739 | { call_site: 18446744072102523378 } hitcount: 485 bytes_req: 27160 |
| 740 | { call_site: 18446744072099568646 } hitcount: 1676 bytes_req: 33520 |
| 741 | |
| 742 | Totals: |
| 743 | Hits: 4610 |
| 744 | Entries: 45 |
| 745 | Dropped: 0 |
| 746 | |
| 747 | The output displays a line for each entry, beginning with the key |
| 748 | specified in the trigger, followed by the value(s) also specified in |
| 749 | the trigger. At the beginning of the output is a line that displays |
| 750 | the trigger info, which can also be displayed by reading the |
| 751 | 'trigger' file: |
| 752 | |
| 753 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger |
| 754 | hist:keys=call_site:vals=bytes_req:sort=hitcount:size=2048 [active] |
| 755 | |
| 756 | At the end of the output are a few lines that display the overall |
| 757 | totals for the run. The 'Hits' field shows the total number of |
| 758 | times the event trigger was hit, the 'Entries' field shows the total |
| 759 | number of used entries in the hash table, and the 'Dropped' field |
| 760 | shows the number of hits that were dropped because the number of |
| 761 | used entries for the run exceeded the maximum number of entries |
| 762 | allowed for the table (normally 0, but if not a hint that you may |
| 763 | want to increase the size of the table using the 'size' parameter). |
| 764 | |
| 765 | Notice in the above output that there's an extra field, 'hitcount', |
| 766 | which wasn't specified in the trigger. Also notice that in the |
| 767 | trigger info output, there's a parameter, 'sort=hitcount', which |
| 768 | wasn't specified in the trigger either. The reason for that is that |
| 769 | every trigger implicitly keeps a count of the total number of hits |
| 770 | attributed to a given entry, called the 'hitcount'. That hitcount |
| 771 | information is explicitly displayed in the output, and in the |
| 772 | absence of a user-specified sort parameter, is used as the default |
| 773 | sort field. |
| 774 | |
| 775 | The value 'hitcount' can be used in place of an explicit value in |
| 776 | the 'values' parameter if you don't really need to have any |
| 777 | particular field summed and are mainly interested in hit |
| 778 | frequencies. |
| 779 | |
| 780 | To turn the hist trigger off, simply call up the trigger in the |
| 781 | command history and re-execute it with a '!' prepended: |
| 782 | |
| 783 | # echo '!hist:key=call_site:val=bytes_req' > \ |
| 784 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger |
| 785 | |
| 786 | Finally, notice that the call_site as displayed in the output above |
| 787 | isn't really very useful. It's an address, but normally addresses |
| 788 | are displayed in hex. To have a numeric field displayed as a hex |
| 789 | value, simply append '.hex' to the field name in the trigger: |
| 790 | |
| 791 | # echo 'hist:key=call_site.hex:val=bytes_req' > \ |
| 792 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger |
| 793 | |
| 794 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist |
| 795 | # trigger info: hist:keys=call_site.hex:vals=bytes_req:sort=hitcount:size=2048 [active] |
| 796 | |
| 797 | { call_site: ffffffffa026b291 } hitcount: 1 bytes_req: 433 |
| 798 | { call_site: ffffffffa07186ff } hitcount: 1 bytes_req: 176 |
| 799 | { call_site: ffffffff811ae721 } hitcount: 1 bytes_req: 16384 |
| 800 | { call_site: ffffffff811c5134 } hitcount: 1 bytes_req: 8 |
| 801 | { call_site: ffffffffa04a9ebb } hitcount: 1 bytes_req: 511 |
| 802 | { call_site: ffffffff8122e0a6 } hitcount: 1 bytes_req: 12 |
| 803 | { call_site: ffffffff8107da84 } hitcount: 1 bytes_req: 152 |
| 804 | { call_site: ffffffff812d8246 } hitcount: 1 bytes_req: 24 |
| 805 | { call_site: ffffffff811dc1e5 } hitcount: 3 bytes_req: 144 |
| 806 | { call_site: ffffffffa02515e8 } hitcount: 3 bytes_req: 648 |
| 807 | { call_site: ffffffff81258159 } hitcount: 3 bytes_req: 144 |
| 808 | { call_site: ffffffff811c80f4 } hitcount: 4 bytes_req: 544 |
| 809 | . |
| 810 | . |
| 811 | . |
| 812 | { call_site: ffffffffa06c7646 } hitcount: 106 bytes_req: 8024 |
| 813 | { call_site: ffffffffa06cb246 } hitcount: 132 bytes_req: 31680 |
| 814 | { call_site: ffffffffa06cef7a } hitcount: 132 bytes_req: 2112 |
| 815 | { call_site: ffffffff8137e399 } hitcount: 132 bytes_req: 23232 |
| 816 | { call_site: ffffffffa06c941c } hitcount: 185 bytes_req: 171360 |
| 817 | { call_site: ffffffffa06f2a66 } hitcount: 185 bytes_req: 26640 |
| 818 | { call_site: ffffffffa036a70e } hitcount: 265 bytes_req: 10600 |
| 819 | { call_site: ffffffff81325447 } hitcount: 292 bytes_req: 584 |
| 820 | { call_site: ffffffffa072da3c } hitcount: 446 bytes_req: 60656 |
| 821 | { call_site: ffffffffa036b1f2 } hitcount: 526 bytes_req: 29456 |
| 822 | { call_site: ffffffffa0099c06 } hitcount: 1780 bytes_req: 35600 |
| 823 | |
| 824 | Totals: |
| 825 | Hits: 4775 |
| 826 | Entries: 46 |
| 827 | Dropped: 0 |
| 828 | |
| 829 | Even that's only marginally more useful - while hex values do look |
| 830 | more like addresses, what users are typically more interested in |
| 831 | when looking at text addresses are the corresponding symbols |
| 832 | instead. To have an address displayed as symbolic value instead, |
| 833 | simply append '.sym' or '.sym-offset' to the field name in the |
| 834 | trigger: |
| 835 | |
| 836 | # echo 'hist:key=call_site.sym:val=bytes_req' > \ |
| 837 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger |
| 838 | |
| 839 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist |
| 840 | # trigger info: hist:keys=call_site.sym:vals=bytes_req:sort=hitcount:size=2048 [active] |
| 841 | |
| 842 | { call_site: [ffffffff810adcb9] syslog_print_all } hitcount: 1 bytes_req: 1024 |
| 843 | { call_site: [ffffffff8154bc62] usb_control_msg } hitcount: 1 bytes_req: 8 |
| 844 | { call_site: [ffffffffa00bf6fe] hidraw_send_report [hid] } hitcount: 1 bytes_req: 7 |
| 845 | { call_site: [ffffffff8154acbe] usb_alloc_urb } hitcount: 1 bytes_req: 192 |
| 846 | { call_site: [ffffffffa00bf1ca] hidraw_report_event [hid] } hitcount: 1 bytes_req: 7 |
| 847 | { call_site: [ffffffff811e3a25] __seq_open_private } hitcount: 1 bytes_req: 40 |
| 848 | { call_site: [ffffffff8109524a] alloc_fair_sched_group } hitcount: 2 bytes_req: 128 |
| 849 | { call_site: [ffffffff811febd5] fsnotify_alloc_group } hitcount: 2 bytes_req: 528 |
| 850 | { call_site: [ffffffff81440f58] __tty_buffer_request_room } hitcount: 2 bytes_req: 2624 |
| 851 | { call_site: [ffffffff81200ba6] inotify_new_group } hitcount: 2 bytes_req: 96 |
| 852 | { call_site: [ffffffffa05e19af] ieee80211_start_tx_ba_session [mac80211] } hitcount: 2 bytes_req: 464 |
| 853 | { call_site: [ffffffff81672406] tcp_get_metrics } hitcount: 2 bytes_req: 304 |
| 854 | { call_site: [ffffffff81097ec2] alloc_rt_sched_group } hitcount: 2 bytes_req: 128 |
| 855 | { call_site: [ffffffff81089b05] sched_create_group } hitcount: 2 bytes_req: 1424 |
| 856 | . |
| 857 | . |
| 858 | . |
| 859 | { call_site: [ffffffffa04a580c] intel_crtc_page_flip [i915] } hitcount: 1185 bytes_req: 123240 |
| 860 | { call_site: [ffffffffa0287592] drm_mode_page_flip_ioctl [drm] } hitcount: 1185 bytes_req: 104280 |
| 861 | { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state [i915] } hitcount: 1402 bytes_req: 190672 |
| 862 | { call_site: [ffffffff812891ca] ext4_find_extent } hitcount: 1518 bytes_req: 146208 |
| 863 | { call_site: [ffffffffa029070e] drm_vma_node_allow [drm] } hitcount: 1746 bytes_req: 69840 |
| 864 | { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 2021 bytes_req: 792312 |
| 865 | { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc [drm] } hitcount: 2592 bytes_req: 145152 |
| 866 | { call_site: [ffffffffa0489a66] intel_ring_begin [i915] } hitcount: 2629 bytes_req: 378576 |
| 867 | { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 2629 bytes_req: 3783248 |
| 868 | { call_site: [ffffffff81325607] apparmor_file_alloc_security } hitcount: 5192 bytes_req: 10384 |
| 869 | { call_site: [ffffffffa00b7c06] hid_report_raw_event [hid] } hitcount: 5529 bytes_req: 110584 |
| 870 | { call_site: [ffffffff8131ebf7] aa_alloc_task_context } hitcount: 21943 bytes_req: 702176 |
| 871 | { call_site: [ffffffff8125847d] ext4_htree_store_dirent } hitcount: 55759 bytes_req: 5074265 |
| 872 | |
| 873 | Totals: |
| 874 | Hits: 109928 |
| 875 | Entries: 71 |
| 876 | Dropped: 0 |
| 877 | |
| 878 | Because the default sort key above is 'hitcount', the above shows a |
| 879 | the list of call_sites by increasing hitcount, so that at the bottom |
| 880 | we see the functions that made the most kmalloc calls during the |
| 881 | run. If instead we we wanted to see the top kmalloc callers in |
| 882 | terms of the number of bytes requested rather than the number of |
| 883 | calls, and we wanted the top caller to appear at the top, we can use |
| 884 | the 'sort' parameter, along with the 'descending' modifier: |
| 885 | |
| 886 | # echo 'hist:key=call_site.sym:val=bytes_req:sort=bytes_req.descending' > \ |
| 887 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger |
| 888 | |
| 889 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist |
| 890 | # trigger info: hist:keys=call_site.sym:vals=bytes_req:sort=bytes_req.descending:size=2048 [active] |
| 891 | |
| 892 | { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 2186 bytes_req: 3397464 |
| 893 | { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 1790 bytes_req: 712176 |
| 894 | { call_site: [ffffffff8125847d] ext4_htree_store_dirent } hitcount: 8132 bytes_req: 513135 |
| 895 | { call_site: [ffffffff811e2a1b] seq_buf_alloc } hitcount: 106 bytes_req: 440128 |
| 896 | { call_site: [ffffffffa0489a66] intel_ring_begin [i915] } hitcount: 2186 bytes_req: 314784 |
| 897 | { call_site: [ffffffff812891ca] ext4_find_extent } hitcount: 2174 bytes_req: 208992 |
| 898 | { call_site: [ffffffff811ae8e1] __kmalloc } hitcount: 8 bytes_req: 131072 |
| 899 | { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state [i915] } hitcount: 859 bytes_req: 116824 |
| 900 | { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc [drm] } hitcount: 1834 bytes_req: 102704 |
| 901 | { call_site: [ffffffffa04a580c] intel_crtc_page_flip [i915] } hitcount: 972 bytes_req: 101088 |
| 902 | { call_site: [ffffffffa0287592] drm_mode_page_flip_ioctl [drm] } hitcount: 972 bytes_req: 85536 |
| 903 | { call_site: [ffffffffa00b7c06] hid_report_raw_event [hid] } hitcount: 3333 bytes_req: 66664 |
| 904 | { call_site: [ffffffff8137e559] sg_kmalloc } hitcount: 209 bytes_req: 61632 |
| 905 | . |
| 906 | . |
| 907 | . |
| 908 | { call_site: [ffffffff81095225] alloc_fair_sched_group } hitcount: 2 bytes_req: 128 |
| 909 | { call_site: [ffffffff81097ec2] alloc_rt_sched_group } hitcount: 2 bytes_req: 128 |
| 910 | { call_site: [ffffffff812d8406] copy_semundo } hitcount: 2 bytes_req: 48 |
| 911 | { call_site: [ffffffff81200ba6] inotify_new_group } hitcount: 1 bytes_req: 48 |
| 912 | { call_site: [ffffffffa027121a] drm_getmagic [drm] } hitcount: 1 bytes_req: 48 |
| 913 | { call_site: [ffffffff811e3a25] __seq_open_private } hitcount: 1 bytes_req: 40 |
| 914 | { call_site: [ffffffff811c52f4] bprm_change_interp } hitcount: 2 bytes_req: 16 |
| 915 | { call_site: [ffffffff8154bc62] usb_control_msg } hitcount: 1 bytes_req: 8 |
| 916 | { call_site: [ffffffffa00bf1ca] hidraw_report_event [hid] } hitcount: 1 bytes_req: 7 |
| 917 | { call_site: [ffffffffa00bf6fe] hidraw_send_report [hid] } hitcount: 1 bytes_req: 7 |
| 918 | |
| 919 | Totals: |
| 920 | Hits: 32133 |
| 921 | Entries: 81 |
| 922 | Dropped: 0 |
| 923 | |
| 924 | To display the offset and size information in addition to the symbol |
| 925 | name, just use 'sym-offset' instead: |
| 926 | |
| 927 | # echo 'hist:key=call_site.sym-offset:val=bytes_req:sort=bytes_req.descending' > \ |
| 928 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger |
| 929 | |
| 930 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist |
| 931 | # trigger info: hist:keys=call_site.sym-offset:vals=bytes_req:sort=bytes_req.descending:size=2048 [active] |
| 932 | |
| 933 | { call_site: [ffffffffa046041c] i915_gem_execbuffer2+0x6c/0x2c0 [i915] } hitcount: 4569 bytes_req: 3163720 |
| 934 | { call_site: [ffffffffa0489a66] intel_ring_begin+0xc6/0x1f0 [i915] } hitcount: 4569 bytes_req: 657936 |
| 935 | { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23+0x694/0x1020 [i915] } hitcount: 1519 bytes_req: 472936 |
| 936 | { call_site: [ffffffffa045e646] i915_gem_do_execbuffer.isra.23+0x516/0x1020 [i915] } hitcount: 3050 bytes_req: 211832 |
| 937 | { call_site: [ffffffff811e2a1b] seq_buf_alloc+0x1b/0x50 } hitcount: 34 bytes_req: 148384 |
| 938 | { call_site: [ffffffffa04a580c] intel_crtc_page_flip+0xbc/0x870 [i915] } hitcount: 1385 bytes_req: 144040 |
| 939 | { call_site: [ffffffff811ae8e1] __kmalloc+0x191/0x1b0 } hitcount: 8 bytes_req: 131072 |
| 940 | { call_site: [ffffffffa0287592] drm_mode_page_flip_ioctl+0x282/0x360 [drm] } hitcount: 1385 bytes_req: 121880 |
| 941 | { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc+0x32/0x100 [drm] } hitcount: 1848 bytes_req: 103488 |
| 942 | { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state+0x2c/0xa0 [i915] } hitcount: 461 bytes_req: 62696 |
| 943 | { call_site: [ffffffffa029070e] drm_vma_node_allow+0x2e/0xd0 [drm] } hitcount: 1541 bytes_req: 61640 |
| 944 | { call_site: [ffffffff815f8d7b] sk_prot_alloc+0xcb/0x1b0 } hitcount: 57 bytes_req: 57456 |
| 945 | . |
| 946 | . |
| 947 | . |
| 948 | { call_site: [ffffffff8109524a] alloc_fair_sched_group+0x5a/0x1a0 } hitcount: 2 bytes_req: 128 |
| 949 | { call_site: [ffffffffa027b921] drm_vm_open_locked+0x31/0xa0 [drm] } hitcount: 3 bytes_req: 96 |
| 950 | { call_site: [ffffffff8122e266] proc_self_follow_link+0x76/0xb0 } hitcount: 8 bytes_req: 96 |
| 951 | { call_site: [ffffffff81213e80] load_elf_binary+0x240/0x1650 } hitcount: 3 bytes_req: 84 |
| 952 | { call_site: [ffffffff8154bc62] usb_control_msg+0x42/0x110 } hitcount: 1 bytes_req: 8 |
| 953 | { call_site: [ffffffffa00bf6fe] hidraw_send_report+0x7e/0x1a0 [hid] } hitcount: 1 bytes_req: 7 |
| 954 | { call_site: [ffffffffa00bf1ca] hidraw_report_event+0x8a/0x120 [hid] } hitcount: 1 bytes_req: 7 |
| 955 | |
| 956 | Totals: |
| 957 | Hits: 26098 |
| 958 | Entries: 64 |
| 959 | Dropped: 0 |
| 960 | |
| 961 | We can also add multiple fields to the 'values' parameter. For |
| 962 | example, we might want to see the total number of bytes allocated |
| 963 | alongside bytes requested, and display the result sorted by bytes |
| 964 | allocated in a descending order: |
| 965 | |
| 966 | # echo 'hist:keys=call_site.sym:values=bytes_req,bytes_alloc:sort=bytes_alloc.descending' > \ |
| 967 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger |
| 968 | |
| 969 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist |
| 970 | # trigger info: hist:keys=call_site.sym:vals=bytes_req,bytes_alloc:sort=bytes_alloc.descending:size=2048 [active] |
| 971 | |
| 972 | { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 7403 bytes_req: 4084360 bytes_alloc: 5958016 |
| 973 | { call_site: [ffffffff811e2a1b] seq_buf_alloc } hitcount: 541 bytes_req: 2213968 bytes_alloc: 2228224 |
| 974 | { call_site: [ffffffffa0489a66] intel_ring_begin [i915] } hitcount: 7404 bytes_req: 1066176 bytes_alloc: 1421568 |
| 975 | { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 1565 bytes_req: 557368 bytes_alloc: 1037760 |
| 976 | { call_site: [ffffffff8125847d] ext4_htree_store_dirent } hitcount: 9557 bytes_req: 595778 bytes_alloc: 695744 |
| 977 | { call_site: [ffffffffa045e646] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 5839 bytes_req: 430680 bytes_alloc: 470400 |
| 978 | { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state [i915] } hitcount: 2388 bytes_req: 324768 bytes_alloc: 458496 |
| 979 | { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc [drm] } hitcount: 3911 bytes_req: 219016 bytes_alloc: 250304 |
| 980 | { call_site: [ffffffff815f8d7b] sk_prot_alloc } hitcount: 235 bytes_req: 236880 bytes_alloc: 240640 |
| 981 | { call_site: [ffffffff8137e559] sg_kmalloc } hitcount: 557 bytes_req: 169024 bytes_alloc: 221760 |
| 982 | { call_site: [ffffffffa00b7c06] hid_report_raw_event [hid] } hitcount: 9378 bytes_req: 187548 bytes_alloc: 206312 |
| 983 | { call_site: [ffffffffa04a580c] intel_crtc_page_flip [i915] } hitcount: 1519 bytes_req: 157976 bytes_alloc: 194432 |
| 984 | . |
| 985 | . |
| 986 | . |
| 987 | { call_site: [ffffffff8109bd3b] sched_autogroup_create_attach } hitcount: 2 bytes_req: 144 bytes_alloc: 192 |
| 988 | { call_site: [ffffffff81097ee8] alloc_rt_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128 |
| 989 | { call_site: [ffffffff8109524a] alloc_fair_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128 |
| 990 | { call_site: [ffffffff81095225] alloc_fair_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128 |
| 991 | { call_site: [ffffffff81097ec2] alloc_rt_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128 |
| 992 | { call_site: [ffffffff81213e80] load_elf_binary } hitcount: 3 bytes_req: 84 bytes_alloc: 96 |
| 993 | { call_site: [ffffffff81079a2e] kthread_create_on_node } hitcount: 1 bytes_req: 56 bytes_alloc: 64 |
| 994 | { call_site: [ffffffffa00bf6fe] hidraw_send_report [hid] } hitcount: 1 bytes_req: 7 bytes_alloc: 8 |
| 995 | { call_site: [ffffffff8154bc62] usb_control_msg } hitcount: 1 bytes_req: 8 bytes_alloc: 8 |
| 996 | { call_site: [ffffffffa00bf1ca] hidraw_report_event [hid] } hitcount: 1 bytes_req: 7 bytes_alloc: 8 |
| 997 | |
| 998 | Totals: |
| 999 | Hits: 66598 |
| 1000 | Entries: 65 |
| 1001 | Dropped: 0 |
| 1002 | |
| 1003 | Finally, to finish off our kmalloc example, instead of simply having |
| 1004 | the hist trigger display symbolic call_sites, we can have the hist |
| 1005 | trigger additionally display the complete set of kernel stack traces |
| 1006 | that led to each call_site. To do that, we simply use the special |
| 1007 | value 'stacktrace' for the key parameter: |
| 1008 | |
| 1009 | # echo 'hist:keys=stacktrace:values=bytes_req,bytes_alloc:sort=bytes_alloc' > \ |
| 1010 | /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger |
| 1011 | |
| 1012 | The above trigger will use the kernel stack trace in effect when an |
| 1013 | event is triggered as the key for the hash table. This allows the |
| 1014 | enumeration of every kernel callpath that led up to a particular |
| 1015 | event, along with a running total of any of the event fields for |
| 1016 | that event. Here we tally bytes requested and bytes allocated for |
| 1017 | every callpath in the system that led up to a kmalloc (in this case |
| 1018 | every callpath to a kmalloc for a kernel compile): |
| 1019 | |
| 1020 | # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist |
| 1021 | # trigger info: hist:keys=stacktrace:vals=bytes_req,bytes_alloc:sort=bytes_alloc:size=2048 [active] |
| 1022 | |
| 1023 | { stacktrace: |
| 1024 | __kmalloc_track_caller+0x10b/0x1a0 |
| 1025 | kmemdup+0x20/0x50 |
| 1026 | hidraw_report_event+0x8a/0x120 [hid] |
| 1027 | hid_report_raw_event+0x3ea/0x440 [hid] |
| 1028 | hid_input_report+0x112/0x190 [hid] |
| 1029 | hid_irq_in+0xc2/0x260 [usbhid] |
| 1030 | __usb_hcd_giveback_urb+0x72/0x120 |
| 1031 | usb_giveback_urb_bh+0x9e/0xe0 |
| 1032 | tasklet_hi_action+0xf8/0x100 |
| 1033 | __do_softirq+0x114/0x2c0 |
| 1034 | irq_exit+0xa5/0xb0 |
| 1035 | do_IRQ+0x5a/0xf0 |
| 1036 | ret_from_intr+0x0/0x30 |
| 1037 | cpuidle_enter+0x17/0x20 |
| 1038 | cpu_startup_entry+0x315/0x3e0 |
| 1039 | rest_init+0x7c/0x80 |
| 1040 | } hitcount: 3 bytes_req: 21 bytes_alloc: 24 |
| 1041 | { stacktrace: |
| 1042 | __kmalloc_track_caller+0x10b/0x1a0 |
| 1043 | kmemdup+0x20/0x50 |
| 1044 | hidraw_report_event+0x8a/0x120 [hid] |
| 1045 | hid_report_raw_event+0x3ea/0x440 [hid] |
| 1046 | hid_input_report+0x112/0x190 [hid] |
| 1047 | hid_irq_in+0xc2/0x260 [usbhid] |
| 1048 | __usb_hcd_giveback_urb+0x72/0x120 |
| 1049 | usb_giveback_urb_bh+0x9e/0xe0 |
| 1050 | tasklet_hi_action+0xf8/0x100 |
| 1051 | __do_softirq+0x114/0x2c0 |
| 1052 | irq_exit+0xa5/0xb0 |
| 1053 | do_IRQ+0x5a/0xf0 |
| 1054 | ret_from_intr+0x0/0x30 |
| 1055 | } hitcount: 3 bytes_req: 21 bytes_alloc: 24 |
| 1056 | { stacktrace: |
| 1057 | kmem_cache_alloc_trace+0xeb/0x150 |
| 1058 | aa_alloc_task_context+0x27/0x40 |
| 1059 | apparmor_cred_prepare+0x1f/0x50 |
| 1060 | security_prepare_creds+0x16/0x20 |
| 1061 | prepare_creds+0xdf/0x1a0 |
| 1062 | SyS_capset+0xb5/0x200 |
| 1063 | system_call_fastpath+0x12/0x6a |
| 1064 | } hitcount: 1 bytes_req: 32 bytes_alloc: 32 |
| 1065 | . |
| 1066 | . |
| 1067 | . |
| 1068 | { stacktrace: |
| 1069 | __kmalloc+0x11b/0x1b0 |
| 1070 | i915_gem_execbuffer2+0x6c/0x2c0 [i915] |
| 1071 | drm_ioctl+0x349/0x670 [drm] |
| 1072 | do_vfs_ioctl+0x2f0/0x4f0 |
| 1073 | SyS_ioctl+0x81/0xa0 |
| 1074 | system_call_fastpath+0x12/0x6a |
| 1075 | } hitcount: 17726 bytes_req: 13944120 bytes_alloc: 19593808 |
| 1076 | { stacktrace: |
| 1077 | __kmalloc+0x11b/0x1b0 |
| 1078 | load_elf_phdrs+0x76/0xa0 |
| 1079 | load_elf_binary+0x102/0x1650 |
| 1080 | search_binary_handler+0x97/0x1d0 |
| 1081 | do_execveat_common.isra.34+0x551/0x6e0 |
| 1082 | SyS_execve+0x3a/0x50 |
| 1083 | return_from_execve+0x0/0x23 |
| 1084 | } hitcount: 33348 bytes_req: 17152128 bytes_alloc: 20226048 |
| 1085 | { stacktrace: |
| 1086 | kmem_cache_alloc_trace+0xeb/0x150 |
| 1087 | apparmor_file_alloc_security+0x27/0x40 |
| 1088 | security_file_alloc+0x16/0x20 |
| 1089 | get_empty_filp+0x93/0x1c0 |
| 1090 | path_openat+0x31/0x5f0 |
| 1091 | do_filp_open+0x3a/0x90 |
| 1092 | do_sys_open+0x128/0x220 |
| 1093 | SyS_open+0x1e/0x20 |
| 1094 | system_call_fastpath+0x12/0x6a |
| 1095 | } hitcount: 4766422 bytes_req: 9532844 bytes_alloc: 38131376 |
| 1096 | { stacktrace: |
| 1097 | __kmalloc+0x11b/0x1b0 |
| 1098 | seq_buf_alloc+0x1b/0x50 |
| 1099 | seq_read+0x2cc/0x370 |
| 1100 | proc_reg_read+0x3d/0x80 |
| 1101 | __vfs_read+0x28/0xe0 |
| 1102 | vfs_read+0x86/0x140 |
| 1103 | SyS_read+0x46/0xb0 |
| 1104 | system_call_fastpath+0x12/0x6a |
| 1105 | } hitcount: 19133 bytes_req: 78368768 bytes_alloc: 78368768 |
| 1106 | |
| 1107 | Totals: |
| 1108 | Hits: 6085872 |
| 1109 | Entries: 253 |
| 1110 | Dropped: 0 |
| 1111 | |
| 1112 | If you key a hist trigger on common_pid, in order for example to |
| 1113 | gather and display sorted totals for each process, you can use the |
| 1114 | special .execname modifier to display the executable names for the |
| 1115 | processes in the table rather than raw pids. The example below |
| 1116 | keeps a per-process sum of total bytes read: |
| 1117 | |
| 1118 | # echo 'hist:key=common_pid.execname:val=count:sort=count.descending' > \ |
| 1119 | /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger |
| 1120 | |
| 1121 | # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/hist |
| 1122 | # trigger info: hist:keys=common_pid.execname:vals=count:sort=count.descending:size=2048 [active] |
| 1123 | |
| 1124 | { common_pid: gnome-terminal [ 3196] } hitcount: 280 count: 1093512 |
| 1125 | { common_pid: Xorg [ 1309] } hitcount: 525 count: 256640 |
| 1126 | { common_pid: compiz [ 2889] } hitcount: 59 count: 254400 |
| 1127 | { common_pid: bash [ 8710] } hitcount: 3 count: 66369 |
| 1128 | { common_pid: dbus-daemon-lau [ 8703] } hitcount: 49 count: 47739 |
| 1129 | { common_pid: irqbalance [ 1252] } hitcount: 27 count: 27648 |
| 1130 | { common_pid: 01ifupdown [ 8705] } hitcount: 3 count: 17216 |
| 1131 | { common_pid: dbus-daemon [ 772] } hitcount: 10 count: 12396 |
| 1132 | { common_pid: Socket Thread [ 8342] } hitcount: 11 count: 11264 |
| 1133 | { common_pid: nm-dhcp-client. [ 8701] } hitcount: 6 count: 7424 |
| 1134 | { common_pid: gmain [ 1315] } hitcount: 18 count: 6336 |
| 1135 | . |
| 1136 | . |
| 1137 | . |
| 1138 | { common_pid: postgres [ 1892] } hitcount: 2 count: 32 |
| 1139 | { common_pid: postgres [ 1891] } hitcount: 2 count: 32 |
| 1140 | { common_pid: gmain [ 8704] } hitcount: 2 count: 32 |
| 1141 | { common_pid: upstart-dbus-br [ 2740] } hitcount: 21 count: 21 |
| 1142 | { common_pid: nm-dispatcher.a [ 8696] } hitcount: 1 count: 16 |
| 1143 | { common_pid: indicator-datet [ 2904] } hitcount: 1 count: 16 |
| 1144 | { common_pid: gdbus [ 2998] } hitcount: 1 count: 16 |
| 1145 | { common_pid: rtkit-daemon [ 2052] } hitcount: 1 count: 8 |
| 1146 | { common_pid: init [ 1] } hitcount: 2 count: 2 |
| 1147 | |
| 1148 | Totals: |
| 1149 | Hits: 2116 |
| 1150 | Entries: 51 |
| 1151 | Dropped: 0 |
| 1152 | |
| 1153 | Similarly, if you key a hist trigger on syscall id, for example to |
| 1154 | gather and display a list of systemwide syscall hits, you can use |
| 1155 | the special .syscall modifier to display the syscall names rather |
| 1156 | than raw ids. The example below keeps a running total of syscall |
| 1157 | counts for the system during the run: |
| 1158 | |
| 1159 | # echo 'hist:key=id.syscall:val=hitcount' > \ |
| 1160 | /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger |
| 1161 | |
| 1162 | # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist |
| 1163 | # trigger info: hist:keys=id.syscall:vals=hitcount:sort=hitcount:size=2048 [active] |
| 1164 | |
| 1165 | { id: sys_fsync [ 74] } hitcount: 1 |
| 1166 | { id: sys_newuname [ 63] } hitcount: 1 |
| 1167 | { id: sys_prctl [157] } hitcount: 1 |
| 1168 | { id: sys_statfs [137] } hitcount: 1 |
| 1169 | { id: sys_symlink [ 88] } hitcount: 1 |
| 1170 | { id: sys_sendmmsg [307] } hitcount: 1 |
| 1171 | { id: sys_semctl [ 66] } hitcount: 1 |
| 1172 | { id: sys_readlink [ 89] } hitcount: 3 |
| 1173 | { id: sys_bind [ 49] } hitcount: 3 |
| 1174 | { id: sys_getsockname [ 51] } hitcount: 3 |
| 1175 | { id: sys_unlink [ 87] } hitcount: 3 |
| 1176 | { id: sys_rename [ 82] } hitcount: 4 |
| 1177 | { id: unknown_syscall [ 58] } hitcount: 4 |
| 1178 | { id: sys_connect [ 42] } hitcount: 4 |
| 1179 | { id: sys_getpid [ 39] } hitcount: 4 |
| 1180 | . |
| 1181 | . |
| 1182 | . |
| 1183 | { id: sys_rt_sigprocmask [ 14] } hitcount: 952 |
| 1184 | { id: sys_futex [202] } hitcount: 1534 |
| 1185 | { id: sys_write [ 1] } hitcount: 2689 |
| 1186 | { id: sys_setitimer [ 38] } hitcount: 2797 |
| 1187 | { id: sys_read [ 0] } hitcount: 3202 |
| 1188 | { id: sys_select [ 23] } hitcount: 3773 |
| 1189 | { id: sys_writev [ 20] } hitcount: 4531 |
| 1190 | { id: sys_poll [ 7] } hitcount: 8314 |
| 1191 | { id: sys_recvmsg [ 47] } hitcount: 13738 |
| 1192 | { id: sys_ioctl [ 16] } hitcount: 21843 |
| 1193 | |
| 1194 | Totals: |
| 1195 | Hits: 67612 |
| 1196 | Entries: 72 |
| 1197 | Dropped: 0 |
| 1198 | |
| 1199 | The syscall counts above provide a rough overall picture of system |
| 1200 | call activity on the system; we can see for example that the most |
| 1201 | popular system call on this system was the 'sys_ioctl' system call. |
| 1202 | |
| 1203 | We can use 'compound' keys to refine that number and provide some |
| 1204 | further insight as to which processes exactly contribute to the |
| 1205 | overall ioctl count. |
| 1206 | |
| 1207 | The command below keeps a hitcount for every unique combination of |
| 1208 | system call id and pid - the end result is essentially a table |
| 1209 | that keeps a per-pid sum of system call hits. The results are |
| 1210 | sorted using the system call id as the primary key, and the |
| 1211 | hitcount sum as the secondary key: |
| 1212 | |
| 1213 | # echo 'hist:key=id.syscall,common_pid.execname:val=hitcount:sort=id,hitcount' > \ |
| 1214 | /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger |
| 1215 | |
| 1216 | # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist |
| 1217 | # trigger info: hist:keys=id.syscall,common_pid.execname:vals=hitcount:sort=id.syscall,hitcount:size=2048 [active] |
| 1218 | |
| 1219 | { id: sys_read [ 0], common_pid: rtkit-daemon [ 1877] } hitcount: 1 |
| 1220 | { id: sys_read [ 0], common_pid: gdbus [ 2976] } hitcount: 1 |
| 1221 | { id: sys_read [ 0], common_pid: console-kit-dae [ 3400] } hitcount: 1 |
| 1222 | { id: sys_read [ 0], common_pid: postgres [ 1865] } hitcount: 1 |
| 1223 | { id: sys_read [ 0], common_pid: deja-dup-monito [ 3543] } hitcount: 2 |
| 1224 | { id: sys_read [ 0], common_pid: NetworkManager [ 890] } hitcount: 2 |
| 1225 | { id: sys_read [ 0], common_pid: evolution-calen [ 3048] } hitcount: 2 |
| 1226 | { id: sys_read [ 0], common_pid: postgres [ 1864] } hitcount: 2 |
| 1227 | { id: sys_read [ 0], common_pid: nm-applet [ 3022] } hitcount: 2 |
| 1228 | { id: sys_read [ 0], common_pid: whoopsie [ 1212] } hitcount: 2 |
| 1229 | . |
| 1230 | . |
| 1231 | . |
| 1232 | { id: sys_ioctl [ 16], common_pid: bash [ 8479] } hitcount: 1 |
| 1233 | { id: sys_ioctl [ 16], common_pid: bash [ 3472] } hitcount: 12 |
| 1234 | { id: sys_ioctl [ 16], common_pid: gnome-terminal [ 3199] } hitcount: 16 |
| 1235 | { id: sys_ioctl [ 16], common_pid: Xorg [ 1267] } hitcount: 1808 |
| 1236 | { id: sys_ioctl [ 16], common_pid: compiz [ 2994] } hitcount: 5580 |
| 1237 | . |
| 1238 | . |
| 1239 | . |
| 1240 | { id: sys_waitid [247], common_pid: upstart-dbus-br [ 2690] } hitcount: 3 |
| 1241 | { id: sys_waitid [247], common_pid: upstart-dbus-br [ 2688] } hitcount: 16 |
| 1242 | { id: sys_inotify_add_watch [254], common_pid: gmain [ 975] } hitcount: 2 |
| 1243 | { id: sys_inotify_add_watch [254], common_pid: gmain [ 3204] } hitcount: 4 |
| 1244 | { id: sys_inotify_add_watch [254], common_pid: gmain [ 2888] } hitcount: 4 |
| 1245 | { id: sys_inotify_add_watch [254], common_pid: gmain [ 3003] } hitcount: 4 |
| 1246 | { id: sys_inotify_add_watch [254], common_pid: gmain [ 2873] } hitcount: 4 |
| 1247 | { id: sys_inotify_add_watch [254], common_pid: gmain [ 3196] } hitcount: 6 |
| 1248 | { id: sys_openat [257], common_pid: java [ 2623] } hitcount: 2 |
| 1249 | { id: sys_eventfd2 [290], common_pid: ibus-ui-gtk3 [ 2760] } hitcount: 4 |
| 1250 | { id: sys_eventfd2 [290], common_pid: compiz [ 2994] } hitcount: 6 |
| 1251 | |
| 1252 | Totals: |
| 1253 | Hits: 31536 |
| 1254 | Entries: 323 |
| 1255 | Dropped: 0 |
| 1256 | |
| 1257 | The above list does give us a breakdown of the ioctl syscall by |
| 1258 | pid, but it also gives us quite a bit more than that, which we |
| 1259 | don't really care about at the moment. Since we know the syscall |
| 1260 | id for sys_ioctl (16, displayed next to the sys_ioctl name), we |
| 1261 | can use that to filter out all the other syscalls: |
| 1262 | |
| 1263 | # echo 'hist:key=id.syscall,common_pid.execname:val=hitcount:sort=id,hitcount if id == 16' > \ |
| 1264 | /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger |
| 1265 | |
| 1266 | # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist |
| 1267 | # trigger info: hist:keys=id.syscall,common_pid.execname:vals=hitcount:sort=id.syscall,hitcount:size=2048 if id == 16 [active] |
| 1268 | |
| 1269 | { id: sys_ioctl [ 16], common_pid: gmain [ 2769] } hitcount: 1 |
| 1270 | { id: sys_ioctl [ 16], common_pid: evolution-addre [ 8571] } hitcount: 1 |
| 1271 | { id: sys_ioctl [ 16], common_pid: gmain [ 3003] } hitcount: 1 |
| 1272 | { id: sys_ioctl [ 16], common_pid: gmain [ 2781] } hitcount: 1 |
| 1273 | { id: sys_ioctl [ 16], common_pid: gmain [ 2829] } hitcount: 1 |
| 1274 | { id: sys_ioctl [ 16], common_pid: bash [ 8726] } hitcount: 1 |
| 1275 | { id: sys_ioctl [ 16], common_pid: bash [ 8508] } hitcount: 1 |
| 1276 | { id: sys_ioctl [ 16], common_pid: gmain [ 2970] } hitcount: 1 |
| 1277 | { id: sys_ioctl [ 16], common_pid: gmain [ 2768] } hitcount: 1 |
| 1278 | . |
| 1279 | . |
| 1280 | . |
| 1281 | { id: sys_ioctl [ 16], common_pid: pool [ 8559] } hitcount: 45 |
| 1282 | { id: sys_ioctl [ 16], common_pid: pool [ 8555] } hitcount: 48 |
| 1283 | { id: sys_ioctl [ 16], common_pid: pool [ 8551] } hitcount: 48 |
| 1284 | { id: sys_ioctl [ 16], common_pid: avahi-daemon [ 896] } hitcount: 66 |
| 1285 | { id: sys_ioctl [ 16], common_pid: Xorg [ 1267] } hitcount: 26674 |
| 1286 | { id: sys_ioctl [ 16], common_pid: compiz [ 2994] } hitcount: 73443 |
| 1287 | |
| 1288 | Totals: |
| 1289 | Hits: 101162 |
| 1290 | Entries: 103 |
| 1291 | Dropped: 0 |
| 1292 | |
| 1293 | The above output shows that 'compiz' and 'Xorg' are far and away |
| 1294 | the heaviest ioctl callers (which might lead to questions about |
| 1295 | whether they really need to be making all those calls and to |
| 1296 | possible avenues for further investigation.) |
| 1297 | |
| 1298 | The compound key examples used a key and a sum value (hitcount) to |
| 1299 | sort the output, but we can just as easily use two keys instead. |
| 1300 | Here's an example where we use a compound key composed of the the |
| 1301 | common_pid and size event fields. Sorting with pid as the primary |
| 1302 | key and 'size' as the secondary key allows us to display an |
| 1303 | ordered summary of the recvfrom sizes, with counts, received by |
| 1304 | each process: |
| 1305 | |
| 1306 | # echo 'hist:key=common_pid.execname,size:val=hitcount:sort=common_pid,size' > \ |
| 1307 | /sys/kernel/debug/tracing/events/syscalls/sys_enter_recvfrom/trigger |
| 1308 | |
| 1309 | # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_recvfrom/hist |
| 1310 | # trigger info: hist:keys=common_pid.execname,size:vals=hitcount:sort=common_pid.execname,size:size=2048 [active] |
| 1311 | |
| 1312 | { common_pid: smbd [ 784], size: 4 } hitcount: 1 |
| 1313 | { common_pid: dnsmasq [ 1412], size: 4096 } hitcount: 672 |
| 1314 | { common_pid: postgres [ 1796], size: 1000 } hitcount: 6 |
| 1315 | { common_pid: postgres [ 1867], size: 1000 } hitcount: 10 |
| 1316 | { common_pid: bamfdaemon [ 2787], size: 28 } hitcount: 2 |
| 1317 | { common_pid: bamfdaemon [ 2787], size: 14360 } hitcount: 1 |
| 1318 | { common_pid: compiz [ 2994], size: 8 } hitcount: 1 |
| 1319 | { common_pid: compiz [ 2994], size: 20 } hitcount: 11 |
| 1320 | { common_pid: gnome-terminal [ 3199], size: 4 } hitcount: 2 |
| 1321 | { common_pid: firefox [ 8817], size: 4 } hitcount: 1 |
| 1322 | { common_pid: firefox [ 8817], size: 8 } hitcount: 5 |
| 1323 | { common_pid: firefox [ 8817], size: 588 } hitcount: 2 |
| 1324 | { common_pid: firefox [ 8817], size: 628 } hitcount: 1 |
| 1325 | { common_pid: firefox [ 8817], size: 6944 } hitcount: 1 |
| 1326 | { common_pid: firefox [ 8817], size: 408880 } hitcount: 2 |
| 1327 | { common_pid: firefox [ 8822], size: 8 } hitcount: 2 |
| 1328 | { common_pid: firefox [ 8822], size: 160 } hitcount: 2 |
| 1329 | { common_pid: firefox [ 8822], size: 320 } hitcount: 2 |
| 1330 | { common_pid: firefox [ 8822], size: 352 } hitcount: 1 |
| 1331 | . |
| 1332 | . |
| 1333 | . |
| 1334 | { common_pid: pool [ 8923], size: 1960 } hitcount: 10 |
| 1335 | { common_pid: pool [ 8923], size: 2048 } hitcount: 10 |
| 1336 | { common_pid: pool [ 8924], size: 1960 } hitcount: 10 |
| 1337 | { common_pid: pool [ 8924], size: 2048 } hitcount: 10 |
| 1338 | { common_pid: pool [ 8928], size: 1964 } hitcount: 4 |
| 1339 | { common_pid: pool [ 8928], size: 1965 } hitcount: 2 |
| 1340 | { common_pid: pool [ 8928], size: 2048 } hitcount: 6 |
| 1341 | { common_pid: pool [ 8929], size: 1982 } hitcount: 1 |
| 1342 | { common_pid: pool [ 8929], size: 2048 } hitcount: 1 |
| 1343 | |
| 1344 | Totals: |
| 1345 | Hits: 2016 |
| 1346 | Entries: 224 |
| 1347 | Dropped: 0 |
| 1348 | |
| 1349 | The above example also illustrates the fact that although a compound |
| 1350 | key is treated as a single entity for hashing purposes, the sub-keys |
| 1351 | it's composed of can be accessed independently. |
| 1352 | |
| 1353 | The next example uses a string field as the hash key and |
| 1354 | demonstrates how you can manually pause and continue a hist trigger. |
| 1355 | In this example, we'll aggregate fork counts and don't expect a |
| 1356 | large number of entries in the hash table, so we'll drop it to a |
| 1357 | much smaller number, say 256: |
| 1358 | |
| 1359 | # echo 'hist:key=child_comm:val=hitcount:size=256' > \ |
| 1360 | /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger |
| 1361 | |
| 1362 | # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist |
| 1363 | # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [active] |
| 1364 | |
| 1365 | { child_comm: dconf worker } hitcount: 1 |
| 1366 | { child_comm: ibus-daemon } hitcount: 1 |
| 1367 | { child_comm: whoopsie } hitcount: 1 |
| 1368 | { child_comm: smbd } hitcount: 1 |
| 1369 | { child_comm: gdbus } hitcount: 1 |
| 1370 | { child_comm: kthreadd } hitcount: 1 |
| 1371 | { child_comm: dconf worker } hitcount: 1 |
| 1372 | { child_comm: evolution-alarm } hitcount: 2 |
| 1373 | { child_comm: Socket Thread } hitcount: 2 |
| 1374 | { child_comm: postgres } hitcount: 2 |
| 1375 | { child_comm: bash } hitcount: 3 |
| 1376 | { child_comm: compiz } hitcount: 3 |
| 1377 | { child_comm: evolution-sourc } hitcount: 4 |
| 1378 | { child_comm: dhclient } hitcount: 4 |
| 1379 | { child_comm: pool } hitcount: 5 |
| 1380 | { child_comm: nm-dispatcher.a } hitcount: 8 |
| 1381 | { child_comm: firefox } hitcount: 8 |
| 1382 | { child_comm: dbus-daemon } hitcount: 8 |
| 1383 | { child_comm: glib-pacrunner } hitcount: 10 |
| 1384 | { child_comm: evolution } hitcount: 23 |
| 1385 | |
| 1386 | Totals: |
| 1387 | Hits: 89 |
| 1388 | Entries: 20 |
| 1389 | Dropped: 0 |
| 1390 | |
| 1391 | If we want to pause the hist trigger, we can simply append :pause to |
| 1392 | the command that started the trigger. Notice that the trigger info |
| 1393 | displays as [paused]: |
| 1394 | |
| 1395 | # echo 'hist:key=child_comm:val=hitcount:size=256:pause' >> \ |
| 1396 | /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger |
| 1397 | |
| 1398 | # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist |
| 1399 | # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [paused] |
| 1400 | |
| 1401 | { child_comm: dconf worker } hitcount: 1 |
| 1402 | { child_comm: kthreadd } hitcount: 1 |
| 1403 | { child_comm: dconf worker } hitcount: 1 |
| 1404 | { child_comm: gdbus } hitcount: 1 |
| 1405 | { child_comm: ibus-daemon } hitcount: 1 |
| 1406 | { child_comm: Socket Thread } hitcount: 2 |
| 1407 | { child_comm: evolution-alarm } hitcount: 2 |
| 1408 | { child_comm: smbd } hitcount: 2 |
| 1409 | { child_comm: bash } hitcount: 3 |
| 1410 | { child_comm: whoopsie } hitcount: 3 |
| 1411 | { child_comm: compiz } hitcount: 3 |
| 1412 | { child_comm: evolution-sourc } hitcount: 4 |
| 1413 | { child_comm: pool } hitcount: 5 |
| 1414 | { child_comm: postgres } hitcount: 6 |
| 1415 | { child_comm: firefox } hitcount: 8 |
| 1416 | { child_comm: dhclient } hitcount: 10 |
| 1417 | { child_comm: emacs } hitcount: 12 |
| 1418 | { child_comm: dbus-daemon } hitcount: 20 |
| 1419 | { child_comm: nm-dispatcher.a } hitcount: 20 |
| 1420 | { child_comm: evolution } hitcount: 35 |
| 1421 | { child_comm: glib-pacrunner } hitcount: 59 |
| 1422 | |
| 1423 | Totals: |
| 1424 | Hits: 199 |
| 1425 | Entries: 21 |
| 1426 | Dropped: 0 |
| 1427 | |
| 1428 | To manually continue having the trigger aggregate events, append |
| 1429 | :cont instead. Notice that the trigger info displays as [active] |
| 1430 | again, and the data has changed: |
| 1431 | |
| 1432 | # echo 'hist:key=child_comm:val=hitcount:size=256:cont' >> \ |
| 1433 | /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger |
| 1434 | |
| 1435 | # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist |
| 1436 | # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [active] |
| 1437 | |
| 1438 | { child_comm: dconf worker } hitcount: 1 |
| 1439 | { child_comm: dconf worker } hitcount: 1 |
| 1440 | { child_comm: kthreadd } hitcount: 1 |
| 1441 | { child_comm: gdbus } hitcount: 1 |
| 1442 | { child_comm: ibus-daemon } hitcount: 1 |
| 1443 | { child_comm: Socket Thread } hitcount: 2 |
| 1444 | { child_comm: evolution-alarm } hitcount: 2 |
| 1445 | { child_comm: smbd } hitcount: 2 |
| 1446 | { child_comm: whoopsie } hitcount: 3 |
| 1447 | { child_comm: compiz } hitcount: 3 |
| 1448 | { child_comm: evolution-sourc } hitcount: 4 |
| 1449 | { child_comm: bash } hitcount: 5 |
| 1450 | { child_comm: pool } hitcount: 5 |
| 1451 | { child_comm: postgres } hitcount: 6 |
| 1452 | { child_comm: firefox } hitcount: 8 |
| 1453 | { child_comm: dhclient } hitcount: 11 |
| 1454 | { child_comm: emacs } hitcount: 12 |
| 1455 | { child_comm: dbus-daemon } hitcount: 22 |
| 1456 | { child_comm: nm-dispatcher.a } hitcount: 22 |
| 1457 | { child_comm: evolution } hitcount: 35 |
| 1458 | { child_comm: glib-pacrunner } hitcount: 59 |
| 1459 | |
| 1460 | Totals: |
| 1461 | Hits: 206 |
| 1462 | Entries: 21 |
| 1463 | Dropped: 0 |
| 1464 | |
| 1465 | The previous example showed how to start and stop a hist trigger by |
| 1466 | appending 'pause' and 'continue' to the hist trigger command. A |
| 1467 | hist trigger can also be started in a paused state by initially |
| 1468 | starting the trigger with ':pause' appended. This allows you to |
| 1469 | start the trigger only when you're ready to start collecting data |
| 1470 | and not before. For example, you could start the trigger in a |
| 1471 | paused state, then unpause it and do something you want to measure, |
| 1472 | then pause the trigger again when done. |
| 1473 | |
| 1474 | Of course, doing this manually can be difficult and error-prone, but |
| 1475 | it is possible to automatically start and stop a hist trigger based |
| 1476 | on some condition, via the enable_hist and disable_hist triggers. |
| 1477 | |
| 1478 | For example, suppose we wanted to take a look at the relative |
| 1479 | weights in terms of skb length for each callpath that leads to a |
| 1480 | netif_receieve_skb event when downloading a decent-sized file using |
| 1481 | wget. |
| 1482 | |
| 1483 | First we set up an initially paused stacktrace trigger on the |
| 1484 | netif_receive_skb event: |
| 1485 | |
| 1486 | # echo 'hist:key=stacktrace:vals=len:pause' > \ |
| 1487 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger |
| 1488 | |
| 1489 | Next, we set up an 'enable_hist' trigger on the sched_process_exec |
| 1490 | event, with an 'if filename==/usr/bin/wget' filter. The effect of |
| 1491 | this new trigger is that it will 'unpause' the hist trigger we just |
| 1492 | set up on netif_receive_skb if and only if it sees a |
| 1493 | sched_process_exec event with a filename of '/usr/bin/wget'. When |
| 1494 | that happens, all netif_receive_skb events are aggregated into a |
| 1495 | hash table keyed on stacktrace: |
| 1496 | |
| 1497 | # echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \ |
| 1498 | /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger |
| 1499 | |
| 1500 | The aggregation continues until the netif_receive_skb is paused |
| 1501 | again, which is what the following disable_hist event does by |
| 1502 | creating a similar setup on the sched_process_exit event, using the |
| 1503 | filter 'comm==wget': |
| 1504 | |
| 1505 | # echo 'disable_hist:net:netif_receive_skb if comm==wget' > \ |
| 1506 | /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger |
| 1507 | |
| 1508 | Whenever a process exits and the comm field of the disable_hist |
| 1509 | trigger filter matches 'comm==wget', the netif_receive_skb hist |
| 1510 | trigger is disabled. |
| 1511 | |
| 1512 | The overall effect is that netif_receive_skb events are aggregated |
| 1513 | into the hash table for only the duration of the wget. Executing a |
| 1514 | wget command and then listing the 'hist' file will display the |
| 1515 | output generated by the wget command: |
| 1516 | |
| 1517 | $ wget https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.19.xz |
| 1518 | |
| 1519 | # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist |
| 1520 | # trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused] |
| 1521 | |
| 1522 | { stacktrace: |
| 1523 | __netif_receive_skb_core+0x46d/0x990 |
| 1524 | __netif_receive_skb+0x18/0x60 |
| 1525 | netif_receive_skb_internal+0x23/0x90 |
| 1526 | napi_gro_receive+0xc8/0x100 |
| 1527 | ieee80211_deliver_skb+0xd6/0x270 [mac80211] |
| 1528 | ieee80211_rx_handlers+0xccf/0x22f0 [mac80211] |
| 1529 | ieee80211_prepare_and_rx_handle+0x4e7/0xc40 [mac80211] |
| 1530 | ieee80211_rx+0x31d/0x900 [mac80211] |
| 1531 | iwlagn_rx_reply_rx+0x3db/0x6f0 [iwldvm] |
| 1532 | iwl_rx_dispatch+0x8e/0xf0 [iwldvm] |
| 1533 | iwl_pcie_irq_handler+0xe3c/0x12f0 [iwlwifi] |
| 1534 | irq_thread_fn+0x20/0x50 |
| 1535 | irq_thread+0x11f/0x150 |
| 1536 | kthread+0xd2/0xf0 |
| 1537 | ret_from_fork+0x42/0x70 |
| 1538 | } hitcount: 85 len: 28884 |
| 1539 | { stacktrace: |
| 1540 | __netif_receive_skb_core+0x46d/0x990 |
| 1541 | __netif_receive_skb+0x18/0x60 |
| 1542 | netif_receive_skb_internal+0x23/0x90 |
| 1543 | napi_gro_complete+0xa4/0xe0 |
| 1544 | dev_gro_receive+0x23a/0x360 |
| 1545 | napi_gro_receive+0x30/0x100 |
| 1546 | ieee80211_deliver_skb+0xd6/0x270 [mac80211] |
| 1547 | ieee80211_rx_handlers+0xccf/0x22f0 [mac80211] |
| 1548 | ieee80211_prepare_and_rx_handle+0x4e7/0xc40 [mac80211] |
| 1549 | ieee80211_rx+0x31d/0x900 [mac80211] |
| 1550 | iwlagn_rx_reply_rx+0x3db/0x6f0 [iwldvm] |
| 1551 | iwl_rx_dispatch+0x8e/0xf0 [iwldvm] |
| 1552 | iwl_pcie_irq_handler+0xe3c/0x12f0 [iwlwifi] |
| 1553 | irq_thread_fn+0x20/0x50 |
| 1554 | irq_thread+0x11f/0x150 |
| 1555 | kthread+0xd2/0xf0 |
| 1556 | } hitcount: 98 len: 664329 |
| 1557 | { stacktrace: |
| 1558 | __netif_receive_skb_core+0x46d/0x990 |
| 1559 | __netif_receive_skb+0x18/0x60 |
| 1560 | process_backlog+0xa8/0x150 |
| 1561 | net_rx_action+0x15d/0x340 |
| 1562 | __do_softirq+0x114/0x2c0 |
| 1563 | do_softirq_own_stack+0x1c/0x30 |
| 1564 | do_softirq+0x65/0x70 |
| 1565 | __local_bh_enable_ip+0xb5/0xc0 |
| 1566 | ip_finish_output+0x1f4/0x840 |
| 1567 | ip_output+0x6b/0xc0 |
| 1568 | ip_local_out_sk+0x31/0x40 |
| 1569 | ip_send_skb+0x1a/0x50 |
| 1570 | udp_send_skb+0x173/0x2a0 |
| 1571 | udp_sendmsg+0x2bf/0x9f0 |
| 1572 | inet_sendmsg+0x64/0xa0 |
| 1573 | sock_sendmsg+0x3d/0x50 |
| 1574 | } hitcount: 115 len: 13030 |
| 1575 | { stacktrace: |
| 1576 | __netif_receive_skb_core+0x46d/0x990 |
| 1577 | __netif_receive_skb+0x18/0x60 |
| 1578 | netif_receive_skb_internal+0x23/0x90 |
| 1579 | napi_gro_complete+0xa4/0xe0 |
| 1580 | napi_gro_flush+0x6d/0x90 |
| 1581 | iwl_pcie_irq_handler+0x92a/0x12f0 [iwlwifi] |
| 1582 | irq_thread_fn+0x20/0x50 |
| 1583 | irq_thread+0x11f/0x150 |
| 1584 | kthread+0xd2/0xf0 |
| 1585 | ret_from_fork+0x42/0x70 |
| 1586 | } hitcount: 934 len: 5512212 |
| 1587 | |
| 1588 | Totals: |
| 1589 | Hits: 1232 |
| 1590 | Entries: 4 |
| 1591 | Dropped: 0 |
| 1592 | |
| 1593 | The above shows all the netif_receive_skb callpaths and their total |
| 1594 | lengths for the duration of the wget command. |
| 1595 | |
| 1596 | The 'clear' hist trigger param can be used to clear the hash table. |
| 1597 | Suppose we wanted to try another run of the previous example but |
| 1598 | this time also wanted to see the complete list of events that went |
| 1599 | into the histogram. In order to avoid having to set everything up |
| 1600 | again, we can just clear the histogram first: |
| 1601 | |
| 1602 | # echo 'hist:key=stacktrace:vals=len:clear' >> \ |
| 1603 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger |
| 1604 | |
| 1605 | Just to verify that it is in fact cleared, here's what we now see in |
| 1606 | the hist file: |
| 1607 | |
| 1608 | # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist |
| 1609 | # trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused] |
| 1610 | |
| 1611 | Totals: |
| 1612 | Hits: 0 |
| 1613 | Entries: 0 |
| 1614 | Dropped: 0 |
| 1615 | |
| 1616 | Since we want to see the detailed list of every netif_receive_skb |
| 1617 | event occurring during the new run, which are in fact the same |
| 1618 | events being aggregated into the hash table, we add some additional |
| 1619 | 'enable_event' events to the triggering sched_process_exec and |
| 1620 | sched_process_exit events as such: |
| 1621 | |
| 1622 | # echo 'enable_event:net:netif_receive_skb if filename==/usr/bin/wget' > \ |
| 1623 | /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger |
| 1624 | |
| 1625 | # echo 'disable_event:net:netif_receive_skb if comm==wget' > \ |
| 1626 | /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger |
| 1627 | |
| 1628 | If you read the trigger files for the sched_process_exec and |
| 1629 | sched_process_exit triggers, you should see two triggers for each: |
| 1630 | one enabling/disabling the hist aggregation and the other |
| 1631 | enabling/disabling the logging of events: |
| 1632 | |
| 1633 | # cat /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger |
| 1634 | enable_event:net:netif_receive_skb:unlimited if filename==/usr/bin/wget |
| 1635 | enable_hist:net:netif_receive_skb:unlimited if filename==/usr/bin/wget |
| 1636 | |
| 1637 | # cat /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger |
| 1638 | enable_event:net:netif_receive_skb:unlimited if comm==wget |
| 1639 | disable_hist:net:netif_receive_skb:unlimited if comm==wget |
| 1640 | |
| 1641 | In other words, whenever either of the sched_process_exec or |
| 1642 | sched_process_exit events is hit and matches 'wget', it enables or |
| 1643 | disables both the histogram and the event log, and what you end up |
| 1644 | with is a hash table and set of events just covering the specified |
| 1645 | duration. Run the wget command again: |
| 1646 | |
| 1647 | $ wget https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.19.xz |
| 1648 | |
| 1649 | Displaying the 'hist' file should show something similar to what you |
| 1650 | saw in the last run, but this time you should also see the |
| 1651 | individual events in the trace file: |
| 1652 | |
| 1653 | # cat /sys/kernel/debug/tracing/trace |
| 1654 | |
| 1655 | # tracer: nop |
| 1656 | # |
| 1657 | # entries-in-buffer/entries-written: 183/1426 #P:4 |
| 1658 | # |
| 1659 | # _-----=> irqs-off |
| 1660 | # / _----=> need-resched |
| 1661 | # | / _---=> hardirq/softirq |
| 1662 | # || / _--=> preempt-depth |
| 1663 | # ||| / delay |
| 1664 | # TASK-PID CPU# |||| TIMESTAMP FUNCTION |
| 1665 | # | | | |||| | | |
| 1666 | wget-15108 [000] ..s1 31769.606929: netif_receive_skb: dev=lo skbaddr=ffff88009c353100 len=60 |
| 1667 | wget-15108 [000] ..s1 31769.606999: netif_receive_skb: dev=lo skbaddr=ffff88009c353200 len=60 |
| 1668 | dnsmasq-1382 [000] ..s1 31769.677652: netif_receive_skb: dev=lo skbaddr=ffff88009c352b00 len=130 |
| 1669 | dnsmasq-1382 [000] ..s1 31769.685917: netif_receive_skb: dev=lo skbaddr=ffff88009c352200 len=138 |
| 1670 | ##### CPU 2 buffer started #### |
| 1671 | irq/29-iwlwifi-559 [002] ..s. 31772.031529: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433d00 len=2948 |
| 1672 | irq/29-iwlwifi-559 [002] ..s. 31772.031572: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d432200 len=1500 |
| 1673 | irq/29-iwlwifi-559 [002] ..s. 31772.032196: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433100 len=2948 |
| 1674 | irq/29-iwlwifi-559 [002] ..s. 31772.032761: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433000 len=2948 |
| 1675 | irq/29-iwlwifi-559 [002] ..s. 31772.033220: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d432e00 len=1500 |
| 1676 | . |
| 1677 | . |
| 1678 | . |
Tom Zanussi | 52a7f16 | 2016-03-03 12:54:57 -0600 | [diff] [blame] | 1679 | |
| 1680 | The following example demonstrates how multiple hist triggers can be |
| 1681 | attached to a given event. This capability can be useful for |
| 1682 | creating a set of different summaries derived from the same set of |
| 1683 | events, or for comparing the effects of different filters, among |
| 1684 | other things. |
| 1685 | |
| 1686 | # echo 'hist:keys=skbaddr.hex:vals=len if len < 0' >> \ |
| 1687 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger |
| 1688 | # echo 'hist:keys=skbaddr.hex:vals=len if len > 4096' >> \ |
| 1689 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger |
| 1690 | # echo 'hist:keys=skbaddr.hex:vals=len if len == 256' >> \ |
| 1691 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger |
| 1692 | # echo 'hist:keys=skbaddr.hex:vals=len' >> \ |
| 1693 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger |
| 1694 | # echo 'hist:keys=len:vals=common_preempt_count' >> \ |
| 1695 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger |
| 1696 | |
| 1697 | The above set of commands create four triggers differing only in |
| 1698 | their filters, along with a completely different though fairly |
| 1699 | nonsensical trigger. Note that in order to append multiple hist |
| 1700 | triggers to the same file, you should use the '>>' operator to |
| 1701 | append them ('>' will also add the new hist trigger, but will remove |
| 1702 | any existing hist triggers beforehand). |
| 1703 | |
| 1704 | Displaying the contents of the 'hist' file for the event shows the |
| 1705 | contents of all five histograms: |
| 1706 | |
| 1707 | # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist |
| 1708 | |
| 1709 | # event histogram |
| 1710 | # |
| 1711 | # trigger info: hist:keys=len:vals=hitcount,common_preempt_count:sort=hitcount:size=2048 [active] |
| 1712 | # |
| 1713 | |
| 1714 | { len: 176 } hitcount: 1 common_preempt_count: 0 |
| 1715 | { len: 223 } hitcount: 1 common_preempt_count: 0 |
| 1716 | { len: 4854 } hitcount: 1 common_preempt_count: 0 |
| 1717 | { len: 395 } hitcount: 1 common_preempt_count: 0 |
| 1718 | { len: 177 } hitcount: 1 common_preempt_count: 0 |
| 1719 | { len: 446 } hitcount: 1 common_preempt_count: 0 |
| 1720 | { len: 1601 } hitcount: 1 common_preempt_count: 0 |
| 1721 | . |
| 1722 | . |
| 1723 | . |
| 1724 | { len: 1280 } hitcount: 66 common_preempt_count: 0 |
| 1725 | { len: 116 } hitcount: 81 common_preempt_count: 40 |
| 1726 | { len: 708 } hitcount: 112 common_preempt_count: 0 |
| 1727 | { len: 46 } hitcount: 221 common_preempt_count: 0 |
| 1728 | { len: 1264 } hitcount: 458 common_preempt_count: 0 |
| 1729 | |
| 1730 | Totals: |
| 1731 | Hits: 1428 |
| 1732 | Entries: 147 |
| 1733 | Dropped: 0 |
| 1734 | |
| 1735 | |
| 1736 | # event histogram |
| 1737 | # |
| 1738 | # trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 [active] |
| 1739 | # |
| 1740 | |
| 1741 | { skbaddr: ffff8800baee5e00 } hitcount: 1 len: 130 |
| 1742 | { skbaddr: ffff88005f3d5600 } hitcount: 1 len: 1280 |
| 1743 | { skbaddr: ffff88005f3d4900 } hitcount: 1 len: 1280 |
| 1744 | { skbaddr: ffff88009fed6300 } hitcount: 1 len: 115 |
| 1745 | { skbaddr: ffff88009fe0ad00 } hitcount: 1 len: 115 |
| 1746 | { skbaddr: ffff88008cdb1900 } hitcount: 1 len: 46 |
| 1747 | { skbaddr: ffff880064b5ef00 } hitcount: 1 len: 118 |
| 1748 | { skbaddr: ffff880044e3c700 } hitcount: 1 len: 60 |
| 1749 | { skbaddr: ffff880100065900 } hitcount: 1 len: 46 |
| 1750 | { skbaddr: ffff8800d46bd500 } hitcount: 1 len: 116 |
| 1751 | { skbaddr: ffff88005f3d5f00 } hitcount: 1 len: 1280 |
| 1752 | { skbaddr: ffff880100064700 } hitcount: 1 len: 365 |
| 1753 | { skbaddr: ffff8800badb6f00 } hitcount: 1 len: 60 |
| 1754 | . |
| 1755 | . |
| 1756 | . |
| 1757 | { skbaddr: ffff88009fe0be00 } hitcount: 27 len: 24677 |
| 1758 | { skbaddr: ffff88009fe0a400 } hitcount: 27 len: 23052 |
| 1759 | { skbaddr: ffff88009fe0b700 } hitcount: 31 len: 25589 |
| 1760 | { skbaddr: ffff88009fe0b600 } hitcount: 32 len: 27326 |
| 1761 | { skbaddr: ffff88006a462800 } hitcount: 68 len: 71678 |
| 1762 | { skbaddr: ffff88006a463700 } hitcount: 70 len: 72678 |
| 1763 | { skbaddr: ffff88006a462b00 } hitcount: 71 len: 77589 |
| 1764 | { skbaddr: ffff88006a463600 } hitcount: 73 len: 71307 |
| 1765 | { skbaddr: ffff88006a462200 } hitcount: 81 len: 81032 |
| 1766 | |
| 1767 | Totals: |
| 1768 | Hits: 1451 |
| 1769 | Entries: 318 |
| 1770 | Dropped: 0 |
| 1771 | |
| 1772 | |
| 1773 | # event histogram |
| 1774 | # |
| 1775 | # trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 if len == 256 [active] |
| 1776 | # |
| 1777 | |
| 1778 | |
| 1779 | Totals: |
| 1780 | Hits: 0 |
| 1781 | Entries: 0 |
| 1782 | Dropped: 0 |
| 1783 | |
| 1784 | |
| 1785 | # event histogram |
| 1786 | # |
| 1787 | # trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 if len > 4096 [active] |
| 1788 | # |
| 1789 | |
| 1790 | { skbaddr: ffff88009fd2c300 } hitcount: 1 len: 7212 |
| 1791 | { skbaddr: ffff8800d2bcce00 } hitcount: 1 len: 7212 |
| 1792 | { skbaddr: ffff8800d2bcd700 } hitcount: 1 len: 7212 |
| 1793 | { skbaddr: ffff8800d2bcda00 } hitcount: 1 len: 21492 |
| 1794 | { skbaddr: ffff8800ae2e2d00 } hitcount: 1 len: 7212 |
| 1795 | { skbaddr: ffff8800d2bcdb00 } hitcount: 1 len: 7212 |
| 1796 | { skbaddr: ffff88006a4df500 } hitcount: 1 len: 4854 |
| 1797 | { skbaddr: ffff88008ce47b00 } hitcount: 1 len: 18636 |
| 1798 | { skbaddr: ffff8800ae2e2200 } hitcount: 1 len: 12924 |
| 1799 | { skbaddr: ffff88005f3e1000 } hitcount: 1 len: 4356 |
| 1800 | { skbaddr: ffff8800d2bcdc00 } hitcount: 2 len: 24420 |
| 1801 | { skbaddr: ffff8800d2bcc200 } hitcount: 2 len: 12996 |
| 1802 | |
| 1803 | Totals: |
| 1804 | Hits: 14 |
| 1805 | Entries: 12 |
| 1806 | Dropped: 0 |
| 1807 | |
| 1808 | |
| 1809 | # event histogram |
| 1810 | # |
| 1811 | # trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 if len < 0 [active] |
| 1812 | # |
| 1813 | |
| 1814 | |
| 1815 | Totals: |
| 1816 | Hits: 0 |
| 1817 | Entries: 0 |
| 1818 | Dropped: 0 |
Tom Zanussi | 5463bfd | 2016-03-03 12:54:59 -0600 | [diff] [blame] | 1819 | |
| 1820 | Named triggers can be used to have triggers share a common set of |
| 1821 | histogram data. This capability is mostly useful for combining the |
| 1822 | output of events generated by tracepoints contained inside inline |
| 1823 | functions, but names can be used in a hist trigger on any event. |
| 1824 | For example, these two triggers when hit will update the same 'len' |
| 1825 | field in the shared 'foo' histogram data: |
| 1826 | |
| 1827 | # echo 'hist:name=foo:keys=skbaddr.hex:vals=len' > \ |
| 1828 | /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger |
| 1829 | # echo 'hist:name=foo:keys=skbaddr.hex:vals=len' > \ |
| 1830 | /sys/kernel/debug/tracing/events/net/netif_rx/trigger |
| 1831 | |
| 1832 | You can see that they're updating common histogram data by reading |
| 1833 | each event's hist files at the same time: |
| 1834 | |
| 1835 | # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist; |
| 1836 | cat /sys/kernel/debug/tracing/events/net/netif_rx/hist |
| 1837 | |
| 1838 | # event histogram |
| 1839 | # |
| 1840 | # trigger info: hist:name=foo:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 [active] |
| 1841 | # |
| 1842 | |
| 1843 | { skbaddr: ffff88000ad53500 } hitcount: 1 len: 46 |
| 1844 | { skbaddr: ffff8800af5a1500 } hitcount: 1 len: 76 |
| 1845 | { skbaddr: ffff8800d62a1900 } hitcount: 1 len: 46 |
| 1846 | { skbaddr: ffff8800d2bccb00 } hitcount: 1 len: 468 |
| 1847 | { skbaddr: ffff8800d3c69900 } hitcount: 1 len: 46 |
| 1848 | { skbaddr: ffff88009ff09100 } hitcount: 1 len: 52 |
| 1849 | { skbaddr: ffff88010f13ab00 } hitcount: 1 len: 168 |
| 1850 | { skbaddr: ffff88006a54f400 } hitcount: 1 len: 46 |
| 1851 | { skbaddr: ffff8800d2bcc500 } hitcount: 1 len: 260 |
| 1852 | { skbaddr: ffff880064505000 } hitcount: 1 len: 46 |
| 1853 | { skbaddr: ffff8800baf24e00 } hitcount: 1 len: 32 |
| 1854 | { skbaddr: ffff88009fe0ad00 } hitcount: 1 len: 46 |
| 1855 | { skbaddr: ffff8800d3edff00 } hitcount: 1 len: 44 |
| 1856 | { skbaddr: ffff88009fe0b400 } hitcount: 1 len: 168 |
| 1857 | { skbaddr: ffff8800a1c55a00 } hitcount: 1 len: 40 |
| 1858 | { skbaddr: ffff8800d2bcd100 } hitcount: 1 len: 40 |
| 1859 | { skbaddr: ffff880064505f00 } hitcount: 1 len: 174 |
| 1860 | { skbaddr: ffff8800a8bff200 } hitcount: 1 len: 160 |
| 1861 | { skbaddr: ffff880044e3cc00 } hitcount: 1 len: 76 |
| 1862 | { skbaddr: ffff8800a8bfe700 } hitcount: 1 len: 46 |
| 1863 | { skbaddr: ffff8800d2bcdc00 } hitcount: 1 len: 32 |
| 1864 | { skbaddr: ffff8800a1f64800 } hitcount: 1 len: 46 |
| 1865 | { skbaddr: ffff8800d2bcde00 } hitcount: 1 len: 988 |
| 1866 | { skbaddr: ffff88006a5dea00 } hitcount: 1 len: 46 |
| 1867 | { skbaddr: ffff88002e37a200 } hitcount: 1 len: 44 |
| 1868 | { skbaddr: ffff8800a1f32c00 } hitcount: 2 len: 676 |
| 1869 | { skbaddr: ffff88000ad52600 } hitcount: 2 len: 107 |
| 1870 | { skbaddr: ffff8800a1f91e00 } hitcount: 2 len: 92 |
| 1871 | { skbaddr: ffff8800af5a0200 } hitcount: 2 len: 142 |
| 1872 | { skbaddr: ffff8800d2bcc600 } hitcount: 2 len: 220 |
| 1873 | { skbaddr: ffff8800ba36f500 } hitcount: 2 len: 92 |
| 1874 | { skbaddr: ffff8800d021f800 } hitcount: 2 len: 92 |
| 1875 | { skbaddr: ffff8800a1f33600 } hitcount: 2 len: 675 |
| 1876 | { skbaddr: ffff8800a8bfff00 } hitcount: 3 len: 138 |
| 1877 | { skbaddr: ffff8800d62a1300 } hitcount: 3 len: 138 |
| 1878 | { skbaddr: ffff88002e37a100 } hitcount: 4 len: 184 |
| 1879 | { skbaddr: ffff880064504400 } hitcount: 4 len: 184 |
| 1880 | { skbaddr: ffff8800a8bfec00 } hitcount: 4 len: 184 |
| 1881 | { skbaddr: ffff88000ad53700 } hitcount: 5 len: 230 |
| 1882 | { skbaddr: ffff8800d2bcdb00 } hitcount: 5 len: 196 |
| 1883 | { skbaddr: ffff8800a1f90000 } hitcount: 6 len: 276 |
| 1884 | { skbaddr: ffff88006a54f900 } hitcount: 6 len: 276 |
| 1885 | |
| 1886 | Totals: |
| 1887 | Hits: 81 |
| 1888 | Entries: 42 |
| 1889 | Dropped: 0 |
| 1890 | # event histogram |
| 1891 | # |
| 1892 | # trigger info: hist:name=foo:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 [active] |
| 1893 | # |
| 1894 | |
| 1895 | { skbaddr: ffff88000ad53500 } hitcount: 1 len: 46 |
| 1896 | { skbaddr: ffff8800af5a1500 } hitcount: 1 len: 76 |
| 1897 | { skbaddr: ffff8800d62a1900 } hitcount: 1 len: 46 |
| 1898 | { skbaddr: ffff8800d2bccb00 } hitcount: 1 len: 468 |
| 1899 | { skbaddr: ffff8800d3c69900 } hitcount: 1 len: 46 |
| 1900 | { skbaddr: ffff88009ff09100 } hitcount: 1 len: 52 |
| 1901 | { skbaddr: ffff88010f13ab00 } hitcount: 1 len: 168 |
| 1902 | { skbaddr: ffff88006a54f400 } hitcount: 1 len: 46 |
| 1903 | { skbaddr: ffff8800d2bcc500 } hitcount: 1 len: 260 |
| 1904 | { skbaddr: ffff880064505000 } hitcount: 1 len: 46 |
| 1905 | { skbaddr: ffff8800baf24e00 } hitcount: 1 len: 32 |
| 1906 | { skbaddr: ffff88009fe0ad00 } hitcount: 1 len: 46 |
| 1907 | { skbaddr: ffff8800d3edff00 } hitcount: 1 len: 44 |
| 1908 | { skbaddr: ffff88009fe0b400 } hitcount: 1 len: 168 |
| 1909 | { skbaddr: ffff8800a1c55a00 } hitcount: 1 len: 40 |
| 1910 | { skbaddr: ffff8800d2bcd100 } hitcount: 1 len: 40 |
| 1911 | { skbaddr: ffff880064505f00 } hitcount: 1 len: 174 |
| 1912 | { skbaddr: ffff8800a8bff200 } hitcount: 1 len: 160 |
| 1913 | { skbaddr: ffff880044e3cc00 } hitcount: 1 len: 76 |
| 1914 | { skbaddr: ffff8800a8bfe700 } hitcount: 1 len: 46 |
| 1915 | { skbaddr: ffff8800d2bcdc00 } hitcount: 1 len: 32 |
| 1916 | { skbaddr: ffff8800a1f64800 } hitcount: 1 len: 46 |
| 1917 | { skbaddr: ffff8800d2bcde00 } hitcount: 1 len: 988 |
| 1918 | { skbaddr: ffff88006a5dea00 } hitcount: 1 len: 46 |
| 1919 | { skbaddr: ffff88002e37a200 } hitcount: 1 len: 44 |
| 1920 | { skbaddr: ffff8800a1f32c00 } hitcount: 2 len: 676 |
| 1921 | { skbaddr: ffff88000ad52600 } hitcount: 2 len: 107 |
| 1922 | { skbaddr: ffff8800a1f91e00 } hitcount: 2 len: 92 |
| 1923 | { skbaddr: ffff8800af5a0200 } hitcount: 2 len: 142 |
| 1924 | { skbaddr: ffff8800d2bcc600 } hitcount: 2 len: 220 |
| 1925 | { skbaddr: ffff8800ba36f500 } hitcount: 2 len: 92 |
| 1926 | { skbaddr: ffff8800d021f800 } hitcount: 2 len: 92 |
| 1927 | { skbaddr: ffff8800a1f33600 } hitcount: 2 len: 675 |
| 1928 | { skbaddr: ffff8800a8bfff00 } hitcount: 3 len: 138 |
| 1929 | { skbaddr: ffff8800d62a1300 } hitcount: 3 len: 138 |
| 1930 | { skbaddr: ffff88002e37a100 } hitcount: 4 len: 184 |
| 1931 | { skbaddr: ffff880064504400 } hitcount: 4 len: 184 |
| 1932 | { skbaddr: ffff8800a8bfec00 } hitcount: 4 len: 184 |
| 1933 | { skbaddr: ffff88000ad53700 } hitcount: 5 len: 230 |
| 1934 | { skbaddr: ffff8800d2bcdb00 } hitcount: 5 len: 196 |
| 1935 | { skbaddr: ffff8800a1f90000 } hitcount: 6 len: 276 |
| 1936 | { skbaddr: ffff88006a54f900 } hitcount: 6 len: 276 |
| 1937 | |
| 1938 | Totals: |
| 1939 | Hits: 81 |
| 1940 | Entries: 42 |
| 1941 | Dropped: 0 |
| 1942 | |
| 1943 | And here's an example that shows how to combine histogram data from |
| 1944 | any two events even if they don't share any 'compatible' fields |
| 1945 | other than 'hitcount' and 'stacktrace'. These commands create a |
| 1946 | couple of triggers named 'bar' using those fields: |
| 1947 | |
| 1948 | # echo 'hist:name=bar:key=stacktrace:val=hitcount' > \ |
| 1949 | /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger |
| 1950 | # echo 'hist:name=bar:key=stacktrace:val=hitcount' > \ |
| 1951 | /sys/kernel/debug/tracing/events/net/netif_rx/trigger |
| 1952 | |
| 1953 | And displaying the output of either shows some interesting if |
| 1954 | somewhat confusing output: |
| 1955 | |
| 1956 | # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist |
| 1957 | # cat /sys/kernel/debug/tracing/events/net/netif_rx/hist |
| 1958 | |
| 1959 | # event histogram |
| 1960 | # |
| 1961 | # trigger info: hist:name=bar:keys=stacktrace:vals=hitcount:sort=hitcount:size=2048 [active] |
| 1962 | # |
| 1963 | |
| 1964 | { stacktrace: |
| 1965 | _do_fork+0x18e/0x330 |
| 1966 | kernel_thread+0x29/0x30 |
| 1967 | kthreadd+0x154/0x1b0 |
| 1968 | ret_from_fork+0x3f/0x70 |
| 1969 | } hitcount: 1 |
| 1970 | { stacktrace: |
| 1971 | netif_rx_internal+0xb2/0xd0 |
| 1972 | netif_rx_ni+0x20/0x70 |
| 1973 | dev_loopback_xmit+0xaa/0xd0 |
| 1974 | ip_mc_output+0x126/0x240 |
| 1975 | ip_local_out_sk+0x31/0x40 |
| 1976 | igmp_send_report+0x1e9/0x230 |
| 1977 | igmp_timer_expire+0xe9/0x120 |
| 1978 | call_timer_fn+0x39/0xf0 |
| 1979 | run_timer_softirq+0x1e1/0x290 |
| 1980 | __do_softirq+0xfd/0x290 |
| 1981 | irq_exit+0x98/0xb0 |
| 1982 | smp_apic_timer_interrupt+0x4a/0x60 |
| 1983 | apic_timer_interrupt+0x6d/0x80 |
| 1984 | cpuidle_enter+0x17/0x20 |
| 1985 | call_cpuidle+0x3b/0x60 |
| 1986 | cpu_startup_entry+0x22d/0x310 |
| 1987 | } hitcount: 1 |
| 1988 | { stacktrace: |
| 1989 | netif_rx_internal+0xb2/0xd0 |
| 1990 | netif_rx_ni+0x20/0x70 |
| 1991 | dev_loopback_xmit+0xaa/0xd0 |
| 1992 | ip_mc_output+0x17f/0x240 |
| 1993 | ip_local_out_sk+0x31/0x40 |
| 1994 | ip_send_skb+0x1a/0x50 |
| 1995 | udp_send_skb+0x13e/0x270 |
| 1996 | udp_sendmsg+0x2bf/0x980 |
| 1997 | inet_sendmsg+0x67/0xa0 |
| 1998 | sock_sendmsg+0x38/0x50 |
| 1999 | SYSC_sendto+0xef/0x170 |
| 2000 | SyS_sendto+0xe/0x10 |
| 2001 | entry_SYSCALL_64_fastpath+0x12/0x6a |
| 2002 | } hitcount: 2 |
| 2003 | { stacktrace: |
| 2004 | netif_rx_internal+0xb2/0xd0 |
| 2005 | netif_rx+0x1c/0x60 |
| 2006 | loopback_xmit+0x6c/0xb0 |
| 2007 | dev_hard_start_xmit+0x219/0x3a0 |
| 2008 | __dev_queue_xmit+0x415/0x4f0 |
| 2009 | dev_queue_xmit_sk+0x13/0x20 |
| 2010 | ip_finish_output2+0x237/0x340 |
| 2011 | ip_finish_output+0x113/0x1d0 |
| 2012 | ip_output+0x66/0xc0 |
| 2013 | ip_local_out_sk+0x31/0x40 |
| 2014 | ip_send_skb+0x1a/0x50 |
| 2015 | udp_send_skb+0x16d/0x270 |
| 2016 | udp_sendmsg+0x2bf/0x980 |
| 2017 | inet_sendmsg+0x67/0xa0 |
| 2018 | sock_sendmsg+0x38/0x50 |
| 2019 | ___sys_sendmsg+0x14e/0x270 |
| 2020 | } hitcount: 76 |
| 2021 | { stacktrace: |
| 2022 | netif_rx_internal+0xb2/0xd0 |
| 2023 | netif_rx+0x1c/0x60 |
| 2024 | loopback_xmit+0x6c/0xb0 |
| 2025 | dev_hard_start_xmit+0x219/0x3a0 |
| 2026 | __dev_queue_xmit+0x415/0x4f0 |
| 2027 | dev_queue_xmit_sk+0x13/0x20 |
| 2028 | ip_finish_output2+0x237/0x340 |
| 2029 | ip_finish_output+0x113/0x1d0 |
| 2030 | ip_output+0x66/0xc0 |
| 2031 | ip_local_out_sk+0x31/0x40 |
| 2032 | ip_send_skb+0x1a/0x50 |
| 2033 | udp_send_skb+0x16d/0x270 |
| 2034 | udp_sendmsg+0x2bf/0x980 |
| 2035 | inet_sendmsg+0x67/0xa0 |
| 2036 | sock_sendmsg+0x38/0x50 |
| 2037 | ___sys_sendmsg+0x269/0x270 |
| 2038 | } hitcount: 77 |
| 2039 | { stacktrace: |
| 2040 | netif_rx_internal+0xb2/0xd0 |
| 2041 | netif_rx+0x1c/0x60 |
| 2042 | loopback_xmit+0x6c/0xb0 |
| 2043 | dev_hard_start_xmit+0x219/0x3a0 |
| 2044 | __dev_queue_xmit+0x415/0x4f0 |
| 2045 | dev_queue_xmit_sk+0x13/0x20 |
| 2046 | ip_finish_output2+0x237/0x340 |
| 2047 | ip_finish_output+0x113/0x1d0 |
| 2048 | ip_output+0x66/0xc0 |
| 2049 | ip_local_out_sk+0x31/0x40 |
| 2050 | ip_send_skb+0x1a/0x50 |
| 2051 | udp_send_skb+0x16d/0x270 |
| 2052 | udp_sendmsg+0x2bf/0x980 |
| 2053 | inet_sendmsg+0x67/0xa0 |
| 2054 | sock_sendmsg+0x38/0x50 |
| 2055 | SYSC_sendto+0xef/0x170 |
| 2056 | } hitcount: 88 |
| 2057 | { stacktrace: |
| 2058 | _do_fork+0x18e/0x330 |
| 2059 | SyS_clone+0x19/0x20 |
| 2060 | entry_SYSCALL_64_fastpath+0x12/0x6a |
| 2061 | } hitcount: 244 |
| 2062 | |
| 2063 | Totals: |
| 2064 | Hits: 489 |
| 2065 | Entries: 7 |
| 2066 | Dropped: 0 |