blob: 2cc08d4a326e120cab3c87b60f8cf67934604fd9 [file] [log] [blame]
Theodore Ts'oabd41442009-04-11 15:51:18 -04001 Event Tracing
2
3 Documentation written by Theodore Ts'o
Tom Zanussi95b69602009-09-10 23:13:51 -05004 Updated by Li Zefan and Tom Zanussi
Theodore Ts'oabd41442009-04-11 15:51:18 -04005
Li Zefan143c1452009-05-19 14:43:15 +080061. Introduction
7===============
Theodore Ts'oabd41442009-04-11 15:51:18 -04008
9Tracepoints (see Documentation/trace/tracepoints.txt) can be used
10without creating custom kernel modules to register probe functions
11using the event tracing infrastructure.
12
13Not all tracepoints can be traced using the event tracing system;
14the kernel developer must provide code snippets which define how the
15tracing information is saved into the tracing buffer, and how the
Li Zefan143c1452009-05-19 14:43:15 +080016tracing information should be printed.
Theodore Ts'oabd41442009-04-11 15:51:18 -040017
Li Zefan143c1452009-05-19 14:43:15 +0800182. Using Event Tracing
19======================
20
212.1 Via the 'set_event' interface
22---------------------------------
Theodore Ts'oabd41442009-04-11 15:51:18 -040023
24The events which are available for tracing can be found in the file
GeunSik Lim52ad51e2009-09-07 21:37:17 +090025/sys/kernel/debug/tracing/available_events.
Theodore Ts'oabd41442009-04-11 15:51:18 -040026
27To enable a particular event, such as 'sched_wakeup', simply echo it
GeunSik Lim52ad51e2009-09-07 21:37:17 +090028to /sys/kernel/debug/tracing/set_event. For example:
Theodore Ts'oabd41442009-04-11 15:51:18 -040029
GeunSik Lim52ad51e2009-09-07 21:37:17 +090030 # echo sched_wakeup >> /sys/kernel/debug/tracing/set_event
Theodore Ts'oabd41442009-04-11 15:51:18 -040031
Li Zefan143c1452009-05-19 14:43:15 +080032[ Note: '>>' is necessary, otherwise it will firstly disable
33 all the events. ]
Theodore Ts'oabd41442009-04-11 15:51:18 -040034
35To disable an event, echo the event name to the set_event file prefixed
36with an exclamation point:
37
GeunSik Lim52ad51e2009-09-07 21:37:17 +090038 # echo '!sched_wakeup' >> /sys/kernel/debug/tracing/set_event
Theodore Ts'oabd41442009-04-11 15:51:18 -040039
Li Zefan143c1452009-05-19 14:43:15 +080040To disable all events, echo an empty line to the set_event file:
Theodore Ts'oabd41442009-04-11 15:51:18 -040041
GeunSik Lim52ad51e2009-09-07 21:37:17 +090042 # echo > /sys/kernel/debug/tracing/set_event
Li Zefan143c1452009-05-19 14:43:15 +080043
44To enable all events, echo '*:*' or '*:' to the set_event file:
45
GeunSik Lim52ad51e2009-09-07 21:37:17 +090046 # echo *:* > /sys/kernel/debug/tracing/set_event
Theodore Ts'oabd41442009-04-11 15:51:18 -040047
48The events are organized into subsystems, such as ext4, irq, sched,
49etc., and a full event name looks like this: <subsystem>:<event>. The
50subsystem name is optional, but it is displayed in the available_events
51file. All of the events in a subsystem can be specified via the syntax
52"<subsystem>:*"; for example, to enable all irq events, you can use the
53command:
54
GeunSik Lim52ad51e2009-09-07 21:37:17 +090055 # echo 'irq:*' > /sys/kernel/debug/tracing/set_event
Theodore Ts'oabd41442009-04-11 15:51:18 -040056
Li Zefan143c1452009-05-19 14:43:15 +0800572.2 Via the 'enable' toggle
58---------------------------
Theodore Ts'oabd41442009-04-11 15:51:18 -040059
GeunSik Lim52ad51e2009-09-07 21:37:17 +090060The events available are also listed in /sys/kernel/debug/tracing/events/ hierarchy
Li Zefan143c1452009-05-19 14:43:15 +080061of directories.
Theodore Ts'oabd41442009-04-11 15:51:18 -040062
Li Zefan143c1452009-05-19 14:43:15 +080063To enable event 'sched_wakeup':
Theodore Ts'oabd41442009-04-11 15:51:18 -040064
GeunSik Lim52ad51e2009-09-07 21:37:17 +090065 # echo 1 > /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable
Theodore Ts'oabd41442009-04-11 15:51:18 -040066
Li Zefan143c1452009-05-19 14:43:15 +080067To disable it:
Theodore Ts'oabd41442009-04-11 15:51:18 -040068
GeunSik Lim52ad51e2009-09-07 21:37:17 +090069 # echo 0 > /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable
Theodore Ts'oabd41442009-04-11 15:51:18 -040070
Li Zefan143c1452009-05-19 14:43:15 +080071To enable all events in sched subsystem:
Theodore Ts'oabd41442009-04-11 15:51:18 -040072
GeunSik Lim52ad51e2009-09-07 21:37:17 +090073 # echo 1 > /sys/kernel/debug/tracing/events/sched/enable
Theodore Ts'oabd41442009-04-11 15:51:18 -040074
Michal Sojka6afb1c62009-09-10 08:02:21 +020075To enable all events:
Theodore Ts'oabd41442009-04-11 15:51:18 -040076
GeunSik Lim52ad51e2009-09-07 21:37:17 +090077 # echo 1 > /sys/kernel/debug/tracing/events/enable
Theodore Ts'oabd41442009-04-11 15:51:18 -040078
Li Zefan143c1452009-05-19 14:43:15 +080079When reading one of these enable files, there are four results:
Theodore Ts'oabd41442009-04-11 15:51:18 -040080
Li Zefan143c1452009-05-19 14:43:15 +080081 0 - all events this file affects are disabled
82 1 - all events this file affects are enabled
83 X - there is a mixture of events enabled and disabled
84 ? - this file does not affect any event
Theodore Ts'oabd41442009-04-11 15:51:18 -040085
Li Zefan020e5f82009-07-01 10:47:05 +0800862.3 Boot option
87---------------
88
89In order to facilitate early boot debugging, use boot option:
90
91 trace_event=[event-list]
92
Li Zefan03d646e2009-12-21 14:27:24 +080093event-list is a comma separated list of events. See section 2.1 for event
94format.
Li Zefan020e5f82009-07-01 10:47:05 +080095
Li Zefan143c1452009-05-19 14:43:15 +0800963. Defining an event-enabled tracepoint
97=======================================
Theodore Ts'oabd41442009-04-11 15:51:18 -040098
Li Zefan143c1452009-05-19 14:43:15 +080099See The example provided in samples/trace_events
Theodore Ts'oabd41442009-04-11 15:51:18 -0400100
Tom Zanussi95b69602009-09-10 23:13:51 -05001014. Event formats
102================
103
104Each trace event has a 'format' file associated with it that contains
105a description of each field in a logged event. This information can
106be used to parse the binary trace stream, and is also the place to
107find the field names that can be used in event filters (see section 5).
108
109It also displays the format string that will be used to print the
110event in text mode, along with the event name and ID used for
111profiling.
112
113Every event has a set of 'common' fields associated with it; these are
114the fields prefixed with 'common_'. The other fields vary between
115events and correspond to the fields defined in the TRACE_EVENT
116definition for that event.
117
118Each field in the format has the form:
119
120 field:field-type field-name; offset:N; size:N;
121
122where offset is the offset of the field in the trace record and size
123is the size of the data item, in bytes.
124
125For example, here's the information displayed for the 'sched_wakeup'
126event:
127
Tao Mabfc672d2011-01-05 14:07:54 +0800128# cat /sys/kernel/debug/tracing/events/sched/sched_wakeup/format
Tom Zanussi95b69602009-09-10 23:13:51 -0500129
130name: sched_wakeup
131ID: 60
132format:
133 field:unsigned short common_type; offset:0; size:2;
134 field:unsigned char common_flags; offset:2; size:1;
135 field:unsigned char common_preempt_count; offset:3; size:1;
136 field:int common_pid; offset:4; size:4;
137 field:int common_tgid; offset:8; size:4;
138
139 field:char comm[TASK_COMM_LEN]; offset:12; size:16;
140 field:pid_t pid; offset:28; size:4;
141 field:int prio; offset:32; size:4;
142 field:int success; offset:36; size:4;
143 field:int cpu; offset:40; size:4;
144
145print fmt: "task %s:%d [%d] success=%d [%03d]", REC->comm, REC->pid,
146 REC->prio, REC->success, REC->cpu
147
148This event contains 10 fields, the first 5 common and the remaining 5
149event-specific. All the fields for this event are numeric, except for
150'comm' which is a string, a distinction important for event filtering.
151
1525. Event filtering
153==================
154
155Trace events can be filtered in the kernel by associating boolean
156'filter expressions' with them. As soon as an event is logged into
157the trace buffer, its fields are checked against the filter expression
158associated with that event type. An event with field values that
159'match' the filter will appear in the trace output, and an event whose
160values don't match will be discarded. An event with no filter
161associated with it matches everything, and is the default when no
162filter has been set for an event.
163
1645.1 Expression syntax
165---------------------
166
167A filter expression consists of one or more 'predicates' that can be
168combined using the logical operators '&&' and '||'. A predicate is
169simply a clause that compares the value of a field contained within a
170logged event with a constant value and returns either 0 or 1 depending
171on whether the field value matched (1) or didn't match (0):
172
173 field-name relational-operator value
174
175Parentheses can be used to provide arbitrary logical groupings and
176double-quotes can be used to prevent the shell from interpreting
177operators as shell metacharacters.
178
179The field-names available for use in filters can be found in the
180'format' files for trace events (see section 4).
181
182The relational-operators depend on the type of the field being tested:
183
184The operators available for numeric fields are:
185
Steven Rostedt1a891cf2013-06-12 13:16:25 -0400186==, !=, <, <=, >, >=, &
Tom Zanussi95b69602009-09-10 23:13:51 -0500187
188And for string fields they are:
189
Steven Rostedt (Red Hat)c3e13c72013-06-17 10:59:17 -0400190==, !=, ~
Tom Zanussi95b69602009-09-10 23:13:51 -0500191
Masami Hiramatsu60f1d5e2016-10-05 20:58:15 +0900192The glob (~) accepts a wild card character (*,?) and character classes
193([). For example:
Steven Rostedt (Red Hat)c3e13c72013-06-17 10:59:17 -0400194
195 prev_comm ~ "*sh"
196 prev_comm ~ "sh*"
197 prev_comm ~ "*sh*"
Masami Hiramatsu60f1d5e2016-10-05 20:58:15 +0900198 prev_comm ~ "ba*sh"
Tom Zanussi95b69602009-09-10 23:13:51 -0500199
Tom Zanussi95b69602009-09-10 23:13:51 -05002005.2 Setting filters
201-------------------
202
203A filter for an individual event is set by writing a filter expression
204to the 'filter' file for the given event.
205
206For example:
207
Tao Mabfc672d2011-01-05 14:07:54 +0800208# cd /sys/kernel/debug/tracing/events/sched/sched_wakeup
Tom Zanussi95b69602009-09-10 23:13:51 -0500209# echo "common_preempt_count > 4" > filter
210
211A slightly more involved example:
212
Tao Mabfc672d2011-01-05 14:07:54 +0800213# cd /sys/kernel/debug/tracing/events/signal/signal_generate
Tom Zanussi95b69602009-09-10 23:13:51 -0500214# echo "((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter
215
216If there is an error in the expression, you'll get an 'Invalid
217argument' error when setting it, and the erroneous string along with
218an error message can be seen by looking at the filter e.g.:
219
Tao Mabfc672d2011-01-05 14:07:54 +0800220# cd /sys/kernel/debug/tracing/events/signal/signal_generate
Tom Zanussi95b69602009-09-10 23:13:51 -0500221# echo "((sig >= 10 && sig < 15) || dsig == 17) && comm != bash" > filter
222-bash: echo: write error: Invalid argument
223# cat filter
224((sig >= 10 && sig < 15) || dsig == 17) && comm != bash
225^
226parse_error: Field not found
227
228Currently the caret ('^') for an error always appears at the beginning of
229the filter string; the error message should still be useful though
230even without more accurate position info.
231
2325.3 Clearing filters
233--------------------
234
235To clear the filter for an event, write a '0' to the event's filter
236file.
237
238To clear the filters for all events in a subsystem, write a '0' to the
239subsystem's filter file.
240
2415.3 Subsystem filters
242---------------------
243
244For convenience, filters for every event in a subsystem can be set or
245cleared as a group by writing a filter expression into the filter file
Thomas Weber88393162010-03-16 11:47:56 +0100246at the root of the subsystem. Note however, that if a filter for any
Tom Zanussi95b69602009-09-10 23:13:51 -0500247event within the subsystem lacks a field specified in the subsystem
248filter, or if the filter can't be applied for any other reason, the
249filter for that event will retain its previous setting. This can
250result in an unintended mixture of filters which could lead to
251confusing (to the user who might think different filters are in
252effect) trace output. Only filters that reference just the common
253fields can be guaranteed to propagate successfully to all events.
254
255Here are a few subsystem filter examples that also illustrate the
256above points:
257
Thomas Weber88393162010-03-16 11:47:56 +0100258Clear the filters on all events in the sched subsystem:
Tom Zanussi95b69602009-09-10 23:13:51 -0500259
260# cd /sys/kernel/debug/tracing/events/sched
261# echo 0 > filter
262# cat sched_switch/filter
263none
264# cat sched_wakeup/filter
265none
266
267Set a filter using only common fields for all events in the sched
Thomas Weber88393162010-03-16 11:47:56 +0100268subsystem (all events end up with the same filter):
Tom Zanussi95b69602009-09-10 23:13:51 -0500269
270# cd /sys/kernel/debug/tracing/events/sched
271# echo common_pid == 0 > filter
272# cat sched_switch/filter
273common_pid == 0
274# cat sched_wakeup/filter
275common_pid == 0
276
277Attempt to set a filter using a non-common field for all events in the
Thomas Weber88393162010-03-16 11:47:56 +0100278sched subsystem (all events but those that have a prev_pid field retain
Tom Zanussi95b69602009-09-10 23:13:51 -0500279their old filters):
280
281# cd /sys/kernel/debug/tracing/events/sched
282# echo prev_pid == 0 > filter
283# cat sched_switch/filter
284prev_pid == 0
285# cat sched_wakeup/filter
286common_pid == 0
Tom Zanussiac38fb82013-10-24 08:59:30 -0500287
Steven Rostedt (Red Hat)627645f2015-11-03 16:37:15 -05002885.4 PID filtering
289-----------------
290
291The set_event_pid file in the same directory as the top events directory
292exists, will filter all events from tracing any task that does not have the
293PID listed in the set_event_pid file.
294
295# cd /sys/kernel/debug/tracing
296# echo $$ > set_event_pid
297# echo 1 > events/enabled
298
299Will only trace events for the current task.
300
301To add more PIDs without losing the PIDs already included, use '>>'.
302
303# echo 123 244 1 >> set_event_pid
304
305
Tom Zanussiac38fb82013-10-24 08:59:30 -05003066. Event triggers
307=================
308
309Trace events can be made to conditionally invoke trigger 'commands'
310which can take various forms and are described in detail below;
311examples would be enabling or disabling other trace events or invoking
312a stack trace whenever the trace event is hit. Whenever a trace event
313with attached triggers is invoked, the set of trigger commands
314associated with that event is invoked. Any given trigger can
315additionally have an event filter of the same form as described in
316section 5 (Event filtering) associated with it - the command will only
317be invoked if the event being invoked passes the associated filter.
318If no filter is associated with the trigger, it always passes.
319
320Triggers are added to and removed from a particular event by writing
321trigger expressions to the 'trigger' file for the given event.
322
323A given event can have any number of triggers associated with it,
324subject to any restrictions that individual commands may have in that
325regard.
326
327Event triggers are implemented on top of "soft" mode, which means that
328whenever a trace event has one or more triggers associated with it,
329the event is activated even if it isn't actually enabled, but is
330disabled in a "soft" mode. That is, the tracepoint will be called,
331but just will not be traced, unless of course it's actually enabled.
332This scheme allows triggers to be invoked even for events that aren't
333enabled, and also allows the current event filter implementation to be
334used for conditionally invoking triggers.
335
336The syntax for event triggers is roughly based on the syntax for
337set_ftrace_filter 'ftrace filter commands' (see the 'Filter commands'
338section of Documentation/trace/ftrace.txt), but there are major
339differences and the implementation isn't currently tied to it in any
340way, so beware about making generalizations between the two.
341
3426.1 Expression syntax
343---------------------
344
345Triggers are added by echoing the command to the 'trigger' file:
346
347 # echo 'command[:count] [if filter]' > trigger
348
349Triggers are removed by echoing the same command but starting with '!'
350to the 'trigger' file:
351
352 # echo '!command[:count] [if filter]' > trigger
353
354The [if filter] part isn't used in matching commands when removing, so
355leaving that off in a '!' command will accomplish the same thing as
356having it in.
357
358The filter syntax is the same as that described in the 'Event
359filtering' section above.
360
361For ease of use, writing to the trigger file using '>' currently just
362adds or removes a single trigger and there's no explicit '>>' support
363('>' actually behaves like '>>') or truncation support to remove all
364triggers (you have to use '!' for each one added.)
365
3666.2 Supported trigger commands
367------------------------------
368
369The following commands are supported:
370
371- enable_event/disable_event
372
373 These commands can enable or disable another trace event whenever
374 the triggering event is hit. When these commands are registered,
375 the other trace event is activated, but disabled in a "soft" mode.
376 That is, the tracepoint will be called, but just will not be traced.
377 The event tracepoint stays in this mode as long as there's a trigger
378 in effect that can trigger it.
379
380 For example, the following trigger causes kmalloc events to be
381 traced when a read system call is entered, and the :1 at the end
382 specifies that this enablement happens only once:
383
384 # echo 'enable_event:kmem:kmalloc:1' > \
385 /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
386
387 The following trigger causes kmalloc events to stop being traced
388 when a read system call exits. This disablement happens on every
389 read system call exit:
390
391 # echo 'disable_event:kmem:kmalloc' > \
392 /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger
393
394 The format is:
395
396 enable_event:<system>:<event>[:count]
397 disable_event:<system>:<event>[:count]
398
399 To remove the above commands:
400
401 # echo '!enable_event:kmem:kmalloc:1' > \
402 /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
403
404 # echo '!disable_event:kmem:kmalloc' > \
405 /sys/kernel/debug/tracing/events/syscalls/sys_exit_read/trigger
406
407 Note that there can be any number of enable/disable_event triggers
408 per triggering event, but there can only be one trigger per
409 triggered event. e.g. sys_enter_read can have triggers enabling both
410 kmem:kmalloc and sched:sched_switch, but can't have two kmem:kmalloc
411 versions such as kmem:kmalloc and kmem:kmalloc:1 or 'kmem:kmalloc if
412 bytes_req == 256' and 'kmem:kmalloc if bytes_alloc == 256' (they
413 could be combined into a single filter on kmem:kmalloc though).
414
415- stacktrace
416
417 This command dumps a stacktrace in the trace buffer whenever the
418 triggering event occurs.
419
420 For example, the following trigger dumps a stacktrace every time the
421 kmalloc tracepoint is hit:
422
423 # echo 'stacktrace' > \
424 /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
425
426 The following trigger dumps a stacktrace the first 5 times a kmalloc
427 request happens with a size >= 64K
428
429 # echo 'stacktrace:5 if bytes_req >= 65536' > \
430 /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
431
432 The format is:
433
434 stacktrace[:count]
435
436 To remove the above commands:
437
438 # echo '!stacktrace' > \
439 /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
440
441 # echo '!stacktrace:5 if bytes_req >= 65536' > \
442 /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
443
444 The latter can also be removed more simply by the following (without
445 the filter):
446
447 # echo '!stacktrace:5' > \
448 /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
449
450 Note that there can be only one stacktrace trigger per triggering
451 event.
452
453- snapshot
454
455 This command causes a snapshot to be triggered whenever the
456 triggering event occurs.
457
458 The following command creates a snapshot every time a block request
459 queue is unplugged with a depth > 1. If you were tracing a set of
460 events or functions at the time, the snapshot trace buffer would
Carlos Garciac98be0c2014-04-04 22:31:00 -0400461 capture those events when the trigger event occurred:
Tom Zanussiac38fb82013-10-24 08:59:30 -0500462
463 # echo 'snapshot if nr_rq > 1' > \
464 /sys/kernel/debug/tracing/events/block/block_unplug/trigger
465
466 To only snapshot once:
467
468 # echo 'snapshot:1 if nr_rq > 1' > \
469 /sys/kernel/debug/tracing/events/block/block_unplug/trigger
470
471 To remove the above commands:
472
473 # echo '!snapshot if nr_rq > 1' > \
474 /sys/kernel/debug/tracing/events/block/block_unplug/trigger
475
476 # echo '!snapshot:1 if nr_rq > 1' > \
477 /sys/kernel/debug/tracing/events/block/block_unplug/trigger
478
479 Note that there can be only one snapshot trigger per triggering
480 event.
481
482- traceon/traceoff
483
484 These commands turn tracing on and off when the specified events are
485 hit. The parameter determines how many times the tracing system is
486 turned on and off. If unspecified, there is no limit.
487
488 The following command turns tracing off the first time a block
489 request queue is unplugged with a depth > 1. If you were tracing a
490 set of events or functions at the time, you could then examine the
491 trace buffer to see the sequence of events that led up to the
492 trigger event:
493
494 # echo 'traceoff:1 if nr_rq > 1' > \
495 /sys/kernel/debug/tracing/events/block/block_unplug/trigger
496
497 To always disable tracing when nr_rq > 1 :
498
499 # echo 'traceoff if nr_rq > 1' > \
500 /sys/kernel/debug/tracing/events/block/block_unplug/trigger
501
502 To remove the above commands:
503
504 # echo '!traceoff:1 if nr_rq > 1' > \
505 /sys/kernel/debug/tracing/events/block/block_unplug/trigger
506
507 # echo '!traceoff if nr_rq > 1' > \
508 /sys/kernel/debug/tracing/events/block/block_unplug/trigger
509
510 Note that there can be only one traceon or traceoff trigger per
511 triggering event.
Tom Zanussi0fc38132016-03-03 12:54:56 -0600512
513- hist
514
515 This command aggregates event hits into a hash table keyed on one or
516 more trace event format fields (or stacktrace) and a set of running
517 totals derived from one or more trace event format fields and/or
518 event counts (hitcount).
519
520 The format of a hist trigger is as follows:
521
522 hist:keys=<field1[,field2,...]>[:values=<field1[,field2,...]>]
523 [:sort=<field1[,field2,...]>][:size=#entries][:pause][:continue]
Tom Zanussi5463bfd2016-03-03 12:54:59 -0600524 [:clear][:name=histname1] [if <filter>]
Tom Zanussi0fc38132016-03-03 12:54:56 -0600525
526 When a matching event is hit, an entry is added to a hash table
527 using the key(s) and value(s) named. Keys and values correspond to
528 fields in the event's format description. Values must correspond to
529 numeric fields - on an event hit, the value(s) will be added to a
530 sum kept for that field. The special string 'hitcount' can be used
531 in place of an explicit value field - this is simply a count of
532 event hits. If 'values' isn't specified, an implicit 'hitcount'
533 value will be automatically created and used as the only value.
534 Keys can be any field, or the special string 'stacktrace', which
535 will use the event's kernel stacktrace as the key. The keywords
536 'keys' or 'key' can be used to specify keys, and the keywords
537 'values', 'vals', or 'val' can be used to specify values. Compound
538 keys consisting of up to two fields can be specified by the 'keys'
539 keyword. Hashing a compound key produces a unique entry in the
540 table for each unique combination of component keys, and can be
541 useful for providing more fine-grained summaries of event data.
542 Additionally, sort keys consisting of up to two fields can be
543 specified by the 'sort' keyword. If more than one field is
544 specified, the result will be a 'sort within a sort': the first key
545 is taken to be the primary sort key and the second the secondary
Tom Zanussi5463bfd2016-03-03 12:54:59 -0600546 key. If a hist trigger is given a name using the 'name' parameter,
547 its histogram data will be shared with other triggers of the same
548 name, and trigger hits will update this common data. Only triggers
549 with 'compatible' fields can be combined in this way; triggers are
550 'compatible' if the fields named in the trigger share the same
551 number and type of fields and those fields also have the same names.
552 Note that any two events always share the compatible 'hitcount' and
553 'stacktrace' fields and can therefore be combined using those
554 fields, however pointless that may be.
Tom Zanussi0fc38132016-03-03 12:54:56 -0600555
556 'hist' triggers add a 'hist' file to each event's subdirectory.
557 Reading the 'hist' file for the event will dump the hash table in
Tom Zanussi52a7f162016-03-03 12:54:57 -0600558 its entirety to stdout. If there are multiple hist triggers
559 attached to an event, there will be a table for each trigger in the
Tom Zanussi5463bfd2016-03-03 12:54:59 -0600560 output. The table displayed for a named trigger will be the same as
561 any other instance having the same name. Each printed hash table
562 entry is a simple list of the keys and values comprising the entry;
563 keys are printed first and are delineated by curly braces, and are
564 followed by the set of value fields for the entry. By default,
565 numeric fields are displayed as base-10 integers. This can be
566 modified by appending any of the following modifiers to the field
567 name:
Tom Zanussi0fc38132016-03-03 12:54:56 -0600568
569 .hex display a number as a hex value
570 .sym display an address as a symbol
571 .sym-offset display an address as a symbol and offset
572 .syscall display a syscall id as a system call name
573 .execname display a common_pid as a program name
574
575 Note that in general the semantics of a given field aren't
576 interpreted when applying a modifier to it, but there are some
577 restrictions to be aware of in this regard:
578
579 - only the 'hex' modifier can be used for values (because values
580 are essentially sums, and the other modifiers don't make sense
581 in that context).
582 - the 'execname' modifier can only be used on a 'common_pid'. The
583 reason for this is that the execname is simply the 'comm' value
584 saved for the 'current' process when an event was triggered,
585 which is the same as the common_pid value saved by the event
586 tracing code. Trying to apply that comm value to other pid
587 values wouldn't be correct, and typically events that care save
588 pid-specific comm fields in the event itself.
589
590 A typical usage scenario would be the following to enable a hist
591 trigger, read its current contents, and then turn it off:
592
593 # echo 'hist:keys=skbaddr.hex:vals=len' > \
594 /sys/kernel/debug/tracing/events/net/netif_rx/trigger
595
596 # cat /sys/kernel/debug/tracing/events/net/netif_rx/hist
597
598 # echo '!hist:keys=skbaddr.hex:vals=len' > \
599 /sys/kernel/debug/tracing/events/net/netif_rx/trigger
600
601 The trigger file itself can be read to show the details of the
602 currently attached hist trigger. This information is also displayed
603 at the top of the 'hist' file when read.
604
605 By default, the size of the hash table is 2048 entries. The 'size'
606 parameter can be used to specify more or fewer than that. The units
607 are in terms of hashtable entries - if a run uses more entries than
608 specified, the results will show the number of 'drops', the number
609 of hits that were ignored. The size should be a power of 2 between
610 128 and 131072 (any non- power-of-2 number specified will be rounded
611 up).
612
613 The 'sort' parameter can be used to specify a value field to sort
614 on. The default if unspecified is 'hitcount' and the default sort
615 order is 'ascending'. To sort in the opposite direction, append
616 .descending' to the sort key.
617
618 The 'pause' parameter can be used to pause an existing hist trigger
619 or to start a hist trigger but not log any events until told to do
620 so. 'continue' or 'cont' can be used to start or restart a paused
621 hist trigger.
622
623 The 'clear' parameter will clear the contents of a running hist
624 trigger and leave its current paused/active state.
625
626 Note that the 'pause', 'cont', and 'clear' parameters should be
627 applied using 'append' shell operator ('>>') if applied to an
628 existing trigger, rather than via the '>' operator, which will cause
629 the trigger to be removed through truncation.
630
631- enable_hist/disable_hist
632
633 The enable_hist and disable_hist triggers can be used to have one
634 event conditionally start and stop another event's already-attached
635 hist trigger. Any number of enable_hist and disable_hist triggers
636 can be attached to a given event, allowing that event to kick off
637 and stop aggregations on a host of other events.
638
639 The format is very similar to the enable/disable_event triggers:
640
641 enable_hist:<system>:<event>[:count]
642 disable_hist:<system>:<event>[:count]
643
644 Instead of enabling or disabling the tracing of the target event
645 into the trace buffer as the enable/disable_event triggers do, the
646 enable/disable_hist triggers enable or disable the aggregation of
647 the target event into a hash table.
648
649 A typical usage scenario for the enable_hist/disable_hist triggers
650 would be to first set up a paused hist trigger on some event,
651 followed by an enable_hist/disable_hist pair that turns the hist
652 aggregation on and off when conditions of interest are hit:
653
654 # echo 'hist:keys=skbaddr.hex:vals=len:pause' > \
655 /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
656
657 # echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \
658 /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger
659
660 # echo 'disable_hist:net:netif_receive_skb if comm==wget' > \
661 /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger
662
663 The above sets up an initially paused hist trigger which is unpaused
664 and starts aggregating events when a given program is executed, and
665 which stops aggregating when the process exits and the hist trigger
666 is paused again.
667
668 The examples below provide a more concrete illustration of the
669 concepts and typical usage patterns discussed above.
670
671
6726.2 'hist' trigger examples
673---------------------------
674
675 The first set of examples creates aggregations using the kmalloc
676 event. The fields that can be used for the hist trigger are listed
677 in the kmalloc event's format file:
678
679 # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/format
680 name: kmalloc
681 ID: 374
682 format:
683 field:unsigned short common_type; offset:0; size:2; signed:0;
684 field:unsigned char common_flags; offset:2; size:1; signed:0;
685 field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
686 field:int common_pid; offset:4; size:4; signed:1;
687
688 field:unsigned long call_site; offset:8; size:8; signed:0;
689 field:const void * ptr; offset:16; size:8; signed:0;
690 field:size_t bytes_req; offset:24; size:8; signed:0;
691 field:size_t bytes_alloc; offset:32; size:8; signed:0;
692 field:gfp_t gfp_flags; offset:40; size:4; signed:0;
693
694 We'll start by creating a hist trigger that generates a simple table
695 that lists the total number of bytes requested for each function in
696 the kernel that made one or more calls to kmalloc:
697
698 # echo 'hist:key=call_site:val=bytes_req' > \
699 /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
700
701 This tells the tracing system to create a 'hist' trigger using the
702 call_site field of the kmalloc event as the key for the table, which
703 just means that each unique call_site address will have an entry
704 created for it in the table. The 'val=bytes_req' parameter tells
705 the hist trigger that for each unique entry (call_site) in the
706 table, it should keep a running total of the number of bytes
707 requested by that call_site.
708
709 We'll let it run for awhile and then dump the contents of the 'hist'
710 file in the kmalloc event's subdirectory (for readability, a number
711 of entries have been omitted):
712
713 # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist
714 # trigger info: hist:keys=call_site:vals=bytes_req:sort=hitcount:size=2048 [active]
715
716 { call_site: 18446744072106379007 } hitcount: 1 bytes_req: 176
717 { call_site: 18446744071579557049 } hitcount: 1 bytes_req: 1024
718 { call_site: 18446744071580608289 } hitcount: 1 bytes_req: 16384
719 { call_site: 18446744071581827654 } hitcount: 1 bytes_req: 24
720 { call_site: 18446744071580700980 } hitcount: 1 bytes_req: 8
721 { call_site: 18446744071579359876 } hitcount: 1 bytes_req: 152
722 { call_site: 18446744071580795365 } hitcount: 3 bytes_req: 144
723 { call_site: 18446744071581303129 } hitcount: 3 bytes_req: 144
724 { call_site: 18446744071580713234 } hitcount: 4 bytes_req: 2560
725 { call_site: 18446744071580933750 } hitcount: 4 bytes_req: 736
726 .
727 .
728 .
729 { call_site: 18446744072106047046 } hitcount: 69 bytes_req: 5576
730 { call_site: 18446744071582116407 } hitcount: 73 bytes_req: 2336
731 { call_site: 18446744072106054684 } hitcount: 136 bytes_req: 140504
732 { call_site: 18446744072106224230 } hitcount: 136 bytes_req: 19584
733 { call_site: 18446744072106078074 } hitcount: 153 bytes_req: 2448
734 { call_site: 18446744072106062406 } hitcount: 153 bytes_req: 36720
735 { call_site: 18446744071582507929 } hitcount: 153 bytes_req: 37088
736 { call_site: 18446744072102520590 } hitcount: 273 bytes_req: 10920
737 { call_site: 18446744071582143559 } hitcount: 358 bytes_req: 716
738 { call_site: 18446744072106465852 } hitcount: 417 bytes_req: 56712
739 { call_site: 18446744072102523378 } hitcount: 485 bytes_req: 27160
740 { call_site: 18446744072099568646 } hitcount: 1676 bytes_req: 33520
741
742 Totals:
743 Hits: 4610
744 Entries: 45
745 Dropped: 0
746
747 The output displays a line for each entry, beginning with the key
748 specified in the trigger, followed by the value(s) also specified in
749 the trigger. At the beginning of the output is a line that displays
750 the trigger info, which can also be displayed by reading the
751 'trigger' file:
752
753 # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
754 hist:keys=call_site:vals=bytes_req:sort=hitcount:size=2048 [active]
755
756 At the end of the output are a few lines that display the overall
757 totals for the run. The 'Hits' field shows the total number of
758 times the event trigger was hit, the 'Entries' field shows the total
759 number of used entries in the hash table, and the 'Dropped' field
760 shows the number of hits that were dropped because the number of
761 used entries for the run exceeded the maximum number of entries
762 allowed for the table (normally 0, but if not a hint that you may
763 want to increase the size of the table using the 'size' parameter).
764
765 Notice in the above output that there's an extra field, 'hitcount',
766 which wasn't specified in the trigger. Also notice that in the
767 trigger info output, there's a parameter, 'sort=hitcount', which
768 wasn't specified in the trigger either. The reason for that is that
769 every trigger implicitly keeps a count of the total number of hits
770 attributed to a given entry, called the 'hitcount'. That hitcount
771 information is explicitly displayed in the output, and in the
772 absence of a user-specified sort parameter, is used as the default
773 sort field.
774
775 The value 'hitcount' can be used in place of an explicit value in
776 the 'values' parameter if you don't really need to have any
777 particular field summed and are mainly interested in hit
778 frequencies.
779
780 To turn the hist trigger off, simply call up the trigger in the
781 command history and re-execute it with a '!' prepended:
782
783 # echo '!hist:key=call_site:val=bytes_req' > \
784 /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
785
786 Finally, notice that the call_site as displayed in the output above
787 isn't really very useful. It's an address, but normally addresses
788 are displayed in hex. To have a numeric field displayed as a hex
789 value, simply append '.hex' to the field name in the trigger:
790
791 # echo 'hist:key=call_site.hex:val=bytes_req' > \
792 /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
793
794 # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist
795 # trigger info: hist:keys=call_site.hex:vals=bytes_req:sort=hitcount:size=2048 [active]
796
797 { call_site: ffffffffa026b291 } hitcount: 1 bytes_req: 433
798 { call_site: ffffffffa07186ff } hitcount: 1 bytes_req: 176
799 { call_site: ffffffff811ae721 } hitcount: 1 bytes_req: 16384
800 { call_site: ffffffff811c5134 } hitcount: 1 bytes_req: 8
801 { call_site: ffffffffa04a9ebb } hitcount: 1 bytes_req: 511
802 { call_site: ffffffff8122e0a6 } hitcount: 1 bytes_req: 12
803 { call_site: ffffffff8107da84 } hitcount: 1 bytes_req: 152
804 { call_site: ffffffff812d8246 } hitcount: 1 bytes_req: 24
805 { call_site: ffffffff811dc1e5 } hitcount: 3 bytes_req: 144
806 { call_site: ffffffffa02515e8 } hitcount: 3 bytes_req: 648
807 { call_site: ffffffff81258159 } hitcount: 3 bytes_req: 144
808 { call_site: ffffffff811c80f4 } hitcount: 4 bytes_req: 544
809 .
810 .
811 .
812 { call_site: ffffffffa06c7646 } hitcount: 106 bytes_req: 8024
813 { call_site: ffffffffa06cb246 } hitcount: 132 bytes_req: 31680
814 { call_site: ffffffffa06cef7a } hitcount: 132 bytes_req: 2112
815 { call_site: ffffffff8137e399 } hitcount: 132 bytes_req: 23232
816 { call_site: ffffffffa06c941c } hitcount: 185 bytes_req: 171360
817 { call_site: ffffffffa06f2a66 } hitcount: 185 bytes_req: 26640
818 { call_site: ffffffffa036a70e } hitcount: 265 bytes_req: 10600
819 { call_site: ffffffff81325447 } hitcount: 292 bytes_req: 584
820 { call_site: ffffffffa072da3c } hitcount: 446 bytes_req: 60656
821 { call_site: ffffffffa036b1f2 } hitcount: 526 bytes_req: 29456
822 { call_site: ffffffffa0099c06 } hitcount: 1780 bytes_req: 35600
823
824 Totals:
825 Hits: 4775
826 Entries: 46
827 Dropped: 0
828
829 Even that's only marginally more useful - while hex values do look
830 more like addresses, what users are typically more interested in
831 when looking at text addresses are the corresponding symbols
832 instead. To have an address displayed as symbolic value instead,
833 simply append '.sym' or '.sym-offset' to the field name in the
834 trigger:
835
836 # echo 'hist:key=call_site.sym:val=bytes_req' > \
837 /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
838
839 # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist
840 # trigger info: hist:keys=call_site.sym:vals=bytes_req:sort=hitcount:size=2048 [active]
841
842 { call_site: [ffffffff810adcb9] syslog_print_all } hitcount: 1 bytes_req: 1024
843 { call_site: [ffffffff8154bc62] usb_control_msg } hitcount: 1 bytes_req: 8
844 { call_site: [ffffffffa00bf6fe] hidraw_send_report [hid] } hitcount: 1 bytes_req: 7
845 { call_site: [ffffffff8154acbe] usb_alloc_urb } hitcount: 1 bytes_req: 192
846 { call_site: [ffffffffa00bf1ca] hidraw_report_event [hid] } hitcount: 1 bytes_req: 7
847 { call_site: [ffffffff811e3a25] __seq_open_private } hitcount: 1 bytes_req: 40
848 { call_site: [ffffffff8109524a] alloc_fair_sched_group } hitcount: 2 bytes_req: 128
849 { call_site: [ffffffff811febd5] fsnotify_alloc_group } hitcount: 2 bytes_req: 528
850 { call_site: [ffffffff81440f58] __tty_buffer_request_room } hitcount: 2 bytes_req: 2624
851 { call_site: [ffffffff81200ba6] inotify_new_group } hitcount: 2 bytes_req: 96
852 { call_site: [ffffffffa05e19af] ieee80211_start_tx_ba_session [mac80211] } hitcount: 2 bytes_req: 464
853 { call_site: [ffffffff81672406] tcp_get_metrics } hitcount: 2 bytes_req: 304
854 { call_site: [ffffffff81097ec2] alloc_rt_sched_group } hitcount: 2 bytes_req: 128
855 { call_site: [ffffffff81089b05] sched_create_group } hitcount: 2 bytes_req: 1424
856 .
857 .
858 .
859 { call_site: [ffffffffa04a580c] intel_crtc_page_flip [i915] } hitcount: 1185 bytes_req: 123240
860 { call_site: [ffffffffa0287592] drm_mode_page_flip_ioctl [drm] } hitcount: 1185 bytes_req: 104280
861 { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state [i915] } hitcount: 1402 bytes_req: 190672
862 { call_site: [ffffffff812891ca] ext4_find_extent } hitcount: 1518 bytes_req: 146208
863 { call_site: [ffffffffa029070e] drm_vma_node_allow [drm] } hitcount: 1746 bytes_req: 69840
864 { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 2021 bytes_req: 792312
865 { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc [drm] } hitcount: 2592 bytes_req: 145152
866 { call_site: [ffffffffa0489a66] intel_ring_begin [i915] } hitcount: 2629 bytes_req: 378576
867 { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 2629 bytes_req: 3783248
868 { call_site: [ffffffff81325607] apparmor_file_alloc_security } hitcount: 5192 bytes_req: 10384
869 { call_site: [ffffffffa00b7c06] hid_report_raw_event [hid] } hitcount: 5529 bytes_req: 110584
870 { call_site: [ffffffff8131ebf7] aa_alloc_task_context } hitcount: 21943 bytes_req: 702176
871 { call_site: [ffffffff8125847d] ext4_htree_store_dirent } hitcount: 55759 bytes_req: 5074265
872
873 Totals:
874 Hits: 109928
875 Entries: 71
876 Dropped: 0
877
878 Because the default sort key above is 'hitcount', the above shows a
879 the list of call_sites by increasing hitcount, so that at the bottom
880 we see the functions that made the most kmalloc calls during the
881 run. If instead we we wanted to see the top kmalloc callers in
882 terms of the number of bytes requested rather than the number of
883 calls, and we wanted the top caller to appear at the top, we can use
884 the 'sort' parameter, along with the 'descending' modifier:
885
886 # echo 'hist:key=call_site.sym:val=bytes_req:sort=bytes_req.descending' > \
887 /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
888
889 # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist
890 # trigger info: hist:keys=call_site.sym:vals=bytes_req:sort=bytes_req.descending:size=2048 [active]
891
892 { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 2186 bytes_req: 3397464
893 { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 1790 bytes_req: 712176
894 { call_site: [ffffffff8125847d] ext4_htree_store_dirent } hitcount: 8132 bytes_req: 513135
895 { call_site: [ffffffff811e2a1b] seq_buf_alloc } hitcount: 106 bytes_req: 440128
896 { call_site: [ffffffffa0489a66] intel_ring_begin [i915] } hitcount: 2186 bytes_req: 314784
897 { call_site: [ffffffff812891ca] ext4_find_extent } hitcount: 2174 bytes_req: 208992
898 { call_site: [ffffffff811ae8e1] __kmalloc } hitcount: 8 bytes_req: 131072
899 { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state [i915] } hitcount: 859 bytes_req: 116824
900 { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc [drm] } hitcount: 1834 bytes_req: 102704
901 { call_site: [ffffffffa04a580c] intel_crtc_page_flip [i915] } hitcount: 972 bytes_req: 101088
902 { call_site: [ffffffffa0287592] drm_mode_page_flip_ioctl [drm] } hitcount: 972 bytes_req: 85536
903 { call_site: [ffffffffa00b7c06] hid_report_raw_event [hid] } hitcount: 3333 bytes_req: 66664
904 { call_site: [ffffffff8137e559] sg_kmalloc } hitcount: 209 bytes_req: 61632
905 .
906 .
907 .
908 { call_site: [ffffffff81095225] alloc_fair_sched_group } hitcount: 2 bytes_req: 128
909 { call_site: [ffffffff81097ec2] alloc_rt_sched_group } hitcount: 2 bytes_req: 128
910 { call_site: [ffffffff812d8406] copy_semundo } hitcount: 2 bytes_req: 48
911 { call_site: [ffffffff81200ba6] inotify_new_group } hitcount: 1 bytes_req: 48
912 { call_site: [ffffffffa027121a] drm_getmagic [drm] } hitcount: 1 bytes_req: 48
913 { call_site: [ffffffff811e3a25] __seq_open_private } hitcount: 1 bytes_req: 40
914 { call_site: [ffffffff811c52f4] bprm_change_interp } hitcount: 2 bytes_req: 16
915 { call_site: [ffffffff8154bc62] usb_control_msg } hitcount: 1 bytes_req: 8
916 { call_site: [ffffffffa00bf1ca] hidraw_report_event [hid] } hitcount: 1 bytes_req: 7
917 { call_site: [ffffffffa00bf6fe] hidraw_send_report [hid] } hitcount: 1 bytes_req: 7
918
919 Totals:
920 Hits: 32133
921 Entries: 81
922 Dropped: 0
923
924 To display the offset and size information in addition to the symbol
925 name, just use 'sym-offset' instead:
926
927 # echo 'hist:key=call_site.sym-offset:val=bytes_req:sort=bytes_req.descending' > \
928 /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
929
930 # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist
931 # trigger info: hist:keys=call_site.sym-offset:vals=bytes_req:sort=bytes_req.descending:size=2048 [active]
932
933 { call_site: [ffffffffa046041c] i915_gem_execbuffer2+0x6c/0x2c0 [i915] } hitcount: 4569 bytes_req: 3163720
934 { call_site: [ffffffffa0489a66] intel_ring_begin+0xc6/0x1f0 [i915] } hitcount: 4569 bytes_req: 657936
935 { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23+0x694/0x1020 [i915] } hitcount: 1519 bytes_req: 472936
936 { call_site: [ffffffffa045e646] i915_gem_do_execbuffer.isra.23+0x516/0x1020 [i915] } hitcount: 3050 bytes_req: 211832
937 { call_site: [ffffffff811e2a1b] seq_buf_alloc+0x1b/0x50 } hitcount: 34 bytes_req: 148384
938 { call_site: [ffffffffa04a580c] intel_crtc_page_flip+0xbc/0x870 [i915] } hitcount: 1385 bytes_req: 144040
939 { call_site: [ffffffff811ae8e1] __kmalloc+0x191/0x1b0 } hitcount: 8 bytes_req: 131072
940 { call_site: [ffffffffa0287592] drm_mode_page_flip_ioctl+0x282/0x360 [drm] } hitcount: 1385 bytes_req: 121880
941 { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc+0x32/0x100 [drm] } hitcount: 1848 bytes_req: 103488
942 { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state+0x2c/0xa0 [i915] } hitcount: 461 bytes_req: 62696
943 { call_site: [ffffffffa029070e] drm_vma_node_allow+0x2e/0xd0 [drm] } hitcount: 1541 bytes_req: 61640
944 { call_site: [ffffffff815f8d7b] sk_prot_alloc+0xcb/0x1b0 } hitcount: 57 bytes_req: 57456
945 .
946 .
947 .
948 { call_site: [ffffffff8109524a] alloc_fair_sched_group+0x5a/0x1a0 } hitcount: 2 bytes_req: 128
949 { call_site: [ffffffffa027b921] drm_vm_open_locked+0x31/0xa0 [drm] } hitcount: 3 bytes_req: 96
950 { call_site: [ffffffff8122e266] proc_self_follow_link+0x76/0xb0 } hitcount: 8 bytes_req: 96
951 { call_site: [ffffffff81213e80] load_elf_binary+0x240/0x1650 } hitcount: 3 bytes_req: 84
952 { call_site: [ffffffff8154bc62] usb_control_msg+0x42/0x110 } hitcount: 1 bytes_req: 8
953 { call_site: [ffffffffa00bf6fe] hidraw_send_report+0x7e/0x1a0 [hid] } hitcount: 1 bytes_req: 7
954 { call_site: [ffffffffa00bf1ca] hidraw_report_event+0x8a/0x120 [hid] } hitcount: 1 bytes_req: 7
955
956 Totals:
957 Hits: 26098
958 Entries: 64
959 Dropped: 0
960
961 We can also add multiple fields to the 'values' parameter. For
962 example, we might want to see the total number of bytes allocated
963 alongside bytes requested, and display the result sorted by bytes
964 allocated in a descending order:
965
966 # echo 'hist:keys=call_site.sym:values=bytes_req,bytes_alloc:sort=bytes_alloc.descending' > \
967 /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
968
969 # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist
970 # trigger info: hist:keys=call_site.sym:vals=bytes_req,bytes_alloc:sort=bytes_alloc.descending:size=2048 [active]
971
972 { call_site: [ffffffffa046041c] i915_gem_execbuffer2 [i915] } hitcount: 7403 bytes_req: 4084360 bytes_alloc: 5958016
973 { call_site: [ffffffff811e2a1b] seq_buf_alloc } hitcount: 541 bytes_req: 2213968 bytes_alloc: 2228224
974 { call_site: [ffffffffa0489a66] intel_ring_begin [i915] } hitcount: 7404 bytes_req: 1066176 bytes_alloc: 1421568
975 { call_site: [ffffffffa045e7c4] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 1565 bytes_req: 557368 bytes_alloc: 1037760
976 { call_site: [ffffffff8125847d] ext4_htree_store_dirent } hitcount: 9557 bytes_req: 595778 bytes_alloc: 695744
977 { call_site: [ffffffffa045e646] i915_gem_do_execbuffer.isra.23 [i915] } hitcount: 5839 bytes_req: 430680 bytes_alloc: 470400
978 { call_site: [ffffffffa04c4a3c] intel_plane_duplicate_state [i915] } hitcount: 2388 bytes_req: 324768 bytes_alloc: 458496
979 { call_site: [ffffffffa02911f2] drm_modeset_lock_crtc [drm] } hitcount: 3911 bytes_req: 219016 bytes_alloc: 250304
980 { call_site: [ffffffff815f8d7b] sk_prot_alloc } hitcount: 235 bytes_req: 236880 bytes_alloc: 240640
981 { call_site: [ffffffff8137e559] sg_kmalloc } hitcount: 557 bytes_req: 169024 bytes_alloc: 221760
982 { call_site: [ffffffffa00b7c06] hid_report_raw_event [hid] } hitcount: 9378 bytes_req: 187548 bytes_alloc: 206312
983 { call_site: [ffffffffa04a580c] intel_crtc_page_flip [i915] } hitcount: 1519 bytes_req: 157976 bytes_alloc: 194432
984 .
985 .
986 .
987 { call_site: [ffffffff8109bd3b] sched_autogroup_create_attach } hitcount: 2 bytes_req: 144 bytes_alloc: 192
988 { call_site: [ffffffff81097ee8] alloc_rt_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128
989 { call_site: [ffffffff8109524a] alloc_fair_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128
990 { call_site: [ffffffff81095225] alloc_fair_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128
991 { call_site: [ffffffff81097ec2] alloc_rt_sched_group } hitcount: 2 bytes_req: 128 bytes_alloc: 128
992 { call_site: [ffffffff81213e80] load_elf_binary } hitcount: 3 bytes_req: 84 bytes_alloc: 96
993 { call_site: [ffffffff81079a2e] kthread_create_on_node } hitcount: 1 bytes_req: 56 bytes_alloc: 64
994 { call_site: [ffffffffa00bf6fe] hidraw_send_report [hid] } hitcount: 1 bytes_req: 7 bytes_alloc: 8
995 { call_site: [ffffffff8154bc62] usb_control_msg } hitcount: 1 bytes_req: 8 bytes_alloc: 8
996 { call_site: [ffffffffa00bf1ca] hidraw_report_event [hid] } hitcount: 1 bytes_req: 7 bytes_alloc: 8
997
998 Totals:
999 Hits: 66598
1000 Entries: 65
1001 Dropped: 0
1002
1003 Finally, to finish off our kmalloc example, instead of simply having
1004 the hist trigger display symbolic call_sites, we can have the hist
1005 trigger additionally display the complete set of kernel stack traces
1006 that led to each call_site. To do that, we simply use the special
1007 value 'stacktrace' for the key parameter:
1008
1009 # echo 'hist:keys=stacktrace:values=bytes_req,bytes_alloc:sort=bytes_alloc' > \
1010 /sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
1011
1012 The above trigger will use the kernel stack trace in effect when an
1013 event is triggered as the key for the hash table. This allows the
1014 enumeration of every kernel callpath that led up to a particular
1015 event, along with a running total of any of the event fields for
1016 that event. Here we tally bytes requested and bytes allocated for
1017 every callpath in the system that led up to a kmalloc (in this case
1018 every callpath to a kmalloc for a kernel compile):
1019
1020 # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/hist
1021 # trigger info: hist:keys=stacktrace:vals=bytes_req,bytes_alloc:sort=bytes_alloc:size=2048 [active]
1022
1023 { stacktrace:
1024 __kmalloc_track_caller+0x10b/0x1a0
1025 kmemdup+0x20/0x50
1026 hidraw_report_event+0x8a/0x120 [hid]
1027 hid_report_raw_event+0x3ea/0x440 [hid]
1028 hid_input_report+0x112/0x190 [hid]
1029 hid_irq_in+0xc2/0x260 [usbhid]
1030 __usb_hcd_giveback_urb+0x72/0x120
1031 usb_giveback_urb_bh+0x9e/0xe0
1032 tasklet_hi_action+0xf8/0x100
1033 __do_softirq+0x114/0x2c0
1034 irq_exit+0xa5/0xb0
1035 do_IRQ+0x5a/0xf0
1036 ret_from_intr+0x0/0x30
1037 cpuidle_enter+0x17/0x20
1038 cpu_startup_entry+0x315/0x3e0
1039 rest_init+0x7c/0x80
1040 } hitcount: 3 bytes_req: 21 bytes_alloc: 24
1041 { stacktrace:
1042 __kmalloc_track_caller+0x10b/0x1a0
1043 kmemdup+0x20/0x50
1044 hidraw_report_event+0x8a/0x120 [hid]
1045 hid_report_raw_event+0x3ea/0x440 [hid]
1046 hid_input_report+0x112/0x190 [hid]
1047 hid_irq_in+0xc2/0x260 [usbhid]
1048 __usb_hcd_giveback_urb+0x72/0x120
1049 usb_giveback_urb_bh+0x9e/0xe0
1050 tasklet_hi_action+0xf8/0x100
1051 __do_softirq+0x114/0x2c0
1052 irq_exit+0xa5/0xb0
1053 do_IRQ+0x5a/0xf0
1054 ret_from_intr+0x0/0x30
1055 } hitcount: 3 bytes_req: 21 bytes_alloc: 24
1056 { stacktrace:
1057 kmem_cache_alloc_trace+0xeb/0x150
1058 aa_alloc_task_context+0x27/0x40
1059 apparmor_cred_prepare+0x1f/0x50
1060 security_prepare_creds+0x16/0x20
1061 prepare_creds+0xdf/0x1a0
1062 SyS_capset+0xb5/0x200
1063 system_call_fastpath+0x12/0x6a
1064 } hitcount: 1 bytes_req: 32 bytes_alloc: 32
1065 .
1066 .
1067 .
1068 { stacktrace:
1069 __kmalloc+0x11b/0x1b0
1070 i915_gem_execbuffer2+0x6c/0x2c0 [i915]
1071 drm_ioctl+0x349/0x670 [drm]
1072 do_vfs_ioctl+0x2f0/0x4f0
1073 SyS_ioctl+0x81/0xa0
1074 system_call_fastpath+0x12/0x6a
1075 } hitcount: 17726 bytes_req: 13944120 bytes_alloc: 19593808
1076 { stacktrace:
1077 __kmalloc+0x11b/0x1b0
1078 load_elf_phdrs+0x76/0xa0
1079 load_elf_binary+0x102/0x1650
1080 search_binary_handler+0x97/0x1d0
1081 do_execveat_common.isra.34+0x551/0x6e0
1082 SyS_execve+0x3a/0x50
1083 return_from_execve+0x0/0x23
1084 } hitcount: 33348 bytes_req: 17152128 bytes_alloc: 20226048
1085 { stacktrace:
1086 kmem_cache_alloc_trace+0xeb/0x150
1087 apparmor_file_alloc_security+0x27/0x40
1088 security_file_alloc+0x16/0x20
1089 get_empty_filp+0x93/0x1c0
1090 path_openat+0x31/0x5f0
1091 do_filp_open+0x3a/0x90
1092 do_sys_open+0x128/0x220
1093 SyS_open+0x1e/0x20
1094 system_call_fastpath+0x12/0x6a
1095 } hitcount: 4766422 bytes_req: 9532844 bytes_alloc: 38131376
1096 { stacktrace:
1097 __kmalloc+0x11b/0x1b0
1098 seq_buf_alloc+0x1b/0x50
1099 seq_read+0x2cc/0x370
1100 proc_reg_read+0x3d/0x80
1101 __vfs_read+0x28/0xe0
1102 vfs_read+0x86/0x140
1103 SyS_read+0x46/0xb0
1104 system_call_fastpath+0x12/0x6a
1105 } hitcount: 19133 bytes_req: 78368768 bytes_alloc: 78368768
1106
1107 Totals:
1108 Hits: 6085872
1109 Entries: 253
1110 Dropped: 0
1111
1112 If you key a hist trigger on common_pid, in order for example to
1113 gather and display sorted totals for each process, you can use the
1114 special .execname modifier to display the executable names for the
1115 processes in the table rather than raw pids. The example below
1116 keeps a per-process sum of total bytes read:
1117
1118 # echo 'hist:key=common_pid.execname:val=count:sort=count.descending' > \
1119 /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/trigger
1120
1121 # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_read/hist
1122 # trigger info: hist:keys=common_pid.execname:vals=count:sort=count.descending:size=2048 [active]
1123
1124 { common_pid: gnome-terminal [ 3196] } hitcount: 280 count: 1093512
1125 { common_pid: Xorg [ 1309] } hitcount: 525 count: 256640
1126 { common_pid: compiz [ 2889] } hitcount: 59 count: 254400
1127 { common_pid: bash [ 8710] } hitcount: 3 count: 66369
1128 { common_pid: dbus-daemon-lau [ 8703] } hitcount: 49 count: 47739
1129 { common_pid: irqbalance [ 1252] } hitcount: 27 count: 27648
1130 { common_pid: 01ifupdown [ 8705] } hitcount: 3 count: 17216
1131 { common_pid: dbus-daemon [ 772] } hitcount: 10 count: 12396
1132 { common_pid: Socket Thread [ 8342] } hitcount: 11 count: 11264
1133 { common_pid: nm-dhcp-client. [ 8701] } hitcount: 6 count: 7424
1134 { common_pid: gmain [ 1315] } hitcount: 18 count: 6336
1135 .
1136 .
1137 .
1138 { common_pid: postgres [ 1892] } hitcount: 2 count: 32
1139 { common_pid: postgres [ 1891] } hitcount: 2 count: 32
1140 { common_pid: gmain [ 8704] } hitcount: 2 count: 32
1141 { common_pid: upstart-dbus-br [ 2740] } hitcount: 21 count: 21
1142 { common_pid: nm-dispatcher.a [ 8696] } hitcount: 1 count: 16
1143 { common_pid: indicator-datet [ 2904] } hitcount: 1 count: 16
1144 { common_pid: gdbus [ 2998] } hitcount: 1 count: 16
1145 { common_pid: rtkit-daemon [ 2052] } hitcount: 1 count: 8
1146 { common_pid: init [ 1] } hitcount: 2 count: 2
1147
1148 Totals:
1149 Hits: 2116
1150 Entries: 51
1151 Dropped: 0
1152
1153 Similarly, if you key a hist trigger on syscall id, for example to
1154 gather and display a list of systemwide syscall hits, you can use
1155 the special .syscall modifier to display the syscall names rather
1156 than raw ids. The example below keeps a running total of syscall
1157 counts for the system during the run:
1158
1159 # echo 'hist:key=id.syscall:val=hitcount' > \
1160 /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
1161
1162 # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
1163 # trigger info: hist:keys=id.syscall:vals=hitcount:sort=hitcount:size=2048 [active]
1164
1165 { id: sys_fsync [ 74] } hitcount: 1
1166 { id: sys_newuname [ 63] } hitcount: 1
1167 { id: sys_prctl [157] } hitcount: 1
1168 { id: sys_statfs [137] } hitcount: 1
1169 { id: sys_symlink [ 88] } hitcount: 1
1170 { id: sys_sendmmsg [307] } hitcount: 1
1171 { id: sys_semctl [ 66] } hitcount: 1
1172 { id: sys_readlink [ 89] } hitcount: 3
1173 { id: sys_bind [ 49] } hitcount: 3
1174 { id: sys_getsockname [ 51] } hitcount: 3
1175 { id: sys_unlink [ 87] } hitcount: 3
1176 { id: sys_rename [ 82] } hitcount: 4
1177 { id: unknown_syscall [ 58] } hitcount: 4
1178 { id: sys_connect [ 42] } hitcount: 4
1179 { id: sys_getpid [ 39] } hitcount: 4
1180 .
1181 .
1182 .
1183 { id: sys_rt_sigprocmask [ 14] } hitcount: 952
1184 { id: sys_futex [202] } hitcount: 1534
1185 { id: sys_write [ 1] } hitcount: 2689
1186 { id: sys_setitimer [ 38] } hitcount: 2797
1187 { id: sys_read [ 0] } hitcount: 3202
1188 { id: sys_select [ 23] } hitcount: 3773
1189 { id: sys_writev [ 20] } hitcount: 4531
1190 { id: sys_poll [ 7] } hitcount: 8314
1191 { id: sys_recvmsg [ 47] } hitcount: 13738
1192 { id: sys_ioctl [ 16] } hitcount: 21843
1193
1194 Totals:
1195 Hits: 67612
1196 Entries: 72
1197 Dropped: 0
1198
1199 The syscall counts above provide a rough overall picture of system
1200 call activity on the system; we can see for example that the most
1201 popular system call on this system was the 'sys_ioctl' system call.
1202
1203 We can use 'compound' keys to refine that number and provide some
1204 further insight as to which processes exactly contribute to the
1205 overall ioctl count.
1206
1207 The command below keeps a hitcount for every unique combination of
1208 system call id and pid - the end result is essentially a table
1209 that keeps a per-pid sum of system call hits. The results are
1210 sorted using the system call id as the primary key, and the
1211 hitcount sum as the secondary key:
1212
1213 # echo 'hist:key=id.syscall,common_pid.execname:val=hitcount:sort=id,hitcount' > \
1214 /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
1215
1216 # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
1217 # trigger info: hist:keys=id.syscall,common_pid.execname:vals=hitcount:sort=id.syscall,hitcount:size=2048 [active]
1218
1219 { id: sys_read [ 0], common_pid: rtkit-daemon [ 1877] } hitcount: 1
1220 { id: sys_read [ 0], common_pid: gdbus [ 2976] } hitcount: 1
1221 { id: sys_read [ 0], common_pid: console-kit-dae [ 3400] } hitcount: 1
1222 { id: sys_read [ 0], common_pid: postgres [ 1865] } hitcount: 1
1223 { id: sys_read [ 0], common_pid: deja-dup-monito [ 3543] } hitcount: 2
1224 { id: sys_read [ 0], common_pid: NetworkManager [ 890] } hitcount: 2
1225 { id: sys_read [ 0], common_pid: evolution-calen [ 3048] } hitcount: 2
1226 { id: sys_read [ 0], common_pid: postgres [ 1864] } hitcount: 2
1227 { id: sys_read [ 0], common_pid: nm-applet [ 3022] } hitcount: 2
1228 { id: sys_read [ 0], common_pid: whoopsie [ 1212] } hitcount: 2
1229 .
1230 .
1231 .
1232 { id: sys_ioctl [ 16], common_pid: bash [ 8479] } hitcount: 1
1233 { id: sys_ioctl [ 16], common_pid: bash [ 3472] } hitcount: 12
1234 { id: sys_ioctl [ 16], common_pid: gnome-terminal [ 3199] } hitcount: 16
1235 { id: sys_ioctl [ 16], common_pid: Xorg [ 1267] } hitcount: 1808
1236 { id: sys_ioctl [ 16], common_pid: compiz [ 2994] } hitcount: 5580
1237 .
1238 .
1239 .
1240 { id: sys_waitid [247], common_pid: upstart-dbus-br [ 2690] } hitcount: 3
1241 { id: sys_waitid [247], common_pid: upstart-dbus-br [ 2688] } hitcount: 16
1242 { id: sys_inotify_add_watch [254], common_pid: gmain [ 975] } hitcount: 2
1243 { id: sys_inotify_add_watch [254], common_pid: gmain [ 3204] } hitcount: 4
1244 { id: sys_inotify_add_watch [254], common_pid: gmain [ 2888] } hitcount: 4
1245 { id: sys_inotify_add_watch [254], common_pid: gmain [ 3003] } hitcount: 4
1246 { id: sys_inotify_add_watch [254], common_pid: gmain [ 2873] } hitcount: 4
1247 { id: sys_inotify_add_watch [254], common_pid: gmain [ 3196] } hitcount: 6
1248 { id: sys_openat [257], common_pid: java [ 2623] } hitcount: 2
1249 { id: sys_eventfd2 [290], common_pid: ibus-ui-gtk3 [ 2760] } hitcount: 4
1250 { id: sys_eventfd2 [290], common_pid: compiz [ 2994] } hitcount: 6
1251
1252 Totals:
1253 Hits: 31536
1254 Entries: 323
1255 Dropped: 0
1256
1257 The above list does give us a breakdown of the ioctl syscall by
1258 pid, but it also gives us quite a bit more than that, which we
1259 don't really care about at the moment. Since we know the syscall
1260 id for sys_ioctl (16, displayed next to the sys_ioctl name), we
1261 can use that to filter out all the other syscalls:
1262
1263 # echo 'hist:key=id.syscall,common_pid.execname:val=hitcount:sort=id,hitcount if id == 16' > \
1264 /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/trigger
1265
1266 # cat /sys/kernel/debug/tracing/events/raw_syscalls/sys_enter/hist
1267 # trigger info: hist:keys=id.syscall,common_pid.execname:vals=hitcount:sort=id.syscall,hitcount:size=2048 if id == 16 [active]
1268
1269 { id: sys_ioctl [ 16], common_pid: gmain [ 2769] } hitcount: 1
1270 { id: sys_ioctl [ 16], common_pid: evolution-addre [ 8571] } hitcount: 1
1271 { id: sys_ioctl [ 16], common_pid: gmain [ 3003] } hitcount: 1
1272 { id: sys_ioctl [ 16], common_pid: gmain [ 2781] } hitcount: 1
1273 { id: sys_ioctl [ 16], common_pid: gmain [ 2829] } hitcount: 1
1274 { id: sys_ioctl [ 16], common_pid: bash [ 8726] } hitcount: 1
1275 { id: sys_ioctl [ 16], common_pid: bash [ 8508] } hitcount: 1
1276 { id: sys_ioctl [ 16], common_pid: gmain [ 2970] } hitcount: 1
1277 { id: sys_ioctl [ 16], common_pid: gmain [ 2768] } hitcount: 1
1278 .
1279 .
1280 .
1281 { id: sys_ioctl [ 16], common_pid: pool [ 8559] } hitcount: 45
1282 { id: sys_ioctl [ 16], common_pid: pool [ 8555] } hitcount: 48
1283 { id: sys_ioctl [ 16], common_pid: pool [ 8551] } hitcount: 48
1284 { id: sys_ioctl [ 16], common_pid: avahi-daemon [ 896] } hitcount: 66
1285 { id: sys_ioctl [ 16], common_pid: Xorg [ 1267] } hitcount: 26674
1286 { id: sys_ioctl [ 16], common_pid: compiz [ 2994] } hitcount: 73443
1287
1288 Totals:
1289 Hits: 101162
1290 Entries: 103
1291 Dropped: 0
1292
1293 The above output shows that 'compiz' and 'Xorg' are far and away
1294 the heaviest ioctl callers (which might lead to questions about
1295 whether they really need to be making all those calls and to
1296 possible avenues for further investigation.)
1297
1298 The compound key examples used a key and a sum value (hitcount) to
1299 sort the output, but we can just as easily use two keys instead.
1300 Here's an example where we use a compound key composed of the the
1301 common_pid and size event fields. Sorting with pid as the primary
1302 key and 'size' as the secondary key allows us to display an
1303 ordered summary of the recvfrom sizes, with counts, received by
1304 each process:
1305
1306 # echo 'hist:key=common_pid.execname,size:val=hitcount:sort=common_pid,size' > \
1307 /sys/kernel/debug/tracing/events/syscalls/sys_enter_recvfrom/trigger
1308
1309 # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_recvfrom/hist
1310 # trigger info: hist:keys=common_pid.execname,size:vals=hitcount:sort=common_pid.execname,size:size=2048 [active]
1311
1312 { common_pid: smbd [ 784], size: 4 } hitcount: 1
1313 { common_pid: dnsmasq [ 1412], size: 4096 } hitcount: 672
1314 { common_pid: postgres [ 1796], size: 1000 } hitcount: 6
1315 { common_pid: postgres [ 1867], size: 1000 } hitcount: 10
1316 { common_pid: bamfdaemon [ 2787], size: 28 } hitcount: 2
1317 { common_pid: bamfdaemon [ 2787], size: 14360 } hitcount: 1
1318 { common_pid: compiz [ 2994], size: 8 } hitcount: 1
1319 { common_pid: compiz [ 2994], size: 20 } hitcount: 11
1320 { common_pid: gnome-terminal [ 3199], size: 4 } hitcount: 2
1321 { common_pid: firefox [ 8817], size: 4 } hitcount: 1
1322 { common_pid: firefox [ 8817], size: 8 } hitcount: 5
1323 { common_pid: firefox [ 8817], size: 588 } hitcount: 2
1324 { common_pid: firefox [ 8817], size: 628 } hitcount: 1
1325 { common_pid: firefox [ 8817], size: 6944 } hitcount: 1
1326 { common_pid: firefox [ 8817], size: 408880 } hitcount: 2
1327 { common_pid: firefox [ 8822], size: 8 } hitcount: 2
1328 { common_pid: firefox [ 8822], size: 160 } hitcount: 2
1329 { common_pid: firefox [ 8822], size: 320 } hitcount: 2
1330 { common_pid: firefox [ 8822], size: 352 } hitcount: 1
1331 .
1332 .
1333 .
1334 { common_pid: pool [ 8923], size: 1960 } hitcount: 10
1335 { common_pid: pool [ 8923], size: 2048 } hitcount: 10
1336 { common_pid: pool [ 8924], size: 1960 } hitcount: 10
1337 { common_pid: pool [ 8924], size: 2048 } hitcount: 10
1338 { common_pid: pool [ 8928], size: 1964 } hitcount: 4
1339 { common_pid: pool [ 8928], size: 1965 } hitcount: 2
1340 { common_pid: pool [ 8928], size: 2048 } hitcount: 6
1341 { common_pid: pool [ 8929], size: 1982 } hitcount: 1
1342 { common_pid: pool [ 8929], size: 2048 } hitcount: 1
1343
1344 Totals:
1345 Hits: 2016
1346 Entries: 224
1347 Dropped: 0
1348
1349 The above example also illustrates the fact that although a compound
1350 key is treated as a single entity for hashing purposes, the sub-keys
1351 it's composed of can be accessed independently.
1352
1353 The next example uses a string field as the hash key and
1354 demonstrates how you can manually pause and continue a hist trigger.
1355 In this example, we'll aggregate fork counts and don't expect a
1356 large number of entries in the hash table, so we'll drop it to a
1357 much smaller number, say 256:
1358
1359 # echo 'hist:key=child_comm:val=hitcount:size=256' > \
1360 /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger
1361
1362 # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist
1363 # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [active]
1364
1365 { child_comm: dconf worker } hitcount: 1
1366 { child_comm: ibus-daemon } hitcount: 1
1367 { child_comm: whoopsie } hitcount: 1
1368 { child_comm: smbd } hitcount: 1
1369 { child_comm: gdbus } hitcount: 1
1370 { child_comm: kthreadd } hitcount: 1
1371 { child_comm: dconf worker } hitcount: 1
1372 { child_comm: evolution-alarm } hitcount: 2
1373 { child_comm: Socket Thread } hitcount: 2
1374 { child_comm: postgres } hitcount: 2
1375 { child_comm: bash } hitcount: 3
1376 { child_comm: compiz } hitcount: 3
1377 { child_comm: evolution-sourc } hitcount: 4
1378 { child_comm: dhclient } hitcount: 4
1379 { child_comm: pool } hitcount: 5
1380 { child_comm: nm-dispatcher.a } hitcount: 8
1381 { child_comm: firefox } hitcount: 8
1382 { child_comm: dbus-daemon } hitcount: 8
1383 { child_comm: glib-pacrunner } hitcount: 10
1384 { child_comm: evolution } hitcount: 23
1385
1386 Totals:
1387 Hits: 89
1388 Entries: 20
1389 Dropped: 0
1390
1391 If we want to pause the hist trigger, we can simply append :pause to
1392 the command that started the trigger. Notice that the trigger info
1393 displays as [paused]:
1394
1395 # echo 'hist:key=child_comm:val=hitcount:size=256:pause' >> \
1396 /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger
1397
1398 # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist
1399 # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [paused]
1400
1401 { child_comm: dconf worker } hitcount: 1
1402 { child_comm: kthreadd } hitcount: 1
1403 { child_comm: dconf worker } hitcount: 1
1404 { child_comm: gdbus } hitcount: 1
1405 { child_comm: ibus-daemon } hitcount: 1
1406 { child_comm: Socket Thread } hitcount: 2
1407 { child_comm: evolution-alarm } hitcount: 2
1408 { child_comm: smbd } hitcount: 2
1409 { child_comm: bash } hitcount: 3
1410 { child_comm: whoopsie } hitcount: 3
1411 { child_comm: compiz } hitcount: 3
1412 { child_comm: evolution-sourc } hitcount: 4
1413 { child_comm: pool } hitcount: 5
1414 { child_comm: postgres } hitcount: 6
1415 { child_comm: firefox } hitcount: 8
1416 { child_comm: dhclient } hitcount: 10
1417 { child_comm: emacs } hitcount: 12
1418 { child_comm: dbus-daemon } hitcount: 20
1419 { child_comm: nm-dispatcher.a } hitcount: 20
1420 { child_comm: evolution } hitcount: 35
1421 { child_comm: glib-pacrunner } hitcount: 59
1422
1423 Totals:
1424 Hits: 199
1425 Entries: 21
1426 Dropped: 0
1427
1428 To manually continue having the trigger aggregate events, append
1429 :cont instead. Notice that the trigger info displays as [active]
1430 again, and the data has changed:
1431
1432 # echo 'hist:key=child_comm:val=hitcount:size=256:cont' >> \
1433 /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger
1434
1435 # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist
1436 # trigger info: hist:keys=child_comm:vals=hitcount:sort=hitcount:size=256 [active]
1437
1438 { child_comm: dconf worker } hitcount: 1
1439 { child_comm: dconf worker } hitcount: 1
1440 { child_comm: kthreadd } hitcount: 1
1441 { child_comm: gdbus } hitcount: 1
1442 { child_comm: ibus-daemon } hitcount: 1
1443 { child_comm: Socket Thread } hitcount: 2
1444 { child_comm: evolution-alarm } hitcount: 2
1445 { child_comm: smbd } hitcount: 2
1446 { child_comm: whoopsie } hitcount: 3
1447 { child_comm: compiz } hitcount: 3
1448 { child_comm: evolution-sourc } hitcount: 4
1449 { child_comm: bash } hitcount: 5
1450 { child_comm: pool } hitcount: 5
1451 { child_comm: postgres } hitcount: 6
1452 { child_comm: firefox } hitcount: 8
1453 { child_comm: dhclient } hitcount: 11
1454 { child_comm: emacs } hitcount: 12
1455 { child_comm: dbus-daemon } hitcount: 22
1456 { child_comm: nm-dispatcher.a } hitcount: 22
1457 { child_comm: evolution } hitcount: 35
1458 { child_comm: glib-pacrunner } hitcount: 59
1459
1460 Totals:
1461 Hits: 206
1462 Entries: 21
1463 Dropped: 0
1464
1465 The previous example showed how to start and stop a hist trigger by
1466 appending 'pause' and 'continue' to the hist trigger command. A
1467 hist trigger can also be started in a paused state by initially
1468 starting the trigger with ':pause' appended. This allows you to
1469 start the trigger only when you're ready to start collecting data
1470 and not before. For example, you could start the trigger in a
1471 paused state, then unpause it and do something you want to measure,
1472 then pause the trigger again when done.
1473
1474 Of course, doing this manually can be difficult and error-prone, but
1475 it is possible to automatically start and stop a hist trigger based
1476 on some condition, via the enable_hist and disable_hist triggers.
1477
1478 For example, suppose we wanted to take a look at the relative
1479 weights in terms of skb length for each callpath that leads to a
1480 netif_receieve_skb event when downloading a decent-sized file using
1481 wget.
1482
1483 First we set up an initially paused stacktrace trigger on the
1484 netif_receive_skb event:
1485
1486 # echo 'hist:key=stacktrace:vals=len:pause' > \
1487 /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
1488
1489 Next, we set up an 'enable_hist' trigger on the sched_process_exec
1490 event, with an 'if filename==/usr/bin/wget' filter. The effect of
1491 this new trigger is that it will 'unpause' the hist trigger we just
1492 set up on netif_receive_skb if and only if it sees a
1493 sched_process_exec event with a filename of '/usr/bin/wget'. When
1494 that happens, all netif_receive_skb events are aggregated into a
1495 hash table keyed on stacktrace:
1496
1497 # echo 'enable_hist:net:netif_receive_skb if filename==/usr/bin/wget' > \
1498 /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger
1499
1500 The aggregation continues until the netif_receive_skb is paused
1501 again, which is what the following disable_hist event does by
1502 creating a similar setup on the sched_process_exit event, using the
1503 filter 'comm==wget':
1504
1505 # echo 'disable_hist:net:netif_receive_skb if comm==wget' > \
1506 /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger
1507
1508 Whenever a process exits and the comm field of the disable_hist
1509 trigger filter matches 'comm==wget', the netif_receive_skb hist
1510 trigger is disabled.
1511
1512 The overall effect is that netif_receive_skb events are aggregated
1513 into the hash table for only the duration of the wget. Executing a
1514 wget command and then listing the 'hist' file will display the
1515 output generated by the wget command:
1516
1517 $ wget https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.19.xz
1518
1519 # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist
1520 # trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused]
1521
1522 { stacktrace:
1523 __netif_receive_skb_core+0x46d/0x990
1524 __netif_receive_skb+0x18/0x60
1525 netif_receive_skb_internal+0x23/0x90
1526 napi_gro_receive+0xc8/0x100
1527 ieee80211_deliver_skb+0xd6/0x270 [mac80211]
1528 ieee80211_rx_handlers+0xccf/0x22f0 [mac80211]
1529 ieee80211_prepare_and_rx_handle+0x4e7/0xc40 [mac80211]
1530 ieee80211_rx+0x31d/0x900 [mac80211]
1531 iwlagn_rx_reply_rx+0x3db/0x6f0 [iwldvm]
1532 iwl_rx_dispatch+0x8e/0xf0 [iwldvm]
1533 iwl_pcie_irq_handler+0xe3c/0x12f0 [iwlwifi]
1534 irq_thread_fn+0x20/0x50
1535 irq_thread+0x11f/0x150
1536 kthread+0xd2/0xf0
1537 ret_from_fork+0x42/0x70
1538 } hitcount: 85 len: 28884
1539 { stacktrace:
1540 __netif_receive_skb_core+0x46d/0x990
1541 __netif_receive_skb+0x18/0x60
1542 netif_receive_skb_internal+0x23/0x90
1543 napi_gro_complete+0xa4/0xe0
1544 dev_gro_receive+0x23a/0x360
1545 napi_gro_receive+0x30/0x100
1546 ieee80211_deliver_skb+0xd6/0x270 [mac80211]
1547 ieee80211_rx_handlers+0xccf/0x22f0 [mac80211]
1548 ieee80211_prepare_and_rx_handle+0x4e7/0xc40 [mac80211]
1549 ieee80211_rx+0x31d/0x900 [mac80211]
1550 iwlagn_rx_reply_rx+0x3db/0x6f0 [iwldvm]
1551 iwl_rx_dispatch+0x8e/0xf0 [iwldvm]
1552 iwl_pcie_irq_handler+0xe3c/0x12f0 [iwlwifi]
1553 irq_thread_fn+0x20/0x50
1554 irq_thread+0x11f/0x150
1555 kthread+0xd2/0xf0
1556 } hitcount: 98 len: 664329
1557 { stacktrace:
1558 __netif_receive_skb_core+0x46d/0x990
1559 __netif_receive_skb+0x18/0x60
1560 process_backlog+0xa8/0x150
1561 net_rx_action+0x15d/0x340
1562 __do_softirq+0x114/0x2c0
1563 do_softirq_own_stack+0x1c/0x30
1564 do_softirq+0x65/0x70
1565 __local_bh_enable_ip+0xb5/0xc0
1566 ip_finish_output+0x1f4/0x840
1567 ip_output+0x6b/0xc0
1568 ip_local_out_sk+0x31/0x40
1569 ip_send_skb+0x1a/0x50
1570 udp_send_skb+0x173/0x2a0
1571 udp_sendmsg+0x2bf/0x9f0
1572 inet_sendmsg+0x64/0xa0
1573 sock_sendmsg+0x3d/0x50
1574 } hitcount: 115 len: 13030
1575 { stacktrace:
1576 __netif_receive_skb_core+0x46d/0x990
1577 __netif_receive_skb+0x18/0x60
1578 netif_receive_skb_internal+0x23/0x90
1579 napi_gro_complete+0xa4/0xe0
1580 napi_gro_flush+0x6d/0x90
1581 iwl_pcie_irq_handler+0x92a/0x12f0 [iwlwifi]
1582 irq_thread_fn+0x20/0x50
1583 irq_thread+0x11f/0x150
1584 kthread+0xd2/0xf0
1585 ret_from_fork+0x42/0x70
1586 } hitcount: 934 len: 5512212
1587
1588 Totals:
1589 Hits: 1232
1590 Entries: 4
1591 Dropped: 0
1592
1593 The above shows all the netif_receive_skb callpaths and their total
1594 lengths for the duration of the wget command.
1595
1596 The 'clear' hist trigger param can be used to clear the hash table.
1597 Suppose we wanted to try another run of the previous example but
1598 this time also wanted to see the complete list of events that went
1599 into the histogram. In order to avoid having to set everything up
1600 again, we can just clear the histogram first:
1601
1602 # echo 'hist:key=stacktrace:vals=len:clear' >> \
1603 /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
1604
1605 Just to verify that it is in fact cleared, here's what we now see in
1606 the hist file:
1607
1608 # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist
1609 # trigger info: hist:keys=stacktrace:vals=len:sort=hitcount:size=2048 [paused]
1610
1611 Totals:
1612 Hits: 0
1613 Entries: 0
1614 Dropped: 0
1615
1616 Since we want to see the detailed list of every netif_receive_skb
1617 event occurring during the new run, which are in fact the same
1618 events being aggregated into the hash table, we add some additional
1619 'enable_event' events to the triggering sched_process_exec and
1620 sched_process_exit events as such:
1621
1622 # echo 'enable_event:net:netif_receive_skb if filename==/usr/bin/wget' > \
1623 /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger
1624
1625 # echo 'disable_event:net:netif_receive_skb if comm==wget' > \
1626 /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger
1627
1628 If you read the trigger files for the sched_process_exec and
1629 sched_process_exit triggers, you should see two triggers for each:
1630 one enabling/disabling the hist aggregation and the other
1631 enabling/disabling the logging of events:
1632
1633 # cat /sys/kernel/debug/tracing/events/sched/sched_process_exec/trigger
1634 enable_event:net:netif_receive_skb:unlimited if filename==/usr/bin/wget
1635 enable_hist:net:netif_receive_skb:unlimited if filename==/usr/bin/wget
1636
1637 # cat /sys/kernel/debug/tracing/events/sched/sched_process_exit/trigger
1638 enable_event:net:netif_receive_skb:unlimited if comm==wget
1639 disable_hist:net:netif_receive_skb:unlimited if comm==wget
1640
1641 In other words, whenever either of the sched_process_exec or
1642 sched_process_exit events is hit and matches 'wget', it enables or
1643 disables both the histogram and the event log, and what you end up
1644 with is a hash table and set of events just covering the specified
1645 duration. Run the wget command again:
1646
1647 $ wget https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.19.xz
1648
1649 Displaying the 'hist' file should show something similar to what you
1650 saw in the last run, but this time you should also see the
1651 individual events in the trace file:
1652
1653 # cat /sys/kernel/debug/tracing/trace
1654
1655 # tracer: nop
1656 #
1657 # entries-in-buffer/entries-written: 183/1426 #P:4
1658 #
1659 # _-----=> irqs-off
1660 # / _----=> need-resched
1661 # | / _---=> hardirq/softirq
1662 # || / _--=> preempt-depth
1663 # ||| / delay
1664 # TASK-PID CPU# |||| TIMESTAMP FUNCTION
1665 # | | | |||| | |
1666 wget-15108 [000] ..s1 31769.606929: netif_receive_skb: dev=lo skbaddr=ffff88009c353100 len=60
1667 wget-15108 [000] ..s1 31769.606999: netif_receive_skb: dev=lo skbaddr=ffff88009c353200 len=60
1668 dnsmasq-1382 [000] ..s1 31769.677652: netif_receive_skb: dev=lo skbaddr=ffff88009c352b00 len=130
1669 dnsmasq-1382 [000] ..s1 31769.685917: netif_receive_skb: dev=lo skbaddr=ffff88009c352200 len=138
1670 ##### CPU 2 buffer started ####
1671 irq/29-iwlwifi-559 [002] ..s. 31772.031529: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433d00 len=2948
1672 irq/29-iwlwifi-559 [002] ..s. 31772.031572: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d432200 len=1500
1673 irq/29-iwlwifi-559 [002] ..s. 31772.032196: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433100 len=2948
1674 irq/29-iwlwifi-559 [002] ..s. 31772.032761: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d433000 len=2948
1675 irq/29-iwlwifi-559 [002] ..s. 31772.033220: netif_receive_skb: dev=wlan0 skbaddr=ffff88009d432e00 len=1500
1676 .
1677 .
1678 .
Tom Zanussi52a7f162016-03-03 12:54:57 -06001679
1680 The following example demonstrates how multiple hist triggers can be
1681 attached to a given event. This capability can be useful for
1682 creating a set of different summaries derived from the same set of
1683 events, or for comparing the effects of different filters, among
1684 other things.
1685
1686 # echo 'hist:keys=skbaddr.hex:vals=len if len < 0' >> \
1687 /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
1688 # echo 'hist:keys=skbaddr.hex:vals=len if len > 4096' >> \
1689 /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
1690 # echo 'hist:keys=skbaddr.hex:vals=len if len == 256' >> \
1691 /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
1692 # echo 'hist:keys=skbaddr.hex:vals=len' >> \
1693 /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
1694 # echo 'hist:keys=len:vals=common_preempt_count' >> \
1695 /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
1696
1697 The above set of commands create four triggers differing only in
1698 their filters, along with a completely different though fairly
1699 nonsensical trigger. Note that in order to append multiple hist
1700 triggers to the same file, you should use the '>>' operator to
1701 append them ('>' will also add the new hist trigger, but will remove
1702 any existing hist triggers beforehand).
1703
1704 Displaying the contents of the 'hist' file for the event shows the
1705 contents of all five histograms:
1706
1707 # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist
1708
1709 # event histogram
1710 #
1711 # trigger info: hist:keys=len:vals=hitcount,common_preempt_count:sort=hitcount:size=2048 [active]
1712 #
1713
1714 { len: 176 } hitcount: 1 common_preempt_count: 0
1715 { len: 223 } hitcount: 1 common_preempt_count: 0
1716 { len: 4854 } hitcount: 1 common_preempt_count: 0
1717 { len: 395 } hitcount: 1 common_preempt_count: 0
1718 { len: 177 } hitcount: 1 common_preempt_count: 0
1719 { len: 446 } hitcount: 1 common_preempt_count: 0
1720 { len: 1601 } hitcount: 1 common_preempt_count: 0
1721 .
1722 .
1723 .
1724 { len: 1280 } hitcount: 66 common_preempt_count: 0
1725 { len: 116 } hitcount: 81 common_preempt_count: 40
1726 { len: 708 } hitcount: 112 common_preempt_count: 0
1727 { len: 46 } hitcount: 221 common_preempt_count: 0
1728 { len: 1264 } hitcount: 458 common_preempt_count: 0
1729
1730 Totals:
1731 Hits: 1428
1732 Entries: 147
1733 Dropped: 0
1734
1735
1736 # event histogram
1737 #
1738 # trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 [active]
1739 #
1740
1741 { skbaddr: ffff8800baee5e00 } hitcount: 1 len: 130
1742 { skbaddr: ffff88005f3d5600 } hitcount: 1 len: 1280
1743 { skbaddr: ffff88005f3d4900 } hitcount: 1 len: 1280
1744 { skbaddr: ffff88009fed6300 } hitcount: 1 len: 115
1745 { skbaddr: ffff88009fe0ad00 } hitcount: 1 len: 115
1746 { skbaddr: ffff88008cdb1900 } hitcount: 1 len: 46
1747 { skbaddr: ffff880064b5ef00 } hitcount: 1 len: 118
1748 { skbaddr: ffff880044e3c700 } hitcount: 1 len: 60
1749 { skbaddr: ffff880100065900 } hitcount: 1 len: 46
1750 { skbaddr: ffff8800d46bd500 } hitcount: 1 len: 116
1751 { skbaddr: ffff88005f3d5f00 } hitcount: 1 len: 1280
1752 { skbaddr: ffff880100064700 } hitcount: 1 len: 365
1753 { skbaddr: ffff8800badb6f00 } hitcount: 1 len: 60
1754 .
1755 .
1756 .
1757 { skbaddr: ffff88009fe0be00 } hitcount: 27 len: 24677
1758 { skbaddr: ffff88009fe0a400 } hitcount: 27 len: 23052
1759 { skbaddr: ffff88009fe0b700 } hitcount: 31 len: 25589
1760 { skbaddr: ffff88009fe0b600 } hitcount: 32 len: 27326
1761 { skbaddr: ffff88006a462800 } hitcount: 68 len: 71678
1762 { skbaddr: ffff88006a463700 } hitcount: 70 len: 72678
1763 { skbaddr: ffff88006a462b00 } hitcount: 71 len: 77589
1764 { skbaddr: ffff88006a463600 } hitcount: 73 len: 71307
1765 { skbaddr: ffff88006a462200 } hitcount: 81 len: 81032
1766
1767 Totals:
1768 Hits: 1451
1769 Entries: 318
1770 Dropped: 0
1771
1772
1773 # event histogram
1774 #
1775 # trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 if len == 256 [active]
1776 #
1777
1778
1779 Totals:
1780 Hits: 0
1781 Entries: 0
1782 Dropped: 0
1783
1784
1785 # event histogram
1786 #
1787 # trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 if len > 4096 [active]
1788 #
1789
1790 { skbaddr: ffff88009fd2c300 } hitcount: 1 len: 7212
1791 { skbaddr: ffff8800d2bcce00 } hitcount: 1 len: 7212
1792 { skbaddr: ffff8800d2bcd700 } hitcount: 1 len: 7212
1793 { skbaddr: ffff8800d2bcda00 } hitcount: 1 len: 21492
1794 { skbaddr: ffff8800ae2e2d00 } hitcount: 1 len: 7212
1795 { skbaddr: ffff8800d2bcdb00 } hitcount: 1 len: 7212
1796 { skbaddr: ffff88006a4df500 } hitcount: 1 len: 4854
1797 { skbaddr: ffff88008ce47b00 } hitcount: 1 len: 18636
1798 { skbaddr: ffff8800ae2e2200 } hitcount: 1 len: 12924
1799 { skbaddr: ffff88005f3e1000 } hitcount: 1 len: 4356
1800 { skbaddr: ffff8800d2bcdc00 } hitcount: 2 len: 24420
1801 { skbaddr: ffff8800d2bcc200 } hitcount: 2 len: 12996
1802
1803 Totals:
1804 Hits: 14
1805 Entries: 12
1806 Dropped: 0
1807
1808
1809 # event histogram
1810 #
1811 # trigger info: hist:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 if len < 0 [active]
1812 #
1813
1814
1815 Totals:
1816 Hits: 0
1817 Entries: 0
1818 Dropped: 0
Tom Zanussi5463bfd2016-03-03 12:54:59 -06001819
1820 Named triggers can be used to have triggers share a common set of
1821 histogram data. This capability is mostly useful for combining the
1822 output of events generated by tracepoints contained inside inline
1823 functions, but names can be used in a hist trigger on any event.
1824 For example, these two triggers when hit will update the same 'len'
1825 field in the shared 'foo' histogram data:
1826
1827 # echo 'hist:name=foo:keys=skbaddr.hex:vals=len' > \
1828 /sys/kernel/debug/tracing/events/net/netif_receive_skb/trigger
1829 # echo 'hist:name=foo:keys=skbaddr.hex:vals=len' > \
1830 /sys/kernel/debug/tracing/events/net/netif_rx/trigger
1831
1832 You can see that they're updating common histogram data by reading
1833 each event's hist files at the same time:
1834
1835 # cat /sys/kernel/debug/tracing/events/net/netif_receive_skb/hist;
1836 cat /sys/kernel/debug/tracing/events/net/netif_rx/hist
1837
1838 # event histogram
1839 #
1840 # trigger info: hist:name=foo:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 [active]
1841 #
1842
1843 { skbaddr: ffff88000ad53500 } hitcount: 1 len: 46
1844 { skbaddr: ffff8800af5a1500 } hitcount: 1 len: 76
1845 { skbaddr: ffff8800d62a1900 } hitcount: 1 len: 46
1846 { skbaddr: ffff8800d2bccb00 } hitcount: 1 len: 468
1847 { skbaddr: ffff8800d3c69900 } hitcount: 1 len: 46
1848 { skbaddr: ffff88009ff09100 } hitcount: 1 len: 52
1849 { skbaddr: ffff88010f13ab00 } hitcount: 1 len: 168
1850 { skbaddr: ffff88006a54f400 } hitcount: 1 len: 46
1851 { skbaddr: ffff8800d2bcc500 } hitcount: 1 len: 260
1852 { skbaddr: ffff880064505000 } hitcount: 1 len: 46
1853 { skbaddr: ffff8800baf24e00 } hitcount: 1 len: 32
1854 { skbaddr: ffff88009fe0ad00 } hitcount: 1 len: 46
1855 { skbaddr: ffff8800d3edff00 } hitcount: 1 len: 44
1856 { skbaddr: ffff88009fe0b400 } hitcount: 1 len: 168
1857 { skbaddr: ffff8800a1c55a00 } hitcount: 1 len: 40
1858 { skbaddr: ffff8800d2bcd100 } hitcount: 1 len: 40
1859 { skbaddr: ffff880064505f00 } hitcount: 1 len: 174
1860 { skbaddr: ffff8800a8bff200 } hitcount: 1 len: 160
1861 { skbaddr: ffff880044e3cc00 } hitcount: 1 len: 76
1862 { skbaddr: ffff8800a8bfe700 } hitcount: 1 len: 46
1863 { skbaddr: ffff8800d2bcdc00 } hitcount: 1 len: 32
1864 { skbaddr: ffff8800a1f64800 } hitcount: 1 len: 46
1865 { skbaddr: ffff8800d2bcde00 } hitcount: 1 len: 988
1866 { skbaddr: ffff88006a5dea00 } hitcount: 1 len: 46
1867 { skbaddr: ffff88002e37a200 } hitcount: 1 len: 44
1868 { skbaddr: ffff8800a1f32c00 } hitcount: 2 len: 676
1869 { skbaddr: ffff88000ad52600 } hitcount: 2 len: 107
1870 { skbaddr: ffff8800a1f91e00 } hitcount: 2 len: 92
1871 { skbaddr: ffff8800af5a0200 } hitcount: 2 len: 142
1872 { skbaddr: ffff8800d2bcc600 } hitcount: 2 len: 220
1873 { skbaddr: ffff8800ba36f500 } hitcount: 2 len: 92
1874 { skbaddr: ffff8800d021f800 } hitcount: 2 len: 92
1875 { skbaddr: ffff8800a1f33600 } hitcount: 2 len: 675
1876 { skbaddr: ffff8800a8bfff00 } hitcount: 3 len: 138
1877 { skbaddr: ffff8800d62a1300 } hitcount: 3 len: 138
1878 { skbaddr: ffff88002e37a100 } hitcount: 4 len: 184
1879 { skbaddr: ffff880064504400 } hitcount: 4 len: 184
1880 { skbaddr: ffff8800a8bfec00 } hitcount: 4 len: 184
1881 { skbaddr: ffff88000ad53700 } hitcount: 5 len: 230
1882 { skbaddr: ffff8800d2bcdb00 } hitcount: 5 len: 196
1883 { skbaddr: ffff8800a1f90000 } hitcount: 6 len: 276
1884 { skbaddr: ffff88006a54f900 } hitcount: 6 len: 276
1885
1886 Totals:
1887 Hits: 81
1888 Entries: 42
1889 Dropped: 0
1890 # event histogram
1891 #
1892 # trigger info: hist:name=foo:keys=skbaddr.hex:vals=hitcount,len:sort=hitcount:size=2048 [active]
1893 #
1894
1895 { skbaddr: ffff88000ad53500 } hitcount: 1 len: 46
1896 { skbaddr: ffff8800af5a1500 } hitcount: 1 len: 76
1897 { skbaddr: ffff8800d62a1900 } hitcount: 1 len: 46
1898 { skbaddr: ffff8800d2bccb00 } hitcount: 1 len: 468
1899 { skbaddr: ffff8800d3c69900 } hitcount: 1 len: 46
1900 { skbaddr: ffff88009ff09100 } hitcount: 1 len: 52
1901 { skbaddr: ffff88010f13ab00 } hitcount: 1 len: 168
1902 { skbaddr: ffff88006a54f400 } hitcount: 1 len: 46
1903 { skbaddr: ffff8800d2bcc500 } hitcount: 1 len: 260
1904 { skbaddr: ffff880064505000 } hitcount: 1 len: 46
1905 { skbaddr: ffff8800baf24e00 } hitcount: 1 len: 32
1906 { skbaddr: ffff88009fe0ad00 } hitcount: 1 len: 46
1907 { skbaddr: ffff8800d3edff00 } hitcount: 1 len: 44
1908 { skbaddr: ffff88009fe0b400 } hitcount: 1 len: 168
1909 { skbaddr: ffff8800a1c55a00 } hitcount: 1 len: 40
1910 { skbaddr: ffff8800d2bcd100 } hitcount: 1 len: 40
1911 { skbaddr: ffff880064505f00 } hitcount: 1 len: 174
1912 { skbaddr: ffff8800a8bff200 } hitcount: 1 len: 160
1913 { skbaddr: ffff880044e3cc00 } hitcount: 1 len: 76
1914 { skbaddr: ffff8800a8bfe700 } hitcount: 1 len: 46
1915 { skbaddr: ffff8800d2bcdc00 } hitcount: 1 len: 32
1916 { skbaddr: ffff8800a1f64800 } hitcount: 1 len: 46
1917 { skbaddr: ffff8800d2bcde00 } hitcount: 1 len: 988
1918 { skbaddr: ffff88006a5dea00 } hitcount: 1 len: 46
1919 { skbaddr: ffff88002e37a200 } hitcount: 1 len: 44
1920 { skbaddr: ffff8800a1f32c00 } hitcount: 2 len: 676
1921 { skbaddr: ffff88000ad52600 } hitcount: 2 len: 107
1922 { skbaddr: ffff8800a1f91e00 } hitcount: 2 len: 92
1923 { skbaddr: ffff8800af5a0200 } hitcount: 2 len: 142
1924 { skbaddr: ffff8800d2bcc600 } hitcount: 2 len: 220
1925 { skbaddr: ffff8800ba36f500 } hitcount: 2 len: 92
1926 { skbaddr: ffff8800d021f800 } hitcount: 2 len: 92
1927 { skbaddr: ffff8800a1f33600 } hitcount: 2 len: 675
1928 { skbaddr: ffff8800a8bfff00 } hitcount: 3 len: 138
1929 { skbaddr: ffff8800d62a1300 } hitcount: 3 len: 138
1930 { skbaddr: ffff88002e37a100 } hitcount: 4 len: 184
1931 { skbaddr: ffff880064504400 } hitcount: 4 len: 184
1932 { skbaddr: ffff8800a8bfec00 } hitcount: 4 len: 184
1933 { skbaddr: ffff88000ad53700 } hitcount: 5 len: 230
1934 { skbaddr: ffff8800d2bcdb00 } hitcount: 5 len: 196
1935 { skbaddr: ffff8800a1f90000 } hitcount: 6 len: 276
1936 { skbaddr: ffff88006a54f900 } hitcount: 6 len: 276
1937
1938 Totals:
1939 Hits: 81
1940 Entries: 42
1941 Dropped: 0
1942
1943 And here's an example that shows how to combine histogram data from
1944 any two events even if they don't share any 'compatible' fields
1945 other than 'hitcount' and 'stacktrace'. These commands create a
1946 couple of triggers named 'bar' using those fields:
1947
1948 # echo 'hist:name=bar:key=stacktrace:val=hitcount' > \
1949 /sys/kernel/debug/tracing/events/sched/sched_process_fork/trigger
1950 # echo 'hist:name=bar:key=stacktrace:val=hitcount' > \
1951 /sys/kernel/debug/tracing/events/net/netif_rx/trigger
1952
1953 And displaying the output of either shows some interesting if
1954 somewhat confusing output:
1955
1956 # cat /sys/kernel/debug/tracing/events/sched/sched_process_fork/hist
1957 # cat /sys/kernel/debug/tracing/events/net/netif_rx/hist
1958
1959 # event histogram
1960 #
1961 # trigger info: hist:name=bar:keys=stacktrace:vals=hitcount:sort=hitcount:size=2048 [active]
1962 #
1963
1964 { stacktrace:
1965 _do_fork+0x18e/0x330
1966 kernel_thread+0x29/0x30
1967 kthreadd+0x154/0x1b0
1968 ret_from_fork+0x3f/0x70
1969 } hitcount: 1
1970 { stacktrace:
1971 netif_rx_internal+0xb2/0xd0
1972 netif_rx_ni+0x20/0x70
1973 dev_loopback_xmit+0xaa/0xd0
1974 ip_mc_output+0x126/0x240
1975 ip_local_out_sk+0x31/0x40
1976 igmp_send_report+0x1e9/0x230
1977 igmp_timer_expire+0xe9/0x120
1978 call_timer_fn+0x39/0xf0
1979 run_timer_softirq+0x1e1/0x290
1980 __do_softirq+0xfd/0x290
1981 irq_exit+0x98/0xb0
1982 smp_apic_timer_interrupt+0x4a/0x60
1983 apic_timer_interrupt+0x6d/0x80
1984 cpuidle_enter+0x17/0x20
1985 call_cpuidle+0x3b/0x60
1986 cpu_startup_entry+0x22d/0x310
1987 } hitcount: 1
1988 { stacktrace:
1989 netif_rx_internal+0xb2/0xd0
1990 netif_rx_ni+0x20/0x70
1991 dev_loopback_xmit+0xaa/0xd0
1992 ip_mc_output+0x17f/0x240
1993 ip_local_out_sk+0x31/0x40
1994 ip_send_skb+0x1a/0x50
1995 udp_send_skb+0x13e/0x270
1996 udp_sendmsg+0x2bf/0x980
1997 inet_sendmsg+0x67/0xa0
1998 sock_sendmsg+0x38/0x50
1999 SYSC_sendto+0xef/0x170
2000 SyS_sendto+0xe/0x10
2001 entry_SYSCALL_64_fastpath+0x12/0x6a
2002 } hitcount: 2
2003 { stacktrace:
2004 netif_rx_internal+0xb2/0xd0
2005 netif_rx+0x1c/0x60
2006 loopback_xmit+0x6c/0xb0
2007 dev_hard_start_xmit+0x219/0x3a0
2008 __dev_queue_xmit+0x415/0x4f0
2009 dev_queue_xmit_sk+0x13/0x20
2010 ip_finish_output2+0x237/0x340
2011 ip_finish_output+0x113/0x1d0
2012 ip_output+0x66/0xc0
2013 ip_local_out_sk+0x31/0x40
2014 ip_send_skb+0x1a/0x50
2015 udp_send_skb+0x16d/0x270
2016 udp_sendmsg+0x2bf/0x980
2017 inet_sendmsg+0x67/0xa0
2018 sock_sendmsg+0x38/0x50
2019 ___sys_sendmsg+0x14e/0x270
2020 } hitcount: 76
2021 { stacktrace:
2022 netif_rx_internal+0xb2/0xd0
2023 netif_rx+0x1c/0x60
2024 loopback_xmit+0x6c/0xb0
2025 dev_hard_start_xmit+0x219/0x3a0
2026 __dev_queue_xmit+0x415/0x4f0
2027 dev_queue_xmit_sk+0x13/0x20
2028 ip_finish_output2+0x237/0x340
2029 ip_finish_output+0x113/0x1d0
2030 ip_output+0x66/0xc0
2031 ip_local_out_sk+0x31/0x40
2032 ip_send_skb+0x1a/0x50
2033 udp_send_skb+0x16d/0x270
2034 udp_sendmsg+0x2bf/0x980
2035 inet_sendmsg+0x67/0xa0
2036 sock_sendmsg+0x38/0x50
2037 ___sys_sendmsg+0x269/0x270
2038 } hitcount: 77
2039 { stacktrace:
2040 netif_rx_internal+0xb2/0xd0
2041 netif_rx+0x1c/0x60
2042 loopback_xmit+0x6c/0xb0
2043 dev_hard_start_xmit+0x219/0x3a0
2044 __dev_queue_xmit+0x415/0x4f0
2045 dev_queue_xmit_sk+0x13/0x20
2046 ip_finish_output2+0x237/0x340
2047 ip_finish_output+0x113/0x1d0
2048 ip_output+0x66/0xc0
2049 ip_local_out_sk+0x31/0x40
2050 ip_send_skb+0x1a/0x50
2051 udp_send_skb+0x16d/0x270
2052 udp_sendmsg+0x2bf/0x980
2053 inet_sendmsg+0x67/0xa0
2054 sock_sendmsg+0x38/0x50
2055 SYSC_sendto+0xef/0x170
2056 } hitcount: 88
2057 { stacktrace:
2058 _do_fork+0x18e/0x330
2059 SyS_clone+0x19/0x20
2060 entry_SYSCALL_64_fastpath+0x12/0x6a
2061 } hitcount: 244
2062
2063 Totals:
2064 Hits: 489
2065 Entries: 7
2066 Dropped: 0