Theodore Ts'o | abd4144 | 2009-04-11 15:51:18 -0400 | [diff] [blame^] | 1 | Event Tracing |
| 2 | |
| 3 | Documentation written by Theodore Ts'o |
| 4 | |
| 5 | Introduction |
| 6 | ============ |
| 7 | |
| 8 | Tracepoints (see Documentation/trace/tracepoints.txt) can be used |
| 9 | without creating custom kernel modules to register probe functions |
| 10 | using the event tracing infrastructure. |
| 11 | |
| 12 | Not all tracepoints can be traced using the event tracing system; |
| 13 | the kernel developer must provide code snippets which define how the |
| 14 | tracing information is saved into the tracing buffer, and how the |
| 15 | the tracing information should be printed. |
| 16 | |
| 17 | Using Event Tracing |
| 18 | =================== |
| 19 | |
| 20 | The events which are available for tracing can be found in the file |
| 21 | /sys/kernel/debug/tracing/available_events. |
| 22 | |
| 23 | To enable a particular event, such as 'sched_wakeup', simply echo it |
| 24 | to /sys/debug/tracing/set_event. For example: |
| 25 | |
| 26 | # echo sched_wakeup > /sys/kernel/debug/tracing/set_event |
| 27 | |
| 28 | [ Note: events can also be enabled/disabled via the 'enabled' toggle |
| 29 | found in the /sys/kernel/tracing/events/ hierarchy of directories. ] |
| 30 | |
| 31 | To disable an event, echo the event name to the set_event file prefixed |
| 32 | with an exclamation point: |
| 33 | |
| 34 | # echo '!sched_wakeup' >> /sys/kernel/debug/tracing/set_event |
| 35 | |
| 36 | To disable events, echo an empty line to the set_event file: |
| 37 | |
| 38 | # echo > /sys/kernel/debug/tracing/set_event |
| 39 | |
| 40 | The events are organized into subsystems, such as ext4, irq, sched, |
| 41 | etc., and a full event name looks like this: <subsystem>:<event>. The |
| 42 | subsystem name is optional, but it is displayed in the available_events |
| 43 | file. All of the events in a subsystem can be specified via the syntax |
| 44 | "<subsystem>:*"; for example, to enable all irq events, you can use the |
| 45 | command: |
| 46 | |
| 47 | # echo 'irq:*' > /sys/kernel/debug/tracing/set_event |
| 48 | |
| 49 | Defining an event-enabled tracepoint |
| 50 | ------------------------------------ |
| 51 | |
| 52 | A kernel developer which wishes to define an event-enabled tracepoint |
| 53 | must declare the tracepoint using TRACE_EVENT instead of DECLARE_TRACE. |
| 54 | This is done via two header files in include/trace. For example, to |
| 55 | event-enable the jbd2 subsystem, we must create two files, |
| 56 | include/trace/jbd2.h and include/trace/jbd2_event_types.h. The |
| 57 | include/trace/jbd2.h file should be included by kernel source files that |
| 58 | will have a tracepoint inserted, and might look like this: |
| 59 | |
| 60 | #ifndef _TRACE_JBD2_H |
| 61 | #define _TRACE_JBD2_H |
| 62 | |
| 63 | #include <linux/jbd2.h> |
| 64 | #include <linux/tracepoint.h> |
| 65 | |
| 66 | #include <trace/jbd2_event_types.h> |
| 67 | |
| 68 | #endif |
| 69 | |
| 70 | In a file that utilizes a jbd2 tracepoint, this header file would be |
| 71 | included. Note that you still have to use DEFINE_TRACE(). So for |
| 72 | example, if fs/jbd2/commit.c planned to use the jbd2_start_commit |
| 73 | tracepoint, it would have the following near the beginning of the file: |
| 74 | |
| 75 | #include <trace/jbd2.h> |
| 76 | |
| 77 | DEFINE_TRACE(jbd2_start_commit); |
| 78 | |
| 79 | Then in the function that would call the tracepoint, it would call the |
| 80 | tracepoint function. (For more information, please see the tracepoint |
| 81 | documentation in Documentation/trace/tracepoints.txt): |
| 82 | |
| 83 | trace_jbd2_start_commit(journal, commit_transaction); |
| 84 | |
| 85 | The code snippets which allow jbd2_start_commit to be an event-enabled |
| 86 | tracepoint are placed in the file include/trace/jbd2_event_types.h: |
| 87 | |
| 88 | /* use <trace/jbd2.h> instead */ |
| 89 | #ifndef TRACE_EVENT |
| 90 | # error Do not include this file directly. |
| 91 | # error Unless you know what you are doing. |
| 92 | #endif |
| 93 | |
| 94 | #undef TRACE_SYSTEM |
| 95 | #define TRACE_SYSTEM jbd2 |
| 96 | |
| 97 | #include <linux/jbd2.h> |
| 98 | |
| 99 | TRACE_EVENT(jbd2_start_commit, |
| 100 | TP_PROTO(journal_t *journal, transaction_t *commit_transaction), |
| 101 | TP_ARGS(journal, commit_transaction), |
| 102 | TP_STRUCT__entry( |
| 103 | __array( char, devname, BDEVNAME_SIZE+24 ) |
| 104 | __field( int, transaction ) |
| 105 | ), |
| 106 | TP_fast_assign( |
| 107 | memcpy(__entry->devname, journal->j_devname, BDEVNAME_SIZE+24); |
| 108 | __entry->transaction = commit_transaction->t_tid; |
| 109 | ), |
| 110 | TP_printk("dev %s transaction %d", |
| 111 | __entry->devname, __entry->transaction) |
| 112 | ); |
| 113 | |
| 114 | The TP_PROTO and TP_ARGS are unchanged from DECLARE_TRACE. The new |
| 115 | arguments to TRACE_EVENT are TP_STRUCT__entry, TP_fast_assign, and |
| 116 | TP_printk. |
| 117 | |
| 118 | TP_STRUCT__entry defines the data structure which will be stored in the |
| 119 | trace buffer. Normally, fields in __entry will be arrays or simple |
| 120 | types. It is possible to place data structures in __entry --- however, |
| 121 | pointers in the data structure can not be trusted, since they will be |
| 122 | accessed sometime later by TP_printk, and if the data structure contains |
| 123 | fields that will not or cannot be used by TP_printk, this will waste |
| 124 | space in the trace buffer. In general, data structures should be |
| 125 | avoided, unless they do only contain non-pointer types and all of the |
| 126 | fields will be used by TP_printk. |
| 127 | |
| 128 | TP_fast_assign defines the code snippet which saves information into the |
| 129 | __entry data structure, using the passed-in arguments defined in |
| 130 | TP_PROTO and TP_ARGS. |
| 131 | |
| 132 | Finally, TP_printk will print the __entry data structure. At the time |
| 133 | when the code snippet defined by TP_printk is executed, it will not have |
| 134 | access to the TP_ARGS arguments; it can only use the information saved |
| 135 | in the __entry data structure. |