| Using the Linux Kernel Markers |
| |
| Mathieu Desnoyers |
| |
| |
| This document introduces Linux Kernel Markers and their use. It provides |
| examples of how to insert markers in the kernel and connect probe functions to |
| them and provides some examples of probe functions. |
| |
| |
| * Purpose of markers |
| |
| A marker placed in code provides a hook to call a function (probe) that you can |
| provide at runtime. A marker can be "on" (a probe is connected to it) or "off" |
| (no probe is attached). When a marker is "off" it has no effect, except for |
| adding a tiny time penalty (checking a condition for a branch) and space |
| penalty (adding a few bytes for the function call at the end of the |
| instrumented function and adds a data structure in a separate section). When a |
| marker is "on", the function you provide is called each time the marker is |
| executed, in the execution context of the caller. When the function provided |
| ends its execution, it returns to the caller (continuing from the marker site). |
| |
| You can put markers at important locations in the code. Markers are |
| lightweight hooks that can pass an arbitrary number of parameters, |
| described in a printk-like format string, to the attached probe function. |
| |
| They can be used for tracing and performance accounting. |
| |
| |
| * Usage |
| |
| In order to use the macro trace_mark, you should include linux/marker.h. |
| |
| #include <linux/marker.h> |
| |
| And, |
| |
| trace_mark(subsystem_event, "myint %d mystring %s", someint, somestring); |
| Where : |
| - subsystem_event is an identifier unique to your event |
| - subsystem is the name of your subsystem. |
| - event is the name of the event to mark. |
| - "myint %d mystring %s" is the formatted string for the serializer. "myint" and |
| "mystring" are repectively the field names associated with the first and |
| second parameter. |
| - someint is an integer. |
| - somestring is a char pointer. |
| |
| Connecting a function (probe) to a marker is done by providing a probe (function |
| to call) for the specific marker through marker_probe_register() and can be |
| activated by calling marker_arm(). Marker deactivation can be done by calling |
| marker_disarm() as many times as marker_arm() has been called. Removing a probe |
| is done through marker_probe_unregister(); it will disarm the probe. |
| |
| marker_synchronize_unregister() must be called between probe unregistration and |
| the first occurrence of |
| - the end of module exit function, |
| to make sure there is no caller left using the probe; |
| - the free of any resource used by the probes, |
| to make sure the probes wont be accessing invalid data. |
| This, and the fact that preemption is disabled around the probe call, make sure |
| that probe removal and module unload are safe. See the "Probe example" section |
| below for a sample probe module. |
| |
| The marker mechanism supports inserting multiple instances of the same marker. |
| Markers can be put in inline functions, inlined static functions, and |
| unrolled loops as well as regular functions. |
| |
| The naming scheme "subsystem_event" is suggested here as a convention intended |
| to limit collisions. Marker names are global to the kernel: they are considered |
| as being the same whether they are in the core kernel image or in modules. |
| Conflicting format strings for markers with the same name will cause the markers |
| to be detected to have a different format string not to be armed and will output |
| a printk warning which identifies the inconsistency: |
| |
| "Format mismatch for probe probe_name (format), marker (format)" |
| |
| Another way to use markers is to simply define the marker without generating any |
| function call to actually call into the marker. This is useful in combination |
| with tracepoint probes in a scheme like this : |
| |
| void probe_tracepoint_name(unsigned int arg1, struct task_struct *tsk); |
| |
| DEFINE_MARKER_TP(marker_eventname, tracepoint_name, probe_tracepoint_name, |
| "arg1 %u pid %d"); |
| |
| notrace void probe_tracepoint_name(unsigned int arg1, struct task_struct *tsk) |
| { |
| struct marker *marker = &GET_MARKER(kernel_irq_entry); |
| /* write data to trace buffers ... */ |
| } |
| |
| * Probe / marker example |
| |
| See the example provided in samples/markers/src |
| |
| Compile them with your kernel. |
| |
| Run, as root : |
| modprobe marker-example (insmod order is not important) |
| modprobe probe-example |
| cat /proc/marker-example (returns an expected error) |
| rmmod marker-example probe-example |
| dmesg |