| Demonstrations of reset-trace, for Linux bcc/BPF. |
| |
| |
| You will probably never need this tool. If you kill -9 a bcc tool (plus other |
| signals, like SIGTERM), or if a bcc tool crashes, then kernel tracing can be |
| left in a semi-enabled state. It's not as bad as it sounds: there may just be |
| overhead for writing to ring buffers that are never read. This tool can be |
| used to clean up the tracing state, and reset and disable active tracing. |
| |
| WARNING: Make sure no other tracing sessions are active, as it will likely |
| stop them from functioning (perhaps ungracefully). |
| |
| This specifically clears the state in at least the following files in |
| /sys/kernel/debug/tracing: kprobe_events, uprobe_events, trace_pipe. |
| Other tracing facilities (ftrace) are checked, and if not in an expected state, |
| a note is printed. All tracing files can be reset with -F for force, but this |
| will interfere with any other running tracing sessions (eg, ftrace). |
| |
| Here's an example: |
| |
| # ./reset-trace.sh |
| # |
| |
| That's it. |
| |
| |
| You can use -v to see what it does: |
| |
| # ./reset-trace.sh -v |
| Reseting tracing state... |
| |
| Checking /sys/kernel/debug/tracing/kprobe_events |
| Checking /sys/kernel/debug/tracing/uprobe_events |
| Checking /sys/kernel/debug/tracing/trace |
| Checking /sys/kernel/debug/tracing/current_tracer |
| Checking /sys/kernel/debug/tracing/set_ftrace_filter |
| Checking /sys/kernel/debug/tracing/set_graph_function |
| Checking /sys/kernel/debug/tracing/set_ftrace_pid |
| Checking /sys/kernel/debug/tracing/events/enable |
| Checking /sys/kernel/debug/tracing/tracing_thresh |
| Checking /sys/kernel/debug/tracing/tracing_on |
| |
| Done. |
| |
| In this example, no resetting was necessary. |
| |
| |
| Here's an example of actually needing it: |
| |
| # ./funccount 'bash:r*' |
| Tracing 317 functions for "bash:r*"... Hit Ctrl-C to end. |
| ^C |
| FUNC COUNT |
| rl_free_undo_list 1 |
| rl_deprep_terminal 1 |
| readline_internal_teardown 1 |
| rl_on_new_line 1 |
| rl_crlf 1 |
| rl_clear_signals 1 |
| rl_prep_terminal 1 |
| rl_reset_line_state 1 |
| rl_initialize 1 |
| rl_newline 1 |
| readline_internal_setup 1 |
| rl_set_screen_size 1 |
| readline 1 |
| rl_set_signals 1 |
| rl_expand_prompt 1 |
| replace_history_data 1 |
| rl_set_prompt 1 |
| rl_add_undo 1 |
| rl_insert_text 2 |
| rl_insert 2 |
| rl_redisplay 3 |
| rl_read_key 3 |
| rl_getc 3 |
| readline_internal_char 3 |
| restore_parser_state 6 |
| reap_dead_jobs 6 |
| reset_parser 6 |
| restore_input_line_state 6 |
| realloc 7 |
| read_octal 10 |
| read_tty_modified 13 |
| run_exit_trap 13 |
| redirection_expand 13 |
| restore_pipestatus_array 18 |
| reader_loop 20 |
| run_return_trap 21 |
| remember_args 25 |
| reset_signal_handlers 30 |
| remove_quoted_escapes 60 |
| run_unwind_frame 102 |
| reset_terminating_signals 125 |
| restore_original_signals 139 |
| reset_internal_getopt 405 |
| run_debug_trap 719 |
| read_command 940 |
| remove_quoted_nulls 1830 |
| run_pending_traps 3207 |
| ^C |
| ^C |
| ^C |
| |
| I've traced 317 functions using funccount, and when I hit Ctrl-C, funccount is |
| not exiting (it can normally take many seconds, but this really looks stuck): |
| |
| # pidstat 1 |
| Linux 4.9.0-rc1-virtual (bgregg-xenial-bpf-i-xxx) 10/18/2016 _x86_64_ (8 CPU) |
| |
| 10:00:33 PM UID PID %usr %system %guest %CPU CPU Command |
| 10:00:34 PM 60004 3277 0.00 0.98 0.00 0.98 0 redis-server |
| 10:00:34 PM 0 27980 87.25 10.78 0.00 98.04 3 funccount.py |
| 10:00:34 PM 0 29965 0.00 0.98 0.00 0.98 6 pidstat |
| |
| 10:00:34 PM UID PID %usr %system %guest %CPU CPU Command |
| 10:00:35 PM 65534 3276 0.00 1.00 0.00 1.00 2 multilog |
| 10:00:35 PM 0 27980 77.00 23.00 0.00 100.00 3 funccount.py |
| 10:00:35 PM 0 29965 0.00 1.00 0.00 1.00 6 pidstat |
| 10:00:35 PM 60004 29990 0.00 1.00 0.00 1.00 6 catalina.sh |
| |
| funccount looks a lot like it's in an infinite loop (I can use a stack-sampling |
| profiler to confirm). This is a known bug (#665) and may be fixed by the time |
| you read this. But right now it's a good example of needing reset-trace. |
| |
| I'll send a SIGTERM, before resorting to a SIGKILL: |
| |
| # kill 27980 |
| Terminated |
| |
| Ok, so the process is now gone, but it did leave tracing in a semi-enabled |
| state. Using reset-trace: |
| |
| # ./reset-trace.sh -v |
| Reseting tracing state... |
| |
| Checking /sys/kernel/debug/tracing/kprobe_events |
| Checking /sys/kernel/debug/tracing/uprobe_events |
| Needed to reset /sys/kernel/debug/tracing/uprobe_events |
| uprobe_events, before (line enumerated): |
| 1 p:uprobes/p__bin_bash_0xa2540 /bin/bash:0x00000000000a2540 |
| 2 p:uprobes/p__bin_bash_0x21220 /bin/bash:0x0000000000021220 |
| 3 p:uprobes/p__bin_bash_0x78530 /bin/bash:0x0000000000078530 |
| 4 p:uprobes/p__bin_bash_0xa3840 /bin/bash:0x00000000000a3840 |
| 5 p:uprobes/p__bin_bash_0x9c550 /bin/bash:0x000000000009c550 |
| 6 p:uprobes/p__bin_bash_0x5e360 /bin/bash:0x000000000005e360 |
| 7 p:uprobes/p__bin_bash_0xb2630 /bin/bash:0x00000000000b2630 |
| 8 p:uprobes/p__bin_bash_0xb1e70 /bin/bash:0x00000000000b1e70 |
| 9 p:uprobes/p__bin_bash_0xb2540 /bin/bash:0x00000000000b2540 |
| 10 p:uprobes/p__bin_bash_0xb16e0 /bin/bash:0x00000000000b16e0 |
| [...] |
| 312 p:uprobes/p__bin_bash_0xa80b0 /bin/bash:0x00000000000a80b0 |
| 313 p:uprobes/p__bin_bash_0x9e280 /bin/bash:0x000000000009e280 |
| 314 p:uprobes/p__bin_bash_0x9e100 /bin/bash:0x000000000009e100 |
| 315 p:uprobes/p__bin_bash_0xb2bd0 /bin/bash:0x00000000000b2bd0 |
| 316 p:uprobes/p__bin_bash_0x9d9c0 /bin/bash:0x000000000009d9c0 |
| 317 p:uprobes/p__bin_bash_0x4a930 /bin/bash:0x000000000004a930 |
| uprobe_events, after (line enumerated): |
| |
| Checking /sys/kernel/debug/tracing/trace |
| Checking /sys/kernel/debug/tracing/current_tracer |
| Checking /sys/kernel/debug/tracing/set_ftrace_filter |
| Checking /sys/kernel/debug/tracing/set_graph_function |
| Checking /sys/kernel/debug/tracing/set_ftrace_pid |
| Checking /sys/kernel/debug/tracing/events/enable |
| Checking /sys/kernel/debug/tracing/tracing_thresh |
| Checking /sys/kernel/debug/tracing/tracing_on |
| |
| Done. |
| |
| Now looks clean. I did truncate the output here: there were a few hundred lines |
| from uprobe_events. |
| |
| Here's the same situation, but without the verbose option: |
| |
| # ./reset-trace.sh |
| Needed to reset /sys/kernel/debug/tracing/uprobe_events |
| # |
| |
| And again with quiet: |
| |
| # ./reset-trace.sh -q |
| # |
| |
| |
| Here is an example of reset-trace detecting an unrelated tracing session: |
| |
| # ./reset-trace.sh |
| Noticed unrelated tracing file /sys/kernel/debug/tracing/set_ftrace_filter isn't set as expected. Not reseting (-F to force, -v for verbose). |
| |
| And verbose: |
| |
| # ./reset-trace.sh -v |
| Reseting tracing state... |
| |
| Checking /sys/kernel/debug/tracing/kprobe_events |
| Checking /sys/kernel/debug/tracing/uprobe_events |
| Checking /sys/kernel/debug/tracing/trace |
| Checking /sys/kernel/debug/tracing/current_tracer |
| Checking /sys/kernel/debug/tracing/set_ftrace_filter |
| Noticed unrelated tracing file /sys/kernel/debug/tracing/set_ftrace_filter isn't set as expected. Not reseting (-F to force, -v for verbose). |
| Contents of set_ftrace_filter is (line enumerated): |
| 1 tcp_send_mss |
| 2 tcp_sendpage |
| 3 tcp_sendmsg |
| 4 tcp_send_dupack |
| 5 tcp_send_challenge_ack.isra.53 |
| 6 tcp_send_rcvq |
| 7 tcp_send_ack |
| 8 tcp_send_loss_probe |
| 9 tcp_send_fin |
| 10 tcp_send_active_reset |
| 11 tcp_send_synack |
| 12 tcp_send_delayed_ack |
| 13 tcp_send_window_probe |
| 14 tcp_send_probe0 |
| Expected "". |
| Checking /sys/kernel/debug/tracing/set_graph_function |
| Checking /sys/kernel/debug/tracing/set_ftrace_pid |
| Checking /sys/kernel/debug/tracing/events/enable |
| Checking /sys/kernel/debug/tracing/tracing_thresh |
| Checking /sys/kernel/debug/tracing/tracing_on |
| |
| Done. |
| |
| So this file is not currently used by bcc, but it may be useful to know that |
| it's not in the default state -- something is either using it or has left it |
| enabled. These files can be reset with -F, but that may break other tools that |
| are currently using them. |
| |
| |
| Use -h to print the USAGE message: |
| |
| # ./reset-trace.sh -h |
| USAGE: reset-trace [-Fhqv] |
| -F # force: reset all tracing files |
| -v # verbose: print details while working |
| -h # this usage message |
| -q # quiet: no output |
| eg, |
| reset-trace # disable semi-enabled tracing |