Joel | 0b813f8 | 2018-06-18 19:37:38 -0700 | [diff] [blame] | 1 | Demonstrations of criticalstat: Find long atomic critical sections in the kernel. |
| 2 | |
| 3 | criticalstat traces and reports occurences of atomic critical sections in the |
| 4 | kernel with useful stacktraces showing the origin of them. Such critical |
| 5 | sections frequently occur due to use of spinlocks, or if interrupts or |
| 6 | preemption were explicity disabled by a driver. IRQ routines in Linux are also |
| 7 | executed with interrupts disabled. There are many reasons. Such critical |
| 8 | sections are a source of long latency/responsive issues for real-time systems. |
| 9 | |
| 10 | This works by probing the preempt/irq and cpuidle tracepoints in the kernel. |
| 11 | Since this uses BPF, only the root user can use this tool. Further, the kernel |
| 12 | has to be built with certain CONFIG options enabled inorder for it to work: |
| 13 | CONFIG_PREEMPTIRQ_EVENTS |
| 14 | CONFIG_DEBUG_PREEMPT |
| 15 | Additionally, the following options should be turned off on older kernels: |
| 16 | CONFIG_PROVE_LOCKING |
| 17 | CONFIG_LOCKDEP |
| 18 | |
| 19 | USAGE: |
| 20 | # ./criticalstat -h |
| 21 | usage: criticalstat [-h] [-p] [-i] [-d DURATION] |
| 22 | |
| 23 | Trace long critical sections |
| 24 | |
| 25 | optional arguments: |
| 26 | -h, --help Show this help message and exit |
| 27 | -p, --preemptoff Find long sections where preemption was off |
| 28 | -i, --irqoff Find long sections where IRQ was off |
| 29 | -d DURATION, --duration DURATION |
| 30 | Duration in uS (microseconds) below which we filter |
| 31 | |
| 32 | examples: |
| 33 | ./criticalstat # run with default options: irq off for more than 100 uS |
| 34 | ./criticalstat -p # find sections with preemption disabled for more than 100 uS |
| 35 | ./criticalstat -d 500 # find sections with IRQs disabled for more than 500 uS |
| 36 | ./criticalstat -p -d 500 # find sections with preemption disabled for more than 500 uS |
| 37 | |
| 38 | The tool runs continuously until interrupted by Ctrl-C |
| 39 | |
| 40 | |
| 41 | By default, criticalstat finds IRQ disable sections for > 100us. |
| 42 | |
| 43 | # ./criticalstat |
| 44 | Finding critical section with IRQ disabled for > 100us |
| 45 | =================================== |
| 46 | TASK: kworker/u16:5 (pid 5903 tid 5903) Total Time: 194.427 us |
| 47 | |
| 48 | Section start: __schedule -> schedule |
| 49 | Section end: _raw_spin_unlock_irq -> finish_task_switch |
| 50 | trace_hardirqs_on+0xdc |
| 51 | trace_hardirqs_on+0xdc |
| 52 | _raw_spin_unlock_irq+0x18 |
| 53 | finish_task_switch+0xf0 |
| 54 | __schedule+0x8c8 |
| 55 | preempt_schedule_irq+0x38 |
| 56 | el1_preempt+0x8 |
| 57 | =================================== |
| 58 | |
| 59 | |
| 60 | If too many sections are showing up, the user can raise the threshold to only |
| 61 | show critical sections that are > 500us by passing "-d" option: |
| 62 | |
| 63 | # ./criticalstat -d 500 |
| 64 | Finding critical section with IRQ disabled for > 500us |
| 65 | =================================== |
| 66 | TASK: crtc_commit:111 (pid 246 tid 246) Total Time: 580.730 us |
| 67 | |
| 68 | Section start: clk_enable_lock -> clk_enable |
| 69 | Section end: _raw_spin_unlock_irqrestore -> clk_enable |
| 70 | trace_hardirqs_on+0xdc |
| 71 | trace_hardirqs_on+0xdc |
| 72 | _raw_spin_unlock_irqrestore+0x24 |
| 73 | clk_enable+0x80 |
| 74 | msm_dss_enable_clk+0x7c |
| 75 | sde_power_resource_enable+0x578 |
| 76 | _sde_crtc_vblank_enable_no_lock+0x68 |
| 77 | sde_crtc_vblank+0x8c |
| 78 | sde_kms_enable_vblank+0x18 |
| 79 | vblank_ctrl_worker+0xd0 |
| 80 | kthread_worker_fn+0xf8 |
| 81 | kthread+0x114 |
| 82 | ret_from_fork+0x10 |
| 83 | =================================== |
| 84 | |
| 85 | |
| 86 | If instead of irq disabled sections, we want to see preempt disabled sections, |
| 87 | then pass the "-p" option. Below we try to find preempt-disabled critical |
| 88 | sections that are > 500us. |
| 89 | |
| 90 | # ./criticalstat -p -d 500 |
| 91 | Finding critical section with preempt disabled for > 500us |
| 92 | =================================== |
| 93 | TASK: swapper/1 (pid 0 tid 0) Total Time: 618.437 us |
| 94 | |
| 95 | Section start: preempt_count_add -> preempt_count_add |
| 96 | Section end: preempt_count_sub -> preempt_count_sub |
| 97 | trace_preempt_on+0x98 |
| 98 | trace_preempt_on+0x98 |
| 99 | preempt_latency_stop+0x164 |
| 100 | preempt_count_sub+0x50 |
| 101 | schedule+0x74 |
| 102 | schedule_preempt_disabled+0x14 |
| 103 | cpu_startup_entry+0x84 |
| 104 | secondary_start_kernel+0x1c8 |
| 105 | [unknown] |
| 106 | =================================== |
| 107 | |
| 108 | |
| 109 | criticalstat -p can also reflect kernel scheduler issues sometimes. These may |
| 110 | show up as long preempt-off sections if the functions in the scheduler take a |
| 111 | long time to run (such as pick_next_task_fair which selects the CPU for a task |
| 112 | |
| 113 | Follow is a report showing a preempt-off latency of 700us during the schedule |
| 114 | loop's execution: |
| 115 | |
| 116 | =================================== |
| 117 | TASK: irq/296-cs35l36 (pid 666 tid 666) Total Time: 732.657 us |
| 118 | |
| 119 | Section start: schedule -> schedule |
| 120 | Section end: schedule -> schedule |
| 121 | trace_preempt_on+0x98 |
| 122 | trace_preempt_on+0x98 |
| 123 | preempt_count_sub+0xa4 |
| 124 | schedule+0x78 |
| 125 | schedule_timeout+0x80 |
| 126 | wait_for_common+0xb4 |
| 127 | wait_for_completion_timeout+0x28 |
| 128 | geni_i2c_xfer+0x298 |
| 129 | __i2c_transfer+0x4e0 |
| 130 | i2c_transfer+0x8 |
| 131 | irq_thread_fn+0x2c |
| 132 | irq_thread+0x160 |
| 133 | kthread+0x118 |
| 134 | ret_from_fork+0x10 |
| 135 | =================================== |
| 136 | |
| 137 | |
| 138 | See Also: Linux kernel's preemptoff and irqoff tracers which provide similar |
| 139 | tracing but with some limitations. |