| Demonstrations of criticalstat: Find long atomic critical sections in the kernel. |
| |
| criticalstat traces and reports occurences of atomic critical sections in the |
| kernel with useful stacktraces showing the origin of them. Such critical |
| sections frequently occur due to use of spinlocks, or if interrupts or |
| preemption were explicity disabled by a driver. IRQ routines in Linux are also |
| executed with interrupts disabled. There are many reasons. Such critical |
| sections are a source of long latency/responsive issues for real-time systems. |
| |
| This works by probing the preempt/irq and cpuidle tracepoints in the kernel. |
| Since this uses BPF, only the root user can use this tool. Further, the kernel |
| has to be built with certain CONFIG options enabled inorder for it to work: |
| CONFIG_PREEMPTIRQ_EVENTS |
| CONFIG_DEBUG_PREEMPT |
| Additionally, the following options should be turned off on older kernels: |
| CONFIG_PROVE_LOCKING |
| CONFIG_LOCKDEP |
| |
| USAGE: |
| # ./criticalstat -h |
| usage: criticalstat [-h] [-p] [-i] [-d DURATION] |
| |
| Trace long critical sections |
| |
| optional arguments: |
| -h, --help Show this help message and exit |
| -p, --preemptoff Find long sections where preemption was off |
| -i, --irqoff Find long sections where IRQ was off |
| -d DURATION, --duration DURATION |
| Duration in uS (microseconds) below which we filter |
| |
| examples: |
| ./criticalstat # run with default options: irq off for more than 100 uS |
| ./criticalstat -p # find sections with preemption disabled for more than 100 uS |
| ./criticalstat -d 500 # find sections with IRQs disabled for more than 500 uS |
| ./criticalstat -p -d 500 # find sections with preemption disabled for more than 500 uS |
| |
| The tool runs continuously until interrupted by Ctrl-C |
| |
| |
| By default, criticalstat finds IRQ disable sections for > 100us. |
| |
| # ./criticalstat |
| Finding critical section with IRQ disabled for > 100us |
| =================================== |
| TASK: kworker/u16:5 (pid 5903 tid 5903) Total Time: 194.427 us |
| |
| Section start: __schedule -> schedule |
| Section end: _raw_spin_unlock_irq -> finish_task_switch |
| trace_hardirqs_on+0xdc |
| trace_hardirqs_on+0xdc |
| _raw_spin_unlock_irq+0x18 |
| finish_task_switch+0xf0 |
| __schedule+0x8c8 |
| preempt_schedule_irq+0x38 |
| el1_preempt+0x8 |
| =================================== |
| |
| |
| If too many sections are showing up, the user can raise the threshold to only |
| show critical sections that are > 500us by passing "-d" option: |
| |
| # ./criticalstat -d 500 |
| Finding critical section with IRQ disabled for > 500us |
| =================================== |
| TASK: crtc_commit:111 (pid 246 tid 246) Total Time: 580.730 us |
| |
| Section start: clk_enable_lock -> clk_enable |
| Section end: _raw_spin_unlock_irqrestore -> clk_enable |
| trace_hardirqs_on+0xdc |
| trace_hardirqs_on+0xdc |
| _raw_spin_unlock_irqrestore+0x24 |
| clk_enable+0x80 |
| msm_dss_enable_clk+0x7c |
| sde_power_resource_enable+0x578 |
| _sde_crtc_vblank_enable_no_lock+0x68 |
| sde_crtc_vblank+0x8c |
| sde_kms_enable_vblank+0x18 |
| vblank_ctrl_worker+0xd0 |
| kthread_worker_fn+0xf8 |
| kthread+0x114 |
| ret_from_fork+0x10 |
| =================================== |
| |
| |
| If instead of irq disabled sections, we want to see preempt disabled sections, |
| then pass the "-p" option. Below we try to find preempt-disabled critical |
| sections that are > 500us. |
| |
| # ./criticalstat -p -d 500 |
| Finding critical section with preempt disabled for > 500us |
| =================================== |
| TASK: swapper/1 (pid 0 tid 0) Total Time: 618.437 us |
| |
| Section start: preempt_count_add -> preempt_count_add |
| Section end: preempt_count_sub -> preempt_count_sub |
| trace_preempt_on+0x98 |
| trace_preempt_on+0x98 |
| preempt_latency_stop+0x164 |
| preempt_count_sub+0x50 |
| schedule+0x74 |
| schedule_preempt_disabled+0x14 |
| cpu_startup_entry+0x84 |
| secondary_start_kernel+0x1c8 |
| [unknown] |
| =================================== |
| |
| |
| criticalstat -p can also reflect kernel scheduler issues sometimes. These may |
| show up as long preempt-off sections if the functions in the scheduler take a |
| long time to run (such as pick_next_task_fair which selects the CPU for a task |
| |
| Follow is a report showing a preempt-off latency of 700us during the schedule |
| loop's execution: |
| |
| =================================== |
| TASK: irq/296-cs35l36 (pid 666 tid 666) Total Time: 732.657 us |
| |
| Section start: schedule -> schedule |
| Section end: schedule -> schedule |
| trace_preempt_on+0x98 |
| trace_preempt_on+0x98 |
| preempt_count_sub+0xa4 |
| schedule+0x78 |
| schedule_timeout+0x80 |
| wait_for_common+0xb4 |
| wait_for_completion_timeout+0x28 |
| geni_i2c_xfer+0x298 |
| __i2c_transfer+0x4e0 |
| i2c_transfer+0x8 |
| irq_thread_fn+0x2c |
| irq_thread+0x160 |
| kthread+0x118 |
| ret_from_fork+0x10 |
| =================================== |
| |
| |
| See Also: Linux kernel's preemptoff and irqoff tracers which provide similar |
| tracing but with some limitations. |