blob: 01b5b8ad3fad592c400e9296e0df029c9788907d [file] [log] [blame]
Brendan Gregg38cef482016-01-15 17:26:30 -08001Demonstrations of stackcount, the Linux eBPF/bcc version.
2
3
4This program traces kernel functions and frequency counts them with their entire
5kernel stack trace, summarized in-kernel for efficiency. For example, counting
6stack traces that led to submit_bio(), which creates block device I/O:
7
8# ./stackcount submit_bio
9Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end.
10^C
11 submit_bio
12 submit_bh
13 journal_submit_commit_record.isra.13
14 jbd2_journal_commit_transaction
15 kjournald2
16 kthread
17 ret_from_fork
18 mb_cache_list
19 1
20
21 submit_bio
22 __block_write_full_page.constprop.39
23 block_write_full_page
24 blkdev_writepage
25 __writepage
26 write_cache_pages
27 generic_writepages
28 do_writepages
29 __writeback_single_inode
30 writeback_sb_inodes
31 __writeback_inodes_wb
32 2
33
34 submit_bio
35 __block_write_full_page.constprop.39
36 block_write_full_page
37 blkdev_writepage
38 __writepage
39 write_cache_pages
40 generic_writepages
41 do_writepages
42 __filemap_fdatawrite_range
43 filemap_fdatawrite
44 fdatawrite_one_bdev
45 36
46
47 submit_bio
48 submit_bh
49 jbd2_journal_commit_transaction
50 kjournald2
51 kthread
52 ret_from_fork
53 mb_cache_list
54 38
55
56 submit_bio
57 ext4_writepages
58 do_writepages
59 __filemap_fdatawrite_range
60 filemap_flush
61 ext4_alloc_da_blocks
62 ext4_rename
63 ext4_rename2
64 vfs_rename
65 sys_rename
66 entry_SYSCALL_64_fastpath
67 79
68
69Detaching...
70
71The output shows unique stack traces, in order from leaf (on-CPU) to root,
72followed by their occurrence count. The last stack trace in the above output
73shows syscall handling, ext4_rename(), and filemap_flush(): looks like an
74application issued file rename has caused back end disk I/O due to ext4
75block allocation and a filemap_flush(). I'd have to browse the code to those
76functions to confirm!
77
78The order of printed stack traces is from least to most frequent. The most
79frequent in this case, the ext4_rename() stack, was taken 79 times during
80tracing.
81
82It can be useful to trace the path to submit_bio to explain unusual rates of
83disk IOPS. These could have in-kernel origins (eg, background scrub).
84
Brendan Gregg38cef482016-01-15 17:26:30 -080085
86As another example, here are the code paths that led to ip_output(), which
87sends a packet at the IP level:
88
89# ./stackcount ip_output
90Tracing 1 functions for "ip_output"... Hit Ctrl-C to end.
91^C
92 ip_output
93 ip_queue_xmit
94 tcp_transmit_skb
95 tcp_write_xmit
96 tcp_tsq_handler.part.32
97 tcp_tasklet_func
98 tasklet_action
99 __do_softirq
100 do_softirq_own_stack
101 do_softirq
102 __local_bh_enable_ip
103 1
104
105 ip_output
106 ip_queue_xmit
107 tcp_transmit_skb
108 tcp_send_ack
109 tcp_send_delayed_ack
110 __tcp_ack_snd_check
111 tcp_rcv_established
112 tcp_v4_do_rcv
113 tcp_v4_rcv
114 ip_local_deliver_finish
115 ip_local_deliver
116 1
117
118 ip_output
119 ip_queue_xmit
120 tcp_transmit_skb
121 tcp_send_ack
122 tcp_send_delayed_ack
123 __tcp_ack_snd_check
124 tcp_rcv_established
125 tcp_v4_do_rcv
126 tcp_v4_rcv
127 ip_local_deliver_finish
128 ip_local_deliver
129 1
130
131 ip_output
132 ip_queue_xmit
133 tcp_transmit_skb
134 tcp_send_ack
135 tcp_delack_timer_handler
136 tcp_delack_timer
137 call_timer_fn
138 run_timer_softirq
139 __do_softirq
140 irq_exit
141 xen_evtchn_do_upcall
142 1
143
144 ip_output
145 ip_queue_xmit
146 tcp_transmit_skb
147 tcp_write_xmit
148 __tcp_push_pending_frames
149 tcp_rcv_established
150 tcp_v4_do_rcv
151 release_sock
152 tcp_sendmsg
153 inet_sendmsg
154 sock_sendmsg
155 3
156
157 ip_output
158 ip_queue_xmit
159 tcp_transmit_skb
160 tcp_write_xmit
161 tcp_tsq_handler.part.32
162 tcp_tasklet_func
163 tasklet_action
164 __do_softirq
165 run_ksoftirqd
166 smpboot_thread_fn
167 kthread
168 3
169
170 ip_output
171 ip_queue_xmit
172 tcp_transmit_skb
173 tcp_write_xmit
174 __tcp_push_pending_frames
175 tcp_rcv_established
176 tcp_v4_do_rcv
177 tcp_v4_rcv
178 ip_local_deliver_finish
179 ip_local_deliver
180 ip_rcv_finish
181 4
182
183 ip_output
184 ip_queue_xmit
185 tcp_transmit_skb
186 tcp_send_ack
187 tcp_cleanup_rbuf
188 tcp_recvmsg
189 inet_recvmsg
190 sock_recvmsg
191 sock_read_iter
192 __vfs_read
193 vfs_read
194 5
195
196 ip_output
197 ip_queue_xmit
198 tcp_transmit_skb
199 tcp_write_xmit
200 __tcp_push_pending_frames
201 tcp_rcv_established
202 tcp_v4_do_rcv
203 release_sock
204 tcp_sendmsg
205 inet_sendmsg
206 sock_sendmsg
207 9
208
209 ip_output
210 ip_queue_xmit
211 tcp_transmit_skb
212 tcp_write_xmit
213 __tcp_push_pending_frames
214 tcp_push
215 tcp_sendmsg
216 inet_sendmsg
217 sock_sendmsg
218 sock_write_iter
219 __vfs_write
220 51
221
222 ip_output
223 ip_queue_xmit
224 tcp_transmit_skb
225 tcp_write_xmit
226 __tcp_push_pending_frames
227 tcp_rcv_established
228 tcp_v4_do_rcv
229 tcp_v4_rcv
230 ip_local_deliver_finish
231 ip_local_deliver
232 ip_rcv_finish
233 171
234
235 ip_output
236 ip_queue_xmit
237 tcp_transmit_skb
238 tcp_write_xmit
239 __tcp_push_pending_frames
240 tcp_rcv_established
241 tcp_v4_do_rcv
242 tcp_v4_rcv
243 ip_local_deliver_finish
244 ip_local_deliver
245 ip_rcv_finish
246 994
247
248 ip_output
249 ip_queue_xmit
250 tcp_transmit_skb
251 tcp_write_xmit
252 tcp_tsq_handler.part.32
253 tcp_tasklet_func
254 tasklet_action
255 __do_softirq
256 irq_exit
257 xen_evtchn_do_upcall
258 xen_do_hypervisor_callback
259 1002
260
261Detaching...
262
263The last two most frequent stack traces are via tcp_transmit_skb(). Note the
264send last begins with ip_rcv_finish(), for a local receive, which then becomes
265a transmit: likely pushing more frames when processing the newly received ones.
266
267As may be obvious, this is a great tool for quickly understanding kernel code
268flow.
269
270
271A -i option can be used to set an output interval, and -T to include a
272timestamp. For example:
273
274# ./stackcount -Ti 1 submit_bio
275Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end.
276
27701:11:37
278 submit_bio
279 submit_bh
280 journal_submit_commit_record.isra.13
281 jbd2_journal_commit_transaction
282 kjournald2
283 kthread
284 ret_from_fork
285 mb_cache_list
286 1
287
288 submit_bio
289 ext4_writepages
290 do_writepages
291 __filemap_fdatawrite_range
292 filemap_flush
293 ext4_alloc_da_blocks
294 ext4_rename
295 ext4_rename2
296 vfs_rename
297 sys_rename
298 entry_SYSCALL_64_fastpath
299 20
300
301 submit_bio
302 submit_bh
303 jbd2_journal_commit_transaction
304 kjournald2
305 kthread
306 ret_from_fork
307 mb_cache_list
308 39
309
310
31101:11:38
312 submit_bio
313 ext4_writepages
314 do_writepages
315 __filemap_fdatawrite_range
316 filemap_flush
317 ext4_alloc_da_blocks
318 ext4_rename
319 ext4_rename2
320 vfs_rename
321 sys_rename
322 entry_SYSCALL_64_fastpath
323 20
324
325
32601:11:39
327 submit_bio
328 ext4_writepages
329 do_writepages
330 __filemap_fdatawrite_range
331 filemap_flush
332 ext4_alloc_da_blocks
333 ext4_rename
334 ext4_rename2
335 vfs_rename
336 sys_rename
337 entry_SYSCALL_64_fastpath
338 20
339
340^C
34101:11:39
342 submit_bio
343 ext4_writepages
344 do_writepages
345 __filemap_fdatawrite_range
346 filemap_flush
347 ext4_alloc_da_blocks
348 ext4_rename
349 ext4_rename2
350 vfs_rename
351 sys_rename
352 entry_SYSCALL_64_fastpath
353 20
354
355Detaching...
356
357
358The -s output prints the return instruction offset for each function (aka
359symbol offset). Eg:
360
361# ./stackcount -s tcp_sendmsg
362Tracing 1 functions for "tcp_sendmsg"... Hit Ctrl-C to end.
363^C
364 tcp_sendmsg0x1
365 sock_sendmsg0x38
366 sock_write_iter0x78
367 __vfs_write0xaa
368 vfs_write0xa9
369 sys_write0x46
370 entry_SYSCALL_64_fastpath0x16
371 29
372
373Detaching...
374
375If it wasn't clear how one function called another, knowing the instruction
Alex Bagehot3b9679a2016-02-06 16:01:02 +0000376offset can help you locate the lines of code from a disassembly dump.
Brendan Gregg38cef482016-01-15 17:26:30 -0800377
378
379A wildcard can also be used. Eg, all functions beginning with "tcp_send":
380
381# ./stackcount -s 'tcp_send*'
382Tracing 14 functions for "tcp_send*"... Hit Ctrl-C to end.
383^C
384 tcp_send_delayed_ack0x1
385 tcp_rcv_established0x3b1
386 tcp_v4_do_rcv0x130
387 tcp_v4_rcv0x8e0
388 ip_local_deliver_finish0x9f
389 ip_local_deliver0x51
390 ip_rcv_finish0x8a
391 ip_rcv0x29d
392 __netif_receive_skb_core0x637
393 __netif_receive_skb0x18
394 netif_receive_skb_internal0x23
395 1
396
397 tcp_send_delayed_ack0x1
398 tcp_rcv_established0x222
399 tcp_v4_do_rcv0x130
400 tcp_v4_rcv0x8e0
401 ip_local_deliver_finish0x9f
402 ip_local_deliver0x51
403 ip_rcv_finish0x8a
404 ip_rcv0x29d
405 __netif_receive_skb_core0x637
406 __netif_receive_skb0x18
407 netif_receive_skb_internal0x23
408 4
409
410 tcp_send_mss0x1
411 inet_sendmsg0x67
412 sock_sendmsg0x38
413 sock_write_iter0x78
414 __vfs_write0xaa
415 vfs_write0xa9
416 sys_write0x46
417 entry_SYSCALL_64_fastpath0x16
418 7
419
420 tcp_sendmsg0x1
421 sock_sendmsg0x38
422 sock_write_iter0x78
423 __vfs_write0xaa
424 vfs_write0xa9
425 sys_write0x46
426 entry_SYSCALL_64_fastpath0x16
427 7
428
429Detaching...
430
431Use -r to allow regular expressions.
432
433
434USAGE message:
435
436# ./stackcount -h
437usage: stackcount [-h] [-p PID] [-i INTERVAL] [-T] [-r] [-s] [-v] pattern
438
439Count kernel function calls and their stack traces
440
441positional arguments:
442 pattern search expression for kernel functions
443
444optional arguments:
445 -h, --help show this help message and exit
446 -p PID, --pid PID trace this PID only
447 -i INTERVAL, --interval INTERVAL
448 summary interval, seconds
449 -T, --timestamp include timestamp on output
450 -r, --regexp use regular expressions. Default is "*" wildcards
451 only.
452 -s, --offset show address offsets
453 -v, --verbose show raw addresses
454
455examples:
456 ./stackcount submit_bio # count kernel stack traces for submit_bio
457 ./stackcount ip_output # count kernel stack traces for ip_output
458 ./stackcount -s ip_output # show symbol offsets
459 ./stackcount -sv ip_output # show offsets and raw addresses (verbose)
460 ./stackcount 'tcp_send*' # count stacks for funcs matching tcp_send*
461 ./stackcount -r '^tcp_send.*' # same as above, using regular expressions
462 ./stackcount -Ti 5 ip_output # output every 5 seconds, with timestamps
463 ./stackcount -p 185 ip_output # count ip_output stacks for PID 185 only