blob: d76387c81bb96d455942f76929d09ccff029f811 [file] [log] [blame]
Brendan Gregg74016c32015-09-21 15:49:21 -07001Demonstrations of funclatency, the Linux eBPF/bcc version.
2
3
4Timing the do_sys_open() kernel function until Ctrl-C:
5
6# ./funclatency do_sys_open
7Tracing do_sys_open... Hit Ctrl-C to end.
8^C
9 nsecs : count distribution
10 0 -> 1 : 0 | |
11 2 -> 3 : 0 | |
12 4 -> 7 : 0 | |
13 8 -> 15 : 0 | |
14 16 -> 31 : 0 | |
15 32 -> 63 : 0 | |
16 64 -> 127 : 0 | |
17 128 -> 255 : 0 | |
18 256 -> 511 : 0 | |
19 512 -> 1023 : 0 | |
20 1024 -> 2047 : 0 | |
21 2048 -> 4095 : 124 |**************** |
22 4096 -> 8191 : 291 |**************************************|
23 8192 -> 16383 : 36 |**** |
24 16384 -> 32767 : 16 |** |
25 32768 -> 65535 : 8 |* |
26 65536 -> 131071 : 0 | |
27 131072 -> 262143 : 0 | |
28 262144 -> 524287 : 0 | |
29 524288 -> 1048575 : 0 | |
30 1048576 -> 2097151 : 0 | |
31 2097152 -> 4194303 : 1 | |
chenyue.zhou0cf81662021-04-14 13:37:07 -040032
Haoaf90bd42020-09-29 02:01:36 +080033avg = 13746 nsecs, total: 6543360 nsecs, count: 476
34
Brendan Gregg74016c32015-09-21 15:49:21 -070035Detaching...
36
37The output shows a histogram of function latency (call time), measured from when
38the function began executing (was called) to when it finished (returned).
39
40This example output shows that most of the time, do_sys_open() took between
412048 and 65536 nanoseconds (2 to 65 microseconds). The peak of this distribution
42shows 291 calls of between 4096 and 8191 nanoseconds. There was also one
Alex Bagehot3b9679a2016-02-06 16:01:02 +000043occurrence, an outlier, in the 2 to 4 millisecond range.
Brendan Gregg74016c32015-09-21 15:49:21 -070044
45How this works: the function entry and return are traced using the kernel kprobe
46and kretprobe tracer. Timestamps are collected, the delta time calculated, which
47is the bucketized and stored as an in-kernel histogram for efficiency. The
Alex Bagehot3b9679a2016-02-06 16:01:02 +000048histogram is visible in the output: it's the "count" column; everything else is
Brendan Gregg74016c32015-09-21 15:49:21 -070049decoration. Only the count column is copied to user-level on output. This is an
50efficient way to time kernel functions and examine their latency distribution.
51
52
Sasha Goldshteina466c462016-10-06 21:17:59 +030053Now trace a user function, pthread_mutex_lock in libpthread, to determine if
54there is considerable lock contention:
55
56# ./funclatency pthread:pthread_mutex_lock -p $(pidof contentions)
57Tracing 1 function for "pthread:pthread_mutex_lock"... Hit Ctrl-C to end.
58 nsecs : count distribution
59 0 -> 1 : 0 | |
60 2 -> 3 : 0 | |
61 4 -> 7 : 0 | |
62 8 -> 15 : 0 | |
63 16 -> 31 : 0 | |
64 32 -> 63 : 0 | |
65 64 -> 127 : 0 | |
66 128 -> 255 : 0 | |
67 256 -> 511 : 0 | |
68 512 -> 1023 : 0 | |
69 1024 -> 2047 : 0 | |
70 2048 -> 4095 : 508967 |****************************************|
71 4096 -> 8191 : 70072 |***** |
72 8192 -> 16383 : 27686 |** |
73 16384 -> 32767 : 5075 | |
74 32768 -> 65535 : 2318 | |
75 65536 -> 131071 : 581 | |
76 131072 -> 262143 : 38 | |
77 262144 -> 524287 : 5 | |
78 524288 -> 1048575 : 1 | |
79 1048576 -> 2097151 : 9 | |
Haoaf90bd42020-09-29 02:01:36 +080080
81avg = 4317 nsecs, total: 2654426112 nsecs, count: 614752
82
Sasha Goldshteina466c462016-10-06 21:17:59 +030083Detaching...
84
Taekho Nam0bd1a6d2017-06-29 23:20:47 +090085It seems that most calls to pthread_mutex_lock completed rather quickly (in
Sasha Goldshteina466c462016-10-06 21:17:59 +030086under 4us), but there were some cases of considerable contention, sometimes
87over a full millisecond.
88
89
90Run a quick-and-dirty profiler over all the functions in an executable:
91# ./funclatency /home/user/primes:* -p $(pidof primes) -F
92Tracing 15 functions for "/home/user/primes:*"... Hit Ctrl-C to end.
93^C
94
95Function = is_prime [6556]
96 nsecs : count distribution
97 0 -> 1 : 0 | |
98 2 -> 3 : 0 | |
99 4 -> 7 : 0 | |
100 8 -> 15 : 0 | |
101 16 -> 31 : 0 | |
102 32 -> 63 : 0 | |
103 64 -> 127 : 0 | |
104 128 -> 255 : 0 | |
105 256 -> 511 : 0 | |
106 512 -> 1023 : 0 | |
107 1024 -> 2047 : 0 | |
108 2048 -> 4095 : 1495322 |****************************************|
109 4096 -> 8191 : 95744 |** |
110 8192 -> 16383 : 9926 | |
111 16384 -> 32767 : 3070 | |
112 32768 -> 65535 : 1415 | |
113 65536 -> 131071 : 112 | |
114 131072 -> 262143 : 9 | |
115 262144 -> 524287 : 3 | |
116 524288 -> 1048575 : 0 | |
117 1048576 -> 2097151 : 8 | |
118
119Function = insert_result [6556]
120 nsecs : count distribution
121 0 -> 1 : 0 | |
122 2 -> 3 : 0 | |
123 4 -> 7 : 0 | |
124 8 -> 15 : 0 | |
125 16 -> 31 : 0 | |
126 32 -> 63 : 0 | |
127 64 -> 127 : 0 | |
128 128 -> 255 : 0 | |
129 256 -> 511 : 0 | |
130 512 -> 1023 : 0 | |
131 1024 -> 2047 : 0 | |
132 2048 -> 4095 : 111047 |****************************************|
133 4096 -> 8191 : 3998 |* |
134 8192 -> 16383 : 720 | |
135 16384 -> 32767 : 238 | |
136 32768 -> 65535 : 106 | |
137 65536 -> 131071 : 5 | |
138 131072 -> 262143 : 4 | |
Haoaf90bd42020-09-29 02:01:36 +0800139
140avg = 3404 nsecs, total: 5862276096 nsecs, count: 1721727
141
Sasha Goldshteina466c462016-10-06 21:17:59 +0300142Detaching...
143
144From the results, we can see that the is_prime function has something resembling
145an exponential distribution -- very few primes take a very long time to test,
146while most numbers are verified as prime or composite in less than 4us. The
147insert_result function exhibits a similar phenomenon, likely due to contention
148over a shared results container.
149
150
Brendan Gregg74016c32015-09-21 15:49:21 -0700151Now vfs_read() is traced, and a microseconds histogram printed:
152
153# ./funclatency -u vfs_read
154Tracing vfs_read... Hit Ctrl-C to end.
155^C
156 usecs : count distribution
157 0 -> 1 : 1143 |**************************************|
158 2 -> 3 : 420 |************* |
159 4 -> 7 : 159 |***** |
160 8 -> 15 : 295 |********* |
161 16 -> 31 : 25 | |
162 32 -> 63 : 5 | |
163 64 -> 127 : 1 | |
164 128 -> 255 : 0 | |
165 256 -> 511 : 0 | |
166 512 -> 1023 : 0 | |
167 1024 -> 2047 : 1 | |
168 2048 -> 4095 : 0 | |
169 4096 -> 8191 : 5 | |
170 8192 -> 16383 : 0 | |
171 16384 -> 32767 : 0 | |
172 32768 -> 65535 : 0 | |
173 65536 -> 131071 : 7 | |
174 131072 -> 262143 : 7 | |
175 262144 -> 524287 : 3 | |
176 524288 -> 1048575 : 7 | |
Haoaf90bd42020-09-29 02:01:36 +0800177
178avg = 4229 nsecs, total: 8789145 nsecs, count: 2078
179
Brendan Gregg74016c32015-09-21 15:49:21 -0700180Detaching...
181
182This shows a bimodal distribution. Many vfs_read() calls were faster than 15
183microseconds, however, there was also a small handful between 65 milliseconds
184and 1 second, seen at the bottom of the table. These are likely network reads
185from SSH, waiting on interactive keystrokes.
186
187
188Tracing do_nanosleep() in milliseconds:
189
190# ./funclatency -m do_nanosleep
191Tracing do_nanosleep... Hit Ctrl-C to end.
192^C
193 msecs : count distribution
194 0 -> 1 : 0 | |
195 2 -> 3 : 0 | |
196 4 -> 7 : 0 | |
197 8 -> 15 : 0 | |
198 16 -> 31 : 0 | |
199 32 -> 63 : 0 | |
200 64 -> 127 : 0 | |
201 128 -> 255 : 0 | |
202 256 -> 511 : 0 | |
203 512 -> 1023 : 328 |**************************************|
204 1024 -> 2047 : 0 | |
205 2048 -> 4095 : 0 | |
206 4096 -> 8191 : 32 |*** |
207 8192 -> 16383 : 0 | |
208 16384 -> 32767 : 0 | |
209 32768 -> 65535 : 2 | |
Haoaf90bd42020-09-29 02:01:36 +0800210
211avg = 1510 nsecs, total: 546816 nsecs, count: 326
212
Brendan Gregg74016c32015-09-21 15:49:21 -0700213Detaching...
214
215This looks like it has found threads that are sleeping every 1, 5, and 60
216seconds.
217
218
219An interval can be provided using -i, and timestamps added using -T. For
220example, tracing vfs_read() latency in milliseconds and printing output
221every 5 seconds:
222
223# ./funclatency -mTi 5 vfs_read
224Tracing vfs_read... Hit Ctrl-C to end.
225
22620:10:08
227 msecs : count distribution
228 0 -> 1 : 1500 |*************************************+|
229 2 -> 3 : 3 | |
230 4 -> 7 : 1 | |
231 8 -> 15 : 2 | |
232 16 -> 31 : 0 | |
233 32 -> 63 : 0 | |
234 64 -> 127 : 4 | |
235 128 -> 255 : 3 | |
236 256 -> 511 : 1 | |
237 512 -> 1023 : 7 | |
238
Haoaf90bd42020-09-29 02:01:36 +0800239avg = 5 nsecs, total: 8259 nsecs, count: 1521
240
Brendan Gregg74016c32015-09-21 15:49:21 -070024120:10:13
242 msecs : count distribution
243 0 -> 1 : 1251 |*************************************+|
244 2 -> 3 : 3 | |
245 4 -> 7 : 2 | |
246 8 -> 15 : 0 | |
247 16 -> 31 : 2 | |
248 32 -> 63 : 3 | |
249 64 -> 127 : 5 | |
250 128 -> 255 : 5 | |
251 256 -> 511 : 3 | |
252 512 -> 1023 : 6 | |
253 1024 -> 2047 : 2 | |
254
Haoaf90bd42020-09-29 02:01:36 +0800255avg = 9 nsecs, total: 11736 nsecs, count: 1282
256
Brendan Gregg74016c32015-09-21 15:49:21 -070025720:10:18
258 msecs : count distribution
259 0 -> 1 : 1265 |*************************************+|
260 2 -> 3 : 0 | |
261 4 -> 7 : 5 | |
262 8 -> 15 : 9 | |
263 16 -> 31 : 7 | |
264 32 -> 63 : 1 | |
265 64 -> 127 : 2 | |
266 128 -> 255 : 3 | |
267 256 -> 511 : 5 | |
268 512 -> 1023 : 5 | |
269 1024 -> 2047 : 0 | |
270 2048 -> 4095 : 1 | |
Haoaf90bd42020-09-29 02:01:36 +0800271
272avg = 8 nsecs, total: 11219 nsecs, count: 1303
273
Brendan Gregg74016c32015-09-21 15:49:21 -0700274^C
27520:10:20
276 msecs : count distribution
277 0 -> 1 : 249 |*************************************+|
278 2 -> 3 : 0 | |
279 4 -> 7 : 0 | |
280 8 -> 15 : 1 | |
281 16 -> 31 : 0 | |
282 32 -> 63 : 0 | |
283 64 -> 127 : 0 | |
284 128 -> 255 : 0 | |
285 256 -> 511 : 0 | |
286 512 -> 1023 : 1 | |
Haoaf90bd42020-09-29 02:01:36 +0800287
288avg = 4 nsecs, total: 1029 nsecs, count: 251
289
Brendan Gregg74016c32015-09-21 15:49:21 -0700290Detaching...
291
292
293A single process can be traced, which filters in-kernel for efficiency. Here,
294the vfs_read() function is timed as milliseconds for PID 17064, which is a
295bash shell:
296
297# ./funclatency -mp 17064 vfs_read
298Tracing vfs_read... Hit Ctrl-C to end.
299^C
300 msecs : count distribution
301 0 -> 1 : 1 |** |
302 2 -> 3 : 0 | |
303 4 -> 7 : 0 | |
304 8 -> 15 : 1 |** |
305 16 -> 31 : 2 |***** |
306 32 -> 63 : 0 | |
307 64 -> 127 : 13 |**************************************|
308 128 -> 255 : 10 |***************************** |
309 256 -> 511 : 4 |*********** |
Haoaf90bd42020-09-29 02:01:36 +0800310
311avg = 153 nsecs, total: 4765 nsecs, count: 31
312
Brendan Gregg74016c32015-09-21 15:49:21 -0700313Detaching...
314
315The distribution between 64 and 511 milliseconds shows keystroke latency.
316
317
Brendan Gregg50bbca42015-09-25 12:47:53 -0700318The -F option can be used to print a histogram per function. Eg:
319
320# ./funclatency -uF 'vfs_r*'
321Tracing 5 functions for "vfs_r*"... Hit Ctrl-C to end.
322^C
323
324Function = vfs_read
325 usecs : count distribution
326 0 -> 1 : 1044 |****************************************|
327 2 -> 3 : 383 |************** |
328 4 -> 7 : 76 |** |
329 8 -> 15 : 41 |* |
330 16 -> 31 : 26 | |
331 32 -> 63 : 0 | |
332 64 -> 127 : 1 | |
333 128 -> 255 : 0 | |
334 256 -> 511 : 0 | |
335 512 -> 1023 : 0 | |
336 1024 -> 2047 : 0 | |
337 2048 -> 4095 : 4 | |
338 4096 -> 8191 : 2 | |
339 8192 -> 16383 : 0 | |
340 16384 -> 32767 : 0 | |
341 32768 -> 65535 : 2 | |
342 65536 -> 131071 : 5 | |
343 131072 -> 262143 : 5 | |
344 262144 -> 524287 : 3 | |
345 524288 -> 1048575 : 7 | |
346
347Function = vfs_rename
348 usecs : count distribution
349 0 -> 1 : 2 |**** |
350 2 -> 3 : 2 |**** |
351 4 -> 7 : 2 |**** |
352 8 -> 15 : 0 | |
353 16 -> 31 : 6 |************* |
354 32 -> 63 : 18 |****************************************|
Haoaf90bd42020-09-29 02:01:36 +0800355
356avg = 5087 nsecs, total: 8287001 nsecs, count: 1629
357
Brendan Gregg50bbca42015-09-25 12:47:53 -0700358Detaching...
359
360
361
Brendan Gregg74016c32015-09-21 15:49:21 -0700362USAGE message:
363
Brendan Gregg50bbca42015-09-25 12:47:53 -0700364# ./funclatency -h
Sasha Goldshteina466c462016-10-06 21:17:59 +0300365usage: funclatency [-h] [-p PID] [-i INTERVAL] [-T] [-u] [-m] [-F] [-r] [-v]
Brendan Gregg50bbca42015-09-25 12:47:53 -0700366 pattern
Brendan Gregg74016c32015-09-21 15:49:21 -0700367
Sasha Goldshteina466c462016-10-06 21:17:59 +0300368Time functions and print latency as a histogram
Brendan Gregg74016c32015-09-21 15:49:21 -0700369
370positional arguments:
Sasha Goldshteina466c462016-10-06 21:17:59 +0300371 pattern search expression for functions
Brendan Gregg74016c32015-09-21 15:49:21 -0700372
373optional arguments:
374 -h, --help show this help message and exit
375 -p PID, --pid PID trace this PID only
376 -i INTERVAL, --interval INTERVAL
Akilesh Kailash89967192018-05-18 13:36:54 -0700377 summary interval, in seconds
378 -d DURATION, --duration DURATION
379 total duration of trace, in seconds
Brendan Gregg74016c32015-09-21 15:49:21 -0700380 -T, --timestamp include timestamp on output
381 -u, --microseconds microsecond histogram
382 -m, --milliseconds millisecond histogram
Brendan Gregg50bbca42015-09-25 12:47:53 -0700383 -F, --function show a separate histogram per function
Brendan Gregg74016c32015-09-21 15:49:21 -0700384 -r, --regexp use regular expressions. Default is "*" wildcards
385 only.
chenyue.zhou0cf81662021-04-14 13:37:07 -0400386 -l LEVEL, --level LEVEL
387 set the level of nested or recursive functions
Sasha Goldshteina466c462016-10-06 21:17:59 +0300388 -v, --verbose print the BPF program (for debugging purposes)
Brendan Gregg74016c32015-09-21 15:49:21 -0700389
390examples:
Alex Bagehot3b9679a2016-02-06 16:01:02 +0000391 ./funclatency do_sys_open # time the do_sys_open() kernel function
Sasha Goldshteina466c462016-10-06 21:17:59 +0300392 ./funclatency c:read # time the read() C library function
Brendan Gregg74016c32015-09-21 15:49:21 -0700393 ./funclatency -u vfs_read # time vfs_read(), in microseconds
394 ./funclatency -m do_nanosleep # time do_nanosleep(), in milliseconds
Akilesh Kailash89967192018-05-18 13:36:54 -0700395 ./funclatency -i 2 -d 10 c:open # output every 2 seconds, for duration 10s
Brendan Gregg74016c32015-09-21 15:49:21 -0700396 ./funclatency -mTi 5 vfs_read # output every 5 seconds, with timestamps
397 ./funclatency -p 181 vfs_read # time process 181 only
398 ./funclatency 'vfs_fstat*' # time both vfs_fstat() and vfs_fstatat()
Sasha Goldshteina466c462016-10-06 21:17:59 +0300399 ./funclatency 'c:*printf' # time the *printf family of functions
Brendan Gregg50bbca42015-09-25 12:47:53 -0700400 ./funclatency -F 'vfs_r*' # show one histogram per matched function