Brendan Gregg | 74016c3 | 2015-09-21 15:49:21 -0700 | [diff] [blame] | 1 | Demonstrations of funclatency, the Linux eBPF/bcc version. |
| 2 | |
| 3 | |
| 4 | Timing the do_sys_open() kernel function until Ctrl-C: |
| 5 | |
| 6 | # ./funclatency do_sys_open |
| 7 | Tracing do_sys_open... Hit Ctrl-C to end. |
| 8 | ^C |
| 9 | nsecs : count distribution |
| 10 | 0 -> 1 : 0 | | |
| 11 | 2 -> 3 : 0 | | |
| 12 | 4 -> 7 : 0 | | |
| 13 | 8 -> 15 : 0 | | |
| 14 | 16 -> 31 : 0 | | |
| 15 | 32 -> 63 : 0 | | |
| 16 | 64 -> 127 : 0 | | |
| 17 | 128 -> 255 : 0 | | |
| 18 | 256 -> 511 : 0 | | |
| 19 | 512 -> 1023 : 0 | | |
| 20 | 1024 -> 2047 : 0 | | |
| 21 | 2048 -> 4095 : 124 |**************** | |
| 22 | 4096 -> 8191 : 291 |**************************************| |
| 23 | 8192 -> 16383 : 36 |**** | |
| 24 | 16384 -> 32767 : 16 |** | |
| 25 | 32768 -> 65535 : 8 |* | |
| 26 | 65536 -> 131071 : 0 | | |
| 27 | 131072 -> 262143 : 0 | | |
| 28 | 262144 -> 524287 : 0 | | |
| 29 | 524288 -> 1048575 : 0 | | |
| 30 | 1048576 -> 2097151 : 0 | | |
| 31 | 2097152 -> 4194303 : 1 | | |
| 32 | Detaching... |
| 33 | |
| 34 | The output shows a histogram of function latency (call time), measured from when |
| 35 | the function began executing (was called) to when it finished (returned). |
| 36 | |
| 37 | This example output shows that most of the time, do_sys_open() took between |
| 38 | 2048 and 65536 nanoseconds (2 to 65 microseconds). The peak of this distribution |
| 39 | shows 291 calls of between 4096 and 8191 nanoseconds. There was also one |
Alex Bagehot | 3b9679a | 2016-02-06 16:01:02 +0000 | [diff] [blame] | 40 | occurrence, an outlier, in the 2 to 4 millisecond range. |
Brendan Gregg | 74016c3 | 2015-09-21 15:49:21 -0700 | [diff] [blame] | 41 | |
| 42 | How this works: the function entry and return are traced using the kernel kprobe |
| 43 | and kretprobe tracer. Timestamps are collected, the delta time calculated, which |
| 44 | is the bucketized and stored as an in-kernel histogram for efficiency. The |
Alex Bagehot | 3b9679a | 2016-02-06 16:01:02 +0000 | [diff] [blame] | 45 | histogram is visible in the output: it's the "count" column; everything else is |
Brendan Gregg | 74016c3 | 2015-09-21 15:49:21 -0700 | [diff] [blame] | 46 | decoration. Only the count column is copied to user-level on output. This is an |
| 47 | efficient way to time kernel functions and examine their latency distribution. |
| 48 | |
| 49 | |
Sasha Goldshtein | a466c46 | 2016-10-06 21:17:59 +0300 | [diff] [blame] | 50 | Now trace a user function, pthread_mutex_lock in libpthread, to determine if |
| 51 | there is considerable lock contention: |
| 52 | |
| 53 | # ./funclatency pthread:pthread_mutex_lock -p $(pidof contentions) |
| 54 | Tracing 1 function for "pthread:pthread_mutex_lock"... Hit Ctrl-C to end. |
| 55 | nsecs : count distribution |
| 56 | 0 -> 1 : 0 | | |
| 57 | 2 -> 3 : 0 | | |
| 58 | 4 -> 7 : 0 | | |
| 59 | 8 -> 15 : 0 | | |
| 60 | 16 -> 31 : 0 | | |
| 61 | 32 -> 63 : 0 | | |
| 62 | 64 -> 127 : 0 | | |
| 63 | 128 -> 255 : 0 | | |
| 64 | 256 -> 511 : 0 | | |
| 65 | 512 -> 1023 : 0 | | |
| 66 | 1024 -> 2047 : 0 | | |
| 67 | 2048 -> 4095 : 508967 |****************************************| |
| 68 | 4096 -> 8191 : 70072 |***** | |
| 69 | 8192 -> 16383 : 27686 |** | |
| 70 | 16384 -> 32767 : 5075 | | |
| 71 | 32768 -> 65535 : 2318 | | |
| 72 | 65536 -> 131071 : 581 | | |
| 73 | 131072 -> 262143 : 38 | | |
| 74 | 262144 -> 524287 : 5 | | |
| 75 | 524288 -> 1048575 : 1 | | |
| 76 | 1048576 -> 2097151 : 9 | | |
| 77 | Detaching... |
| 78 | |
Taekho Nam | 0bd1a6d | 2017-06-29 23:20:47 +0900 | [diff] [blame] | 79 | It seems that most calls to pthread_mutex_lock completed rather quickly (in |
Sasha Goldshtein | a466c46 | 2016-10-06 21:17:59 +0300 | [diff] [blame] | 80 | under 4us), but there were some cases of considerable contention, sometimes |
| 81 | over a full millisecond. |
| 82 | |
| 83 | |
| 84 | Run a quick-and-dirty profiler over all the functions in an executable: |
| 85 | # ./funclatency /home/user/primes:* -p $(pidof primes) -F |
| 86 | Tracing 15 functions for "/home/user/primes:*"... Hit Ctrl-C to end. |
| 87 | ^C |
| 88 | |
| 89 | Function = is_prime [6556] |
| 90 | nsecs : count distribution |
| 91 | 0 -> 1 : 0 | | |
| 92 | 2 -> 3 : 0 | | |
| 93 | 4 -> 7 : 0 | | |
| 94 | 8 -> 15 : 0 | | |
| 95 | 16 -> 31 : 0 | | |
| 96 | 32 -> 63 : 0 | | |
| 97 | 64 -> 127 : 0 | | |
| 98 | 128 -> 255 : 0 | | |
| 99 | 256 -> 511 : 0 | | |
| 100 | 512 -> 1023 : 0 | | |
| 101 | 1024 -> 2047 : 0 | | |
| 102 | 2048 -> 4095 : 1495322 |****************************************| |
| 103 | 4096 -> 8191 : 95744 |** | |
| 104 | 8192 -> 16383 : 9926 | | |
| 105 | 16384 -> 32767 : 3070 | | |
| 106 | 32768 -> 65535 : 1415 | | |
| 107 | 65536 -> 131071 : 112 | | |
| 108 | 131072 -> 262143 : 9 | | |
| 109 | 262144 -> 524287 : 3 | | |
| 110 | 524288 -> 1048575 : 0 | | |
| 111 | 1048576 -> 2097151 : 8 | | |
| 112 | |
| 113 | Function = insert_result [6556] |
| 114 | nsecs : count distribution |
| 115 | 0 -> 1 : 0 | | |
| 116 | 2 -> 3 : 0 | | |
| 117 | 4 -> 7 : 0 | | |
| 118 | 8 -> 15 : 0 | | |
| 119 | 16 -> 31 : 0 | | |
| 120 | 32 -> 63 : 0 | | |
| 121 | 64 -> 127 : 0 | | |
| 122 | 128 -> 255 : 0 | | |
| 123 | 256 -> 511 : 0 | | |
| 124 | 512 -> 1023 : 0 | | |
| 125 | 1024 -> 2047 : 0 | | |
| 126 | 2048 -> 4095 : 111047 |****************************************| |
| 127 | 4096 -> 8191 : 3998 |* | |
| 128 | 8192 -> 16383 : 720 | | |
| 129 | 16384 -> 32767 : 238 | | |
| 130 | 32768 -> 65535 : 106 | | |
| 131 | 65536 -> 131071 : 5 | | |
| 132 | 131072 -> 262143 : 4 | | |
| 133 | Detaching... |
| 134 | |
| 135 | From the results, we can see that the is_prime function has something resembling |
| 136 | an exponential distribution -- very few primes take a very long time to test, |
| 137 | while most numbers are verified as prime or composite in less than 4us. The |
| 138 | insert_result function exhibits a similar phenomenon, likely due to contention |
| 139 | over a shared results container. |
| 140 | |
| 141 | |
Brendan Gregg | 74016c3 | 2015-09-21 15:49:21 -0700 | [diff] [blame] | 142 | Now vfs_read() is traced, and a microseconds histogram printed: |
| 143 | |
| 144 | # ./funclatency -u vfs_read |
| 145 | Tracing vfs_read... Hit Ctrl-C to end. |
| 146 | ^C |
| 147 | usecs : count distribution |
| 148 | 0 -> 1 : 1143 |**************************************| |
| 149 | 2 -> 3 : 420 |************* | |
| 150 | 4 -> 7 : 159 |***** | |
| 151 | 8 -> 15 : 295 |********* | |
| 152 | 16 -> 31 : 25 | | |
| 153 | 32 -> 63 : 5 | | |
| 154 | 64 -> 127 : 1 | | |
| 155 | 128 -> 255 : 0 | | |
| 156 | 256 -> 511 : 0 | | |
| 157 | 512 -> 1023 : 0 | | |
| 158 | 1024 -> 2047 : 1 | | |
| 159 | 2048 -> 4095 : 0 | | |
| 160 | 4096 -> 8191 : 5 | | |
| 161 | 8192 -> 16383 : 0 | | |
| 162 | 16384 -> 32767 : 0 | | |
| 163 | 32768 -> 65535 : 0 | | |
| 164 | 65536 -> 131071 : 7 | | |
| 165 | 131072 -> 262143 : 7 | | |
| 166 | 262144 -> 524287 : 3 | | |
| 167 | 524288 -> 1048575 : 7 | | |
| 168 | Detaching... |
| 169 | |
| 170 | This shows a bimodal distribution. Many vfs_read() calls were faster than 15 |
| 171 | microseconds, however, there was also a small handful between 65 milliseconds |
| 172 | and 1 second, seen at the bottom of the table. These are likely network reads |
| 173 | from SSH, waiting on interactive keystrokes. |
| 174 | |
| 175 | |
| 176 | Tracing do_nanosleep() in milliseconds: |
| 177 | |
| 178 | # ./funclatency -m do_nanosleep |
| 179 | Tracing do_nanosleep... Hit Ctrl-C to end. |
| 180 | ^C |
| 181 | msecs : count distribution |
| 182 | 0 -> 1 : 0 | | |
| 183 | 2 -> 3 : 0 | | |
| 184 | 4 -> 7 : 0 | | |
| 185 | 8 -> 15 : 0 | | |
| 186 | 16 -> 31 : 0 | | |
| 187 | 32 -> 63 : 0 | | |
| 188 | 64 -> 127 : 0 | | |
| 189 | 128 -> 255 : 0 | | |
| 190 | 256 -> 511 : 0 | | |
| 191 | 512 -> 1023 : 328 |**************************************| |
| 192 | 1024 -> 2047 : 0 | | |
| 193 | 2048 -> 4095 : 0 | | |
| 194 | 4096 -> 8191 : 32 |*** | |
| 195 | 8192 -> 16383 : 0 | | |
| 196 | 16384 -> 32767 : 0 | | |
| 197 | 32768 -> 65535 : 2 | | |
| 198 | Detaching... |
| 199 | |
| 200 | This looks like it has found threads that are sleeping every 1, 5, and 60 |
| 201 | seconds. |
| 202 | |
| 203 | |
| 204 | An interval can be provided using -i, and timestamps added using -T. For |
| 205 | example, tracing vfs_read() latency in milliseconds and printing output |
| 206 | every 5 seconds: |
| 207 | |
| 208 | # ./funclatency -mTi 5 vfs_read |
| 209 | Tracing vfs_read... Hit Ctrl-C to end. |
| 210 | |
| 211 | 20:10:08 |
| 212 | msecs : count distribution |
| 213 | 0 -> 1 : 1500 |*************************************+| |
| 214 | 2 -> 3 : 3 | | |
| 215 | 4 -> 7 : 1 | | |
| 216 | 8 -> 15 : 2 | | |
| 217 | 16 -> 31 : 0 | | |
| 218 | 32 -> 63 : 0 | | |
| 219 | 64 -> 127 : 4 | | |
| 220 | 128 -> 255 : 3 | | |
| 221 | 256 -> 511 : 1 | | |
| 222 | 512 -> 1023 : 7 | | |
| 223 | |
| 224 | 20:10:13 |
| 225 | msecs : count distribution |
| 226 | 0 -> 1 : 1251 |*************************************+| |
| 227 | 2 -> 3 : 3 | | |
| 228 | 4 -> 7 : 2 | | |
| 229 | 8 -> 15 : 0 | | |
| 230 | 16 -> 31 : 2 | | |
| 231 | 32 -> 63 : 3 | | |
| 232 | 64 -> 127 : 5 | | |
| 233 | 128 -> 255 : 5 | | |
| 234 | 256 -> 511 : 3 | | |
| 235 | 512 -> 1023 : 6 | | |
| 236 | 1024 -> 2047 : 2 | | |
| 237 | |
| 238 | 20:10:18 |
| 239 | msecs : count distribution |
| 240 | 0 -> 1 : 1265 |*************************************+| |
| 241 | 2 -> 3 : 0 | | |
| 242 | 4 -> 7 : 5 | | |
| 243 | 8 -> 15 : 9 | | |
| 244 | 16 -> 31 : 7 | | |
| 245 | 32 -> 63 : 1 | | |
| 246 | 64 -> 127 : 2 | | |
| 247 | 128 -> 255 : 3 | | |
| 248 | 256 -> 511 : 5 | | |
| 249 | 512 -> 1023 : 5 | | |
| 250 | 1024 -> 2047 : 0 | | |
| 251 | 2048 -> 4095 : 1 | | |
| 252 | ^C |
| 253 | 20:10:20 |
| 254 | msecs : count distribution |
| 255 | 0 -> 1 : 249 |*************************************+| |
| 256 | 2 -> 3 : 0 | | |
| 257 | 4 -> 7 : 0 | | |
| 258 | 8 -> 15 : 1 | | |
| 259 | 16 -> 31 : 0 | | |
| 260 | 32 -> 63 : 0 | | |
| 261 | 64 -> 127 : 0 | | |
| 262 | 128 -> 255 : 0 | | |
| 263 | 256 -> 511 : 0 | | |
| 264 | 512 -> 1023 : 1 | | |
| 265 | Detaching... |
| 266 | |
| 267 | |
| 268 | A single process can be traced, which filters in-kernel for efficiency. Here, |
| 269 | the vfs_read() function is timed as milliseconds for PID 17064, which is a |
| 270 | bash shell: |
| 271 | |
| 272 | # ./funclatency -mp 17064 vfs_read |
| 273 | Tracing vfs_read... Hit Ctrl-C to end. |
| 274 | ^C |
| 275 | msecs : count distribution |
| 276 | 0 -> 1 : 1 |** | |
| 277 | 2 -> 3 : 0 | | |
| 278 | 4 -> 7 : 0 | | |
| 279 | 8 -> 15 : 1 |** | |
| 280 | 16 -> 31 : 2 |***** | |
| 281 | 32 -> 63 : 0 | | |
| 282 | 64 -> 127 : 13 |**************************************| |
| 283 | 128 -> 255 : 10 |***************************** | |
| 284 | 256 -> 511 : 4 |*********** | |
| 285 | Detaching... |
| 286 | |
| 287 | The distribution between 64 and 511 milliseconds shows keystroke latency. |
| 288 | |
| 289 | |
Brendan Gregg | 50bbca4 | 2015-09-25 12:47:53 -0700 | [diff] [blame] | 290 | The -F option can be used to print a histogram per function. Eg: |
| 291 | |
| 292 | # ./funclatency -uF 'vfs_r*' |
| 293 | Tracing 5 functions for "vfs_r*"... Hit Ctrl-C to end. |
| 294 | ^C |
| 295 | |
| 296 | Function = vfs_read |
| 297 | usecs : count distribution |
| 298 | 0 -> 1 : 1044 |****************************************| |
| 299 | 2 -> 3 : 383 |************** | |
| 300 | 4 -> 7 : 76 |** | |
| 301 | 8 -> 15 : 41 |* | |
| 302 | 16 -> 31 : 26 | | |
| 303 | 32 -> 63 : 0 | | |
| 304 | 64 -> 127 : 1 | | |
| 305 | 128 -> 255 : 0 | | |
| 306 | 256 -> 511 : 0 | | |
| 307 | 512 -> 1023 : 0 | | |
| 308 | 1024 -> 2047 : 0 | | |
| 309 | 2048 -> 4095 : 4 | | |
| 310 | 4096 -> 8191 : 2 | | |
| 311 | 8192 -> 16383 : 0 | | |
| 312 | 16384 -> 32767 : 0 | | |
| 313 | 32768 -> 65535 : 2 | | |
| 314 | 65536 -> 131071 : 5 | | |
| 315 | 131072 -> 262143 : 5 | | |
| 316 | 262144 -> 524287 : 3 | | |
| 317 | 524288 -> 1048575 : 7 | | |
| 318 | |
| 319 | Function = vfs_rename |
| 320 | usecs : count distribution |
| 321 | 0 -> 1 : 2 |**** | |
| 322 | 2 -> 3 : 2 |**** | |
| 323 | 4 -> 7 : 2 |**** | |
| 324 | 8 -> 15 : 0 | | |
| 325 | 16 -> 31 : 6 |************* | |
| 326 | 32 -> 63 : 18 |****************************************| |
| 327 | Detaching... |
| 328 | |
| 329 | |
| 330 | |
Brendan Gregg | 74016c3 | 2015-09-21 15:49:21 -0700 | [diff] [blame] | 331 | USAGE message: |
| 332 | |
Brendan Gregg | 50bbca4 | 2015-09-25 12:47:53 -0700 | [diff] [blame] | 333 | # ./funclatency -h |
Sasha Goldshtein | a466c46 | 2016-10-06 21:17:59 +0300 | [diff] [blame] | 334 | usage: funclatency [-h] [-p PID] [-i INTERVAL] [-T] [-u] [-m] [-F] [-r] [-v] |
Brendan Gregg | 50bbca4 | 2015-09-25 12:47:53 -0700 | [diff] [blame] | 335 | pattern |
Brendan Gregg | 74016c3 | 2015-09-21 15:49:21 -0700 | [diff] [blame] | 336 | |
Sasha Goldshtein | a466c46 | 2016-10-06 21:17:59 +0300 | [diff] [blame] | 337 | Time functions and print latency as a histogram |
Brendan Gregg | 74016c3 | 2015-09-21 15:49:21 -0700 | [diff] [blame] | 338 | |
| 339 | positional arguments: |
Sasha Goldshtein | a466c46 | 2016-10-06 21:17:59 +0300 | [diff] [blame] | 340 | pattern search expression for functions |
Brendan Gregg | 74016c3 | 2015-09-21 15:49:21 -0700 | [diff] [blame] | 341 | |
| 342 | optional arguments: |
| 343 | -h, --help show this help message and exit |
| 344 | -p PID, --pid PID trace this PID only |
| 345 | -i INTERVAL, --interval INTERVAL |
Akilesh Kailash | 8996719 | 2018-05-18 13:36:54 -0700 | [diff] [blame] | 346 | summary interval, in seconds |
| 347 | -d DURATION, --duration DURATION |
| 348 | total duration of trace, in seconds |
Brendan Gregg | 74016c3 | 2015-09-21 15:49:21 -0700 | [diff] [blame] | 349 | -T, --timestamp include timestamp on output |
| 350 | -u, --microseconds microsecond histogram |
| 351 | -m, --milliseconds millisecond histogram |
Brendan Gregg | 50bbca4 | 2015-09-25 12:47:53 -0700 | [diff] [blame] | 352 | -F, --function show a separate histogram per function |
Brendan Gregg | 74016c3 | 2015-09-21 15:49:21 -0700 | [diff] [blame] | 353 | -r, --regexp use regular expressions. Default is "*" wildcards |
| 354 | only. |
Sasha Goldshtein | a466c46 | 2016-10-06 21:17:59 +0300 | [diff] [blame] | 355 | -v, --verbose print the BPF program (for debugging purposes) |
Brendan Gregg | 74016c3 | 2015-09-21 15:49:21 -0700 | [diff] [blame] | 356 | |
| 357 | examples: |
Alex Bagehot | 3b9679a | 2016-02-06 16:01:02 +0000 | [diff] [blame] | 358 | ./funclatency do_sys_open # time the do_sys_open() kernel function |
Sasha Goldshtein | a466c46 | 2016-10-06 21:17:59 +0300 | [diff] [blame] | 359 | ./funclatency c:read # time the read() C library function |
Brendan Gregg | 74016c3 | 2015-09-21 15:49:21 -0700 | [diff] [blame] | 360 | ./funclatency -u vfs_read # time vfs_read(), in microseconds |
| 361 | ./funclatency -m do_nanosleep # time do_nanosleep(), in milliseconds |
Akilesh Kailash | 8996719 | 2018-05-18 13:36:54 -0700 | [diff] [blame] | 362 | ./funclatency -i 2 -d 10 c:open # output every 2 seconds, for duration 10s |
Brendan Gregg | 74016c3 | 2015-09-21 15:49:21 -0700 | [diff] [blame] | 363 | ./funclatency -mTi 5 vfs_read # output every 5 seconds, with timestamps |
| 364 | ./funclatency -p 181 vfs_read # time process 181 only |
| 365 | ./funclatency 'vfs_fstat*' # time both vfs_fstat() and vfs_fstatat() |
Sasha Goldshtein | a466c46 | 2016-10-06 21:17:59 +0300 | [diff] [blame] | 366 | ./funclatency 'c:*printf' # time the *printf family of functions |
Brendan Gregg | 50bbca4 | 2015-09-25 12:47:53 -0700 | [diff] [blame] | 367 | ./funclatency -F 'vfs_r*' # show one histogram per matched function |