Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 1 | Demonstrations of argdist. |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 2 | |
| 3 | |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 4 | argdist probes functions you specify and collects parameter values into a |
| 5 | histogram or a frequency count. This can be used to understand the distribution |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 6 | of values a certain parameter takes, filter and print interesting parameters |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 7 | without attaching a debugger, and obtain general execution statistics on |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 8 | various functions. |
| 9 | |
| 10 | For example, suppose you want to find what allocation sizes are common in |
| 11 | your application: |
| 12 | |
Sasha Goldshtein | d2f4762 | 2016-10-04 18:40:15 +0300 | [diff] [blame] | 13 | # ./argdist -p 2420 -c -C 'p:c:malloc(size_t size):size_t:size' |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 14 | [01:42:29] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 15 | p:c:malloc(size_t size):size_t:size |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 16 | COUNT EVENT |
| 17 | [01:42:30] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 18 | p:c:malloc(size_t size):size_t:size |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 19 | COUNT EVENT |
| 20 | [01:42:31] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 21 | p:c:malloc(size_t size):size_t:size |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 22 | COUNT EVENT |
| 23 | 1 size = 16 |
| 24 | [01:42:32] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 25 | p:c:malloc(size_t size):size_t:size |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 26 | COUNT EVENT |
| 27 | 2 size = 16 |
| 28 | [01:42:33] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 29 | p:c:malloc(size_t size):size_t:size |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 30 | COUNT EVENT |
| 31 | 3 size = 16 |
| 32 | [01:42:34] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 33 | p:c:malloc(size_t size):size_t:size |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 34 | COUNT EVENT |
| 35 | 4 size = 16 |
| 36 | ^C |
| 37 | |
| 38 | It seems that the application is allocating blocks of size 16. The COUNT |
| 39 | column contains the number of occurrences of a particular event, and the |
| 40 | EVENT column describes the event. In this case, the "size" parameter was |
| 41 | probed and its value was 16, repeatedly. |
| 42 | |
| 43 | Now, suppose you wanted a histogram of buffer sizes passed to the write() |
| 44 | function across the system: |
| 45 | |
Sasha Goldshtein | d2f4762 | 2016-10-04 18:40:15 +0300 | [diff] [blame] | 46 | # ./argdist -c -H 'p:c:write(int fd, void *buf, size_t len):size_t:len' |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 47 | [01:45:22] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 48 | p:c:write(int fd, void *buf, size_t len):size_t:len |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 49 | len : count distribution |
| 50 | 0 -> 1 : 0 | | |
| 51 | 2 -> 3 : 2 |************* | |
| 52 | 4 -> 7 : 0 | | |
| 53 | 8 -> 15 : 2 |************* | |
| 54 | 16 -> 31 : 0 | | |
| 55 | 32 -> 63 : 6 |****************************************| |
| 56 | [01:45:23] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 57 | p:c:write(int fd, void *buf, size_t len):size_t:len |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 58 | len : count distribution |
| 59 | 0 -> 1 : 0 | | |
| 60 | 2 -> 3 : 11 |*************** | |
| 61 | 4 -> 7 : 0 | | |
| 62 | 8 -> 15 : 4 |***** | |
| 63 | 16 -> 31 : 0 | | |
| 64 | 32 -> 63 : 28 |****************************************| |
| 65 | 64 -> 127 : 12 |***************** | |
| 66 | [01:45:24] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 67 | p:c:write(int fd, void *buf, size_t len):size_t:len |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 68 | len : count distribution |
| 69 | 0 -> 1 : 0 | | |
| 70 | 2 -> 3 : 21 |**************** | |
| 71 | 4 -> 7 : 0 | | |
| 72 | 8 -> 15 : 6 |**** | |
| 73 | 16 -> 31 : 0 | | |
| 74 | 32 -> 63 : 52 |****************************************| |
| 75 | 64 -> 127 : 26 |******************** | |
| 76 | ^C |
| 77 | |
| 78 | It seems that most writes fall into three buckets: very small writes of 2-3 |
| 79 | bytes, medium writes of 32-63 bytes, and larger writes of 64-127 bytes. |
| 80 | |
| 81 | But these are writes across the board -- what if you wanted to focus on writes |
| 82 | to STDOUT? |
| 83 | |
Sasha Goldshtein | d2f4762 | 2016-10-04 18:40:15 +0300 | [diff] [blame] | 84 | # ./argdist -c -H 'p:c:write(int fd, void *buf, size_t len):size_t:len:fd==1' |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 85 | [01:47:17] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 86 | p:c:write(int fd, void *buf, size_t len):size_t:len:fd==1 |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 87 | len : count distribution |
| 88 | 0 -> 1 : 0 | | |
| 89 | 2 -> 3 : 0 | | |
| 90 | 4 -> 7 : 0 | | |
| 91 | 8 -> 15 : 1 |****************************************| |
| 92 | 16 -> 31 : 0 | | |
| 93 | 32 -> 63 : 1 |****************************************| |
| 94 | [01:47:18] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 95 | p:c:write(int fd, void *buf, size_t len):size_t:len:fd==1 |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 96 | len : count distribution |
| 97 | 0 -> 1 : 0 | | |
| 98 | 2 -> 3 : 0 | | |
| 99 | 4 -> 7 : 0 | | |
| 100 | 8 -> 15 : 2 |************* | |
| 101 | 16 -> 31 : 0 | | |
| 102 | 32 -> 63 : 3 |******************** | |
| 103 | 64 -> 127 : 6 |****************************************| |
| 104 | [01:47:19] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 105 | p:c:write(int fd, void *buf, size_t len):size_t:len:fd==1 |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 106 | len : count distribution |
| 107 | 0 -> 1 : 0 | | |
| 108 | 2 -> 3 : 0 | | |
| 109 | 4 -> 7 : 0 | | |
| 110 | 8 -> 15 : 3 |********* | |
| 111 | 16 -> 31 : 0 | | |
| 112 | 32 -> 63 : 5 |*************** | |
| 113 | 64 -> 127 : 13 |****************************************| |
| 114 | ^C |
| 115 | |
| 116 | The "fd==1" part is a filter that is applied to every invocation of write(). |
| 117 | Only if the filter condition is true, the value is recorded. |
| 118 | |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 119 | You can also use argdist to trace kernel functions. For example, suppose you |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 120 | wanted a histogram of kernel allocation (kmalloc) sizes across the system, |
| 121 | printed twice with 3 second intervals: |
| 122 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 123 | # ./argdist -i 3 -n 2 -H 'p::__kmalloc(size_t size):size_t:size' |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 124 | [01:50:00] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 125 | p::__kmalloc(size_t size):size_t:size |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 126 | size : count distribution |
| 127 | 0 -> 1 : 0 | | |
| 128 | 2 -> 3 : 0 | | |
| 129 | 4 -> 7 : 0 | | |
| 130 | 8 -> 15 : 6 |****************************************| |
| 131 | [01:50:03] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 132 | p::__kmalloc(size_t size):size_t:size |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 133 | size : count distribution |
| 134 | 0 -> 1 : 0 | | |
| 135 | 2 -> 3 : 0 | | |
| 136 | 4 -> 7 : 0 | | |
| 137 | 8 -> 15 : 22 |****************************************| |
| 138 | 16 -> 31 : 0 | | |
| 139 | 32 -> 63 : 0 | | |
| 140 | 64 -> 127 : 5 |********* | |
| 141 | 128 -> 255 : 2 |*** | |
| 142 | |
| 143 | Occasionally, numeric information isn't enough and you want to capture strings. |
| 144 | What are the strings printed by puts() across the system? |
| 145 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 146 | # ./argdist -i 10 -n 1 -C 'p:c:puts(char *str):char*:str' |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 147 | [01:53:54] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 148 | p:c:puts(char *str):char*:str |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 149 | COUNT EVENT |
| 150 | 2 str = Press ENTER to start. |
| 151 | |
| 152 | It looks like the message "Press ENTER to start." was printed twice during the |
| 153 | 10 seconds we were tracing. |
| 154 | |
Sasha Goldshtein | 8058eb1 | 2016-02-11 02:17:22 -0800 | [diff] [blame] | 155 | What about reads? You could trace gets() across the system and print the |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 156 | strings input by the user (note how "r" is used instead of "p" to attach a |
| 157 | probe to the function's return): |
Sasha Goldshtein | 8058eb1 | 2016-02-11 02:17:22 -0800 | [diff] [blame] | 158 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 159 | # ./argdist -i 10 -n 1 -C 'r:c:gets():char*:(char*)$retval:$retval!=0' |
Sasha Goldshtein | 8058eb1 | 2016-02-11 02:17:22 -0800 | [diff] [blame] | 160 | [02:12:23] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 161 | r:c:gets():char*:$retval:$retval!=0 |
Sasha Goldshtein | 8058eb1 | 2016-02-11 02:17:22 -0800 | [diff] [blame] | 162 | COUNT EVENT |
Sasha Goldshtein | cc27edf | 2016-02-14 03:49:01 -0800 | [diff] [blame] | 163 | 1 (char*)$retval = hi there |
| 164 | 3 (char*)$retval = sasha |
| 165 | 8 (char*)$retval = hello |
Sasha Goldshtein | 8058eb1 | 2016-02-11 02:17:22 -0800 | [diff] [blame] | 166 | |
| 167 | Similarly, we could get a histogram of the error codes returned by read(): |
| 168 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 169 | # ./argdist -i 10 -c 1 -H 'r:c:read()' |
Sasha Goldshtein | 8058eb1 | 2016-02-11 02:17:22 -0800 | [diff] [blame] | 170 | [02:15:36] |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 171 | r:c:read() |
| 172 | retval : count distribution |
Sasha Goldshtein | 8058eb1 | 2016-02-11 02:17:22 -0800 | [diff] [blame] | 173 | 0 -> 1 : 29 |****************************************| |
| 174 | 2 -> 3 : 11 |*************** | |
| 175 | 4 -> 7 : 0 | | |
| 176 | 8 -> 15 : 3 |**** | |
| 177 | 16 -> 31 : 2 |** | |
| 178 | 32 -> 63 : 22 |****************************** | |
| 179 | 64 -> 127 : 5 |****** | |
| 180 | 128 -> 255 : 0 | | |
| 181 | 256 -> 511 : 1 |* | |
| 182 | 512 -> 1023 : 1 |* | |
| 183 | 1024 -> 2047 : 0 | | |
| 184 | 2048 -> 4095 : 2 |** | |
| 185 | |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 186 | In return probes, you can also trace the latency of the function (unless it is |
| 187 | recursive) and the parameters it had on entry. For example, we can identify |
| 188 | which processes are performing slow synchronous filesystem reads -- say, |
| 189 | longer than 0.1ms (100,000ns): |
| 190 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 191 | # ./argdist -C 'r::__vfs_read():u32:$PID:$latency > 100000' |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 192 | [01:08:48] |
| 193 | r::__vfs_read():u32:$PID:$latency > 100000 |
| 194 | COUNT EVENT |
Sasha Goldshtein | cc27edf | 2016-02-14 03:49:01 -0800 | [diff] [blame] | 195 | 1 $PID = 10457 |
| 196 | 21 $PID = 2780 |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 197 | [01:08:49] |
| 198 | r::__vfs_read():u32:$PID:$latency > 100000 |
| 199 | COUNT EVENT |
Sasha Goldshtein | cc27edf | 2016-02-14 03:49:01 -0800 | [diff] [blame] | 200 | 1 $PID = 10457 |
| 201 | 21 $PID = 2780 |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 202 | ^C |
| 203 | |
Sasha Goldshtein | cc27edf | 2016-02-14 03:49:01 -0800 | [diff] [blame] | 204 | It looks like process 2780 performed 21 slow reads. |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 205 | |
| 206 | Occasionally, entry parameter values are also interesting. For example, you |
| 207 | might be curious how long it takes malloc() to allocate memory -- nanoseconds |
| 208 | per byte allocated. Let's go: |
| 209 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 210 | # ./argdist -H 'r:c:malloc(size_t size):u64:$latency/$entry(size);ns per byte' -n 1 -i 10 |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 211 | [01:11:13] |
| 212 | ns per byte : count distribution |
| 213 | 0 -> 1 : 0 | | |
| 214 | 2 -> 3 : 4 |***************** | |
| 215 | 4 -> 7 : 3 |************* | |
| 216 | 8 -> 15 : 2 |******** | |
| 217 | 16 -> 31 : 1 |**** | |
| 218 | 32 -> 63 : 0 | | |
| 219 | 64 -> 127 : 7 |******************************* | |
| 220 | 128 -> 255 : 1 |**** | |
| 221 | 256 -> 511 : 0 | | |
| 222 | 512 -> 1023 : 1 |**** | |
| 223 | 1024 -> 2047 : 1 |**** | |
| 224 | 2048 -> 4095 : 9 |****************************************| |
| 225 | 4096 -> 8191 : 1 |**** | |
| 226 | |
| 227 | It looks like a tri-modal distribution. Some allocations are extremely cheap, |
| 228 | and take 2-15 nanoseconds per byte. Other allocations are slower, and take |
| 229 | 64-127 nanoseconds per byte. And some allocations are slower still, and take |
| 230 | multiple microseconds per byte. |
| 231 | |
Sasha Goldshtein | cc27edf | 2016-02-14 03:49:01 -0800 | [diff] [blame] | 232 | You could also group results by more than one field. For example, __kmalloc |
| 233 | takes an additional flags parameter that describes how to allocate memory: |
| 234 | |
Sasha Goldshtein | d2f4762 | 2016-10-04 18:40:15 +0300 | [diff] [blame] | 235 | # ./argdist -c -C 'p::__kmalloc(size_t size, gfp_t flags):gfp_t,size_t:flags,size' |
Sasha Goldshtein | cc27edf | 2016-02-14 03:49:01 -0800 | [diff] [blame] | 236 | [03:42:29] |
| 237 | p::__kmalloc(size_t size, gfp_t flags):gfp_t,size_t:flags,size |
| 238 | COUNT EVENT |
| 239 | 1 flags = 16, size = 152 |
| 240 | 2 flags = 131280, size = 8 |
| 241 | 7 flags = 131280, size = 16 |
| 242 | [03:42:30] |
| 243 | p::__kmalloc(size_t size, gfp_t flags):gfp_t,size_t:flags,size |
| 244 | COUNT EVENT |
| 245 | 1 flags = 16, size = 152 |
| 246 | 6 flags = 131280, size = 8 |
| 247 | 19 flags = 131280, size = 16 |
| 248 | [03:42:31] |
| 249 | p::__kmalloc(size_t size, gfp_t flags):gfp_t,size_t:flags,size |
| 250 | COUNT EVENT |
| 251 | 2 flags = 16, size = 152 |
| 252 | 10 flags = 131280, size = 8 |
| 253 | 31 flags = 131280, size = 16 |
| 254 | [03:42:32] |
| 255 | p::__kmalloc(size_t size, gfp_t flags):gfp_t,size_t:flags,size |
| 256 | COUNT EVENT |
| 257 | 2 flags = 16, size = 152 |
| 258 | 14 flags = 131280, size = 8 |
| 259 | 43 flags = 131280, size = 16 |
| 260 | ^C |
| 261 | |
| 262 | The flags value must be expanded by hand, but it's still helpful to eliminate |
| 263 | certain kinds of allocations or visually group them together. |
| 264 | |
Sasha Goldshtein | fd60d55 | 2016-03-01 12:15:34 -0800 | [diff] [blame] | 265 | argdist also has basic support for kernel tracepoints. It is sometimes more |
| 266 | convenient to use tracepoints because they are documented and don't vary a lot |
Sasha Goldshtein | 376ae5c | 2016-10-04 19:49:57 +0300 | [diff] [blame] | 267 | between kernel versions. For example, let's trace the net:net_dev_start_xmit |
| 268 | tracepoint and print out the protocol field from the tracepoint structure: |
Sasha Goldshtein | fd60d55 | 2016-03-01 12:15:34 -0800 | [diff] [blame] | 269 | |
Sasha Goldshtein | 376ae5c | 2016-10-04 19:49:57 +0300 | [diff] [blame] | 270 | # argdist -C 't:net:net_dev_start_xmit():u16:args->protocol' |
| 271 | [13:01:49] |
| 272 | t:net:net_dev_start_xmit():u16:args->protocol |
Sasha Goldshtein | fd60d55 | 2016-03-01 12:15:34 -0800 | [diff] [blame] | 273 | COUNT EVENT |
Sasha Goldshtein | 376ae5c | 2016-10-04 19:49:57 +0300 | [diff] [blame] | 274 | 8 args->protocol = 2048 |
| 275 | ^C |
Sasha Goldshtein | fd60d55 | 2016-03-01 12:15:34 -0800 | [diff] [blame] | 276 | |
Sasha Goldshtein | 376ae5c | 2016-10-04 19:49:57 +0300 | [diff] [blame] | 277 | Note that to discover the format of the net:net_dev_start_xmit tracepoint, you |
| 278 | use the tplist tool (tplist -v net:net_dev_start_xmit). |
Sasha Goldshtein | fd60d55 | 2016-03-01 12:15:34 -0800 | [diff] [blame] | 279 | |
Sasha Goldshtein | c8f752f | 2016-10-17 02:18:43 -0700 | [diff] [blame] | 280 | |
| 281 | Occasionally, it is useful to filter certain expressions by string. This is not |
| 282 | trivially supported by BPF, but argdist provides a STRCMP helper you can use in |
| 283 | filter expressions. For example, to get a histogram of latencies opening a |
| 284 | specific file, run this: |
| 285 | |
| 286 | # argdist -c -H 'r:c:open(char *file):u64:$latency/1000:STRCMP("test.txt",$entry(file))' |
| 287 | [02:16:38] |
| 288 | [02:16:39] |
| 289 | [02:16:40] |
| 290 | $latency/1000 : count distribution |
| 291 | 0 -> 1 : 0 | | |
| 292 | 2 -> 3 : 0 | | |
| 293 | 4 -> 7 : 0 | | |
| 294 | 8 -> 15 : 0 | | |
| 295 | 16 -> 31 : 2 |****************************************| |
| 296 | [02:16:41] |
| 297 | $latency/1000 : count distribution |
| 298 | 0 -> 1 : 0 | | |
| 299 | 2 -> 3 : 0 | | |
| 300 | 4 -> 7 : 0 | | |
| 301 | 8 -> 15 : 1 |********** | |
| 302 | 16 -> 31 : 4 |****************************************| |
| 303 | [02:16:42] |
| 304 | $latency/1000 : count distribution |
| 305 | 0 -> 1 : 0 | | |
| 306 | 2 -> 3 : 0 | | |
| 307 | 4 -> 7 : 0 | | |
| 308 | 8 -> 15 : 1 |******** | |
| 309 | 16 -> 31 : 5 |****************************************| |
| 310 | [02:16:43] |
| 311 | $latency/1000 : count distribution |
| 312 | 0 -> 1 : 0 | | |
| 313 | 2 -> 3 : 0 | | |
| 314 | 4 -> 7 : 0 | | |
| 315 | 8 -> 15 : 1 |******** | |
| 316 | 16 -> 31 : 5 |****************************************| |
| 317 | |
| 318 | |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 319 | Here's a final example that finds how many write() system calls are performed |
| 320 | by each process on the system: |
| 321 | |
Sasha Goldshtein | d2f4762 | 2016-10-04 18:40:15 +0300 | [diff] [blame] | 322 | # argdist -c -C 'p:c:write():int:$PID;write per process' -n 2 |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 323 | [06:47:18] |
| 324 | write by process |
| 325 | COUNT EVENT |
Sasha Goldshtein | cc27edf | 2016-02-14 03:49:01 -0800 | [diff] [blame] | 326 | 3 $PID = 8889 |
| 327 | 7 $PID = 7615 |
| 328 | 7 $PID = 2480 |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 329 | [06:47:19] |
| 330 | write by process |
| 331 | COUNT EVENT |
Sasha Goldshtein | cc27edf | 2016-02-14 03:49:01 -0800 | [diff] [blame] | 332 | 9 $PID = 8889 |
| 333 | 23 $PID = 7615 |
| 334 | 23 $PID = 2480 |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 335 | |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 336 | |
| 337 | USAGE message: |
| 338 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 339 | # argdist -h |
| 340 | usage: argdist [-h] [-p PID] [-z STRING_SIZE] [-i INTERVAL] [-n COUNT] [-v] |
Sasha Goldshtein | 4725a72 | 2016-10-18 20:54:47 +0300 | [diff] [blame] | 341 | [-c] [-T TOP] [-H specifier] [-C[specifier] [-I header] |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 342 | |
| 343 | Trace a function and display a summary of its parameter values. |
| 344 | |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 345 | optional arguments: |
| 346 | -h, --help show this help message and exit |
| 347 | -p PID, --pid PID id of the process to trace (optional) |
| 348 | -z STRING_SIZE, --string-size STRING_SIZE |
| 349 | maximum string size to read from char* arguments |
Sasha Goldshtein | 0a9783b | 2016-02-11 11:21:01 -0800 | [diff] [blame] | 350 | -i INTERVAL, --interval INTERVAL |
Akilesh Kailash | 8996719 | 2018-05-18 13:36:54 -0700 | [diff] [blame] | 351 | output interval, in seconds (default 1 second) |
| 352 | -d DURATION, --duration DURATION |
| 353 | total duration of trace, in seconds |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 354 | -n COUNT, --number COUNT |
Sasha Goldshtein | 0a9783b | 2016-02-11 11:21:01 -0800 | [diff] [blame] | 355 | number of outputs |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 356 | -v, --verbose print resulting BPF program code before executing |
Sasha Goldshtein | d2f4762 | 2016-10-04 18:40:15 +0300 | [diff] [blame] | 357 | -c, --cumulative do not clear histograms and freq counts at each interval |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 358 | -T TOP, --top TOP number of top results to show (not applicable to |
| 359 | histograms) |
Sasha Goldshtein | 4725a72 | 2016-10-18 20:54:47 +0300 | [diff] [blame] | 360 | -H specifier, --histogram specifier |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 361 | probe specifier to capture histogram of (see examples |
| 362 | below) |
Sasha Goldshtein | 4725a72 | 2016-10-18 20:54:47 +0300 | [diff] [blame] | 363 | -C specifier, --count specifier |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 364 | probe specifier to capture count of (see examples |
| 365 | below) |
Sasha Goldshtein | 4725a72 | 2016-10-18 20:54:47 +0300 | [diff] [blame] | 366 | -I header, --include header |
Sasha Goldshtein | 7983d6b | 2016-02-13 23:14:18 -0800 | [diff] [blame] | 367 | additional header files to include in the BPF program |
Yonghong Song | f4470dc | 2017-12-13 14:12:13 -0800 | [diff] [blame] | 368 | as either full path, or relative to current working directory, |
| 369 | or relative to default kernel header search path |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 370 | |
| 371 | Probe specifier syntax: |
Sasha Goldshtein | 3e39a08 | 2016-03-24 08:39:47 -0700 | [diff] [blame] | 372 | {p,r,t,u}:{[library],category}:function(signature)[:type[,type...]:expr[,expr...][:filter]][#label] |
Sasha Goldshtein | 8058eb1 | 2016-02-11 02:17:22 -0800 | [diff] [blame] | 373 | Where: |
Sasha Goldshtein | 3e39a08 | 2016-03-24 08:39:47 -0700 | [diff] [blame] | 374 | p,r,t,u -- probe at function entry, function exit, kernel tracepoint, |
| 375 | or USDT probe |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 376 | in exit probes: can use $retval, $entry(param), $latency |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 377 | library -- the library that contains the function |
| 378 | (leave empty for kernel functions) |
Sasha Goldshtein | fd60d55 | 2016-03-01 12:15:34 -0800 | [diff] [blame] | 379 | category -- the category of the kernel tracepoint (e.g. net, sched) |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 380 | signature -- the function's parameters, as in the C header |
Sasha Goldshtein | cc27edf | 2016-02-14 03:49:01 -0800 | [diff] [blame] | 381 | type -- the type of the expression to collect (supports multiple) |
| 382 | expr -- the expression to collect (supports multiple) |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 383 | filter -- the filter that is applied to collected values |
| 384 | label -- the label for this probe in the resulting output |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 385 | |
| 386 | EXAMPLES: |
| 387 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 388 | argdist -H 'p::__kmalloc(u64 size):u64:size' |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 389 | Print a histogram of allocation sizes passed to kmalloc |
| 390 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 391 | argdist -p 1005 -C 'p:c:malloc(size_t size):size_t:size:size==16' |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 392 | Print a frequency count of how many times process 1005 called malloc |
| 393 | with an allocation size of 16 bytes |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 394 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 395 | argdist -C 'r:c:gets():char*:$retval#snooped strings' |
Sasha Goldshtein | 8058eb1 | 2016-02-11 02:17:22 -0800 | [diff] [blame] | 396 | Snoop on all strings returned by gets() |
| 397 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 398 | argdist -H 'r::__kmalloc(size_t size):u64:$latency/$entry(size)#ns per byte' |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 399 | Print a histogram of nanoseconds per byte from kmalloc allocations |
| 400 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 401 | argdist -C 'p::__kmalloc(size_t size, gfp_t flags):size_t:size:flags&GFP_ATOMIC' |
Sasha Goldshtein | 7983d6b | 2016-02-13 23:14:18 -0800 | [diff] [blame] | 402 | Print frequency count of kmalloc allocation sizes that have GFP_ATOMIC |
| 403 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 404 | argdist -p 1005 -C 'p:c:write(int fd):int:fd' -T 5 |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 405 | Print frequency counts of how many times writes were issued to a |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 406 | particular file descriptor number, in process 1005, but only show |
| 407 | the top 5 busiest fds |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 408 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 409 | argdist -p 1005 -H 'r:c:read()' |
Sasha Goldshtein | 8058eb1 | 2016-02-11 02:17:22 -0800 | [diff] [blame] | 410 | Print a histogram of error codes returned by read() in process 1005 |
| 411 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 412 | argdist -C 'r::__vfs_read():u32:$PID:$latency > 100000' |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 413 | Print frequency of reads by process where the latency was >0.1ms |
| 414 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 415 | argdist -H 'r::__vfs_read(void *file, void *buf, size_t count):size_t:$entry(count):$latency > 1000000' |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 416 | Print a histogram of read sizes that were longer than 1ms |
| 417 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 418 | argdist -H \ |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 419 | 'p:c:write(int fd, const void *buf, size_t count):size_t:count:fd==1' |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 420 | Print a histogram of buffer sizes passed to write() across all |
| 421 | processes, where the file descriptor was 1 (STDOUT) |
| 422 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 423 | argdist -C 'p:c:fork()#fork calls' |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 424 | Count fork() calls in libc across all processes |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 425 | Can also use funccount.py, which is easier and more flexible |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 426 | |
Sasha Goldshtein | 376ae5c | 2016-10-04 19:49:57 +0300 | [diff] [blame] | 427 | argdist -H 't:block:block_rq_complete():u32:args->nr_sector' |
Sasha Goldshtein | fd60d55 | 2016-03-01 12:15:34 -0800 | [diff] [blame] | 428 | Print histogram of number of sectors in completing block I/O requests |
| 429 | |
Sasha Goldshtein | 376ae5c | 2016-10-04 19:49:57 +0300 | [diff] [blame] | 430 | argdist -C 't:irq:irq_handler_entry():int:args->irq' |
Sasha Goldshtein | fd60d55 | 2016-03-01 12:15:34 -0800 | [diff] [blame] | 431 | Aggregate interrupts by interrupt request (IRQ) |
| 432 | |
Sasha Goldshtein | 3e39a08 | 2016-03-24 08:39:47 -0700 | [diff] [blame] | 433 | argdist -C 'u:pthread:pthread_start():u64:arg2' -p 1337 |
| 434 | Print frequency of function addresses used as a pthread start function, |
| 435 | relying on the USDT pthread_start probe in process 1337 |
| 436 | |
Sasha Goldshtein | 4725a72 | 2016-10-18 20:54:47 +0300 | [diff] [blame] | 437 | argdist -H 'p:c:sleep(u32 seconds):u32:seconds' \ |
| 438 | -H 'p:c:nanosleep(struct timespec *req):long:req->tv_nsec' |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 439 | Print histograms of sleep() and nanosleep() parameter values |
| 440 | |
Sasha Goldshtein | 7df65da | 2016-02-14 05:12:27 -0800 | [diff] [blame] | 441 | argdist -p 2780 -z 120 \ |
Sasha Goldshtein | 5e4e1f4 | 2016-02-12 06:52:19 -0800 | [diff] [blame] | 442 | -C 'p:c:write(int fd, char* buf, size_t len):char*:buf:fd==1' |
Sasha Goldshtein | 4cd7063 | 2016-02-11 01:55:11 -0800 | [diff] [blame] | 443 | Spy on writes to STDOUT performed by process 2780, up to a string size |
Sasha Goldshtein | e350115 | 2016-02-13 03:56:29 -0800 | [diff] [blame] | 444 | of 120 characters |
Yonghong Song | f4470dc | 2017-12-13 14:12:13 -0800 | [diff] [blame] | 445 | |
| 446 | argdist -I 'kernel/sched/sched.h' \ |
| 447 | -C 'p::__account_cfs_rq_runtime(struct cfs_rq *cfs_rq):s64:cfs_rq->runtime_remaining' |
| 448 | Trace on the cfs scheduling runqueue remaining runtime. The struct cfs_rq is defined |
| 449 | in kernel/sched/sched.h which is in kernel source tree and not in kernel-devel |
| 450 | package. So this command needs to run at the kernel source tree root directory |
| 451 | so that the added header file can be found by the compiler. |