Erwan Velu | 8c12794 | 2020-05-20 20:25:32 +0200 | [diff] [blame] | 1 | Demonstrations of dirtop, the Linux eBPF/bcc version. |
| 2 | |
| 3 | |
| 4 | dirtop shows reads and writes by directory. For example: |
| 5 | |
| 6 | # ./dirtop.py -d '/hdfs/uuid/*/yarn' |
| 7 | Tracing... Output every 1 secs. Hit Ctrl-C to end |
| 8 | |
| 9 | 14:28:12 loadavg: 25.00 22.85 21.22 31/2921 66450 |
| 10 | |
| 11 | READS WRITES R_Kb W_Kb PATH |
| 12 | 1030 2852 8 147341 /hdfs/uuid/c11da291-28de-4a77-873e-44bb452d238b/yarn |
| 13 | 3308 2459 10980 24893 /hdfs/uuid/bf829d08-1455-45b8-81fa-05c3303e8c45/yarn |
| 14 | 2227 7165 6484 11157 /hdfs/uuid/76dc0b77-e2fd-4476-818f-2b5c3c452396/yarn |
| 15 | 1985 9576 6431 6616 /hdfs/uuid/99c178d5-a209-4af2-8467-7382c7f03c1b/yarn |
| 16 | 1986 398 6474 6486 /hdfs/uuid/7d512fe7-b20d-464c-a75a-dbf8b687ee1c/yarn |
| 17 | 764 3685 5 7069 /hdfs/uuid/250b21c8-1714-45fe-8c08-d45d0271c6bd/yarn |
| 18 | 432 1603 259 6402 /hdfs/uuid/4a833770-767e-43b3-b696-dc98901bce26/yarn |
| 19 | 993 5856 320 129 /hdfs/uuid/b94cbf3f-76b1-4ced-9043-02d450b9887c/yarn |
| 20 | 612 5645 4 249 /hdfs/uuid/8138a53b-b942-44d3-82df-51575f1a3901/yarn |
| 21 | 818 21 6 166 /hdfs/uuid/fada8004-53ff-48df-9396-165d8e42925b/yarn |
| 22 | 174 23 1 171 /hdfs/uuid/d04fccd8-bc72-4ed9-bda4-c5b6893f1405/yarn |
| 23 | 376 6281 2 97 /hdfs/uuid/0cc3683f-4800-4c73-8075-8d77dc7cf116/yarn |
| 24 | 370 4588 2 96 /hdfs/uuid/a78f846a-58c4-4d10-a9f5-42f16a6134a0/yarn |
| 25 | 190 6420 1 86 /hdfs/uuid/2c6a7223-cb18-4916-a1b6-8cd02bda1d31/yarn |
| 26 | 178 123 1 17 /hdfs/uuid/b3b2a2ed-f6c1-4641-86bf-2989dd932411/yarn |
| 27 | [...] |
| 28 | |
| 29 | This shows various directories read and written when hadoop runs. |
| 30 | By default the output is sorted by the total read size in Kbytes (R_Kb). |
| 31 | Sorting order can be changed via -s option. |
| 32 | This is instrumenting at the VFS interface, so this is reads and writes that |
| 33 | may return entirely from the file system cache (page cache). |
| 34 | |
| 35 | While not printed, the average read and write size can be calculated by |
| 36 | dividing R_Kb by READS, and the same for writes. |
| 37 | |
| 38 | This script works by tracing the vfs_read() and vfs_write() functions using |
| 39 | kernel dynamic tracing, which instruments explicit read and write calls. If |
| 40 | files are read or written using another means (eg, via mmap()), then they |
| 41 | will not be visible using this tool. |
| 42 | |
| 43 | This should be useful for file system workload characterization when analyzing |
| 44 | the performance of applications. |
| 45 | |
| 46 | Note that tracing VFS level reads and writes can be a frequent activity, and |
| 47 | this tool can begin to cost measurable overhead at high I/O rates. |
| 48 | |
| 49 | |
| 50 | A -C option will stop clearing the screen, and -r with a number will restrict |
| 51 | the output to that many rows (20 by default). For example, not clearing |
| 52 | the screen and showing the top 5 only: |
| 53 | |
| 54 | # ./dirtop -d '/hdfs/uuid/*/yarn' -Cr 5 |
| 55 | Tracing... Output every 1 secs. Hit Ctrl-C to end |
| 56 | |
| 57 | 14:29:08 loadavg: 25.66 23.42 21.51 17/2850 67167 |
| 58 | |
| 59 | READS WRITES R_Kb W_Kb PATH |
| 60 | 100 8429 0 48243 /hdfs/uuid/b94cbf3f-76b1-4ced-9043-02d450b9887c/yarn |
| 61 | 2066 4091 8176 26457 /hdfs/uuid/d04fccd8-bc72-4ed9-bda4-c5b6893f1405/yarn |
| 62 | 10 2043 0 8172 /hdfs/uuid/b3b2a2ed-f6c1-4641-86bf-2989dd932411/yarn |
| 63 | 38 1368 0 2652 /hdfs/uuid/a78f846a-58c4-4d10-a9f5-42f16a6134a0/yarn |
| 64 | 86 19 0 123 /hdfs/uuid/c11da291-28de-4a77-873e-44bb452d238b/yarn |
| 65 | |
| 66 | 14:29:09 loadavg: 25.66 23.42 21.51 15/2849 67170 |
| 67 | |
| 68 | READS WRITES R_Kb W_Kb PATH |
| 69 | 1204 5619 4388 33767 /hdfs/uuid/b94cbf3f-76b1-4ced-9043-02d450b9887c/yarn |
| 70 | 2208 3511 8744 22992 /hdfs/uuid/d04fccd8-bc72-4ed9-bda4-c5b6893f1405/yarn |
| 71 | 62 4010 0 21181 /hdfs/uuid/8138a53b-b942-44d3-82df-51575f1a3901/yarn |
| 72 | 22 2187 0 8748 /hdfs/uuid/b3b2a2ed-f6c1-4641-86bf-2989dd932411/yarn |
| 73 | 74 1097 0 4388 /hdfs/uuid/4a833770-767e-43b3-b696-dc98901bce26/yarn |
| 74 | |
| 75 | [..] |
| 76 | |
| 77 | |
| 78 | |
| 79 | USAGE message: |
| 80 | |
| 81 | # ./dirtop.py -h |
| 82 | usage: dirtop.py [-h] [-C] [-r MAXROWS] [-s {all,reads,writes,rbytes,wbytes}] |
| 83 | [-p PID] -d ROOTDIRS |
| 84 | [interval] [count] |
| 85 | |
| 86 | File reads and writes by process |
| 87 | |
| 88 | positional arguments: |
| 89 | interval output interval, in seconds |
| 90 | count number of outputs |
| 91 | |
| 92 | optional arguments: |
| 93 | -h, --help show this help message and exit |
| 94 | -C, --noclear don't clear the screen |
| 95 | -r MAXROWS, --maxrows MAXROWS |
| 96 | maximum rows to print, default 20 |
| 97 | -s {all,reads,writes,rbytes,wbytes}, --sort {all,reads,writes,rbytes,wbytes} |
| 98 | sort column, default all |
| 99 | -p PID, --pid PID trace this PID only |
| 100 | -d ROOTDIRS, --root-directories ROOTDIRS |
| 101 | select the directories to observe, separated by commas |
| 102 | |
| 103 | examples: |
| 104 | ./dirtop -d '/hdfs/uuid/*/yarn' # directory I/O top, 1 second refresh |
| 105 | ./dirtop -d '/hdfs/uuid/*/yarn' -C # don't clear the screen |
| 106 | ./dirtop -d '/hdfs/uuid/*/yarn' 5 # 5 second summaries |
| 107 | ./dirtop -d '/hdfs/uuid/*/yarn' 5 10 # 5 second summaries, 10 times only |
| 108 | ./dirtop -d '/hdfs/uuid/*/yarn,/hdfs/uuid/*/data' # Running dirtop on two set of directories |