blob: 91b0b03389139c56d0471500167c8ea56f340f4e [file] [log] [blame]
Erwan Velu8c127942020-05-20 20:25:32 +02001Demonstrations of dirtop, the Linux eBPF/bcc version.
2
3
4dirtop shows reads and writes by directory. For example:
5
6# ./dirtop.py -d '/hdfs/uuid/*/yarn'
7Tracing... Output every 1 secs. Hit Ctrl-C to end
8
914:28:12 loadavg: 25.00 22.85 21.22 31/2921 66450
10
11READS WRITES R_Kb W_Kb PATH
121030 2852 8 147341 /hdfs/uuid/c11da291-28de-4a77-873e-44bb452d238b/yarn
133308 2459 10980 24893 /hdfs/uuid/bf829d08-1455-45b8-81fa-05c3303e8c45/yarn
142227 7165 6484 11157 /hdfs/uuid/76dc0b77-e2fd-4476-818f-2b5c3c452396/yarn
151985 9576 6431 6616 /hdfs/uuid/99c178d5-a209-4af2-8467-7382c7f03c1b/yarn
161986 398 6474 6486 /hdfs/uuid/7d512fe7-b20d-464c-a75a-dbf8b687ee1c/yarn
17764 3685 5 7069 /hdfs/uuid/250b21c8-1714-45fe-8c08-d45d0271c6bd/yarn
18432 1603 259 6402 /hdfs/uuid/4a833770-767e-43b3-b696-dc98901bce26/yarn
19993 5856 320 129 /hdfs/uuid/b94cbf3f-76b1-4ced-9043-02d450b9887c/yarn
20612 5645 4 249 /hdfs/uuid/8138a53b-b942-44d3-82df-51575f1a3901/yarn
21818 21 6 166 /hdfs/uuid/fada8004-53ff-48df-9396-165d8e42925b/yarn
22174 23 1 171 /hdfs/uuid/d04fccd8-bc72-4ed9-bda4-c5b6893f1405/yarn
23376 6281 2 97 /hdfs/uuid/0cc3683f-4800-4c73-8075-8d77dc7cf116/yarn
24370 4588 2 96 /hdfs/uuid/a78f846a-58c4-4d10-a9f5-42f16a6134a0/yarn
25190 6420 1 86 /hdfs/uuid/2c6a7223-cb18-4916-a1b6-8cd02bda1d31/yarn
26178 123 1 17 /hdfs/uuid/b3b2a2ed-f6c1-4641-86bf-2989dd932411/yarn
27[...]
28
29This shows various directories read and written when hadoop runs.
30By default the output is sorted by the total read size in Kbytes (R_Kb).
31Sorting order can be changed via -s option.
32This is instrumenting at the VFS interface, so this is reads and writes that
33may return entirely from the file system cache (page cache).
34
35While not printed, the average read and write size can be calculated by
36dividing R_Kb by READS, and the same for writes.
37
38This script works by tracing the vfs_read() and vfs_write() functions using
39kernel dynamic tracing, which instruments explicit read and write calls. If
40files are read or written using another means (eg, via mmap()), then they
41will not be visible using this tool.
42
43This should be useful for file system workload characterization when analyzing
44the performance of applications.
45
46Note that tracing VFS level reads and writes can be a frequent activity, and
47this tool can begin to cost measurable overhead at high I/O rates.
48
49
50A -C option will stop clearing the screen, and -r with a number will restrict
51the output to that many rows (20 by default). For example, not clearing
52the screen and showing the top 5 only:
53
54# ./dirtop -d '/hdfs/uuid/*/yarn' -Cr 5
55Tracing... Output every 1 secs. Hit Ctrl-C to end
56
5714:29:08 loadavg: 25.66 23.42 21.51 17/2850 67167
58
59READS WRITES R_Kb W_Kb PATH
60100 8429 0 48243 /hdfs/uuid/b94cbf3f-76b1-4ced-9043-02d450b9887c/yarn
612066 4091 8176 26457 /hdfs/uuid/d04fccd8-bc72-4ed9-bda4-c5b6893f1405/yarn
6210 2043 0 8172 /hdfs/uuid/b3b2a2ed-f6c1-4641-86bf-2989dd932411/yarn
6338 1368 0 2652 /hdfs/uuid/a78f846a-58c4-4d10-a9f5-42f16a6134a0/yarn
6486 19 0 123 /hdfs/uuid/c11da291-28de-4a77-873e-44bb452d238b/yarn
65
6614:29:09 loadavg: 25.66 23.42 21.51 15/2849 67170
67
68READS WRITES R_Kb W_Kb PATH
691204 5619 4388 33767 /hdfs/uuid/b94cbf3f-76b1-4ced-9043-02d450b9887c/yarn
702208 3511 8744 22992 /hdfs/uuid/d04fccd8-bc72-4ed9-bda4-c5b6893f1405/yarn
7162 4010 0 21181 /hdfs/uuid/8138a53b-b942-44d3-82df-51575f1a3901/yarn
7222 2187 0 8748 /hdfs/uuid/b3b2a2ed-f6c1-4641-86bf-2989dd932411/yarn
7374 1097 0 4388 /hdfs/uuid/4a833770-767e-43b3-b696-dc98901bce26/yarn
74
75[..]
76
77
78
79USAGE message:
80
81# ./dirtop.py -h
82usage: dirtop.py [-h] [-C] [-r MAXROWS] [-s {all,reads,writes,rbytes,wbytes}]
83 [-p PID] -d ROOTDIRS
84 [interval] [count]
85
86File reads and writes by process
87
88positional arguments:
89 interval output interval, in seconds
90 count number of outputs
91
92optional arguments:
93 -h, --help show this help message and exit
94 -C, --noclear don't clear the screen
95 -r MAXROWS, --maxrows MAXROWS
96 maximum rows to print, default 20
97 -s {all,reads,writes,rbytes,wbytes}, --sort {all,reads,writes,rbytes,wbytes}
98 sort column, default all
99 -p PID, --pid PID trace this PID only
100 -d ROOTDIRS, --root-directories ROOTDIRS
101 select the directories to observe, separated by commas
102
103examples:
104 ./dirtop -d '/hdfs/uuid/*/yarn' # directory I/O top, 1 second refresh
105 ./dirtop -d '/hdfs/uuid/*/yarn' -C # don't clear the screen
106 ./dirtop -d '/hdfs/uuid/*/yarn' 5 # 5 second summaries
107 ./dirtop -d '/hdfs/uuid/*/yarn' 5 10 # 5 second summaries, 10 times only
108 ./dirtop -d '/hdfs/uuid/*/yarn,/hdfs/uuid/*/data' # Running dirtop on two set of directories