blob: a02d4dc0edd19bafa2ac7b25fcc03a41e4f3e4ec [file] [log] [blame]
Brendan Gregg9c1f3622016-02-14 23:14:12 -08001Demonstrations of zfsdist, the Linux eBPF/bcc version.
2
3
4zfsdist traces ZFS reads, writes, opens, and fsyncs, and summarizes their
5latency as a power-of-2 histogram. It has been written to work on ZFS on Linux
6(http://zfsonlinux.org). For example:
7
8# ./zfsdist
9Tracing ZFS operation latency... Hit Ctrl-C to end.
10^C
11
12operation = 'read'
13 usecs : count distribution
14 0 -> 1 : 0 | |
15 2 -> 3 : 0 | |
16 4 -> 7 : 4479 |****************************************|
17 8 -> 15 : 1028 |********* |
18 16 -> 31 : 14 | |
19 32 -> 63 : 1 | |
20 64 -> 127 : 2 | |
21 128 -> 255 : 6 | |
22 256 -> 511 : 1 | |
23 512 -> 1023 : 1256 |*********** |
24 1024 -> 2047 : 9 | |
25 2048 -> 4095 : 1 | |
26 4096 -> 8191 : 2 | |
27
28operation = 'write'
29 usecs : count distribution
30 0 -> 1 : 0 | |
31 2 -> 3 : 0 | |
32 4 -> 7 : 0 | |
33 8 -> 15 : 0 | |
34 16 -> 31 : 0 | |
35 32 -> 63 : 0 | |
36 64 -> 127 : 0 | |
37 128 -> 255 : 75 |****************************************|
38 256 -> 511 : 11 |***** |
39 512 -> 1023 : 0 | |
40 1024 -> 2047 : 0 | |
41 2048 -> 4095 : 0 | |
42 4096 -> 8191 : 0 | |
43 8192 -> 16383 : 0 | |
44 16384 -> 32767 : 0 | |
45 32768 -> 65535 : 0 | |
46 65536 -> 131071 : 13 |****** |
47 131072 -> 262143 : 1 | |
48
49operation = 'open'
50 usecs : count distribution
51 0 -> 1 : 0 | |
52 2 -> 3 : 2 |****************************************|
53
54This output shows a bimodal distribution for read latency, with a faster
55mode of around 5 thousand reads that took between 4 and 15 microseconds, and a
56slower mode of 1256 reads that took between 512 and 1023 microseconds. It's
57likely that the faster mode was a hit from the in-memory file system cache,
58and the slower mode is a read from a storage device (disk).
59
60The write latency is also bimodal, with a faster mode between 128 and 511 us,
61and the slower mode between 65 and 131 ms.
62
63This "latency" is measured from when the operation was issued from the VFS
64interface to the file system (via the ZFS POSIX layer), to when it completed.
65This spans everything: block device I/O (disk I/O), file system CPU cycles,
66file system locks, run queue latency, etc. This is a better measure of the
67latency suffered by applications reading from the file system than measuring
68this down at the block device interface.
69
70Note that this only traces the common file system operations previously
71listed: other file system operations (eg, inode operations including
72getattr()) are not traced.
73
74
75An optional interval and a count can be provided, as well as -m to show the
76distributions in milliseconds. For example:
77
78# ./zfsdist 1 5
79Tracing ZFS operation latency... Hit Ctrl-C to end.
80
8106:55:41:
82
83operation = 'read'
84 usecs : count distribution
85 0 -> 1 : 0 | |
86 2 -> 3 : 0 | |
87 4 -> 7 : 3976 |****************************************|
88 8 -> 15 : 1181 |*********** |
89 16 -> 31 : 18 | |
90 32 -> 63 : 4 | |
91 64 -> 127 : 17 | |
92 128 -> 255 : 16 | |
93 256 -> 511 : 0 | |
94 512 -> 1023 : 1275 |************ |
95 1024 -> 2047 : 36 | |
96 2048 -> 4095 : 3 | |
97 4096 -> 8191 : 0 | |
98 8192 -> 16383 : 1 | |
99 16384 -> 32767 : 1 | |
100
10106:55:42:
102
103operation = 'read'
104 usecs : count distribution
105 0 -> 1 : 0 | |
106 2 -> 3 : 0 | |
107 4 -> 7 : 12751 |****************************************|
108 8 -> 15 : 1190 |*** |
109 16 -> 31 : 38 | |
110 32 -> 63 : 7 | |
111 64 -> 127 : 85 | |
112 128 -> 255 : 47 | |
113 256 -> 511 : 0 | |
114 512 -> 1023 : 1010 |*** |
115 1024 -> 2047 : 49 | |
116 2048 -> 4095 : 12 | |
117
11806:55:43:
119
120operation = 'read'
121 usecs : count distribution
122 0 -> 1 : 0 | |
123 2 -> 3 : 0 | |
124 4 -> 7 : 80925 |****************************************|
125 8 -> 15 : 1645 | |
126 16 -> 31 : 251 | |
127 32 -> 63 : 24 | |
128 64 -> 127 : 16 | |
129 128 -> 255 : 12 | |
130 256 -> 511 : 0 | |
131 512 -> 1023 : 80 | |
132 1024 -> 2047 : 1 | |
133
13406:55:44:
135
136operation = 'read'
137 usecs : count distribution
138 0 -> 1 : 0 | |
139 2 -> 3 : 0 | |
140 4 -> 7 : 81207 |****************************************|
141 8 -> 15 : 2075 |* |
142 16 -> 31 : 2005 | |
143 32 -> 63 : 177 | |
144 64 -> 127 : 3 | |
145
14606:55:45:
147
148operation = 'read'
149 usecs : count distribution
150 0 -> 1 : 0 | |
151 2 -> 3 : 0 | |
152 4 -> 7 : 74364 |****************************************|
153 8 -> 15 : 865 | |
154 16 -> 31 : 4960 |** |
155 32 -> 63 : 625 | |
156 64 -> 127 : 2 | |
157
158This workload was randomly reading from a file that became cached. The slower
159mode can be seen to disappear by the final summaries.
160
161
162USAGE message:
163
164# ./zfsdist -h
165usage: zfsdist [-h] [-T] [-m] [-p PID] [interval] [count]
166
167Summarize ZFS operation latency
168
169positional arguments:
170 interval output interval, in seconds
171 count number of outputs
172
173optional arguments:
174 -h, --help show this help message and exit
175 -T, --notimestamp don't include timestamp on interval output
176 -m, --milliseconds output in milliseconds
177 -p PID, --pid PID trace this PID only
178
179examples:
180 ./zfsdist # show operation latency as a histogram
181 ./zfsdist -p 181 # trace PID 181 only
182 ./zfsdist 1 10 # print 1 second summaries, 10 times
183 ./zfsdist -m 5 # 5s summaries, milliseconds