blob: b057569bd8b4baa241a7fea085bfc3c007758736 [file] [log] [blame]
samuelnairceff3e02017-09-12 13:13:10 +08001Demonstrations of nfsdist, the Linux eBPF/bcc version.
2
3nfsdist traces NFS reads, writes, opens, and getattr, and summarizes their
4latency as a power-of-2 histogram. For example:
5
6
7./nfsdist.py
8
9Tracing NFS operation latency... Hit Ctrl-C to end.
10
11operation = read
12 usecs : count distribution
13 0 -> 1 : 4 | |
14 2 -> 3 : 0 | |
15 4 -> 7 : 0 | |
16 8 -> 15 : 7107 |************** |
17 16 -> 31 : 19864 |****************************************|
18 32 -> 63 : 1494 |*** |
19 64 -> 127 : 491 | |
20 128 -> 255 : 1810 |*** |
21 256 -> 511 : 6356 |************ |
22 512 -> 1023 : 4860 |********* |
23 1024 -> 2047 : 3070 |****** |
24 2048 -> 4095 : 1853 |*** |
25 4096 -> 8191 : 921 |* |
26 8192 -> 16383 : 122 | |
27 16384 -> 32767 : 15 | |
28 32768 -> 65535 : 5 | |
29 65536 -> 131071 : 2 | |
30 131072 -> 262143 : 1 | |
31
32operation = write
33 usecs : count distribution
34 0 -> 1 : 0 | |
35 2 -> 3 : 0 | |
36 4 -> 7 : 0 | |
37 8 -> 15 : 1 | |
38 16 -> 31 : 0 | |
39 32 -> 63 : 9 | |
40 64 -> 127 : 19491 |****************************************|
41 128 -> 255 : 3064 |****** |
42 256 -> 511 : 940 |* |
43 512 -> 1023 : 365 | |
44 1024 -> 2047 : 312 | |
45 2048 -> 4095 : 119 | |
46 4096 -> 8191 : 31 | |
47 8192 -> 16383 : 84 | |
48 16384 -> 32767 : 31 | |
49 32768 -> 65535 : 5 | |
50 65536 -> 131071 : 3 | |
51 131072 -> 262143 : 0 | |
52 262144 -> 524287 : 1 | |
53
54operation = getattr
55 usecs : count distribution
56 0 -> 1 : 27 |****************************************|
57 2 -> 3 : 2 |** |
58 4 -> 7 : 3 |**** |
59 8 -> 15 : 0 | |
60 16 -> 31 : 0 | |
61 32 -> 63 : 0 | |
62 64 -> 127 : 0 | |
63 128 -> 255 : 0 | |
64 256 -> 511 : 2 |** |
65 512 -> 1023 : 2 |** |
66
67operation = open
68 usecs : count distribution
69 0 -> 1 : 0 | |
70 2 -> 3 : 0 | |
71 4 -> 7 : 0 | |
72 8 -> 15 : 0 | |
73 16 -> 31 : 0 | |
74 32 -> 63 : 0 | |
75 64 -> 127 : 0 | |
76 128 -> 255 : 0 | |
77 256 -> 511 : 2 |****************************************|
78
79
80In this example you can see that the read traffic is rather bi-modal, with about
8126K reads falling within 8 - 30 usecs and about 18K reads spread between 128 -
828191 usecs. Write traffic is largely clustered in the 64 - 127 usecs bracket.
83The faster read traffic is probably coming from a filesystem cache and the slower
84traffic from disk. The reason why the writes are so consistently fast is because
85this example test was run on a couple of VM's and I believe the hypervisor was
86caching all the write traffic to memory.
87
88This "latency" is measured from when the operation was issued from the VFS
89interface to the file system, to when it completed. This spans everything:
90RPC latency, network latency, file system CPU cycles, file system locks, run
91queue latency, etc. This is a better measure of the latency suffered by
92applications reading from a NFS share and can better expose problems
93experienced by NFS clients.
94
95Note that this only traces the common NFS operations (read, write, open and
96getattr). I chose to include getattr as a significant percentage of NFS
97traffic end up being getattr calls and are a good indicator of problems
98with an NFS server.
99
100An optional interval and a count can be provided, as well as -m to show the
101distributions in milliseconds. For example:
102
103./nfsdist -m 1 5
104Tracing NFS operation latency... Hit Ctrl-C to end.
105
10611:02:39:
107
108operation = write
109 msecs : count distribution
110 0 -> 1 : 1 | |
111 2 -> 3 : 24 |******** |
112 4 -> 7 : 114 |****************************************|
113 8 -> 15 : 9 |*** |
114 16 -> 31 : 1 | |
115 32 -> 63 : 1 | |
116
11711:02:40:
118
119operation = write
120 msecs : count distribution
121 0 -> 1 : 0 | |
122 2 -> 3 : 11 |*** |
123 4 -> 7 : 111 |****************************************|
124 8 -> 15 : 13 |**** |
125 16 -> 31 : 1 | |
126
12711:02:41:
128
129operation = write
130 msecs : count distribution
131 0 -> 1 : 0 | |
132 2 -> 3 : 21 |****** |
133 4 -> 7 : 137 |****************************************|
134 8 -> 15 : 3 | |
135
136This shows a write workload, with writes hovering primarily in the 4-7ms range.
137
138USAGE message:
139
140
141./nfsdist -h
142usage: nfsdist.py [-h] [-T] [-m] [-p PID] [interval] [count]
143
144Summarize NFS operation latency
145
146positional arguments:
147 interval output interval, in seconds
148 count number of outputs
149
150optional arguments:
151 -h, --help show this help message and exit
152 -T, --notimestamp don't include timestamp on interval output
153 -m, --milliseconds output in milliseconds
154 -p PID, --pid PID trace this PID only
155
156examples:
157 ./nfsdist # show operation latency as a histogram
158 ./nfsdist -p 181 # trace PID 181 only
159 ./nfsdist 1 10 # print 1 second summaries, 10 times
160 ./nfsdist -m 5 # 5s summaries, milliseconds