blob: 6b5b8719f11906160efa9da495db65d0d3f64e37 [file] [log] [blame]
Fei Li42365632020-08-31 21:35:33 +08001Demonstrations of kvm exit reasons, the Linux eBPF/bcc version.
2
3
4Considering virtual machines' frequent exits can cause performance problems,
5this tool aims to locate the frequent exited reasons and then find solutions
6to reduce or even avoid the exit, by displaying the detail exit reasons and
7the counts of each vm exit for all vms running on one physical machine.
8
9
10Features of this tool
11=====================
12
13- Although there is a patch: [KVM: x86: add full vm-exit reason debug entries]
14 (https://patchwork.kernel.org/project/kvm/patch/1555939499-30854-1-git-send-email-pizhenwei@bytedance.com/)
15 trying to fill more vm-exit reason debug entries, just as the comments said,
16 the code allocates lots of memory that may never be consumed, misses some
17 arch-specific kvm causes, and can not do kernel aggregation. Instead bcc, as
18 a user space tool, can implement all these functions more easily and flexibly.
19- The bcc python logic could provide nice kernel aggregation and custom output,
20 like collpasing all tids for one pid (e.i. one vm's qemu process id) with exit
21 reasons sorted in descending order. For more information, see the following
22 #USAGE message.
23- The bpf in-kernel percpu_array and percpu_cache further improves performance.
24 For more information, see the following #Help to understand.
25
26
27Limited
28=======
29
30In view of the hardware-assisted virtualization technology of
31different architectures, currently we only adapt on vmx in intel.
32And the amd feature is on the road..
33
34
35Example output:
36===============
37
38# ./kvmexit.py
39Display kvm exit reasons and statistics for all threads... Hit Ctrl-C to end.
40PID TID KVM_EXIT_REASON COUNT
41^C1273551 1273568 EXIT_REASON_HLT 12
421273551 1273568 EXIT_REASON_MSR_WRITE 6
431274253 1274261 EXIT_REASON_EXTERNAL_INTERRUPT 1
441274253 1274261 EXIT_REASON_HLT 12
451274253 1274261 EXIT_REASON_MSR_WRITE 4
46
47# ./kvmexit.py 6
48Display kvm exit reasons and statistics for all threads after sleeping 6 secs.
49PID TID KVM_EXIT_REASON COUNT
501273903 1273922 EXIT_REASON_EXTERNAL_INTERRUPT 175
511273903 1273922 EXIT_REASON_CPUID 10
521273903 1273922 EXIT_REASON_HLT 6043
531273903 1273922 EXIT_REASON_IO_INSTRUCTION 24
541273903 1273922 EXIT_REASON_MSR_WRITE 15025
551273903 1273922 EXIT_REASON_PAUSE_INSTRUCTION 11
561273903 1273922 EXIT_REASON_EOI_INDUCED 12
571273903 1273922 EXIT_REASON_EPT_VIOLATION 6
581273903 1273922 EXIT_REASON_EPT_MISCONFIG 380
591273903 1273922 EXIT_REASON_PREEMPTION_TIMER 194
601273551 1273568 EXIT_REASON_EXTERNAL_INTERRUPT 18
611273551 1273568 EXIT_REASON_HLT 989
621273551 1273568 EXIT_REASON_IO_INSTRUCTION 10
631273551 1273568 EXIT_REASON_MSR_WRITE 2205
641273551 1273568 EXIT_REASON_PAUSE_INSTRUCTION 1
651273551 1273568 EXIT_REASON_EOI_INDUCED 5
661273551 1273568 EXIT_REASON_EPT_MISCONFIG 61
671273551 1273568 EXIT_REASON_PREEMPTION_TIMER 14
68
69# ./kvmexit.py -p 1273795 5
70Display kvm exit reasons and statistics for PID 1273795 after sleeping 5 secs.
71KVM_EXIT_REASON COUNT
72MSR_WRITE 13467
73HLT 5060
74PREEMPTION_TIMER 345
75EPT_MISCONFIG 264
76EXTERNAL_INTERRUPT 169
77EPT_VIOLATION 18
78PAUSE_INSTRUCTION 6
79IO_INSTRUCTION 4
80EOI_INDUCED 2
81
82# ./kvmexit.py -p 1273795 5 -a
83Display kvm exit reasons and statistics for PID 1273795 and its all threads after sleeping 5 secs.
84TID KVM_EXIT_REASON COUNT
851273819 EXTERNAL_INTERRUPT 64
861273819 HLT 2802
871273819 IO_INSTRUCTION 4
881273819 MSR_WRITE 7196
891273819 PAUSE_INSTRUCTION 2
901273819 EOI_INDUCED 2
911273819 EPT_VIOLATION 6
921273819 EPT_MISCONFIG 162
931273819 PREEMPTION_TIMER 194
941273820 EXTERNAL_INTERRUPT 78
951273820 HLT 2054
961273820 MSR_WRITE 5199
971273820 EPT_VIOLATION 2
981273820 EPT_MISCONFIG 77
991273820 PREEMPTION_TIMER 102
100
101# ./kvmexit.py -p 1273795 -v 0
102Display kvm exit reasons and statistics for PID 1273795 VCPU 0... Hit Ctrl-C to end.
103KVM_EXIT_REASON COUNT
104^CMSR_WRITE 2076
105HLT 795
106PREEMPTION_TIMER 86
107EXTERNAL_INTERRUPT 20
108EPT_MISCONFIG 10
109PAUSE_INSTRUCTION 2
110IO_INSTRUCTION 2
111EPT_VIOLATION 1
112EOI_INDUCED 1
113
114# ./kvmexit.py -p 1273795 -v 0 4
115Display kvm exit reasons and statistics for PID 1273795 VCPU 0 after sleeping 4 secs.
116KVM_EXIT_REASON COUNT
117MSR_WRITE 4726
118HLT 1827
119PREEMPTION_TIMER 78
120EPT_MISCONFIG 67
121EXTERNAL_INTERRUPT 28
122IO_INSTRUCTION 4
123EOI_INDUCED 2
124PAUSE_INSTRUCTION 2
125
126# ./kvmexit.py -p 1273795 -v 4 4
127Traceback (most recent call last):
128 File "tools/kvmexit.py", line 306, in <module>
129 raise Exception("There's no v%s for PID %d." % (tgt_vcpu, args.pid))
130 Exception: There's no vCPU 4 for PID 1273795.
131
132# ./kvmexit.py -t 1273819 10
133Display kvm exit reasons and statistics for TID 1273819 after sleeping 10 secs.
134KVM_EXIT_REASON COUNT
135MSR_WRITE 13318
136HLT 5274
137EPT_MISCONFIG 263
138PREEMPTION_TIMER 171
139EXTERNAL_INTERRUPT 109
140IO_INSTRUCTION 8
141PAUSE_INSTRUCTION 5
142EOI_INDUCED 4
143EPT_VIOLATION 2
144
145# ./kvmexit.py -T '1273820,1273819'
146Display kvm exit reasons and statistics for TIDS ['1273820', '1273819']... Hit Ctrl-C to end.
147TIDS KVM_EXIT_REASON COUNT
148^C1273819 EXTERNAL_INTERRUPT 300
1491273819 HLT 13718
1501273819 IO_INSTRUCTION 26
1511273819 MSR_WRITE 37457
1521273819 PAUSE_INSTRUCTION 13
1531273819 EOI_INDUCED 13
1541273819 EPT_VIOLATION 53
1551273819 EPT_MISCONFIG 654
1561273819 PREEMPTION_TIMER 958
1571273820 EXTERNAL_INTERRUPT 212
1581273820 HLT 9002
1591273820 MSR_WRITE 25495
1601273820 PAUSE_INSTRUCTION 2
1611273820 EPT_VIOLATION 64
1621273820 EPT_MISCONFIG 396
1631273820 PREEMPTION_TIMER 268
164
165
166Help to understand
167==================
168
169We use a PERCPU_ARRAY: pcpuArrayA and a percpu_hash: hashA to collaboratively
170store each kvm exit reason and its count. The reason is there exists a rule when
171one vcpu exits and re-enters, it tends to continue to run on the same physical
172cpu (pcpu as follows) as the last cycle, which is also called 'cache hit'. Thus
173we turn to use a PERCPU_ARRAY to record the 'cache hit' situation to speed
174things up; and for other cases, then use a percpu_hash.
175
176BTW, we originally use a common hash to do this, with a u64(exit_reason)
177key and a struct exit_info {tgid_pid, exit_reason} value. But due to
178the big lock in bpf_hash, each updating is quite performance consuming.
179
180Now imagine here is a pid_tgidA (vcpu A) exits and is going to run on
181pcpuArrayA, the BPF code flow is as follows:
182
183 pid_tgidA keeps running on the same pcpu
184 // \\
185 // \\
186 // Y N \\
187 // \\
188 a. cache_hit b. cache_miss
189(cacheA's pid_tgid matches pid_tgidA) ||
190 | ||
191 | ||
192 "increase percpu exit_ct and return" ||
193 [*Note*] ||
194 pid_tgidA ever been exited on pcpuArrayA?
195 // \\
196 // \\
197 // \\
198 // Y N \\
199 // \\
200 b.a load_last_hashA b.b initialize_hashA_with_zero
201 \ /
202 \ /
203 \ /
204 "increase percpu exit_ct"
205 ||
206 ||
207 is another pid_tgid been running on pcpuArrayA?
208 // \\
209 // Y N \\
210 // \\
211 b.*.a save_theLastHit_hashB do_nothing
212 \\ //
213 \\ //
214 \\ //
215 b.* save_to_pcpuArrayA
216
217
218[*Note*] we do not update the table in above "a.", in case the vcpu hit the same
219pcpu again when exits next time, instead we only update until this pcpu is not
220hitted by the same tgidpid(vcpu) again, which is in "b.*.a" and "b.*".
221
222
223USAGE message:
224==============
225
226# ./kvmexit.py -h
227usage: kvmexit.py [-h] [-p PID [-v VCPU | -a] ] [-t TID | -T 'TID1,TID2'] [duration]
228
229Display kvm_exit_reason and its statistics at a timed interval
230
231optional arguments:
232 -h, --help show this help message and exit
233 -p PID, --pid PID display process with this PID only, collpase all tids with exit reasons sorted in descending order
234 -v VCPU, --v VCPU display this VCPU only for this PID
235 -a, --alltids display all TIDS for this PID
236 -t TID, --tid TID display thread with this TID only with exit reasons sorted in descending order
237 -T 'TID1,TID2', --tids 'TID1,TID2'
238 display threads for a union like {395490, 395491}
239 duration duration of display, after sleeping several seconds
240
241examples:
242 ./kvmexit # Display kvm_exit_reason and its statistics in real-time until Ctrl-C
243 ./kvmexit 5 # Display in real-time after sleeping 5s
244 ./kvmexit -p 3195281 # Collpase all tids for pid 3195281 with exit reasons sorted in descending order
245 ./kvmexit -p 3195281 20 # Collpase all tids for pid 3195281 with exit reasons sorted in descending order, and display after sleeping 20s
246 ./kvmexit -p 3195281 -v 0 # Display only vcpu0 for pid 3195281, descending sort by default
247 ./kvmexit -p 3195281 -a # Display all tids for pid 3195281
248 ./kvmexit -t 395490 # Display only for tid 395490 with exit reasons sorted in descending order
249 ./kvmexit -t 395490 20 # Display only for tid 395490 with exit reasons sorted in descending order after sleeping 20s
250 ./kvmexit -T '395490,395491' # Display for a union like {395490, 395491}