Fei Li | 4236563 | 2020-08-31 21:35:33 +0800 | [diff] [blame] | 1 | Demonstrations of kvm exit reasons, the Linux eBPF/bcc version. |
| 2 | |
| 3 | |
| 4 | Considering virtual machines' frequent exits can cause performance problems, |
| 5 | this tool aims to locate the frequent exited reasons and then find solutions |
| 6 | to reduce or even avoid the exit, by displaying the detail exit reasons and |
| 7 | the counts of each vm exit for all vms running on one physical machine. |
| 8 | |
| 9 | |
| 10 | Features of this tool |
| 11 | ===================== |
| 12 | |
| 13 | - Although there is a patch: [KVM: x86: add full vm-exit reason debug entries] |
| 14 | (https://patchwork.kernel.org/project/kvm/patch/1555939499-30854-1-git-send-email-pizhenwei@bytedance.com/) |
| 15 | trying to fill more vm-exit reason debug entries, just as the comments said, |
| 16 | the code allocates lots of memory that may never be consumed, misses some |
| 17 | arch-specific kvm causes, and can not do kernel aggregation. Instead bcc, as |
| 18 | a user space tool, can implement all these functions more easily and flexibly. |
| 19 | - The bcc python logic could provide nice kernel aggregation and custom output, |
| 20 | like collpasing all tids for one pid (e.i. one vm's qemu process id) with exit |
| 21 | reasons sorted in descending order. For more information, see the following |
| 22 | #USAGE message. |
| 23 | - The bpf in-kernel percpu_array and percpu_cache further improves performance. |
| 24 | For more information, see the following #Help to understand. |
| 25 | |
| 26 | |
| 27 | Limited |
| 28 | ======= |
| 29 | |
| 30 | In view of the hardware-assisted virtualization technology of |
| 31 | different architectures, currently we only adapt on vmx in intel. |
| 32 | And the amd feature is on the road.. |
| 33 | |
| 34 | |
| 35 | Example output: |
| 36 | =============== |
| 37 | |
| 38 | # ./kvmexit.py |
| 39 | Display kvm exit reasons and statistics for all threads... Hit Ctrl-C to end. |
| 40 | PID TID KVM_EXIT_REASON COUNT |
| 41 | ^C1273551 1273568 EXIT_REASON_HLT 12 |
| 42 | 1273551 1273568 EXIT_REASON_MSR_WRITE 6 |
| 43 | 1274253 1274261 EXIT_REASON_EXTERNAL_INTERRUPT 1 |
| 44 | 1274253 1274261 EXIT_REASON_HLT 12 |
| 45 | 1274253 1274261 EXIT_REASON_MSR_WRITE 4 |
| 46 | |
| 47 | # ./kvmexit.py 6 |
| 48 | Display kvm exit reasons and statistics for all threads after sleeping 6 secs. |
| 49 | PID TID KVM_EXIT_REASON COUNT |
| 50 | 1273903 1273922 EXIT_REASON_EXTERNAL_INTERRUPT 175 |
| 51 | 1273903 1273922 EXIT_REASON_CPUID 10 |
| 52 | 1273903 1273922 EXIT_REASON_HLT 6043 |
| 53 | 1273903 1273922 EXIT_REASON_IO_INSTRUCTION 24 |
| 54 | 1273903 1273922 EXIT_REASON_MSR_WRITE 15025 |
| 55 | 1273903 1273922 EXIT_REASON_PAUSE_INSTRUCTION 11 |
| 56 | 1273903 1273922 EXIT_REASON_EOI_INDUCED 12 |
| 57 | 1273903 1273922 EXIT_REASON_EPT_VIOLATION 6 |
| 58 | 1273903 1273922 EXIT_REASON_EPT_MISCONFIG 380 |
| 59 | 1273903 1273922 EXIT_REASON_PREEMPTION_TIMER 194 |
| 60 | 1273551 1273568 EXIT_REASON_EXTERNAL_INTERRUPT 18 |
| 61 | 1273551 1273568 EXIT_REASON_HLT 989 |
| 62 | 1273551 1273568 EXIT_REASON_IO_INSTRUCTION 10 |
| 63 | 1273551 1273568 EXIT_REASON_MSR_WRITE 2205 |
| 64 | 1273551 1273568 EXIT_REASON_PAUSE_INSTRUCTION 1 |
| 65 | 1273551 1273568 EXIT_REASON_EOI_INDUCED 5 |
| 66 | 1273551 1273568 EXIT_REASON_EPT_MISCONFIG 61 |
| 67 | 1273551 1273568 EXIT_REASON_PREEMPTION_TIMER 14 |
| 68 | |
| 69 | # ./kvmexit.py -p 1273795 5 |
| 70 | Display kvm exit reasons and statistics for PID 1273795 after sleeping 5 secs. |
| 71 | KVM_EXIT_REASON COUNT |
| 72 | MSR_WRITE 13467 |
| 73 | HLT 5060 |
| 74 | PREEMPTION_TIMER 345 |
| 75 | EPT_MISCONFIG 264 |
| 76 | EXTERNAL_INTERRUPT 169 |
| 77 | EPT_VIOLATION 18 |
| 78 | PAUSE_INSTRUCTION 6 |
| 79 | IO_INSTRUCTION 4 |
| 80 | EOI_INDUCED 2 |
| 81 | |
| 82 | # ./kvmexit.py -p 1273795 5 -a |
| 83 | Display kvm exit reasons and statistics for PID 1273795 and its all threads after sleeping 5 secs. |
| 84 | TID KVM_EXIT_REASON COUNT |
| 85 | 1273819 EXTERNAL_INTERRUPT 64 |
| 86 | 1273819 HLT 2802 |
| 87 | 1273819 IO_INSTRUCTION 4 |
| 88 | 1273819 MSR_WRITE 7196 |
| 89 | 1273819 PAUSE_INSTRUCTION 2 |
| 90 | 1273819 EOI_INDUCED 2 |
| 91 | 1273819 EPT_VIOLATION 6 |
| 92 | 1273819 EPT_MISCONFIG 162 |
| 93 | 1273819 PREEMPTION_TIMER 194 |
| 94 | 1273820 EXTERNAL_INTERRUPT 78 |
| 95 | 1273820 HLT 2054 |
| 96 | 1273820 MSR_WRITE 5199 |
| 97 | 1273820 EPT_VIOLATION 2 |
| 98 | 1273820 EPT_MISCONFIG 77 |
| 99 | 1273820 PREEMPTION_TIMER 102 |
| 100 | |
| 101 | # ./kvmexit.py -p 1273795 -v 0 |
| 102 | Display kvm exit reasons and statistics for PID 1273795 VCPU 0... Hit Ctrl-C to end. |
| 103 | KVM_EXIT_REASON COUNT |
| 104 | ^CMSR_WRITE 2076 |
| 105 | HLT 795 |
| 106 | PREEMPTION_TIMER 86 |
| 107 | EXTERNAL_INTERRUPT 20 |
| 108 | EPT_MISCONFIG 10 |
| 109 | PAUSE_INSTRUCTION 2 |
| 110 | IO_INSTRUCTION 2 |
| 111 | EPT_VIOLATION 1 |
| 112 | EOI_INDUCED 1 |
| 113 | |
| 114 | # ./kvmexit.py -p 1273795 -v 0 4 |
| 115 | Display kvm exit reasons and statistics for PID 1273795 VCPU 0 after sleeping 4 secs. |
| 116 | KVM_EXIT_REASON COUNT |
| 117 | MSR_WRITE 4726 |
| 118 | HLT 1827 |
| 119 | PREEMPTION_TIMER 78 |
| 120 | EPT_MISCONFIG 67 |
| 121 | EXTERNAL_INTERRUPT 28 |
| 122 | IO_INSTRUCTION 4 |
| 123 | EOI_INDUCED 2 |
| 124 | PAUSE_INSTRUCTION 2 |
| 125 | |
| 126 | # ./kvmexit.py -p 1273795 -v 4 4 |
| 127 | Traceback (most recent call last): |
| 128 | File "tools/kvmexit.py", line 306, in <module> |
| 129 | raise Exception("There's no v%s for PID %d." % (tgt_vcpu, args.pid)) |
| 130 | Exception: There's no vCPU 4 for PID 1273795. |
| 131 | |
| 132 | # ./kvmexit.py -t 1273819 10 |
| 133 | Display kvm exit reasons and statistics for TID 1273819 after sleeping 10 secs. |
| 134 | KVM_EXIT_REASON COUNT |
| 135 | MSR_WRITE 13318 |
| 136 | HLT 5274 |
| 137 | EPT_MISCONFIG 263 |
| 138 | PREEMPTION_TIMER 171 |
| 139 | EXTERNAL_INTERRUPT 109 |
| 140 | IO_INSTRUCTION 8 |
| 141 | PAUSE_INSTRUCTION 5 |
| 142 | EOI_INDUCED 4 |
| 143 | EPT_VIOLATION 2 |
| 144 | |
| 145 | # ./kvmexit.py -T '1273820,1273819' |
| 146 | Display kvm exit reasons and statistics for TIDS ['1273820', '1273819']... Hit Ctrl-C to end. |
| 147 | TIDS KVM_EXIT_REASON COUNT |
| 148 | ^C1273819 EXTERNAL_INTERRUPT 300 |
| 149 | 1273819 HLT 13718 |
| 150 | 1273819 IO_INSTRUCTION 26 |
| 151 | 1273819 MSR_WRITE 37457 |
| 152 | 1273819 PAUSE_INSTRUCTION 13 |
| 153 | 1273819 EOI_INDUCED 13 |
| 154 | 1273819 EPT_VIOLATION 53 |
| 155 | 1273819 EPT_MISCONFIG 654 |
| 156 | 1273819 PREEMPTION_TIMER 958 |
| 157 | 1273820 EXTERNAL_INTERRUPT 212 |
| 158 | 1273820 HLT 9002 |
| 159 | 1273820 MSR_WRITE 25495 |
| 160 | 1273820 PAUSE_INSTRUCTION 2 |
| 161 | 1273820 EPT_VIOLATION 64 |
| 162 | 1273820 EPT_MISCONFIG 396 |
| 163 | 1273820 PREEMPTION_TIMER 268 |
| 164 | |
| 165 | |
| 166 | Help to understand |
| 167 | ================== |
| 168 | |
| 169 | We use a PERCPU_ARRAY: pcpuArrayA and a percpu_hash: hashA to collaboratively |
| 170 | store each kvm exit reason and its count. The reason is there exists a rule when |
| 171 | one vcpu exits and re-enters, it tends to continue to run on the same physical |
| 172 | cpu (pcpu as follows) as the last cycle, which is also called 'cache hit'. Thus |
| 173 | we turn to use a PERCPU_ARRAY to record the 'cache hit' situation to speed |
| 174 | things up; and for other cases, then use a percpu_hash. |
| 175 | |
| 176 | BTW, we originally use a common hash to do this, with a u64(exit_reason) |
| 177 | key and a struct exit_info {tgid_pid, exit_reason} value. But due to |
| 178 | the big lock in bpf_hash, each updating is quite performance consuming. |
| 179 | |
| 180 | Now imagine here is a pid_tgidA (vcpu A) exits and is going to run on |
| 181 | pcpuArrayA, the BPF code flow is as follows: |
| 182 | |
| 183 | pid_tgidA keeps running on the same pcpu |
| 184 | // \\ |
| 185 | // \\ |
| 186 | // Y N \\ |
| 187 | // \\ |
| 188 | a. cache_hit b. cache_miss |
| 189 | (cacheA's pid_tgid matches pid_tgidA) || |
| 190 | | || |
| 191 | | || |
| 192 | "increase percpu exit_ct and return" || |
| 193 | [*Note*] || |
| 194 | pid_tgidA ever been exited on pcpuArrayA? |
| 195 | // \\ |
| 196 | // \\ |
| 197 | // \\ |
| 198 | // Y N \\ |
| 199 | // \\ |
| 200 | b.a load_last_hashA b.b initialize_hashA_with_zero |
| 201 | \ / |
| 202 | \ / |
| 203 | \ / |
| 204 | "increase percpu exit_ct" |
| 205 | || |
| 206 | || |
| 207 | is another pid_tgid been running on pcpuArrayA? |
| 208 | // \\ |
| 209 | // Y N \\ |
| 210 | // \\ |
| 211 | b.*.a save_theLastHit_hashB do_nothing |
| 212 | \\ // |
| 213 | \\ // |
| 214 | \\ // |
| 215 | b.* save_to_pcpuArrayA |
| 216 | |
| 217 | |
| 218 | [*Note*] we do not update the table in above "a.", in case the vcpu hit the same |
| 219 | pcpu again when exits next time, instead we only update until this pcpu is not |
| 220 | hitted by the same tgidpid(vcpu) again, which is in "b.*.a" and "b.*". |
| 221 | |
| 222 | |
| 223 | USAGE message: |
| 224 | ============== |
| 225 | |
| 226 | # ./kvmexit.py -h |
| 227 | usage: kvmexit.py [-h] [-p PID [-v VCPU | -a] ] [-t TID | -T 'TID1,TID2'] [duration] |
| 228 | |
| 229 | Display kvm_exit_reason and its statistics at a timed interval |
| 230 | |
| 231 | optional arguments: |
| 232 | -h, --help show this help message and exit |
| 233 | -p PID, --pid PID display process with this PID only, collpase all tids with exit reasons sorted in descending order |
| 234 | -v VCPU, --v VCPU display this VCPU only for this PID |
| 235 | -a, --alltids display all TIDS for this PID |
| 236 | -t TID, --tid TID display thread with this TID only with exit reasons sorted in descending order |
| 237 | -T 'TID1,TID2', --tids 'TID1,TID2' |
| 238 | display threads for a union like {395490, 395491} |
| 239 | duration duration of display, after sleeping several seconds |
| 240 | |
| 241 | examples: |
| 242 | ./kvmexit # Display kvm_exit_reason and its statistics in real-time until Ctrl-C |
| 243 | ./kvmexit 5 # Display in real-time after sleeping 5s |
| 244 | ./kvmexit -p 3195281 # Collpase all tids for pid 3195281 with exit reasons sorted in descending order |
| 245 | ./kvmexit -p 3195281 20 # Collpase all tids for pid 3195281 with exit reasons sorted in descending order, and display after sleeping 20s |
| 246 | ./kvmexit -p 3195281 -v 0 # Display only vcpu0 for pid 3195281, descending sort by default |
| 247 | ./kvmexit -p 3195281 -a # Display all tids for pid 3195281 |
| 248 | ./kvmexit -t 395490 # Display only for tid 395490 with exit reasons sorted in descending order |
| 249 | ./kvmexit -t 395490 20 # Display only for tid 395490 with exit reasons sorted in descending order after sleeping 20s |
| 250 | ./kvmexit -T '395490,395491' # Display for a union like {395490, 395491} |