KVM: x86: Unconditionally enable irqs in guest context
On VMX, KVM currently does not re-enable irqs until after it has exited
the guest context. As a result, a tick that fires in the window between
VM-Exit and guest_exit_irqoff() will be accounted as system time. While
said window is relatively small, it's large enough to be problematic in
some configurations, e.g. if VM-Exits are consistently occurring a hair
earlier than the tick irq.
Intentionally toggle irqs back off so that guest_exit_irqoff() can be
used in lieu of guest_exit() in order to avoid the save/restore of flags
in guest_exit(). On my Haswell system, "nop; cli; sti" is ~6 cycles,
versus ~28 cycles for "pushf; pop <reg>; cli; push <reg>; popf".
Fixes: f2485b3e0c6c0 ("KVM: x86: use guest_exit_irqoff")
Reported-by: Wei Yang <w90p710@gmail.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 5270711..98b848f 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6184,15 +6184,7 @@
static void svm_handle_exit_irqoff(struct kvm_vcpu *vcpu)
{
- kvm_before_interrupt(vcpu);
- local_irq_enable();
- /*
- * We must have an instruction with interrupts enabled, so
- * the timer interrupt isn't delayed by the interrupt shadow.
- */
- asm("nop");
- local_irq_disable();
- kvm_after_interrupt(vcpu);
+
}
static void svm_sched_in(struct kvm_vcpu *vcpu, int cpu)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 81faceb..0347b1e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8046,7 +8046,18 @@
kvm_x86_ops->handle_exit_irqoff(vcpu);
+ /*
+ * Consume any pending interrupts, including the possible source of
+ * VM-Exit on SVM and any ticks that occur between VM-Exit and now.
+ * An instruction is required after local_irq_enable() to fully unblock
+ * interrupts on processors that implement an interrupt shadow, the
+ * stat.exits increment will do nicely.
+ */
+ kvm_before_interrupt(vcpu);
+ local_irq_enable();
++vcpu->stat.exits;
+ local_irq_disable();
+ kvm_after_interrupt(vcpu);
guest_exit_irqoff();
if (lapic_in_kernel(vcpu)) {