KVM: Prevent system selectors leaking into guest on real->protected mode transition on vmx

Intel virtualization extensions do not support virtualizing real mode.  So
kvm uses virtualized vm86 mode to run real mode code.  Unfortunately, this
virtualized vm86 mode does not support the so called "big real" mode, where
the segment selector and base do not agree with each other according to the
real mode rules (base == selector << 4).

To work around this, kvm checks whether a selector/base pair violates the
virtualized vm86 rules, and if so, forces it into conformance.  On a
transition back to protected mode, if we see that the guest did not touch
a forced segment, we restore it back to the original protected mode value.

This pile of hacks breaks down if the gdt has changed in real mode, as it
can cause a segment selector to point to a system descriptor instead of a
normal data segment.  In fact, this happens with the Windows bootloader
and the qemu acpi bios, where a protected mode memcpy routine issues an
innocent 'pop %es' and traps on an attempt to load a system descriptor.

"Fix" by checking if the to-be-restored selector points at a system segment,
and if so, coercing it into a normal data segment.  The long term solution,
of course, is to abandon vm86 mode and use emulation for big real mode.

Signed-off-by: Avi Kivity <avi@qumranet.com>
diff --git a/drivers/kvm/vmx.c b/drivers/kvm/vmx.c
index bfa0ce4..25b2471 100644
--- a/drivers/kvm/vmx.c
+++ b/drivers/kvm/vmx.c
@@ -618,7 +618,7 @@
 {
 	struct kvm_vmx_segment_field *sf = &kvm_vmx_segment_fields[seg];
 
-	if (vmcs_readl(sf->base) == save->base) {
+	if (vmcs_readl(sf->base) == save->base && (save->base & AR_S_MASK)) {
 		vmcs_write16(sf->selector, save->selector);
 		vmcs_writel(sf->base, save->base);
 		vmcs_write32(sf->limit, save->limit);