x86: fix gart_iommu_init()

When the GART table is unmapped from the kernel direct mappings
during early bootup, make sure we have no leftover cachelines in it.

Note: the clflush done by set_memory_np() was not enough, because
clflush does not work on unmapped pages.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c
index 65f6acb..faf3229 100644
--- a/arch/x86/kernel/pci-gart_64.c
+++ b/arch/x86/kernel/pci-gart_64.c
@@ -749,6 +749,15 @@
 	 */
 	set_memory_np((unsigned long)__va(iommu_bus_base),
 				iommu_size >> PAGE_SHIFT);
+	/*
+	 * Tricky. The GART table remaps the physical memory range,
+	 * so the CPU wont notice potential aliases and if the memory
+	 * is remapped to UC later on, we might surprise the PCI devices
+	 * with a stray writeout of a cacheline. So play it sure and
+	 * do an explicit, full-scale wbinvd() _after_ having marked all
+	 * the pages as Not-Present:
+	 */
+	wbinvd();
 
 	/*
 	 * Try to workaround a bug (thanks to BenH)