drm/i915: flush system agent TLBs on SNB

This allows us to map the PTEs WC. I've not done thorough testing or
performance measurements with this patch, but it should be decent.

This is based on a patch from Jesse with the original commit message
> I've only lightly tested this so far, but the corruption seems to be
> gone if I write the GFX_FLSH_CNTL reg after binding an object.  This
> register should control the TLB for the system agent, which is what CPU
> mapped objects will go through.

It has been updated for the new AGP-less code by me, and included with
it is feedback from the original patch.

v2: Updated to reflect paranoia on pte updates/register posting reads.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by [v1]: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 814ed3e..a3e509a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -379,6 +379,13 @@
 	 */
 	if (i != 0)
 		WARN_ON(readl(&gtt_entries[i-1]) != pte_encode(dev, addr, level));
+
+	/* This next bit makes the above posting read even more important. We
+	 * want to flush the TLBs only after we're certain all the PTE updates
+	 * have finished.
+	 */
+	I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN);
+	POSTING_READ(GFX_FLSH_CNTL_GEN6);
 }
 
 void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
@@ -589,8 +596,8 @@
 		goto err_out;
 	}
 
-	dev_priv->mm.gtt->gtt = ioremap(gtt_bus_addr,
-					dev_priv->mm.gtt->gtt_total_entries * sizeof(gtt_pte_t));
+	dev_priv->mm.gtt->gtt = ioremap_wc(gtt_bus_addr,
+					   dev_priv->mm.gtt->gtt_total_entries * sizeof(gtt_pte_t));
 	if (!dev_priv->mm.gtt->gtt) {
 		DRM_ERROR("Failed to map the gtt page table\n");
 		teardown_scratch_page(dev);