drm: radeon add a tcl state flush before accessing tcl vector space

Do a tcl state flush before accessing tcl vector space. This fixes some
more problems with flickering (bug #6637). drm may not be appropriate
place for this, since doing that flush there might both be overkill and
insufficient in some cases. However, it's hard to figure out when that
flush is needed, so this has to suffice. There does not seem to be a
performance penalty associated with it.

From: Roland Scheidegger (DRM CVS)
Signed-off-by: Dave Airlie <airlied@linux.ie>
diff --git a/drivers/char/drm/radeon_state.c b/drivers/char/drm/radeon_state.c
index c5b8f77..4ca6bd1 100644
--- a/drivers/char/drm/radeon_state.c
+++ b/drivers/char/drm/radeon_state.c
@@ -2595,7 +2595,8 @@
 	int stride = header.vectors.stride;
 	RING_LOCALS;
 
-	BEGIN_RING(3 + sz);
+	BEGIN_RING(5 + sz);
+	OUT_RING_REG(RADEON_SE_TCL_STATE_FLUSH, 0);
 	OUT_RING(CP_PACKET0(RADEON_SE_TCL_VECTOR_INDX_REG, 0));
 	OUT_RING(start | (stride << RADEON_VEC_INDX_OCTWORD_STRIDE_SHIFT));
 	OUT_RING(CP_PACKET0_TABLE(RADEON_SE_TCL_VECTOR_DATA_REG, (sz - 1)));