i965/skl: Use an exec size of 8 to initialise the message header

Commit e93566a15c61c33faa changed the message header code needed to
make Skylake use SIMD4x2 so that it uses a register with width 4
instead of 8 as the source register in the send message. However it
also changed the width for the dest in the MOV instruction which is
used to initialise the header register with the values from g0. The
width of the destination is used to determine the exec size in
brw_set_dest so this would end up making the MOV have an exec size of
4. I think this would end up leaving the top half of the register
uninitialised. The top half of the header has meaningful values so
this probably isn't a good idea.

This patch just casts the dest register for the MOV instruction back
to a vec8 to fix it. It doesn't cause any changes to a Piglit run.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index 2743297..7c00020 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -1046,7 +1046,7 @@
 
       brw_push_insn_state(p);
       brw_set_default_mask_control(p, BRW_MASK_DISABLE);
-      brw_MOV(p, src, retype(brw_vec4_grf(0, 0), BRW_REGISTER_TYPE_UD));
+      brw_MOV(p, vec8(src), retype(brw_vec8_grf(0, 0), BRW_REGISTER_TYPE_UD));
       brw_set_default_access_mode(p, BRW_ALIGN_1);
 
       brw_MOV(p, get_element_ud(src, 2),