i965: Improve (i.e. remove) some grf-to-mrf unnecessary moves

Several routines directly analyze the grf-to-mrf moves from the Gen
binary code. When it is possible, the mov is removed and the message
register is directly written in the arithmetic instruction

Also redundant mrf-to-grf moves are removed (frequently for example,
when sampling many textures with the same uv)

Code was tested with piglit, warsow and nexuiz on an Ironlake
machine. No regression was found there

Note that the optimizations are *deactivated* on Gen4 and Gen6 since I
did test them properly yet. No reason there are bugs but who knows

The optimizations are currently done in branch free programs *only*.
Considering branches is more complicated and there are actually two
paths: one for branch free programs and one for programs with branches

Also some other optimizations should be done during the emission
itself but considering that some code is shader between vertex shaders
(AOS) and pixel shaders (SOA) and that we may have branches or not, it
is pretty hard to both factorize the code and have one good set of
strategies
1 file changed