Implement support for the MMX instruction set.  The scheme used is
the same as that for FPU instructions.  That is, regard the MMX state
(which is the same as the FPU state) opaquely, and every time we
need to do a MMX instruction, move the simulated MMX state into the
real CPU, do the instruction, and move it back.  JeremyF's optimisation
to minimise FPU saves/restores applies automatically here.

So, this scheme is simple.  It will cause memcheck to complain bitterly
if uninitialised data is copied through the MMX registers, in the same
way that memcheck complains if you move uninit data through the FPU
registers.  Whether this turns out to be a problem remains to be seen.

Most instructions are done, and doing the rest is easy enough, I just
need people to send test cases so I can do them on demand.

(Core) UCode has been extended with 7 new uinstrs:

   MMX1 MMX2 MMX3
      -- 1/2/3 byte mmx insns, no references to
         integer regs or memory, copy exactly to the output stream.

   MMX_MemRd  MMX_MemWr
      -- 2 byte mmx insns which read/write memory and therefore need
         to have an address register patched in at code generation
         time.  These are the analogues to FPU_R / FPU_W.

   MMX_RegRd  MMX_RegWr
      -- These have no analogues in FPU land.  They hold 2 byte insns
         which move data to/from a normal integer register (%eax etc),
         and so this has to be made explicit so that (1) a suitable
         int reg can be patched in at codegen time, and (2) so that
         memcheck can do suitable magic with the V bits going into/
         out of the MMX regs.

Nulgrind (ok, this is a nop, but still ...) and AddrCheck's
instrumenters have been extended to cover these new UInstrs.  All
others (cachesim, memcheck, lackey, helgrind, did I forget any)
abort when they see any of them.  This may be overkill but at least
it ensures we don't forget to implement it in those skins.
[A bad thing would be that some skin silently passes along
MMX uinstrs because of a default: case, when it should actually
do something with them.]

If this works out well, I propose to backport this to 2_0_BRANCH.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@1483 a5019735-40e9-0310-863c-91ae7b9d1cf9
diff --git a/include/vg_skin.h b/include/vg_skin.h
index 1ff3f80..71940d7 100644
--- a/include/vg_skin.h
+++ b/include/vg_skin.h
@@ -516,6 +516,41 @@
       FPU,           /* Doesn't touch memory */
       FPU_R, FPU_W,  /* Reads/writes memory  */
 
+      /* ------------ MMX ops ------------ */
+
+      /* 1 byte, no memrefs, no iregdefs, copy exactly to the
+	 output.  Held in val1[7:0]. */
+      MMX1,
+
+      /* 2 bytes, no memrefs, no iregdefs, copy exactly to the
+	 output.  Held in val1[15:0]. */
+      MMX2,
+
+      /* 3 bytes, no memrefs, no iregdefs, copy exactly to the
+         output.  Held in val1[15:0] and val2[7:0]. */
+      MMX3,
+
+      /* 2 bytes, reads/writes mem.  Insns of the form
+         bbbbbbbb:mod mmxreg r/m.
+         Held in val1[15:0], and mod and rm are to be replaced
+         at codegen time by a reference to the Temp/RealReg holding 
+         the address.  Arg2 holds this Temp/Real Reg.
+         Transfer is always at size 8.
+      */
+      MMX2_MemRd,
+      MMX2_MemWr,
+
+      /* 2 bytes, reads/writes an integer register.  Insns of the form
+         bbbbbbbb:11 mmxreg ireg.
+         Held in val1[15:0], and ireg is to be replaced
+         at codegen time by a reference to the relevant RealReg.
+         Transfer is always at size 4.  Arg2 holds this Temp/Real Reg.
+      */
+      MMX2_RegRd,
+      MMX2_RegWr,
+
+      /* ------------------------ */
+
       /* Not strictly needed, but improve address calculation translations. */
       LEA1,  /* reg2 := const + reg1 */
       LEA2,  /* reg3 := const + reg1 + reg2 * 1,2,4 or 8 */
@@ -931,14 +966,18 @@
 #define R_GS 5
 
 /* For pretty printing x86 code */
+extern Char* VG_(name_of_mmx_gran) ( UChar gran );
+extern Char* VG_(name_of_mmx_reg)  ( Int mmxreg );
 extern Char* VG_(name_of_seg_reg)  ( Int sreg );
 extern Char* VG_(name_of_int_reg)  ( Int size, Int reg );
 extern Char  VG_(name_of_int_size) ( Int size );
 
 /* Shorter macros for convenience */
-#define nameIReg  VG_(name_of_int_reg)
-#define nameISize VG_(name_of_int_size)
-#define nameSReg  VG_(name_of_seg_reg)
+#define nameIReg    VG_(name_of_int_reg)
+#define nameISize   VG_(name_of_int_size)
+#define nameSReg    VG_(name_of_seg_reg)
+#define nameMMXReg  VG_(name_of_mmx_reg)
+#define nameMMXGran VG_(name_of_mmx_gran)
 
 /* Randomly useful things */
 extern UInt  VG_(extend_s_8to32) ( UInt x );