Parallel Move Resolver: Perform Stack/Stack first

On machines like x86, by the time other parallel moves are done, there
may be no free registers available to move/swap without having to save
and restore a register.

To avoid this, perform stack/stack first, while there is a good chance
that there is a destination register that we can use.  On the X86, this
avoids a lot of push eax/pop eax code.

Change-Id: I57076271b5672c931a93888ff23e30b2567f43b8
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
diff --git a/compiler/optimizing/parallel_move_resolver.cc b/compiler/optimizing/parallel_move_resolver.cc
index 54ea6f1..f9d812f 100644
--- a/compiler/optimizing/parallel_move_resolver.cc
+++ b/compiler/optimizing/parallel_move_resolver.cc
@@ -38,6 +38,20 @@
   // Build up a worklist of moves.
   BuildInitialMoveList(parallel_move);
 
+  // Move stack/stack slot to take advantage of a free register on constrained machines.
+  for (size_t i = 0; i < moves_.Size(); ++i) {
+    const MoveOperands& move = *moves_.Get(i);
+    // Ignore constants and moves already eliminated.
+    if (move.IsEliminated() || move.GetSource().IsConstant()) {
+      continue;
+    }
+
+    if ((move.GetSource().IsStackSlot() || move.GetSource().IsDoubleStackSlot()) &&
+        (move.GetDestination().IsStackSlot() || move.GetDestination().IsDoubleStackSlot())) {
+      PerformMove(i);
+    }
+  }
+
   for (size_t i = 0; i < moves_.Size(); ++i) {
     const MoveOperands& move = *moves_.Get(i);
     // Skip constants to perform them last.  They don't block other moves