Optimization fixes

Two primary fixes.  First, the save/restore mechanism for FP callee saves
was broken if there were any holes in the save mask (the Arm ld/store
multiple instructions for floating point use a start + count mechanism,
rather than the bit-mask mechanism used for core registers).

The second fix corrects a problem introduced by the recent enhancements
to loading floating point literals.  The load->copy optimization mechanism
for literal loads used the value of the loaded literal to identify
redundant loads.  However, it used only the first 32 bits of the
literal - which worked fine previously because 64-bit literal loads
were treated as a pair of 32-bit loads.  The fix was to use the
label of the literal rather than the value in the aliasInfo - which
works for all sizes.

Change-Id: Ic4779adf73b2c7d80059a988b0ecdef39921a81f
diff --git a/src/compiler/codegen/arm/ArchUtility.cc b/src/compiler/codegen/arm/ArchUtility.cc
index 3ceffae..edce114 100644
--- a/src/compiler/codegen/arm/ArchUtility.cc
+++ b/src/compiler/codegen/arm/ArchUtility.cc
@@ -404,7 +404,8 @@
     LOG(INFO) << "Regs (excluding ins) : " << cUnit->numRegs;
     LOG(INFO) << "Ins                  : " << cUnit->numIns;
     LOG(INFO) << "Outs                 : " << cUnit->numOuts;
-    LOG(INFO) << "Spills               : " << cUnit->numSpills;
+    LOG(INFO) << "CoreSpills           : " << cUnit->numCoreSpills;
+    LOG(INFO) << "FPSpills             : " << cUnit->numFPSpills;
     LOG(INFO) << "Padding              : " << cUnit->numPadding;
     LOG(INFO) << "Frame size           : " << cUnit->frameSize;
     LOG(INFO) << "Start of ins         : " << cUnit->insOffset;