Fix perf regression in Color32.

The regression was due to the fact that we were calling PlatformColorProc() for
every span (which in turns makes CPUID, a fairly expensive call).  Since we draw
a lot of rects, and rects have 1-pixel wide spans for the vertical segments,
that's a lot of CPUID.

Fixed by cacheing the result of PlatformColorProc(), as is done for the other
platform-specific blitters.

Review URL:  http://codereview.appspot.com/3669042/



git-svn-id: http://skia.googlecode.com/svn/trunk@636 2bbb7eff-a529-9590-31e7-b0007b416f81
diff --git a/Makefile b/Makefile
index 65f0b71..146d11a 100644
--- a/Makefile
+++ b/Makefile
@@ -94,6 +94,7 @@
 
 # For these files, and these files only, compile with -msse2.
 SSE2_OBJS := out/src/opts/SkBlitRow_opts_SSE2.o \
+             out/src/opts/SkBitmapProcState_opts_SSE2.o \
              out/src/opts/SkUtils_opts_SSE2.o
 $(SSE2_OBJS) : CFLAGS := $(CFLAGS_SSE2)