make inline version of premultiply, to speed up gradient creation.

We could speed-up again if we...
- respected kDither and only built 1/2 of the table for non-dither requests
- output simple params to the gpu rather than always a texture
- detected that we have no alpha, and then can skip premul per-entry



git-svn-id: http://skia.googlecode.com/svn/trunk@1772 2bbb7eff-a529-9590-31e7-b0007b416f81
diff --git a/src/effects/SkGradientShader.cpp b/src/effects/SkGradientShader.cpp
index d8ffe5a..f44e038 100644
--- a/src/effects/SkGradientShader.cpp
+++ b/src/effects/SkGradientShader.cpp
@@ -538,11 +538,11 @@
     b = SkIntToFixed(b) + 0x8000;
 
     do {
-        cache[0] = SkPreMultiplyARGB(a >> 16, r >> 16, g >> 16, b >> 16);
-        cache[kCache32Count] = SkPreMultiplyARGB(dither_ceil_fixed_to_8(a),
-                                                 dither_fixed_to_8(r),
-                                                 dither_fixed_to_8(g),
-                                                 dither_fixed_to_8(b));
+        cache[0] = SkPremultiplyARGBInline(a >> 16, r >> 16, g >> 16, b >> 16);
+        cache[kCache32Count] = SkPremultiplyARGBInline(dither_ceil_fixed_to_8(a),
+                                                       dither_fixed_to_8(r),
+                                                       dither_fixed_to_8(g),
+                                                       dither_fixed_to_8(b));
         cache += 1;
         a += da;
         r += dr;