Replace interp() with clut_{3,4}D stages.

I tried to follow exactly the same strategy as a start.
(Though I did fix the off-by-one dimensions.)

It does rather look like we only need 3D and 4D now
that I've looked at the call sites.

Looks like about a 20% speedup.

Change-Id: I8b1af64750ad1750716ee1ab0767e64591c7206a
Reviewed-on: https://skia-review.googlesource.com/32842
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Brian Osman <brianosman@google.com>
diff --git a/src/jumper/SkJumper.h b/src/jumper/SkJumper.h
index a22bb22..d4e8ef4 100644
--- a/src/jumper/SkJumper.h
+++ b/src/jumper/SkJumper.h
@@ -121,4 +121,9 @@
     uint32_t rgba;
 };
 
+struct SkJumper_ColorLookupTableCtx {
+    const float* table;
+    int limits[4];
+};
+
 #endif//SkJumper_DEFINED