jumper, rework callback a bit, use it for color_lookup_table

Looks like the color-space images have this well tested (even without
lab_to_xyz) and the diffs look like rounding/FMA.

The old plan to keep loads and stores outside callback was:
  1) awkward, with too many pointers and pointers to pointers to track
  2) misguided... load and store stages march ahead by x,
     working at ptr+0, ptr+8, ptr+16, etc. while callback
     always wants to be working at the same spot in the buffer.

I spent a frustrating day in lldb to understood 2).  :/

So now the stage always store4's its pixels to a buffer in the context
before the callback, and when the callback returns it load4's them back
from a pointer in the context, defaulting to that same buffer.

Instead of passing a void* into the callback, we pass the context
itself.  This lets us subclass the context and add our own data...
C-compatible object-oriented programming.

Change-Id: I7a03439b3abd2efb000a6973631a9336452e9a43
Reviewed-on: https://skia-review.googlesource.com/13985
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
7 files changed