Split srgb out of accum stages.

By stashing the scales in the context (i.e. on the stack), we can free up enough registers to really simplify how the bitmap sample stages interact.

Nearest neighbor is straightforward now: just call the appropriate gather_ function, and you're done.  The source pixels end up in the source registers.  If they're sRGB encoded, follow up with from_srgb_s

To bilerp, we bracket those 1 or 2 gather+from_srgb_s stages with a stage setting up each corner (x += dx, y += dy, save off scale) and a stage that accumulates into the d-registers (load saved scale, dr += scale * r, etc.).  When all the samples are accumulated, copy the d-registers into the s-registers.

from_srgb_d and to_srgb are lightly sketched here and will be used in the next CL, where I apply this same factoring to non-bitmap loads and stores.  This is a little tricky, because we don't actually have a float->float to_srgb yet.

CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD

Change-Id: I272a1f278f0ea1b29a2f07ac225f753faa8dae81
Reviewed-on: https://skia-review.googlesource.com/5271
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
4 files changed