Some simple pipeline refactoring.

This is a batch of little tweaks that all preserve the existing logical behavior:
  - rename dst to move_dst_src to parallel move_src_dst
  - remove unused swap_src_dst
  - move swap_rb up with the other utility stages
  - factor out from_8888() to parallel from_565() and from_4444()
  - factor out gather() from the accum_* stages

This changes the order of the math in accum_8888[_srgb] ever so slightly, from (scale * C) * (1/255.0f) to scale * (1/255.0f * C).  It causes a few pixel diffs, but nothing noticeable.  This makes the 8888 bilerp logic consistent with the other formats, which all convert to [0,1] float first before being scaled.

CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD

Change-Id: Id37857b91be3086565169dcc9b1a537574e532aa
Reviewed-on: https://skia-review.googlesource.com/5226
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
4 files changed