8-bit jumper on armv8

The GM diffs are all minor and what you'd expect.

I did a quick performance sanity check, which also looks fine.

  $ out/ok bench rp filter:search=Modulate
    [blendmode_rect_Modulate] 30.2ms  @0  32ms    @95 32ms    @100
    [blendmode_mask_Modulate] 12.6ms  @0  12.6ms  @95 14.5ms  @100
  ~~~>
    [blendmode_rect_Modulate] 11.2ms  @0  11.7ms  @95 12.4ms  @100
    [blendmode_mask_Modulate] 10.5ms  @0  23.6ms  @95 23.9ms  @100

This isn't even really the fastest we can make 8-bit go on ARMv8;
it's actually much more natural to work de-interlaced there.  Lots
of room to follow up.

Change-Id: I86b1099f6742bcb0b8b4fa153e85eaba9567cbf7
Reviewed-on: https://skia-review.googlesource.com/39740
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
4 files changed