8-bit hacking
I think we can replace a lot of legacy code with an SkRasterPipeline
backend that works in 8-bit and stays interlaced. Think of this as a
"lowerp" replacement for lowp.
I'm having some trouble getting ARMv8 working.
ARMv7 should be fine, but I want to turn it on separately from x86.
I haven't looked at 32-bit x86 yet, but that's also on the todo list.
Open questions to follow up on:
- is it better to fold every multiply back down to 8-bit
(as seen here), or to allow intermediates to accumulate
in 16-bit and divide by 255 when done/needed?
- is it better pass tightly packed 8-bit vectors between stages (as
seen here), or to keep the 8-bit values unpacked in 16-bit lanes?
- should we make V wider than 1 register?
GMs look good. All diffs invisible and plausibly due to the 15->8 bit
precision drop. A quick bench run showed this running in about 0.75x
the time of the existing lowp backend.
Change-Id: I24aa46ff1d19c0b9b8dc192d5b1821cab0b8843c
Reviewed-on: https://skia-review.googlesource.com/29886
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Florin Malita <fmalita@chromium.org>
6 files changed