use NEON 8-bit stages on ARMv7 too

We don't really use anything very ARMv8 specific in the 8-bit NEON
stages, so we can just naturally extend what we're doing to ARMv7 too.

Note that unlike the float stages, we're not requiring VFPv4 either,
just NEON.  VFPv4 is for FMA and F16<->F32 conversion, both of which are
unnecessary for the integer pipeline.

GMs and perf improvement are similar to the previous ARMv8 change.

Change-Id: Id618801ea1920564c1deee144a640a4133c4505f
Reviewed-on: https://skia-review.googlesource.com/39840
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
4 files changed