Add mulHi to SkNx

Add mulHi to base SkNx, and specialize implementations for Sk4u for
neon and sse.

Add casts for converting from uint8_t by 4 to uint32_t by 4.

Cq-Include-Trybots: skia.primary:Test-Debian9-Clang-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: I29a32e2ad9812a47fff841ceca334e562362836f
Reviewed-on: https://skia-review.googlesource.com/57960
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Herb Derby <herb@google.com>
4 files changed