impl gather32 for x86

Some TODOs left over to make the scalar
tail case better... as is it issues a
256-bit gather for each 32-bit load!

I added a trimmed down variant of the existing
SkVM_gathers unit test to test just gather32,
covering this new JIT code.

Change-Id: Iabd2e6a61f0213b6d02d222b9f7aec2be000b70b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/264217
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
2 files changed