baby's first use of st2/4
This isn't really how I intended to use these, but we might as well
check if the registers happen to already be lined up how we'd like them.
And it happens quite often... I didn't gather detailed numbers, but both
sides of the "are we lined up?" conditions are being hit lots and lots
of times, in both store64 and store128.
Also rewrote the scalar flow of store128 to mirror store64. The old
code was just fine, but this makes it easier to follow the conditions of
when we can use st4. (This does remind me that we could also use
single-lane st2/st4/ld2/ld4 instructions to handle the scalar paths.)
Change-Id: Id2c94f68e5ea14031f7a23bdc76583dff4a7b65f
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/356436
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
1 file changed