Fix Sk8f::Store4 (for HSW)
This should fix the colorspacexform gm in Gold.
https://gold.skia.org/search?head=true&include=false&limit=50&neg=false&pos=false&query=name%3Dcolorspacexform%26source_type%3Dgm&unt=true
BUG=skia:
Change-Id: I05e2c2c0e7d7095f6935e60ff1bf89858380335f
Reviewed-on: https://skia-review.googlesource.com/6721
Commit-Queue: Matt Sarett <msarett@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
diff --git a/src/opts/SkNx_sse.h b/src/opts/SkNx_sse.h
index d52509a..78b6d39 100644
--- a/src/opts/SkNx_sse.h
+++ b/src/opts/SkNx_sse.h
@@ -567,10 +567,10 @@
_26 = unpacklo_pd(rg2367, ba2367), // r2 ... | r6 ...
_37 = unpackhi_pd(rg2367, ba2367); // r3 ... | r7 ...
- __m256 _01 = _mm256_permute2f128_ps(_04, _15, 16), // 16 == 010 000 == lo, lo
- _23 = _mm256_permute2f128_ps(_26, _37, 16),
- _45 = _mm256_permute2f128_ps(_04, _15, 25), // 25 == 011 001 == hi, hi
- _67 = _mm256_permute2f128_ps(_26, _37, 25);
+ __m256 _01 = _mm256_permute2f128_ps(_04, _15, 32), // 32 == 0010 0000 == lo, lo
+ _23 = _mm256_permute2f128_ps(_26, _37, 32),
+ _45 = _mm256_permute2f128_ps(_04, _15, 49), // 49 == 0011 0001 == hi, hi
+ _67 = _mm256_permute2f128_ps(_26, _37, 49);
_mm256_storeu_ps((float*)ptr + 0*8, _01);
_mm256_storeu_ps((float*)ptr + 1*8, _23);