Expand _01 half<->float limitation to _finite. Simplify. It's become clear we need to sometimes deal with values <0 or >1. I'm not yet convinced we care about NaN or +-inf. We had some fairly clever tricks and optimizations here for NEON and SSE. I've thrown them out in favor of a single implementation. If we find the specializations mattered, we can certainly figure out how to extend them to this new range/domain. This happens to add a vectorized float -> half for ARMv7, which was missing from the _01 version. (The SSE strategy was not portable to platforms that flush denorm floats to zero.) I've tested the full float range for FloatToHalf on my desktop and a 5x. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145663003 CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast-Trybot Committed: https://skia.googlesource.com/skia/+/3296bee70d074bb8094b3229dbe12fa016657e90 Review-Url: https://codereview.chromium.org/2145663003

commit: 58e389b0518b46bbe58ba01c23443cf23c18435c [log] [tgz]
author: mtklein <mtklein@chromium.org> Fri Jul 15 07:00:11 2016 -0700
committer: Commit bot <commit-bot@chromium.org> Fri Jul 15 07:00:11 2016 -0700
tree: 51f6d91fa6a116666c9c318897211cbc7ca0395b
parent: 428036621e1667b504051872869ac38cf6fac9c8 [diff] [blame]
diff --git a/tests/SkNxTest.cpp b/tests/SkNxTest.cpp
index 5509814..51d937d 100644
--- a/tests/SkNxTest.cpp
+++ b/tests/SkNxTest.cpp

@@ -288,3 +288,22 @@
         REPORTER_ASSERT(r, !memcmp(s16, d16, sizeof(s16)));
     }
 }
+
+// The SSE2 implementation of SkNx_cast<uint16_t>(Sk4i) is non-trivial, so worth a test.
+DEF_TEST(SkNx_int_u16, r) {
+    // These are pretty hard to get wrong.
+    for (int i = 0; i <= 0x7fff; i++) {
+        uint16_t expected = (uint16_t)i;
+        uint16_t actual = SkNx_cast<uint16_t>(Sk4i(i))[0];
+
+        REPORTER_ASSERT(r, expected == actual);
+    }
+
+    // A naive implementation with _mm_packs_epi32 would succeed up to 0x7fff but fail here:
+    for (int i = 0x8000; (1) && i <= 0xffff; i++) {
+        uint16_t expected = (uint16_t)i;
+        uint16_t actual = SkNx_cast<uint16_t>(Sk4i(i))[0];
+
+        REPORTER_ASSERT(r, expected == actual);
+    }
+}
commit	58e389b0518b46bbe58ba01c23443cf23c18435c	[log] [tgz]
author	mtklein <mtklein@chromium.org>	Fri Jul 15 07:00:11 2016 -0700
committer	Commit bot <commit-bot@chromium.org>	Fri Jul 15 07:00:11 2016 -0700
tree	51f6d91fa6a116666c9c318897211cbc7ca0395b
parent	428036621e1667b504051872869ac38cf6fac9c8 [diff] [blame]