prune unused SkNx features

  - remove float -> int conversion, keeping float -> byte
  - remove support for doubles

I was thinking of specializing Sk8f for AVX.  This will help keep the complexity down.

This may cause minor diffs in radial gradients: toBytes() rounds where castTrunc() truncated.  But I don't see any diffs in Gold.
https://gold.skia.org/search2?issue=1411563008&unt=true&query=source_type%3Dgm&master=false

BUG=skia:4117
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Review URL: https://codereview.chromium.org/1411563008
diff --git a/src/effects/gradients/SkRadialGradient.cpp b/src/effects/gradients/SkRadialGradient.cpp
index de0f764..a9cdb2a 100644
--- a/src/effects/gradients/SkRadialGradient.cpp
+++ b/src/effects/gradients/SkRadialGradient.cpp
@@ -306,8 +306,8 @@
             R = R + dR;
             dR = dR + ddR;
 
-            int fi[4];
-            dist.castTrunc().store(fi);
+            uint8_t fi[4];
+            dist.toBytes(fi);
 
             for (int i = 0; i < 4; i++) {
                 *dstC++ = cache[toggle + fi[i]];
@@ -318,8 +318,8 @@
         if (count) {
             Sk4f dist = Sk4f::Min(fast_sqrt(R), max);
 
-            int fi[4];
-            dist.castTrunc().store(fi);
+            uint8_t fi[4];
+            dist.toBytes(fi);
             for (int i = 0; i < count; i++) {
                 *dstC++ = cache[toggle + fi[i]];
                 toggle = next_dither_toggle(toggle);