Update some Sk4px APIs.

Mostly this is about ergonomics, making it easier to do good operations and hard / impossible to do bad ones.

- SkAlpha / SkPMColor constructors become static factories.
- Remove div255TruncNarrow(), rename div255RoundNarrow() to div255().  In practice we always want to round, and the narrowing to 8-bit is contextually obvious.
- Rename fastMulDiv255Round() approxMulDiv255() to stress it's approximate-ness over its speed.  Drop Round for the same reason as above... we should always round.
- Add operator overloads so we don't have to keep throwing in seemingly-random Sk4px() or Sk4px::Wide() casts.
- use operator*() for 8-bit x 8-bit -> 16-bit math.  It's always what we want, and there's generally no 8x8->8 alternative.
- MapFoo can take a const Func&.  Don't think it makes a big difference, but nice to do.

BUG=skia:

Review URL: https://codereview.chromium.org/1202013002
diff --git a/tests/SkNxTest.cpp b/tests/SkNxTest.cpp
index eab625d..3719044 100644
--- a/tests/SkNxTest.cpp
+++ b/tests/SkNxTest.cpp
@@ -174,15 +174,15 @@
         int exact = (a*b+127)/255;
 
         // Duplicate a and b 16x each.
-        Sk4px av((SkAlpha)a),
-              bv((SkAlpha)b);
+        auto av = Sk4px::DupAlpha(a),
+             bv = Sk4px::DupAlpha(b);
 
         // This way should always be exactly correct.
-        int correct = av.mulWiden(bv).div255RoundNarrow().kth<0>();
+        int correct = (av * bv).div255().kth<0>();
         REPORTER_ASSERT(r, correct == exact);
 
         // We're a bit more flexible on this method: correct for 0 or 255, otherwise off by <=1.
-        int fast = av.fastMulDiv255Round(bv).kth<0>();
+        int fast = av.approxMulDiv255(bv).kth<0>();
         REPORTER_ASSERT(r, fast-exact >= -1 && fast-exact <= 1);
         if (a == 0 || a == 255 || b == 0 || b == 255) {
             REPORTER_ASSERT(r, fast == exact);