Sk2x for NEON

Also decreases the precision of Sk4f::rsqrt() for speed, keeping Sk4f::sqrt() the same:
instead of doing two estimation steps in rsqrt(), do one there and one more in sqrt().

Tests pass on my Nexus 7.  float64x2_t is still a TODO for when I get a hold of a Nexus 9.

BUG=skia:

Review URL: https://codereview.chromium.org/1018423003
3 files changed