Runtime CPU detection for rsqrt().

This enables the NEON sk_float_rsqrt() code for configurations that have NEON at run-time but not compile-time.

These devices will see about a 2x (1.26 -> 2.33) slowdown in sk_float_rsqrt(), but it should be more precise than our portable fallback.

(When inlined, the portable fallback and the NEON code are almost identical in speed.  The only difference is precision.  Going through a function pointer is causing all this slowdown.  This is a good example of a place where Skia really benefits from compile-time NEON.)

BUG=skia:4117,skia:4114

No public API changes.
TBR=reed@google.com

Review URL: https://codereview.chromium.org/1264893002
4 files changed