Sk4i version of blur.

For the blur_1.50_normal_low_quality benchmark, this code goes from about 120us to 85us.
The original implementation executes at about 95us.

This changed in controlled by the flag:

SK_SUPPORT_LEGACY_SLOW_SMALL_BLUR

BUG=chromium:759070

CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD
Change-Id: If722cb8ffd8c47a94b7a6b4e6dd26fd1474b6209
Reviewed-on: https://skia-review.googlesource.com/45300
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
3 files changed