Experimental blur code with 32 bit fix.

This uses a new method of blurring that runs the three
passes of the box filter in a single pass. This implementation
currently only does 1x1 pixel at a time, but it should be simple
to expand to 4x4 pixels at a time.

On the  blur_10_normal_high_quality benchmark, the new is 7% faster
than the old code. For the blur_100.50_normal_high_quality
benchmark, the new code is 11% slower.

Bug: skia:
Change-Id: I847270906b0ceac1dfbf43ab5446756689ef660f
Reviewed-on: https://skia-review.googlesource.com/22700
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Herb Derby <herb@google.com>
5 files changed