ARM Skia NEON patches - 12 - S32_Blend

Blitrow32: S32_Blend fix and little speed improvement

- the results are now exactly similar as the C code
- the speed has improved, especially for small values of count

+-------+-----------+------------+
| count | Cortex-A9 | Cortex-A15 |
+-------+-----------+------------+
| 1     | +30%      | +18%       |
+-------+-----------+------------+
| 2     | 0         | 0          |
+-------+-----------+------------+
| 4     | - <1%     | +14%       |
+-------+-----------+------------+
| > 4   | -0.5..+5% | -0.5..+4%  |
+-------+-----------+------------+

Signed-off-by: Kévin PETIT <kevin.petit@arm.com>

BUG=skia:
R=djsollen@google.com, mtklein@google.com

Author: kevin.petit@arm.com

Review URL: https://codereview.chromium.org/158973002

git-svn-id: http://skia.googlecode.com/svn/trunk@13532 2bbb7eff-a529-9590-31e7-b0007b416f81
2 files changed