ARM Skia NEON patches - 27 - S32A_D565_Blend


BlitRow565: new intrinsics version of S32A_D565_Blend

This new version is basically a rewrite of the existing code with
a few speed and accuracy improvements. There is a switch to enable
pixel perfect results at the cost of a (quite big) decrease of
performances (disabled in this patch).

Here are the benchmark results (speedup vs. existing code):

+-------+------------+------------+
| count | Cortex -A9 | Cortex-A15 |
+-------+------------+------------+
| 1     | +103.6%    | +12%       |
+-------+------------+------------+
| 2     | +3.6%      | +21.6%     |
+-------+------------+------------+
| 4     | +0.8%      | -0.8%      |
+-------+------------+------------+
| 8     | +3.9%      | -1%        |
+-------+------------+------------+
| 16    | +14.7%     | +5.7%      |
+-------+------------+------------+
| 64    | +18.1%     | +13.2%     |
+-------+------------+------------+
| 256   | +16.3%     | +27.4%     |
+-------+------------+------------+
| 1024  | +78.2%     | +17.4%     |
+-------+------------+------------+

Signed-off-by: Kévin PETIT <kevin.petit@arm.com>

BUG=skia:
R=djsollen@google.com, mtklein@google.com, halcanary@google.com

Author: kevin.petit@arm.com

Review URL: https://codereview.chromium.org/156113005

git-svn-id: http://skia.googlecode.com/svn/trunk@13438 2bbb7eff-a529-9590-31e7-b0007b416f81
2 files changed