ARM Skia NEON patches - 41 - arm64: SkXfermode::xfer32
Currently the NEON code for Xfermodes performs well on arm64
targets except for dstout and dstin which are significantly
slower than the C code. This patch fixes this and gives
further improvements on other modes.
Here are some perf results:
+------------+------------+------------+
| mode | Cortex-A53 | Cortex-A57 |
+------------+------------+------------+
| multiply | +24.58% | +23.71% |
+------------+------------+------------+
| exclusion | +22.72% | +22.05% |
+------------+------------+------------+
| difference | +34.67% | +36.82% |
+------------+------------+------------+
| hardlight | +17.07% | +14.74% |
+------------+------------+------------+
| lighten | +38.21% | +32.87% |
+------------+------------+------------+
| darken | +37.59% | +32.99% |
+------------+------------+------------+
| overlay | +17.36% | +16.88% |
+------------+------------+------------+
| screen | +52.56% | +54.43% |
+------------+------------+------------+
| modulate | +62.85% | +61.32% |
+------------+------------+------------+
| plus | +91.52% | +117.41% |
+------------+------------+------------+
| xor | +42.86% | +43.38% |
+------------+------------+------------+
| dstatop | +48.46% | +48.99% |
+------------+------------+------------+
| srcatop | +50.50% | +48.51% |
+------------+------------+------------+
| dstout | +67.83% | +78.09% |
+------------+------------+------------+
| srcout | +69.02% | +78.26% |
+------------+------------+------------+
| dstin | +70.92% | +79.24% |
+------------+------------+------------+
| srcin | +68.90% | +78.23% |
+------------+------------+------------+
| dstover | +73.80% | +68.10% |
+------------+------------+------------+
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=skia
R=mtklein@google.com, djsollen@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/350343002
1 file changed