1aebdaee0e2aa4324509fd3ad4c40c21703ae4a2 - platform/external/skia

commit	1aebdaee0e2aa4324509fd3ad4c40c21703ae4a2	[log] [tgz]
author	Mike Klein <mtklein@chromium.org>	Thu Oct 06 15:06:38 2016 -0400
committer	Skia Commit-Bot <skia-commit-bot@chromium.org>	Fri Oct 07 12:52:29 2016 +0000
tree	c5ffae6c59217f3d228891177e1d50d7f784801a
parent	2766cc567d5c939730fadd2d865e4bdf05477263 [diff]

SkRasterPipeline: 8x pipelines

Bench runtime changes:
sRGB: 7194 -> 3735  = 1.93x faster 
F16:  6531 -> 2559  = 2.55x faster

Instead of building 4x and 1-3x pipelines and then maybe 8x and 1-7x, instead build either the short ones or the long ones, but not both.  If we just take care to use a compatible run_pipeline(), there's some cross-module type disagreement but everything works out in the end.

Oddly, a few places that looked like they'd be faster using SkNx_fma() or Sk4f_round()/Sk8f_round() are actually faster the long way, e.g. multiply, add 0.5, truncate.  Curious!  In all the other places you see here that I've used SkNx_fma(), it's been a significant speedup.

This folds in a couple refactors and cleanups that I've been meaning to do.  Hope you don't mind... if find the new code considerably easier to read than the old code.

BUG=skia:

GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2990
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot

Change-Id: I1c82e5755d8e44cc0b9c6673d04b117f85d71a3a
Reviewed-on: https://skia-review.googlesource.com/2990
Reviewed-by: Matt Sarett <msarett@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>

11 files changed