ca0cfb4a7a52ae894ca005475ad9de5ac1329900 - platform/external/skia

commit	ca0cfb4a7a52ae894ca005475ad9de5ac1329900	[log] [tgz]
author	Mike Klein <mtklein@chromium.org>	Thu Feb 23 08:04:49 2017 -0500
committer	Mike Klein <mtklein@chromium.org>	Thu Feb 23 13:37:39 2017 +0000
tree	3f7defe919b4120bb4cef3496c207291e6d1e955
parent	a6e431b2c1baa564d2619bdc2a51a3b5bfa7e276 [diff]

Add AVX to the SkJumper mix.

AVX is a nice little halfway point between SSE4.1 and HSW, in terms
of instructions available, performance, and availability.

Intel chips have had AVX since ~2011, compared to ~2013 for HSW and
~2007 for SSE4.1.  Like HSW it's got 8-wide 256-bit float vectors,
but integer (and double) operations are essentially still only 128-bit.
It also doesn't have F16 conversion or FMA instructions.

It doesn't look like this is going to be a burden to maintain, and only
adds a few KB of code size.  In exchange, we now run 8x wide on 45% to
70% of x86 machines, depending on the OS.

In my brief testing, speed eerily resembles exact geometric progression:
   SSE4.1:        1x speed (baseline)
      AVX: ~sqrt(2)x speed
      HSW:       ~2x speed

This adds all the basic plumbing for AVX but leaves it disabled.
I'll flip it on once I've implemented the f16 TODOs.

Change-Id: I1c378dabb8a06386646371bf78ade9e9432b006f
Reviewed-on: https://skia-review.googlesource.com/8898
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>

5 files changed

tree: 3f7defe919b4120bb4cef3496c207291e6d1e955