grand unifried lowp stages
I have text_16_AA_FF -> 8888 (forcing RP) faster than head now on my
laptop. I'm feeling confident that we can make this perform well.
After looking at performance a bit more today, it looks like everything
is within what I'd consider comparable in performance, especially on
ARM. On x86-64 it looks like big bulk blits get a little slower and
small mask blits get a little faster.
Quality looks good, and maybe improved for 565.
There are fewer platform-specific differences now in _lowp, and I think
they're few enough now that we could even consider completing the
unification by folding the 8-bit and float code together. Rename
"div255()" to "rebias()", slap on a few coats of paint...
Guarded for Chrome with SK_JUMPER_LEGACY_LOWP.
Change-Id: I36309c07cf736f3cb31952cca66030ad56026318
Reviewed-on: https://skia-review.googlesource.com/45982
Reviewed-by: Herb Derby <herb@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
diff --git a/BUILD.gn b/BUILD.gn
index ef2ac07..9b46480 100644
--- a/BUILD.gn
+++ b/BUILD.gn
@@ -1804,6 +1804,7 @@
inputs = [
"src/jumper/SkJumper_stages.cpp",
"src/jumper/SkJumper_stages_8bit.cpp",
+ "src/jumper/SkJumper_stages_lowp.cpp",
]
# GN insists its outputs should go somewhere underneath target_out_dir, so we trick it.