SkSplicer

I think I may have cracked the compile-ahead-of-time-splice-at-runtime nut.

This compiles stages ahead of time using clang, then splices them together at runtime.  This means the stages can be written in simple C++, with some mild restrictions.

This performs identically to our Xbyak experiment, and already supports more stages.  As written this stands alone from SkRasterPipeline_opts.h, but I'm fairly confident that the bulk (the STAGE implementations) can ultimately be shared.

As of PS 25 or so, this also supports all the stages used by bench/SkRasterPipelineBench.cpp:

    SkRasterPipeline_…
    400  …f16_compile 1x  …f16_run 1.38x  …srgb_compile 1.89x  …srgb_run 2.21x

That is, ~30% faster than baseline for f16, ~15% faster for sRGB.

Change-Id: I1ec7dcb769613713ce56978c58038f606f87d63d
Reviewed-on: https://skia-review.googlesource.com/6733
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
7 files changed