Reland "Reland "make SkJumper stages normal Skia code""

This is a reland of 78cb579f33943421afc8423a39867fcfd69fed44

This time, lowp stages are controlled by !defined(JUMPER_IS_SCALAR), not
by defined(__clang__).  The two are usually the same, except when we opt
Clang builds into JUMPER_IS_SCALAR artificially.

Some Google3 builds use compilers old enough that they barf when
compiling our NEON code.  It's conceivably also possible to define
JUMPER_IS_SCALAR yourself, but I don't think anyone does that.

Original change's description:
> Reland "make SkJumper stages normal Skia code"
>
> This is a reland of 22e536e3a1a09405d1c0e6f071717a726d86e8d4
>
> Now with fixed #include paths in SkRasterPipeline_opts.h,
> and -ffp-contract=fast for the :hsw target to minimize
> diffs on non-Windows Clang AVX2/AVX-512 bots.
>
> Original change's description:
> > make SkJumper stages normal Skia code
> >
> > Enough clients are using Clang now that we can say, use Clang to build
> > if you want these software pipeline stages to go fast.
> >
> > This lets us drop the offline build aspect of SkJumper stages, instead
> > building as part of Skia using the SkOpts framework.
> >
> > I think everything should work, except I've (temporarily) removed
> > AVX-512 support.  I will put this back in a follow up.
> >
> > I have had to drop Windows down to __vectorcall and our narrower
> > stage calling convention that keeps the d-registers on the stack.
> > I tried forcing sysv_abi, but that crashed Clang.  :/
> >
> > Added a TODO to up the same narrower stage calling convention
> > for lowp stages... we just *don't* today, for no good reason.
> >
> > Change-Id: Iaaa792ffe4deab3508d2dc5d0008c163c24b3383
> > Reviewed-on: https://skia-review.googlesource.com/110641
> > Commit-Queue: Mike Klein <mtklein@chromium.org>
> > Reviewed-by: Herb Derby <herb@google.com>
> > Reviewed-by: Florin Malita <fmalita@chromium.org>
>
> Change-Id: I44f2c03d33958e3807747e40904b6351957dd448
> Reviewed-on: https://skia-review.googlesource.com/112742
> Reviewed-by: Mike Klein <mtklein@chromium.org>

Change-Id: I3d71197d4bbb19ca4a94961a97fa2e54d5cbfb0d
Reviewed-on: https://skia-review.googlesource.com/112744
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
diff --git a/BUILD.gn b/BUILD.gn
index f2a69e8..b1f9c0b 100644
--- a/BUILD.gn
+++ b/BUILD.gn
@@ -48,10 +48,6 @@
   skia_compile_processors = false
   skia_lex = false
 
-  skia_jumper_clang = ""
-  skia_jumper_objdump = ""
-  skia_jumper_ccache = ""
-
   skia_skqp_enable_driver_correctness_workarounds = false
   skia_skqp_global_error_tolerance = 0
 }
@@ -314,6 +310,28 @@
   }
 }
 
+opts("hsw") {
+  enabled = is_x86
+  sources = skia_opts.hsw_sources
+  if (!is_clang && is_win) {
+    cflags = [ "/arch:AVX2" ]
+  } else {
+    cflags = [
+      "-mavx2",
+      "-mf16c",
+      "-mfma",
+    ]
+  }
+
+  # Oddly, clang-cl doesn't recognize this as a valid flag.
+  # If it ever does, it'd nice to move this up with -mavx2 and co.
+  if (is_clang && !is_win) {
+    # This flag lets Clang generate FMAs when it sees a mul-then-add.  It's optional,
+    # but nice to have, generating slightly better code for paths without explicit FMAs.
+    cflags += [ "-ffp-contract=fast" ]
+  }
+}
+
 # Any feature of Skia that requires third-party code should be optional and use this template.
 template("optional") {
   if (invoker.enabled) {
@@ -775,6 +793,7 @@
     ":fontmgr_fuchsia",
     ":gpu",
     ":heif",
+    ":hsw",
     ":jpeg",
     ":none",
     ":pdf",
@@ -2101,28 +2120,3 @@
     }
   }
 }
-
-if (skia_jumper_clang != "") {
-  action("regen_jumper") {
-    script = "src/jumper/build_stages.py"
-
-    inputs = [
-      "src/jumper/SkJumper_stages.cpp",
-      "src/jumper/SkJumper_stages_lowp.cpp",
-    ]
-
-    # GN insists its outputs should go somewhere underneath target_out_dir, so we trick it.
-    outputs = [
-      "$target_out_dir/" +
-          rebase_path("src/jumper/SkJumper_generated.S", target_out_dir),
-      "$target_out_dir/" +
-          rebase_path("src/jumper/SkJumper_generated_win.S", target_out_dir),
-    ]
-
-    args = [
-             skia_jumper_clang,
-             skia_jumper_objdump,
-             skia_jumper_ccache,
-           ] + rebase_path(inputs) + rebase_path(outputs)
-  }
-}