Threaded generation of software paths

Re-land of: https://skia-review.googlesource.com/36560

All information needed by the thread is captured by the prepare
callback object, the lambda captures a pointer to that, and does the
mask render. Once it's done, it signals the semaphore (also owned by the
callback). The callback defers the semaphore wait even longer (into the
ASAP upload), so the odds of waiting for the thread are REALLY low.

Also did a bunch of cleanup along the way, and put in some trace markers
so we can monitor how well this is working.

Traces of a GM that includes GPU and SW path rendering (path-reverse):

Original:
    https://screenshot.googleplex.com/f5BG3901tQg.png
Threaded, with wait in the callback (notice pre flush callback blocking):
    https://screenshot.googleplex.com/htOSZFE2s04.png
Current version, with wait deferred to ASAP upload function:
    https://screenshot.googleplex.com/GHjD0U3C34q.png
Bug: skia:
Change-Id: Idb92f385590749f41328a9aec65b2a93f4775079
Reviewed-on: https://skia-review.googlesource.com/40775
Reviewed-by: Brian Salomon <bsalomon@google.com>
Commit-Queue: Brian Osman <brianosman@google.com>
diff --git a/dm/DM.cpp b/dm/DM.cpp
index a442e09..1e02d25 100644
--- a/dm/DM.cpp
+++ b/dm/DM.cpp
@@ -18,6 +18,7 @@
 #include "SkColorSpacePriv.h"
 #include "SkCommonFlags.h"
 #include "SkCommonFlagsConfig.h"
+#include "SkCommonFlagsGpuThreads.h"
 #include "SkCommonFlagsPathRenderer.h"
 #include "SkData.h"
 #include "SkDocument.h"
@@ -858,10 +859,19 @@
                      "GM tests will be skipped.\n", gpuConfig->getTag().c_str());
                 return nullptr;
             }
-            return new GPUSink(contextType, contextOverrides, gpuConfig->getSamples(),
-                               gpuConfig->getUseDIText(), gpuConfig->getColorType(),
-                               gpuConfig->getAlphaType(), sk_ref_sp(gpuConfig->getColorSpace()),
-                               FLAGS_gpu_threading, grCtxOptions);
+            if (gpuConfig->getTestThreading()) {
+                return new GPUThreadTestingSink(contextType, contextOverrides,
+                                                gpuConfig->getSamples(), gpuConfig->getUseDIText(),
+                                                gpuConfig->getColorType(),
+                                                gpuConfig->getAlphaType(),
+                                                sk_ref_sp(gpuConfig->getColorSpace()),
+                                                FLAGS_gpu_threading, grCtxOptions);
+            } else {
+                return new GPUSink(contextType, contextOverrides, gpuConfig->getSamples(),
+                                   gpuConfig->getUseDIText(), gpuConfig->getColorType(),
+                                   gpuConfig->getAlphaType(), sk_ref_sp(gpuConfig->getColorSpace()),
+                                   FLAGS_gpu_threading, grCtxOptions);
+            }
         }
     }
 #endif
@@ -1320,6 +1330,7 @@
     GrContextOptions grCtxOptions;
 #if SK_SUPPORT_GPU
     grCtxOptions.fGpuPathRenderers = CollectGpuPathRenderersFromFlags();
+    grCtxOptions.fExecutor = GpuExecutorForTools();
 #endif
 
     JsonWriter::DumpJson();  // It's handy for the bots to assume this is ~never missing.