With threaded software paths, free mask memory earlier

This alleviates memory pressure in my benchmarking, and makes a
measurable impact on overall time when drawing many SW paths.

Bug: skia:
Change-Id: Iacabc9aa51522578da9f4d9411995b8d4fd381ba
Reviewed-on: https://skia-review.googlesource.com/41848
Commit-Queue: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Brian Salomon <bsalomon@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
diff --git a/src/gpu/GrSoftwarePathRenderer.cpp b/src/gpu/GrSoftwarePathRenderer.cpp
index 1363e1a..3070d4d 100644
--- a/src/gpu/GrSoftwarePathRenderer.cpp
+++ b/src/gpu/GrSoftwarePathRenderer.cpp
@@ -205,6 +205,9 @@
                               this->fPixels.width(), this->fPixels.height(),
                               kAlpha_8_GrPixelConfig,
                               this->fPixels.addr(), this->fPixels.rowBytes());
+                // Free this memory immediately, so it can be recycled. This avoids memory pressure
+                // when there is a large amount of threaded work still running during flush.
+                this->fPixels.reset();
             }
         };
         flushState->addASAPUpload(std::move(uploadMask));