better exclusion for stack traces

Instead of two synchronization systems (in_signal_handler, gMutex),
we can just use one.  This simplifies the signal handler logic to:
   - first thread through grabs the lock, prints what's running and a stack trace,
     then exits
   - all other threads just sit waiting on that lock until exit kills them

Previously I think all threads were racing to exit, which can kill the process
before the printing thread is done.  That truncated the output, which is dumb.

Plus...
   refactor slightly so that crash_handler() shows up at the top of the stack
   trace rather than some odd name for a lambda inside setup_crash_handler().

BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2051863002

Review-Url: https://codereview.chromium.org/2051863002
1 file changed