Change the way thread termination is handled.  Until now, there has
been a concept of a 'master thread'.  This is the first thread in the
process.  There was special logic which kept the master thread alive
artificially should it attempt to exit before its children.  So the
master would wait for all children to exit and then exit itself, in
the process emitting the final summary of errors, leaks, etc.

This has the advantage that any process waiting on this one will see
the final summaries appearing before its sys_wait call returns.  In
other words, the final summary output is synchronous with the
master-thread exiting.

Unfortunately the master-thread idea has a serious drawback, namely
that it can and sometimes does cause threaded programs to deadlock at
exit.  It introduces an artificial dependency which is that the master
thread cannot really exit until all its children have exited.  If --
by any means at all -- the children are waiting for the master to exit
before exiting themselves, deadlock results.  There are now two known
examples of such deadlocks.

This commit removes the master thread concept and lets threads exit in
the order which they would have exited without Valgrind's involvement.
The last thread to exit prints the final summaries.  This has the
disadvantage that final output may appear arbitrarily later relative
to the exit of the initial thread.  Whether this is a problem in
practice remains to be seen.

As a minor side effect of this change, some functions have had
_NORETURN added to their names.  Such functions do not return.  The
thread in which they execute is guaranteed to exit before they return.
This makes the logic somewhat easier to follow.

amd64 compilation is now broken.  I will fix it shortly.




git-svn-id: svn://svn.valgrind.org/valgrind/trunk@3816 a5019735-40e9-0310-863c-91ae7b9d1cf9
8 files changed