Import thrcheck from the THRCHECK branch, and rename it Helgrind (with
permission of the existing Helgrind authors).
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@7116 a5019735-40e9-0310-863c-91ae7b9d1cf9
diff --git a/helgrind/docs/hg-manual.xml b/helgrind/docs/hg-manual.xml
new file mode 100644
index 0000000..5090cfc
--- /dev/null
+++ b/helgrind/docs/hg-manual.xml
@@ -0,0 +1,1311 @@
+<?xml version="1.0"?> <!-- -*- sgml -*- -->
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+ "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
+
+
+<chapter id="tc-manual" xreflabel="Thrcheck: thread error detector">
+ <title>Thrcheck: a thread error detector</title>
+
+<para>To use this tool, you must specify
+<computeroutput>--tool=thrcheck</computeroutput> on the Valgrind
+command line.</para>
+
+
+
+
+<sect1 id="tc-manual.overview" xreflabel="Overview">
+<title>Overview</title>
+
+<para>Thrcheck is a Valgrind tool for detecting synchronisation errors
+in C, C++ and Fortran programs that use the POSIX pthreads
+threading primitives.</para>
+
+<para>The main abstractions in POSIX pthreads are: a set of threads
+sharing a common address space, thread creation, thread joinage,
+thread exit, mutexes (locks), condition variables (inter-thread event
+notifications), reader-writer locks, and semaphores.</para>
+
+<para>Thrcheck is aware of all these abstractions and tracks their
+effects as accurately as it can. Currently it does not correctly
+handle pthread barriers and pthread spinlocks, although it will not
+object if you use them. On x86 and amd64 platforms, it understands
+and partially handles implicit locking arising from the use of the
+LOCK instruction prefix.
+</para>
+
+<para>Thrcheck can detect three classes of errors, which are discussed
+in detail in the next three sections:</para>
+
+<orderedlist>
+ <listitem>
+ <para><link linkend="tc-manual.api-checks">
+ Misuses of the POSIX pthreads API.</link></para>
+ </listitem>
+ <listitem>
+ <para><link linkend="tc-manual.lock-orders">
+ Potential deadlocks arising from lock
+ ordering problems.</link></para>
+ </listitem>
+ <listitem>
+ <para><link linkend="tc-manual.data-races">
+ Data races -- accessing memory without adequate locking.
+ </link></para>
+ </listitem>
+</orderedlist>
+
+<para>Following those is a section containing
+<link linkend="tc-manual.effective-use">
+hints and tips on how to get the best out of Thrcheck.</link>
+</para>
+
+<para>Then there is a
+<link linkend="tc-manual.options">summary of command-line
+options.</link>
+</para>
+
+<para>Finally, there is
+<link linkend="tc-manual.todolist">a brief summary of areas in which Thrcheck
+could be improved.</link>
+</para>
+
+</sect1>
+
+
+
+
+<sect1 id="tc-manual.api-checks" xreflabel="API Checks">
+<title>Detected errors: Misuses of the POSIX pthreads API</title>
+
+<para>Thrcheck intercepts calls to many POSIX pthreads functions, and
+is therefore able to report on various common problems. Although
+these are unglamourous errors, their presence can lead to undefined
+program behaviour and hard-to-find bugs later in execution. The
+detected errors are:</para>
+
+<itemizedlist>
+ <listitem><para>unlocking an invalid mutex</para></listitem>
+ <listitem><para>unlocking a not-locked mutex</para></listitem>
+ <listitem><para>unlocking a mutex held by a different
+ thread</para></listitem>
+ <listitem><para>destroying an invalid or a locked mutex</para></listitem>
+ <listitem><para>recursively locking a non-recursive mutex</para></listitem>
+ <listitem><para>deallocation of memory that contains a
+ locked mutex</para></listitem>
+ <listitem><para>passing mutex arguments to functions expecting
+ reader-writer lock arguments, and vice
+ versa</para></listitem>
+ <listitem><para>when a POSIX pthread function fails with an
+ error code that must be handled</para></listitem>
+ <listitem><para>when a thread exits whilst still holding locked
+ locks</para></listitem>
+ <listitem><para>calling <computeroutput>pthread_cond_wait</computeroutput>
+ with a not-locked mutex, or one locked by a different
+ thread</para></listitem>
+</itemizedlist>
+
+<para>Checks pertaining to the validity of mutexes are generally also
+performed for reader-writer locks.</para>
+
+<para>Various kinds of this-can't-possibly-happen events are also
+reported. These usually indicate bugs in the system threading
+library.</para>
+
+<para>Reported errors always contain a primary stack trace indicating
+where the error was detected. They may also contain auxiliary stack
+traces giving additional information. In particular, most errors
+relating to mutexes will also tell you where that mutex first came to
+Thrcheck's attention (the "<computeroutput>was first observed
+at</computeroutput>" part), so you have a chance of figuring out which
+mutex it is referring to. For example:</para>
+
+<programlisting><![CDATA[
+Thread #1 unlocked a not-locked lock at 0x7FEFFFA90
+ at 0x4C2408D: pthread_mutex_unlock (tc_intercepts.c:492)
+ by 0x40073A: nearly_main (tc09_bad_unlock.c:27)
+ by 0x40079B: main (tc09_bad_unlock.c:50)
+ Lock at 0x7FEFFFA90 was first observed
+ at 0x4C25D01: pthread_mutex_init (tc_intercepts.c:326)
+ by 0x40071F: nearly_main (tc09_bad_unlock.c:23)
+ by 0x40079B: main (tc09_bad_unlock.c:50)
+]]></programlisting>
+
+<para>Thrcheck has a way of summarising thread identities, as
+evidenced here by the text "<computeroutput>Thread
+#1</computeroutput>". This is so that it can speak about threads and
+sets of threads without overwhelming you with details. See
+<link linkend="tc-manual.data-races.errmsgs">below</link>
+for more information on interpreting error messages.</para>
+
+</sect1>
+
+
+
+
+<sect1 id="tc-manual.lock-orders" xreflabel="Lock Orders">
+<title>Detected errors: Inconsistent Lock Orderings</title>
+
+<para>In this section, and in general, to "acquire" a lock simply
+means to lock that lock, and to "release" a lock means to unlock
+it.</para>
+
+<para>Thrcheck monitors the order in which threads acquire locks.
+This allows it to detect potential deadlocks which could arise from
+the formation of cycles of locks. Detecting such inconsistencies is
+useful because, whilst actual deadlocks are fairly obvious, potential
+deadlocks may never be discovered during testing and could later lead
+to hard-to-diagnose in-service failures.</para>
+
+<para>The simplest example of such a problem is as
+follows.</para>
+
+<itemizedlist>
+ <listitem><para>Imagine some shared resource R, which, for whatever
+ reason, is guarded by two locks, L1 and L2, which must both be held
+ when R is accessed.</para>
+ </listitem>
+ <listitem><para>Suppose a thread acquires L1, then L2, and proceeds
+ to access R. The implication of this is that all threads in the
+ program must acquire the two locks in the order first L1 then L2.
+ Not doing so risks deadlock.</para>
+ </listitem>
+ <listitem><para>The deadlock could happen if two threads -- call them
+ T1 and T2 -- both want to access R. Suppose T1 acquires L1 first,
+ and T2 acquires L2 first. Then T1 tries to acquire L2, and T2 tries
+ to acquire L1, but those locks are both already held. So T1 and T2
+ become deadlocked.</para>
+ </listitem>
+</itemizedlist>
+
+<para>Thrcheck builds a directed graph indicating the order in which
+locks have been acquired in the past. When a thread acquires a new
+lock, the graph is updated, and then checked to see if it now contains
+a cycle. The presence of a cycle indicates a potential deadlock involving
+the locks in the cycle.</para>
+
+<para>In simple situations, where the cycle only contains two locks,
+Thrcheck will show where the required order was established:</para>
+
+<programlisting><![CDATA[
+Thread #1: lock order "0x7FEFFFAB0 before 0x7FEFFFA80" violated
+ at 0x4C23C91: pthread_mutex_lock (tc_intercepts.c:388)
+ by 0x40081F: main (tc13_laog1.c:24)
+ Required order was established by acquisition of lock at 0x7FEFFFAB0
+ at 0x4C23C91: pthread_mutex_lock (tc_intercepts.c:388)
+ by 0x400748: main (tc13_laog1.c:17)
+ followed by a later acquisition of lock at 0x7FEFFFA80
+ at 0x4C23C91: pthread_mutex_lock (tc_intercepts.c:388)
+ by 0x400773: main (tc13_laog1.c:18)
+]]></programlisting>
+
+<para>When there are more than two locks in the cycle, the error is
+equally serious. However, at present Thrcheck does not show the locks
+involved, so as to avoid flooding you with information. That could be
+fixed in future. For example, here is a an example involving a cycle
+of five locks from a naive implementation the famous Dining
+Philosophers problem
+(see <computeroutput>thrcheck/tests/tc14_laog_dinphils.c</computeroutput>).
+In this case Thrcheck has detected that all 5 philosophers could
+simultaneously pick up their left fork and then deadlock whilst
+waiting to pick up their right forks.</para>
+
+<programlisting><![CDATA[
+Thread #6: lock order "0x6010C0 before 0x601160" violated
+ at 0x4C23C91: pthread_mutex_lock (tc_intercepts.c:388)
+ by 0x4007C0: dine (tc14_laog_dinphils.c:19)
+ by 0x4C25DF7: mythread_wrapper (tc_intercepts.c:178)
+ by 0x4E2F09D: start_thread (in /lib64/libpthread-2.5.so)
+ by 0x51054CC: clone (in /lib64/libc-2.5.so)
+]]></programlisting>
+
+</sect1>
+
+
+
+
+<sect1 id="tc-manual.data-races" xreflabel="Data Races">
+<title>Detected errors: Data Races</title>
+
+<para>A data race happens, or could happen, when two threads
+access a shared memory location without using suitable locks to
+ensure single-threaded access. Such missing locking can cause
+obscure timing dependent bugs. Ensuring programs are race-free is
+one of the central difficulties of threaded programming.</para>
+
+<para>Reliably detecting races is a difficult problem, and most
+of Thrcheck's internals are devoted to do dealing with it.
+As a consequence this section is somewhat long and involved.
+We begin with a simple example.</para>
+
+
+<sect2 id="tc-manual.data-races.example" xreflabel="Simple Race">
+<title>A Simple Data Race</title>
+
+<para>About the simplest possible example of a race is as follows. In
+this program, it is impossible to know what the value
+of <computeroutput>var</computeroutput> is at the end of the program.
+Is it 2 ? Or 1 ?</para>
+
+<programlisting><![CDATA[
+#include <pthread.h>
+
+int var = 0;
+
+void* child_fn ( void* arg ) {
+ var++; /* Unprotected relative to parent */ /* this is line 6 */
+ return NULL;
+}
+
+int main ( void ) {
+ pthread_t child;
+ pthread_create(&child, NULL, child_fn, NULL);
+ var++; /* Unprotected relative to child */ /* this is line 13 */
+ pthread_join(child, NULL);
+ return 0;
+}
+]]></programlisting>
+
+<para>The problem is there is nothing to
+stop <computeroutput>var</computeroutput> being updated simultaneously
+by both threads. A correct program would
+protect <computeroutput>var</computeroutput> with a lock of type
+<computeroutput>pthread_mutex_t</computeroutput>, which is acquired
+before each access and released afterwards. Thrcheck's output for
+this program is:</para>
+
+<programlisting><![CDATA[
+Thread #1 is the program's root thread
+
+Thread #2 was created
+ at 0x510548E: clone (in /lib64/libc-2.5.so)
+ by 0x4E2F305: do_clone (in /lib64/libpthread-2.5.so)
+ by 0x4E2F7C5: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.5.so)
+ by 0x4C23870: pthread_create@* (tc_intercepts.c:198)
+ by 0x4005F1: main (simple_race.c:12)
+
+Possible data race during write of size 4 at 0x601034
+ at 0x4005F2: main (simple_race.c:13)
+ Old state: shared-readonly by threads #1, #2
+ New state: shared-modified by threads #1, #2
+ Reason: this thread, #1, holds no consistent locks
+ Location 0x601034 has never been protected by any lock
+]]></programlisting>
+
+<para>This is quite a lot of detail for an apparently simple error.
+The last clause is the main error message. It says there is a race as
+a result of a write of size 4 (bytes), at 0x601034, which is
+presumably the address of <computeroutput>var</computeroutput>,
+happening in function <computeroutput>main</computeroutput> at line 13
+in the program.</para>
+
+<para>Note that it is purely by chance that the race is
+reported for the parent thread's access. It could equally have been
+reported instead for the child's access, at line 6. The error will
+only be reported for one of the locations, since neither the parent
+nor child is, by itself, incorrect. It is only when both access
+<computeroutput>var</computeroutput> without a lock that an error
+exists.</para>
+
+<para>The error message shows some other interesting details. The
+sections below explain them. Here we merely note their presence:</para>
+
+<itemizedlist>
+ <listitem><para>Thrcheck maintains some kind of state machine for the
+ memory location in question, hence the "<computeroutput>Old
+ state:</computeroutput>" and "<computeroutput>New
+ state:</computeroutput>" lines.</para>
+ </listitem>
+ <listitem><para>Thrcheck keeps track of which threads have accessed
+ the location: "<computeroutput>threads #1, #2</computeroutput>".
+ Before printing the main error message, it prints the creation
+ points of these two threads, so you can see which threads it is
+ referring to.</para>
+ </listitem>
+ <listitem><para>Thrcheck tries to provide an explaination of why the
+ race exists: "<computeroutput>Location 0x601034 has never been
+ protected by any lock</computeroutput>".</para>
+ </listitem>
+</itemizedlist>
+
+<para>Understanding the memory state machine is central to
+understanding Thrcheck's race-detection algorithm. The next three
+subsections explain this.</para>
+
+</sect2>
+
+
+<sect2 id="tc-manual.data-races.memstates" xreflabel="Memory States">
+<title>Thrcheck's Memory State Machine</title>
+
+<para>Thrcheck tracks the state of every byte of memory used by your
+program. There are a number of states, but only three are
+interesting:</para>
+
+<itemizedlist>
+ <listitem><para>Exclusive: memory in this state is regarded as owned
+ exclusively by one particular thread. That thread may read and
+ write it without a lock. Even in highly threaded programs, the
+ majority of locations never leave the Exclusive state, since most
+ data is thread-private.</para>
+ </listitem>
+ <listitem><para>Shared-Readonly: memory in this state is regarded as
+ shared by multiple threads. In this state, any thread may read the
+ memory without a lock, reflecting the fact that readonly data may
+ safely be shared between threads without locking.</para>
+ </listitem>
+ <listitem><para>Shared-Modified: memory in this state is regarded as
+ shared by multiple threads, at least one of which has written to it.
+ All participating threads must hold at least one lock in common when
+ accessing the memory. If no such lock exists, Thrcheck reports a
+ race error.</para>
+ </listitem>
+</itemizedlist>
+
+<para>Let's review the simple example above with this in mind. When
+the program starts, <computeroutput>var</computeroutput> is not in any
+of these states. Either the parent or child thread gets to its
+<computeroutput>var++</computeroutput> first, and thereby
+thereby gets Exclusive ownership of the location.</para>
+
+<para>The later-running thread now arrives at
+its <computeroutput>var++</computeroutput> statement. It first reads
+the existing value from memory.
+Because <computeroutput>var</computeroutput> is currently marked as
+owned exclusively by the other thread, its state is changed to
+shared-readonly by both threads.</para>
+
+<para>This same thread adds one to the value it has and stores it back
+in <computeroutput>var</computeroutput>. This causes another state
+change, this time to the shared-modified state. Because Thrcheck has
+also been tracking which threads hold which locks, it can see that
+<computeroutput>var</computeroutput> is in shared-modified state but
+no lock has been used to consistently protect it. Hence a race is
+reported exactly at the transition from shared-readonly to
+shared-modified.</para>
+
+<para>The essence of the algorithm is this. Thrcheck keeps track of
+each memory location that has been accessed by more than one thread.
+For each such location it incrementally infers the set of locks which
+have consistently been used to protect that location. If the
+location's lockset becomes empty, and at some point one of the threads
+attempts to write to it, a race is then reported.</para>
+
+<para>This technique is known as "lockset inference" and was
+introduced in: "Eraser: A Dynamic Data Race Detector for Multithreaded
+Programs" (Stefan Savage, Michael Burrows, Greg Nelson, Patrick
+Sobalvarro and Thomas Anderson, ACM Transactions on Computer Systems,
+15(4):391-411, November 1997).</para>
+
+<para>Lockset inference has since been widely implemented, studied and
+extended. Thrcheck incorporates several refinements aimed at avoiding
+the high false error rate that naive versions of the algorithm suffer
+from. A
+<link linkend="tc-manual.data-races.summary">summary of the complete
+algorithm used by Thrcheck</link> is presented below. First, however,
+it is important to understand details of transitions pertaining to the
+Exclusive-ownership state.</para>
+
+</sect2>
+
+
+
+<sect2 id="tc-manual.data-races.exclusive" xreflabel="Excl Transfers">
+<title>Transfers of Exclusive Ownership Between Threads</title>
+
+<para>As presented, the algorithm is far too strict. It reports many
+errors in perfectly correct, widely used parallel programming
+constructions, for example, using child worker threads and worker
+thread pools.</para>
+
+<para>To avoid these false errors, we must refine the algorithm so
+that it keeps memory in an Exclusive ownership state in cases where it
+would otherwise decay into a shared-readonly or shared-modified state.
+Recall that Exclusive ownership is special in that it grants the
+owning thread the right to access memory without use of any locks. In
+order to support worker-thread and worker-thread-pool idioms, we will
+allow threads to steal exclusive ownership of memory from other
+threads under certain circumstances.</para>
+
+<para>Here's an example. Imagine a parent thread creates child
+threads to do units of work. For each unit of work, the parent
+allocates a work buffer, fills it in, and creates the child thread,
+handing it a pointer to the buffer. The child reads/writes the buffer
+and eventually exits, and the waiting parent then extracts the results
+from the buffer:</para>
+
+<programlisting><![CDATA[
+typedef ... Buffer;
+
+pthread_t child;
+Buffer buf;
+
+/* ---- Parent ---- */ /* ---- Child ---- */
+
+/* parent writes workload into buf */
+pthread_create( &child, child_fn, &buf );
+
+/* parent does not read */ void child_fn ( Buffer* buf ) {
+/* or write buf */ /* read/write buf */
+ }
+
+pthread_join ( child );
+/* parent reads results from buf */
+]]></programlisting>
+
+<para>Although <computeroutput>buf</computeroutput> is accessed by
+both threads, neither uses locks, yet the program is race-free. The
+essential observation is that the child's creation and exit create
+synchronisation events between it and the parent. These force the
+child's accesses to <computeroutput>buf</computeroutput> to happen
+after the parent initialises <computeroutput>buf</computeroutput>, and
+before the parent reads the results
+from <computeroutput>buf</computeroutput>.</para>
+
+<para>To model this, Thrcheck allows the child to steal, from the
+parent, exclusive ownership of any memory exclusively owned by the
+parent before the pthread_create call. Similarly, once the parent's
+pthread_join call returns, it can steal back ownership of memory
+exclusively owned by the child. In this way ownership
+of <computeroutput>buf</computeroutput> is transferred from parent to
+child and back, so the basic algorithm does not report any races
+despite the absence of any locking.</para>
+
+<para>Note that the child may only steal memory owned by the parent
+prior to the pthread_create call. If the child attempts to read or
+write memory which is also accessed by the parent in between the
+pthread_create and pthread_join calls, an error is still
+reported.</para>
+
+<para>This technique was introduced with the name "thread lifetime
+segments" in "Runtime Checking of Multithreaded Applications with
+Visual Threads" (Jerry J. Harrow, Jr, Proceedings of the 7th
+International SPIN Workshop on Model Checking of Software Stanford,
+California, USA, August 2000, LNCS 1885, pp331--342). Thrcheck
+implements an extended version of it. Specifically, Thrcheck allows
+transfer of exclusive ownership in the following situations:</para>
+
+<itemizedlist>
+ <listitem><para>At thread creation: a child can acquire ownership of
+ memory held exclusively by the parent prior to the child's
+ creation.</para>
+ </listitem>
+ <listitem><para>At thread joining: the joiner (thread not exiting)
+ can acquire ownership of memory held exclusively by the joinee
+ (thread that is exiting) at the point it exited.</para>
+ </listitem>
+ <listitem><para>At condition variable signallings and broadcasts. A
+ thread Tw which completes a pthread_cond_wait call as a result of
+ a signal or broadcast on the same condition variable by some other
+ thread Ts, may acquire ownership of memory held exclusively by
+ Ts prior to the pthread_cond_signal/broadcast
+ call.</para>
+ </listitem>
+ <listitem><para>At semaphore posts (sem_post) calls. A thread Tw
+ which completes a sem_wait call call as a result of a sem_post call
+ on the same semaphore by some other thread Tp, may acquire
+ ownership of memory held exclusively by Tp prior to the sem_post
+ call.</para>
+ </listitem>
+</itemizedlist>
+
+</sect2>
+
+
+
+<sect2 id="tc-manual.data-races.re-excl" xreflabel="Re-Excl Transfers">
+<title>Restoration of Exclusive Ownership</title>
+
+<para>Another common idiom is to partition the lifetime of the program
+as a whole into several distinct phases. In some of those phases, a
+memory location may be accessed by multiple threads and so require
+locking. In other phases only one thread exists and so can access the
+memory without locking. For example:</para>
+
+<programlisting><![CDATA[
+int var = 0; /* shared variable */
+pthread_mutex_t mx = PTHREAD_MUTEX_INITIALIZER; /* guard for var */
+pthread_t child;
+
+/* ---- Parent ---- */ /* ---- Child ---- */
+
+var += 1; /* no lock used */
+
+pthread_create( &child, child_fn, NULL );
+
+ void child_fn ( void* uu ) {
+pthread_mutex_lock(&mx); pthread_mutex_lock(&mx);
+var += 2; var += 3;
+pthread_mutex_unlock(&mx); pthread_mutex_unlock(&mx);
+ }
+
+pthread_join ( child );
+
+var += 4; /* no lock used */
+]]></programlisting>
+
+<para>This program is correct, but using only the mechanisms described
+so far, Thrcheck would report an error at
+<computeroutput>var += 4</computeroutput>. This is because, by that
+point, <computeroutput>var</computeroutput> is marked as being in the
+state "shared-modified and protected by the
+lock <computeroutput>mx</computeroutput>", but is being accessed
+without locking. Really, what we want is
+for <computeroutput>var</computeroutput> to return to the parent
+thread's exclusive ownership after the child thread has exited.</para>
+
+<para>To make this possible, for every memory location Thrcheck also keeps
+track of all the threads that have accessed that location
+-- its threadset. When a thread Tquitter joins back to Tstayer,
+Thrcheck examines the locksets of all memory in shared-modified or
+shared-readable state. In each such lockset, if Tquitter is
+mentioned, it is removed and replaced by Tstayer. If, as a result, a
+lockset becomes a singleton set containing Tstayer, then the
+location's state is changed to belongs-exclusively-to-Tstayer.</para>
+
+<para>In our example, the result is exactly as we desire:
+<computeroutput>var</computeroutput> is reacquired exclusively by the
+parent after the child exits.</para>
+
+<para>More generally, when a group of threads merges back to a single
+thread via a cascade of pthread_join calls, any memory shared by the
+group (or a subset of it) ends up being owned exclusively by the sole
+surviving thread. This significantly enhances Thrcheck's flexibility,
+since it means that each memory location may make arbitrarily many
+transitions between exclusive and shared ownership. Furthermore, a
+different lock may protect the location during each period of shared
+ownership.</para>
+
+</sect2>
+
+
+
+<sect2 id="tc-manual.data-races.summary" xreflabel="Race Det Summary">
+<title>A Summary of the Race Detection Algorithm</title>
+
+<para>Thrcheck looks for memory locations which are accessed by more
+than one thread. For each such location, Thrcheck records which of
+the program's locks were held by the accessing thread at the time of
+each access. The hope is to discover that there is indeed at least
+one lock which is consistently used by all threads to protect that
+location. If no such lock can be found, then there is apparently no
+consistent locking strategy being applied for that location, and so a
+possible data race might result. Thrcheck accordingly reports an
+error.</para>
+
+<para>In practice this discipline is far too simplistic, and is
+unusable since it reports many races in some widely used and
+known-correct programming disciplines. Thrcheck's checking therefore
+incorporates many refinements to this basic idea, and can be
+summarised as follows:</para>
+
+<para>The following thread events are intercepted and monitored:</para>
+
+<itemizedlist>
+ <listitem><para>thread creation and exiting (pthread_create,
+ pthread_join, pthread_exit)</para>
+ </listitem>
+ <listitem>
+ <para>lock acquisition and release (pthread_mutex_lock,
+ pthread_mutex_unlock, pthread_rwlock_rdlock,
+ pthread_rwlock_wrlock,
+ pthread_rwlock_unlock)</para>
+ </listitem>
+ <listitem>
+ <para>inter-thread event notifications (pthread_cond_wait,
+ pthread_cond_signal, pthread_cond_broadcast,
+ sem_wait, sem_post)</para>
+ </listitem>
+</itemizedlist>
+
+<para>Memory allocation and deallocation events are intercepted and
+monitored:</para>
+
+<itemizedlist>
+ <listitem>
+ <para>malloc/new/free/delete and variants</para>
+ </listitem>
+ <listitem>
+ <para>stack allocation and deallocation</para>
+ </listitem>
+</itemizedlist>
+
+<para>All memory accesses are intercepted and monitored.</para>
+
+<para>By observing the above events, Thrcheck can infer certain
+aspects of the program's locking discipline. Programs which adhere to
+the following rules are considered to be acceptable:
+</para>
+
+<itemizedlist>
+ <listitem>
+ <para>A thread may allocate memory, and write initial values into
+ it, without locking. That thread is regarded as owning the memory
+ exclusively.</para>
+ </listitem>
+ <listitem>
+ <para>A thread may read and write memory which it owns exclusively,
+ without locking.</para>
+ </listitem>
+ <listitem>
+ <para>Memory which is owned exclusively by one thread may be read by
+ that thread and others without locking. However, in this situation
+ no thread may do unlocked writes to the memory (except for the owner
+ thread's initializing write).</para>
+ </listitem>
+ <listitem>
+ <para>Memory which is shared between multiple threads, one or more
+ of which writes to it, must be protected by a lock which is
+ correctly acquired and released by all threads accessing the
+ memory.</para>
+ </listitem>
+</itemizedlist>
+
+<para>Any violation of this discipline will cause an error to be reported.
+However, two exemptions apply:</para>
+
+<itemizedlist>
+ <listitem>
+ <para>A thread Y can acquire exclusive ownership of memory
+ previously owned exclusively by a different thread X providing
+ X's last access and Y's first access are separated by one of the
+ following synchronization events:</para>
+ <itemizedlist>
+ <listitem><para>X creates thread Y</para></listitem>
+ <listitem><para>X joins back to Y</para></listitem>
+ <listitem><para>X uses a condition-variable to signal at Y, and Y is
+ waiting for that event</para></listitem>
+ <listitem><para>Y completes a semaphore wait as a result of X signalling
+ on that same semaphore</para></listitem>
+ </itemizedlist>
+ <para>
+ This refinement allows Thrcheck to correctly track the ownership
+ state of inter-thread buffers used in the worker-thread and
+ worker-thread-pool concurrent programming idioms (styles).</para>
+ </listitem>
+ <listitem>
+ <para>Similarly, if thread Y joins back to thread X, memory
+ exclusively owned by Y becomes exclusively owned by X instead.
+ Also, memory that has been shared only by X and Y becomes
+ exclusively owned by X. More generally, memory that has been shared
+ by X, Y and some arbitrary other set S of threads is re-marked as
+ shared by X and S. Hence, under the right circumstances, memory
+ shared amongst multiple threads, all of which join into just one,
+ can revert to the exclusive ownership state.</para>
+ <para>
+ In effect, each memory location may make arbitrarily many
+ transitions between exclusive and shared ownership. Furthermore, a
+ different lock may protect the location during each period of shared
+ ownership. This significantly enhances the flexibility of the
+ algorithm.</para>
+ </listitem>
+</itemizedlist>
+
+<para>The ownership state, accessing thread-set and related lock-set
+for each memory location are tracked at 8-bit granularity. This means
+the algorithm is precise even for 16- and 8-bit memory
+accesses.</para>
+
+<para>Thrcheck correctly handles reader-writer locks in this
+framework. Locations shared between multiple threads can be protected
+during reads by locks held in either read-mode or write-mode, but can
+only be protected during writes by locks held in write-mode. Normal
+POSIX mutexes are treated as if they are reader-writer locks which are
+only ever held in write-mode.</para>
+
+<para>Thrcheck correctly handles POSIX mutexes for which recursive
+locking is allowed.</para>
+
+<para>Thrcheck partially correctly handles x86 and amd64 memory access
+instructions preceded by a LOCK prefix. Writes are correctly handled,
+by pretending that the LOCK prefix implies acquisition and release of
+a magic "bus hardware lock" mutex before and after the instruction.
+This unfortunately requires subsequent reads from such locations to
+also use a LOCK prefix, which is not required by the real hardware.
+Thrcheck does not offer any equivalent handling for atomic sequences
+on PowerPC/POWER platforms created by the use of lwarx/stwcx
+instructions.</para>
+
+</sect2>
+
+
+
+<sect2 id="tc-manual.data-races.errmsgs" xreflabel="Race Error Messages">
+<title>Interpreting Race Error Messages</title>
+
+<para>Thrcheck's race detection algorithm collects a lot of
+information, and tries to present it in a helpful way when a race is
+detected. Here's an example:</para>
+
+<programlisting><![CDATA[
+Thread #2 was created
+ at 0x510548E: clone (in /lib64/libc-2.5.so)
+ by 0x4E2F305: do_clone (in /lib64/libpthread-2.5.so)
+ by 0x4E2F7C5: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.5.so)
+ by 0x4C23870: pthread_create@* (tc_intercepts.c:198)
+ by 0x400CEF: main (tc17_sembar.c:195)
+
+// And the same for threads #3, #4 and #5 -- omitted for conciseness
+
+Possible data race during read of size 4 at 0x602174
+ at 0x400BE5: gomp_barrier_wait (tc17_sembar.c:122)
+ by 0x400C44: child (tc17_sembar.c:161)
+ by 0x4C25DF7: mythread_wrapper (tc_intercepts.c:178)
+ by 0x4E2F09D: start_thread (in /lib64/libpthread-2.5.so)
+ by 0x51054CC: clone (in /lib64/libc-2.5.so)
+ Old state: shared-modified by threads #2, #3, #4, #5
+ New state: shared-modified by threads #2, #3, #4, #5
+ Reason: this thread, #2, holds no consistent locks
+ Last consistently used lock for 0x602174 was first observed
+ at 0x4C25D01: pthread_mutex_init (tc_intercepts.c:326)
+ by 0x4009E4: gomp_barrier_init (tc17_sembar.c:46)
+ by 0x400CBC: main (tc17_sembar.c:192)
+]]></programlisting>
+
+<para>Thrcheck first announces the creation points of any threads
+referenced in the error message. This is so it can speak concisely
+about threads and sets of threads without repeatedly printing their
+creation point call stacks. Each thread is only ever announced once,
+the first time it appears in any Thrcheck error message.</para>
+
+<para>The main error message begins at the text
+"<computeroutput>Possible data race during read</computeroutput>".
+At the start is information you would expect to see -- address and
+size of the racing access, whether a read or a write, and the call
+stack at the point it was detected.</para>
+
+<para>More interesting is the state transition caused by this access.
+This memory is already in the shared-modified state, and up to now has
+been consistently protected by at least one lock. However, the thread
+making the access in question (thread #2, here) does not hold any
+locks in common with those held during all previous accesses to the
+location -- "no consistent locks", in other words.</para>
+
+<para>Finally, Thrcheck shows the lock which has protected this
+location in all previous accesses. (If there is more than one, only
+one is shown). This can be a useful hint, because it typically shows
+the lock that the programmers intended to use to protect the location,
+but in this case forgot.</para>
+
+<para>Here are some more examples of race reports. This not an
+exhaustive list of combinations, but should give you some insight into
+how to interpret the output.</para>
+
+<programlisting><![CDATA[
+Possible data race during write ...
+ Old state: shared-readonly by threads #1, #2, #3
+ New state: shared-modified by threads #1, #2, #3
+ Reason: this thread, #3, holds no consistent locks
+ Location ... has never been protected by any lock
+]]></programlisting>
+
+<para>The location is shared by 3 threads, all of which have been
+reading it without locking ("has never been protected by any lock").
+Now one of them is writing it. Regardless of whether the writer has a
+lock or not, this is still an error, because the write races against
+the previously observed reads.</para>
+
+<programlisting><![CDATA[
+Possible data race during read ...
+ Old state: shared-modified by threads #1, #2, #3
+ New state: shared-modified by threads #1, #2, #3
+ Reason: this thread, #3, holds no consistent locks
+ Last consistently used lock for ... was first observed ...
+]]></programlisting>
+
+<para>The location is shared by 3 threads, all of which have been
+reading and writing it while (as required) holding at least one lock
+in common. Now it is being read without that lock being held. In the
+"Last consistently used lock" part, Thrcheck offers its best guess as
+to the identity of the lock that should have been used.</para>
+
+<programlisting><![CDATA[
+Possible data race during write ...
+ Old state: owned exclusively by thread #4
+ New state: shared-modified by threads #4, #5
+ Reason: this thread, #5, holds no locks at all
+]]></programlisting>
+
+<para>A location that has so far been accessed exclusively by thread
+#4 has now been written by thread #5, without use of any lock. This
+can be a sign that the programmer did not consider the possibility of
+the location being shared between threads, or, alternatively, forgot
+to use the appropriate lock.</para>
+
+<para>Note that thread #4 exclusively owns the location, and so has
+the right to access it without holding a lock. However, this message
+does not say that thread #4 is not using a lock for this location.
+Indeed, it could be using a lock for the location because it intends
+to make it available to other threads, one of which is thread #5 --
+and thread #5 has forgotten to use the lock.</para>
+
+<para>Also, this message implies that Thrcheck did not see any
+synchronisation event between threads #4 and #5 that would have
+allowed #5 to acquire exclusive ownership from #4. See
+<link linkend="tc-manual.data-races.exclusive">above</link>
+for a discussion of transfers of exclusive ownership states between
+threads.</para>
+
+</sect2>
+
+
+</sect1>
+
+<sect1 id="tc-manual.effective-use" xreflabel="Thrcheck Effective Use">
+<title>Hints and Tips for Effective Use of Thrcheck</title>
+
+<para>Thrcheck can be very helpful in finding and resolving
+threading-related problems. Like all sophisticated tools, it is most
+effective when you understand how to play to its strengths.</para>
+
+<para>Thrcheck will be less effective when you merely throw an
+existing threaded program at it and try to make sense of any reported
+errors. It will be more effective if you design threaded programs
+from the start in a way that helps Thrcheck verify correctness. The
+same is true for finding memory errors with Memcheck, but applies more
+here, because thread checking is a harder problem. Consequently it is
+much easier to write a correct program for which Thrcheck falsely
+reports (threading) errors than it is to write a correct program for
+which Memcheck falsely reports (memory) errors.</para>
+
+<para>With that in mind, here are some tips, listed most important first,
+for getting reliable results and avoiding false errors. The first two
+are critical. Any violations of them will swamp you with huge numbers
+of false data-race errors.</para>
+
+
+<orderedlist>
+
+ <listitem>
+ <para>Make sure your application, and all the libraries it uses,
+ use the POSIX threading primitives. Thrcheck needs to be able to
+ see all events pertaining to thread creation, exit, locking and
+ other syncronisation events. To do so it intercepts many POSIX
+ pthread_ functions.</para>
+
+ <para>Do not roll your own threading primitives (mutexes, etc)
+ from combinations of the Linux futex syscall, counters and wotnot.
+ These throw Thrcheck's internal what's-going-on models way off
+ course and will give bogus results.</para>
+
+ <para>Also, do not reimplement existing POSIX abstractions using
+ other POSIX abstractions. For example, don't build your own
+ semaphore routines or reader-writer locks from POSIX mutexes and
+ condition variables. Instead use POSIX reader-writer locks and
+ semaphores directly, since Thrcheck supports them directly.</para>
+
+ <para>Thrcheck directly supports the following POSIX threading
+ abstractions: mutexes, reader-writer locks, condition variables
+ (but see below), and semaphores. Currently spinlocks and barriers
+ are not supported, although they could be in future. A prototype
+ "safe" implementation of barriers, based on semaphores, is
+ available: please contact the Valgrind authors for details.</para>
+
+ <para>At the time of writing, the following popular Linux packages
+ are known to implement their own threading primitives:</para>
+
+ <itemizedlist>
+ <listitem><para>Qt version 4.X. Qt 3.X is fine, but not 4.X.
+ Thrcheck contains partial direct support for Qt 4.X threading,
+ but this is not yet in a usable state. Assistance from folks
+ knowledgeable in Qt 4 threading internals would be
+ appreciated.</para></listitem>
+
+ <listitem><para>Runtime support library for GNU OpenMP (part of
+ GCC), at least GCC versions 4.2 and 4.3. With some minor effort
+ of modifying the GNU OpenMP runtime support sources, it is
+ possible to use Thrcheck on GNU OpenMP compiled codes. Please
+ contact the Valgrind authors for details.</para></listitem>
+ </itemizedlist>
+ </listitem>
+
+ <listitem>
+ <para>Avoid memory recycling. If you can't avoid it, you must use
+ tell Thrcheck what is going on via the VALGRIND_HG_CLEAN_MEMORY
+ client request
+ (in <computeroutput>thrcheck.h</computeroutput>).</para>
+
+ <para>Thrcheck is aware of standard memory allocation and
+ deallocation that occurs via malloc/free/new/delete and from entry
+ and exit of stack frames. In particular, when memory is
+ deallocated via free, delete, or function exit, Thrcheck considers
+ that memory clean, so when it is eventually reallocated, its
+ history is irrelevant.</para>
+
+ <para>However, it is common practice to implement memory recycling
+ schemes. In these, memory to be freed is not handed to
+ malloc/delete, but instead put into a pool of free buffers to be
+ handed out again as required. The problem is that Thrcheck has no
+ way to know that such memory is logically no longer in use, and
+ its history is irrelevant. Hence you must make that explicit,
+ using the VALGRIND_HG_CLEAN_MEMORY client request to specify the
+ relevant address ranges. It's easiest to put these requests into
+ the pool manager code, and use them either when memory is returned
+ to the pool, or is allocated from it.</para>
+ </listitem>
+
+ <listitem>
+ <para>Avoid POSIX condition variables. If you can, use POSIX
+ semaphores (sem_t, sem_post, sem_wait) to do inter-thread event
+ signalling. Semaphores with an initial value of zero are
+ particularly useful for this.</para>
+
+ <para>Thrcheck only partially correctly handles POSIX condition
+ variables. This is because Thrcheck can see inter-thread
+ dependencies between a pthread_cond_wait call and a
+ pthread_cond_signal/broadcast call only if the waiting thread
+ actually gets to the rendezvous first (so that it actually calls
+ pthread_cond_wait). It can't see dependencies between the threads
+ if the signaller arrives first. In the latter case, POSIX
+ guidelines imply that the associated boolean condition still
+ provides an inter-thread synchronisation event, but one which is
+ invisible to Thrcheck.</para>
+
+ <para>The result of Thrcheck missing some inter-thread
+ synchronisation events is to cause it to report false positives.
+ That's because missing such events reduces the extent to which it
+ can transfer exclusive memory ownership between threads. So
+ memory may end up in a shared-modified state when that was not
+ intended by the application programmers.</para>
+
+ <para>The root cause of this synchronisation lossage is
+ particularly hard to understand, so an example is helpful. It was
+ discussed at length by Arndt Muehlenfeld ("Runtime Race Detection
+ in Multi-Threaded Programs", Dissertation, TU Graz, Austria). The
+ canonical POSIX-recommended usage scheme for condition variables
+ is as follows:</para>
+
+<programlisting><![CDATA[
+b is a Boolean condition, which is False most of the time
+cv is a condition variable
+mx is its associated mutex
+
+Signaller: Waiter:
+
+lock(mx) lock(mx)
+b = True while (b == False)
+signal(cv) wait(cv,mx)
+unlock(mx) unlock(mx)
+]]></programlisting>
+
+ <para>Assume <computeroutput>b</computeroutput> is False most of
+ the time. If the waiter arrives at the rendezvous first, it
+ enters its while-loop, waits for the signaller to signal, and
+ eventually proceeds. Thrcheck sees the signal, notes the
+ dependency, and all is well.</para>
+
+ <para>If the signaller arrives
+ first, <computeroutput>b</computeroutput> is set to true, and the
+ signal disappears into nowhere. When the waiter later arrives, it
+ does not enter its while-loop and simply carries on. But even in
+ this case, the waiter code following the while-loop cannot execute
+ until the signaller sets <computeroutput>b</computeroutput> to
+ True. Hence there is still the same inter-thread dependency, but
+ this time it is through an arbitrary in-memory condition, and
+ Thrcheck cannot see it.</para>
+
+ <para>By comparison, Thrcheck's detection of inter-thread
+ dependencies caused by semaphore operations is believed to be
+ exactly correct.</para>
+
+ <para>As far as I know, a solution to this problem that does not
+ require source-level annotation of condition-variable wait loops
+ is beyond the current state of the art.</para>
+ </listitem>
+
+ <listitem>
+ <para>Make sure you are using a supported Linux distribution. At
+ present, Thrcheck only properly supports x86-linux and amd64-linux
+ with glibc-2.3 or later. The latter restriction means we only
+ support glibc's NPTL threading implementation. The old
+ LinuxThreads implementation is not supported.</para>
+
+ <para>Unsupported targets may work to varying degrees. In
+ particular ppc32-linux and ppc64-linux running NTPL should work,
+ but you will get false race errors because Thrcheck does not know
+ how to properly handle atomic instruction sequences created using
+ the lwarx/stwcx instructions.</para>
+ </listitem>
+
+ <listitem>
+ <para>Round up all finished threads using pthread_join. Avoid
+ detaching threads: don't create threads in the detached state, and
+ don't call pthread_detach on existing threads.</para>
+
+ <para>Using pthread_join to round up finished threads provides a
+ clear synchronisation point that both Thrcheck and programmers can
+ see. This synchronisation point allows Thrcheck to adjust its
+ memory ownership
+ models <link linkend="tc-manual.data-races.exclusive">as described
+ extensively above</link>, which helps Thrcheck produce more
+ accurate error reports.</para>
+
+ <para>If you don't call pthread_join on a thread, Thrcheck has no
+ way to know when it finishes, relative to any significant
+ synchronisation points for other threads in the program. So it
+ assumes that the thread lingers indefinitely and can potentially
+ interfere indefinitely with the memory state of the program. It
+ has every right to assume that -- after all, it might really be
+ the case that, for scheduling reasons, the exiting thread did run
+ very slowly in the last stages of its life.</para>
+ </listitem>
+
+ <listitem>
+ <para>Perform thread debugging (with Thrcheck) and memory
+ debugging (with Memcheck) together.</para>
+
+ <para>Thrcheck tracks the state of memory in detail, and memory
+ management bugs in the application are liable to cause confusion.
+ In extreme cases, applications which do many invalid reads and
+ writes (particularly to freed memory) have been known to crash
+ Thrcheck. So, ideally, you should make your application
+ Memcheck-clean before using Thrcheck.</para>
+
+ <para>It may be impossible to make your application Memcheck-clean
+ unless you first remove threading bugs. In particular, it may be
+ difficult to remove all reads and writes to freed memory in
+ multithreaded C++ destructor sequences at program termination.
+ So, ideally, you should make your application Thrcheck-clean
+ before using Memcheck.</para>
+
+ <para>Since this circularity is obviously unresolvable, at least
+ bear in mind that Memcheck and Thrcheck are to some extent
+ complementary, and you may need to use them together.</para>
+ </listitem>
+
+ <listitem>
+ <para>POSIX requires that implementations of standard I/O (printf,
+ fprintf, fwrite, fread, etc) are thread safe. Unfortunately GNU
+ libc implements this by using internal locking primitives that
+ Thrcheck is unable to intercept. Consequently Thrcheck generates
+ many false race reports when you use these functions.</para>
+
+ <para>Thrcheck attempts to hide these errors using the standard
+ Valgrind error-suppression mechanism. So, at least for simple
+ test cases, you don't see any. Nevertheless, some may slip
+ through. Just something to be aware of.</para>
+ </listitem>
+
+ <listitem>
+ <para>Thrcheck's error checks do not work properly inside the
+ system threading library itself
+ (<computeroutput>libpthread.so</computeroutput>), and it usually
+ observes large numbers of (false) errors in there. Valgrind's
+ suppression system then filters these out, so you should not see
+ them.</para>
+
+ <para>If you see any race errors reported
+ where <computeroutput>libpthread.so</computeroutput> or
+ <computeroutput>ld.so</computeroutput> is the object associated
+ with the innermost stack frame, please file a bug report at
+ http://www.valgrind.org.</para>
+ </listitem>
+
+</orderedlist>
+
+</sect1>
+
+
+
+
+<sect1 id="tc-manual.options" xreflabel="Thrcheck Options">
+<title>Thrcheck Options</title>
+
+<para>The following end-user options are available:</para>
+
+<!-- start of xi:include in the manpage -->
+<variablelist id="tc.opts.list">
+
+ <varlistentry id="opt.happens-before" xreflabel="--happens-before">
+ <term>
+ <option><![CDATA[--happens-before=none|threads|all
+ [default: all] ]]></option>
+ </term>
+ <listitem>
+ <para>Thrcheck always regards locks as the basis for
+ inter-thread synchronisation. However, by default, before
+ reporting a race error, Thrcheck will also check whether
+ certain other kinds of inter-thread synchronisation events
+ happened. It may be that if such events took place, then no
+ race really occurred, and so no error needs to be reported.
+ See <link linkend="tc-manual.data-races.exclusive">above</link>
+ for a discussion of transfers of exclusive ownership states
+ between threads.
+ </para>
+ <para>With <varname>--happens-before=all</varname>, the
+ following events are regarded as sources of synchronisation:
+ thread creation/joinage, condition variable
+ signal/broadcast/waits, and semaphore posts/waits.
+ </para>
+ <para>With <varname>--happens-before=threads</varname>, only
+ thread creation/joinage events are regarded as sources of
+ synchronisation.
+ </para>
+ <para>With <varname>--happens-before=none</varname>, no events
+ (apart, of course, from locking) are regarded as sources of
+ synchronisation.
+ </para>
+ <para>Changing this setting from the default will increase your
+ false-error rate but give little or no gain. The only advantage
+ is that <option>--happens-before=threads</option> and
+ <option>--happens-before=none</option> should make Thrcheck
+ less and less sensitive to the scheduling of threads, and hence
+ the output more and more repeatable across runs.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry id="opt.trace-addr" xreflabel="--trace-addr">
+ <term>
+ <option><![CDATA[--trace-addr=0xXXYYZZ
+ ]]></option> and
+ <option><![CDATA[--trace-level=0|1|2 [default: 1]
+ ]]></option>
+ </term>
+ <listitem>
+ <para>Requests that Thrcheck produces a log of all state changes
+ to location 0xXXYYZZ. This can be helpful in tracking down
+ tricky races. <varname>--trace-level</varname> controls the
+ verbosity of the log. At the default setting (1), a one-line
+ summary of is printed for each state change. At level 2 a
+ complete stack trace is printed for each state change.</para>
+ </listitem>
+ </varlistentry>
+
+</variablelist>
+<!-- end of xi:include in the manpage -->
+
+<!-- start of xi:include in the manpage -->
+<para>In addition, the following debugging options are available for
+Thrcheck:</para>
+
+<variablelist id="tc.debugopts.list">
+
+ <varlistentry id="opt.trace-malloc" xreflabel="--trace-malloc">
+ <term>
+ <option><![CDATA[--trace-malloc=no|yes [no]
+ ]]></option>
+ </term>
+ <listitem>
+ <para>Show all client malloc (etc) and free (etc) requests.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry id="opt.gen-vcg" xreflabel="--gen-vcg">
+ <term>
+ <option><![CDATA[--gen-vcg=no|yes|yes-w-vts [no]
+ ]]></option>
+ </term>
+ <listitem>
+ <para>At exit, write to stderr a dump of the happens-before
+ graph computed by Thrcheck, in a format suitable for the VCG
+ graph visualisation tool. A suitable command line is:</para>
+ <para><computeroutput>valgrind --tool=thrcheck
+ --gen-vcg=yes my_app 2>&1
+ | grep xxxxxx | sed "s/xxxxxx//g"
+ | xvcg -</computeroutput></para>
+ <para>With <varname>--gen-vcg=yes</varname>, the basic
+ happens-before graph is shown. With
+ <varname>--gen-vcg=yes-w-vts</varname>, the vector timestamp
+ for each node is also shown.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry id="opt.cmp-race-err-addrs"
+ xreflabel="--cmp-race-err-addrs">
+ <term>
+ <option><![CDATA[--cmp-race-err-addrs=no|yes [no]
+ ]]></option>
+ </term>
+ <listitem>
+ <para>Controls whether or not race (data) addresses should be
+ taken into account when removing duplicates of race errors.
+ With <varname>--cmp-race-err-addrs=no</varname>, two otherwise
+ identical race errors will be considered to be the same if
+ their race addresses differ. With
+ With <varname>--cmp-race-err-addrs=yes</varname> they will be
+ considered different. This is provided to help make certain
+ regression tests work reliably.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry id="opt.tc-sanity-flags" xreflabel="--tc-sanity-flags">
+ <term>
+ <option><![CDATA[--tc-sanity-flags=<XXXXX> (X = 0|1) [00000]
+ ]]></option>
+ </term>
+ <listitem>
+ <para>Run extensive sanity checks on Thrcheck's internal
+ data structures at events defined by the bitstring, as
+ follows:</para>
+ <para><computeroutput>10000 </computeroutput>after changes to
+ the lock order acquisition graph</para>
+ <para><computeroutput>01000 </computeroutput>after every client
+ memory access (NB: not currently used)</para>
+ <para><computeroutput>00100 </computeroutput>after every client
+ memory range permission setting of 256 bytes or greater</para>
+ <para><computeroutput>00010 </computeroutput>after every client
+ lock or unlock event</para>
+ <para><computeroutput>00001 </computeroutput>after every client
+ thread creation or joinage event</para>
+ <para>Note these will make Thrcheck run very slowly, often to
+ the point of being completely unusable.</para>
+ </listitem>
+ </varlistentry>
+
+</variablelist>
+<!-- end of xi:include in the manpage -->
+
+
+</sect1>
+
+<sect1 id="tc-manual.todolist" xreflabel="To Do List">
+<title>A To-Do List for Thrcheck</title>
+
+<para>The following is a list of loose ends which should be tidied up
+some time.</para>
+
+<itemizedlist>
+ <listitem><para>Track which mutexes are associated with which
+ condition variables, and emit a warning if this becomes
+ inconsistent.</para>
+ </listitem>
+ <listitem><para>For lock order errors, print the complete lock
+ cycle, rather than only doing for size-2 cycles as at
+ present.</para>
+ </listitem>
+ <listitem><para>Document the VALGRIND_HG_CLEAN_MEMORY client
+ request.</para>
+ </listitem>
+ <listitem><para>Possibly a client request to forcibly transfer
+ ownership of memory from one thread to another. Requires further
+ consideration.</para>
+ </listitem>
+ <listitem><para>Add a new client request that marks an address range
+ as being "shared-modified with empty lockset" (the error state),
+ and describe how to use it.</para>
+ </listitem>
+ <listitem><para>Document races caused by gcc's thread-unsafe code
+ generation for speculative stores. In the interim see
+ <computeroutput>http://gcc.gnu.org/ml/gcc/2007-10/msg00266.html
+ </computeroutput>
+ and <computeroutput>http://lkml.org/lkml/2007/10/24/673</computeroutput>.
+ </para>
+ </listitem>
+ <listitem><para>Don't update the lock-order graph, and don't check
+ for errors, when a "try"-style lock operation happens (eg
+ pthread_mutex_trylock). Such calls do not add any real
+ restrictions to the locking order, since they can always fail to
+ acquire the lock, resulting in the caller going off and doing Plan
+ B (presumably it will have a Plan B). Doing such checks could
+ generate false lock-order errors and confuse users.</para>
+ </listitem>
+ <listitem><para> Performance can be very poor. Slowdowns on the
+ order of 100:1 are not unusual. There is quite some scope for
+ performance improvements, though.
+ </para>
+ </listitem>
+
+</itemizedlist>
+
+</sect1>
+
+</chapter>