blob: 5090cfc2eeeb49780c7385cff04b736a7afbe739 [file] [log] [blame]
sewardjb4112022007-11-09 22:49:28 +00001<?xml version="1.0"?> <!-- -*- sgml -*- -->
2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
4
5
6<chapter id="tc-manual" xreflabel="Thrcheck: thread error detector">
7 <title>Thrcheck: a thread error detector</title>
8
9<para>To use this tool, you must specify
10<computeroutput>--tool=thrcheck</computeroutput> on the Valgrind
11command line.</para>
12
13
14
15
16<sect1 id="tc-manual.overview" xreflabel="Overview">
17<title>Overview</title>
18
19<para>Thrcheck is a Valgrind tool for detecting synchronisation errors
20in C, C++ and Fortran programs that use the POSIX pthreads
21threading primitives.</para>
22
23<para>The main abstractions in POSIX pthreads are: a set of threads
24sharing a common address space, thread creation, thread joinage,
25thread exit, mutexes (locks), condition variables (inter-thread event
26notifications), reader-writer locks, and semaphores.</para>
27
28<para>Thrcheck is aware of all these abstractions and tracks their
29effects as accurately as it can. Currently it does not correctly
30handle pthread barriers and pthread spinlocks, although it will not
31object if you use them. On x86 and amd64 platforms, it understands
32and partially handles implicit locking arising from the use of the
33LOCK instruction prefix.
34</para>
35
36<para>Thrcheck can detect three classes of errors, which are discussed
37in detail in the next three sections:</para>
38
39<orderedlist>
40 <listitem>
41 <para><link linkend="tc-manual.api-checks">
42 Misuses of the POSIX pthreads API.</link></para>
43 </listitem>
44 <listitem>
45 <para><link linkend="tc-manual.lock-orders">
46 Potential deadlocks arising from lock
47 ordering problems.</link></para>
48 </listitem>
49 <listitem>
50 <para><link linkend="tc-manual.data-races">
51 Data races -- accessing memory without adequate locking.
52 </link></para>
53 </listitem>
54</orderedlist>
55
56<para>Following those is a section containing
57<link linkend="tc-manual.effective-use">
58hints and tips on how to get the best out of Thrcheck.</link>
59</para>
60
61<para>Then there is a
62<link linkend="tc-manual.options">summary of command-line
63options.</link>
64</para>
65
66<para>Finally, there is
67<link linkend="tc-manual.todolist">a brief summary of areas in which Thrcheck
68could be improved.</link>
69</para>
70
71</sect1>
72
73
74
75
76<sect1 id="tc-manual.api-checks" xreflabel="API Checks">
77<title>Detected errors: Misuses of the POSIX pthreads API</title>
78
79<para>Thrcheck intercepts calls to many POSIX pthreads functions, and
80is therefore able to report on various common problems. Although
81these are unglamourous errors, their presence can lead to undefined
82program behaviour and hard-to-find bugs later in execution. The
83detected errors are:</para>
84
85<itemizedlist>
86 <listitem><para>unlocking an invalid mutex</para></listitem>
87 <listitem><para>unlocking a not-locked mutex</para></listitem>
88 <listitem><para>unlocking a mutex held by a different
89 thread</para></listitem>
90 <listitem><para>destroying an invalid or a locked mutex</para></listitem>
91 <listitem><para>recursively locking a non-recursive mutex</para></listitem>
92 <listitem><para>deallocation of memory that contains a
93 locked mutex</para></listitem>
94 <listitem><para>passing mutex arguments to functions expecting
95 reader-writer lock arguments, and vice
96 versa</para></listitem>
97 <listitem><para>when a POSIX pthread function fails with an
98 error code that must be handled</para></listitem>
99 <listitem><para>when a thread exits whilst still holding locked
100 locks</para></listitem>
101 <listitem><para>calling <computeroutput>pthread_cond_wait</computeroutput>
102 with a not-locked mutex, or one locked by a different
103 thread</para></listitem>
104</itemizedlist>
105
106<para>Checks pertaining to the validity of mutexes are generally also
107performed for reader-writer locks.</para>
108
109<para>Various kinds of this-can't-possibly-happen events are also
110reported. These usually indicate bugs in the system threading
111library.</para>
112
113<para>Reported errors always contain a primary stack trace indicating
114where the error was detected. They may also contain auxiliary stack
115traces giving additional information. In particular, most errors
116relating to mutexes will also tell you where that mutex first came to
117Thrcheck's attention (the "<computeroutput>was first observed
118at</computeroutput>" part), so you have a chance of figuring out which
119mutex it is referring to. For example:</para>
120
121<programlisting><![CDATA[
122Thread #1 unlocked a not-locked lock at 0x7FEFFFA90
123 at 0x4C2408D: pthread_mutex_unlock (tc_intercepts.c:492)
124 by 0x40073A: nearly_main (tc09_bad_unlock.c:27)
125 by 0x40079B: main (tc09_bad_unlock.c:50)
126 Lock at 0x7FEFFFA90 was first observed
127 at 0x4C25D01: pthread_mutex_init (tc_intercepts.c:326)
128 by 0x40071F: nearly_main (tc09_bad_unlock.c:23)
129 by 0x40079B: main (tc09_bad_unlock.c:50)
130]]></programlisting>
131
132<para>Thrcheck has a way of summarising thread identities, as
133evidenced here by the text "<computeroutput>Thread
134#1</computeroutput>". This is so that it can speak about threads and
135sets of threads without overwhelming you with details. See
136<link linkend="tc-manual.data-races.errmsgs">below</link>
137for more information on interpreting error messages.</para>
138
139</sect1>
140
141
142
143
144<sect1 id="tc-manual.lock-orders" xreflabel="Lock Orders">
145<title>Detected errors: Inconsistent Lock Orderings</title>
146
147<para>In this section, and in general, to "acquire" a lock simply
148means to lock that lock, and to "release" a lock means to unlock
149it.</para>
150
151<para>Thrcheck monitors the order in which threads acquire locks.
152This allows it to detect potential deadlocks which could arise from
153the formation of cycles of locks. Detecting such inconsistencies is
154useful because, whilst actual deadlocks are fairly obvious, potential
155deadlocks may never be discovered during testing and could later lead
156to hard-to-diagnose in-service failures.</para>
157
158<para>The simplest example of such a problem is as
159follows.</para>
160
161<itemizedlist>
162 <listitem><para>Imagine some shared resource R, which, for whatever
163 reason, is guarded by two locks, L1 and L2, which must both be held
164 when R is accessed.</para>
165 </listitem>
166 <listitem><para>Suppose a thread acquires L1, then L2, and proceeds
167 to access R. The implication of this is that all threads in the
168 program must acquire the two locks in the order first L1 then L2.
169 Not doing so risks deadlock.</para>
170 </listitem>
171 <listitem><para>The deadlock could happen if two threads -- call them
172 T1 and T2 -- both want to access R. Suppose T1 acquires L1 first,
173 and T2 acquires L2 first. Then T1 tries to acquire L2, and T2 tries
174 to acquire L1, but those locks are both already held. So T1 and T2
175 become deadlocked.</para>
176 </listitem>
177</itemizedlist>
178
179<para>Thrcheck builds a directed graph indicating the order in which
180locks have been acquired in the past. When a thread acquires a new
181lock, the graph is updated, and then checked to see if it now contains
182a cycle. The presence of a cycle indicates a potential deadlock involving
183the locks in the cycle.</para>
184
185<para>In simple situations, where the cycle only contains two locks,
186Thrcheck will show where the required order was established:</para>
187
188<programlisting><![CDATA[
189Thread #1: lock order "0x7FEFFFAB0 before 0x7FEFFFA80" violated
190 at 0x4C23C91: pthread_mutex_lock (tc_intercepts.c:388)
191 by 0x40081F: main (tc13_laog1.c:24)
192 Required order was established by acquisition of lock at 0x7FEFFFAB0
193 at 0x4C23C91: pthread_mutex_lock (tc_intercepts.c:388)
194 by 0x400748: main (tc13_laog1.c:17)
195 followed by a later acquisition of lock at 0x7FEFFFA80
196 at 0x4C23C91: pthread_mutex_lock (tc_intercepts.c:388)
197 by 0x400773: main (tc13_laog1.c:18)
198]]></programlisting>
199
200<para>When there are more than two locks in the cycle, the error is
201equally serious. However, at present Thrcheck does not show the locks
202involved, so as to avoid flooding you with information. That could be
203fixed in future. For example, here is a an example involving a cycle
204of five locks from a naive implementation the famous Dining
205Philosophers problem
206(see <computeroutput>thrcheck/tests/tc14_laog_dinphils.c</computeroutput>).
207In this case Thrcheck has detected that all 5 philosophers could
208simultaneously pick up their left fork and then deadlock whilst
209waiting to pick up their right forks.</para>
210
211<programlisting><![CDATA[
212Thread #6: lock order "0x6010C0 before 0x601160" violated
213 at 0x4C23C91: pthread_mutex_lock (tc_intercepts.c:388)
214 by 0x4007C0: dine (tc14_laog_dinphils.c:19)
215 by 0x4C25DF7: mythread_wrapper (tc_intercepts.c:178)
216 by 0x4E2F09D: start_thread (in /lib64/libpthread-2.5.so)
217 by 0x51054CC: clone (in /lib64/libc-2.5.so)
218]]></programlisting>
219
220</sect1>
221
222
223
224
225<sect1 id="tc-manual.data-races" xreflabel="Data Races">
226<title>Detected errors: Data Races</title>
227
228<para>A data race happens, or could happen, when two threads
229access a shared memory location without using suitable locks to
230ensure single-threaded access. Such missing locking can cause
231obscure timing dependent bugs. Ensuring programs are race-free is
232one of the central difficulties of threaded programming.</para>
233
234<para>Reliably detecting races is a difficult problem, and most
235of Thrcheck's internals are devoted to do dealing with it.
236As a consequence this section is somewhat long and involved.
237We begin with a simple example.</para>
238
239
240<sect2 id="tc-manual.data-races.example" xreflabel="Simple Race">
241<title>A Simple Data Race</title>
242
243<para>About the simplest possible example of a race is as follows. In
244this program, it is impossible to know what the value
245of <computeroutput>var</computeroutput> is at the end of the program.
246Is it 2 ? Or 1 ?</para>
247
248<programlisting><![CDATA[
249#include <pthread.h>
250
251int var = 0;
252
253void* child_fn ( void* arg ) {
254 var++; /* Unprotected relative to parent */ /* this is line 6 */
255 return NULL;
256}
257
258int main ( void ) {
259 pthread_t child;
260 pthread_create(&child, NULL, child_fn, NULL);
261 var++; /* Unprotected relative to child */ /* this is line 13 */
262 pthread_join(child, NULL);
263 return 0;
264}
265]]></programlisting>
266
267<para>The problem is there is nothing to
268stop <computeroutput>var</computeroutput> being updated simultaneously
269by both threads. A correct program would
270protect <computeroutput>var</computeroutput> with a lock of type
271<computeroutput>pthread_mutex_t</computeroutput>, which is acquired
272before each access and released afterwards. Thrcheck's output for
273this program is:</para>
274
275<programlisting><![CDATA[
276Thread #1 is the program's root thread
277
278Thread #2 was created
279 at 0x510548E: clone (in /lib64/libc-2.5.so)
280 by 0x4E2F305: do_clone (in /lib64/libpthread-2.5.so)
281 by 0x4E2F7C5: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.5.so)
282 by 0x4C23870: pthread_create@* (tc_intercepts.c:198)
283 by 0x4005F1: main (simple_race.c:12)
284
285Possible data race during write of size 4 at 0x601034
286 at 0x4005F2: main (simple_race.c:13)
287 Old state: shared-readonly by threads #1, #2
288 New state: shared-modified by threads #1, #2
289 Reason: this thread, #1, holds no consistent locks
290 Location 0x601034 has never been protected by any lock
291]]></programlisting>
292
293<para>This is quite a lot of detail for an apparently simple error.
294The last clause is the main error message. It says there is a race as
295a result of a write of size 4 (bytes), at 0x601034, which is
296presumably the address of <computeroutput>var</computeroutput>,
297happening in function <computeroutput>main</computeroutput> at line 13
298in the program.</para>
299
300<para>Note that it is purely by chance that the race is
301reported for the parent thread's access. It could equally have been
302reported instead for the child's access, at line 6. The error will
303only be reported for one of the locations, since neither the parent
304nor child is, by itself, incorrect. It is only when both access
305<computeroutput>var</computeroutput> without a lock that an error
306exists.</para>
307
308<para>The error message shows some other interesting details. The
309sections below explain them. Here we merely note their presence:</para>
310
311<itemizedlist>
312 <listitem><para>Thrcheck maintains some kind of state machine for the
313 memory location in question, hence the "<computeroutput>Old
314 state:</computeroutput>" and "<computeroutput>New
315 state:</computeroutput>" lines.</para>
316 </listitem>
317 <listitem><para>Thrcheck keeps track of which threads have accessed
318 the location: "<computeroutput>threads #1, #2</computeroutput>".
319 Before printing the main error message, it prints the creation
320 points of these two threads, so you can see which threads it is
321 referring to.</para>
322 </listitem>
323 <listitem><para>Thrcheck tries to provide an explaination of why the
324 race exists: "<computeroutput>Location 0x601034 has never been
325 protected by any lock</computeroutput>".</para>
326 </listitem>
327</itemizedlist>
328
329<para>Understanding the memory state machine is central to
330understanding Thrcheck's race-detection algorithm. The next three
331subsections explain this.</para>
332
333</sect2>
334
335
336<sect2 id="tc-manual.data-races.memstates" xreflabel="Memory States">
337<title>Thrcheck's Memory State Machine</title>
338
339<para>Thrcheck tracks the state of every byte of memory used by your
340program. There are a number of states, but only three are
341interesting:</para>
342
343<itemizedlist>
344 <listitem><para>Exclusive: memory in this state is regarded as owned
345 exclusively by one particular thread. That thread may read and
346 write it without a lock. Even in highly threaded programs, the
347 majority of locations never leave the Exclusive state, since most
348 data is thread-private.</para>
349 </listitem>
350 <listitem><para>Shared-Readonly: memory in this state is regarded as
351 shared by multiple threads. In this state, any thread may read the
352 memory without a lock, reflecting the fact that readonly data may
353 safely be shared between threads without locking.</para>
354 </listitem>
355 <listitem><para>Shared-Modified: memory in this state is regarded as
356 shared by multiple threads, at least one of which has written to it.
357 All participating threads must hold at least one lock in common when
358 accessing the memory. If no such lock exists, Thrcheck reports a
359 race error.</para>
360 </listitem>
361</itemizedlist>
362
363<para>Let's review the simple example above with this in mind. When
364the program starts, <computeroutput>var</computeroutput> is not in any
365of these states. Either the parent or child thread gets to its
366<computeroutput>var++</computeroutput> first, and thereby
367thereby gets Exclusive ownership of the location.</para>
368
369<para>The later-running thread now arrives at
370its <computeroutput>var++</computeroutput> statement. It first reads
371the existing value from memory.
372Because <computeroutput>var</computeroutput> is currently marked as
373owned exclusively by the other thread, its state is changed to
374shared-readonly by both threads.</para>
375
376<para>This same thread adds one to the value it has and stores it back
377in <computeroutput>var</computeroutput>. This causes another state
378change, this time to the shared-modified state. Because Thrcheck has
379also been tracking which threads hold which locks, it can see that
380<computeroutput>var</computeroutput> is in shared-modified state but
381no lock has been used to consistently protect it. Hence a race is
382reported exactly at the transition from shared-readonly to
383shared-modified.</para>
384
385<para>The essence of the algorithm is this. Thrcheck keeps track of
386each memory location that has been accessed by more than one thread.
387For each such location it incrementally infers the set of locks which
388have consistently been used to protect that location. If the
389location's lockset becomes empty, and at some point one of the threads
390attempts to write to it, a race is then reported.</para>
391
392<para>This technique is known as "lockset inference" and was
393introduced in: "Eraser: A Dynamic Data Race Detector for Multithreaded
394Programs" (Stefan Savage, Michael Burrows, Greg Nelson, Patrick
395Sobalvarro and Thomas Anderson, ACM Transactions on Computer Systems,
39615(4):391-411, November 1997).</para>
397
398<para>Lockset inference has since been widely implemented, studied and
399extended. Thrcheck incorporates several refinements aimed at avoiding
400the high false error rate that naive versions of the algorithm suffer
401from. A
402<link linkend="tc-manual.data-races.summary">summary of the complete
403algorithm used by Thrcheck</link> is presented below. First, however,
404it is important to understand details of transitions pertaining to the
405Exclusive-ownership state.</para>
406
407</sect2>
408
409
410
411<sect2 id="tc-manual.data-races.exclusive" xreflabel="Excl Transfers">
412<title>Transfers of Exclusive Ownership Between Threads</title>
413
414<para>As presented, the algorithm is far too strict. It reports many
415errors in perfectly correct, widely used parallel programming
416constructions, for example, using child worker threads and worker
417thread pools.</para>
418
419<para>To avoid these false errors, we must refine the algorithm so
420that it keeps memory in an Exclusive ownership state in cases where it
421would otherwise decay into a shared-readonly or shared-modified state.
422Recall that Exclusive ownership is special in that it grants the
423owning thread the right to access memory without use of any locks. In
424order to support worker-thread and worker-thread-pool idioms, we will
425allow threads to steal exclusive ownership of memory from other
426threads under certain circumstances.</para>
427
428<para>Here's an example. Imagine a parent thread creates child
429threads to do units of work. For each unit of work, the parent
430allocates a work buffer, fills it in, and creates the child thread,
431handing it a pointer to the buffer. The child reads/writes the buffer
432and eventually exits, and the waiting parent then extracts the results
433from the buffer:</para>
434
435<programlisting><![CDATA[
436typedef ... Buffer;
437
438pthread_t child;
439Buffer buf;
440
441/* ---- Parent ---- */ /* ---- Child ---- */
442
443/* parent writes workload into buf */
444pthread_create( &child, child_fn, &buf );
445
446/* parent does not read */ void child_fn ( Buffer* buf ) {
447/* or write buf */ /* read/write buf */
448 }
449
450pthread_join ( child );
451/* parent reads results from buf */
452]]></programlisting>
453
454<para>Although <computeroutput>buf</computeroutput> is accessed by
455both threads, neither uses locks, yet the program is race-free. The
456essential observation is that the child's creation and exit create
457synchronisation events between it and the parent. These force the
458child's accesses to <computeroutput>buf</computeroutput> to happen
459after the parent initialises <computeroutput>buf</computeroutput>, and
460before the parent reads the results
461from <computeroutput>buf</computeroutput>.</para>
462
463<para>To model this, Thrcheck allows the child to steal, from the
464parent, exclusive ownership of any memory exclusively owned by the
465parent before the pthread_create call. Similarly, once the parent's
466pthread_join call returns, it can steal back ownership of memory
467exclusively owned by the child. In this way ownership
468of <computeroutput>buf</computeroutput> is transferred from parent to
469child and back, so the basic algorithm does not report any races
470despite the absence of any locking.</para>
471
472<para>Note that the child may only steal memory owned by the parent
473prior to the pthread_create call. If the child attempts to read or
474write memory which is also accessed by the parent in between the
475pthread_create and pthread_join calls, an error is still
476reported.</para>
477
478<para>This technique was introduced with the name "thread lifetime
479segments" in "Runtime Checking of Multithreaded Applications with
480Visual Threads" (Jerry J. Harrow, Jr, Proceedings of the 7th
481International SPIN Workshop on Model Checking of Software Stanford,
482California, USA, August 2000, LNCS 1885, pp331--342). Thrcheck
483implements an extended version of it. Specifically, Thrcheck allows
484transfer of exclusive ownership in the following situations:</para>
485
486<itemizedlist>
487 <listitem><para>At thread creation: a child can acquire ownership of
488 memory held exclusively by the parent prior to the child's
489 creation.</para>
490 </listitem>
491 <listitem><para>At thread joining: the joiner (thread not exiting)
492 can acquire ownership of memory held exclusively by the joinee
493 (thread that is exiting) at the point it exited.</para>
494 </listitem>
495 <listitem><para>At condition variable signallings and broadcasts. A
496 thread Tw which completes a pthread_cond_wait call as a result of
497 a signal or broadcast on the same condition variable by some other
498 thread Ts, may acquire ownership of memory held exclusively by
499 Ts prior to the pthread_cond_signal/broadcast
500 call.</para>
501 </listitem>
502 <listitem><para>At semaphore posts (sem_post) calls. A thread Tw
503 which completes a sem_wait call call as a result of a sem_post call
504 on the same semaphore by some other thread Tp, may acquire
505 ownership of memory held exclusively by Tp prior to the sem_post
506 call.</para>
507 </listitem>
508</itemizedlist>
509
510</sect2>
511
512
513
514<sect2 id="tc-manual.data-races.re-excl" xreflabel="Re-Excl Transfers">
515<title>Restoration of Exclusive Ownership</title>
516
517<para>Another common idiom is to partition the lifetime of the program
518as a whole into several distinct phases. In some of those phases, a
519memory location may be accessed by multiple threads and so require
520locking. In other phases only one thread exists and so can access the
521memory without locking. For example:</para>
522
523<programlisting><![CDATA[
524int var = 0; /* shared variable */
525pthread_mutex_t mx = PTHREAD_MUTEX_INITIALIZER; /* guard for var */
526pthread_t child;
527
528/* ---- Parent ---- */ /* ---- Child ---- */
529
530var += 1; /* no lock used */
531
532pthread_create( &child, child_fn, NULL );
533
534 void child_fn ( void* uu ) {
535pthread_mutex_lock(&mx); pthread_mutex_lock(&mx);
536var += 2; var += 3;
537pthread_mutex_unlock(&mx); pthread_mutex_unlock(&mx);
538 }
539
540pthread_join ( child );
541
542var += 4; /* no lock used */
543]]></programlisting>
544
545<para>This program is correct, but using only the mechanisms described
546so far, Thrcheck would report an error at
547<computeroutput>var += 4</computeroutput>. This is because, by that
548point, <computeroutput>var</computeroutput> is marked as being in the
549state "shared-modified and protected by the
550lock <computeroutput>mx</computeroutput>", but is being accessed
551without locking. Really, what we want is
552for <computeroutput>var</computeroutput> to return to the parent
553thread's exclusive ownership after the child thread has exited.</para>
554
555<para>To make this possible, for every memory location Thrcheck also keeps
556track of all the threads that have accessed that location
557-- its threadset. When a thread Tquitter joins back to Tstayer,
558Thrcheck examines the locksets of all memory in shared-modified or
559shared-readable state. In each such lockset, if Tquitter is
560mentioned, it is removed and replaced by Tstayer. If, as a result, a
561lockset becomes a singleton set containing Tstayer, then the
562location's state is changed to belongs-exclusively-to-Tstayer.</para>
563
564<para>In our example, the result is exactly as we desire:
565<computeroutput>var</computeroutput> is reacquired exclusively by the
566parent after the child exits.</para>
567
568<para>More generally, when a group of threads merges back to a single
569thread via a cascade of pthread_join calls, any memory shared by the
570group (or a subset of it) ends up being owned exclusively by the sole
571surviving thread. This significantly enhances Thrcheck's flexibility,
572since it means that each memory location may make arbitrarily many
573transitions between exclusive and shared ownership. Furthermore, a
574different lock may protect the location during each period of shared
575ownership.</para>
576
577</sect2>
578
579
580
581<sect2 id="tc-manual.data-races.summary" xreflabel="Race Det Summary">
582<title>A Summary of the Race Detection Algorithm</title>
583
584<para>Thrcheck looks for memory locations which are accessed by more
585than one thread. For each such location, Thrcheck records which of
586the program's locks were held by the accessing thread at the time of
587each access. The hope is to discover that there is indeed at least
588one lock which is consistently used by all threads to protect that
589location. If no such lock can be found, then there is apparently no
590consistent locking strategy being applied for that location, and so a
591possible data race might result. Thrcheck accordingly reports an
592error.</para>
593
594<para>In practice this discipline is far too simplistic, and is
595unusable since it reports many races in some widely used and
596known-correct programming disciplines. Thrcheck's checking therefore
597incorporates many refinements to this basic idea, and can be
598summarised as follows:</para>
599
600<para>The following thread events are intercepted and monitored:</para>
601
602<itemizedlist>
603 <listitem><para>thread creation and exiting (pthread_create,
604 pthread_join, pthread_exit)</para>
605 </listitem>
606 <listitem>
607 <para>lock acquisition and release (pthread_mutex_lock,
608 pthread_mutex_unlock, pthread_rwlock_rdlock,
609 pthread_rwlock_wrlock,
610 pthread_rwlock_unlock)</para>
611 </listitem>
612 <listitem>
613 <para>inter-thread event notifications (pthread_cond_wait,
614 pthread_cond_signal, pthread_cond_broadcast,
615 sem_wait, sem_post)</para>
616 </listitem>
617</itemizedlist>
618
619<para>Memory allocation and deallocation events are intercepted and
620monitored:</para>
621
622<itemizedlist>
623 <listitem>
624 <para>malloc/new/free/delete and variants</para>
625 </listitem>
626 <listitem>
627 <para>stack allocation and deallocation</para>
628 </listitem>
629</itemizedlist>
630
631<para>All memory accesses are intercepted and monitored.</para>
632
633<para>By observing the above events, Thrcheck can infer certain
634aspects of the program's locking discipline. Programs which adhere to
635the following rules are considered to be acceptable:
636</para>
637
638<itemizedlist>
639 <listitem>
640 <para>A thread may allocate memory, and write initial values into
641 it, without locking. That thread is regarded as owning the memory
642 exclusively.</para>
643 </listitem>
644 <listitem>
645 <para>A thread may read and write memory which it owns exclusively,
646 without locking.</para>
647 </listitem>
648 <listitem>
649 <para>Memory which is owned exclusively by one thread may be read by
650 that thread and others without locking. However, in this situation
651 no thread may do unlocked writes to the memory (except for the owner
652 thread's initializing write).</para>
653 </listitem>
654 <listitem>
655 <para>Memory which is shared between multiple threads, one or more
656 of which writes to it, must be protected by a lock which is
657 correctly acquired and released by all threads accessing the
658 memory.</para>
659 </listitem>
660</itemizedlist>
661
662<para>Any violation of this discipline will cause an error to be reported.
663However, two exemptions apply:</para>
664
665<itemizedlist>
666 <listitem>
667 <para>A thread Y can acquire exclusive ownership of memory
668 previously owned exclusively by a different thread X providing
669 X's last access and Y's first access are separated by one of the
670 following synchronization events:</para>
671 <itemizedlist>
672 <listitem><para>X creates thread Y</para></listitem>
673 <listitem><para>X joins back to Y</para></listitem>
674 <listitem><para>X uses a condition-variable to signal at Y, and Y is
675 waiting for that event</para></listitem>
676 <listitem><para>Y completes a semaphore wait as a result of X signalling
677 on that same semaphore</para></listitem>
678 </itemizedlist>
679 <para>
680 This refinement allows Thrcheck to correctly track the ownership
681 state of inter-thread buffers used in the worker-thread and
682 worker-thread-pool concurrent programming idioms (styles).</para>
683 </listitem>
684 <listitem>
685 <para>Similarly, if thread Y joins back to thread X, memory
686 exclusively owned by Y becomes exclusively owned by X instead.
687 Also, memory that has been shared only by X and Y becomes
688 exclusively owned by X. More generally, memory that has been shared
689 by X, Y and some arbitrary other set S of threads is re-marked as
690 shared by X and S. Hence, under the right circumstances, memory
691 shared amongst multiple threads, all of which join into just one,
692 can revert to the exclusive ownership state.</para>
693 <para>
694 In effect, each memory location may make arbitrarily many
695 transitions between exclusive and shared ownership. Furthermore, a
696 different lock may protect the location during each period of shared
697 ownership. This significantly enhances the flexibility of the
698 algorithm.</para>
699 </listitem>
700</itemizedlist>
701
702<para>The ownership state, accessing thread-set and related lock-set
703for each memory location are tracked at 8-bit granularity. This means
704the algorithm is precise even for 16- and 8-bit memory
705accesses.</para>
706
707<para>Thrcheck correctly handles reader-writer locks in this
708framework. Locations shared between multiple threads can be protected
709during reads by locks held in either read-mode or write-mode, but can
710only be protected during writes by locks held in write-mode. Normal
711POSIX mutexes are treated as if they are reader-writer locks which are
712only ever held in write-mode.</para>
713
714<para>Thrcheck correctly handles POSIX mutexes for which recursive
715locking is allowed.</para>
716
717<para>Thrcheck partially correctly handles x86 and amd64 memory access
718instructions preceded by a LOCK prefix. Writes are correctly handled,
719by pretending that the LOCK prefix implies acquisition and release of
720a magic "bus hardware lock" mutex before and after the instruction.
721This unfortunately requires subsequent reads from such locations to
722also use a LOCK prefix, which is not required by the real hardware.
723Thrcheck does not offer any equivalent handling for atomic sequences
724on PowerPC/POWER platforms created by the use of lwarx/stwcx
725instructions.</para>
726
727</sect2>
728
729
730
731<sect2 id="tc-manual.data-races.errmsgs" xreflabel="Race Error Messages">
732<title>Interpreting Race Error Messages</title>
733
734<para>Thrcheck's race detection algorithm collects a lot of
735information, and tries to present it in a helpful way when a race is
736detected. Here's an example:</para>
737
738<programlisting><![CDATA[
739Thread #2 was created
740 at 0x510548E: clone (in /lib64/libc-2.5.so)
741 by 0x4E2F305: do_clone (in /lib64/libpthread-2.5.so)
742 by 0x4E2F7C5: pthread_create@@GLIBC_2.2.5 (in /lib64/libpthread-2.5.so)
743 by 0x4C23870: pthread_create@* (tc_intercepts.c:198)
744 by 0x400CEF: main (tc17_sembar.c:195)
745
746// And the same for threads #3, #4 and #5 -- omitted for conciseness
747
748Possible data race during read of size 4 at 0x602174
749 at 0x400BE5: gomp_barrier_wait (tc17_sembar.c:122)
750 by 0x400C44: child (tc17_sembar.c:161)
751 by 0x4C25DF7: mythread_wrapper (tc_intercepts.c:178)
752 by 0x4E2F09D: start_thread (in /lib64/libpthread-2.5.so)
753 by 0x51054CC: clone (in /lib64/libc-2.5.so)
754 Old state: shared-modified by threads #2, #3, #4, #5
755 New state: shared-modified by threads #2, #3, #4, #5
756 Reason: this thread, #2, holds no consistent locks
757 Last consistently used lock for 0x602174 was first observed
758 at 0x4C25D01: pthread_mutex_init (tc_intercepts.c:326)
759 by 0x4009E4: gomp_barrier_init (tc17_sembar.c:46)
760 by 0x400CBC: main (tc17_sembar.c:192)
761]]></programlisting>
762
763<para>Thrcheck first announces the creation points of any threads
764referenced in the error message. This is so it can speak concisely
765about threads and sets of threads without repeatedly printing their
766creation point call stacks. Each thread is only ever announced once,
767the first time it appears in any Thrcheck error message.</para>
768
769<para>The main error message begins at the text
770"<computeroutput>Possible data race during read</computeroutput>".
771At the start is information you would expect to see -- address and
772size of the racing access, whether a read or a write, and the call
773stack at the point it was detected.</para>
774
775<para>More interesting is the state transition caused by this access.
776This memory is already in the shared-modified state, and up to now has
777been consistently protected by at least one lock. However, the thread
778making the access in question (thread #2, here) does not hold any
779locks in common with those held during all previous accesses to the
780location -- "no consistent locks", in other words.</para>
781
782<para>Finally, Thrcheck shows the lock which has protected this
783location in all previous accesses. (If there is more than one, only
784one is shown). This can be a useful hint, because it typically shows
785the lock that the programmers intended to use to protect the location,
786but in this case forgot.</para>
787
788<para>Here are some more examples of race reports. This not an
789exhaustive list of combinations, but should give you some insight into
790how to interpret the output.</para>
791
792<programlisting><![CDATA[
793Possible data race during write ...
794 Old state: shared-readonly by threads #1, #2, #3
795 New state: shared-modified by threads #1, #2, #3
796 Reason: this thread, #3, holds no consistent locks
797 Location ... has never been protected by any lock
798]]></programlisting>
799
800<para>The location is shared by 3 threads, all of which have been
801reading it without locking ("has never been protected by any lock").
802Now one of them is writing it. Regardless of whether the writer has a
803lock or not, this is still an error, because the write races against
804the previously observed reads.</para>
805
806<programlisting><![CDATA[
807Possible data race during read ...
808 Old state: shared-modified by threads #1, #2, #3
809 New state: shared-modified by threads #1, #2, #3
810 Reason: this thread, #3, holds no consistent locks
811 Last consistently used lock for ... was first observed ...
812]]></programlisting>
813
814<para>The location is shared by 3 threads, all of which have been
815reading and writing it while (as required) holding at least one lock
816in common. Now it is being read without that lock being held. In the
817"Last consistently used lock" part, Thrcheck offers its best guess as
818to the identity of the lock that should have been used.</para>
819
820<programlisting><![CDATA[
821Possible data race during write ...
822 Old state: owned exclusively by thread #4
823 New state: shared-modified by threads #4, #5
824 Reason: this thread, #5, holds no locks at all
825]]></programlisting>
826
827<para>A location that has so far been accessed exclusively by thread
828#4 has now been written by thread #5, without use of any lock. This
829can be a sign that the programmer did not consider the possibility of
830the location being shared between threads, or, alternatively, forgot
831to use the appropriate lock.</para>
832
833<para>Note that thread #4 exclusively owns the location, and so has
834the right to access it without holding a lock. However, this message
835does not say that thread #4 is not using a lock for this location.
836Indeed, it could be using a lock for the location because it intends
837to make it available to other threads, one of which is thread #5 --
838and thread #5 has forgotten to use the lock.</para>
839
840<para>Also, this message implies that Thrcheck did not see any
841synchronisation event between threads #4 and #5 that would have
842allowed #5 to acquire exclusive ownership from #4. See
843<link linkend="tc-manual.data-races.exclusive">above</link>
844for a discussion of transfers of exclusive ownership states between
845threads.</para>
846
847</sect2>
848
849
850</sect1>
851
852<sect1 id="tc-manual.effective-use" xreflabel="Thrcheck Effective Use">
853<title>Hints and Tips for Effective Use of Thrcheck</title>
854
855<para>Thrcheck can be very helpful in finding and resolving
856threading-related problems. Like all sophisticated tools, it is most
857effective when you understand how to play to its strengths.</para>
858
859<para>Thrcheck will be less effective when you merely throw an
860existing threaded program at it and try to make sense of any reported
861errors. It will be more effective if you design threaded programs
862from the start in a way that helps Thrcheck verify correctness. The
863same is true for finding memory errors with Memcheck, but applies more
864here, because thread checking is a harder problem. Consequently it is
865much easier to write a correct program for which Thrcheck falsely
866reports (threading) errors than it is to write a correct program for
867which Memcheck falsely reports (memory) errors.</para>
868
869<para>With that in mind, here are some tips, listed most important first,
870for getting reliable results and avoiding false errors. The first two
871are critical. Any violations of them will swamp you with huge numbers
872of false data-race errors.</para>
873
874
875<orderedlist>
876
877 <listitem>
878 <para>Make sure your application, and all the libraries it uses,
879 use the POSIX threading primitives. Thrcheck needs to be able to
880 see all events pertaining to thread creation, exit, locking and
881 other syncronisation events. To do so it intercepts many POSIX
882 pthread_ functions.</para>
883
884 <para>Do not roll your own threading primitives (mutexes, etc)
885 from combinations of the Linux futex syscall, counters and wotnot.
886 These throw Thrcheck's internal what's-going-on models way off
887 course and will give bogus results.</para>
888
889 <para>Also, do not reimplement existing POSIX abstractions using
890 other POSIX abstractions. For example, don't build your own
891 semaphore routines or reader-writer locks from POSIX mutexes and
892 condition variables. Instead use POSIX reader-writer locks and
893 semaphores directly, since Thrcheck supports them directly.</para>
894
895 <para>Thrcheck directly supports the following POSIX threading
896 abstractions: mutexes, reader-writer locks, condition variables
897 (but see below), and semaphores. Currently spinlocks and barriers
898 are not supported, although they could be in future. A prototype
899 "safe" implementation of barriers, based on semaphores, is
900 available: please contact the Valgrind authors for details.</para>
901
902 <para>At the time of writing, the following popular Linux packages
903 are known to implement their own threading primitives:</para>
904
905 <itemizedlist>
906 <listitem><para>Qt version 4.X. Qt 3.X is fine, but not 4.X.
907 Thrcheck contains partial direct support for Qt 4.X threading,
908 but this is not yet in a usable state. Assistance from folks
909 knowledgeable in Qt 4 threading internals would be
910 appreciated.</para></listitem>
911
912 <listitem><para>Runtime support library for GNU OpenMP (part of
913 GCC), at least GCC versions 4.2 and 4.3. With some minor effort
914 of modifying the GNU OpenMP runtime support sources, it is
915 possible to use Thrcheck on GNU OpenMP compiled codes. Please
916 contact the Valgrind authors for details.</para></listitem>
917 </itemizedlist>
918 </listitem>
919
920 <listitem>
921 <para>Avoid memory recycling. If you can't avoid it, you must use
922 tell Thrcheck what is going on via the VALGRIND_HG_CLEAN_MEMORY
923 client request
924 (in <computeroutput>thrcheck.h</computeroutput>).</para>
925
926 <para>Thrcheck is aware of standard memory allocation and
927 deallocation that occurs via malloc/free/new/delete and from entry
928 and exit of stack frames. In particular, when memory is
929 deallocated via free, delete, or function exit, Thrcheck considers
930 that memory clean, so when it is eventually reallocated, its
931 history is irrelevant.</para>
932
933 <para>However, it is common practice to implement memory recycling
934 schemes. In these, memory to be freed is not handed to
935 malloc/delete, but instead put into a pool of free buffers to be
936 handed out again as required. The problem is that Thrcheck has no
937 way to know that such memory is logically no longer in use, and
938 its history is irrelevant. Hence you must make that explicit,
939 using the VALGRIND_HG_CLEAN_MEMORY client request to specify the
940 relevant address ranges. It's easiest to put these requests into
941 the pool manager code, and use them either when memory is returned
942 to the pool, or is allocated from it.</para>
943 </listitem>
944
945 <listitem>
946 <para>Avoid POSIX condition variables. If you can, use POSIX
947 semaphores (sem_t, sem_post, sem_wait) to do inter-thread event
948 signalling. Semaphores with an initial value of zero are
949 particularly useful for this.</para>
950
951 <para>Thrcheck only partially correctly handles POSIX condition
952 variables. This is because Thrcheck can see inter-thread
953 dependencies between a pthread_cond_wait call and a
954 pthread_cond_signal/broadcast call only if the waiting thread
955 actually gets to the rendezvous first (so that it actually calls
956 pthread_cond_wait). It can't see dependencies between the threads
957 if the signaller arrives first. In the latter case, POSIX
958 guidelines imply that the associated boolean condition still
959 provides an inter-thread synchronisation event, but one which is
960 invisible to Thrcheck.</para>
961
962 <para>The result of Thrcheck missing some inter-thread
963 synchronisation events is to cause it to report false positives.
964 That's because missing such events reduces the extent to which it
965 can transfer exclusive memory ownership between threads. So
966 memory may end up in a shared-modified state when that was not
967 intended by the application programmers.</para>
968
969 <para>The root cause of this synchronisation lossage is
970 particularly hard to understand, so an example is helpful. It was
971 discussed at length by Arndt Muehlenfeld ("Runtime Race Detection
972 in Multi-Threaded Programs", Dissertation, TU Graz, Austria). The
973 canonical POSIX-recommended usage scheme for condition variables
974 is as follows:</para>
975
976<programlisting><![CDATA[
977b is a Boolean condition, which is False most of the time
978cv is a condition variable
979mx is its associated mutex
980
981Signaller: Waiter:
982
983lock(mx) lock(mx)
984b = True while (b == False)
985signal(cv) wait(cv,mx)
986unlock(mx) unlock(mx)
987]]></programlisting>
988
989 <para>Assume <computeroutput>b</computeroutput> is False most of
990 the time. If the waiter arrives at the rendezvous first, it
991 enters its while-loop, waits for the signaller to signal, and
992 eventually proceeds. Thrcheck sees the signal, notes the
993 dependency, and all is well.</para>
994
995 <para>If the signaller arrives
996 first, <computeroutput>b</computeroutput> is set to true, and the
997 signal disappears into nowhere. When the waiter later arrives, it
998 does not enter its while-loop and simply carries on. But even in
999 this case, the waiter code following the while-loop cannot execute
1000 until the signaller sets <computeroutput>b</computeroutput> to
1001 True. Hence there is still the same inter-thread dependency, but
1002 this time it is through an arbitrary in-memory condition, and
1003 Thrcheck cannot see it.</para>
1004
1005 <para>By comparison, Thrcheck's detection of inter-thread
1006 dependencies caused by semaphore operations is believed to be
1007 exactly correct.</para>
1008
1009 <para>As far as I know, a solution to this problem that does not
1010 require source-level annotation of condition-variable wait loops
1011 is beyond the current state of the art.</para>
1012 </listitem>
1013
1014 <listitem>
1015 <para>Make sure you are using a supported Linux distribution. At
1016 present, Thrcheck only properly supports x86-linux and amd64-linux
1017 with glibc-2.3 or later. The latter restriction means we only
1018 support glibc's NPTL threading implementation. The old
1019 LinuxThreads implementation is not supported.</para>
1020
1021 <para>Unsupported targets may work to varying degrees. In
1022 particular ppc32-linux and ppc64-linux running NTPL should work,
1023 but you will get false race errors because Thrcheck does not know
1024 how to properly handle atomic instruction sequences created using
1025 the lwarx/stwcx instructions.</para>
1026 </listitem>
1027
1028 <listitem>
1029 <para>Round up all finished threads using pthread_join. Avoid
1030 detaching threads: don't create threads in the detached state, and
1031 don't call pthread_detach on existing threads.</para>
1032
1033 <para>Using pthread_join to round up finished threads provides a
1034 clear synchronisation point that both Thrcheck and programmers can
1035 see. This synchronisation point allows Thrcheck to adjust its
1036 memory ownership
1037 models <link linkend="tc-manual.data-races.exclusive">as described
1038 extensively above</link>, which helps Thrcheck produce more
1039 accurate error reports.</para>
1040
1041 <para>If you don't call pthread_join on a thread, Thrcheck has no
1042 way to know when it finishes, relative to any significant
1043 synchronisation points for other threads in the program. So it
1044 assumes that the thread lingers indefinitely and can potentially
1045 interfere indefinitely with the memory state of the program. It
1046 has every right to assume that -- after all, it might really be
1047 the case that, for scheduling reasons, the exiting thread did run
1048 very slowly in the last stages of its life.</para>
1049 </listitem>
1050
1051 <listitem>
1052 <para>Perform thread debugging (with Thrcheck) and memory
1053 debugging (with Memcheck) together.</para>
1054
1055 <para>Thrcheck tracks the state of memory in detail, and memory
1056 management bugs in the application are liable to cause confusion.
1057 In extreme cases, applications which do many invalid reads and
1058 writes (particularly to freed memory) have been known to crash
1059 Thrcheck. So, ideally, you should make your application
1060 Memcheck-clean before using Thrcheck.</para>
1061
1062 <para>It may be impossible to make your application Memcheck-clean
1063 unless you first remove threading bugs. In particular, it may be
1064 difficult to remove all reads and writes to freed memory in
1065 multithreaded C++ destructor sequences at program termination.
1066 So, ideally, you should make your application Thrcheck-clean
1067 before using Memcheck.</para>
1068
1069 <para>Since this circularity is obviously unresolvable, at least
1070 bear in mind that Memcheck and Thrcheck are to some extent
1071 complementary, and you may need to use them together.</para>
1072 </listitem>
1073
1074 <listitem>
1075 <para>POSIX requires that implementations of standard I/O (printf,
1076 fprintf, fwrite, fread, etc) are thread safe. Unfortunately GNU
1077 libc implements this by using internal locking primitives that
1078 Thrcheck is unable to intercept. Consequently Thrcheck generates
1079 many false race reports when you use these functions.</para>
1080
1081 <para>Thrcheck attempts to hide these errors using the standard
1082 Valgrind error-suppression mechanism. So, at least for simple
1083 test cases, you don't see any. Nevertheless, some may slip
1084 through. Just something to be aware of.</para>
1085 </listitem>
1086
1087 <listitem>
1088 <para>Thrcheck's error checks do not work properly inside the
1089 system threading library itself
1090 (<computeroutput>libpthread.so</computeroutput>), and it usually
1091 observes large numbers of (false) errors in there. Valgrind's
1092 suppression system then filters these out, so you should not see
1093 them.</para>
1094
1095 <para>If you see any race errors reported
1096 where <computeroutput>libpthread.so</computeroutput> or
1097 <computeroutput>ld.so</computeroutput> is the object associated
1098 with the innermost stack frame, please file a bug report at
1099 http://www.valgrind.org.</para>
1100 </listitem>
1101
1102</orderedlist>
1103
1104</sect1>
1105
1106
1107
1108
1109<sect1 id="tc-manual.options" xreflabel="Thrcheck Options">
1110<title>Thrcheck Options</title>
1111
1112<para>The following end-user options are available:</para>
1113
1114<!-- start of xi:include in the manpage -->
1115<variablelist id="tc.opts.list">
1116
1117 <varlistentry id="opt.happens-before" xreflabel="--happens-before">
1118 <term>
1119 <option><![CDATA[--happens-before=none|threads|all
1120 [default: all] ]]></option>
1121 </term>
1122 <listitem>
1123 <para>Thrcheck always regards locks as the basis for
1124 inter-thread synchronisation. However, by default, before
1125 reporting a race error, Thrcheck will also check whether
1126 certain other kinds of inter-thread synchronisation events
1127 happened. It may be that if such events took place, then no
1128 race really occurred, and so no error needs to be reported.
1129 See <link linkend="tc-manual.data-races.exclusive">above</link>
1130 for a discussion of transfers of exclusive ownership states
1131 between threads.
1132 </para>
1133 <para>With <varname>--happens-before=all</varname>, the
1134 following events are regarded as sources of synchronisation:
1135 thread creation/joinage, condition variable
1136 signal/broadcast/waits, and semaphore posts/waits.
1137 </para>
1138 <para>With <varname>--happens-before=threads</varname>, only
1139 thread creation/joinage events are regarded as sources of
1140 synchronisation.
1141 </para>
1142 <para>With <varname>--happens-before=none</varname>, no events
1143 (apart, of course, from locking) are regarded as sources of
1144 synchronisation.
1145 </para>
1146 <para>Changing this setting from the default will increase your
1147 false-error rate but give little or no gain. The only advantage
1148 is that <option>--happens-before=threads</option> and
1149 <option>--happens-before=none</option> should make Thrcheck
1150 less and less sensitive to the scheduling of threads, and hence
1151 the output more and more repeatable across runs.
1152 </para>
1153 </listitem>
1154 </varlistentry>
1155
1156 <varlistentry id="opt.trace-addr" xreflabel="--trace-addr">
1157 <term>
1158 <option><![CDATA[--trace-addr=0xXXYYZZ
1159 ]]></option> and
1160 <option><![CDATA[--trace-level=0|1|2 [default: 1]
1161 ]]></option>
1162 </term>
1163 <listitem>
1164 <para>Requests that Thrcheck produces a log of all state changes
1165 to location 0xXXYYZZ. This can be helpful in tracking down
1166 tricky races. <varname>--trace-level</varname> controls the
1167 verbosity of the log. At the default setting (1), a one-line
1168 summary of is printed for each state change. At level 2 a
1169 complete stack trace is printed for each state change.</para>
1170 </listitem>
1171 </varlistentry>
1172
1173</variablelist>
1174<!-- end of xi:include in the manpage -->
1175
1176<!-- start of xi:include in the manpage -->
1177<para>In addition, the following debugging options are available for
1178Thrcheck:</para>
1179
1180<variablelist id="tc.debugopts.list">
1181
1182 <varlistentry id="opt.trace-malloc" xreflabel="--trace-malloc">
1183 <term>
1184 <option><![CDATA[--trace-malloc=no|yes [no]
1185 ]]></option>
1186 </term>
1187 <listitem>
1188 <para>Show all client malloc (etc) and free (etc) requests.</para>
1189 </listitem>
1190 </varlistentry>
1191
1192 <varlistentry id="opt.gen-vcg" xreflabel="--gen-vcg">
1193 <term>
1194 <option><![CDATA[--gen-vcg=no|yes|yes-w-vts [no]
1195 ]]></option>
1196 </term>
1197 <listitem>
1198 <para>At exit, write to stderr a dump of the happens-before
1199 graph computed by Thrcheck, in a format suitable for the VCG
1200 graph visualisation tool. A suitable command line is:</para>
1201 <para><computeroutput>valgrind --tool=thrcheck
1202 --gen-vcg=yes my_app 2&gt;&amp;1
1203 | grep xxxxxx | sed "s/xxxxxx//g"
1204 | xvcg -</computeroutput></para>
1205 <para>With <varname>--gen-vcg=yes</varname>, the basic
1206 happens-before graph is shown. With
1207 <varname>--gen-vcg=yes-w-vts</varname>, the vector timestamp
1208 for each node is also shown.</para>
1209 </listitem>
1210 </varlistentry>
1211
1212 <varlistentry id="opt.cmp-race-err-addrs"
1213 xreflabel="--cmp-race-err-addrs">
1214 <term>
1215 <option><![CDATA[--cmp-race-err-addrs=no|yes [no]
1216 ]]></option>
1217 </term>
1218 <listitem>
1219 <para>Controls whether or not race (data) addresses should be
1220 taken into account when removing duplicates of race errors.
1221 With <varname>--cmp-race-err-addrs=no</varname>, two otherwise
1222 identical race errors will be considered to be the same if
1223 their race addresses differ. With
1224 With <varname>--cmp-race-err-addrs=yes</varname> they will be
1225 considered different. This is provided to help make certain
1226 regression tests work reliably.</para>
1227 </listitem>
1228 </varlistentry>
1229
1230 <varlistentry id="opt.tc-sanity-flags" xreflabel="--tc-sanity-flags">
1231 <term>
1232 <option><![CDATA[--tc-sanity-flags=<XXXXX> (X = 0|1) [00000]
1233 ]]></option>
1234 </term>
1235 <listitem>
1236 <para>Run extensive sanity checks on Thrcheck's internal
1237 data structures at events defined by the bitstring, as
1238 follows:</para>
1239 <para><computeroutput>10000 </computeroutput>after changes to
1240 the lock order acquisition graph</para>
1241 <para><computeroutput>01000 </computeroutput>after every client
1242 memory access (NB: not currently used)</para>
1243 <para><computeroutput>00100 </computeroutput>after every client
1244 memory range permission setting of 256 bytes or greater</para>
1245 <para><computeroutput>00010 </computeroutput>after every client
1246 lock or unlock event</para>
1247 <para><computeroutput>00001 </computeroutput>after every client
1248 thread creation or joinage event</para>
1249 <para>Note these will make Thrcheck run very slowly, often to
1250 the point of being completely unusable.</para>
1251 </listitem>
1252 </varlistentry>
1253
1254</variablelist>
1255<!-- end of xi:include in the manpage -->
1256
1257
1258</sect1>
1259
1260<sect1 id="tc-manual.todolist" xreflabel="To Do List">
1261<title>A To-Do List for Thrcheck</title>
1262
1263<para>The following is a list of loose ends which should be tidied up
1264some time.</para>
1265
1266<itemizedlist>
1267 <listitem><para>Track which mutexes are associated with which
1268 condition variables, and emit a warning if this becomes
1269 inconsistent.</para>
1270 </listitem>
1271 <listitem><para>For lock order errors, print the complete lock
1272 cycle, rather than only doing for size-2 cycles as at
1273 present.</para>
1274 </listitem>
1275 <listitem><para>Document the VALGRIND_HG_CLEAN_MEMORY client
1276 request.</para>
1277 </listitem>
1278 <listitem><para>Possibly a client request to forcibly transfer
1279 ownership of memory from one thread to another. Requires further
1280 consideration.</para>
1281 </listitem>
1282 <listitem><para>Add a new client request that marks an address range
1283 as being "shared-modified with empty lockset" (the error state),
1284 and describe how to use it.</para>
1285 </listitem>
1286 <listitem><para>Document races caused by gcc's thread-unsafe code
1287 generation for speculative stores. In the interim see
1288 <computeroutput>http://gcc.gnu.org/ml/gcc/2007-10/msg00266.html
1289 </computeroutput>
1290 and <computeroutput>http://lkml.org/lkml/2007/10/24/673</computeroutput>.
1291 </para>
1292 </listitem>
1293 <listitem><para>Don't update the lock-order graph, and don't check
1294 for errors, when a "try"-style lock operation happens (eg
1295 pthread_mutex_trylock). Such calls do not add any real
1296 restrictions to the locking order, since they can always fail to
1297 acquire the lock, resulting in the caller going off and doing Plan
1298 B (presumably it will have a Plan B). Doing such checks could
1299 generate false lock-order errors and confuse users.</para>
1300 </listitem>
1301 <listitem><para> Performance can be very poor. Slowdowns on the
1302 order of 100:1 are not unusual. There is quite some scope for
1303 performance improvements, though.
1304 </para>
1305 </listitem>
1306
1307</itemizedlist>
1308
1309</sect1>
1310
1311</chapter>