blob: 4f5a998e123a64dc2b531c606c5537170e90ea1c [file] [log] [blame]
<?xml version="1.0"?> <!-- -*- sgml -*- -->
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
[ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]>
<chapter id="drd-manual" xreflabel="DRD: a thread error detector">
<title>DRD: a thread error detector</title>
<para>To use this tool, you must specify
<computeroutput>--tool=exp-drd</computeroutput>
on the Valgrind command line.</para>
<sect1 id="drd-manual.overview" xreflabel="Overview">
<title>Introduction</title>
<para>
DRD is a Valgrind tool for detecting errors in multithreaded C and C++
shared-memory programs. The tool works for any program that uses the
POSIX threading primitives or a threading library built on top of the
POSIX threading primitives. POSIX threads, also known as Pthreads, is
the most widely available threading library on Unix systems.
</para>
<para>
Multithreaded programming is error prone. Depending on how multithreading is
expressed in a program, one or more of the following problems can pop up in a
multithreaded program:
<itemizedlist>
<listitem>
<para>
A data race, i.e. one or more threads access the same memory
location without sufficient locking.
</para>
</listitem>
<listitem>
<para>
Lock contention: one thread blocks the progress of another thread
by holding a lock too long.
</para>
</listitem>
<listitem>
<para>
Deadlock: two or more threads wait for each other indefinitely.
</para>
</listitem>
<listitem>
<para>
False sharing: threads on two different processors access different
variables in the same cache line frequently, causing frequent exchange
of cache lines and slowing down both threads.
</para>
</listitem>
<listitem>
<para>
Improper use of the POSIX threads API.
</para>
</listitem>
</itemizedlist>
</para>
<para>
Although the likelihood of some classes of multithreaded programming
errors can be reduced by a disciplined programming style, a tool for
automatic detection of runtime threading errors is always a great help
when developing multithreaded software.
</para>
<para>
The remainder of this manual is organized as follows. In the next
section it is discussed which <link
linkend="drd-manual.mt-progr-models"> multithreading programming
paradigms</link> exist.
</para>
<para>Then there is a
<link linkend="drd-manual.options">summary of command-line
options</link>.
</para>
<para>
DRD can detect three classes of errors, which are discussed in detail:
</para>
<orderedlist>
<listitem>
<para><link linkend="drd-manual.data-races">Data races</link>.</para>
</listitem>
<listitem>
<para><link linkend="drd-manual.lock-contention">Lock contention</link>.
</para>
</listitem>
<listitem>
<para><link linkend="drd-manual.api-checks">
Misuse of the POSIX threads API</link>.</para>
</listitem>
</orderedlist>
<para>Finally, there is a section about the current
<link linkend="drd-manual.limitations">limitations</link>
of DRD.
</para>
</sect1>
<sect1 id="drd-manual.mt-progr-models" xreflabel="MT-progr-models">
<title>Multithreaded Programming Paradigms</title>
<para>
For many applications multithreading is a necessity. There are two
reasons why the use of threads may be required:
<itemizedlist>
<listitem>
<para>
To model concurrent activities. Managing the state of one activity
per thread is a simpler programming model than multiplexing the states
of multiple activities in a single thread. This is why most server and
embedded software is multithreaded.
</para>
</listitem>
<listitem>
<para>
To let computations run on multiple CPU cores simultaneously. This is
why many High Performance Computing (HPC) applications are multithreaded.
</para>
</listitem>
</itemizedlist>
</para>
<para>
Multithreaded programs can be developed by using one or more of the
following paradigms. Which paradigm is appropriate also depends on the
application type -- modeling concurrent activities versus HPC.
<itemizedlist>
<listitem>
<para>
Locking: data that is shared between threads may only be accessed
after a lock is obtained on the mutex(es) associated with the
shared data item. The POSIX threads library, the Qt library
and the Boost.Thread library support this paradigm directly.
</para>
</listitem>
<listitem>
<para>
Message passing: any data that has to be passed from one thread to
another is sent via a message to that other thread. No data is explicitly
shared. Well known implementations of the message passing paradigm are
MPI and CORBA.
</para>
</listitem>
<listitem>
<para>
Software Transactional Memory (STM). Just like the locking
paradigm, with STM data is shared between threads. While the
locking paradigm requires that all associated mutexes are locked
before the shared data is accessed, with the STM paradigm after
each transaction it is verified whether there were conflicting
transactions. If there were conflicts, the transaction is aborted,
otherwise it is committed. This is a so-called optimistic
approach. Not all C, C++ and Fortran compilers already support STM.
</para>
</listitem>
<listitem>
<para>
Automatic parallelization: a compiler converts a sequential
program into a multithreaded program. The original program can
contain parallelization hints. As an example, gcc version 4.3.0
and later supports OpenMP, a set of standardized compiler
directives which tell a compiler how to parallelize a C, C++ or
Fortran program.
</para>
</listitem>
</itemizedlist>
</para>
<para>
Next to the above paradigms, most CPU instruction sets support atomic
memory accesses. Such operations are the most efficient way to update
a single value on a system with multiple CPU cores.
</para>
<para>
DRD supports any combination of multithreaded programming paradigms
and atomic memory accesses, as long as the libraries that implement
the paradigms are based on POSIX threads. Direct use of e.g. Linux'
futexes is not recognized by DRD and will result in false positives.
</para>
</sect1>
<sect1 id="drd-manual.options" xreflabel="DRD Options">
<title>Command Line Options</title>
<para>The following end-user options are available:</para>
<!-- start of xi:include in the manpage -->
<variablelist id="drd.opts.list">
</variablelist>
<!-- end of xi:include in the manpage -->
<!-- start of xi:include in the manpage -->
<para>In addition, the following debugging options are available for
DRD:</para>
<variablelist id="drd.debugopts.list">
</variablelist>
<!-- end of xi:include in the manpage -->
</sect1>
<sect1 id="drd-manual.data-races" xreflabel="Data Races">
<title>Data Races</title>
</sect1>
<sect1 id="drd-manual.lock-contention" xreflabel="Lock Contention">
<title>Lock Contention</title>
</sect1>
<sect1 id="drd-manual.api-checks" xreflabel="API Checks">
<title>Misuse of the POSIX threads API</title>
</sect1>
<sect1 id="drd-manual.clientreqs" xreflabel="Client requests">
<title>Client Requests</title>
<para>
Just as for other Valgrind tools it is possible to pass information
from a client program to the DRD tool.
</para>
</sect1>
<sect1 id="drd-manual.openmp" xreflabel="OpenMP">
<title>Debugging OpenMP Programs With DRD</title>
<para>
Just as for other Valgrind tools it is possible to pass information
from a client program to the DRD tool.
</para>
</sect1>
<sect1 id="drd-manual.limitations" xreflabel="Limitations">
<title>Limitations</title>
<para>DRD currently has the following limitations:</para>
<itemizedlist>
<listitem><para>DRD has only been tested on the Linux operating
system, and not on any of the other operating systems supported by
Valgrind.</para>
</listitem>
<listitem><para>Of the two POSIX threads implementations for Linux,
only the NPTL (Native POSIX Thread Library) is supported. The older
LinuxThreads library is not supported.</para>
</listitem>
<listitem><para>When running DRD on a PowerPC CPU, DRD will report
false positives on atomic operations. See also <ulink
url="http://bugs.kde.org/show_bug.cgi?id=162354">KDE bug 162354</ulink>.
</para></listitem>
<listitem><para>DRD, just like memcheck, will refuse to
start on Linux distributions where all symbol information has been
removed from ld.so. This is e.g. the case for openSUSE 10.3 -- see
also <ulink url="http://bugzilla.novell.com/show_bug.cgi?id=396197">
Novell bug 396197</ulink>.
</para></listitem>
<listitem><para>If you compile the DRD source code yourself, you need
gcc 3.0 or later. gcc 2.95 is not supported.</para>
</listitem>
</itemizedlist>
</sect1>
</chapter>