| <?xml version="1.0"?> <!-- -*- sgml -*- --> |
| <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" |
| "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" |
| [ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]> |
| |
| |
| <chapter id="drd-manual" xreflabel="DRD: a thread error detector"> |
| <title>DRD: a thread error detector</title> |
| |
| <para>To use this tool, you must specify |
| <computeroutput>--tool=exp-drd</computeroutput> |
| on the Valgrind command line.</para> |
| |
| <sect1 id="drd-manual.overview" xreflabel="Overview"> |
| <title>Introduction</title> |
| |
| <para> |
| DRD is a Valgrind tool for detecting errors in multithreaded C and C++ |
| shared-memory programs. The tool works for any program that uses the |
| POSIX threading primitives or a threading library built on top of the |
| POSIX threading primitives. POSIX threads, also known as Pthreads, is |
| the most widely available threading library on Unix systems. |
| </para> |
| |
| <para> |
| Multithreaded programming is error prone. Depending on how multithreading is |
| expressed in a program, one or more of the following problems can pop up in a |
| multithreaded program: |
| <itemizedlist> |
| <listitem> |
| <para> |
| A data race, i.e. one or more threads access the same memory |
| location without sufficient locking. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| Lock contention: one thread blocks the progress of another thread |
| by holding a lock too long. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| Deadlock: two or more threads wait for each other indefinitely. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| False sharing: threads on two different processors access different |
| variables in the same cache line frequently, causing frequent exchange |
| of cache lines and slowing down both threads. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| Improper use of the POSIX threads API. |
| </para> |
| </listitem> |
| </itemizedlist> |
| </para> |
| |
| <para> |
| Although the likelihood of some classes of multithreaded programming |
| errors can be reduced by a disciplined programming style, a tool for |
| automatic detection of runtime threading errors is always a great help |
| when developing multithreaded software. |
| </para> |
| |
| <para> |
| The remainder of this manual is organized as follows. In the next |
| section it is discussed which <link |
| linkend="drd-manual.mt-progr-models"> multithreading programming |
| paradigms</link> exist. |
| </para> |
| |
| <para>Then there is a |
| <link linkend="drd-manual.options">summary of command-line |
| options</link>. |
| </para> |
| |
| <para> |
| DRD can detect three classes of errors, which are discussed in detail: |
| </para> |
| |
| <orderedlist> |
| <listitem> |
| <para><link linkend="drd-manual.data-races">Data races</link>.</para> |
| </listitem> |
| <listitem> |
| <para><link linkend="drd-manual.lock-contention">Lock contention</link>. |
| </para> |
| </listitem> |
| <listitem> |
| <para><link linkend="drd-manual.api-checks"> |
| Misuse of the POSIX threads API</link>.</para> |
| </listitem> |
| </orderedlist> |
| |
| <para>Finally, there is a section about the current |
| <link linkend="drd-manual.limitations">limitations</link> |
| of DRD. |
| </para> |
| |
| </sect1> |
| |
| |
| <sect1 id="drd-manual.mt-progr-models" xreflabel="MT-progr-models"> |
| <title>Multithreaded Programming Paradigms</title> |
| |
| <para> |
| For many applications multithreading is a necessity. There are two |
| reasons why the use of threads may be required: |
| <itemizedlist> |
| <listitem> |
| <para> |
| To model concurrent activities. Managing the state of one activity |
| per thread is a simpler programming model than multiplexing the states |
| of multiple activities in a single thread. This is why most server and |
| embedded software is multithreaded. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| To let computations run on multiple CPU cores simultaneously. This is |
| why many High Performance Computing (HPC) applications are multithreaded. |
| </para> |
| </listitem> |
| </itemizedlist> |
| </para> |
| |
| <para> |
| Multithreaded programs can be developed by using one or more of the |
| following paradigms. Which paradigm is appropriate also depends on the |
| application type -- modeling concurrent activities versus HPC. |
| <itemizedlist> |
| <listitem> |
| <para> |
| Locking: data that is shared between threads may only be accessed |
| after a lock is obtained on the mutex(es) associated with the |
| shared data item. The POSIX threads library, the Qt library |
| and the Boost.Thread library support this paradigm directly. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| Message passing: any data that has to be passed from one thread to |
| another is sent via a message to that other thread. No data is explicitly |
| shared. Well known implementations of the message passing paradigm are |
| MPI and CORBA. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| Software Transactional Memory (STM). Just like the locking |
| paradigm, with STM data is shared between threads. While the |
| locking paradigm requires that all associated mutexes are locked |
| before the shared data is accessed, with the STM paradigm after |
| each transaction it is verified whether there were conflicting |
| transactions. If there were conflicts, the transaction is aborted, |
| otherwise it is committed. This is a so-called optimistic |
| approach. Not all C, C++ and Fortran compilers already support STM. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| Automatic parallelization: a compiler converts a sequential |
| program into a multithreaded program. The original program can |
| contain parallelization hints. As an example, gcc version 4.3.0 |
| and later supports OpenMP, a set of standardized compiler |
| directives which tell a compiler how to parallelize a C, C++ or |
| Fortran program. |
| </para> |
| </listitem> |
| </itemizedlist> |
| </para> |
| |
| <para> |
| Next to the above paradigms, most CPU instruction sets support atomic |
| memory accesses. Such operations are the most efficient way to update |
| a single value on a system with multiple CPU cores. |
| </para> |
| |
| <para> |
| DRD supports any combination of multithreaded programming paradigms |
| and atomic memory accesses, as long as the libraries that implement |
| the paradigms are based on POSIX threads. Direct use of e.g. Linux' |
| futexes is not recognized by DRD and will result in false positives. |
| </para> |
| |
| </sect1> |
| |
| |
| <sect1 id="drd-manual.options" xreflabel="DRD Options"> |
| <title>Command Line Options</title> |
| |
| <para>The following end-user options are available:</para> |
| |
| <!-- start of xi:include in the manpage --> |
| <variablelist id="drd.opts.list"> |
| </variablelist> |
| <!-- end of xi:include in the manpage --> |
| |
| <!-- start of xi:include in the manpage --> |
| <para>In addition, the following debugging options are available for |
| DRD:</para> |
| <variablelist id="drd.debugopts.list"> |
| </variablelist> |
| <!-- end of xi:include in the manpage --> |
| |
| </sect1> |
| |
| |
| <sect1 id="drd-manual.data-races" xreflabel="Data Races"> |
| <title>Data Races</title> |
| </sect1> |
| |
| |
| <sect1 id="drd-manual.lock-contention" xreflabel="Lock Contention"> |
| <title>Lock Contention</title> |
| </sect1> |
| |
| |
| <sect1 id="drd-manual.api-checks" xreflabel="API Checks"> |
| <title>Misuse of the POSIX threads API</title> |
| </sect1> |
| |
| |
| <sect1 id="drd-manual.clientreqs" xreflabel="Client requests"> |
| <title>Client Requests</title> |
| |
| <para> |
| Just as for other Valgrind tools it is possible to pass information |
| from a client program to the DRD tool. |
| </para> |
| |
| </sect1> |
| |
| |
| <sect1 id="drd-manual.openmp" xreflabel="OpenMP"> |
| <title>Debugging OpenMP Programs With DRD</title> |
| |
| <para> |
| Just as for other Valgrind tools it is possible to pass information |
| from a client program to the DRD tool. |
| </para> |
| |
| </sect1> |
| |
| |
| <sect1 id="drd-manual.limitations" xreflabel="Limitations"> |
| <title>Limitations</title> |
| |
| <para>DRD currently has the following limitations:</para> |
| |
| <itemizedlist> |
| <listitem><para>DRD has only been tested on the Linux operating |
| system, and not on any of the other operating systems supported by |
| Valgrind.</para> |
| </listitem> |
| <listitem><para>Of the two POSIX threads implementations for Linux, |
| only the NPTL (Native POSIX Thread Library) is supported. The older |
| LinuxThreads library is not supported.</para> |
| </listitem> |
| <listitem><para>When running DRD on a PowerPC CPU, DRD will report |
| false positives on atomic operations. See also <ulink |
| url="http://bugs.kde.org/show_bug.cgi?id=162354">KDE bug 162354</ulink>. |
| </para></listitem> |
| <listitem><para>DRD, just like memcheck, will refuse to |
| start on Linux distributions where all symbol information has been |
| removed from ld.so. This is e.g. the case for openSUSE 10.3 -- see |
| also <ulink url="http://bugzilla.novell.com/show_bug.cgi?id=396197"> |
| Novell bug 396197</ulink>. |
| </para></listitem> |
| <listitem><para>If you compile the DRD source code yourself, you need |
| gcc 3.0 or later. gcc 2.95 is not supported.</para> |
| </listitem> |
| |
| </itemizedlist> |
| |
| </sect1> |
| |
| </chapter> |