Elliott Hughes | a0664b9 | 2017-04-18 17:46:52 -0700 | [diff] [blame^] | 1 | <html> |
| 2 | <head> |
| 3 | <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> |
| 4 | <title>8. DRD: a thread error detector</title> |
| 5 | <link rel="stylesheet" type="text/css" href="vg_basic.css"> |
| 6 | <meta name="generator" content="DocBook XSL Stylesheets V1.78.1"> |
| 7 | <link rel="home" href="index.html" title="Valgrind Documentation"> |
| 8 | <link rel="up" href="manual.html" title="Valgrind User Manual"> |
| 9 | <link rel="prev" href="hg-manual.html" title="7. Helgrind: a thread error detector"> |
| 10 | <link rel="next" href="ms-manual.html" title="9. Massif: a heap profiler"> |
| 11 | </head> |
| 12 | <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> |
| 13 | <div><table class="nav" width="100%" cellspacing="3" cellpadding="3" border="0" summary="Navigation header"><tr> |
| 14 | <td width="22px" align="center" valign="middle"><a accesskey="p" href="hg-manual.html"><img src="images/prev.png" width="18" height="21" border="0" alt="Prev"></a></td> |
| 15 | <td width="25px" align="center" valign="middle"><a accesskey="u" href="manual.html"><img src="images/up.png" width="21" height="18" border="0" alt="Up"></a></td> |
| 16 | <td width="31px" align="center" valign="middle"><a accesskey="h" href="index.html"><img src="images/home.png" width="27" height="20" border="0" alt="Up"></a></td> |
| 17 | <th align="center" valign="middle">Valgrind User Manual</th> |
| 18 | <td width="22px" align="center" valign="middle"><a accesskey="n" href="ms-manual.html"><img src="images/next.png" width="18" height="21" border="0" alt="Next"></a></td> |
| 19 | </tr></table></div> |
| 20 | <div class="chapter"> |
| 21 | <div class="titlepage"><div><div><h1 class="title"> |
| 22 | <a name="drd-manual"></a>8. DRD: a thread error detector</h1></div></div></div> |
| 23 | <div class="toc"> |
| 24 | <p><b>Table of Contents</b></p> |
| 25 | <dl class="toc"> |
| 26 | <dt><span class="sect1"><a href="drd-manual.html#drd-manual.overview">8.1. Overview</a></span></dt> |
| 27 | <dd><dl> |
| 28 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.mt-progr-models">8.1.1. Multithreaded Programming Paradigms</a></span></dt> |
| 29 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.pthreads-model">8.1.2. POSIX Threads Programming Model</a></span></dt> |
| 30 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.mt-problems">8.1.3. Multithreaded Programming Problems</a></span></dt> |
| 31 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.data-race-detection">8.1.4. Data Race Detection</a></span></dt> |
| 32 | </dl></dd> |
| 33 | <dt><span class="sect1"><a href="drd-manual.html#drd-manual.using-drd">8.2. Using DRD</a></span></dt> |
| 34 | <dd><dl> |
| 35 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.options">8.2.1. DRD Command-line Options</a></span></dt> |
| 36 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.data-races">8.2.2. Detected Errors: Data Races</a></span></dt> |
| 37 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.lock-contention">8.2.3. Detected Errors: Lock Contention</a></span></dt> |
| 38 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.api-checks">8.2.4. Detected Errors: Misuse of the POSIX threads API</a></span></dt> |
| 39 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.clientreqs">8.2.5. Client Requests</a></span></dt> |
| 40 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.C++11">8.2.6. Debugging C++11 Programs</a></span></dt> |
| 41 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.gnome">8.2.7. Debugging GNOME Programs</a></span></dt> |
| 42 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.boost.thread">8.2.8. Debugging Boost.Thread Programs</a></span></dt> |
| 43 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.openmp">8.2.9. Debugging OpenMP Programs</a></span></dt> |
| 44 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.cust-mem-alloc">8.2.10. DRD and Custom Memory Allocators</a></span></dt> |
| 45 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.drd-versus-memcheck">8.2.11. DRD Versus Memcheck</a></span></dt> |
| 46 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.resource-requirements">8.2.12. Resource Requirements</a></span></dt> |
| 47 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.effective-use">8.2.13. Hints and Tips for Effective Use of DRD</a></span></dt> |
| 48 | </dl></dd> |
| 49 | <dt><span class="sect1"><a href="drd-manual.html#drd-manual.Pthreads">8.3. Using the POSIX Threads API Effectively</a></span></dt> |
| 50 | <dd><dl> |
| 51 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.mutex-types">8.3.1. Mutex types</a></span></dt> |
| 52 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.condvar">8.3.2. Condition variables</a></span></dt> |
| 53 | <dt><span class="sect2"><a href="drd-manual.html#drd-manual.pctw">8.3.3. pthread_cond_timedwait and timeouts</a></span></dt> |
| 54 | </dl></dd> |
| 55 | <dt><span class="sect1"><a href="drd-manual.html#drd-manual.limitations">8.4. Limitations</a></span></dt> |
| 56 | <dt><span class="sect1"><a href="drd-manual.html#drd-manual.feedback">8.5. Feedback</a></span></dt> |
| 57 | </dl> |
| 58 | </div> |
| 59 | <p>To use this tool, you must specify |
| 60 | <code class="option">--tool=drd</code> |
| 61 | on the Valgrind command line.</p> |
| 62 | <div class="sect1"> |
| 63 | <div class="titlepage"><div><div><h2 class="title" style="clear: both"> |
| 64 | <a name="drd-manual.overview"></a>8.1. Overview</h2></div></div></div> |
| 65 | <p> |
| 66 | DRD is a Valgrind tool for detecting errors in multithreaded C and C++ |
| 67 | programs. The tool works for any program that uses the POSIX threading |
| 68 | primitives or that uses threading concepts built on top of the POSIX threading |
| 69 | primitives. |
| 70 | </p> |
| 71 | <div class="sect2"> |
| 72 | <div class="titlepage"><div><div><h3 class="title"> |
| 73 | <a name="drd-manual.mt-progr-models"></a>8.1.1. Multithreaded Programming Paradigms</h3></div></div></div> |
| 74 | <p> |
| 75 | There are two possible reasons for using multithreading in a program: |
| 76 | </p> |
| 77 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 78 | <li class="listitem"><p> |
| 79 | To model concurrent activities. Assigning one thread to each activity |
| 80 | can be a great simplification compared to multiplexing the states of |
| 81 | multiple activities in a single thread. This is why most server software |
| 82 | and embedded software is multithreaded. |
| 83 | </p></li> |
| 84 | <li class="listitem"><p> |
| 85 | To use multiple CPU cores simultaneously for speeding up |
| 86 | computations. This is why many High Performance Computing (HPC) |
| 87 | applications are multithreaded. |
| 88 | </p></li> |
| 89 | </ul></div> |
| 90 | <p> |
| 91 | </p> |
| 92 | <p> |
| 93 | Multithreaded programs can use one or more of the following programming |
| 94 | paradigms. Which paradigm is appropriate depends e.g. on the application type. |
| 95 | Some examples of multithreaded programming paradigms are: |
| 96 | </p> |
| 97 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 98 | <li class="listitem"><p> |
| 99 | Locking. Data that is shared over threads is protected from concurrent |
| 100 | accesses via locking. E.g. the POSIX threads library, the Qt library |
| 101 | and the Boost.Thread library support this paradigm directly. |
| 102 | </p></li> |
| 103 | <li class="listitem"><p> |
| 104 | Message passing. No data is shared between threads, but threads exchange |
| 105 | data by passing messages to each other. Examples of implementations of |
| 106 | the message passing paradigm are MPI and CORBA. |
| 107 | </p></li> |
| 108 | <li class="listitem"><p> |
| 109 | Automatic parallelization. A compiler converts a sequential program into |
| 110 | a multithreaded program. The original program may or may not contain |
| 111 | parallelization hints. One example of such parallelization hints is the |
| 112 | OpenMP standard. In this standard a set of directives are defined which |
| 113 | tell a compiler how to parallelize a C, C++ or Fortran program. OpenMP |
| 114 | is well suited for computational intensive applications. As an example, |
| 115 | an open source image processing software package is using OpenMP to |
| 116 | maximize performance on systems with multiple CPU |
| 117 | cores. GCC supports the |
| 118 | OpenMP standard from version 4.2.0 on. |
| 119 | </p></li> |
| 120 | <li class="listitem"><p> |
| 121 | Software Transactional Memory (STM). Any data that is shared between |
| 122 | threads is updated via transactions. After each transaction it is |
| 123 | verified whether there were any conflicting transactions. If there were |
| 124 | conflicts, the transaction is aborted, otherwise it is committed. This |
| 125 | is a so-called optimistic approach. There is a prototype of the Intel C++ |
| 126 | Compiler available that supports STM. Research about the addition of |
| 127 | STM support to GCC is ongoing. |
| 128 | </p></li> |
| 129 | </ul></div> |
| 130 | <p> |
| 131 | </p> |
| 132 | <p> |
| 133 | DRD supports any combination of multithreaded programming paradigms as |
| 134 | long as the implementation of these paradigms is based on the POSIX |
| 135 | threads primitives. DRD however does not support programs that use |
| 136 | e.g. Linux' futexes directly. Attempts to analyze such programs with |
| 137 | DRD will cause DRD to report many false positives. |
| 138 | </p> |
| 139 | </div> |
| 140 | <div class="sect2"> |
| 141 | <div class="titlepage"><div><div><h3 class="title"> |
| 142 | <a name="drd-manual.pthreads-model"></a>8.1.2. POSIX Threads Programming Model</h3></div></div></div> |
| 143 | <p> |
| 144 | POSIX threads, also known as Pthreads, is the most widely available |
| 145 | threading library on Unix systems. |
| 146 | </p> |
| 147 | <p> |
| 148 | The POSIX threads programming model is based on the following abstractions: |
| 149 | </p> |
| 150 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 151 | <li class="listitem"><p> |
| 152 | A shared address space. All threads running within the same |
| 153 | process share the same address space. All data, whether shared or |
| 154 | not, is identified by its address. |
| 155 | </p></li> |
| 156 | <li class="listitem"><p> |
| 157 | Regular load and store operations, which allow to read values |
| 158 | from or to write values to the memory shared by all threads |
| 159 | running in the same process. |
| 160 | </p></li> |
| 161 | <li class="listitem"><p> |
| 162 | Atomic store and load-modify-store operations. While these are |
| 163 | not mentioned in the POSIX threads standard, most |
| 164 | microprocessors support atomic memory operations. |
| 165 | </p></li> |
| 166 | <li class="listitem"><p> |
| 167 | Threads. Each thread represents a concurrent activity. |
| 168 | </p></li> |
| 169 | <li class="listitem"><p> |
| 170 | Synchronization objects and operations on these synchronization |
| 171 | objects. The following types of synchronization objects have been |
| 172 | defined in the POSIX threads standard: mutexes, condition variables, |
| 173 | semaphores, reader-writer synchronization objects, barriers and |
| 174 | spinlocks. |
| 175 | </p></li> |
| 176 | </ul></div> |
| 177 | <p> |
| 178 | </p> |
| 179 | <p> |
| 180 | Which source code statements generate which memory accesses depends on |
| 181 | the <span class="emphasis"><em>memory model</em></span> of the programming language being |
| 182 | used. There is not yet a definitive memory model for the C and C++ |
| 183 | languages. For a draft memory model, see also the document |
| 184 | <a class="ulink" href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2338.html" target="_top"> |
| 185 | WG21/N2338: Concurrency memory model compiler consequences</a>. |
| 186 | </p> |
| 187 | <p> |
| 188 | For more information about POSIX threads, see also the Single UNIX |
| 189 | Specification version 3, also known as |
| 190 | <a class="ulink" href="http://www.opengroup.org/onlinepubs/000095399/idx/threads.html" target="_top"> |
| 191 | IEEE Std 1003.1</a>. |
| 192 | </p> |
| 193 | </div> |
| 194 | <div class="sect2"> |
| 195 | <div class="titlepage"><div><div><h3 class="title"> |
| 196 | <a name="drd-manual.mt-problems"></a>8.1.3. Multithreaded Programming Problems</h3></div></div></div> |
| 197 | <p> |
| 198 | Depending on which multithreading paradigm is being used in a program, |
| 199 | one or more of the following problems can occur: |
| 200 | </p> |
| 201 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 202 | <li class="listitem"><p> |
| 203 | Data races. One or more threads access the same memory location without |
| 204 | sufficient locking. Most but not all data races are programming errors |
| 205 | and are the cause of subtle and hard-to-find bugs. |
| 206 | </p></li> |
| 207 | <li class="listitem"><p> |
| 208 | Lock contention. One thread blocks the progress of one or more other |
| 209 | threads by holding a lock too long. |
| 210 | </p></li> |
| 211 | <li class="listitem"><p> |
| 212 | Improper use of the POSIX threads API. Most implementations of the POSIX |
| 213 | threads API have been optimized for runtime speed. Such implementations |
| 214 | will not complain on certain errors, e.g. when a mutex is being unlocked |
| 215 | by another thread than the thread that obtained a lock on the mutex. |
| 216 | </p></li> |
| 217 | <li class="listitem"><p> |
| 218 | Deadlock. A deadlock occurs when two or more threads wait for |
| 219 | each other indefinitely. |
| 220 | </p></li> |
| 221 | <li class="listitem"><p> |
| 222 | False sharing. If threads that run on different processor cores |
| 223 | access different variables located in the same cache line |
| 224 | frequently, this will slow down the involved threads a lot due |
| 225 | to frequent exchange of cache lines. |
| 226 | </p></li> |
| 227 | </ul></div> |
| 228 | <p> |
| 229 | </p> |
| 230 | <p> |
| 231 | Although the likelihood of the occurrence of data races can be reduced |
| 232 | through a disciplined programming style, a tool for automatic |
| 233 | detection of data races is a necessity when developing multithreaded |
| 234 | software. DRD can detect these, as well as lock contention and |
| 235 | improper use of the POSIX threads API. |
| 236 | </p> |
| 237 | </div> |
| 238 | <div class="sect2"> |
| 239 | <div class="titlepage"><div><div><h3 class="title"> |
| 240 | <a name="drd-manual.data-race-detection"></a>8.1.4. Data Race Detection</h3></div></div></div> |
| 241 | <p> |
| 242 | The result of load and store operations performed by a multithreaded program |
| 243 | depends on the order in which memory operations are performed. This order is |
| 244 | determined by: |
| 245 | </p> |
| 246 | <div class="orderedlist"><ol class="orderedlist" type="1"> |
| 247 | <li class="listitem"><p> |
| 248 | All memory operations performed by the same thread are performed in |
| 249 | <span class="emphasis"><em>program order</em></span>, that is, the order determined by the |
| 250 | program source code and the results of previous load operations. |
| 251 | </p></li> |
| 252 | <li class="listitem"><p> |
| 253 | Synchronization operations determine certain ordering constraints on |
| 254 | memory operations performed by different threads. These ordering |
| 255 | constraints are called the <span class="emphasis"><em>synchronization order</em></span>. |
| 256 | </p></li> |
| 257 | </ol></div> |
| 258 | <p> |
| 259 | The combination of program order and synchronization order is called the |
| 260 | <span class="emphasis"><em>happens-before relationship</em></span>. This concept was first |
| 261 | defined by S. Adve et al in the paper <span class="emphasis"><em>Detecting data races on weak |
| 262 | memory systems</em></span>, ACM SIGARCH Computer Architecture News, v.19 n.3, |
| 263 | p.234-243, May 1991. |
| 264 | </p> |
| 265 | <p> |
| 266 | Two memory operations <span class="emphasis"><em>conflict</em></span> if both operations are |
| 267 | performed by different threads, refer to the same memory location and at least |
| 268 | one of them is a store operation. |
| 269 | </p> |
| 270 | <p> |
| 271 | A multithreaded program is <span class="emphasis"><em>data-race free</em></span> if all |
| 272 | conflicting memory accesses are ordered by synchronization |
| 273 | operations. |
| 274 | </p> |
| 275 | <p> |
| 276 | A well known way to ensure that a multithreaded program is data-race |
| 277 | free is to ensure that a locking discipline is followed. It is e.g. |
| 278 | possible to associate a mutex with each shared data item, and to hold |
| 279 | a lock on the associated mutex while the shared data is accessed. |
| 280 | </p> |
| 281 | <p> |
| 282 | All programs that follow a locking discipline are data-race free, but not all |
| 283 | data-race free programs follow a locking discipline. There exist multithreaded |
| 284 | programs where access to shared data is arbitrated via condition variables, |
| 285 | semaphores or barriers. As an example, a certain class of HPC applications |
| 286 | consists of a sequence of computation steps separated in time by barriers, and |
| 287 | where these barriers are the only means of synchronization. Although there are |
| 288 | many conflicting memory accesses in such applications and although such |
| 289 | applications do not make use mutexes, most of these applications do not |
| 290 | contain data races. |
| 291 | </p> |
| 292 | <p> |
| 293 | There exist two different approaches for verifying the correctness of |
| 294 | multithreaded programs at runtime. The approach of the so-called Eraser |
| 295 | algorithm is to verify whether all shared memory accesses follow a consistent |
| 296 | locking strategy. And the happens-before data race detectors verify directly |
| 297 | whether all interthread memory accesses are ordered by synchronization |
| 298 | operations. While the last approach is more complex to implement, and while it |
| 299 | is more sensitive to OS scheduling, it is a general approach that works for |
| 300 | all classes of multithreaded programs. An important advantage of |
| 301 | happens-before data race detectors is that these do not report any false |
| 302 | positives. |
| 303 | </p> |
| 304 | <p> |
| 305 | DRD is based on the happens-before algorithm. |
| 306 | </p> |
| 307 | </div> |
| 308 | </div> |
| 309 | <div class="sect1"> |
| 310 | <div class="titlepage"><div><div><h2 class="title" style="clear: both"> |
| 311 | <a name="drd-manual.using-drd"></a>8.2. Using DRD</h2></div></div></div> |
| 312 | <div class="sect2"> |
| 313 | <div class="titlepage"><div><div><h3 class="title"> |
| 314 | <a name="drd-manual.options"></a>8.2.1. DRD Command-line Options</h3></div></div></div> |
| 315 | <p>The following command-line options are available for controlling the |
| 316 | behavior of the DRD tool itself:</p> |
| 317 | <div class="variablelist"> |
| 318 | <a name="drd.opts.list"></a><dl class="variablelist"> |
| 319 | <dt><span class="term"> |
| 320 | <code class="option">--check-stack-var=<yes|no> [default: no]</code> |
| 321 | </span></dt> |
| 322 | <dd><p> |
| 323 | Controls whether DRD detects data races on stack |
| 324 | variables. Verifying stack variables is disabled by default because |
| 325 | most programs do not share stack variables over threads. |
| 326 | </p></dd> |
| 327 | <dt><span class="term"> |
| 328 | <code class="option">--exclusive-threshold=<n> [default: off]</code> |
| 329 | </span></dt> |
| 330 | <dd><p> |
| 331 | Print an error message if any mutex or writer lock has been |
| 332 | held longer than the time specified in milliseconds. This |
| 333 | option enables the detection of lock contention. |
| 334 | </p></dd> |
| 335 | <dt><span class="term"> |
| 336 | <code class="option">--join-list-vol=<n> [default: 10]</code> |
| 337 | </span></dt> |
| 338 | <dd><p> |
| 339 | Data races that occur between a statement at the end of one thread |
| 340 | and another thread can be missed if memory access information is |
| 341 | discarded immediately after a thread has been joined. This option |
| 342 | allows to specify for how many joined threads memory access information |
| 343 | should be retained. |
| 344 | </p></dd> |
| 345 | <dt><span class="term"> |
| 346 | <code class="option"> |
| 347 | --first-race-only=<yes|no> [default: no] |
| 348 | </code> |
| 349 | </span></dt> |
| 350 | <dd><p> |
| 351 | Whether to report only the first data race that has been detected on a |
| 352 | memory location or all data races that have been detected on a memory |
| 353 | location. |
| 354 | </p></dd> |
| 355 | <dt><span class="term"> |
| 356 | <code class="option"> |
| 357 | --free-is-write=<yes|no> [default: no] |
| 358 | </code> |
| 359 | </span></dt> |
| 360 | <dd> |
| 361 | <p> |
| 362 | Whether to report races between accessing memory and freeing |
| 363 | memory. Enabling this option may cause DRD to run slightly |
| 364 | slower. Notes:</p> |
| 365 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 366 | <li class="listitem"><p> |
| 367 | Don't enable this option when using custom memory allocators |
| 368 | that use |
| 369 | the <code class="computeroutput">VG_USERREQ__MALLOCLIKE_BLOCK</code> |
| 370 | and <code class="computeroutput">VG_USERREQ__FREELIKE_BLOCK</code> |
| 371 | because that would result in false positives. |
| 372 | </p></li> |
| 373 | <li class="listitem"><p>Don't enable this option when using reference-counted |
| 374 | objects because that will result in false positives, even when |
| 375 | that code has been annotated properly with |
| 376 | <code class="computeroutput">ANNOTATE_HAPPENS_BEFORE</code> |
| 377 | and <code class="computeroutput">ANNOTATE_HAPPENS_AFTER</code>. See |
| 378 | e.g. the output of the following command for an example: |
| 379 | <code class="computeroutput">valgrind --tool=drd --free-is-write=yes |
| 380 | drd/tests/annotate_smart_pointer</code>. |
| 381 | </p></li> |
| 382 | </ul></div> |
| 383 | </dd> |
| 384 | <dt><span class="term"> |
| 385 | <code class="option"> |
| 386 | --report-signal-unlocked=<yes|no> [default: yes] |
| 387 | </code> |
| 388 | </span></dt> |
| 389 | <dd><p> |
| 390 | Whether to report calls to |
| 391 | <code class="function">pthread_cond_signal</code> and |
| 392 | <code class="function">pthread_cond_broadcast</code> where the mutex |
| 393 | associated with the signal through |
| 394 | <code class="function">pthread_cond_wait</code> or |
| 395 | <code class="function">pthread_cond_timed_wait</code>is not locked at |
| 396 | the time the signal is sent. Sending a signal without holding |
| 397 | a lock on the associated mutex is a common programming error |
| 398 | which can cause subtle race conditions and unpredictable |
| 399 | behavior. There exist some uncommon synchronization patterns |
| 400 | however where it is safe to send a signal without holding a |
| 401 | lock on the associated mutex. |
| 402 | </p></dd> |
| 403 | <dt><span class="term"> |
| 404 | <code class="option">--segment-merging=<yes|no> [default: yes]</code> |
| 405 | </span></dt> |
| 406 | <dd><p> |
| 407 | Controls segment merging. Segment merging is an algorithm to |
| 408 | limit memory usage of the data race detection |
| 409 | algorithm. Disabling segment merging may improve the accuracy |
| 410 | of the so-called 'other segments' displayed in race reports |
| 411 | but can also trigger an out of memory error. |
| 412 | </p></dd> |
| 413 | <dt><span class="term"> |
| 414 | <code class="option">--segment-merging-interval=<n> [default: 10]</code> |
| 415 | </span></dt> |
| 416 | <dd><p> |
| 417 | Perform segment merging only after the specified number of new |
| 418 | segments have been created. This is an advanced configuration option |
| 419 | that allows to choose whether to minimize DRD's memory usage by |
| 420 | choosing a low value or to let DRD run faster by choosing a slightly |
| 421 | higher value. The optimal value for this parameter depends on the |
| 422 | program being analyzed. The default value works well for most programs. |
| 423 | </p></dd> |
| 424 | <dt><span class="term"> |
| 425 | <code class="option">--shared-threshold=<n> [default: off]</code> |
| 426 | </span></dt> |
| 427 | <dd><p> |
| 428 | Print an error message if a reader lock has been held longer |
| 429 | than the specified time (in milliseconds). This option enables |
| 430 | the detection of lock contention. |
| 431 | </p></dd> |
| 432 | <dt><span class="term"> |
| 433 | <code class="option">--show-confl-seg=<yes|no> [default: yes]</code> |
| 434 | </span></dt> |
| 435 | <dd><p> |
| 436 | Show conflicting segments in race reports. Since this |
| 437 | information can help to find the cause of a data race, this |
| 438 | option is enabled by default. Disabling this option makes the |
| 439 | output of DRD more compact. |
| 440 | </p></dd> |
| 441 | <dt><span class="term"> |
| 442 | <code class="option">--show-stack-usage=<yes|no> [default: no]</code> |
| 443 | </span></dt> |
| 444 | <dd><p> |
| 445 | Print stack usage at thread exit time. When a program creates a large |
| 446 | number of threads it becomes important to limit the amount of virtual |
| 447 | memory allocated for thread stacks. This option makes it possible to |
| 448 | observe how much stack memory has been used by each thread of the |
| 449 | client program. Note: the DRD tool itself allocates some temporary |
| 450 | data on the client thread stack. The space necessary for this |
| 451 | temporary data must be allocated by the client program when it |
| 452 | allocates stack memory, but is not included in stack usage reported by |
| 453 | DRD. |
| 454 | </p></dd> |
| 455 | <dt><span class="term"> |
| 456 | <code class="option">--ignore-thread-creation=<yes|no> [default: no]</code> |
| 457 | </span></dt> |
| 458 | <dd> |
| 459 | <p> |
| 460 | Controls whether all activities during thread creation should be |
| 461 | ignored. By default enabled only on Solaris. |
| 462 | Solaris provides higher throughput, parallelism and scalability than |
| 463 | other operating systems, at the cost of more fine-grained locking |
| 464 | activity. This means for example that when a thread is created under |
| 465 | glibc, just one big lock is used for all thread setup. Solaris libc |
| 466 | uses several fine-grained locks and the creator thread resumes its |
| 467 | activities as soon as possible, leaving for example stack and TLS setup |
| 468 | sequence to the created thread. |
| 469 | This situation confuses DRD as it assumes there is some false ordering |
| 470 | in place between creator and created thread; and therefore many types |
| 471 | of race conditions in the application would not be reported. To prevent |
| 472 | such false ordering, this command line option is set to |
| 473 | <code class="computeroutput">yes</code> by default on Solaris. |
| 474 | All activity (loads, stores, client requests) is therefore ignored |
| 475 | during:</p> |
| 476 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 477 | <li class="listitem"><p> |
| 478 | pthread_create() call in the creator thread |
| 479 | </p></li> |
| 480 | <li class="listitem"><p> |
| 481 | thread creation phase (stack and TLS setup) in the created thread |
| 482 | </p></li> |
| 483 | </ul></div> |
| 484 | </dd> |
| 485 | </dl> |
| 486 | </div> |
| 487 | <p> |
| 488 | The following options are available for monitoring the behavior of the |
| 489 | client program: |
| 490 | </p> |
| 491 | <div class="variablelist"> |
| 492 | <a name="drd.debugopts.list"></a><dl class="variablelist"> |
| 493 | <dt><span class="term"> |
| 494 | <code class="option">--trace-addr=<address> [default: none]</code> |
| 495 | </span></dt> |
| 496 | <dd><p> |
| 497 | Trace all load and store activity for the specified |
| 498 | address. This option may be specified more than once. |
| 499 | </p></dd> |
| 500 | <dt><span class="term"> |
| 501 | <code class="option">--ptrace-addr=<address> [default: none]</code> |
| 502 | </span></dt> |
| 503 | <dd><p> |
| 504 | Trace all load and store activity for the specified address and keep |
| 505 | doing that even after the memory at that address has been freed and |
| 506 | reallocated. |
| 507 | </p></dd> |
| 508 | <dt><span class="term"> |
| 509 | <code class="option">--trace-alloc=<yes|no> [default: no]</code> |
| 510 | </span></dt> |
| 511 | <dd><p> |
| 512 | Trace all memory allocations and deallocations. May produce a huge |
| 513 | amount of output. |
| 514 | </p></dd> |
| 515 | <dt><span class="term"> |
| 516 | <code class="option">--trace-barrier=<yes|no> [default: no]</code> |
| 517 | </span></dt> |
| 518 | <dd><p> |
| 519 | Trace all barrier activity. |
| 520 | </p></dd> |
| 521 | <dt><span class="term"> |
| 522 | <code class="option">--trace-cond=<yes|no> [default: no]</code> |
| 523 | </span></dt> |
| 524 | <dd><p> |
| 525 | Trace all condition variable activity. |
| 526 | </p></dd> |
| 527 | <dt><span class="term"> |
| 528 | <code class="option">--trace-fork-join=<yes|no> [default: no]</code> |
| 529 | </span></dt> |
| 530 | <dd><p> |
| 531 | Trace all thread creation and all thread termination events. |
| 532 | </p></dd> |
| 533 | <dt><span class="term"> |
| 534 | <code class="option">--trace-hb=<yes|no> [default: no]</code> |
| 535 | </span></dt> |
| 536 | <dd><p> |
| 537 | Trace execution of the <code class="literal">ANNOTATE_HAPPENS_BEFORE()</code>, |
| 538 | <code class="literal">ANNOTATE_HAPPENS_AFTER()</code> and |
| 539 | <code class="literal">ANNOTATE_HAPPENS_DONE()</code> client requests. |
| 540 | </p></dd> |
| 541 | <dt><span class="term"> |
| 542 | <code class="option">--trace-mutex=<yes|no> [default: no]</code> |
| 543 | </span></dt> |
| 544 | <dd><p> |
| 545 | Trace all mutex activity. |
| 546 | </p></dd> |
| 547 | <dt><span class="term"> |
| 548 | <code class="option">--trace-rwlock=<yes|no> [default: no]</code> |
| 549 | </span></dt> |
| 550 | <dd><p> |
| 551 | Trace all reader-writer lock activity. |
| 552 | </p></dd> |
| 553 | <dt><span class="term"> |
| 554 | <code class="option">--trace-semaphore=<yes|no> [default: no]</code> |
| 555 | </span></dt> |
| 556 | <dd><p> |
| 557 | Trace all semaphore activity. |
| 558 | </p></dd> |
| 559 | </dl> |
| 560 | </div> |
| 561 | </div> |
| 562 | <div class="sect2"> |
| 563 | <div class="titlepage"><div><div><h3 class="title"> |
| 564 | <a name="drd-manual.data-races"></a>8.2.2. Detected Errors: Data Races</h3></div></div></div> |
| 565 | <p> |
| 566 | DRD prints a message every time it detects a data race. Please keep |
| 567 | the following in mind when interpreting DRD's output: |
| 568 | </p> |
| 569 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 570 | <li class="listitem"><p> |
| 571 | Every thread is assigned a <span class="emphasis"><em>thread ID</em></span> by the DRD |
| 572 | tool. A thread ID is a number. Thread ID's start at one and are never |
| 573 | recycled. |
| 574 | </p></li> |
| 575 | <li class="listitem"><p> |
| 576 | The term <span class="emphasis"><em>segment</em></span> refers to a consecutive |
| 577 | sequence of load, store and synchronization operations, all |
| 578 | issued by the same thread. A segment always starts and ends at a |
| 579 | synchronization operation. Data race analysis is performed |
| 580 | between segments instead of between individual load and store |
| 581 | operations because of performance reasons. |
| 582 | </p></li> |
| 583 | <li class="listitem"><p> |
| 584 | There are always at least two memory accesses involved in a data |
| 585 | race. Memory accesses involved in a data race are called |
| 586 | <span class="emphasis"><em>conflicting memory accesses</em></span>. DRD prints a |
| 587 | report for each memory access that conflicts with a past memory |
| 588 | access. |
| 589 | </p></li> |
| 590 | </ul></div> |
| 591 | <p> |
| 592 | </p> |
| 593 | <p> |
| 594 | Below you can find an example of a message printed by DRD when it |
| 595 | detects a data race: |
| 596 | </p> |
| 597 | <pre class="programlisting"> |
| 598 | $ valgrind --tool=drd --read-var-info=yes drd/tests/rwlock_race |
| 599 | ... |
| 600 | ==9466== Thread 3: |
| 601 | ==9466== Conflicting load by thread 3 at 0x006020b8 size 4 |
| 602 | ==9466== at 0x400B6C: thread_func (rwlock_race.c:29) |
| 603 | ==9466== by 0x4C291DF: vg_thread_wrapper (drd_pthread_intercepts.c:186) |
| 604 | ==9466== by 0x4E3403F: start_thread (in /lib64/libpthread-2.8.so) |
| 605 | ==9466== by 0x53250CC: clone (in /lib64/libc-2.8.so) |
| 606 | ==9466== Location 0x6020b8 is 0 bytes inside local var "s_racy" |
| 607 | ==9466== declared at rwlock_race.c:18, in frame #0 of thread 3 |
| 608 | ==9466== Other segment start (thread 2) |
| 609 | ==9466== at 0x4C2847D: pthread_rwlock_rdlock* (drd_pthread_intercepts.c:813) |
| 610 | ==9466== by 0x400B6B: thread_func (rwlock_race.c:28) |
| 611 | ==9466== by 0x4C291DF: vg_thread_wrapper (drd_pthread_intercepts.c:186) |
| 612 | ==9466== by 0x4E3403F: start_thread (in /lib64/libpthread-2.8.so) |
| 613 | ==9466== by 0x53250CC: clone (in /lib64/libc-2.8.so) |
| 614 | ==9466== Other segment end (thread 2) |
| 615 | ==9466== at 0x4C28B54: pthread_rwlock_unlock* (drd_pthread_intercepts.c:912) |
| 616 | ==9466== by 0x400B84: thread_func (rwlock_race.c:30) |
| 617 | ==9466== by 0x4C291DF: vg_thread_wrapper (drd_pthread_intercepts.c:186) |
| 618 | ==9466== by 0x4E3403F: start_thread (in /lib64/libpthread-2.8.so) |
| 619 | ==9466== by 0x53250CC: clone (in /lib64/libc-2.8.so) |
| 620 | ... |
| 621 | </pre> |
| 622 | <p> |
| 623 | The above report has the following meaning: |
| 624 | </p> |
| 625 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 626 | <li class="listitem"><p> |
| 627 | The number in the column on the left is the process ID of the |
| 628 | process being analyzed by DRD. |
| 629 | </p></li> |
| 630 | <li class="listitem"><p> |
| 631 | The first line ("Thread 3") tells you the thread ID for |
| 632 | the thread in which context the data race has been detected. |
| 633 | </p></li> |
| 634 | <li class="listitem"><p> |
| 635 | The next line tells which kind of operation was performed (load or |
| 636 | store) and by which thread. On the same line the start address and the |
| 637 | number of bytes involved in the conflicting access are also displayed. |
| 638 | </p></li> |
| 639 | <li class="listitem"><p> |
| 640 | Next, the call stack of the conflicting access is displayed. If |
| 641 | your program has been compiled with debug information |
| 642 | (<code class="option">-g</code>), this call stack will include file names and |
| 643 | line numbers. The two |
| 644 | bottommost frames in this call stack (<code class="function">clone</code> |
| 645 | and <code class="function">start_thread</code>) show how the NPTL starts |
| 646 | a thread. The third frame |
| 647 | (<code class="function">vg_thread_wrapper</code>) is added by DRD. The |
| 648 | fourth frame (<code class="function">thread_func</code>) is the first |
| 649 | interesting line because it shows the thread entry point, that |
| 650 | is the function that has been passed as the third argument to |
| 651 | <code class="function">pthread_create</code>. |
| 652 | </p></li> |
| 653 | <li class="listitem"><p> |
| 654 | Next, the allocation context for the conflicting address is |
| 655 | displayed. For dynamically allocated data the allocation call |
| 656 | stack is shown. For static variables and stack variables the |
| 657 | allocation context is only shown when the option |
| 658 | <code class="option">--read-var-info=yes</code> has been |
| 659 | specified. Otherwise DRD will print <code class="computeroutput">Allocation |
| 660 | context: unknown</code>. |
| 661 | </p></li> |
| 662 | <li class="listitem"> |
| 663 | <p> |
| 664 | A conflicting access involves at least two memory accesses. For |
| 665 | one of these accesses an exact call stack is displayed, and for |
| 666 | the other accesses an approximate call stack is displayed, |
| 667 | namely the start and the end of the segments of the other |
| 668 | accesses. This information can be interpreted as follows: |
| 669 | </p> |
| 670 | <div class="orderedlist"><ol class="orderedlist" type="1"> |
| 671 | <li class="listitem"><p> |
| 672 | Start at the bottom of both call stacks, and count the |
| 673 | number stack frames with identical function name, file |
| 674 | name and line number. In the above example the three |
| 675 | bottommost frames are identical |
| 676 | (<code class="function">clone</code>, |
| 677 | <code class="function">start_thread</code> and |
| 678 | <code class="function">vg_thread_wrapper</code>). |
| 679 | </p></li> |
| 680 | <li class="listitem"><p> |
| 681 | The next higher stack frame in both call stacks now tells |
| 682 | you between in which source code region the other memory |
| 683 | access happened. The above output tells that the other |
| 684 | memory access involved in the data race happened between |
| 685 | source code lines 28 and 30 in file |
| 686 | <code class="computeroutput">rwlock_race.c</code>. |
| 687 | </p></li> |
| 688 | </ol></div> |
| 689 | <p> |
| 690 | </p> |
| 691 | </li> |
| 692 | </ul></div> |
| 693 | <p> |
| 694 | </p> |
| 695 | </div> |
| 696 | <div class="sect2"> |
| 697 | <div class="titlepage"><div><div><h3 class="title"> |
| 698 | <a name="drd-manual.lock-contention"></a>8.2.3. Detected Errors: Lock Contention</h3></div></div></div> |
| 699 | <p> |
| 700 | Threads must be able to make progress without being blocked for too long by |
| 701 | other threads. Sometimes a thread has to wait until a mutex or reader-writer |
| 702 | synchronization object is unlocked by another thread. This is called |
| 703 | <span class="emphasis"><em>lock contention</em></span>. |
| 704 | </p> |
| 705 | <p> |
| 706 | Lock contention causes delays. Such delays should be as short as |
| 707 | possible. The two command line options |
| 708 | <code class="literal">--exclusive-threshold=<n></code> and |
| 709 | <code class="literal">--shared-threshold=<n></code> make it possible to |
| 710 | detect excessive lock contention by making DRD report any lock that |
| 711 | has been held longer than the specified threshold. An example: |
| 712 | </p> |
| 713 | <pre class="programlisting"> |
| 714 | $ valgrind --tool=drd --exclusive-threshold=10 drd/tests/hold_lock -i 500 |
| 715 | ... |
| 716 | ==10668== Acquired at: |
| 717 | ==10668== at 0x4C267C8: pthread_mutex_lock (drd_pthread_intercepts.c:395) |
| 718 | ==10668== by 0x400D92: main (hold_lock.c:51) |
| 719 | ==10668== Lock on mutex 0x7fefffd50 was held during 503 ms (threshold: 10 ms). |
| 720 | ==10668== at 0x4C26ADA: pthread_mutex_unlock (drd_pthread_intercepts.c:441) |
| 721 | ==10668== by 0x400DB5: main (hold_lock.c:55) |
| 722 | ... |
| 723 | </pre> |
| 724 | <p> |
| 725 | The <code class="literal">hold_lock</code> test program holds a lock as long as |
| 726 | specified by the <code class="literal">-i</code> (interval) argument. The DRD |
| 727 | output reports that the lock acquired at line 51 in source file |
| 728 | <code class="literal">hold_lock.c</code> and released at line 55 was held during |
| 729 | 503 ms, while a threshold of 10 ms was specified to DRD. |
| 730 | </p> |
| 731 | </div> |
| 732 | <div class="sect2"> |
| 733 | <div class="titlepage"><div><div><h3 class="title"> |
| 734 | <a name="drd-manual.api-checks"></a>8.2.4. Detected Errors: Misuse of the POSIX threads API</h3></div></div></div> |
| 735 | <p> |
| 736 | DRD is able to detect and report the following misuses of the POSIX |
| 737 | threads API: |
| 738 | </p> |
| 739 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 740 | <li class="listitem"><p> |
| 741 | Passing the address of one type of synchronization object |
| 742 | (e.g. a mutex) to a POSIX API call that expects a pointer to |
| 743 | another type of synchronization object (e.g. a condition |
| 744 | variable). |
| 745 | </p></li> |
| 746 | <li class="listitem"><p> |
| 747 | Attempts to unlock a mutex that has not been locked. |
| 748 | </p></li> |
| 749 | <li class="listitem"><p> |
| 750 | Attempts to unlock a mutex that was locked by another thread. |
| 751 | </p></li> |
| 752 | <li class="listitem"><p> |
| 753 | Attempts to lock a mutex of type |
| 754 | <code class="literal">PTHREAD_MUTEX_NORMAL</code> or a spinlock |
| 755 | recursively. |
| 756 | </p></li> |
| 757 | <li class="listitem"><p> |
| 758 | Destruction or deallocation of a locked mutex. |
| 759 | </p></li> |
| 760 | <li class="listitem"><p> |
| 761 | Sending a signal to a condition variable while no lock is held |
| 762 | on the mutex associated with the condition variable. |
| 763 | </p></li> |
| 764 | <li class="listitem"><p> |
| 765 | Calling <code class="function">pthread_cond_wait</code> on a mutex |
| 766 | that is not locked, that is locked by another thread or that |
| 767 | has been locked recursively. |
| 768 | </p></li> |
| 769 | <li class="listitem"><p> |
| 770 | Associating two different mutexes with a condition variable |
| 771 | through <code class="function">pthread_cond_wait</code>. |
| 772 | </p></li> |
| 773 | <li class="listitem"><p> |
| 774 | Destruction or deallocation of a condition variable that is |
| 775 | being waited upon. |
| 776 | </p></li> |
| 777 | <li class="listitem"><p> |
| 778 | Destruction or deallocation of a locked reader-writer synchronization |
| 779 | object. |
| 780 | </p></li> |
| 781 | <li class="listitem"><p> |
| 782 | Attempts to unlock a reader-writer synchronization object that was not |
| 783 | locked by the calling thread. |
| 784 | </p></li> |
| 785 | <li class="listitem"><p> |
| 786 | Attempts to recursively lock a reader-writer synchronization object |
| 787 | exclusively. |
| 788 | </p></li> |
| 789 | <li class="listitem"><p> |
| 790 | Attempts to pass the address of a user-defined reader-writer |
| 791 | synchronization object to a POSIX threads function. |
| 792 | </p></li> |
| 793 | <li class="listitem"><p> |
| 794 | Attempts to pass the address of a POSIX reader-writer synchronization |
| 795 | object to one of the annotations for user-defined reader-writer |
| 796 | synchronization objects. |
| 797 | </p></li> |
| 798 | <li class="listitem"><p> |
| 799 | Reinitialization of a mutex, condition variable, reader-writer |
| 800 | lock, semaphore or barrier. |
| 801 | </p></li> |
| 802 | <li class="listitem"><p> |
| 803 | Destruction or deallocation of a semaphore or barrier that is |
| 804 | being waited upon. |
| 805 | </p></li> |
| 806 | <li class="listitem"><p> |
| 807 | Missing synchronization between barrier wait and barrier destruction. |
| 808 | </p></li> |
| 809 | <li class="listitem"><p> |
| 810 | Exiting a thread without first unlocking the spinlocks, mutexes or |
| 811 | reader-writer synchronization objects that were locked by that thread. |
| 812 | </p></li> |
| 813 | <li class="listitem"><p> |
| 814 | Passing an invalid thread ID to <code class="function">pthread_join</code> |
| 815 | or <code class="function">pthread_cancel</code>. |
| 816 | </p></li> |
| 817 | </ul></div> |
| 818 | <p> |
| 819 | </p> |
| 820 | </div> |
| 821 | <div class="sect2"> |
| 822 | <div class="titlepage"><div><div><h3 class="title"> |
| 823 | <a name="drd-manual.clientreqs"></a>8.2.5. Client Requests</h3></div></div></div> |
| 824 | <p> |
| 825 | Just as for other Valgrind tools it is possible to let a client program |
| 826 | interact with the DRD tool through client requests. In addition to the |
| 827 | client requests several macros have been defined that allow to use the |
| 828 | client requests in a convenient way. |
| 829 | </p> |
| 830 | <p> |
| 831 | The interface between client programs and the DRD tool is defined in |
| 832 | the header file <code class="literal"><valgrind/drd.h></code>. The |
| 833 | available macros and client requests are: |
| 834 | </p> |
| 835 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 836 | <li class="listitem"><p> |
| 837 | The macro <code class="literal">DRD_GET_VALGRIND_THREADID</code> and the |
| 838 | corresponding client |
| 839 | request <code class="varname">VG_USERREQ__DRD_GET_VALGRIND_THREAD_ID</code>. |
| 840 | Query the thread ID that has been assigned by the Valgrind core to the |
| 841 | thread executing this client request. Valgrind's thread ID's start at |
| 842 | one and are recycled in case a thread stops. |
| 843 | </p></li> |
| 844 | <li class="listitem"><p> |
| 845 | The macro <code class="literal">DRD_GET_DRD_THREADID</code> and the corresponding |
| 846 | client request <code class="varname">VG_USERREQ__DRD_GET_DRD_THREAD_ID</code>. |
| 847 | Query the thread ID that has been assigned by DRD to the thread |
| 848 | executing this client request. These are the thread ID's reported by DRD |
| 849 | in data race reports and in trace messages. DRD's thread ID's start at |
| 850 | one and are never recycled. |
| 851 | </p></li> |
| 852 | <li class="listitem"><p> |
| 853 | The macros <code class="literal">DRD_IGNORE_VAR(x)</code>, |
| 854 | <code class="literal">ANNOTATE_TRACE_MEMORY(&x)</code> and the corresponding |
| 855 | client request <code class="varname">VG_USERREQ__DRD_START_SUPPRESSION</code>. Some |
| 856 | applications contain intentional races. There exist e.g. applications |
| 857 | where the same value is assigned to a shared variable from two different |
| 858 | threads. It may be more convenient to suppress such races than to solve |
| 859 | these. This client request allows to suppress such races. |
| 860 | </p></li> |
| 861 | <li class="listitem"><p> |
| 862 | The macro <code class="literal">DRD_STOP_IGNORING_VAR(x)</code> and the |
| 863 | corresponding client request |
| 864 | <code class="varname">VG_USERREQ__DRD_FINISH_SUPPRESSION</code>. Tell DRD |
| 865 | to no longer ignore data races for the address range that was suppressed |
| 866 | either via the macro <code class="literal">DRD_IGNORE_VAR(x)</code> or via the |
| 867 | client request <code class="varname">VG_USERREQ__DRD_START_SUPPRESSION</code>. |
| 868 | </p></li> |
| 869 | <li class="listitem"><p> |
| 870 | The macro <code class="literal">DRD_TRACE_VAR(x)</code>. Trace all load and store |
| 871 | activity for the address range starting at <code class="literal">&x</code> and |
| 872 | occupying <code class="literal">sizeof(x)</code> bytes. When DRD reports a data |
| 873 | race on a specified variable, and it's not immediately clear which |
| 874 | source code statements triggered the conflicting accesses, it can be |
| 875 | very helpful to trace all activity on the offending memory location. |
| 876 | </p></li> |
| 877 | <li class="listitem"><p> |
| 878 | The macro <code class="literal">DRD_STOP_TRACING_VAR(x)</code>. Stop tracing load |
| 879 | and store activity for the address range starting |
| 880 | at <code class="literal">&x</code> and occupying <code class="literal">sizeof(x)</code> |
| 881 | bytes. |
| 882 | </p></li> |
| 883 | <li class="listitem"><p> |
| 884 | The macro <code class="literal">ANNOTATE_TRACE_MEMORY(&x)</code>. Trace all |
| 885 | load and store activity that touches at least the single byte at the |
| 886 | address <code class="literal">&x</code>. |
| 887 | </p></li> |
| 888 | <li class="listitem"><p> |
| 889 | The client request <code class="varname">VG_USERREQ__DRD_START_TRACE_ADDR</code>, |
| 890 | which allows to trace all load and store activity for the specified |
| 891 | address range. |
| 892 | </p></li> |
| 893 | <li class="listitem"><p> |
| 894 | The client |
| 895 | request <code class="varname">VG_USERREQ__DRD_STOP_TRACE_ADDR</code>. Do no longer |
| 896 | trace load and store activity for the specified address range. |
| 897 | </p></li> |
| 898 | <li class="listitem"><p> |
| 899 | The macro <code class="literal">ANNOTATE_HAPPENS_BEFORE(addr)</code> tells DRD to |
| 900 | insert a mark. Insert this macro just after an access to the variable at |
| 901 | the specified address has been performed. |
| 902 | </p></li> |
| 903 | <li class="listitem"><p> |
| 904 | The macro <code class="literal">ANNOTATE_HAPPENS_AFTER(addr)</code> tells DRD that |
| 905 | the next access to the variable at the specified address should be |
| 906 | considered to have happened after the access just before the latest |
| 907 | <code class="literal">ANNOTATE_HAPPENS_BEFORE(addr)</code> annotation that |
| 908 | references the same variable. The purpose of these two macros is to tell |
| 909 | DRD about the order of inter-thread memory accesses implemented via |
| 910 | atomic memory operations. See |
| 911 | also <code class="literal">drd/tests/annotate_smart_pointer.cpp</code> for an |
| 912 | example. |
| 913 | </p></li> |
| 914 | <li class="listitem"><p> |
| 915 | The macro <code class="literal">ANNOTATE_RWLOCK_CREATE(rwlock)</code> tells DRD |
| 916 | that the object at address <code class="literal">rwlock</code> is a |
| 917 | reader-writer synchronization object that is not a |
| 918 | <code class="literal">pthread_rwlock_t</code> synchronization object. See |
| 919 | also <code class="literal">drd/tests/annotate_rwlock.c</code> for an example. |
| 920 | </p></li> |
| 921 | <li class="listitem"><p> |
| 922 | The macro <code class="literal">ANNOTATE_RWLOCK_DESTROY(rwlock)</code> tells DRD |
| 923 | that the reader-writer synchronization object at |
| 924 | address <code class="literal">rwlock</code> has been destroyed. |
| 925 | </p></li> |
| 926 | <li class="listitem"><p> |
| 927 | The macro <code class="literal">ANNOTATE_WRITERLOCK_ACQUIRED(rwlock)</code> tells |
| 928 | DRD that a writer lock has been acquired on the reader-writer |
| 929 | synchronization object at address <code class="literal">rwlock</code>. |
| 930 | </p></li> |
| 931 | <li class="listitem"><p> |
| 932 | The macro <code class="literal">ANNOTATE_READERLOCK_ACQUIRED(rwlock)</code> tells |
| 933 | DRD that a reader lock has been acquired on the reader-writer |
| 934 | synchronization object at address <code class="literal">rwlock</code>. |
| 935 | </p></li> |
| 936 | <li class="listitem"><p> |
| 937 | The macro <code class="literal">ANNOTATE_RWLOCK_ACQUIRED(rwlock, is_w)</code> |
| 938 | tells DRD that a writer lock (when <code class="literal">is_w != 0</code>) or that |
| 939 | a reader lock (when <code class="literal">is_w == 0</code>) has been acquired on |
| 940 | the reader-writer synchronization object at |
| 941 | address <code class="literal">rwlock</code>. |
| 942 | </p></li> |
| 943 | <li class="listitem"><p> |
| 944 | The macro <code class="literal">ANNOTATE_WRITERLOCK_RELEASED(rwlock)</code> tells |
| 945 | DRD that a writer lock has been released on the reader-writer |
| 946 | synchronization object at address <code class="literal">rwlock</code>. |
| 947 | </p></li> |
| 948 | <li class="listitem"><p> |
| 949 | The macro <code class="literal">ANNOTATE_READERLOCK_RELEASED(rwlock)</code> tells |
| 950 | DRD that a reader lock has been released on the reader-writer |
| 951 | synchronization object at address <code class="literal">rwlock</code>. |
| 952 | </p></li> |
| 953 | <li class="listitem"><p> |
| 954 | The macro <code class="literal">ANNOTATE_RWLOCK_RELEASED(rwlock, is_w)</code> |
| 955 | tells DRD that a writer lock (when <code class="literal">is_w != 0</code>) or that |
| 956 | a reader lock (when <code class="literal">is_w == 0</code>) has been released on |
| 957 | the reader-writer synchronization object at |
| 958 | address <code class="literal">rwlock</code>. |
| 959 | </p></li> |
| 960 | <li class="listitem"><p> |
| 961 | The macro <code class="literal">ANNOTATE_BARRIER_INIT(barrier, count, |
| 962 | reinitialization_allowed)</code> tells DRD that a new barrier object |
| 963 | at the address <code class="literal">barrier</code> has been initialized, |
| 964 | that <code class="literal">count</code> threads participate in each barrier and |
| 965 | also whether or not barrier reinitialization without intervening |
| 966 | destruction should be reported as an error. See |
| 967 | also <code class="literal">drd/tests/annotate_barrier.c</code> for an example. |
| 968 | </p></li> |
| 969 | <li class="listitem"><p> |
| 970 | The macro <code class="literal">ANNOTATE_BARRIER_DESTROY(barrier)</code> |
| 971 | tells DRD that a barrier object is about to be destroyed. |
| 972 | </p></li> |
| 973 | <li class="listitem"><p> |
| 974 | The macro <code class="literal">ANNOTATE_BARRIER_WAIT_BEFORE(barrier)</code> |
| 975 | tells DRD that waiting for a barrier will start. |
| 976 | </p></li> |
| 977 | <li class="listitem"><p> |
| 978 | The macro <code class="literal">ANNOTATE_BARRIER_WAIT_AFTER(barrier)</code> |
| 979 | tells DRD that waiting for a barrier has finished. |
| 980 | </p></li> |
| 981 | <li class="listitem"><p> |
| 982 | The macro <code class="literal">ANNOTATE_BENIGN_RACE_SIZED(addr, size, |
| 983 | descr)</code> tells DRD that any races detected on the specified |
| 984 | address are benign and hence should not be |
| 985 | reported. The <code class="literal">descr</code> argument is ignored but can be |
| 986 | used to document why data races on <code class="literal">addr</code> are benign. |
| 987 | </p></li> |
| 988 | <li class="listitem"><p> |
| 989 | The macro <code class="literal">ANNOTATE_BENIGN_RACE_STATIC(var, descr)</code> |
| 990 | tells DRD that any races detected on the specified static variable are |
| 991 | benign and hence should not be reported. The <code class="literal">descr</code> |
| 992 | argument is ignored but can be used to document why data races |
| 993 | on <code class="literal">var</code> are benign. Note: this macro can only be |
| 994 | used in C++ programs and not in C programs. |
| 995 | </p></li> |
| 996 | <li class="listitem"><p> |
| 997 | The macro <code class="literal">ANNOTATE_IGNORE_READS_BEGIN</code> tells |
| 998 | DRD to ignore all memory loads performed by the current thread. |
| 999 | </p></li> |
| 1000 | <li class="listitem"><p> |
| 1001 | The macro <code class="literal">ANNOTATE_IGNORE_READS_END</code> tells |
| 1002 | DRD to stop ignoring the memory loads performed by the current thread. |
| 1003 | </p></li> |
| 1004 | <li class="listitem"><p> |
| 1005 | The macro <code class="literal">ANNOTATE_IGNORE_WRITES_BEGIN</code> tells |
| 1006 | DRD to ignore all memory stores performed by the current thread. |
| 1007 | </p></li> |
| 1008 | <li class="listitem"><p> |
| 1009 | The macro <code class="literal">ANNOTATE_IGNORE_WRITES_END</code> tells |
| 1010 | DRD to stop ignoring the memory stores performed by the current thread. |
| 1011 | </p></li> |
| 1012 | <li class="listitem"><p> |
| 1013 | The macro <code class="literal">ANNOTATE_IGNORE_READS_AND_WRITES_BEGIN</code> tells |
| 1014 | DRD to ignore all memory accesses performed by the current thread. |
| 1015 | </p></li> |
| 1016 | <li class="listitem"><p> |
| 1017 | The macro <code class="literal">ANNOTATE_IGNORE_READS_AND_WRITES_END</code> tells |
| 1018 | DRD to stop ignoring the memory accesses performed by the current thread. |
| 1019 | </p></li> |
| 1020 | <li class="listitem"><p> |
| 1021 | The macro <code class="literal">ANNOTATE_NEW_MEMORY(addr, size)</code> tells |
| 1022 | DRD that the specified memory range has been allocated by a custom |
| 1023 | memory allocator in the client program and that the client program |
| 1024 | will start using this memory range. |
| 1025 | </p></li> |
| 1026 | <li class="listitem"><p> |
| 1027 | The macro <code class="literal">ANNOTATE_THREAD_NAME(name)</code> tells DRD to |
| 1028 | associate the specified name with the current thread and to include this |
| 1029 | name in the error messages printed by DRD. |
| 1030 | </p></li> |
| 1031 | <li class="listitem"><p> |
| 1032 | The macros <code class="literal">VALGRIND_MALLOCLIKE_BLOCK</code> and |
| 1033 | <code class="literal">VALGRIND_FREELIKE_BLOCK</code> from the Valgrind core are |
| 1034 | implemented; they are described in |
| 1035 | <a class="xref" href="manual-core-adv.html#manual-core-adv.clientreq" title="3.1. The Client Request mechanism">The Client Request mechanism</a>. |
| 1036 | </p></li> |
| 1037 | </ul></div> |
| 1038 | <p> |
| 1039 | </p> |
| 1040 | <p> |
| 1041 | Note: if you compiled Valgrind yourself, the header file |
| 1042 | <code class="literal"><valgrind/drd.h></code> will have been installed in |
| 1043 | the directory <code class="literal">/usr/include</code> by the command |
| 1044 | <code class="literal">make install</code>. If you obtained Valgrind by |
| 1045 | installing it as a package however, you will probably have to install |
| 1046 | another package with a name like <code class="literal">valgrind-devel</code> |
| 1047 | before Valgrind's header files are available. |
| 1048 | </p> |
| 1049 | </div> |
| 1050 | <div class="sect2"> |
| 1051 | <div class="titlepage"><div><div><h3 class="title"> |
| 1052 | <a name="drd-manual.C++11"></a>8.2.6. Debugging C++11 Programs</h3></div></div></div> |
| 1053 | <p>If you want to use the C++11 class std::thread you will need to do the |
| 1054 | following to annotate the std::shared_ptr<> objects used in the |
| 1055 | implementation of that class: |
| 1056 | </p> |
| 1057 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 1058 | <li class="listitem"> |
| 1059 | <p>Add the following code at the start of a common header or at the |
| 1060 | start of each source file, before any C++ header files are included:</p> |
| 1061 | <pre class="programlisting"> |
| 1062 | #include <valgrind/drd.h> |
| 1063 | #define _GLIBCXX_SYNCHRONIZATION_HAPPENS_BEFORE(addr) ANNOTATE_HAPPENS_BEFORE(addr) |
| 1064 | #define _GLIBCXX_SYNCHRONIZATION_HAPPENS_AFTER(addr) ANNOTATE_HAPPENS_AFTER(addr) |
| 1065 | </pre> |
| 1066 | </li> |
| 1067 | <li class="listitem"><p>Download the gcc source code and from source file |
| 1068 | libstdc++-v3/src/c++11/thread.cc copy the implementation of the |
| 1069 | <code class="computeroutput">execute_native_thread_routine()</code> |
| 1070 | and <code class="computeroutput">std::thread::_M_start_thread()</code> |
| 1071 | functions into a source file that is linked with your application. Make |
| 1072 | sure that also in this source file the |
| 1073 | _GLIBCXX_SYNCHRONIZATION_HAPPENS_*() macros are defined properly.</p></li> |
| 1074 | </ul></div> |
| 1075 | <p> |
| 1076 | </p> |
| 1077 | <p>For more information, see also <span class="emphasis"><em>The |
| 1078 | GNU C++ Library Manual, Debugging Support</em></span> |
| 1079 | (<a class="ulink" href="http://gcc.gnu.org/onlinedocs/libstdc++/manual/debug.html" target="_top">http://gcc.gnu.org/onlinedocs/libstdc++/manual/debug.html</a>).</p> |
| 1080 | </div> |
| 1081 | <div class="sect2"> |
| 1082 | <div class="titlepage"><div><div><h3 class="title"> |
| 1083 | <a name="drd-manual.gnome"></a>8.2.7. Debugging GNOME Programs</h3></div></div></div> |
| 1084 | <p> |
| 1085 | GNOME applications use the threading primitives provided by the |
| 1086 | <code class="computeroutput">glib</code> and |
| 1087 | <code class="computeroutput">gthread</code> libraries. These libraries |
| 1088 | are built on top of POSIX threads, and hence are directly supported by |
| 1089 | DRD. Please keep in mind that you have to call |
| 1090 | <code class="function">g_thread_init</code> before creating any threads, or |
| 1091 | DRD will report several data races on glib functions. See also the |
| 1092 | <a class="ulink" href="http://library.gnome.org/devel/glib/stable/glib-Threads.html" target="_top">GLib |
| 1093 | Reference Manual</a> for more information about |
| 1094 | <code class="function">g_thread_init</code>. |
| 1095 | </p> |
| 1096 | <p> |
| 1097 | One of the many facilities provided by the <code class="literal">glib</code> |
| 1098 | library is a block allocator, called <code class="literal">g_slice</code>. You |
| 1099 | have to disable this block allocator when using DRD by adding the |
| 1100 | following to the shell environment variables: |
| 1101 | <code class="literal">G_SLICE=always-malloc</code>. See also the <a class="ulink" href="http://library.gnome.org/devel/glib/stable/glib-Memory-Slices.html" target="_top">GLib |
| 1102 | Reference Manual</a> for more information. |
| 1103 | </p> |
| 1104 | </div> |
| 1105 | <div class="sect2"> |
| 1106 | <div class="titlepage"><div><div><h3 class="title"> |
| 1107 | <a name="drd-manual.boost.thread"></a>8.2.8. Debugging Boost.Thread Programs</h3></div></div></div> |
| 1108 | <p> |
| 1109 | The Boost.Thread library is the threading library included with the |
| 1110 | cross-platform Boost Libraries. This threading library is an early |
| 1111 | implementation of the upcoming C++0x threading library. |
| 1112 | </p> |
| 1113 | <p> |
| 1114 | Applications that use the Boost.Thread library should run fine under DRD. |
| 1115 | </p> |
| 1116 | <p> |
| 1117 | More information about Boost.Thread can be found here: |
| 1118 | </p> |
| 1119 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 1120 | <li class="listitem"><p> |
| 1121 | Anthony Williams, <a class="ulink" href="http://www.boost.org/doc/libs/1_37_0/doc/html/thread.html" target="_top">Boost.Thread</a> |
| 1122 | Library Documentation, Boost website, 2007. |
| 1123 | </p></li> |
| 1124 | <li class="listitem"><p> |
| 1125 | Anthony Williams, <a class="ulink" href="http://www.ddj.com/cpp/211600441" target="_top">What's New in Boost |
| 1126 | Threads?</a>, Recent changes to the Boost Thread library, |
| 1127 | Dr. Dobbs Magazine, October 2008. |
| 1128 | </p></li> |
| 1129 | </ul></div> |
| 1130 | <p> |
| 1131 | </p> |
| 1132 | </div> |
| 1133 | <div class="sect2"> |
| 1134 | <div class="titlepage"><div><div><h3 class="title"> |
| 1135 | <a name="drd-manual.openmp"></a>8.2.9. Debugging OpenMP Programs</h3></div></div></div> |
| 1136 | <p> |
| 1137 | OpenMP stands for <span class="emphasis"><em>Open Multi-Processing</em></span>. The OpenMP |
| 1138 | standard consists of a set of compiler directives for C, C++ and Fortran |
| 1139 | programs that allows a compiler to transform a sequential program into a |
| 1140 | parallel program. OpenMP is well suited for HPC applications and allows to |
| 1141 | work at a higher level compared to direct use of the POSIX threads API. While |
| 1142 | OpenMP ensures that the POSIX API is used correctly, OpenMP programs can still |
| 1143 | contain data races. So it definitely makes sense to verify OpenMP programs |
| 1144 | with a thread checking tool. |
| 1145 | </p> |
| 1146 | <p> |
| 1147 | DRD supports OpenMP shared-memory programs generated by GCC. GCC |
| 1148 | supports OpenMP since version 4.2.0. GCC's runtime support |
| 1149 | for OpenMP programs is provided by a library called |
| 1150 | <code class="literal">libgomp</code>. The synchronization primitives implemented |
| 1151 | in this library use Linux' futex system call directly, unless the |
| 1152 | library has been configured with the |
| 1153 | <code class="literal">--disable-linux-futex</code> option. DRD only supports |
| 1154 | libgomp libraries that have been configured with this option and in |
| 1155 | which symbol information is present. For most Linux distributions this |
| 1156 | means that you will have to recompile GCC. See also the script |
| 1157 | <code class="literal">drd/scripts/download-and-build-gcc</code> in the |
| 1158 | Valgrind source tree for an example of how to compile GCC. You will |
| 1159 | also have to make sure that the newly compiled |
| 1160 | <code class="literal">libgomp.so</code> library is loaded when OpenMP programs |
| 1161 | are started. This is possible by adding a line similar to the |
| 1162 | following to your shell startup script: |
| 1163 | </p> |
| 1164 | <pre class="programlisting"> |
| 1165 | export LD_LIBRARY_PATH=~/gcc-4.4.0/lib64:~/gcc-4.4.0/lib: |
| 1166 | </pre> |
| 1167 | <p> |
| 1168 | As an example, the test OpenMP test program |
| 1169 | <code class="literal">drd/tests/omp_matinv</code> triggers a data race |
| 1170 | when the option -r has been specified on the command line. The data |
| 1171 | race is triggered by the following code: |
| 1172 | </p> |
| 1173 | <pre class="programlisting"> |
| 1174 | #pragma omp parallel for private(j) |
| 1175 | for (j = 0; j < rows; j++) |
| 1176 | { |
| 1177 | if (i != j) |
| 1178 | { |
| 1179 | const elem_t factor = a[j * cols + i]; |
| 1180 | for (k = 0; k < cols; k++) |
| 1181 | { |
| 1182 | a[j * cols + k] -= a[i * cols + k] * factor; |
| 1183 | } |
| 1184 | } |
| 1185 | } |
| 1186 | </pre> |
| 1187 | <p> |
| 1188 | The above code is racy because the variable <code class="literal">k</code> has |
| 1189 | not been declared private. DRD will print the following error message |
| 1190 | for the above code: |
| 1191 | </p> |
| 1192 | <pre class="programlisting"> |
| 1193 | $ valgrind --tool=drd --check-stack-var=yes --read-var-info=yes drd/tests/omp_matinv 3 -t 2 -r |
| 1194 | ... |
| 1195 | Conflicting store by thread 1/1 at 0x7fefffbc4 size 4 |
| 1196 | at 0x4014A0: gj.omp_fn.0 (omp_matinv.c:203) |
| 1197 | by 0x401211: gj (omp_matinv.c:159) |
| 1198 | by 0x40166A: invert_matrix (omp_matinv.c:238) |
| 1199 | by 0x4019B4: main (omp_matinv.c:316) |
| 1200 | Location 0x7fefffbc4 is 0 bytes inside local var "k" |
| 1201 | declared at omp_matinv.c:160, in frame #0 of thread 1 |
| 1202 | ... |
| 1203 | </pre> |
| 1204 | <p> |
| 1205 | In the above output the function name <code class="function">gj.omp_fn.0</code> |
| 1206 | has been generated by GCC from the function name |
| 1207 | <code class="function">gj</code>. The allocation context information shows that the |
| 1208 | data race has been caused by modifying the variable <code class="literal">k</code>. |
| 1209 | </p> |
| 1210 | <p> |
| 1211 | Note: for GCC versions before 4.4.0, no allocation context information is |
| 1212 | shown. With these GCC versions the most usable information in the above output |
| 1213 | is the source file name and the line number where the data race has been |
| 1214 | detected (<code class="literal">omp_matinv.c:203</code>). |
| 1215 | </p> |
| 1216 | <p> |
| 1217 | For more information about OpenMP, see also |
| 1218 | <a class="ulink" href="http://openmp.org/" target="_top">openmp.org</a>. |
| 1219 | </p> |
| 1220 | </div> |
| 1221 | <div class="sect2"> |
| 1222 | <div class="titlepage"><div><div><h3 class="title"> |
| 1223 | <a name="drd-manual.cust-mem-alloc"></a>8.2.10. DRD and Custom Memory Allocators</h3></div></div></div> |
| 1224 | <p> |
| 1225 | DRD tracks all memory allocation events that happen via the |
| 1226 | standard memory allocation and deallocation functions |
| 1227 | (<code class="function">malloc</code>, <code class="function">free</code>, |
| 1228 | <code class="function">new</code> and <code class="function">delete</code>), via entry |
| 1229 | and exit of stack frames or that have been annotated with Valgrind's |
| 1230 | memory pool client requests. DRD uses memory allocation and deallocation |
| 1231 | information for two purposes: |
| 1232 | </p> |
| 1233 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 1234 | <li class="listitem"><p> |
| 1235 | To know where the scope ends of POSIX objects that have not been |
| 1236 | destroyed explicitly. It is e.g. not required by the POSIX |
| 1237 | threads standard to call |
| 1238 | <code class="function">pthread_mutex_destroy</code> before freeing the |
| 1239 | memory in which a mutex object resides. |
| 1240 | </p></li> |
| 1241 | <li class="listitem"><p> |
| 1242 | To know where the scope of variables ends. If e.g. heap memory |
| 1243 | has been used by one thread, that thread frees that memory, and |
| 1244 | another thread allocates and starts using that memory, no data |
| 1245 | races must be reported for that memory. |
| 1246 | </p></li> |
| 1247 | </ul></div> |
| 1248 | <p> |
| 1249 | </p> |
| 1250 | <p> |
| 1251 | It is essential for correct operation of DRD that the tool knows about |
| 1252 | memory allocation and deallocation events. When analyzing a client program |
| 1253 | with DRD that uses a custom memory allocator, either instrument the custom |
| 1254 | memory allocator with the <code class="literal">VALGRIND_MALLOCLIKE_BLOCK</code> |
| 1255 | and <code class="literal">VALGRIND_FREELIKE_BLOCK</code> macros or disable the |
| 1256 | custom memory allocator. |
| 1257 | </p> |
| 1258 | <p> |
| 1259 | As an example, the GNU libstdc++ library can be configured |
| 1260 | to use standard memory allocation functions instead of memory pools by |
| 1261 | setting the environment variable |
| 1262 | <code class="literal">GLIBCXX_FORCE_NEW</code>. For more information, see also |
| 1263 | the <a class="ulink" href="http://gcc.gnu.org/onlinedocs/libstdc++/manual/bk01pt04ch11.html" target="_top">libstdc++ |
| 1264 | manual</a>. |
| 1265 | </p> |
| 1266 | </div> |
| 1267 | <div class="sect2"> |
| 1268 | <div class="titlepage"><div><div><h3 class="title"> |
| 1269 | <a name="drd-manual.drd-versus-memcheck"></a>8.2.11. DRD Versus Memcheck</h3></div></div></div> |
| 1270 | <p> |
| 1271 | It is essential for correct operation of DRD that there are no memory |
| 1272 | errors such as dangling pointers in the client program. Which means that |
| 1273 | it is a good idea to make sure that your program is Memcheck-clean |
| 1274 | before you analyze it with DRD. It is possible however that some of |
| 1275 | the Memcheck reports are caused by data races. In this case it makes |
| 1276 | sense to run DRD before Memcheck. |
| 1277 | </p> |
| 1278 | <p> |
| 1279 | So which tool should be run first? In case both DRD and Memcheck |
| 1280 | complain about a program, a possible approach is to run both tools |
| 1281 | alternatingly and to fix as many errors as possible after each run of |
| 1282 | each tool until none of the two tools prints any more error messages. |
| 1283 | </p> |
| 1284 | </div> |
| 1285 | <div class="sect2"> |
| 1286 | <div class="titlepage"><div><div><h3 class="title"> |
| 1287 | <a name="drd-manual.resource-requirements"></a>8.2.12. Resource Requirements</h3></div></div></div> |
| 1288 | <p> |
| 1289 | The requirements of DRD with regard to heap and stack memory and the |
| 1290 | effect on the execution time of client programs are as follows: |
| 1291 | </p> |
| 1292 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 1293 | <li class="listitem"><p> |
| 1294 | When running a program under DRD with default DRD options, |
| 1295 | between 1.1 and 3.6 times more memory will be needed compared to |
| 1296 | a native run of the client program. More memory will be needed |
| 1297 | if loading debug information has been enabled |
| 1298 | (<code class="literal">--read-var-info=yes</code>). |
| 1299 | </p></li> |
| 1300 | <li class="listitem"><p> |
| 1301 | DRD allocates some of its temporary data structures on the stack |
| 1302 | of the client program threads. This amount of data is limited to |
| 1303 | 1 - 2 KB. Make sure that thread stacks are sufficiently large. |
| 1304 | </p></li> |
| 1305 | <li class="listitem"><p> |
| 1306 | Most applications will run between 20 and 50 times slower under |
| 1307 | DRD than a native single-threaded run. The slowdown will be most |
| 1308 | noticeable for applications which perform frequent mutex lock / |
| 1309 | unlock operations. |
| 1310 | </p></li> |
| 1311 | </ul></div> |
| 1312 | <p> |
| 1313 | </p> |
| 1314 | </div> |
| 1315 | <div class="sect2"> |
| 1316 | <div class="titlepage"><div><div><h3 class="title"> |
| 1317 | <a name="drd-manual.effective-use"></a>8.2.13. Hints and Tips for Effective Use of DRD</h3></div></div></div> |
| 1318 | <p> |
| 1319 | The following information may be helpful when using DRD: |
| 1320 | </p> |
| 1321 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 1322 | <li class="listitem"><p> |
| 1323 | Make sure that debug information is present in the executable |
| 1324 | being analyzed, such that DRD can print function name and line |
| 1325 | number information in stack traces. Most compilers can be told |
| 1326 | to include debug information via compiler option |
| 1327 | <code class="option">-g</code>. |
| 1328 | </p></li> |
| 1329 | <li class="listitem"><p> |
| 1330 | Compile with option <code class="option">-O1</code> instead of |
| 1331 | <code class="option">-O0</code>. This will reduce the amount of generated |
| 1332 | code, may reduce the amount of debug info and will speed up |
| 1333 | DRD's processing of the client program. For more information, |
| 1334 | see also <a class="xref" href="manual-core.html#manual-core.started" title="2.2. Getting started">Getting started</a>. |
| 1335 | </p></li> |
| 1336 | <li class="listitem"><p> |
| 1337 | If DRD reports any errors on libraries that are part of your |
| 1338 | Linux distribution like e.g. <code class="literal">libc.so</code> or |
| 1339 | <code class="literal">libstdc++.so</code>, installing the debug packages |
| 1340 | for these libraries will make the output of DRD a lot more |
| 1341 | detailed. |
| 1342 | </p></li> |
| 1343 | <li class="listitem"> |
| 1344 | <p> |
| 1345 | When using C++, do not send output from more than one thread to |
| 1346 | <code class="literal">std::cout</code>. Doing so would not only |
| 1347 | generate multiple data race reports, it could also result in |
| 1348 | output from several threads getting mixed up. Either use |
| 1349 | <code class="function">printf</code> or do the following: |
| 1350 | </p> |
| 1351 | <div class="orderedlist"><ol class="orderedlist" type="1"> |
| 1352 | <li class="listitem"><p>Derive a class from <code class="literal">std::ostreambuf</code> |
| 1353 | and let that class send output line by line to |
| 1354 | <code class="literal">stdout</code>. This will avoid that individual |
| 1355 | lines of text produced by different threads get mixed |
| 1356 | up.</p></li> |
| 1357 | <li class="listitem"><p>Create one instance of <code class="literal">std::ostream</code> |
| 1358 | for each thread. This makes stream formatting settings |
| 1359 | thread-local. Pass a per-thread instance of the class |
| 1360 | derived from <code class="literal">std::ostreambuf</code> to the |
| 1361 | constructor of each instance. </p></li> |
| 1362 | <li class="listitem"><p>Let each thread send its output to its own instance of |
| 1363 | <code class="literal">std::ostream</code> instead of |
| 1364 | <code class="literal">std::cout</code>.</p></li> |
| 1365 | </ol></div> |
| 1366 | <p> |
| 1367 | </p> |
| 1368 | </li> |
| 1369 | </ul></div> |
| 1370 | <p> |
| 1371 | </p> |
| 1372 | </div> |
| 1373 | </div> |
| 1374 | <div class="sect1"> |
| 1375 | <div class="titlepage"><div><div><h2 class="title" style="clear: both"> |
| 1376 | <a name="drd-manual.Pthreads"></a>8.3. Using the POSIX Threads API Effectively</h2></div></div></div> |
| 1377 | <div class="sect2"> |
| 1378 | <div class="titlepage"><div><div><h3 class="title"> |
| 1379 | <a name="drd-manual.mutex-types"></a>8.3.1. Mutex types</h3></div></div></div> |
| 1380 | <p> |
| 1381 | The Single UNIX Specification version two defines the following four |
| 1382 | mutex types (see also the documentation of <a class="ulink" href="http://www.opengroup.org/onlinepubs/007908799/xsh/pthread_mutexattr_settype.html" target="_top"><code class="function">pthread_mutexattr_settype</code></a>): |
| 1383 | </p> |
| 1384 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 1385 | <li class="listitem"><p> |
| 1386 | <span class="emphasis"><em>normal</em></span>, which means that no error checking |
| 1387 | is performed, and that the mutex is non-recursive. |
| 1388 | </p></li> |
| 1389 | <li class="listitem"><p> |
| 1390 | <span class="emphasis"><em>error checking</em></span>, which means that the mutex |
| 1391 | is non-recursive and that error checking is performed. |
| 1392 | </p></li> |
| 1393 | <li class="listitem"><p> |
| 1394 | <span class="emphasis"><em>recursive</em></span>, which means that a mutex may be |
| 1395 | locked recursively. |
| 1396 | </p></li> |
| 1397 | <li class="listitem"><p> |
| 1398 | <span class="emphasis"><em>default</em></span>, which means that error checking |
| 1399 | behavior is undefined, and that the behavior for recursive |
| 1400 | locking is also undefined. Or: portable code must neither |
| 1401 | trigger error conditions through the Pthreads API nor attempt to |
| 1402 | lock a mutex of default type recursively. |
| 1403 | </p></li> |
| 1404 | </ul></div> |
| 1405 | <p> |
| 1406 | </p> |
| 1407 | <p> |
| 1408 | In complex applications it is not always clear from beforehand which |
| 1409 | mutex will be locked recursively and which mutex will not be locked |
| 1410 | recursively. Attempts lock a non-recursive mutex recursively will |
| 1411 | result in race conditions that are very hard to find without a thread |
| 1412 | checking tool. So either use the error checking mutex type and |
| 1413 | consistently check the return value of Pthread API mutex calls, or use |
| 1414 | the recursive mutex type. |
| 1415 | </p> |
| 1416 | </div> |
| 1417 | <div class="sect2"> |
| 1418 | <div class="titlepage"><div><div><h3 class="title"> |
| 1419 | <a name="drd-manual.condvar"></a>8.3.2. Condition variables</h3></div></div></div> |
| 1420 | <p> |
| 1421 | A condition variable allows one thread to wake up one or more other |
| 1422 | threads. Condition variables are often used to notify one or more |
| 1423 | threads about state changes of shared data. Unfortunately it is very |
| 1424 | easy to introduce race conditions by using condition variables as the |
| 1425 | only means of state information propagation. A better approach is to |
| 1426 | let threads poll for changes of a state variable that is protected by |
| 1427 | a mutex, and to use condition variables only as a thread wakeup |
| 1428 | mechanism. See also the source file |
| 1429 | <code class="computeroutput">drd/tests/monitor_example.cpp</code> for an |
| 1430 | example of how to implement this concept in C++. The monitor concept |
| 1431 | used in this example is a well known and very useful concept -- see |
| 1432 | also Wikipedia for more information about the <a class="ulink" href="http://en.wikipedia.org/wiki/Monitor_(synchronization)" target="_top">monitor</a> |
| 1433 | concept. |
| 1434 | </p> |
| 1435 | </div> |
| 1436 | <div class="sect2"> |
| 1437 | <div class="titlepage"><div><div><h3 class="title"> |
| 1438 | <a name="drd-manual.pctw"></a>8.3.3. pthread_cond_timedwait and timeouts</h3></div></div></div> |
| 1439 | <p> |
| 1440 | Historically the function |
| 1441 | <code class="function">pthread_cond_timedwait</code> only allowed the |
| 1442 | specification of an absolute timeout, that is a timeout independent of |
| 1443 | the time when this function was called. However, almost every call to |
| 1444 | this function expresses a relative timeout. This typically happens by |
| 1445 | passing the sum of |
| 1446 | <code class="computeroutput">clock_gettime(CLOCK_REALTIME)</code> and a |
| 1447 | relative timeout as the third argument. This approach is incorrect |
| 1448 | since forward or backward clock adjustments by e.g. ntpd will affect |
| 1449 | the timeout. A more reliable approach is as follows: |
| 1450 | </p> |
| 1451 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 1452 | <li class="listitem"><p> |
| 1453 | When initializing a condition variable through |
| 1454 | <code class="function">pthread_cond_init</code>, specify that the timeout of |
| 1455 | <code class="function">pthread_cond_timedwait</code> will use the clock |
| 1456 | <code class="literal">CLOCK_MONOTONIC</code> instead of |
| 1457 | <code class="literal">CLOCK_REALTIME</code>. You can do this via |
| 1458 | <code class="computeroutput">pthread_condattr_setclock(..., |
| 1459 | CLOCK_MONOTONIC)</code>. |
| 1460 | </p></li> |
| 1461 | <li class="listitem"><p> |
| 1462 | When calling <code class="function">pthread_cond_timedwait</code>, pass |
| 1463 | the sum of |
| 1464 | <code class="computeroutput">clock_gettime(CLOCK_MONOTONIC)</code> |
| 1465 | and a relative timeout as the third argument. |
| 1466 | </p></li> |
| 1467 | </ul></div> |
| 1468 | <p> |
| 1469 | See also |
| 1470 | <code class="computeroutput">drd/tests/monitor_example.cpp</code> for an |
| 1471 | example. |
| 1472 | </p> |
| 1473 | </div> |
| 1474 | </div> |
| 1475 | <div class="sect1"> |
| 1476 | <div class="titlepage"><div><div><h2 class="title" style="clear: both"> |
| 1477 | <a name="drd-manual.limitations"></a>8.4. Limitations</h2></div></div></div> |
| 1478 | <p>DRD currently has the following limitations:</p> |
| 1479 | <div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "> |
| 1480 | <li class="listitem"><p> |
| 1481 | DRD, just like Memcheck, will refuse to start on Linux |
| 1482 | distributions where all symbol information has been removed from |
| 1483 | <code class="filename">ld.so</code>. This is e.g. the case for the PPC editions |
| 1484 | of openSUSE and Gentoo. You will have to install the glibc debuginfo |
| 1485 | package on these platforms before you can use DRD. See also openSUSE |
| 1486 | bug <a class="ulink" href="http://bugzilla.novell.com/show_bug.cgi?id=396197" target="_top"> |
| 1487 | 396197</a> and Gentoo bug <a class="ulink" href="http://bugs.gentoo.org/214065" target="_top">214065</a>. |
| 1488 | </p></li> |
| 1489 | <li class="listitem"><p> |
| 1490 | With gcc 4.4.3 and before, DRD may report data races on the C++ |
| 1491 | class <code class="literal">std::string</code> in a multithreaded program. This is |
| 1492 | a know <code class="literal">libstdc++</code> issue -- see also GCC bug |
| 1493 | <a class="ulink" href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40518" target="_top">40518</a> |
| 1494 | for more information. |
| 1495 | </p></li> |
| 1496 | <li class="listitem"><p> |
| 1497 | If you compile the DRD source code yourself, you need GCC 3.0 or |
| 1498 | later. GCC 2.95 is not supported. |
| 1499 | </p></li> |
| 1500 | <li class="listitem"><p> |
| 1501 | Of the two POSIX threads implementations for Linux, only the |
| 1502 | NPTL (Native POSIX Thread Library) is supported. The older |
| 1503 | LinuxThreads library is not supported. |
| 1504 | </p></li> |
| 1505 | </ul></div> |
| 1506 | </div> |
| 1507 | <div class="sect1"> |
| 1508 | <div class="titlepage"><div><div><h2 class="title" style="clear: both"> |
| 1509 | <a name="drd-manual.feedback"></a>8.5. Feedback</h2></div></div></div> |
| 1510 | <p> |
| 1511 | If you have any comments, suggestions, feedback or bug reports about |
| 1512 | DRD, feel free to either post a message on the Valgrind users mailing |
| 1513 | list or to file a bug report. See also <a class="ulink" href="http://www.valgrind.org/" target="_top">http://www.valgrind.org/</a> for more information. |
| 1514 | </p> |
| 1515 | </div> |
| 1516 | </div> |
| 1517 | <div> |
| 1518 | <br><table class="nav" width="100%" cellspacing="3" cellpadding="2" border="0" summary="Navigation footer"> |
| 1519 | <tr> |
| 1520 | <td rowspan="2" width="40%" align="left"> |
| 1521 | <a accesskey="p" href="hg-manual.html"><< 7. Helgrind: a thread error detector</a> </td> |
| 1522 | <td width="20%" align="center"><a accesskey="u" href="manual.html">Up</a></td> |
| 1523 | <td rowspan="2" width="40%" align="right"> <a accesskey="n" href="ms-manual.html">9. Massif: a heap profiler >></a> |
| 1524 | </td> |
| 1525 | </tr> |
| 1526 | <tr><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td></tr> |
| 1527 | </table> |
| 1528 | </div> |
| 1529 | </body> |
| 1530 | </html> |