blob: 4c75629d1f61d70df88971a2814bca97f2808413 [file] [log] [blame]
njn3e986b22004-11-30 10:43:45 +00001<?xml version="1.0"?> <!-- -*- sgml -*- -->
2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
de03e0e7c2005-12-03 23:02:33 +00003 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">
4
njn3e986b22004-11-30 10:43:45 +00005
njn05a89172009-07-29 02:36:21 +00006<chapter id="mc-manual" xreflabel="Memcheck: a memory error detector">
7<title>Memcheck: a memory error detector</title>
njn3e986b22004-11-30 10:43:45 +00008
de03e0e7c2005-12-03 23:02:33 +00009<para>To use this tool, you may specify <option>--tool=memcheck</option>
10on the Valgrind command line. You don't have to, though, since Memcheck
11is the default tool.</para>
njn3e986b22004-11-30 10:43:45 +000012
13
njn05a89172009-07-29 02:36:21 +000014<sect1 id="mc-manual.overview" xreflabel="Overview">
15<title>Overview</title>
njn3e986b22004-11-30 10:43:45 +000016
njn05a89172009-07-29 02:36:21 +000017<para>Memcheck is a memory error detector. It can detect the following
18problems that are common in C and C++ programs.</para>
njn3e986b22004-11-30 10:43:45 +000019
20<itemizedlist>
21 <listitem>
njn05a89172009-07-29 02:36:21 +000022 <para>Accessing memory you shouldn't, e.g. overrunning and underrunning
23 heap blocks, overrunning the top of the stack, and accessing memory after
24 it has been freed.</para>
njn3e986b22004-11-30 10:43:45 +000025 </listitem>
njn05a89172009-07-29 02:36:21 +000026
njn3e986b22004-11-30 10:43:45 +000027 <listitem>
njn05a89172009-07-29 02:36:21 +000028 <para>Using undefined values, i.e. values that have not been initialised,
29 or that have been derived from other undefined values.</para>
njn3e986b22004-11-30 10:43:45 +000030 </listitem>
njn05a89172009-07-29 02:36:21 +000031
njn3e986b22004-11-30 10:43:45 +000032 <listitem>
njn05a89172009-07-29 02:36:21 +000033 <para>Incorrect freeing of heap memory, such as double-freeing heap
34 blocks, or mismatched use of
bartaf25f672009-06-26 19:03:53 +000035 <function>malloc</function>/<computeroutput>new</computeroutput>/<computeroutput>new[]</computeroutput>
36 versus
37 <function>free</function>/<computeroutput>delete</computeroutput>/<computeroutput>delete[]</computeroutput></para>
njn3e986b22004-11-30 10:43:45 +000038 </listitem>
njn05a89172009-07-29 02:36:21 +000039
njn3e986b22004-11-30 10:43:45 +000040 <listitem>
41 <para>Overlapping <computeroutput>src</computeroutput> and
42 <computeroutput>dst</computeroutput> pointers in
njn2f7eebe2009-08-05 06:34:27 +000043 <computeroutput>memcpy</computeroutput> and related
njn05a89172009-07-29 02:36:21 +000044 functions.</para>
45 </listitem>
46
47 <listitem>
48 <para>Memory leaks.</para>
njn3e986b22004-11-30 10:43:45 +000049 </listitem>
njn3e986b22004-11-30 10:43:45 +000050</itemizedlist>
51
njn05a89172009-07-29 02:36:21 +000052<para>Problems like these can be difficult to find by other means,
53often remaining undetected for long periods, then causing occasional,
54difficult-to-diagnose crashes.</para>
55
njn3e986b22004-11-30 10:43:45 +000056</sect1>
57
58
59
njn3e986b22004-11-30 10:43:45 +000060<sect1 id="mc-manual.errormsgs"
61 xreflabel="Explanation of error messages from Memcheck">
62<title>Explanation of error messages from Memcheck</title>
63
njnc1abdcb2009-08-05 05:11:02 +000064<para>Memcheck issues a range of error messages. This section presents a
65quick summary of what error messages mean. The precise behaviour of the
66error-checking machinery is described in <xref
67linkend="mc-manual.machine"/>.</para>
njn3e986b22004-11-30 10:43:45 +000068
69
70<sect2 id="mc-manual.badrw"
71 xreflabel="Illegal read / Illegal write errors">
72<title>Illegal read / Illegal write errors</title>
73
74<para>For example:</para>
75<programlisting><![CDATA[
76Invalid read of size 4
77 at 0x40F6BBCC: (within /usr/lib/libpng.so.2.1.0.9)
78 by 0x40F6B804: (within /usr/lib/libpng.so.2.1.0.9)
sewardj08e31e22007-05-23 21:58:33 +000079 by 0x40B07FF4: read_png_image(QImageIO *) (kernel/qpngio.cpp:326)
njn3e986b22004-11-30 10:43:45 +000080 by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621)
njn21f91952005-03-12 22:14:42 +000081 Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd
njn3e986b22004-11-30 10:43:45 +000082]]></programlisting>
83
de03e0e7c2005-12-03 23:02:33 +000084<para>This happens when your program reads or writes memory at a place
85which Memcheck reckons it shouldn't. In this example, the program did a
864-byte read at address 0xBFFFF0E0, somewhere within the system-supplied
87library libpng.so.2.1.0.9, which was called from somewhere else in the
88same library, called from line 326 of <filename>qpngio.cpp</filename>,
89and so on.</para>
njn3e986b22004-11-30 10:43:45 +000090
de03e0e7c2005-12-03 23:02:33 +000091<para>Memcheck tries to establish what the illegal address might relate
92to, since that's often useful. So, if it points into a block of memory
93which has already been freed, you'll be informed of this, and also where
njn7316df22009-08-04 01:16:01 +000094the block was freed. Likewise, if it should turn out to be just off
95the end of a heap block, a common result of off-by-one-errors in
de03e0e7c2005-12-03 23:02:33 +000096array subscripting, you'll be informed of this fact, and also where the
njn2f7eebe2009-08-05 06:34:27 +000097block was allocated. If you use the <option><xref
98linkend="opt.read-var-info"/></option> option Memcheck will run more slowly
99but may give a more detailed description of any illegal address.</para>
njn3e986b22004-11-30 10:43:45 +0000100
de03e0e7c2005-12-03 23:02:33 +0000101<para>In this example, Memcheck can't identify the address. Actually
102the address is on the stack, but, for some reason, this is not a valid
103stack address -- it is below the stack pointer and that isn't allowed.
njn7316df22009-08-04 01:16:01 +0000104In this particular case it's probably caused by GCC generating invalid
105code, a known bug in some ancient versions of GCC.</para>
njn3e986b22004-11-30 10:43:45 +0000106
de03e0e7c2005-12-03 23:02:33 +0000107<para>Note that Memcheck only tells you that your program is about to
108access memory at an illegal address. It can't stop the access from
109happening. So, if your program makes an access which normally would
110result in a segmentation fault, you program will still suffer the same
111fate -- but you will get a message from Memcheck immediately prior to
112this. In this particular example, reading junk on the stack is
113non-fatal, and the program stays alive.</para>
njn3e986b22004-11-30 10:43:45 +0000114
115</sect2>
116
117
118
119<sect2 id="mc-manual.uninitvals"
120 xreflabel="Use of uninitialised values">
121<title>Use of uninitialised values</title>
122
123<para>For example:</para>
124<programlisting><![CDATA[
125Conditional jump or move depends on uninitialised value(s)
126 at 0x402DFA94: _IO_vfprintf (_itoa.h:49)
127 by 0x402E8476: _IO_printf (printf.c:36)
128 by 0x8048472: main (tests/manuel1.c:8)
njn3e986b22004-11-30 10:43:45 +0000129]]></programlisting>
130
de03e0e7c2005-12-03 23:02:33 +0000131<para>An uninitialised-value use error is reported when your program
132uses a value which hasn't been initialised -- in other words, is
133undefined. Here, the undefined value is used somewhere inside the
njn2f7eebe2009-08-05 06:34:27 +0000134<function>printf</function> machinery of the C library. This error was
135reported when running the following small program:</para>
njn3e986b22004-11-30 10:43:45 +0000136<programlisting><![CDATA[
137int main()
138{
139 int x;
140 printf ("x = %d\n", x);
141}]]></programlisting>
142
de03e0e7c2005-12-03 23:02:33 +0000143<para>It is important to understand that your program can copy around
144junk (uninitialised) data as much as it likes. Memcheck observes this
145and keeps track of the data, but does not complain. A complaint is
146issued only when your program attempts to make use of uninitialised
njn2f7eebe2009-08-05 06:34:27 +0000147data in a way that might affect your program's externally-visible behaviour.
148In this example, <varname>x</varname> is uninitialised. Memcheck observes
149the value being passed to <function>_IO_printf</function> and thence to
150<function>_IO_vfprintf</function>, but makes no comment. However,
151<function>_IO_vfprintf</function> has to examine the value of
152<varname>x</varname> so it can turn it into the corresponding ASCII string,
153and it is at this point that Memcheck complains.</para>
njn3e986b22004-11-30 10:43:45 +0000154
155<para>Sources of uninitialised data tend to be:</para>
156<itemizedlist>
157 <listitem>
de03e0e7c2005-12-03 23:02:33 +0000158 <para>Local variables in procedures which have not been initialised,
159 as in the example above.</para>
njn3e986b22004-11-30 10:43:45 +0000160 </listitem>
161 <listitem>
njn7316df22009-08-04 01:16:01 +0000162 <para>The contents of heap blocks (allocated with
163 <function>malloc</function>, <function>new</function>, or a similar
164 function) before you (or a constructor) write something there.
165 </para>
njn3e986b22004-11-30 10:43:45 +0000166 </listitem>
167</itemizedlist>
168
sewardjcd0f2bd2008-05-04 23:06:28 +0000169<para>To see information on the sources of uninitialised data in your
njna3311642009-08-10 01:29:14 +0000170program, use the <option>--track-origins=yes</option> option. This
sewardjcd0f2bd2008-05-04 23:06:28 +0000171makes Memcheck run more slowly, but can make it much easier to track down
172the root causes of uninitialised value errors.</para>
173
njn3e986b22004-11-30 10:43:45 +0000174</sect2>
175
176
177
njn2f7eebe2009-08-05 06:34:27 +0000178<sect2 id="mc-manual.bad-syscall-args"
179 xreflabel="Use of uninitialised or unaddressable values in system
180 calls">
181<title>Use of uninitialised or unaddressable values in system
182 calls</title>
183
184<para>Memcheck checks all parameters to system calls:
185<itemizedlist>
186 <listitem>
187 <para>It checks all the direct parameters themselves, whether they are
188 initialised.</para>
189 </listitem>
190 <listitem>
191 <para>Also, if a system call needs to read from a buffer provided by
192 your program, Memcheck checks that the entire buffer is addressable
193 and its contents are initialised.</para>
194 </listitem>
195 <listitem>
196 <para>Also, if the system call needs to write to a user-supplied
197 buffer, Memcheck checks that the buffer is addressable.</para>
198 </listitem>
199</itemizedlist>
200</para>
201
202<para>After the system call, Memcheck updates its tracked information to
203precisely reflect any changes in memory state caused by the system
204call.</para>
205
206<para>Here's an example of two system calls with invalid parameters:</para>
207<programlisting><![CDATA[
208 #include <stdlib.h>
209 #include <unistd.h>
210 int main( void )
211 {
212 char* arr = malloc(10);
213 int* arr2 = malloc(sizeof(int));
214 write( 1 /* stdout */, arr, 10 );
215 exit(arr2[0]);
216 }
217]]></programlisting>
218
219<para>You get these complaints ...</para>
220<programlisting><![CDATA[
221 Syscall param write(buf) points to uninitialised byte(s)
222 at 0x25A48723: __write_nocancel (in /lib/tls/libc-2.3.3.so)
223 by 0x259AFAD3: __libc_start_main (in /lib/tls/libc-2.3.3.so)
224 by 0x8048348: (within /auto/homes/njn25/grind/head4/a.out)
225 Address 0x25AB8028 is 0 bytes inside a block of size 10 alloc'd
226 at 0x259852B0: malloc (vg_replace_malloc.c:130)
227 by 0x80483F1: main (a.c:5)
228
229 Syscall param exit(error_code) contains uninitialised byte(s)
230 at 0x25A21B44: __GI__exit (in /lib/tls/libc-2.3.3.so)
231 by 0x8048426: main (a.c:8)
232]]></programlisting>
233
234<para>... because the program has (a) written uninitialised junk
235from the heap block to the standard output, and (b) passed an
236uninitialised value to <function>exit</function>. Note that the first
237error refers to the memory pointed to by
238<computeroutput>buf</computeroutput> (not
239<computeroutput>buf</computeroutput> itself), but the second error
240refers directly to <computeroutput>exit</computeroutput>'s argument
241<computeroutput>arr2[0]</computeroutput>.</para>
242
243</sect2>
244
245
njn3e986b22004-11-30 10:43:45 +0000246<sect2 id="mc-manual.badfrees" xreflabel="Illegal frees">
247<title>Illegal frees</title>
248
249<para>For example:</para>
250<programlisting><![CDATA[
251Invalid free()
252 at 0x4004FFDF: free (vg_clientmalloc.c:577)
253 by 0x80484C7: main (tests/doublefree.c:10)
njn21f91952005-03-12 22:14:42 +0000254 Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd
njn3e986b22004-11-30 10:43:45 +0000255 at 0x4004FFDF: free (vg_clientmalloc.c:577)
256 by 0x80484C7: main (tests/doublefree.c:10)
njn3e986b22004-11-30 10:43:45 +0000257]]></programlisting>
258
bartaf25f672009-06-26 19:03:53 +0000259<para>Memcheck keeps track of the blocks allocated by your program
260with <function>malloc</function>/<computeroutput>new</computeroutput>,
261so it can know exactly whether or not the argument to
262<function>free</function>/<computeroutput>delete</computeroutput> is
263legitimate or not. Here, this test program has freed the same block
264twice. As with the illegal read/write errors, Memcheck attempts to
njn7316df22009-08-04 01:16:01 +0000265make sense of the address freed. If, as here, the address is one
bartaf25f672009-06-26 19:03:53 +0000266which has previously been freed, you wil be told that -- making
njn2f7eebe2009-08-05 06:34:27 +0000267duplicate frees of the same block easy to spot. You will also get this
268message if you try to free a pointer that doesn't point to the start of a
269heap block.</para>
njn3e986b22004-11-30 10:43:45 +0000270
271</sect2>
272
273
274<sect2 id="mc-manual.rudefn"
njn2f7eebe2009-08-05 06:34:27 +0000275 xreflabel="When a heap block is freed with an inappropriate deallocation
njn3e986b22004-11-30 10:43:45 +0000276function">
njn2f7eebe2009-08-05 06:34:27 +0000277<title>When a heap block is freed with an inappropriate deallocation
njn3e986b22004-11-30 10:43:45 +0000278function</title>
279
280<para>In the following example, a block allocated with
de03e0e7c2005-12-03 23:02:33 +0000281<function>new[]</function> has wrongly been deallocated with
282<function>free</function>:</para>
njn3e986b22004-11-30 10:43:45 +0000283<programlisting><![CDATA[
284Mismatched free() / delete / delete []
285 at 0x40043249: free (vg_clientfuncs.c:171)
286 by 0x4102BB4E: QGArray::~QGArray(void) (tools/qgarray.cpp:149)
287 by 0x4C261C41: PptDoc::~PptDoc(void) (include/qmemarray.h:60)
288 by 0x4C261F0E: PptXml::~PptXml(void) (pptxml.cc:44)
njn21f91952005-03-12 22:14:42 +0000289 Address 0x4BB292A8 is 0 bytes inside a block of size 64 alloc'd
sewardj08e31e22007-05-23 21:58:33 +0000290 at 0x4004318C: operator new[](unsigned int) (vg_clientfuncs.c:152)
njn3e986b22004-11-30 10:43:45 +0000291 by 0x4C21BC15: KLaola::readSBStream(int) const (klaola.cc:314)
292 by 0x4C21C155: KLaola::stream(KLaola::OLENode const *) (klaola.cc:416)
293 by 0x4C21788F: OLEFilter::convert(QCString const &) (olefilter.cc:272)
294]]></programlisting>
295
de03e0e7c2005-12-03 23:02:33 +0000296<para>In <literal>C++</literal> it's important to deallocate memory in a
297way compatible with how it was allocated. The deal is:</para>
njn3e986b22004-11-30 10:43:45 +0000298<itemizedlist>
299 <listitem>
300 <para>If allocated with
de03e0e7c2005-12-03 23:02:33 +0000301 <function>malloc</function>,
302 <function>calloc</function>,
303 <function>realloc</function>,
304 <function>valloc</function> or
305 <function>memalign</function>, you must
306 deallocate with <function>free</function>.</para>
njn3e986b22004-11-30 10:43:45 +0000307 </listitem>
308 <listitem>
de03e0e7c2005-12-03 23:02:33 +0000309 <para>If allocated with <function>new</function>, you must deallocate
310 with <function>delete</function>.</para>
njn3e986b22004-11-30 10:43:45 +0000311 </listitem>
njn2f7eebe2009-08-05 06:34:27 +0000312 <listitem>
313 <para>If allocated with <function>new[]</function>, you must
314 deallocate with <function>delete[]</function>.</para>
315 </listitem>
njn3e986b22004-11-30 10:43:45 +0000316</itemizedlist>
317
de03e0e7c2005-12-03 23:02:33 +0000318<para>The worst thing is that on Linux apparently it doesn't matter if
sewardj08e31e22007-05-23 21:58:33 +0000319you do mix these up, but the same program may then crash on a
320different platform, Solaris for example. So it's best to fix it
321properly. According to the KDE folks "it's amazing how many C++
322programmers don't know this".</para>
njn3e986b22004-11-30 10:43:45 +0000323
sewardj08e31e22007-05-23 21:58:33 +0000324<para>The reason behind the requirement is as follows. In some C++
325implementations, <function>delete[]</function> must be used for
326objects allocated by <function>new[]</function> because the compiler
327stores the size of the array and the pointer-to-member to the
328destructor of the array's content just before the pointer actually
njn2f7eebe2009-08-05 06:34:27 +0000329returned. <function>delete</function> doesn't account for this and will get
330confused, possibly corrupting the heap.</para>
de03e0e7c2005-12-03 23:02:33 +0000331
njn3e986b22004-11-30 10:43:45 +0000332</sect2>
333
334
335
njn3e986b22004-11-30 10:43:45 +0000336<sect2 id="mc-manual.overlap"
337 xreflabel="Overlapping source and destination blocks">
338<title>Overlapping source and destination blocks</title>
339
340<para>The following C library functions copy some data from one
341memory block to another (or something similar):
njn2f7eebe2009-08-05 06:34:27 +0000342<function>memcpy</function>,
343<function>strcpy</function>,
344<function>strncpy</function>,
345<function>strcat</function>,
346<function>strncat</function>.
de03e0e7c2005-12-03 23:02:33 +0000347The blocks pointed to by their <computeroutput>src</computeroutput> and
348<computeroutput>dst</computeroutput> pointers aren't allowed to overlap.
njn2f7eebe2009-08-05 06:34:27 +0000349The POSIX standards have wording along the lines "If copying takes place
350between objects that overlap, the behavior is undefined." Therefore,
351Memcheck checks for this.
352</para>
njn3e986b22004-11-30 10:43:45 +0000353
354<para>For example:</para>
355<programlisting><![CDATA[
356==27492== Source and destination overlap in memcpy(0xbffff294, 0xbffff280, 21)
357==27492== at 0x40026CDC: memcpy (mc_replace_strmem.c:71)
358==27492== by 0x804865A: main (overlap.c:40)
njn3e986b22004-11-30 10:43:45 +0000359]]></programlisting>
360
de03e0e7c2005-12-03 23:02:33 +0000361<para>You don't want the two blocks to overlap because one of them could
sewardj08e31e22007-05-23 21:58:33 +0000362get partially overwritten by the copying.</para>
njn3e986b22004-11-30 10:43:45 +0000363
njnccad0b82005-07-19 00:48:55 +0000364<para>You might think that Memcheck is being overly pedantic reporting
de03e0e7c2005-12-03 23:02:33 +0000365this in the case where <computeroutput>dst</computeroutput> is less than
366<computeroutput>src</computeroutput>. For example, the obvious way to
njn2f7eebe2009-08-05 06:34:27 +0000367implement <function>memcpy</function> is by copying from the first
de03e0e7c2005-12-03 23:02:33 +0000368byte to the last. However, the optimisation guides of some
369architectures recommend copying from the last byte down to the first.
njn2f7eebe2009-08-05 06:34:27 +0000370Also, some implementations of <function>memcpy</function> zero
de03e0e7c2005-12-03 23:02:33 +0000371<computeroutput>dst</computeroutput> before copying, because zeroing the
372destination's cache line(s) can improve performance.</para>
njnccad0b82005-07-19 00:48:55 +0000373
de03e0e7c2005-12-03 23:02:33 +0000374<para>The moral of the story is: if you want to write truly portable
375code, don't make any assumptions about the language
376implementation.</para>
njnccad0b82005-07-19 00:48:55 +0000377
njn3e986b22004-11-30 10:43:45 +0000378</sect2>
379
380
njnab5b7142005-08-16 02:20:17 +0000381<sect2 id="mc-manual.leaks" xreflabel="Memory leak detection">
382<title>Memory leak detection</title>
383
njn2f7eebe2009-08-05 06:34:27 +0000384<para>Memcheck keeps track of all heap blocks issued in response to
bartaf25f672009-06-26 19:03:53 +0000385calls to
njn2f7eebe2009-08-05 06:34:27 +0000386<function>malloc</function>/<function>new</function> et al.
bartaf25f672009-06-26 19:03:53 +0000387So when the program exits, it knows which blocks have not been freed.
njnab5b7142005-08-16 02:20:17 +0000388</para>
389
de03e0e7c2005-12-03 23:02:33 +0000390<para>If <option>--leak-check</option> is set appropriately, for each
njn8225cc02009-03-09 22:52:24 +0000391remaining block, Memcheck determines if the block is reachable from pointers
392within the root-set. The root-set consists of (a) general purpose registers
393of all threads, and (b) initialised, aligned, pointer-sized data words in
394accessible client memory, including stacks.</para>
395
396<para>There are two ways a block can be reached. The first is with a
njn389f5702009-07-15 07:18:16 +0000397"start-pointer", i.e. a pointer to the start of the block. The second is with
398an "interior-pointer", i.e. a pointer to the middle of the block. There are
njn2f7eebe2009-08-05 06:34:27 +0000399three ways we know of that an interior-pointer can occur:</para>
njn389f5702009-07-15 07:18:16 +0000400
401<itemizedlist>
402 <listitem>
403 <para>The pointer might have originally been a start-pointer and have been
bart9d6d2a92009-07-19 09:19:58 +0000404 moved along deliberately (or not deliberately) by the program.</para>
njn389f5702009-07-15 07:18:16 +0000405 </listitem>
406
407 <listitem>
408 <para>It might be a random junk value in memory, entirely unrelated, just
409 a coincidence.</para>
410 </listitem>
411
412 <listitem>
413 <para>It might be a pointer to an array of C++ objects (which possess
414 destructors) allocated with <computeroutput>new[]</computeroutput>. In
415 this case, some compilers store a "magic cookie" containing the array
416 length at the start of the allocated block, and return a pointer to just
417 past that magic cookie, i.e. an interior-pointer.
418 See <ulink url="http://theory.uwinnipeg.ca/gnu/gcc/gxxint_14.html">this
419 page</ulink> for more information.</para>
420 </listitem>
bart9d6d2a92009-07-19 09:19:58 +0000421</itemizedlist>
njn8225cc02009-03-09 22:52:24 +0000422
423<para>With that in mind, consider the nine possible cases described by the
424following figure.</para>
425
426<programlisting><![CDATA[
427 Pointer chain AAA Category BBB Category
428 ------------- ------------ ------------
429(1) RRR ------------> BBB DR
430(2) RRR ---> AAA ---> BBB DR IR
431(3) RRR BBB DL
432(4) RRR AAA ---> BBB DL IL
433(5) RRR ------?-----> BBB (y)DR, (n)DL
434(6) RRR ---> AAA -?-> BBB DR (y)IR, (n)DL
435(7) RRR -?-> AAA ---> BBB (y)DR, (n)DL (y)IR, (n)IL
436(8) RRR -?-> AAA -?-> BBB (y)DR, (n)DL (y,y)IR, (n,y)IL, (_,n)DL
437(9) RRR AAA -?-> BBB DL (y)IL, (n)DL
438
439Pointer chain legend:
440- RRR: a root set node or DR block
441- AAA, BBB: heap blocks
442- --->: a start-pointer
443- -?->: an interior-pointer
444
445Category legend:
446- DR: Directly reachable
447- IR: Indirectly reachable
448- DL: Directly lost
449- IL: Indirectly lost
450- (y)XY: it's XY if the interior-pointer is a real pointer
451- (n)XY: it's XY if the interior-pointer is not a real pointer
452- (_)XY: it's XY in either case
453]]></programlisting>
454
455<para>Every possible case can be reduced to one of the above nine. Memcheck
456merges some of these cases in its output, resulting in the following four
457categories.</para>
458
njnab5b7142005-08-16 02:20:17 +0000459
460<itemizedlist>
461
462 <listitem>
njn8225cc02009-03-09 22:52:24 +0000463 <para>"Still reachable". This covers cases 1 and 2 (for the BBB blocks)
464 above. A start-pointer or chain of start-pointers to the block is
465 found. Since the block is still pointed at, the programmer could, at
466 least in principle, have freed it before program exit. Because these
467 are very common and arguably not a problem, Memcheck won't report such
468 blocks individually unless <option>--show-reachable=yes</option> is
469 specified.</para>
njnab5b7142005-08-16 02:20:17 +0000470 </listitem>
471
472 <listitem>
njn8225cc02009-03-09 22:52:24 +0000473 <para>"Definitely lost". This covers case 3 (for the BBB blocks) above.
474 This means that no pointer to the block can be found. The block is
475 classified as "lost", because the programmer could not possibly have
476 freed it at program exit, since no pointer to it exists. This is likely
477 a symptom of having lost the pointer at some earlier point in the
478 program. Such cases should be fixed by the programmer.</para>
njnab5b7142005-08-16 02:20:17 +0000479 </listitem>
480
njn8225cc02009-03-09 22:52:24 +0000481 <listitem>
482 <para>"Indirectly lost". This covers cases 4 and 9 (for the BBB blocks)
483 above. This means that the block is lost, not because there are no
484 pointers to it, but rather because all the blocks that point to it are
485 themselves lost. For example, if you have a binary tree and the root
486 node is lost, all its children nodes will be indirectly lost. Because
487 the problem will disappear if the definitely lost block that caused the
488 indirect leak is fixed, Memcheck won't report such blocks individually
489 unless <option>--show-reachable=yes</option> is specified.</para>
490 </listitem>
491
492 <listitem>
493 <para>"Possibly lost". This covers cases 5--8 (for the BBB blocks)
494 above. This means that a chain of one or more pointers to the block has
495 been found, but at least one of the pointers is an interior-pointer.
496 This could just be a random value in memory that happens to point into a
497 block, and so you shouldn't consider this ok unless you know you have
498 interior-pointers.</para>
499 </listitem>
500
njnab5b7142005-08-16 02:20:17 +0000501</itemizedlist>
502
njn8225cc02009-03-09 22:52:24 +0000503<para>(Note: This mapping of the nine possible cases onto four categories is
504not necessarily the best way that leaks could be reported; in particular,
505interior-pointers are treated inconsistently. It is possible the
506categorisation may be improved in the future.)</para>
507
508<para>Furthermore, if suppressions exists for a block, it will be reported
509as "suppressed" no matter what which of the above four categories it belongs
510to.</para>
511
512
513<para>The following is an example leak summary.</para>
514
515<programlisting><![CDATA[
516LEAK SUMMARY:
517 definitely lost: 48 bytes in 3 blocks.
518 indirectly lost: 32 bytes in 2 blocks.
519 possibly lost: 96 bytes in 6 blocks.
520 still reachable: 64 bytes in 4 blocks.
521 suppressed: 0 bytes in 0 blocks.
522]]></programlisting>
523
njn7e5d4ed2009-07-30 02:57:52 +0000524<para>If <option>--leak-check=full</option> is specified,
njn8225cc02009-03-09 22:52:24 +0000525Memcheck will give details for each definitely lost or possibly lost block,
njn62dd9fa2009-03-10 21:40:46 +0000526including where it was allocated. (Actually, it merges results for all
527blocks that have the same category and sufficiently similar stack traces
528into a single "loss record". The
njn7e5d4ed2009-07-30 02:57:52 +0000529<option>--leak-resolution</option> lets you control the
njn62dd9fa2009-03-10 21:40:46 +0000530meaning of "sufficiently similar".) It cannot tell you when or how or why
531the pointer to a leaked block was lost; you have to work that out for
532yourself. In general, you should attempt to ensure your programs do not
533have any definitely lost or possibly lost blocks at exit.</para>
njnab5b7142005-08-16 02:20:17 +0000534
535<para>For example:</para>
536<programlisting><![CDATA[
5378 bytes in 1 blocks are definitely lost in loss record 1 of 14
538 at 0x........: malloc (vg_replace_malloc.c:...)
539 by 0x........: mk (leak-tree.c:11)
540 by 0x........: main (leak-tree.c:39)
541
njn8225cc02009-03-09 22:52:24 +000054288 (8 direct, 80 indirect) bytes in 1 blocks are definitely lost in loss record 13 of 14
njnab5b7142005-08-16 02:20:17 +0000543 at 0x........: malloc (vg_replace_malloc.c:...)
544 by 0x........: mk (leak-tree.c:11)
545 by 0x........: main (leak-tree.c:25)
546]]></programlisting>
547
de03e0e7c2005-12-03 23:02:33 +0000548<para>The first message describes a simple case of a single 8 byte block
njn8225cc02009-03-09 22:52:24 +0000549that has been definitely lost. The second case mentions another 8 byte
550block that has been definitely lost; the difference is that a further 80
njn62dd9fa2009-03-10 21:40:46 +0000551bytes in other blocks are indirectly lost because of this lost block.
552The loss records are not presented in any notable order, so the loss record
553numbers aren't particularly meaningful.</para>
njnab5b7142005-08-16 02:20:17 +0000554
njn7e5d4ed2009-07-30 02:57:52 +0000555<para>If you specify <option>--show-reachable=yes</option>,
njn8225cc02009-03-09 22:52:24 +0000556reachable and indirectly lost blocks will also be shown, as the following
557two examples show.</para>
558
559<programlisting><![CDATA[
56064 bytes in 4 blocks are still reachable in loss record 2 of 4
561 at 0x........: malloc (vg_replace_malloc.c:177)
562 by 0x........: mk (leak-cases.c:52)
563 by 0x........: main (leak-cases.c:74)
564
56532 bytes in 2 blocks are indirectly lost in loss record 1 of 4
566 at 0x........: malloc (vg_replace_malloc.c:177)
567 by 0x........: mk (leak-cases.c:52)
568 by 0x........: main (leak-cases.c:80)
569]]></programlisting>
njnab5b7142005-08-16 02:20:17 +0000570
njn26670552009-08-13 00:02:30 +0000571<para>Because there are different kinds of leaks with different severities, an
572interesting question is this: which leaks should be counted as true "errors"
573and which should not? The answer to this question affects the numbers printed
574in the <computeroutput>ERROR SUMMARY</computeroutput> line, and also the effect
575of the <option>--error-exitcode</option> option. Memcheck uses the following
576criteria:</para>
577
578<itemizedlist>
579 <listitem>
580 <para>First, a leak is only counted as a true "error" if
581 <option>--leak-check=full</option> is specified. In other words, an
582 unprinted leak is not considered a true "error". If this were not the
583 case, it would be possible to get a high error count but not have any
584 errors printed, which would be confusing.</para>
585 </listitem>
586
587 <listitem>
588 <para>After that, definitely lost and possibly lost blocks are counted as
589 true "errors". Indirectly lost and still reachable blocks are not counted
590 as true "errors", even if <option>--show-reachable=yes</option> is
591 specified and they are printed; this is because such blocks don't need
592 direct fixing by the programmer.
593 </para>
594 </listitem>
595</itemizedlist>
596
njnab5b7142005-08-16 02:20:17 +0000597</sect2>
598
njn3e986b22004-11-30 10:43:45 +0000599</sect1>
600
601
602
njna3311642009-08-10 01:29:14 +0000603<sect1 id="mc-manual.options"
604 xreflabel="Memcheck Command-Line Options">
605<title>Memcheck Command-Line Options</title>
njnc1abdcb2009-08-05 05:11:02 +0000606
607<!-- start of xi:include in the manpage -->
608<variablelist id="mc.opts.list">
609
610 <varlistentry id="opt.leak-check" xreflabel="--leak-check">
611 <term>
612 <option><![CDATA[--leak-check=<no|summary|yes|full> [default: summary] ]]></option>
613 </term>
614 <listitem>
615 <para>When enabled, search for memory leaks when the client
616 program finishes. If set to <varname>summary</varname>, it says how
617 many leaks occurred. If set to <varname>full</varname> or
618 <varname>yes</varname>, it also gives details of each individual
619 leak.</para>
620 </listitem>
621 </varlistentry>
622
bart3cedf572010-08-26 10:56:27 +0000623 <varlistentry id="opt.show-possibly-lost" xreflabel="--show-possibly-lost">
624 <term>
625 <option><![CDATA[--show-possibly-lost=<yes|no> [default: yes] ]]></option>
626 </term>
627 <listitem>
628 <para>When disabled, the memory leak detector will not show "possibly lost" blocks.
629 </para>
630 </listitem>
631 </varlistentry>
632
njnc1abdcb2009-08-05 05:11:02 +0000633 <varlistentry id="opt.leak-resolution" xreflabel="--leak-resolution">
634 <term>
635 <option><![CDATA[--leak-resolution=<low|med|high> [default: high] ]]></option>
636 </term>
637 <listitem>
638 <para>When doing leak checking, determines how willing
639 Memcheck is to consider different backtraces to
640 be the same for the purposes of merging multiple leaks into a single
641 leak report. When set to <varname>low</varname>, only the first
642 two entries need match. When <varname>med</varname>, four entries
643 have to match. When <varname>high</varname>, all entries need to
644 match.</para>
645
646 <para>For hardcore leak debugging, you probably want to use
647 <option>--leak-resolution=high</option> together with
648 <option>--num-callers=40</option> or some such large number.
649 </para>
650
651 <para>Note that the <option>--leak-resolution</option> setting
652 does not affect Memcheck's ability to find
653 leaks. It only changes how the results are presented.</para>
654 </listitem>
655 </varlistentry>
656
657 <varlistentry id="opt.show-reachable" xreflabel="--show-reachable">
658 <term>
659 <option><![CDATA[--show-reachable=<yes|no> [default: no] ]]></option>
660 </term>
661 <listitem>
662 <para>When disabled, the memory leak detector only shows "definitely
663 lost" and "possibly lost" blocks. When enabled, the leak detector also
664 shows "reachable" and "indirectly lost" blocks. (In other words, it
665 shows all blocks, except suppressed ones, so
666 <option>--show-all</option> would be a better name for
667 it.)</para>
668 </listitem>
669 </varlistentry>
670
671 <varlistentry id="opt.undef-value-errors" xreflabel="--undef-value-errors">
672 <term>
673 <option><![CDATA[--undef-value-errors=<yes|no> [default: yes] ]]></option>
674 </term>
675 <listitem>
676 <para>Controls whether Memcheck reports
677 uses of undefined value errors. Set this to
678 <varname>no</varname> if you don't want to see undefined value
679 errors. It also has the side effect of speeding up
680 Memcheck somewhat.
681 </para>
682 </listitem>
683 </varlistentry>
684
685 <varlistentry id="opt.track-origins" xreflabel="--track-origins">
686 <term>
687 <option><![CDATA[--track-origins=<yes|no> [default: no] ]]></option>
688 </term>
689 <listitem>
690 <para>Controls whether Memcheck tracks
691 the origin of uninitialised values. By default, it does not,
692 which means that although it can tell you that an
693 uninitialised value is being used in a dangerous way, it
694 cannot tell you where the uninitialised value came from. This
695 often makes it difficult to track down the root problem.
696 </para>
697 <para>When set
698 to <varname>yes</varname>, Memcheck keeps
699 track of the origins of all uninitialised values. Then, when
700 an uninitialised value error is
701 reported, Memcheck will try to show the
702 origin of the value. An origin can be one of the following
703 four places: a heap block, a stack allocation, a client
704 request, or miscellaneous other sources (eg, a call
705 to <varname>brk</varname>).
706 </para>
707 <para>For uninitialised values originating from a heap
708 block, Memcheck shows where the block was
709 allocated. For uninitialised values originating from a stack
710 allocation, Memcheck can tell you which
711 function allocated the value, but no more than that -- typically
712 it shows you the source location of the opening brace of the
713 function. So you should carefully check that all of the
714 function's local variables are initialised properly.
715 </para>
716 <para>Performance overhead: origin tracking is expensive. It
717 halves Memcheck's speed and increases
718 memory use by a minimum of 100MB, and possibly more.
719 Nevertheless it can drastically reduce the effort required to
720 identify the root cause of uninitialised value errors, and so
721 is often a programmer productivity win, despite running
722 more slowly.
723 </para>
724 <para>Accuracy: Memcheck tracks origins
725 quite accurately. To avoid very large space and time
726 overheads, some approximations are made. It is possible,
727 although unlikely, that Memcheck will report an incorrect origin, or
728 not be able to identify any origin.
729 </para>
730 <para>Note that the combination
731 <option>--track-origins=yes</option>
732 and <option>--undef-value-errors=no</option> is
733 nonsensical. Memcheck checks for and
734 rejects this combination at startup.
735 </para>
736 </listitem>
737 </varlistentry>
738
739 <varlistentry id="opt.partial-loads-ok" xreflabel="--partial-loads-ok">
740 <term>
741 <option><![CDATA[--partial-loads-ok=<yes|no> [default: no] ]]></option>
742 </term>
743 <listitem>
744 <para>Controls how Memcheck handles word-sized,
745 word-aligned loads from addresses for which some bytes are
746 addressable and others are not. When <varname>yes</varname>, such
747 loads do not produce an address error. Instead, loaded bytes
748 originating from illegal addresses are marked as uninitialised, and
749 those corresponding to legal addresses are handled in the normal
750 way.</para>
751
752 <para>When <varname>no</varname>, loads from partially invalid
753 addresses are treated the same as loads from completely invalid
754 addresses: an illegal-address error is issued, and the resulting
755 bytes are marked as initialised.</para>
756
757 <para>Note that code that behaves in this way is in violation of
758 the the ISO C/C++ standards, and should be considered broken. If
njna3311642009-08-10 01:29:14 +0000759 at all possible, such code should be fixed. This option should be
njnc1abdcb2009-08-05 05:11:02 +0000760 used only as a last resort.</para>
761 </listitem>
762 </varlistentry>
763
764 <varlistentry id="opt.freelist-vol" xreflabel="--freelist-vol">
765 <term>
766 <option><![CDATA[--freelist-vol=<number> [default: 10000000] ]]></option>
767 </term>
768 <listitem>
769 <para>When the client program releases memory using
770 <function>free</function> (in <literal>C</literal>) or
771 <computeroutput>delete</computeroutput>
772 (<literal>C++</literal>), that memory is not immediately made
773 available for re-allocation. Instead, it is marked inaccessible
774 and placed in a queue of freed blocks. The purpose is to defer as
775 long as possible the point at which freed-up memory comes back
776 into circulation. This increases the chance that
777 Memcheck will be able to detect invalid
778 accesses to blocks for some significant period of time after they
779 have been freed.</para>
780
njna3311642009-08-10 01:29:14 +0000781 <para>This option specifies the maximum total size, in bytes, of the
njnc1abdcb2009-08-05 05:11:02 +0000782 blocks in the queue. The default value is ten million bytes.
783 Increasing this increases the total amount of memory used by
784 Memcheck but may detect invalid uses of freed
785 blocks which would otherwise go undetected.</para>
786 </listitem>
787 </varlistentry>
788
789 <varlistentry id="opt.workaround-gcc296-bugs" xreflabel="--workaround-gcc296-bugs">
790 <term>
791 <option><![CDATA[--workaround-gcc296-bugs=<yes|no> [default: no] ]]></option>
792 </term>
793 <listitem>
794 <para>When enabled, assume that reads and writes some small
795 distance below the stack pointer are due to bugs in GCC 2.96, and
796 does not report them. The "small distance" is 256 bytes by
797 default. Note that GCC 2.96 is the default compiler on some ancient
798 Linux distributions (RedHat 7.X) and so you may need to use this
njna3311642009-08-10 01:29:14 +0000799 option. Do not use it if you do not have to, as it can cause real
njnc1abdcb2009-08-05 05:11:02 +0000800 errors to be overlooked. A better alternative is to use a more
801 recent GCC in which this bug is fixed.</para>
802
njna3311642009-08-10 01:29:14 +0000803 <para>You may also need to use this option when working with
njnc1abdcb2009-08-05 05:11:02 +0000804 GCC 3.X or 4.X on 32-bit PowerPC Linux. This is because
805 GCC generates code which occasionally accesses below the
806 stack pointer, particularly for floating-point to/from integer
807 conversions. This is in violation of the 32-bit PowerPC ELF
808 specification, which makes no provision for locations below the
809 stack pointer to be accessible.</para>
810 </listitem>
811 </varlistentry>
812
813 <varlistentry id="opt.ignore-ranges" xreflabel="--ignore-ranges">
814 <term>
815 <option><![CDATA[--ignore-ranges=0xPP-0xQQ[,0xRR-0xSS] ]]></option>
816 </term>
817 <listitem>
818 <para>Any ranges listed in this option (and multiple ranges can be
819 specified, separated by commas) will be ignored by Memcheck's
820 addressability checking.</para>
821 </listitem>
822 </varlistentry>
823
824 <varlistentry id="opt.malloc-fill" xreflabel="--malloc-fill">
825 <term>
826 <option><![CDATA[--malloc-fill=<hexnumber> ]]></option>
827 </term>
828 <listitem>
829 <para>Fills blocks allocated
830 by <computeroutput>malloc</computeroutput>,
831 <computeroutput>new</computeroutput>, etc, but not
832 by <computeroutput>calloc</computeroutput>, with the specified
833 byte. This can be useful when trying to shake out obscure
834 memory corruption problems. The allocated area is still
njna3311642009-08-10 01:29:14 +0000835 regarded by Memcheck as undefined -- this option only affects its
njnc1abdcb2009-08-05 05:11:02 +0000836 contents.
837 </para>
838 </listitem>
839 </varlistentry>
840
841 <varlistentry id="opt.free-fill" xreflabel="--free-fill">
842 <term>
843 <option><![CDATA[--free-fill=<hexnumber> ]]></option>
844 </term>
845 <listitem>
846 <para>Fills blocks freed
847 by <computeroutput>free</computeroutput>,
848 <computeroutput>delete</computeroutput>, etc, with the
849 specified byte value. This can be useful when trying to shake out
850 obscure memory corruption problems. The freed area is still
njna3311642009-08-10 01:29:14 +0000851 regarded by Memcheck as not valid for access -- this option only
njnc1abdcb2009-08-05 05:11:02 +0000852 affects its contents.
853 </para>
854 </listitem>
855 </varlistentry>
856
857</variablelist>
858<!-- end of xi:include in the manpage -->
859
860</sect1>
861
862
njn62ad73d2005-08-15 04:26:13 +0000863<sect1 id="mc-manual.suppfiles" xreflabel="Writing suppression files">
864<title>Writing suppression files</title>
njn3e986b22004-11-30 10:43:45 +0000865
866<para>The basic suppression format is described in
867<xref linkend="manual-core.suppress"/>.</para>
868
sewardj08e31e22007-05-23 21:58:33 +0000869<para>The suppression-type (second) line should have the form:</para>
njn3e986b22004-11-30 10:43:45 +0000870<programlisting><![CDATA[
871Memcheck:suppression_type]]></programlisting>
872
njn3e986b22004-11-30 10:43:45 +0000873<para>The Memcheck suppression types are as follows:</para>
874
875<itemizedlist>
876 <listitem>
de03e0e7c2005-12-03 23:02:33 +0000877 <para><varname>Value1</varname>,
878 <varname>Value2</varname>,
879 <varname>Value4</varname>,
880 <varname>Value8</varname>,
881 <varname>Value16</varname>,
njn3e986b22004-11-30 10:43:45 +0000882 meaning an uninitialised-value error when
883 using a value of 1, 2, 4, 8 or 16 bytes.</para>
884 </listitem>
885
886 <listitem>
sewardj08e31e22007-05-23 21:58:33 +0000887 <para><varname>Cond</varname> (or its old
de03e0e7c2005-12-03 23:02:33 +0000888 name, <varname>Value0</varname>), meaning use
njn3e986b22004-11-30 10:43:45 +0000889 of an uninitialised CPU condition code.</para>
890 </listitem>
891
892 <listitem>
sewardj08e31e22007-05-23 21:58:33 +0000893 <para><varname>Addr1</varname>,
de03e0e7c2005-12-03 23:02:33 +0000894 <varname>Addr2</varname>,
895 <varname>Addr4</varname>,
896 <varname>Addr8</varname>,
897 <varname>Addr16</varname>,
njn3e986b22004-11-30 10:43:45 +0000898 meaning an invalid address during a
899 memory access of 1, 2, 4, 8 or 16 bytes respectively.</para>
900 </listitem>
901
902 <listitem>
sewardj08e31e22007-05-23 21:58:33 +0000903 <para><varname>Jump</varname>, meaning an
njn718d3b12006-12-16 00:54:12 +0000904 jump to an unaddressable location error.</para>
905 </listitem>
906
907 <listitem>
sewardj08e31e22007-05-23 21:58:33 +0000908 <para><varname>Param</varname>, meaning an
njn3e986b22004-11-30 10:43:45 +0000909 invalid system call parameter error.</para>
910 </listitem>
911
912 <listitem>
sewardj08e31e22007-05-23 21:58:33 +0000913 <para><varname>Free</varname>, meaning an
njn3e986b22004-11-30 10:43:45 +0000914 invalid or mismatching free.</para>
915 </listitem>
916
917 <listitem>
sewardj08e31e22007-05-23 21:58:33 +0000918 <para><varname>Overlap</varname>, meaning a
njn3e986b22004-11-30 10:43:45 +0000919 <computeroutput>src</computeroutput> /
920 <computeroutput>dst</computeroutput> overlap in
njn2f7eebe2009-08-05 06:34:27 +0000921 <function>memcpy</function> or a similar function.</para>
njn3e986b22004-11-30 10:43:45 +0000922 </listitem>
923
924 <listitem>
sewardj08e31e22007-05-23 21:58:33 +0000925 <para><varname>Leak</varname>, meaning
njn62ad73d2005-08-15 04:26:13 +0000926 a memory leak.</para>
njn3e986b22004-11-30 10:43:45 +0000927 </listitem>
928
929</itemizedlist>
930
sewardj08e31e22007-05-23 21:58:33 +0000931<para><computeroutput>Param</computeroutput> errors have an extra
932information line at this point, which is the name of the offending
933system call parameter. No other error kinds have this extra
de03e0e7c2005-12-03 23:02:33 +0000934line.</para>
njn3e986b22004-11-30 10:43:45 +0000935
njn2f7eebe2009-08-05 06:34:27 +0000936<para>The first line of the calling context: for <varname>ValueN</varname>
937and <varname>AddrN</varname> errors, it is either the name of the function
938in which the error occurred, or, failing that, the full path of the
939<filename>.so</filename> file
940or executable containing the error location. For <varname>Free</varname> errors, is the name
941of the function doing the freeing (eg, <function>free</function>,
942<function>__builtin_vec_delete</function>, etc). For
943<varname>Overlap</varname> errors, is the name of the function with the
944overlapping arguments (eg. <function>memcpy</function>,
945<function>strcpy</function>, etc).</para>
njn3e986b22004-11-30 10:43:45 +0000946
947<para>Lastly, there's the rest of the calling context.</para>
948
949</sect1>
950
951
952
953<sect1 id="mc-manual.machine"
954 xreflabel="Details of Memcheck's checking machinery">
955<title>Details of Memcheck's checking machinery</title>
956
957<para>Read this section if you want to know, in detail, exactly
958what and how Memcheck is checking.</para>
959
960
961<sect2 id="mc-manual.value" xreflabel="Valid-value (V) bit">
962<title>Valid-value (V) bits</title>
963
de03e0e7c2005-12-03 23:02:33 +0000964<para>It is simplest to think of Memcheck implementing a synthetic CPU
965which is identical to a real CPU, except for one crucial detail. Every
966bit (literally) of data processed, stored and handled by the real CPU
967has, in the synthetic CPU, an associated "valid-value" bit, which says
968whether or not the accompanying bit has a legitimate value. In the
969discussions which follow, this bit is referred to as the V (valid-value)
njn3e986b22004-11-30 10:43:45 +0000970bit.</para>
971
de03e0e7c2005-12-03 23:02:33 +0000972<para>Each byte in the system therefore has a 8 V bits which follow it
973wherever it goes. For example, when the CPU loads a word-size item (4
974bytes) from memory, it also loads the corresponding 32 V bits from a
975bitmap which stores the V bits for the process' entire address space.
976If the CPU should later write the whole or some part of that value to
977memory at a different address, the relevant V bits will be stored back
978in the V-bit bitmap.</para>
njn3e986b22004-11-30 10:43:45 +0000979
njn2f7eebe2009-08-05 06:34:27 +0000980<para>In short, each bit in the system has (conceptually) an associated V
981bit, which follows it around everywhere, even inside the CPU. Yes, all the
982CPU's registers (integer, floating point, vector and condition registers)
983have their own V bit vectors. For this to work, Memcheck uses a great deal
984of compression to represent the V bits compactly.</para>
njn3e986b22004-11-30 10:43:45 +0000985
de03e0e7c2005-12-03 23:02:33 +0000986<para>Copying values around does not cause Memcheck to check for, or
987report on, errors. However, when a value is used in a way which might
njn2f7eebe2009-08-05 06:34:27 +0000988conceivably affect your program's externally-visible behaviour,
989the associated V bits are immediately checked. If any of these indicate
990that the value is undefined (even partially), an error is reported.</para>
njn3e986b22004-11-30 10:43:45 +0000991
992<para>Here's an (admittedly nonsensical) example:</para>
993<programlisting><![CDATA[
994int i, j;
995int a[10], b[10];
996for ( i = 0; i < 10; i++ ) {
997 j = a[i];
998 b[i] = j;
999}]]></programlisting>
1000
de03e0e7c2005-12-03 23:02:33 +00001001<para>Memcheck emits no complaints about this, since it merely copies
1002uninitialised values from <varname>a[]</varname> into
sewardj08e31e22007-05-23 21:58:33 +00001003<varname>b[]</varname>, and doesn't use them in a way which could
1004affect the behaviour of the program. However, if
de03e0e7c2005-12-03 23:02:33 +00001005the loop is changed to:</para>
njn3e986b22004-11-30 10:43:45 +00001006<programlisting><![CDATA[
1007for ( i = 0; i < 10; i++ ) {
1008 j += a[i];
1009}
1010if ( j == 77 )
1011 printf("hello there\n");
1012]]></programlisting>
1013
sewardj08e31e22007-05-23 21:58:33 +00001014<para>then Memcheck will complain, at the
de03e0e7c2005-12-03 23:02:33 +00001015<computeroutput>if</computeroutput>, that the condition depends on
1016uninitialised values. Note that it <command>doesn't</command> complain
1017at the <varname>j += a[i];</varname>, since at that point the
1018undefinedness is not "observable". It's only when a decision has to be
1019made as to whether or not to do the <function>printf</function> -- an
1020observable action of your program -- that Memcheck complains.</para>
njn3e986b22004-11-30 10:43:45 +00001021
de03e0e7c2005-12-03 23:02:33 +00001022<para>Most low level operations, such as adds, cause Memcheck to use the
1023V bits for the operands to calculate the V bits for the result. Even if
1024the result is partially or wholly undefined, it does not
njn62ad73d2005-08-15 04:26:13 +00001025complain.</para>
njn3e986b22004-11-30 10:43:45 +00001026
de03e0e7c2005-12-03 23:02:33 +00001027<para>Checks on definedness only occur in three places: when a value is
1028used to generate a memory address, when control flow decision needs to
sewardj08e31e22007-05-23 21:58:33 +00001029be made, and when a system call is detected, Memcheck checks definedness
de03e0e7c2005-12-03 23:02:33 +00001030of parameters as required.</para>
njn3e986b22004-11-30 10:43:45 +00001031
1032<para>If a check should detect undefinedness, an error message is
de03e0e7c2005-12-03 23:02:33 +00001033issued. The resulting value is subsequently regarded as well-defined.
sewardj08e31e22007-05-23 21:58:33 +00001034To do otherwise would give long chains of error messages. In other
1035words, once Memcheck reports an undefined value error, it tries to
1036avoid reporting further errors derived from that same undefined
1037value.</para>
njn3e986b22004-11-30 10:43:45 +00001038
de03e0e7c2005-12-03 23:02:33 +00001039<para>This sounds overcomplicated. Why not just check all reads from
1040memory, and complain if an undefined value is loaded into a CPU
1041register? Well, that doesn't work well, because perfectly legitimate C
1042programs routinely copy uninitialised values around in memory, and we
1043don't want endless complaints about that. Here's the canonical example.
1044Consider a struct like this:</para>
njn3e986b22004-11-30 10:43:45 +00001045<programlisting><![CDATA[
1046struct S { int x; char c; };
1047struct S s1, s2;
1048s1.x = 42;
1049s1.c = 'z';
1050s2 = s1;
1051]]></programlisting>
1052
de03e0e7c2005-12-03 23:02:33 +00001053<para>The question to ask is: how large is <varname>struct S</varname>,
1054in bytes? An <varname>int</varname> is 4 bytes and a
1055<varname>char</varname> one byte, so perhaps a <varname>struct
sewardj08e31e22007-05-23 21:58:33 +00001056S</varname> occupies 5 bytes? Wrong. All non-toy compilers we know
de03e0e7c2005-12-03 23:02:33 +00001057of will round the size of <varname>struct S</varname> up to a whole
1058number of words, in this case 8 bytes. Not doing this forces compilers
sewardj08e31e22007-05-23 21:58:33 +00001059to generate truly appalling code for accessing arrays of
1060<varname>struct S</varname>'s on some architectures.</para>
njn3e986b22004-11-30 10:43:45 +00001061
de03e0e7c2005-12-03 23:02:33 +00001062<para>So <varname>s1</varname> occupies 8 bytes, yet only 5 of them will
njn7316df22009-08-04 01:16:01 +00001063be initialised. For the assignment <varname>s2 = s1</varname>, GCC
de03e0e7c2005-12-03 23:02:33 +00001064generates code to copy all 8 bytes wholesale into <varname>s2</varname>
1065without regard for their meaning. If Memcheck simply checked values as
1066they came out of memory, it would yelp every time a structure assignment
sewardj08e31e22007-05-23 21:58:33 +00001067like this happened. So the more complicated behaviour described above
njn7316df22009-08-04 01:16:01 +00001068is necessary. This allows GCC to copy
de03e0e7c2005-12-03 23:02:33 +00001069<varname>s1</varname> into <varname>s2</varname> any way it likes, and a
1070warning will only be emitted if the uninitialised values are later
1071used.</para>
njn3e986b22004-11-30 10:43:45 +00001072
njn3e986b22004-11-30 10:43:45 +00001073</sect2>
1074
1075
1076<sect2 id="mc-manual.vaddress" xreflabel=" Valid-address (A) bits">
1077<title>Valid-address (A) bits</title>
1078
de03e0e7c2005-12-03 23:02:33 +00001079<para>Notice that the previous subsection describes how the validity of
1080values is established and maintained without having to say whether the
1081program does or does not have the right to access any particular memory
sewardj08e31e22007-05-23 21:58:33 +00001082location. We now consider the latter question.</para>
njn3e986b22004-11-30 10:43:45 +00001083
de03e0e7c2005-12-03 23:02:33 +00001084<para>As described above, every bit in memory or in the CPU has an
1085associated valid-value (V) bit. In addition, all bytes in memory, but
1086not in the CPU, have an associated valid-address (A) bit. This
1087indicates whether or not the program can legitimately read or write that
1088location. It does not give any indication of the validity or the data
1089at that location -- that's the job of the V bits -- only whether or not
1090the location may be accessed.</para>
njn3e986b22004-11-30 10:43:45 +00001091
de03e0e7c2005-12-03 23:02:33 +00001092<para>Every time your program reads or writes memory, Memcheck checks
1093the A bits associated with the address. If any of them indicate an
1094invalid address, an error is emitted. Note that the reads and writes
1095themselves do not change the A bits, only consult them.</para>
njn3e986b22004-11-30 10:43:45 +00001096
njn62ad73d2005-08-15 04:26:13 +00001097<para>So how do the A bits get set/cleared? Like this:</para>
njn3e986b22004-11-30 10:43:45 +00001098
1099<itemizedlist>
1100 <listitem>
1101 <para>When the program starts, all the global data areas are
1102 marked as accessible.</para>
1103 </listitem>
1104
1105 <listitem>
bartaf25f672009-06-26 19:03:53 +00001106 <para>When the program does
1107 <function>malloc</function>/<computeroutput>new</computeroutput>,
1108 the A bits for exactly the area allocated, and not a byte more,
1109 are marked as accessible. Upon freeing the area the A bits are
1110 changed to indicate inaccessibility.</para>
njn3e986b22004-11-30 10:43:45 +00001111 </listitem>
1112
1113 <listitem>
de03e0e7c2005-12-03 23:02:33 +00001114 <para>When the stack pointer register (<literal>SP</literal>) moves
1115 up or down, A bits are set. The rule is that the area from
1116 <literal>SP</literal> up to the base of the stack is marked as
1117 accessible, and below <literal>SP</literal> is inaccessible. (If
1118 that sounds illogical, bear in mind that the stack grows down, not
1119 up, on almost all Unix systems, including GNU/Linux.) Tracking
1120 <literal>SP</literal> like this has the useful side-effect that the
1121 section of stack used by a function for local variables etc is
1122 automatically marked accessible on function entry and inaccessible
1123 on exit.</para>
njn3e986b22004-11-30 10:43:45 +00001124 </listitem>
1125
1126 <listitem>
de03e0e7c2005-12-03 23:02:33 +00001127 <para>When doing system calls, A bits are changed appropriately.
sewardj08e31e22007-05-23 21:58:33 +00001128 For example, <literal>mmap</literal>
1129 magically makes files appear in the process'
1130 address space, so the A bits must be updated if <literal>mmap</literal>
de03e0e7c2005-12-03 23:02:33 +00001131 succeeds.</para>
njn3e986b22004-11-30 10:43:45 +00001132 </listitem>
1133
1134 <listitem>
sewardj08e31e22007-05-23 21:58:33 +00001135 <para>Optionally, your program can tell Memcheck about such changes
de03e0e7c2005-12-03 23:02:33 +00001136 explicitly, using the client request mechanism described
1137 above.</para>
njn3e986b22004-11-30 10:43:45 +00001138 </listitem>
1139
1140</itemizedlist>
1141
1142</sect2>
1143
1144
1145<sect2 id="mc-manual.together" xreflabel="Putting it all together">
1146<title>Putting it all together</title>
1147
1148<para>Memcheck's checking machinery can be summarised as
1149follows:</para>
1150
1151<itemizedlist>
1152 <listitem>
de03e0e7c2005-12-03 23:02:33 +00001153 <para>Each byte in memory has 8 associated V (valid-value) bits,
1154 saying whether or not the byte has a defined value, and a single A
1155 (valid-address) bit, saying whether or not the program currently has
njn2f7eebe2009-08-05 06:34:27 +00001156 the right to read/write that address. (But, as mentioned above, heavy
1157 use of compression means the overhead is typically less than 25%.)</para>
njn3e986b22004-11-30 10:43:45 +00001158 </listitem>
1159
1160 <listitem>
de03e0e7c2005-12-03 23:02:33 +00001161 <para>When memory is read or written, the relevant A bits are
sewardj08e31e22007-05-23 21:58:33 +00001162 consulted. If they indicate an invalid address, Memcheck emits an
de03e0e7c2005-12-03 23:02:33 +00001163 Invalid read or Invalid write error.</para>
njn3e986b22004-11-30 10:43:45 +00001164 </listitem>
1165
1166 <listitem>
de03e0e7c2005-12-03 23:02:33 +00001167 <para>When memory is read into the CPU's registers, the relevant V
1168 bits are fetched from memory and stored in the simulated CPU. They
1169 are not consulted.</para>
njn3e986b22004-11-30 10:43:45 +00001170 </listitem>
1171
1172 <listitem>
de03e0e7c2005-12-03 23:02:33 +00001173 <para>When a register is written out to memory, the V bits for that
1174 register are written back to memory too.</para>
njn3e986b22004-11-30 10:43:45 +00001175 </listitem>
1176
1177 <listitem>
de03e0e7c2005-12-03 23:02:33 +00001178 <para>When values in CPU registers are used to generate a memory
1179 address, or to determine the outcome of a conditional branch, the V
1180 bits for those values are checked, and an error emitted if any of
1181 them are undefined.</para>
njn3e986b22004-11-30 10:43:45 +00001182 </listitem>
1183
1184 <listitem>
de03e0e7c2005-12-03 23:02:33 +00001185 <para>When values in CPU registers are used for any other purpose,
sewardj08e31e22007-05-23 21:58:33 +00001186 Memcheck computes the V bits for the result, but does not check
de03e0e7c2005-12-03 23:02:33 +00001187 them.</para>
njn3e986b22004-11-30 10:43:45 +00001188 </listitem>
1189
1190 <listitem>
sewardj08e31e22007-05-23 21:58:33 +00001191 <para>Once the V bits for a value in the CPU have been checked, they
de03e0e7c2005-12-03 23:02:33 +00001192 are then set to indicate validity. This avoids long chains of
1193 errors.</para>
njn3e986b22004-11-30 10:43:45 +00001194 </listitem>
1195
1196 <listitem>
sewardj08e31e22007-05-23 21:58:33 +00001197 <para>When values are loaded from memory, Memcheck checks the A bits
de03e0e7c2005-12-03 23:02:33 +00001198 for that location and issues an illegal-address warning if needed.
1199 In that case, the V bits loaded are forced to indicate Valid,
1200 despite the location being invalid.</para>
1201
1202 <para>This apparently strange choice reduces the amount of confusing
1203 information presented to the user. It avoids the unpleasant
1204 phenomenon in which memory is read from a place which is both
sewardj33878892007-11-17 09:43:25 +00001205 unaddressable and contains invalid values, and, as a result, you get
de03e0e7c2005-12-03 23:02:33 +00001206 not only an invalid-address (read/write) error, but also a
1207 potentially large set of uninitialised-value errors, one for every
1208 time the value is used.</para>
1209
1210 <para>There is a hazy boundary case to do with multi-byte loads from
1211 addresses which are partially valid and partially invalid. See
njna3311642009-08-10 01:29:14 +00001212 details of the option <option>--partial-loads-ok</option> for details.
de03e0e7c2005-12-03 23:02:33 +00001213 </para>
njn3e986b22004-11-30 10:43:45 +00001214 </listitem>
1215
1216</itemizedlist>
1217
1218
bartaf25f672009-06-26 19:03:53 +00001219<para>Memcheck intercepts calls to <function>malloc</function>,
1220<function>calloc</function>, <function>realloc</function>,
1221<function>valloc</function>, <function>memalign</function>,
1222<function>free</function>, <computeroutput>new</computeroutput>,
1223<computeroutput>new[]</computeroutput>,
1224<computeroutput>delete</computeroutput> and
1225<computeroutput>delete[]</computeroutput>. The behaviour you get
njn3e986b22004-11-30 10:43:45 +00001226is:</para>
1227
1228<itemizedlist>
1229
1230 <listitem>
bartaf25f672009-06-26 19:03:53 +00001231 <para><function>malloc</function>/<function>new</function>/<computeroutput>new[]</computeroutput>:
1232 the returned memory is marked as addressable but not having valid
1233 values. This means you have to write to it before you can read
1234 it.</para>
njn3e986b22004-11-30 10:43:45 +00001235 </listitem>
1236
1237 <listitem>
bartaf25f672009-06-26 19:03:53 +00001238 <para><function>calloc</function>: returned memory is marked both
1239 addressable and valid, since <function>calloc</function> clears
1240 the area to zero.</para>
njn3e986b22004-11-30 10:43:45 +00001241 </listitem>
1242
1243 <listitem>
bartaf25f672009-06-26 19:03:53 +00001244 <para><function>realloc</function>: if the new size is larger than
1245 the old, the new section is addressable but invalid, as with
njn2f7eebe2009-08-05 06:34:27 +00001246 <function>malloc</function>. If the new size is smaller, the
1247 dropped-off section is marked as unaddressable. You may only pass to
bartaf25f672009-06-26 19:03:53 +00001248 <function>realloc</function> a pointer previously issued to you by
1249 <function>malloc</function>/<function>calloc</function>/<function>realloc</function>.</para>
njn3e986b22004-11-30 10:43:45 +00001250 </listitem>
1251
1252 <listitem>
bartaf25f672009-06-26 19:03:53 +00001253 <para><function>free</function>/<computeroutput>delete</computeroutput>/<computeroutput>delete[]</computeroutput>:
1254 you may only pass to these functions a pointer previously issued
1255 to you by the corresponding allocation function. Otherwise,
1256 Memcheck complains. If the pointer is indeed valid, Memcheck
1257 marks the entire area it points at as unaddressable, and places
1258 the block in the freed-blocks-queue. The aim is to defer as long
1259 as possible reallocation of this block. Until that happens, all
1260 attempts to access it will elicit an invalid-address error, as you
1261 would hope.</para>
njn3e986b22004-11-30 10:43:45 +00001262 </listitem>
1263
1264</itemizedlist>
1265
1266</sect2>
1267</sect1>
1268
1269
1270
njn3e986b22004-11-30 10:43:45 +00001271<sect1 id="mc-manual.clientreqs" xreflabel="Client requests">
1272<title>Client Requests</title>
1273
1274<para>The following client requests are defined in
njn1d0825f2006-03-27 11:37:07 +00001275<filename>memcheck.h</filename>.
njn3e986b22004-11-30 10:43:45 +00001276See <filename>memcheck.h</filename> for exact details of their
1277arguments.</para>
1278
1279<itemizedlist>
1280
1281 <listitem>
njndbf7ca72006-03-31 11:57:59 +00001282 <para><varname>VALGRIND_MAKE_MEM_NOACCESS</varname>,
1283 <varname>VALGRIND_MAKE_MEM_UNDEFINED</varname> and
1284 <varname>VALGRIND_MAKE_MEM_DEFINED</varname>.
njn3e986b22004-11-30 10:43:45 +00001285 These mark address ranges as completely inaccessible,
1286 accessible but containing undefined data, and accessible and
1287 containing defined data, respectively. Subsequent errors may
1288 have their faulting addresses described in terms of these
1289 blocks. Returns a "block handle". Returns zero when not run
1290 on Valgrind.</para>
1291 </listitem>
1292
1293 <listitem>
njndbf7ca72006-03-31 11:57:59 +00001294 <para><varname>VALGRIND_MAKE_MEM_DEFINED_IF_ADDRESSABLE</varname>.
1295 This is just like <varname>VALGRIND_MAKE_MEM_DEFINED</varname> but only
1296 affects those bytes that are already addressable.</para>
1297 </listitem>
1298
1299 <listitem>
de03e0e7c2005-12-03 23:02:33 +00001300 <para><varname>VALGRIND_DISCARD</varname>: At some point you may
1301 want Valgrind to stop reporting errors in terms of the blocks
1302 defined by the previous three macros. To do this, the above macros
1303 return a small-integer "block handle". You can pass this block
1304 handle to <varname>VALGRIND_DISCARD</varname>. After doing so,
1305 Valgrind will no longer be able to relate addressing errors to the
1306 user-defined block associated with the handle. The permissions
1307 settings associated with the handle remain in place; this just
1308 affects how errors are reported, not whether they are reported.
1309 Returns 1 for an invalid handle and 0 for a valid handle (although
1310 passing invalid handles is harmless). Always returns 0 when not run
njn3e986b22004-11-30 10:43:45 +00001311 on Valgrind.</para>
1312 </listitem>
1313
1314 <listitem>
njndbf7ca72006-03-31 11:57:59 +00001315 <para><varname>VALGRIND_CHECK_MEM_IS_ADDRESSABLE</varname> and
1316 <varname>VALGRIND_CHECK_MEM_IS_DEFINED</varname>: check immediately
de03e0e7c2005-12-03 23:02:33 +00001317 whether or not the given address range has the relevant property,
1318 and if not, print an error message. Also, for the convenience of
1319 the client, returns zero if the relevant property holds; otherwise,
1320 the returned value is the address of the first byte for which the
1321 property is not true. Always returns 0 when not run on
1322 Valgrind.</para>
njn3e986b22004-11-30 10:43:45 +00001323 </listitem>
1324
1325 <listitem>
njndbf7ca72006-03-31 11:57:59 +00001326 <para><varname>VALGRIND_CHECK_VALUE_IS_DEFINED</varname>: a quick and easy
1327 way to find out whether Valgrind thinks a particular value
1328 (lvalue, to be precise) is addressable and defined. Prints an error
njn8225cc02009-03-09 22:52:24 +00001329 message if not. It has no return value.</para>
njn3e986b22004-11-30 10:43:45 +00001330 </listitem>
1331
1332 <listitem>
njn8225cc02009-03-09 22:52:24 +00001333 <para><varname>VALGRIND_DO_LEAK_CHECK</varname>: does a full memory leak
njn2f7eebe2009-08-05 06:34:27 +00001334 check (like <option>--leak-check=full</option>) right now.
njn8225cc02009-03-09 22:52:24 +00001335 This is useful for incrementally checking for leaks between arbitrary
1336 places in the program's execution. It has no return value.</para>
1337 </listitem>
1338
1339 <listitem>
1340 <para><varname>VALGRIND_DO_QUICK_LEAK_CHECK</varname>: like
1341 <varname>VALGRIND_DO_LEAK_CHECK</varname>, except it produces only a leak
njn7e5d4ed2009-07-30 02:57:52 +00001342 summary (like <option>--leak-check=summary</option>).
njn8225cc02009-03-09 22:52:24 +00001343 It has no return value.</para>
njn3e986b22004-11-30 10:43:45 +00001344 </listitem>
1345
1346 <listitem>
de03e0e7c2005-12-03 23:02:33 +00001347 <para><varname>VALGRIND_COUNT_LEAKS</varname>: fills in the four
1348 arguments with the number of bytes of memory found by the previous
njn8225cc02009-03-09 22:52:24 +00001349 leak check to be leaked (i.e. the sum of direct leaks and indirect leaks),
njn2f7eebe2009-08-05 06:34:27 +00001350 dubious, reachable and suppressed. This is useful in test harness code,
njn8225cc02009-03-09 22:52:24 +00001351 after calling <varname>VALGRIND_DO_LEAK_CHECK</varname> or
1352 <varname>VALGRIND_DO_QUICK_LEAK_CHECK</varname>.</para>
njn3e986b22004-11-30 10:43:45 +00001353 </listitem>
1354
1355 <listitem>
njn8df80b22009-03-02 05:11:06 +00001356 <para><varname>VALGRIND_COUNT_LEAK_BLOCKS</varname>: identical to
1357 <varname>VALGRIND_COUNT_LEAKS</varname> except that it returns the
1358 number of blocks rather than the number of bytes in each
1359 category.</para>
1360 </listitem>
1361
1362 <listitem>
de03e0e7c2005-12-03 23:02:33 +00001363 <para><varname>VALGRIND_GET_VBITS</varname> and
1364 <varname>VALGRIND_SET_VBITS</varname>: allow you to get and set the
1365 V (validity) bits for an address range. You should probably only
1366 set V bits that you have got with
1367 <varname>VALGRIND_GET_VBITS</varname>. Only for those who really
njn1d0825f2006-03-27 11:37:07 +00001368 know what they are doing.</para>
njn3e986b22004-11-30 10:43:45 +00001369 </listitem>
1370
1371</itemizedlist>
1372
1373</sect1>
sewardjce10c262006-10-05 17:56:14 +00001374
1375
1376
1377
njn09f2e6c2009-08-10 04:07:54 +00001378<sect1 id="mc-manual.mempools" xreflabel="Memory Pools">
sewardjce10c262006-10-05 17:56:14 +00001379<title>Memory Pools: describing and working with custom allocators</title>
1380
1381<para>Some programs use custom memory allocators, often for performance
njna3311642009-08-10 01:29:14 +00001382reasons. Left to itself, Memcheck is unable to understand the
1383behaviour of custom allocation schemes as well as it understands the
1384standard allocators, and so may miss errors and leaks in your program. What
1385this section describes is a way to give Memcheck enough of a description of
1386your custom allocator that it can make at least some sense of what is
1387happening.</para>
sewardjae0e07b2006-10-06 11:47:01 +00001388
1389<para>There are many different sorts of custom allocator, so Memcheck
sewardjce10c262006-10-05 17:56:14 +00001390attempts to reason about them using a loose, abstract model. We
1391use the following terminology when describing custom allocation
1392systems:</para>
1393
1394<itemizedlist>
1395 <listitem>
1396 <para>Custom allocation involves a set of independent "memory pools".
1397 </para>
1398 </listitem>
1399 <listitem>
1400 <para>Memcheck's notion of a a memory pool consists of a single "anchor
1401 address" and a set of non-overlapping "chunks" associated with the
1402 anchor address.</para>
1403 </listitem>
1404 <listitem>
1405 <para>Typically a pool's anchor address is the address of a
1406 book-keeping "header" structure.</para>
1407 </listitem>
1408 <listitem>
1409 <para>Typically the pool's chunks are drawn from a contiguous
bartaf25f672009-06-26 19:03:53 +00001410 "superblock" acquired through the system
njn2f7eebe2009-08-05 06:34:27 +00001411 <function>malloc</function> or
1412 <function>mmap</function>.</para>
sewardjce10c262006-10-05 17:56:14 +00001413 </listitem>
1414
1415</itemizedlist>
1416
1417<para>Keep in mind that the last two points above say "typically": the
1418Valgrind mempool client request API is intentionally vague about the
1419exact structure of a mempool. There is no specific mention made of
1420headers or superblocks. Nevertheless, the following picture may help
1421elucidate the intention of the terms in the API:</para>
1422
1423<programlisting><![CDATA[
1424 "pool"
1425 (anchor address)
1426 |
1427 v
1428 +--------+---+
1429 | header | o |
1430 +--------+-|-+
1431 |
1432 v superblock
1433 +------+---+--------------+---+------------------+
1434 | |rzB| allocation |rzB| |
1435 +------+---+--------------+---+------------------+
1436 ^ ^
1437 | |
1438 "addr" "addr"+"size"
1439]]></programlisting>
1440
1441<para>
1442Note that the header and the superblock may be contiguous or
1443discontiguous, and there may be multiple superblocks associated with a
1444single header; such variations are opaque to Memcheck. The API
1445only requires that your allocation scheme can present sensible values
1446of "pool", "addr" and "size".</para>
1447
1448<para>
1449Typically, before making client requests related to mempools, a client
1450program will have allocated such a header and superblock for their
1451mempool, and marked the superblock NOACCESS using the
1452<varname>VALGRIND_MAKE_MEM_NOACCESS</varname> client request.</para>
1453
1454<para>
1455When dealing with mempools, the goal is to maintain a particular
1456invariant condition: that Memcheck believes the unallocated portions
1457of the pool's superblock (including redzones) are NOACCESS. To
1458maintain this invariant, the client program must ensure that the
1459superblock starts out in that state; Memcheck cannot make it so, since
1460Memcheck never explicitly learns about the superblock of a pool, only
1461the allocated chunks within the pool.</para>
1462
1463<para>
1464Once the header and superblock for a pool are established and properly
1465marked, there are a number of client requests programs can use to
1466inform Memcheck about changes to the state of a mempool:</para>
1467
1468<itemizedlist>
1469
1470 <listitem>
1471 <para>
1472 <varname>VALGRIND_CREATE_MEMPOOL(pool, rzB, is_zeroed)</varname>:
njna3311642009-08-10 01:29:14 +00001473 This request registers the address <varname>pool</varname> as the anchor
1474 address for a memory pool. It also provides a size
1475 <varname>rzB</varname>, specifying how large the redzones placed around
1476 chunks allocated from the pool should be. Finally, it provides an
1477 <varname>is_zeroed</varname> argument that specifies whether the pool's
1478 chunks are zeroed (more precisely: defined) when allocated.
sewardjce10c262006-10-05 17:56:14 +00001479 </para>
1480 <para>
1481 Upon completion of this request, no chunks are associated with the
1482 pool. The request simply tells Memcheck that the pool exists, so that
1483 subsequent calls can refer to it as a pool.
1484 </para>
1485 </listitem>
1486
1487 <listitem>
1488 <para><varname>VALGRIND_DESTROY_MEMPOOL(pool)</varname>:
1489 This request tells Memcheck that a pool is being torn down. Memcheck
1490 then removes all records of chunks associated with the pool, as well
1491 as its record of the pool's existence. While destroying its records of
1492 a mempool, Memcheck resets the redzones of any live chunks in the pool
1493 to NOACCESS.
1494 </para>
1495 </listitem>
1496
1497 <listitem>
1498 <para><varname>VALGRIND_MEMPOOL_ALLOC(pool, addr, size)</varname>:
njna3311642009-08-10 01:29:14 +00001499 This request informs Memcheck that a <varname>size</varname>-byte chunk
1500 has been allocated at <varname>addr</varname>, and associates the chunk with the
1501 specified
1502 <varname>pool</varname>. If the pool was created with nonzero
1503 <varname>rzB</varname> redzones, Memcheck will mark the
1504 <varname>rzB</varname> bytes before and after the chunk as NOACCESS. If
1505 the pool was created with the <varname>is_zeroed</varname> argument set,
1506 Memcheck will mark the chunk as DEFINED, otherwise Memcheck will mark
1507 the chunk as UNDEFINED.
sewardjce10c262006-10-05 17:56:14 +00001508 </para>
1509 </listitem>
1510
1511 <listitem>
1512 <para><varname>VALGRIND_MEMPOOL_FREE(pool, addr)</varname>:
njna3311642009-08-10 01:29:14 +00001513 This request informs Memcheck that the chunk at <varname>addr</varname>
1514 should no longer be considered allocated. Memcheck will mark the chunk
1515 associated with <varname>addr</varname> as NOACCESS, and delete its
1516 record of the chunk's existence.
sewardjce10c262006-10-05 17:56:14 +00001517 </para>
1518 </listitem>
1519
1520 <listitem>
1521 <para><varname>VALGRIND_MEMPOOL_TRIM(pool, addr, size)</varname>:
njna3311642009-08-10 01:29:14 +00001522 This request trims the chunks associated with <varname>pool</varname>.
1523 The request only operates on chunks associated with
1524 <varname>pool</varname>. Trimming is formally defined as:</para>
sewardjce10c262006-10-05 17:56:14 +00001525 <itemizedlist>
1526 <listitem>
njna3311642009-08-10 01:29:14 +00001527 <para> All chunks entirely inside the range
1528 <varname>addr..(addr+size-1)</varname> are preserved.</para>
sewardjce10c262006-10-05 17:56:14 +00001529 </listitem>
1530 <listitem>
njna3311642009-08-10 01:29:14 +00001531 <para>All chunks entirely outside the range
1532 <varname>addr..(addr+size-1)</varname> are discarded, as though
1533 <varname>VALGRIND_MEMPOOL_FREE</varname> was called on them. </para>
sewardjce10c262006-10-05 17:56:14 +00001534 </listitem>
1535 <listitem>
1536 <para>All other chunks must intersect with the range
njna3311642009-08-10 01:29:14 +00001537 <varname>addr..(addr+size-1)</varname>; areas outside the
1538 intersection are marked as NOACCESS, as though they had been
1539 independently freed with
sewardjce10c262006-10-05 17:56:14 +00001540 <varname>VALGRIND_MEMPOOL_FREE</varname>.</para>
1541 </listitem>
1542 </itemizedlist>
1543 <para>This is a somewhat rare request, but can be useful in
1544 implementing the type of mass-free operations common in custom
1545 LIFO allocators.</para>
1546 </listitem>
1547
1548 <listitem>
bartaf25f672009-06-26 19:03:53 +00001549 <para><varname>VALGRIND_MOVE_MEMPOOL(poolA, poolB)</varname>: This
1550 request informs Memcheck that the pool previously anchored at
njna3311642009-08-10 01:29:14 +00001551 address <varname>poolA</varname> has moved to anchor address
1552 <varname>poolB</varname>. This is a rare request, typically only needed
1553 if you <function>realloc</function> the header of a mempool.</para>
sewardjce10c262006-10-05 17:56:14 +00001554 <para>No memory-status bits are altered by this request.</para>
1555 </listitem>
1556
1557 <listitem>
1558 <para>
bartaf25f672009-06-26 19:03:53 +00001559 <varname>VALGRIND_MEMPOOL_CHANGE(pool, addrA, addrB,
1560 size)</varname>: This request informs Memcheck that the chunk
njna3311642009-08-10 01:29:14 +00001561 previously allocated at address <varname>addrA</varname> within
1562 <varname>pool</varname> has been moved and/or resized, and should be
1563 changed to cover the region <varname>addrB..(addrB+size-1)</varname>. This
1564 is a rare request, typically only needed if you
1565 <function>realloc</function> a superblock or wish to extend a chunk
1566 without changing its memory-status bits.
sewardjce10c262006-10-05 17:56:14 +00001567 </para>
1568 <para>No memory-status bits are altered by this request.
1569 </para>
1570 </listitem>
1571
1572 <listitem>
1573 <para><varname>VALGRIND_MEMPOOL_EXISTS(pool)</varname>:
1574 This request informs the caller whether or not Memcheck is currently
njna3311642009-08-10 01:29:14 +00001575 tracking a mempool at anchor address <varname>pool</varname>. It
1576 evaluates to 1 when there is a mempool associated with that address, 0
1577 otherwise. This is a rare request, only useful in circumstances when
1578 client code might have lost track of the set of active mempools.
sewardjce10c262006-10-05 17:56:14 +00001579 </para>
1580 </listitem>
1581
1582</itemizedlist>
1583
sewardj778d7832007-11-22 01:21:56 +00001584</sect1>
1585
1586
1587
1588
1589
1590
1591
1592<sect1 id="mc-manual.mpiwrap" xreflabel="MPI Wrappers">
1593<title>Debugging MPI Parallel Programs with Valgrind</title>
1594
njn2f7eebe2009-08-05 06:34:27 +00001595<para>Memcheck supports debugging of distributed-memory applications
sewardj778d7832007-11-22 01:21:56 +00001596which use the MPI message passing standard. This support consists of a
1597library of wrapper functions for the
1598<computeroutput>PMPI_*</computeroutput> interface. When incorporated
1599into the application's address space, either by direct linking or by
1600<computeroutput>LD_PRELOAD</computeroutput>, the wrappers intercept
1601calls to <computeroutput>PMPI_Send</computeroutput>,
1602<computeroutput>PMPI_Recv</computeroutput>, etc. They then
njn2f7eebe2009-08-05 06:34:27 +00001603use client requests to inform Memcheck of memory state changes caused
sewardj778d7832007-11-22 01:21:56 +00001604by the function being wrapped. This reduces the number of false
1605positives that Memcheck otherwise typically reports for MPI
1606applications.</para>
1607
1608<para>The wrappers also take the opportunity to carefully check
1609size and definedness of buffers passed as arguments to MPI functions, hence
1610detecting errors such as passing undefined data to
1611<computeroutput>PMPI_Send</computeroutput>, or receiving data into a
1612buffer which is too small.</para>
1613
1614<para>Unlike most of the rest of Valgrind, the wrapper library is subject to a
1615BSD-style license, so you can link it into any code base you like.
njna437a602009-08-04 05:24:46 +00001616See the top of <computeroutput>mpi/libmpiwrap.c</computeroutput>
sewardj778d7832007-11-22 01:21:56 +00001617for license details.</para>
1618
1619
1620<sect2 id="mc-manual.mpiwrap.build" xreflabel="Building MPI Wrappers">
1621<title>Building and installing the wrappers</title>
1622
1623<para> The wrapper library will be built automatically if possible.
1624Valgrind's configure script will look for a suitable
1625<computeroutput>mpicc</computeroutput> to build it with. This must be
1626the same <computeroutput>mpicc</computeroutput> you use to build the
1627MPI application you want to debug. By default, Valgrind tries
1628<computeroutput>mpicc</computeroutput>, but you can specify a
njna3311642009-08-10 01:29:14 +00001629different one by using the configure-time option
njn7316df22009-08-04 01:16:01 +00001630<option>--with-mpicc</option>. Currently the
sewardj778d7832007-11-22 01:21:56 +00001631wrappers are only buildable with
1632<computeroutput>mpicc</computeroutput>s which are based on GNU
njn7316df22009-08-04 01:16:01 +00001633GCC or Intel's C++ Compiler.</para>
sewardj778d7832007-11-22 01:21:56 +00001634
1635<para>Check that the configure script prints a line like this:</para>
1636
1637<programlisting><![CDATA[
1638checking for usable MPI2-compliant mpicc and mpi.h... yes, mpicc
1639]]></programlisting>
1640
1641<para>If it says <computeroutput>... no</computeroutput>, your
1642<computeroutput>mpicc</computeroutput> has failed to compile and link
1643a test MPI2 program.</para>
1644
1645<para>If the configure test succeeds, continue in the usual way with
1646<computeroutput>make</computeroutput> and <computeroutput>make
1647install</computeroutput>. The final install tree should then contain
njn2f7eebe2009-08-05 06:34:27 +00001648<computeroutput>libmpiwrap-&lt;platform&gt;.so</computeroutput>.
sewardj778d7832007-11-22 01:21:56 +00001649</para>
1650
1651<para>Compile up a test MPI program (eg, MPI hello-world) and try
1652this:</para>
1653
1654<programlisting><![CDATA[
njn6bf365c2009-02-11 00:35:45 +00001655LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so \
sewardj778d7832007-11-22 01:21:56 +00001656 mpirun [args] $prefix/bin/valgrind ./hello
1657]]></programlisting>
1658
1659<para>You should see something similar to the following</para>
1660
1661<programlisting><![CDATA[
1662valgrind MPI wrappers 31901: Active for pid 31901
1663valgrind MPI wrappers 31901: Try MPIWRAP_DEBUG=help for possible options
1664]]></programlisting>
1665
1666<para>repeated for every process in the group. If you do not see
1667these, there is an build/installation problem of some kind.</para>
1668
1669<para> The MPI functions to be wrapped are assumed to be in an ELF
1670shared object with soname matching
1671<computeroutput>libmpi.so*</computeroutput>. This is known to be
1672correct at least for Open MPI and Quadrics MPI, and can easily be
1673changed if required.</para>
1674</sect2>
1675
1676
1677<sect2 id="mc-manual.mpiwrap.gettingstarted"
1678 xreflabel="Getting started with MPI Wrappers">
1679<title>Getting started</title>
1680
1681<para>Compile your MPI application as usual, taking care to link it
1682using the same <computeroutput>mpicc</computeroutput> that your
1683Valgrind build was configured with.</para>
1684
1685<para>
1686Use the following basic scheme to run your application on Valgrind with
1687the wrappers engaged:</para>
1688
1689<programlisting><![CDATA[
1690MPIWRAP_DEBUG=[wrapper-args] \
njn6bf365c2009-02-11 00:35:45 +00001691 LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so \
sewardj778d7832007-11-22 01:21:56 +00001692 mpirun [mpirun-args] \
1693 $prefix/bin/valgrind [valgrind-args] \
1694 [application] [app-args]
1695]]></programlisting>
1696
1697<para>As an alternative to
1698<computeroutput>LD_PRELOAD</computeroutput>ing
njn6bf365c2009-02-11 00:35:45 +00001699<computeroutput>libmpiwrap-&lt;platform&gt;.so</computeroutput>, you can
1700simply link it to your application if desired. This should not disturb
1701native behaviour of your application in any way.</para>
sewardj778d7832007-11-22 01:21:56 +00001702</sect2>
1703
1704
1705<sect2 id="mc-manual.mpiwrap.controlling"
1706 xreflabel="Controlling the MPI Wrappers">
1707<title>Controlling the wrapper library</title>
1708
1709<para>Environment variable
1710<computeroutput>MPIWRAP_DEBUG</computeroutput> is consulted at
1711startup. The default behaviour is to print a starting banner</para>
1712
1713<programlisting><![CDATA[
1714valgrind MPI wrappers 16386: Active for pid 16386
1715valgrind MPI wrappers 16386: Try MPIWRAP_DEBUG=help for possible options
1716]]></programlisting>
1717
1718<para> and then be relatively quiet.</para>
1719
1720<para>You can give a list of comma-separated options in
1721<computeroutput>MPIWRAP_DEBUG</computeroutput>. These are</para>
1722
1723<itemizedlist>
1724 <listitem>
1725 <para><computeroutput>verbose</computeroutput>:
1726 show entries/exits of all wrappers. Also show extra
1727 debugging info, such as the status of outstanding
1728 <computeroutput>MPI_Request</computeroutput>s resulting
1729 from uncompleted <computeroutput>MPI_Irecv</computeroutput>s.</para>
1730 </listitem>
1731 <listitem>
1732 <para><computeroutput>quiet</computeroutput>:
1733 opposite of <computeroutput>verbose</computeroutput>, only print
1734 anything when the wrappers want
1735 to report a detected programming error, or in case of catastrophic
1736 failure of the wrappers.</para>
1737 </listitem>
1738 <listitem>
1739 <para><computeroutput>warn</computeroutput>:
1740 by default, functions which lack proper wrappers
1741 are not commented on, just silently
1742 ignored. This causes a warning to be printed for each unwrapped
1743 function used, up to a maximum of three warnings per function.</para>
1744 </listitem>
1745 <listitem>
1746 <para><computeroutput>strict</computeroutput>:
1747 print an error message and abort the program if
1748 a function lacking a wrapper is used.</para>
1749 </listitem>
1750</itemizedlist>
1751
1752<para> If you want to use Valgrind's XML output facility
njn7e5d4ed2009-07-30 02:57:52 +00001753(<option>--xml=yes</option>), you should pass
sewardj778d7832007-11-22 01:21:56 +00001754<computeroutput>quiet</computeroutput> in
1755<computeroutput>MPIWRAP_DEBUG</computeroutput> so as to get rid of any
1756extraneous printing from the wrappers.</para>
1757
1758</sect2>
1759
1760
njn2f7eebe2009-08-05 06:34:27 +00001761<sect2 id="mc-manual.mpiwrap.limitations.functions"
1762 xreflabel="Functions: Abilities and Limitations">
sewardj778d7832007-11-22 01:21:56 +00001763<title>Functions</title>
1764
1765<para>All MPI2 functions except
1766<computeroutput>MPI_Wtick</computeroutput>,
1767<computeroutput>MPI_Wtime</computeroutput> and
1768<computeroutput>MPI_Pcontrol</computeroutput> have wrappers. The
1769first two are not wrapped because they return a
njn2f7eebe2009-08-05 06:34:27 +00001770<computeroutput>double</computeroutput>, which Valgrind's
1771function-wrap mechanism cannot handle (but it could easily be
1772extended to do so). <computeroutput>MPI_Pcontrol</computeroutput> cannot be
sewardj778d7832007-11-22 01:21:56 +00001773wrapped as it has variable arity:
1774<computeroutput>int MPI_Pcontrol(const int level, ...)</computeroutput></para>
1775
1776<para>Most functions are wrapped with a default wrapper which does
1777nothing except complain or abort if it is called, depending on
1778settings in <computeroutput>MPIWRAP_DEBUG</computeroutput> listed
1779above. The following functions have "real", do-something-useful
1780wrappers:</para>
1781
1782<programlisting><![CDATA[
1783PMPI_Send PMPI_Bsend PMPI_Ssend PMPI_Rsend
1784
1785PMPI_Recv PMPI_Get_count
1786
1787PMPI_Isend PMPI_Ibsend PMPI_Issend PMPI_Irsend
1788
1789PMPI_Irecv
1790PMPI_Wait PMPI_Waitall
1791PMPI_Test PMPI_Testall
1792
1793PMPI_Iprobe PMPI_Probe
1794
1795PMPI_Cancel
1796
1797PMPI_Sendrecv
1798
1799PMPI_Type_commit PMPI_Type_free
1800
1801PMPI_Pack PMPI_Unpack
1802
1803PMPI_Bcast PMPI_Gather PMPI_Scatter PMPI_Alltoall
1804PMPI_Reduce PMPI_Allreduce PMPI_Op_create
1805
1806PMPI_Comm_create PMPI_Comm_dup PMPI_Comm_free PMPI_Comm_rank PMPI_Comm_size
1807
1808PMPI_Error_string
1809PMPI_Init PMPI_Initialized PMPI_Finalize
1810]]></programlisting>
1811
1812<para> A few functions such as
1813<computeroutput>PMPI_Address</computeroutput> are listed as
1814<computeroutput>HAS_NO_WRAPPER</computeroutput>. They have no wrapper
1815at all as there is nothing worth checking, and giving a no-op wrapper
1816would reduce performance for no reason.</para>
1817
1818<para> Note that the wrapper library itself can itself generate large
1819numbers of calls to the MPI implementation, especially when walking
1820complex types. The most common functions called are
1821<computeroutput>PMPI_Extent</computeroutput>,
1822<computeroutput>PMPI_Type_get_envelope</computeroutput>,
1823<computeroutput>PMPI_Type_get_contents</computeroutput>, and
1824<computeroutput>PMPI_Type_free</computeroutput>. </para>
njn2f7eebe2009-08-05 06:34:27 +00001825</sect2>
sewardj778d7832007-11-22 01:21:56 +00001826
njn2f7eebe2009-08-05 06:34:27 +00001827<sect2 id="mc-manual.mpiwrap.limitations.types"
1828 xreflabel="Types: Abilities and Limitations">
sewardj778d7832007-11-22 01:21:56 +00001829<title>Types</title>
1830
1831<para> MPI-1.1 structured types are supported, and walked exactly.
1832The currently supported combiners are
1833<computeroutput>MPI_COMBINER_NAMED</computeroutput>,
1834<computeroutput>MPI_COMBINER_CONTIGUOUS</computeroutput>,
1835<computeroutput>MPI_COMBINER_VECTOR</computeroutput>,
1836<computeroutput>MPI_COMBINER_HVECTOR</computeroutput>
1837<computeroutput>MPI_COMBINER_INDEXED</computeroutput>,
1838<computeroutput>MPI_COMBINER_HINDEXED</computeroutput> and
1839<computeroutput>MPI_COMBINER_STRUCT</computeroutput>. This should
1840cover all MPI-1.1 types. The mechanism (function
1841<computeroutput>walk_type</computeroutput>) should extend easily to
1842cover MPI2 combiners.</para>
1843
1844<para>MPI defines some named structured types
1845(<computeroutput>MPI_FLOAT_INT</computeroutput>,
1846<computeroutput>MPI_DOUBLE_INT</computeroutput>,
1847<computeroutput>MPI_LONG_INT</computeroutput>,
1848<computeroutput>MPI_2INT</computeroutput>,
1849<computeroutput>MPI_SHORT_INT</computeroutput>,
1850<computeroutput>MPI_LONG_DOUBLE_INT</computeroutput>) which are pairs
1851of some basic type and a C <computeroutput>int</computeroutput>.
1852Unfortunately the MPI specification makes it impossible to look inside
1853these types and see where the fields are. Therefore these wrappers
1854assume the types are laid out as <computeroutput>struct { float val;
1855int loc; }</computeroutput> (for
1856<computeroutput>MPI_FLOAT_INT</computeroutput>), etc, and act
1857accordingly. This appears to be correct at least for Open MPI 1.0.2
1858and for Quadrics MPI.</para>
1859
1860<para>If <computeroutput>strict</computeroutput> is an option specified
1861in <computeroutput>MPIWRAP_DEBUG</computeroutput>, the application
1862will abort if an unhandled type is encountered. Otherwise, the
1863application will print a warning message and continue.</para>
1864
1865<para>Some effort is made to mark/check memory ranges corresponding to
1866arrays of values in a single pass. This is important for performance
1867since asking Valgrind to mark/check any range, no matter how small,
1868carries quite a large constant cost. This optimisation is applied to
1869arrays of primitive types (<computeroutput>double</computeroutput>,
1870<computeroutput>float</computeroutput>,
1871<computeroutput>int</computeroutput>,
1872<computeroutput>long</computeroutput>, <computeroutput>long
1873long</computeroutput>, <computeroutput>short</computeroutput>,
1874<computeroutput>char</computeroutput>, and <computeroutput>long
1875double</computeroutput> on platforms where <computeroutput>sizeof(long
1876double) == 8</computeroutput>). For arrays of all other types, the
1877wrappers handle each element individually and so there can be a very
1878large performance cost.</para>
1879
sewardj778d7832007-11-22 01:21:56 +00001880</sect2>
1881
1882
1883<sect2 id="mc-manual.mpiwrap.writingwrappers"
1884 xreflabel="Writing new MPI Wrappers">
1885<title>Writing new wrappers</title>
1886
1887<para>
1888For the most part the wrappers are straightforward. The only
1889significant complexity arises with nonblocking receives.</para>
1890
1891<para>The issue is that <computeroutput>MPI_Irecv</computeroutput>
1892states the recv buffer and returns immediately, giving a handle
1893(<computeroutput>MPI_Request</computeroutput>) for the transaction.
1894Later the user will have to poll for completion with
1895<computeroutput>MPI_Wait</computeroutput> etc, and when the
1896transaction completes successfully, the wrappers have to paint the
1897recv buffer. But the recv buffer details are not presented to
1898<computeroutput>MPI_Wait</computeroutput> -- only the handle is. The
1899library therefore maintains a shadow table which associates
1900uncompleted <computeroutput>MPI_Request</computeroutput>s with the
1901corresponding buffer address/count/type. When an operation completes,
1902the table is searched for the associated address/count/type info, and
1903memory is marked accordingly.</para>
1904
1905<para>Access to the table is guarded by a (POSIX pthreads) lock, so as
1906to make the library thread-safe.</para>
1907
1908<para>The table is allocated with
1909<computeroutput>malloc</computeroutput> and never
1910<computeroutput>free</computeroutput>d, so it will show up in leak
1911checks.</para>
1912
1913<para>Writing new wrappers should be fairly easy. The source file is
njna437a602009-08-04 05:24:46 +00001914<computeroutput>mpi/libmpiwrap.c</computeroutput>. If possible,
sewardj778d7832007-11-22 01:21:56 +00001915find an existing wrapper for a function of similar behaviour to the
1916one you want to wrap, and use it as a starting point. The wrappers
1917are organised in sections in the same order as the MPI 1.1 spec, to
1918aid navigation. When adding a wrapper, remember to comment out the
1919definition of the default wrapper in the long list of defaults at the
1920bottom of the file (do not remove it, just comment it out).</para>
1921</sect2>
1922
1923<sect2 id="mc-manual.mpiwrap.whattoexpect"
1924 xreflabel="What to expect with MPI Wrappers">
1925<title>What to expect when using the wrappers</title>
1926
1927<para>The wrappers should reduce Memcheck's false-error rate on MPI
1928applications. Because the wrapping is done at the MPI interface,
1929there will still potentially be a large number of errors reported in
1930the MPI implementation below the interface. The best you can do is
1931try to suppress them.</para>
1932
1933<para>You may also find that the input-side (buffer
1934length/definedness) checks find errors in your MPI use, for example
1935passing too short a buffer to
1936<computeroutput>MPI_Recv</computeroutput>.</para>
1937
1938<para>Functions which are not wrapped may increase the false
1939error rate. A possible approach is to run with
1940<computeroutput>MPI_DEBUG</computeroutput> containing
1941<computeroutput>warn</computeroutput>. This will show you functions
1942which lack proper wrappers but which are nevertheless used. You can
1943then write wrappers for them.
1944</para>
1945
1946<para>A known source of potential false errors are the
1947<computeroutput>PMPI_Reduce</computeroutput> family of functions, when
1948using a custom (user-defined) reduction function. In a reduction
1949operation, each node notionally sends data to a "central point" which
1950uses the specified reduction function to merge the data items into a
1951single item. Hence, in general, data is passed between nodes and fed
1952to the reduction function, but the wrapper library cannot mark the
1953transferred data as initialised before it is handed to the reduction
1954function, because all that happens "inside" the
1955<computeroutput>PMPI_Reduce</computeroutput> call. As a result you
1956may see false positives reported in your reduction function.</para>
1957
1958</sect2>
sewardjce10c262006-10-05 17:56:14 +00001959
1960</sect1>
sewardj778d7832007-11-22 01:21:56 +00001961
1962
1963
1964
1965
njn3e986b22004-11-30 10:43:45 +00001966</chapter>