blob: 96897d654ce0ab7aa44c7dec36f8369ff75876ce [file] [log] [blame]
njn3e986b22004-11-30 10:43:45 +00001<?xml version="1.0"?> <!-- -*- sgml -*- -->
2<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
4[ <!ENTITY % vg-entities SYSTEM "vg-entities.xml"> %vg-entities; ]>
5
6<book id="FAQ" xreflabel="Valgrind FAQ">
7
8 <bookinfo>
9 <title>Valgrind FAQ</title>
10 </bookinfo>
11
12
13<chapter id="faq.background" xreflabel="Background">
14<title>Background</title>
15
16<qandaset id="qset.background">
17
18<qandaentry id="faq.pronounce">
19 <question>
20 <para>How do you pronounce "Valgrind"?</para>
21 </question>
22 <answer>
23 <para>The "Val" as in the world "value". The "grind" is
24 pronounced with a short 'i' -- ie. "grinned" (rhymes with
25 "tinned") rather than "grined" (rhymes with "find").</para>
26 <para>Don't feel bad: almost everyone gets it wrong at
27 first.</para>
28 </answer>
29</qandaentry>
30
31<qandaentry id="faq.whence">
32 <question>
33 <para>Where does the name "Valgrind" come from?</para>
34 </question>
35 <answer>
36 <para>From Nordic mythology. Originally (before release) the
37 project was named Heimdall, after the watchman of the Nordic
38 gods. He could "see a hundred miles by day or night, hear the
39 grass growing, see the wool growing on a sheep's back" (etc).
40 This would have been a great name, but it was already taken by
41 a security package "Heimdal".</para> <para>Keeping with the
42 Nordic theme, Valgrind was chosen. Valgrind is the name of the
43 main entrance to Valhalla (the Hall of the Chosen Slain in
44 Asgard). Over this entrance there resides a wolf and over it
45 there is the head of a boar and on it perches a huge eagle,
46 whose eyes can see to the far regions of the nine worlds. Only
47 those judged worthy by the guardians are allowed to pass
48 through Valgrind. All others are refused entrance.</para>
49 <para>It's not short for "value grinder", although that's not a
50 bad guess.</para>
51 </answer>
52 </qandaentry>
53
54</qandaset>
55
56</chapter>
57
58
59<chapter id="faq.installing"
60 xreflabel="Compiling, installing and configuring">
61<title>Compiling, installing and configuring</title>
62<qandaset id="qset.installing">
63
64<qandaentry id="faq.make_dies">
65 <question>
66 <para>When I trying building Valgrind, 'make' dies partway with
67 an assertion failure, something like this:
68<screen>
69% make: expand.c:489: allocated_variable_append:
70 Assertion 'current_variable_set_list->next != 0' failed.
71</screen>
72 </para>
73 </question>
74 <answer>
75 <para>It's probably a bug in 'make'. Some, but not all,
76 instances of version 3.79.1 have this bug, see
77 www.mail-archive.com/bug-make@gnu.org/msg01658.html. Try
78 upgrading to a more recent version of 'make'. Alternatively,
79 we have heard that unsetting the CFLAGS environment variable
80 avoids the problem.</para>
81 </answer>
82</qandaentry>
83
84</qandaset>
85</chapter>
86
87
88
89<chapter id="faq.abort"
90 xreflabel="Valgrind aborts unexpectedly">
91<title>Valgrind aborts unexpectedly</title>
92<qandaset id="qset.abort">
93
94<qandaentry id="faq.exit_errors">
95 <question>
96 <para>Programs run OK on Valgrind, but at exit produce a bunch
97 of errors a bit like this:</para>
98 </question>
99 <answer><para>
100<programlisting>
101==20755== Invalid read of size 4
102==20755== at 0x40281C8A: _nl_unload_locale (loadlocale.c:238)
103==20755== by 0x4028179D: free_mem (findlocale.c:257)
104==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34)
105==20755== by 0x40048DCC: vgPlain___libc_freeres_wrapper (vg_clientfuncs.c:585)
106==20755== Address 0x40CC304C is 8 bytes inside a block of size 380 free'd
107==20755== at 0x400484C9: free (vg_clientfuncs.c:180)
108==20755== by 0x40281CBA: _nl_unload_locale (loadlocale.c:246)
109==20755== by 0x40281218: free_mem (setlocale.c:461)
110==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34)
111</programlisting>
112
113 and then die with a segmentation fault.</para>
114 <para>When the program exits, Valgrind runs the procedure
115 <literal>__libc_freeres()</literal> in glibc. This is a hook
116 for memory debuggers, so they can ask glibc to free up any
117 memory it has used. Doing that is needed to ensure that
118 Valgrind doesn't incorrectly report space leaks in glibc.</para>
119 <para>Problem is that running
120 <literal>__libc_freeres()</literal> in older glibc versions
121 causes this crash.</para> <para>WORKAROUND FOR 1.1.X and later
122 versions of Valgrind: use the
123 <literal>--run-libc-freeres=no</literal> flag. You may then get
124 space leak reports for glibc-allocations (please _don't_ report
125 these to the glibc people, since they are not real leaks), but
126 at least the program runs.</para>
127 </answer>
128</qandaentry>
129
130<qandaentry id="faq.bugdeath">
131 <question>
132 <para>My (buggy) program dies like this:</para>
133 </question>
134 <answer>
135 <screen>
136% valgrind: vg_malloc2.c:442 (bszW_to_pszW): Assertion 'pszW >= 0' failed.
137</screen>
138
139 <para>If Memcheck (the memory checker) shows any invalid reads,
140 invalid writes and invalid frees in your program, the above may
141 happen. Reason is that your program may trash Valgrind's
142 low-level memory manager, which then dies with the above
143 assertion, or something like this. The cure is to fix your
144 program so that it doesn't do any illegal memory accesses. The
145 above failure will hopefully go away after that.</para>
146 </answer>
147</qandaentry>
148
149<qandaentry id="faq.msgdeath">
150 <question>
151 <para>My program dies, printing a message like this along the
152 way:</para>
153 </question>
154 <answer>
155<screen>
156% disInstr: unhandled instruction bytes: 0x66 0xF 0x2E 0x5
157</screen>
158
159 <para>Older versions did not support some x86 instructions,
160 particularly SSE/SSE2 instructions. Try a newer Valgrind; we
161 now support almost all instructions. If it still happens with
162 newer versions, if the failing instruction is an SSE/SSE2
163 instruction, you might be able to recompile your progrma
164 without it by using the flag
165 <computeroutput>-march</computeroutput> to gcc. Either way,
166 let us know and we'll try to fix it.</para>
167
168 <para>Another possibility is that your program has a bug and
169 erroneously jumps to a non-code address, in which case you'll
170 get a SIGILL signal. Memcheck/Addrcheck may issue a warning
171 just before this happens, but they might not if the jump
172 happens to land in addressable memory.</para>
173 </answer>
174</qandaentry>
175
176<qandaentry id="faq.defdeath">
177 <question>
178 <para>My program dies like this:</para>
179 </question>
180 <answer>
181<screen>
182% error: /lib/librt.so.1: symbol __pthread_clock_settime,
183 version GLIBC_PRIVATE not defined in file libpthread.so.0 with link time reference
184</screen>
185
186 <para>This is a total swamp. Nevertheless there is a way out.
187 It's a problem which is not easy to fix. Really the problem is
188 that <filename>/lib/librt.so.1</filename> refers to some
189 symbols <literal>__pthread_clock_settime</literal> and
190 <literal>__pthread_clock_gettime</literal> in
191 <filename>/lib/libpthread.so</filename> which are not intended
192 to be exported, ie they are private.</para>
193
194 <para>Best solution is to ensure your program does not use
195 <filename>/lib/librt.so.1</filename>.</para>
196
197 <para>However ... since you're probably not using it directly,
198 or even knowingly, that's hard to do. You might instead be
199 able to fix it by playing around with
200 <filename>coregrind/vg_libpthread.vs</filename>. Things to
201 try:</para>
202
203 <para>Remove this:</para>
204<programlisting>
205GLIBC_PRIVATE {
206 __pthread_clock_gettime;
207 __pthread_clock_settime;
208};
209</programlisting>
210
211<para>or maybe remove this</para>
212<programlisting>
213GLIBC_2.2.3 {
214 __pthread_clock_gettime;
215 __pthread_clock_settime;
216} GLIBC_2.2;
217</programlisting>
218
219<para>or maybe add this:</para>
220<programlisting>
221GLIBC_2.2.4 {
222 __pthread_clock_gettime;
223 __pthread_clock_settime;
224} GLIBC_2.2;
225
226GLIBC_2.2.5 {
227 __pthread_clock_gettime;
228 __pthread_clock_settime;
229} GLIBC_2.2;
230</programlisting>
231
232 <para>or some combination of the above. After each change you
233 need to delete <filename>coregrind/libpthread.so</filename> and
234 do <computeroutput>make &amp;&amp; make
235 install</computeroutput>.</para>
236
237 <para>I just don't know if any of the above will work. If you
238 can find a solution which works, I would be interested to hear
239 it.</para>
240
241 <para>To which someone replied:</para>
242<screen>
243I deleted this:
244
245GLIBC_2.2.3 {
246 __pthread_clock_gettime;
247 __pthread_clock_settime;
248} GLIBC_2.2;
249
250and it worked.
251</screen>
252
253 </answer>
254</qandaentry>
255
256</qandaset>
257</chapter>
258
259
260<chapter id="faq.unexpected"
261 xreflabel="Valgrind behaves unexpectedly">
262<title>Valgrind behaves unexpectedly</title>
263<qandaset id="qset.unexpected">
264
265<qandaentry id="faq.no-output">
266 <question>
267 <para>I try running "valgrind my-program", but my-program runs
268 normally, and Valgrind doesn't emit any output at all.</para>
269 </question>
270 <answer>
271 <para><command>For versions prior to 2.1.1:</command></para>
272
273 <para>Valgrind doesn't work out-of-the-box with programs that
274 are entirely statically linked. It does a quick test at
275 startup, and if it detects that the program is statically
276 linked, it aborts with an explanation.</para>
277
278 <para>This test may fail in some obscure cases, eg. if you run
279 a script under Valgrind and the script interpreter is
280 statically linked.</para>
281
282 <para>If you still want static linking, you can ask gcc to link
283 certain libraries statically. Try the following options:</para>
284<screen>
285-Wl,-Bstatic -lmyLibrary1 -lotherLibrary -Wl,-Bdynamic
286</screen>
287
288 <para>Just make sure you end with
289 <computeroutput>-Wl,-Bdynamic</computeroutput> so that libc is
290 dynamically linked.</para>
291
292 <para>If you absolutely cannot use dynamic libraries, you can
293 try statically linking together all the .o files in coregrind/,
294 all the .o files of the tool of your choice (eg. those in
295 memcheck/), and the .o files of your program. You'll end up
296 with a statically linked binary that runs permanently under
297 Valgrind's control. Note that we haven't tested this procedure
298 thoroughly.</para>
299
300 <para><command>For versions 2.1.1 and later:</command></para>
301 <para>Valgrind does now work with static binaries, although
302 beware that some of the tools won't operate as well as normal,
303 because they have access to less information about how the
304 program runs. Eg. Memcheck will miss some errors that it would
305 otherwise find. This is because Valgrind doesn't replace
306 malloc() and friends with its own versions. It's best if your
307 program is dynamically linked with glibc.</para>
308 </answer>
309</qandaentry>
310
311<qandaentry id="faq.slowthread">
312 <question>
313 <para>My threaded server process runs unbelievably slowly on
314 Valgrind. So slowly, in fact, that at first I thought it had
315 completely locked up.</para>
316 </question>
317 <answer>
318 <para>We are not completely sure about this, but one
319 possibility is that laptops with power management fool
320 Valgrind's timekeeping mechanism, which is (somewhat in error)
321 based on the x86 RDTSC instruction. A "fix" which is claimed
322 to work is to run some other cpu-intensive process at the same
323 time, so that the laptop's power-management clock-slowing does
324 not kick in. We would be interested in hearing more feedback
325 on this.</para>
326
327 <para>Another possible cause is that versions prior to 1.9.6
328 did not support threading on glibc 2.3.X systems well.
329 Hopefully the situation is much improved with 1.9.6 and later
330 versions.</para>
331 </answer>
332</qandaentry>
333
334
335<qandaentry id="faq.reports">
336 <question>
337 <para>My program uses the C++ STL and string classes. Valgrind
338 reports 'still reachable' memory leaks involving these classes
339 at the exit of the program, but there should be none.</para>
340 </question>
341 <answer>
342 <para>First of all: relax, it's probably not a bug, but a
343 feature. Many implementations of the C++ standard libraries
344 use their own memory pool allocators. Memory for quite a
345 number of destructed objects is not immediately freed and given
346 back to the OS, but kept in the pool(s) for later re-use. The
347 fact that the pools are not freed at the exit() of the program
348 cause Valgrind to report this memory as still reachable. The
349 behaviour not to free pools at the exit() could be called a bug
350 of the library though.</para>
351
352 <para>Using gcc, you can force the STL to use malloc and to
353 free memory as soon as possible by globally disabling memory
354 caching. Beware! Doing so will probably slow down your
355 program, sometimes drastically.</para>
356 <itemizedlist>
357 <listitem>
358 <para>With gcc 2.91, 2.95, 3.0 and 3.1, compile all source
359 using the STL with <literal>-D__USE_MALLOC</literal>. Beware!
360 This is removed from gcc starting with version 3.3.</para>
361 </listitem>
362 <listitem>
363 <para>With 3.2.2 and later, you should export the environment
364 variable <literal>GLIBCPP_FORCE_NEW</literal> before running
365 your program.</para>
366 </listitem>
367 </itemizedlist>
368
369 <para>There are other ways to disable memory pooling: using the
370 <literal>malloc_alloc</literal> template with your objects (not
371 portable, but should work for gcc) or even writing your own
372 memory allocators. But all this goes beyond the scope of this
373 FAQ. Start by reading <ulink
374 url="http://gcc.gnu.org/onlinedocs/libstdc++/ext/howto.html#3">
375 http://gcc.gnu.org/onlinedocs/libstdc++/ext/howto.html#3</ulink>
376 if you absolutely want to do that. But beware:</para>
377
378 <orderedlist>
379 <listitem>
380 <para>there are currently changes underway for gcc which are
381 not totally reflected in the docs right now ("now" == 26 Apr
382 03)</para>
383 </listitem>
384 <listitem>
385 <para>allocators belong to the more messy parts of the STL
386 and people went at great lengths to make it portable across
387 platforms. Chances are good that your solution will work on
388 your platform, but not on others.</para>
389 </listitem>
390 </orderedlist>
391 </answer>
392</qandaentry>
393
394
395<qandaentry id="faq.unhelpful">
396 <question>
397 <para>The stack traces given by Memcheck (or another tool)
398 aren't helpful. How can I improve them?</para>
399 </question>
400 <answer>
401 <para>If they're not long enough, use
402 <literal>--num-callers</literal> to make them longer.</para>
403 <para>If they're not detailed enough, make sure you are
404 compiling with <literal>-g</literal> to add debug information.
405 And don't strip symbol tables (programs should be unstripped
406 unless you run 'strip' on them; some libraries ship
407 stripped).</para>
408
409 <para>Also, <literal>-fomit-frame-pointer</literal> and
410 <literal>-fstack-check</literal> can make stack traces
411 worse.</para>
412
413 <para>Some example sub-traces:</para>
414
415 <para>With debug information and unstripped (best):</para>
416<programlisting>
417Invalid write of size 1
418 at 0x80483BF: really (malloc1.c:20)
419 by 0x8048370: main (malloc1.c:9)
420</programlisting>
421
422 <para>With no debug information, unstripped:</para>
423<programlisting>
424Invalid write of size 1
425 at 0x80483BF: really (in /auto/homes/njn25/grind/head5/a.out)
426 by 0x8048370: main (in /auto/homes/njn25/grind/head5/a.out)
427</programlisting>
428
429 <para>With no debug information, stripped:</para>
430<programlisting>
431Invalid write of size 1
432 at 0x80483BF: (within /auto/homes/njn25/grind/head5/a.out)
433 by 0x8048370: (within /auto/homes/njn25/grind/head5/a.out)
434 by 0x42015703: __libc_start_main (in /lib/tls/libc-2.3.2.so)
435 by 0x80482CC: (within /auto/homes/njn25/grind/head5/a.out)
436</programlisting>
437
438 <para>With debug information and -fomit-frame-pointer:</para>
439<programlisting>
440Invalid write of size 1
441 at 0x80483C4: really (malloc1.c:20)
442 by 0x42015703: __libc_start_main (in /lib/tls/libc-2.3.2.so)
443 by 0x80482CC: ??? (start.S:81)
444</programlisting>
445
446 </answer>
447</qandaentry>
448
449</qandaset>
450</chapter>
451
452
453<chapter id="faq.notfound" xreflabel="Memcheck doesn't find my bug">
454<title>Memcheck doesn't find my bug</title>
455<qandaset id="qset.notfound">
456
457<qandaentry id="faq.hiddenbug">
458 <question>
459 <para>I try running "valgrind --tool=memcheck my_program" and
460 get Valgrind's startup message, but I don't get any errors and
461 I know my program has errors.</para>
462 </question>
463 <answer>
464 <para>By default, Valgrind only traces the top-level process.
465 So if your program spawns children, they won't be traced by
466 Valgrind by default. Also, if your program is started by a
467 shell script, Perl script, or something similar, Valgrind will
468 trace the shell, or the Perl interpreter, or equivalent.</para>
469
470 <para>To trace child processes, use the
471 <literal>--trace-children=yes</literal> option.</para>
472
473 <para>If you are tracing large trees of processes, it can be
474 less disruptive to have the output sent over the network. Give
475 Valgrind the flag
476 <literal>--log-socket=127.0.0.1:12345</literal> (if you want
477 logging output sent to <literal>port 12345</literal> on
478 <literal>localhost</literal>). You can use the
479 valgrind-listener program to listen on that port:</para>
480<programlisting>
481valgrind-listener 12345
482</programlisting>
483
484 <para>Obviously you have to start the listener process first.
485 See the Manual: <ulink url="http://www.valgrind.org/docs/bookset/manual-core.out2file.html">Directing output to file</ulink> for more details.</para>
486 </answer>
487</qandaentry>
488
489
490<qandaentry id="faq.overruns">
491 <question>
492 <para>Why doesn't Memcheck find the array overruns in this program?</para>
493 </question>
494 <answer>
495<programlisting>
496int static[5];
497
498int main(void)
499{
500 int stack[5];
501
502 static[5] = 0;
503 stack [5] = 0;
504
505 return 0;
506}
507</programlisting>
508 <para>Unfortunately, Memcheck doesn't do bounds checking on
509 static or stack arrays. We'd like to, but it's just not
510 possible to do in a reasonable way that fits with how Memcheck
511 works. Sorry.</para>
512 </answer>
513</qandaentry>
514
515
516<qandaentry id="faq.segfault">
517 <question>
518 <para>My program dies with a segmentation fault, but Memcheck
519 doesn't give any error messages before it, or none that look
520 related.</para>
521 </question>
522 <answer>
523 <para>One possibility is that your program accesses to memory
524 with inappropriate permissions set, such as writing to
525 read-only memory. Maybe your program is writing to a static
526 string like this:</para>
527<programlisting>
528char* s = "hello";
529s[0] = 'j';
530</programlisting>
531
532 <para>or something similar. Writing to read-only memory can
533 also apparently make LinuxThreads behave strangely.</para>
534 </answer>
535</qandaentry>
536
537</qandaset>
538</chapter>
539
540
541<chapter id="faq.misc"
542 xreflabel="Miscellaneous">
543<title>Miscellaneous</title>
544<qandaset id="qset.misc">
545
546<qandaentry id="faq.writesupp">
547 <question>
548 <para>I tried writing a suppression but it didn't work. Can
549 you write my suppression for me?</para>
550 </question>
551 <answer>
552 <para>Yes! Use the
553 <computeroutput>--gen-suppressions=yes</computeroutput> feature
554 to spit out suppressions automatically for you. You can then
555 edit them if you like, eg. combining similar automatically
556 generated suppressions using wildcards like
557 <literal>'*'</literal>.</para>
558
559 <para>If you really want to write suppressions by hand, read
560 the manual carefully. Note particularly that C++ function
561 names must be <literal>_mangled_</literal>.</para>
562 </answer>
563</qandaentry>
564
565
566<qandaentry id="faq.deflost">
567 <question>
568 <para>With Memcheck/Addrcheck's memory leak detector, what's
569 the difference between "definitely lost", "possibly lost",
570 "still reachable", and "suppressed"?</para>
571 </question>
572 <answer>
573 <para>The details are in the Manual:
574 <ulink url="http://www.valgrind.org/docs/bookset/mc-manual.leaks.html">Memory leak detection</ulink>.</para>
575
576 <para>In short:</para>
577 <itemizedlist>
578 <listitem>
579 <para>"definitely lost" means your program is leaking memory
580 -- fix it!</para>
581 </listitem>
582 <listitem>
583 <para>"possibly lost" means your program is probably leaking
584 memory, unless you're doing funny things with
585 pointers.</para>
586 </listitem>
587 <listitem>
588 <para>"still reachable" means your program is probably ok --
589 it didn't free some memory it could have. This is quite
590 common and often reasonable. Don't use
591 <computeroutput>--show-reachable=yes</computeroutput> if you
592 don't want to see these reports.</para>
593 </listitem>
594 <listitem>
595 <para>"suppressed" means that a leak error has been
596 suppressed. There are some suppressions in the default
597 suppression files. You can ignore suppressed errors.</para>
598 </listitem>
599 </itemizedlist>
600 </answer>
601</qandaentry>
602
603
604</qandaset>
605</chapter>
606
607
608<!-- template
609<chapter id="faq."
610 xreflabel="xx">
611<title>xx</title>
612<qandaset id="qset.">
613
614<qandaentry id="faq.deflost">
615 <question>
616 <para></para>
617 </question>
618 <answer>
619 <para></para>
620 </answer>
621</qandaentry>
622
623</qandaset>
624</chapter>
625-->
626
627
628
629<chapter id="faq.help" xreflabel="How To Get Further Assistance">
630<title>How To Get Further Assistance</title>
631
632
633<para>Please read all of this section before posting.</para>
634
635<para>If you think an answer is incomplete or inaccurate, please
636e-mail <ulink url="mailto:&vg-vemail;">&vg-vemail;</ulink>.</para>
637
638<para>Read the appropriate section(s) of the Manual(s):
639<ulink url="http://www.valgrind.org/docs/">Valgrind
640Documentation</ulink>.</para>
641
642<para>Read the <ulink url="http://www.valgrind.org/docs/">Distribution Documents</ulink>.</para>
643
644<para><ulink url="http://search.gmane.org">Search</ulink> the
645<ulink url="http://news.gmane.org/gmane.comp.debugging.valgrind">valgrind-users</ulink> mailing list archives, using the group name
646<computeroutput>gmane.comp.debugging.valgrind</computeroutput>.</para>
647
648<para>Only when you have tried all of these things and are still stuck,
649should you post to the <ulink url="&vg-users-list;">valgrind-users
650mailing list</ulink>. In which case, please read the following
651carefully. Making a complete posting will greatly increase the chances
652that an expert or fellow user reading it will have enough information
653and motivation to reply.</para>
654
655<para>Make sure you give full details of the problem,
656including the full output of <computeroutput>valgrind
657-v</computeroutput>, if applicable. Also which Linux distribution
658you're using (Red Hat, Debian, etc) and its version number.</para>
659
660<para>You are in little danger of making your posting too long
661unless you include large chunks of valgrind's (unsuppressed)
662output, so err on the side of giving too much information.</para>
663
664<para>Clearly written subject lines and message bodies are appreciated,
665too.</para>
666
667<para>Finally, remember that, despite the fact that most of the
668community are very helpful and responsive to emailed questions,
669you are probably requesting help from unpaid volunteers, so you
670have no guarantee of receiving an answer.</para>
671
672</chapter>
673
674</book>