| Valgrind FAQ, version 2.1.2 |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| Last revised 18 July 2004 |
| ~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| 1. Background |
| 2. Compiling, installing and configuring |
| 3. Valgrind aborts unexpectedly |
| 4. Valgrind behaves unexpectedly |
| 5. Memcheck doesn't find my bug |
| 6. Miscellaneous |
| |
| |
| ----------------------------------------------------------------- |
| 1. Background |
| ----------------------------------------------------------------- |
| |
| 1.1. How do you pronounce "Valgrind"? |
| |
| The "Val" as in the world "value". The "grind" is pronounced with a |
| short 'i' -- ie. "grinned" (rhymes with "tinned") rather than "grined" |
| (rhymes with "find"). |
| |
| Don't feel bad: almost everyone gets it wrong at first. |
| |
| ----------------------------------------------------------------- |
| |
| 1.2. Where does the name "Valgrind" come from? |
| |
| From Nordic mythology. Originally (before release) the project was |
| named Heimdall, after the watchman of the Nordic gods. He could "see a |
| hundred miles by day or night, hear the grass growing, see the wool |
| growing on a sheep's back" (etc). This would have been a great name, |
| but it was already taken by a security package "Heimdal". |
| |
| Keeping with the Nordic theme, Valgrind was chosen. Valgrind is the |
| name of the main entrance to Valhalla (the Hall of the Chosen Slain in |
| Asgard). Over this entrance there resides a wolf and over it there is |
| the head of a boar and on it perches a huge eagle, whose eyes can see to |
| the far regions of the nine worlds. Only those judged worthy by the |
| guardians are allowed to pass through Valgrind. All others are refused |
| entrance. |
| |
| It's not short for "value grinder", although that's not a bad guess. |
| |
| |
| ----------------------------------------------------------------- |
| 2. Compiling, installing and configuring |
| ----------------------------------------------------------------- |
| |
| 2.1. When I trying building Valgrind, 'make' dies partway with an |
| assertion failure, something like this: make: expand.c:489: |
| |
| allocated_variable_append: Assertion |
| `current_variable_set_list->next != 0' failed. |
| |
| It's probably a bug in 'make'. Some, but not all, instances of version 3.79.1 |
| have this bug, see www.mail-archive.com/bug-make@gnu.org/msg01658.html. Try |
| upgrading to a more recent version of 'make'. Alternatively, we have heard |
| that unsetting the CFLAGS environment variable avoids the problem. |
| |
| |
| ----------------------------------------------------------------- |
| 3. Valgrind aborts unexpectedly |
| ----------------------------------------------------------------- |
| |
| 3.1. Programs run OK on Valgrind, but at exit produce a bunch of errors a bit |
| like this |
| |
| ==20755== Invalid read of size 4 |
| ==20755== at 0x40281C8A: _nl_unload_locale (loadlocale.c:238) |
| ==20755== by 0x4028179D: free_mem (findlocale.c:257) |
| ==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34) |
| ==20755== by 0x40048DCC: vgPlain___libc_freeres_wrapper |
| (vg_clientfuncs.c:585) |
| ==20755== Address 0x40CC304C is 8 bytes inside a block of size 380 free'd |
| ==20755== at 0x400484C9: free (vg_clientfuncs.c:180) |
| ==20755== by 0x40281CBA: _nl_unload_locale (loadlocale.c:246) |
| ==20755== by 0x40281218: free_mem (setlocale.c:461) |
| ==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34) |
| |
| and then die with a segmentation fault. |
| |
| When the program exits, Valgrind runs the procedure __libc_freeres() in |
| glibc. This is a hook for memory debuggers, so they can ask glibc to |
| free up any memory it has used. Doing that is needed to ensure that |
| Valgrind doesn't incorrectly report space leaks in glibc. |
| |
| Problem is that running __libc_freeres() in older glibc versions causes |
| this crash. |
| |
| WORKAROUND FOR 1.1.X and later versions of Valgrind: use the |
| --run-libc-freeres=no flag. You may then get space leak reports for |
| glibc-allocations (please _don't_ report these to the glibc people, |
| since they are not real leaks), but at least the program runs. |
| |
| ----------------------------------------------------------------- |
| |
| 3.2. My (buggy) program dies like this: |
| valgrind: vg_malloc2.c:442 (bszW_to_pszW): |
| Assertion `pszW >= 0' failed. |
| |
| If Memcheck (the memory checker) shows any invalid reads, invalid writes |
| and invalid frees in your program, the above may happen. Reason is that |
| your program may trash Valgrind's low-level memory manager, which then |
| dies with the above assertion, or something like this. The cure is to |
| fix your program so that it doesn't do any illegal memory accesses. The |
| above failure will hopefully go away after that. |
| |
| ----------------------------------------------------------------- |
| |
| 3.3. My program dies, printing a message like this along the way: |
| |
| disInstr: unhandled instruction bytes: 0x66 0xF 0x2E 0x5 |
| |
| Older versions did not support some x86 instructions, particularly |
| SSE/SSE2 instructions. Try a newer Valgrind; we now support almost all |
| instructions. If it still happens with newer versions, if the failing |
| instruction is an SSE/SSE2 instruction, you might be able to recompile |
| your program without it by using the flag -march to gcc. Either way, |
| let us know and we'll try to fix it. |
| |
| Another possibility is that your program has a bug and erroneously jumps |
| to a non-code address, in which case you'll get a SIGILL signal. |
| Memcheck/Addrcheck may issue a warning just before this happens, but they |
| might not if the jump happens to land in addressable memory. |
| |
| ----------------------------------------------------------------- |
| |
| 3.4. My program dies like this: |
| |
| error: /lib/librt.so.1: symbol __pthread_clock_settime, version |
| GLIBC_PRIVATE not defined in file libpthread.so.0 with link time |
| reference |
| |
| This is a total swamp. Nevertheless there is a way out. It's a problem |
| which is not easy to fix. Really the problem is that /lib/librt.so.1 |
| refers to some symbols __pthread_clock_settime and |
| __pthread_clock_gettime in /lib/libpthread.so which are not intended to |
| be exported, ie they are private. |
| |
| Best solution is to ensure your program does not use /lib/librt.so.1. |
| |
| However .. since you're probably not using it directly, or even |
| knowingly, that's hard to do. You might instead be able to fix it by |
| playing around with coregrind/vg_libpthread.vs. Things to try: |
| |
| Remove this |
| |
| GLIBC_PRIVATE { |
| __pthread_clock_gettime; |
| __pthread_clock_settime; |
| }; |
| |
| or maybe remove this |
| |
| GLIBC_2.2.3 { |
| __pthread_clock_gettime; |
| __pthread_clock_settime; |
| } GLIBC_2.2; |
| |
| or maybe add this |
| |
| GLIBC_2.2.4 { |
| __pthread_clock_gettime; |
| __pthread_clock_settime; |
| } GLIBC_2.2; |
| |
| GLIBC_2.2.5 { |
| __pthread_clock_gettime; |
| __pthread_clock_settime; |
| } GLIBC_2.2; |
| |
| or some combination of the above. After each change you need to delete |
| coregrind/libpthread.so and do make && make install. |
| |
| I just don't know if any of the above will work. If you can find a |
| solution which works, I would be interested to hear it. |
| |
| To which someone replied: |
| |
| I deleted this: |
| |
| GLIBC_2.2.3 { |
| __pthread_clock_gettime; |
| __pthread_clock_settime; |
| } GLIBC_2.2; |
| |
| and it worked. |
| |
| |
| ----------------------------------------------------------------- |
| 4. Valgrind behaves unexpectedly |
| ----------------------------------------------------------------- |
| |
| 4.1. I try running "valgrind my_program", but my_program runs normally, |
| and Valgrind doesn't emit any output at all. |
| |
| For versions prior to 2.1.1: |
| |
| Valgrind doesn't work out-of-the-box with programs that are entirely |
| statically linked. It does a quick test at startup, and if it detects |
| that the program is statically linked, it aborts with an explanation. |
| |
| This test may fail in some obscure cases, eg. if you run a script under |
| Valgrind and the script interpreter is statically linked. |
| |
| If you still want static linking, you can ask gcc to link certain |
| libraries statically. Try the following options: |
| |
| -Wl,-Bstatic -lmyLibrary1 -lotherLibrary -Wl,-Bdynamic |
| |
| Just make sure you end with -Wl,-Bdynamic so that libc is dynamically |
| linked. |
| |
| If you absolutely cannot use dynamic libraries, you can try statically |
| linking together all the .o files in coregrind/, all the .o files of the |
| tool of your choice (eg. those in memcheck/), and the .o files of your |
| program. You'll end up with a statically linked binary that runs |
| permanently under Valgrind's control. Note that we haven't tested this |
| procedure thoroughly. |
| |
| |
| For versions 2.1.1 and later: |
| |
| Valgrind does now work with static binaries, although beware that some |
| of the tools won't operate as well as normal, because they have access |
| to less information about how the program runs. Eg. Memcheck will miss |
| some errors that it would otherwise find. This is because Valgrind |
| doesn't replace malloc() and friends with its own versions. It's best |
| if your program is dynamically linked with glibc. |
| |
| ----------------------------------------------------------------- |
| |
| 4.2. My threaded server process runs unbelievably slowly on Valgrind. |
| So slowly, in fact, that at first I thought it had completely |
| locked up. |
| |
| We are not completely sure about this, but one possibility is that |
| laptops with power management fool Valgrind's timekeeping mechanism, |
| which is (somewhat in error) based on the x86 RDTSC instruction. A |
| "fix" which is claimed to work is to run some other cpu-intensive |
| process at the same time, so that the laptop's power-management |
| clock-slowing does not kick in. We would be interested in hearing more |
| feedback on this. |
| |
| Another possible cause is that versions prior to 1.9.6 did not support |
| threading on glibc 2.3.X systems well. Hopefully the situation is much |
| improved with 1.9.6 and later versions. |
| |
| ----------------------------------------------------------------- |
| |
| 4.3. My program uses the C++ STL and string classes. Valgrind |
| reports 'still reachable' memory leaks involving these classes |
| at the exit of the program, but there should be none. |
| |
| First of all: relax, it's probably not a bug, but a feature. Many |
| implementations of the C++ standard libraries use their own memory pool |
| allocators. Memory for quite a number of destructed objects is not |
| immediately freed and given back to the OS, but kept in the pool(s) for |
| later re-use. The fact that the pools are not freed at the exit() of |
| the program cause Valgrind to report this memory as still reachable. |
| The behaviour not to free pools at the exit() could be called a bug of |
| the library though. |
| |
| Using gcc, you can force the STL to use malloc and to free memory as |
| soon as possible by globally disabling memory caching. Beware! Doing |
| so will probably slow down your program, sometimes drastically. |
| |
| - With gcc 2.91, 2.95, 3.0 and 3.1, compile all source using the STL |
| with -D__USE_MALLOC. Beware! This is removed from gcc starting with |
| version 3.3. |
| |
| - With 3.2.2 and later, you should export the environment variable |
| GLIBCPP_FORCE_NEW before running your program. |
| |
| There are other ways to disable memory pooling: using the malloc_alloc |
| template with your objects (not portable, but should work for gcc) or |
| even writing your own memory allocators. But all this goes beyond the |
| scope of this FAQ. Start by reading |
| http://gcc.gnu.org/onlinedocs/libstdc++/ext/howto.html#3 if you |
| absolutely want to do that. But beware: |
| |
| 1) there are currently changes underway for gcc which are not totally |
| reflected in the docs right now ("now" == 26 Apr 03) |
| |
| 2) allocators belong to the more messy parts of the STL and people went |
| at great lengths to make it portable across platforms. Chances are |
| good that your solution will work on your platform, but not on |
| others. |
| |
| ----------------------------------------------------------------------------- |
| 4.4. The stack traces given by Memcheck (or another tool) aren't helpful. |
| How can I improve them? |
| |
| If they're not long enough, use --num-callers to make them longer. |
| |
| If they're not detailed enough, make sure you are compiling with -g to add |
| debug information. And don't strip symbol tables (programs should be |
| unstripped unless you run 'strip' on them; some libraries ship stripped). |
| |
| Also, -fomit-frame-pointer and -fstack-check can make stack traces worse. |
| |
| Some example sub-traces: |
| |
| With debug information and unstripped (best): |
| |
| Invalid write of size 1 |
| at 0x80483BF: really (malloc1.c:20) |
| by 0x8048370: main (malloc1.c:9) |
| |
| With no debug information, unstripped: |
| |
| Invalid write of size 1 |
| at 0x80483BF: really (in /auto/homes/njn25/grind/head5/a.out) |
| by 0x8048370: main (in /auto/homes/njn25/grind/head5/a.out) |
| |
| With no debug information, stripped: |
| |
| Invalid write of size 1 |
| at 0x80483BF: (within /auto/homes/njn25/grind/head5/a.out) |
| by 0x8048370: (within /auto/homes/njn25/grind/head5/a.out) |
| by 0x42015703: __libc_start_main (in /lib/tls/libc-2.3.2.so) |
| by 0x80482CC: (within /auto/homes/njn25/grind/head5/a.out) |
| |
| With debug information and -fomit-frame-pointer: |
| |
| Invalid write of size 1 |
| at 0x80483C4: really (malloc1.c:20) |
| by 0x42015703: __libc_start_main (in /lib/tls/libc-2.3.2.so) |
| by 0x80482CC: ??? (start.S:81) |
| |
| ----------------------------------------------------------------- |
| 5. Memcheck doesn't find my bug |
| ----------------------------------------------------------------- |
| |
| 5.1. I try running "valgrind --tool=memcheck my_program" and get |
| Valgrind's startup message, but I don't get any errors and I know |
| my program has errors. |
| |
| By default, Valgrind only traces the top-level process. So if your |
| program spawns children, they won't be traced by Valgrind by default. |
| Also, if your program is started by a shell script, Perl script, or |
| something similar, Valgrind will trace the shell, or the Perl |
| interpreter, or equivalent. |
| |
| To trace child processes, use the --trace-children=yes option. |
| |
| If you are tracing large trees of processes, it can be less disruptive |
| to have the output sent over the network. Give Valgrind the flag |
| --log-socket=127.0.0.1:12345 (if you want logging output sent to port |
| 12345 on localhost). You can use the valgrind-listener program to |
| listen on that port: |
| |
| valgrind-listener 12345 |
| |
| Obviously you have to start the listener process first. See the |
| documentation for more details. |
| |
| ----------------------------------------------------------------- |
| |
| 5.2. Why doesn't Memcheck find the array overruns in this program? |
| |
| int static[5]; |
| |
| int main(void) |
| { |
| int stack[5]; |
| |
| static[5] = 0; |
| stack [5] = 0; |
| |
| return 0; |
| } |
| |
| Unfortunately, Memcheck doesn't do bounds checking on static or stack |
| arrays. We'd like to, but it's just not possible to do in a reasonable |
| way that fits with how Memcheck works. Sorry. |
| |
| ----------------------------------------------------------------- |
| |
| 5.3. My program dies with a segmentation fault, but Memcheck doesn't give |
| any error messages before it, or none that look related. |
| |
| One possibility is that your program accesses to memory with |
| inappropriate permissions set, such as writing to read-only memory. |
| Maybe your program is writing to a static string like this: |
| |
| char* s = "hello"; |
| s[0] = 'j'; |
| |
| or something similar. Writing to read-only memory can also apparently |
| make LinuxThreads behave strangely. |
| |
| |
| ----------------------------------------------------------------- |
| 6. Miscellaneous |
| ----------------------------------------------------------------- |
| |
| 6.1. I tried writing a suppression but it didn't work. Can you |
| write my suppression for me? |
| |
| Yes! Use the --gen-suppressions=yes feature to spit out suppressions |
| automatically for you. You can then edit them if you like, eg. |
| combining similar automatically generated suppressions using wildcards |
| like '*'. |
| |
| If you really want to write suppressions by hand, read the manual |
| carefully. Note particularly that C++ function names must be _mangled_. |
| |
| ----------------------------------------------------------------- |
| |
| 6.2. With Memcheck/Addrcheck's memory leak detector, what's the |
| difference between "definitely lost", "possibly lost", "still |
| reachable", and "suppressed"? |
| |
| The details are in section 3.6 of the manual. |
| |
| In short: |
| |
| - "definitely lost" means your program is leaking memory -- fix it! |
| |
| - "possibly lost" means your program is probably leaking memory, |
| unless you're doing funny things with pointers. |
| |
| - "still reachable" means your program is probably ok -- it didn't |
| free some memory it could have. This is quite common and often |
| reasonable. Don't use --show-reachable=yes if you don't want to see |
| these reports. |
| |
| - "suppressed" means that a leak error has been suppressed. There are |
| some suppressions in the default suppression files. You can ignore |
| suppressed errors. |
| |
| ----------------------------------------------------------------- |
| |
| (this is the end of the FAQ.) |