nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 1 | Valgrind FAQ, version 2.1.2 |
| 2 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
nethercote | 8deae81 | 2004-07-18 10:35:36 +0000 | [diff] [blame] | 3 | Last revised 18 July 2004 |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 4 | ~~~~~~~~~~~~~~~~~~~~~~~~~ |
njn | 4e59bd9 | 2003-04-22 20:58:47 +0000 | [diff] [blame] | 5 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 6 | 1. Background |
| 7 | 2. Compiling, installing and configuring |
| 8 | 3. Valgrind aborts unexpectedly |
| 9 | 4. Valgrind behaves unexpectedly |
| 10 | 5. Memcheck doesn't find my bug |
| 11 | 6. Miscellaneous |
| 12 | |
| 13 | |
| 14 | ----------------------------------------------------------------- |
| 15 | 1. Background |
| 16 | ----------------------------------------------------------------- |
| 17 | |
| 18 | 1.1. How do you pronounce "Valgrind"? |
| 19 | |
| 20 | The "Val" as in the world "value". The "grind" is pronounced with a |
| 21 | short 'i' -- ie. "grinned" (rhymes with "tinned") rather than "grined" |
| 22 | (rhymes with "find"). |
| 23 | |
| 24 | Don't feel bad: almost everyone gets it wrong at first. |
njn | 4e59bd9 | 2003-04-22 20:58:47 +0000 | [diff] [blame] | 25 | |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 26 | ----------------------------------------------------------------- |
| 27 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 28 | 1.2. Where does the name "Valgrind" come from? |
| 29 | |
| 30 | From Nordic mythology. Originally (before release) the project was |
| 31 | named Heimdall, after the watchman of the Nordic gods. He could "see a |
| 32 | hundred miles by day or night, hear the grass growing, see the wool |
| 33 | growing on a sheep's back" (etc). This would have been a great name, |
| 34 | but it was already taken by a security package "Heimdal". |
| 35 | |
| 36 | Keeping with the Nordic theme, Valgrind was chosen. Valgrind is the |
| 37 | name of the main entrance to Valhalla (the Hall of the Chosen Slain in |
| 38 | Asgard). Over this entrance there resides a wolf and over it there is |
| 39 | the head of a boar and on it perches a huge eagle, whose eyes can see to |
| 40 | the far regions of the nine worlds. Only those judged worthy by the |
| 41 | guardians are allowed to pass through Valgrind. All others are refused |
| 42 | entrance. |
| 43 | |
| 44 | It's not short for "value grinder", although that's not a bad guess. |
| 45 | |
| 46 | |
| 47 | ----------------------------------------------------------------- |
| 48 | 2. Compiling, installing and configuring |
| 49 | ----------------------------------------------------------------- |
| 50 | |
| 51 | 2.1. When I trying building Valgrind, 'make' dies partway with an |
| 52 | assertion failure, something like this: make: expand.c:489: |
| 53 | |
| 54 | allocated_variable_append: Assertion |
| 55 | `current_variable_set_list->next != 0' failed. |
| 56 | |
| 57 | It's probably a bug in 'make'. Some, but not all, instances of version 3.79.1 |
| 58 | have this bug, see www.mail-archive.com/bug-make@gnu.org/msg01658.html. Try |
| 59 | upgrading to a more recent version of 'make'. Alternatively, we have heard |
| 60 | that unsetting the CFLAGS environment variable avoids the problem. |
| 61 | |
| 62 | |
| 63 | ----------------------------------------------------------------- |
| 64 | 3. Valgrind aborts unexpectedly |
| 65 | ----------------------------------------------------------------- |
| 66 | |
| 67 | 3.1. Programs run OK on Valgrind, but at exit produce a bunch of errors a bit |
| 68 | like this |
njn | 4e59bd9 | 2003-04-22 20:58:47 +0000 | [diff] [blame] | 69 | |
| 70 | ==20755== Invalid read of size 4 |
| 71 | ==20755== at 0x40281C8A: _nl_unload_locale (loadlocale.c:238) |
| 72 | ==20755== by 0x4028179D: free_mem (findlocale.c:257) |
| 73 | ==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34) |
| 74 | ==20755== by 0x40048DCC: vgPlain___libc_freeres_wrapper |
| 75 | (vg_clientfuncs.c:585) |
| 76 | ==20755== Address 0x40CC304C is 8 bytes inside a block of size 380 free'd |
| 77 | ==20755== at 0x400484C9: free (vg_clientfuncs.c:180) |
| 78 | ==20755== by 0x40281CBA: _nl_unload_locale (loadlocale.c:246) |
| 79 | ==20755== by 0x40281218: free_mem (setlocale.c:461) |
| 80 | ==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34) |
| 81 | |
| 82 | and then die with a segmentation fault. |
| 83 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 84 | When the program exits, Valgrind runs the procedure __libc_freeres() in |
| 85 | glibc. This is a hook for memory debuggers, so they can ask glibc to |
| 86 | free up any memory it has used. Doing that is needed to ensure that |
| 87 | Valgrind doesn't incorrectly report space leaks in glibc. |
njn | 4e59bd9 | 2003-04-22 20:58:47 +0000 | [diff] [blame] | 88 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 89 | Problem is that running __libc_freeres() in older glibc versions causes |
| 90 | this crash. |
njn | 4e59bd9 | 2003-04-22 20:58:47 +0000 | [diff] [blame] | 91 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 92 | WORKAROUND FOR 1.1.X and later versions of Valgrind: use the |
| 93 | --run-libc-freeres=no flag. You may then get space leak reports for |
| 94 | glibc-allocations (please _don't_ report these to the glibc people, |
| 95 | since they are not real leaks), but at least the program runs. |
njn | 4e59bd9 | 2003-04-22 20:58:47 +0000 | [diff] [blame] | 96 | |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 97 | ----------------------------------------------------------------- |
njn | 4e59bd9 | 2003-04-22 20:58:47 +0000 | [diff] [blame] | 98 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 99 | 3.2. My (buggy) program dies like this: |
njn | 4e59bd9 | 2003-04-22 20:58:47 +0000 | [diff] [blame] | 100 | valgrind: vg_malloc2.c:442 (bszW_to_pszW): |
| 101 | Assertion `pszW >= 0' failed. |
njn | 4e59bd9 | 2003-04-22 20:58:47 +0000 | [diff] [blame] | 102 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 103 | If Memcheck (the memory checker) shows any invalid reads, invalid writes |
| 104 | and invalid frees in your program, the above may happen. Reason is that |
| 105 | your program may trash Valgrind's low-level memory manager, which then |
| 106 | dies with the above assertion, or something like this. The cure is to |
| 107 | fix your program so that it doesn't do any illegal memory accesses. The |
| 108 | above failure will hopefully go away after that. |
njn | 4e59bd9 | 2003-04-22 20:58:47 +0000 | [diff] [blame] | 109 | |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 110 | ----------------------------------------------------------------- |
njn | 4e59bd9 | 2003-04-22 20:58:47 +0000 | [diff] [blame] | 111 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 112 | 3.3. My program dies, printing a message like this along the way: |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 113 | |
nethercote | 3178887 | 2003-11-02 16:32:05 +0000 | [diff] [blame] | 114 | disInstr: unhandled instruction bytes: 0x66 0xF 0x2E 0x5 |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 115 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 116 | Older versions did not support some x86 instructions, particularly |
| 117 | SSE/SSE2 instructions. Try a newer Valgrind; we now support almost all |
| 118 | instructions. If it still happens with newer versions, if the failing |
| 119 | instruction is an SSE/SSE2 instruction, you might be able to recompile |
nethercote | 8deae81 | 2004-07-18 10:35:36 +0000 | [diff] [blame] | 120 | your program without it by using the flag -march to gcc. Either way, |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 121 | let us know and we'll try to fix it. |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 122 | |
nethercote | 8deae81 | 2004-07-18 10:35:36 +0000 | [diff] [blame] | 123 | Another possibility is that your program has a bug and erroneously jumps |
| 124 | to a non-code address, in which case you'll get a SIGILL signal. |
| 125 | Memcheck/Addrcheck may issue a warning just before this happens, but they |
| 126 | might not if the jump happens to land in addressable memory. |
| 127 | |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 128 | ----------------------------------------------------------------- |
| 129 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 130 | 3.4. My program dies like this: |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 131 | |
| 132 | error: /lib/librt.so.1: symbol __pthread_clock_settime, version |
| 133 | GLIBC_PRIVATE not defined in file libpthread.so.0 with link time |
| 134 | reference |
| 135 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 136 | This is a total swamp. Nevertheless there is a way out. It's a problem |
| 137 | which is not easy to fix. Really the problem is that /lib/librt.so.1 |
| 138 | refers to some symbols __pthread_clock_settime and |
| 139 | __pthread_clock_gettime in /lib/libpthread.so which are not intended to |
| 140 | be exported, ie they are private. |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 141 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 142 | Best solution is to ensure your program does not use /lib/librt.so.1. |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 143 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 144 | However .. since you're probably not using it directly, or even |
| 145 | knowingly, that's hard to do. You might instead be able to fix it by |
| 146 | playing around with coregrind/vg_libpthread.vs. Things to try: |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 147 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 148 | Remove this |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 149 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 150 | GLIBC_PRIVATE { |
| 151 | __pthread_clock_gettime; |
| 152 | __pthread_clock_settime; |
| 153 | }; |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 154 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 155 | or maybe remove this |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 156 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 157 | GLIBC_2.2.3 { |
| 158 | __pthread_clock_gettime; |
| 159 | __pthread_clock_settime; |
| 160 | } GLIBC_2.2; |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 161 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 162 | or maybe add this |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 163 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 164 | GLIBC_2.2.4 { |
| 165 | __pthread_clock_gettime; |
| 166 | __pthread_clock_settime; |
| 167 | } GLIBC_2.2; |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 168 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 169 | GLIBC_2.2.5 { |
| 170 | __pthread_clock_gettime; |
| 171 | __pthread_clock_settime; |
| 172 | } GLIBC_2.2; |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 173 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 174 | or some combination of the above. After each change you need to delete |
| 175 | coregrind/libpthread.so and do make && make install. |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 176 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 177 | I just don't know if any of the above will work. If you can find a |
| 178 | solution which works, I would be interested to hear it. |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 179 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 180 | To which someone replied: |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 181 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 182 | I deleted this: |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 183 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 184 | GLIBC_2.2.3 { |
| 185 | __pthread_clock_gettime; |
| 186 | __pthread_clock_settime; |
| 187 | } GLIBC_2.2; |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 188 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 189 | and it worked. |
| 190 | |
| 191 | |
| 192 | ----------------------------------------------------------------- |
| 193 | 4. Valgrind behaves unexpectedly |
| 194 | ----------------------------------------------------------------- |
| 195 | |
| 196 | 4.1. I try running "valgrind my_program", but my_program runs normally, |
| 197 | and Valgrind doesn't emit any output at all. |
| 198 | |
| 199 | For versions prior to 2.1.1: |
| 200 | |
| 201 | Valgrind doesn't work out-of-the-box with programs that are entirely |
| 202 | statically linked. It does a quick test at startup, and if it detects |
| 203 | that the program is statically linked, it aborts with an explanation. |
| 204 | |
| 205 | This test may fail in some obscure cases, eg. if you run a script under |
| 206 | Valgrind and the script interpreter is statically linked. |
| 207 | |
| 208 | If you still want static linking, you can ask gcc to link certain |
| 209 | libraries statically. Try the following options: |
| 210 | |
| 211 | -Wl,-Bstatic -lmyLibrary1 -lotherLibrary -Wl,-Bdynamic |
| 212 | |
| 213 | Just make sure you end with -Wl,-Bdynamic so that libc is dynamically |
| 214 | linked. |
| 215 | |
| 216 | If you absolutely cannot use dynamic libraries, you can try statically |
| 217 | linking together all the .o files in coregrind/, all the .o files of the |
| 218 | tool of your choice (eg. those in memcheck/), and the .o files of your |
| 219 | program. You'll end up with a statically linked binary that runs |
| 220 | permanently under Valgrind's control. Note that we haven't tested this |
| 221 | procedure thoroughly. |
| 222 | |
| 223 | |
| 224 | For versions 2.1.1 and later: |
| 225 | |
| 226 | Valgrind does now work with static binaries, although beware that some |
| 227 | of the tools won't operate as well as normal, because they have access |
| 228 | to less information about how the program runs. Eg. Memcheck will miss |
| 229 | some errors that it would otherwise find. This is because Valgrind |
| 230 | doesn't replace malloc() and friends with its own versions. It's best |
| 231 | if your program is dynamically linked with glibc. |
sewardj | 36a53ad | 2003-04-22 23:26:24 +0000 | [diff] [blame] | 232 | |
| 233 | ----------------------------------------------------------------- |
njn | 4e59bd9 | 2003-04-22 20:58:47 +0000 | [diff] [blame] | 234 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 235 | 4.2. My threaded server process runs unbelievably slowly on Valgrind. |
| 236 | So slowly, in fact, that at first I thought it had completely |
| 237 | locked up. |
sewardj | 03272ff | 2003-04-26 22:23:35 +0000 | [diff] [blame] | 238 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 239 | We are not completely sure about this, but one possibility is that |
| 240 | laptops with power management fool Valgrind's timekeeping mechanism, |
| 241 | which is (somewhat in error) based on the x86 RDTSC instruction. A |
| 242 | "fix" which is claimed to work is to run some other cpu-intensive |
| 243 | process at the same time, so that the laptop's power-management |
| 244 | clock-slowing does not kick in. We would be interested in hearing more |
| 245 | feedback on this. |
sewardj | 03272ff | 2003-04-26 22:23:35 +0000 | [diff] [blame] | 246 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 247 | Another possible cause is that versions prior to 1.9.6 did not support |
| 248 | threading on glibc 2.3.X systems well. Hopefully the situation is much |
| 249 | improved with 1.9.6 and later versions. |
sewardj | 03272ff | 2003-04-26 22:23:35 +0000 | [diff] [blame] | 250 | |
| 251 | ----------------------------------------------------------------- |
| 252 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 253 | 4.3. My program uses the C++ STL and string classes. Valgrind |
| 254 | reports 'still reachable' memory leaks involving these classes |
| 255 | at the exit of the program, but there should be none. |
njn | ae34aef | 2003-08-07 21:24:24 +0000 | [diff] [blame] | 256 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 257 | First of all: relax, it's probably not a bug, but a feature. Many |
| 258 | implementations of the C++ standard libraries use their own memory pool |
| 259 | allocators. Memory for quite a number of destructed objects is not |
| 260 | immediately freed and given back to the OS, but kept in the pool(s) for |
| 261 | later re-use. The fact that the pools are not freed at the exit() of |
| 262 | the program cause Valgrind to report this memory as still reachable. |
| 263 | The behaviour not to free pools at the exit() could be called a bug of |
| 264 | the library though. |
njn | ae34aef | 2003-08-07 21:24:24 +0000 | [diff] [blame] | 265 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 266 | Using gcc, you can force the STL to use malloc and to free memory as |
| 267 | soon as possible by globally disabling memory caching. Beware! Doing |
| 268 | so will probably slow down your program, sometimes drastically. |
njn | ae34aef | 2003-08-07 21:24:24 +0000 | [diff] [blame] | 269 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 270 | - With gcc 2.91, 2.95, 3.0 and 3.1, compile all source using the STL |
| 271 | with -D__USE_MALLOC. Beware! This is removed from gcc starting with |
| 272 | version 3.3. |
| 273 | |
| 274 | - With 3.2.2 and later, you should export the environment variable |
| 275 | GLIBCPP_FORCE_NEW before running your program. |
| 276 | |
| 277 | There are other ways to disable memory pooling: using the malloc_alloc |
| 278 | template with your objects (not portable, but should work for gcc) or |
| 279 | even writing your own memory allocators. But all this goes beyond the |
| 280 | scope of this FAQ. Start by reading |
| 281 | http://gcc.gnu.org/onlinedocs/libstdc++/ext/howto.html#3 if you |
| 282 | absolutely want to do that. But beware: |
| 283 | |
| 284 | 1) there are currently changes underway for gcc which are not totally |
| 285 | reflected in the docs right now ("now" == 26 Apr 03) |
| 286 | |
| 287 | 2) allocators belong to the more messy parts of the STL and people went |
| 288 | at great lengths to make it portable across platforms. Chances are |
| 289 | good that your solution will work on your platform, but not on |
| 290 | others. |
| 291 | |
| 292 | ----------------------------------------------------------------------------- |
| 293 | 4.4. The stack traces given by Memcheck (or another tool) aren't helpful. |
| 294 | How can I improve them? |
| 295 | |
| 296 | If they're not long enough, use --num-callers to make them longer. |
| 297 | |
| 298 | If they're not detailed enough, make sure you are compiling with -g to add |
| 299 | debug information. And don't strip symbol tables (programs should be |
| 300 | unstripped unless you run 'strip' on them; some libraries ship stripped). |
| 301 | |
| 302 | Also, -fomit-frame-pointer and -fstack-check can make stack traces worse. |
| 303 | |
| 304 | Some example sub-traces: |
| 305 | |
| 306 | With debug information and unstripped (best): |
| 307 | |
| 308 | Invalid write of size 1 |
| 309 | at 0x80483BF: really (malloc1.c:20) |
| 310 | by 0x8048370: main (malloc1.c:9) |
| 311 | |
| 312 | With no debug information, unstripped: |
| 313 | |
| 314 | Invalid write of size 1 |
| 315 | at 0x80483BF: really (in /auto/homes/njn25/grind/head5/a.out) |
| 316 | by 0x8048370: main (in /auto/homes/njn25/grind/head5/a.out) |
| 317 | |
| 318 | With no debug information, stripped: |
| 319 | |
| 320 | Invalid write of size 1 |
| 321 | at 0x80483BF: (within /auto/homes/njn25/grind/head5/a.out) |
| 322 | by 0x8048370: (within /auto/homes/njn25/grind/head5/a.out) |
| 323 | by 0x42015703: __libc_start_main (in /lib/tls/libc-2.3.2.so) |
| 324 | by 0x80482CC: (within /auto/homes/njn25/grind/head5/a.out) |
| 325 | |
| 326 | With debug information and -fomit-frame-pointer: |
| 327 | |
| 328 | Invalid write of size 1 |
| 329 | at 0x80483C4: really (malloc1.c:20) |
| 330 | by 0x42015703: __libc_start_main (in /lib/tls/libc-2.3.2.so) |
| 331 | by 0x80482CC: ??? (start.S:81) |
| 332 | |
| 333 | ----------------------------------------------------------------- |
| 334 | 5. Memcheck doesn't find my bug |
| 335 | ----------------------------------------------------------------- |
| 336 | |
| 337 | 5.1. I try running "valgrind --tool=memcheck my_program" and get |
| 338 | Valgrind's startup message, but I don't get any errors and I know |
| 339 | my program has errors. |
| 340 | |
| 341 | By default, Valgrind only traces the top-level process. So if your |
| 342 | program spawns children, they won't be traced by Valgrind by default. |
| 343 | Also, if your program is started by a shell script, Perl script, or |
| 344 | something similar, Valgrind will trace the shell, or the Perl |
| 345 | interpreter, or equivalent. |
| 346 | |
| 347 | To trace child processes, use the --trace-children=yes option. |
| 348 | |
| 349 | If you are tracing large trees of processes, it can be less disruptive |
| 350 | to have the output sent over the network. Give Valgrind the flag |
nethercote | f854867 | 2004-06-21 12:42:35 +0000 | [diff] [blame] | 351 | --log-socket=127.0.0.1:12345 (if you want logging output sent to port |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 352 | 12345 on localhost). You can use the valgrind-listener program to |
| 353 | listen on that port: |
| 354 | |
| 355 | valgrind-listener 12345 |
| 356 | |
| 357 | Obviously you have to start the listener process first. See the |
| 358 | documentation for more details. |
njn | ae34aef | 2003-08-07 21:24:24 +0000 | [diff] [blame] | 359 | |
| 360 | ----------------------------------------------------------------- |
| 361 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 362 | 5.2. Why doesn't Memcheck find the array overruns in this program? |
| 363 | |
| 364 | int static[5]; |
| 365 | |
| 366 | int main(void) |
| 367 | { |
| 368 | int stack[5]; |
| 369 | |
| 370 | static[5] = 0; |
| 371 | stack [5] = 0; |
| 372 | |
| 373 | return 0; |
| 374 | } |
| 375 | |
| 376 | Unfortunately, Memcheck doesn't do bounds checking on static or stack |
| 377 | arrays. We'd like to, but it's just not possible to do in a reasonable |
| 378 | way that fits with how Memcheck works. Sorry. |
njn | 1aa1850 | 2003-08-15 07:35:20 +0000 | [diff] [blame] | 379 | |
| 380 | ----------------------------------------------------------------- |
| 381 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 382 | 5.3. My program dies with a segmentation fault, but Memcheck doesn't give |
| 383 | any error messages before it, or none that look related. |
njn | a8fb5a3 | 2003-08-20 11:19:17 +0000 | [diff] [blame] | 384 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 385 | One possibility is that your program accesses to memory with |
| 386 | inappropriate permissions set, such as writing to read-only memory. |
| 387 | Maybe your program is writing to a static string like this: |
njn | a8fb5a3 | 2003-08-20 11:19:17 +0000 | [diff] [blame] | 388 | |
nethercote | ef0abd1 | 2004-04-10 00:29:58 +0000 | [diff] [blame] | 389 | char* s = "hello"; |
| 390 | s[0] = 'j'; |
| 391 | |
| 392 | or something similar. Writing to read-only memory can also apparently |
| 393 | make LinuxThreads behave strangely. |
| 394 | |
| 395 | |
| 396 | ----------------------------------------------------------------- |
| 397 | 6. Miscellaneous |
| 398 | ----------------------------------------------------------------- |
| 399 | |
| 400 | 6.1. I tried writing a suppression but it didn't work. Can you |
| 401 | write my suppression for me? |
| 402 | |
| 403 | Yes! Use the --gen-suppressions=yes feature to spit out suppressions |
| 404 | automatically for you. You can then edit them if you like, eg. |
| 405 | combining similar automatically generated suppressions using wildcards |
| 406 | like '*'. |
| 407 | |
| 408 | If you really want to write suppressions by hand, read the manual |
| 409 | carefully. Note particularly that C++ function names must be _mangled_. |
| 410 | |
| 411 | ----------------------------------------------------------------- |
| 412 | |
| 413 | 6.2. With Memcheck/Addrcheck's memory leak detector, what's the |
| 414 | difference between "definitely lost", "possibly lost", "still |
| 415 | reachable", and "suppressed"? |
| 416 | |
| 417 | The details are in section 3.6 of the manual. |
| 418 | |
| 419 | In short: |
| 420 | |
| 421 | - "definitely lost" means your program is leaking memory -- fix it! |
| 422 | |
| 423 | - "possibly lost" means your program is probably leaking memory, |
| 424 | unless you're doing funny things with pointers. |
| 425 | |
| 426 | - "still reachable" means your program is probably ok -- it didn't |
| 427 | free some memory it could have. This is quite common and often |
| 428 | reasonable. Don't use --show-reachable=yes if you don't want to see |
| 429 | these reports. |
| 430 | |
| 431 | - "suppressed" means that a leak error has been suppressed. There are |
| 432 | some suppressions in the default suppression files. You can ignore |
| 433 | suppressed errors. |
njn | a8fb5a3 | 2003-08-20 11:19:17 +0000 | [diff] [blame] | 434 | |
| 435 | ----------------------------------------------------------------- |
| 436 | |
njn | 4e59bd9 | 2003-04-22 20:58:47 +0000 | [diff] [blame] | 437 | (this is the end of the FAQ.) |