blob: b4d9517d2fb2462d513d21ab0b9c39b1f492b357 [file] [log] [blame]
njn4e59bd92003-04-22 20:58:47 +00001
sewardj36a53ad2003-04-22 23:26:24 +00002A mini-FAQ for valgrind, version 1.9.6
njn4e59bd92003-04-22 20:58:47 +00003~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sewardj3d47b792003-05-05 22:15:35 +00004Last revised 5 May 2003
5~~~~~~~~~~~~~~~~~~~~~~~
njn4e59bd92003-04-22 20:58:47 +00006
sewardj36a53ad2003-04-22 23:26:24 +00007-----------------------------------------------------------------
8
njn4e59bd92003-04-22 20:58:47 +00009Q1. Programs run OK on valgrind, but at exit produce a bunch
10 of errors a bit like this
11
12 ==20755== Invalid read of size 4
13 ==20755== at 0x40281C8A: _nl_unload_locale (loadlocale.c:238)
14 ==20755== by 0x4028179D: free_mem (findlocale.c:257)
15 ==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34)
16 ==20755== by 0x40048DCC: vgPlain___libc_freeres_wrapper
17 (vg_clientfuncs.c:585)
18 ==20755== Address 0x40CC304C is 8 bytes inside a block of size 380 free'd
19 ==20755== at 0x400484C9: free (vg_clientfuncs.c:180)
20 ==20755== by 0x40281CBA: _nl_unload_locale (loadlocale.c:246)
21 ==20755== by 0x40281218: free_mem (setlocale.c:461)
22 ==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34)
23
24 and then die with a segmentation fault.
25
26A1. When the program exits, valgrind runs the procedure
27 __libc_freeres() in glibc. This is a hook for memory debuggers,
28 so they can ask glibc to free up any memory it has used. Doing
29 that is needed to ensure that valgrind doesn't incorrectly
30 report space leaks in glibc.
31
32 Problem is that running __libc_freeres() in older glibc versions
33 causes this crash.
34
njn4e59bd92003-04-22 20:58:47 +000035 WORKAROUND FOR 1.1.X and later versions of valgrind: use the
sewardj36a53ad2003-04-22 23:26:24 +000036 --run-libc-freeres=no flag. You may then get space leak
37 reports for glibc-allocations (please _don't_ report these
38 to the glibc people, since they are not real leaks), but at
39 least the program runs.
njn4e59bd92003-04-22 20:58:47 +000040
sewardj36a53ad2003-04-22 23:26:24 +000041-----------------------------------------------------------------
njn4e59bd92003-04-22 20:58:47 +000042
43Q2. My program dies complaining that syscall 197 is unimplemented.
44
45A2. 197, which is fstat64, is supported by valgrind. The problem is
46 that the /usr/include/asm/unistd.h on the machine on which your
47 valgrind was built, doesn't match your kernel -- or, to be more
48 specific, glibc is asking your kernel to do a syscall which is
49 not listed in /usr/include/asm/unistd.h.
50
sewardj36a53ad2003-04-22 23:26:24 +000051 The fix is simple. Somewhere near the top of
52 coregrind/vg_syscalls.c, add the following line:
njn4e59bd92003-04-22 20:58:47 +000053
54 #define __NR_fstat64 197
55
56 Rebuild and try again. The above line should appear before any
57 uses of the __NR_fstat64 symbol in that file. If you look at the
sewardj36a53ad2003-04-22 23:26:24 +000058 place where __NR_fstat64 is used in vg_syscalls.c, it will be
59 obvious why this fix works.
njn4e59bd92003-04-22 20:58:47 +000060
sewardj36a53ad2003-04-22 23:26:24 +000061-----------------------------------------------------------------
njn4e59bd92003-04-22 20:58:47 +000062
63Q3. My (buggy) program dies like this:
64 valgrind: vg_malloc2.c:442 (bszW_to_pszW):
65 Assertion `pszW >= 0' failed.
66 And/or my (buggy) program runs OK on valgrind, but dies like
67 this on cachegrind.
68
69A3. If valgrind shows any invalid reads, invalid writes and invalid
70 frees in your program, the above may happen. Reason is that your
71 program may trash valgrind's low-level memory manager, which then
72 dies with the above assertion, or something like this. The cure
73 is to fix your program so that it doesn't do any illegal memory
74 accesses. The above failure will hopefully go away after that.
75
sewardj36a53ad2003-04-22 23:26:24 +000076-----------------------------------------------------------------
njn4e59bd92003-04-22 20:58:47 +000077
78Q4. I'm running Red Hat Advanced Server. Valgrind always segfaults at
79 startup.
80
sewardj36a53ad2003-04-22 23:26:24 +000081A4. Known issue with RHAS 2.1, due to funny stack permissions at
82 startup. However, valgrind-1.9.4 and later automatically handle
83 this correctly, and should not segfault.
njn4e59bd92003-04-22 20:58:47 +000084
sewardj36a53ad2003-04-22 23:26:24 +000085-----------------------------------------------------------------
njn4e59bd92003-04-22 20:58:47 +000086
87Q5. I try running "valgrind my_program", but my_program runs normally,
88 and Valgrind doesn't emit any output at all.
89
njndc8d5e52003-09-25 18:20:17 +000090A5. This should no longer happen, as a check for this takes place
91 when Valgrind starts up.
njn4e59bd92003-04-22 20:58:47 +000092
njndc8d5e52003-09-25 18:20:17 +000093 However, Valgrind still doesn't work with programs that are entirely
94 statically linked. If you still want static linking, you can ask
95 gcc to link certain libraries statically. Try the following options:
njn4e59bd92003-04-22 20:58:47 +000096
njndc8d5e52003-09-25 18:20:17 +000097 -Wl,-Bstatic -lmyLibrary1 -lotherLibrary -Wl,-Bdynamic
njn4e59bd92003-04-22 20:58:47 +000098
njndc8d5e52003-09-25 18:20:17 +000099 Just make sure you end with -Wl,-Bdynamic so that libc is dynamically
100 linked.
njn4e59bd92003-04-22 20:58:47 +0000101
sewardj36a53ad2003-04-22 23:26:24 +0000102-----------------------------------------------------------------
njn4e59bd92003-04-22 20:58:47 +0000103
104Q6. I try running "valgrind my_program" and get Valgrind's startup message,
105 but I don't get any errors and I know my program has errors.
106
107A6. By default, Valgrind only traces the top-level process. So if your
108 program spawns children, they won't be traced by Valgrind by default.
109 Also, if your program is started by a shell script, Perl script, or
110 something similar, Valgrind will trace the shell, or the Perl
111 interpreter, or equivalent.
112
113 To trace child processes, use the --trace-children=yes option.
114
sewardj36a53ad2003-04-22 23:26:24 +0000115 If you are tracing large trees of processes, it can be less
116 disruptive to have the output sent over the network. Give
117 valgrind the flag --logsocket=127.0.0.1:12345 (if you want
118 logging output sent to port 12345 on localhost). You can
119 use the valgrind-listener program to listen on that port:
120 valgrind-listener 12345
121 Obviously you have to start the listener process first.
122 See the documentation for more details.
123
124-----------------------------------------------------------------
125
126Q7. My threaded server process runs unbelievably slowly on
127 valgrind. So slowly, in fact, that at first I thought it
128 had completely locked up.
129
130A7. We are not completely sure about this, but one possibility
131 is that laptops with power management fool valgrind's
132 timekeeping mechanism, which is (somewhat in error) based
133 on the x86 RDTSC instruction. A "fix" which is claimed to
134 work is to run some other cpu-intensive process at the same
135 time, so that the laptop's power-management clock-slowing
136 does not kick in. We would be interested in hearing more
137 feedback on this.
138
sewardj3d47b792003-05-05 22:15:35 +0000139 Another possible cause is that versions prior to 1.9.6
140 did not support threading on glibc 2.3.X systems well.
141 Hopefully the situation is much improved with 1.9.6.
142
sewardj36a53ad2003-04-22 23:26:24 +0000143-----------------------------------------------------------------
144
145Q8. My program dies (exactly) like this:
146
147 REPE then 0xF
148 valgrind: the `impossible' happened:
149 Unhandled REPE case
150
sewardj3d47b792003-05-05 22:15:35 +0000151A8. Yeah ... that I believe is a SSE or SSE2 instruction. Are you
152 building your app with -march=pentium4 or -march=athlon or
153 something like that? If you can somehow dissuade gcc from
154 producing SSE/SSE2 instructions, you may be able to avoid this.
155 Some folks have reported that removing the flag -march=...
156 works around this.
sewardj36a53ad2003-04-22 23:26:24 +0000157
158 I'd be interested to hear if you can get rid of it by changing
159 your application build flags.
160
161-----------------------------------------------------------------
162
163Q9. My program dies complaining that __libc_current_sigrtmin
164 is unimplemented.
165
sewardj3d47b792003-05-05 22:15:35 +0000166A9. Should be fixed in 1.9.6. I would appreciate confirmation
167 of that.
sewardj03272ff2003-04-26 22:23:35 +0000168
sewardj36a53ad2003-04-22 23:26:24 +0000169-----------------------------------------------------------------
170
171Q10. I upgraded to Red Hat 9 and threaded programs now act
172 strange / deadlock when they didn't before.
173
174A10. Thread support on glibc 2.3.2+ with NPTL is not as
175 good as on older LinuxThreads-based systems. We have
176 this under consideration. Avoid Red Hat >= 8.1 for
177 the time being, if you can.
178
sewardj3d47b792003-05-05 22:15:35 +0000179 5 May 03: 1.9.6 should be significantly improved on
180 Red Hat 9, SuSE 8.2 and other glibc-2.3.2 systems.
181
sewardj36a53ad2003-04-22 23:26:24 +0000182-----------------------------------------------------------------
183
184Q11. I really need to use the NVidia libGL.so in my app.
185 Help!
186
187A11. NVidia also noticed this it seems, and the "latest" drivers
188 (version 4349, apparently) come with this text
189
190 DISABLING CPU SPECIFIC FEATURES
191
192 Setting the environment variable __GL_FORCE_GENERIC_CPU to a
193 non-zero value will inhibit the use of CPU specific features
194 such as MMX, SSE, or 3DNOW!. Use of this option may result in
195 performance loss. This option may be useful in conjunction with
196 software such as the Valgrind memory debugger.
197
198 Set __GL_FORCE_GENERIC_CPU=1 and Valgrind should work. This has
199 been confirmed by various people. Thanks NVidia!
200
201-----------------------------------------------------------------
202
203Q12. My program dies like this (often at exit):
204
205 VG_(mash_LD_PRELOAD_and_LD_LIBRARY_PATH): internal error:
206 (loads of text)
207
njnab882982003-08-13 08:34:42 +0000208A12. One possible cause is that your program modifies its
sewardj36a53ad2003-04-22 23:26:24 +0000209 environment variables, possibly including zeroing them
njn481f8512003-08-13 09:56:30 +0000210 all. Valgrind relies on the LD_PRELOAD, LD_LIBRARY_PATH and
211 VG_ARGS variables. Zeroing them will break things.
sewardj36a53ad2003-04-22 23:26:24 +0000212
njn3cf14302003-08-19 07:50:24 +0000213 As of 1.9.6, Valgrind only uses these variables with
214 --trace-children=no, when executing execve() or using the
215 --stop-after=yes flag. This should reduce the potential for
njnab882982003-08-13 08:34:42 +0000216 problems.
sewardj36a53ad2003-04-22 23:26:24 +0000217
218-----------------------------------------------------------------
219
220Q13. My program dies like this:
221
222 error: /lib/librt.so.1: symbol __pthread_clock_settime, version
223 GLIBC_PRIVATE not defined in file libpthread.so.0 with link time
224 reference
225
226A13. This is a total swamp. Nevertheless there is a way out.
227 It's a problem which is not easy to fix. Really the problem is
228 that /lib/librt.so.1 refers to some symbols
229 __pthread_clock_settime and __pthread_clock_gettime in
230 /lib/libpthread.so which are not intended to be exported, ie
231 they are private.
232
233 Best solution is to ensure your program does not use
234 /lib/librt.so.1.
235
236 However .. since you're probably not using it directly, or even
237 knowingly, that's hard to do. You might instead be able to fix
238 it by playing around with coregrind/vg_libpthread.vs. Things to
239 try:
240
241 Remove this
242
243 GLIBC_PRIVATE {
244 __pthread_clock_gettime;
245 __pthread_clock_settime;
246 };
247
248 or maybe remove this
249
250 GLIBC_2.2.3 {
251 __pthread_clock_gettime;
252 __pthread_clock_settime;
253 } GLIBC_2.2;
254
255 or maybe add this
256
257 GLIBC_2.2.4 {
258 __pthread_clock_gettime;
259 __pthread_clock_settime;
260 } GLIBC_2.2;
261
262 GLIBC_2.2.5 {
263 __pthread_clock_gettime;
264 __pthread_clock_settime;
265 } GLIBC_2.2;
266
267 or some combination of the above. After each change you need to
268 delete coregrind/libpthread.so and do make && make install.
269
270 I just don't know if any of the above will work. If you can
271 find a solution which works, I would be interested to hear it.
272
273 To which someone replied:
274
275 I deleted this:
276
277 GLIBC_2.2.3 {
278 __pthread_clock_gettime;
279 __pthread_clock_settime;
280 } GLIBC_2.2;
281
282 and it worked.
283
284-----------------------------------------------------------------
njn4e59bd92003-04-22 20:58:47 +0000285
sewardj03272ff2003-04-26 22:23:35 +0000286Q14. My program uses the C++ STL and string classes. Valgrind
287 reports 'still reachable' memory leaks involving these classes
288 at the exit of the program, but there should be none.
289
290A14. First of all: relax, it's probably not a bug, but a feature.
291 Many implementations of the C++ standard libraries use their own
292 memory pool allocators. Memory for quite a number of destructed
293 objects is not immediately freed and given back to the OS, but
294 kept in the pool(s) for later re-use. The fact that the pools
295 are not freed at the exit() of the program cause valgrind to
296 report this memory as still reachable. The behaviour not to
297 free pools at the exit() could be called a bug of the library
298 though.
299
300 Using gcc, you can force the STL to use malloc and to free
301 memory as soon as possible by globally disabling memory caching.
302 Beware! Doing so will probably slow down your program,
303 sometimes drastically.
304
305 - With gcc 2.91, 2.95, 3.0 and 3.1, compile all source using the
306 STL with -D__USE_MALLOC. Beware! This is removed from gcc
307 starting with version 3.3.
308
309 - With 3.2.2 and later, you should export the environment
310 variable GLIBCPP_FORCE_NEW before running your program.
311
312 There are other ways to disable memory pooling: using the
313 malloc_alloc template with your objects (not portable, but
314 should work for gcc) or even writing your own memory
315 allocators. But all this goes beyond the scope of this
316 FAQ. Start by reading
317 http://gcc.gnu.org/onlinedocs/libstdc++/ext/howto.html#3
318 if you absolutely want to do that. But beware:
319
320 1) there are currently changes underway for gcc which are not
321 totally reflected in the docs right now
322 ("now" == 26 Apr 03)
323
324 2) allocators belong to the more messy parts of the STL and
325 people went at great lengths to make it portable across
326 platforms. Chances are good that your solution will work
327 on your platform, but not on others.
328
329-----------------------------------------------------------------
330
njnae34aef2003-08-07 21:24:24 +0000331Q15. My program dies with a segmentation fault, but Valgrind doesn't give
332 any error messages before it, or none that look related.
333
334A15. The one kind of segmentation fault that Valgrind won't give any
335 warnings about is writes to read-only memory. Maybe your program is
336 writing to a static string like this:
337
338 char* s = "hello";
339 s[0] = 'j';
340
341 or something similar. Writing to read-only memory can also apparently
342 make LinuxThreads behave strangely.
343
344-----------------------------------------------------------------
345
njn1aa18502003-08-15 07:35:20 +0000346Q16. When I trying building Valgrind, 'make' dies partway with an
347 assertion failure, something like this: make: expand.c:489:
348
349 allocated_variable_append: Assertion
350 `current_variable_set_list->next != 0' failed.
351
352A16. It's probably a bug in 'make'. Some, but not all, instances of
353 version 3.79.1 have this bug, see
354 www.mail-archive.com/bug-make@gnu.org/msg01658.html. Try upgrading to a
355 more recent version of 'make'.
356
357-----------------------------------------------------------------
358
njna8fb5a32003-08-20 11:19:17 +0000359Q17. I tried writing a suppression but it didn't work. Can you
360 write my suppression for me?
361
362A17. Yes! Use the --gen-suppressions=yes feature to spit out
363 suppressions automatically for you. You can then edit them
364 if you like, eg. combining similar automatically generated
365 suppressions using wildcards like '*'.
366
367 If you really want to write suppressions by hand, read the
368 manual carefully. Note particularly that C++ function names
369 must be _mangled_.
370
371-----------------------------------------------------------------
372
njn4e59bd92003-04-22 20:58:47 +0000373(this is the end of the FAQ.)