blob: 75ec7ca77680c7f47f10b355abafb875962b211d [file] [log] [blame]
njn4e59bd92003-04-22 20:58:47 +00001
sewardj36a53ad2003-04-22 23:26:24 +00002A mini-FAQ for valgrind, version 1.9.6
njn4e59bd92003-04-22 20:58:47 +00003~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4Last revised 22 Apr 2003
5~~~~~~~~~~~~~~~~~~~~~~~~
6
sewardj36a53ad2003-04-22 23:26:24 +00007-----------------------------------------------------------------
8
njn4e59bd92003-04-22 20:58:47 +00009Q1. Programs run OK on valgrind, but at exit produce a bunch
10 of errors a bit like this
11
12 ==20755== Invalid read of size 4
13 ==20755== at 0x40281C8A: _nl_unload_locale (loadlocale.c:238)
14 ==20755== by 0x4028179D: free_mem (findlocale.c:257)
15 ==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34)
16 ==20755== by 0x40048DCC: vgPlain___libc_freeres_wrapper
17 (vg_clientfuncs.c:585)
18 ==20755== Address 0x40CC304C is 8 bytes inside a block of size 380 free'd
19 ==20755== at 0x400484C9: free (vg_clientfuncs.c:180)
20 ==20755== by 0x40281CBA: _nl_unload_locale (loadlocale.c:246)
21 ==20755== by 0x40281218: free_mem (setlocale.c:461)
22 ==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34)
23
24 and then die with a segmentation fault.
25
26A1. When the program exits, valgrind runs the procedure
27 __libc_freeres() in glibc. This is a hook for memory debuggers,
28 so they can ask glibc to free up any memory it has used. Doing
29 that is needed to ensure that valgrind doesn't incorrectly
30 report space leaks in glibc.
31
32 Problem is that running __libc_freeres() in older glibc versions
33 causes this crash.
34
njn4e59bd92003-04-22 20:58:47 +000035 WORKAROUND FOR 1.1.X and later versions of valgrind: use the
sewardj36a53ad2003-04-22 23:26:24 +000036 --run-libc-freeres=no flag. You may then get space leak
37 reports for glibc-allocations (please _don't_ report these
38 to the glibc people, since they are not real leaks), but at
39 least the program runs.
njn4e59bd92003-04-22 20:58:47 +000040
sewardj36a53ad2003-04-22 23:26:24 +000041-----------------------------------------------------------------
njn4e59bd92003-04-22 20:58:47 +000042
43Q2. My program dies complaining that syscall 197 is unimplemented.
44
45A2. 197, which is fstat64, is supported by valgrind. The problem is
46 that the /usr/include/asm/unistd.h on the machine on which your
47 valgrind was built, doesn't match your kernel -- or, to be more
48 specific, glibc is asking your kernel to do a syscall which is
49 not listed in /usr/include/asm/unistd.h.
50
sewardj36a53ad2003-04-22 23:26:24 +000051 The fix is simple. Somewhere near the top of
52 coregrind/vg_syscalls.c, add the following line:
njn4e59bd92003-04-22 20:58:47 +000053
54 #define __NR_fstat64 197
55
56 Rebuild and try again. The above line should appear before any
57 uses of the __NR_fstat64 symbol in that file. If you look at the
sewardj36a53ad2003-04-22 23:26:24 +000058 place where __NR_fstat64 is used in vg_syscalls.c, it will be
59 obvious why this fix works.
njn4e59bd92003-04-22 20:58:47 +000060
sewardj36a53ad2003-04-22 23:26:24 +000061-----------------------------------------------------------------
njn4e59bd92003-04-22 20:58:47 +000062
63Q3. My (buggy) program dies like this:
64 valgrind: vg_malloc2.c:442 (bszW_to_pszW):
65 Assertion `pszW >= 0' failed.
66 And/or my (buggy) program runs OK on valgrind, but dies like
67 this on cachegrind.
68
69A3. If valgrind shows any invalid reads, invalid writes and invalid
70 frees in your program, the above may happen. Reason is that your
71 program may trash valgrind's low-level memory manager, which then
72 dies with the above assertion, or something like this. The cure
73 is to fix your program so that it doesn't do any illegal memory
74 accesses. The above failure will hopefully go away after that.
75
sewardj36a53ad2003-04-22 23:26:24 +000076-----------------------------------------------------------------
njn4e59bd92003-04-22 20:58:47 +000077
78Q4. I'm running Red Hat Advanced Server. Valgrind always segfaults at
79 startup.
80
sewardj36a53ad2003-04-22 23:26:24 +000081A4. Known issue with RHAS 2.1, due to funny stack permissions at
82 startup. However, valgrind-1.9.4 and later automatically handle
83 this correctly, and should not segfault.
njn4e59bd92003-04-22 20:58:47 +000084
sewardj36a53ad2003-04-22 23:26:24 +000085-----------------------------------------------------------------
njn4e59bd92003-04-22 20:58:47 +000086
87Q5. I try running "valgrind my_program", but my_program runs normally,
88 and Valgrind doesn't emit any output at all.
89
90A5. Is my_program statically linked? Valgrind doesn't work with
91 statically linked binaries. It must rely on at least one shared
92 object. To detrmine if a my_program is statically linked, run:
93
94 ldd my_program
95
96 It will show what shared objects my_program relies on, or say:
97
98 not a dynamic executable
99
100 it my_program is statically linked.
101
sewardj36a53ad2003-04-22 23:26:24 +0000102-----------------------------------------------------------------
njn4e59bd92003-04-22 20:58:47 +0000103
104Q6. I try running "valgrind my_program" and get Valgrind's startup message,
105 but I don't get any errors and I know my program has errors.
106
107A6. By default, Valgrind only traces the top-level process. So if your
108 program spawns children, they won't be traced by Valgrind by default.
109 Also, if your program is started by a shell script, Perl script, or
110 something similar, Valgrind will trace the shell, or the Perl
111 interpreter, or equivalent.
112
113 To trace child processes, use the --trace-children=yes option.
114
sewardj36a53ad2003-04-22 23:26:24 +0000115 If you are tracing large trees of processes, it can be less
116 disruptive to have the output sent over the network. Give
117 valgrind the flag --logsocket=127.0.0.1:12345 (if you want
118 logging output sent to port 12345 on localhost). You can
119 use the valgrind-listener program to listen on that port:
120 valgrind-listener 12345
121 Obviously you have to start the listener process first.
122 See the documentation for more details.
123
124-----------------------------------------------------------------
125
126Q7. My threaded server process runs unbelievably slowly on
127 valgrind. So slowly, in fact, that at first I thought it
128 had completely locked up.
129
130A7. We are not completely sure about this, but one possibility
131 is that laptops with power management fool valgrind's
132 timekeeping mechanism, which is (somewhat in error) based
133 on the x86 RDTSC instruction. A "fix" which is claimed to
134 work is to run some other cpu-intensive process at the same
135 time, so that the laptop's power-management clock-slowing
136 does not kick in. We would be interested in hearing more
137 feedback on this.
138
139-----------------------------------------------------------------
140
141Q8. My program dies (exactly) like this:
142
143 REPE then 0xF
144 valgrind: the `impossible' happened:
145 Unhandled REPE case
146
147A8. Yeah ... that I believe is a P4 specific instruction. Are you
148 building your app with -march=pentium4 or something like that?
149 Others have reported that removing the flag works around this.
150 In fact this is pretty easy to fix and I do have it on my
151 to-do-for-1.9.6 list.
152
153 I'd be interested to hear if you can get rid of it by changing
154 your application build flags.
155
156-----------------------------------------------------------------
157
158Q9. My program dies complaining that __libc_current_sigrtmin
159 is unimplemented.
160
161A9. Try the following. It is an experiment, but it might work.
162 We would very much appreciate you telling us if it does/
163 does not work for you.
164
165 In vg_libpthread.c, add the 3 functions below.
166
167 In vg_libpthread_unimp.c, remove the stubs for the same 3
168 functions.
169
170 Let me know if it helps. Quite a lot of other valgrind users
171 complain about this, but I have never been able to reproduce it,
172 so fixing it isn't easy. So it's useful if you can try.
173
174 int __libc_current_sigrtmin (void)
175 {
176 return -1;
177 }
178
179 int __libc_current_sigrtmax (void)
180 {
181 return -1;
182 }
183
184 int __libc_allocate_rtsig (int high)
185 {
186 return -1;
187 }
188
189-----------------------------------------------------------------
190
191Q10. I upgraded to Red Hat 9 and threaded programs now act
192 strange / deadlock when they didn't before.
193
194A10. Thread support on glibc 2.3.2+ with NPTL is not as
195 good as on older LinuxThreads-based systems. We have
196 this under consideration. Avoid Red Hat >= 8.1 for
197 the time being, if you can.
198
199-----------------------------------------------------------------
200
201Q11. I really need to use the NVidia libGL.so in my app.
202 Help!
203
204A11. NVidia also noticed this it seems, and the "latest" drivers
205 (version 4349, apparently) come with this text
206
207 DISABLING CPU SPECIFIC FEATURES
208
209 Setting the environment variable __GL_FORCE_GENERIC_CPU to a
210 non-zero value will inhibit the use of CPU specific features
211 such as MMX, SSE, or 3DNOW!. Use of this option may result in
212 performance loss. This option may be useful in conjunction with
213 software such as the Valgrind memory debugger.
214
215 Set __GL_FORCE_GENERIC_CPU=1 and Valgrind should work. This has
216 been confirmed by various people. Thanks NVidia!
217
218-----------------------------------------------------------------
219
220Q12. My program dies like this (often at exit):
221
222 VG_(mash_LD_PRELOAD_and_LD_LIBRARY_PATH): internal error:
223 (loads of text)
224
225A12. We're not entirely sure about this, and would appreciate
226 someone sending a simple test case for us to look at.
227 One possible cause is that your program modifies its
228 environment variables, possibly including zeroing them
229 all. Avoid this if you can.
230
231 In any case, you may be able to work around it like this:
232 Comment out the
233 call to VG_(core_panic) at coregrind/vg_main.c:1647 and see
234 if that helps. The text of coregrind/vg_main.c:1647 is as follows:
235
236 VG_(core_panic)("VG_(mash_LD_PRELOAD_and_LD_LIBRARY_PATH) failed\n");
237
238 and so it's this call you want to comment out.
239
240-----------------------------------------------------------------
241
242Q13. My program dies like this:
243
244 error: /lib/librt.so.1: symbol __pthread_clock_settime, version
245 GLIBC_PRIVATE not defined in file libpthread.so.0 with link time
246 reference
247
248A13. This is a total swamp. Nevertheless there is a way out.
249 It's a problem which is not easy to fix. Really the problem is
250 that /lib/librt.so.1 refers to some symbols
251 __pthread_clock_settime and __pthread_clock_gettime in
252 /lib/libpthread.so which are not intended to be exported, ie
253 they are private.
254
255 Best solution is to ensure your program does not use
256 /lib/librt.so.1.
257
258 However .. since you're probably not using it directly, or even
259 knowingly, that's hard to do. You might instead be able to fix
260 it by playing around with coregrind/vg_libpthread.vs. Things to
261 try:
262
263 Remove this
264
265 GLIBC_PRIVATE {
266 __pthread_clock_gettime;
267 __pthread_clock_settime;
268 };
269
270 or maybe remove this
271
272 GLIBC_2.2.3 {
273 __pthread_clock_gettime;
274 __pthread_clock_settime;
275 } GLIBC_2.2;
276
277 or maybe add this
278
279 GLIBC_2.2.4 {
280 __pthread_clock_gettime;
281 __pthread_clock_settime;
282 } GLIBC_2.2;
283
284 GLIBC_2.2.5 {
285 __pthread_clock_gettime;
286 __pthread_clock_settime;
287 } GLIBC_2.2;
288
289 or some combination of the above. After each change you need to
290 delete coregrind/libpthread.so and do make && make install.
291
292 I just don't know if any of the above will work. If you can
293 find a solution which works, I would be interested to hear it.
294
295 To which someone replied:
296
297 I deleted this:
298
299 GLIBC_2.2.3 {
300 __pthread_clock_gettime;
301 __pthread_clock_settime;
302 } GLIBC_2.2;
303
304 and it worked.
305
306-----------------------------------------------------------------
njn4e59bd92003-04-22 20:58:47 +0000307
308(this is the end of the FAQ.)