njn | 4e59bd9 | 2003-04-22 20:58:47 +0000 | [diff] [blame^] | 1 | |
| 2 | A mini-FAQ for valgrind, version 1.9.5 |
| 3 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 4 | Last revised 22 Apr 2003 |
| 5 | ~~~~~~~~~~~~~~~~~~~~~~~~ |
| 6 | |
| 7 | Q1. Programs run OK on valgrind, but at exit produce a bunch |
| 8 | of errors a bit like this |
| 9 | |
| 10 | ==20755== Invalid read of size 4 |
| 11 | ==20755== at 0x40281C8A: _nl_unload_locale (loadlocale.c:238) |
| 12 | ==20755== by 0x4028179D: free_mem (findlocale.c:257) |
| 13 | ==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34) |
| 14 | ==20755== by 0x40048DCC: vgPlain___libc_freeres_wrapper |
| 15 | (vg_clientfuncs.c:585) |
| 16 | ==20755== Address 0x40CC304C is 8 bytes inside a block of size 380 free'd |
| 17 | ==20755== at 0x400484C9: free (vg_clientfuncs.c:180) |
| 18 | ==20755== by 0x40281CBA: _nl_unload_locale (loadlocale.c:246) |
| 19 | ==20755== by 0x40281218: free_mem (setlocale.c:461) |
| 20 | ==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34) |
| 21 | |
| 22 | and then die with a segmentation fault. |
| 23 | |
| 24 | A1. When the program exits, valgrind runs the procedure |
| 25 | __libc_freeres() in glibc. This is a hook for memory debuggers, |
| 26 | so they can ask glibc to free up any memory it has used. Doing |
| 27 | that is needed to ensure that valgrind doesn't incorrectly |
| 28 | report space leaks in glibc. |
| 29 | |
| 30 | Problem is that running __libc_freeres() in older glibc versions |
| 31 | causes this crash. |
| 32 | |
| 33 | WORKAROUND FOR 1.0.X versions of valgrind: The simple fix is to |
| 34 | find in valgrind's sources, the one and only call to |
| 35 | __libc_freeres() and comment it out, then rebuild the system. In |
| 36 | the 1.0.3 version, this call is on line 584 of vg_clientfuncs.c. |
| 37 | This may mean you get false reports of space leaks in glibc, but |
| 38 | it at least avoids the crash. |
| 39 | |
| 40 | WORKAROUND FOR 1.1.X and later versions of valgrind: use the |
| 41 | --run-libc-freeres=no flag. |
| 42 | |
| 43 | |
| 44 | Q2. My program dies complaining that syscall 197 is unimplemented. |
| 45 | |
| 46 | A2. 197, which is fstat64, is supported by valgrind. The problem is |
| 47 | that the /usr/include/asm/unistd.h on the machine on which your |
| 48 | valgrind was built, doesn't match your kernel -- or, to be more |
| 49 | specific, glibc is asking your kernel to do a syscall which is |
| 50 | not listed in /usr/include/asm/unistd.h. |
| 51 | |
| 52 | The fix is simple. Somewhere near the top of vg_syscall_mem.c, |
| 53 | add the following line: |
| 54 | |
| 55 | #define __NR_fstat64 197 |
| 56 | |
| 57 | Rebuild and try again. The above line should appear before any |
| 58 | uses of the __NR_fstat64 symbol in that file. If you look at the |
| 59 | place where __NR_fstat64 is used in vg_syscall_mem.c, it will be |
| 60 | obvious why this fix works. NOTE for valgrind versions 1.1.0 |
| 61 | and later, the relevant file is actually coregrind/vg_syscalls.c. |
| 62 | |
| 63 | |
| 64 | Q3. My (buggy) program dies like this: |
| 65 | valgrind: vg_malloc2.c:442 (bszW_to_pszW): |
| 66 | Assertion `pszW >= 0' failed. |
| 67 | And/or my (buggy) program runs OK on valgrind, but dies like |
| 68 | this on cachegrind. |
| 69 | |
| 70 | A3. If valgrind shows any invalid reads, invalid writes and invalid |
| 71 | frees in your program, the above may happen. Reason is that your |
| 72 | program may trash valgrind's low-level memory manager, which then |
| 73 | dies with the above assertion, or something like this. The cure |
| 74 | is to fix your program so that it doesn't do any illegal memory |
| 75 | accesses. The above failure will hopefully go away after that. |
| 76 | |
| 77 | |
| 78 | Q4. I'm running Red Hat Advanced Server. Valgrind always segfaults at |
| 79 | startup. |
| 80 | |
| 81 | A4. Known issue with RHAS 2.1. The following kludge works, but |
| 82 | is too gruesome to put in the sources permanently. Try it. |
| 83 | Last verified as working on RHAS 2.1 at 20021008. |
| 84 | |
| 85 | Find the following comment in vg_main.c -- in 1.0.4 this is at |
| 86 | line 636: |
| 87 | |
| 88 | /* we locate: NEW_AUX_ENT(1, AT_PAGESZ, ELF_EXEC_PAGESIZE) in |
| 89 | the elf interpreter table */ |
| 90 | |
| 91 | Immediately _before_ this comment add the following: |
| 92 | |
| 93 | /* HACK for R H Advanced server. Ignore all the above and |
| 94 | start the search 18 pages below the "obvious" start point. |
| 95 | God knows why. Seems like we can't go into the highest 18 |
| 96 | pages of the stack. This is not good! -- the 18 pages is |
| 97 | determined just by looking for the highest proddable |
| 98 | address. It would be nice to see some kernel or libc or |
| 99 | something code to justify this. */ |
| 100 | |
| 101 | /* 0xBFFEE000 is 0xC0000000 - 18 pages */ |
| 102 | sp = 0xBFFEE000; |
| 103 | |
| 104 | /* end of HACK for R H Advanced server. */ |
| 105 | |
| 106 | Obviously the assignment to sp is the only important line. |
| 107 | |
| 108 | |
| 109 | Q5. I try running "valgrind my_program", but my_program runs normally, |
| 110 | and Valgrind doesn't emit any output at all. |
| 111 | |
| 112 | A5. Is my_program statically linked? Valgrind doesn't work with |
| 113 | statically linked binaries. It must rely on at least one shared |
| 114 | object. To detrmine if a my_program is statically linked, run: |
| 115 | |
| 116 | ldd my_program |
| 117 | |
| 118 | It will show what shared objects my_program relies on, or say: |
| 119 | |
| 120 | not a dynamic executable |
| 121 | |
| 122 | it my_program is statically linked. |
| 123 | |
| 124 | |
| 125 | Q6. I try running "valgrind my_program" and get Valgrind's startup message, |
| 126 | but I don't get any errors and I know my program has errors. |
| 127 | |
| 128 | A6. By default, Valgrind only traces the top-level process. So if your |
| 129 | program spawns children, they won't be traced by Valgrind by default. |
| 130 | Also, if your program is started by a shell script, Perl script, or |
| 131 | something similar, Valgrind will trace the shell, or the Perl |
| 132 | interpreter, or equivalent. |
| 133 | |
| 134 | To trace child processes, use the --trace-children=yes option. |
| 135 | |
| 136 | |
| 137 | (this is the end of the FAQ.) |