Blame - FAQ.txt - platform/external/valgrind

blob: 665c91967911bc76c33dfa00cf85832a0989295d [file] [log] [blame]

nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	1	Valgrind FAQ, version 2.1.2
				2	~~~~~~~~~~~~~~~~~~~~~~~~~~~
nethercote	8deae81	2004-07-18 10:35:36 +0000	[diff] [blame]	3	Last revised 18 July 2004
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	4	~~~~~~~~~~~~~~~~~~~~~~~~~
njn	4e59bd9	2003-04-22 20:58:47 +0000	[diff] [blame]	5
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	6	1. Background
				7	2. Compiling, installing and configuring
				8	3. Valgrind aborts unexpectedly
				9	4. Valgrind behaves unexpectedly
				10	5. Memcheck doesn't find my bug
				11	6. Miscellaneous
				12
				13
				14	-----------------------------------------------------------------
				15	1. Background
				16	-----------------------------------------------------------------
				17
				18	1.1. How do you pronounce "Valgrind"?
				19
				20	The "Val" as in the world "value". The "grind" is pronounced with a
				21	short 'i' -- ie. "grinned" (rhymes with "tinned") rather than "grined"
				22	(rhymes with "find").
				23
				24	Don't feel bad: almost everyone gets it wrong at first.
njn	4e59bd9	2003-04-22 20:58:47 +0000	[diff] [blame]	25
sewardj	36a53ad	2003-04-22 23:26:24 +0000	[diff] [blame]	26	-----------------------------------------------------------------
				27
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	28	1.2. Where does the name "Valgrind" come from?
				29
				30	From Nordic mythology. Originally (before release) the project was
				31	named Heimdall, after the watchman of the Nordic gods. He could "see a
				32	hundred miles by day or night, hear the grass growing, see the wool
				33	growing on a sheep's back" (etc). This would have been a great name,
				34	but it was already taken by a security package "Heimdal".
				35
				36	Keeping with the Nordic theme, Valgrind was chosen. Valgrind is the
				37	name of the main entrance to Valhalla (the Hall of the Chosen Slain in
				38	Asgard). Over this entrance there resides a wolf and over it there is
				39	the head of a boar and on it perches a huge eagle, whose eyes can see to
				40	the far regions of the nine worlds. Only those judged worthy by the
				41	guardians are allowed to pass through Valgrind. All others are refused
				42	entrance.
				43
				44	It's not short for "value grinder", although that's not a bad guess.
				45
				46
				47	-----------------------------------------------------------------
				48	2. Compiling, installing and configuring
				49	-----------------------------------------------------------------
				50
				51	2.1. When I trying building Valgrind, 'make' dies partway with an
				52	assertion failure, something like this: make: expand.c:489:
				53
				54	allocated_variable_append: Assertion
				55	`current_variable_set_list->next != 0' failed.
				56
				57	It's probably a bug in 'make'. Some, but not all, instances of version 3.79.1
				58	have this bug, see www.mail-archive.com/bug-make@gnu.org/msg01658.html. Try
				59	upgrading to a more recent version of 'make'. Alternatively, we have heard
				60	that unsetting the CFLAGS environment variable avoids the problem.
				61
				62
				63	-----------------------------------------------------------------
				64	3. Valgrind aborts unexpectedly
				65	-----------------------------------------------------------------
				66
				67	3.1. Programs run OK on Valgrind, but at exit produce a bunch of errors a bit
				68	like this
njn	4e59bd9	2003-04-22 20:58:47 +0000	[diff] [blame]	69
				70	==20755== Invalid read of size 4
				71	==20755== at 0x40281C8A: _nl_unload_locale (loadlocale.c:238)
				72	==20755== by 0x4028179D: free_mem (findlocale.c:257)
				73	==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34)
				74	==20755== by 0x40048DCC: vgPlain___libc_freeres_wrapper
				75	(vg_clientfuncs.c:585)
				76	==20755== Address 0x40CC304C is 8 bytes inside a block of size 380 free'd
				77	==20755== at 0x400484C9: free (vg_clientfuncs.c:180)
				78	==20755== by 0x40281CBA: _nl_unload_locale (loadlocale.c:246)
				79	==20755== by 0x40281218: free_mem (setlocale.c:461)
				80	==20755== by 0x402E0962: __libc_freeres (set-freeres.c:34)
				81
				82	and then die with a segmentation fault.
				83
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	84	When the program exits, Valgrind runs the procedure __libc_freeres() in
				85	glibc. This is a hook for memory debuggers, so they can ask glibc to
				86	free up any memory it has used. Doing that is needed to ensure that
				87	Valgrind doesn't incorrectly report space leaks in glibc.
njn	4e59bd9	2003-04-22 20:58:47 +0000	[diff] [blame]	88
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	89	Problem is that running __libc_freeres() in older glibc versions causes
				90	this crash.
njn	4e59bd9	2003-04-22 20:58:47 +0000	[diff] [blame]	91
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	92	WORKAROUND FOR 1.1.X and later versions of Valgrind: use the
				93	--run-libc-freeres=no flag. You may then get space leak reports for
				94	glibc-allocations (please _don't_ report these to the glibc people,
				95	since they are not real leaks), but at least the program runs.
njn	4e59bd9	2003-04-22 20:58:47 +0000	[diff] [blame]	96
sewardj	36a53ad	2003-04-22 23:26:24 +0000	[diff] [blame]	97	-----------------------------------------------------------------
njn	4e59bd9	2003-04-22 20:58:47 +0000	[diff] [blame]	98
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	99	3.2. My (buggy) program dies like this:
njn	4e59bd9	2003-04-22 20:58:47 +0000	[diff] [blame]	100	valgrind: vg_malloc2.c:442 (bszW_to_pszW):
				101	Assertion `pszW >= 0' failed.
njn	4e59bd9	2003-04-22 20:58:47 +0000	[diff] [blame]	102
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	103	If Memcheck (the memory checker) shows any invalid reads, invalid writes
				104	and invalid frees in your program, the above may happen. Reason is that
				105	your program may trash Valgrind's low-level memory manager, which then
				106	dies with the above assertion, or something like this. The cure is to
				107	fix your program so that it doesn't do any illegal memory accesses. The
				108	above failure will hopefully go away after that.
njn	4e59bd9	2003-04-22 20:58:47 +0000	[diff] [blame]	109
sewardj	36a53ad	2003-04-22 23:26:24 +0000	[diff] [blame]	110	-----------------------------------------------------------------
njn	4e59bd9	2003-04-22 20:58:47 +0000	[diff] [blame]	111
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	112	3.3. My program dies, printing a message like this along the way:
sewardj	36a53ad	2003-04-22 23:26:24 +0000	[diff] [blame]	113
nethercote	3178887	2003-11-02 16:32:05 +0000	[diff] [blame]	114	disInstr: unhandled instruction bytes: 0x66 0xF 0x2E 0x5
sewardj	36a53ad	2003-04-22 23:26:24 +0000	[diff] [blame]	115
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	116	Older versions did not support some x86 instructions, particularly
				117	SSE/SSE2 instructions. Try a newer Valgrind; we now support almost all
				118	instructions. If it still happens with newer versions, if the failing
				119	instruction is an SSE/SSE2 instruction, you might be able to recompile
nethercote	8deae81	2004-07-18 10:35:36 +0000	[diff] [blame]	120	your program without it by using the flag -march to gcc. Either way,
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	121	let us know and we'll try to fix it.
sewardj	36a53ad	2003-04-22 23:26:24 +0000	[diff] [blame]	122
nethercote	8deae81	2004-07-18 10:35:36 +0000	[diff] [blame]	123	Another possibility is that your program has a bug and erroneously jumps
				124	to a non-code address, in which case you'll get a SIGILL signal.
				125	Memcheck/Addrcheck may issue a warning just before this happens, but they
				126	might not if the jump happens to land in addressable memory.
				127
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	128
				129	-----------------------------------------------------------------
				130	4. Valgrind behaves unexpectedly
				131	-----------------------------------------------------------------
				132
				133	4.1. I try running "valgrind my_program", but my_program runs normally,
				134	and Valgrind doesn't emit any output at all.
				135
				136	For versions prior to 2.1.1:
				137
				138	Valgrind doesn't work out-of-the-box with programs that are entirely
				139	statically linked. It does a quick test at startup, and if it detects
				140	that the program is statically linked, it aborts with an explanation.
				141
				142	This test may fail in some obscure cases, eg. if you run a script under
				143	Valgrind and the script interpreter is statically linked.
				144
				145	If you still want static linking, you can ask gcc to link certain
				146	libraries statically. Try the following options:
				147
				148	-Wl,-Bstatic -lmyLibrary1 -lotherLibrary -Wl,-Bdynamic
				149
				150	Just make sure you end with -Wl,-Bdynamic so that libc is dynamically
				151	linked.
				152
				153	If you absolutely cannot use dynamic libraries, you can try statically
				154	linking together all the .o files in coregrind/, all the .o files of the
				155	tool of your choice (eg. those in memcheck/), and the .o files of your
				156	program. You'll end up with a statically linked binary that runs
				157	permanently under Valgrind's control. Note that we haven't tested this
				158	procedure thoroughly.
				159
				160
				161	For versions 2.1.1 and later:
				162
				163	Valgrind does now work with static binaries, although beware that some
				164	of the tools won't operate as well as normal, because they have access
				165	to less information about how the program runs. Eg. Memcheck will miss
				166	some errors that it would otherwise find. This is because Valgrind
				167	doesn't replace malloc() and friends with its own versions. It's best
				168	if your program is dynamically linked with glibc.
sewardj	36a53ad	2003-04-22 23:26:24 +0000	[diff] [blame]	169
				170	-----------------------------------------------------------------
njn	4e59bd9	2003-04-22 20:58:47 +0000	[diff] [blame]	171
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	172	4.2. My threaded server process runs unbelievably slowly on Valgrind.
				173	So slowly, in fact, that at first I thought it had completely
				174	locked up.
sewardj	03272ff	2003-04-26 22:23:35 +0000	[diff] [blame]	175
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	176	We are not completely sure about this, but one possibility is that
				177	laptops with power management fool Valgrind's timekeeping mechanism,
				178	which is (somewhat in error) based on the x86 RDTSC instruction. A
				179	"fix" which is claimed to work is to run some other cpu-intensive
				180	process at the same time, so that the laptop's power-management
				181	clock-slowing does not kick in. We would be interested in hearing more
				182	feedback on this.
sewardj	03272ff	2003-04-26 22:23:35 +0000	[diff] [blame]	183
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	184	Another possible cause is that versions prior to 1.9.6 did not support
				185	threading on glibc 2.3.X systems well. Hopefully the situation is much
				186	improved with 1.9.6 and later versions.
sewardj	03272ff	2003-04-26 22:23:35 +0000	[diff] [blame]	187
				188	-----------------------------------------------------------------
				189
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	190	4.3. My program uses the C++ STL and string classes. Valgrind
				191	reports 'still reachable' memory leaks involving these classes
				192	at the exit of the program, but there should be none.
njn	ae34aef	2003-08-07 21:24:24 +0000	[diff] [blame]	193
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	194	First of all: relax, it's probably not a bug, but a feature. Many
				195	implementations of the C++ standard libraries use their own memory pool
				196	allocators. Memory for quite a number of destructed objects is not
				197	immediately freed and given back to the OS, but kept in the pool(s) for
				198	later re-use. The fact that the pools are not freed at the exit() of
				199	the program cause Valgrind to report this memory as still reachable.
				200	The behaviour not to free pools at the exit() could be called a bug of
				201	the library though.
njn	ae34aef	2003-08-07 21:24:24 +0000	[diff] [blame]	202
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	203	Using gcc, you can force the STL to use malloc and to free memory as
				204	soon as possible by globally disabling memory caching. Beware! Doing
				205	so will probably slow down your program, sometimes drastically.
njn	ae34aef	2003-08-07 21:24:24 +0000	[diff] [blame]	206
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	207	- With gcc 2.91, 2.95, 3.0 and 3.1, compile all source using the STL
				208	with -D__USE_MALLOC. Beware! This is removed from gcc starting with
				209	version 3.3.
				210
				211	- With 3.2.2 and later, you should export the environment variable
				212	GLIBCPP_FORCE_NEW before running your program.
				213
				214	There are other ways to disable memory pooling: using the malloc_alloc
				215	template with your objects (not portable, but should work for gcc) or
				216	even writing your own memory allocators. But all this goes beyond the
				217	scope of this FAQ. Start by reading
				218	http://gcc.gnu.org/onlinedocs/libstdc++/ext/howto.html#3 if you
				219	absolutely want to do that. But beware:
				220
				221	1) there are currently changes underway for gcc which are not totally
				222	reflected in the docs right now ("now" == 26 Apr 03)
				223
				224	2) allocators belong to the more messy parts of the STL and people went
				225	at great lengths to make it portable across platforms. Chances are
				226	good that your solution will work on your platform, but not on
				227	others.
				228
				229	-----------------------------------------------------------------------------
				230	4.4. The stack traces given by Memcheck (or another tool) aren't helpful.
				231	How can I improve them?
				232
				233	If they're not long enough, use --num-callers to make them longer.
				234
				235	If they're not detailed enough, make sure you are compiling with -g to add
				236	debug information. And don't strip symbol tables (programs should be
				237	unstripped unless you run 'strip' on them; some libraries ship stripped).
				238
				239	Also, -fomit-frame-pointer and -fstack-check can make stack traces worse.
				240
				241	Some example sub-traces:
				242
				243	With debug information and unstripped (best):
				244
				245	Invalid write of size 1
				246	at 0x80483BF: really (malloc1.c:20)
				247	by 0x8048370: main (malloc1.c:9)
				248
				249	With no debug information, unstripped:
				250
				251	Invalid write of size 1
				252	at 0x80483BF: really (in /auto/homes/njn25/grind/head5/a.out)
				253	by 0x8048370: main (in /auto/homes/njn25/grind/head5/a.out)
				254
				255	With no debug information, stripped:
				256
				257	Invalid write of size 1
				258	at 0x80483BF: (within /auto/homes/njn25/grind/head5/a.out)
				259	by 0x8048370: (within /auto/homes/njn25/grind/head5/a.out)
				260	by 0x42015703: __libc_start_main (in /lib/tls/libc-2.3.2.so)
				261	by 0x80482CC: (within /auto/homes/njn25/grind/head5/a.out)
				262
				263	With debug information and -fomit-frame-pointer:
				264
				265	Invalid write of size 1
				266	at 0x80483C4: really (malloc1.c:20)
				267	by 0x42015703: __libc_start_main (in /lib/tls/libc-2.3.2.so)
				268	by 0x80482CC: ??? (start.S:81)
				269
				270	-----------------------------------------------------------------
				271	5. Memcheck doesn't find my bug
				272	-----------------------------------------------------------------
				273
				274	5.1. I try running "valgrind --tool=memcheck my_program" and get
				275	Valgrind's startup message, but I don't get any errors and I know
				276	my program has errors.
				277
				278	By default, Valgrind only traces the top-level process. So if your
				279	program spawns children, they won't be traced by Valgrind by default.
				280	Also, if your program is started by a shell script, Perl script, or
				281	something similar, Valgrind will trace the shell, or the Perl
				282	interpreter, or equivalent.
				283
				284	To trace child processes, use the --trace-children=yes option.
				285
				286	If you are tracing large trees of processes, it can be less disruptive
				287	to have the output sent over the network. Give Valgrind the flag
nethercote	f854867	2004-06-21 12:42:35 +0000	[diff] [blame]	288	--log-socket=127.0.0.1:12345 (if you want logging output sent to port
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	289	12345 on localhost). You can use the valgrind-listener program to
				290	listen on that port:
				291
				292	valgrind-listener 12345
				293
				294	Obviously you have to start the listener process first. See the
				295	documentation for more details.
njn	ae34aef	2003-08-07 21:24:24 +0000	[diff] [blame]	296
				297	-----------------------------------------------------------------
				298
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	299	5.2. Why doesn't Memcheck find the array overruns in this program?
				300
				301	int static[5];
				302
				303	int main(void)
				304	{
				305	int stack[5];
				306
				307	static[5] = 0;
				308	stack [5] = 0;
				309
				310	return 0;
				311	}
				312
				313	Unfortunately, Memcheck doesn't do bounds checking on static or stack
				314	arrays. We'd like to, but it's just not possible to do in a reasonable
				315	way that fits with how Memcheck works. Sorry.
njn	1aa1850	2003-08-15 07:35:20 +0000	[diff] [blame]	316
nethercote	ef0abd1	2004-04-10 00:29:58 +0000	[diff] [blame]	317
				318	-----------------------------------------------------------------
				319	6. Miscellaneous
				320	-----------------------------------------------------------------
				321
				322	6.1. I tried writing a suppression but it didn't work. Can you
				323	write my suppression for me?
				324
				325	Yes! Use the --gen-suppressions=yes feature to spit out suppressions
				326	automatically for you. You can then edit them if you like, eg.
				327	combining similar automatically generated suppressions using wildcards
				328	like '*'.
				329
				330	If you really want to write suppressions by hand, read the manual
				331	carefully. Note particularly that C++ function names must be _mangled_.
				332
				333	-----------------------------------------------------------------
				334
				335	6.2. With Memcheck/Addrcheck's memory leak detector, what's the
				336	difference between "definitely lost", "possibly lost", "still
				337	reachable", and "suppressed"?
				338
				339	The details are in section 3.6 of the manual.
				340
				341	In short:
				342
				343	- "definitely lost" means your program is leaking memory -- fix it!
				344
				345	- "possibly lost" means your program is probably leaking memory,
				346	unless you're doing funny things with pointers.
				347
				348	- "still reachable" means your program is probably ok -- it didn't
				349	free some memory it could have. This is quite common and often
				350	reasonable. Don't use --show-reachable=yes if you don't want to see
				351	these reports.
				352
				353	- "suppressed" means that a leak error has been suppressed. There are
				354	some suppressions in the default suppression files. You can ignore
				355	suppressed errors.
njn	a8fb5a3	2003-08-20 11:19:17 +0000	[diff] [blame]	356
				357	-----------------------------------------------------------------
				358
njn	4e59bd9	2003-04-22 20:58:47 +0000	[diff] [blame]	359	(this is the end of the FAQ.)