blob: b5751ac7c3c693029349a9faaa5a421aa40e4f03 [file] [log] [blame]
Neal Norwitzc3cd9df2004-06-06 19:58:40 +00001This document describes some caveats about the use of Valgrind with
2Python. Valgrind is used periodically by Python developers to try
3to ensure there are no memory leaks or invalid memory reads/writes.
4
5If you don't want to read about the details of using Valgrind, there
6are still two things you must do to suppress the warnings. First,
7you must use a suppressions file. One is supplied in
8Misc/valgrind-python.supp. Second, you must do one of the following:
9
10 * Uncomment Py_USING_MEMORY_DEBUGGER in Objects/obmalloc.c,
11 then rebuild Python
12 * Uncomment the lines in Misc/valgrind-python.supp that
13 suppress the warnings for PyObject_Free and PyObject_Realloc
14
15Details:
16--------
17Python uses its own allocation scheme on top of malloc called PyMalloc.
18Valgrind my show some unexpected results when PyMalloc is used.
19Starting with Python 2.3, PyMalloc is used by default. You can disable
20PyMalloc when configuring python by adding the --without-pymalloc option.
21If you disable PyMalloc, most of the information in this document and
22the supplied suppressions file will not be useful.
23
24If you use valgrind on a default build of Python, you will see
25many errors like:
26
27 ==6399== Use of uninitialised value of size 4
28 ==6399== at 0x4A9BDE7E: PyObject_Free (obmalloc.c:711)
29 ==6399== by 0x4A9B8198: dictresize (dictobject.c:477)
30
31These are expected and not a problem. Tim Peters explains
32the situation:
33
34 PyMalloc needs to know whether an arbitrary address is one
35 that's managed by it, or is managed by the system malloc.
36 The current scheme allows this to be determined in constant
37 time, regardless of how many memory areas are under pymalloc's
38 control.
39
40 The memory pymalloc manages itself is in one or more "arenas",
41 each a large contiguous memory area obtained from malloc.
42 The base address of each arena is saved by pymalloc
43 in a vector, and a field at the start of each arena contains
44 the index of that arena's base address in that vector.
45
46 Given an arbitrary address, pymalloc computes the arena base
47 address corresponding to it, then looks at "the index" stored
48 near there. If the index read up is out of bounds for the
49 vector of arena base addresses pymalloc maintains, then
50 pymalloc knows for certain that this address is not under
51 pymalloc's control. Otherwise the index is in bounds, and
52 pymalloc compares
53
54 the arena base address stored at that index in the vector
55
56 to
57
58 the computed arena address
59
60 pymalloc controls this arena if and only if they're equal.
61
62 It doesn't matter whether the memory pymalloc reads up ("the
63 index") is initialized. If it's not initialized, then
64 whatever trash gets read up will lead pymalloc to conclude
65 (correctly) that the address isn't controlled by it.
66
67 This determination has to be made on every call to one of
68 pymalloc's free/realloc entry points, so its speed is critical
69 (Python allocates and frees dynamic memory at a ferocious rate
70 -- everything in Python, from integers to "stack frames",
71 lives in the heap).