blob: 8e480e9c9cb8e6912055d89e6be6fc4df9fb958c [file] [log] [blame]
Neal Norwitzc3cd9df2004-06-06 19:58:40 +00001This document describes some caveats about the use of Valgrind with
Tim Petersb8b20e22004-07-07 02:46:03 +00002Python. Valgrind is used periodically by Python developers to try
Neal Norwitzc3cd9df2004-06-06 19:58:40 +00003to ensure there are no memory leaks or invalid memory reads/writes.
4
5If you don't want to read about the details of using Valgrind, there
Tim Petersb8b20e22004-07-07 02:46:03 +00006are still two things you must do to suppress the warnings. First,
Neal Norwitzc3cd9df2004-06-06 19:58:40 +00007you must use a suppressions file. One is supplied in
8Misc/valgrind-python.supp. Second, you must do one of the following:
9
10 * Uncomment Py_USING_MEMORY_DEBUGGER in Objects/obmalloc.c,
Tim Petersb8b20e22004-07-07 02:46:03 +000011 then rebuild Python
12 * Uncomment the lines in Misc/valgrind-python.supp that
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000013 suppress the warnings for PyObject_Free and PyObject_Realloc
14
15Details:
16--------
Tim Petersb8b20e22004-07-07 02:46:03 +000017Python uses its own small-object allocation scheme on top of malloc,
18called PyMalloc.
19
20Valgrind may show some unexpected results when PyMalloc is used.
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000021Starting with Python 2.3, PyMalloc is used by default. You can disable
22PyMalloc when configuring python by adding the --without-pymalloc option.
Tim Petersb8b20e22004-07-07 02:46:03 +000023If you disable PyMalloc, most of the information in this document and
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000024the supplied suppressions file will not be useful.
25
26If you use valgrind on a default build of Python, you will see
27many errors like:
28
29 ==6399== Use of uninitialised value of size 4
30 ==6399== at 0x4A9BDE7E: PyObject_Free (obmalloc.c:711)
31 ==6399== by 0x4A9B8198: dictresize (dictobject.c:477)
32
33These are expected and not a problem. Tim Peters explains
34the situation:
35
36 PyMalloc needs to know whether an arbitrary address is one
Tim Petersb8b20e22004-07-07 02:46:03 +000037 that's managed by it, or is managed by the system malloc.
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000038 The current scheme allows this to be determined in constant
39 time, regardless of how many memory areas are under pymalloc's
40 control.
41
42 The memory pymalloc manages itself is in one or more "arenas",
Tim Petersb8b20e22004-07-07 02:46:03 +000043 each a large contiguous memory area obtained from malloc.
44 The base address of each arena is saved by pymalloc
45 in a vector. Each arena is carved into "pools", and a field at
46 the start of each pool contains the index of that pool's arena's
47 base address in that vector.
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000048
Tim Petersb8b20e22004-07-07 02:46:03 +000049 Given an arbitrary address, pymalloc computes the pool base
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000050 address corresponding to it, then looks at "the index" stored
51 near there. If the index read up is out of bounds for the
52 vector of arena base addresses pymalloc maintains, then
53 pymalloc knows for certain that this address is not under
54 pymalloc's control. Otherwise the index is in bounds, and
55 pymalloc compares
56
57 the arena base address stored at that index in the vector
58
59 to
60
Tim Petersb8b20e22004-07-07 02:46:03 +000061 the arbitrary address pymalloc is investigating
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000062
Tim Petersb8b20e22004-07-07 02:46:03 +000063 pymalloc controls this arbitrary address if and only if it lies
64 in the arena the address's pool's index claims it lies in.
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000065
66 It doesn't matter whether the memory pymalloc reads up ("the
67 index") is initialized. If it's not initialized, then
68 whatever trash gets read up will lead pymalloc to conclude
Tim Petersb8b20e22004-07-07 02:46:03 +000069 (correctly) that the address isn't controlled by it, either
70 because the index is out of bounds, or the index is in bounds
71 but the arena it represents doesn't contain the address.
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000072
73 This determination has to be made on every call to one of
74 pymalloc's free/realloc entry points, so its speed is critical
75 (Python allocates and frees dynamic memory at a ferocious rate
76 -- everything in Python, from integers to "stack frames",
77 lives in the heap).