blob: b483b2ea60a4d2930044cb36664b7b7f16ea4b23 [file] [log] [blame]
Neal Norwitzc3cd9df2004-06-06 19:58:40 +00001This document describes some caveats about the use of Valgrind with
Tim Petersb8b20e22004-07-07 02:46:03 +00002Python. Valgrind is used periodically by Python developers to try
Neal Norwitzc3cd9df2004-06-06 19:58:40 +00003to ensure there are no memory leaks or invalid memory reads/writes.
4
Sanyam Khuranad5d33682018-11-20 16:40:49 +05305If you want to enable valgrind support in Python, you will need to
6configure Python --with-valgrind option or an older option
7--without-pymalloc.
8
Victor Stinner34be8072016-03-14 12:04:26 +01009UPDATE: Python 3.6 now supports PYTHONMALLOC=malloc environment variable which
10can be used to force the usage of the malloc() allocator of the C library.
11
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000012If you don't want to read about the details of using Valgrind, there
Tim Petersb8b20e22004-07-07 02:46:03 +000013are still two things you must do to suppress the warnings. First,
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000014you must use a suppressions file. One is supplied in
15Misc/valgrind-python.supp. Second, you must do one of the following:
16
17 * Uncomment Py_USING_MEMORY_DEBUGGER in Objects/obmalloc.c,
Tim Petersb8b20e22004-07-07 02:46:03 +000018 then rebuild Python
19 * Uncomment the lines in Misc/valgrind-python.supp that
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000020 suppress the warnings for PyObject_Free and PyObject_Realloc
21
Neal Norwitz7bcabc62005-11-20 23:58:38 +000022If you want to use Valgrind more effectively and catch even more
23memory leaks, you will need to configure python --without-pymalloc.
24PyMalloc allocates a few blocks in big chunks and most object
25allocations don't call malloc, they use chunks doled about by PyMalloc
26from the big blocks. This means Valgrind can't detect
27many allocations (and frees), except for those that are forwarded
28to the system malloc. Note: configuring python --without-pymalloc
29makes Python run much slower, especially when running under Valgrind.
30You may need to run the tests in batches under Valgrind to keep
31the memory usage down to allow the tests to complete. It seems to take
32about 5 times longer to run --without-pymalloc.
33
Thomas Wouters49fd7fa2006-04-21 10:40:58 +000034Apr 15, 2006:
35 test_ctypes causes Valgrind 3.1.1 to fail (crash).
36 test_socket_ssl should be skipped when running valgrind.
37 The reason is that it purposely uses uninitialized memory.
38 This causes many spurious warnings, so it's easier to just skip it.
39
Neal Norwitz7bcabc62005-11-20 23:58:38 +000040
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000041Details:
42--------
Tim Petersb8b20e22004-07-07 02:46:03 +000043Python uses its own small-object allocation scheme on top of malloc,
44called PyMalloc.
45
46Valgrind may show some unexpected results when PyMalloc is used.
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000047Starting with Python 2.3, PyMalloc is used by default. You can disable
48PyMalloc when configuring python by adding the --without-pymalloc option.
Tim Petersb8b20e22004-07-07 02:46:03 +000049If you disable PyMalloc, most of the information in this document and
Neal Norwitz7bcabc62005-11-20 23:58:38 +000050the supplied suppressions file will not be useful. As discussed above,
51disabling PyMalloc can catch more problems.
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000052
Sanyam Khuranad5d33682018-11-20 16:40:49 +053053PyMalloc uses 256KB chunks of memory, so it can't detect anything
54wrong within these blocks. For that reason, compiling Python
55--without-pymalloc usually increases the usefulness of other tools.
56
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000057If you use valgrind on a default build of Python, you will see
58many errors like:
59
60 ==6399== Use of uninitialised value of size 4
61 ==6399== at 0x4A9BDE7E: PyObject_Free (obmalloc.c:711)
62 ==6399== by 0x4A9B8198: dictresize (dictobject.c:477)
63
64These are expected and not a problem. Tim Peters explains
65the situation:
66
67 PyMalloc needs to know whether an arbitrary address is one
Tim Petersb8b20e22004-07-07 02:46:03 +000068 that's managed by it, or is managed by the system malloc.
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000069 The current scheme allows this to be determined in constant
70 time, regardless of how many memory areas are under pymalloc's
71 control.
72
73 The memory pymalloc manages itself is in one or more "arenas",
Tim Petersb8b20e22004-07-07 02:46:03 +000074 each a large contiguous memory area obtained from malloc.
75 The base address of each arena is saved by pymalloc
76 in a vector. Each arena is carved into "pools", and a field at
77 the start of each pool contains the index of that pool's arena's
78 base address in that vector.
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000079
Tim Petersb8b20e22004-07-07 02:46:03 +000080 Given an arbitrary address, pymalloc computes the pool base
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000081 address corresponding to it, then looks at "the index" stored
82 near there. If the index read up is out of bounds for the
83 vector of arena base addresses pymalloc maintains, then
84 pymalloc knows for certain that this address is not under
85 pymalloc's control. Otherwise the index is in bounds, and
86 pymalloc compares
87
88 the arena base address stored at that index in the vector
89
90 to
91
Tim Petersb8b20e22004-07-07 02:46:03 +000092 the arbitrary address pymalloc is investigating
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000093
Tim Petersb8b20e22004-07-07 02:46:03 +000094 pymalloc controls this arbitrary address if and only if it lies
95 in the arena the address's pool's index claims it lies in.
Neal Norwitzc3cd9df2004-06-06 19:58:40 +000096
97 It doesn't matter whether the memory pymalloc reads up ("the
98 index") is initialized. If it's not initialized, then
99 whatever trash gets read up will lead pymalloc to conclude
Tim Petersb8b20e22004-07-07 02:46:03 +0000100 (correctly) that the address isn't controlled by it, either
101 because the index is out of bounds, or the index is in bounds
102 but the arena it represents doesn't contain the address.
Neal Norwitzc3cd9df2004-06-06 19:58:40 +0000103
104 This determination has to be made on every call to one of
105 pymalloc's free/realloc entry points, so its speed is critical
106 (Python allocates and frees dynamic memory at a ferocious rate
107 -- everything in Python, from integers to "stack frames",
108 lives in the heap).