Fred Drake | 3adf79e | 2001-10-12 19:01:43 +0000 | [diff] [blame] | 1 | \chapter{Memory Management \label{memory}} |
| 2 | \sectionauthor{Vladimir Marangozov}{Vladimir.Marangozov@inrialpes.fr} |
| 3 | |
| 4 | |
| 5 | \section{Overview \label{memoryOverview}} |
| 6 | |
| 7 | Memory management in Python involves a private heap containing all |
| 8 | Python objects and data structures. The management of this private |
| 9 | heap is ensured internally by the \emph{Python memory manager}. The |
| 10 | Python memory manager has different components which deal with various |
| 11 | dynamic storage management aspects, like sharing, segmentation, |
| 12 | preallocation or caching. |
| 13 | |
| 14 | At the lowest level, a raw memory allocator ensures that there is |
| 15 | enough room in the private heap for storing all Python-related data |
| 16 | by interacting with the memory manager of the operating system. On top |
| 17 | of the raw memory allocator, several object-specific allocators |
| 18 | operate on the same heap and implement distinct memory management |
| 19 | policies adapted to the peculiarities of every object type. For |
| 20 | example, integer objects are managed differently within the heap than |
| 21 | strings, tuples or dictionaries because integers imply different |
| 22 | storage requirements and speed/space tradeoffs. The Python memory |
| 23 | manager thus delegates some of the work to the object-specific |
| 24 | allocators, but ensures that the latter operate within the bounds of |
| 25 | the private heap. |
| 26 | |
| 27 | It is important to understand that the management of the Python heap |
| 28 | is performed by the interpreter itself and that the user has no |
| 29 | control on it, even if she regularly manipulates object pointers to |
| 30 | memory blocks inside that heap. The allocation of heap space for |
| 31 | Python objects and other internal buffers is performed on demand by |
| 32 | the Python memory manager through the Python/C API functions listed in |
| 33 | this document. |
| 34 | |
| 35 | To avoid memory corruption, extension writers should never try to |
| 36 | operate on Python objects with the functions exported by the C |
| 37 | library: \cfunction{malloc()}\ttindex{malloc()}, |
| 38 | \cfunction{calloc()}\ttindex{calloc()}, |
| 39 | \cfunction{realloc()}\ttindex{realloc()} and |
| 40 | \cfunction{free()}\ttindex{free()}. This will result in |
| 41 | mixed calls between the C allocator and the Python memory manager |
| 42 | with fatal consequences, because they implement different algorithms |
| 43 | and operate on different heaps. However, one may safely allocate and |
| 44 | release memory blocks with the C library allocator for individual |
| 45 | purposes, as shown in the following example: |
| 46 | |
| 47 | \begin{verbatim} |
| 48 | PyObject *res; |
| 49 | char *buf = (char *) malloc(BUFSIZ); /* for I/O */ |
| 50 | |
| 51 | if (buf == NULL) |
| 52 | return PyErr_NoMemory(); |
| 53 | ...Do some I/O operation involving buf... |
| 54 | res = PyString_FromString(buf); |
| 55 | free(buf); /* malloc'ed */ |
| 56 | return res; |
| 57 | \end{verbatim} |
| 58 | |
| 59 | In this example, the memory request for the I/O buffer is handled by |
| 60 | the C library allocator. The Python memory manager is involved only |
| 61 | in the allocation of the string object returned as a result. |
| 62 | |
| 63 | In most situations, however, it is recommended to allocate memory from |
| 64 | the Python heap specifically because the latter is under control of |
| 65 | the Python memory manager. For example, this is required when the |
| 66 | interpreter is extended with new object types written in C. Another |
| 67 | reason for using the Python heap is the desire to \emph{inform} the |
| 68 | Python memory manager about the memory needs of the extension module. |
| 69 | Even when the requested memory is used exclusively for internal, |
| 70 | highly-specific purposes, delegating all memory requests to the Python |
| 71 | memory manager causes the interpreter to have a more accurate image of |
| 72 | its memory footprint as a whole. Consequently, under certain |
| 73 | circumstances, the Python memory manager may or may not trigger |
| 74 | appropriate actions, like garbage collection, memory compaction or |
| 75 | other preventive procedures. Note that by using the C library |
| 76 | allocator as shown in the previous example, the allocated memory for |
| 77 | the I/O buffer escapes completely the Python memory manager. |
| 78 | |
| 79 | |
| 80 | \section{Memory Interface \label{memoryInterface}} |
| 81 | |
| 82 | The following function sets, modeled after the ANSI C standard, are |
| 83 | available for allocating and releasing memory from the Python heap: |
| 84 | |
| 85 | |
| 86 | \begin{cfuncdesc}{void*}{PyMem_Malloc}{size_t n} |
| 87 | Allocates \var{n} bytes and returns a pointer of type \ctype{void*} |
| 88 | to the allocated memory, or \NULL{} if the request fails. |
| 89 | Requesting zero bytes returns a non-\NULL{} pointer. |
| 90 | The memory will not have been initialized in any way. |
| 91 | \end{cfuncdesc} |
| 92 | |
| 93 | \begin{cfuncdesc}{void*}{PyMem_Realloc}{void *p, size_t n} |
| 94 | Resizes the memory block pointed to by \var{p} to \var{n} bytes. |
| 95 | The contents will be unchanged to the minimum of the old and the new |
| 96 | sizes. If \var{p} is \NULL, the call is equivalent to |
| 97 | \cfunction{PyMem_Malloc(\var{n})}; if \var{n} is equal to zero, the |
| 98 | memory block is resized but is not freed, and the returned pointer |
| 99 | is non-\NULL. Unless \var{p} is \NULL, it must have been |
| 100 | returned by a previous call to \cfunction{PyMem_Malloc()} or |
| 101 | \cfunction{PyMem_Realloc()}. |
| 102 | \end{cfuncdesc} |
| 103 | |
| 104 | \begin{cfuncdesc}{void}{PyMem_Free}{void *p} |
| 105 | Frees the memory block pointed to by \var{p}, which must have been |
| 106 | returned by a previous call to \cfunction{PyMem_Malloc()} or |
| 107 | \cfunction{PyMem_Realloc()}. Otherwise, or if |
| 108 | \cfunction{PyMem_Free(p)} has been called before, undefined |
| 109 | behaviour occurs. If \var{p} is \NULL, no operation is performed. |
| 110 | \end{cfuncdesc} |
| 111 | |
| 112 | The following type-oriented macros are provided for convenience. Note |
| 113 | that \var{TYPE} refers to any C type. |
| 114 | |
| 115 | \begin{cfuncdesc}{\var{TYPE}*}{PyMem_New}{TYPE, size_t n} |
| 116 | Same as \cfunction{PyMem_Malloc()}, but allocates \code{(\var{n} * |
| 117 | sizeof(\var{TYPE}))} bytes of memory. Returns a pointer cast to |
| 118 | \ctype{\var{TYPE}*}. The memory will not have been initialized in |
| 119 | any way. |
| 120 | \end{cfuncdesc} |
| 121 | |
| 122 | \begin{cfuncdesc}{\var{TYPE}*}{PyMem_Resize}{void *p, TYPE, size_t n} |
| 123 | Same as \cfunction{PyMem_Realloc()}, but the memory block is resized |
| 124 | to \code{(\var{n} * sizeof(\var{TYPE}))} bytes. Returns a pointer |
| 125 | cast to \ctype{\var{TYPE}*}. |
| 126 | \end{cfuncdesc} |
| 127 | |
| 128 | \begin{cfuncdesc}{void}{PyMem_Del}{void *p} |
| 129 | Same as \cfunction{PyMem_Free()}. |
| 130 | \end{cfuncdesc} |
| 131 | |
| 132 | In addition, the following macro sets are provided for calling the |
| 133 | Python memory allocator directly, without involving the C API functions |
| 134 | listed above. However, note that their use does not preserve binary |
| 135 | compatibility accross Python versions and is therefore deprecated in |
| 136 | extension modules. |
| 137 | |
| 138 | \cfunction{PyMem_MALLOC()}, \cfunction{PyMem_REALLOC()}, \cfunction{PyMem_FREE()}. |
| 139 | |
| 140 | \cfunction{PyMem_NEW()}, \cfunction{PyMem_RESIZE()}, \cfunction{PyMem_DEL()}. |
| 141 | |
| 142 | |
| 143 | \section{Examples \label{memoryExamples}} |
| 144 | |
| 145 | Here is the example from section \ref{memoryOverview}, rewritten so |
| 146 | that the I/O buffer is allocated from the Python heap by using the |
| 147 | first function set: |
| 148 | |
| 149 | \begin{verbatim} |
| 150 | PyObject *res; |
| 151 | char *buf = (char *) PyMem_Malloc(BUFSIZ); /* for I/O */ |
| 152 | |
| 153 | if (buf == NULL) |
| 154 | return PyErr_NoMemory(); |
| 155 | /* ...Do some I/O operation involving buf... */ |
| 156 | res = PyString_FromString(buf); |
| 157 | PyMem_Free(buf); /* allocated with PyMem_Malloc */ |
| 158 | return res; |
| 159 | \end{verbatim} |
| 160 | |
| 161 | The same code using the type-oriented function set: |
| 162 | |
| 163 | \begin{verbatim} |
| 164 | PyObject *res; |
| 165 | char *buf = PyMem_New(char, BUFSIZ); /* for I/O */ |
| 166 | |
| 167 | if (buf == NULL) |
| 168 | return PyErr_NoMemory(); |
| 169 | /* ...Do some I/O operation involving buf... */ |
| 170 | res = PyString_FromString(buf); |
| 171 | PyMem_Del(buf); /* allocated with PyMem_New */ |
| 172 | return res; |
| 173 | \end{verbatim} |
| 174 | |
| 175 | Note that in the two examples above, the buffer is always |
| 176 | manipulated via functions belonging to the same set. Indeed, it |
| 177 | is required to use the same memory API family for a given |
| 178 | memory block, so that the risk of mixing different allocators is |
| 179 | reduced to a minimum. The following code sequence contains two errors, |
| 180 | one of which is labeled as \emph{fatal} because it mixes two different |
| 181 | allocators operating on different heaps. |
| 182 | |
| 183 | \begin{verbatim} |
| 184 | char *buf1 = PyMem_New(char, BUFSIZ); |
| 185 | char *buf2 = (char *) malloc(BUFSIZ); |
| 186 | char *buf3 = (char *) PyMem_Malloc(BUFSIZ); |
| 187 | ... |
| 188 | PyMem_Del(buf3); /* Wrong -- should be PyMem_Free() */ |
| 189 | free(buf2); /* Right -- allocated via malloc() */ |
| 190 | free(buf1); /* Fatal -- should be PyMem_Del() */ |
| 191 | \end{verbatim} |
| 192 | |
| 193 | In addition to the functions aimed at handling raw memory blocks from |
| 194 | the Python heap, objects in Python are allocated and released with |
| 195 | \cfunction{PyObject_New()}, \cfunction{PyObject_NewVar()} and |
| 196 | \cfunction{PyObject_Del()}, or with their corresponding macros |
| 197 | \cfunction{PyObject_NEW()}, \cfunction{PyObject_NEW_VAR()} and |
| 198 | \cfunction{PyObject_DEL()}. |
| 199 | |
| 200 | These will be explained in the next chapter on defining and |
| 201 | implementing new object types in C. |