| \chapter{Memory Management \label{memory}} |
| \sectionauthor{Vladimir Marangozov}{Vladimir.Marangozov@inrialpes.fr} |
| |
| |
| \section{Overview \label{memoryOverview}} |
| |
| Memory management in Python involves a private heap containing all |
| Python objects and data structures. The management of this private |
| heap is ensured internally by the \emph{Python memory manager}. The |
| Python memory manager has different components which deal with various |
| dynamic storage management aspects, like sharing, segmentation, |
| preallocation or caching. |
| |
| At the lowest level, a raw memory allocator ensures that there is |
| enough room in the private heap for storing all Python-related data |
| by interacting with the memory manager of the operating system. On top |
| of the raw memory allocator, several object-specific allocators |
| operate on the same heap and implement distinct memory management |
| policies adapted to the peculiarities of every object type. For |
| example, integer objects are managed differently within the heap than |
| strings, tuples or dictionaries because integers imply different |
| storage requirements and speed/space tradeoffs. The Python memory |
| manager thus delegates some of the work to the object-specific |
| allocators, but ensures that the latter operate within the bounds of |
| the private heap. |
| |
| It is important to understand that the management of the Python heap |
| is performed by the interpreter itself and that the user has no |
| control over it, even if she regularly manipulates object pointers to |
| memory blocks inside that heap. The allocation of heap space for |
| Python objects and other internal buffers is performed on demand by |
| the Python memory manager through the Python/C API functions listed in |
| this document. |
| |
| To avoid memory corruption, extension writers should never try to |
| operate on Python objects with the functions exported by the C |
| library: \cfunction{malloc()}\ttindex{malloc()}, |
| \cfunction{calloc()}\ttindex{calloc()}, |
| \cfunction{realloc()}\ttindex{realloc()} and |
| \cfunction{free()}\ttindex{free()}. This will result in |
| mixed calls between the C allocator and the Python memory manager |
| with fatal consequences, because they implement different algorithms |
| and operate on different heaps. However, one may safely allocate and |
| release memory blocks with the C library allocator for individual |
| purposes, as shown in the following example: |
| |
| \begin{verbatim} |
| PyObject *res; |
| char *buf = (char *) malloc(BUFSIZ); /* for I/O */ |
| |
| if (buf == NULL) |
| return PyErr_NoMemory(); |
| ...Do some I/O operation involving buf... |
| res = PyString_FromString(buf); |
| free(buf); /* malloc'ed */ |
| return res; |
| \end{verbatim} |
| |
| In this example, the memory request for the I/O buffer is handled by |
| the C library allocator. The Python memory manager is involved only |
| in the allocation of the string object returned as a result. |
| |
| In most situations, however, it is recommended to allocate memory from |
| the Python heap specifically because the latter is under control of |
| the Python memory manager. For example, this is required when the |
| interpreter is extended with new object types written in C. Another |
| reason for using the Python heap is the desire to \emph{inform} the |
| Python memory manager about the memory needs of the extension module. |
| Even when the requested memory is used exclusively for internal, |
| highly-specific purposes, delegating all memory requests to the Python |
| memory manager causes the interpreter to have a more accurate image of |
| its memory footprint as a whole. Consequently, under certain |
| circumstances, the Python memory manager may or may not trigger |
| appropriate actions, like garbage collection, memory compaction or |
| other preventive procedures. Note that by using the C library |
| allocator as shown in the previous example, the allocated memory for |
| the I/O buffer escapes completely the Python memory manager. |
| |
| |
| \section{Memory Interface \label{memoryInterface}} |
| |
| The following function sets, modeled after the ANSI C standard, |
| but specifying behavior when requesting zero bytes, |
| are available for allocating and releasing memory from the Python heap: |
| |
| |
| \begin{cfuncdesc}{void*}{PyMem_Malloc}{size_t n} |
| Allocates \var{n} bytes and returns a pointer of type \ctype{void*} |
| to the allocated memory, or \NULL{} if the request fails. |
| Requesting zero bytes returns a distinct non-\NULL{} pointer if |
| possible, as if \cfunction{PyMem_Malloc(1)} had been called instead. |
| The memory will not have been initialized in any way. |
| \end{cfuncdesc} |
| |
| \begin{cfuncdesc}{void*}{PyMem_Realloc}{void *p, size_t n} |
| Resizes the memory block pointed to by \var{p} to \var{n} bytes. |
| The contents will be unchanged to the minimum of the old and the new |
| sizes. If \var{p} is \NULL, the call is equivalent to |
| \cfunction{PyMem_Malloc(\var{n})}; else if \var{n} is equal to zero, the |
| memory block is resized but is not freed, and the returned pointer |
| is non-\NULL. Unless \var{p} is \NULL, it must have been |
| returned by a previous call to \cfunction{PyMem_Malloc()} or |
| \cfunction{PyMem_Realloc()}. |
| \end{cfuncdesc} |
| |
| \begin{cfuncdesc}{void}{PyMem_Free}{void *p} |
| Frees the memory block pointed to by \var{p}, which must have been |
| returned by a previous call to \cfunction{PyMem_Malloc()} or |
| \cfunction{PyMem_Realloc()}. Otherwise, or if |
| \cfunction{PyMem_Free(p)} has been called before, undefined |
| behavior occurs. If \var{p} is \NULL, no operation is performed. |
| \end{cfuncdesc} |
| |
| The following type-oriented macros are provided for convenience. Note |
| that \var{TYPE} refers to any C type. |
| |
| \begin{cfuncdesc}{\var{TYPE}*}{PyMem_New}{TYPE, size_t n} |
| Same as \cfunction{PyMem_Malloc()}, but allocates \code{(\var{n} * |
| sizeof(\var{TYPE}))} bytes of memory. Returns a pointer cast to |
| \ctype{\var{TYPE}*}. The memory will not have been initialized in |
| any way. |
| \end{cfuncdesc} |
| |
| \begin{cfuncdesc}{\var{TYPE}*}{PyMem_Resize}{void *p, TYPE, size_t n} |
| Same as \cfunction{PyMem_Realloc()}, but the memory block is resized |
| to \code{(\var{n} * sizeof(\var{TYPE}))} bytes. Returns a pointer |
| cast to \ctype{\var{TYPE}*}. |
| \end{cfuncdesc} |
| |
| \begin{cfuncdesc}{void}{PyMem_Del}{void *p} |
| Same as \cfunction{PyMem_Free()}. |
| \end{cfuncdesc} |
| |
| In addition, the following macro sets are provided for calling the |
| Python memory allocator directly, without involving the C API functions |
| listed above. However, note that their use does not preserve binary |
| compatibility across Python versions and is therefore deprecated in |
| extension modules. |
| |
| \cfunction{PyMem_MALLOC()}, \cfunction{PyMem_REALLOC()}, \cfunction{PyMem_FREE()}. |
| |
| \cfunction{PyMem_NEW()}, \cfunction{PyMem_RESIZE()}, \cfunction{PyMem_DEL()}. |
| |
| |
| \section{Examples \label{memoryExamples}} |
| |
| Here is the example from section \ref{memoryOverview}, rewritten so |
| that the I/O buffer is allocated from the Python heap by using the |
| first function set: |
| |
| \begin{verbatim} |
| PyObject *res; |
| char *buf = (char *) PyMem_Malloc(BUFSIZ); /* for I/O */ |
| |
| if (buf == NULL) |
| return PyErr_NoMemory(); |
| /* ...Do some I/O operation involving buf... */ |
| res = PyString_FromString(buf); |
| PyMem_Free(buf); /* allocated with PyMem_Malloc */ |
| return res; |
| \end{verbatim} |
| |
| The same code using the type-oriented function set: |
| |
| \begin{verbatim} |
| PyObject *res; |
| char *buf = PyMem_New(char, BUFSIZ); /* for I/O */ |
| |
| if (buf == NULL) |
| return PyErr_NoMemory(); |
| /* ...Do some I/O operation involving buf... */ |
| res = PyString_FromString(buf); |
| PyMem_Del(buf); /* allocated with PyMem_New */ |
| return res; |
| \end{verbatim} |
| |
| Note that in the two examples above, the buffer is always |
| manipulated via functions belonging to the same set. Indeed, it |
| is required to use the same memory API family for a given |
| memory block, so that the risk of mixing different allocators is |
| reduced to a minimum. The following code sequence contains two errors, |
| one of which is labeled as \emph{fatal} because it mixes two different |
| allocators operating on different heaps. |
| |
| \begin{verbatim} |
| char *buf1 = PyMem_New(char, BUFSIZ); |
| char *buf2 = (char *) malloc(BUFSIZ); |
| char *buf3 = (char *) PyMem_Malloc(BUFSIZ); |
| ... |
| PyMem_Del(buf3); /* Wrong -- should be PyMem_Free() */ |
| free(buf2); /* Right -- allocated via malloc() */ |
| free(buf1); /* Fatal -- should be PyMem_Del() */ |
| \end{verbatim} |
| |
| In addition to the functions aimed at handling raw memory blocks from |
| the Python heap, objects in Python are allocated and released with |
| \cfunction{PyObject_New()}, \cfunction{PyObject_NewVar()} and |
| \cfunction{PyObject_Del()}. |
| |
| These will be explained in the next chapter on defining and |
| implementing new object types in C. |