| \chapter{Memory Management \label{memory}} | 
 | \sectionauthor{Vladimir Marangozov}{Vladimir.Marangozov@inrialpes.fr} | 
 |  | 
 |  | 
 | \section{Overview \label{memoryOverview}} | 
 |  | 
 | Memory management in Python involves a private heap containing all | 
 | Python objects and data structures. The management of this private | 
 | heap is ensured internally by the \emph{Python memory manager}.  The | 
 | Python memory manager has different components which deal with various | 
 | dynamic storage management aspects, like sharing, segmentation, | 
 | preallocation or caching. | 
 |  | 
 | At the lowest level, a raw memory allocator ensures that there is | 
 | enough room in the private heap for storing all Python-related data | 
 | by interacting with the memory manager of the operating system. On top | 
 | of the raw memory allocator, several object-specific allocators | 
 | operate on the same heap and implement distinct memory management | 
 | policies adapted to the peculiarities of every object type. For | 
 | example, integer objects are managed differently within the heap than | 
 | strings, tuples or dictionaries because integers imply different | 
 | storage requirements and speed/space tradeoffs. The Python memory | 
 | manager thus delegates some of the work to the object-specific | 
 | allocators, but ensures that the latter operate within the bounds of | 
 | the private heap. | 
 |  | 
 | It is important to understand that the management of the Python heap | 
 | is performed by the interpreter itself and that the user has no | 
 | control over it, even if she regularly manipulates object pointers to | 
 | memory blocks inside that heap.  The allocation of heap space for | 
 | Python objects and other internal buffers is performed on demand by | 
 | the Python memory manager through the Python/C API functions listed in | 
 | this document. | 
 |  | 
 | To avoid memory corruption, extension writers should never try to | 
 | operate on Python objects with the functions exported by the C | 
 | library: \cfunction{malloc()}\ttindex{malloc()}, | 
 | \cfunction{calloc()}\ttindex{calloc()}, | 
 | \cfunction{realloc()}\ttindex{realloc()} and | 
 | \cfunction{free()}\ttindex{free()}.  This will result in  | 
 | mixed calls between the C allocator and the Python memory manager | 
 | with fatal consequences, because they implement different algorithms | 
 | and operate on different heaps.  However, one may safely allocate and | 
 | release memory blocks with the C library allocator for individual | 
 | purposes, as shown in the following example: | 
 |  | 
 | \begin{verbatim} | 
 |     PyObject *res; | 
 |     char *buf = (char *) malloc(BUFSIZ); /* for I/O */ | 
 |  | 
 |     if (buf == NULL) | 
 |         return PyErr_NoMemory(); | 
 |     ...Do some I/O operation involving buf... | 
 |     res = PyString_FromString(buf); | 
 |     free(buf); /* malloc'ed */ | 
 |     return res; | 
 | \end{verbatim} | 
 |  | 
 | In this example, the memory request for the I/O buffer is handled by | 
 | the C library allocator. The Python memory manager is involved only | 
 | in the allocation of the string object returned as a result. | 
 |  | 
 | In most situations, however, it is recommended to allocate memory from | 
 | the Python heap specifically because the latter is under control of | 
 | the Python memory manager. For example, this is required when the | 
 | interpreter is extended with new object types written in C. Another | 
 | reason for using the Python heap is the desire to \emph{inform} the | 
 | Python memory manager about the memory needs of the extension module. | 
 | Even when the requested memory is used exclusively for internal, | 
 | highly-specific purposes, delegating all memory requests to the Python | 
 | memory manager causes the interpreter to have a more accurate image of | 
 | its memory footprint as a whole. Consequently, under certain | 
 | circumstances, the Python memory manager may or may not trigger | 
 | appropriate actions, like garbage collection, memory compaction or | 
 | other preventive procedures. Note that by using the C library | 
 | allocator as shown in the previous example, the allocated memory for | 
 | the I/O buffer escapes completely the Python memory manager. | 
 |  | 
 |  | 
 | \section{Memory Interface \label{memoryInterface}} | 
 |  | 
 | The following function sets, modeled after the ANSI C standard, | 
 | but specifying  behavior when requesting zero bytes, | 
 | are available for allocating and releasing memory from the Python heap: | 
 |  | 
 |  | 
 | \begin{cfuncdesc}{void*}{PyMem_Malloc}{size_t n} | 
 |   Allocates \var{n} bytes and returns a pointer of type \ctype{void*} | 
 |   to the allocated memory, or \NULL{} if the request fails. | 
 |   Requesting zero bytes returns a distinct non-\NULL{} pointer if | 
 |   possible, as if \cfunction{PyMem_Malloc(1)} had been called instead. | 
 |   The memory will not have been initialized in any way. | 
 | \end{cfuncdesc} | 
 |  | 
 | \begin{cfuncdesc}{void*}{PyMem_Realloc}{void *p, size_t n} | 
 |   Resizes the memory block pointed to by \var{p} to \var{n} bytes. | 
 |   The contents will be unchanged to the minimum of the old and the new | 
 |   sizes. If \var{p} is \NULL, the call is equivalent to | 
 |   \cfunction{PyMem_Malloc(\var{n})}; else if \var{n} is equal to zero, the | 
 |   memory block is resized but is not freed, and the returned pointer | 
 |   is non-\NULL.  Unless \var{p} is \NULL, it must have been | 
 |   returned by a previous call to \cfunction{PyMem_Malloc()} or | 
 |   \cfunction{PyMem_Realloc()}. | 
 | \end{cfuncdesc} | 
 |  | 
 | \begin{cfuncdesc}{void}{PyMem_Free}{void *p} | 
 |   Frees the memory block pointed to by \var{p}, which must have been | 
 |   returned by a previous call to \cfunction{PyMem_Malloc()} or | 
 |   \cfunction{PyMem_Realloc()}.  Otherwise, or if | 
 |   \cfunction{PyMem_Free(p)} has been called before, undefined | 
 |   behavior occurs. If \var{p} is \NULL, no operation is performed. | 
 | \end{cfuncdesc} | 
 |  | 
 | The following type-oriented macros are provided for convenience.  Note  | 
 | that \var{TYPE} refers to any C type. | 
 |  | 
 | \begin{cfuncdesc}{\var{TYPE}*}{PyMem_New}{TYPE, size_t n} | 
 |   Same as \cfunction{PyMem_Malloc()}, but allocates \code{(\var{n} * | 
 |   sizeof(\var{TYPE}))} bytes of memory.  Returns a pointer cast to | 
 |   \ctype{\var{TYPE}*}.  The memory will not have been initialized in | 
 |   any way. | 
 | \end{cfuncdesc} | 
 |  | 
 | \begin{cfuncdesc}{\var{TYPE}*}{PyMem_Resize}{void *p, TYPE, size_t n} | 
 |   Same as \cfunction{PyMem_Realloc()}, but the memory block is resized | 
 |   to \code{(\var{n} * sizeof(\var{TYPE}))} bytes.  Returns a pointer | 
 |   cast to \ctype{\var{TYPE}*}. | 
 | \end{cfuncdesc} | 
 |  | 
 | \begin{cfuncdesc}{void}{PyMem_Del}{void *p} | 
 |   Same as \cfunction{PyMem_Free()}. | 
 | \end{cfuncdesc} | 
 |  | 
 | In addition, the following macro sets are provided for calling the | 
 | Python memory allocator directly, without involving the C API functions | 
 | listed above. However, note that their use does not preserve binary | 
 | compatibility accross Python versions and is therefore deprecated in | 
 | extension modules. | 
 |  | 
 | \cfunction{PyMem_MALLOC()}, \cfunction{PyMem_REALLOC()}, \cfunction{PyMem_FREE()}. | 
 |  | 
 | \cfunction{PyMem_NEW()}, \cfunction{PyMem_RESIZE()}, \cfunction{PyMem_DEL()}. | 
 |  | 
 |  | 
 | \section{Examples \label{memoryExamples}} | 
 |  | 
 | Here is the example from section \ref{memoryOverview}, rewritten so | 
 | that the I/O buffer is allocated from the Python heap by using the | 
 | first function set: | 
 |  | 
 | \begin{verbatim} | 
 |     PyObject *res; | 
 |     char *buf = (char *) PyMem_Malloc(BUFSIZ); /* for I/O */ | 
 |  | 
 |     if (buf == NULL) | 
 |         return PyErr_NoMemory(); | 
 |     /* ...Do some I/O operation involving buf... */ | 
 |     res = PyString_FromString(buf); | 
 |     PyMem_Free(buf); /* allocated with PyMem_Malloc */ | 
 |     return res; | 
 | \end{verbatim} | 
 |  | 
 | The same code using the type-oriented function set: | 
 |  | 
 | \begin{verbatim} | 
 |     PyObject *res; | 
 |     char *buf = PyMem_New(char, BUFSIZ); /* for I/O */ | 
 |  | 
 |     if (buf == NULL) | 
 |         return PyErr_NoMemory(); | 
 |     /* ...Do some I/O operation involving buf... */ | 
 |     res = PyString_FromString(buf); | 
 |     PyMem_Del(buf); /* allocated with PyMem_New */ | 
 |     return res; | 
 | \end{verbatim} | 
 |  | 
 | Note that in the two examples above, the buffer is always | 
 | manipulated via functions belonging to the same set. Indeed, it | 
 | is required to use the same memory API family for a given | 
 | memory block, so that the risk of mixing different allocators is | 
 | reduced to a minimum. The following code sequence contains two errors, | 
 | one of which is labeled as \emph{fatal} because it mixes two different | 
 | allocators operating on different heaps. | 
 |  | 
 | \begin{verbatim} | 
 | char *buf1 = PyMem_New(char, BUFSIZ); | 
 | char *buf2 = (char *) malloc(BUFSIZ); | 
 | char *buf3 = (char *) PyMem_Malloc(BUFSIZ); | 
 | ... | 
 | PyMem_Del(buf3);  /* Wrong -- should be PyMem_Free() */ | 
 | free(buf2);       /* Right -- allocated via malloc() */ | 
 | free(buf1);       /* Fatal -- should be PyMem_Del()  */ | 
 | \end{verbatim} | 
 |  | 
 | In addition to the functions aimed at handling raw memory blocks from | 
 | the Python heap, objects in Python are allocated and released with | 
 | \cfunction{PyObject_New()}, \cfunction{PyObject_NewVar()} and | 
 | \cfunction{PyObject_Del()}, or with their corresponding macros | 
 | \cfunction{PyObject_NEW()}, \cfunction{PyObject_NEW_VAR()} and | 
 | \cfunction{PyObject_DEL()}. | 
 |  | 
 | These will be explained in the next chapter on defining and | 
 | implementing new object types in C. |