| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | .. highlightlang:: c | 
 | 2 |  | 
 | 3 |  | 
 | 4 | .. _api-intro: | 
 | 5 |  | 
 | 6 | ************ | 
 | 7 | Introduction | 
 | 8 | ************ | 
 | 9 |  | 
 | 10 | The Application Programmer's Interface to Python gives C and C++ programmers | 
 | 11 | access to the Python interpreter at a variety of levels.  The API is equally | 
 | 12 | usable from C++, but for brevity it is generally referred to as the Python/C | 
 | 13 | API.  There are two fundamentally different reasons for using the Python/C API. | 
 | 14 | The first reason is to write *extension modules* for specific purposes; these | 
 | 15 | are C modules that extend the Python interpreter.  This is probably the most | 
 | 16 | common use.  The second reason is to use Python as a component in a larger | 
 | 17 | application; this technique is generally referred to as :dfn:`embedding` Python | 
 | 18 | in an application. | 
 | 19 |  | 
 | 20 | Writing an extension module is a relatively well-understood process,  where a | 
 | 21 | "cookbook" approach works well.  There are several tools  that automate the | 
 | 22 | process to some extent.  While people have embedded  Python in other | 
 | 23 | applications since its early existence, the process of  embedding Python is less | 
 | 24 | straightforward than writing an extension. | 
 | 25 |  | 
 | 26 | Many API functions are useful independent of whether you're embedding  or | 
 | 27 | extending Python; moreover, most applications that embed Python  will need to | 
 | 28 | provide a custom extension as well, so it's probably a  good idea to become | 
 | 29 | familiar with writing an extension before  attempting to embed Python in a real | 
 | 30 | application. | 
 | 31 |  | 
 | 32 |  | 
 | 33 | .. _api-includes: | 
 | 34 |  | 
 | 35 | Include Files | 
 | 36 | ============= | 
 | 37 |  | 
 | 38 | All function, type and macro definitions needed to use the Python/C API are | 
 | 39 | included in your code by the following line:: | 
 | 40 |  | 
 | 41 |    #include "Python.h" | 
 | 42 |  | 
 | 43 | This implies inclusion of the following standard headers: ``<stdio.h>``, | 
| Georg Brandl | 4f13d61 | 2010-11-23 18:14:57 +0000 | [diff] [blame] | 44 | ``<string.h>``, ``<errno.h>``, ``<limits.h>``, ``<assert.h>`` and ``<stdlib.h>`` | 
 | 45 | (if available). | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 46 |  | 
| Georg Brandl | e720c0a | 2009-04-27 16:20:50 +0000 | [diff] [blame] | 47 | .. note:: | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 48 |  | 
 | 49 |    Since Python may define some pre-processor definitions which affect the standard | 
 | 50 |    headers on some systems, you *must* include :file:`Python.h` before any standard | 
 | 51 |    headers are included. | 
 | 52 |  | 
 | 53 | All user visible names defined by Python.h (except those defined by the included | 
 | 54 | standard headers) have one of the prefixes ``Py`` or ``_Py``.  Names beginning | 
 | 55 | with ``_Py`` are for internal use by the Python implementation and should not be | 
 | 56 | used by extension writers. Structure member names do not have a reserved prefix. | 
 | 57 |  | 
 | 58 | **Important:** user code should never define names that begin with ``Py`` or | 
 | 59 | ``_Py``.  This confuses the reader, and jeopardizes the portability of the user | 
 | 60 | code to future Python versions, which may define additional names beginning with | 
 | 61 | one of these prefixes. | 
 | 62 |  | 
 | 63 | The header files are typically installed with Python.  On Unix, these  are | 
 | 64 | located in the directories :file:`{prefix}/include/pythonversion/` and | 
 | 65 | :file:`{exec_prefix}/include/pythonversion/`, where :envvar:`prefix` and | 
 | 66 | :envvar:`exec_prefix` are defined by the corresponding parameters to Python's | 
 | 67 | :program:`configure` script and *version* is ``sys.version[:3]``.  On Windows, | 
 | 68 | the headers are installed in :file:`{prefix}/include`, where :envvar:`prefix` is | 
 | 69 | the installation directory specified to the installer. | 
 | 70 |  | 
 | 71 | To include the headers, place both directories (if different) on your compiler's | 
 | 72 | search path for includes.  Do *not* place the parent directories on the search | 
 | 73 | path and then use ``#include <pythonX.Y/Python.h>``; this will break on | 
 | 74 | multi-platform builds since the platform independent headers under | 
 | 75 | :envvar:`prefix` include the platform specific headers from | 
 | 76 | :envvar:`exec_prefix`. | 
 | 77 |  | 
 | 78 | C++ users should note that though the API is defined entirely using C, the | 
 | 79 | header files do properly declare the entry points to be ``extern "C"``, so there | 
 | 80 | is no need to do anything special to use the API from C++. | 
 | 81 |  | 
 | 82 |  | 
 | 83 | .. _api-objects: | 
 | 84 |  | 
 | 85 | Objects, Types and Reference Counts | 
 | 86 | =================================== | 
 | 87 |  | 
 | 88 | .. index:: object: type | 
 | 89 |  | 
 | 90 | Most Python/C API functions have one or more arguments as well as a return value | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 91 | of type :c:type:`PyObject\*`.  This type is a pointer to an opaque data type | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 92 | representing an arbitrary Python object.  Since all Python object types are | 
 | 93 | treated the same way by the Python language in most situations (e.g., | 
 | 94 | assignments, scope rules, and argument passing), it is only fitting that they | 
 | 95 | should be represented by a single C type.  Almost all Python objects live on the | 
 | 96 | heap: you never declare an automatic or static variable of type | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 97 | :c:type:`PyObject`, only pointer variables of type :c:type:`PyObject\*` can  be | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 98 | declared.  The sole exception are the type objects; since these must never be | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 99 | deallocated, they are typically static :c:type:`PyTypeObject` objects. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 100 |  | 
 | 101 | All Python objects (even Python integers) have a :dfn:`type` and a | 
 | 102 | :dfn:`reference count`.  An object's type determines what kind of object it is | 
 | 103 | (e.g., an integer, a list, or a user-defined function; there are many more as | 
 | 104 | explained in :ref:`types`).  For each of the well-known types there is a macro | 
 | 105 | to check whether an object is of that type; for instance, ``PyList_Check(a)`` is | 
 | 106 | true if (and only if) the object pointed to by *a* is a Python list. | 
 | 107 |  | 
 | 108 |  | 
 | 109 | .. _api-refcounts: | 
 | 110 |  | 
 | 111 | Reference Counts | 
 | 112 | ---------------- | 
 | 113 |  | 
 | 114 | The reference count is important because today's computers have a  finite (and | 
 | 115 | often severely limited) memory size; it counts how many  different places there | 
 | 116 | are that have a reference to an object.  Such a  place could be another object, | 
 | 117 | or a global (or static) C variable, or  a local variable in some C function. | 
 | 118 | When an object's reference count  becomes zero, the object is deallocated.  If | 
 | 119 | it contains references to  other objects, their reference count is decremented. | 
 | 120 | Those other  objects may be deallocated in turn, if this decrement makes their | 
 | 121 | reference count become zero, and so on.  (There's an obvious problem  with | 
 | 122 | objects that reference each other here; for now, the solution is  "don't do | 
 | 123 | that.") | 
 | 124 |  | 
 | 125 | .. index:: | 
 | 126 |    single: Py_INCREF() | 
 | 127 |    single: Py_DECREF() | 
 | 128 |  | 
 | 129 | Reference counts are always manipulated explicitly.  The normal way is  to use | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 130 | the macro :c:func:`Py_INCREF` to increment an object's reference count by one, | 
 | 131 | and :c:func:`Py_DECREF` to decrement it by   one.  The :c:func:`Py_DECREF` macro | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 132 | is considerably more complex than the incref one, since it must check whether | 
 | 133 | the reference count becomes zero and then cause the object's deallocator to be | 
 | 134 | called. The deallocator is a function pointer contained in the object's type | 
 | 135 | structure.  The type-specific deallocator takes care of decrementing the | 
 | 136 | reference counts for other objects contained in the object if this is a compound | 
 | 137 | object type, such as a list, as well as performing any additional finalization | 
 | 138 | that's needed.  There's no chance that the reference count can overflow; at | 
 | 139 | least as many bits are used to hold the reference count as there are distinct | 
| Christian Heimes | dd15f6c | 2008-03-16 00:07:10 +0000 | [diff] [blame] | 140 | memory locations in virtual memory (assuming ``sizeof(Py_ssize_t) >= sizeof(void*)``). | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 141 | Thus, the reference count increment is a simple operation. | 
 | 142 |  | 
 | 143 | It is not necessary to increment an object's reference count for every  local | 
 | 144 | variable that contains a pointer to an object.  In theory, the  object's | 
 | 145 | reference count goes up by one when the variable is made to  point to it and it | 
 | 146 | goes down by one when the variable goes out of  scope.  However, these two | 
 | 147 | cancel each other out, so at the end the  reference count hasn't changed.  The | 
 | 148 | only real reason to use the  reference count is to prevent the object from being | 
 | 149 | deallocated as  long as our variable is pointing to it.  If we know that there | 
 | 150 | is at  least one other reference to the object that lives at least as long as | 
 | 151 | our variable, there is no need to increment the reference count  temporarily. | 
 | 152 | An important situation where this arises is in objects  that are passed as | 
 | 153 | arguments to C functions in an extension module  that are called from Python; | 
 | 154 | the call mechanism guarantees to hold a  reference to every argument for the | 
 | 155 | duration of the call. | 
 | 156 |  | 
 | 157 | However, a common pitfall is to extract an object from a list and hold on to it | 
 | 158 | for a while without incrementing its reference count. Some other operation might | 
 | 159 | conceivably remove the object from the list, decrementing its reference count | 
 | 160 | and possible deallocating it. The real danger is that innocent-looking | 
 | 161 | operations may invoke arbitrary Python code which could do this; there is a code | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 162 | path which allows control to flow back to the user from a :c:func:`Py_DECREF`, so | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 163 | almost any operation is potentially dangerous. | 
 | 164 |  | 
 | 165 | A safe approach is to always use the generic operations (functions  whose name | 
 | 166 | begins with ``PyObject_``, ``PyNumber_``, ``PySequence_`` or ``PyMapping_``). | 
 | 167 | These operations always increment the reference count of the object they return. | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 168 | This leaves the caller with the responsibility to call :c:func:`Py_DECREF` when | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 169 | they are done with the result; this soon becomes second nature. | 
 | 170 |  | 
 | 171 |  | 
 | 172 | .. _api-refcountdetails: | 
 | 173 |  | 
 | 174 | Reference Count Details | 
 | 175 | ^^^^^^^^^^^^^^^^^^^^^^^ | 
 | 176 |  | 
 | 177 | The reference count behavior of functions in the Python/C API is best  explained | 
 | 178 | in terms of *ownership of references*.  Ownership pertains to references, never | 
 | 179 | to objects (objects are not owned: they are always shared).  "Owning a | 
 | 180 | reference" means being responsible for calling Py_DECREF on it when the | 
 | 181 | reference is no longer needed.  Ownership can also be transferred, meaning that | 
 | 182 | the code that receives ownership of the reference then becomes responsible for | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 183 | eventually decref'ing it by calling :c:func:`Py_DECREF` or :c:func:`Py_XDECREF` | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 184 | when it's no longer needed---or passing on this responsibility (usually to its | 
 | 185 | caller). When a function passes ownership of a reference on to its caller, the | 
 | 186 | caller is said to receive a *new* reference.  When no ownership is transferred, | 
 | 187 | the caller is said to *borrow* the reference. Nothing needs to be done for a | 
 | 188 | borrowed reference. | 
 | 189 |  | 
| Benjamin Peterson | ad3d5c2 | 2009-02-26 03:38:59 +0000 | [diff] [blame] | 190 | Conversely, when a calling function passes in a reference to an  object, there | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 191 | are two possibilities: the function *steals* a  reference to the object, or it | 
 | 192 | does not.  *Stealing a reference* means that when you pass a reference to a | 
 | 193 | function, that function assumes that it now owns that reference, and you are not | 
 | 194 | responsible for it any longer. | 
 | 195 |  | 
 | 196 | .. index:: | 
 | 197 |    single: PyList_SetItem() | 
 | 198 |    single: PyTuple_SetItem() | 
 | 199 |  | 
 | 200 | Few functions steal references; the two notable exceptions are | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 201 | :c:func:`PyList_SetItem` and :c:func:`PyTuple_SetItem`, which  steal a reference | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 202 | to the item (but not to the tuple or list into which the item is put!).  These | 
 | 203 | functions were designed to steal a reference because of a common idiom for | 
 | 204 | populating a tuple or list with newly created objects; for example, the code to | 
 | 205 | create the tuple ``(1, 2, "three")`` could look like this (forgetting about | 
 | 206 | error handling for the moment; a better way to code this is shown below):: | 
 | 207 |  | 
 | 208 |    PyObject *t; | 
 | 209 |  | 
 | 210 |    t = PyTuple_New(3); | 
| Georg Brandl | d019fe2 | 2007-12-08 18:58:51 +0000 | [diff] [blame] | 211 |    PyTuple_SetItem(t, 0, PyLong_FromLong(1L)); | 
 | 212 |    PyTuple_SetItem(t, 1, PyLong_FromLong(2L)); | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 213 |    PyTuple_SetItem(t, 2, PyString_FromString("three")); | 
 | 214 |  | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 215 | Here, :c:func:`PyLong_FromLong` returns a new reference which is immediately | 
 | 216 | stolen by :c:func:`PyTuple_SetItem`.  When you want to keep using an object | 
 | 217 | although the reference to it will be stolen, use :c:func:`Py_INCREF` to grab | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 218 | another reference before calling the reference-stealing function. | 
 | 219 |  | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 220 | Incidentally, :c:func:`PyTuple_SetItem` is the *only* way to set tuple items; | 
 | 221 | :c:func:`PySequence_SetItem` and :c:func:`PyObject_SetItem` refuse to do this | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 222 | since tuples are an immutable data type.  You should only use | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 223 | :c:func:`PyTuple_SetItem` for tuples that you are creating yourself. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 224 |  | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 225 | Equivalent code for populating a list can be written using :c:func:`PyList_New` | 
 | 226 | and :c:func:`PyList_SetItem`. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 227 |  | 
 | 228 | However, in practice, you will rarely use these ways of creating and populating | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 229 | a tuple or list.  There's a generic function, :c:func:`Py_BuildValue`, that can | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 230 | create most common objects from C values, directed by a :dfn:`format string`. | 
 | 231 | For example, the above two blocks of code could be replaced by the following | 
 | 232 | (which also takes care of the error checking):: | 
 | 233 |  | 
 | 234 |    PyObject *tuple, *list; | 
 | 235 |  | 
 | 236 |    tuple = Py_BuildValue("(iis)", 1, 2, "three"); | 
 | 237 |    list = Py_BuildValue("[iis]", 1, 2, "three"); | 
 | 238 |  | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 239 | It is much more common to use :c:func:`PyObject_SetItem` and friends with items | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 240 | whose references you are only borrowing, like arguments that were passed in to | 
 | 241 | the function you are writing.  In that case, their behaviour regarding reference | 
 | 242 | counts is much saner, since you don't have to increment a reference count so you | 
 | 243 | can give a reference away ("have it be stolen").  For example, this function | 
 | 244 | sets all items of a list (actually, any mutable sequence) to a given item:: | 
 | 245 |  | 
 | 246 |    int | 
 | 247 |    set_all(PyObject *target, PyObject *item) | 
 | 248 |    { | 
 | 249 |        int i, n; | 
 | 250 |  | 
 | 251 |        n = PyObject_Length(target); | 
 | 252 |        if (n < 0) | 
 | 253 |            return -1; | 
 | 254 |        for (i = 0; i < n; i++) { | 
| Georg Brandl | d019fe2 | 2007-12-08 18:58:51 +0000 | [diff] [blame] | 255 |            PyObject *index = PyLong_FromLong(i); | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 256 |            if (!index) | 
 | 257 |                return -1; | 
 | 258 |            if (PyObject_SetItem(target, index, item) < 0) | 
 | 259 |                return -1; | 
 | 260 |            Py_DECREF(index); | 
 | 261 |        } | 
 | 262 |        return 0; | 
 | 263 |    } | 
 | 264 |  | 
 | 265 | .. index:: single: set_all() | 
 | 266 |  | 
 | 267 | The situation is slightly different for function return values.   While passing | 
 | 268 | a reference to most functions does not change your  ownership responsibilities | 
 | 269 | for that reference, many functions that  return a reference to an object give | 
 | 270 | you ownership of the reference. The reason is simple: in many cases, the | 
 | 271 | returned object is created  on the fly, and the reference you get is the only | 
 | 272 | reference to the  object.  Therefore, the generic functions that return object | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 273 | references, like :c:func:`PyObject_GetItem` and  :c:func:`PySequence_GetItem`, | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 274 | always return a new reference (the caller becomes the owner of the reference). | 
 | 275 |  | 
 | 276 | It is important to realize that whether you own a reference returned  by a | 
 | 277 | function depends on which function you call only --- *the plumage* (the type of | 
 | 278 | the object passed as an argument to the function) *doesn't enter into it!* | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 279 | Thus, if you  extract an item from a list using :c:func:`PyList_GetItem`, you | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 280 | don't own the reference --- but if you obtain the same item from the same list | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 281 | using :c:func:`PySequence_GetItem` (which happens to take exactly the same | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 282 | arguments), you do own a reference to the returned object. | 
 | 283 |  | 
 | 284 | .. index:: | 
 | 285 |    single: PyList_GetItem() | 
 | 286 |    single: PySequence_GetItem() | 
 | 287 |  | 
 | 288 | Here is an example of how you could write a function that computes the sum of | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 289 | the items in a list of integers; once using  :c:func:`PyList_GetItem`, and once | 
 | 290 | using :c:func:`PySequence_GetItem`. :: | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 291 |  | 
 | 292 |    long | 
 | 293 |    sum_list(PyObject *list) | 
 | 294 |    { | 
 | 295 |        int i, n; | 
 | 296 |        long total = 0; | 
 | 297 |        PyObject *item; | 
 | 298 |  | 
 | 299 |        n = PyList_Size(list); | 
 | 300 |        if (n < 0) | 
 | 301 |            return -1; /* Not a list */ | 
 | 302 |        for (i = 0; i < n; i++) { | 
 | 303 |            item = PyList_GetItem(list, i); /* Can't fail */ | 
| Georg Brandl | d019fe2 | 2007-12-08 18:58:51 +0000 | [diff] [blame] | 304 |            if (!PyLong_Check(item)) continue; /* Skip non-integers */ | 
 | 305 |            total += PyLong_AsLong(item); | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 306 |        } | 
 | 307 |        return total; | 
 | 308 |    } | 
 | 309 |  | 
 | 310 | .. index:: single: sum_list() | 
 | 311 |  | 
 | 312 | :: | 
 | 313 |  | 
 | 314 |    long | 
 | 315 |    sum_sequence(PyObject *sequence) | 
 | 316 |    { | 
 | 317 |        int i, n; | 
 | 318 |        long total = 0; | 
 | 319 |        PyObject *item; | 
 | 320 |        n = PySequence_Length(sequence); | 
 | 321 |        if (n < 0) | 
 | 322 |            return -1; /* Has no length */ | 
 | 323 |        for (i = 0; i < n; i++) { | 
 | 324 |            item = PySequence_GetItem(sequence, i); | 
 | 325 |            if (item == NULL) | 
 | 326 |                return -1; /* Not a sequence, or other failure */ | 
| Georg Brandl | d019fe2 | 2007-12-08 18:58:51 +0000 | [diff] [blame] | 327 |            if (PyLong_Check(item)) | 
 | 328 |                total += PyLong_AsLong(item); | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 329 |            Py_DECREF(item); /* Discard reference ownership */ | 
 | 330 |        } | 
 | 331 |        return total; | 
 | 332 |    } | 
 | 333 |  | 
 | 334 | .. index:: single: sum_sequence() | 
 | 335 |  | 
 | 336 |  | 
 | 337 | .. _api-types: | 
 | 338 |  | 
 | 339 | Types | 
 | 340 | ----- | 
 | 341 |  | 
 | 342 | There are few other data types that play a significant role in  the Python/C | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 343 | API; most are simple C types such as :c:type:`int`,  :c:type:`long`, | 
 | 344 | :c:type:`double` and :c:type:`char\*`.  A few structure types  are used to | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 345 | describe static tables used to list the functions exported  by a module or the | 
 | 346 | data attributes of a new object type, and another is used to describe the value | 
 | 347 | of a complex number.  These will  be discussed together with the functions that | 
 | 348 | use them. | 
 | 349 |  | 
 | 350 |  | 
 | 351 | .. _api-exceptions: | 
 | 352 |  | 
 | 353 | Exceptions | 
 | 354 | ========== | 
 | 355 |  | 
 | 356 | The Python programmer only needs to deal with exceptions if specific  error | 
 | 357 | handling is required; unhandled exceptions are automatically  propagated to the | 
 | 358 | caller, then to the caller's caller, and so on, until they reach the top-level | 
 | 359 | interpreter, where they are reported to the  user accompanied by a stack | 
 | 360 | traceback. | 
 | 361 |  | 
 | 362 | .. index:: single: PyErr_Occurred() | 
 | 363 |  | 
| Georg Brandl | dd909db | 2010-10-17 06:32:59 +0000 | [diff] [blame] | 364 | For C programmers, however, error checking always has to be explicit.  All | 
 | 365 | functions in the Python/C API can raise exceptions, unless an explicit claim is | 
 | 366 | made otherwise in a function's documentation.  In general, when a function | 
 | 367 | encounters an error, it sets an exception, discards any object references that | 
 | 368 | it owns, and returns an error indicator.  If not documented otherwise, this | 
 | 369 | indicator is either *NULL* or ``-1``, depending on the function's return type. | 
 | 370 | A few functions return a Boolean true/false result, with false indicating an | 
 | 371 | error.  Very few functions return no explicit error indicator or have an | 
 | 372 | ambiguous return value, and require explicit testing for errors with | 
 | 373 | :c:func:`PyErr_Occurred`.  These exceptions are always explicitly documented. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 374 |  | 
 | 375 | .. index:: | 
 | 376 |    single: PyErr_SetString() | 
 | 377 |    single: PyErr_Clear() | 
 | 378 |  | 
 | 379 | Exception state is maintained in per-thread storage (this is  equivalent to | 
 | 380 | using global storage in an unthreaded application).  A  thread can be in one of | 
 | 381 | two states: an exception has occurred, or not. The function | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 382 | :c:func:`PyErr_Occurred` can be used to check for this: it returns a borrowed | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 383 | reference to the exception type object when an exception has occurred, and | 
 | 384 | *NULL* otherwise.  There are a number of functions to set the exception state: | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 385 | :c:func:`PyErr_SetString` is the most common (though not the most general) | 
 | 386 | function to set the exception state, and :c:func:`PyErr_Clear` clears the | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 387 | exception state. | 
 | 388 |  | 
 | 389 | The full exception state consists of three objects (all of which can  be | 
 | 390 | *NULL*): the exception type, the corresponding exception  value, and the | 
 | 391 | traceback.  These have the same meanings as the Python result of | 
 | 392 | ``sys.exc_info()``; however, they are not the same: the Python objects represent | 
 | 393 | the last exception being handled by a Python  :keyword:`try` ... | 
 | 394 | :keyword:`except` statement, while the C level exception state only exists while | 
 | 395 | an exception is being passed on between C functions until it reaches the Python | 
 | 396 | bytecode interpreter's  main loop, which takes care of transferring it to | 
 | 397 | ``sys.exc_info()`` and friends. | 
 | 398 |  | 
 | 399 | .. index:: single: exc_info() (in module sys) | 
 | 400 |  | 
 | 401 | Note that starting with Python 1.5, the preferred, thread-safe way to access the | 
 | 402 | exception state from Python code is to call the function :func:`sys.exc_info`, | 
 | 403 | which returns the per-thread exception state for Python code.  Also, the | 
 | 404 | semantics of both ways to access the exception state have changed so that a | 
 | 405 | function which catches an exception will save and restore its thread's exception | 
 | 406 | state so as to preserve the exception state of its caller.  This prevents common | 
 | 407 | bugs in exception handling code caused by an innocent-looking function | 
 | 408 | overwriting the exception being handled; it also reduces the often unwanted | 
 | 409 | lifetime extension for objects that are referenced by the stack frames in the | 
 | 410 | traceback. | 
 | 411 |  | 
 | 412 | As a general principle, a function that calls another function to  perform some | 
 | 413 | task should check whether the called function raised an  exception, and if so, | 
 | 414 | pass the exception state on to its caller.  It  should discard any object | 
 | 415 | references that it owns, and return an  error indicator, but it should *not* set | 
 | 416 | another exception --- that would overwrite the exception that was just raised, | 
 | 417 | and lose important information about the exact cause of the error. | 
 | 418 |  | 
 | 419 | .. index:: single: sum_sequence() | 
 | 420 |  | 
 | 421 | A simple example of detecting exceptions and passing them on is shown in the | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 422 | :c:func:`sum_sequence` example above.  It so happens that that example doesn't | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 423 | need to clean up any owned references when it detects an error.  The following | 
 | 424 | example function shows some error cleanup.  First, to remind you why you like | 
 | 425 | Python, we show the equivalent Python code:: | 
 | 426 |  | 
 | 427 |    def incr_item(dict, key): | 
 | 428 |        try: | 
 | 429 |            item = dict[key] | 
 | 430 |        except KeyError: | 
 | 431 |            item = 0 | 
 | 432 |        dict[key] = item + 1 | 
 | 433 |  | 
 | 434 | .. index:: single: incr_item() | 
 | 435 |  | 
 | 436 | Here is the corresponding C code, in all its glory:: | 
 | 437 |  | 
 | 438 |    int | 
 | 439 |    incr_item(PyObject *dict, PyObject *key) | 
 | 440 |    { | 
 | 441 |        /* Objects all initialized to NULL for Py_XDECREF */ | 
 | 442 |        PyObject *item = NULL, *const_one = NULL, *incremented_item = NULL; | 
 | 443 |        int rv = -1; /* Return value initialized to -1 (failure) */ | 
 | 444 |  | 
 | 445 |        item = PyObject_GetItem(dict, key); | 
 | 446 |        if (item == NULL) { | 
 | 447 |            /* Handle KeyError only: */ | 
 | 448 |            if (!PyErr_ExceptionMatches(PyExc_KeyError)) | 
 | 449 |                goto error; | 
 | 450 |  | 
 | 451 |            /* Clear the error and use zero: */ | 
 | 452 |            PyErr_Clear(); | 
| Georg Brandl | d019fe2 | 2007-12-08 18:58:51 +0000 | [diff] [blame] | 453 |            item = PyLong_FromLong(0L); | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 454 |            if (item == NULL) | 
 | 455 |                goto error; | 
 | 456 |        } | 
| Georg Brandl | d019fe2 | 2007-12-08 18:58:51 +0000 | [diff] [blame] | 457 |        const_one = PyLong_FromLong(1L); | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 458 |        if (const_one == NULL) | 
 | 459 |            goto error; | 
 | 460 |  | 
 | 461 |        incremented_item = PyNumber_Add(item, const_one); | 
 | 462 |        if (incremented_item == NULL) | 
 | 463 |            goto error; | 
 | 464 |  | 
 | 465 |        if (PyObject_SetItem(dict, key, incremented_item) < 0) | 
 | 466 |            goto error; | 
 | 467 |        rv = 0; /* Success */ | 
 | 468 |        /* Continue with cleanup code */ | 
 | 469 |  | 
 | 470 |     error: | 
 | 471 |        /* Cleanup code, shared by success and failure path */ | 
 | 472 |  | 
 | 473 |        /* Use Py_XDECREF() to ignore NULL references */ | 
 | 474 |        Py_XDECREF(item); | 
 | 475 |        Py_XDECREF(const_one); | 
 | 476 |        Py_XDECREF(incremented_item); | 
 | 477 |  | 
 | 478 |        return rv; /* -1 for error, 0 for success */ | 
 | 479 |    } | 
 | 480 |  | 
 | 481 | .. index:: single: incr_item() | 
 | 482 |  | 
 | 483 | .. index:: | 
 | 484 |    single: PyErr_ExceptionMatches() | 
 | 485 |    single: PyErr_Clear() | 
 | 486 |    single: Py_XDECREF() | 
 | 487 |  | 
| Christian Heimes | 5b5e81c | 2007-12-31 16:14:33 +0000 | [diff] [blame] | 488 | This example represents an endorsed use of the ``goto`` statement  in C! | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 489 | It illustrates the use of :c:func:`PyErr_ExceptionMatches` and | 
 | 490 | :c:func:`PyErr_Clear` to handle specific exceptions, and the use of | 
 | 491 | :c:func:`Py_XDECREF` to dispose of owned references that may be *NULL* (note the | 
 | 492 | ``'X'`` in the name; :c:func:`Py_DECREF` would crash when confronted with a | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 493 | *NULL* reference).  It is important that the variables used to hold owned | 
 | 494 | references are initialized to *NULL* for this to work; likewise, the proposed | 
 | 495 | return value is initialized to ``-1`` (failure) and only set to success after | 
 | 496 | the final call made is successful. | 
 | 497 |  | 
 | 498 |  | 
 | 499 | .. _api-embedding: | 
 | 500 |  | 
 | 501 | Embedding Python | 
 | 502 | ================ | 
 | 503 |  | 
 | 504 | The one important task that only embedders (as opposed to extension writers) of | 
 | 505 | the Python interpreter have to worry about is the initialization, and possibly | 
 | 506 | the finalization, of the Python interpreter.  Most functionality of the | 
 | 507 | interpreter can only be used after the interpreter has been initialized. | 
 | 508 |  | 
 | 509 | .. index:: | 
 | 510 |    single: Py_Initialize() | 
| Georg Brandl | 1a3284e | 2007-12-02 09:40:06 +0000 | [diff] [blame] | 511 |    module: builtins | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 512 |    module: __main__ | 
 | 513 |    module: sys | 
 | 514 |    module: exceptions | 
 | 515 |    triple: module; search; path | 
 | 516 |    single: path (in module sys) | 
 | 517 |  | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 518 | The basic initialization function is :c:func:`Py_Initialize`. This initializes | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 519 | the table of loaded modules, and creates the fundamental modules | 
| Georg Brandl | 1a3284e | 2007-12-02 09:40:06 +0000 | [diff] [blame] | 520 | :mod:`builtins`, :mod:`__main__`, :mod:`sys`, and :mod:`exceptions`.  It also | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 521 | initializes the module search path (``sys.path``). | 
 | 522 |  | 
| Benjamin Peterson | 2ebf8ce | 2010-06-27 21:48:35 +0000 | [diff] [blame] | 523 | .. index:: single: PySys_SetArgvEx() | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 524 |  | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 525 | :c:func:`Py_Initialize` does not set the "script argument list"  (``sys.argv``). | 
| Benjamin Peterson | 2ebf8ce | 2010-06-27 21:48:35 +0000 | [diff] [blame] | 526 | If this variable is needed by Python code that will be executed later, it must | 
 | 527 | be set explicitly with a call to  ``PySys_SetArgvEx(argc, argv, updatepath)`` | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 528 | after the call to :c:func:`Py_Initialize`. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 529 |  | 
 | 530 | On most systems (in particular, on Unix and Windows, although the details are | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 531 | slightly different), :c:func:`Py_Initialize` calculates the module search path | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 532 | based upon its best guess for the location of the standard Python interpreter | 
 | 533 | executable, assuming that the Python library is found in a fixed location | 
 | 534 | relative to the Python interpreter executable.  In particular, it looks for a | 
 | 535 | directory named :file:`lib/python{X.Y}` relative to the parent directory | 
 | 536 | where the executable named :file:`python` is found on the shell command search | 
 | 537 | path (the environment variable :envvar:`PATH`). | 
 | 538 |  | 
 | 539 | For instance, if the Python executable is found in | 
 | 540 | :file:`/usr/local/bin/python`, it will assume that the libraries are in | 
 | 541 | :file:`/usr/local/lib/python{X.Y}`.  (In fact, this particular path is also | 
 | 542 | the "fallback" location, used when no executable file named :file:`python` is | 
 | 543 | found along :envvar:`PATH`.)  The user can override this behavior by setting the | 
 | 544 | environment variable :envvar:`PYTHONHOME`, or insert additional directories in | 
 | 545 | front of the standard path by setting :envvar:`PYTHONPATH`. | 
 | 546 |  | 
 | 547 | .. index:: | 
 | 548 |    single: Py_SetProgramName() | 
 | 549 |    single: Py_GetPath() | 
 | 550 |    single: Py_GetPrefix() | 
 | 551 |    single: Py_GetExecPrefix() | 
 | 552 |    single: Py_GetProgramFullPath() | 
 | 553 |  | 
 | 554 | The embedding application can steer the search by calling | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 555 | ``Py_SetProgramName(file)`` *before* calling  :c:func:`Py_Initialize`.  Note that | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 556 | :envvar:`PYTHONHOME` still overrides this and :envvar:`PYTHONPATH` is still | 
 | 557 | inserted in front of the standard path.  An application that requires total | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 558 | control has to provide its own implementation of :c:func:`Py_GetPath`, | 
 | 559 | :c:func:`Py_GetPrefix`, :c:func:`Py_GetExecPrefix`, and | 
 | 560 | :c:func:`Py_GetProgramFullPath` (all defined in :file:`Modules/getpath.c`). | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 561 |  | 
 | 562 | .. index:: single: Py_IsInitialized() | 
 | 563 |  | 
 | 564 | Sometimes, it is desirable to "uninitialize" Python.  For instance,  the | 
 | 565 | application may want to start over (make another call to | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 566 | :c:func:`Py_Initialize`) or the application is simply done with its  use of | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 567 | Python and wants to free memory allocated by Python.  This can be accomplished | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 568 | by calling :c:func:`Py_Finalize`.  The function :c:func:`Py_IsInitialized` returns | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 569 | true if Python is currently in the initialized state.  More information about | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 570 | these functions is given in a later chapter. Notice that :c:func:`Py_Finalize` | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 571 | does *not* free all memory allocated by the Python interpreter, e.g. memory | 
 | 572 | allocated by extension modules currently cannot be released. | 
 | 573 |  | 
 | 574 |  | 
 | 575 | .. _api-debugging: | 
 | 576 |  | 
 | 577 | Debugging Builds | 
 | 578 | ================ | 
 | 579 |  | 
 | 580 | Python can be built with several macros to enable extra checks of the | 
 | 581 | interpreter and extension modules.  These checks tend to add a large amount of | 
 | 582 | overhead to the runtime so they are not enabled by default. | 
 | 583 |  | 
 | 584 | A full list of the various types of debugging builds is in the file | 
 | 585 | :file:`Misc/SpecialBuilds.txt` in the Python source distribution. Builds are | 
 | 586 | available that support tracing of reference counts, debugging the memory | 
 | 587 | allocator, or low-level profiling of the main interpreter loop.  Only the most | 
 | 588 | frequently-used builds will be described in the remainder of this section. | 
 | 589 |  | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 590 | Compiling the interpreter with the :c:macro:`Py_DEBUG` macro defined produces | 
 | 591 | what is generally meant by "a debug build" of Python. :c:macro:`Py_DEBUG` is | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 592 | enabled in the Unix build by adding :option:`--with-pydebug` to the | 
 | 593 | :file:`configure` command.  It is also implied by the presence of the | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 594 | not-Python-specific :c:macro:`_DEBUG` macro.  When :c:macro:`Py_DEBUG` is enabled | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 595 | in the Unix build, compiler optimization is disabled. | 
 | 596 |  | 
 | 597 | In addition to the reference count debugging described below, the following | 
 | 598 | extra checks are performed: | 
 | 599 |  | 
 | 600 | * Extra checks are added to the object allocator. | 
 | 601 |  | 
 | 602 | * Extra checks are added to the parser and compiler. | 
 | 603 |  | 
 | 604 | * Downcasts from wide types to narrow types are checked for loss of information. | 
 | 605 |  | 
 | 606 | * A number of assertions are added to the dictionary and set implementations. | 
 | 607 |   In addition, the set object acquires a :meth:`test_c_api` method. | 
 | 608 |  | 
 | 609 | * Sanity checks of the input arguments are added to frame creation. | 
 | 610 |  | 
| Mark Dickinson | bf5c6a9 | 2009-01-17 10:21:23 +0000 | [diff] [blame] | 611 | * The storage for ints is initialized with a known invalid pattern to catch | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 612 |   reference to uninitialized digits. | 
 | 613 |  | 
 | 614 | * Low-level tracing and extra exception checking are added to the runtime | 
 | 615 |   virtual machine. | 
 | 616 |  | 
 | 617 | * Extra checks are added to the memory arena implementation. | 
 | 618 |  | 
 | 619 | * Extra debugging is added to the thread module. | 
 | 620 |  | 
 | 621 | There may be additional checks not mentioned here. | 
 | 622 |  | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 623 | Defining :c:macro:`Py_TRACE_REFS` enables reference tracing.  When defined, a | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 624 | circular doubly linked list of active objects is maintained by adding two extra | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 625 | fields to every :c:type:`PyObject`.  Total allocations are tracked as well.  Upon | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 626 | exit, all existing references are printed.  (In interactive mode this happens | 
| Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 627 | after every statement run by the interpreter.)  Implied by :c:macro:`Py_DEBUG`. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 628 |  | 
 | 629 | Please refer to :file:`Misc/SpecialBuilds.txt` in the Python source distribution | 
 | 630 | for more detailed information. | 
 | 631 |  |