Fred Drake | 3adf79e | 2001-10-12 19:01:43 +0000 | [diff] [blame] | 1 | \chapter{Introduction \label{intro}} |
| 2 | |
| 3 | |
| 4 | The Application Programmer's Interface to Python gives C and |
| 5 | \Cpp{} programmers access to the Python interpreter at a variety of |
Fred Drake | c37b65e | 2001-11-28 07:26:15 +0000 | [diff] [blame] | 6 | levels. The API is equally usable from \Cpp, but for brevity it is |
Fred Drake | 3adf79e | 2001-10-12 19:01:43 +0000 | [diff] [blame] | 7 | generally referred to as the Python/C API. There are two |
| 8 | fundamentally different reasons for using the Python/C API. The first |
| 9 | reason is to write \emph{extension modules} for specific purposes; |
| 10 | these are C modules that extend the Python interpreter. This is |
| 11 | probably the most common use. The second reason is to use Python as a |
| 12 | component in a larger application; this technique is generally |
| 13 | referred to as \dfn{embedding} Python in an application. |
| 14 | |
| 15 | Writing an extension module is a relatively well-understood process, |
| 16 | where a ``cookbook'' approach works well. There are several tools |
| 17 | that automate the process to some extent. While people have embedded |
| 18 | Python in other applications since its early existence, the process of |
| 19 | embedding Python is less straightforward than writing an extension. |
| 20 | |
| 21 | Many API functions are useful independent of whether you're embedding |
| 22 | or extending Python; moreover, most applications that embed Python |
| 23 | will need to provide a custom extension as well, so it's probably a |
| 24 | good idea to become familiar with writing an extension before |
| 25 | attempting to embed Python in a real application. |
| 26 | |
| 27 | |
| 28 | \section{Include Files \label{includes}} |
| 29 | |
| 30 | All function, type and macro definitions needed to use the Python/C |
| 31 | API are included in your code by the following line: |
| 32 | |
| 33 | \begin{verbatim} |
| 34 | #include "Python.h" |
| 35 | \end{verbatim} |
| 36 | |
| 37 | This implies inclusion of the following standard headers: |
| 38 | \code{<stdio.h>}, \code{<string.h>}, \code{<errno.h>}, |
| 39 | \code{<limits.h>}, and \code{<stdlib.h>} (if available). |
Fred Drake | 34c4320 | 2004-03-31 07:45:46 +0000 | [diff] [blame] | 40 | |
| 41 | \begin{notice}[warning] |
| 42 | Since Python may define some pre-processor definitions which affect |
| 43 | the standard headers on some systems, you \emph{must} include |
| 44 | \file{Python.h} before any standard headers are included. |
| 45 | \end{notice} |
Fred Drake | 3adf79e | 2001-10-12 19:01:43 +0000 | [diff] [blame] | 46 | |
| 47 | All user visible names defined by Python.h (except those defined by |
| 48 | the included standard headers) have one of the prefixes \samp{Py} or |
| 49 | \samp{_Py}. Names beginning with \samp{_Py} are for internal use by |
| 50 | the Python implementation and should not be used by extension writers. |
| 51 | Structure member names do not have a reserved prefix. |
| 52 | |
| 53 | \strong{Important:} user code should never define names that begin |
| 54 | with \samp{Py} or \samp{_Py}. This confuses the reader, and |
| 55 | jeopardizes the portability of the user code to future Python |
| 56 | versions, which may define additional names beginning with one of |
| 57 | these prefixes. |
| 58 | |
| 59 | The header files are typically installed with Python. On \UNIX, these |
| 60 | are located in the directories |
| 61 | \file{\envvar{prefix}/include/python\var{version}/} and |
| 62 | \file{\envvar{exec_prefix}/include/python\var{version}/}, where |
| 63 | \envvar{prefix} and \envvar{exec_prefix} are defined by the |
| 64 | corresponding parameters to Python's \program{configure} script and |
| 65 | \var{version} is \code{sys.version[:3]}. On Windows, the headers are |
| 66 | installed in \file{\envvar{prefix}/include}, where \envvar{prefix} is |
| 67 | the installation directory specified to the installer. |
| 68 | |
| 69 | To include the headers, place both directories (if different) on your |
| 70 | compiler's search path for includes. Do \emph{not} place the parent |
| 71 | directories on the search path and then use |
| 72 | \samp{\#include <python\shortversion/Python.h>}; this will break on |
| 73 | multi-platform builds since the platform independent headers under |
| 74 | \envvar{prefix} include the platform specific headers from |
| 75 | \envvar{exec_prefix}. |
| 76 | |
| 77 | \Cpp{} users should note that though the API is defined entirely using |
| 78 | C, the header files do properly declare the entry points to be |
| 79 | \code{extern "C"}, so there is no need to do anything special to use |
| 80 | the API from \Cpp. |
| 81 | |
| 82 | |
| 83 | \section{Objects, Types and Reference Counts \label{objects}} |
| 84 | |
| 85 | Most Python/C API functions have one or more arguments as well as a |
| 86 | return value of type \ctype{PyObject*}. This type is a pointer |
| 87 | to an opaque data type representing an arbitrary Python |
| 88 | object. Since all Python object types are treated the same way by the |
| 89 | Python language in most situations (e.g., assignments, scope rules, |
| 90 | and argument passing), it is only fitting that they should be |
| 91 | represented by a single C type. Almost all Python objects live on the |
| 92 | heap: you never declare an automatic or static variable of type |
| 93 | \ctype{PyObject}, only pointer variables of type \ctype{PyObject*} can |
| 94 | be declared. The sole exception are the type objects\obindex{type}; |
| 95 | since these must never be deallocated, they are typically static |
| 96 | \ctype{PyTypeObject} objects. |
| 97 | |
| 98 | All Python objects (even Python integers) have a \dfn{type} and a |
| 99 | \dfn{reference count}. An object's type determines what kind of object |
| 100 | it is (e.g., an integer, a list, or a user-defined function; there are |
| 101 | many more as explained in the \citetitle[../ref/ref.html]{Python |
| 102 | Reference Manual}). For each of the well-known types there is a macro |
| 103 | to check whether an object is of that type; for instance, |
| 104 | \samp{PyList_Check(\var{a})} is true if (and only if) the object |
| 105 | pointed to by \var{a} is a Python list. |
| 106 | |
| 107 | |
| 108 | \subsection{Reference Counts \label{refcounts}} |
| 109 | |
| 110 | The reference count is important because today's computers have a |
| 111 | finite (and often severely limited) memory size; it counts how many |
| 112 | different places there are that have a reference to an object. Such a |
| 113 | place could be another object, or a global (or static) C variable, or |
| 114 | a local variable in some C function. When an object's reference count |
| 115 | becomes zero, the object is deallocated. If it contains references to |
| 116 | other objects, their reference count is decremented. Those other |
| 117 | objects may be deallocated in turn, if this decrement makes their |
| 118 | reference count become zero, and so on. (There's an obvious problem |
| 119 | with objects that reference each other here; for now, the solution is |
| 120 | ``don't do that.'') |
| 121 | |
| 122 | Reference counts are always manipulated explicitly. The normal way is |
| 123 | to use the macro \cfunction{Py_INCREF()}\ttindex{Py_INCREF()} to |
| 124 | increment an object's reference count by one, and |
| 125 | \cfunction{Py_DECREF()}\ttindex{Py_DECREF()} to decrement it by |
| 126 | one. The \cfunction{Py_DECREF()} macro is considerably more complex |
| 127 | than the incref one, since it must check whether the reference count |
| 128 | becomes zero and then cause the object's deallocator to be called. |
| 129 | The deallocator is a function pointer contained in the object's type |
| 130 | structure. The type-specific deallocator takes care of decrementing |
| 131 | the reference counts for other objects contained in the object if this |
| 132 | is a compound object type, such as a list, as well as performing any |
| 133 | additional finalization that's needed. There's no chance that the |
| 134 | reference count can overflow; at least as many bits are used to hold |
| 135 | the reference count as there are distinct memory locations in virtual |
| 136 | memory (assuming \code{sizeof(long) >= sizeof(char*)}). Thus, the |
| 137 | reference count increment is a simple operation. |
| 138 | |
| 139 | It is not necessary to increment an object's reference count for every |
| 140 | local variable that contains a pointer to an object. In theory, the |
| 141 | object's reference count goes up by one when the variable is made to |
| 142 | point to it and it goes down by one when the variable goes out of |
| 143 | scope. However, these two cancel each other out, so at the end the |
| 144 | reference count hasn't changed. The only real reason to use the |
| 145 | reference count is to prevent the object from being deallocated as |
| 146 | long as our variable is pointing to it. If we know that there is at |
| 147 | least one other reference to the object that lives at least as long as |
| 148 | our variable, there is no need to increment the reference count |
| 149 | temporarily. An important situation where this arises is in objects |
| 150 | that are passed as arguments to C functions in an extension module |
| 151 | that are called from Python; the call mechanism guarantees to hold a |
| 152 | reference to every argument for the duration of the call. |
| 153 | |
| 154 | However, a common pitfall is to extract an object from a list and |
| 155 | hold on to it for a while without incrementing its reference count. |
| 156 | Some other operation might conceivably remove the object from the |
| 157 | list, decrementing its reference count and possible deallocating it. |
| 158 | The real danger is that innocent-looking operations may invoke |
| 159 | arbitrary Python code which could do this; there is a code path which |
| 160 | allows control to flow back to the user from a \cfunction{Py_DECREF()}, |
| 161 | so almost any operation is potentially dangerous. |
| 162 | |
| 163 | A safe approach is to always use the generic operations (functions |
| 164 | whose name begins with \samp{PyObject_}, \samp{PyNumber_}, |
| 165 | \samp{PySequence_} or \samp{PyMapping_}). These operations always |
| 166 | increment the reference count of the object they return. This leaves |
| 167 | the caller with the responsibility to call |
| 168 | \cfunction{Py_DECREF()} when they are done with the result; this soon |
| 169 | becomes second nature. |
| 170 | |
| 171 | |
| 172 | \subsubsection{Reference Count Details \label{refcountDetails}} |
| 173 | |
| 174 | The reference count behavior of functions in the Python/C API is best |
Martin v. Löwis | 5ce2fec | 2003-11-06 21:08:11 +0000 | [diff] [blame] | 175 | explained in terms of \emph{ownership of references}. Ownership |
| 176 | pertains to references, never to objects (objects are not owned: they |
| 177 | are always shared). "Owning a reference" means being responsible for |
| 178 | calling Py_DECREF on it when the reference is no longer needed. |
| 179 | Ownership can also be transferred, meaning that the code that receives |
| 180 | ownership of the reference then becomes responsible for eventually |
| 181 | decref'ing it by calling \cfunction{Py_DECREF()} or |
| 182 | \cfunction{Py_XDECREF()} when it's no longer needed --or passing on |
| 183 | this responsibility (usually to its caller). |
| 184 | When a function passes ownership of a reference on to its caller, the |
Fred Drake | 3adf79e | 2001-10-12 19:01:43 +0000 | [diff] [blame] | 185 | caller is said to receive a \emph{new} reference. When no ownership |
| 186 | is transferred, the caller is said to \emph{borrow} the reference. |
| 187 | Nothing needs to be done for a borrowed reference. |
| 188 | |
| 189 | Conversely, when a calling function passes it a reference to an |
| 190 | object, there are two possibilities: the function \emph{steals} a |
| 191 | reference to the object, or it does not. Few functions steal |
| 192 | references; the two notable exceptions are |
| 193 | \cfunction{PyList_SetItem()}\ttindex{PyList_SetItem()} and |
| 194 | \cfunction{PyTuple_SetItem()}\ttindex{PyTuple_SetItem()}, which |
| 195 | steal a reference to the item (but not to the tuple or list into which |
| 196 | the item is put!). These functions were designed to steal a reference |
| 197 | because of a common idiom for populating a tuple or list with newly |
| 198 | created objects; for example, the code to create the tuple \code{(1, |
| 199 | 2, "three")} could look like this (forgetting about error handling for |
| 200 | the moment; a better way to code this is shown below): |
| 201 | |
| 202 | \begin{verbatim} |
| 203 | PyObject *t; |
| 204 | |
| 205 | t = PyTuple_New(3); |
| 206 | PyTuple_SetItem(t, 0, PyInt_FromLong(1L)); |
| 207 | PyTuple_SetItem(t, 1, PyInt_FromLong(2L)); |
| 208 | PyTuple_SetItem(t, 2, PyString_FromString("three")); |
| 209 | \end{verbatim} |
| 210 | |
| 211 | Incidentally, \cfunction{PyTuple_SetItem()} is the \emph{only} way to |
| 212 | set tuple items; \cfunction{PySequence_SetItem()} and |
| 213 | \cfunction{PyObject_SetItem()} refuse to do this since tuples are an |
| 214 | immutable data type. You should only use |
| 215 | \cfunction{PyTuple_SetItem()} for tuples that you are creating |
| 216 | yourself. |
| 217 | |
| 218 | Equivalent code for populating a list can be written using |
| 219 | \cfunction{PyList_New()} and \cfunction{PyList_SetItem()}. Such code |
| 220 | can also use \cfunction{PySequence_SetItem()}; this illustrates the |
| 221 | difference between the two (the extra \cfunction{Py_DECREF()} calls): |
| 222 | |
| 223 | \begin{verbatim} |
| 224 | PyObject *l, *x; |
| 225 | |
| 226 | l = PyList_New(3); |
| 227 | x = PyInt_FromLong(1L); |
| 228 | PySequence_SetItem(l, 0, x); Py_DECREF(x); |
| 229 | x = PyInt_FromLong(2L); |
| 230 | PySequence_SetItem(l, 1, x); Py_DECREF(x); |
| 231 | x = PyString_FromString("three"); |
| 232 | PySequence_SetItem(l, 2, x); Py_DECREF(x); |
| 233 | \end{verbatim} |
| 234 | |
| 235 | You might find it strange that the ``recommended'' approach takes more |
| 236 | code. However, in practice, you will rarely use these ways of |
| 237 | creating and populating a tuple or list. There's a generic function, |
| 238 | \cfunction{Py_BuildValue()}, that can create most common objects from |
| 239 | C values, directed by a \dfn{format string}. For example, the |
| 240 | above two blocks of code could be replaced by the following (which |
| 241 | also takes care of the error checking): |
| 242 | |
| 243 | \begin{verbatim} |
| 244 | PyObject *t, *l; |
| 245 | |
| 246 | t = Py_BuildValue("(iis)", 1, 2, "three"); |
| 247 | l = Py_BuildValue("[iis]", 1, 2, "three"); |
| 248 | \end{verbatim} |
| 249 | |
| 250 | It is much more common to use \cfunction{PyObject_SetItem()} and |
| 251 | friends with items whose references you are only borrowing, like |
| 252 | arguments that were passed in to the function you are writing. In |
| 253 | that case, their behaviour regarding reference counts is much saner, |
| 254 | since you don't have to increment a reference count so you can give a |
| 255 | reference away (``have it be stolen''). For example, this function |
| 256 | sets all items of a list (actually, any mutable sequence) to a given |
| 257 | item: |
| 258 | |
| 259 | \begin{verbatim} |
Fred Drake | 847c51a | 2001-10-25 15:53:44 +0000 | [diff] [blame] | 260 | int |
| 261 | set_all(PyObject *target, PyObject *item) |
Fred Drake | 3adf79e | 2001-10-12 19:01:43 +0000 | [diff] [blame] | 262 | { |
| 263 | int i, n; |
| 264 | |
| 265 | n = PyObject_Length(target); |
| 266 | if (n < 0) |
| 267 | return -1; |
| 268 | for (i = 0; i < n; i++) { |
| 269 | if (PyObject_SetItem(target, i, item) < 0) |
| 270 | return -1; |
| 271 | } |
| 272 | return 0; |
| 273 | } |
| 274 | \end{verbatim} |
| 275 | \ttindex{set_all()} |
| 276 | |
| 277 | The situation is slightly different for function return values. |
| 278 | While passing a reference to most functions does not change your |
| 279 | ownership responsibilities for that reference, many functions that |
Raymond Hettinger | 6880431 | 2005-01-01 00:28:46 +0000 | [diff] [blame] | 280 | return a reference to an object give you ownership of the reference. |
Fred Drake | 3adf79e | 2001-10-12 19:01:43 +0000 | [diff] [blame] | 281 | The reason is simple: in many cases, the returned object is created |
| 282 | on the fly, and the reference you get is the only reference to the |
| 283 | object. Therefore, the generic functions that return object |
| 284 | references, like \cfunction{PyObject_GetItem()} and |
| 285 | \cfunction{PySequence_GetItem()}, always return a new reference (the |
| 286 | caller becomes the owner of the reference). |
| 287 | |
| 288 | It is important to realize that whether you own a reference returned |
| 289 | by a function depends on which function you call only --- \emph{the |
Neal Norwitz | 7decf5e | 2003-10-13 17:47:30 +0000 | [diff] [blame] | 290 | plumage} (the type of the object passed as an |
Fred Drake | 3adf79e | 2001-10-12 19:01:43 +0000 | [diff] [blame] | 291 | argument to the function) \emph{doesn't enter into it!} Thus, if you |
| 292 | extract an item from a list using \cfunction{PyList_GetItem()}, you |
| 293 | don't own the reference --- but if you obtain the same item from the |
| 294 | same list using \cfunction{PySequence_GetItem()} (which happens to |
| 295 | take exactly the same arguments), you do own a reference to the |
| 296 | returned object. |
| 297 | |
| 298 | Here is an example of how you could write a function that computes the |
| 299 | sum of the items in a list of integers; once using |
| 300 | \cfunction{PyList_GetItem()}\ttindex{PyList_GetItem()}, and once using |
| 301 | \cfunction{PySequence_GetItem()}\ttindex{PySequence_GetItem()}. |
| 302 | |
| 303 | \begin{verbatim} |
Fred Drake | 847c51a | 2001-10-25 15:53:44 +0000 | [diff] [blame] | 304 | long |
| 305 | sum_list(PyObject *list) |
Fred Drake | 3adf79e | 2001-10-12 19:01:43 +0000 | [diff] [blame] | 306 | { |
| 307 | int i, n; |
| 308 | long total = 0; |
| 309 | PyObject *item; |
| 310 | |
| 311 | n = PyList_Size(list); |
| 312 | if (n < 0) |
| 313 | return -1; /* Not a list */ |
| 314 | for (i = 0; i < n; i++) { |
| 315 | item = PyList_GetItem(list, i); /* Can't fail */ |
| 316 | if (!PyInt_Check(item)) continue; /* Skip non-integers */ |
| 317 | total += PyInt_AsLong(item); |
| 318 | } |
| 319 | return total; |
| 320 | } |
| 321 | \end{verbatim} |
| 322 | \ttindex{sum_list()} |
| 323 | |
| 324 | \begin{verbatim} |
Fred Drake | 847c51a | 2001-10-25 15:53:44 +0000 | [diff] [blame] | 325 | long |
| 326 | sum_sequence(PyObject *sequence) |
Fred Drake | 3adf79e | 2001-10-12 19:01:43 +0000 | [diff] [blame] | 327 | { |
| 328 | int i, n; |
| 329 | long total = 0; |
| 330 | PyObject *item; |
| 331 | n = PySequence_Length(sequence); |
| 332 | if (n < 0) |
| 333 | return -1; /* Has no length */ |
| 334 | for (i = 0; i < n; i++) { |
| 335 | item = PySequence_GetItem(sequence, i); |
| 336 | if (item == NULL) |
| 337 | return -1; /* Not a sequence, or other failure */ |
| 338 | if (PyInt_Check(item)) |
| 339 | total += PyInt_AsLong(item); |
| 340 | Py_DECREF(item); /* Discard reference ownership */ |
| 341 | } |
| 342 | return total; |
| 343 | } |
| 344 | \end{verbatim} |
| 345 | \ttindex{sum_sequence()} |
| 346 | |
| 347 | |
| 348 | \subsection{Types \label{types}} |
| 349 | |
| 350 | There are few other data types that play a significant role in |
| 351 | the Python/C API; most are simple C types such as \ctype{int}, |
| 352 | \ctype{long}, \ctype{double} and \ctype{char*}. A few structure types |
| 353 | are used to describe static tables used to list the functions exported |
| 354 | by a module or the data attributes of a new object type, and another |
| 355 | is used to describe the value of a complex number. These will |
| 356 | be discussed together with the functions that use them. |
| 357 | |
| 358 | |
| 359 | \section{Exceptions \label{exceptions}} |
| 360 | |
| 361 | The Python programmer only needs to deal with exceptions if specific |
| 362 | error handling is required; unhandled exceptions are automatically |
| 363 | propagated to the caller, then to the caller's caller, and so on, until |
| 364 | they reach the top-level interpreter, where they are reported to the |
| 365 | user accompanied by a stack traceback. |
| 366 | |
| 367 | For C programmers, however, error checking always has to be explicit. |
| 368 | All functions in the Python/C API can raise exceptions, unless an |
| 369 | explicit claim is made otherwise in a function's documentation. In |
| 370 | general, when a function encounters an error, it sets an exception, |
| 371 | discards any object references that it owns, and returns an |
| 372 | error indicator --- usually \NULL{} or \code{-1}. A few functions |
| 373 | return a Boolean true/false result, with false indicating an error. |
| 374 | Very few functions return no explicit error indicator or have an |
| 375 | ambiguous return value, and require explicit testing for errors with |
| 376 | \cfunction{PyErr_Occurred()}\ttindex{PyErr_Occurred()}. |
| 377 | |
| 378 | Exception state is maintained in per-thread storage (this is |
| 379 | equivalent to using global storage in an unthreaded application). A |
| 380 | thread can be in one of two states: an exception has occurred, or not. |
| 381 | The function \cfunction{PyErr_Occurred()} can be used to check for |
| 382 | this: it returns a borrowed reference to the exception type object |
| 383 | when an exception has occurred, and \NULL{} otherwise. There are a |
| 384 | number of functions to set the exception state: |
| 385 | \cfunction{PyErr_SetString()}\ttindex{PyErr_SetString()} is the most |
| 386 | common (though not the most general) function to set the exception |
| 387 | state, and \cfunction{PyErr_Clear()}\ttindex{PyErr_Clear()} clears the |
| 388 | exception state. |
| 389 | |
| 390 | The full exception state consists of three objects (all of which can |
| 391 | be \NULL): the exception type, the corresponding exception |
| 392 | value, and the traceback. These have the same meanings as the Python |
| 393 | \withsubitem{(in module sys)}{ |
| 394 | \ttindex{exc_type}\ttindex{exc_value}\ttindex{exc_traceback}} |
| 395 | objects \code{sys.exc_type}, \code{sys.exc_value}, and |
| 396 | \code{sys.exc_traceback}; however, they are not the same: the Python |
| 397 | objects represent the last exception being handled by a Python |
| 398 | \keyword{try} \ldots\ \keyword{except} statement, while the C level |
| 399 | exception state only exists while an exception is being passed on |
| 400 | between C functions until it reaches the Python bytecode interpreter's |
| 401 | main loop, which takes care of transferring it to \code{sys.exc_type} |
| 402 | and friends. |
| 403 | |
| 404 | Note that starting with Python 1.5, the preferred, thread-safe way to |
| 405 | access the exception state from Python code is to call the function |
| 406 | \withsubitem{(in module sys)}{\ttindex{exc_info()}} |
| 407 | \function{sys.exc_info()}, which returns the per-thread exception state |
| 408 | for Python code. Also, the semantics of both ways to access the |
| 409 | exception state have changed so that a function which catches an |
| 410 | exception will save and restore its thread's exception state so as to |
| 411 | preserve the exception state of its caller. This prevents common bugs |
| 412 | in exception handling code caused by an innocent-looking function |
| 413 | overwriting the exception being handled; it also reduces the often |
| 414 | unwanted lifetime extension for objects that are referenced by the |
| 415 | stack frames in the traceback. |
| 416 | |
| 417 | As a general principle, a function that calls another function to |
| 418 | perform some task should check whether the called function raised an |
| 419 | exception, and if so, pass the exception state on to its caller. It |
| 420 | should discard any object references that it owns, and return an |
| 421 | error indicator, but it should \emph{not} set another exception --- |
| 422 | that would overwrite the exception that was just raised, and lose |
| 423 | important information about the exact cause of the error. |
| 424 | |
| 425 | A simple example of detecting exceptions and passing them on is shown |
| 426 | in the \cfunction{sum_sequence()}\ttindex{sum_sequence()} example |
| 427 | above. It so happens that that example doesn't need to clean up any |
| 428 | owned references when it detects an error. The following example |
| 429 | function shows some error cleanup. First, to remind you why you like |
| 430 | Python, we show the equivalent Python code: |
| 431 | |
| 432 | \begin{verbatim} |
| 433 | def incr_item(dict, key): |
| 434 | try: |
| 435 | item = dict[key] |
| 436 | except KeyError: |
| 437 | item = 0 |
| 438 | dict[key] = item + 1 |
| 439 | \end{verbatim} |
| 440 | \ttindex{incr_item()} |
| 441 | |
| 442 | Here is the corresponding C code, in all its glory: |
| 443 | |
| 444 | \begin{verbatim} |
Fred Drake | 847c51a | 2001-10-25 15:53:44 +0000 | [diff] [blame] | 445 | int |
| 446 | incr_item(PyObject *dict, PyObject *key) |
Fred Drake | 3adf79e | 2001-10-12 19:01:43 +0000 | [diff] [blame] | 447 | { |
| 448 | /* Objects all initialized to NULL for Py_XDECREF */ |
| 449 | PyObject *item = NULL, *const_one = NULL, *incremented_item = NULL; |
| 450 | int rv = -1; /* Return value initialized to -1 (failure) */ |
| 451 | |
| 452 | item = PyObject_GetItem(dict, key); |
| 453 | if (item == NULL) { |
| 454 | /* Handle KeyError only: */ |
| 455 | if (!PyErr_ExceptionMatches(PyExc_KeyError)) |
| 456 | goto error; |
| 457 | |
| 458 | /* Clear the error and use zero: */ |
| 459 | PyErr_Clear(); |
| 460 | item = PyInt_FromLong(0L); |
| 461 | if (item == NULL) |
| 462 | goto error; |
| 463 | } |
| 464 | const_one = PyInt_FromLong(1L); |
| 465 | if (const_one == NULL) |
| 466 | goto error; |
| 467 | |
| 468 | incremented_item = PyNumber_Add(item, const_one); |
| 469 | if (incremented_item == NULL) |
| 470 | goto error; |
| 471 | |
| 472 | if (PyObject_SetItem(dict, key, incremented_item) < 0) |
| 473 | goto error; |
| 474 | rv = 0; /* Success */ |
| 475 | /* Continue with cleanup code */ |
| 476 | |
| 477 | error: |
| 478 | /* Cleanup code, shared by success and failure path */ |
| 479 | |
| 480 | /* Use Py_XDECREF() to ignore NULL references */ |
| 481 | Py_XDECREF(item); |
| 482 | Py_XDECREF(const_one); |
| 483 | Py_XDECREF(incremented_item); |
| 484 | |
| 485 | return rv; /* -1 for error, 0 for success */ |
| 486 | } |
| 487 | \end{verbatim} |
| 488 | \ttindex{incr_item()} |
| 489 | |
| 490 | This example represents an endorsed use of the \keyword{goto} statement |
| 491 | in C! It illustrates the use of |
| 492 | \cfunction{PyErr_ExceptionMatches()}\ttindex{PyErr_ExceptionMatches()} and |
| 493 | \cfunction{PyErr_Clear()}\ttindex{PyErr_Clear()} to |
| 494 | handle specific exceptions, and the use of |
| 495 | \cfunction{Py_XDECREF()}\ttindex{Py_XDECREF()} to |
| 496 | dispose of owned references that may be \NULL{} (note the |
| 497 | \character{X} in the name; \cfunction{Py_DECREF()} would crash when |
| 498 | confronted with a \NULL{} reference). It is important that the |
| 499 | variables used to hold owned references are initialized to \NULL{} for |
| 500 | this to work; likewise, the proposed return value is initialized to |
| 501 | \code{-1} (failure) and only set to success after the final call made |
| 502 | is successful. |
| 503 | |
| 504 | |
| 505 | \section{Embedding Python \label{embedding}} |
| 506 | |
| 507 | The one important task that only embedders (as opposed to extension |
| 508 | writers) of the Python interpreter have to worry about is the |
| 509 | initialization, and possibly the finalization, of the Python |
| 510 | interpreter. Most functionality of the interpreter can only be used |
| 511 | after the interpreter has been initialized. |
| 512 | |
| 513 | The basic initialization function is |
| 514 | \cfunction{Py_Initialize()}\ttindex{Py_Initialize()}. |
| 515 | This initializes the table of loaded modules, and creates the |
| 516 | fundamental modules \module{__builtin__}\refbimodindex{__builtin__}, |
| 517 | \module{__main__}\refbimodindex{__main__}, \module{sys}\refbimodindex{sys}, |
| 518 | and \module{exceptions}.\refbimodindex{exceptions} It also initializes |
| 519 | the module search path (\code{sys.path}).% |
| 520 | \indexiii{module}{search}{path} |
| 521 | \withsubitem{(in module sys)}{\ttindex{path}} |
| 522 | |
| 523 | \cfunction{Py_Initialize()} does not set the ``script argument list'' |
| 524 | (\code{sys.argv}). If this variable is needed by Python code that |
| 525 | will be executed later, it must be set explicitly with a call to |
| 526 | \code{PySys_SetArgv(\var{argc}, |
| 527 | \var{argv})}\ttindex{PySys_SetArgv()} subsequent to the call to |
| 528 | \cfunction{Py_Initialize()}. |
| 529 | |
| 530 | On most systems (in particular, on \UNIX{} and Windows, although the |
| 531 | details are slightly different), |
| 532 | \cfunction{Py_Initialize()} calculates the module search path based |
| 533 | upon its best guess for the location of the standard Python |
| 534 | interpreter executable, assuming that the Python library is found in a |
| 535 | fixed location relative to the Python interpreter executable. In |
| 536 | particular, it looks for a directory named |
| 537 | \file{lib/python\shortversion} relative to the parent directory where |
| 538 | the executable named \file{python} is found on the shell command |
| 539 | search path (the environment variable \envvar{PATH}). |
| 540 | |
| 541 | For instance, if the Python executable is found in |
| 542 | \file{/usr/local/bin/python}, it will assume that the libraries are in |
| 543 | \file{/usr/local/lib/python\shortversion}. (In fact, this particular path |
| 544 | is also the ``fallback'' location, used when no executable file named |
| 545 | \file{python} is found along \envvar{PATH}.) The user can override |
| 546 | this behavior by setting the environment variable \envvar{PYTHONHOME}, |
| 547 | or insert additional directories in front of the standard path by |
| 548 | setting \envvar{PYTHONPATH}. |
| 549 | |
| 550 | The embedding application can steer the search by calling |
| 551 | \code{Py_SetProgramName(\var{file})}\ttindex{Py_SetProgramName()} \emph{before} calling |
| 552 | \cfunction{Py_Initialize()}. Note that \envvar{PYTHONHOME} still |
| 553 | overrides this and \envvar{PYTHONPATH} is still inserted in front of |
| 554 | the standard path. An application that requires total control has to |
| 555 | provide its own implementation of |
| 556 | \cfunction{Py_GetPath()}\ttindex{Py_GetPath()}, |
| 557 | \cfunction{Py_GetPrefix()}\ttindex{Py_GetPrefix()}, |
| 558 | \cfunction{Py_GetExecPrefix()}\ttindex{Py_GetExecPrefix()}, and |
| 559 | \cfunction{Py_GetProgramFullPath()}\ttindex{Py_GetProgramFullPath()} (all |
| 560 | defined in \file{Modules/getpath.c}). |
| 561 | |
| 562 | Sometimes, it is desirable to ``uninitialize'' Python. For instance, |
| 563 | the application may want to start over (make another call to |
| 564 | \cfunction{Py_Initialize()}) or the application is simply done with its |
| 565 | use of Python and wants to free all memory allocated by Python. This |
| 566 | can be accomplished by calling \cfunction{Py_Finalize()}. The function |
| 567 | \cfunction{Py_IsInitialized()}\ttindex{Py_IsInitialized()} returns |
| 568 | true if Python is currently in the initialized state. More |
| 569 | information about these functions is given in a later chapter. |