blob: af983b3e35b40de71f487b150f9cf47f866d52f4 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001.. highlightlang:: c
2
3
4.. _extending-intro:
5
6******************************
7Extending Python with C or C++
8******************************
9
10It is quite easy to add new built-in modules to Python, if you know how to
11program in C. Such :dfn:`extension modules` can do two things that can't be
12done directly in Python: they can implement new built-in object types, and they
13can call C library functions and system calls.
14
15To support extensions, the Python API (Application Programmers Interface)
16defines a set of functions, macros and variables that provide access to most
17aspects of the Python run-time system. The Python API is incorporated in a C
18source file by including the header ``"Python.h"``.
19
20The compilation of an extension module depends on its intended use as well as on
21your system setup; details are given in later chapters.
22
23
24.. _extending-simpleexample:
25
26A Simple Example
27================
28
29Let's create an extension module called ``spam`` (the favorite food of Monty
30Python fans...) and let's say we want to create a Python interface to the C
31library function :cfunc:`system`. [#]_ This function takes a null-terminated
32character string as argument and returns an integer. We want this function to
33be callable from Python as follows::
34
35 >>> import spam
36 >>> status = spam.system("ls -l")
37
38Begin by creating a file :file:`spammodule.c`. (Historically, if a module is
39called ``spam``, the C file containing its implementation is called
40:file:`spammodule.c`; if the module name is very long, like ``spammify``, the
41module name can be just :file:`spammify.c`.)
42
43The first line of our file can be::
44
45 #include <Python.h>
46
47which pulls in the Python API (you can add a comment describing the purpose of
48the module and a copyright notice if you like).
49
Georg Brandle720c0a2009-04-27 16:20:50 +000050.. note::
Georg Brandl116aa622007-08-15 14:28:22 +000051
52 Since Python may define some pre-processor definitions which affect the standard
53 headers on some systems, you *must* include :file:`Python.h` before any standard
54 headers are included.
55
56All user-visible symbols defined by :file:`Python.h` have a prefix of ``Py`` or
57``PY``, except those defined in standard header files. For convenience, and
58since they are used extensively by the Python interpreter, ``"Python.h"``
59includes a few standard header files: ``<stdio.h>``, ``<string.h>``,
60``<errno.h>``, and ``<stdlib.h>``. If the latter header file does not exist on
61your system, it declares the functions :cfunc:`malloc`, :cfunc:`free` and
62:cfunc:`realloc` directly.
63
64The next thing we add to our module file is the C function that will be called
65when the Python expression ``spam.system(string)`` is evaluated (we'll see
66shortly how it ends up being called)::
67
68 static PyObject *
69 spam_system(PyObject *self, PyObject *args)
70 {
71 const char *command;
72 int sts;
73
74 if (!PyArg_ParseTuple(args, "s", &command))
75 return NULL;
76 sts = system(command);
77 return Py_BuildValue("i", sts);
78 }
79
80There is a straightforward translation from the argument list in Python (for
81example, the single expression ``"ls -l"``) to the arguments passed to the C
82function. The C function always has two arguments, conventionally named *self*
83and *args*.
84
Georg Brandlc5605df2009-08-13 08:26:44 +000085The *self* argument points to the module object for module-level functions;
86for a method it would point to the object instance.
Georg Brandl116aa622007-08-15 14:28:22 +000087
88The *args* argument will be a pointer to a Python tuple object containing the
89arguments. Each item of the tuple corresponds to an argument in the call's
90argument list. The arguments are Python objects --- in order to do anything
91with them in our C function we have to convert them to C values. The function
92:cfunc:`PyArg_ParseTuple` in the Python API checks the argument types and
93converts them to C values. It uses a template string to determine the required
94types of the arguments as well as the types of the C variables into which to
95store the converted values. More about this later.
96
97:cfunc:`PyArg_ParseTuple` returns true (nonzero) if all arguments have the right
98type and its components have been stored in the variables whose addresses are
99passed. It returns false (zero) if an invalid argument list was passed. In the
100latter case it also raises an appropriate exception so the calling function can
101return *NULL* immediately (as we saw in the example).
102
103
104.. _extending-errors:
105
106Intermezzo: Errors and Exceptions
107=================================
108
109An important convention throughout the Python interpreter is the following: when
110a function fails, it should set an exception condition and return an error value
111(usually a *NULL* pointer). Exceptions are stored in a static global variable
112inside the interpreter; if this variable is *NULL* no exception has occurred. A
113second global variable stores the "associated value" of the exception (the
114second argument to :keyword:`raise`). A third variable contains the stack
115traceback in case the error originated in Python code. These three variables
116are the C equivalents of the result in Python of :meth:`sys.exc_info` (see the
117section on module :mod:`sys` in the Python Library Reference). It is important
118to know about them to understand how errors are passed around.
119
120The Python API defines a number of functions to set various types of exceptions.
121
122The most common one is :cfunc:`PyErr_SetString`. Its arguments are an exception
123object and a C string. The exception object is usually a predefined object like
124:cdata:`PyExc_ZeroDivisionError`. The C string indicates the cause of the error
125and is converted to a Python string object and stored as the "associated value"
126of the exception.
127
128Another useful function is :cfunc:`PyErr_SetFromErrno`, which only takes an
129exception argument and constructs the associated value by inspection of the
130global variable :cdata:`errno`. The most general function is
131:cfunc:`PyErr_SetObject`, which takes two object arguments, the exception and
132its associated value. You don't need to :cfunc:`Py_INCREF` the objects passed
133to any of these functions.
134
135You can test non-destructively whether an exception has been set with
136:cfunc:`PyErr_Occurred`. This returns the current exception object, or *NULL*
137if no exception has occurred. You normally don't need to call
138:cfunc:`PyErr_Occurred` to see whether an error occurred in a function call,
139since you should be able to tell from the return value.
140
141When a function *f* that calls another function *g* detects that the latter
142fails, *f* should itself return an error value (usually *NULL* or ``-1``). It
143should *not* call one of the :cfunc:`PyErr_\*` functions --- one has already
144been called by *g*. *f*'s caller is then supposed to also return an error
145indication to *its* caller, again *without* calling :cfunc:`PyErr_\*`, and so on
146--- the most detailed cause of the error was already reported by the function
147that first detected it. Once the error reaches the Python interpreter's main
148loop, this aborts the currently executing Python code and tries to find an
149exception handler specified by the Python programmer.
150
151(There are situations where a module can actually give a more detailed error
152message by calling another :cfunc:`PyErr_\*` function, and in such cases it is
153fine to do so. As a general rule, however, this is not necessary, and can cause
154information about the cause of the error to be lost: most operations can fail
155for a variety of reasons.)
156
157To ignore an exception set by a function call that failed, the exception
158condition must be cleared explicitly by calling :cfunc:`PyErr_Clear`. The only
159time C code should call :cfunc:`PyErr_Clear` is if it doesn't want to pass the
160error on to the interpreter but wants to handle it completely by itself
161(possibly by trying something else, or pretending nothing went wrong).
162
163Every failing :cfunc:`malloc` call must be turned into an exception --- the
164direct caller of :cfunc:`malloc` (or :cfunc:`realloc`) must call
165:cfunc:`PyErr_NoMemory` and return a failure indicator itself. All the
Georg Brandl9914dd32007-12-02 23:08:39 +0000166object-creating functions (for example, :cfunc:`PyLong_FromLong`) already do
Georg Brandl116aa622007-08-15 14:28:22 +0000167this, so this note is only relevant to those who call :cfunc:`malloc` directly.
168
169Also note that, with the important exception of :cfunc:`PyArg_ParseTuple` and
170friends, functions that return an integer status usually return a positive value
171or zero for success and ``-1`` for failure, like Unix system calls.
172
173Finally, be careful to clean up garbage (by making :cfunc:`Py_XDECREF` or
174:cfunc:`Py_DECREF` calls for objects you have already created) when you return
175an error indicator!
176
177The choice of which exception to raise is entirely yours. There are predeclared
178C objects corresponding to all built-in Python exceptions, such as
179:cdata:`PyExc_ZeroDivisionError`, which you can use directly. Of course, you
180should choose exceptions wisely --- don't use :cdata:`PyExc_TypeError` to mean
181that a file couldn't be opened (that should probably be :cdata:`PyExc_IOError`).
182If something's wrong with the argument list, the :cfunc:`PyArg_ParseTuple`
183function usually raises :cdata:`PyExc_TypeError`. If you have an argument whose
184value must be in a particular range or must satisfy other conditions,
185:cdata:`PyExc_ValueError` is appropriate.
186
187You can also define a new exception that is unique to your module. For this, you
188usually declare a static object variable at the beginning of your file::
189
190 static PyObject *SpamError;
191
Martin v. Löwis1a214512008-06-11 05:26:20 +0000192and initialize it in your module's initialization function (:cfunc:`PyInit_spam`)
Georg Brandl116aa622007-08-15 14:28:22 +0000193with an exception object (leaving out the error checking for now)::
194
195 PyMODINIT_FUNC
Martin v. Löwis1a214512008-06-11 05:26:20 +0000196 PyInit_spam(void)
Georg Brandl116aa622007-08-15 14:28:22 +0000197 {
198 PyObject *m;
199
Martin v. Löwis1a214512008-06-11 05:26:20 +0000200 m = PyModule_Create(&spammodule);
Georg Brandl116aa622007-08-15 14:28:22 +0000201 if (m == NULL)
Martin v. Löwis1a214512008-06-11 05:26:20 +0000202 return NULL;
Georg Brandl116aa622007-08-15 14:28:22 +0000203
204 SpamError = PyErr_NewException("spam.error", NULL, NULL);
205 Py_INCREF(SpamError);
206 PyModule_AddObject(m, "error", SpamError);
Martin v. Löwis1a214512008-06-11 05:26:20 +0000207 return m;
Georg Brandl116aa622007-08-15 14:28:22 +0000208 }
209
210Note that the Python name for the exception object is :exc:`spam.error`. The
211:cfunc:`PyErr_NewException` function may create a class with the base class
212being :exc:`Exception` (unless another class is passed in instead of *NULL*),
213described in :ref:`bltin-exceptions`.
214
215Note also that the :cdata:`SpamError` variable retains a reference to the newly
216created exception class; this is intentional! Since the exception could be
217removed from the module by external code, an owned reference to the class is
218needed to ensure that it will not be discarded, causing :cdata:`SpamError` to
219become a dangling pointer. Should it become a dangling pointer, C code which
220raises the exception could cause a core dump or other unintended side effects.
221
Georg Brandl13f959b2010-10-06 08:35:38 +0000222We discuss the use of ``PyMODINIT_FUNC`` as a function return type later in this
Georg Brandl116aa622007-08-15 14:28:22 +0000223sample.
224
Georg Brandl13f959b2010-10-06 08:35:38 +0000225The :exc:`spam.error` exception can be raised in your extension module using a
226call to :cfunc:`PyErr_SetString` as shown below::
227
228 static PyObject *
229 spam_system(PyObject *self, PyObject *args)
230 {
231 const char *command;
232 int sts;
233
234 if (!PyArg_ParseTuple(args, "s", &command))
235 return NULL;
236 sts = system(command);
237 if (sts < 0) {
238 PyErr_SetString(SpamError, "System command failed");
239 return NULL;
240 }
241 return PyLong_FromLong(sts);
242 }
243
Georg Brandl116aa622007-08-15 14:28:22 +0000244
245.. _backtoexample:
246
247Back to the Example
248===================
249
250Going back to our example function, you should now be able to understand this
251statement::
252
253 if (!PyArg_ParseTuple(args, "s", &command))
254 return NULL;
255
256It returns *NULL* (the error indicator for functions returning object pointers)
257if an error is detected in the argument list, relying on the exception set by
258:cfunc:`PyArg_ParseTuple`. Otherwise the string value of the argument has been
259copied to the local variable :cdata:`command`. This is a pointer assignment and
260you are not supposed to modify the string to which it points (so in Standard C,
261the variable :cdata:`command` should properly be declared as ``const char
262*command``).
263
264The next statement is a call to the Unix function :cfunc:`system`, passing it
265the string we just got from :cfunc:`PyArg_ParseTuple`::
266
267 sts = system(command);
268
269Our :func:`spam.system` function must return the value of :cdata:`sts` as a
270Python object. This is done using the function :cfunc:`Py_BuildValue`, which is
271something like the inverse of :cfunc:`PyArg_ParseTuple`: it takes a format
272string and an arbitrary number of C values, and returns a new Python object.
273More info on :cfunc:`Py_BuildValue` is given later. ::
274
275 return Py_BuildValue("i", sts);
276
277In this case, it will return an integer object. (Yes, even integers are objects
278on the heap in Python!)
279
280If you have a C function that returns no useful argument (a function returning
281:ctype:`void`), the corresponding Python function must return ``None``. You
282need this idiom to do so (which is implemented by the :cmacro:`Py_RETURN_NONE`
283macro)::
284
285 Py_INCREF(Py_None);
286 return Py_None;
287
288:cdata:`Py_None` is the C name for the special Python object ``None``. It is a
289genuine Python object rather than a *NULL* pointer, which means "error" in most
290contexts, as we have seen.
291
292
293.. _methodtable:
294
295The Module's Method Table and Initialization Function
296=====================================================
297
298I promised to show how :cfunc:`spam_system` is called from Python programs.
299First, we need to list its name and address in a "method table"::
300
301 static PyMethodDef SpamMethods[] = {
302 ...
303 {"system", spam_system, METH_VARARGS,
304 "Execute a shell command."},
305 ...
306 {NULL, NULL, 0, NULL} /* Sentinel */
307 };
308
309Note the third entry (``METH_VARARGS``). This is a flag telling the interpreter
310the calling convention to be used for the C function. It should normally always
311be ``METH_VARARGS`` or ``METH_VARARGS | METH_KEYWORDS``; a value of ``0`` means
312that an obsolete variant of :cfunc:`PyArg_ParseTuple` is used.
313
314When using only ``METH_VARARGS``, the function should expect the Python-level
315parameters to be passed in as a tuple acceptable for parsing via
316:cfunc:`PyArg_ParseTuple`; more information on this function is provided below.
317
318The :const:`METH_KEYWORDS` bit may be set in the third field if keyword
319arguments should be passed to the function. In this case, the C function should
Benjamin Peterson3851d122008-10-20 21:04:06 +0000320accept a third ``PyObject \*`` parameter which will be a dictionary of keywords.
Georg Brandl116aa622007-08-15 14:28:22 +0000321Use :cfunc:`PyArg_ParseTupleAndKeywords` to parse the arguments to such a
322function.
323
Martin v. Löwis1a214512008-06-11 05:26:20 +0000324The method table must be referenced in the module definition structure::
325
Benjamin Peterson3851d122008-10-20 21:04:06 +0000326 static struct PyModuleDef spammodule = {
Martin v. Löwis1a214512008-06-11 05:26:20 +0000327 PyModuleDef_HEAD_INIT,
328 "spam", /* name of module */
329 spam_doc, /* module documentation, may be NULL */
330 -1, /* size of per-interpreter state of the module,
331 or -1 if the module keeps state in global variables. */
332 SpamMethods
333 };
334
335This structure, in turn, must be passed to the interpreter in the module's
Georg Brandl116aa622007-08-15 14:28:22 +0000336initialization function. The initialization function must be named
Martin v. Löwis1a214512008-06-11 05:26:20 +0000337:cfunc:`PyInit_name`, where *name* is the name of the module, and should be the
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000338only non-\ ``static`` item defined in the module file::
Georg Brandl116aa622007-08-15 14:28:22 +0000339
340 PyMODINIT_FUNC
Martin v. Löwis1a214512008-06-11 05:26:20 +0000341 PyInit_spam(void)
Georg Brandl116aa622007-08-15 14:28:22 +0000342 {
Martin v. Löwis1a214512008-06-11 05:26:20 +0000343 return PyModule_Create(&spammodule);
Georg Brandl116aa622007-08-15 14:28:22 +0000344 }
345
Benjamin Peterson71e30a02008-12-24 16:27:25 +0000346Note that PyMODINIT_FUNC declares the function as ``PyObject *`` return type,
347declares any special linkage declarations required by the platform, and for C++
Georg Brandl116aa622007-08-15 14:28:22 +0000348declares the function as ``extern "C"``.
349
350When the Python program imports module :mod:`spam` for the first time,
Martin v. Löwis1a214512008-06-11 05:26:20 +0000351:cfunc:`PyInit_spam` is called. (See below for comments about embedding Python.)
352It calls :cfunc:`PyModule_Create`, which returns a module object, and
Georg Brandl116aa622007-08-15 14:28:22 +0000353inserts built-in function objects into the newly created module based upon the
Georg Brandl48310cd2009-01-03 21:18:54 +0000354table (an array of :ctype:`PyMethodDef` structures) found in the module definition.
Martin v. Löwis1a214512008-06-11 05:26:20 +0000355:cfunc:`PyModule_Create` returns a pointer to the module object
356that it creates. It may abort with a fatal error for
Georg Brandl116aa622007-08-15 14:28:22 +0000357certain errors, or return *NULL* if the module could not be initialized
Martin v. Löwis1a214512008-06-11 05:26:20 +0000358satisfactorily. The init function must return the module object to its caller,
359so that it then gets inserted into ``sys.modules``.
Georg Brandl116aa622007-08-15 14:28:22 +0000360
Martin v. Löwis1a214512008-06-11 05:26:20 +0000361When embedding Python, the :cfunc:`PyInit_spam` function is not called
Georg Brandlacc68cc2008-12-09 23:48:44 +0000362automatically unless there's an entry in the :cdata:`PyImport_Inittab` table.
Martin v. Löwis1a214512008-06-11 05:26:20 +0000363To add the module to the initialization table, use :cfunc:`PyImport_AppendInittab`,
364optionally followed by an import of the module::
Georg Brandl116aa622007-08-15 14:28:22 +0000365
366 int
367 main(int argc, char *argv[])
368 {
Martin v. Löwis1a214512008-06-11 05:26:20 +0000369 /* Add a builtin module, before Py_Initialize */
370 PyImport_AppendInittab("spam", PyInit_spam);
371
Georg Brandl116aa622007-08-15 14:28:22 +0000372 /* Pass argv[0] to the Python interpreter */
373 Py_SetProgramName(argv[0]);
374
375 /* Initialize the Python interpreter. Required. */
376 Py_Initialize();
377
Martin v. Löwis1a214512008-06-11 05:26:20 +0000378 /* Optionally import the module; alternatively,
379 import can be deferred until the embedded script
380 imports it. */
381 PyImport_ImportModule("spam");
Georg Brandl116aa622007-08-15 14:28:22 +0000382
383An example may be found in the file :file:`Demo/embed/demo.c` in the Python
384source distribution.
385
386.. note::
387
388 Removing entries from ``sys.modules`` or importing compiled modules into
389 multiple interpreters within a process (or following a :cfunc:`fork` without an
390 intervening :cfunc:`exec`) can create problems for some extension modules.
391 Extension module authors should exercise caution when initializing internal data
392 structures.
393
394A more substantial example module is included in the Python source distribution
395as :file:`Modules/xxmodule.c`. This file may be used as a template or simply
396read as an example. The :program:`modulator.py` script included in the source
397distribution or Windows install provides a simple graphical user interface for
398declaring the functions and objects which a module should implement, and can
399generate a template which can be filled in. The script lives in the
400:file:`Tools/modulator/` directory; see the :file:`README` file there for more
401information.
402
403
404.. _compilation:
405
406Compilation and Linkage
407=======================
408
409There are two more things to do before you can use your new extension: compiling
410and linking it with the Python system. If you use dynamic loading, the details
411may depend on the style of dynamic loading your system uses; see the chapters
412about building extension modules (chapter :ref:`building`) and additional
413information that pertains only to building on Windows (chapter
414:ref:`building-on-windows`) for more information about this.
415
416If you can't use dynamic loading, or if you want to make your module a permanent
417part of the Python interpreter, you will have to change the configuration setup
418and rebuild the interpreter. Luckily, this is very simple on Unix: just place
419your file (:file:`spammodule.c` for example) in the :file:`Modules/` directory
420of an unpacked source distribution, add a line to the file
421:file:`Modules/Setup.local` describing your file::
422
423 spam spammodule.o
424
425and rebuild the interpreter by running :program:`make` in the toplevel
426directory. You can also run :program:`make` in the :file:`Modules/`
427subdirectory, but then you must first rebuild :file:`Makefile` there by running
428':program:`make` Makefile'. (This is necessary each time you change the
429:file:`Setup` file.)
430
431If your module requires additional libraries to link with, these can be listed
432on the line in the configuration file as well, for instance::
433
434 spam spammodule.o -lX11
435
436
437.. _callingpython:
438
439Calling Python Functions from C
440===============================
441
442So far we have concentrated on making C functions callable from Python. The
443reverse is also useful: calling Python functions from C. This is especially the
444case for libraries that support so-called "callback" functions. If a C
445interface makes use of callbacks, the equivalent Python often needs to provide a
446callback mechanism to the Python programmer; the implementation will require
447calling the Python callback functions from a C callback. Other uses are also
448imaginable.
449
450Fortunately, the Python interpreter is easily called recursively, and there is a
451standard interface to call a Python function. (I won't dwell on how to call the
452Python parser with a particular string as input --- if you're interested, have a
453look at the implementation of the :option:`-c` command line option in
Georg Brandl22291c52007-09-06 14:49:02 +0000454:file:`Modules/main.c` from the Python source code.)
Georg Brandl116aa622007-08-15 14:28:22 +0000455
456Calling a Python function is easy. First, the Python program must somehow pass
457you the Python function object. You should provide a function (or some other
458interface) to do this. When this function is called, save a pointer to the
459Python function object (be careful to :cfunc:`Py_INCREF` it!) in a global
460variable --- or wherever you see fit. For example, the following function might
461be part of a module definition::
462
463 static PyObject *my_callback = NULL;
464
465 static PyObject *
466 my_set_callback(PyObject *dummy, PyObject *args)
467 {
468 PyObject *result = NULL;
469 PyObject *temp;
470
471 if (PyArg_ParseTuple(args, "O:set_callback", &temp)) {
472 if (!PyCallable_Check(temp)) {
473 PyErr_SetString(PyExc_TypeError, "parameter must be callable");
474 return NULL;
475 }
476 Py_XINCREF(temp); /* Add a reference to new callback */
477 Py_XDECREF(my_callback); /* Dispose of previous callback */
478 my_callback = temp; /* Remember new callback */
479 /* Boilerplate to return "None" */
480 Py_INCREF(Py_None);
481 result = Py_None;
482 }
483 return result;
484 }
485
486This function must be registered with the interpreter using the
487:const:`METH_VARARGS` flag; this is described in section :ref:`methodtable`. The
488:cfunc:`PyArg_ParseTuple` function and its arguments are documented in section
489:ref:`parsetuple`.
490
491The macros :cfunc:`Py_XINCREF` and :cfunc:`Py_XDECREF` increment/decrement the
492reference count of an object and are safe in the presence of *NULL* pointers
493(but note that *temp* will not be *NULL* in this context). More info on them
494in section :ref:`refcounts`.
495
Benjamin Petersond23f8222009-04-05 19:13:16 +0000496.. index:: single: PyObject_CallObject()
Georg Brandl116aa622007-08-15 14:28:22 +0000497
498Later, when it is time to call the function, you call the C function
Benjamin Petersond23f8222009-04-05 19:13:16 +0000499:cfunc:`PyObject_CallObject`. This function has two arguments, both pointers to
Georg Brandl116aa622007-08-15 14:28:22 +0000500arbitrary Python objects: the Python function, and the argument list. The
501argument list must always be a tuple object, whose length is the number of
Georg Brandl48310cd2009-01-03 21:18:54 +0000502arguments. To call the Python function with no arguments, pass in NULL, or
Christian Heimesd8654cf2007-12-02 15:22:16 +0000503an empty tuple; to call it with one argument, pass a singleton tuple.
504:cfunc:`Py_BuildValue` returns a tuple when its format string consists of zero
505or more format codes between parentheses. For example::
Georg Brandl116aa622007-08-15 14:28:22 +0000506
507 int arg;
508 PyObject *arglist;
509 PyObject *result;
510 ...
511 arg = 123;
512 ...
513 /* Time to call the callback */
514 arglist = Py_BuildValue("(i)", arg);
Benjamin Petersond23f8222009-04-05 19:13:16 +0000515 result = PyObject_CallObject(my_callback, arglist);
Georg Brandl116aa622007-08-15 14:28:22 +0000516 Py_DECREF(arglist);
517
Benjamin Petersond23f8222009-04-05 19:13:16 +0000518:cfunc:`PyObject_CallObject` returns a Python object pointer: this is the return
519value of the Python function. :cfunc:`PyObject_CallObject` is
Georg Brandl116aa622007-08-15 14:28:22 +0000520"reference-count-neutral" with respect to its arguments. In the example a new
521tuple was created to serve as the argument list, which is :cfunc:`Py_DECREF`\
522-ed immediately after the call.
523
Benjamin Petersond23f8222009-04-05 19:13:16 +0000524The return value of :cfunc:`PyObject_CallObject` is "new": either it is a brand
Georg Brandl116aa622007-08-15 14:28:22 +0000525new object, or it is an existing object whose reference count has been
526incremented. So, unless you want to save it in a global variable, you should
527somehow :cfunc:`Py_DECREF` the result, even (especially!) if you are not
528interested in its value.
529
530Before you do this, however, it is important to check that the return value
531isn't *NULL*. If it is, the Python function terminated by raising an exception.
Benjamin Petersond23f8222009-04-05 19:13:16 +0000532If the C code that called :cfunc:`PyObject_CallObject` is called from Python, it
Georg Brandl116aa622007-08-15 14:28:22 +0000533should now return an error indication to its Python caller, so the interpreter
534can print a stack trace, or the calling Python code can handle the exception.
535If this is not possible or desirable, the exception should be cleared by calling
536:cfunc:`PyErr_Clear`. For example::
537
538 if (result == NULL)
539 return NULL; /* Pass error back */
540 ...use result...
Georg Brandl48310cd2009-01-03 21:18:54 +0000541 Py_DECREF(result);
Georg Brandl116aa622007-08-15 14:28:22 +0000542
543Depending on the desired interface to the Python callback function, you may also
Benjamin Petersond23f8222009-04-05 19:13:16 +0000544have to provide an argument list to :cfunc:`PyObject_CallObject`. In some cases
Georg Brandl116aa622007-08-15 14:28:22 +0000545the argument list is also provided by the Python program, through the same
546interface that specified the callback function. It can then be saved and used
547in the same manner as the function object. In other cases, you may have to
548construct a new tuple to pass as the argument list. The simplest way to do this
549is to call :cfunc:`Py_BuildValue`. For example, if you want to pass an integral
550event code, you might use the following code::
551
552 PyObject *arglist;
553 ...
554 arglist = Py_BuildValue("(l)", eventcode);
Benjamin Petersond23f8222009-04-05 19:13:16 +0000555 result = PyObject_CallObject(my_callback, arglist);
Georg Brandl116aa622007-08-15 14:28:22 +0000556 Py_DECREF(arglist);
557 if (result == NULL)
558 return NULL; /* Pass error back */
559 /* Here maybe use the result */
560 Py_DECREF(result);
561
562Note the placement of ``Py_DECREF(arglist)`` immediately after the call, before
Christian Heimesd8654cf2007-12-02 15:22:16 +0000563the error check! Also note that strictly speaking this code is not complete:
Georg Brandl116aa622007-08-15 14:28:22 +0000564:cfunc:`Py_BuildValue` may run out of memory, and this should be checked.
565
Georg Brandl48310cd2009-01-03 21:18:54 +0000566You may also call a function with keyword arguments by using
Benjamin Petersond23f8222009-04-05 19:13:16 +0000567:cfunc:`PyObject_Call`, which supports arguments and keyword arguments. As in
568the above example, we use :cfunc:`Py_BuildValue` to construct the dictionary. ::
Christian Heimesd8654cf2007-12-02 15:22:16 +0000569
570 PyObject *dict;
571 ...
572 dict = Py_BuildValue("{s:i}", "name", val);
Benjamin Petersond23f8222009-04-05 19:13:16 +0000573 result = PyObject_Call(my_callback, NULL, dict);
Christian Heimesd8654cf2007-12-02 15:22:16 +0000574 Py_DECREF(dict);
575 if (result == NULL)
576 return NULL; /* Pass error back */
577 /* Here maybe use the result */
578 Py_DECREF(result);
Georg Brandl116aa622007-08-15 14:28:22 +0000579
Benjamin Petersond23f8222009-04-05 19:13:16 +0000580
Georg Brandl116aa622007-08-15 14:28:22 +0000581.. _parsetuple:
582
583Extracting Parameters in Extension Functions
584============================================
585
586.. index:: single: PyArg_ParseTuple()
587
588The :cfunc:`PyArg_ParseTuple` function is declared as follows::
589
590 int PyArg_ParseTuple(PyObject *arg, char *format, ...);
591
592The *arg* argument must be a tuple object containing an argument list passed
593from Python to a C function. The *format* argument must be a format string,
594whose syntax is explained in :ref:`arg-parsing` in the Python/C API Reference
595Manual. The remaining arguments must be addresses of variables whose type is
596determined by the format string.
597
598Note that while :cfunc:`PyArg_ParseTuple` checks that the Python arguments have
599the required types, it cannot check the validity of the addresses of C variables
600passed to the call: if you make mistakes there, your code will probably crash or
601at least overwrite random bits in memory. So be careful!
602
603Note that any Python object references which are provided to the caller are
604*borrowed* references; do not decrement their reference count!
605
606Some example calls::
607
Gregory P. Smith02c3b5c2008-11-23 23:49:16 +0000608 #define PY_SSIZE_T_CLEAN /* Make "s#" use Py_ssize_t rather than int. */
609 #include <Python.h>
610
611::
612
Georg Brandl116aa622007-08-15 14:28:22 +0000613 int ok;
614 int i, j;
615 long k, l;
616 const char *s;
Gregory P. Smith02c3b5c2008-11-23 23:49:16 +0000617 Py_ssize_t size;
Georg Brandl116aa622007-08-15 14:28:22 +0000618
619 ok = PyArg_ParseTuple(args, ""); /* No arguments */
620 /* Python call: f() */
621
622::
623
624 ok = PyArg_ParseTuple(args, "s", &s); /* A string */
625 /* Possible Python call: f('whoops!') */
626
627::
628
629 ok = PyArg_ParseTuple(args, "lls", &k, &l, &s); /* Two longs and a string */
630 /* Possible Python call: f(1, 2, 'three') */
631
632::
633
634 ok = PyArg_ParseTuple(args, "(ii)s#", &i, &j, &s, &size);
635 /* A pair of ints and a string, whose size is also returned */
636 /* Possible Python call: f((1, 2), 'three') */
637
638::
639
640 {
641 const char *file;
642 const char *mode = "r";
643 int bufsize = 0;
644 ok = PyArg_ParseTuple(args, "s|si", &file, &mode, &bufsize);
645 /* A string, and optionally another string and an integer */
646 /* Possible Python calls:
647 f('spam')
648 f('spam', 'w')
649 f('spam', 'wb', 100000) */
650 }
651
652::
653
654 {
655 int left, top, right, bottom, h, v;
656 ok = PyArg_ParseTuple(args, "((ii)(ii))(ii)",
657 &left, &top, &right, &bottom, &h, &v);
658 /* A rectangle and a point */
659 /* Possible Python call:
660 f(((0, 0), (400, 300)), (10, 10)) */
661 }
662
663::
664
665 {
666 Py_complex c;
667 ok = PyArg_ParseTuple(args, "D:myfunction", &c);
668 /* a complex, also providing a function name for errors */
669 /* Possible Python call: myfunction(1+2j) */
670 }
671
672
673.. _parsetupleandkeywords:
674
675Keyword Parameters for Extension Functions
676==========================================
677
678.. index:: single: PyArg_ParseTupleAndKeywords()
679
680The :cfunc:`PyArg_ParseTupleAndKeywords` function is declared as follows::
681
682 int PyArg_ParseTupleAndKeywords(PyObject *arg, PyObject *kwdict,
683 char *format, char *kwlist[], ...);
684
685The *arg* and *format* parameters are identical to those of the
686:cfunc:`PyArg_ParseTuple` function. The *kwdict* parameter is the dictionary of
687keywords received as the third parameter from the Python runtime. The *kwlist*
688parameter is a *NULL*-terminated list of strings which identify the parameters;
689the names are matched with the type information from *format* from left to
690right. On success, :cfunc:`PyArg_ParseTupleAndKeywords` returns true, otherwise
691it returns false and raises an appropriate exception.
692
693.. note::
694
695 Nested tuples cannot be parsed when using keyword arguments! Keyword parameters
696 passed in which are not present in the *kwlist* will cause :exc:`TypeError` to
697 be raised.
698
699.. index:: single: Philbrick, Geoff
700
701Here is an example module which uses keywords, based on an example by Geoff
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000702Philbrick (philbrick@hks.com)::
Georg Brandl116aa622007-08-15 14:28:22 +0000703
704 #include "Python.h"
705
706 static PyObject *
707 keywdarg_parrot(PyObject *self, PyObject *args, PyObject *keywds)
Georg Brandl48310cd2009-01-03 21:18:54 +0000708 {
Georg Brandl116aa622007-08-15 14:28:22 +0000709 int voltage;
710 char *state = "a stiff";
711 char *action = "voom";
712 char *type = "Norwegian Blue";
713
714 static char *kwlist[] = {"voltage", "state", "action", "type", NULL};
715
Georg Brandl48310cd2009-01-03 21:18:54 +0000716 if (!PyArg_ParseTupleAndKeywords(args, keywds, "i|sss", kwlist,
Georg Brandl116aa622007-08-15 14:28:22 +0000717 &voltage, &state, &action, &type))
Georg Brandl48310cd2009-01-03 21:18:54 +0000718 return NULL;
Georg Brandl116aa622007-08-15 14:28:22 +0000719
Georg Brandl48310cd2009-01-03 21:18:54 +0000720 printf("-- This parrot wouldn't %s if you put %i Volts through it.\n",
Georg Brandl116aa622007-08-15 14:28:22 +0000721 action, voltage);
722 printf("-- Lovely plumage, the %s -- It's %s!\n", type, state);
723
724 Py_INCREF(Py_None);
725
726 return Py_None;
727 }
728
729 static PyMethodDef keywdarg_methods[] = {
730 /* The cast of the function is necessary since PyCFunction values
731 * only take two PyObject* parameters, and keywdarg_parrot() takes
732 * three.
733 */
734 {"parrot", (PyCFunction)keywdarg_parrot, METH_VARARGS | METH_KEYWORDS,
735 "Print a lovely skit to standard output."},
736 {NULL, NULL, 0, NULL} /* sentinel */
737 };
738
739::
740
741 void
742 initkeywdarg(void)
743 {
744 /* Create the module and add the functions */
745 Py_InitModule("keywdarg", keywdarg_methods);
746 }
747
748
749.. _buildvalue:
750
751Building Arbitrary Values
752=========================
753
754This function is the counterpart to :cfunc:`PyArg_ParseTuple`. It is declared
755as follows::
756
757 PyObject *Py_BuildValue(char *format, ...);
758
759It recognizes a set of format units similar to the ones recognized by
760:cfunc:`PyArg_ParseTuple`, but the arguments (which are input to the function,
761not output) must not be pointers, just values. It returns a new Python object,
762suitable for returning from a C function called from Python.
763
764One difference with :cfunc:`PyArg_ParseTuple`: while the latter requires its
765first argument to be a tuple (since Python argument lists are always represented
766as tuples internally), :cfunc:`Py_BuildValue` does not always build a tuple. It
767builds a tuple only if its format string contains two or more format units. If
768the format string is empty, it returns ``None``; if it contains exactly one
769format unit, it returns whatever object is described by that format unit. To
770force it to return a tuple of size 0 or one, parenthesize the format string.
771
772Examples (to the left the call, to the right the resulting Python value)::
773
774 Py_BuildValue("") None
775 Py_BuildValue("i", 123) 123
776 Py_BuildValue("iii", 123, 456, 789) (123, 456, 789)
777 Py_BuildValue("s", "hello") 'hello'
778 Py_BuildValue("y", "hello") b'hello'
779 Py_BuildValue("ss", "hello", "world") ('hello', 'world')
780 Py_BuildValue("s#", "hello", 4) 'hell'
781 Py_BuildValue("y#", "hello", 4) b'hell'
782 Py_BuildValue("()") ()
783 Py_BuildValue("(i)", 123) (123,)
784 Py_BuildValue("(ii)", 123, 456) (123, 456)
785 Py_BuildValue("(i,i)", 123, 456) (123, 456)
786 Py_BuildValue("[i,i]", 123, 456) [123, 456]
787 Py_BuildValue("{s:i,s:i}",
788 "abc", 123, "def", 456) {'abc': 123, 'def': 456}
789 Py_BuildValue("((ii)(ii)) (ii)",
790 1, 2, 3, 4, 5, 6) (((1, 2), (3, 4)), (5, 6))
791
792
793.. _refcounts:
794
795Reference Counts
796================
797
798In languages like C or C++, the programmer is responsible for dynamic allocation
799and deallocation of memory on the heap. In C, this is done using the functions
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000800:cfunc:`malloc` and :cfunc:`free`. In C++, the operators ``new`` and
801``delete`` are used with essentially the same meaning and we'll restrict
Georg Brandl116aa622007-08-15 14:28:22 +0000802the following discussion to the C case.
803
804Every block of memory allocated with :cfunc:`malloc` should eventually be
805returned to the pool of available memory by exactly one call to :cfunc:`free`.
806It is important to call :cfunc:`free` at the right time. If a block's address
807is forgotten but :cfunc:`free` is not called for it, the memory it occupies
808cannot be reused until the program terminates. This is called a :dfn:`memory
809leak`. On the other hand, if a program calls :cfunc:`free` for a block and then
810continues to use the block, it creates a conflict with re-use of the block
811through another :cfunc:`malloc` call. This is called :dfn:`using freed memory`.
812It has the same bad consequences as referencing uninitialized data --- core
813dumps, wrong results, mysterious crashes.
814
815Common causes of memory leaks are unusual paths through the code. For instance,
816a function may allocate a block of memory, do some calculation, and then free
817the block again. Now a change in the requirements for the function may add a
818test to the calculation that detects an error condition and can return
819prematurely from the function. It's easy to forget to free the allocated memory
820block when taking this premature exit, especially when it is added later to the
821code. Such leaks, once introduced, often go undetected for a long time: the
822error exit is taken only in a small fraction of all calls, and most modern
823machines have plenty of virtual memory, so the leak only becomes apparent in a
824long-running process that uses the leaking function frequently. Therefore, it's
825important to prevent leaks from happening by having a coding convention or
826strategy that minimizes this kind of errors.
827
828Since Python makes heavy use of :cfunc:`malloc` and :cfunc:`free`, it needs a
829strategy to avoid memory leaks as well as the use of freed memory. The chosen
830method is called :dfn:`reference counting`. The principle is simple: every
831object contains a counter, which is incremented when a reference to the object
832is stored somewhere, and which is decremented when a reference to it is deleted.
833When the counter reaches zero, the last reference to the object has been deleted
834and the object is freed.
835
836An alternative strategy is called :dfn:`automatic garbage collection`.
837(Sometimes, reference counting is also referred to as a garbage collection
838strategy, hence my use of "automatic" to distinguish the two.) The big
839advantage of automatic garbage collection is that the user doesn't need to call
840:cfunc:`free` explicitly. (Another claimed advantage is an improvement in speed
841or memory usage --- this is no hard fact however.) The disadvantage is that for
842C, there is no truly portable automatic garbage collector, while reference
843counting can be implemented portably (as long as the functions :cfunc:`malloc`
844and :cfunc:`free` are available --- which the C Standard guarantees). Maybe some
845day a sufficiently portable automatic garbage collector will be available for C.
846Until then, we'll have to live with reference counts.
847
848While Python uses the traditional reference counting implementation, it also
849offers a cycle detector that works to detect reference cycles. This allows
850applications to not worry about creating direct or indirect circular references;
851these are the weakness of garbage collection implemented using only reference
852counting. Reference cycles consist of objects which contain (possibly indirect)
853references to themselves, so that each object in the cycle has a reference count
854which is non-zero. Typical reference counting implementations are not able to
855reclaim the memory belonging to any objects in a reference cycle, or referenced
856from the objects in the cycle, even though there are no further references to
857the cycle itself.
858
859The cycle detector is able to detect garbage cycles and can reclaim them so long
860as there are no finalizers implemented in Python (:meth:`__del__` methods).
861When there are such finalizers, the detector exposes the cycles through the
862:mod:`gc` module (specifically, the
863``garbage`` variable in that module). The :mod:`gc` module also exposes a way
864to run the detector (the :func:`collect` function), as well as configuration
865interfaces and the ability to disable the detector at runtime. The cycle
866detector is considered an optional component; though it is included by default,
867it can be disabled at build time using the :option:`--without-cycle-gc` option
Georg Brandlf6945182008-02-01 11:56:49 +0000868to the :program:`configure` script on Unix platforms (including Mac OS X). If
869the cycle detector is disabled in this way, the :mod:`gc` module will not be
870available.
Georg Brandl116aa622007-08-15 14:28:22 +0000871
872
873.. _refcountsinpython:
874
875Reference Counting in Python
876----------------------------
877
878There are two macros, ``Py_INCREF(x)`` and ``Py_DECREF(x)``, which handle the
879incrementing and decrementing of the reference count. :cfunc:`Py_DECREF` also
880frees the object when the count reaches zero. For flexibility, it doesn't call
881:cfunc:`free` directly --- rather, it makes a call through a function pointer in
882the object's :dfn:`type object`. For this purpose (and others), every object
883also contains a pointer to its type object.
884
885The big question now remains: when to use ``Py_INCREF(x)`` and ``Py_DECREF(x)``?
886Let's first introduce some terms. Nobody "owns" an object; however, you can
887:dfn:`own a reference` to an object. An object's reference count is now defined
888as the number of owned references to it. The owner of a reference is
889responsible for calling :cfunc:`Py_DECREF` when the reference is no longer
890needed. Ownership of a reference can be transferred. There are three ways to
891dispose of an owned reference: pass it on, store it, or call :cfunc:`Py_DECREF`.
892Forgetting to dispose of an owned reference creates a memory leak.
893
894It is also possible to :dfn:`borrow` [#]_ a reference to an object. The
895borrower of a reference should not call :cfunc:`Py_DECREF`. The borrower must
896not hold on to the object longer than the owner from which it was borrowed.
897Using a borrowed reference after the owner has disposed of it risks using freed
898memory and should be avoided completely. [#]_
899
900The advantage of borrowing over owning a reference is that you don't need to
901take care of disposing of the reference on all possible paths through the code
902--- in other words, with a borrowed reference you don't run the risk of leaking
Benjamin Peterson6ebe78f2008-12-21 00:06:59 +0000903when a premature exit is taken. The disadvantage of borrowing over owning is
Georg Brandl116aa622007-08-15 14:28:22 +0000904that there are some subtle situations where in seemingly correct code a borrowed
905reference can be used after the owner from which it was borrowed has in fact
906disposed of it.
907
908A borrowed reference can be changed into an owned reference by calling
909:cfunc:`Py_INCREF`. This does not affect the status of the owner from which the
910reference was borrowed --- it creates a new owned reference, and gives full
911owner responsibilities (the new owner must dispose of the reference properly, as
912well as the previous owner).
913
914
915.. _ownershiprules:
916
917Ownership Rules
918---------------
919
920Whenever an object reference is passed into or out of a function, it is part of
921the function's interface specification whether ownership is transferred with the
922reference or not.
923
924Most functions that return a reference to an object pass on ownership with the
925reference. In particular, all functions whose function it is to create a new
Georg Brandl9914dd32007-12-02 23:08:39 +0000926object, such as :cfunc:`PyLong_FromLong` and :cfunc:`Py_BuildValue`, pass
Georg Brandl116aa622007-08-15 14:28:22 +0000927ownership to the receiver. Even if the object is not actually new, you still
928receive ownership of a new reference to that object. For instance,
Georg Brandl9914dd32007-12-02 23:08:39 +0000929:cfunc:`PyLong_FromLong` maintains a cache of popular values and can return a
Georg Brandl116aa622007-08-15 14:28:22 +0000930reference to a cached item.
931
932Many functions that extract objects from other objects also transfer ownership
933with the reference, for instance :cfunc:`PyObject_GetAttrString`. The picture
934is less clear, here, however, since a few common routines are exceptions:
935:cfunc:`PyTuple_GetItem`, :cfunc:`PyList_GetItem`, :cfunc:`PyDict_GetItem`, and
936:cfunc:`PyDict_GetItemString` all return references that you borrow from the
937tuple, list or dictionary.
938
939The function :cfunc:`PyImport_AddModule` also returns a borrowed reference, even
940though it may actually create the object it returns: this is possible because an
941owned reference to the object is stored in ``sys.modules``.
942
943When you pass an object reference into another function, in general, the
944function borrows the reference from you --- if it needs to store it, it will use
945:cfunc:`Py_INCREF` to become an independent owner. There are exactly two
946important exceptions to this rule: :cfunc:`PyTuple_SetItem` and
947:cfunc:`PyList_SetItem`. These functions take over ownership of the item passed
948to them --- even if they fail! (Note that :cfunc:`PyDict_SetItem` and friends
949don't take over ownership --- they are "normal.")
950
951When a C function is called from Python, it borrows references to its arguments
952from the caller. The caller owns a reference to the object, so the borrowed
953reference's lifetime is guaranteed until the function returns. Only when such a
954borrowed reference must be stored or passed on, it must be turned into an owned
955reference by calling :cfunc:`Py_INCREF`.
956
957The object reference returned from a C function that is called from Python must
958be an owned reference --- ownership is transferred from the function to its
959caller.
960
961
962.. _thinice:
963
964Thin Ice
965--------
966
967There are a few situations where seemingly harmless use of a borrowed reference
968can lead to problems. These all have to do with implicit invocations of the
969interpreter, which can cause the owner of a reference to dispose of it.
970
971The first and most important case to know about is using :cfunc:`Py_DECREF` on
972an unrelated object while borrowing a reference to a list item. For instance::
973
974 void
975 bug(PyObject *list)
976 {
977 PyObject *item = PyList_GetItem(list, 0);
978
Georg Brandl9914dd32007-12-02 23:08:39 +0000979 PyList_SetItem(list, 1, PyLong_FromLong(0L));
Georg Brandl116aa622007-08-15 14:28:22 +0000980 PyObject_Print(item, stdout, 0); /* BUG! */
981 }
982
983This function first borrows a reference to ``list[0]``, then replaces
984``list[1]`` with the value ``0``, and finally prints the borrowed reference.
985Looks harmless, right? But it's not!
986
987Let's follow the control flow into :cfunc:`PyList_SetItem`. The list owns
988references to all its items, so when item 1 is replaced, it has to dispose of
989the original item 1. Now let's suppose the original item 1 was an instance of a
990user-defined class, and let's further suppose that the class defined a
991:meth:`__del__` method. If this class instance has a reference count of 1,
992disposing of it will call its :meth:`__del__` method.
993
994Since it is written in Python, the :meth:`__del__` method can execute arbitrary
995Python code. Could it perhaps do something to invalidate the reference to
996``item`` in :cfunc:`bug`? You bet! Assuming that the list passed into
997:cfunc:`bug` is accessible to the :meth:`__del__` method, it could execute a
998statement to the effect of ``del list[0]``, and assuming this was the last
999reference to that object, it would free the memory associated with it, thereby
1000invalidating ``item``.
1001
1002The solution, once you know the source of the problem, is easy: temporarily
1003increment the reference count. The correct version of the function reads::
1004
1005 void
1006 no_bug(PyObject *list)
1007 {
1008 PyObject *item = PyList_GetItem(list, 0);
1009
1010 Py_INCREF(item);
Georg Brandl9914dd32007-12-02 23:08:39 +00001011 PyList_SetItem(list, 1, PyLong_FromLong(0L));
Georg Brandl116aa622007-08-15 14:28:22 +00001012 PyObject_Print(item, stdout, 0);
1013 Py_DECREF(item);
1014 }
1015
1016This is a true story. An older version of Python contained variants of this bug
1017and someone spent a considerable amount of time in a C debugger to figure out
1018why his :meth:`__del__` methods would fail...
1019
1020The second case of problems with a borrowed reference is a variant involving
1021threads. Normally, multiple threads in the Python interpreter can't get in each
1022other's way, because there is a global lock protecting Python's entire object
1023space. However, it is possible to temporarily release this lock using the macro
1024:cmacro:`Py_BEGIN_ALLOW_THREADS`, and to re-acquire it using
1025:cmacro:`Py_END_ALLOW_THREADS`. This is common around blocking I/O calls, to
1026let other threads use the processor while waiting for the I/O to complete.
1027Obviously, the following function has the same problem as the previous one::
1028
1029 void
1030 bug(PyObject *list)
1031 {
1032 PyObject *item = PyList_GetItem(list, 0);
1033 Py_BEGIN_ALLOW_THREADS
1034 ...some blocking I/O call...
1035 Py_END_ALLOW_THREADS
1036 PyObject_Print(item, stdout, 0); /* BUG! */
1037 }
1038
1039
1040.. _nullpointers:
1041
1042NULL Pointers
1043-------------
1044
1045In general, functions that take object references as arguments do not expect you
1046to pass them *NULL* pointers, and will dump core (or cause later core dumps) if
1047you do so. Functions that return object references generally return *NULL* only
1048to indicate that an exception occurred. The reason for not testing for *NULL*
1049arguments is that functions often pass the objects they receive on to other
1050function --- if each function were to test for *NULL*, there would be a lot of
1051redundant tests and the code would run more slowly.
1052
1053It is better to test for *NULL* only at the "source:" when a pointer that may be
1054*NULL* is received, for example, from :cfunc:`malloc` or from a function that
1055may raise an exception.
1056
1057The macros :cfunc:`Py_INCREF` and :cfunc:`Py_DECREF` do not check for *NULL*
1058pointers --- however, their variants :cfunc:`Py_XINCREF` and :cfunc:`Py_XDECREF`
1059do.
1060
1061The macros for checking for a particular object type (``Pytype_Check()``) don't
1062check for *NULL* pointers --- again, there is much code that calls several of
1063these in a row to test an object against various different expected types, and
1064this would generate redundant tests. There are no variants with *NULL*
1065checking.
1066
1067The C function calling mechanism guarantees that the argument list passed to C
1068functions (``args`` in the examples) is never *NULL* --- in fact it guarantees
1069that it is always a tuple. [#]_
1070
1071It is a severe error to ever let a *NULL* pointer "escape" to the Python user.
1072
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001073.. Frank Stajano:
1074 A pedagogically buggy example, along the lines of the previous listing, would
1075 be helpful here -- showing in more concrete terms what sort of actions could
1076 cause the problem. I can't very well imagine it from the description.
Georg Brandl116aa622007-08-15 14:28:22 +00001077
1078
1079.. _cplusplus:
1080
1081Writing Extensions in C++
1082=========================
1083
1084It is possible to write extension modules in C++. Some restrictions apply. If
1085the main program (the Python interpreter) is compiled and linked by the C
1086compiler, global or static objects with constructors cannot be used. This is
1087not a problem if the main program is linked by the C++ compiler. Functions that
1088will be called by the Python interpreter (in particular, module initialization
1089functions) have to be declared using ``extern "C"``. It is unnecessary to
1090enclose the Python header files in ``extern "C" {...}`` --- they use this form
1091already if the symbol ``__cplusplus`` is defined (all recent C++ compilers
1092define this symbol).
1093
1094
Benjamin Petersonb173f782009-05-05 22:31:58 +00001095.. _using-capsules:
Georg Brandl116aa622007-08-15 14:28:22 +00001096
1097Providing a C API for an Extension Module
1098=========================================
1099
1100.. sectionauthor:: Konrad Hinsen <hinsen@cnrs-orleans.fr>
1101
1102
1103Many extension modules just provide new functions and types to be used from
1104Python, but sometimes the code in an extension module can be useful for other
1105extension modules. For example, an extension module could implement a type
1106"collection" which works like lists without order. Just like the standard Python
1107list type has a C API which permits extension modules to create and manipulate
1108lists, this new collection type should have a set of C functions for direct
1109manipulation from other extension modules.
1110
1111At first sight this seems easy: just write the functions (without declaring them
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001112``static``, of course), provide an appropriate header file, and document
Georg Brandl116aa622007-08-15 14:28:22 +00001113the C API. And in fact this would work if all extension modules were always
1114linked statically with the Python interpreter. When modules are used as shared
1115libraries, however, the symbols defined in one module may not be visible to
1116another module. The details of visibility depend on the operating system; some
1117systems use one global namespace for the Python interpreter and all extension
1118modules (Windows, for example), whereas others require an explicit list of
1119imported symbols at module link time (AIX is one example), or offer a choice of
1120different strategies (most Unices). And even if symbols are globally visible,
1121the module whose functions one wishes to call might not have been loaded yet!
1122
1123Portability therefore requires not to make any assumptions about symbol
1124visibility. This means that all symbols in extension modules should be declared
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001125``static``, except for the module's initialization function, in order to
Georg Brandl116aa622007-08-15 14:28:22 +00001126avoid name clashes with other extension modules (as discussed in section
1127:ref:`methodtable`). And it means that symbols that *should* be accessible from
1128other extension modules must be exported in a different way.
1129
1130Python provides a special mechanism to pass C-level information (pointers) from
Benjamin Petersonb173f782009-05-05 22:31:58 +00001131one extension module to another one: Capsules. A Capsule is a Python data type
1132which stores a pointer (:ctype:`void \*`). Capsules can only be created and
Georg Brandl116aa622007-08-15 14:28:22 +00001133accessed via their C API, but they can be passed around like any other Python
1134object. In particular, they can be assigned to a name in an extension module's
1135namespace. Other extension modules can then import this module, retrieve the
Benjamin Petersonb173f782009-05-05 22:31:58 +00001136value of this name, and then retrieve the pointer from the Capsule.
Georg Brandl116aa622007-08-15 14:28:22 +00001137
Benjamin Petersonb173f782009-05-05 22:31:58 +00001138There are many ways in which Capsules can be used to export the C API of an
1139extension module. Each function could get its own Capsule, or all C API pointers
1140could be stored in an array whose address is published in a Capsule. And the
Georg Brandl116aa622007-08-15 14:28:22 +00001141various tasks of storing and retrieving the pointers can be distributed in
1142different ways between the module providing the code and the client modules.
1143
Benjamin Petersonb173f782009-05-05 22:31:58 +00001144Whichever method you choose, it's important to name your Capsules properly.
1145The function :cfunc:`PyCapsule_New` takes a name parameter
1146(:ctype:`const char \*`); you're permitted to pass in a *NULL* name, but
1147we strongly encourage you to specify a name. Properly named Capsules provide
1148a degree of runtime type-safety; there is no feasible way to tell one unnamed
1149Capsule from another.
1150
1151In particular, Capsules used to expose C APIs should be given a name following
1152this convention::
1153
1154 modulename.attributename
1155
1156The convenience function :cfunc:`PyCapsule_Import` makes it easy to
1157load a C API provided via a Capsule, but only if the Capsule's name
1158matches this convention. This behavior gives C API users a high degree
1159of certainty that the Capsule they load contains the correct C API.
1160
Georg Brandl116aa622007-08-15 14:28:22 +00001161The following example demonstrates an approach that puts most of the burden on
1162the writer of the exporting module, which is appropriate for commonly used
1163library modules. It stores all C API pointers (just one in the example!) in an
Benjamin Petersonb173f782009-05-05 22:31:58 +00001164array of :ctype:`void` pointers which becomes the value of a Capsule. The header
Georg Brandl116aa622007-08-15 14:28:22 +00001165file corresponding to the module provides a macro that takes care of importing
1166the module and retrieving its C API pointers; client modules only have to call
1167this macro before accessing the C API.
1168
1169The exporting module is a modification of the :mod:`spam` module from section
1170:ref:`extending-simpleexample`. The function :func:`spam.system` does not call
1171the C library function :cfunc:`system` directly, but a function
1172:cfunc:`PySpam_System`, which would of course do something more complicated in
1173reality (such as adding "spam" to every command). This function
1174:cfunc:`PySpam_System` is also exported to other extension modules.
1175
1176The function :cfunc:`PySpam_System` is a plain C function, declared
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001177``static`` like everything else::
Georg Brandl116aa622007-08-15 14:28:22 +00001178
1179 static int
1180 PySpam_System(const char *command)
1181 {
1182 return system(command);
1183 }
1184
1185The function :cfunc:`spam_system` is modified in a trivial way::
1186
1187 static PyObject *
1188 spam_system(PyObject *self, PyObject *args)
1189 {
1190 const char *command;
1191 int sts;
1192
1193 if (!PyArg_ParseTuple(args, "s", &command))
1194 return NULL;
1195 sts = PySpam_System(command);
1196 return Py_BuildValue("i", sts);
1197 }
1198
1199In the beginning of the module, right after the line ::
1200
1201 #include "Python.h"
1202
1203two more lines must be added::
1204
1205 #define SPAM_MODULE
1206 #include "spammodule.h"
1207
1208The ``#define`` is used to tell the header file that it is being included in the
1209exporting module, not a client module. Finally, the module's initialization
1210function must take care of initializing the C API pointer array::
1211
1212 PyMODINIT_FUNC
Martin v. Löwis1a214512008-06-11 05:26:20 +00001213 PyInit_spam(void)
Georg Brandl116aa622007-08-15 14:28:22 +00001214 {
1215 PyObject *m;
1216 static void *PySpam_API[PySpam_API_pointers];
1217 PyObject *c_api_object;
1218
Martin v. Löwis1a214512008-06-11 05:26:20 +00001219 m = PyModule_Create(&spammodule);
Georg Brandl116aa622007-08-15 14:28:22 +00001220 if (m == NULL)
Martin v. Löwis1a214512008-06-11 05:26:20 +00001221 return NULL;
Georg Brandl116aa622007-08-15 14:28:22 +00001222
1223 /* Initialize the C API pointer array */
1224 PySpam_API[PySpam_System_NUM] = (void *)PySpam_System;
1225
Benjamin Petersonb173f782009-05-05 22:31:58 +00001226 /* Create a Capsule containing the API pointer array's address */
1227 c_api_object = PyCapsule_New((void *)PySpam_API, "spam._C_API", NULL);
Georg Brandl116aa622007-08-15 14:28:22 +00001228
1229 if (c_api_object != NULL)
1230 PyModule_AddObject(m, "_C_API", c_api_object);
Martin v. Löwis1a214512008-06-11 05:26:20 +00001231 return m;
Georg Brandl116aa622007-08-15 14:28:22 +00001232 }
1233
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001234Note that ``PySpam_API`` is declared ``static``; otherwise the pointer
Martin v. Löwis1a214512008-06-11 05:26:20 +00001235array would disappear when :func:`PyInit_spam` terminates!
Georg Brandl116aa622007-08-15 14:28:22 +00001236
1237The bulk of the work is in the header file :file:`spammodule.h`, which looks
1238like this::
1239
1240 #ifndef Py_SPAMMODULE_H
1241 #define Py_SPAMMODULE_H
1242 #ifdef __cplusplus
1243 extern "C" {
1244 #endif
1245
1246 /* Header file for spammodule */
1247
1248 /* C API functions */
1249 #define PySpam_System_NUM 0
1250 #define PySpam_System_RETURN int
1251 #define PySpam_System_PROTO (const char *command)
1252
1253 /* Total number of C API pointers */
1254 #define PySpam_API_pointers 1
1255
1256
1257 #ifdef SPAM_MODULE
1258 /* This section is used when compiling spammodule.c */
1259
1260 static PySpam_System_RETURN PySpam_System PySpam_System_PROTO;
1261
1262 #else
1263 /* This section is used in modules that use spammodule's API */
1264
1265 static void **PySpam_API;
1266
1267 #define PySpam_System \
1268 (*(PySpam_System_RETURN (*)PySpam_System_PROTO) PySpam_API[PySpam_System_NUM])
1269
Benjamin Petersonb173f782009-05-05 22:31:58 +00001270 /* Return -1 on error, 0 on success.
1271 * PyCapsule_Import will set an exception if there's an error.
1272 */
Georg Brandl116aa622007-08-15 14:28:22 +00001273 static int
1274 import_spam(void)
1275 {
Benjamin Petersonb173f782009-05-05 22:31:58 +00001276 PySpam_API = (void **)PyCapsule_Import("spam._C_API", 0);
1277 return (PySpam_API != NULL) ? 0 : -1;
Georg Brandl116aa622007-08-15 14:28:22 +00001278 }
1279
1280 #endif
1281
1282 #ifdef __cplusplus
1283 }
1284 #endif
1285
1286 #endif /* !defined(Py_SPAMMODULE_H) */
1287
1288All that a client module must do in order to have access to the function
1289:cfunc:`PySpam_System` is to call the function (or rather macro)
1290:cfunc:`import_spam` in its initialization function::
1291
1292 PyMODINIT_FUNC
Benjamin Peterson7c435242009-03-24 01:40:39 +00001293 PyInit_client(void)
Georg Brandl116aa622007-08-15 14:28:22 +00001294 {
1295 PyObject *m;
1296
Georg Brandl21151762009-03-31 15:52:41 +00001297 m = PyModule_Create(&clientmodule);
Georg Brandl116aa622007-08-15 14:28:22 +00001298 if (m == NULL)
Georg Brandl21151762009-03-31 15:52:41 +00001299 return NULL;
Georg Brandl116aa622007-08-15 14:28:22 +00001300 if (import_spam() < 0)
Georg Brandl21151762009-03-31 15:52:41 +00001301 return NULL;
Georg Brandl116aa622007-08-15 14:28:22 +00001302 /* additional initialization can happen here */
Georg Brandl21151762009-03-31 15:52:41 +00001303 return m;
Georg Brandl116aa622007-08-15 14:28:22 +00001304 }
1305
1306The main disadvantage of this approach is that the file :file:`spammodule.h` is
1307rather complicated. However, the basic structure is the same for each function
1308that is exported, so it has to be learned only once.
1309
Benjamin Petersonb173f782009-05-05 22:31:58 +00001310Finally it should be mentioned that Capsules offer additional functionality,
Georg Brandl116aa622007-08-15 14:28:22 +00001311which is especially useful for memory allocation and deallocation of the pointer
Benjamin Petersonb173f782009-05-05 22:31:58 +00001312stored in a Capsule. The details are described in the Python/C API Reference
1313Manual in the section :ref:`capsules` and in the implementation of Capsules (files
1314:file:`Include/pycapsule.h` and :file:`Objects/pycapsule.c` in the Python source
Georg Brandl116aa622007-08-15 14:28:22 +00001315code distribution).
1316
1317.. rubric:: Footnotes
1318
1319.. [#] An interface for this function already exists in the standard module :mod:`os`
1320 --- it was chosen as a simple and straightforward example.
1321
1322.. [#] The metaphor of "borrowing" a reference is not completely correct: the owner
1323 still has a copy of the reference.
1324
1325.. [#] Checking that the reference count is at least 1 **does not work** --- the
1326 reference count itself could be in freed memory and may thus be reused for
1327 another object!
1328
1329.. [#] These guarantees don't hold when you use the "old" style calling convention ---
1330 this is still found in much existing code.
1331