blob: 68dc9d662bd3469e2562f4db92a2ca1d1690177f [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001.. highlightlang:: c
2
3
4.. _extending-intro:
5
6******************************
7Extending Python with C or C++
8******************************
9
10It is quite easy to add new built-in modules to Python, if you know how to
11program in C. Such :dfn:`extension modules` can do two things that can't be
12done directly in Python: they can implement new built-in object types, and they
13can call C library functions and system calls.
14
15To support extensions, the Python API (Application Programmers Interface)
16defines a set of functions, macros and variables that provide access to most
17aspects of the Python run-time system. The Python API is incorporated in a C
18source file by including the header ``"Python.h"``.
19
20The compilation of an extension module depends on its intended use as well as on
21your system setup; details are given in later chapters.
22
23
24.. _extending-simpleexample:
25
26A Simple Example
27================
28
29Let's create an extension module called ``spam`` (the favorite food of Monty
30Python fans...) and let's say we want to create a Python interface to the C
31library function :cfunc:`system`. [#]_ This function takes a null-terminated
32character string as argument and returns an integer. We want this function to
33be callable from Python as follows::
34
35 >>> import spam
36 >>> status = spam.system("ls -l")
37
38Begin by creating a file :file:`spammodule.c`. (Historically, if a module is
39called ``spam``, the C file containing its implementation is called
40:file:`spammodule.c`; if the module name is very long, like ``spammify``, the
41module name can be just :file:`spammify.c`.)
42
43The first line of our file can be::
44
45 #include <Python.h>
46
47which pulls in the Python API (you can add a comment describing the purpose of
48the module and a copyright notice if you like).
49
Georg Brandle720c0a2009-04-27 16:20:50 +000050.. note::
Georg Brandl116aa622007-08-15 14:28:22 +000051
52 Since Python may define some pre-processor definitions which affect the standard
53 headers on some systems, you *must* include :file:`Python.h` before any standard
54 headers are included.
55
56All user-visible symbols defined by :file:`Python.h` have a prefix of ``Py`` or
57``PY``, except those defined in standard header files. For convenience, and
58since they are used extensively by the Python interpreter, ``"Python.h"``
59includes a few standard header files: ``<stdio.h>``, ``<string.h>``,
60``<errno.h>``, and ``<stdlib.h>``. If the latter header file does not exist on
61your system, it declares the functions :cfunc:`malloc`, :cfunc:`free` and
62:cfunc:`realloc` directly.
63
64The next thing we add to our module file is the C function that will be called
65when the Python expression ``spam.system(string)`` is evaluated (we'll see
66shortly how it ends up being called)::
67
68 static PyObject *
69 spam_system(PyObject *self, PyObject *args)
70 {
71 const char *command;
72 int sts;
73
74 if (!PyArg_ParseTuple(args, "s", &command))
75 return NULL;
76 sts = system(command);
Georg Brandlae26cce2010-11-26 18:29:10 +000077 return PyLong_FromLong(sts);
Georg Brandl116aa622007-08-15 14:28:22 +000078 }
79
80There is a straightforward translation from the argument list in Python (for
81example, the single expression ``"ls -l"``) to the arguments passed to the C
82function. The C function always has two arguments, conventionally named *self*
83and *args*.
84
Georg Brandlc5605df2009-08-13 08:26:44 +000085The *self* argument points to the module object for module-level functions;
86for a method it would point to the object instance.
Georg Brandl116aa622007-08-15 14:28:22 +000087
88The *args* argument will be a pointer to a Python tuple object containing the
89arguments. Each item of the tuple corresponds to an argument in the call's
90argument list. The arguments are Python objects --- in order to do anything
91with them in our C function we have to convert them to C values. The function
92:cfunc:`PyArg_ParseTuple` in the Python API checks the argument types and
93converts them to C values. It uses a template string to determine the required
94types of the arguments as well as the types of the C variables into which to
95store the converted values. More about this later.
96
97:cfunc:`PyArg_ParseTuple` returns true (nonzero) if all arguments have the right
98type and its components have been stored in the variables whose addresses are
99passed. It returns false (zero) if an invalid argument list was passed. In the
100latter case it also raises an appropriate exception so the calling function can
101return *NULL* immediately (as we saw in the example).
102
103
104.. _extending-errors:
105
106Intermezzo: Errors and Exceptions
107=================================
108
109An important convention throughout the Python interpreter is the following: when
110a function fails, it should set an exception condition and return an error value
111(usually a *NULL* pointer). Exceptions are stored in a static global variable
112inside the interpreter; if this variable is *NULL* no exception has occurred. A
113second global variable stores the "associated value" of the exception (the
114second argument to :keyword:`raise`). A third variable contains the stack
115traceback in case the error originated in Python code. These three variables
116are the C equivalents of the result in Python of :meth:`sys.exc_info` (see the
117section on module :mod:`sys` in the Python Library Reference). It is important
118to know about them to understand how errors are passed around.
119
120The Python API defines a number of functions to set various types of exceptions.
121
122The most common one is :cfunc:`PyErr_SetString`. Its arguments are an exception
123object and a C string. The exception object is usually a predefined object like
124:cdata:`PyExc_ZeroDivisionError`. The C string indicates the cause of the error
125and is converted to a Python string object and stored as the "associated value"
126of the exception.
127
128Another useful function is :cfunc:`PyErr_SetFromErrno`, which only takes an
129exception argument and constructs the associated value by inspection of the
130global variable :cdata:`errno`. The most general function is
131:cfunc:`PyErr_SetObject`, which takes two object arguments, the exception and
132its associated value. You don't need to :cfunc:`Py_INCREF` the objects passed
133to any of these functions.
134
135You can test non-destructively whether an exception has been set with
136:cfunc:`PyErr_Occurred`. This returns the current exception object, or *NULL*
137if no exception has occurred. You normally don't need to call
138:cfunc:`PyErr_Occurred` to see whether an error occurred in a function call,
139since you should be able to tell from the return value.
140
141When a function *f* that calls another function *g* detects that the latter
142fails, *f* should itself return an error value (usually *NULL* or ``-1``). It
143should *not* call one of the :cfunc:`PyErr_\*` functions --- one has already
144been called by *g*. *f*'s caller is then supposed to also return an error
145indication to *its* caller, again *without* calling :cfunc:`PyErr_\*`, and so on
146--- the most detailed cause of the error was already reported by the function
147that first detected it. Once the error reaches the Python interpreter's main
148loop, this aborts the currently executing Python code and tries to find an
149exception handler specified by the Python programmer.
150
151(There are situations where a module can actually give a more detailed error
152message by calling another :cfunc:`PyErr_\*` function, and in such cases it is
153fine to do so. As a general rule, however, this is not necessary, and can cause
154information about the cause of the error to be lost: most operations can fail
155for a variety of reasons.)
156
157To ignore an exception set by a function call that failed, the exception
158condition must be cleared explicitly by calling :cfunc:`PyErr_Clear`. The only
159time C code should call :cfunc:`PyErr_Clear` is if it doesn't want to pass the
160error on to the interpreter but wants to handle it completely by itself
161(possibly by trying something else, or pretending nothing went wrong).
162
163Every failing :cfunc:`malloc` call must be turned into an exception --- the
164direct caller of :cfunc:`malloc` (or :cfunc:`realloc`) must call
165:cfunc:`PyErr_NoMemory` and return a failure indicator itself. All the
Georg Brandl9914dd32007-12-02 23:08:39 +0000166object-creating functions (for example, :cfunc:`PyLong_FromLong`) already do
Georg Brandl116aa622007-08-15 14:28:22 +0000167this, so this note is only relevant to those who call :cfunc:`malloc` directly.
168
169Also note that, with the important exception of :cfunc:`PyArg_ParseTuple` and
170friends, functions that return an integer status usually return a positive value
171or zero for success and ``-1`` for failure, like Unix system calls.
172
173Finally, be careful to clean up garbage (by making :cfunc:`Py_XDECREF` or
174:cfunc:`Py_DECREF` calls for objects you have already created) when you return
175an error indicator!
176
177The choice of which exception to raise is entirely yours. There are predeclared
178C objects corresponding to all built-in Python exceptions, such as
179:cdata:`PyExc_ZeroDivisionError`, which you can use directly. Of course, you
180should choose exceptions wisely --- don't use :cdata:`PyExc_TypeError` to mean
181that a file couldn't be opened (that should probably be :cdata:`PyExc_IOError`).
182If something's wrong with the argument list, the :cfunc:`PyArg_ParseTuple`
183function usually raises :cdata:`PyExc_TypeError`. If you have an argument whose
184value must be in a particular range or must satisfy other conditions,
185:cdata:`PyExc_ValueError` is appropriate.
186
187You can also define a new exception that is unique to your module. For this, you
188usually declare a static object variable at the beginning of your file::
189
190 static PyObject *SpamError;
191
Martin v. Löwis1a214512008-06-11 05:26:20 +0000192and initialize it in your module's initialization function (:cfunc:`PyInit_spam`)
Georg Brandl116aa622007-08-15 14:28:22 +0000193with an exception object (leaving out the error checking for now)::
194
195 PyMODINIT_FUNC
Martin v. Löwis1a214512008-06-11 05:26:20 +0000196 PyInit_spam(void)
Georg Brandl116aa622007-08-15 14:28:22 +0000197 {
198 PyObject *m;
199
Martin v. Löwis1a214512008-06-11 05:26:20 +0000200 m = PyModule_Create(&spammodule);
Georg Brandl116aa622007-08-15 14:28:22 +0000201 if (m == NULL)
Martin v. Löwis1a214512008-06-11 05:26:20 +0000202 return NULL;
Georg Brandl116aa622007-08-15 14:28:22 +0000203
204 SpamError = PyErr_NewException("spam.error", NULL, NULL);
205 Py_INCREF(SpamError);
206 PyModule_AddObject(m, "error", SpamError);
Martin v. Löwis1a214512008-06-11 05:26:20 +0000207 return m;
Georg Brandl116aa622007-08-15 14:28:22 +0000208 }
209
210Note that the Python name for the exception object is :exc:`spam.error`. The
211:cfunc:`PyErr_NewException` function may create a class with the base class
212being :exc:`Exception` (unless another class is passed in instead of *NULL*),
213described in :ref:`bltin-exceptions`.
214
215Note also that the :cdata:`SpamError` variable retains a reference to the newly
216created exception class; this is intentional! Since the exception could be
217removed from the module by external code, an owned reference to the class is
218needed to ensure that it will not be discarded, causing :cdata:`SpamError` to
219become a dangling pointer. Should it become a dangling pointer, C code which
220raises the exception could cause a core dump or other unintended side effects.
221
Georg Brandl13f959b2010-10-06 08:35:38 +0000222We discuss the use of ``PyMODINIT_FUNC`` as a function return type later in this
Georg Brandl116aa622007-08-15 14:28:22 +0000223sample.
224
Georg Brandl13f959b2010-10-06 08:35:38 +0000225The :exc:`spam.error` exception can be raised in your extension module using a
226call to :cfunc:`PyErr_SetString` as shown below::
227
228 static PyObject *
229 spam_system(PyObject *self, PyObject *args)
230 {
231 const char *command;
232 int sts;
233
234 if (!PyArg_ParseTuple(args, "s", &command))
235 return NULL;
236 sts = system(command);
237 if (sts < 0) {
238 PyErr_SetString(SpamError, "System command failed");
239 return NULL;
240 }
241 return PyLong_FromLong(sts);
242 }
243
Georg Brandl116aa622007-08-15 14:28:22 +0000244
245.. _backtoexample:
246
247Back to the Example
248===================
249
250Going back to our example function, you should now be able to understand this
251statement::
252
253 if (!PyArg_ParseTuple(args, "s", &command))
254 return NULL;
255
256It returns *NULL* (the error indicator for functions returning object pointers)
257if an error is detected in the argument list, relying on the exception set by
258:cfunc:`PyArg_ParseTuple`. Otherwise the string value of the argument has been
259copied to the local variable :cdata:`command`. This is a pointer assignment and
260you are not supposed to modify the string to which it points (so in Standard C,
261the variable :cdata:`command` should properly be declared as ``const char
262*command``).
263
264The next statement is a call to the Unix function :cfunc:`system`, passing it
265the string we just got from :cfunc:`PyArg_ParseTuple`::
266
267 sts = system(command);
268
Benjamin Peterson115b99c2010-11-27 15:39:31 +0000269Our :func:`spam.system` function must return the value of :cdata:`sts` as a
Georg Brandlae26cce2010-11-26 18:29:10 +0000270Python object. This is done using the function :cfunc:`PyLong_FromLong`. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000271
Georg Brandlae26cce2010-11-26 18:29:10 +0000272 return PyLong_FromLong(sts);
Georg Brandl116aa622007-08-15 14:28:22 +0000273
274In this case, it will return an integer object. (Yes, even integers are objects
275on the heap in Python!)
276
277If you have a C function that returns no useful argument (a function returning
278:ctype:`void`), the corresponding Python function must return ``None``. You
279need this idiom to do so (which is implemented by the :cmacro:`Py_RETURN_NONE`
280macro)::
281
282 Py_INCREF(Py_None);
283 return Py_None;
284
285:cdata:`Py_None` is the C name for the special Python object ``None``. It is a
286genuine Python object rather than a *NULL* pointer, which means "error" in most
287contexts, as we have seen.
288
289
290.. _methodtable:
291
292The Module's Method Table and Initialization Function
293=====================================================
294
295I promised to show how :cfunc:`spam_system` is called from Python programs.
296First, we need to list its name and address in a "method table"::
297
298 static PyMethodDef SpamMethods[] = {
299 ...
300 {"system", spam_system, METH_VARARGS,
301 "Execute a shell command."},
302 ...
303 {NULL, NULL, 0, NULL} /* Sentinel */
304 };
305
306Note the third entry (``METH_VARARGS``). This is a flag telling the interpreter
307the calling convention to be used for the C function. It should normally always
308be ``METH_VARARGS`` or ``METH_VARARGS | METH_KEYWORDS``; a value of ``0`` means
309that an obsolete variant of :cfunc:`PyArg_ParseTuple` is used.
310
311When using only ``METH_VARARGS``, the function should expect the Python-level
312parameters to be passed in as a tuple acceptable for parsing via
313:cfunc:`PyArg_ParseTuple`; more information on this function is provided below.
314
315The :const:`METH_KEYWORDS` bit may be set in the third field if keyword
316arguments should be passed to the function. In this case, the C function should
Benjamin Peterson3851d122008-10-20 21:04:06 +0000317accept a third ``PyObject \*`` parameter which will be a dictionary of keywords.
Georg Brandl116aa622007-08-15 14:28:22 +0000318Use :cfunc:`PyArg_ParseTupleAndKeywords` to parse the arguments to such a
319function.
320
Martin v. Löwis1a214512008-06-11 05:26:20 +0000321The method table must be referenced in the module definition structure::
322
Benjamin Peterson3851d122008-10-20 21:04:06 +0000323 static struct PyModuleDef spammodule = {
Martin v. Löwis1a214512008-06-11 05:26:20 +0000324 PyModuleDef_HEAD_INIT,
325 "spam", /* name of module */
326 spam_doc, /* module documentation, may be NULL */
327 -1, /* size of per-interpreter state of the module,
328 or -1 if the module keeps state in global variables. */
329 SpamMethods
330 };
331
332This structure, in turn, must be passed to the interpreter in the module's
Georg Brandl116aa622007-08-15 14:28:22 +0000333initialization function. The initialization function must be named
Martin v. Löwis1a214512008-06-11 05:26:20 +0000334:cfunc:`PyInit_name`, where *name* is the name of the module, and should be the
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000335only non-\ ``static`` item defined in the module file::
Georg Brandl116aa622007-08-15 14:28:22 +0000336
337 PyMODINIT_FUNC
Martin v. Löwis1a214512008-06-11 05:26:20 +0000338 PyInit_spam(void)
Georg Brandl116aa622007-08-15 14:28:22 +0000339 {
Martin v. Löwis1a214512008-06-11 05:26:20 +0000340 return PyModule_Create(&spammodule);
Georg Brandl116aa622007-08-15 14:28:22 +0000341 }
342
Benjamin Peterson71e30a02008-12-24 16:27:25 +0000343Note that PyMODINIT_FUNC declares the function as ``PyObject *`` return type,
344declares any special linkage declarations required by the platform, and for C++
Georg Brandl116aa622007-08-15 14:28:22 +0000345declares the function as ``extern "C"``.
346
347When the Python program imports module :mod:`spam` for the first time,
Martin v. Löwis1a214512008-06-11 05:26:20 +0000348:cfunc:`PyInit_spam` is called. (See below for comments about embedding Python.)
349It calls :cfunc:`PyModule_Create`, which returns a module object, and
Georg Brandl116aa622007-08-15 14:28:22 +0000350inserts built-in function objects into the newly created module based upon the
Georg Brandl48310cd2009-01-03 21:18:54 +0000351table (an array of :ctype:`PyMethodDef` structures) found in the module definition.
Martin v. Löwis1a214512008-06-11 05:26:20 +0000352:cfunc:`PyModule_Create` returns a pointer to the module object
353that it creates. It may abort with a fatal error for
Georg Brandl116aa622007-08-15 14:28:22 +0000354certain errors, or return *NULL* if the module could not be initialized
Martin v. Löwis1a214512008-06-11 05:26:20 +0000355satisfactorily. The init function must return the module object to its caller,
356so that it then gets inserted into ``sys.modules``.
Georg Brandl116aa622007-08-15 14:28:22 +0000357
Martin v. Löwis1a214512008-06-11 05:26:20 +0000358When embedding Python, the :cfunc:`PyInit_spam` function is not called
Georg Brandlacc68cc2008-12-09 23:48:44 +0000359automatically unless there's an entry in the :cdata:`PyImport_Inittab` table.
Martin v. Löwis1a214512008-06-11 05:26:20 +0000360To add the module to the initialization table, use :cfunc:`PyImport_AppendInittab`,
361optionally followed by an import of the module::
Georg Brandl116aa622007-08-15 14:28:22 +0000362
363 int
364 main(int argc, char *argv[])
365 {
Martin v. Löwis1a214512008-06-11 05:26:20 +0000366 /* Add a builtin module, before Py_Initialize */
367 PyImport_AppendInittab("spam", PyInit_spam);
368
Georg Brandl116aa622007-08-15 14:28:22 +0000369 /* Pass argv[0] to the Python interpreter */
370 Py_SetProgramName(argv[0]);
371
372 /* Initialize the Python interpreter. Required. */
373 Py_Initialize();
374
Martin v. Löwis1a214512008-06-11 05:26:20 +0000375 /* Optionally import the module; alternatively,
376 import can be deferred until the embedded script
377 imports it. */
378 PyImport_ImportModule("spam");
Georg Brandl116aa622007-08-15 14:28:22 +0000379
380An example may be found in the file :file:`Demo/embed/demo.c` in the Python
381source distribution.
382
383.. note::
384
385 Removing entries from ``sys.modules`` or importing compiled modules into
386 multiple interpreters within a process (or following a :cfunc:`fork` without an
387 intervening :cfunc:`exec`) can create problems for some extension modules.
388 Extension module authors should exercise caution when initializing internal data
389 structures.
390
391A more substantial example module is included in the Python source distribution
392as :file:`Modules/xxmodule.c`. This file may be used as a template or simply
393read as an example. The :program:`modulator.py` script included in the source
394distribution or Windows install provides a simple graphical user interface for
395declaring the functions and objects which a module should implement, and can
396generate a template which can be filled in. The script lives in the
397:file:`Tools/modulator/` directory; see the :file:`README` file there for more
398information.
399
400
401.. _compilation:
402
403Compilation and Linkage
404=======================
405
406There are two more things to do before you can use your new extension: compiling
407and linking it with the Python system. If you use dynamic loading, the details
408may depend on the style of dynamic loading your system uses; see the chapters
409about building extension modules (chapter :ref:`building`) and additional
410information that pertains only to building on Windows (chapter
411:ref:`building-on-windows`) for more information about this.
412
413If you can't use dynamic loading, or if you want to make your module a permanent
414part of the Python interpreter, you will have to change the configuration setup
415and rebuild the interpreter. Luckily, this is very simple on Unix: just place
416your file (:file:`spammodule.c` for example) in the :file:`Modules/` directory
417of an unpacked source distribution, add a line to the file
418:file:`Modules/Setup.local` describing your file::
419
420 spam spammodule.o
421
422and rebuild the interpreter by running :program:`make` in the toplevel
423directory. You can also run :program:`make` in the :file:`Modules/`
424subdirectory, but then you must first rebuild :file:`Makefile` there by running
425':program:`make` Makefile'. (This is necessary each time you change the
426:file:`Setup` file.)
427
428If your module requires additional libraries to link with, these can be listed
429on the line in the configuration file as well, for instance::
430
431 spam spammodule.o -lX11
432
433
434.. _callingpython:
435
436Calling Python Functions from C
437===============================
438
439So far we have concentrated on making C functions callable from Python. The
440reverse is also useful: calling Python functions from C. This is especially the
441case for libraries that support so-called "callback" functions. If a C
442interface makes use of callbacks, the equivalent Python often needs to provide a
443callback mechanism to the Python programmer; the implementation will require
444calling the Python callback functions from a C callback. Other uses are also
445imaginable.
446
447Fortunately, the Python interpreter is easily called recursively, and there is a
448standard interface to call a Python function. (I won't dwell on how to call the
449Python parser with a particular string as input --- if you're interested, have a
450look at the implementation of the :option:`-c` command line option in
Georg Brandl22291c52007-09-06 14:49:02 +0000451:file:`Modules/main.c` from the Python source code.)
Georg Brandl116aa622007-08-15 14:28:22 +0000452
453Calling a Python function is easy. First, the Python program must somehow pass
454you the Python function object. You should provide a function (or some other
455interface) to do this. When this function is called, save a pointer to the
456Python function object (be careful to :cfunc:`Py_INCREF` it!) in a global
457variable --- or wherever you see fit. For example, the following function might
458be part of a module definition::
459
460 static PyObject *my_callback = NULL;
461
462 static PyObject *
463 my_set_callback(PyObject *dummy, PyObject *args)
464 {
465 PyObject *result = NULL;
466 PyObject *temp;
467
468 if (PyArg_ParseTuple(args, "O:set_callback", &temp)) {
469 if (!PyCallable_Check(temp)) {
470 PyErr_SetString(PyExc_TypeError, "parameter must be callable");
471 return NULL;
472 }
473 Py_XINCREF(temp); /* Add a reference to new callback */
474 Py_XDECREF(my_callback); /* Dispose of previous callback */
475 my_callback = temp; /* Remember new callback */
476 /* Boilerplate to return "None" */
477 Py_INCREF(Py_None);
478 result = Py_None;
479 }
480 return result;
481 }
482
483This function must be registered with the interpreter using the
484:const:`METH_VARARGS` flag; this is described in section :ref:`methodtable`. The
485:cfunc:`PyArg_ParseTuple` function and its arguments are documented in section
486:ref:`parsetuple`.
487
488The macros :cfunc:`Py_XINCREF` and :cfunc:`Py_XDECREF` increment/decrement the
489reference count of an object and are safe in the presence of *NULL* pointers
490(but note that *temp* will not be *NULL* in this context). More info on them
491in section :ref:`refcounts`.
492
Benjamin Petersond23f8222009-04-05 19:13:16 +0000493.. index:: single: PyObject_CallObject()
Georg Brandl116aa622007-08-15 14:28:22 +0000494
495Later, when it is time to call the function, you call the C function
Benjamin Petersond23f8222009-04-05 19:13:16 +0000496:cfunc:`PyObject_CallObject`. This function has two arguments, both pointers to
Georg Brandl116aa622007-08-15 14:28:22 +0000497arbitrary Python objects: the Python function, and the argument list. The
498argument list must always be a tuple object, whose length is the number of
Georg Brandl48310cd2009-01-03 21:18:54 +0000499arguments. To call the Python function with no arguments, pass in NULL, or
Christian Heimesd8654cf2007-12-02 15:22:16 +0000500an empty tuple; to call it with one argument, pass a singleton tuple.
501:cfunc:`Py_BuildValue` returns a tuple when its format string consists of zero
502or more format codes between parentheses. For example::
Georg Brandl116aa622007-08-15 14:28:22 +0000503
504 int arg;
505 PyObject *arglist;
506 PyObject *result;
507 ...
508 arg = 123;
509 ...
510 /* Time to call the callback */
511 arglist = Py_BuildValue("(i)", arg);
Benjamin Petersond23f8222009-04-05 19:13:16 +0000512 result = PyObject_CallObject(my_callback, arglist);
Georg Brandl116aa622007-08-15 14:28:22 +0000513 Py_DECREF(arglist);
514
Benjamin Petersond23f8222009-04-05 19:13:16 +0000515:cfunc:`PyObject_CallObject` returns a Python object pointer: this is the return
516value of the Python function. :cfunc:`PyObject_CallObject` is
Georg Brandl116aa622007-08-15 14:28:22 +0000517"reference-count-neutral" with respect to its arguments. In the example a new
518tuple was created to serve as the argument list, which is :cfunc:`Py_DECREF`\
519-ed immediately after the call.
520
Benjamin Petersond23f8222009-04-05 19:13:16 +0000521The return value of :cfunc:`PyObject_CallObject` is "new": either it is a brand
Georg Brandl116aa622007-08-15 14:28:22 +0000522new object, or it is an existing object whose reference count has been
523incremented. So, unless you want to save it in a global variable, you should
524somehow :cfunc:`Py_DECREF` the result, even (especially!) if you are not
525interested in its value.
526
527Before you do this, however, it is important to check that the return value
528isn't *NULL*. If it is, the Python function terminated by raising an exception.
Benjamin Petersond23f8222009-04-05 19:13:16 +0000529If the C code that called :cfunc:`PyObject_CallObject` is called from Python, it
Georg Brandl116aa622007-08-15 14:28:22 +0000530should now return an error indication to its Python caller, so the interpreter
531can print a stack trace, or the calling Python code can handle the exception.
532If this is not possible or desirable, the exception should be cleared by calling
533:cfunc:`PyErr_Clear`. For example::
534
535 if (result == NULL)
536 return NULL; /* Pass error back */
537 ...use result...
Georg Brandl48310cd2009-01-03 21:18:54 +0000538 Py_DECREF(result);
Georg Brandl116aa622007-08-15 14:28:22 +0000539
540Depending on the desired interface to the Python callback function, you may also
Benjamin Petersond23f8222009-04-05 19:13:16 +0000541have to provide an argument list to :cfunc:`PyObject_CallObject`. In some cases
Georg Brandl116aa622007-08-15 14:28:22 +0000542the argument list is also provided by the Python program, through the same
543interface that specified the callback function. It can then be saved and used
544in the same manner as the function object. In other cases, you may have to
545construct a new tuple to pass as the argument list. The simplest way to do this
546is to call :cfunc:`Py_BuildValue`. For example, if you want to pass an integral
547event code, you might use the following code::
548
549 PyObject *arglist;
550 ...
551 arglist = Py_BuildValue("(l)", eventcode);
Benjamin Petersond23f8222009-04-05 19:13:16 +0000552 result = PyObject_CallObject(my_callback, arglist);
Georg Brandl116aa622007-08-15 14:28:22 +0000553 Py_DECREF(arglist);
554 if (result == NULL)
555 return NULL; /* Pass error back */
556 /* Here maybe use the result */
557 Py_DECREF(result);
558
559Note the placement of ``Py_DECREF(arglist)`` immediately after the call, before
Christian Heimesd8654cf2007-12-02 15:22:16 +0000560the error check! Also note that strictly speaking this code is not complete:
Georg Brandl116aa622007-08-15 14:28:22 +0000561:cfunc:`Py_BuildValue` may run out of memory, and this should be checked.
562
Georg Brandl48310cd2009-01-03 21:18:54 +0000563You may also call a function with keyword arguments by using
Benjamin Petersond23f8222009-04-05 19:13:16 +0000564:cfunc:`PyObject_Call`, which supports arguments and keyword arguments. As in
565the above example, we use :cfunc:`Py_BuildValue` to construct the dictionary. ::
Christian Heimesd8654cf2007-12-02 15:22:16 +0000566
567 PyObject *dict;
568 ...
569 dict = Py_BuildValue("{s:i}", "name", val);
Benjamin Petersond23f8222009-04-05 19:13:16 +0000570 result = PyObject_Call(my_callback, NULL, dict);
Christian Heimesd8654cf2007-12-02 15:22:16 +0000571 Py_DECREF(dict);
572 if (result == NULL)
573 return NULL; /* Pass error back */
574 /* Here maybe use the result */
575 Py_DECREF(result);
Georg Brandl116aa622007-08-15 14:28:22 +0000576
Benjamin Petersond23f8222009-04-05 19:13:16 +0000577
Georg Brandl116aa622007-08-15 14:28:22 +0000578.. _parsetuple:
579
580Extracting Parameters in Extension Functions
581============================================
582
583.. index:: single: PyArg_ParseTuple()
584
585The :cfunc:`PyArg_ParseTuple` function is declared as follows::
586
587 int PyArg_ParseTuple(PyObject *arg, char *format, ...);
588
589The *arg* argument must be a tuple object containing an argument list passed
590from Python to a C function. The *format* argument must be a format string,
591whose syntax is explained in :ref:`arg-parsing` in the Python/C API Reference
592Manual. The remaining arguments must be addresses of variables whose type is
593determined by the format string.
594
595Note that while :cfunc:`PyArg_ParseTuple` checks that the Python arguments have
596the required types, it cannot check the validity of the addresses of C variables
597passed to the call: if you make mistakes there, your code will probably crash or
598at least overwrite random bits in memory. So be careful!
599
600Note that any Python object references which are provided to the caller are
601*borrowed* references; do not decrement their reference count!
602
603Some example calls::
604
Gregory P. Smith02c3b5c2008-11-23 23:49:16 +0000605 #define PY_SSIZE_T_CLEAN /* Make "s#" use Py_ssize_t rather than int. */
606 #include <Python.h>
607
608::
609
Georg Brandl116aa622007-08-15 14:28:22 +0000610 int ok;
611 int i, j;
612 long k, l;
613 const char *s;
Gregory P. Smith02c3b5c2008-11-23 23:49:16 +0000614 Py_ssize_t size;
Georg Brandl116aa622007-08-15 14:28:22 +0000615
616 ok = PyArg_ParseTuple(args, ""); /* No arguments */
617 /* Python call: f() */
618
619::
620
621 ok = PyArg_ParseTuple(args, "s", &s); /* A string */
622 /* Possible Python call: f('whoops!') */
623
624::
625
626 ok = PyArg_ParseTuple(args, "lls", &k, &l, &s); /* Two longs and a string */
627 /* Possible Python call: f(1, 2, 'three') */
628
629::
630
631 ok = PyArg_ParseTuple(args, "(ii)s#", &i, &j, &s, &size);
632 /* A pair of ints and a string, whose size is also returned */
633 /* Possible Python call: f((1, 2), 'three') */
634
635::
636
637 {
638 const char *file;
639 const char *mode = "r";
640 int bufsize = 0;
641 ok = PyArg_ParseTuple(args, "s|si", &file, &mode, &bufsize);
642 /* A string, and optionally another string and an integer */
643 /* Possible Python calls:
644 f('spam')
645 f('spam', 'w')
646 f('spam', 'wb', 100000) */
647 }
648
649::
650
651 {
652 int left, top, right, bottom, h, v;
653 ok = PyArg_ParseTuple(args, "((ii)(ii))(ii)",
654 &left, &top, &right, &bottom, &h, &v);
655 /* A rectangle and a point */
656 /* Possible Python call:
657 f(((0, 0), (400, 300)), (10, 10)) */
658 }
659
660::
661
662 {
663 Py_complex c;
664 ok = PyArg_ParseTuple(args, "D:myfunction", &c);
665 /* a complex, also providing a function name for errors */
666 /* Possible Python call: myfunction(1+2j) */
667 }
668
669
670.. _parsetupleandkeywords:
671
672Keyword Parameters for Extension Functions
673==========================================
674
675.. index:: single: PyArg_ParseTupleAndKeywords()
676
677The :cfunc:`PyArg_ParseTupleAndKeywords` function is declared as follows::
678
679 int PyArg_ParseTupleAndKeywords(PyObject *arg, PyObject *kwdict,
680 char *format, char *kwlist[], ...);
681
682The *arg* and *format* parameters are identical to those of the
683:cfunc:`PyArg_ParseTuple` function. The *kwdict* parameter is the dictionary of
684keywords received as the third parameter from the Python runtime. The *kwlist*
685parameter is a *NULL*-terminated list of strings which identify the parameters;
686the names are matched with the type information from *format* from left to
687right. On success, :cfunc:`PyArg_ParseTupleAndKeywords` returns true, otherwise
688it returns false and raises an appropriate exception.
689
690.. note::
691
692 Nested tuples cannot be parsed when using keyword arguments! Keyword parameters
693 passed in which are not present in the *kwlist* will cause :exc:`TypeError` to
694 be raised.
695
696.. index:: single: Philbrick, Geoff
697
698Here is an example module which uses keywords, based on an example by Geoff
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000699Philbrick (philbrick@hks.com)::
Georg Brandl116aa622007-08-15 14:28:22 +0000700
701 #include "Python.h"
702
703 static PyObject *
704 keywdarg_parrot(PyObject *self, PyObject *args, PyObject *keywds)
Georg Brandl48310cd2009-01-03 21:18:54 +0000705 {
Georg Brandl116aa622007-08-15 14:28:22 +0000706 int voltage;
707 char *state = "a stiff";
708 char *action = "voom";
709 char *type = "Norwegian Blue";
710
711 static char *kwlist[] = {"voltage", "state", "action", "type", NULL};
712
Georg Brandl48310cd2009-01-03 21:18:54 +0000713 if (!PyArg_ParseTupleAndKeywords(args, keywds, "i|sss", kwlist,
Georg Brandl116aa622007-08-15 14:28:22 +0000714 &voltage, &state, &action, &type))
Georg Brandl48310cd2009-01-03 21:18:54 +0000715 return NULL;
Georg Brandl116aa622007-08-15 14:28:22 +0000716
Georg Brandl48310cd2009-01-03 21:18:54 +0000717 printf("-- This parrot wouldn't %s if you put %i Volts through it.\n",
Georg Brandl116aa622007-08-15 14:28:22 +0000718 action, voltage);
719 printf("-- Lovely plumage, the %s -- It's %s!\n", type, state);
720
721 Py_INCREF(Py_None);
722
723 return Py_None;
724 }
725
726 static PyMethodDef keywdarg_methods[] = {
727 /* The cast of the function is necessary since PyCFunction values
728 * only take two PyObject* parameters, and keywdarg_parrot() takes
729 * three.
730 */
731 {"parrot", (PyCFunction)keywdarg_parrot, METH_VARARGS | METH_KEYWORDS,
732 "Print a lovely skit to standard output."},
733 {NULL, NULL, 0, NULL} /* sentinel */
734 };
735
736::
737
738 void
739 initkeywdarg(void)
740 {
741 /* Create the module and add the functions */
742 Py_InitModule("keywdarg", keywdarg_methods);
743 }
744
745
746.. _buildvalue:
747
748Building Arbitrary Values
749=========================
750
751This function is the counterpart to :cfunc:`PyArg_ParseTuple`. It is declared
752as follows::
753
754 PyObject *Py_BuildValue(char *format, ...);
755
756It recognizes a set of format units similar to the ones recognized by
757:cfunc:`PyArg_ParseTuple`, but the arguments (which are input to the function,
758not output) must not be pointers, just values. It returns a new Python object,
759suitable for returning from a C function called from Python.
760
761One difference with :cfunc:`PyArg_ParseTuple`: while the latter requires its
762first argument to be a tuple (since Python argument lists are always represented
763as tuples internally), :cfunc:`Py_BuildValue` does not always build a tuple. It
764builds a tuple only if its format string contains two or more format units. If
765the format string is empty, it returns ``None``; if it contains exactly one
766format unit, it returns whatever object is described by that format unit. To
767force it to return a tuple of size 0 or one, parenthesize the format string.
768
769Examples (to the left the call, to the right the resulting Python value)::
770
771 Py_BuildValue("") None
772 Py_BuildValue("i", 123) 123
773 Py_BuildValue("iii", 123, 456, 789) (123, 456, 789)
774 Py_BuildValue("s", "hello") 'hello'
775 Py_BuildValue("y", "hello") b'hello'
776 Py_BuildValue("ss", "hello", "world") ('hello', 'world')
777 Py_BuildValue("s#", "hello", 4) 'hell'
778 Py_BuildValue("y#", "hello", 4) b'hell'
779 Py_BuildValue("()") ()
780 Py_BuildValue("(i)", 123) (123,)
781 Py_BuildValue("(ii)", 123, 456) (123, 456)
782 Py_BuildValue("(i,i)", 123, 456) (123, 456)
783 Py_BuildValue("[i,i]", 123, 456) [123, 456]
784 Py_BuildValue("{s:i,s:i}",
785 "abc", 123, "def", 456) {'abc': 123, 'def': 456}
786 Py_BuildValue("((ii)(ii)) (ii)",
787 1, 2, 3, 4, 5, 6) (((1, 2), (3, 4)), (5, 6))
788
789
790.. _refcounts:
791
792Reference Counts
793================
794
795In languages like C or C++, the programmer is responsible for dynamic allocation
796and deallocation of memory on the heap. In C, this is done using the functions
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000797:cfunc:`malloc` and :cfunc:`free`. In C++, the operators ``new`` and
798``delete`` are used with essentially the same meaning and we'll restrict
Georg Brandl116aa622007-08-15 14:28:22 +0000799the following discussion to the C case.
800
801Every block of memory allocated with :cfunc:`malloc` should eventually be
802returned to the pool of available memory by exactly one call to :cfunc:`free`.
803It is important to call :cfunc:`free` at the right time. If a block's address
804is forgotten but :cfunc:`free` is not called for it, the memory it occupies
805cannot be reused until the program terminates. This is called a :dfn:`memory
806leak`. On the other hand, if a program calls :cfunc:`free` for a block and then
807continues to use the block, it creates a conflict with re-use of the block
808through another :cfunc:`malloc` call. This is called :dfn:`using freed memory`.
809It has the same bad consequences as referencing uninitialized data --- core
810dumps, wrong results, mysterious crashes.
811
812Common causes of memory leaks are unusual paths through the code. For instance,
813a function may allocate a block of memory, do some calculation, and then free
814the block again. Now a change in the requirements for the function may add a
815test to the calculation that detects an error condition and can return
816prematurely from the function. It's easy to forget to free the allocated memory
817block when taking this premature exit, especially when it is added later to the
818code. Such leaks, once introduced, often go undetected for a long time: the
819error exit is taken only in a small fraction of all calls, and most modern
820machines have plenty of virtual memory, so the leak only becomes apparent in a
821long-running process that uses the leaking function frequently. Therefore, it's
822important to prevent leaks from happening by having a coding convention or
823strategy that minimizes this kind of errors.
824
825Since Python makes heavy use of :cfunc:`malloc` and :cfunc:`free`, it needs a
826strategy to avoid memory leaks as well as the use of freed memory. The chosen
827method is called :dfn:`reference counting`. The principle is simple: every
828object contains a counter, which is incremented when a reference to the object
829is stored somewhere, and which is decremented when a reference to it is deleted.
830When the counter reaches zero, the last reference to the object has been deleted
831and the object is freed.
832
833An alternative strategy is called :dfn:`automatic garbage collection`.
834(Sometimes, reference counting is also referred to as a garbage collection
835strategy, hence my use of "automatic" to distinguish the two.) The big
836advantage of automatic garbage collection is that the user doesn't need to call
837:cfunc:`free` explicitly. (Another claimed advantage is an improvement in speed
838or memory usage --- this is no hard fact however.) The disadvantage is that for
839C, there is no truly portable automatic garbage collector, while reference
840counting can be implemented portably (as long as the functions :cfunc:`malloc`
841and :cfunc:`free` are available --- which the C Standard guarantees). Maybe some
842day a sufficiently portable automatic garbage collector will be available for C.
843Until then, we'll have to live with reference counts.
844
845While Python uses the traditional reference counting implementation, it also
846offers a cycle detector that works to detect reference cycles. This allows
847applications to not worry about creating direct or indirect circular references;
848these are the weakness of garbage collection implemented using only reference
849counting. Reference cycles consist of objects which contain (possibly indirect)
850references to themselves, so that each object in the cycle has a reference count
851which is non-zero. Typical reference counting implementations are not able to
852reclaim the memory belonging to any objects in a reference cycle, or referenced
853from the objects in the cycle, even though there are no further references to
854the cycle itself.
855
856The cycle detector is able to detect garbage cycles and can reclaim them so long
857as there are no finalizers implemented in Python (:meth:`__del__` methods).
858When there are such finalizers, the detector exposes the cycles through the
859:mod:`gc` module (specifically, the
860``garbage`` variable in that module). The :mod:`gc` module also exposes a way
861to run the detector (the :func:`collect` function), as well as configuration
862interfaces and the ability to disable the detector at runtime. The cycle
863detector is considered an optional component; though it is included by default,
864it can be disabled at build time using the :option:`--without-cycle-gc` option
Georg Brandlf6945182008-02-01 11:56:49 +0000865to the :program:`configure` script on Unix platforms (including Mac OS X). If
866the cycle detector is disabled in this way, the :mod:`gc` module will not be
867available.
Georg Brandl116aa622007-08-15 14:28:22 +0000868
869
870.. _refcountsinpython:
871
872Reference Counting in Python
873----------------------------
874
875There are two macros, ``Py_INCREF(x)`` and ``Py_DECREF(x)``, which handle the
876incrementing and decrementing of the reference count. :cfunc:`Py_DECREF` also
877frees the object when the count reaches zero. For flexibility, it doesn't call
878:cfunc:`free` directly --- rather, it makes a call through a function pointer in
879the object's :dfn:`type object`. For this purpose (and others), every object
880also contains a pointer to its type object.
881
882The big question now remains: when to use ``Py_INCREF(x)`` and ``Py_DECREF(x)``?
883Let's first introduce some terms. Nobody "owns" an object; however, you can
884:dfn:`own a reference` to an object. An object's reference count is now defined
885as the number of owned references to it. The owner of a reference is
886responsible for calling :cfunc:`Py_DECREF` when the reference is no longer
887needed. Ownership of a reference can be transferred. There are three ways to
888dispose of an owned reference: pass it on, store it, or call :cfunc:`Py_DECREF`.
889Forgetting to dispose of an owned reference creates a memory leak.
890
891It is also possible to :dfn:`borrow` [#]_ a reference to an object. The
892borrower of a reference should not call :cfunc:`Py_DECREF`. The borrower must
893not hold on to the object longer than the owner from which it was borrowed.
894Using a borrowed reference after the owner has disposed of it risks using freed
895memory and should be avoided completely. [#]_
896
897The advantage of borrowing over owning a reference is that you don't need to
898take care of disposing of the reference on all possible paths through the code
899--- in other words, with a borrowed reference you don't run the risk of leaking
Benjamin Peterson6ebe78f2008-12-21 00:06:59 +0000900when a premature exit is taken. The disadvantage of borrowing over owning is
Georg Brandl116aa622007-08-15 14:28:22 +0000901that there are some subtle situations where in seemingly correct code a borrowed
902reference can be used after the owner from which it was borrowed has in fact
903disposed of it.
904
905A borrowed reference can be changed into an owned reference by calling
906:cfunc:`Py_INCREF`. This does not affect the status of the owner from which the
907reference was borrowed --- it creates a new owned reference, and gives full
908owner responsibilities (the new owner must dispose of the reference properly, as
909well as the previous owner).
910
911
912.. _ownershiprules:
913
914Ownership Rules
915---------------
916
917Whenever an object reference is passed into or out of a function, it is part of
918the function's interface specification whether ownership is transferred with the
919reference or not.
920
921Most functions that return a reference to an object pass on ownership with the
922reference. In particular, all functions whose function it is to create a new
Georg Brandl9914dd32007-12-02 23:08:39 +0000923object, such as :cfunc:`PyLong_FromLong` and :cfunc:`Py_BuildValue`, pass
Georg Brandl116aa622007-08-15 14:28:22 +0000924ownership to the receiver. Even if the object is not actually new, you still
925receive ownership of a new reference to that object. For instance,
Georg Brandl9914dd32007-12-02 23:08:39 +0000926:cfunc:`PyLong_FromLong` maintains a cache of popular values and can return a
Georg Brandl116aa622007-08-15 14:28:22 +0000927reference to a cached item.
928
929Many functions that extract objects from other objects also transfer ownership
930with the reference, for instance :cfunc:`PyObject_GetAttrString`. The picture
931is less clear, here, however, since a few common routines are exceptions:
932:cfunc:`PyTuple_GetItem`, :cfunc:`PyList_GetItem`, :cfunc:`PyDict_GetItem`, and
933:cfunc:`PyDict_GetItemString` all return references that you borrow from the
934tuple, list or dictionary.
935
936The function :cfunc:`PyImport_AddModule` also returns a borrowed reference, even
937though it may actually create the object it returns: this is possible because an
938owned reference to the object is stored in ``sys.modules``.
939
940When you pass an object reference into another function, in general, the
941function borrows the reference from you --- if it needs to store it, it will use
942:cfunc:`Py_INCREF` to become an independent owner. There are exactly two
943important exceptions to this rule: :cfunc:`PyTuple_SetItem` and
944:cfunc:`PyList_SetItem`. These functions take over ownership of the item passed
945to them --- even if they fail! (Note that :cfunc:`PyDict_SetItem` and friends
946don't take over ownership --- they are "normal.")
947
948When a C function is called from Python, it borrows references to its arguments
949from the caller. The caller owns a reference to the object, so the borrowed
950reference's lifetime is guaranteed until the function returns. Only when such a
951borrowed reference must be stored or passed on, it must be turned into an owned
952reference by calling :cfunc:`Py_INCREF`.
953
954The object reference returned from a C function that is called from Python must
955be an owned reference --- ownership is transferred from the function to its
956caller.
957
958
959.. _thinice:
960
961Thin Ice
962--------
963
964There are a few situations where seemingly harmless use of a borrowed reference
965can lead to problems. These all have to do with implicit invocations of the
966interpreter, which can cause the owner of a reference to dispose of it.
967
968The first and most important case to know about is using :cfunc:`Py_DECREF` on
969an unrelated object while borrowing a reference to a list item. For instance::
970
971 void
972 bug(PyObject *list)
973 {
974 PyObject *item = PyList_GetItem(list, 0);
975
Georg Brandl9914dd32007-12-02 23:08:39 +0000976 PyList_SetItem(list, 1, PyLong_FromLong(0L));
Georg Brandl116aa622007-08-15 14:28:22 +0000977 PyObject_Print(item, stdout, 0); /* BUG! */
978 }
979
980This function first borrows a reference to ``list[0]``, then replaces
981``list[1]`` with the value ``0``, and finally prints the borrowed reference.
982Looks harmless, right? But it's not!
983
984Let's follow the control flow into :cfunc:`PyList_SetItem`. The list owns
985references to all its items, so when item 1 is replaced, it has to dispose of
986the original item 1. Now let's suppose the original item 1 was an instance of a
987user-defined class, and let's further suppose that the class defined a
988:meth:`__del__` method. If this class instance has a reference count of 1,
989disposing of it will call its :meth:`__del__` method.
990
991Since it is written in Python, the :meth:`__del__` method can execute arbitrary
992Python code. Could it perhaps do something to invalidate the reference to
993``item`` in :cfunc:`bug`? You bet! Assuming that the list passed into
994:cfunc:`bug` is accessible to the :meth:`__del__` method, it could execute a
995statement to the effect of ``del list[0]``, and assuming this was the last
996reference to that object, it would free the memory associated with it, thereby
997invalidating ``item``.
998
999The solution, once you know the source of the problem, is easy: temporarily
1000increment the reference count. The correct version of the function reads::
1001
1002 void
1003 no_bug(PyObject *list)
1004 {
1005 PyObject *item = PyList_GetItem(list, 0);
1006
1007 Py_INCREF(item);
Georg Brandl9914dd32007-12-02 23:08:39 +00001008 PyList_SetItem(list, 1, PyLong_FromLong(0L));
Georg Brandl116aa622007-08-15 14:28:22 +00001009 PyObject_Print(item, stdout, 0);
1010 Py_DECREF(item);
1011 }
1012
1013This is a true story. An older version of Python contained variants of this bug
1014and someone spent a considerable amount of time in a C debugger to figure out
1015why his :meth:`__del__` methods would fail...
1016
1017The second case of problems with a borrowed reference is a variant involving
1018threads. Normally, multiple threads in the Python interpreter can't get in each
1019other's way, because there is a global lock protecting Python's entire object
1020space. However, it is possible to temporarily release this lock using the macro
1021:cmacro:`Py_BEGIN_ALLOW_THREADS`, and to re-acquire it using
1022:cmacro:`Py_END_ALLOW_THREADS`. This is common around blocking I/O calls, to
1023let other threads use the processor while waiting for the I/O to complete.
1024Obviously, the following function has the same problem as the previous one::
1025
1026 void
1027 bug(PyObject *list)
1028 {
1029 PyObject *item = PyList_GetItem(list, 0);
1030 Py_BEGIN_ALLOW_THREADS
1031 ...some blocking I/O call...
1032 Py_END_ALLOW_THREADS
1033 PyObject_Print(item, stdout, 0); /* BUG! */
1034 }
1035
1036
1037.. _nullpointers:
1038
1039NULL Pointers
1040-------------
1041
1042In general, functions that take object references as arguments do not expect you
1043to pass them *NULL* pointers, and will dump core (or cause later core dumps) if
1044you do so. Functions that return object references generally return *NULL* only
1045to indicate that an exception occurred. The reason for not testing for *NULL*
1046arguments is that functions often pass the objects they receive on to other
1047function --- if each function were to test for *NULL*, there would be a lot of
1048redundant tests and the code would run more slowly.
1049
1050It is better to test for *NULL* only at the "source:" when a pointer that may be
1051*NULL* is received, for example, from :cfunc:`malloc` or from a function that
1052may raise an exception.
1053
1054The macros :cfunc:`Py_INCREF` and :cfunc:`Py_DECREF` do not check for *NULL*
1055pointers --- however, their variants :cfunc:`Py_XINCREF` and :cfunc:`Py_XDECREF`
1056do.
1057
1058The macros for checking for a particular object type (``Pytype_Check()``) don't
1059check for *NULL* pointers --- again, there is much code that calls several of
1060these in a row to test an object against various different expected types, and
1061this would generate redundant tests. There are no variants with *NULL*
1062checking.
1063
1064The C function calling mechanism guarantees that the argument list passed to C
1065functions (``args`` in the examples) is never *NULL* --- in fact it guarantees
1066that it is always a tuple. [#]_
1067
1068It is a severe error to ever let a *NULL* pointer "escape" to the Python user.
1069
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001070.. Frank Stajano:
1071 A pedagogically buggy example, along the lines of the previous listing, would
1072 be helpful here -- showing in more concrete terms what sort of actions could
1073 cause the problem. I can't very well imagine it from the description.
Georg Brandl116aa622007-08-15 14:28:22 +00001074
1075
1076.. _cplusplus:
1077
1078Writing Extensions in C++
1079=========================
1080
1081It is possible to write extension modules in C++. Some restrictions apply. If
1082the main program (the Python interpreter) is compiled and linked by the C
1083compiler, global or static objects with constructors cannot be used. This is
1084not a problem if the main program is linked by the C++ compiler. Functions that
1085will be called by the Python interpreter (in particular, module initialization
1086functions) have to be declared using ``extern "C"``. It is unnecessary to
1087enclose the Python header files in ``extern "C" {...}`` --- they use this form
1088already if the symbol ``__cplusplus`` is defined (all recent C++ compilers
1089define this symbol).
1090
1091
Benjamin Petersonb173f782009-05-05 22:31:58 +00001092.. _using-capsules:
Georg Brandl116aa622007-08-15 14:28:22 +00001093
1094Providing a C API for an Extension Module
1095=========================================
1096
1097.. sectionauthor:: Konrad Hinsen <hinsen@cnrs-orleans.fr>
1098
1099
1100Many extension modules just provide new functions and types to be used from
1101Python, but sometimes the code in an extension module can be useful for other
1102extension modules. For example, an extension module could implement a type
1103"collection" which works like lists without order. Just like the standard Python
1104list type has a C API which permits extension modules to create and manipulate
1105lists, this new collection type should have a set of C functions for direct
1106manipulation from other extension modules.
1107
1108At first sight this seems easy: just write the functions (without declaring them
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001109``static``, of course), provide an appropriate header file, and document
Georg Brandl116aa622007-08-15 14:28:22 +00001110the C API. And in fact this would work if all extension modules were always
1111linked statically with the Python interpreter. When modules are used as shared
1112libraries, however, the symbols defined in one module may not be visible to
1113another module. The details of visibility depend on the operating system; some
1114systems use one global namespace for the Python interpreter and all extension
1115modules (Windows, for example), whereas others require an explicit list of
1116imported symbols at module link time (AIX is one example), or offer a choice of
1117different strategies (most Unices). And even if symbols are globally visible,
1118the module whose functions one wishes to call might not have been loaded yet!
1119
1120Portability therefore requires not to make any assumptions about symbol
1121visibility. This means that all symbols in extension modules should be declared
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001122``static``, except for the module's initialization function, in order to
Georg Brandl116aa622007-08-15 14:28:22 +00001123avoid name clashes with other extension modules (as discussed in section
1124:ref:`methodtable`). And it means that symbols that *should* be accessible from
1125other extension modules must be exported in a different way.
1126
1127Python provides a special mechanism to pass C-level information (pointers) from
Benjamin Petersonb173f782009-05-05 22:31:58 +00001128one extension module to another one: Capsules. A Capsule is a Python data type
1129which stores a pointer (:ctype:`void \*`). Capsules can only be created and
Georg Brandl116aa622007-08-15 14:28:22 +00001130accessed via their C API, but they can be passed around like any other Python
1131object. In particular, they can be assigned to a name in an extension module's
1132namespace. Other extension modules can then import this module, retrieve the
Benjamin Petersonb173f782009-05-05 22:31:58 +00001133value of this name, and then retrieve the pointer from the Capsule.
Georg Brandl116aa622007-08-15 14:28:22 +00001134
Benjamin Petersonb173f782009-05-05 22:31:58 +00001135There are many ways in which Capsules can be used to export the C API of an
1136extension module. Each function could get its own Capsule, or all C API pointers
1137could be stored in an array whose address is published in a Capsule. And the
Georg Brandl116aa622007-08-15 14:28:22 +00001138various tasks of storing and retrieving the pointers can be distributed in
1139different ways between the module providing the code and the client modules.
1140
Benjamin Petersonb173f782009-05-05 22:31:58 +00001141Whichever method you choose, it's important to name your Capsules properly.
1142The function :cfunc:`PyCapsule_New` takes a name parameter
1143(:ctype:`const char \*`); you're permitted to pass in a *NULL* name, but
1144we strongly encourage you to specify a name. Properly named Capsules provide
1145a degree of runtime type-safety; there is no feasible way to tell one unnamed
1146Capsule from another.
1147
1148In particular, Capsules used to expose C APIs should be given a name following
1149this convention::
1150
1151 modulename.attributename
1152
1153The convenience function :cfunc:`PyCapsule_Import` makes it easy to
1154load a C API provided via a Capsule, but only if the Capsule's name
1155matches this convention. This behavior gives C API users a high degree
1156of certainty that the Capsule they load contains the correct C API.
1157
Georg Brandl116aa622007-08-15 14:28:22 +00001158The following example demonstrates an approach that puts most of the burden on
1159the writer of the exporting module, which is appropriate for commonly used
1160library modules. It stores all C API pointers (just one in the example!) in an
Benjamin Petersonb173f782009-05-05 22:31:58 +00001161array of :ctype:`void` pointers which becomes the value of a Capsule. The header
Georg Brandl116aa622007-08-15 14:28:22 +00001162file corresponding to the module provides a macro that takes care of importing
1163the module and retrieving its C API pointers; client modules only have to call
1164this macro before accessing the C API.
1165
1166The exporting module is a modification of the :mod:`spam` module from section
1167:ref:`extending-simpleexample`. The function :func:`spam.system` does not call
1168the C library function :cfunc:`system` directly, but a function
1169:cfunc:`PySpam_System`, which would of course do something more complicated in
1170reality (such as adding "spam" to every command). This function
1171:cfunc:`PySpam_System` is also exported to other extension modules.
1172
1173The function :cfunc:`PySpam_System` is a plain C function, declared
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001174``static`` like everything else::
Georg Brandl116aa622007-08-15 14:28:22 +00001175
1176 static int
1177 PySpam_System(const char *command)
1178 {
1179 return system(command);
1180 }
1181
1182The function :cfunc:`spam_system` is modified in a trivial way::
1183
1184 static PyObject *
1185 spam_system(PyObject *self, PyObject *args)
1186 {
1187 const char *command;
1188 int sts;
1189
1190 if (!PyArg_ParseTuple(args, "s", &command))
1191 return NULL;
1192 sts = PySpam_System(command);
Georg Brandlae26cce2010-11-26 18:29:10 +00001193 return PyLong_FromLong(sts);
Georg Brandl116aa622007-08-15 14:28:22 +00001194 }
1195
1196In the beginning of the module, right after the line ::
1197
1198 #include "Python.h"
1199
1200two more lines must be added::
1201
1202 #define SPAM_MODULE
1203 #include "spammodule.h"
1204
1205The ``#define`` is used to tell the header file that it is being included in the
1206exporting module, not a client module. Finally, the module's initialization
1207function must take care of initializing the C API pointer array::
1208
1209 PyMODINIT_FUNC
Martin v. Löwis1a214512008-06-11 05:26:20 +00001210 PyInit_spam(void)
Georg Brandl116aa622007-08-15 14:28:22 +00001211 {
1212 PyObject *m;
1213 static void *PySpam_API[PySpam_API_pointers];
1214 PyObject *c_api_object;
1215
Martin v. Löwis1a214512008-06-11 05:26:20 +00001216 m = PyModule_Create(&spammodule);
Georg Brandl116aa622007-08-15 14:28:22 +00001217 if (m == NULL)
Martin v. Löwis1a214512008-06-11 05:26:20 +00001218 return NULL;
Georg Brandl116aa622007-08-15 14:28:22 +00001219
1220 /* Initialize the C API pointer array */
1221 PySpam_API[PySpam_System_NUM] = (void *)PySpam_System;
1222
Benjamin Petersonb173f782009-05-05 22:31:58 +00001223 /* Create a Capsule containing the API pointer array's address */
1224 c_api_object = PyCapsule_New((void *)PySpam_API, "spam._C_API", NULL);
Georg Brandl116aa622007-08-15 14:28:22 +00001225
1226 if (c_api_object != NULL)
1227 PyModule_AddObject(m, "_C_API", c_api_object);
Martin v. Löwis1a214512008-06-11 05:26:20 +00001228 return m;
Georg Brandl116aa622007-08-15 14:28:22 +00001229 }
1230
Christian Heimes5b5e81c2007-12-31 16:14:33 +00001231Note that ``PySpam_API`` is declared ``static``; otherwise the pointer
Martin v. Löwis1a214512008-06-11 05:26:20 +00001232array would disappear when :func:`PyInit_spam` terminates!
Georg Brandl116aa622007-08-15 14:28:22 +00001233
1234The bulk of the work is in the header file :file:`spammodule.h`, which looks
1235like this::
1236
1237 #ifndef Py_SPAMMODULE_H
1238 #define Py_SPAMMODULE_H
1239 #ifdef __cplusplus
1240 extern "C" {
1241 #endif
1242
1243 /* Header file for spammodule */
1244
1245 /* C API functions */
1246 #define PySpam_System_NUM 0
1247 #define PySpam_System_RETURN int
1248 #define PySpam_System_PROTO (const char *command)
1249
1250 /* Total number of C API pointers */
1251 #define PySpam_API_pointers 1
1252
1253
1254 #ifdef SPAM_MODULE
1255 /* This section is used when compiling spammodule.c */
1256
1257 static PySpam_System_RETURN PySpam_System PySpam_System_PROTO;
1258
1259 #else
1260 /* This section is used in modules that use spammodule's API */
1261
1262 static void **PySpam_API;
1263
1264 #define PySpam_System \
1265 (*(PySpam_System_RETURN (*)PySpam_System_PROTO) PySpam_API[PySpam_System_NUM])
1266
Benjamin Petersonb173f782009-05-05 22:31:58 +00001267 /* Return -1 on error, 0 on success.
1268 * PyCapsule_Import will set an exception if there's an error.
1269 */
Georg Brandl116aa622007-08-15 14:28:22 +00001270 static int
1271 import_spam(void)
1272 {
Benjamin Petersonb173f782009-05-05 22:31:58 +00001273 PySpam_API = (void **)PyCapsule_Import("spam._C_API", 0);
1274 return (PySpam_API != NULL) ? 0 : -1;
Georg Brandl116aa622007-08-15 14:28:22 +00001275 }
1276
1277 #endif
1278
1279 #ifdef __cplusplus
1280 }
1281 #endif
1282
1283 #endif /* !defined(Py_SPAMMODULE_H) */
1284
1285All that a client module must do in order to have access to the function
1286:cfunc:`PySpam_System` is to call the function (or rather macro)
1287:cfunc:`import_spam` in its initialization function::
1288
1289 PyMODINIT_FUNC
Benjamin Peterson7c435242009-03-24 01:40:39 +00001290 PyInit_client(void)
Georg Brandl116aa622007-08-15 14:28:22 +00001291 {
1292 PyObject *m;
1293
Georg Brandl21151762009-03-31 15:52:41 +00001294 m = PyModule_Create(&clientmodule);
Georg Brandl116aa622007-08-15 14:28:22 +00001295 if (m == NULL)
Georg Brandl21151762009-03-31 15:52:41 +00001296 return NULL;
Georg Brandl116aa622007-08-15 14:28:22 +00001297 if (import_spam() < 0)
Georg Brandl21151762009-03-31 15:52:41 +00001298 return NULL;
Georg Brandl116aa622007-08-15 14:28:22 +00001299 /* additional initialization can happen here */
Georg Brandl21151762009-03-31 15:52:41 +00001300 return m;
Georg Brandl116aa622007-08-15 14:28:22 +00001301 }
1302
1303The main disadvantage of this approach is that the file :file:`spammodule.h` is
1304rather complicated. However, the basic structure is the same for each function
1305that is exported, so it has to be learned only once.
1306
Benjamin Petersonb173f782009-05-05 22:31:58 +00001307Finally it should be mentioned that Capsules offer additional functionality,
Georg Brandl116aa622007-08-15 14:28:22 +00001308which is especially useful for memory allocation and deallocation of the pointer
Benjamin Petersonb173f782009-05-05 22:31:58 +00001309stored in a Capsule. The details are described in the Python/C API Reference
1310Manual in the section :ref:`capsules` and in the implementation of Capsules (files
1311:file:`Include/pycapsule.h` and :file:`Objects/pycapsule.c` in the Python source
Georg Brandl116aa622007-08-15 14:28:22 +00001312code distribution).
1313
1314.. rubric:: Footnotes
1315
1316.. [#] An interface for this function already exists in the standard module :mod:`os`
1317 --- it was chosen as a simple and straightforward example.
1318
1319.. [#] The metaphor of "borrowing" a reference is not completely correct: the owner
1320 still has a copy of the reference.
1321
1322.. [#] Checking that the reference count is at least 1 **does not work** --- the
1323 reference count itself could be in freed memory and may thus be reused for
1324 another object!
1325
1326.. [#] These guarantees don't hold when you use the "old" style calling convention ---
1327 this is still found in much existing code.
1328