blob: d8d5a43f42b36fab6859f437cf27493f100eb2a5 [file] [log] [blame]
Fred Drakedca87921998-01-13 16:53:23 +00001\documentclass[twoside,openright]{report}
Fred Drake0fd82681998-01-09 05:39:38 +00002\usepackage{myformat}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00003
Guido van Rossum5049bcb1995-03-13 16:55:23 +00004% XXX PM Modulator
5
Guido van Rossum6938f061994-08-01 12:22:53 +00006\title{Extending and Embedding the Python Interpreter}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00007
Guido van Rossum16cd7f91994-10-06 10:29:26 +00008\input{boilerplate}
Guido van Rossum83eb9621993-11-23 16:28:45 +00009
Guido van Rossum7a2dba21993-11-05 14:45:11 +000010% Tell \index to actually write the .idx file
11\makeindex
12
13\begin{document}
14
Guido van Rossum7a2dba21993-11-05 14:45:11 +000015\maketitle
16
Guido van Rossum16cd7f91994-10-06 10:29:26 +000017\input{copyright}
18
Guido van Rossum7a2dba21993-11-05 14:45:11 +000019\begin{abstract}
20
21\noindent
Guido van Rossumb92112d1995-03-20 14:24:09 +000022Python is an interpreted, object-oriented programming language. This
Fred Drake0fd82681998-01-09 05:39:38 +000023document describes how to write modules in \C{} or \Cpp{} to extend the
Guido van Rossumb92112d1995-03-20 14:24:09 +000024Python interpreter with new modules. Those modules can define new
25functions but also new object types and their methods. The document
26also describes how to embed the Python interpreter in another
27application, for use as an extension language. Finally, it shows how
28to compile and link extension modules so that they can be loaded
29dynamically (at run time) into the interpreter, if the underlying
30operating system supports this feature.
31
32This document assumes basic knowledge about Python. For an informal
33introduction to the language, see the Python Tutorial. The Python
34Reference Manual gives a more formal definition of the language. The
35Python Library Reference documents the existing object types,
36functions and modules (both built-in and written in Python) that give
37the language its wide application range.
Guido van Rossum7a2dba21993-11-05 14:45:11 +000038
Fred Drake0fd82681998-01-09 05:39:38 +000039For a detailed description of the whole Python/\C{} API, see the separate
40Python/\C{} API Reference Manual. \strong{Note:} While that manual is
Guido van Rossumfdacc581997-10-07 14:40:16 +000041still in a state of flux, it is safe to say that it is much more up to
42date than the manual you're reading currently (which has been in need
43for an upgrade for some time now).
44
45
Guido van Rossum7a2dba21993-11-05 14:45:11 +000046\end{abstract}
47
Fred Drake4d4f9e71998-01-13 22:25:02 +000048\tableofcontents
Guido van Rossum7a2dba21993-11-05 14:45:11 +000049
Guido van Rossumdb65a6c1993-11-05 17:11:16 +000050
Fred Drake0fd82681998-01-09 05:39:38 +000051\chapter{Extending Python with \C{} or \Cpp{} code}
Guido van Rossum7a2dba21993-11-05 14:45:11 +000052
Guido van Rossum6f0132f1993-11-19 13:13:22 +000053
54\section{Introduction}
55
Guido van Rossumb92112d1995-03-20 14:24:09 +000056It is quite easy to add new built-in modules to Python, if you know
Fred Drake0fd82681998-01-09 05:39:38 +000057how to program in \C{}. Such \dfn{extension modules} can do two things
Guido van Rossumb92112d1995-03-20 14:24:09 +000058that can't be done directly in Python: they can implement new built-in
Fred Drake0fd82681998-01-09 05:39:38 +000059object types, and they can call \C{} library functions and system calls.
Guido van Rossum6938f061994-08-01 12:22:53 +000060
Guido van Rossum5049bcb1995-03-13 16:55:23 +000061To support extensions, the Python API (Application Programmers
Guido van Rossumb92112d1995-03-20 14:24:09 +000062Interface) defines a set of functions, macros and variables that
63provide access to most aspects of the Python run-time system. The
Fred Drake0fd82681998-01-09 05:39:38 +000064Python API is incorporated in a \C{} source file by including the header
Guido van Rossumb92112d1995-03-20 14:24:09 +000065\code{"Python.h"}.
Guido van Rossum6938f061994-08-01 12:22:53 +000066
Guido van Rossumb92112d1995-03-20 14:24:09 +000067The compilation of an extension module depends on its intended use as
68well as on your system setup; details are given in a later section.
Guido van Rossum6938f061994-08-01 12:22:53 +000069
Guido van Rossum7a2dba21993-11-05 14:45:11 +000070
Guido van Rossum5049bcb1995-03-13 16:55:23 +000071\section{A Simple Example}
Guido van Rossum7a2dba21993-11-05 14:45:11 +000072
Guido van Rossumb92112d1995-03-20 14:24:09 +000073Let's create an extension module called \samp{spam} (the favorite food
74of Monty Python fans...) and let's say we want to create a Python
Fred Drake0fd82681998-01-09 05:39:38 +000075interface to the \C{} library function \code{system()}.\footnote{An
Guido van Rossumb92112d1995-03-20 14:24:09 +000076interface for this function already exists in the standard module
77\code{os} --- it was chosen as a simple and straightfoward example.}
78This function takes a null-terminated character string as argument and
79returns an integer. We want this function to be callable from Python
80as follows:
81
Fred Drake1e11a5c1998-02-13 07:11:32 +000082\begin{verbatim}
83>>> import spam
84>>> status = spam.system("ls -l")
85\end{verbatim}
86
Guido van Rossumb92112d1995-03-20 14:24:09 +000087Begin by creating a file \samp{spammodule.c}. (In general, if a
Fred Drake0fd82681998-01-09 05:39:38 +000088module is called \samp{spam}, the \C{} file containing its implementation
Guido van Rossumb92112d1995-03-20 14:24:09 +000089is called \file{spammodule.c}; if the module name is very long, like
90\samp{spammify}, the module name can be just \file{spammify.c}.)
91
92The first line of our file can be:
Guido van Rossum7a2dba21993-11-05 14:45:11 +000093
Fred Drake1e11a5c1998-02-13 07:11:32 +000094\begin{verbatim}
95#include "Python.h"
96\end{verbatim}
97
Guido van Rossum5049bcb1995-03-13 16:55:23 +000098which pulls in the Python API (you can add a comment describing the
99purpose of the module and a copyright notice if you like).
100
Guido van Rossumb92112d1995-03-20 14:24:09 +0000101All user-visible symbols defined by \code{"Python.h"} have a prefix of
102\samp{Py} or \samp{PY}, except those defined in standard header files.
103For convenience, and since they are used extensively by the Python
104interpreter, \code{"Python.h"} includes a few standard header files:
105\code{<stdio.h>}, \code{<string.h>}, \code{<errno.h>}, and
106\code{<stdlib.h>}. If the latter header file does not exist on your
107system, it declares the functions \code{malloc()}, \code{free()} and
108\code{realloc()} directly.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000109
Fred Drake0fd82681998-01-09 05:39:38 +0000110The next thing we add to our module file is the \C{} function that will
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000111be called when the Python expression \samp{spam.system(\var{string})}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000112is evaluated (we'll see shortly how it ends up being called):
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000113
Fred Drake1e11a5c1998-02-13 07:11:32 +0000114\begin{verbatim}
115static PyObject *
116spam_system(self, args)
117 PyObject *self;
118 PyObject *args;
119{
120 char *command;
121 int sts;
122 if (!PyArg_ParseTuple(args, "s", &command))
123 return NULL;
124 sts = system(command);
125 return Py_BuildValue("i", sts);
126}
127\end{verbatim}
128
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000129There is a straightforward translation from the argument list in
Guido van Rossumb92112d1995-03-20 14:24:09 +0000130Python (e.g.\ the single expression \code{"ls -l"}) to the arguments
Fred Drake0fd82681998-01-09 05:39:38 +0000131passed to the \C{} function. The \C{} function always has two arguments,
Guido van Rossumb92112d1995-03-20 14:24:09 +0000132conventionally named \var{self} and \var{args}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000133
Fred Drake0fd82681998-01-09 05:39:38 +0000134The \var{self} argument is only used when the \C{} function implements a
Guido van Rossumb92112d1995-03-20 14:24:09 +0000135builtin method. This will be discussed later. In the example,
Fred Drake0fd82681998-01-09 05:39:38 +0000136\var{self} will always be a \NULL{} pointer, since we are defining
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000137a function, not a method. (This is done so that the interpreter
Fred Drake0fd82681998-01-09 05:39:38 +0000138doesn't have to understand two different types of \C{} functions.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000139
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000140The \var{args} argument will be a pointer to a Python tuple object
Guido van Rossumb92112d1995-03-20 14:24:09 +0000141containing the arguments. Each item of the tuple corresponds to an
142argument in the call's argument list. The arguments are Python
Fred Drake1aedbd81998-02-16 14:47:27 +0000143objects --- in order to do anything with them in our \C{} function we have
Fred Drake0fd82681998-01-09 05:39:38 +0000144to convert them to \C{} values. The function \code{PyArg_ParseTuple()}
145in the Python API checks the argument types and converts them to \C{}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000146values. It uses a template string to determine the required types of
Fred Drake0fd82681998-01-09 05:39:38 +0000147the arguments as well as the types of the \C{} variables into which to
Guido van Rossumb92112d1995-03-20 14:24:09 +0000148store the converted values. More about this later.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000149
Guido van Rossumb92112d1995-03-20 14:24:09 +0000150\code{PyArg_ParseTuple()} returns true (nonzero) if all arguments have
151the right type and its components have been stored in the variables
152whose addresses are passed. It returns false (zero) if an invalid
153argument list was passed. In the latter case it also raises an
154appropriate exception by so the calling function can return
Fred Drake0fd82681998-01-09 05:39:38 +0000155\NULL{} immediately (as we saw in the example).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000156
157
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000158\section{Intermezzo: Errors and Exceptions}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000159
160An important convention throughout the Python interpreter is the
161following: when a function fails, it should set an exception condition
Fred Drake0fd82681998-01-09 05:39:38 +0000162and return an error value (usually a \NULL{} pointer). Exceptions
Guido van Rossumb92112d1995-03-20 14:24:09 +0000163are stored in a static global variable inside the interpreter; if this
Fred Drake0fd82681998-01-09 05:39:38 +0000164variable is \NULL{} no exception has occurred. A second global
Guido van Rossumb92112d1995-03-20 14:24:09 +0000165variable stores the ``associated value'' of the exception (the second
166argument to \code{raise}). A third variable contains the stack
167traceback in case the error originated in Python code. These three
Fred Drake0fd82681998-01-09 05:39:38 +0000168variables are the \C{} equivalents of the Python variables
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000169\code{sys.exc_type}, \code{sys.exc_value} and \code{sys.exc_traceback}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000170(see the section on module \code{sys} in the Library Reference
171Manual). It is important to know about them to understand how errors
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000172are passed around.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000173
Guido van Rossumb92112d1995-03-20 14:24:09 +0000174The Python API defines a number of functions to set various types of
175exceptions.
176
177The most common one is \code{PyErr_SetString()}. Its arguments are an
Fred Drake0fd82681998-01-09 05:39:38 +0000178exception object and a \C{} string. The exception object is usually a
179predefined object like \code{PyExc_ZeroDivisionError}. The \C{} string
Guido van Rossumb92112d1995-03-20 14:24:09 +0000180indicates the cause of the error and is converted to a Python string
181object and stored as the ``associated value'' of the exception.
182
183Another useful function is \code{PyErr_SetFromErrno()}, which only
184takes an exception argument and constructs the associated value by
185inspection of the (\UNIX{}) global variable \code{errno}. The most
186general function is \code{PyErr_SetObject()}, which takes two object
187arguments, the exception and its associated value. You don't need to
188\code{Py_INCREF()} the objects passed to any of these functions.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000189
190You can test non-destructively whether an exception has been set with
Guido van Rossumb92112d1995-03-20 14:24:09 +0000191\code{PyErr_Occurred()}. This returns the current exception object,
Fred Drake0fd82681998-01-09 05:39:38 +0000192or \NULL{} if no exception has occurred. You normally don't need
Guido van Rossumb92112d1995-03-20 14:24:09 +0000193to call \code{PyErr_Occurred()} to see whether an error occurred in a
194function call, since you should be able to tell from the return value.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000195
Guido van Rossumd16ddb61996-12-13 02:38:17 +0000196When a function \var{f} that calls another function \var{g} detects
Guido van Rossumb92112d1995-03-20 14:24:09 +0000197that the latter fails, \var{f} should itself return an error value
Fred Drake0fd82681998-01-09 05:39:38 +0000198(e.g. \NULL{} or \code{-1}). It should \emph{not} call one of the
Guido van Rossumb92112d1995-03-20 14:24:09 +0000199\code{PyErr_*()} functions --- one has already been called by \var{g}.
200\var{f}'s caller is then supposed to also return an error indication
201to \emph{its} caller, again \emph{without} calling \code{PyErr_*()},
202and so on --- the most detailed cause of the error was already
203reported by the function that first detected it. Once the error
204reaches the Python interpreter's main loop, this aborts the currently
205executing Python code and tries to find an exception handler specified
206by the Python programmer.
Guido van Rossum6938f061994-08-01 12:22:53 +0000207
208(There are situations where a module can actually give a more detailed
Guido van Rossumb92112d1995-03-20 14:24:09 +0000209error message by calling another \code{PyErr_*()} function, and in
210such cases it is fine to do so. As a general rule, however, this is
211not necessary, and can cause information about the cause of the error
212to be lost: most operations can fail for a variety of reasons.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000213
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000214To ignore an exception set by a function call that failed, the exception
215condition must be cleared explicitly by calling \code{PyErr_Clear()}.
Fred Drake0fd82681998-01-09 05:39:38 +0000216The only time \C{} code should call \code{PyErr_Clear()} is if it doesn't
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000217want to pass the error on to the interpreter but wants to handle it
218completely by itself (e.g. by trying something else or pretending
219nothing happened).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000220
Guido van Rossumb92112d1995-03-20 14:24:09 +0000221Note that a failing \code{malloc()} call must be turned into an
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000222exception --- the direct caller of \code{malloc()} (or
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000223\code{realloc()}) must call \code{PyErr_NoMemory()} and return a
224failure indicator itself. All the object-creating functions
225(\code{PyInt_FromLong()} etc.) already do this, so only if you call
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000226\code{malloc()} directly this note is of importance.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000227
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000228Also note that, with the important exception of
Guido van Rossumb92112d1995-03-20 14:24:09 +0000229\code{PyArg_ParseTuple()} and friends, functions that return an
230integer status usually return a positive value or zero for success and
231\code{-1} for failure, like \UNIX{} system calls.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000232
Guido van Rossumb92112d1995-03-20 14:24:09 +0000233Finally, be careful to clean up garbage (by making \code{Py_XDECREF()}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000234or \code{Py_DECREF()} calls for objects you have already created) when
Guido van Rossumb92112d1995-03-20 14:24:09 +0000235you return an error indicator!
Guido van Rossum6938f061994-08-01 12:22:53 +0000236
237The choice of which exception to raise is entirely yours. There are
Fred Drake0fd82681998-01-09 05:39:38 +0000238predeclared \C{} objects corresponding to all built-in Python exceptions,
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000239e.g. \code{PyExc_ZeroDevisionError} which you can use directly. Of
Guido van Rossumb92112d1995-03-20 14:24:09 +0000240course, you should choose exceptions wisely --- don't use
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000241\code{PyExc_TypeError} to mean that a file couldn't be opened (that
242should probably be \code{PyExc_IOError}). If something's wrong with
243the argument list, the \code{PyArg_ParseTuple()} function usually
244raises \code{PyExc_TypeError}. If you have an argument whose value
245which must be in a particular range or must satisfy other conditions,
246\code{PyExc_ValueError} is appropriate.
Guido van Rossum6938f061994-08-01 12:22:53 +0000247
248You can also define a new exception that is unique to your module.
249For this, you usually declare a static object variable at the
250beginning of your file, e.g.
251
Fred Drake1e11a5c1998-02-13 07:11:32 +0000252\begin{verbatim}
253static PyObject *SpamError;
254\end{verbatim}
255
Guido van Rossum6938f061994-08-01 12:22:53 +0000256and initialize it in your module's initialization function
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000257(\code{initspam()}) with a string object, e.g. (leaving out the error
Guido van Rossumb92112d1995-03-20 14:24:09 +0000258checking for now):
Guido van Rossum6938f061994-08-01 12:22:53 +0000259
Fred Drake1e11a5c1998-02-13 07:11:32 +0000260\begin{verbatim}
261void
262initspam()
263{
264 PyObject *m, *d;
265 m = Py_InitModule("spam", SpamMethods);
266 d = PyModule_GetDict(m);
267 SpamError = PyString_FromString("spam.error");
268 PyDict_SetItemString(d, "error", SpamError);
269}
270\end{verbatim}
271
Guido van Rossumb92112d1995-03-20 14:24:09 +0000272Note that the Python name for the exception object is
273\code{spam.error}. It is conventional for module and exception names
274to be spelled in lower case. It is also conventional that the
275\emph{value} of the exception object is the same as its name, e.g.\
276the string \code{"spam.error"}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000277
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000278
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000279\section{Back to the Example}
280
281Going back to our example function, you should now be able to
282understand this statement:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000283
Fred Drake1e11a5c1998-02-13 07:11:32 +0000284\begin{verbatim}
285 if (!PyArg_ParseTuple(args, "s", &command))
286 return NULL;
287\end{verbatim}
288
Fred Drake0fd82681998-01-09 05:39:38 +0000289It returns \NULL{} (the error indicator for functions returning
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000290object pointers) if an error is detected in the argument list, relying
291on the exception set by \code{PyArg_ParseTuple()}. Otherwise the
292string value of the argument has been copied to the local variable
293\code{command}. This is a pointer assignment and you are not supposed
Fred Drake0fd82681998-01-09 05:39:38 +0000294to modify the string to which it points (so in Standard \C{}, the variable
Guido van Rossumb92112d1995-03-20 14:24:09 +0000295\code{command} should properly be declared as \samp{const char
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000296*command}).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000297
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000298The next statement is a call to the \UNIX{} function \code{system()},
299passing it the string we just got from \code{PyArg_ParseTuple()}:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000300
Fred Drake1e11a5c1998-02-13 07:11:32 +0000301\begin{verbatim}
302 sts = system(command);
303\end{verbatim}
304
Guido van Rossumd16ddb61996-12-13 02:38:17 +0000305Our \code{spam.system()} function must return the value of \code{sts}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000306as a Python object. This is done using the function
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000307\code{Py_BuildValue()}, which is something like the inverse of
308\code{PyArg_ParseTuple()}: it takes a format string and an arbitrary
Fred Drake0fd82681998-01-09 05:39:38 +0000309number of \C{} values, and returns a new Python object. More info on
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000310\code{Py_BuildValue()} is given later.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000311
Fred Drake1e11a5c1998-02-13 07:11:32 +0000312\begin{verbatim}
313 return Py_BuildValue("i", sts);
314\end{verbatim}
315
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000316In this case, it will return an integer object. (Yes, even integers
317are objects on the heap in Python!)
Guido van Rossum6938f061994-08-01 12:22:53 +0000318
Fred Drake0fd82681998-01-09 05:39:38 +0000319If you have a \C{} function that returns no useful argument (a function
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000320returning \code{void}), the corresponding Python function must return
321\code{None}. You need this idiom to do so:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000322
Fred Drake1e11a5c1998-02-13 07:11:32 +0000323\begin{verbatim}
324 Py_INCREF(Py_None);
325 return Py_None;
326\end{verbatim}
327
Fred Drake0fd82681998-01-09 05:39:38 +0000328\code{Py_None} is the \C{} name for the special Python object
329\code{None}. It is a genuine Python object (not a \NULL{}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000330pointer, which means ``error'' in most contexts, as we have seen).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000331
332
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000333\section{The Module's Method Table and Initialization Function}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000334
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000335I promised to show how \code{spam_system()} is called from Python
336programs. First, we need to list its name and address in a ``method
337table'':
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000338
Fred Drake1e11a5c1998-02-13 07:11:32 +0000339\begin{verbatim}
340static PyMethodDef SpamMethods[] = {
341 ...
342 {"system", spam_system, METH_VARARGS},
343 ...
344 {NULL, NULL} /* Sentinel */
345};
346\end{verbatim}
347
Fred Drake0fd82681998-01-09 05:39:38 +0000348Note the third entry (\samp{METH_VARARGS}). This is a flag telling
349the interpreter the calling convention to be used for the \C{}
350function. It should normally always be \samp{METH_VARARGS} or
351\samp{METH_VARARGS | METH_KEYWORDS}; a value of \samp{0} means that an
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000352obsolete variant of \code{PyArg_ParseTuple()} is used.
353
Fred Drakeb6e50321998-02-04 20:26:31 +0000354When using only \samp{METH_VARARGS}, the function should expect
355the Python-level parameters to be passed in as a tuple acceptable for
356parsing via \cfunction{PyArg_ParseTuple()}; more information on this
357function is provided below.
358
Fred Drake0fd82681998-01-09 05:39:38 +0000359The \code{METH_KEYWORDS} bit may be set in the third field if keyword
360arguments should be passed to the function. In this case, the \C{}
361function should accept a third \samp{PyObject *} parameter which will
362be a dictionary of keywords. Use \code{PyArg_ParseTupleAndKeywords()}
363to parse the arguemts to such a function.
364
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000365The method table must be passed to the interpreter in the module's
366initialization function (which should be the only non-\code{static}
367item defined in the module file):
368
Fred Drake1e11a5c1998-02-13 07:11:32 +0000369\begin{verbatim}
370void
371initspam()
372{
373 (void) Py_InitModule("spam", SpamMethods);
374}
375\end{verbatim}
376
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000377When the Python program imports module \code{spam} for the first time,
378\code{initspam()} is called. It calls \code{Py_InitModule()}, which
379creates a ``module object'' (which is inserted in the dictionary
380\code{sys.modules} under the key \code{"spam"}), and inserts built-in
381function objects into the newly created module based upon the table
382(an array of \code{PyMethodDef} structures) that was passed as its
383second argument. \code{Py_InitModule()} returns a pointer to the
Guido van Rossum6938f061994-08-01 12:22:53 +0000384module object that it creates (which is unused here). It aborts with
385a fatal error if the module could not be initialized satisfactorily,
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000386so the caller doesn't need to check for errors.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000387
388
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000389\section{Compilation and Linkage}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000390
Guido van Rossumb92112d1995-03-20 14:24:09 +0000391There are two more things to do before you can use your new extension:
392compiling and linking it with the Python system. If you use dynamic
393loading, the details depend on the style of dynamic loading your
394system uses; see the chapter on Dynamic Loading for more info about
395this.
Guido van Rossum6938f061994-08-01 12:22:53 +0000396
397If you can't use dynamic loading, or if you want to make your module a
398permanent part of the Python interpreter, you will have to change the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000399configuration setup and rebuild the interpreter. Luckily, this is
400very simple: just place your file (\file{spammodule.c} for example) in
401the \file{Modules} directory, add a line to the file
402\file{Modules/Setup} describing your file:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000403
Fred Drake1e11a5c1998-02-13 07:11:32 +0000404\begin{verbatim}
405spam spammodule.o
406\end{verbatim}
407
Guido van Rossum6938f061994-08-01 12:22:53 +0000408and rebuild the interpreter by running \code{make} in the toplevel
409directory. You can also run \code{make} in the \file{Modules}
410subdirectory, but then you must first rebuilt the \file{Makefile}
411there by running \code{make Makefile}. (This is necessary each time
412you change the \file{Setup} file.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000413
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000414If your module requires additional libraries to link with, these can
415be listed on the line in the \file{Setup} file as well, for instance:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000416
Fred Drake1e11a5c1998-02-13 07:11:32 +0000417\begin{verbatim}
418spam spammodule.o -lX11
419\end{verbatim}
420
Fred Drake0fd82681998-01-09 05:39:38 +0000421\section{Calling Python Functions From \C{}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000422
Fred Drake0fd82681998-01-09 05:39:38 +0000423So far we have concentrated on making \C{} functions callable from
424Python. The reverse is also useful: calling Python functions from \C{}.
Guido van Rossum6938f061994-08-01 12:22:53 +0000425This is especially the case for libraries that support so-called
Fred Drake0fd82681998-01-09 05:39:38 +0000426``callback'' functions. If a \C{} interface makes use of callbacks, the
Guido van Rossum6938f061994-08-01 12:22:53 +0000427equivalent Python often needs to provide a callback mechanism to the
428Python programmer; the implementation will require calling the Python
Fred Drake0fd82681998-01-09 05:39:38 +0000429callback functions from a \C{} callback. Other uses are also imaginable.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000430
431Fortunately, the Python interpreter is easily called recursively, and
Guido van Rossum6938f061994-08-01 12:22:53 +0000432there is a standard interface to call a Python function. (I won't
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000433dwell on how to call the Python parser with a particular string as
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000434input --- if you're interested, have a look at the implementation of
Guido van Rossum6938f061994-08-01 12:22:53 +0000435the \samp{-c} command line option in \file{Python/pythonmain.c}.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000436
437Calling a Python function is easy. First, the Python program must
438somehow pass you the Python function object. You should provide a
439function (or some other interface) to do this. When this function is
440called, save a pointer to the Python function object (be careful to
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000441\code{Py_INCREF()} it!) in a global variable --- or whereever you see fit.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000442For example, the following function might be part of a module
443definition:
444
Fred Drake1e11a5c1998-02-13 07:11:32 +0000445\begin{verbatim}
446static PyObject *my_callback = NULL;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000447
Fred Drake1e11a5c1998-02-13 07:11:32 +0000448static PyObject *
449my_set_callback(dummy, arg)
450 PyObject *dummy, *arg;
451{
452 Py_XDECREF(my_callback); /* Dispose of previous callback */
453 Py_XINCREF(arg); /* Add a reference to new callback */
454 my_callback = arg; /* Remember new callback */
455 /* Boilerplate to return "None" */
456 Py_INCREF(Py_None);
457 return Py_None;
458}
459\end{verbatim}
460
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000461The macros \code{Py_XINCREF()} and \code{Py_XDECREF()} increment/decrement
Guido van Rossum6938f061994-08-01 12:22:53 +0000462the reference count of an object and are safe in the presence of
Fred Drake0fd82681998-01-09 05:39:38 +0000463\NULL{} pointers. More info on them in the section on Reference
Guido van Rossum6938f061994-08-01 12:22:53 +0000464Counts below.
465
Fred Drake0fd82681998-01-09 05:39:38 +0000466Later, when it is time to call the function, you call the \C{} function
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000467\code{PyEval_CallObject()}. This function has two arguments, both
468pointers to arbitrary Python objects: the Python function, and the
469argument list. The argument list must always be a tuple object, whose
470length is the number of arguments. To call the Python function with
471no arguments, pass an empty tuple; to call it with one argument, pass
472a singleton tuple. \code{Py_BuildValue()} returns a tuple when its
473format string consists of zero or more format codes between
474parentheses. For example:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000475
Fred Drake1e11a5c1998-02-13 07:11:32 +0000476\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000477 int arg;
478 PyObject *arglist;
479 PyObject *result;
480 ...
481 arg = 123;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000482 ...
483 /* Time to call the callback */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000484 arglist = Py_BuildValue("(i)", arg);
485 result = PyEval_CallObject(my_callback, arglist);
486 Py_DECREF(arglist);
Fred Drake1e11a5c1998-02-13 07:11:32 +0000487\end{verbatim}
488
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000489\code{PyEval_CallObject()} returns a Python object pointer: this is
490the return value of the Python function. \code{PyEval_CallObject()} is
Guido van Rossumb92112d1995-03-20 14:24:09 +0000491``reference-count-neutral'' with respect to its arguments. In the
Guido van Rossum6938f061994-08-01 12:22:53 +0000492example a new tuple was created to serve as the argument list, which
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000493is \code{Py_DECREF()}-ed immediately after the call.
Guido van Rossum6938f061994-08-01 12:22:53 +0000494
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000495The return value of \code{PyEval_CallObject()} is ``new'': either it
496is a brand new object, or it is an existing object whose reference
497count has been incremented. So, unless you want to save it in a
498global variable, you should somehow \code{Py_DECREF()} the result,
499even (especially!) if you are not interested in its value.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000500
501Before you do this, however, it is important to check that the return
Fred Drake0fd82681998-01-09 05:39:38 +0000502value isn't \NULL{}. If it is, the Python function terminated by raising
503an exception. If the \C{} code that called \code{PyEval_CallObject()} is
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000504called from Python, it should now return an error indication to its
505Python caller, so the interpreter can print a stack trace, or the
506calling Python code can handle the exception. If this is not possible
507or desirable, the exception should be cleared by calling
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000508\code{PyErr_Clear()}. For example:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000509
Fred Drake1e11a5c1998-02-13 07:11:32 +0000510\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000511 if (result == NULL)
512 return NULL; /* Pass error back */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000513 ...use result...
514 Py_DECREF(result);
Fred Drake1e11a5c1998-02-13 07:11:32 +0000515\end{verbatim}
516
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000517Depending on the desired interface to the Python callback function,
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000518you may also have to provide an argument list to \code{PyEval_CallObject()}.
Guido van Rossum6938f061994-08-01 12:22:53 +0000519In some cases the argument list is also provided by the Python
520program, through the same interface that specified the callback
521function. It can then be saved and used in the same manner as the
522function object. In other cases, you may have to construct a new
523tuple to pass as the argument list. The simplest way to do this is to
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000524call \code{Py_BuildValue()}. For example, if you want to pass an integral
Guido van Rossum6938f061994-08-01 12:22:53 +0000525event code, you might use the following code:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000526
Fred Drake1e11a5c1998-02-13 07:11:32 +0000527\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000528 PyObject *arglist;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000529 ...
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000530 arglist = Py_BuildValue("(l)", eventcode);
531 result = PyEval_CallObject(my_callback, arglist);
532 Py_DECREF(arglist);
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000533 if (result == NULL)
534 return NULL; /* Pass error back */
535 /* Here maybe use the result */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000536 Py_DECREF(result);
Fred Drake1e11a5c1998-02-13 07:11:32 +0000537\end{verbatim}
538
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000539Note the placement of \code{Py_DECREF(argument)} immediately after the call,
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000540before the error check! Also note that strictly spoken this code is
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000541not complete: \code{Py_BuildValue()} may run out of memory, and this should
Guido van Rossum6938f061994-08-01 12:22:53 +0000542be checked.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000543
544
Fred Drake53396f61998-01-19 02:48:37 +0000545\section{Format Strings for \sectcode{PyArg_ParseTuple()}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000546
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000547The \code{PyArg_ParseTuple()} function is declared as follows:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000548
Fred Drake1e11a5c1998-02-13 07:11:32 +0000549\begin{verbatim}
550int PyArg_ParseTuple(PyObject *arg, char *format, ...);
551\end{verbatim}
552
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000553The \var{arg} argument must be a tuple object containing an argument
Fred Drake0fd82681998-01-09 05:39:38 +0000554list passed from Python to a \C{} function. The \var{format} argument
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000555must be a format string, whose syntax is explained below. The
556remaining arguments must be addresses of variables whose type is
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000557determined by the format string. For the conversion to succeed, the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000558\var{arg} object must match the format and the format must be
559exhausted.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000560
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000561Note that while \code{PyArg_ParseTuple()} checks that the Python
562arguments have the required types, it cannot check the validity of the
Fred Drake0fd82681998-01-09 05:39:38 +0000563addresses of \C{} variables passed to the call: if you make mistakes
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000564there, your code will probably crash or at least overwrite random bits
565in memory. So be careful!
566
567A format string consists of zero or more ``format units''. A format
568unit describes one Python object; it is usually a single character or
569a parenthesized sequence of format units. With a few exceptions, a
570format unit that is not a parenthesized sequence normally corresponds
571to a single address argument to \code{PyArg_ParseTuple()}. In the
572following description, the quoted form is the format unit; the entry
573in (round) parentheses is the Python object type that matches the
Fred Drake0fd82681998-01-09 05:39:38 +0000574format unit; and the entry in [square] brackets is the type of the \C{}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000575variable(s) whose address should be passed. (Use the \samp{\&}
576operator to pass a variable's address.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000577
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000578\begin{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000579
Fred Drake628f5981998-02-25 15:48:16 +0000580\item[\samp{s} (string) [char *{]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000581Convert a Python string to a \C{} pointer to a character string. You
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000582must not provide storage for the string itself; a pointer to an
583existing string is stored into the character pointer variable whose
Fred Drake0fd82681998-01-09 05:39:38 +0000584address you pass. The \C{} string is null-terminated. The Python string
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000585must not contain embedded null bytes; if it does, a \code{TypeError}
586exception is raised.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000587
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000588\item[\samp{s\#} (string) {[char *, int]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000589This variant on \code{'s'} stores into two \C{} variables, the first one
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000590a pointer to a character string, the second one its length. In this
591case the Python string may contain embedded null bytes.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000592
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000593\item[\samp{z} (string or \code{None}) {[char *]}]
594Like \samp{s}, but the Python object may also be \code{None}, in which
Fred Drake0fd82681998-01-09 05:39:38 +0000595case the \C{} pointer is set to \NULL{}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000596
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000597\item[\samp{z\#} (string or \code{None}) {[char *, int]}]
598This is to \code{'s\#'} as \code{'z'} is to \code{'s'}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000599
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000600\item[\samp{b} (integer) {[char]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000601Convert a Python integer to a tiny int, stored in a \C{} \code{char}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000602
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000603\item[\samp{h} (integer) {[short int]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000604Convert a Python integer to a \C{} \code{short int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000605
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000606\item[\samp{i} (integer) {[int]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000607Convert a Python integer to a plain \C{} \code{int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000608
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000609\item[\samp{l} (integer) {[long int]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000610Convert a Python integer to a \C{} \code{long int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000611
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000612\item[\samp{c} (string of length 1) {[char]}]
613Convert a Python character, represented as a string of length 1, to a
Fred Drake0fd82681998-01-09 05:39:38 +0000614\C{} \code{char}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000615
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000616\item[\samp{f} (float) {[float]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000617Convert a Python floating point number to a \C{} \code{float}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000618
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000619\item[\samp{d} (float) {[double]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000620Convert a Python floating point number to a \C{} \code{double}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000621
Fred Drakeb6e50321998-02-04 20:26:31 +0000622\item[\samp{D} (complex) {[Py_complex]}]
623Convert a Python complex number to a \C{} \code{Py_complex} structure.
624
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000625\item[\samp{O} (object) {[PyObject *]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000626Store a Python object (without any conversion) in a \C{} object pointer.
627The \C{} program thus receives the actual object that was passed. The
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000628object's reference count is not increased. The pointer stored is not
Fred Drake0fd82681998-01-09 05:39:38 +0000629\NULL{}.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000630
631\item[\samp{O!} (object) {[\var{typeobject}, PyObject *]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000632Store a Python object in a \C{} object pointer. This is similar to
633\samp{O}, but takes two \C{} arguments: the first is the address of a
634Python type object, the second is the address of the \C{} variable (of
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000635type \code{PyObject *}) into which the object pointer is stored.
636If the Python object does not have the required type, a
637\code{TypeError} exception is raised.
638
639\item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000640Convert a Python object to a \C{} variable through a \var{converter}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000641function. This takes two arguments: the first is a function, the
Fred Drake0fd82681998-01-09 05:39:38 +0000642second is the address of a \C{} variable (of arbitrary type), converted
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000643to \code{void *}. The \var{converter} function in turn is called as
644follows:
645
646\code{\var{status} = \var{converter}(\var{object}, \var{address});}
647
648where \var{object} is the Python object to be converted and
649\var{address} is the \code{void *} argument that was passed to
650\code{PyArg_ConvertTuple()}. The returned \var{status} should be
651\code{1} for a successful conversion and \code{0} if the conversion
652has failed. When the conversion fails, the \var{converter} function
653should raise an exception.
654
655\item[\samp{S} (string) {[PyStringObject *]}]
Guido van Rossum2474d681998-02-26 17:07:11 +0000656Like \samp{O} but requires that the Python object is a string object.
657Raises a \code{TypeError} exception if the object is not a string
658object. The \C{} variable may also be declared as \code{PyObject *}.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000659
660\item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
661The object must be a Python tuple whose length is the number of format
Fred Drake0fd82681998-01-09 05:39:38 +0000662units in \var{items}. The \C{} arguments must correspond to the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000663individual format units in \var{items}. Format units for tuples may
664be nested.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000665
666\end{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000667
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000668It is possible to pass Python long integers where integers are
Fred Drake1aedbd81998-02-16 14:47:27 +0000669requested; however no proper range checking is done --- the most
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000670significant bits are silently truncated when the receiving field is
671too small to receive the value (actually, the semantics are inherited
Fred Drake0fd82681998-01-09 05:39:38 +0000672from downcasts in \C{} --- your milage may vary).
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000673
674A few other characters have a meaning in a format string. These may
675not occur inside nested parentheses. They are:
676
677\begin{description}
678
679\item[\samp{|}]
680Indicates that the remaining arguments in the Python argument list are
Fred Drake0fd82681998-01-09 05:39:38 +0000681optional. The \C{} variables corresponding to optional arguments should
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000682be initialized to their default value --- when an optional argument is
683not specified, the \code{PyArg_ParseTuple} does not touch the contents
Fred Drake0fd82681998-01-09 05:39:38 +0000684of the corresponding \C{} variable(s).
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000685
686\item[\samp{:}]
687The list of format units ends here; the string after the colon is used
688as the function name in error messages (the ``associated value'' of
689the exceptions that \code{PyArg_ParseTuple} raises).
690
691\item[\samp{;}]
692The list of format units ends here; the string after the colon is used
693as the error message \emph{instead} of the default error message.
694Clearly, \samp{:} and \samp{;} mutually exclude each other.
695
696\end{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000697
698Some example calls:
699
Fred Drake0fd82681998-01-09 05:39:38 +0000700\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000701 int ok;
702 int i, j;
703 long k, l;
704 char *s;
705 int size;
706
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000707 ok = PyArg_ParseTuple(args, ""); /* No arguments */
Guido van Rossum6938f061994-08-01 12:22:53 +0000708 /* Python call: f() */
Fred Drake0fd82681998-01-09 05:39:38 +0000709
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000710 ok = PyArg_ParseTuple(args, "s", &s); /* A string */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000711 /* Possible Python call: f('whoops!') */
712
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000713 ok = PyArg_ParseTuple(args, "lls", &k, &l, &s); /* Two longs and a string */
Guido van Rossum6938f061994-08-01 12:22:53 +0000714 /* Possible Python call: f(1, 2, 'three') */
Fred Drake0fd82681998-01-09 05:39:38 +0000715
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000716 ok = PyArg_ParseTuple(args, "(ii)s#", &i, &j, &s, &size);
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000717 /* A pair of ints and a string, whose size is also returned */
Guido van Rossum7e924dd1997-02-10 16:51:52 +0000718 /* Possible Python call: f((1, 2), 'three') */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000719
720 {
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000721 char *file;
722 char *mode = "r";
723 int bufsize = 0;
724 ok = PyArg_ParseTuple(args, "s|si", &file, &mode, &bufsize);
725 /* A string, and optionally another string and an integer */
726 /* Possible Python calls:
727 f('spam')
728 f('spam', 'w')
729 f('spam', 'wb', 100000) */
730 }
731
732 {
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000733 int left, top, right, bottom, h, v;
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000734 ok = PyArg_ParseTuple(args, "((ii)(ii))(ii)",
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000735 &left, &top, &right, &bottom, &h, &v);
736 /* A rectangle and a point */
737 /* Possible Python call:
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000738 f(((0, 0), (400, 300)), (10, 10)) */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000739 }
Fred Drakeb6e50321998-02-04 20:26:31 +0000740
741 {
742 Py_complex c;
743 ok = PyArg_ParseTuple(args, "D:myfunction", &c);
744 /* a complex, also providing a function name for errors */
745 /* Possible Python call: myfunction(1+2j) */
746 }
Fred Drake0fd82681998-01-09 05:39:38 +0000747\end{verbatim}
Fred Drakeb6e50321998-02-04 20:26:31 +0000748
749
750\section{Keyword Parsing with \sectcode{PyArg_ParseTupleAndKeywords()}}
751
752The \cfunction{PyArg_ParseTupleAndKeywords()} function is declared as
753follows:
754
Fred Drake1e11a5c1998-02-13 07:11:32 +0000755\begin{verbatim}
756int PyArg_ParseTupleAndKeywords(PyObject *arg, PyObject *kwdict,
757 char *format, char **kwlist, ...);
758\end{verbatim}
Fred Drakeb6e50321998-02-04 20:26:31 +0000759
760The \var{arg} and \var{format} parameters are identical to those of the
761\cfunction{PyArg_ParseTuple()} function. The \var{kwdict} parameter
762is the dictionary of keywords received as the third parameter from the
763Python runtime. The \var{kwlist} parameter is a \NULL{}-terminated
764list of strings which identify the parameters; the names are matched
765with the type information from \var{format} from left to right.
766
767\strong{Note:} Nested tuples cannot be parsed when using keyword
768arguments! Keyword parameters passed in which are not present in the
769\var{kwlist} will cause a \exception{TypeError} to be raised.
770
771Here is an example module which uses keywords, based on an example by
772Geoff Philbrick (\email{philbrick@hks.com}):
773
774\begin{verbatim}
775#include <stdio.h>
776#include "Python.h"
777
778static PyObject *
779keywdarg_parrot(self, args, keywds)
780 PyObject *self;
781 PyObject *args;
782 PyObject *keywds;
783{
784 int voltage;
785 char *state = "a stiff";
786 char *action = "voom";
787 char *type = "Norwegian Blue";
788
789 static char *kwlist[] = {"voltage", "state", "action", "type", NULL};
790
791 if (!PyArg_ParseTupleAndKeywords(args, keywds, "i|sss", kwlist,
792 &voltage, &state, &action, &type))
793 return NULL;
794
795 printf("-- This parrot wouldn't %s if you put %i Volts through it.\n",
796 action, voltage);
797 printf("-- Lovely plumage, the %s -- It's %s!\n", type, state);
798
799 Py_INCREF(Py_None);
800
801 return Py_None;
802}
803
804static PyMethodDef keywdarg_methods[] = {
805 {"parrot", (PyCFunction)keywdarg_parrot, METH_VARARGS|METH_KEYWORDS},
806 {NULL, NULL} /* sentinel */
807};
808
809void
810initkeywdarg()
811{
812 /* Create the module and add the functions */
813 Py_InitModule("keywdarg", keywdarg_methods);
814
815}
816\end{verbatim}
817
818
Fred Drake53396f61998-01-19 02:48:37 +0000819\section{The \sectcode{Py_BuildValue()} Function}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000820
821This function is the counterpart to \code{PyArg_ParseTuple()}. It is
822declared as follows:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000823
Fred Drake1e11a5c1998-02-13 07:11:32 +0000824\begin{verbatim}
825PyObject *Py_BuildValue(char *format, ...);
826\end{verbatim}
827
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000828It recognizes a set of format units similar to the ones recognized by
829\code{PyArg_ParseTuple()}, but the arguments (which are input to the
830function, not output) must not be pointers, just values. It returns a
Fred Drake0fd82681998-01-09 05:39:38 +0000831new Python object, suitable for returning from a \C{} function called
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000832from Python.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000833
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000834One difference with \code{PyArg_ParseTuple()}: while the latter
835requires its first argument to be a tuple (since Python argument lists
836are always represented as tuples internally), \code{BuildValue()} does
837not always build a tuple. It builds a tuple only if its format string
838contains two or more format units. If the format string is empty, it
839returns \code{None}; if it contains exactly one format unit, it
840returns whatever object is described by that format unit. To force it
841to return a tuple of size 0 or one, parenthesize the format string.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000842
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000843In the following description, the quoted form is the format unit; the
844entry in (round) parentheses is the Python object type that the format
845unit will return; and the entry in [square] brackets is the type of
Fred Drake0fd82681998-01-09 05:39:38 +0000846the \C{} value(s) to be passed.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000847
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000848The characters space, tab, colon and comma are ignored in format
849strings (but not within format units such as \samp{s\#}). This can be
850used to make long format strings a tad more readable.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000851
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000852\begin{description}
853
854\item[\samp{s} (string) {[char *]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000855Convert a null-terminated \C{} string to a Python object. If the \C{}
856string pointer is \NULL{}, \code{None} is returned.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000857
858\item[\samp{s\#} (string) {[char *, int]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000859Convert a \C{} string and its length to a Python object. If the \C{} string
860pointer is \NULL{}, the length is ignored and \code{None} is
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000861returned.
862
863\item[\samp{z} (string or \code{None}) {[char *]}]
864Same as \samp{s}.
865
866\item[\samp{z\#} (string or \code{None}) {[char *, int]}]
867Same as \samp{s\#}.
868
869\item[\samp{i} (integer) {[int]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000870Convert a plain \C{} \code{int} to a Python integer object.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000871
872\item[\samp{b} (integer) {[char]}]
873Same as \samp{i}.
874
875\item[\samp{h} (integer) {[short int]}]
876Same as \samp{i}.
877
878\item[\samp{l} (integer) {[long int]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000879Convert a \C{} \code{long int} to a Python integer object.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000880
881\item[\samp{c} (string of length 1) {[char]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000882Convert a \C{} \code{int} representing a character to a Python string of
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000883length 1.
884
885\item[\samp{d} (float) {[double]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000886Convert a \C{} \code{double} to a Python floating point number.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000887
888\item[\samp{f} (float) {[float]}]
889Same as \samp{d}.
890
891\item[\samp{O} (object) {[PyObject *]}]
892Pass a Python object untouched (except for its reference count, which
Fred Drake0fd82681998-01-09 05:39:38 +0000893is incremented by one). If the object passed in is a \NULL{}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000894pointer, it is assumed that this was caused because the call producing
895the argument found an error and set an exception. Therefore,
Fred Drake0fd82681998-01-09 05:39:38 +0000896\code{Py_BuildValue()} will return \NULL{} but won't raise an
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000897exception. If no exception has been raised yet,
898\code{PyExc_SystemError} is set.
899
900\item[\samp{S} (object) {[PyObject *]}]
901Same as \samp{O}.
902
903\item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
904Convert \var{anything} to a Python object through a \var{converter}
905function. The function is called with \var{anything} (which should be
906compatible with \code{void *}) as its argument and should return a
Fred Drake0fd82681998-01-09 05:39:38 +0000907``new'' Python object, or \NULL{} if an error occurred.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000908
909\item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000910Convert a sequence of \C{} values to a Python tuple with the same number
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000911of items.
912
913\item[\samp{[\var{items}]} (list) {[\var{matching-items}]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000914Convert a sequence of \C{} values to a Python list with the same number
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000915of items.
916
917\item[\samp{\{\var{items}\}} (dictionary) {[\var{matching-items}]}]
Fred Drake0fd82681998-01-09 05:39:38 +0000918Convert a sequence of \C{} values to a Python dictionary. Each pair of
919consecutive \C{} values adds one item to the dictionary, serving as key
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000920and value, respectively.
921
922\end{description}
923
924If there is an error in the format string, the
Fred Drake0fd82681998-01-09 05:39:38 +0000925\code{PyExc_SystemError} exception is raised and \NULL{} returned.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000926
927Examples (to the left the call, to the right the resulting Python value):
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000928
Fred Drake1e11a5c1998-02-13 07:11:32 +0000929\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000930 Py_BuildValue("") None
931 Py_BuildValue("i", 123) 123
Guido van Rossumf23e0fe1995-03-18 11:04:29 +0000932 Py_BuildValue("iii", 123, 456, 789) (123, 456, 789)
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000933 Py_BuildValue("s", "hello") 'hello'
934 Py_BuildValue("ss", "hello", "world") ('hello', 'world')
935 Py_BuildValue("s#", "hello", 4) 'hell'
936 Py_BuildValue("()") ()
937 Py_BuildValue("(i)", 123) (123,)
938 Py_BuildValue("(ii)", 123, 456) (123, 456)
939 Py_BuildValue("(i,i)", 123, 456) (123, 456)
940 Py_BuildValue("[i,i]", 123, 456) [123, 456]
Guido van Rossumf23e0fe1995-03-18 11:04:29 +0000941 Py_BuildValue("{s:i,s:i}",
942 "abc", 123, "def", 456) {'abc': 123, 'def': 456}
943 Py_BuildValue("((ii)(ii)) (ii)",
944 1, 2, 3, 4, 5, 6) (((1, 2), (3, 4)), (5, 6))
Fred Drake1e11a5c1998-02-13 07:11:32 +0000945\end{verbatim}
946
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000947\section{Reference Counts}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000948
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000949\subsection{Introduction}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000950
Fred Drake0fd82681998-01-09 05:39:38 +0000951In languages like \C{} or \Cpp{}, the programmer is responsible for
952dynamic allocation and deallocation of memory on the heap. In \C{}, this
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000953is done using the functions \code{malloc()} and \code{free()}. In
954\Cpp{}, the operators \code{new} and \code{delete} are used with
955essentially the same meaning; they are actually implemented using
956\code{malloc()} and \code{free()}, so we'll restrict the following
957discussion to the latter.
958
959Every block of memory allocated with \code{malloc()} should eventually
960be returned to the pool of available memory by exactly one call to
961\code{free()}. It is important to call \code{free()} at the right
962time. If a block's address is forgotten but \code{free()} is not
963called for it, the memory it occupies cannot be reused until the
964program terminates. This is called a \dfn{memory leak}. On the other
965hand, if a program calls \code{free()} for a block and then continues
966to use the block, it creates a conflict with re-use of the block
967through another \code{malloc()} call. This is called \dfn{using freed
Guido van Rossumdebf2e81997-07-17 15:58:43 +0000968memory}. It has the same bad consequences as referencing uninitialized
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000969data --- core dumps, wrong results, mysterious crashes.
970
971Common causes of memory leaks are unusual paths through the code. For
972instance, a function may allocate a block of memory, do some
973calculation, and then free the block again. Now a change in the
974requirements for the function may add a test to the calculation that
975detects an error condition and can return prematurely from the
976function. It's easy to forget to free the allocated memory block when
977taking this premature exit, especially when it is added later to the
978code. Such leaks, once introduced, often go undetected for a long
979time: the error exit is taken only in a small fraction of all calls,
980and most modern machines have plenty of virtual memory, so the leak
981only becomes apparent in a long-running process that uses the leaking
982function frequently. Therefore, it's important to prevent leaks from
983happening by having a coding convention or strategy that minimizes
984this kind of errors.
985
986Since Python makes heavy use of \code{malloc()} and \code{free()}, it
987needs a strategy to avoid memory leaks as well as the use of freed
988memory. The chosen method is called \dfn{reference counting}. The
989principle is simple: every object contains a counter, which is
990incremented when a reference to the object is stored somewhere, and
991which is decremented when a reference to it is deleted. When the
992counter reaches zero, the last reference to the object has been
993deleted and the object is freed.
994
995An alternative strategy is called \dfn{automatic garbage collection}.
996(Sometimes, reference counting is also referred to as a garbage
997collection strategy, hence my use of ``automatic'' to distinguish the
998two.) The big advantage of automatic garbage collection is that the
999user doesn't need to call \code{free()} explicitly. (Another claimed
1000advantage is an improvement in speed or memory usage --- this is no
Fred Drake0fd82681998-01-09 05:39:38 +00001001hard fact however.) The disadvantage is that for \C{}, there is no
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001002truly portable automatic garbage collector, while reference counting
1003can be implemented portably (as long as the functions \code{malloc()}
Fred Drake0fd82681998-01-09 05:39:38 +00001004and \code{free()} are available --- which the \C{} Standard guarantees).
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001005Maybe some day a sufficiently portable automatic garbage collector
Fred Drake0fd82681998-01-09 05:39:38 +00001006will be available for \C{}. Until then, we'll have to live with
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001007reference counts.
1008
1009\subsection{Reference Counting in Python}
1010
1011There are two macros, \code{Py_INCREF(x)} and \code{Py_DECREF(x)},
1012which handle the incrementing and decrementing of the reference count.
1013\code{Py_DECREF()} also frees the object when the count reaches zero.
1014For flexibility, it doesn't call \code{free()} directly --- rather, it
1015makes a call through a function pointer in the object's \dfn{type
1016object}. For this purpose (and others), every object also contains a
1017pointer to its type object.
1018
1019The big question now remains: when to use \code{Py_INCREF(x)} and
1020\code{Py_DECREF(x)}? Let's first introduce some terms. Nobody
1021``owns'' an object; however, you can \dfn{own a reference} to an
1022object. An object's reference count is now defined as the number of
1023owned references to it. The owner of a reference is responsible for
1024calling \code{Py_DECREF()} when the reference is no longer needed.
1025Ownership of a reference can be transferred. There are three ways to
1026dispose of an owned reference: pass it on, store it, or call
1027\code{Py_DECREF()}. Forgetting to dispose of an owned reference creates
1028a memory leak.
1029
1030It is also possible to \dfn{borrow}\footnote{The metaphor of
1031``borrowing'' a reference is not completely correct: the owner still
1032has a copy of the reference.} a reference to an object. The borrower
1033of a reference should not call \code{Py_DECREF()}. The borrower must
1034not hold on to the object longer than the owner from which it was
1035borrowed. Using a borrowed reference after the owner has disposed of
1036it risks using freed memory and should be avoided
1037completely.\footnote{Checking that the reference count is at least 1
1038\strong{does not work} --- the reference count itself could be in
1039freed memory and may thus be reused for another object!}
1040
1041The advantage of borrowing over owning a reference is that you don't
1042need to take care of disposing of the reference on all possible paths
1043through the code --- in other words, with a borrowed reference you
1044don't run the risk of leaking when a premature exit is taken. The
1045disadvantage of borrowing over leaking is that there are some subtle
1046situations where in seemingly correct code a borrowed reference can be
1047used after the owner from which it was borrowed has in fact disposed
1048of it.
1049
1050A borrowed reference can be changed into an owned reference by calling
1051\code{Py_INCREF()}. This does not affect the status of the owner from
1052which the reference was borrowed --- it creates a new owned reference,
1053and gives full owner responsibilities (i.e., the new owner must
1054dispose of the reference properly, as well as the previous owner).
1055
1056\subsection{Ownership Rules}
1057
1058Whenever an object reference is passed into or out of a function, it
1059is part of the function's interface specification whether ownership is
1060transferred with the reference or not.
1061
1062Most functions that return a reference to an object pass on ownership
1063with the reference. In particular, all functions whose function it is
1064to create a new object, e.g.\ \code{PyInt_FromLong()} and
1065\code{Py_BuildValue()}, pass ownership to the receiver. Even if in
1066fact, in some cases, you don't receive a reference to a brand new
1067object, you still receive ownership of the reference. For instance,
1068\code{PyInt_FromLong()} maintains a cache of popular values and can
1069return a reference to a cached item.
1070
1071Many functions that extract objects from other objects also transfer
1072ownership with the reference, for instance
1073\code{PyObject_GetAttrString()}. The picture is less clear, here,
1074however, since a few common routines are exceptions:
1075\code{PyTuple_GetItem()}, \code{PyList_GetItem()} and
1076\code{PyDict_GetItem()} (and \code{PyDict_GetItemString()}) all return
1077references that you borrow from the tuple, list or dictionary.
1078
1079The function \code{PyImport_AddModule()} also returns a borrowed
1080reference, even though it may actually create the object it returns:
1081this is possible because an owned reference to the object is stored in
1082\code{sys.modules}.
1083
1084When you pass an object reference into another function, in general,
1085the function borrows the reference from you --- if it needs to store
1086it, it will use \code{Py_INCREF()} to become an independent owner.
1087There are exactly two important exceptions to this rule:
1088\code{PyTuple_SetItem()} and \code{PyList_SetItem()}. These functions
1089take over ownership of the item passed to them --- even if they fail!
1090(Note that \code{PyDict_SetItem()} and friends don't take over
1091ownership --- they are ``normal''.)
1092
Fred Drake0fd82681998-01-09 05:39:38 +00001093When a \C{} function is called from Python, it borrows references to its
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001094arguments from the caller. The caller owns a reference to the object,
1095so the borrowed reference's lifetime is guaranteed until the function
1096returns. Only when such a borrowed reference must be stored or passed
1097on, it must be turned into an owned reference by calling
1098\code{Py_INCREF()}.
1099
Fred Drake0fd82681998-01-09 05:39:38 +00001100The object reference returned from a \C{} function that is called from
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001101Python must be an owned reference --- ownership is tranferred from the
1102function to its caller.
1103
1104\subsection{Thin Ice}
1105
1106There are a few situations where seemingly harmless use of a borrowed
1107reference can lead to problems. These all have to do with implicit
1108invocations of the interpreter, which can cause the owner of a
1109reference to dispose of it.
1110
1111The first and most important case to know about is using
1112\code{Py_DECREF()} on an unrelated object while borrowing a reference
1113to a list item. For instance:
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001114
Fred Drake1e11a5c1998-02-13 07:11:32 +00001115\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001116bug(PyObject *list) {
1117 PyObject *item = PyList_GetItem(list, 0);
1118 PyList_SetItem(list, 1, PyInt_FromLong(0L));
1119 PyObject_Print(item, stdout, 0); /* BUG! */
1120}
Fred Drake1e11a5c1998-02-13 07:11:32 +00001121\end{verbatim}
1122
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001123This function first borrows a reference to \code{list[0]}, then
1124replaces \code{list[1]} with the value \code{0}, and finally prints
1125the borrowed reference. Looks harmless, right? But it's not!
1126
1127Let's follow the control flow into \code{PyList_SetItem()}. The list
1128owns references to all its items, so when item 1 is replaced, it has
1129to dispose of the original item 1. Now let's suppose the original
1130item 1 was an instance of a user-defined class, and let's further
1131suppose that the class defined a \code{__del__()} method. If this
1132class instance has a reference count of 1, disposing of it will call
1133its \code{__del__()} method.
1134
1135Since it is written in Python, the \code{__del__()} method can execute
1136arbitrary Python code. Could it perhaps do something to invalidate
1137the reference to \code{item} in \code{bug()}? You bet! Assuming that
1138the list passed into \code{bug()} is accessible to the
1139\code{__del__()} method, it could execute a statement to the effect of
1140\code{del list[0]}, and assuming this was the last reference to that
1141object, it would free the memory associated with it, thereby
1142invalidating \code{item}.
1143
1144The solution, once you know the source of the problem, is easy:
1145temporarily increment the reference count. The correct version of the
1146function reads:
1147
Fred Drake1e11a5c1998-02-13 07:11:32 +00001148\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001149no_bug(PyObject *list) {
1150 PyObject *item = PyList_GetItem(list, 0);
1151 Py_INCREF(item);
1152 PyList_SetItem(list, 1, PyInt_FromLong(0L));
1153 PyObject_Print(item, stdout, 0);
1154 Py_DECREF(item);
1155}
Fred Drake1e11a5c1998-02-13 07:11:32 +00001156\end{verbatim}
1157
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001158This is a true story. An older version of Python contained variants
Fred Drake0fd82681998-01-09 05:39:38 +00001159of this bug and someone spent a considerable amount of time in a \C{}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001160debugger to figure out why his \code{__del__()} methods would fail...
1161
1162The second case of problems with a borrowed reference is a variant
1163involving threads. Normally, multiple threads in the Python
1164interpreter can't get in each other's way, because there is a global
1165lock protecting Python's entire object space. However, it is possible
1166to temporarily release this lock using the macro
1167\code{Py_BEGIN_ALLOW_THREADS}, and to re-acquire it using
1168\code{Py_END_ALLOW_THREADS}. This is common around blocking I/O
1169calls, to let other threads use the CPU while waiting for the I/O to
1170complete. Obviously, the following function has the same problem as
1171the previous one:
1172
Fred Drake1e11a5c1998-02-13 07:11:32 +00001173\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001174bug(PyObject *list) {
1175 PyObject *item = PyList_GetItem(list, 0);
1176 Py_BEGIN_ALLOW_THREADS
1177 ...some blocking I/O call...
1178 Py_END_ALLOW_THREADS
1179 PyObject_Print(item, stdout, 0); /* BUG! */
1180}
Fred Drake1e11a5c1998-02-13 07:11:32 +00001181\end{verbatim}
1182
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001183\subsection{NULL Pointers}
1184
1185In general, functions that take object references as arguments don't
Fred Drake0fd82681998-01-09 05:39:38 +00001186expect you to pass them \NULL{} pointers, and will dump core (or
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001187cause later core dumps) if you do so. Functions that return object
Fred Drake0fd82681998-01-09 05:39:38 +00001188references generally return \NULL{} only to indicate that an
1189exception occurred. The reason for not testing for \NULL{}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001190arguments is that functions often pass the objects they receive on to
Fred Drake0fd82681998-01-09 05:39:38 +00001191other function --- if each function were to test for \NULL{},
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001192there would be a lot of redundant tests and the code would run slower.
1193
Fred Drake0fd82681998-01-09 05:39:38 +00001194It is better to test for \NULL{} only at the ``source'', i.e.\
1195when a pointer that may be \NULL{} is received, e.g.\ from
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001196\code{malloc()} or from a function that may raise an exception.
1197
1198The macros \code{Py_INCREF()} and \code{Py_DECREF()}
Fred Drake0fd82681998-01-09 05:39:38 +00001199don't check for \NULL{} pointers --- however, their variants
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001200\code{Py_XINCREF()} and \code{Py_XDECREF()} do.
1201
1202The macros for checking for a particular object type
Fred Drake0fd82681998-01-09 05:39:38 +00001203(\code{Py\var{type}_Check()}) don't check for \NULL{} pointers ---
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001204again, there is much code that calls several of these in a row to test
1205an object against various different expected types, and this would
Fred Drake0fd82681998-01-09 05:39:38 +00001206generate redundant tests. There are no variants with \NULL{}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001207checking.
1208
Fred Drake0fd82681998-01-09 05:39:38 +00001209The \C{} function calling mechanism guarantees that the argument list
1210passed to \C{} functions (\code{args} in the examples) is never
1211\NULL{} --- in fact it guarantees that it is always a tuple.%
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001212\footnote{These guarantees don't hold when you use the ``old'' style
1213calling convention --- this is still found in much existing code.}
1214
Fred Drake0fd82681998-01-09 05:39:38 +00001215It is a severe error to ever let a \NULL{} pointer ``escape'' to
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001216the Python user.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001217
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001218
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001219\section{Writing Extensions in \Cpp{}}
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001220
Guido van Rossum16d6e711994-08-08 12:30:22 +00001221It is possible to write extension modules in \Cpp{}. Some restrictions
Guido van Rossumed39cd01995-10-08 00:17:19 +00001222apply. If the main program (the Python interpreter) is compiled and
Fred Drake0fd82681998-01-09 05:39:38 +00001223linked by the \C{} compiler, global or static objects with constructors
Guido van Rossumed39cd01995-10-08 00:17:19 +00001224cannot be used. This is not a problem if the main program is linked
Guido van Rossumafcd5891998-02-05 19:59:39 +00001225by the \Cpp{} compiler. Functions that will be called by the
1226Python interpreter (in particular, module initalization functions)
1227have to be declared using \code{extern "C"}.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001228It is unnecessary to enclose the Python header files in
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001229\code{extern "C" \{...\}} --- they use this form already if the symbol
Fred Drake0fd82681998-01-09 05:39:38 +00001230\samp{__cplusplus} is defined (all recent \Cpp{} compilers define this
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001231symbol).
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001232
1233\chapter{Embedding Python in another application}
1234
1235Embedding Python is similar to extending it, but not quite. The
1236difference is that when you extend Python, the main program of the
Guido van Rossum16d6e711994-08-08 12:30:22 +00001237application is still the Python interpreter, while if you embed
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001238Python, the main program may have nothing to do with Python ---
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001239instead, some parts of the application occasionally call the Python
1240interpreter to run some Python code.
1241
1242So if you are embedding Python, you are providing your own main
1243program. One of the things this main program has to do is initialize
1244the Python interpreter. At the very least, you have to call the
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001245function \code{Py_Initialize()}. There are optional calls to pass command
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001246line arguments to Python. Then later you can call the interpreter
1247from any part of the application.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001248
1249There are several different ways to call the interpreter: you can pass
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001250a string containing Python statements to \code{PyRun_SimpleString()},
1251or you can pass a stdio file pointer and a file name (for
1252identification in error messages only) to \code{PyRun_SimpleFile()}. You
1253can also call the lower-level operations described in the previous
1254chapters to construct and use Python objects.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001255
1256A simple demo of embedding Python can be found in the directory
Guido van Rossum6938f061994-08-01 12:22:53 +00001257\file{Demo/embed}.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001258
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001259
Guido van Rossum16d6e711994-08-08 12:30:22 +00001260\section{Embedding Python in \Cpp{}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001261
Guido van Rossum16d6e711994-08-08 12:30:22 +00001262It is also possible to embed Python in a \Cpp{} program; precisely how this
1263is done will depend on the details of the \Cpp{} system used; in general you
1264will need to write the main program in \Cpp{}, and use the \Cpp{} compiler
1265to compile and link your program. There is no need to recompile Python
1266itself using \Cpp{}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001267
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001268
1269\chapter{Dynamic Loading}
1270
Guido van Rossum6938f061994-08-01 12:22:53 +00001271On most modern systems it is possible to configure Python to support
Fred Drake0fd82681998-01-09 05:39:38 +00001272dynamic loading of extension modules implemented in \C{}. When shared
Guido van Rossum6938f061994-08-01 12:22:53 +00001273libraries are used dynamic loading is configured automatically;
1274otherwise you have to select it as a build option (see below). Once
1275configured, dynamic loading is trivial to use: when a Python program
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001276executes \code{import spam}, the search for modules tries to find a
1277file \file{spammodule.o} (\file{spammodule.so} when using shared
Guido van Rossum6938f061994-08-01 12:22:53 +00001278libraries) in the module search path, and if one is found, it is
1279loaded into the executing binary and executed. Once loaded, the
1280module acts just like a built-in extension module.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001281
Guido van Rossumb92112d1995-03-20 14:24:09 +00001282The advantages of dynamic loading are twofold: the ``core'' Python
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001283binary gets smaller, and users can extend Python with their own
Fred Drake0fd82681998-01-09 05:39:38 +00001284modules implemented in \C{} without having to build and maintain their
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001285own copy of the Python interpreter. There are also disadvantages:
1286dynamic loading isn't available on all systems (this just means that
1287on some systems you have to use static loading), and dynamically
1288loading a module that was compiled for a different version of Python
Guido van Rossum6938f061994-08-01 12:22:53 +00001289(e.g. with a different representation of objects) may dump core.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001290
1291
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001292\section{Configuring and Building the Interpreter for Dynamic Loading}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001293
Guido van Rossum6938f061994-08-01 12:22:53 +00001294There are three styles of dynamic loading: one using shared libraries,
1295one using SGI IRIX 4 dynamic loading, and one using GNU dynamic
1296loading.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001297
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001298\subsection{Shared Libraries}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001299
Guido van Rossum16d6e711994-08-08 12:30:22 +00001300The following systems support dynamic loading using shared libraries:
Guido van Rossum6938f061994-08-01 12:22:53 +00001301SunOS 4; Solaris 2; SGI IRIX 5 (but not SGI IRIX 4!); and probably all
1302systems derived from SVR4, or at least those SVR4 derivatives that
1303support shared libraries (are there any that don't?).
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001304
Guido van Rossum6938f061994-08-01 12:22:53 +00001305You don't need to do anything to configure dynamic loading on these
1306systems --- the \file{configure} detects the presence of the
1307\file{<dlfcn.h>} header file and automatically configures dynamic
1308loading.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001309
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001310\subsection{SGI IRIX 4 Dynamic Loading}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001311
Guido van Rossum6938f061994-08-01 12:22:53 +00001312Only SGI IRIX 4 supports dynamic loading of modules using SGI dynamic
1313loading. (SGI IRIX 5 might also support it but it is inferior to
1314using shared libraries so there is no reason to; a small test didn't
1315work right away so I gave up trying to support it.)
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001316
Guido van Rossum6938f061994-08-01 12:22:53 +00001317Before you build Python, you first need to fetch and build the \code{dl}
1318package written by Jack Jansen. This is available by anonymous ftp
Fred Drakeca6567f1998-01-22 20:44:18 +00001319from \url{ftp://ftp.cwi.nl/pub/dynload}, file
Guido van Rossum6938f061994-08-01 12:22:53 +00001320\file{dl-1.6.tar.Z}. (The version number may change.) Follow the
1321instructions in the package's \file{README} file to build it.
1322
1323Once you have built \code{dl}, you can configure Python to use it. To
1324this end, you run the \file{configure} script with the option
1325\code{--with-dl=\var{directory}} where \var{directory} is the absolute
1326pathname of the \code{dl} directory.
1327
1328Now build and install Python as you normally would (see the
1329\file{README} file in the toplevel Python directory.)
1330
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001331\subsection{GNU Dynamic Loading}
Guido van Rossum6938f061994-08-01 12:22:53 +00001332
1333GNU dynamic loading supports (according to its \file{README} file) the
1334following hardware and software combinations: VAX (Ultrix), Sun 3
1335(SunOS 3.4 and 4.0), Sparc (SunOS 4.0), Sequent Symmetry (Dynix), and
1336Atari ST. There is no reason to use it on a Sparc; I haven't seen a
1337Sun 3 for years so I don't know if these have shared libraries or not.
1338
Guido van Rossum7e924dd1997-02-10 16:51:52 +00001339You need to fetch and build two packages.
1340One is GNU DLD. All development of this code has been done with DLD
Fred Drakeca6567f1998-01-22 20:44:18 +00001341version 3.2.3, which is available by anonymous ftp from
1342\url{ftp://ftp.cwi.nl/pub/dynload}, file
Guido van Rossum7e924dd1997-02-10 16:51:52 +00001343\file{dld-3.2.3.tar.Z}. (A more recent version of DLD is available
Fred Drakeca6567f1998-01-22 20:44:18 +00001344via \url{http://www-swiss.ai.mit.edu/~jaffer/DLD.html} but this has
Guido van Rossum7e924dd1997-02-10 16:51:52 +00001345not been tested.)
1346The other package needed is an
Guido van Rossum6938f061994-08-01 12:22:53 +00001347emulation of Jack Jansen's \code{dl} package that I wrote on top of
1348GNU DLD 3.2.3. This is available from the same host and directory,
Guido van Rossum98046b91997-08-14 19:50:18 +00001349file \file{dl-dld-1.1.tar.Z}. (The version number may change --- but I doubt
Guido van Rossum6938f061994-08-01 12:22:53 +00001350it will.) Follow the instructions in each package's \file{README}
Guido van Rossum98046b91997-08-14 19:50:18 +00001351file to configure and build them.
Guido van Rossum6938f061994-08-01 12:22:53 +00001352
1353Now configure Python. Run the \file{configure} script with the option
1354\code{--with-dl-dld=\var{dl-directory},\var{dld-directory}} where
1355\var{dl-directory} is the absolute pathname of the directory where you
1356have built the \file{dl-dld} package, and \var{dld-directory} is that
1357of the GNU DLD package. The Python interpreter you build hereafter
1358will support GNU dynamic loading.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001359
1360
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001361\section{Building a Dynamically Loadable Module}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001362
Guido van Rossum6938f061994-08-01 12:22:53 +00001363Since there are three styles of dynamic loading, there are also three
1364groups of instructions for building a dynamically loadable module.
1365Instructions common for all three styles are given first. Assuming
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001366your module is called \code{spam}, the source filename must be
1367\file{spammodule.c}, so the object name is \file{spammodule.o}. The
Guido van Rossum6938f061994-08-01 12:22:53 +00001368module must be written as a normal Python extension module (as
1369described earlier).
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001370
Guido van Rossum6938f061994-08-01 12:22:53 +00001371Note that in all cases you will have to create your own Makefile that
1372compiles your module file(s). This Makefile will have to pass two
Fred Drake0fd82681998-01-09 05:39:38 +00001373\samp{-I} arguments to the \C{} compiler which will make it find the
Guido van Rossum6938f061994-08-01 12:22:53 +00001374Python header files. If the Make variable \var{PYTHONTOP} points to
1375the toplevel Python directory, your \var{CFLAGS} Make variable should
1376contain the options \samp{-I\$(PYTHONTOP) -I\$(PYTHONTOP)/Include}.
1377(Most header files are in the \file{Include} subdirectory, but the
Guido van Rossum305ed111996-08-19 22:59:46 +00001378\file{config.h} header lives in the toplevel directory.)
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001379
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001380
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001381\subsection{Shared Libraries}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001382
Fred Drakeaf8a0151998-01-14 14:51:31 +00001383You must link the \file{.o} file to produce a shared library. This is
1384done using a special invocation of the \UNIX{} loader/linker,
1385\emph{ld}(1). Unfortunately the invocation differs slightly per
1386system.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001387
Guido van Rossum6938f061994-08-01 12:22:53 +00001388On SunOS 4, use
Fred Drake1e11a5c1998-02-13 07:11:32 +00001389\begin{verbatim}
1390ld spammodule.o -o spammodule.so
1391\end{verbatim}
1392
Guido van Rossum6938f061994-08-01 12:22:53 +00001393On Solaris 2, use
Fred Drake1e11a5c1998-02-13 07:11:32 +00001394\begin{verbatim}
1395ld -G spammodule.o -o spammodule.so
1396\end{verbatim}
1397
Guido van Rossum6938f061994-08-01 12:22:53 +00001398On SGI IRIX 5, use
Fred Drake1e11a5c1998-02-13 07:11:32 +00001399\begin{verbatim}
1400ld -shared spammodule.o -o spammodule.so
1401\end{verbatim}
1402
Guido van Rossumb92112d1995-03-20 14:24:09 +00001403On other systems, consult the manual page for \code{ld}(1) to find what
Guido van Rossum6938f061994-08-01 12:22:53 +00001404flags, if any, must be used.
1405
1406If your extension module uses system libraries that haven't already
1407been linked with Python (e.g. a windowing system), these must be
Guido van Rossumb92112d1995-03-20 14:24:09 +00001408passed to the \code{ld} command as \samp{-l} options after the
Guido van Rossum6938f061994-08-01 12:22:53 +00001409\samp{.o} file.
1410
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001411The resulting file \file{spammodule.so} must be copied into a directory
Guido van Rossum6938f061994-08-01 12:22:53 +00001412along the Python module search path.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001413
1414
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001415\subsection{SGI IRIX 4 Dynamic Loading}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001416
Fred Drakeaf8a0151998-01-14 14:51:31 +00001417\strong{IMPORTANT:} You must compile your extension module with the
Fred Drake0fd82681998-01-09 05:39:38 +00001418additional \C{} flag \samp{-G0} (or \samp{-G 0}). This instruct the
Guido van Rossum6938f061994-08-01 12:22:53 +00001419assembler to generate position-independent code.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001420
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001421You don't need to link the resulting \file{spammodule.o} file; just
Guido van Rossum6938f061994-08-01 12:22:53 +00001422copy it into a directory along the Python module search path.
1423
1424The first time your extension is loaded, it takes some extra time and
1425a few messages may be printed. This creates a file
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001426\file{spammodule.ld} which is an image that can be loaded quickly into
Guido van Rossum6938f061994-08-01 12:22:53 +00001427the Python interpreter process. When a new Python interpreter is
1428installed, the \code{dl} package detects this and rebuilds
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001429\file{spammodule.ld}. The file \file{spammodule.ld} is placed in the
1430directory where \file{spammodule.o} was found, unless this directory is
Guido van Rossum6938f061994-08-01 12:22:53 +00001431unwritable; in that case it is placed in a temporary
1432directory.\footnote{Check the manual page of the \code{dl} package for
1433details.}
1434
1435If your extension modules uses additional system libraries, you must
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001436create a file \file{spammodule.libs} in the same directory as the
1437\file{spammodule.o}. This file should contain one or more lines with
Guido van Rossum6938f061994-08-01 12:22:53 +00001438whitespace-separated options that will be passed to the linker ---
1439normally only \samp{-l} options or absolute pathnames of libraries
1440(\samp{.a} files) should be used.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001441
1442
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001443\subsection{GNU Dynamic Loading}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001444
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001445Just copy \file{spammodule.o} into a directory along the Python module
Guido van Rossum6938f061994-08-01 12:22:53 +00001446search path.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001447
Guido van Rossum6938f061994-08-01 12:22:53 +00001448If your extension modules uses additional system libraries, you must
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001449create a file \file{spammodule.libs} in the same directory as the
1450\file{spammodule.o}. This file should contain one or more lines with
Guido van Rossum6938f061994-08-01 12:22:53 +00001451whitespace-separated absolute pathnames of libraries (\samp{.a}
1452files). No \samp{-l} options can be used.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001453
1454
Guido van Rossum9231c8f1997-05-15 21:43:21 +00001455%\input{extref}
Guido van Rossum267e80d1996-08-09 21:01:07 +00001456
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001457\input{ext.ind}
1458
1459\end{document}