blob: 662613010bdfa2f936c5a281b514be55c3eb023e [file] [log] [blame]
Guido van Rossum6938f061994-08-01 12:22:53 +00001\documentstyle[twoside,11pt,myformat]{report}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002
Guido van Rossum5049bcb1995-03-13 16:55:23 +00003% XXX PM Modulator
4
Guido van Rossum6938f061994-08-01 12:22:53 +00005\title{Extending and Embedding the Python Interpreter}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00006
Guido van Rossum16cd7f91994-10-06 10:29:26 +00007\input{boilerplate}
Guido van Rossum83eb9621993-11-23 16:28:45 +00008
Guido van Rossum7a2dba21993-11-05 14:45:11 +00009% Tell \index to actually write the .idx file
10\makeindex
11
12\begin{document}
13
14\pagenumbering{roman}
15
16\maketitle
17
Guido van Rossum16cd7f91994-10-06 10:29:26 +000018\input{copyright}
19
Guido van Rossum7a2dba21993-11-05 14:45:11 +000020\begin{abstract}
21
22\noindent
Guido van Rossumb92112d1995-03-20 14:24:09 +000023Python is an interpreted, object-oriented programming language. This
24document describes how to write modules in C or \Cpp{} to extend the
25Python interpreter with new modules. Those modules can define new
26functions but also new object types and their methods. The document
27also describes how to embed the Python interpreter in another
28application, for use as an extension language. Finally, it shows how
29to compile and link extension modules so that they can be loaded
30dynamically (at run time) into the interpreter, if the underlying
31operating system supports this feature.
32
33This document assumes basic knowledge about Python. For an informal
34introduction to the language, see the Python Tutorial. The Python
35Reference Manual gives a more formal definition of the language. The
36Python Library Reference documents the existing object types,
37functions and modules (both built-in and written in Python) that give
38the language its wide application range.
Guido van Rossum7a2dba21993-11-05 14:45:11 +000039
Guido van Rossumfdacc581997-10-07 14:40:16 +000040For a detailed description of the whole Python/C API, see the separate
41Python/C API Reference Manual. \strong{Note:} While that manual is
42still in a state of flux, it is safe to say that it is much more up to
43date than the manual you're reading currently (which has been in need
44for an upgrade for some time now).
45
46
Guido van Rossum7a2dba21993-11-05 14:45:11 +000047\end{abstract}
48
49\pagebreak
50
51{
52\parskip = 0mm
53\tableofcontents
54}
55
56\pagebreak
57
58\pagenumbering{arabic}
59
Guido van Rossumdb65a6c1993-11-05 17:11:16 +000060
Guido van Rossum16d6e711994-08-08 12:30:22 +000061\chapter{Extending Python with C or \Cpp{} code}
Guido van Rossum7a2dba21993-11-05 14:45:11 +000062
Guido van Rossum6f0132f1993-11-19 13:13:22 +000063
64\section{Introduction}
65
Guido van Rossumb92112d1995-03-20 14:24:09 +000066It is quite easy to add new built-in modules to Python, if you know
67how to program in C. Such \dfn{extension modules} can do two things
68that can't be done directly in Python: they can implement new built-in
69object types, and they can call C library functions and system calls.
Guido van Rossum6938f061994-08-01 12:22:53 +000070
Guido van Rossum5049bcb1995-03-13 16:55:23 +000071To support extensions, the Python API (Application Programmers
Guido van Rossumb92112d1995-03-20 14:24:09 +000072Interface) defines a set of functions, macros and variables that
73provide access to most aspects of the Python run-time system. The
74Python API is incorporated in a C source file by including the header
75\code{"Python.h"}.
Guido van Rossum6938f061994-08-01 12:22:53 +000076
Guido van Rossumb92112d1995-03-20 14:24:09 +000077The compilation of an extension module depends on its intended use as
78well as on your system setup; details are given in a later section.
Guido van Rossum6938f061994-08-01 12:22:53 +000079
Guido van Rossum7a2dba21993-11-05 14:45:11 +000080
Guido van Rossum5049bcb1995-03-13 16:55:23 +000081\section{A Simple Example}
Guido van Rossum7a2dba21993-11-05 14:45:11 +000082
Guido van Rossumb92112d1995-03-20 14:24:09 +000083Let's create an extension module called \samp{spam} (the favorite food
84of Monty Python fans...) and let's say we want to create a Python
85interface to the C library function \code{system()}.\footnote{An
86interface for this function already exists in the standard module
87\code{os} --- it was chosen as a simple and straightfoward example.}
88This function takes a null-terminated character string as argument and
89returns an integer. We want this function to be callable from Python
90as follows:
91
Guido van Rossume47da0a1997-07-17 16:34:52 +000092\bcode\begin{verbatim}
Guido van Rossumb92112d1995-03-20 14:24:09 +000093 >>> import spam
94 >>> status = spam.system("ls -l")
Guido van Rossume47da0a1997-07-17 16:34:52 +000095\end{verbatim}\ecode
96%
Guido van Rossumb92112d1995-03-20 14:24:09 +000097Begin by creating a file \samp{spammodule.c}. (In general, if a
98module is called \samp{spam}, the C file containing its implementation
99is called \file{spammodule.c}; if the module name is very long, like
100\samp{spammify}, the module name can be just \file{spammify.c}.)
101
102The first line of our file can be:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000103
Guido van Rossume47da0a1997-07-17 16:34:52 +0000104\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000105 #include "Python.h"
Guido van Rossume47da0a1997-07-17 16:34:52 +0000106\end{verbatim}\ecode
107%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000108which pulls in the Python API (you can add a comment describing the
109purpose of the module and a copyright notice if you like).
110
Guido van Rossumb92112d1995-03-20 14:24:09 +0000111All user-visible symbols defined by \code{"Python.h"} have a prefix of
112\samp{Py} or \samp{PY}, except those defined in standard header files.
113For convenience, and since they are used extensively by the Python
114interpreter, \code{"Python.h"} includes a few standard header files:
115\code{<stdio.h>}, \code{<string.h>}, \code{<errno.h>}, and
116\code{<stdlib.h>}. If the latter header file does not exist on your
117system, it declares the functions \code{malloc()}, \code{free()} and
118\code{realloc()} directly.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000119
120The next thing we add to our module file is the C function that will
121be called when the Python expression \samp{spam.system(\var{string})}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000122is evaluated (we'll see shortly how it ends up being called):
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000123
Guido van Rossume47da0a1997-07-17 16:34:52 +0000124\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000125 static PyObject *
126 spam_system(self, args)
127 PyObject *self;
128 PyObject *args;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000129 {
130 char *command;
131 int sts;
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000132 if (!PyArg_ParseTuple(args, "s", &command))
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000133 return NULL;
134 sts = system(command);
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000135 return Py_BuildValue("i", sts);
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000136 }
Guido van Rossume47da0a1997-07-17 16:34:52 +0000137\end{verbatim}\ecode
138%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000139There is a straightforward translation from the argument list in
Guido van Rossumb92112d1995-03-20 14:24:09 +0000140Python (e.g.\ the single expression \code{"ls -l"}) to the arguments
141passed to the C function. The C function always has two arguments,
142conventionally named \var{self} and \var{args}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000143
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000144The \var{self} argument is only used when the C function implements a
Guido van Rossumb92112d1995-03-20 14:24:09 +0000145builtin method. This will be discussed later. In the example,
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000146\var{self} will always be a \code{NULL} pointer, since we are defining
147a function, not a method. (This is done so that the interpreter
148doesn't have to understand two different types of C functions.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000149
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000150The \var{args} argument will be a pointer to a Python tuple object
Guido van Rossumb92112d1995-03-20 14:24:09 +0000151containing the arguments. Each item of the tuple corresponds to an
152argument in the call's argument list. The arguments are Python
153objects -- in order to do anything with them in our C function we have
154to convert them to C values. The function \code{PyArg_ParseTuple()}
155in the Python API checks the argument types and converts them to C
156values. It uses a template string to determine the required types of
157the arguments as well as the types of the C variables into which to
158store the converted values. More about this later.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000159
Guido van Rossumb92112d1995-03-20 14:24:09 +0000160\code{PyArg_ParseTuple()} returns true (nonzero) if all arguments have
161the right type and its components have been stored in the variables
162whose addresses are passed. It returns false (zero) if an invalid
163argument list was passed. In the latter case it also raises an
164appropriate exception by so the calling function can return
165\code{NULL} immediately (as we saw in the example).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000166
167
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000168\section{Intermezzo: Errors and Exceptions}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000169
170An important convention throughout the Python interpreter is the
171following: when a function fails, it should set an exception condition
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000172and return an error value (usually a \code{NULL} pointer). Exceptions
Guido van Rossumb92112d1995-03-20 14:24:09 +0000173are stored in a static global variable inside the interpreter; if this
174variable is \code{NULL} no exception has occurred. A second global
175variable stores the ``associated value'' of the exception (the second
176argument to \code{raise}). A third variable contains the stack
177traceback in case the error originated in Python code. These three
178variables are the C equivalents of the Python variables
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000179\code{sys.exc_type}, \code{sys.exc_value} and \code{sys.exc_traceback}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000180(see the section on module \code{sys} in the Library Reference
181Manual). It is important to know about them to understand how errors
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000182are passed around.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000183
Guido van Rossumb92112d1995-03-20 14:24:09 +0000184The Python API defines a number of functions to set various types of
185exceptions.
186
187The most common one is \code{PyErr_SetString()}. Its arguments are an
188exception object and a C string. The exception object is usually a
189predefined object like \code{PyExc_ZeroDivisionError}. The C string
190indicates the cause of the error and is converted to a Python string
191object and stored as the ``associated value'' of the exception.
192
193Another useful function is \code{PyErr_SetFromErrno()}, which only
194takes an exception argument and constructs the associated value by
195inspection of the (\UNIX{}) global variable \code{errno}. The most
196general function is \code{PyErr_SetObject()}, which takes two object
197arguments, the exception and its associated value. You don't need to
198\code{Py_INCREF()} the objects passed to any of these functions.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000199
200You can test non-destructively whether an exception has been set with
Guido van Rossumb92112d1995-03-20 14:24:09 +0000201\code{PyErr_Occurred()}. This returns the current exception object,
202or \code{NULL} if no exception has occurred. You normally don't need
203to call \code{PyErr_Occurred()} to see whether an error occurred in a
204function call, since you should be able to tell from the return value.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000205
Guido van Rossumd16ddb61996-12-13 02:38:17 +0000206When a function \var{f} that calls another function \var{g} detects
Guido van Rossumb92112d1995-03-20 14:24:09 +0000207that the latter fails, \var{f} should itself return an error value
208(e.g. \code{NULL} or \code{-1}). It should \emph{not} call one of the
209\code{PyErr_*()} functions --- one has already been called by \var{g}.
210\var{f}'s caller is then supposed to also return an error indication
211to \emph{its} caller, again \emph{without} calling \code{PyErr_*()},
212and so on --- the most detailed cause of the error was already
213reported by the function that first detected it. Once the error
214reaches the Python interpreter's main loop, this aborts the currently
215executing Python code and tries to find an exception handler specified
216by the Python programmer.
Guido van Rossum6938f061994-08-01 12:22:53 +0000217
218(There are situations where a module can actually give a more detailed
Guido van Rossumb92112d1995-03-20 14:24:09 +0000219error message by calling another \code{PyErr_*()} function, and in
220such cases it is fine to do so. As a general rule, however, this is
221not necessary, and can cause information about the cause of the error
222to be lost: most operations can fail for a variety of reasons.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000223
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000224To ignore an exception set by a function call that failed, the exception
225condition must be cleared explicitly by calling \code{PyErr_Clear()}.
226The only time C code should call \code{PyErr_Clear()} is if it doesn't
227want to pass the error on to the interpreter but wants to handle it
228completely by itself (e.g. by trying something else or pretending
229nothing happened).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000230
Guido van Rossumb92112d1995-03-20 14:24:09 +0000231Note that a failing \code{malloc()} call must be turned into an
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000232exception --- the direct caller of \code{malloc()} (or
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000233\code{realloc()}) must call \code{PyErr_NoMemory()} and return a
234failure indicator itself. All the object-creating functions
235(\code{PyInt_FromLong()} etc.) already do this, so only if you call
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000236\code{malloc()} directly this note is of importance.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000237
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000238Also note that, with the important exception of
Guido van Rossumb92112d1995-03-20 14:24:09 +0000239\code{PyArg_ParseTuple()} and friends, functions that return an
240integer status usually return a positive value or zero for success and
241\code{-1} for failure, like \UNIX{} system calls.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000242
Guido van Rossumb92112d1995-03-20 14:24:09 +0000243Finally, be careful to clean up garbage (by making \code{Py_XDECREF()}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000244or \code{Py_DECREF()} calls for objects you have already created) when
Guido van Rossumb92112d1995-03-20 14:24:09 +0000245you return an error indicator!
Guido van Rossum6938f061994-08-01 12:22:53 +0000246
247The choice of which exception to raise is entirely yours. There are
248predeclared C objects corresponding to all built-in Python exceptions,
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000249e.g. \code{PyExc_ZeroDevisionError} which you can use directly. Of
Guido van Rossumb92112d1995-03-20 14:24:09 +0000250course, you should choose exceptions wisely --- don't use
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000251\code{PyExc_TypeError} to mean that a file couldn't be opened (that
252should probably be \code{PyExc_IOError}). If something's wrong with
253the argument list, the \code{PyArg_ParseTuple()} function usually
254raises \code{PyExc_TypeError}. If you have an argument whose value
255which must be in a particular range or must satisfy other conditions,
256\code{PyExc_ValueError} is appropriate.
Guido van Rossum6938f061994-08-01 12:22:53 +0000257
258You can also define a new exception that is unique to your module.
259For this, you usually declare a static object variable at the
260beginning of your file, e.g.
261
Guido van Rossume47da0a1997-07-17 16:34:52 +0000262\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000263 static PyObject *SpamError;
Guido van Rossume47da0a1997-07-17 16:34:52 +0000264\end{verbatim}\ecode
265%
Guido van Rossum6938f061994-08-01 12:22:53 +0000266and initialize it in your module's initialization function
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000267(\code{initspam()}) with a string object, e.g. (leaving out the error
Guido van Rossumb92112d1995-03-20 14:24:09 +0000268checking for now):
Guido van Rossum6938f061994-08-01 12:22:53 +0000269
Guido van Rossume47da0a1997-07-17 16:34:52 +0000270\bcode\begin{verbatim}
Guido van Rossum6938f061994-08-01 12:22:53 +0000271 void
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000272 initspam()
Guido van Rossum6938f061994-08-01 12:22:53 +0000273 {
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000274 PyObject *m, *d;
Guido van Rossumb92112d1995-03-20 14:24:09 +0000275 m = Py_InitModule("spam", SpamMethods);
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000276 d = PyModule_GetDict(m);
277 SpamError = PyString_FromString("spam.error");
278 PyDict_SetItemString(d, "error", SpamError);
Guido van Rossum6938f061994-08-01 12:22:53 +0000279 }
Guido van Rossume47da0a1997-07-17 16:34:52 +0000280\end{verbatim}\ecode
281%
Guido van Rossumb92112d1995-03-20 14:24:09 +0000282Note that the Python name for the exception object is
283\code{spam.error}. It is conventional for module and exception names
284to be spelled in lower case. It is also conventional that the
285\emph{value} of the exception object is the same as its name, e.g.\
286the string \code{"spam.error"}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000287
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000288
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000289\section{Back to the Example}
290
291Going back to our example function, you should now be able to
292understand this statement:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000293
Guido van Rossume47da0a1997-07-17 16:34:52 +0000294\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000295 if (!PyArg_ParseTuple(args, "s", &command))
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000296 return NULL;
Guido van Rossume47da0a1997-07-17 16:34:52 +0000297\end{verbatim}\ecode
298%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000299It returns \code{NULL} (the error indicator for functions returning
300object pointers) if an error is detected in the argument list, relying
301on the exception set by \code{PyArg_ParseTuple()}. Otherwise the
302string value of the argument has been copied to the local variable
303\code{command}. This is a pointer assignment and you are not supposed
Guido van Rossumb92112d1995-03-20 14:24:09 +0000304to modify the string to which it points (so in Standard C, the variable
305\code{command} should properly be declared as \samp{const char
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000306*command}).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000307
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000308The next statement is a call to the \UNIX{} function \code{system()},
309passing it the string we just got from \code{PyArg_ParseTuple()}:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000310
Guido van Rossume47da0a1997-07-17 16:34:52 +0000311\bcode\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000312 sts = system(command);
Guido van Rossume47da0a1997-07-17 16:34:52 +0000313\end{verbatim}\ecode
314%
Guido van Rossumd16ddb61996-12-13 02:38:17 +0000315Our \code{spam.system()} function must return the value of \code{sts}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000316as a Python object. This is done using the function
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000317\code{Py_BuildValue()}, which is something like the inverse of
318\code{PyArg_ParseTuple()}: it takes a format string and an arbitrary
319number of C values, and returns a new Python object. More info on
320\code{Py_BuildValue()} is given later.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000321
Guido van Rossume47da0a1997-07-17 16:34:52 +0000322\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000323 return Py_BuildValue("i", sts);
Guido van Rossume47da0a1997-07-17 16:34:52 +0000324\end{verbatim}\ecode
325%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000326In this case, it will return an integer object. (Yes, even integers
327are objects on the heap in Python!)
Guido van Rossum6938f061994-08-01 12:22:53 +0000328
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000329If you have a C function that returns no useful argument (a function
330returning \code{void}), the corresponding Python function must return
331\code{None}. You need this idiom to do so:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000332
Guido van Rossume47da0a1997-07-17 16:34:52 +0000333\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000334 Py_INCREF(Py_None);
335 return Py_None;
Guido van Rossume47da0a1997-07-17 16:34:52 +0000336\end{verbatim}\ecode
337%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000338\code{Py_None} is the C name for the special Python object
339\code{None}. It is a genuine Python object (not a \code{NULL}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000340pointer, which means ``error'' in most contexts, as we have seen).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000341
342
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000343\section{The Module's Method Table and Initialization Function}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000344
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000345I promised to show how \code{spam_system()} is called from Python
346programs. First, we need to list its name and address in a ``method
347table'':
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000348
Guido van Rossume47da0a1997-07-17 16:34:52 +0000349\bcode\begin{verbatim}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000350 static PyMethodDef SpamMethods[] = {
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000351 ...
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000352 {"system", spam_system, 1},
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000353 ...
354 {NULL, NULL} /* Sentinel */
355 };
Guido van Rossume47da0a1997-07-17 16:34:52 +0000356\end{verbatim}\ecode
357%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000358Note the third entry (\samp{1}). This is a flag telling the
359interpreter the calling convention to be used for the C function. It
360should normally always be \samp{1}; a value of \samp{0} means that an
361obsolete variant of \code{PyArg_ParseTuple()} is used.
362
363The method table must be passed to the interpreter in the module's
364initialization function (which should be the only non-\code{static}
365item defined in the module file):
366
Guido van Rossume47da0a1997-07-17 16:34:52 +0000367\bcode\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000368 void
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000369 initspam()
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000370 {
Guido van Rossumb92112d1995-03-20 14:24:09 +0000371 (void) Py_InitModule("spam", SpamMethods);
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000372 }
Guido van Rossume47da0a1997-07-17 16:34:52 +0000373\end{verbatim}\ecode
374%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000375When the Python program imports module \code{spam} for the first time,
376\code{initspam()} is called. It calls \code{Py_InitModule()}, which
377creates a ``module object'' (which is inserted in the dictionary
378\code{sys.modules} under the key \code{"spam"}), and inserts built-in
379function objects into the newly created module based upon the table
380(an array of \code{PyMethodDef} structures) that was passed as its
381second argument. \code{Py_InitModule()} returns a pointer to the
Guido van Rossum6938f061994-08-01 12:22:53 +0000382module object that it creates (which is unused here). It aborts with
383a fatal error if the module could not be initialized satisfactorily,
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000384so the caller doesn't need to check for errors.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000385
386
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000387\section{Compilation and Linkage}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000388
Guido van Rossumb92112d1995-03-20 14:24:09 +0000389There are two more things to do before you can use your new extension:
390compiling and linking it with the Python system. If you use dynamic
391loading, the details depend on the style of dynamic loading your
392system uses; see the chapter on Dynamic Loading for more info about
393this.
Guido van Rossum6938f061994-08-01 12:22:53 +0000394
395If you can't use dynamic loading, or if you want to make your module a
396permanent part of the Python interpreter, you will have to change the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000397configuration setup and rebuild the interpreter. Luckily, this is
398very simple: just place your file (\file{spammodule.c} for example) in
399the \file{Modules} directory, add a line to the file
400\file{Modules/Setup} describing your file:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000401
Guido van Rossume47da0a1997-07-17 16:34:52 +0000402\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000403 spam spammodule.o
Guido van Rossume47da0a1997-07-17 16:34:52 +0000404\end{verbatim}\ecode
405%
Guido van Rossum6938f061994-08-01 12:22:53 +0000406and rebuild the interpreter by running \code{make} in the toplevel
407directory. You can also run \code{make} in the \file{Modules}
408subdirectory, but then you must first rebuilt the \file{Makefile}
409there by running \code{make Makefile}. (This is necessary each time
410you change the \file{Setup} file.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000411
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000412If your module requires additional libraries to link with, these can
413be listed on the line in the \file{Setup} file as well, for instance:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000414
Guido van Rossume47da0a1997-07-17 16:34:52 +0000415\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000416 spam spammodule.o -lX11
Guido van Rossume47da0a1997-07-17 16:34:52 +0000417\end{verbatim}\ecode
418%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000419\section{Calling Python Functions From C}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000420
Guido van Rossum6938f061994-08-01 12:22:53 +0000421So far we have concentrated on making C functions callable from
422Python. The reverse is also useful: calling Python functions from C.
423This is especially the case for libraries that support so-called
Guido van Rossumb92112d1995-03-20 14:24:09 +0000424``callback'' functions. If a C interface makes use of callbacks, the
Guido van Rossum6938f061994-08-01 12:22:53 +0000425equivalent Python often needs to provide a callback mechanism to the
426Python programmer; the implementation will require calling the Python
427callback functions from a C callback. Other uses are also imaginable.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000428
429Fortunately, the Python interpreter is easily called recursively, and
Guido van Rossum6938f061994-08-01 12:22:53 +0000430there is a standard interface to call a Python function. (I won't
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000431dwell on how to call the Python parser with a particular string as
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000432input --- if you're interested, have a look at the implementation of
Guido van Rossum6938f061994-08-01 12:22:53 +0000433the \samp{-c} command line option in \file{Python/pythonmain.c}.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000434
435Calling a Python function is easy. First, the Python program must
436somehow pass you the Python function object. You should provide a
437function (or some other interface) to do this. When this function is
438called, save a pointer to the Python function object (be careful to
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000439\code{Py_INCREF()} it!) in a global variable --- or whereever you see fit.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000440For example, the following function might be part of a module
441definition:
442
Guido van Rossume47da0a1997-07-17 16:34:52 +0000443\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000444 static PyObject *my_callback = NULL;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000445
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000446 static PyObject *
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000447 my_set_callback(dummy, arg)
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000448 PyObject *dummy, *arg;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000449 {
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000450 Py_XDECREF(my_callback); /* Dispose of previous callback */
451 Py_XINCREF(arg); /* Add a reference to new callback */
452 my_callback = arg; /* Remember new callback */
453 /* Boilerplate to return "None" */
454 Py_INCREF(Py_None);
455 return Py_None;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000456 }
Guido van Rossume47da0a1997-07-17 16:34:52 +0000457\end{verbatim}\ecode
458%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000459The macros \code{Py_XINCREF()} and \code{Py_XDECREF()} increment/decrement
Guido van Rossum6938f061994-08-01 12:22:53 +0000460the reference count of an object and are safe in the presence of
461\code{NULL} pointers. More info on them in the section on Reference
462Counts below.
463
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000464Later, when it is time to call the function, you call the C function
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000465\code{PyEval_CallObject()}. This function has two arguments, both
466pointers to arbitrary Python objects: the Python function, and the
467argument list. The argument list must always be a tuple object, whose
468length is the number of arguments. To call the Python function with
469no arguments, pass an empty tuple; to call it with one argument, pass
470a singleton tuple. \code{Py_BuildValue()} returns a tuple when its
471format string consists of zero or more format codes between
472parentheses. For example:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000473
Guido van Rossume47da0a1997-07-17 16:34:52 +0000474\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000475 int arg;
476 PyObject *arglist;
477 PyObject *result;
478 ...
479 arg = 123;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000480 ...
481 /* Time to call the callback */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000482 arglist = Py_BuildValue("(i)", arg);
483 result = PyEval_CallObject(my_callback, arglist);
484 Py_DECREF(arglist);
Guido van Rossume47da0a1997-07-17 16:34:52 +0000485\end{verbatim}\ecode
486%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000487\code{PyEval_CallObject()} returns a Python object pointer: this is
488the return value of the Python function. \code{PyEval_CallObject()} is
Guido van Rossumb92112d1995-03-20 14:24:09 +0000489``reference-count-neutral'' with respect to its arguments. In the
Guido van Rossum6938f061994-08-01 12:22:53 +0000490example a new tuple was created to serve as the argument list, which
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000491is \code{Py_DECREF()}-ed immediately after the call.
Guido van Rossum6938f061994-08-01 12:22:53 +0000492
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000493The return value of \code{PyEval_CallObject()} is ``new'': either it
494is a brand new object, or it is an existing object whose reference
495count has been incremented. So, unless you want to save it in a
496global variable, you should somehow \code{Py_DECREF()} the result,
497even (especially!) if you are not interested in its value.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000498
499Before you do this, however, it is important to check that the return
Guido van Rossum6938f061994-08-01 12:22:53 +0000500value isn't \code{NULL}. If it is, the Python function terminated by raising
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000501an exception. If the C code that called \code{PyEval_CallObject()} is
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000502called from Python, it should now return an error indication to its
503Python caller, so the interpreter can print a stack trace, or the
504calling Python code can handle the exception. If this is not possible
505or desirable, the exception should be cleared by calling
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000506\code{PyErr_Clear()}. For example:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000507
Guido van Rossume47da0a1997-07-17 16:34:52 +0000508\bcode\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000509 if (result == NULL)
510 return NULL; /* Pass error back */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000511 ...use result...
512 Py_DECREF(result);
Guido van Rossume47da0a1997-07-17 16:34:52 +0000513\end{verbatim}\ecode
514%
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000515Depending on the desired interface to the Python callback function,
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000516you may also have to provide an argument list to \code{PyEval_CallObject()}.
Guido van Rossum6938f061994-08-01 12:22:53 +0000517In some cases the argument list is also provided by the Python
518program, through the same interface that specified the callback
519function. It can then be saved and used in the same manner as the
520function object. In other cases, you may have to construct a new
521tuple to pass as the argument list. The simplest way to do this is to
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000522call \code{Py_BuildValue()}. For example, if you want to pass an integral
Guido van Rossum6938f061994-08-01 12:22:53 +0000523event code, you might use the following code:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000524
Guido van Rossume47da0a1997-07-17 16:34:52 +0000525\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000526 PyObject *arglist;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000527 ...
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000528 arglist = Py_BuildValue("(l)", eventcode);
529 result = PyEval_CallObject(my_callback, arglist);
530 Py_DECREF(arglist);
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000531 if (result == NULL)
532 return NULL; /* Pass error back */
533 /* Here maybe use the result */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000534 Py_DECREF(result);
Guido van Rossume47da0a1997-07-17 16:34:52 +0000535\end{verbatim}\ecode
536%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000537Note the placement of \code{Py_DECREF(argument)} immediately after the call,
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000538before the error check! Also note that strictly spoken this code is
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000539not complete: \code{Py_BuildValue()} may run out of memory, and this should
Guido van Rossum6938f061994-08-01 12:22:53 +0000540be checked.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000541
542
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000543\section{Format Strings for {\tt PyArg_ParseTuple()}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000544
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000545The \code{PyArg_ParseTuple()} function is declared as follows:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000546
Guido van Rossume47da0a1997-07-17 16:34:52 +0000547\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000548 int PyArg_ParseTuple(PyObject *arg, char *format, ...);
Guido van Rossume47da0a1997-07-17 16:34:52 +0000549\end{verbatim}\ecode
550%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000551The \var{arg} argument must be a tuple object containing an argument
552list passed from Python to a C function. The \var{format} argument
553must be a format string, whose syntax is explained below. The
554remaining arguments must be addresses of variables whose type is
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000555determined by the format string. For the conversion to succeed, the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000556\var{arg} object must match the format and the format must be
557exhausted.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000558
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000559Note that while \code{PyArg_ParseTuple()} checks that the Python
560arguments have the required types, it cannot check the validity of the
561addresses of C variables passed to the call: if you make mistakes
562there, your code will probably crash or at least overwrite random bits
563in memory. So be careful!
564
565A format string consists of zero or more ``format units''. A format
566unit describes one Python object; it is usually a single character or
567a parenthesized sequence of format units. With a few exceptions, a
568format unit that is not a parenthesized sequence normally corresponds
569to a single address argument to \code{PyArg_ParseTuple()}. In the
570following description, the quoted form is the format unit; the entry
571in (round) parentheses is the Python object type that matches the
572format unit; and the entry in [square] brackets is the type of the C
573variable(s) whose address should be passed. (Use the \samp{\&}
574operator to pass a variable's address.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000575
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000576\begin{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000577
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000578\item[\samp{s} (string) [char *]]
579Convert a Python string to a C pointer to a character string. You
580must not provide storage for the string itself; a pointer to an
581existing string is stored into the character pointer variable whose
582address you pass. The C string is null-terminated. The Python string
583must not contain embedded null bytes; if it does, a \code{TypeError}
584exception is raised.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000585
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000586\item[\samp{s\#} (string) {[char *, int]}]
587This variant on \code{'s'} stores into two C variables, the first one
588a pointer to a character string, the second one its length. In this
589case the Python string may contain embedded null bytes.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000590
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000591\item[\samp{z} (string or \code{None}) {[char *]}]
592Like \samp{s}, but the Python object may also be \code{None}, in which
593case the C pointer is set to \code{NULL}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000594
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000595\item[\samp{z\#} (string or \code{None}) {[char *, int]}]
596This is to \code{'s\#'} as \code{'z'} is to \code{'s'}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000597
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000598\item[\samp{b} (integer) {[char]}]
599Convert a Python integer to a tiny int, stored in a C \code{char}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000600
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000601\item[\samp{h} (integer) {[short int]}]
602Convert a Python integer to a C \code{short int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000603
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000604\item[\samp{i} (integer) {[int]}]
605Convert a Python integer to a plain C \code{int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000606
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000607\item[\samp{l} (integer) {[long int]}]
608Convert a Python integer to a C \code{long int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000609
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000610\item[\samp{c} (string of length 1) {[char]}]
611Convert a Python character, represented as a string of length 1, to a
612C \code{char}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000613
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000614\item[\samp{f} (float) {[float]}]
615Convert a Python floating point number to a C \code{float}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000616
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000617\item[\samp{d} (float) {[double]}]
618Convert a Python floating point number to a C \code{double}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000619
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000620\item[\samp{O} (object) {[PyObject *]}]
621Store a Python object (without any conversion) in a C object pointer.
622The C program thus receives the actual object that was passed. The
623object's reference count is not increased. The pointer stored is not
624\code{NULL}.
625
626\item[\samp{O!} (object) {[\var{typeobject}, PyObject *]}]
627Store a Python object in a C object pointer. This is similar to
628\samp{O}, but takes two C arguments: the first is the address of a
629Python type object, the second is the address of the C variable (of
630type \code{PyObject *}) into which the object pointer is stored.
631If the Python object does not have the required type, a
632\code{TypeError} exception is raised.
633
634\item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
635Convert a Python object to a C variable through a \var{converter}
636function. This takes two arguments: the first is a function, the
637second is the address of a C variable (of arbitrary type), converted
638to \code{void *}. The \var{converter} function in turn is called as
639follows:
640
641\code{\var{status} = \var{converter}(\var{object}, \var{address});}
642
643where \var{object} is the Python object to be converted and
644\var{address} is the \code{void *} argument that was passed to
645\code{PyArg_ConvertTuple()}. The returned \var{status} should be
646\code{1} for a successful conversion and \code{0} if the conversion
647has failed. When the conversion fails, the \var{converter} function
648should raise an exception.
649
650\item[\samp{S} (string) {[PyStringObject *]}]
651Like \samp{O} but raises a \code{TypeError} exception that the object
652is a string object. The C variable may also be declared as
653\code{PyObject *}.
654
655\item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
656The object must be a Python tuple whose length is the number of format
657units in \var{items}. The C arguments must correspond to the
658individual format units in \var{items}. Format units for tuples may
659be nested.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000660
661\end{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000662
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000663It is possible to pass Python long integers where integers are
664requested; however no proper range checking is done -- the most
665significant bits are silently truncated when the receiving field is
666too small to receive the value (actually, the semantics are inherited
667from downcasts in C --- your milage may vary).
668
669A few other characters have a meaning in a format string. These may
670not occur inside nested parentheses. They are:
671
672\begin{description}
673
674\item[\samp{|}]
675Indicates that the remaining arguments in the Python argument list are
676optional. The C variables corresponding to optional arguments should
677be initialized to their default value --- when an optional argument is
678not specified, the \code{PyArg_ParseTuple} does not touch the contents
679of the corresponding C variable(s).
680
681\item[\samp{:}]
682The list of format units ends here; the string after the colon is used
683as the function name in error messages (the ``associated value'' of
684the exceptions that \code{PyArg_ParseTuple} raises).
685
686\item[\samp{;}]
687The list of format units ends here; the string after the colon is used
688as the error message \emph{instead} of the default error message.
689Clearly, \samp{:} and \samp{;} mutually exclude each other.
690
691\end{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000692
693Some example calls:
694
Guido van Rossume47da0a1997-07-17 16:34:52 +0000695\bcode\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000696 int ok;
697 int i, j;
698 long k, l;
699 char *s;
700 int size;
701
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000702 ok = PyArg_ParseTuple(args, ""); /* No arguments */
Guido van Rossum6938f061994-08-01 12:22:53 +0000703 /* Python call: f() */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000704
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000705 ok = PyArg_ParseTuple(args, "s", &s); /* A string */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000706 /* Possible Python call: f('whoops!') */
707
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000708 ok = PyArg_ParseTuple(args, "lls", &k, &l, &s); /* Two longs and a string */
Guido van Rossum6938f061994-08-01 12:22:53 +0000709 /* Possible Python call: f(1, 2, 'three') */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000710
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000711 ok = PyArg_ParseTuple(args, "(ii)s#", &i, &j, &s, &size);
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000712 /* A pair of ints and a string, whose size is also returned */
Guido van Rossum7e924dd1997-02-10 16:51:52 +0000713 /* Possible Python call: f((1, 2), 'three') */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000714
715 {
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000716 char *file;
717 char *mode = "r";
718 int bufsize = 0;
719 ok = PyArg_ParseTuple(args, "s|si", &file, &mode, &bufsize);
720 /* A string, and optionally another string and an integer */
721 /* Possible Python calls:
722 f('spam')
723 f('spam', 'w')
724 f('spam', 'wb', 100000) */
725 }
726
727 {
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000728 int left, top, right, bottom, h, v;
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000729 ok = PyArg_ParseTuple(args, "((ii)(ii))(ii)",
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000730 &left, &top, &right, &bottom, &h, &v);
731 /* A rectangle and a point */
732 /* Possible Python call:
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000733 f(((0, 0), (400, 300)), (10, 10)) */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000734 }
Guido van Rossume47da0a1997-07-17 16:34:52 +0000735\end{verbatim}\ecode
736%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000737\section{The {\tt Py_BuildValue()} Function}
738
739This function is the counterpart to \code{PyArg_ParseTuple()}. It is
740declared as follows:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000741
Guido van Rossume47da0a1997-07-17 16:34:52 +0000742\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000743 PyObject *Py_BuildValue(char *format, ...);
Guido van Rossume47da0a1997-07-17 16:34:52 +0000744\end{verbatim}\ecode
745%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000746It recognizes a set of format units similar to the ones recognized by
747\code{PyArg_ParseTuple()}, but the arguments (which are input to the
748function, not output) must not be pointers, just values. It returns a
749new Python object, suitable for returning from a C function called
750from Python.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000751
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000752One difference with \code{PyArg_ParseTuple()}: while the latter
753requires its first argument to be a tuple (since Python argument lists
754are always represented as tuples internally), \code{BuildValue()} does
755not always build a tuple. It builds a tuple only if its format string
756contains two or more format units. If the format string is empty, it
757returns \code{None}; if it contains exactly one format unit, it
758returns whatever object is described by that format unit. To force it
759to return a tuple of size 0 or one, parenthesize the format string.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000760
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000761In the following description, the quoted form is the format unit; the
762entry in (round) parentheses is the Python object type that the format
763unit will return; and the entry in [square] brackets is the type of
764the C value(s) to be passed.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000765
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000766The characters space, tab, colon and comma are ignored in format
767strings (but not within format units such as \samp{s\#}). This can be
768used to make long format strings a tad more readable.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000769
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000770\begin{description}
771
772\item[\samp{s} (string) {[char *]}]
773Convert a null-terminated C string to a Python object. If the C
774string pointer is \code{NULL}, \code{None} is returned.
775
776\item[\samp{s\#} (string) {[char *, int]}]
777Convert a C string and its length to a Python object. If the C string
778pointer is \code{NULL}, the length is ignored and \code{None} is
779returned.
780
781\item[\samp{z} (string or \code{None}) {[char *]}]
782Same as \samp{s}.
783
784\item[\samp{z\#} (string or \code{None}) {[char *, int]}]
785Same as \samp{s\#}.
786
787\item[\samp{i} (integer) {[int]}]
788Convert a plain C \code{int} to a Python integer object.
789
790\item[\samp{b} (integer) {[char]}]
791Same as \samp{i}.
792
793\item[\samp{h} (integer) {[short int]}]
794Same as \samp{i}.
795
796\item[\samp{l} (integer) {[long int]}]
797Convert a C \code{long int} to a Python integer object.
798
799\item[\samp{c} (string of length 1) {[char]}]
800Convert a C \code{int} representing a character to a Python string of
801length 1.
802
803\item[\samp{d} (float) {[double]}]
804Convert a C \code{double} to a Python floating point number.
805
806\item[\samp{f} (float) {[float]}]
807Same as \samp{d}.
808
809\item[\samp{O} (object) {[PyObject *]}]
810Pass a Python object untouched (except for its reference count, which
811is incremented by one). If the object passed in is a \code{NULL}
812pointer, it is assumed that this was caused because the call producing
813the argument found an error and set an exception. Therefore,
814\code{Py_BuildValue()} will return \code{NULL} but won't raise an
815exception. If no exception has been raised yet,
816\code{PyExc_SystemError} is set.
817
818\item[\samp{S} (object) {[PyObject *]}]
819Same as \samp{O}.
820
821\item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
822Convert \var{anything} to a Python object through a \var{converter}
823function. The function is called with \var{anything} (which should be
824compatible with \code{void *}) as its argument and should return a
825``new'' Python object, or \code{NULL} if an error occurred.
826
827\item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
828Convert a sequence of C values to a Python tuple with the same number
829of items.
830
831\item[\samp{[\var{items}]} (list) {[\var{matching-items}]}]
832Convert a sequence of C values to a Python list with the same number
833of items.
834
835\item[\samp{\{\var{items}\}} (dictionary) {[\var{matching-items}]}]
836Convert a sequence of C values to a Python dictionary. Each pair of
837consecutive C values adds one item to the dictionary, serving as key
838and value, respectively.
839
840\end{description}
841
842If there is an error in the format string, the
843\code{PyExc_SystemError} exception is raised and \code{NULL} returned.
844
845Examples (to the left the call, to the right the resulting Python value):
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000846
Guido van Rossume47da0a1997-07-17 16:34:52 +0000847\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000848 Py_BuildValue("") None
849 Py_BuildValue("i", 123) 123
Guido van Rossumf23e0fe1995-03-18 11:04:29 +0000850 Py_BuildValue("iii", 123, 456, 789) (123, 456, 789)
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000851 Py_BuildValue("s", "hello") 'hello'
852 Py_BuildValue("ss", "hello", "world") ('hello', 'world')
853 Py_BuildValue("s#", "hello", 4) 'hell'
854 Py_BuildValue("()") ()
855 Py_BuildValue("(i)", 123) (123,)
856 Py_BuildValue("(ii)", 123, 456) (123, 456)
857 Py_BuildValue("(i,i)", 123, 456) (123, 456)
858 Py_BuildValue("[i,i]", 123, 456) [123, 456]
Guido van Rossumf23e0fe1995-03-18 11:04:29 +0000859 Py_BuildValue("{s:i,s:i}",
860 "abc", 123, "def", 456) {'abc': 123, 'def': 456}
861 Py_BuildValue("((ii)(ii)) (ii)",
862 1, 2, 3, 4, 5, 6) (((1, 2), (3, 4)), (5, 6))
Guido van Rossume47da0a1997-07-17 16:34:52 +0000863\end{verbatim}\ecode
864%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000865\section{Reference Counts}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000866
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000867\subsection{Introduction}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000868
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000869In languages like C or \Cpp{}, the programmer is responsible for
870dynamic allocation and deallocation of memory on the heap. In C, this
871is done using the functions \code{malloc()} and \code{free()}. In
872\Cpp{}, the operators \code{new} and \code{delete} are used with
873essentially the same meaning; they are actually implemented using
874\code{malloc()} and \code{free()}, so we'll restrict the following
875discussion to the latter.
876
877Every block of memory allocated with \code{malloc()} should eventually
878be returned to the pool of available memory by exactly one call to
879\code{free()}. It is important to call \code{free()} at the right
880time. If a block's address is forgotten but \code{free()} is not
881called for it, the memory it occupies cannot be reused until the
882program terminates. This is called a \dfn{memory leak}. On the other
883hand, if a program calls \code{free()} for a block and then continues
884to use the block, it creates a conflict with re-use of the block
885through another \code{malloc()} call. This is called \dfn{using freed
Guido van Rossumdebf2e81997-07-17 15:58:43 +0000886memory}. It has the same bad consequences as referencing uninitialized
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000887data --- core dumps, wrong results, mysterious crashes.
888
889Common causes of memory leaks are unusual paths through the code. For
890instance, a function may allocate a block of memory, do some
891calculation, and then free the block again. Now a change in the
892requirements for the function may add a test to the calculation that
893detects an error condition and can return prematurely from the
894function. It's easy to forget to free the allocated memory block when
895taking this premature exit, especially when it is added later to the
896code. Such leaks, once introduced, often go undetected for a long
897time: the error exit is taken only in a small fraction of all calls,
898and most modern machines have plenty of virtual memory, so the leak
899only becomes apparent in a long-running process that uses the leaking
900function frequently. Therefore, it's important to prevent leaks from
901happening by having a coding convention or strategy that minimizes
902this kind of errors.
903
904Since Python makes heavy use of \code{malloc()} and \code{free()}, it
905needs a strategy to avoid memory leaks as well as the use of freed
906memory. The chosen method is called \dfn{reference counting}. The
907principle is simple: every object contains a counter, which is
908incremented when a reference to the object is stored somewhere, and
909which is decremented when a reference to it is deleted. When the
910counter reaches zero, the last reference to the object has been
911deleted and the object is freed.
912
913An alternative strategy is called \dfn{automatic garbage collection}.
914(Sometimes, reference counting is also referred to as a garbage
915collection strategy, hence my use of ``automatic'' to distinguish the
916two.) The big advantage of automatic garbage collection is that the
917user doesn't need to call \code{free()} explicitly. (Another claimed
918advantage is an improvement in speed or memory usage --- this is no
919hard fact however.) The disadvantage is that for C, there is no
920truly portable automatic garbage collector, while reference counting
921can be implemented portably (as long as the functions \code{malloc()}
922and \code{free()} are available --- which the C Standard guarantees).
923Maybe some day a sufficiently portable automatic garbage collector
924will be available for C. Until then, we'll have to live with
925reference counts.
926
927\subsection{Reference Counting in Python}
928
929There are two macros, \code{Py_INCREF(x)} and \code{Py_DECREF(x)},
930which handle the incrementing and decrementing of the reference count.
931\code{Py_DECREF()} also frees the object when the count reaches zero.
932For flexibility, it doesn't call \code{free()} directly --- rather, it
933makes a call through a function pointer in the object's \dfn{type
934object}. For this purpose (and others), every object also contains a
935pointer to its type object.
936
937The big question now remains: when to use \code{Py_INCREF(x)} and
938\code{Py_DECREF(x)}? Let's first introduce some terms. Nobody
939``owns'' an object; however, you can \dfn{own a reference} to an
940object. An object's reference count is now defined as the number of
941owned references to it. The owner of a reference is responsible for
942calling \code{Py_DECREF()} when the reference is no longer needed.
943Ownership of a reference can be transferred. There are three ways to
944dispose of an owned reference: pass it on, store it, or call
945\code{Py_DECREF()}. Forgetting to dispose of an owned reference creates
946a memory leak.
947
948It is also possible to \dfn{borrow}\footnote{The metaphor of
949``borrowing'' a reference is not completely correct: the owner still
950has a copy of the reference.} a reference to an object. The borrower
951of a reference should not call \code{Py_DECREF()}. The borrower must
952not hold on to the object longer than the owner from which it was
953borrowed. Using a borrowed reference after the owner has disposed of
954it risks using freed memory and should be avoided
955completely.\footnote{Checking that the reference count is at least 1
956\strong{does not work} --- the reference count itself could be in
957freed memory and may thus be reused for another object!}
958
959The advantage of borrowing over owning a reference is that you don't
960need to take care of disposing of the reference on all possible paths
961through the code --- in other words, with a borrowed reference you
962don't run the risk of leaking when a premature exit is taken. The
963disadvantage of borrowing over leaking is that there are some subtle
964situations where in seemingly correct code a borrowed reference can be
965used after the owner from which it was borrowed has in fact disposed
966of it.
967
968A borrowed reference can be changed into an owned reference by calling
969\code{Py_INCREF()}. This does not affect the status of the owner from
970which the reference was borrowed --- it creates a new owned reference,
971and gives full owner responsibilities (i.e., the new owner must
972dispose of the reference properly, as well as the previous owner).
973
974\subsection{Ownership Rules}
975
976Whenever an object reference is passed into or out of a function, it
977is part of the function's interface specification whether ownership is
978transferred with the reference or not.
979
980Most functions that return a reference to an object pass on ownership
981with the reference. In particular, all functions whose function it is
982to create a new object, e.g.\ \code{PyInt_FromLong()} and
983\code{Py_BuildValue()}, pass ownership to the receiver. Even if in
984fact, in some cases, you don't receive a reference to a brand new
985object, you still receive ownership of the reference. For instance,
986\code{PyInt_FromLong()} maintains a cache of popular values and can
987return a reference to a cached item.
988
989Many functions that extract objects from other objects also transfer
990ownership with the reference, for instance
991\code{PyObject_GetAttrString()}. The picture is less clear, here,
992however, since a few common routines are exceptions:
993\code{PyTuple_GetItem()}, \code{PyList_GetItem()} and
994\code{PyDict_GetItem()} (and \code{PyDict_GetItemString()}) all return
995references that you borrow from the tuple, list or dictionary.
996
997The function \code{PyImport_AddModule()} also returns a borrowed
998reference, even though it may actually create the object it returns:
999this is possible because an owned reference to the object is stored in
1000\code{sys.modules}.
1001
1002When you pass an object reference into another function, in general,
1003the function borrows the reference from you --- if it needs to store
1004it, it will use \code{Py_INCREF()} to become an independent owner.
1005There are exactly two important exceptions to this rule:
1006\code{PyTuple_SetItem()} and \code{PyList_SetItem()}. These functions
1007take over ownership of the item passed to them --- even if they fail!
1008(Note that \code{PyDict_SetItem()} and friends don't take over
1009ownership --- they are ``normal''.)
1010
1011When a C function is called from Python, it borrows references to its
1012arguments from the caller. The caller owns a reference to the object,
1013so the borrowed reference's lifetime is guaranteed until the function
1014returns. Only when such a borrowed reference must be stored or passed
1015on, it must be turned into an owned reference by calling
1016\code{Py_INCREF()}.
1017
1018The object reference returned from a C function that is called from
1019Python must be an owned reference --- ownership is tranferred from the
1020function to its caller.
1021
1022\subsection{Thin Ice}
1023
1024There are a few situations where seemingly harmless use of a borrowed
1025reference can lead to problems. These all have to do with implicit
1026invocations of the interpreter, which can cause the owner of a
1027reference to dispose of it.
1028
1029The first and most important case to know about is using
1030\code{Py_DECREF()} on an unrelated object while borrowing a reference
1031to a list item. For instance:
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001032
Guido van Rossume47da0a1997-07-17 16:34:52 +00001033\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001034bug(PyObject *list) {
1035 PyObject *item = PyList_GetItem(list, 0);
1036 PyList_SetItem(list, 1, PyInt_FromLong(0L));
1037 PyObject_Print(item, stdout, 0); /* BUG! */
1038}
Guido van Rossume47da0a1997-07-17 16:34:52 +00001039\end{verbatim}\ecode
1040%
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001041This function first borrows a reference to \code{list[0]}, then
1042replaces \code{list[1]} with the value \code{0}, and finally prints
1043the borrowed reference. Looks harmless, right? But it's not!
1044
1045Let's follow the control flow into \code{PyList_SetItem()}. The list
1046owns references to all its items, so when item 1 is replaced, it has
1047to dispose of the original item 1. Now let's suppose the original
1048item 1 was an instance of a user-defined class, and let's further
1049suppose that the class defined a \code{__del__()} method. If this
1050class instance has a reference count of 1, disposing of it will call
1051its \code{__del__()} method.
1052
1053Since it is written in Python, the \code{__del__()} method can execute
1054arbitrary Python code. Could it perhaps do something to invalidate
1055the reference to \code{item} in \code{bug()}? You bet! Assuming that
1056the list passed into \code{bug()} is accessible to the
1057\code{__del__()} method, it could execute a statement to the effect of
1058\code{del list[0]}, and assuming this was the last reference to that
1059object, it would free the memory associated with it, thereby
1060invalidating \code{item}.
1061
1062The solution, once you know the source of the problem, is easy:
1063temporarily increment the reference count. The correct version of the
1064function reads:
1065
Guido van Rossume47da0a1997-07-17 16:34:52 +00001066\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001067no_bug(PyObject *list) {
1068 PyObject *item = PyList_GetItem(list, 0);
1069 Py_INCREF(item);
1070 PyList_SetItem(list, 1, PyInt_FromLong(0L));
1071 PyObject_Print(item, stdout, 0);
1072 Py_DECREF(item);
1073}
Guido van Rossume47da0a1997-07-17 16:34:52 +00001074\end{verbatim}\ecode
1075%
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001076This is a true story. An older version of Python contained variants
1077of this bug and someone spent a considerable amount of time in a C
1078debugger to figure out why his \code{__del__()} methods would fail...
1079
1080The second case of problems with a borrowed reference is a variant
1081involving threads. Normally, multiple threads in the Python
1082interpreter can't get in each other's way, because there is a global
1083lock protecting Python's entire object space. However, it is possible
1084to temporarily release this lock using the macro
1085\code{Py_BEGIN_ALLOW_THREADS}, and to re-acquire it using
1086\code{Py_END_ALLOW_THREADS}. This is common around blocking I/O
1087calls, to let other threads use the CPU while waiting for the I/O to
1088complete. Obviously, the following function has the same problem as
1089the previous one:
1090
Guido van Rossume47da0a1997-07-17 16:34:52 +00001091\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001092bug(PyObject *list) {
1093 PyObject *item = PyList_GetItem(list, 0);
1094 Py_BEGIN_ALLOW_THREADS
1095 ...some blocking I/O call...
1096 Py_END_ALLOW_THREADS
1097 PyObject_Print(item, stdout, 0); /* BUG! */
1098}
Guido van Rossume47da0a1997-07-17 16:34:52 +00001099\end{verbatim}\ecode
1100%
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001101\subsection{NULL Pointers}
1102
1103In general, functions that take object references as arguments don't
1104expect you to pass them \code{NULL} pointers, and will dump core (or
1105cause later core dumps) if you do so. Functions that return object
1106references generally return \code{NULL} only to indicate that an
1107exception occurred. The reason for not testing for \code{NULL}
1108arguments is that functions often pass the objects they receive on to
1109other function --- if each function were to test for \code{NULL},
1110there would be a lot of redundant tests and the code would run slower.
1111
1112It is better to test for \code{NULL} only at the ``source'', i.e.\
1113when a pointer that may be \code{NULL} is received, e.g.\ from
1114\code{malloc()} or from a function that may raise an exception.
1115
1116The macros \code{Py_INCREF()} and \code{Py_DECREF()}
1117don't check for \code{NULL} pointers --- however, their variants
1118\code{Py_XINCREF()} and \code{Py_XDECREF()} do.
1119
1120The macros for checking for a particular object type
1121(\code{Py\var{type}_Check()}) don't check for \code{NULL} pointers ---
1122again, there is much code that calls several of these in a row to test
1123an object against various different expected types, and this would
1124generate redundant tests. There are no variants with \code{NULL}
1125checking.
1126
1127The C function calling mechanism guarantees that the argument list
1128passed to C functions (\code{args} in the examples) is never
1129\code{NULL} --- in fact it guarantees that it is always a tuple.%
1130\footnote{These guarantees don't hold when you use the ``old'' style
1131calling convention --- this is still found in much existing code.}
1132
1133It is a severe error to ever let a \code{NULL} pointer ``escape'' to
1134the Python user.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001135
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001136
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001137\section{Writing Extensions in \Cpp{}}
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001138
Guido van Rossum16d6e711994-08-08 12:30:22 +00001139It is possible to write extension modules in \Cpp{}. Some restrictions
Guido van Rossumed39cd01995-10-08 00:17:19 +00001140apply. If the main program (the Python interpreter) is compiled and
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001141linked by the C compiler, global or static objects with constructors
Guido van Rossumed39cd01995-10-08 00:17:19 +00001142cannot be used. This is not a problem if the main program is linked
1143by the \Cpp{} compiler. All functions that will be called directly or
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001144indirectly (i.e. via function pointers) by the Python interpreter will
1145have to be declared using \code{extern "C"}; this applies to all
Guido van Rossumb92112d1995-03-20 14:24:09 +00001146``methods'' as well as to the module's initialization function.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001147It is unnecessary to enclose the Python header files in
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001148\code{extern "C" \{...\}} --- they use this form already if the symbol
1149\samp{__cplusplus} is defined (all recent C++ compilers define this
1150symbol).
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001151
1152\chapter{Embedding Python in another application}
1153
1154Embedding Python is similar to extending it, but not quite. The
1155difference is that when you extend Python, the main program of the
Guido van Rossum16d6e711994-08-08 12:30:22 +00001156application is still the Python interpreter, while if you embed
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001157Python, the main program may have nothing to do with Python ---
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001158instead, some parts of the application occasionally call the Python
1159interpreter to run some Python code.
1160
1161So if you are embedding Python, you are providing your own main
1162program. One of the things this main program has to do is initialize
1163the Python interpreter. At the very least, you have to call the
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001164function \code{Py_Initialize()}. There are optional calls to pass command
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001165line arguments to Python. Then later you can call the interpreter
1166from any part of the application.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001167
1168There are several different ways to call the interpreter: you can pass
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001169a string containing Python statements to \code{PyRun_SimpleString()},
1170or you can pass a stdio file pointer and a file name (for
1171identification in error messages only) to \code{PyRun_SimpleFile()}. You
1172can also call the lower-level operations described in the previous
1173chapters to construct and use Python objects.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001174
1175A simple demo of embedding Python can be found in the directory
Guido van Rossum6938f061994-08-01 12:22:53 +00001176\file{Demo/embed}.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001177
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001178
Guido van Rossum16d6e711994-08-08 12:30:22 +00001179\section{Embedding Python in \Cpp{}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001180
Guido van Rossum16d6e711994-08-08 12:30:22 +00001181It is also possible to embed Python in a \Cpp{} program; precisely how this
1182is done will depend on the details of the \Cpp{} system used; in general you
1183will need to write the main program in \Cpp{}, and use the \Cpp{} compiler
1184to compile and link your program. There is no need to recompile Python
1185itself using \Cpp{}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001186
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001187
1188\chapter{Dynamic Loading}
1189
Guido van Rossum6938f061994-08-01 12:22:53 +00001190On most modern systems it is possible to configure Python to support
1191dynamic loading of extension modules implemented in C. When shared
1192libraries are used dynamic loading is configured automatically;
1193otherwise you have to select it as a build option (see below). Once
1194configured, dynamic loading is trivial to use: when a Python program
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001195executes \code{import spam}, the search for modules tries to find a
1196file \file{spammodule.o} (\file{spammodule.so} when using shared
Guido van Rossum6938f061994-08-01 12:22:53 +00001197libraries) in the module search path, and if one is found, it is
1198loaded into the executing binary and executed. Once loaded, the
1199module acts just like a built-in extension module.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001200
Guido van Rossumb92112d1995-03-20 14:24:09 +00001201The advantages of dynamic loading are twofold: the ``core'' Python
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001202binary gets smaller, and users can extend Python with their own
1203modules implemented in C without having to build and maintain their
1204own copy of the Python interpreter. There are also disadvantages:
1205dynamic loading isn't available on all systems (this just means that
1206on some systems you have to use static loading), and dynamically
1207loading a module that was compiled for a different version of Python
Guido van Rossum6938f061994-08-01 12:22:53 +00001208(e.g. with a different representation of objects) may dump core.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001209
1210
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001211\section{Configuring and Building the Interpreter for Dynamic Loading}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001212
Guido van Rossum6938f061994-08-01 12:22:53 +00001213There are three styles of dynamic loading: one using shared libraries,
1214one using SGI IRIX 4 dynamic loading, and one using GNU dynamic
1215loading.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001216
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001217\subsection{Shared Libraries}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001218
Guido van Rossum16d6e711994-08-08 12:30:22 +00001219The following systems support dynamic loading using shared libraries:
Guido van Rossum6938f061994-08-01 12:22:53 +00001220SunOS 4; Solaris 2; SGI IRIX 5 (but not SGI IRIX 4!); and probably all
1221systems derived from SVR4, or at least those SVR4 derivatives that
1222support shared libraries (are there any that don't?).
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001223
Guido van Rossum6938f061994-08-01 12:22:53 +00001224You don't need to do anything to configure dynamic loading on these
1225systems --- the \file{configure} detects the presence of the
1226\file{<dlfcn.h>} header file and automatically configures dynamic
1227loading.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001228
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001229\subsection{SGI IRIX 4 Dynamic Loading}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001230
Guido van Rossum6938f061994-08-01 12:22:53 +00001231Only SGI IRIX 4 supports dynamic loading of modules using SGI dynamic
1232loading. (SGI IRIX 5 might also support it but it is inferior to
1233using shared libraries so there is no reason to; a small test didn't
1234work right away so I gave up trying to support it.)
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001235
Guido van Rossum6938f061994-08-01 12:22:53 +00001236Before you build Python, you first need to fetch and build the \code{dl}
1237package written by Jack Jansen. This is available by anonymous ftp
1238from host \file{ftp.cwi.nl}, directory \file{pub/dynload}, file
1239\file{dl-1.6.tar.Z}. (The version number may change.) Follow the
1240instructions in the package's \file{README} file to build it.
1241
1242Once you have built \code{dl}, you can configure Python to use it. To
1243this end, you run the \file{configure} script with the option
1244\code{--with-dl=\var{directory}} where \var{directory} is the absolute
1245pathname of the \code{dl} directory.
1246
1247Now build and install Python as you normally would (see the
1248\file{README} file in the toplevel Python directory.)
1249
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001250\subsection{GNU Dynamic Loading}
Guido van Rossum6938f061994-08-01 12:22:53 +00001251
1252GNU dynamic loading supports (according to its \file{README} file) the
1253following hardware and software combinations: VAX (Ultrix), Sun 3
1254(SunOS 3.4 and 4.0), Sparc (SunOS 4.0), Sequent Symmetry (Dynix), and
1255Atari ST. There is no reason to use it on a Sparc; I haven't seen a
1256Sun 3 for years so I don't know if these have shared libraries or not.
1257
Guido van Rossum7e924dd1997-02-10 16:51:52 +00001258You need to fetch and build two packages.
1259One is GNU DLD. All development of this code has been done with DLD
1260version 3.2.3, which is available by anonymous ftp from host
1261\file{ftp.cwi.nl}, directory \file{pub/dynload}, file
1262\file{dld-3.2.3.tar.Z}. (A more recent version of DLD is available
1263via \file{http://www-swiss.ai.mit.edu/~jaffer/DLD.html} but this has
1264not been tested.)
1265The other package needed is an
Guido van Rossum6938f061994-08-01 12:22:53 +00001266emulation of Jack Jansen's \code{dl} package that I wrote on top of
1267GNU DLD 3.2.3. This is available from the same host and directory,
Guido van Rossum98046b91997-08-14 19:50:18 +00001268file \file{dl-dld-1.1.tar.Z}. (The version number may change --- but I doubt
Guido van Rossum6938f061994-08-01 12:22:53 +00001269it will.) Follow the instructions in each package's \file{README}
Guido van Rossum98046b91997-08-14 19:50:18 +00001270file to configure and build them.
Guido van Rossum6938f061994-08-01 12:22:53 +00001271
1272Now configure Python. Run the \file{configure} script with the option
1273\code{--with-dl-dld=\var{dl-directory},\var{dld-directory}} where
1274\var{dl-directory} is the absolute pathname of the directory where you
1275have built the \file{dl-dld} package, and \var{dld-directory} is that
1276of the GNU DLD package. The Python interpreter you build hereafter
1277will support GNU dynamic loading.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001278
1279
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001280\section{Building a Dynamically Loadable Module}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001281
Guido van Rossum6938f061994-08-01 12:22:53 +00001282Since there are three styles of dynamic loading, there are also three
1283groups of instructions for building a dynamically loadable module.
1284Instructions common for all three styles are given first. Assuming
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001285your module is called \code{spam}, the source filename must be
1286\file{spammodule.c}, so the object name is \file{spammodule.o}. The
Guido van Rossum6938f061994-08-01 12:22:53 +00001287module must be written as a normal Python extension module (as
1288described earlier).
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001289
Guido van Rossum6938f061994-08-01 12:22:53 +00001290Note that in all cases you will have to create your own Makefile that
1291compiles your module file(s). This Makefile will have to pass two
1292\samp{-I} arguments to the C compiler which will make it find the
1293Python header files. If the Make variable \var{PYTHONTOP} points to
1294the toplevel Python directory, your \var{CFLAGS} Make variable should
1295contain the options \samp{-I\$(PYTHONTOP) -I\$(PYTHONTOP)/Include}.
1296(Most header files are in the \file{Include} subdirectory, but the
Guido van Rossum305ed111996-08-19 22:59:46 +00001297\file{config.h} header lives in the toplevel directory.)
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001298
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001299
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001300\subsection{Shared Libraries}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001301
Guido van Rossum6938f061994-08-01 12:22:53 +00001302You must link the \samp{.o} file to produce a shared library. This is
1303done using a special invocation of the \UNIX{} loader/linker, {\em
1304ld}(1). Unfortunately the invocation differs slightly per system.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001305
Guido van Rossum6938f061994-08-01 12:22:53 +00001306On SunOS 4, use
Guido van Rossume47da0a1997-07-17 16:34:52 +00001307\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001308 ld spammodule.o -o spammodule.so
Guido van Rossume47da0a1997-07-17 16:34:52 +00001309\end{verbatim}\ecode
1310%
Guido van Rossum6938f061994-08-01 12:22:53 +00001311On Solaris 2, use
Guido van Rossume47da0a1997-07-17 16:34:52 +00001312\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001313 ld -G spammodule.o -o spammodule.so
Guido van Rossume47da0a1997-07-17 16:34:52 +00001314\end{verbatim}\ecode
1315%
Guido van Rossum6938f061994-08-01 12:22:53 +00001316On SGI IRIX 5, use
Guido van Rossume47da0a1997-07-17 16:34:52 +00001317\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001318 ld -shared spammodule.o -o spammodule.so
Guido van Rossume47da0a1997-07-17 16:34:52 +00001319\end{verbatim}\ecode
1320%
Guido van Rossumb92112d1995-03-20 14:24:09 +00001321On other systems, consult the manual page for \code{ld}(1) to find what
Guido van Rossum6938f061994-08-01 12:22:53 +00001322flags, if any, must be used.
1323
1324If your extension module uses system libraries that haven't already
1325been linked with Python (e.g. a windowing system), these must be
Guido van Rossumb92112d1995-03-20 14:24:09 +00001326passed to the \code{ld} command as \samp{-l} options after the
Guido van Rossum6938f061994-08-01 12:22:53 +00001327\samp{.o} file.
1328
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001329The resulting file \file{spammodule.so} must be copied into a directory
Guido van Rossum6938f061994-08-01 12:22:53 +00001330along the Python module search path.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001331
1332
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001333\subsection{SGI IRIX 4 Dynamic Loading}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001334
Guido van Rossum7ec59571995-04-07 15:35:33 +00001335{\bf IMPORTANT:} You must compile your extension module with the
Guido van Rossum6938f061994-08-01 12:22:53 +00001336additional C flag \samp{-G0} (or \samp{-G 0}). This instruct the
1337assembler to generate position-independent code.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001338
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001339You don't need to link the resulting \file{spammodule.o} file; just
Guido van Rossum6938f061994-08-01 12:22:53 +00001340copy it into a directory along the Python module search path.
1341
1342The first time your extension is loaded, it takes some extra time and
1343a few messages may be printed. This creates a file
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001344\file{spammodule.ld} which is an image that can be loaded quickly into
Guido van Rossum6938f061994-08-01 12:22:53 +00001345the Python interpreter process. When a new Python interpreter is
1346installed, the \code{dl} package detects this and rebuilds
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001347\file{spammodule.ld}. The file \file{spammodule.ld} is placed in the
1348directory where \file{spammodule.o} was found, unless this directory is
Guido van Rossum6938f061994-08-01 12:22:53 +00001349unwritable; in that case it is placed in a temporary
1350directory.\footnote{Check the manual page of the \code{dl} package for
1351details.}
1352
1353If your extension modules uses additional system libraries, you must
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001354create a file \file{spammodule.libs} in the same directory as the
1355\file{spammodule.o}. This file should contain one or more lines with
Guido van Rossum6938f061994-08-01 12:22:53 +00001356whitespace-separated options that will be passed to the linker ---
1357normally only \samp{-l} options or absolute pathnames of libraries
1358(\samp{.a} files) should be used.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001359
1360
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001361\subsection{GNU Dynamic Loading}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001362
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001363Just copy \file{spammodule.o} into a directory along the Python module
Guido van Rossum6938f061994-08-01 12:22:53 +00001364search path.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001365
Guido van Rossum6938f061994-08-01 12:22:53 +00001366If your extension modules uses additional system libraries, you must
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001367create a file \file{spammodule.libs} in the same directory as the
1368\file{spammodule.o}. This file should contain one or more lines with
Guido van Rossum6938f061994-08-01 12:22:53 +00001369whitespace-separated absolute pathnames of libraries (\samp{.a}
1370files). No \samp{-l} options can be used.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001371
1372
Guido van Rossum9231c8f1997-05-15 21:43:21 +00001373%\input{extref}
Guido van Rossum267e80d1996-08-09 21:01:07 +00001374
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001375\input{ext.ind}
1376
1377\end{document}