blob: af4a66791f85cadcb2bdaf5db082ee7387737a16 [file] [log] [blame]
Guido van Rossum6938f061994-08-01 12:22:53 +00001\documentstyle[twoside,11pt,myformat]{report}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002
Guido van Rossum5049bcb1995-03-13 16:55:23 +00003% XXX PM Modulator
4
Guido van Rossum6938f061994-08-01 12:22:53 +00005\title{Extending and Embedding the Python Interpreter}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00006
Guido van Rossum16cd7f91994-10-06 10:29:26 +00007\input{boilerplate}
Guido van Rossum83eb9621993-11-23 16:28:45 +00008
Guido van Rossum7a2dba21993-11-05 14:45:11 +00009% Tell \index to actually write the .idx file
10\makeindex
11
12\begin{document}
13
14\pagenumbering{roman}
15
16\maketitle
17
Guido van Rossum16cd7f91994-10-06 10:29:26 +000018\input{copyright}
19
Guido van Rossum7a2dba21993-11-05 14:45:11 +000020\begin{abstract}
21
22\noindent
Guido van Rossumb92112d1995-03-20 14:24:09 +000023Python is an interpreted, object-oriented programming language. This
24document describes how to write modules in C or \Cpp{} to extend the
25Python interpreter with new modules. Those modules can define new
26functions but also new object types and their methods. The document
27also describes how to embed the Python interpreter in another
28application, for use as an extension language. Finally, it shows how
29to compile and link extension modules so that they can be loaded
30dynamically (at run time) into the interpreter, if the underlying
31operating system supports this feature.
32
33This document assumes basic knowledge about Python. For an informal
34introduction to the language, see the Python Tutorial. The Python
35Reference Manual gives a more formal definition of the language. The
36Python Library Reference documents the existing object types,
37functions and modules (both built-in and written in Python) that give
38the language its wide application range.
Guido van Rossum7a2dba21993-11-05 14:45:11 +000039
40\end{abstract}
41
42\pagebreak
43
44{
45\parskip = 0mm
46\tableofcontents
47}
48
49\pagebreak
50
51\pagenumbering{arabic}
52
Guido van Rossumdb65a6c1993-11-05 17:11:16 +000053
Guido van Rossum16d6e711994-08-08 12:30:22 +000054\chapter{Extending Python with C or \Cpp{} code}
Guido van Rossum7a2dba21993-11-05 14:45:11 +000055
Guido van Rossum6f0132f1993-11-19 13:13:22 +000056
57\section{Introduction}
58
Guido van Rossumb92112d1995-03-20 14:24:09 +000059It is quite easy to add new built-in modules to Python, if you know
60how to program in C. Such \dfn{extension modules} can do two things
61that can't be done directly in Python: they can implement new built-in
62object types, and they can call C library functions and system calls.
Guido van Rossum6938f061994-08-01 12:22:53 +000063
Guido van Rossum5049bcb1995-03-13 16:55:23 +000064To support extensions, the Python API (Application Programmers
Guido van Rossumb92112d1995-03-20 14:24:09 +000065Interface) defines a set of functions, macros and variables that
66provide access to most aspects of the Python run-time system. The
67Python API is incorporated in a C source file by including the header
68\code{"Python.h"}.
Guido van Rossum6938f061994-08-01 12:22:53 +000069
Guido van Rossumb92112d1995-03-20 14:24:09 +000070The compilation of an extension module depends on its intended use as
71well as on your system setup; details are given in a later section.
Guido van Rossum6938f061994-08-01 12:22:53 +000072
Guido van Rossum7a2dba21993-11-05 14:45:11 +000073
Guido van Rossum5049bcb1995-03-13 16:55:23 +000074\section{A Simple Example}
Guido van Rossum7a2dba21993-11-05 14:45:11 +000075
Guido van Rossumb92112d1995-03-20 14:24:09 +000076Let's create an extension module called \samp{spam} (the favorite food
77of Monty Python fans...) and let's say we want to create a Python
78interface to the C library function \code{system()}.\footnote{An
79interface for this function already exists in the standard module
80\code{os} --- it was chosen as a simple and straightfoward example.}
81This function takes a null-terminated character string as argument and
82returns an integer. We want this function to be callable from Python
83as follows:
84
Guido van Rossume47da0a1997-07-17 16:34:52 +000085\bcode\begin{verbatim}
Guido van Rossumb92112d1995-03-20 14:24:09 +000086 >>> import spam
87 >>> status = spam.system("ls -l")
Guido van Rossume47da0a1997-07-17 16:34:52 +000088\end{verbatim}\ecode
89%
Guido van Rossumb92112d1995-03-20 14:24:09 +000090Begin by creating a file \samp{spammodule.c}. (In general, if a
91module is called \samp{spam}, the C file containing its implementation
92is called \file{spammodule.c}; if the module name is very long, like
93\samp{spammify}, the module name can be just \file{spammify.c}.)
94
95The first line of our file can be:
Guido van Rossum7a2dba21993-11-05 14:45:11 +000096
Guido van Rossume47da0a1997-07-17 16:34:52 +000097\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +000098 #include "Python.h"
Guido van Rossume47da0a1997-07-17 16:34:52 +000099\end{verbatim}\ecode
100%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000101which pulls in the Python API (you can add a comment describing the
102purpose of the module and a copyright notice if you like).
103
Guido van Rossumb92112d1995-03-20 14:24:09 +0000104All user-visible symbols defined by \code{"Python.h"} have a prefix of
105\samp{Py} or \samp{PY}, except those defined in standard header files.
106For convenience, and since they are used extensively by the Python
107interpreter, \code{"Python.h"} includes a few standard header files:
108\code{<stdio.h>}, \code{<string.h>}, \code{<errno.h>}, and
109\code{<stdlib.h>}. If the latter header file does not exist on your
110system, it declares the functions \code{malloc()}, \code{free()} and
111\code{realloc()} directly.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000112
113The next thing we add to our module file is the C function that will
114be called when the Python expression \samp{spam.system(\var{string})}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000115is evaluated (we'll see shortly how it ends up being called):
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000116
Guido van Rossume47da0a1997-07-17 16:34:52 +0000117\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000118 static PyObject *
119 spam_system(self, args)
120 PyObject *self;
121 PyObject *args;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000122 {
123 char *command;
124 int sts;
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000125 if (!PyArg_ParseTuple(args, "s", &command))
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000126 return NULL;
127 sts = system(command);
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000128 return Py_BuildValue("i", sts);
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000129 }
Guido van Rossume47da0a1997-07-17 16:34:52 +0000130\end{verbatim}\ecode
131%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000132There is a straightforward translation from the argument list in
Guido van Rossumb92112d1995-03-20 14:24:09 +0000133Python (e.g.\ the single expression \code{"ls -l"}) to the arguments
134passed to the C function. The C function always has two arguments,
135conventionally named \var{self} and \var{args}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000136
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000137The \var{self} argument is only used when the C function implements a
Guido van Rossumb92112d1995-03-20 14:24:09 +0000138builtin method. This will be discussed later. In the example,
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000139\var{self} will always be a \code{NULL} pointer, since we are defining
140a function, not a method. (This is done so that the interpreter
141doesn't have to understand two different types of C functions.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000142
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000143The \var{args} argument will be a pointer to a Python tuple object
Guido van Rossumb92112d1995-03-20 14:24:09 +0000144containing the arguments. Each item of the tuple corresponds to an
145argument in the call's argument list. The arguments are Python
146objects -- in order to do anything with them in our C function we have
147to convert them to C values. The function \code{PyArg_ParseTuple()}
148in the Python API checks the argument types and converts them to C
149values. It uses a template string to determine the required types of
150the arguments as well as the types of the C variables into which to
151store the converted values. More about this later.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000152
Guido van Rossumb92112d1995-03-20 14:24:09 +0000153\code{PyArg_ParseTuple()} returns true (nonzero) if all arguments have
154the right type and its components have been stored in the variables
155whose addresses are passed. It returns false (zero) if an invalid
156argument list was passed. In the latter case it also raises an
157appropriate exception by so the calling function can return
158\code{NULL} immediately (as we saw in the example).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000159
160
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000161\section{Intermezzo: Errors and Exceptions}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000162
163An important convention throughout the Python interpreter is the
164following: when a function fails, it should set an exception condition
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000165and return an error value (usually a \code{NULL} pointer). Exceptions
Guido van Rossumb92112d1995-03-20 14:24:09 +0000166are stored in a static global variable inside the interpreter; if this
167variable is \code{NULL} no exception has occurred. A second global
168variable stores the ``associated value'' of the exception (the second
169argument to \code{raise}). A third variable contains the stack
170traceback in case the error originated in Python code. These three
171variables are the C equivalents of the Python variables
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000172\code{sys.exc_type}, \code{sys.exc_value} and \code{sys.exc_traceback}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000173(see the section on module \code{sys} in the Library Reference
174Manual). It is important to know about them to understand how errors
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000175are passed around.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000176
Guido van Rossumb92112d1995-03-20 14:24:09 +0000177The Python API defines a number of functions to set various types of
178exceptions.
179
180The most common one is \code{PyErr_SetString()}. Its arguments are an
181exception object and a C string. The exception object is usually a
182predefined object like \code{PyExc_ZeroDivisionError}. The C string
183indicates the cause of the error and is converted to a Python string
184object and stored as the ``associated value'' of the exception.
185
186Another useful function is \code{PyErr_SetFromErrno()}, which only
187takes an exception argument and constructs the associated value by
188inspection of the (\UNIX{}) global variable \code{errno}. The most
189general function is \code{PyErr_SetObject()}, which takes two object
190arguments, the exception and its associated value. You don't need to
191\code{Py_INCREF()} the objects passed to any of these functions.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000192
193You can test non-destructively whether an exception has been set with
Guido van Rossumb92112d1995-03-20 14:24:09 +0000194\code{PyErr_Occurred()}. This returns the current exception object,
195or \code{NULL} if no exception has occurred. You normally don't need
196to call \code{PyErr_Occurred()} to see whether an error occurred in a
197function call, since you should be able to tell from the return value.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000198
Guido van Rossumd16ddb61996-12-13 02:38:17 +0000199When a function \var{f} that calls another function \var{g} detects
Guido van Rossumb92112d1995-03-20 14:24:09 +0000200that the latter fails, \var{f} should itself return an error value
201(e.g. \code{NULL} or \code{-1}). It should \emph{not} call one of the
202\code{PyErr_*()} functions --- one has already been called by \var{g}.
203\var{f}'s caller is then supposed to also return an error indication
204to \emph{its} caller, again \emph{without} calling \code{PyErr_*()},
205and so on --- the most detailed cause of the error was already
206reported by the function that first detected it. Once the error
207reaches the Python interpreter's main loop, this aborts the currently
208executing Python code and tries to find an exception handler specified
209by the Python programmer.
Guido van Rossum6938f061994-08-01 12:22:53 +0000210
211(There are situations where a module can actually give a more detailed
Guido van Rossumb92112d1995-03-20 14:24:09 +0000212error message by calling another \code{PyErr_*()} function, and in
213such cases it is fine to do so. As a general rule, however, this is
214not necessary, and can cause information about the cause of the error
215to be lost: most operations can fail for a variety of reasons.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000216
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000217To ignore an exception set by a function call that failed, the exception
218condition must be cleared explicitly by calling \code{PyErr_Clear()}.
219The only time C code should call \code{PyErr_Clear()} is if it doesn't
220want to pass the error on to the interpreter but wants to handle it
221completely by itself (e.g. by trying something else or pretending
222nothing happened).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000223
Guido van Rossumb92112d1995-03-20 14:24:09 +0000224Note that a failing \code{malloc()} call must be turned into an
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000225exception --- the direct caller of \code{malloc()} (or
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000226\code{realloc()}) must call \code{PyErr_NoMemory()} and return a
227failure indicator itself. All the object-creating functions
228(\code{PyInt_FromLong()} etc.) already do this, so only if you call
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000229\code{malloc()} directly this note is of importance.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000230
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000231Also note that, with the important exception of
Guido van Rossumb92112d1995-03-20 14:24:09 +0000232\code{PyArg_ParseTuple()} and friends, functions that return an
233integer status usually return a positive value or zero for success and
234\code{-1} for failure, like \UNIX{} system calls.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000235
Guido van Rossumb92112d1995-03-20 14:24:09 +0000236Finally, be careful to clean up garbage (by making \code{Py_XDECREF()}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000237or \code{Py_DECREF()} calls for objects you have already created) when
Guido van Rossumb92112d1995-03-20 14:24:09 +0000238you return an error indicator!
Guido van Rossum6938f061994-08-01 12:22:53 +0000239
240The choice of which exception to raise is entirely yours. There are
241predeclared C objects corresponding to all built-in Python exceptions,
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000242e.g. \code{PyExc_ZeroDevisionError} which you can use directly. Of
Guido van Rossumb92112d1995-03-20 14:24:09 +0000243course, you should choose exceptions wisely --- don't use
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000244\code{PyExc_TypeError} to mean that a file couldn't be opened (that
245should probably be \code{PyExc_IOError}). If something's wrong with
246the argument list, the \code{PyArg_ParseTuple()} function usually
247raises \code{PyExc_TypeError}. If you have an argument whose value
248which must be in a particular range or must satisfy other conditions,
249\code{PyExc_ValueError} is appropriate.
Guido van Rossum6938f061994-08-01 12:22:53 +0000250
251You can also define a new exception that is unique to your module.
252For this, you usually declare a static object variable at the
253beginning of your file, e.g.
254
Guido van Rossume47da0a1997-07-17 16:34:52 +0000255\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000256 static PyObject *SpamError;
Guido van Rossume47da0a1997-07-17 16:34:52 +0000257\end{verbatim}\ecode
258%
Guido van Rossum6938f061994-08-01 12:22:53 +0000259and initialize it in your module's initialization function
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000260(\code{initspam()}) with a string object, e.g. (leaving out the error
Guido van Rossumb92112d1995-03-20 14:24:09 +0000261checking for now):
Guido van Rossum6938f061994-08-01 12:22:53 +0000262
Guido van Rossume47da0a1997-07-17 16:34:52 +0000263\bcode\begin{verbatim}
Guido van Rossum6938f061994-08-01 12:22:53 +0000264 void
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000265 initspam()
Guido van Rossum6938f061994-08-01 12:22:53 +0000266 {
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000267 PyObject *m, *d;
Guido van Rossumb92112d1995-03-20 14:24:09 +0000268 m = Py_InitModule("spam", SpamMethods);
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000269 d = PyModule_GetDict(m);
270 SpamError = PyString_FromString("spam.error");
271 PyDict_SetItemString(d, "error", SpamError);
Guido van Rossum6938f061994-08-01 12:22:53 +0000272 }
Guido van Rossume47da0a1997-07-17 16:34:52 +0000273\end{verbatim}\ecode
274%
Guido van Rossumb92112d1995-03-20 14:24:09 +0000275Note that the Python name for the exception object is
276\code{spam.error}. It is conventional for module and exception names
277to be spelled in lower case. It is also conventional that the
278\emph{value} of the exception object is the same as its name, e.g.\
279the string \code{"spam.error"}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000280
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000281
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000282\section{Back to the Example}
283
284Going back to our example function, you should now be able to
285understand this statement:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000286
Guido van Rossume47da0a1997-07-17 16:34:52 +0000287\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000288 if (!PyArg_ParseTuple(args, "s", &command))
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000289 return NULL;
Guido van Rossume47da0a1997-07-17 16:34:52 +0000290\end{verbatim}\ecode
291%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000292It returns \code{NULL} (the error indicator for functions returning
293object pointers) if an error is detected in the argument list, relying
294on the exception set by \code{PyArg_ParseTuple()}. Otherwise the
295string value of the argument has been copied to the local variable
296\code{command}. This is a pointer assignment and you are not supposed
Guido van Rossumb92112d1995-03-20 14:24:09 +0000297to modify the string to which it points (so in Standard C, the variable
298\code{command} should properly be declared as \samp{const char
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000299*command}).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000300
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000301The next statement is a call to the \UNIX{} function \code{system()},
302passing it the string we just got from \code{PyArg_ParseTuple()}:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000303
Guido van Rossume47da0a1997-07-17 16:34:52 +0000304\bcode\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000305 sts = system(command);
Guido van Rossume47da0a1997-07-17 16:34:52 +0000306\end{verbatim}\ecode
307%
Guido van Rossumd16ddb61996-12-13 02:38:17 +0000308Our \code{spam.system()} function must return the value of \code{sts}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000309as a Python object. This is done using the function
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000310\code{Py_BuildValue()}, which is something like the inverse of
311\code{PyArg_ParseTuple()}: it takes a format string and an arbitrary
312number of C values, and returns a new Python object. More info on
313\code{Py_BuildValue()} is given later.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000314
Guido van Rossume47da0a1997-07-17 16:34:52 +0000315\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000316 return Py_BuildValue("i", sts);
Guido van Rossume47da0a1997-07-17 16:34:52 +0000317\end{verbatim}\ecode
318%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000319In this case, it will return an integer object. (Yes, even integers
320are objects on the heap in Python!)
Guido van Rossum6938f061994-08-01 12:22:53 +0000321
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000322If you have a C function that returns no useful argument (a function
323returning \code{void}), the corresponding Python function must return
324\code{None}. You need this idiom to do so:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000325
Guido van Rossume47da0a1997-07-17 16:34:52 +0000326\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000327 Py_INCREF(Py_None);
328 return Py_None;
Guido van Rossume47da0a1997-07-17 16:34:52 +0000329\end{verbatim}\ecode
330%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000331\code{Py_None} is the C name for the special Python object
332\code{None}. It is a genuine Python object (not a \code{NULL}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000333pointer, which means ``error'' in most contexts, as we have seen).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000334
335
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000336\section{The Module's Method Table and Initialization Function}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000337
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000338I promised to show how \code{spam_system()} is called from Python
339programs. First, we need to list its name and address in a ``method
340table'':
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000341
Guido van Rossume47da0a1997-07-17 16:34:52 +0000342\bcode\begin{verbatim}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000343 static PyMethodDef SpamMethods[] = {
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000344 ...
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000345 {"system", spam_system, 1},
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000346 ...
347 {NULL, NULL} /* Sentinel */
348 };
Guido van Rossume47da0a1997-07-17 16:34:52 +0000349\end{verbatim}\ecode
350%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000351Note the third entry (\samp{1}). This is a flag telling the
352interpreter the calling convention to be used for the C function. It
353should normally always be \samp{1}; a value of \samp{0} means that an
354obsolete variant of \code{PyArg_ParseTuple()} is used.
355
356The method table must be passed to the interpreter in the module's
357initialization function (which should be the only non-\code{static}
358item defined in the module file):
359
Guido van Rossume47da0a1997-07-17 16:34:52 +0000360\bcode\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000361 void
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000362 initspam()
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000363 {
Guido van Rossumb92112d1995-03-20 14:24:09 +0000364 (void) Py_InitModule("spam", SpamMethods);
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000365 }
Guido van Rossume47da0a1997-07-17 16:34:52 +0000366\end{verbatim}\ecode
367%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000368When the Python program imports module \code{spam} for the first time,
369\code{initspam()} is called. It calls \code{Py_InitModule()}, which
370creates a ``module object'' (which is inserted in the dictionary
371\code{sys.modules} under the key \code{"spam"}), and inserts built-in
372function objects into the newly created module based upon the table
373(an array of \code{PyMethodDef} structures) that was passed as its
374second argument. \code{Py_InitModule()} returns a pointer to the
Guido van Rossum6938f061994-08-01 12:22:53 +0000375module object that it creates (which is unused here). It aborts with
376a fatal error if the module could not be initialized satisfactorily,
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000377so the caller doesn't need to check for errors.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000378
379
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000380\section{Compilation and Linkage}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000381
Guido van Rossumb92112d1995-03-20 14:24:09 +0000382There are two more things to do before you can use your new extension:
383compiling and linking it with the Python system. If you use dynamic
384loading, the details depend on the style of dynamic loading your
385system uses; see the chapter on Dynamic Loading for more info about
386this.
Guido van Rossum6938f061994-08-01 12:22:53 +0000387
388If you can't use dynamic loading, or if you want to make your module a
389permanent part of the Python interpreter, you will have to change the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000390configuration setup and rebuild the interpreter. Luckily, this is
391very simple: just place your file (\file{spammodule.c} for example) in
392the \file{Modules} directory, add a line to the file
393\file{Modules/Setup} describing your file:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000394
Guido van Rossume47da0a1997-07-17 16:34:52 +0000395\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000396 spam spammodule.o
Guido van Rossume47da0a1997-07-17 16:34:52 +0000397\end{verbatim}\ecode
398%
Guido van Rossum6938f061994-08-01 12:22:53 +0000399and rebuild the interpreter by running \code{make} in the toplevel
400directory. You can also run \code{make} in the \file{Modules}
401subdirectory, but then you must first rebuilt the \file{Makefile}
402there by running \code{make Makefile}. (This is necessary each time
403you change the \file{Setup} file.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000404
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000405If your module requires additional libraries to link with, these can
406be listed on the line in the \file{Setup} file as well, for instance:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000407
Guido van Rossume47da0a1997-07-17 16:34:52 +0000408\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000409 spam spammodule.o -lX11
Guido van Rossume47da0a1997-07-17 16:34:52 +0000410\end{verbatim}\ecode
411%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000412\section{Calling Python Functions From C}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000413
Guido van Rossum6938f061994-08-01 12:22:53 +0000414So far we have concentrated on making C functions callable from
415Python. The reverse is also useful: calling Python functions from C.
416This is especially the case for libraries that support so-called
Guido van Rossumb92112d1995-03-20 14:24:09 +0000417``callback'' functions. If a C interface makes use of callbacks, the
Guido van Rossum6938f061994-08-01 12:22:53 +0000418equivalent Python often needs to provide a callback mechanism to the
419Python programmer; the implementation will require calling the Python
420callback functions from a C callback. Other uses are also imaginable.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000421
422Fortunately, the Python interpreter is easily called recursively, and
Guido van Rossum6938f061994-08-01 12:22:53 +0000423there is a standard interface to call a Python function. (I won't
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000424dwell on how to call the Python parser with a particular string as
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000425input --- if you're interested, have a look at the implementation of
Guido van Rossum6938f061994-08-01 12:22:53 +0000426the \samp{-c} command line option in \file{Python/pythonmain.c}.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000427
428Calling a Python function is easy. First, the Python program must
429somehow pass you the Python function object. You should provide a
430function (or some other interface) to do this. When this function is
431called, save a pointer to the Python function object (be careful to
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000432\code{Py_INCREF()} it!) in a global variable --- or whereever you see fit.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000433For example, the following function might be part of a module
434definition:
435
Guido van Rossume47da0a1997-07-17 16:34:52 +0000436\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000437 static PyObject *my_callback = NULL;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000438
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000439 static PyObject *
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000440 my_set_callback(dummy, arg)
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000441 PyObject *dummy, *arg;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000442 {
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000443 Py_XDECREF(my_callback); /* Dispose of previous callback */
444 Py_XINCREF(arg); /* Add a reference to new callback */
445 my_callback = arg; /* Remember new callback */
446 /* Boilerplate to return "None" */
447 Py_INCREF(Py_None);
448 return Py_None;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000449 }
Guido van Rossume47da0a1997-07-17 16:34:52 +0000450\end{verbatim}\ecode
451%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000452The macros \code{Py_XINCREF()} and \code{Py_XDECREF()} increment/decrement
Guido van Rossum6938f061994-08-01 12:22:53 +0000453the reference count of an object and are safe in the presence of
454\code{NULL} pointers. More info on them in the section on Reference
455Counts below.
456
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000457Later, when it is time to call the function, you call the C function
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000458\code{PyEval_CallObject()}. This function has two arguments, both
459pointers to arbitrary Python objects: the Python function, and the
460argument list. The argument list must always be a tuple object, whose
461length is the number of arguments. To call the Python function with
462no arguments, pass an empty tuple; to call it with one argument, pass
463a singleton tuple. \code{Py_BuildValue()} returns a tuple when its
464format string consists of zero or more format codes between
465parentheses. For example:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000466
Guido van Rossume47da0a1997-07-17 16:34:52 +0000467\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000468 int arg;
469 PyObject *arglist;
470 PyObject *result;
471 ...
472 arg = 123;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000473 ...
474 /* Time to call the callback */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000475 arglist = Py_BuildValue("(i)", arg);
476 result = PyEval_CallObject(my_callback, arglist);
477 Py_DECREF(arglist);
Guido van Rossume47da0a1997-07-17 16:34:52 +0000478\end{verbatim}\ecode
479%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000480\code{PyEval_CallObject()} returns a Python object pointer: this is
481the return value of the Python function. \code{PyEval_CallObject()} is
Guido van Rossumb92112d1995-03-20 14:24:09 +0000482``reference-count-neutral'' with respect to its arguments. In the
Guido van Rossum6938f061994-08-01 12:22:53 +0000483example a new tuple was created to serve as the argument list, which
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000484is \code{Py_DECREF()}-ed immediately after the call.
Guido van Rossum6938f061994-08-01 12:22:53 +0000485
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000486The return value of \code{PyEval_CallObject()} is ``new'': either it
487is a brand new object, or it is an existing object whose reference
488count has been incremented. So, unless you want to save it in a
489global variable, you should somehow \code{Py_DECREF()} the result,
490even (especially!) if you are not interested in its value.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000491
492Before you do this, however, it is important to check that the return
Guido van Rossum6938f061994-08-01 12:22:53 +0000493value isn't \code{NULL}. If it is, the Python function terminated by raising
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000494an exception. If the C code that called \code{PyEval_CallObject()} is
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000495called from Python, it should now return an error indication to its
496Python caller, so the interpreter can print a stack trace, or the
497calling Python code can handle the exception. If this is not possible
498or desirable, the exception should be cleared by calling
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000499\code{PyErr_Clear()}. For example:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000500
Guido van Rossume47da0a1997-07-17 16:34:52 +0000501\bcode\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000502 if (result == NULL)
503 return NULL; /* Pass error back */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000504 ...use result...
505 Py_DECREF(result);
Guido van Rossume47da0a1997-07-17 16:34:52 +0000506\end{verbatim}\ecode
507%
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000508Depending on the desired interface to the Python callback function,
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000509you may also have to provide an argument list to \code{PyEval_CallObject()}.
Guido van Rossum6938f061994-08-01 12:22:53 +0000510In some cases the argument list is also provided by the Python
511program, through the same interface that specified the callback
512function. It can then be saved and used in the same manner as the
513function object. In other cases, you may have to construct a new
514tuple to pass as the argument list. The simplest way to do this is to
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000515call \code{Py_BuildValue()}. For example, if you want to pass an integral
Guido van Rossum6938f061994-08-01 12:22:53 +0000516event code, you might use the following code:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000517
Guido van Rossume47da0a1997-07-17 16:34:52 +0000518\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000519 PyObject *arglist;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000520 ...
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000521 arglist = Py_BuildValue("(l)", eventcode);
522 result = PyEval_CallObject(my_callback, arglist);
523 Py_DECREF(arglist);
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000524 if (result == NULL)
525 return NULL; /* Pass error back */
526 /* Here maybe use the result */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000527 Py_DECREF(result);
Guido van Rossume47da0a1997-07-17 16:34:52 +0000528\end{verbatim}\ecode
529%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000530Note the placement of \code{Py_DECREF(argument)} immediately after the call,
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000531before the error check! Also note that strictly spoken this code is
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000532not complete: \code{Py_BuildValue()} may run out of memory, and this should
Guido van Rossum6938f061994-08-01 12:22:53 +0000533be checked.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000534
535
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000536\section{Format Strings for {\tt PyArg_ParseTuple()}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000537
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000538The \code{PyArg_ParseTuple()} function is declared as follows:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000539
Guido van Rossume47da0a1997-07-17 16:34:52 +0000540\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000541 int PyArg_ParseTuple(PyObject *arg, char *format, ...);
Guido van Rossume47da0a1997-07-17 16:34:52 +0000542\end{verbatim}\ecode
543%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000544The \var{arg} argument must be a tuple object containing an argument
545list passed from Python to a C function. The \var{format} argument
546must be a format string, whose syntax is explained below. The
547remaining arguments must be addresses of variables whose type is
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000548determined by the format string. For the conversion to succeed, the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000549\var{arg} object must match the format and the format must be
550exhausted.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000551
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000552Note that while \code{PyArg_ParseTuple()} checks that the Python
553arguments have the required types, it cannot check the validity of the
554addresses of C variables passed to the call: if you make mistakes
555there, your code will probably crash or at least overwrite random bits
556in memory. So be careful!
557
558A format string consists of zero or more ``format units''. A format
559unit describes one Python object; it is usually a single character or
560a parenthesized sequence of format units. With a few exceptions, a
561format unit that is not a parenthesized sequence normally corresponds
562to a single address argument to \code{PyArg_ParseTuple()}. In the
563following description, the quoted form is the format unit; the entry
564in (round) parentheses is the Python object type that matches the
565format unit; and the entry in [square] brackets is the type of the C
566variable(s) whose address should be passed. (Use the \samp{\&}
567operator to pass a variable's address.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000568
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000569\begin{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000570
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000571\item[\samp{s} (string) [char *]]
572Convert a Python string to a C pointer to a character string. You
573must not provide storage for the string itself; a pointer to an
574existing string is stored into the character pointer variable whose
575address you pass. The C string is null-terminated. The Python string
576must not contain embedded null bytes; if it does, a \code{TypeError}
577exception is raised.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000578
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000579\item[\samp{s\#} (string) {[char *, int]}]
580This variant on \code{'s'} stores into two C variables, the first one
581a pointer to a character string, the second one its length. In this
582case the Python string may contain embedded null bytes.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000583
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000584\item[\samp{z} (string or \code{None}) {[char *]}]
585Like \samp{s}, but the Python object may also be \code{None}, in which
586case the C pointer is set to \code{NULL}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000587
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000588\item[\samp{z\#} (string or \code{None}) {[char *, int]}]
589This is to \code{'s\#'} as \code{'z'} is to \code{'s'}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000590
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000591\item[\samp{b} (integer) {[char]}]
592Convert a Python integer to a tiny int, stored in a C \code{char}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000593
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000594\item[\samp{h} (integer) {[short int]}]
595Convert a Python integer to a C \code{short int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000596
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000597\item[\samp{i} (integer) {[int]}]
598Convert a Python integer to a plain C \code{int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000599
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000600\item[\samp{l} (integer) {[long int]}]
601Convert a Python integer to a C \code{long int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000602
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000603\item[\samp{c} (string of length 1) {[char]}]
604Convert a Python character, represented as a string of length 1, to a
605C \code{char}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000606
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000607\item[\samp{f} (float) {[float]}]
608Convert a Python floating point number to a C \code{float}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000609
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000610\item[\samp{d} (float) {[double]}]
611Convert a Python floating point number to a C \code{double}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000612
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000613\item[\samp{O} (object) {[PyObject *]}]
614Store a Python object (without any conversion) in a C object pointer.
615The C program thus receives the actual object that was passed. The
616object's reference count is not increased. The pointer stored is not
617\code{NULL}.
618
619\item[\samp{O!} (object) {[\var{typeobject}, PyObject *]}]
620Store a Python object in a C object pointer. This is similar to
621\samp{O}, but takes two C arguments: the first is the address of a
622Python type object, the second is the address of the C variable (of
623type \code{PyObject *}) into which the object pointer is stored.
624If the Python object does not have the required type, a
625\code{TypeError} exception is raised.
626
627\item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
628Convert a Python object to a C variable through a \var{converter}
629function. This takes two arguments: the first is a function, the
630second is the address of a C variable (of arbitrary type), converted
631to \code{void *}. The \var{converter} function in turn is called as
632follows:
633
634\code{\var{status} = \var{converter}(\var{object}, \var{address});}
635
636where \var{object} is the Python object to be converted and
637\var{address} is the \code{void *} argument that was passed to
638\code{PyArg_ConvertTuple()}. The returned \var{status} should be
639\code{1} for a successful conversion and \code{0} if the conversion
640has failed. When the conversion fails, the \var{converter} function
641should raise an exception.
642
643\item[\samp{S} (string) {[PyStringObject *]}]
644Like \samp{O} but raises a \code{TypeError} exception that the object
645is a string object. The C variable may also be declared as
646\code{PyObject *}.
647
648\item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
649The object must be a Python tuple whose length is the number of format
650units in \var{items}. The C arguments must correspond to the
651individual format units in \var{items}. Format units for tuples may
652be nested.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000653
654\end{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000655
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000656It is possible to pass Python long integers where integers are
657requested; however no proper range checking is done -- the most
658significant bits are silently truncated when the receiving field is
659too small to receive the value (actually, the semantics are inherited
660from downcasts in C --- your milage may vary).
661
662A few other characters have a meaning in a format string. These may
663not occur inside nested parentheses. They are:
664
665\begin{description}
666
667\item[\samp{|}]
668Indicates that the remaining arguments in the Python argument list are
669optional. The C variables corresponding to optional arguments should
670be initialized to their default value --- when an optional argument is
671not specified, the \code{PyArg_ParseTuple} does not touch the contents
672of the corresponding C variable(s).
673
674\item[\samp{:}]
675The list of format units ends here; the string after the colon is used
676as the function name in error messages (the ``associated value'' of
677the exceptions that \code{PyArg_ParseTuple} raises).
678
679\item[\samp{;}]
680The list of format units ends here; the string after the colon is used
681as the error message \emph{instead} of the default error message.
682Clearly, \samp{:} and \samp{;} mutually exclude each other.
683
684\end{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000685
686Some example calls:
687
Guido van Rossume47da0a1997-07-17 16:34:52 +0000688\bcode\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000689 int ok;
690 int i, j;
691 long k, l;
692 char *s;
693 int size;
694
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000695 ok = PyArg_ParseTuple(args, ""); /* No arguments */
Guido van Rossum6938f061994-08-01 12:22:53 +0000696 /* Python call: f() */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000697
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000698 ok = PyArg_ParseTuple(args, "s", &s); /* A string */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000699 /* Possible Python call: f('whoops!') */
700
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000701 ok = PyArg_ParseTuple(args, "lls", &k, &l, &s); /* Two longs and a string */
Guido van Rossum6938f061994-08-01 12:22:53 +0000702 /* Possible Python call: f(1, 2, 'three') */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000703
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000704 ok = PyArg_ParseTuple(args, "(ii)s#", &i, &j, &s, &size);
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000705 /* A pair of ints and a string, whose size is also returned */
Guido van Rossum7e924dd1997-02-10 16:51:52 +0000706 /* Possible Python call: f((1, 2), 'three') */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000707
708 {
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000709 char *file;
710 char *mode = "r";
711 int bufsize = 0;
712 ok = PyArg_ParseTuple(args, "s|si", &file, &mode, &bufsize);
713 /* A string, and optionally another string and an integer */
714 /* Possible Python calls:
715 f('spam')
716 f('spam', 'w')
717 f('spam', 'wb', 100000) */
718 }
719
720 {
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000721 int left, top, right, bottom, h, v;
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000722 ok = PyArg_ParseTuple(args, "((ii)(ii))(ii)",
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000723 &left, &top, &right, &bottom, &h, &v);
724 /* A rectangle and a point */
725 /* Possible Python call:
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000726 f(((0, 0), (400, 300)), (10, 10)) */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000727 }
Guido van Rossume47da0a1997-07-17 16:34:52 +0000728\end{verbatim}\ecode
729%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000730\section{The {\tt Py_BuildValue()} Function}
731
732This function is the counterpart to \code{PyArg_ParseTuple()}. It is
733declared as follows:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000734
Guido van Rossume47da0a1997-07-17 16:34:52 +0000735\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000736 PyObject *Py_BuildValue(char *format, ...);
Guido van Rossume47da0a1997-07-17 16:34:52 +0000737\end{verbatim}\ecode
738%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000739It recognizes a set of format units similar to the ones recognized by
740\code{PyArg_ParseTuple()}, but the arguments (which are input to the
741function, not output) must not be pointers, just values. It returns a
742new Python object, suitable for returning from a C function called
743from Python.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000744
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000745One difference with \code{PyArg_ParseTuple()}: while the latter
746requires its first argument to be a tuple (since Python argument lists
747are always represented as tuples internally), \code{BuildValue()} does
748not always build a tuple. It builds a tuple only if its format string
749contains two or more format units. If the format string is empty, it
750returns \code{None}; if it contains exactly one format unit, it
751returns whatever object is described by that format unit. To force it
752to return a tuple of size 0 or one, parenthesize the format string.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000753
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000754In the following description, the quoted form is the format unit; the
755entry in (round) parentheses is the Python object type that the format
756unit will return; and the entry in [square] brackets is the type of
757the C value(s) to be passed.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000758
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000759The characters space, tab, colon and comma are ignored in format
760strings (but not within format units such as \samp{s\#}). This can be
761used to make long format strings a tad more readable.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000762
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000763\begin{description}
764
765\item[\samp{s} (string) {[char *]}]
766Convert a null-terminated C string to a Python object. If the C
767string pointer is \code{NULL}, \code{None} is returned.
768
769\item[\samp{s\#} (string) {[char *, int]}]
770Convert a C string and its length to a Python object. If the C string
771pointer is \code{NULL}, the length is ignored and \code{None} is
772returned.
773
774\item[\samp{z} (string or \code{None}) {[char *]}]
775Same as \samp{s}.
776
777\item[\samp{z\#} (string or \code{None}) {[char *, int]}]
778Same as \samp{s\#}.
779
780\item[\samp{i} (integer) {[int]}]
781Convert a plain C \code{int} to a Python integer object.
782
783\item[\samp{b} (integer) {[char]}]
784Same as \samp{i}.
785
786\item[\samp{h} (integer) {[short int]}]
787Same as \samp{i}.
788
789\item[\samp{l} (integer) {[long int]}]
790Convert a C \code{long int} to a Python integer object.
791
792\item[\samp{c} (string of length 1) {[char]}]
793Convert a C \code{int} representing a character to a Python string of
794length 1.
795
796\item[\samp{d} (float) {[double]}]
797Convert a C \code{double} to a Python floating point number.
798
799\item[\samp{f} (float) {[float]}]
800Same as \samp{d}.
801
802\item[\samp{O} (object) {[PyObject *]}]
803Pass a Python object untouched (except for its reference count, which
804is incremented by one). If the object passed in is a \code{NULL}
805pointer, it is assumed that this was caused because the call producing
806the argument found an error and set an exception. Therefore,
807\code{Py_BuildValue()} will return \code{NULL} but won't raise an
808exception. If no exception has been raised yet,
809\code{PyExc_SystemError} is set.
810
811\item[\samp{S} (object) {[PyObject *]}]
812Same as \samp{O}.
813
814\item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
815Convert \var{anything} to a Python object through a \var{converter}
816function. The function is called with \var{anything} (which should be
817compatible with \code{void *}) as its argument and should return a
818``new'' Python object, or \code{NULL} if an error occurred.
819
820\item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
821Convert a sequence of C values to a Python tuple with the same number
822of items.
823
824\item[\samp{[\var{items}]} (list) {[\var{matching-items}]}]
825Convert a sequence of C values to a Python list with the same number
826of items.
827
828\item[\samp{\{\var{items}\}} (dictionary) {[\var{matching-items}]}]
829Convert a sequence of C values to a Python dictionary. Each pair of
830consecutive C values adds one item to the dictionary, serving as key
831and value, respectively.
832
833\end{description}
834
835If there is an error in the format string, the
836\code{PyExc_SystemError} exception is raised and \code{NULL} returned.
837
838Examples (to the left the call, to the right the resulting Python value):
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000839
Guido van Rossume47da0a1997-07-17 16:34:52 +0000840\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000841 Py_BuildValue("") None
842 Py_BuildValue("i", 123) 123
Guido van Rossumf23e0fe1995-03-18 11:04:29 +0000843 Py_BuildValue("iii", 123, 456, 789) (123, 456, 789)
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000844 Py_BuildValue("s", "hello") 'hello'
845 Py_BuildValue("ss", "hello", "world") ('hello', 'world')
846 Py_BuildValue("s#", "hello", 4) 'hell'
847 Py_BuildValue("()") ()
848 Py_BuildValue("(i)", 123) (123,)
849 Py_BuildValue("(ii)", 123, 456) (123, 456)
850 Py_BuildValue("(i,i)", 123, 456) (123, 456)
851 Py_BuildValue("[i,i]", 123, 456) [123, 456]
Guido van Rossumf23e0fe1995-03-18 11:04:29 +0000852 Py_BuildValue("{s:i,s:i}",
853 "abc", 123, "def", 456) {'abc': 123, 'def': 456}
854 Py_BuildValue("((ii)(ii)) (ii)",
855 1, 2, 3, 4, 5, 6) (((1, 2), (3, 4)), (5, 6))
Guido van Rossume47da0a1997-07-17 16:34:52 +0000856\end{verbatim}\ecode
857%
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000858\section{Reference Counts}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000859
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000860\subsection{Introduction}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000861
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000862In languages like C or \Cpp{}, the programmer is responsible for
863dynamic allocation and deallocation of memory on the heap. In C, this
864is done using the functions \code{malloc()} and \code{free()}. In
865\Cpp{}, the operators \code{new} and \code{delete} are used with
866essentially the same meaning; they are actually implemented using
867\code{malloc()} and \code{free()}, so we'll restrict the following
868discussion to the latter.
869
870Every block of memory allocated with \code{malloc()} should eventually
871be returned to the pool of available memory by exactly one call to
872\code{free()}. It is important to call \code{free()} at the right
873time. If a block's address is forgotten but \code{free()} is not
874called for it, the memory it occupies cannot be reused until the
875program terminates. This is called a \dfn{memory leak}. On the other
876hand, if a program calls \code{free()} for a block and then continues
877to use the block, it creates a conflict with re-use of the block
878through another \code{malloc()} call. This is called \dfn{using freed
Guido van Rossumdebf2e81997-07-17 15:58:43 +0000879memory}. It has the same bad consequences as referencing uninitialized
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000880data --- core dumps, wrong results, mysterious crashes.
881
882Common causes of memory leaks are unusual paths through the code. For
883instance, a function may allocate a block of memory, do some
884calculation, and then free the block again. Now a change in the
885requirements for the function may add a test to the calculation that
886detects an error condition and can return prematurely from the
887function. It's easy to forget to free the allocated memory block when
888taking this premature exit, especially when it is added later to the
889code. Such leaks, once introduced, often go undetected for a long
890time: the error exit is taken only in a small fraction of all calls,
891and most modern machines have plenty of virtual memory, so the leak
892only becomes apparent in a long-running process that uses the leaking
893function frequently. Therefore, it's important to prevent leaks from
894happening by having a coding convention or strategy that minimizes
895this kind of errors.
896
897Since Python makes heavy use of \code{malloc()} and \code{free()}, it
898needs a strategy to avoid memory leaks as well as the use of freed
899memory. The chosen method is called \dfn{reference counting}. The
900principle is simple: every object contains a counter, which is
901incremented when a reference to the object is stored somewhere, and
902which is decremented when a reference to it is deleted. When the
903counter reaches zero, the last reference to the object has been
904deleted and the object is freed.
905
906An alternative strategy is called \dfn{automatic garbage collection}.
907(Sometimes, reference counting is also referred to as a garbage
908collection strategy, hence my use of ``automatic'' to distinguish the
909two.) The big advantage of automatic garbage collection is that the
910user doesn't need to call \code{free()} explicitly. (Another claimed
911advantage is an improvement in speed or memory usage --- this is no
912hard fact however.) The disadvantage is that for C, there is no
913truly portable automatic garbage collector, while reference counting
914can be implemented portably (as long as the functions \code{malloc()}
915and \code{free()} are available --- which the C Standard guarantees).
916Maybe some day a sufficiently portable automatic garbage collector
917will be available for C. Until then, we'll have to live with
918reference counts.
919
920\subsection{Reference Counting in Python}
921
922There are two macros, \code{Py_INCREF(x)} and \code{Py_DECREF(x)},
923which handle the incrementing and decrementing of the reference count.
924\code{Py_DECREF()} also frees the object when the count reaches zero.
925For flexibility, it doesn't call \code{free()} directly --- rather, it
926makes a call through a function pointer in the object's \dfn{type
927object}. For this purpose (and others), every object also contains a
928pointer to its type object.
929
930The big question now remains: when to use \code{Py_INCREF(x)} and
931\code{Py_DECREF(x)}? Let's first introduce some terms. Nobody
932``owns'' an object; however, you can \dfn{own a reference} to an
933object. An object's reference count is now defined as the number of
934owned references to it. The owner of a reference is responsible for
935calling \code{Py_DECREF()} when the reference is no longer needed.
936Ownership of a reference can be transferred. There are three ways to
937dispose of an owned reference: pass it on, store it, or call
938\code{Py_DECREF()}. Forgetting to dispose of an owned reference creates
939a memory leak.
940
941It is also possible to \dfn{borrow}\footnote{The metaphor of
942``borrowing'' a reference is not completely correct: the owner still
943has a copy of the reference.} a reference to an object. The borrower
944of a reference should not call \code{Py_DECREF()}. The borrower must
945not hold on to the object longer than the owner from which it was
946borrowed. Using a borrowed reference after the owner has disposed of
947it risks using freed memory and should be avoided
948completely.\footnote{Checking that the reference count is at least 1
949\strong{does not work} --- the reference count itself could be in
950freed memory and may thus be reused for another object!}
951
952The advantage of borrowing over owning a reference is that you don't
953need to take care of disposing of the reference on all possible paths
954through the code --- in other words, with a borrowed reference you
955don't run the risk of leaking when a premature exit is taken. The
956disadvantage of borrowing over leaking is that there are some subtle
957situations where in seemingly correct code a borrowed reference can be
958used after the owner from which it was borrowed has in fact disposed
959of it.
960
961A borrowed reference can be changed into an owned reference by calling
962\code{Py_INCREF()}. This does not affect the status of the owner from
963which the reference was borrowed --- it creates a new owned reference,
964and gives full owner responsibilities (i.e., the new owner must
965dispose of the reference properly, as well as the previous owner).
966
967\subsection{Ownership Rules}
968
969Whenever an object reference is passed into or out of a function, it
970is part of the function's interface specification whether ownership is
971transferred with the reference or not.
972
973Most functions that return a reference to an object pass on ownership
974with the reference. In particular, all functions whose function it is
975to create a new object, e.g.\ \code{PyInt_FromLong()} and
976\code{Py_BuildValue()}, pass ownership to the receiver. Even if in
977fact, in some cases, you don't receive a reference to a brand new
978object, you still receive ownership of the reference. For instance,
979\code{PyInt_FromLong()} maintains a cache of popular values and can
980return a reference to a cached item.
981
982Many functions that extract objects from other objects also transfer
983ownership with the reference, for instance
984\code{PyObject_GetAttrString()}. The picture is less clear, here,
985however, since a few common routines are exceptions:
986\code{PyTuple_GetItem()}, \code{PyList_GetItem()} and
987\code{PyDict_GetItem()} (and \code{PyDict_GetItemString()}) all return
988references that you borrow from the tuple, list or dictionary.
989
990The function \code{PyImport_AddModule()} also returns a borrowed
991reference, even though it may actually create the object it returns:
992this is possible because an owned reference to the object is stored in
993\code{sys.modules}.
994
995When you pass an object reference into another function, in general,
996the function borrows the reference from you --- if it needs to store
997it, it will use \code{Py_INCREF()} to become an independent owner.
998There are exactly two important exceptions to this rule:
999\code{PyTuple_SetItem()} and \code{PyList_SetItem()}. These functions
1000take over ownership of the item passed to them --- even if they fail!
1001(Note that \code{PyDict_SetItem()} and friends don't take over
1002ownership --- they are ``normal''.)
1003
1004When a C function is called from Python, it borrows references to its
1005arguments from the caller. The caller owns a reference to the object,
1006so the borrowed reference's lifetime is guaranteed until the function
1007returns. Only when such a borrowed reference must be stored or passed
1008on, it must be turned into an owned reference by calling
1009\code{Py_INCREF()}.
1010
1011The object reference returned from a C function that is called from
1012Python must be an owned reference --- ownership is tranferred from the
1013function to its caller.
1014
1015\subsection{Thin Ice}
1016
1017There are a few situations where seemingly harmless use of a borrowed
1018reference can lead to problems. These all have to do with implicit
1019invocations of the interpreter, which can cause the owner of a
1020reference to dispose of it.
1021
1022The first and most important case to know about is using
1023\code{Py_DECREF()} on an unrelated object while borrowing a reference
1024to a list item. For instance:
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001025
Guido van Rossume47da0a1997-07-17 16:34:52 +00001026\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001027bug(PyObject *list) {
1028 PyObject *item = PyList_GetItem(list, 0);
1029 PyList_SetItem(list, 1, PyInt_FromLong(0L));
1030 PyObject_Print(item, stdout, 0); /* BUG! */
1031}
Guido van Rossume47da0a1997-07-17 16:34:52 +00001032\end{verbatim}\ecode
1033%
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001034This function first borrows a reference to \code{list[0]}, then
1035replaces \code{list[1]} with the value \code{0}, and finally prints
1036the borrowed reference. Looks harmless, right? But it's not!
1037
1038Let's follow the control flow into \code{PyList_SetItem()}. The list
1039owns references to all its items, so when item 1 is replaced, it has
1040to dispose of the original item 1. Now let's suppose the original
1041item 1 was an instance of a user-defined class, and let's further
1042suppose that the class defined a \code{__del__()} method. If this
1043class instance has a reference count of 1, disposing of it will call
1044its \code{__del__()} method.
1045
1046Since it is written in Python, the \code{__del__()} method can execute
1047arbitrary Python code. Could it perhaps do something to invalidate
1048the reference to \code{item} in \code{bug()}? You bet! Assuming that
1049the list passed into \code{bug()} is accessible to the
1050\code{__del__()} method, it could execute a statement to the effect of
1051\code{del list[0]}, and assuming this was the last reference to that
1052object, it would free the memory associated with it, thereby
1053invalidating \code{item}.
1054
1055The solution, once you know the source of the problem, is easy:
1056temporarily increment the reference count. The correct version of the
1057function reads:
1058
Guido van Rossume47da0a1997-07-17 16:34:52 +00001059\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001060no_bug(PyObject *list) {
1061 PyObject *item = PyList_GetItem(list, 0);
1062 Py_INCREF(item);
1063 PyList_SetItem(list, 1, PyInt_FromLong(0L));
1064 PyObject_Print(item, stdout, 0);
1065 Py_DECREF(item);
1066}
Guido van Rossume47da0a1997-07-17 16:34:52 +00001067\end{verbatim}\ecode
1068%
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001069This is a true story. An older version of Python contained variants
1070of this bug and someone spent a considerable amount of time in a C
1071debugger to figure out why his \code{__del__()} methods would fail...
1072
1073The second case of problems with a borrowed reference is a variant
1074involving threads. Normally, multiple threads in the Python
1075interpreter can't get in each other's way, because there is a global
1076lock protecting Python's entire object space. However, it is possible
1077to temporarily release this lock using the macro
1078\code{Py_BEGIN_ALLOW_THREADS}, and to re-acquire it using
1079\code{Py_END_ALLOW_THREADS}. This is common around blocking I/O
1080calls, to let other threads use the CPU while waiting for the I/O to
1081complete. Obviously, the following function has the same problem as
1082the previous one:
1083
Guido van Rossume47da0a1997-07-17 16:34:52 +00001084\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001085bug(PyObject *list) {
1086 PyObject *item = PyList_GetItem(list, 0);
1087 Py_BEGIN_ALLOW_THREADS
1088 ...some blocking I/O call...
1089 Py_END_ALLOW_THREADS
1090 PyObject_Print(item, stdout, 0); /* BUG! */
1091}
Guido van Rossume47da0a1997-07-17 16:34:52 +00001092\end{verbatim}\ecode
1093%
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001094\subsection{NULL Pointers}
1095
1096In general, functions that take object references as arguments don't
1097expect you to pass them \code{NULL} pointers, and will dump core (or
1098cause later core dumps) if you do so. Functions that return object
1099references generally return \code{NULL} only to indicate that an
1100exception occurred. The reason for not testing for \code{NULL}
1101arguments is that functions often pass the objects they receive on to
1102other function --- if each function were to test for \code{NULL},
1103there would be a lot of redundant tests and the code would run slower.
1104
1105It is better to test for \code{NULL} only at the ``source'', i.e.\
1106when a pointer that may be \code{NULL} is received, e.g.\ from
1107\code{malloc()} or from a function that may raise an exception.
1108
1109The macros \code{Py_INCREF()} and \code{Py_DECREF()}
1110don't check for \code{NULL} pointers --- however, their variants
1111\code{Py_XINCREF()} and \code{Py_XDECREF()} do.
1112
1113The macros for checking for a particular object type
1114(\code{Py\var{type}_Check()}) don't check for \code{NULL} pointers ---
1115again, there is much code that calls several of these in a row to test
1116an object against various different expected types, and this would
1117generate redundant tests. There are no variants with \code{NULL}
1118checking.
1119
1120The C function calling mechanism guarantees that the argument list
1121passed to C functions (\code{args} in the examples) is never
1122\code{NULL} --- in fact it guarantees that it is always a tuple.%
1123\footnote{These guarantees don't hold when you use the ``old'' style
1124calling convention --- this is still found in much existing code.}
1125
1126It is a severe error to ever let a \code{NULL} pointer ``escape'' to
1127the Python user.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001128
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001129
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001130\section{Writing Extensions in \Cpp{}}
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001131
Guido van Rossum16d6e711994-08-08 12:30:22 +00001132It is possible to write extension modules in \Cpp{}. Some restrictions
Guido van Rossumed39cd01995-10-08 00:17:19 +00001133apply. If the main program (the Python interpreter) is compiled and
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001134linked by the C compiler, global or static objects with constructors
Guido van Rossumed39cd01995-10-08 00:17:19 +00001135cannot be used. This is not a problem if the main program is linked
1136by the \Cpp{} compiler. All functions that will be called directly or
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001137indirectly (i.e. via function pointers) by the Python interpreter will
1138have to be declared using \code{extern "C"}; this applies to all
Guido van Rossumb92112d1995-03-20 14:24:09 +00001139``methods'' as well as to the module's initialization function.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001140It is unnecessary to enclose the Python header files in
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001141\code{extern "C" \{...\}} --- they use this form already if the symbol
1142\samp{__cplusplus} is defined (all recent C++ compilers define this
1143symbol).
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001144
1145\chapter{Embedding Python in another application}
1146
1147Embedding Python is similar to extending it, but not quite. The
1148difference is that when you extend Python, the main program of the
Guido van Rossum16d6e711994-08-08 12:30:22 +00001149application is still the Python interpreter, while if you embed
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001150Python, the main program may have nothing to do with Python ---
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001151instead, some parts of the application occasionally call the Python
1152interpreter to run some Python code.
1153
1154So if you are embedding Python, you are providing your own main
1155program. One of the things this main program has to do is initialize
1156the Python interpreter. At the very least, you have to call the
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001157function \code{Py_Initialize()}. There are optional calls to pass command
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001158line arguments to Python. Then later you can call the interpreter
1159from any part of the application.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001160
1161There are several different ways to call the interpreter: you can pass
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001162a string containing Python statements to \code{PyRun_SimpleString()},
1163or you can pass a stdio file pointer and a file name (for
1164identification in error messages only) to \code{PyRun_SimpleFile()}. You
1165can also call the lower-level operations described in the previous
1166chapters to construct and use Python objects.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001167
1168A simple demo of embedding Python can be found in the directory
Guido van Rossum6938f061994-08-01 12:22:53 +00001169\file{Demo/embed}.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001170
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001171
Guido van Rossum16d6e711994-08-08 12:30:22 +00001172\section{Embedding Python in \Cpp{}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001173
Guido van Rossum16d6e711994-08-08 12:30:22 +00001174It is also possible to embed Python in a \Cpp{} program; precisely how this
1175is done will depend on the details of the \Cpp{} system used; in general you
1176will need to write the main program in \Cpp{}, and use the \Cpp{} compiler
1177to compile and link your program. There is no need to recompile Python
1178itself using \Cpp{}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001179
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001180
1181\chapter{Dynamic Loading}
1182
Guido van Rossum6938f061994-08-01 12:22:53 +00001183On most modern systems it is possible to configure Python to support
1184dynamic loading of extension modules implemented in C. When shared
1185libraries are used dynamic loading is configured automatically;
1186otherwise you have to select it as a build option (see below). Once
1187configured, dynamic loading is trivial to use: when a Python program
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001188executes \code{import spam}, the search for modules tries to find a
1189file \file{spammodule.o} (\file{spammodule.so} when using shared
Guido van Rossum6938f061994-08-01 12:22:53 +00001190libraries) in the module search path, and if one is found, it is
1191loaded into the executing binary and executed. Once loaded, the
1192module acts just like a built-in extension module.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001193
Guido van Rossumb92112d1995-03-20 14:24:09 +00001194The advantages of dynamic loading are twofold: the ``core'' Python
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001195binary gets smaller, and users can extend Python with their own
1196modules implemented in C without having to build and maintain their
1197own copy of the Python interpreter. There are also disadvantages:
1198dynamic loading isn't available on all systems (this just means that
1199on some systems you have to use static loading), and dynamically
1200loading a module that was compiled for a different version of Python
Guido van Rossum6938f061994-08-01 12:22:53 +00001201(e.g. with a different representation of objects) may dump core.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001202
1203
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001204\section{Configuring and Building the Interpreter for Dynamic Loading}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001205
Guido van Rossum6938f061994-08-01 12:22:53 +00001206There are three styles of dynamic loading: one using shared libraries,
1207one using SGI IRIX 4 dynamic loading, and one using GNU dynamic
1208loading.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001209
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001210\subsection{Shared Libraries}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001211
Guido van Rossum16d6e711994-08-08 12:30:22 +00001212The following systems support dynamic loading using shared libraries:
Guido van Rossum6938f061994-08-01 12:22:53 +00001213SunOS 4; Solaris 2; SGI IRIX 5 (but not SGI IRIX 4!); and probably all
1214systems derived from SVR4, or at least those SVR4 derivatives that
1215support shared libraries (are there any that don't?).
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001216
Guido van Rossum6938f061994-08-01 12:22:53 +00001217You don't need to do anything to configure dynamic loading on these
1218systems --- the \file{configure} detects the presence of the
1219\file{<dlfcn.h>} header file and automatically configures dynamic
1220loading.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001221
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001222\subsection{SGI IRIX 4 Dynamic Loading}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001223
Guido van Rossum6938f061994-08-01 12:22:53 +00001224Only SGI IRIX 4 supports dynamic loading of modules using SGI dynamic
1225loading. (SGI IRIX 5 might also support it but it is inferior to
1226using shared libraries so there is no reason to; a small test didn't
1227work right away so I gave up trying to support it.)
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001228
Guido van Rossum6938f061994-08-01 12:22:53 +00001229Before you build Python, you first need to fetch and build the \code{dl}
1230package written by Jack Jansen. This is available by anonymous ftp
1231from host \file{ftp.cwi.nl}, directory \file{pub/dynload}, file
1232\file{dl-1.6.tar.Z}. (The version number may change.) Follow the
1233instructions in the package's \file{README} file to build it.
1234
1235Once you have built \code{dl}, you can configure Python to use it. To
1236this end, you run the \file{configure} script with the option
1237\code{--with-dl=\var{directory}} where \var{directory} is the absolute
1238pathname of the \code{dl} directory.
1239
1240Now build and install Python as you normally would (see the
1241\file{README} file in the toplevel Python directory.)
1242
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001243\subsection{GNU Dynamic Loading}
Guido van Rossum6938f061994-08-01 12:22:53 +00001244
1245GNU dynamic loading supports (according to its \file{README} file) the
1246following hardware and software combinations: VAX (Ultrix), Sun 3
1247(SunOS 3.4 and 4.0), Sparc (SunOS 4.0), Sequent Symmetry (Dynix), and
1248Atari ST. There is no reason to use it on a Sparc; I haven't seen a
1249Sun 3 for years so I don't know if these have shared libraries or not.
1250
Guido van Rossum7e924dd1997-02-10 16:51:52 +00001251You need to fetch and build two packages.
1252One is GNU DLD. All development of this code has been done with DLD
1253version 3.2.3, which is available by anonymous ftp from host
1254\file{ftp.cwi.nl}, directory \file{pub/dynload}, file
1255\file{dld-3.2.3.tar.Z}. (A more recent version of DLD is available
1256via \file{http://www-swiss.ai.mit.edu/~jaffer/DLD.html} but this has
1257not been tested.)
1258The other package needed is an
Guido van Rossum6938f061994-08-01 12:22:53 +00001259emulation of Jack Jansen's \code{dl} package that I wrote on top of
1260GNU DLD 3.2.3. This is available from the same host and directory,
1261file dl-dld-1.1.tar.Z. (The version number may change --- but I doubt
1262it will.) Follow the instructions in each package's \file{README}
1263file to configure build them.
1264
1265Now configure Python. Run the \file{configure} script with the option
1266\code{--with-dl-dld=\var{dl-directory},\var{dld-directory}} where
1267\var{dl-directory} is the absolute pathname of the directory where you
1268have built the \file{dl-dld} package, and \var{dld-directory} is that
1269of the GNU DLD package. The Python interpreter you build hereafter
1270will support GNU dynamic loading.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001271
1272
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001273\section{Building a Dynamically Loadable Module}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001274
Guido van Rossum6938f061994-08-01 12:22:53 +00001275Since there are three styles of dynamic loading, there are also three
1276groups of instructions for building a dynamically loadable module.
1277Instructions common for all three styles are given first. Assuming
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001278your module is called \code{spam}, the source filename must be
1279\file{spammodule.c}, so the object name is \file{spammodule.o}. The
Guido van Rossum6938f061994-08-01 12:22:53 +00001280module must be written as a normal Python extension module (as
1281described earlier).
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001282
Guido van Rossum6938f061994-08-01 12:22:53 +00001283Note that in all cases you will have to create your own Makefile that
1284compiles your module file(s). This Makefile will have to pass two
1285\samp{-I} arguments to the C compiler which will make it find the
1286Python header files. If the Make variable \var{PYTHONTOP} points to
1287the toplevel Python directory, your \var{CFLAGS} Make variable should
1288contain the options \samp{-I\$(PYTHONTOP) -I\$(PYTHONTOP)/Include}.
1289(Most header files are in the \file{Include} subdirectory, but the
Guido van Rossum305ed111996-08-19 22:59:46 +00001290\file{config.h} header lives in the toplevel directory.)
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001291
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001292
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001293\subsection{Shared Libraries}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001294
Guido van Rossum6938f061994-08-01 12:22:53 +00001295You must link the \samp{.o} file to produce a shared library. This is
1296done using a special invocation of the \UNIX{} loader/linker, {\em
1297ld}(1). Unfortunately the invocation differs slightly per system.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001298
Guido van Rossum6938f061994-08-01 12:22:53 +00001299On SunOS 4, use
Guido van Rossume47da0a1997-07-17 16:34:52 +00001300\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001301 ld spammodule.o -o spammodule.so
Guido van Rossume47da0a1997-07-17 16:34:52 +00001302\end{verbatim}\ecode
1303%
Guido van Rossum6938f061994-08-01 12:22:53 +00001304On Solaris 2, use
Guido van Rossume47da0a1997-07-17 16:34:52 +00001305\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001306 ld -G spammodule.o -o spammodule.so
Guido van Rossume47da0a1997-07-17 16:34:52 +00001307\end{verbatim}\ecode
1308%
Guido van Rossum6938f061994-08-01 12:22:53 +00001309On SGI IRIX 5, use
Guido van Rossume47da0a1997-07-17 16:34:52 +00001310\bcode\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001311 ld -shared spammodule.o -o spammodule.so
Guido van Rossume47da0a1997-07-17 16:34:52 +00001312\end{verbatim}\ecode
1313%
Guido van Rossumb92112d1995-03-20 14:24:09 +00001314On other systems, consult the manual page for \code{ld}(1) to find what
Guido van Rossum6938f061994-08-01 12:22:53 +00001315flags, if any, must be used.
1316
1317If your extension module uses system libraries that haven't already
1318been linked with Python (e.g. a windowing system), these must be
Guido van Rossumb92112d1995-03-20 14:24:09 +00001319passed to the \code{ld} command as \samp{-l} options after the
Guido van Rossum6938f061994-08-01 12:22:53 +00001320\samp{.o} file.
1321
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001322The resulting file \file{spammodule.so} must be copied into a directory
Guido van Rossum6938f061994-08-01 12:22:53 +00001323along the Python module search path.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001324
1325
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001326\subsection{SGI IRIX 4 Dynamic Loading}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001327
Guido van Rossum7ec59571995-04-07 15:35:33 +00001328{\bf IMPORTANT:} You must compile your extension module with the
Guido van Rossum6938f061994-08-01 12:22:53 +00001329additional C flag \samp{-G0} (or \samp{-G 0}). This instruct the
1330assembler to generate position-independent code.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001331
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001332You don't need to link the resulting \file{spammodule.o} file; just
Guido van Rossum6938f061994-08-01 12:22:53 +00001333copy it into a directory along the Python module search path.
1334
1335The first time your extension is loaded, it takes some extra time and
1336a few messages may be printed. This creates a file
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001337\file{spammodule.ld} which is an image that can be loaded quickly into
Guido van Rossum6938f061994-08-01 12:22:53 +00001338the Python interpreter process. When a new Python interpreter is
1339installed, the \code{dl} package detects this and rebuilds
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001340\file{spammodule.ld}. The file \file{spammodule.ld} is placed in the
1341directory where \file{spammodule.o} was found, unless this directory is
Guido van Rossum6938f061994-08-01 12:22:53 +00001342unwritable; in that case it is placed in a temporary
1343directory.\footnote{Check the manual page of the \code{dl} package for
1344details.}
1345
1346If your extension modules uses additional system libraries, you must
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001347create a file \file{spammodule.libs} in the same directory as the
1348\file{spammodule.o}. This file should contain one or more lines with
Guido van Rossum6938f061994-08-01 12:22:53 +00001349whitespace-separated options that will be passed to the linker ---
1350normally only \samp{-l} options or absolute pathnames of libraries
1351(\samp{.a} files) should be used.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001352
1353
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001354\subsection{GNU Dynamic Loading}
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001355
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001356Just copy \file{spammodule.o} into a directory along the Python module
Guido van Rossum6938f061994-08-01 12:22:53 +00001357search path.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001358
Guido van Rossum6938f061994-08-01 12:22:53 +00001359If your extension modules uses additional system libraries, you must
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001360create a file \file{spammodule.libs} in the same directory as the
1361\file{spammodule.o}. This file should contain one or more lines with
Guido van Rossum6938f061994-08-01 12:22:53 +00001362whitespace-separated absolute pathnames of libraries (\samp{.a}
1363files). No \samp{-l} options can be used.
Guido van Rossum6f0132f1993-11-19 13:13:22 +00001364
1365
Guido van Rossum9231c8f1997-05-15 21:43:21 +00001366%\input{extref}
Guido van Rossum267e80d1996-08-09 21:01:07 +00001367
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001368\input{ext.ind}
1369
1370\end{document}