blob: 3f9eb96977aafd96d30854a0841ca4afd526ae15 [file] [log] [blame]
Fred Drake6659c301998-03-03 22:02:19 +00001\documentclass{manual}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002
Guido van Rossumd358afe1998-12-23 05:02:08 +00003% XXX PM explain how to add new types to Python
Guido van Rossum5049bcb1995-03-13 16:55:23 +00004
Guido van Rossum6938f061994-08-01 12:22:53 +00005\title{Extending and Embedding the Python Interpreter}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00006
Guido van Rossum16cd7f91994-10-06 10:29:26 +00007\input{boilerplate}
Guido van Rossum83eb9621993-11-23 16:28:45 +00008
Guido van Rossum7a2dba21993-11-05 14:45:11 +00009% Tell \index to actually write the .idx file
10\makeindex
11
12\begin{document}
13
Guido van Rossum7a2dba21993-11-05 14:45:11 +000014\maketitle
15
Fred Drake9f86b661998-07-28 21:55:19 +000016\ifhtml
17\chapter*{Front Matter\label{front}}
18\fi
19
Guido van Rossum16cd7f91994-10-06 10:29:26 +000020\input{copyright}
21
Fred Drake33698f81999-02-16 23:06:32 +000022
Guido van Rossum7a2dba21993-11-05 14:45:11 +000023\begin{abstract}
24
25\noindent
Guido van Rossumb92112d1995-03-20 14:24:09 +000026Python is an interpreted, object-oriented programming language. This
Fred Drakeec9fbe91999-02-15 16:20:25 +000027document describes how to write modules in C or \Cpp{} to extend the
Guido van Rossumb92112d1995-03-20 14:24:09 +000028Python interpreter with new modules. Those modules can define new
29functions but also new object types and their methods. The document
30also describes how to embed the Python interpreter in another
31application, for use as an extension language. Finally, it shows how
32to compile and link extension modules so that they can be loaded
33dynamically (at run time) into the interpreter, if the underlying
34operating system supports this feature.
35
36This document assumes basic knowledge about Python. For an informal
Fred Drake9fa76f11999-11-10 16:01:43 +000037introduction to the language, see the
38\citetitle[../tut/tut.html]{Python Tutorial}. The
39\citetitle[../ref/ref.html]{Python Reference Manual} gives a more
40formal definition of the language. The
41\citetitle[../lib/lib.html]{Python Library Reference} documents the
42existing object types, functions and modules (both built-in and
43written in Python) that give the language its wide application range.
Guido van Rossum7a2dba21993-11-05 14:45:11 +000044
Fred Drakeec9fbe91999-02-15 16:20:25 +000045For a detailed description of the whole Python/C API, see the separate
Fred Drake9fa76f11999-11-10 16:01:43 +000046\citetitle[../api/api.html]{Python/C API Reference Manual}.
Guido van Rossumfdacc581997-10-07 14:40:16 +000047
Guido van Rossum7a2dba21993-11-05 14:45:11 +000048\end{abstract}
49
Fred Drake4d4f9e71998-01-13 22:25:02 +000050\tableofcontents
Guido van Rossum7a2dba21993-11-05 14:45:11 +000051
Guido van Rossumdb65a6c1993-11-05 17:11:16 +000052
Fred Drake8e015171999-02-17 18:12:14 +000053\chapter{Extending Python with C or \Cpp{} \label{intro}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +000054
Guido van Rossum6f0132f1993-11-19 13:13:22 +000055
Guido van Rossumb92112d1995-03-20 14:24:09 +000056It is quite easy to add new built-in modules to Python, if you know
Fred Drakeec9fbe91999-02-15 16:20:25 +000057how to program in C. Such \dfn{extension modules} can do two things
Guido van Rossumb92112d1995-03-20 14:24:09 +000058that can't be done directly in Python: they can implement new built-in
Fred Drakeec9fbe91999-02-15 16:20:25 +000059object types, and they can call C library functions and system calls.
Guido van Rossum6938f061994-08-01 12:22:53 +000060
Guido van Rossum5049bcb1995-03-13 16:55:23 +000061To support extensions, the Python API (Application Programmers
Guido van Rossumb92112d1995-03-20 14:24:09 +000062Interface) defines a set of functions, macros and variables that
63provide access to most aspects of the Python run-time system. The
Fred Drakeec9fbe91999-02-15 16:20:25 +000064Python API is incorporated in a C source file by including the header
Guido van Rossumb92112d1995-03-20 14:24:09 +000065\code{"Python.h"}.
Guido van Rossum6938f061994-08-01 12:22:53 +000066
Guido van Rossumb92112d1995-03-20 14:24:09 +000067The compilation of an extension module depends on its intended use as
Fred Drake54fd8452000-04-03 04:54:28 +000068well as on your system setup; details are given in later chapters.
Guido van Rossum6938f061994-08-01 12:22:53 +000069
Guido van Rossum7a2dba21993-11-05 14:45:11 +000070
Fred Drake5e8aa541998-11-16 18:34:07 +000071\section{A Simple Example
72 \label{simpleExample}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +000073
Guido van Rossumb92112d1995-03-20 14:24:09 +000074Let's create an extension module called \samp{spam} (the favorite food
75of Monty Python fans...) and let's say we want to create a Python
Fred Drakeec9fbe91999-02-15 16:20:25 +000076interface to the C library function \cfunction{system()}.\footnote{An
Guido van Rossumb92112d1995-03-20 14:24:09 +000077interface for this function already exists in the standard module
Fred Draked7bb3031998-03-03 17:52:07 +000078\module{os} --- it was chosen as a simple and straightfoward example.}
Guido van Rossumb92112d1995-03-20 14:24:09 +000079This function takes a null-terminated character string as argument and
80returns an integer. We want this function to be callable from Python
81as follows:
82
Fred Drake1e11a5c1998-02-13 07:11:32 +000083\begin{verbatim}
84>>> import spam
85>>> status = spam.system("ls -l")
86\end{verbatim}
87
Fred Drake54fd8452000-04-03 04:54:28 +000088Begin by creating a file \file{spammodule.c}. (Historically, if a
Fred Drakeec9fbe91999-02-15 16:20:25 +000089module is called \samp{spam}, the C file containing its implementation
Guido van Rossumb92112d1995-03-20 14:24:09 +000090is called \file{spammodule.c}; if the module name is very long, like
91\samp{spammify}, the module name can be just \file{spammify.c}.)
92
93The first line of our file can be:
Guido van Rossum7a2dba21993-11-05 14:45:11 +000094
Fred Drake1e11a5c1998-02-13 07:11:32 +000095\begin{verbatim}
Fred Drake54fd8452000-04-03 04:54:28 +000096#include <Python.h>
Fred Drake1e11a5c1998-02-13 07:11:32 +000097\end{verbatim}
98
Guido van Rossum5049bcb1995-03-13 16:55:23 +000099which pulls in the Python API (you can add a comment describing the
100purpose of the module and a copyright notice if you like).
101
Guido van Rossumb92112d1995-03-20 14:24:09 +0000102All user-visible symbols defined by \code{"Python.h"} have a prefix of
103\samp{Py} or \samp{PY}, except those defined in standard header files.
104For convenience, and since they are used extensively by the Python
105interpreter, \code{"Python.h"} includes a few standard header files:
106\code{<stdio.h>}, \code{<string.h>}, \code{<errno.h>}, and
107\code{<stdlib.h>}. If the latter header file does not exist on your
Fred Draked7bb3031998-03-03 17:52:07 +0000108system, it declares the functions \cfunction{malloc()},
109\cfunction{free()} and \cfunction{realloc()} directly.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000110
Fred Drakeec9fbe91999-02-15 16:20:25 +0000111The next thing we add to our module file is the C function that will
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000112be called when the Python expression \samp{spam.system(\var{string})}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000113is evaluated (we'll see shortly how it ends up being called):
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000114
Fred Drake1e11a5c1998-02-13 07:11:32 +0000115\begin{verbatim}
116static PyObject *
117spam_system(self, args)
118 PyObject *self;
119 PyObject *args;
120{
121 char *command;
122 int sts;
Fred Drakea0dbddf1998-04-02 06:50:02 +0000123
Fred Drake1e11a5c1998-02-13 07:11:32 +0000124 if (!PyArg_ParseTuple(args, "s", &command))
125 return NULL;
126 sts = system(command);
127 return Py_BuildValue("i", sts);
128}
129\end{verbatim}
130
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000131There is a straightforward translation from the argument list in
Guido van Rossumb92112d1995-03-20 14:24:09 +0000132Python (e.g.\ the single expression \code{"ls -l"}) to the arguments
Fred Drakeec9fbe91999-02-15 16:20:25 +0000133passed to the C function. The C function always has two arguments,
Guido van Rossumb92112d1995-03-20 14:24:09 +0000134conventionally named \var{self} and \var{args}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000135
Fred Drakeec9fbe91999-02-15 16:20:25 +0000136The \var{self} argument is only used when the C function implements a
Fred Drake9226d8e1999-02-22 14:55:46 +0000137built-in method, not a function. In the example, \var{self} will
138always be a \NULL{} pointer, since we are defining a function, not a
139method. (This is done so that the interpreter doesn't have to
140understand two different types of C functions.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000141
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000142The \var{args} argument will be a pointer to a Python tuple object
Guido van Rossumb92112d1995-03-20 14:24:09 +0000143containing the arguments. Each item of the tuple corresponds to an
144argument in the call's argument list. The arguments are Python
Fred Drakeec9fbe91999-02-15 16:20:25 +0000145objects --- in order to do anything with them in our C function we have
146to convert them to C values. The function \cfunction{PyArg_ParseTuple()}
147in the Python API checks the argument types and converts them to C
Guido van Rossumb92112d1995-03-20 14:24:09 +0000148values. It uses a template string to determine the required types of
Fred Drakeec9fbe91999-02-15 16:20:25 +0000149the arguments as well as the types of the C variables into which to
Guido van Rossumb92112d1995-03-20 14:24:09 +0000150store the converted values. More about this later.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000151
Fred Drake3da06a61998-02-26 18:49:12 +0000152\cfunction{PyArg_ParseTuple()} returns true (nonzero) if all arguments have
Guido van Rossumb92112d1995-03-20 14:24:09 +0000153the right type and its components have been stored in the variables
154whose addresses are passed. It returns false (zero) if an invalid
155argument list was passed. In the latter case it also raises an
Fred Drake54fd8452000-04-03 04:54:28 +0000156appropriate exception so the calling function can return
Fred Drake0fd82681998-01-09 05:39:38 +0000157\NULL{} immediately (as we saw in the example).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000158
159
Fred Drake5e8aa541998-11-16 18:34:07 +0000160\section{Intermezzo: Errors and Exceptions
161 \label{errors}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000162
163An important convention throughout the Python interpreter is the
164following: when a function fails, it should set an exception condition
Fred Drake0fd82681998-01-09 05:39:38 +0000165and return an error value (usually a \NULL{} pointer). Exceptions
Guido van Rossumb92112d1995-03-20 14:24:09 +0000166are stored in a static global variable inside the interpreter; if this
Fred Drake0fd82681998-01-09 05:39:38 +0000167variable is \NULL{} no exception has occurred. A second global
Guido van Rossumb92112d1995-03-20 14:24:09 +0000168variable stores the ``associated value'' of the exception (the second
Fred Draked7bb3031998-03-03 17:52:07 +0000169argument to \keyword{raise}). A third variable contains the stack
Guido van Rossumb92112d1995-03-20 14:24:09 +0000170traceback in case the error originated in Python code. These three
Fred Drakeec9fbe91999-02-15 16:20:25 +0000171variables are the C equivalents of the Python variables
Fred Drakef9918f21999-02-05 18:30:49 +0000172\code{sys.exc_type}, \code{sys.exc_value} and \code{sys.exc_traceback} (see
Fred Drake9fa76f11999-11-10 16:01:43 +0000173the section on module \module{sys} in the
174\citetitle[../lib/lib.html]{Python Library Reference}). It is
175important to know about them to understand how errors are passed
176around.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000177
Guido van Rossumb92112d1995-03-20 14:24:09 +0000178The Python API defines a number of functions to set various types of
179exceptions.
180
Fred Draked7bb3031998-03-03 17:52:07 +0000181The most common one is \cfunction{PyErr_SetString()}. Its arguments
Fred Drakeec9fbe91999-02-15 16:20:25 +0000182are an exception object and a C string. The exception object is
Fred Draked7bb3031998-03-03 17:52:07 +0000183usually a predefined object like \cdata{PyExc_ZeroDivisionError}. The
Fred Drakeec9fbe91999-02-15 16:20:25 +0000184C string indicates the cause of the error and is converted to a
Fred Draked7bb3031998-03-03 17:52:07 +0000185Python string object and stored as the ``associated value'' of the
186exception.
Guido van Rossumb92112d1995-03-20 14:24:09 +0000187
Fred Draked7bb3031998-03-03 17:52:07 +0000188Another useful function is \cfunction{PyErr_SetFromErrno()}, which only
Guido van Rossumb92112d1995-03-20 14:24:09 +0000189takes an exception argument and constructs the associated value by
Fred Drake54fd8452000-04-03 04:54:28 +0000190inspection of the global variable \cdata{errno}. The most
Fred Draked7bb3031998-03-03 17:52:07 +0000191general function is \cfunction{PyErr_SetObject()}, which takes two object
Guido van Rossumb92112d1995-03-20 14:24:09 +0000192arguments, the exception and its associated value. You don't need to
Fred Draked7bb3031998-03-03 17:52:07 +0000193\cfunction{Py_INCREF()} the objects passed to any of these functions.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000194
195You can test non-destructively whether an exception has been set with
Fred Draked7bb3031998-03-03 17:52:07 +0000196\cfunction{PyErr_Occurred()}. This returns the current exception object,
Fred Drake0fd82681998-01-09 05:39:38 +0000197or \NULL{} if no exception has occurred. You normally don't need
Fred Draked7bb3031998-03-03 17:52:07 +0000198to call \cfunction{PyErr_Occurred()} to see whether an error occurred in a
Guido van Rossumb92112d1995-03-20 14:24:09 +0000199function call, since you should be able to tell from the return value.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000200
Guido van Rossumd16ddb61996-12-13 02:38:17 +0000201When a function \var{f} that calls another function \var{g} detects
Guido van Rossumb92112d1995-03-20 14:24:09 +0000202that the latter fails, \var{f} should itself return an error value
Fred Drake33698f81999-02-16 23:06:32 +0000203(e.g.\ \NULL{} or \code{-1}). It should \emph{not} call one of the
Fred Draked7bb3031998-03-03 17:52:07 +0000204\cfunction{PyErr_*()} functions --- one has already been called by \var{g}.
Guido van Rossumb92112d1995-03-20 14:24:09 +0000205\var{f}'s caller is then supposed to also return an error indication
Fred Draked7bb3031998-03-03 17:52:07 +0000206to \emph{its} caller, again \emph{without} calling \cfunction{PyErr_*()},
Guido van Rossumb92112d1995-03-20 14:24:09 +0000207and so on --- the most detailed cause of the error was already
208reported by the function that first detected it. Once the error
209reaches the Python interpreter's main loop, this aborts the currently
210executing Python code and tries to find an exception handler specified
211by the Python programmer.
Guido van Rossum6938f061994-08-01 12:22:53 +0000212
213(There are situations where a module can actually give a more detailed
Fred Draked7bb3031998-03-03 17:52:07 +0000214error message by calling another \cfunction{PyErr_*()} function, and in
Guido van Rossumb92112d1995-03-20 14:24:09 +0000215such cases it is fine to do so. As a general rule, however, this is
216not necessary, and can cause information about the cause of the error
217to be lost: most operations can fail for a variety of reasons.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000218
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000219To ignore an exception set by a function call that failed, the exception
Fred Draked7bb3031998-03-03 17:52:07 +0000220condition must be cleared explicitly by calling \cfunction{PyErr_Clear()}.
Fred Drakeec9fbe91999-02-15 16:20:25 +0000221The only time C code should call \cfunction{PyErr_Clear()} is if it doesn't
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000222want to pass the error on to the interpreter but wants to handle it
Fred Drake33698f81999-02-16 23:06:32 +0000223completely by itself (e.g.\ by trying something else or pretending
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000224nothing happened).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000225
Fred Drake54fd8452000-04-03 04:54:28 +0000226Every failing \cfunction{malloc()} call must be turned into an
Fred Draked7bb3031998-03-03 17:52:07 +0000227exception --- the direct caller of \cfunction{malloc()} (or
228\cfunction{realloc()}) must call \cfunction{PyErr_NoMemory()} and
229return a failure indicator itself. All the object-creating functions
Fred Drake54fd8452000-04-03 04:54:28 +0000230(for example, \cfunction{PyInt_FromLong()}) already do this, so this
231note is only relevant to those who call \cfunction{malloc()} directly.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000232
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000233Also note that, with the important exception of
Fred Drake3da06a61998-02-26 18:49:12 +0000234\cfunction{PyArg_ParseTuple()} and friends, functions that return an
Guido van Rossumb92112d1995-03-20 14:24:09 +0000235integer status usually return a positive value or zero for success and
236\code{-1} for failure, like \UNIX{} system calls.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000237
Fred Draked7bb3031998-03-03 17:52:07 +0000238Finally, be careful to clean up garbage (by making
239\cfunction{Py_XDECREF()} or \cfunction{Py_DECREF()} calls for objects
240you have already created) when you return an error indicator!
Guido van Rossum6938f061994-08-01 12:22:53 +0000241
242The choice of which exception to raise is entirely yours. There are
Fred Drakeec9fbe91999-02-15 16:20:25 +0000243predeclared C objects corresponding to all built-in Python exceptions,
Fred Drakeabfd7d61999-02-16 17:34:51 +0000244e.g.\ \cdata{PyExc_ZeroDivisionError}, which you can use directly. Of
Guido van Rossumb92112d1995-03-20 14:24:09 +0000245course, you should choose exceptions wisely --- don't use
Fred Draked7bb3031998-03-03 17:52:07 +0000246\cdata{PyExc_TypeError} to mean that a file couldn't be opened (that
247should probably be \cdata{PyExc_IOError}). If something's wrong with
Fred Drake3da06a61998-02-26 18:49:12 +0000248the argument list, the \cfunction{PyArg_ParseTuple()} function usually
Fred Draked7bb3031998-03-03 17:52:07 +0000249raises \cdata{PyExc_TypeError}. If you have an argument whose value
Fred Drakedc12ec81999-03-09 18:36:55 +0000250must be in a particular range or must satisfy other conditions,
Fred Draked7bb3031998-03-03 17:52:07 +0000251\cdata{PyExc_ValueError} is appropriate.
Guido van Rossum6938f061994-08-01 12:22:53 +0000252
253You can also define a new exception that is unique to your module.
254For this, you usually declare a static object variable at the
255beginning of your file, e.g.
256
Fred Drake1e11a5c1998-02-13 07:11:32 +0000257\begin{verbatim}
258static PyObject *SpamError;
259\end{verbatim}
260
Guido van Rossum6938f061994-08-01 12:22:53 +0000261and initialize it in your module's initialization function
Fred Drake33698f81999-02-16 23:06:32 +0000262(\cfunction{initspam()}) with an exception object, e.g.\ (leaving out
Fred Draked7bb3031998-03-03 17:52:07 +0000263the error checking for now):
Guido van Rossum6938f061994-08-01 12:22:53 +0000264
Fred Drake1e11a5c1998-02-13 07:11:32 +0000265\begin{verbatim}
266void
267initspam()
268{
269 PyObject *m, *d;
Fred Drakea0dbddf1998-04-02 06:50:02 +0000270
Fred Drake1e11a5c1998-02-13 07:11:32 +0000271 m = Py_InitModule("spam", SpamMethods);
272 d = PyModule_GetDict(m);
Fred Draked7bb3031998-03-03 17:52:07 +0000273 SpamError = PyErr_NewException("spam.error", NULL, NULL);
Fred Drake1e11a5c1998-02-13 07:11:32 +0000274 PyDict_SetItemString(d, "error", SpamError);
275}
276\end{verbatim}
277
Guido van Rossumb92112d1995-03-20 14:24:09 +0000278Note that the Python name for the exception object is
Fred Draked7bb3031998-03-03 17:52:07 +0000279\exception{spam.error}. The \cfunction{PyErr_NewException()} function
280may create either a string or class, depending on whether the
Fred Drake9fa76f11999-11-10 16:01:43 +0000281\programopt{-X} flag was passed to the interpreter. If
282\programopt{-X} was used, \cdata{SpamError} will be a string object,
283otherwise it will be a class object with the base class being
284\exception{Exception}, described in the
285\citetitle[../lib/lib.html]{Python Library Reference} under ``Built-in
Fred Draked7bb3031998-03-03 17:52:07 +0000286Exceptions.''
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000287
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000288
Fred Drake5e8aa541998-11-16 18:34:07 +0000289\section{Back to the Example
290 \label{backToExample}}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000291
292Going back to our example function, you should now be able to
293understand this statement:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000294
Fred Drake1e11a5c1998-02-13 07:11:32 +0000295\begin{verbatim}
296 if (!PyArg_ParseTuple(args, "s", &command))
297 return NULL;
298\end{verbatim}
299
Fred Drake0fd82681998-01-09 05:39:38 +0000300It returns \NULL{} (the error indicator for functions returning
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000301object pointers) if an error is detected in the argument list, relying
Fred Drake3da06a61998-02-26 18:49:12 +0000302on the exception set by \cfunction{PyArg_ParseTuple()}. Otherwise the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000303string value of the argument has been copied to the local variable
Fred Draked7bb3031998-03-03 17:52:07 +0000304\cdata{command}. This is a pointer assignment and you are not supposed
Fred Drakeec9fbe91999-02-15 16:20:25 +0000305to modify the string to which it points (so in Standard C, the variable
Fred Draked7bb3031998-03-03 17:52:07 +0000306\cdata{command} should properly be declared as \samp{const char
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000307*command}).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000308
Fred Draked7bb3031998-03-03 17:52:07 +0000309The next statement is a call to the \UNIX{} function
310\cfunction{system()}, passing it the string we just got from
311\cfunction{PyArg_ParseTuple()}:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000312
Fred Drake1e11a5c1998-02-13 07:11:32 +0000313\begin{verbatim}
314 sts = system(command);
315\end{verbatim}
316
Fred Draked7bb3031998-03-03 17:52:07 +0000317Our \function{spam.system()} function must return the value of
318\cdata{sts} as a Python object. This is done using the function
319\cfunction{Py_BuildValue()}, which is something like the inverse of
320\cfunction{PyArg_ParseTuple()}: it takes a format string and an
Fred Drakeec9fbe91999-02-15 16:20:25 +0000321arbitrary number of C values, and returns a new Python object.
Fred Draked7bb3031998-03-03 17:52:07 +0000322More info on \cfunction{Py_BuildValue()} is given later.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000323
Fred Drake1e11a5c1998-02-13 07:11:32 +0000324\begin{verbatim}
325 return Py_BuildValue("i", sts);
326\end{verbatim}
327
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000328In this case, it will return an integer object. (Yes, even integers
329are objects on the heap in Python!)
Guido van Rossum6938f061994-08-01 12:22:53 +0000330
Fred Drakeec9fbe91999-02-15 16:20:25 +0000331If you have a C function that returns no useful argument (a function
Fred Draked7bb3031998-03-03 17:52:07 +0000332returning \ctype{void}), the corresponding Python function must return
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000333\code{None}. You need this idiom to do so:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000334
Fred Drake1e11a5c1998-02-13 07:11:32 +0000335\begin{verbatim}
336 Py_INCREF(Py_None);
337 return Py_None;
338\end{verbatim}
339
Fred Drakeec9fbe91999-02-15 16:20:25 +0000340\cdata{Py_None} is the C name for the special Python object
Fred Drakea0dbddf1998-04-02 06:50:02 +0000341\code{None}. It is a genuine Python object rather than a \NULL{}
342pointer, which means ``error'' in most contexts, as we have seen.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000343
344
Fred Drake5e8aa541998-11-16 18:34:07 +0000345\section{The Module's Method Table and Initialization Function
346 \label{methodTable}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000347
Fred Draked7bb3031998-03-03 17:52:07 +0000348I promised to show how \cfunction{spam_system()} is called from Python
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000349programs. First, we need to list its name and address in a ``method
350table'':
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000351
Fred Drake1e11a5c1998-02-13 07:11:32 +0000352\begin{verbatim}
353static PyMethodDef SpamMethods[] = {
354 ...
355 {"system", spam_system, METH_VARARGS},
356 ...
357 {NULL, NULL} /* Sentinel */
358};
359\end{verbatim}
360
Fred Drake0fd82681998-01-09 05:39:38 +0000361Note the third entry (\samp{METH_VARARGS}). This is a flag telling
Fred Drakeec9fbe91999-02-15 16:20:25 +0000362the interpreter the calling convention to be used for the C
Fred Drake0fd82681998-01-09 05:39:38 +0000363function. It should normally always be \samp{METH_VARARGS} or
Fred Drakea0dbddf1998-04-02 06:50:02 +0000364\samp{METH_VARARGS | METH_KEYWORDS}; a value of \code{0} means that an
Fred Drake3da06a61998-02-26 18:49:12 +0000365obsolete variant of \cfunction{PyArg_ParseTuple()} is used.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000366
Fred Drakeb6e50321998-02-04 20:26:31 +0000367When using only \samp{METH_VARARGS}, the function should expect
368the Python-level parameters to be passed in as a tuple acceptable for
369parsing via \cfunction{PyArg_ParseTuple()}; more information on this
370function is provided below.
371
Fred Drake2d545232000-05-10 20:33:18 +0000372The \constant{METH_KEYWORDS} bit may be set in the third field if
373keyword arguments should be passed to the function. In this case, the
374C function should accept a third \samp{PyObject *} parameter which
375will be a dictionary of keywords. Use
376\cfunction{PyArg_ParseTupleAndKeywords()} to parse the arguments to
377such a function.
Fred Drake0fd82681998-01-09 05:39:38 +0000378
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000379The method table must be passed to the interpreter in the module's
Fred Drake2d545232000-05-10 20:33:18 +0000380initialization function. The initialization function must be named
381\cfunction{init\var{name}()}, where \var{name} is the name of the
382module, and should be the only non-\keyword{static} item defined in
383the module file:
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000384
Fred Drake1e11a5c1998-02-13 07:11:32 +0000385\begin{verbatim}
386void
387initspam()
388{
389 (void) Py_InitModule("spam", SpamMethods);
390}
391\end{verbatim}
392
Fred Drake65e69002000-05-10 20:36:34 +0000393Note that for \Cpp, this method must be declared \code{extern "C"}.
394
Fred Draked7bb3031998-03-03 17:52:07 +0000395When the Python program imports module \module{spam} for the first
Fred Drake54fd8452000-04-03 04:54:28 +0000396time, \cfunction{initspam()} is called. (See below for comments about
397embedding Python.) It calls
Fred Draked7bb3031998-03-03 17:52:07 +0000398\cfunction{Py_InitModule()}, which creates a ``module object'' (which
399is inserted in the dictionary \code{sys.modules} under the key
400\code{"spam"}), and inserts built-in function objects into the newly
401created module based upon the table (an array of \ctype{PyMethodDef}
402structures) that was passed as its second argument.
403\cfunction{Py_InitModule()} returns a pointer to the module object
404that it creates (which is unused here). It aborts with a fatal error
405if the module could not be initialized satisfactorily, so the caller
406doesn't need to check for errors.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000407
Fred Drake54fd8452000-04-03 04:54:28 +0000408When embedding Python, the \cfunction{initspam()} function is not
409called automatically unless there's an entry in the
410\cdata{_PyImport_Inittab} table. The easiest way to handle this is to
411statically initialize your statically-linked modules by directly
412calling \cfunction{initspam()} after the call to
413\cfunction{Py_Initialize()} or \cfunction{PyMac_Initialize()}:
414
415\begin{verbatim}
416int main(int argc, char **argv)
417{
418 /* Pass argv[0] to the Python interpreter */
419 Py_SetProgramName(argv[0]);
420
421 /* Initialize the Python interpreter. Required. */
422 Py_Initialize();
423
424 /* Add a static module */
425 initspam();
426\end{verbatim}
427
Fred Drake4dc1a6d2000-10-02 22:38:09 +0000428An example may be found in the file \file{Demo/embed/demo.c} in the
Fred Drake54fd8452000-04-03 04:54:28 +0000429Python source distribution.
430
Fred Drakea48a0831999-06-18 19:17:28 +0000431\strong{Note:} Removing entries from \code{sys.modules} or importing
432compiled modules into multiple interpreters within a process (or
433following a \cfunction{fork()} without an intervening
434\cfunction{exec()}) can create problems for some extension modules.
435Extension module authors should exercise caution when initializing
436internal data structures.
Fred Drake4dc1a6d2000-10-02 22:38:09 +0000437Note also that the \function{reload()} function can be used with
438extension modules, and will call the module initialization function
439(\cfunction{initspam()} in the example), but will not load the module
440again if it was loaded from a dynamically loadable object file
441(\file{.so} on \UNIX, \file{.dll} on Windows).
Fred Drakea48a0831999-06-18 19:17:28 +0000442
Fred Drake54fd8452000-04-03 04:54:28 +0000443A more substantial example module is included in the Python source
444distribution as \file{Modules/xxmodule.c}. This file may be used as a
445template or simply read as an example. The \program{modulator.py}
446script included in the source distribution or Windows install provides
447a simple graphical user interface for declaring the functions and
448objects which a module should implement, and can generate a template
449which can be filled in. The script lives in the
450\file{Tools/modulator/} directory; see the \file{README} file there
451for more information.
452
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000453
Fred Drake5e8aa541998-11-16 18:34:07 +0000454\section{Compilation and Linkage
455 \label{compilation}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000456
Guido van Rossumb92112d1995-03-20 14:24:09 +0000457There are two more things to do before you can use your new extension:
458compiling and linking it with the Python system. If you use dynamic
459loading, the details depend on the style of dynamic loading your
Fred Drake54fd8452000-04-03 04:54:28 +0000460system uses; see the chapters about building extension modules on
461\UNIX{} (chapter \ref{building-on-unix}) and Windows (chapter
462\ref{building-on-windows}) for more information about this.
463% XXX Add information about MacOS
Guido van Rossum6938f061994-08-01 12:22:53 +0000464
465If you can't use dynamic loading, or if you want to make your module a
466permanent part of the Python interpreter, you will have to change the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000467configuration setup and rebuild the interpreter. Luckily, this is
468very simple: just place your file (\file{spammodule.c} for example) in
Fred Drakea4a90dd1999-04-29 02:44:50 +0000469the \file{Modules/} directory of an unpacked source distribution, add
470a line to the file \file{Modules/Setup.local} describing your file:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000471
Fred Drake1e11a5c1998-02-13 07:11:32 +0000472\begin{verbatim}
473spam spammodule.o
474\end{verbatim}
475
Fred Draked7bb3031998-03-03 17:52:07 +0000476and rebuild the interpreter by running \program{make} in the toplevel
Fred Drakea4a90dd1999-04-29 02:44:50 +0000477directory. You can also run \program{make} in the \file{Modules/}
Fred Drakea0dbddf1998-04-02 06:50:02 +0000478subdirectory, but then you must first rebuild \file{Makefile}
Fred Draked7bb3031998-03-03 17:52:07 +0000479there by running `\program{make} Makefile'. (This is necessary each
480time you change the \file{Setup} file.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000481
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000482If your module requires additional libraries to link with, these can
Fred Drakea0dbddf1998-04-02 06:50:02 +0000483be listed on the line in the configuration file as well, for instance:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000484
Fred Drake1e11a5c1998-02-13 07:11:32 +0000485\begin{verbatim}
486spam spammodule.o -lX11
487\end{verbatim}
488
Fred Drakeec9fbe91999-02-15 16:20:25 +0000489\section{Calling Python Functions from C
Fred Drake5e8aa541998-11-16 18:34:07 +0000490 \label{callingPython}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000491
Fred Drakeec9fbe91999-02-15 16:20:25 +0000492So far we have concentrated on making C functions callable from
493Python. The reverse is also useful: calling Python functions from C.
Guido van Rossum6938f061994-08-01 12:22:53 +0000494This is especially the case for libraries that support so-called
Fred Drakeec9fbe91999-02-15 16:20:25 +0000495``callback'' functions. If a C interface makes use of callbacks, the
Guido van Rossum6938f061994-08-01 12:22:53 +0000496equivalent Python often needs to provide a callback mechanism to the
497Python programmer; the implementation will require calling the Python
Fred Drakeec9fbe91999-02-15 16:20:25 +0000498callback functions from a C callback. Other uses are also imaginable.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000499
500Fortunately, the Python interpreter is easily called recursively, and
Guido van Rossum6938f061994-08-01 12:22:53 +0000501there is a standard interface to call a Python function. (I won't
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000502dwell on how to call the Python parser with a particular string as
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000503input --- if you're interested, have a look at the implementation of
Fred Drake9fa76f11999-11-10 16:01:43 +0000504the \programopt{-c} command line option in \file{Python/pythonmain.c}
505from the Python source code.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000506
507Calling a Python function is easy. First, the Python program must
508somehow pass you the Python function object. You should provide a
509function (or some other interface) to do this. When this function is
510called, save a pointer to the Python function object (be careful to
Fred Drakedc12ec81999-03-09 18:36:55 +0000511\cfunction{Py_INCREF()} it!) in a global variable --- or wherever you
Fred Draked7bb3031998-03-03 17:52:07 +0000512see fit. For example, the following function might be part of a module
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000513definition:
514
Fred Drake1e11a5c1998-02-13 07:11:32 +0000515\begin{verbatim}
516static PyObject *my_callback = NULL;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000517
Fred Drake1e11a5c1998-02-13 07:11:32 +0000518static PyObject *
Fred Drake54fd8452000-04-03 04:54:28 +0000519my_set_callback(dummy, args)
520 PyObject *dummy, *args;
Fred Drake1e11a5c1998-02-13 07:11:32 +0000521{
Fred Drake5e8aa541998-11-16 18:34:07 +0000522 PyObject *result = NULL;
523 PyObject *temp;
524
525 if (PyArg_ParseTuple(args, "O:set_callback", &temp)) {
526 if (!PyCallable_Check(temp)) {
527 PyErr_SetString(PyExc_TypeError, "parameter must be callable");
528 return NULL;
529 }
530 Py_XINCREF(temp); /* Add a reference to new callback */
531 Py_XDECREF(my_callback); /* Dispose of previous callback */
532 my_callback = temp; /* Remember new callback */
533 /* Boilerplate to return "None" */
534 Py_INCREF(Py_None);
535 result = Py_None;
536 }
537 return result;
Fred Drake1e11a5c1998-02-13 07:11:32 +0000538}
539\end{verbatim}
540
Fred Drake5e8aa541998-11-16 18:34:07 +0000541This function must be registered with the interpreter using the
Fred Drake5f342ac1999-04-29 02:47:40 +0000542\constant{METH_VARARGS} flag; this is described in section
Fred Drake5e8aa541998-11-16 18:34:07 +0000543\ref{methodTable}, ``The Module's Method Table and Initialization
544Function.'' The \cfunction{PyArg_ParseTuple()} function and its
Fred Drake5f342ac1999-04-29 02:47:40 +0000545arguments are documented in section \ref{parseTuple}, ``Format Strings
Fred Drake5e8aa541998-11-16 18:34:07 +0000546for \cfunction{PyArg_ParseTuple()}.''
547
Fred Draked7bb3031998-03-03 17:52:07 +0000548The macros \cfunction{Py_XINCREF()} and \cfunction{Py_XDECREF()}
549increment/decrement the reference count of an object and are safe in
Fred Drake5e8aa541998-11-16 18:34:07 +0000550the presence of \NULL{} pointers (but note that \var{temp} will not be
Fred Drake5f342ac1999-04-29 02:47:40 +0000551\NULL{} in this context). More info on them in section
Fred Drake5e8aa541998-11-16 18:34:07 +0000552\ref{refcounts}, ``Reference Counts.''
Guido van Rossum6938f061994-08-01 12:22:53 +0000553
Fred Drakeec9fbe91999-02-15 16:20:25 +0000554Later, when it is time to call the function, you call the C function
Fred Draked7bb3031998-03-03 17:52:07 +0000555\cfunction{PyEval_CallObject()}. This function has two arguments, both
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000556pointers to arbitrary Python objects: the Python function, and the
557argument list. The argument list must always be a tuple object, whose
558length is the number of arguments. To call the Python function with
559no arguments, pass an empty tuple; to call it with one argument, pass
Fred Draked7bb3031998-03-03 17:52:07 +0000560a singleton tuple. \cfunction{Py_BuildValue()} returns a tuple when its
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000561format string consists of zero or more format codes between
562parentheses. For example:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000563
Fred Drake1e11a5c1998-02-13 07:11:32 +0000564\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000565 int arg;
566 PyObject *arglist;
567 PyObject *result;
568 ...
569 arg = 123;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000570 ...
571 /* Time to call the callback */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000572 arglist = Py_BuildValue("(i)", arg);
573 result = PyEval_CallObject(my_callback, arglist);
574 Py_DECREF(arglist);
Fred Drake1e11a5c1998-02-13 07:11:32 +0000575\end{verbatim}
576
Fred Draked7bb3031998-03-03 17:52:07 +0000577\cfunction{PyEval_CallObject()} returns a Python object pointer: this is
578the return value of the Python function. \cfunction{PyEval_CallObject()} is
Guido van Rossumb92112d1995-03-20 14:24:09 +0000579``reference-count-neutral'' with respect to its arguments. In the
Guido van Rossum6938f061994-08-01 12:22:53 +0000580example a new tuple was created to serve as the argument list, which
Fred Draked7bb3031998-03-03 17:52:07 +0000581is \cfunction{Py_DECREF()}-ed immediately after the call.
Guido van Rossum6938f061994-08-01 12:22:53 +0000582
Fred Draked7bb3031998-03-03 17:52:07 +0000583The return value of \cfunction{PyEval_CallObject()} is ``new'': either it
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000584is a brand new object, or it is an existing object whose reference
585count has been incremented. So, unless you want to save it in a
Fred Draked7bb3031998-03-03 17:52:07 +0000586global variable, you should somehow \cfunction{Py_DECREF()} the result,
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000587even (especially!) if you are not interested in its value.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000588
589Before you do this, however, it is important to check that the return
Fred Draked7bb3031998-03-03 17:52:07 +0000590value isn't \NULL{}. If it is, the Python function terminated by
Fred Drakeec9fbe91999-02-15 16:20:25 +0000591raising an exception. If the C code that called
Fred Draked7bb3031998-03-03 17:52:07 +0000592\cfunction{PyEval_CallObject()} is called from Python, it should now
593return an error indication to its Python caller, so the interpreter
594can print a stack trace, or the calling Python code can handle the
595exception. If this is not possible or desirable, the exception should
596be cleared by calling \cfunction{PyErr_Clear()}. For example:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000597
Fred Drake1e11a5c1998-02-13 07:11:32 +0000598\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000599 if (result == NULL)
600 return NULL; /* Pass error back */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000601 ...use result...
602 Py_DECREF(result);
Fred Drake1e11a5c1998-02-13 07:11:32 +0000603\end{verbatim}
604
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000605Depending on the desired interface to the Python callback function,
Fred Draked7bb3031998-03-03 17:52:07 +0000606you may also have to provide an argument list to
607\cfunction{PyEval_CallObject()}. In some cases the argument list is
608also provided by the Python program, through the same interface that
609specified the callback function. It can then be saved and used in the
610same manner as the function object. In other cases, you may have to
611construct a new tuple to pass as the argument list. The simplest way
612to do this is to call \cfunction{Py_BuildValue()}. For example, if
613you want to pass an integral event code, you might use the following
614code:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000615
Fred Drake1e11a5c1998-02-13 07:11:32 +0000616\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000617 PyObject *arglist;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000618 ...
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000619 arglist = Py_BuildValue("(l)", eventcode);
620 result = PyEval_CallObject(my_callback, arglist);
621 Py_DECREF(arglist);
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000622 if (result == NULL)
623 return NULL; /* Pass error back */
624 /* Here maybe use the result */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000625 Py_DECREF(result);
Fred Drake1e11a5c1998-02-13 07:11:32 +0000626\end{verbatim}
627
Fred Draked7bb3031998-03-03 17:52:07 +0000628Note the placement of \samp{Py_DECREF(arglist)} immediately after the
629call, before the error check! Also note that strictly spoken this
630code is not complete: \cfunction{Py_BuildValue()} may run out of
631memory, and this should be checked.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000632
633
Fred Drake5e8aa541998-11-16 18:34:07 +0000634\section{Format Strings for \cfunction{PyArg_ParseTuple()}
635 \label{parseTuple}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000636
Fred Drake3da06a61998-02-26 18:49:12 +0000637The \cfunction{PyArg_ParseTuple()} function is declared as follows:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000638
Fred Drake1e11a5c1998-02-13 07:11:32 +0000639\begin{verbatim}
640int PyArg_ParseTuple(PyObject *arg, char *format, ...);
641\end{verbatim}
642
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000643The \var{arg} argument must be a tuple object containing an argument
Fred Drakeec9fbe91999-02-15 16:20:25 +0000644list passed from Python to a C function. The \var{format} argument
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000645must be a format string, whose syntax is explained below. The
646remaining arguments must be addresses of variables whose type is
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000647determined by the format string. For the conversion to succeed, the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000648\var{arg} object must match the format and the format must be
649exhausted.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000650
Fred Drake3da06a61998-02-26 18:49:12 +0000651Note that while \cfunction{PyArg_ParseTuple()} checks that the Python
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000652arguments have the required types, it cannot check the validity of the
Fred Drakeec9fbe91999-02-15 16:20:25 +0000653addresses of C variables passed to the call: if you make mistakes
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000654there, your code will probably crash or at least overwrite random bits
655in memory. So be careful!
656
657A format string consists of zero or more ``format units''. A format
658unit describes one Python object; it is usually a single character or
659a parenthesized sequence of format units. With a few exceptions, a
660format unit that is not a parenthesized sequence normally corresponds
Fred Drake3da06a61998-02-26 18:49:12 +0000661to a single address argument to \cfunction{PyArg_ParseTuple()}. In the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000662following description, the quoted form is the format unit; the entry
663in (round) parentheses is the Python object type that matches the
Fred Drakeec9fbe91999-02-15 16:20:25 +0000664format unit; and the entry in [square] brackets is the type of the C
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000665variable(s) whose address should be passed. (Use the \samp{\&}
666operator to pass a variable's address.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000667
Fred Drake54fd8452000-04-03 04:54:28 +0000668Note that any Python object references which are provided to the
669caller are \emph{borrowed} references; do not decrement their
670reference count!
671
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000672\begin{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000673
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000674\item[\samp{s} (string or Unicode object) {[char *]}]
675Convert a Python string or Unicode object to a C pointer to a
676character string. You must not provide storage for the string
677itself; a pointer to an existing string is stored into the character
678pointer variable whose address you pass. The C string is
679null-terminated. The Python string must not contain embedded null
680bytes; if it does, a \exception{TypeError} exception is raised.
681Unicode objects are converted to C strings using the default
682encoding. If this conversion fails, an \exception{UnicodeError} is
683raised.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000684
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000685\item[\samp{s\#} (string, Unicode or any read buffer compatible object)
686{[char *, int]}]
687This variant on \samp{s} stores into two C variables, the first one a
688pointer to a character string, the second one its length. In this
689case the Python string may contain embedded null bytes. Unicode
Marc-André Lemburg3578b772000-09-21 21:08:08 +0000690objects pass back a pointer to the default encoded string version of the
691object if such a conversion is possible. All other read buffer
692compatible objects pass back a reference to the raw internal data
693representation.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000694
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000695\item[\samp{z} (string or \code{None}) {[char *]}]
696Like \samp{s}, but the Python object may also be \code{None}, in which
Fred Drakeec9fbe91999-02-15 16:20:25 +0000697case the C pointer is set to \NULL{}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000698
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000699\item[\samp{z\#} (string or \code{None} or any read buffer compatible object)
700{[char *, int]}]
Fred Draked7bb3031998-03-03 17:52:07 +0000701This is to \samp{s\#} as \samp{z} is to \samp{s}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000702
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000703\item[\samp{u} (Unicode object) {[Py_UNICODE *]}]
Fred Drake25871c02000-05-03 15:17:02 +0000704Convert a Python Unicode object to a C pointer to a null-terminated
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000705buffer of 16-bit Unicode (UTF-16) data. As with \samp{s}, there is no need
Fred Drake25871c02000-05-03 15:17:02 +0000706to provide storage for the Unicode data buffer; a pointer to the
707existing Unicode data is stored into the Py_UNICODE pointer variable whose
708address you pass.
709
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000710\item[\samp{u\#} (Unicode object) {[Py_UNICODE *, int]}]
Fred Drake25871c02000-05-03 15:17:02 +0000711This variant on \samp{u} stores into two C variables, the first one
712a pointer to a Unicode data buffer, the second one its length.
713
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000714\item[\samp{es} (string, Unicode object or character buffer compatible
715object) {[const char *encoding, char **buffer]}]
716This variant on \samp{s} is used for encoding Unicode and objects
717convertible to Unicode into a character buffer. It only works for
718encoded data without embedded \NULL{} bytes.
719
720The variant reads one C variable and stores into two C variables, the
Fred Drake4bc0aed2000-11-02 21:49:17 +0000721first one a pointer to an encoding name string (\var{encoding}), and the
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000722second a pointer to a pointer to a character buffer (\var{**buffer},
Fred Drake4bc0aed2000-11-02 21:49:17 +0000723the buffer used for storing the encoded data).
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000724
725The encoding name must map to a registered codec. If set to \NULL{},
726the default encoding is used.
727
Fred Drake4e159452000-08-11 17:09:23 +0000728\cfunction{PyArg_ParseTuple()} will allocate a buffer of the needed
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000729size using \cfunction{PyMem_NEW()}, copy the encoded data into this
730buffer and adjust \var{*buffer} to reference the newly allocated
731storage. The caller is responsible for calling
732\cfunction{PyMem_Free()} to free the allocated buffer after usage.
733
734\item[\samp{es\#} (string, Unicode object or character buffer compatible
735object) {[const char *encoding, char **buffer, int *buffer_length]}]
736This variant on \samp{s\#} is used for encoding Unicode and objects
737convertible to Unicode into a character buffer. It reads one C
Fred Drakeaa126e12000-11-17 18:20:33 +0000738variable and stores into three C variables, the first one a pointer to
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000739an encoding name string (\var{encoding}), the second a pointer to a
740pointer to a character buffer (\var{**buffer}, the buffer used for
741storing the encoded data) and the third one a pointer to an integer
742(\var{*buffer_length}, the buffer length).
743
744The encoding name must map to a registered codec. If set to \NULL{},
745the default encoding is used.
746
747There are two modes of operation:
748
749If \var{*buffer} points a \NULL{} pointer,
Fred Drake4e159452000-08-11 17:09:23 +0000750\cfunction{PyArg_ParseTuple()} will allocate a buffer of the needed
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000751size using \cfunction{PyMem_NEW()}, copy the encoded data into this
752buffer and adjust \var{*buffer} to reference the newly allocated
753storage. The caller is responsible for calling
754\cfunction{PyMem_Free()} to free the allocated buffer after usage.
755
756If \var{*buffer} points to a non-\NULL{} pointer (an already allocated
Fred Drake4e159452000-08-11 17:09:23 +0000757buffer), \cfunction{PyArg_ParseTuple()} will use this location as
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000758buffer and interpret \var{*buffer_length} as buffer size. It will then
759copy the encoded data into the buffer and 0-terminate it. Buffer
760overflow is signalled with an exception.
761
762In both cases, \var{*buffer_length} is set to the length of the
763encoded data without the trailing 0-byte.
764
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000765\item[\samp{b} (integer) {[char]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000766Convert a Python integer to a tiny int, stored in a C \ctype{char}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000767
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000768\item[\samp{h} (integer) {[short int]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000769Convert a Python integer to a C \ctype{short int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000770
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000771\item[\samp{i} (integer) {[int]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000772Convert a Python integer to a plain C \ctype{int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000773
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000774\item[\samp{l} (integer) {[long int]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000775Convert a Python integer to a C \ctype{long int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000776
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000777\item[\samp{c} (string of length 1) {[char]}]
778Convert a Python character, represented as a string of length 1, to a
Fred Drakeec9fbe91999-02-15 16:20:25 +0000779C \ctype{char}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000780
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000781\item[\samp{f} (float) {[float]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000782Convert a Python floating point number to a C \ctype{float}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000783
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000784\item[\samp{d} (float) {[double]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000785Convert a Python floating point number to a C \ctype{double}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000786
Fred Drakeb6e50321998-02-04 20:26:31 +0000787\item[\samp{D} (complex) {[Py_complex]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000788Convert a Python complex number to a C \ctype{Py_complex} structure.
Fred Drakeb6e50321998-02-04 20:26:31 +0000789
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000790\item[\samp{O} (object) {[PyObject *]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000791Store a Python object (without any conversion) in a C object pointer.
792The C program thus receives the actual object that was passed. The
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000793object's reference count is not increased. The pointer stored is not
Fred Drake0fd82681998-01-09 05:39:38 +0000794\NULL{}.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000795
Fred Drake3fe985f1998-03-04 03:51:42 +0000796\item[\samp{O!} (object) {[\var{typeobject}, PyObject *]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000797Store a Python object in a C object pointer. This is similar to
798\samp{O}, but takes two C arguments: the first is the address of a
799Python type object, the second is the address of the C variable (of
Fred Draked7bb3031998-03-03 17:52:07 +0000800type \ctype{PyObject *}) into which the object pointer is stored.
Fred Drake54fd8452000-04-03 04:54:28 +0000801If the Python object does not have the required type,
802\exception{TypeError} is raised.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000803
Fred Drake3fe985f1998-03-04 03:51:42 +0000804\item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000805Convert a Python object to a C variable through a \var{converter}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000806function. This takes two arguments: the first is a function, the
Fred Drakeec9fbe91999-02-15 16:20:25 +0000807second is the address of a C variable (of arbitrary type), converted
Fred Draked7bb3031998-03-03 17:52:07 +0000808to \ctype{void *}. The \var{converter} function in turn is called as
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000809follows:
810
Fred Drake82ac24f1999-07-02 14:29:14 +0000811\var{status}\code{ = }\var{converter}\code{(}\var{object}, \var{address}\code{);}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000812
813where \var{object} is the Python object to be converted and
Fred Draked7bb3031998-03-03 17:52:07 +0000814\var{address} is the \ctype{void *} argument that was passed to
815\cfunction{PyArg_ConvertTuple()}. The returned \var{status} should be
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000816\code{1} for a successful conversion and \code{0} if the conversion
817has failed. When the conversion fails, the \var{converter} function
818should raise an exception.
819
820\item[\samp{S} (string) {[PyStringObject *]}]
Guido van Rossum2474d681998-02-26 17:07:11 +0000821Like \samp{O} but requires that the Python object is a string object.
Fred Drake54fd8452000-04-03 04:54:28 +0000822Raises \exception{TypeError} if the object is not a string object.
823The C variable may also be declared as \ctype{PyObject *}.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000824
Fred Drake25871c02000-05-03 15:17:02 +0000825\item[\samp{U} (Unicode string) {[PyUnicodeObject *]}]
826Like \samp{O} but requires that the Python object is a Unicode object.
827Raises \exception{TypeError} if the object is not a Unicode object.
828The C variable may also be declared as \ctype{PyObject *}.
829
Fred Drake8779f641999-08-27 15:28:15 +0000830\item[\samp{t\#} (read-only character buffer) {[char *, int]}]
831Like \samp{s\#}, but accepts any object which implements the read-only
832buffer interface. The \ctype{char *} variable is set to point to the
833first byte of the buffer, and the \ctype{int} is set to the length of
834the buffer. Only single-segment buffer objects are accepted;
835\exception{TypeError} is raised for all others.
836
837\item[\samp{w} (read-write character buffer) {[char *]}]
838Similar to \samp{s}, but accepts any object which implements the
839read-write buffer interface. The caller must determine the length of
840the buffer by other means, or use \samp{w\#} instead. Only
841single-segment buffer objects are accepted; \exception{TypeError} is
842raised for all others.
843
844\item[\samp{w\#} (read-write character buffer) {[char *, int]}]
845Like \samp{s\#}, but accepts any object which implements the
846read-write buffer interface. The \ctype{char *} variable is set to
847point to the first byte of the buffer, and the \ctype{int} is set to
848the length of the buffer. Only single-segment buffer objects are
849accepted; \exception{TypeError} is raised for all others.
850
Fred Drake3fe985f1998-03-04 03:51:42 +0000851\item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
Fred Drake29fb54f1999-02-18 03:50:01 +0000852The object must be a Python sequence whose length is the number of
853format units in \var{items}. The C arguments must correspond to the
854individual format units in \var{items}. Format units for sequences
855may be nested.
856
857\strong{Note:} Prior to Python version 1.5.2, this format specifier
858only accepted a tuple containing the individual parameters, not an
Fred Drake54fd8452000-04-03 04:54:28 +0000859arbitrary sequence. Code which previously caused
Fred Drake29fb54f1999-02-18 03:50:01 +0000860\exception{TypeError} to be raised here may now proceed without an
861exception. This is not expected to be a problem for existing code.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000862
863\end{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000864
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000865It is possible to pass Python long integers where integers are
Fred Drake1aedbd81998-02-16 14:47:27 +0000866requested; however no proper range checking is done --- the most
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000867significant bits are silently truncated when the receiving field is
868too small to receive the value (actually, the semantics are inherited
Fred Drakedc12ec81999-03-09 18:36:55 +0000869from downcasts in C --- your mileage may vary).
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000870
871A few other characters have a meaning in a format string. These may
872not occur inside nested parentheses. They are:
873
874\begin{description}
875
876\item[\samp{|}]
877Indicates that the remaining arguments in the Python argument list are
Fred Drakeec9fbe91999-02-15 16:20:25 +0000878optional. The C variables corresponding to optional arguments should
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000879be initialized to their default value --- when an optional argument is
Fred Drake40e72f71998-03-03 19:37:38 +0000880not specified, \cfunction{PyArg_ParseTuple()} does not touch the contents
Fred Drakeec9fbe91999-02-15 16:20:25 +0000881of the corresponding C variable(s).
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000882
883\item[\samp{:}]
884The list of format units ends here; the string after the colon is used
885as the function name in error messages (the ``associated value'' of
Fred Drakedc12ec81999-03-09 18:36:55 +0000886the exception that \cfunction{PyArg_ParseTuple()} raises).
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000887
888\item[\samp{;}]
Fred Drakeaa126e12000-11-17 18:20:33 +0000889The list of format units ends here; the string after the semicolon is
890used as the error message \emph{instead} of the default error message.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000891Clearly, \samp{:} and \samp{;} mutually exclude each other.
892
893\end{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000894
895Some example calls:
896
Fred Drake0fd82681998-01-09 05:39:38 +0000897\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000898 int ok;
899 int i, j;
900 long k, l;
901 char *s;
902 int size;
903
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000904 ok = PyArg_ParseTuple(args, ""); /* No arguments */
Guido van Rossum6938f061994-08-01 12:22:53 +0000905 /* Python call: f() */
Fred Drake33698f81999-02-16 23:06:32 +0000906\end{verbatim}
Fred Drake0fd82681998-01-09 05:39:38 +0000907
Fred Drake33698f81999-02-16 23:06:32 +0000908\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000909 ok = PyArg_ParseTuple(args, "s", &s); /* A string */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000910 /* Possible Python call: f('whoops!') */
Fred Drake33698f81999-02-16 23:06:32 +0000911\end{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000912
Fred Drake33698f81999-02-16 23:06:32 +0000913\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000914 ok = PyArg_ParseTuple(args, "lls", &k, &l, &s); /* Two longs and a string */
Guido van Rossum6938f061994-08-01 12:22:53 +0000915 /* Possible Python call: f(1, 2, 'three') */
Fred Drake33698f81999-02-16 23:06:32 +0000916\end{verbatim}
Fred Drake0fd82681998-01-09 05:39:38 +0000917
Fred Drake33698f81999-02-16 23:06:32 +0000918\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000919 ok = PyArg_ParseTuple(args, "(ii)s#", &i, &j, &s, &size);
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000920 /* A pair of ints and a string, whose size is also returned */
Guido van Rossum7e924dd1997-02-10 16:51:52 +0000921 /* Possible Python call: f((1, 2), 'three') */
Fred Drake33698f81999-02-16 23:06:32 +0000922\end{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000923
Fred Drake33698f81999-02-16 23:06:32 +0000924\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000925 {
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000926 char *file;
927 char *mode = "r";
928 int bufsize = 0;
929 ok = PyArg_ParseTuple(args, "s|si", &file, &mode, &bufsize);
930 /* A string, and optionally another string and an integer */
931 /* Possible Python calls:
932 f('spam')
933 f('spam', 'w')
934 f('spam', 'wb', 100000) */
935 }
Fred Drake33698f81999-02-16 23:06:32 +0000936\end{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000937
Fred Drake33698f81999-02-16 23:06:32 +0000938\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000939 {
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000940 int left, top, right, bottom, h, v;
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000941 ok = PyArg_ParseTuple(args, "((ii)(ii))(ii)",
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000942 &left, &top, &right, &bottom, &h, &v);
Fred Drakea0dbddf1998-04-02 06:50:02 +0000943 /* A rectangle and a point */
944 /* Possible Python call:
945 f(((0, 0), (400, 300)), (10, 10)) */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000946 }
Fred Drake33698f81999-02-16 23:06:32 +0000947\end{verbatim}
Fred Drakeb6e50321998-02-04 20:26:31 +0000948
Fred Drake33698f81999-02-16 23:06:32 +0000949\begin{verbatim}
Fred Drakeb6e50321998-02-04 20:26:31 +0000950 {
951 Py_complex c;
952 ok = PyArg_ParseTuple(args, "D:myfunction", &c);
953 /* a complex, also providing a function name for errors */
954 /* Possible Python call: myfunction(1+2j) */
955 }
Fred Drake0fd82681998-01-09 05:39:38 +0000956\end{verbatim}
Fred Drakeb6e50321998-02-04 20:26:31 +0000957
958
Fred Drake5e8aa541998-11-16 18:34:07 +0000959\section{Keyword Parsing with \cfunction{PyArg_ParseTupleAndKeywords()}
960 \label{parseTupleAndKeywords}}
Fred Drakeb6e50321998-02-04 20:26:31 +0000961
962The \cfunction{PyArg_ParseTupleAndKeywords()} function is declared as
963follows:
964
Fred Drake1e11a5c1998-02-13 07:11:32 +0000965\begin{verbatim}
966int PyArg_ParseTupleAndKeywords(PyObject *arg, PyObject *kwdict,
967 char *format, char **kwlist, ...);
968\end{verbatim}
Fred Drakeb6e50321998-02-04 20:26:31 +0000969
970The \var{arg} and \var{format} parameters are identical to those of the
971\cfunction{PyArg_ParseTuple()} function. The \var{kwdict} parameter
972is the dictionary of keywords received as the third parameter from the
973Python runtime. The \var{kwlist} parameter is a \NULL{}-terminated
974list of strings which identify the parameters; the names are matched
975with the type information from \var{format} from left to right.
976
977\strong{Note:} Nested tuples cannot be parsed when using keyword
978arguments! Keyword parameters passed in which are not present in the
Fred Drakecd05ca91998-03-07 05:32:08 +0000979\var{kwlist} will cause \exception{TypeError} to be raised.
Fred Drakeb6e50321998-02-04 20:26:31 +0000980
981Here is an example module which uses keywords, based on an example by
Fred Drakea0dbddf1998-04-02 06:50:02 +0000982Geoff Philbrick (\email{philbrick@hks.com}):%
983\index{Philbrick, Geoff}
Fred Drakeb6e50321998-02-04 20:26:31 +0000984
985\begin{verbatim}
986#include <stdio.h>
987#include "Python.h"
988
989static PyObject *
990keywdarg_parrot(self, args, keywds)
991 PyObject *self;
992 PyObject *args;
993 PyObject *keywds;
994{
995 int voltage;
996 char *state = "a stiff";
997 char *action = "voom";
998 char *type = "Norwegian Blue";
999
1000 static char *kwlist[] = {"voltage", "state", "action", "type", NULL};
1001
1002 if (!PyArg_ParseTupleAndKeywords(args, keywds, "i|sss", kwlist,
1003 &voltage, &state, &action, &type))
1004 return NULL;
1005
1006 printf("-- This parrot wouldn't %s if you put %i Volts through it.\n",
1007 action, voltage);
1008 printf("-- Lovely plumage, the %s -- It's %s!\n", type, state);
1009
1010 Py_INCREF(Py_None);
1011
1012 return Py_None;
1013}
1014
1015static PyMethodDef keywdarg_methods[] = {
Fred Drakedc12ec81999-03-09 18:36:55 +00001016 /* The cast of the function is necessary since PyCFunction values
1017 * only take two PyObject* parameters, and keywdarg_parrot() takes
1018 * three.
1019 */
Fred Drakeb6e50321998-02-04 20:26:31 +00001020 {"parrot", (PyCFunction)keywdarg_parrot, METH_VARARGS|METH_KEYWORDS},
1021 {NULL, NULL} /* sentinel */
1022};
1023
1024void
1025initkeywdarg()
1026{
1027 /* Create the module and add the functions */
Fred Drakecd05ca91998-03-07 05:32:08 +00001028 Py_InitModule("keywdarg", keywdarg_methods);
Fred Drakeb6e50321998-02-04 20:26:31 +00001029}
1030\end{verbatim}
1031
1032
Fred Drake5e8aa541998-11-16 18:34:07 +00001033\section{The \cfunction{Py_BuildValue()} Function
1034 \label{buildValue}}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001035
Fred Draked7bb3031998-03-03 17:52:07 +00001036This function is the counterpart to \cfunction{PyArg_ParseTuple()}. It is
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001037declared as follows:
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001038
Fred Drake1e11a5c1998-02-13 07:11:32 +00001039\begin{verbatim}
1040PyObject *Py_BuildValue(char *format, ...);
1041\end{verbatim}
1042
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001043It recognizes a set of format units similar to the ones recognized by
Fred Draked7bb3031998-03-03 17:52:07 +00001044\cfunction{PyArg_ParseTuple()}, but the arguments (which are input to the
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001045function, not output) must not be pointers, just values. It returns a
Fred Drakeec9fbe91999-02-15 16:20:25 +00001046new Python object, suitable for returning from a C function called
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001047from Python.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001048
Fred Draked7bb3031998-03-03 17:52:07 +00001049One difference with \cfunction{PyArg_ParseTuple()}: while the latter
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001050requires its first argument to be a tuple (since Python argument lists
Fred Draked7bb3031998-03-03 17:52:07 +00001051are always represented as tuples internally),
1052\cfunction{Py_BuildValue()} does not always build a tuple. It builds
1053a tuple only if its format string contains two or more format units.
1054If the format string is empty, it returns \code{None}; if it contains
1055exactly one format unit, it returns whatever object is described by
1056that format unit. To force it to return a tuple of size 0 or one,
1057parenthesize the format string.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001058
Fred Drake2b9e1802000-06-28 15:32:29 +00001059When memory buffers are passed as parameters to supply data to build
1060objects, as for the \samp{s} and \samp{s\#} formats, the required data
1061is copied. Buffers provided by the caller are never referenced by the
Fred Drakeec105d02000-06-28 16:15:08 +00001062objects created by \cfunction{Py_BuildValue()}. In other words, if
1063your code invokes \cfunction{malloc()} and passes the allocated memory
1064to \cfunction{Py_BuildValue()}, your code is responsible for
1065calling \cfunction{free()} for that memory once
1066\cfunction{Py_BuildValue()} returns.
Fred Drake2b9e1802000-06-28 15:32:29 +00001067
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001068In the following description, the quoted form is the format unit; the
1069entry in (round) parentheses is the Python object type that the format
1070unit will return; and the entry in [square] brackets is the type of
Fred Drakeec9fbe91999-02-15 16:20:25 +00001071the C value(s) to be passed.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001072
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001073The characters space, tab, colon and comma are ignored in format
1074strings (but not within format units such as \samp{s\#}). This can be
1075used to make long format strings a tad more readable.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001076
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001077\begin{description}
1078
1079\item[\samp{s} (string) {[char *]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001080Convert a null-terminated C string to a Python object. If the C
Fred Drake2b9e1802000-06-28 15:32:29 +00001081string pointer is \NULL{}, \code{None} is used.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001082
1083\item[\samp{s\#} (string) {[char *, int]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001084Convert a C string and its length to a Python object. If the C string
Fred Drake0fd82681998-01-09 05:39:38 +00001085pointer is \NULL{}, the length is ignored and \code{None} is
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001086returned.
1087
1088\item[\samp{z} (string or \code{None}) {[char *]}]
1089Same as \samp{s}.
1090
1091\item[\samp{z\#} (string or \code{None}) {[char *, int]}]
1092Same as \samp{s\#}.
1093
Fred Drake3c3507f2000-04-28 14:43:33 +00001094\item[\samp{u} (Unicode string) {[Py_UNICODE *]}]
1095Convert a null-terminated buffer of Unicode (UCS-2) data to a Python
1096Unicode object. If the Unicode buffer pointer is \NULL,
1097\code{None} is returned.
1098
1099\item[\samp{u\#} (Unicode string) {[Py_UNICODE *, int]}]
1100Convert a Unicode (UCS-2) data buffer and its length to a Python
1101Unicode object. If the Unicode buffer pointer is \NULL, the length
1102is ignored and \code{None} is returned.
1103
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001104\item[\samp{i} (integer) {[int]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001105Convert a plain C \ctype{int} to a Python integer object.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001106
1107\item[\samp{b} (integer) {[char]}]
1108Same as \samp{i}.
1109
1110\item[\samp{h} (integer) {[short int]}]
1111Same as \samp{i}.
1112
1113\item[\samp{l} (integer) {[long int]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001114Convert a C \ctype{long int} to a Python integer object.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001115
1116\item[\samp{c} (string of length 1) {[char]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001117Convert a C \ctype{int} representing a character to a Python string of
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001118length 1.
1119
1120\item[\samp{d} (float) {[double]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001121Convert a C \ctype{double} to a Python floating point number.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001122
1123\item[\samp{f} (float) {[float]}]
1124Same as \samp{d}.
1125
1126\item[\samp{O} (object) {[PyObject *]}]
1127Pass a Python object untouched (except for its reference count, which
Fred Drake0fd82681998-01-09 05:39:38 +00001128is incremented by one). If the object passed in is a \NULL{}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001129pointer, it is assumed that this was caused because the call producing
1130the argument found an error and set an exception. Therefore,
Fred Draked7bb3031998-03-03 17:52:07 +00001131\cfunction{Py_BuildValue()} will return \NULL{} but won't raise an
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001132exception. If no exception has been raised yet,
Fred Draked7bb3031998-03-03 17:52:07 +00001133\cdata{PyExc_SystemError} is set.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001134
1135\item[\samp{S} (object) {[PyObject *]}]
1136Same as \samp{O}.
1137
Fred Drake25871c02000-05-03 15:17:02 +00001138\item[\samp{U} (object) {[PyObject *]}]
1139Same as \samp{O}.
1140
Guido van Rossumd358afe1998-12-23 05:02:08 +00001141\item[\samp{N} (object) {[PyObject *]}]
1142Same as \samp{O}, except it doesn't increment the reference count on
1143the object. Useful when the object is created by a call to an object
1144constructor in the argument list.
1145
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001146\item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
1147Convert \var{anything} to a Python object through a \var{converter}
1148function. The function is called with \var{anything} (which should be
Fred Draked7bb3031998-03-03 17:52:07 +00001149compatible with \ctype{void *}) as its argument and should return a
Fred Drake0fd82681998-01-09 05:39:38 +00001150``new'' Python object, or \NULL{} if an error occurred.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001151
1152\item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001153Convert a sequence of C values to a Python tuple with the same number
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001154of items.
1155
1156\item[\samp{[\var{items}]} (list) {[\var{matching-items}]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001157Convert a sequence of C values to a Python list with the same number
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001158of items.
1159
1160\item[\samp{\{\var{items}\}} (dictionary) {[\var{matching-items}]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001161Convert a sequence of C values to a Python dictionary. Each pair of
1162consecutive C values adds one item to the dictionary, serving as key
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001163and value, respectively.
1164
1165\end{description}
1166
1167If there is an error in the format string, the
Fred Draked7bb3031998-03-03 17:52:07 +00001168\cdata{PyExc_SystemError} exception is raised and \NULL{} returned.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001169
1170Examples (to the left the call, to the right the resulting Python value):
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001171
Fred Drake1e11a5c1998-02-13 07:11:32 +00001172\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001173 Py_BuildValue("") None
1174 Py_BuildValue("i", 123) 123
Guido van Rossumf23e0fe1995-03-18 11:04:29 +00001175 Py_BuildValue("iii", 123, 456, 789) (123, 456, 789)
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001176 Py_BuildValue("s", "hello") 'hello'
1177 Py_BuildValue("ss", "hello", "world") ('hello', 'world')
1178 Py_BuildValue("s#", "hello", 4) 'hell'
1179 Py_BuildValue("()") ()
1180 Py_BuildValue("(i)", 123) (123,)
1181 Py_BuildValue("(ii)", 123, 456) (123, 456)
1182 Py_BuildValue("(i,i)", 123, 456) (123, 456)
1183 Py_BuildValue("[i,i]", 123, 456) [123, 456]
Guido van Rossumf23e0fe1995-03-18 11:04:29 +00001184 Py_BuildValue("{s:i,s:i}",
1185 "abc", 123, "def", 456) {'abc': 123, 'def': 456}
1186 Py_BuildValue("((ii)(ii)) (ii)",
1187 1, 2, 3, 4, 5, 6) (((1, 2), (3, 4)), (5, 6))
Fred Drake1e11a5c1998-02-13 07:11:32 +00001188\end{verbatim}
1189
Fred Drake8e015171999-02-17 18:12:14 +00001190
Fred Drake5e8aa541998-11-16 18:34:07 +00001191\section{Reference Counts
1192 \label{refcounts}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001193
Fred Drakeec9fbe91999-02-15 16:20:25 +00001194In languages like C or \Cpp{}, the programmer is responsible for
1195dynamic allocation and deallocation of memory on the heap. In C,
Fred Draked7bb3031998-03-03 17:52:07 +00001196this is done using the functions \cfunction{malloc()} and
1197\cfunction{free()}. In \Cpp{}, the operators \keyword{new} and
1198\keyword{delete} are used with essentially the same meaning; they are
1199actually implemented using \cfunction{malloc()} and
1200\cfunction{free()}, so we'll restrict the following discussion to the
1201latter.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001202
Fred Draked7bb3031998-03-03 17:52:07 +00001203Every block of memory allocated with \cfunction{malloc()} should
1204eventually be returned to the pool of available memory by exactly one
1205call to \cfunction{free()}. It is important to call
1206\cfunction{free()} at the right time. If a block's address is
1207forgotten but \cfunction{free()} is not called for it, the memory it
1208occupies cannot be reused until the program terminates. This is
1209called a \dfn{memory leak}. On the other hand, if a program calls
1210\cfunction{free()} for a block and then continues to use the block, it
1211creates a conflict with re-use of the block through another
1212\cfunction{malloc()} call. This is called \dfn{using freed memory}.
1213It has the same bad consequences as referencing uninitialized data ---
1214core dumps, wrong results, mysterious crashes.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001215
1216Common causes of memory leaks are unusual paths through the code. For
1217instance, a function may allocate a block of memory, do some
1218calculation, and then free the block again. Now a change in the
1219requirements for the function may add a test to the calculation that
1220detects an error condition and can return prematurely from the
1221function. It's easy to forget to free the allocated memory block when
1222taking this premature exit, especially when it is added later to the
1223code. Such leaks, once introduced, often go undetected for a long
1224time: the error exit is taken only in a small fraction of all calls,
1225and most modern machines have plenty of virtual memory, so the leak
1226only becomes apparent in a long-running process that uses the leaking
1227function frequently. Therefore, it's important to prevent leaks from
1228happening by having a coding convention or strategy that minimizes
1229this kind of errors.
1230
Fred Draked7bb3031998-03-03 17:52:07 +00001231Since Python makes heavy use of \cfunction{malloc()} and
1232\cfunction{free()}, it needs a strategy to avoid memory leaks as well
1233as the use of freed memory. The chosen method is called
1234\dfn{reference counting}. The principle is simple: every object
1235contains a counter, which is incremented when a reference to the
1236object is stored somewhere, and which is decremented when a reference
1237to it is deleted. When the counter reaches zero, the last reference
1238to the object has been deleted and the object is freed.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001239
1240An alternative strategy is called \dfn{automatic garbage collection}.
1241(Sometimes, reference counting is also referred to as a garbage
1242collection strategy, hence my use of ``automatic'' to distinguish the
1243two.) The big advantage of automatic garbage collection is that the
Fred Draked7bb3031998-03-03 17:52:07 +00001244user doesn't need to call \cfunction{free()} explicitly. (Another claimed
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001245advantage is an improvement in speed or memory usage --- this is no
Fred Drakeec9fbe91999-02-15 16:20:25 +00001246hard fact however.) The disadvantage is that for C, there is no
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001247truly portable automatic garbage collector, while reference counting
Fred Draked7bb3031998-03-03 17:52:07 +00001248can be implemented portably (as long as the functions \cfunction{malloc()}
Fred Drakeec9fbe91999-02-15 16:20:25 +00001249and \cfunction{free()} are available --- which the C Standard guarantees).
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001250Maybe some day a sufficiently portable automatic garbage collector
Fred Drakeec9fbe91999-02-15 16:20:25 +00001251will be available for C. Until then, we'll have to live with
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001252reference counts.
1253
Fred Drake5e8aa541998-11-16 18:34:07 +00001254\subsection{Reference Counting in Python
1255 \label{refcountsInPython}}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001256
1257There are two macros, \code{Py_INCREF(x)} and \code{Py_DECREF(x)},
1258which handle the incrementing and decrementing of the reference count.
Fred Draked7bb3031998-03-03 17:52:07 +00001259\cfunction{Py_DECREF()} also frees the object when the count reaches zero.
1260For flexibility, it doesn't call \cfunction{free()} directly --- rather, it
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001261makes a call through a function pointer in the object's \dfn{type
1262object}. For this purpose (and others), every object also contains a
1263pointer to its type object.
1264
1265The big question now remains: when to use \code{Py_INCREF(x)} and
1266\code{Py_DECREF(x)}? Let's first introduce some terms. Nobody
1267``owns'' an object; however, you can \dfn{own a reference} to an
1268object. An object's reference count is now defined as the number of
1269owned references to it. The owner of a reference is responsible for
Fred Draked7bb3031998-03-03 17:52:07 +00001270calling \cfunction{Py_DECREF()} when the reference is no longer
1271needed. Ownership of a reference can be transferred. There are three
1272ways to dispose of an owned reference: pass it on, store it, or call
1273\cfunction{Py_DECREF()}. Forgetting to dispose of an owned reference
1274creates a memory leak.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001275
1276It is also possible to \dfn{borrow}\footnote{The metaphor of
1277``borrowing'' a reference is not completely correct: the owner still
1278has a copy of the reference.} a reference to an object. The borrower
Fred Draked7bb3031998-03-03 17:52:07 +00001279of a reference should not call \cfunction{Py_DECREF()}. The borrower must
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001280not hold on to the object longer than the owner from which it was
1281borrowed. Using a borrowed reference after the owner has disposed of
1282it risks using freed memory and should be avoided
1283completely.\footnote{Checking that the reference count is at least 1
1284\strong{does not work} --- the reference count itself could be in
1285freed memory and may thus be reused for another object!}
1286
1287The advantage of borrowing over owning a reference is that you don't
1288need to take care of disposing of the reference on all possible paths
1289through the code --- in other words, with a borrowed reference you
1290don't run the risk of leaking when a premature exit is taken. The
1291disadvantage of borrowing over leaking is that there are some subtle
1292situations where in seemingly correct code a borrowed reference can be
1293used after the owner from which it was borrowed has in fact disposed
1294of it.
1295
1296A borrowed reference can be changed into an owned reference by calling
Fred Draked7bb3031998-03-03 17:52:07 +00001297\cfunction{Py_INCREF()}. This does not affect the status of the owner from
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001298which the reference was borrowed --- it creates a new owned reference,
1299and gives full owner responsibilities (i.e., the new owner must
1300dispose of the reference properly, as well as the previous owner).
1301
Fred Drake8e015171999-02-17 18:12:14 +00001302
Fred Drake5e8aa541998-11-16 18:34:07 +00001303\subsection{Ownership Rules
1304 \label{ownershipRules}}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001305
1306Whenever an object reference is passed into or out of a function, it
1307is part of the function's interface specification whether ownership is
1308transferred with the reference or not.
1309
1310Most functions that return a reference to an object pass on ownership
1311with the reference. In particular, all functions whose function it is
Fred Draked7bb3031998-03-03 17:52:07 +00001312to create a new object, e.g.\ \cfunction{PyInt_FromLong()} and
1313\cfunction{Py_BuildValue()}, pass ownership to the receiver. Even if in
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001314fact, in some cases, you don't receive a reference to a brand new
1315object, you still receive ownership of the reference. For instance,
Fred Draked7bb3031998-03-03 17:52:07 +00001316\cfunction{PyInt_FromLong()} maintains a cache of popular values and can
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001317return a reference to a cached item.
1318
1319Many functions that extract objects from other objects also transfer
1320ownership with the reference, for instance
Fred Draked7bb3031998-03-03 17:52:07 +00001321\cfunction{PyObject_GetAttrString()}. The picture is less clear, here,
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001322however, since a few common routines are exceptions:
Fred Draked7bb3031998-03-03 17:52:07 +00001323\cfunction{PyTuple_GetItem()}, \cfunction{PyList_GetItem()},
1324\cfunction{PyDict_GetItem()}, and \cfunction{PyDict_GetItemString()}
1325all return references that you borrow from the tuple, list or
1326dictionary.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001327
Fred Draked7bb3031998-03-03 17:52:07 +00001328The function \cfunction{PyImport_AddModule()} also returns a borrowed
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001329reference, even though it may actually create the object it returns:
1330this is possible because an owned reference to the object is stored in
1331\code{sys.modules}.
1332
1333When you pass an object reference into another function, in general,
1334the function borrows the reference from you --- if it needs to store
Fred Draked7bb3031998-03-03 17:52:07 +00001335it, it will use \cfunction{Py_INCREF()} to become an independent
1336owner. There are exactly two important exceptions to this rule:
1337\cfunction{PyTuple_SetItem()} and \cfunction{PyList_SetItem()}. These
1338functions take over ownership of the item passed to them --- even if
1339they fail! (Note that \cfunction{PyDict_SetItem()} and friends don't
Fred Drakea0dbddf1998-04-02 06:50:02 +00001340take over ownership --- they are ``normal.'')
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001341
Fred Drakeec9fbe91999-02-15 16:20:25 +00001342When a C function is called from Python, it borrows references to its
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001343arguments from the caller. The caller owns a reference to the object,
1344so the borrowed reference's lifetime is guaranteed until the function
1345returns. Only when such a borrowed reference must be stored or passed
1346on, it must be turned into an owned reference by calling
Fred Draked7bb3031998-03-03 17:52:07 +00001347\cfunction{Py_INCREF()}.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001348
Fred Drakeec9fbe91999-02-15 16:20:25 +00001349The object reference returned from a C function that is called from
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001350Python must be an owned reference --- ownership is tranferred from the
1351function to its caller.
1352
Fred Drake8e015171999-02-17 18:12:14 +00001353
Fred Drake5e8aa541998-11-16 18:34:07 +00001354\subsection{Thin Ice
1355 \label{thinIce}}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001356
1357There are a few situations where seemingly harmless use of a borrowed
1358reference can lead to problems. These all have to do with implicit
1359invocations of the interpreter, which can cause the owner of a
1360reference to dispose of it.
1361
1362The first and most important case to know about is using
Fred Draked7bb3031998-03-03 17:52:07 +00001363\cfunction{Py_DECREF()} on an unrelated object while borrowing a
1364reference to a list item. For instance:
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001365
Fred Drake1e11a5c1998-02-13 07:11:32 +00001366\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001367bug(PyObject *list) {
1368 PyObject *item = PyList_GetItem(list, 0);
Fred Drakea0dbddf1998-04-02 06:50:02 +00001369
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001370 PyList_SetItem(list, 1, PyInt_FromLong(0L));
1371 PyObject_Print(item, stdout, 0); /* BUG! */
1372}
Fred Drake1e11a5c1998-02-13 07:11:32 +00001373\end{verbatim}
1374
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001375This function first borrows a reference to \code{list[0]}, then
1376replaces \code{list[1]} with the value \code{0}, and finally prints
1377the borrowed reference. Looks harmless, right? But it's not!
1378
Fred Draked7bb3031998-03-03 17:52:07 +00001379Let's follow the control flow into \cfunction{PyList_SetItem()}. The list
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001380owns references to all its items, so when item 1 is replaced, it has
1381to dispose of the original item 1. Now let's suppose the original
1382item 1 was an instance of a user-defined class, and let's further
Fred Draked7bb3031998-03-03 17:52:07 +00001383suppose that the class defined a \method{__del__()} method. If this
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001384class instance has a reference count of 1, disposing of it will call
Fred Draked7bb3031998-03-03 17:52:07 +00001385its \method{__del__()} method.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001386
Fred Draked7bb3031998-03-03 17:52:07 +00001387Since it is written in Python, the \method{__del__()} method can execute
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001388arbitrary Python code. Could it perhaps do something to invalidate
Fred Draked7bb3031998-03-03 17:52:07 +00001389the reference to \code{item} in \cfunction{bug()}? You bet! Assuming
1390that the list passed into \cfunction{bug()} is accessible to the
1391\method{__del__()} method, it could execute a statement to the effect of
1392\samp{del list[0]}, and assuming this was the last reference to that
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001393object, it would free the memory associated with it, thereby
1394invalidating \code{item}.
1395
1396The solution, once you know the source of the problem, is easy:
1397temporarily increment the reference count. The correct version of the
1398function reads:
1399
Fred Drake1e11a5c1998-02-13 07:11:32 +00001400\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001401no_bug(PyObject *list) {
1402 PyObject *item = PyList_GetItem(list, 0);
Fred Drakea0dbddf1998-04-02 06:50:02 +00001403
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001404 Py_INCREF(item);
1405 PyList_SetItem(list, 1, PyInt_FromLong(0L));
1406 PyObject_Print(item, stdout, 0);
1407 Py_DECREF(item);
1408}
Fred Drake1e11a5c1998-02-13 07:11:32 +00001409\end{verbatim}
1410
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001411This is a true story. An older version of Python contained variants
Fred Drakeec9fbe91999-02-15 16:20:25 +00001412of this bug and someone spent a considerable amount of time in a C
Fred Draked7bb3031998-03-03 17:52:07 +00001413debugger to figure out why his \method{__del__()} methods would fail...
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001414
1415The second case of problems with a borrowed reference is a variant
1416involving threads. Normally, multiple threads in the Python
1417interpreter can't get in each other's way, because there is a global
1418lock protecting Python's entire object space. However, it is possible
1419to temporarily release this lock using the macro
1420\code{Py_BEGIN_ALLOW_THREADS}, and to re-acquire it using
1421\code{Py_END_ALLOW_THREADS}. This is common around blocking I/O
1422calls, to let other threads use the CPU while waiting for the I/O to
1423complete. Obviously, the following function has the same problem as
1424the previous one:
1425
Fred Drake1e11a5c1998-02-13 07:11:32 +00001426\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001427bug(PyObject *list) {
1428 PyObject *item = PyList_GetItem(list, 0);
1429 Py_BEGIN_ALLOW_THREADS
1430 ...some blocking I/O call...
1431 Py_END_ALLOW_THREADS
1432 PyObject_Print(item, stdout, 0); /* BUG! */
1433}
Fred Drake1e11a5c1998-02-13 07:11:32 +00001434\end{verbatim}
1435
Fred Drake8e015171999-02-17 18:12:14 +00001436
Fred Drake5e8aa541998-11-16 18:34:07 +00001437\subsection{NULL Pointers
1438 \label{nullPointers}}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001439
Fred Drakea0dbddf1998-04-02 06:50:02 +00001440In general, functions that take object references as arguments do not
Fred Drake0fd82681998-01-09 05:39:38 +00001441expect you to pass them \NULL{} pointers, and will dump core (or
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001442cause later core dumps) if you do so. Functions that return object
Fred Drake0fd82681998-01-09 05:39:38 +00001443references generally return \NULL{} only to indicate that an
1444exception occurred. The reason for not testing for \NULL{}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001445arguments is that functions often pass the objects they receive on to
Fred Drake0fd82681998-01-09 05:39:38 +00001446other function --- if each function were to test for \NULL{},
Fred Drake1739be52000-06-30 17:58:34 +00001447there would be a lot of redundant tests and the code would run more
1448slowly.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001449
Fred Drakee743fd01998-11-24 17:07:29 +00001450It is better to test for \NULL{} only at the ``source'', i.e.\ when a
1451pointer that may be \NULL{} is received, e.g.\ from
Fred Draked7bb3031998-03-03 17:52:07 +00001452\cfunction{malloc()} or from a function that may raise an exception.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001453
Fred Draked7bb3031998-03-03 17:52:07 +00001454The macros \cfunction{Py_INCREF()} and \cfunction{Py_DECREF()}
Fred Drakea0dbddf1998-04-02 06:50:02 +00001455do not check for \NULL{} pointers --- however, their variants
Fred Draked7bb3031998-03-03 17:52:07 +00001456\cfunction{Py_XINCREF()} and \cfunction{Py_XDECREF()} do.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001457
1458The macros for checking for a particular object type
Fred Drake0fd82681998-01-09 05:39:38 +00001459(\code{Py\var{type}_Check()}) don't check for \NULL{} pointers ---
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001460again, there is much code that calls several of these in a row to test
1461an object against various different expected types, and this would
Fred Drake0fd82681998-01-09 05:39:38 +00001462generate redundant tests. There are no variants with \NULL{}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001463checking.
1464
Fred Drakeec9fbe91999-02-15 16:20:25 +00001465The C function calling mechanism guarantees that the argument list
1466passed to C functions (\code{args} in the examples) is never
Fred Drake52e2d511999-04-05 21:26:37 +00001467\NULL{} --- in fact it guarantees that it is always a tuple.\footnote{
1468These guarantees don't hold when you use the ``old'' style
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001469calling convention --- this is still found in much existing code.}
1470
Fred Drake0fd82681998-01-09 05:39:38 +00001471It is a severe error to ever let a \NULL{} pointer ``escape'' to
Fred Drake1739be52000-06-30 17:58:34 +00001472the Python user.
1473
1474% Frank Stajano:
1475% A pedagogically buggy example, along the lines of the previous listing,
1476% would be helpful here -- showing in more concrete terms what sort of
1477% actions could cause the problem. I can't very well imagine it from the
1478% description.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001479
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001480
Fred Drake5e8aa541998-11-16 18:34:07 +00001481\section{Writing Extensions in \Cpp{}
1482 \label{cplusplus}}
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001483
Guido van Rossum16d6e711994-08-08 12:30:22 +00001484It is possible to write extension modules in \Cpp{}. Some restrictions
Guido van Rossumed39cd01995-10-08 00:17:19 +00001485apply. If the main program (the Python interpreter) is compiled and
Fred Drakeec9fbe91999-02-15 16:20:25 +00001486linked by the C compiler, global or static objects with constructors
Guido van Rossumed39cd01995-10-08 00:17:19 +00001487cannot be used. This is not a problem if the main program is linked
Guido van Rossumafcd5891998-02-05 19:59:39 +00001488by the \Cpp{} compiler. Functions that will be called by the
1489Python interpreter (in particular, module initalization functions)
1490have to be declared using \code{extern "C"}.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001491It is unnecessary to enclose the Python header files in
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001492\code{extern "C" \{...\}} --- they use this form already if the symbol
Fred Drake0fd82681998-01-09 05:39:38 +00001493\samp{__cplusplus} is defined (all recent \Cpp{} compilers define this
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001494symbol).
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001495
Fred Drakee743fd01998-11-24 17:07:29 +00001496
Fred Drakeec9fbe91999-02-15 16:20:25 +00001497\section{Providing a C API for an Extension Module
1498 \label{using-cobjects}}
1499\sectionauthor{Konrad Hinsen}{hinsen@cnrs-orleans.fr}
Fred Drakee743fd01998-11-24 17:07:29 +00001500
Fred Drakeec9fbe91999-02-15 16:20:25 +00001501Many extension modules just provide new functions and types to be
1502used from Python, but sometimes the code in an extension module can
1503be useful for other extension modules. For example, an extension
1504module could implement a type ``collection'' which works like lists
1505without order. Just like the standard Python list type has a C API
1506which permits extension modules to create and manipulate lists, this
1507new collection type should have a set of C functions for direct
1508manipulation from other extension modules.
1509
1510At first sight this seems easy: just write the functions (without
1511declaring them \keyword{static}, of course), provide an appropriate
1512header file, and document the C API. And in fact this would work if
1513all extension modules were always linked statically with the Python
1514interpreter. When modules are used as shared libraries, however, the
1515symbols defined in one module may not be visible to another module.
1516The details of visibility depend on the operating system; some systems
1517use one global namespace for the Python interpreter and all extension
Fred Drake33698f81999-02-16 23:06:32 +00001518modules (e.g.\ Windows), whereas others require an explicit list of
1519imported symbols at module link time (e.g.\ AIX), or offer a choice of
Fred Drakeec9fbe91999-02-15 16:20:25 +00001520different strategies (most Unices). And even if symbols are globally
1521visible, the module whose functions one wishes to call might not have
1522been loaded yet!
1523
1524Portability therefore requires not to make any assumptions about
1525symbol visibility. This means that all symbols in extension modules
1526should be declared \keyword{static}, except for the module's
1527initialization function, in order to avoid name clashes with other
1528extension modules (as discussed in section~\ref{methodTable}). And it
1529means that symbols that \emph{should} be accessible from other
1530extension modules must be exported in a different way.
1531
1532Python provides a special mechanism to pass C-level information (i.e.
1533pointers) from one extension module to another one: CObjects.
1534A CObject is a Python data type which stores a pointer (\ctype{void
1535*}). CObjects can only be created and accessed via their C API, but
1536they can be passed around like any other Python object. In particular,
1537they can be assigned to a name in an extension module's namespace.
1538Other extension modules can then import this module, retrieve the
1539value of this name, and then retrieve the pointer from the CObject.
1540
1541There are many ways in which CObjects can be used to export the C API
1542of an extension module. Each name could get its own CObject, or all C
1543API pointers could be stored in an array whose address is published in
1544a CObject. And the various tasks of storing and retrieving the pointers
1545can be distributed in different ways between the module providing the
1546code and the client modules.
1547
1548The following example demonstrates an approach that puts most of the
1549burden on the writer of the exporting module, which is appropriate
1550for commonly used library modules. It stores all C API pointers
1551(just one in the example!) in an array of \ctype{void} pointers which
1552becomes the value of a CObject. The header file corresponding to
1553the module provides a macro that takes care of importing the module
1554and retrieving its C API pointers; client modules only have to call
1555this macro before accessing the C API.
1556
1557The exporting module is a modification of the \module{spam} module from
1558section~\ref{simpleExample}. The function \function{spam.system()}
1559does not call the C library function \cfunction{system()} directly,
1560but a function \cfunction{PySpam_System()}, which would of course do
1561something more complicated in reality (such as adding ``spam'' to
1562every command). This function \cfunction{PySpam_System()} is also
1563exported to other extension modules.
1564
1565The function \cfunction{PySpam_System()} is a plain C function,
1566declared \keyword{static} like everything else:
1567
1568\begin{verbatim}
1569static int
1570PySpam_System(command)
1571 char *command;
1572{
1573 return system(command);
1574}
1575\end{verbatim}
1576
1577The function \cfunction{spam_system()} is modified in a trivial way:
1578
1579\begin{verbatim}
1580static PyObject *
1581spam_system(self, args)
1582 PyObject *self;
1583 PyObject *args;
1584{
1585 char *command;
1586 int sts;
1587
1588 if (!PyArg_ParseTuple(args, "s", &command))
1589 return NULL;
1590 sts = PySpam_System(command);
1591 return Py_BuildValue("i", sts);
1592}
1593\end{verbatim}
1594
1595In the beginning of the module, right after the line
Fred Drake8e015171999-02-17 18:12:14 +00001596
Fred Drakeec9fbe91999-02-15 16:20:25 +00001597\begin{verbatim}
1598#include "Python.h"
1599\end{verbatim}
Fred Drake8e015171999-02-17 18:12:14 +00001600
Fred Drakeec9fbe91999-02-15 16:20:25 +00001601two more lines must be added:
Fred Drake8e015171999-02-17 18:12:14 +00001602
Fred Drakeec9fbe91999-02-15 16:20:25 +00001603\begin{verbatim}
1604#define SPAM_MODULE
1605#include "spammodule.h"
1606\end{verbatim}
1607
1608The \code{\#define} is used to tell the header file that it is being
1609included in the exporting module, not a client module. Finally,
1610the module's initialization function must take care of initializing
1611the C API pointer array:
Fred Drake8e015171999-02-17 18:12:14 +00001612
Fred Drakeec9fbe91999-02-15 16:20:25 +00001613\begin{verbatim}
1614void
1615initspam()
1616{
1617 PyObject *m, *d;
1618 static void *PySpam_API[PySpam_API_pointers];
1619 PyObject *c_api_object;
1620 m = Py_InitModule("spam", SpamMethods);
1621
1622 /* Initialize the C API pointer array */
1623 PySpam_API[PySpam_System_NUM] = (void *)PySpam_System;
1624
1625 /* Create a CObject containing the API pointer array's address */
1626 c_api_object = PyCObject_FromVoidPtr((void *)PySpam_API, NULL);
1627
1628 /* Create a name for this object in the module's namespace */
1629 d = PyModule_GetDict(m);
1630 PyDict_SetItemString(d, "_C_API", c_api_object);
1631}
1632\end{verbatim}
1633
1634Note that \code{PySpam_API} is declared \code{static}; otherwise
1635the pointer array would disappear when \code{initspam} terminates!
1636
1637The bulk of the work is in the header file \file{spammodule.h},
1638which looks like this:
1639
1640\begin{verbatim}
1641#ifndef Py_SPAMMODULE_H
1642#define Py_SPAMMODULE_H
1643#ifdef __cplusplus
1644extern "C" {
1645#endif
1646
1647/* Header file for spammodule */
1648
1649/* C API functions */
1650#define PySpam_System_NUM 0
1651#define PySpam_System_RETURN int
Greg Steinc2844af2000-07-09 16:27:33 +00001652#define PySpam_System_PROTO (char *command)
Fred Drakeec9fbe91999-02-15 16:20:25 +00001653
1654/* Total number of C API pointers */
1655#define PySpam_API_pointers 1
1656
1657
1658#ifdef SPAM_MODULE
1659/* This section is used when compiling spammodule.c */
1660
1661static PySpam_System_RETURN PySpam_System PySpam_System_PROTO;
1662
1663#else
1664/* This section is used in modules that use spammodule's API */
1665
1666static void **PySpam_API;
1667
1668#define PySpam_System \
1669 (*(PySpam_System_RETURN (*)PySpam_System_PROTO) PySpam_API[PySpam_System_NUM])
1670
1671#define import_spam() \
1672{ \
1673 PyObject *module = PyImport_ImportModule("spam"); \
1674 if (module != NULL) { \
1675 PyObject *module_dict = PyModule_GetDict(module); \
1676 PyObject *c_api_object = PyDict_GetItemString(module_dict, "_C_API"); \
1677 if (PyCObject_Check(c_api_object)) { \
1678 PySpam_API = (void **)PyCObject_AsVoidPtr(c_api_object); \
1679 } \
1680 } \
1681}
1682
1683#endif
1684
1685#ifdef __cplusplus
1686}
1687#endif
1688
1689#endif /* !defined(Py_SPAMMODULE_H */
1690\end{verbatim}
1691
1692All that a client module must do in order to have access to the
1693function \cfunction{PySpam_System()} is to call the function (or
1694rather macro) \cfunction{import_spam()} in its initialization
1695function:
1696
1697\begin{verbatim}
1698void
1699initclient()
1700{
1701 PyObject *m;
1702
1703 Py_InitModule("client", ClientMethods);
1704 import_spam();
1705}
1706\end{verbatim}
1707
1708The main disadvantage of this approach is that the file
1709\file{spammodule.h} is rather complicated. However, the
1710basic structure is the same for each function that is
1711exported, so it has to be learned only once.
1712
1713Finally it should be mentioned that CObjects offer additional
1714functionality, which is especially useful for memory allocation and
1715deallocation of the pointer stored in a CObject. The details
Fred Drake9fa76f11999-11-10 16:01:43 +00001716are described in the \citetitle[../api/api.html]{Python/C API
1717Reference Manual} in the section ``CObjects'' and in the
1718implementation of CObjects (files \file{Include/cobject.h} and
1719\file{Objects/cobject.c} in the Python source code distribution).
Fred Drakeec9fbe91999-02-15 16:20:25 +00001720
1721
1722\chapter{Building C and \Cpp{} Extensions on \UNIX{}
Fred Drake3de61bc1999-02-16 21:14:16 +00001723 \label{building-on-unix}}
Fred Drakee743fd01998-11-24 17:07:29 +00001724
Fred Drake33698f81999-02-16 23:06:32 +00001725\sectionauthor{Jim Fulton}{jim@Digicool.com}
Fred Drakee743fd01998-11-24 17:07:29 +00001726
1727
1728%The make file make file, building C extensions on Unix
1729
1730
1731Starting in Python 1.4, Python provides a special make file for
1732building make files for building dynamically-linked extensions and
1733custom interpreters. The make file make file builds a make file
1734that reflects various system variables determined by configure when
1735the Python interpreter was built, so people building module's don't
1736have to resupply these settings. This vastly simplifies the process
1737of building extensions and custom interpreters on Unix systems.
1738
1739The make file make file is distributed as the file
1740\file{Misc/Makefile.pre.in} in the Python source distribution. The
1741first step in building extensions or custom interpreters is to copy
1742this make file to a development directory containing extension module
1743source.
1744
1745The make file make file, \file{Makefile.pre.in} uses metadata
1746provided in a file named \file{Setup}. The format of the \file{Setup}
Fred Drake585698a2000-10-26 17:19:58 +00001747file is the same as the \file{Setup} (or \file{Setup.dist}) file
Fred Drakee743fd01998-11-24 17:07:29 +00001748provided in the \file{Modules/} directory of the Python source
Fred Drake33698f81999-02-16 23:06:32 +00001749distribution. The \file{Setup} file contains variable definitions:
Fred Drakee743fd01998-11-24 17:07:29 +00001750
1751\begin{verbatim}
1752EC=/projects/ExtensionClass
1753\end{verbatim}
1754
1755and module description lines. It can also contain blank lines and
1756comment lines that start with \character{\#}.
1757
1758A module description line includes a module name, source files,
1759options, variable references, and other input files, such
Fred Drake54fd8452000-04-03 04:54:28 +00001760as libraries or object files. Consider a simple example:
Fred Drakee743fd01998-11-24 17:07:29 +00001761
1762\begin{verbatim}
1763ExtensionClass ExtensionClass.c
1764\end{verbatim}
1765
1766This is the simplest form of a module definition line. It defines a
Fred Drake8e015171999-02-17 18:12:14 +00001767module, \module{ExtensionClass}, which has a single source file,
Fred Drakee743fd01998-11-24 17:07:29 +00001768\file{ExtensionClass.c}.
1769
Fred Drake8e015171999-02-17 18:12:14 +00001770This slightly more complex example uses an \strong{-I} option to
1771specify an include directory:
Fred Drakee743fd01998-11-24 17:07:29 +00001772
1773\begin{verbatim}
Fred Drake8e015171999-02-17 18:12:14 +00001774EC=/projects/ExtensionClass
Fred Drakee743fd01998-11-24 17:07:29 +00001775cPersistence cPersistence.c -I$(EC)
Fred Drake8e015171999-02-17 18:12:14 +00001776\end{verbatim} % $ <-- bow to font lock
Fred Drakee743fd01998-11-24 17:07:29 +00001777
1778This example also illustrates the format for variable references.
1779
1780For systems that support dynamic linking, the \file{Setup} file should
1781begin:
1782
1783\begin{verbatim}
1784*shared*
1785\end{verbatim}
1786
1787to indicate that the modules defined in \file{Setup} are to be built
Fred Drakedc12ec81999-03-09 18:36:55 +00001788as dynamically linked modules. A line containing only \samp{*static*}
1789can be used to indicate the subsequently listed modules should be
1790statically linked.
Fred Drakee743fd01998-11-24 17:07:29 +00001791
1792Here is a complete \file{Setup} file for building a
1793\module{cPersistent} module:
1794
1795\begin{verbatim}
1796# Set-up file to build the cPersistence module.
1797# Note that the text should begin in the first column.
1798*shared*
1799
1800# We need the path to the directory containing the ExtensionClass
1801# include file.
1802EC=/projects/ExtensionClass
1803cPersistence cPersistence.c -I$(EC)
Fred Drake8e015171999-02-17 18:12:14 +00001804\end{verbatim} % $ <-- bow to font lock
Fred Drakee743fd01998-11-24 17:07:29 +00001805
1806After the \file{Setup} file has been created, \file{Makefile.pre.in}
1807is run with the \samp{boot} target to create a make file:
1808
1809\begin{verbatim}
1810make -f Makefile.pre.in boot
1811\end{verbatim}
1812
1813This creates the file, Makefile. To build the extensions, simply
1814run the created make file:
1815
1816\begin{verbatim}
1817make
1818\end{verbatim}
1819
1820It's not necessary to re-run \file{Makefile.pre.in} if the
1821\file{Setup} file is changed. The make file automatically rebuilds
1822itself if the \file{Setup} file changes.
1823
Fred Drake8e015171999-02-17 18:12:14 +00001824
1825\section{Building Custom Interpreters \label{custom-interps}}
Fred Drakee743fd01998-11-24 17:07:29 +00001826
1827The make file built by \file{Makefile.pre.in} can be run with the
1828\samp{static} target to build an interpreter:
1829
1830\begin{verbatim}
1831make static
1832\end{verbatim}
1833
Fred Drake585698a2000-10-26 17:19:58 +00001834Any modules defined in the \file{Setup} file before the
1835\samp{*shared*} line will be statically linked into the interpreter.
1836Typically, a \samp{*shared*} line is omitted from the
1837\file{Setup} file when a custom interpreter is desired.
Fred Drakee743fd01998-11-24 17:07:29 +00001838
Fred Drake8e015171999-02-17 18:12:14 +00001839
1840\section{Module Definition Options \label{module-defn-options}}
Fred Drakee743fd01998-11-24 17:07:29 +00001841
1842Several compiler options are supported:
1843
Fred Drake585698a2000-10-26 17:19:58 +00001844\begin{tableii}{l|l}{programopt}{Option}{Meaning}
Fred Drakee743fd01998-11-24 17:07:29 +00001845 \lineii{-C}{Tell the C pre-processor not to discard comments}
1846 \lineii{-D\var{name}=\var{value}}{Define a macro}
1847 \lineii{-I\var{dir}}{Specify an include directory, \var{dir}}
Fred Drake33698f81999-02-16 23:06:32 +00001848 \lineii{-L\var{dir}}{Specify a link-time library directory, \var{dir}}
1849 \lineii{-R\var{dir}}{Specify a run-time library directory, \var{dir}}
Fred Drakee743fd01998-11-24 17:07:29 +00001850 \lineii{-l\var{lib}}{Link a library, \var{lib}}
1851 \lineii{-U\var{name}}{Undefine a macro}
1852\end{tableii}
1853
1854Other compiler options can be included (snuck in) by putting them
Fred Drakedc12ec81999-03-09 18:36:55 +00001855in variables.
Fred Drakee743fd01998-11-24 17:07:29 +00001856
1857Source files can include files with \file{.c}, \file{.C}, \file{.cc},
Fred Drake8e015171999-02-17 18:12:14 +00001858\file{.cpp}, \file{.cxx}, and \file{.c++} extensions.
Fred Drakee743fd01998-11-24 17:07:29 +00001859
Fred Drake8e015171999-02-17 18:12:14 +00001860Other input files include files with \file{.a}, \file{.o}, \file{.sl},
1861and \file{.so} extensions.
Fred Drakee743fd01998-11-24 17:07:29 +00001862
1863
Fred Drake8e015171999-02-17 18:12:14 +00001864\section{Example \label{module-defn-example}}
Fred Drakee743fd01998-11-24 17:07:29 +00001865
Fred Drake585698a2000-10-26 17:19:58 +00001866Here is a more complicated example from \file{Modules/Setup.dist}:
Fred Drakee743fd01998-11-24 17:07:29 +00001867
1868\begin{verbatim}
1869GMP=/ufs/guido/src/gmp
1870mpz mpzmodule.c -I$(GMP) $(GMP)/libgmp.a
1871\end{verbatim}
1872
1873which could also be written as:
1874
1875\begin{verbatim}
1876mpz mpzmodule.c -I$(GMP) -L$(GMP) -lgmp
1877\end{verbatim}
1878
1879
1880\section{Distributing your extension modules
1881 \label{distributing}}
1882
Fred Drake585698a2000-10-26 17:19:58 +00001883There are two ways to distribute extension modules for others to use.
1884The way that allows the easiest cross-platform support is to use the
1885\module{distutils}\refstmodindex{distutils} package. The manual
1886\citetitle[../dist/dist.html]{Distributing Python Modules} contains
1887information on this approach. It is recommended that all new
1888extensions be distributed using this approach to allow easy building
1889and installation across platforms. Older extensions should migrate to
1890this approach as well.
1891
1892What follows describes the older approach; there are still many
1893extensions which use this.
1894
Fred Drakee743fd01998-11-24 17:07:29 +00001895When distributing your extension modules in source form, make sure to
1896include a \file{Setup} file. The \file{Setup} file should be named
1897\file{Setup.in} in the distribution. The make file make file,
Fred Drake585698a2000-10-26 17:19:58 +00001898\file{Makefile.pre.in}, will copy \file{Setup.in} to \file{Setup} if
1899the person installing the extension doesn't do so manually.
Fred Drakee743fd01998-11-24 17:07:29 +00001900Distributing a \file{Setup.in} file makes it easy for people to
1901customize the \file{Setup} file while keeping the original in
1902\file{Setup.in}.
1903
1904It is a good idea to include a copy of \file{Makefile.pre.in} for
1905people who do not have a source distribution of Python.
1906
1907Do not distribute a make file. People building your modules
Fred Drake8e015171999-02-17 18:12:14 +00001908should use \file{Makefile.pre.in} to build their own make file. A
1909\file{README} file included in the package should provide simple
1910instructions to perform the build.
Fred Drakee743fd01998-11-24 17:07:29 +00001911
1912
Fred Drake3de61bc1999-02-16 21:14:16 +00001913\chapter{Building C and \Cpp{} Extensions on Windows
Fred Drake33698f81999-02-16 23:06:32 +00001914 \label{building-on-windows}}
Fred Drake3de61bc1999-02-16 21:14:16 +00001915
1916
1917This chapter briefly explains how to create a Windows extension module
Fred Drake33698f81999-02-16 23:06:32 +00001918for Python using Microsoft Visual \Cpp{}, and follows with more
1919detailed background information on how it works. The explanatory
1920material is useful for both the Windows programmer learning to build
Fred Drake54fd8452000-04-03 04:54:28 +00001921Python extensions and the \UNIX{} programmer interested in producing
Fred Drake33698f81999-02-16 23:06:32 +00001922software which can be successfully built on both \UNIX{} and Windows.
1923
Fred Drake8e015171999-02-17 18:12:14 +00001924
Fred Drake33698f81999-02-16 23:06:32 +00001925\section{A Cookbook Approach \label{win-cookbook}}
1926
1927\sectionauthor{Neil Schemenauer}{neil_schemenauer@transcanada.com}
1928
1929This section provides a recipe for building a Python extension on
1930Windows.
Fred Drake3de61bc1999-02-16 21:14:16 +00001931
1932Grab the binary installer from \url{http://www.python.org/} and
1933install Python. The binary installer has all of the required header
1934files except for \file{config.h}.
1935
1936Get the source distribution and extract it into a convenient location.
1937Copy the \file{config.h} from the \file{PC/} directory into the
1938\file{include/} directory created by the installer.
1939
1940Create a \file{Setup} file for your extension module, as described in
Fred Drake54fd8452000-04-03 04:54:28 +00001941chapter \ref{building-on-unix}.
Fred Drake3de61bc1999-02-16 21:14:16 +00001942
1943Get David Ascher's \file{compile.py} script from
Fred Drakec0fcbc11999-04-29 02:30:04 +00001944\url{http://starship.python.net/crew/da/compile/}. Run the script to
Fred Drake3de61bc1999-02-16 21:14:16 +00001945create Microsoft Visual \Cpp{} project files.
1946
Fred Drake54fd8452000-04-03 04:54:28 +00001947Open the DSW file in Visual \Cpp{} and select \strong{Build}.
Fred Drake3de61bc1999-02-16 21:14:16 +00001948
1949If your module creates a new type, you may have trouble with this line:
1950
1951\begin{verbatim}
1952 PyObject_HEAD_INIT(&PyType_Type)
1953\end{verbatim}
1954
1955Change it to:
1956
1957\begin{verbatim}
1958 PyObject_HEAD_INIT(NULL)
1959\end{verbatim}
1960
1961and add the following to the module initialization function:
1962
1963\begin{verbatim}
1964 MyObject_Type.ob_type = &PyType_Type;
1965\end{verbatim}
1966
1967Refer to section 3 of the Python FAQ
1968(\url{http://www.python.org/doc/FAQ.html}) for details on why you must
1969do this.
1970
1971
Fred Drake33698f81999-02-16 23:06:32 +00001972\section{Differences Between \UNIX{} and Windows
1973 \label{dynamic-linking}}
1974\sectionauthor{Chris Phoenix}{cphoenix@best.com}
1975
1976
1977\UNIX{} and Windows use completely different paradigms for run-time
1978loading of code. Before you try to build a module that can be
1979dynamically loaded, be aware of how your system works.
1980
Fred Drake54fd8452000-04-03 04:54:28 +00001981In \UNIX{}, a shared object (\file{.so}) file contains code to be used by the
Fred Drake33698f81999-02-16 23:06:32 +00001982program, and also the names of functions and data that it expects to
1983find in the program. When the file is joined to the program, all
1984references to those functions and data in the file's code are changed
1985to point to the actual locations in the program where the functions
1986and data are placed in memory. This is basically a link operation.
1987
1988In Windows, a dynamic-link library (\file{.dll}) file has no dangling
1989references. Instead, an access to functions or data goes through a
1990lookup table. So the DLL code does not have to be fixed up at runtime
1991to refer to the program's memory; instead, the code already uses the
1992DLL's lookup table, and the lookup table is modified at runtime to
1993point to the functions and data.
1994
1995In \UNIX{}, there is only one type of library file (\file{.a}) which
1996contains code from several object files (\file{.o}). During the link
1997step to create a shared object file (\file{.so}), the linker may find
1998that it doesn't know where an identifier is defined. The linker will
1999look for it in the object files in the libraries; if it finds it, it
2000will include all the code from that object file.
2001
2002In Windows, there are two types of library, a static library and an
2003import library (both called \file{.lib}). A static library is like a
2004\UNIX{} \file{.a} file; it contains code to be included as necessary.
2005An import library is basically used only to reassure the linker that a
2006certain identifier is legal, and will be present in the program when
2007the DLL is loaded. So the linker uses the information from the
2008import library to build the lookup table for using identifiers that
2009are not included in the DLL. When an application or a DLL is linked,
2010an import library may be generated, which will need to be used for all
2011future DLLs that depend on the symbols in the application or DLL.
2012
2013Suppose you are building two dynamic-load modules, B and C, which should
2014share another block of code A. On \UNIX{}, you would \emph{not} pass
2015\file{A.a} to the linker for \file{B.so} and \file{C.so}; that would
2016cause it to be included twice, so that B and C would each have their
2017own copy. In Windows, building \file{A.dll} will also build
2018\file{A.lib}. You \emph{do} pass \file{A.lib} to the linker for B and
2019C. \file{A.lib} does not contain code; it just contains information
2020which will be used at runtime to access A's code.
2021
2022In Windows, using an import library is sort of like using \samp{import
2023spam}; it gives you access to spam's names, but does not create a
2024separate copy. On \UNIX{}, linking with a library is more like
2025\samp{from spam import *}; it does create a separate copy.
2026
2027
2028\section{Using DLLs in Practice \label{win-dlls}}
2029\sectionauthor{Chris Phoenix}{cphoenix@best.com}
2030
2031Windows Python is built in Microsoft Visual \Cpp{}; using other
2032compilers may or may not work (though Borland seems to). The rest of
2033this section is MSV\Cpp{} specific.
2034
2035When creating DLLs in Windows, you must pass \file{python15.lib} to
2036the linker. To build two DLLs, spam and ni (which uses C functions
2037found in spam), you could use these commands:
2038
2039\begin{verbatim}
2040cl /LD /I/python/include spam.c ../libs/python15.lib
2041cl /LD /I/python/include ni.c spam.lib ../libs/python15.lib
2042\end{verbatim}
2043
2044The first command created three files: \file{spam.obj},
2045\file{spam.dll} and \file{spam.lib}. \file{Spam.dll} does not contain
2046any Python functions (such as \cfunction{PyArg_ParseTuple()}), but it
2047does know how to find the Python code thanks to \file{python15.lib}.
2048
2049The second command created \file{ni.dll} (and \file{.obj} and
2050\file{.lib}), which knows how to find the necessary functions from
2051spam, and also from the Python executable.
2052
2053Not every identifier is exported to the lookup table. If you want any
2054other modules (including Python) to be able to see your identifiers,
2055you have to say \samp{_declspec(dllexport)}, as in \samp{void
2056_declspec(dllexport) initspam(void)} or \samp{PyObject
2057_declspec(dllexport) *NiGetSpamData(void)}.
2058
2059Developer Studio will throw in a lot of import libraries that you do
2060not really need, adding about 100K to your executable. To get rid of
2061them, use the Project Settings dialog, Link tab, to specify
2062\emph{ignore default libraries}. Add the correct
2063\file{msvcrt\var{xx}.lib} to the list of libraries.
2064
2065
Fred Drake5e8aa541998-11-16 18:34:07 +00002066\chapter{Embedding Python in Another Application
2067 \label{embedding}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002068
2069Embedding Python is similar to extending it, but not quite. The
2070difference is that when you extend Python, the main program of the
Guido van Rossum16d6e711994-08-08 12:30:22 +00002071application is still the Python interpreter, while if you embed
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00002072Python, the main program may have nothing to do with Python ---
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002073instead, some parts of the application occasionally call the Python
2074interpreter to run some Python code.
2075
2076So if you are embedding Python, you are providing your own main
2077program. One of the things this main program has to do is initialize
2078the Python interpreter. At the very least, you have to call the
Fred Drake54fd8452000-04-03 04:54:28 +00002079function \cfunction{Py_Initialize()} (on MacOS, call
2080\cfunction{PyMac_Initialize()} instead). There are optional calls to
Fred Draked7bb3031998-03-03 17:52:07 +00002081pass command line arguments to Python. Then later you can call the
2082interpreter from any part of the application.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002083
2084There are several different ways to call the interpreter: you can pass
Fred Draked7bb3031998-03-03 17:52:07 +00002085a string containing Python statements to
2086\cfunction{PyRun_SimpleString()}, or you can pass a stdio file pointer
2087and a file name (for identification in error messages only) to
2088\cfunction{PyRun_SimpleFile()}. You can also call the lower-level
2089operations described in the previous chapters to construct and use
2090Python objects.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002091
2092A simple demo of embedding Python can be found in the directory
Fred Drake295fb431999-02-16 17:29:42 +00002093\file{Demo/embed/} of the source distribution.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00002094
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002095
Fred Drake5e8aa541998-11-16 18:34:07 +00002096\section{Embedding Python in \Cpp{}
2097 \label{embeddingInCplusplus}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002098
Guido van Rossum16d6e711994-08-08 12:30:22 +00002099It is also possible to embed Python in a \Cpp{} program; precisely how this
2100is done will depend on the details of the \Cpp{} system used; in general you
2101will need to write the main program in \Cpp{}, and use the \Cpp{} compiler
2102to compile and link your program. There is no need to recompile Python
2103itself using \Cpp{}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002104
Fred Drake1c258032000-09-08 22:54:53 +00002105
2106\section{Linking Requirements
2107 \label{link-reqs}}
2108
2109While the \program{configure} script shipped with the Python sources
2110will correctly build Python to export the symbols needed by
2111dynamically linked extensions, this is not automatically inherited by
2112applications which embed the Python library statically, at least on
2113\UNIX. This is an issue when the application is linked to the static
2114runtime library (\file{libpython.a}) and needs to load dynamic
2115extensions (implemented as \file{.so} files).
2116
2117The problem is that some entry points are defined by the Python
2118runtime solely for extension modules to use. If the embedding
2119application does not use any of these entry points, some linkers will
2120not include those entries in the symbol table of the finished
2121executable. Some additional options are needed to inform the linker
2122not to remove these symbols.
2123
2124Determining the right options to use for any given platform can be
2125quite difficult, but fortunately the Python configuration already has
2126those values. To retrieve them from an installed Python interpreter,
2127start an interactive interpreter and have a short session like this:
2128
2129\begin{verbatim}
2130>>> import distutils.sysconfig
Fred Drake4bc0aed2000-11-02 21:49:17 +00002131>>> distutils.sysconfig.get_config_var('LINKFORSHARED')
Fred Drake1c258032000-09-08 22:54:53 +00002132'-Xlinker -export-dynamic'
2133\end{verbatim}
2134\refstmodindex{distutils.sysconfig}
2135
2136The contents of the string presented will be the options that should
2137be used. If the string is empty, there's no need to add any
2138additional options. The \constant{LINKFORSHARED} definition
2139corresponds to the variable of the same name in Python's top-level
2140\file{Makefile}.
2141
Fred Drakeed773ef2000-09-21 21:35:22 +00002142
2143\appendix
2144\chapter{Reporting Bugs}
2145\input{reportingbugs}
2146
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002147\end{document}