blob: 9b45172a2d8b35d7bb2a462ec86113ec093ae637 [file] [log] [blame]
Fred Drake6659c301998-03-03 22:02:19 +00001\documentclass{manual}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002
Guido van Rossumd358afe1998-12-23 05:02:08 +00003% XXX PM explain how to add new types to Python
Guido van Rossum5049bcb1995-03-13 16:55:23 +00004
Guido van Rossum6938f061994-08-01 12:22:53 +00005\title{Extending and Embedding the Python Interpreter}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00006
Guido van Rossum16cd7f91994-10-06 10:29:26 +00007\input{boilerplate}
Guido van Rossum83eb9621993-11-23 16:28:45 +00008
Guido van Rossum7a2dba21993-11-05 14:45:11 +00009% Tell \index to actually write the .idx file
10\makeindex
11
12\begin{document}
13
Guido van Rossum7a2dba21993-11-05 14:45:11 +000014\maketitle
15
Fred Drake9f86b661998-07-28 21:55:19 +000016\ifhtml
17\chapter*{Front Matter\label{front}}
18\fi
19
Guido van Rossum16cd7f91994-10-06 10:29:26 +000020\input{copyright}
21
Fred Drake33698f81999-02-16 23:06:32 +000022
Guido van Rossum7a2dba21993-11-05 14:45:11 +000023\begin{abstract}
24
25\noindent
Guido van Rossumb92112d1995-03-20 14:24:09 +000026Python is an interpreted, object-oriented programming language. This
Fred Drakeec9fbe91999-02-15 16:20:25 +000027document describes how to write modules in C or \Cpp{} to extend the
Guido van Rossumb92112d1995-03-20 14:24:09 +000028Python interpreter with new modules. Those modules can define new
29functions but also new object types and their methods. The document
30also describes how to embed the Python interpreter in another
31application, for use as an extension language. Finally, it shows how
32to compile and link extension modules so that they can be loaded
33dynamically (at run time) into the interpreter, if the underlying
34operating system supports this feature.
35
36This document assumes basic knowledge about Python. For an informal
Fred Drake9fa76f11999-11-10 16:01:43 +000037introduction to the language, see the
38\citetitle[../tut/tut.html]{Python Tutorial}. The
39\citetitle[../ref/ref.html]{Python Reference Manual} gives a more
40formal definition of the language. The
41\citetitle[../lib/lib.html]{Python Library Reference} documents the
42existing object types, functions and modules (both built-in and
43written in Python) that give the language its wide application range.
Guido van Rossum7a2dba21993-11-05 14:45:11 +000044
Fred Drakeec9fbe91999-02-15 16:20:25 +000045For a detailed description of the whole Python/C API, see the separate
Fred Drake9fa76f11999-11-10 16:01:43 +000046\citetitle[../api/api.html]{Python/C API Reference Manual}.
Guido van Rossumfdacc581997-10-07 14:40:16 +000047
Guido van Rossum7a2dba21993-11-05 14:45:11 +000048\end{abstract}
49
Fred Drake4d4f9e71998-01-13 22:25:02 +000050\tableofcontents
Guido van Rossum7a2dba21993-11-05 14:45:11 +000051
Guido van Rossumdb65a6c1993-11-05 17:11:16 +000052
Fred Drake8e015171999-02-17 18:12:14 +000053\chapter{Extending Python with C or \Cpp{} \label{intro}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +000054
Guido van Rossum6f0132f1993-11-19 13:13:22 +000055
Guido van Rossumb92112d1995-03-20 14:24:09 +000056It is quite easy to add new built-in modules to Python, if you know
Fred Drakeec9fbe91999-02-15 16:20:25 +000057how to program in C. Such \dfn{extension modules} can do two things
Guido van Rossumb92112d1995-03-20 14:24:09 +000058that can't be done directly in Python: they can implement new built-in
Fred Drakeec9fbe91999-02-15 16:20:25 +000059object types, and they can call C library functions and system calls.
Guido van Rossum6938f061994-08-01 12:22:53 +000060
Guido van Rossum5049bcb1995-03-13 16:55:23 +000061To support extensions, the Python API (Application Programmers
Guido van Rossumb92112d1995-03-20 14:24:09 +000062Interface) defines a set of functions, macros and variables that
63provide access to most aspects of the Python run-time system. The
Fred Drakeec9fbe91999-02-15 16:20:25 +000064Python API is incorporated in a C source file by including the header
Guido van Rossumb92112d1995-03-20 14:24:09 +000065\code{"Python.h"}.
Guido van Rossum6938f061994-08-01 12:22:53 +000066
Guido van Rossumb92112d1995-03-20 14:24:09 +000067The compilation of an extension module depends on its intended use as
Fred Drake54fd8452000-04-03 04:54:28 +000068well as on your system setup; details are given in later chapters.
Guido van Rossum6938f061994-08-01 12:22:53 +000069
Guido van Rossum7a2dba21993-11-05 14:45:11 +000070
Fred Drake5e8aa541998-11-16 18:34:07 +000071\section{A Simple Example
72 \label{simpleExample}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +000073
Guido van Rossumb92112d1995-03-20 14:24:09 +000074Let's create an extension module called \samp{spam} (the favorite food
75of Monty Python fans...) and let's say we want to create a Python
Fred Drakeec9fbe91999-02-15 16:20:25 +000076interface to the C library function \cfunction{system()}.\footnote{An
Guido van Rossumb92112d1995-03-20 14:24:09 +000077interface for this function already exists in the standard module
Fred Draked7bb3031998-03-03 17:52:07 +000078\module{os} --- it was chosen as a simple and straightfoward example.}
Guido van Rossumb92112d1995-03-20 14:24:09 +000079This function takes a null-terminated character string as argument and
80returns an integer. We want this function to be callable from Python
81as follows:
82
Fred Drake1e11a5c1998-02-13 07:11:32 +000083\begin{verbatim}
84>>> import spam
85>>> status = spam.system("ls -l")
86\end{verbatim}
87
Fred Drake54fd8452000-04-03 04:54:28 +000088Begin by creating a file \file{spammodule.c}. (Historically, if a
Fred Drakeec9fbe91999-02-15 16:20:25 +000089module is called \samp{spam}, the C file containing its implementation
Guido van Rossumb92112d1995-03-20 14:24:09 +000090is called \file{spammodule.c}; if the module name is very long, like
91\samp{spammify}, the module name can be just \file{spammify.c}.)
92
93The first line of our file can be:
Guido van Rossum7a2dba21993-11-05 14:45:11 +000094
Fred Drake1e11a5c1998-02-13 07:11:32 +000095\begin{verbatim}
Fred Drake54fd8452000-04-03 04:54:28 +000096#include <Python.h>
Fred Drake1e11a5c1998-02-13 07:11:32 +000097\end{verbatim}
98
Guido van Rossum5049bcb1995-03-13 16:55:23 +000099which pulls in the Python API (you can add a comment describing the
100purpose of the module and a copyright notice if you like).
101
Guido van Rossumb92112d1995-03-20 14:24:09 +0000102All user-visible symbols defined by \code{"Python.h"} have a prefix of
103\samp{Py} or \samp{PY}, except those defined in standard header files.
104For convenience, and since they are used extensively by the Python
105interpreter, \code{"Python.h"} includes a few standard header files:
106\code{<stdio.h>}, \code{<string.h>}, \code{<errno.h>}, and
107\code{<stdlib.h>}. If the latter header file does not exist on your
Fred Draked7bb3031998-03-03 17:52:07 +0000108system, it declares the functions \cfunction{malloc()},
109\cfunction{free()} and \cfunction{realloc()} directly.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000110
Fred Drakeec9fbe91999-02-15 16:20:25 +0000111The next thing we add to our module file is the C function that will
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000112be called when the Python expression \samp{spam.system(\var{string})}
Guido van Rossumb92112d1995-03-20 14:24:09 +0000113is evaluated (we'll see shortly how it ends up being called):
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000114
Fred Drake1e11a5c1998-02-13 07:11:32 +0000115\begin{verbatim}
116static PyObject *
117spam_system(self, args)
118 PyObject *self;
119 PyObject *args;
120{
121 char *command;
122 int sts;
Fred Drakea0dbddf1998-04-02 06:50:02 +0000123
Fred Drake1e11a5c1998-02-13 07:11:32 +0000124 if (!PyArg_ParseTuple(args, "s", &command))
125 return NULL;
126 sts = system(command);
127 return Py_BuildValue("i", sts);
128}
129\end{verbatim}
130
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000131There is a straightforward translation from the argument list in
Fred Drake15e33d82001-07-06 06:49:32 +0000132Python (for example, the single expression \code{"ls -l"}) to the
133arguments passed to the C function. The C function always has two
134arguments, conventionally named \var{self} and \var{args}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000135
Fred Drakeec9fbe91999-02-15 16:20:25 +0000136The \var{self} argument is only used when the C function implements a
Fred Drake9226d8e1999-02-22 14:55:46 +0000137built-in method, not a function. In the example, \var{self} will
138always be a \NULL{} pointer, since we are defining a function, not a
139method. (This is done so that the interpreter doesn't have to
140understand two different types of C functions.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000141
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000142The \var{args} argument will be a pointer to a Python tuple object
Guido van Rossumb92112d1995-03-20 14:24:09 +0000143containing the arguments. Each item of the tuple corresponds to an
144argument in the call's argument list. The arguments are Python
Fred Drakeec9fbe91999-02-15 16:20:25 +0000145objects --- in order to do anything with them in our C function we have
146to convert them to C values. The function \cfunction{PyArg_ParseTuple()}
147in the Python API checks the argument types and converts them to C
Guido van Rossumb92112d1995-03-20 14:24:09 +0000148values. It uses a template string to determine the required types of
Fred Drakeec9fbe91999-02-15 16:20:25 +0000149the arguments as well as the types of the C variables into which to
Guido van Rossumb92112d1995-03-20 14:24:09 +0000150store the converted values. More about this later.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000151
Fred Drake3da06a61998-02-26 18:49:12 +0000152\cfunction{PyArg_ParseTuple()} returns true (nonzero) if all arguments have
Guido van Rossumb92112d1995-03-20 14:24:09 +0000153the right type and its components have been stored in the variables
154whose addresses are passed. It returns false (zero) if an invalid
155argument list was passed. In the latter case it also raises an
Fred Drake54fd8452000-04-03 04:54:28 +0000156appropriate exception so the calling function can return
Fred Drake0fd82681998-01-09 05:39:38 +0000157\NULL{} immediately (as we saw in the example).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000158
159
Fred Drake5e8aa541998-11-16 18:34:07 +0000160\section{Intermezzo: Errors and Exceptions
161 \label{errors}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000162
163An important convention throughout the Python interpreter is the
164following: when a function fails, it should set an exception condition
Fred Drake0fd82681998-01-09 05:39:38 +0000165and return an error value (usually a \NULL{} pointer). Exceptions
Guido van Rossumb92112d1995-03-20 14:24:09 +0000166are stored in a static global variable inside the interpreter; if this
Fred Drake0fd82681998-01-09 05:39:38 +0000167variable is \NULL{} no exception has occurred. A second global
Guido van Rossumb92112d1995-03-20 14:24:09 +0000168variable stores the ``associated value'' of the exception (the second
Fred Draked7bb3031998-03-03 17:52:07 +0000169argument to \keyword{raise}). A third variable contains the stack
Guido van Rossumb92112d1995-03-20 14:24:09 +0000170traceback in case the error originated in Python code. These three
Fred Drakeec9fbe91999-02-15 16:20:25 +0000171variables are the C equivalents of the Python variables
Fred Drakef9918f21999-02-05 18:30:49 +0000172\code{sys.exc_type}, \code{sys.exc_value} and \code{sys.exc_traceback} (see
Fred Drake9fa76f11999-11-10 16:01:43 +0000173the section on module \module{sys} in the
174\citetitle[../lib/lib.html]{Python Library Reference}). It is
175important to know about them to understand how errors are passed
176around.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000177
Guido van Rossumb92112d1995-03-20 14:24:09 +0000178The Python API defines a number of functions to set various types of
179exceptions.
180
Fred Draked7bb3031998-03-03 17:52:07 +0000181The most common one is \cfunction{PyErr_SetString()}. Its arguments
Fred Drakeec9fbe91999-02-15 16:20:25 +0000182are an exception object and a C string. The exception object is
Fred Draked7bb3031998-03-03 17:52:07 +0000183usually a predefined object like \cdata{PyExc_ZeroDivisionError}. The
Fred Drakeec9fbe91999-02-15 16:20:25 +0000184C string indicates the cause of the error and is converted to a
Fred Draked7bb3031998-03-03 17:52:07 +0000185Python string object and stored as the ``associated value'' of the
186exception.
Guido van Rossumb92112d1995-03-20 14:24:09 +0000187
Fred Draked7bb3031998-03-03 17:52:07 +0000188Another useful function is \cfunction{PyErr_SetFromErrno()}, which only
Guido van Rossumb92112d1995-03-20 14:24:09 +0000189takes an exception argument and constructs the associated value by
Fred Drake54fd8452000-04-03 04:54:28 +0000190inspection of the global variable \cdata{errno}. The most
Fred Draked7bb3031998-03-03 17:52:07 +0000191general function is \cfunction{PyErr_SetObject()}, which takes two object
Guido van Rossumb92112d1995-03-20 14:24:09 +0000192arguments, the exception and its associated value. You don't need to
Fred Draked7bb3031998-03-03 17:52:07 +0000193\cfunction{Py_INCREF()} the objects passed to any of these functions.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000194
195You can test non-destructively whether an exception has been set with
Fred Draked7bb3031998-03-03 17:52:07 +0000196\cfunction{PyErr_Occurred()}. This returns the current exception object,
Fred Drake0fd82681998-01-09 05:39:38 +0000197or \NULL{} if no exception has occurred. You normally don't need
Fred Draked7bb3031998-03-03 17:52:07 +0000198to call \cfunction{PyErr_Occurred()} to see whether an error occurred in a
Guido van Rossumb92112d1995-03-20 14:24:09 +0000199function call, since you should be able to tell from the return value.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000200
Guido van Rossumd16ddb61996-12-13 02:38:17 +0000201When a function \var{f} that calls another function \var{g} detects
Guido van Rossumb92112d1995-03-20 14:24:09 +0000202that the latter fails, \var{f} should itself return an error value
Fred Drake15e33d82001-07-06 06:49:32 +0000203(usually \NULL{} or \code{-1}). It should \emph{not} call one of the
Fred Draked7bb3031998-03-03 17:52:07 +0000204\cfunction{PyErr_*()} functions --- one has already been called by \var{g}.
Guido van Rossumb92112d1995-03-20 14:24:09 +0000205\var{f}'s caller is then supposed to also return an error indication
Fred Draked7bb3031998-03-03 17:52:07 +0000206to \emph{its} caller, again \emph{without} calling \cfunction{PyErr_*()},
Guido van Rossumb92112d1995-03-20 14:24:09 +0000207and so on --- the most detailed cause of the error was already
208reported by the function that first detected it. Once the error
209reaches the Python interpreter's main loop, this aborts the currently
210executing Python code and tries to find an exception handler specified
211by the Python programmer.
Guido van Rossum6938f061994-08-01 12:22:53 +0000212
213(There are situations where a module can actually give a more detailed
Fred Draked7bb3031998-03-03 17:52:07 +0000214error message by calling another \cfunction{PyErr_*()} function, and in
Guido van Rossumb92112d1995-03-20 14:24:09 +0000215such cases it is fine to do so. As a general rule, however, this is
216not necessary, and can cause information about the cause of the error
217to be lost: most operations can fail for a variety of reasons.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000218
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000219To ignore an exception set by a function call that failed, the exception
Fred Draked7bb3031998-03-03 17:52:07 +0000220condition must be cleared explicitly by calling \cfunction{PyErr_Clear()}.
Fred Drakeec9fbe91999-02-15 16:20:25 +0000221The only time C code should call \cfunction{PyErr_Clear()} is if it doesn't
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000222want to pass the error on to the interpreter but wants to handle it
Fred Drake15e33d82001-07-06 06:49:32 +0000223completely by itself (possibly by trying something else, or pretending
224nothing went wrong).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000225
Fred Drake54fd8452000-04-03 04:54:28 +0000226Every failing \cfunction{malloc()} call must be turned into an
Fred Draked7bb3031998-03-03 17:52:07 +0000227exception --- the direct caller of \cfunction{malloc()} (or
228\cfunction{realloc()}) must call \cfunction{PyErr_NoMemory()} and
229return a failure indicator itself. All the object-creating functions
Fred Drake54fd8452000-04-03 04:54:28 +0000230(for example, \cfunction{PyInt_FromLong()}) already do this, so this
231note is only relevant to those who call \cfunction{malloc()} directly.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000232
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000233Also note that, with the important exception of
Fred Drake3da06a61998-02-26 18:49:12 +0000234\cfunction{PyArg_ParseTuple()} and friends, functions that return an
Guido van Rossumb92112d1995-03-20 14:24:09 +0000235integer status usually return a positive value or zero for success and
236\code{-1} for failure, like \UNIX{} system calls.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000237
Fred Draked7bb3031998-03-03 17:52:07 +0000238Finally, be careful to clean up garbage (by making
239\cfunction{Py_XDECREF()} or \cfunction{Py_DECREF()} calls for objects
240you have already created) when you return an error indicator!
Guido van Rossum6938f061994-08-01 12:22:53 +0000241
242The choice of which exception to raise is entirely yours. There are
Fred Drakeec9fbe91999-02-15 16:20:25 +0000243predeclared C objects corresponding to all built-in Python exceptions,
Fred Drake15e33d82001-07-06 06:49:32 +0000244such as \cdata{PyExc_ZeroDivisionError}, which you can use directly.
245Of course, you should choose exceptions wisely --- don't use
Fred Draked7bb3031998-03-03 17:52:07 +0000246\cdata{PyExc_TypeError} to mean that a file couldn't be opened (that
247should probably be \cdata{PyExc_IOError}). If something's wrong with
Fred Drake3da06a61998-02-26 18:49:12 +0000248the argument list, the \cfunction{PyArg_ParseTuple()} function usually
Fred Draked7bb3031998-03-03 17:52:07 +0000249raises \cdata{PyExc_TypeError}. If you have an argument whose value
Fred Drakedc12ec81999-03-09 18:36:55 +0000250must be in a particular range or must satisfy other conditions,
Fred Draked7bb3031998-03-03 17:52:07 +0000251\cdata{PyExc_ValueError} is appropriate.
Guido van Rossum6938f061994-08-01 12:22:53 +0000252
253You can also define a new exception that is unique to your module.
254For this, you usually declare a static object variable at the
Fred Drake15e33d82001-07-06 06:49:32 +0000255beginning of your file:
Guido van Rossum6938f061994-08-01 12:22:53 +0000256
Fred Drake1e11a5c1998-02-13 07:11:32 +0000257\begin{verbatim}
258static PyObject *SpamError;
259\end{verbatim}
260
Guido van Rossum6938f061994-08-01 12:22:53 +0000261and initialize it in your module's initialization function
Fred Drake15e33d82001-07-06 06:49:32 +0000262(\cfunction{initspam()}) with an exception object (leaving out
Fred Draked7bb3031998-03-03 17:52:07 +0000263the error checking for now):
Guido van Rossum6938f061994-08-01 12:22:53 +0000264
Fred Drake1e11a5c1998-02-13 07:11:32 +0000265\begin{verbatim}
266void
267initspam()
268{
269 PyObject *m, *d;
Fred Drakea0dbddf1998-04-02 06:50:02 +0000270
Fred Drake1e11a5c1998-02-13 07:11:32 +0000271 m = Py_InitModule("spam", SpamMethods);
272 d = PyModule_GetDict(m);
Fred Draked7bb3031998-03-03 17:52:07 +0000273 SpamError = PyErr_NewException("spam.error", NULL, NULL);
Fred Drake1e11a5c1998-02-13 07:11:32 +0000274 PyDict_SetItemString(d, "error", SpamError);
275}
276\end{verbatim}
277
Guido van Rossumb92112d1995-03-20 14:24:09 +0000278Note that the Python name for the exception object is
Fred Draked7bb3031998-03-03 17:52:07 +0000279\exception{spam.error}. The \cfunction{PyErr_NewException()} function
Fred Drake0539bfa2001-03-02 18:15:11 +0000280may create a class with the base class being \exception{Exception}
281(unless another class is passed in instead of \NULL), described in the
Fred Drake9fa76f11999-11-10 16:01:43 +0000282\citetitle[../lib/lib.html]{Python Library Reference} under ``Built-in
Fred Draked7bb3031998-03-03 17:52:07 +0000283Exceptions.''
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000284
Fred Drake0539bfa2001-03-02 18:15:11 +0000285Note also that the \cdata{SpamError} variable retains a reference to
286the newly created exception class; this is intentional! Since the
287exception could be removed from the module by external code, an owned
288reference to the class is needed to ensure that it will not be
289discarded, causing \cdata{SpamError} to become a dangling pointer.
290Should it become a dangling pointer, C code which raises the exception
291could cause a core dump or other unintended side effects.
292
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000293
Fred Drake5e8aa541998-11-16 18:34:07 +0000294\section{Back to the Example
295 \label{backToExample}}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000296
297Going back to our example function, you should now be able to
298understand this statement:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000299
Fred Drake1e11a5c1998-02-13 07:11:32 +0000300\begin{verbatim}
301 if (!PyArg_ParseTuple(args, "s", &command))
302 return NULL;
303\end{verbatim}
304
Fred Drake0fd82681998-01-09 05:39:38 +0000305It returns \NULL{} (the error indicator for functions returning
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000306object pointers) if an error is detected in the argument list, relying
Fred Drake3da06a61998-02-26 18:49:12 +0000307on the exception set by \cfunction{PyArg_ParseTuple()}. Otherwise the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000308string value of the argument has been copied to the local variable
Fred Draked7bb3031998-03-03 17:52:07 +0000309\cdata{command}. This is a pointer assignment and you are not supposed
Fred Drakeec9fbe91999-02-15 16:20:25 +0000310to modify the string to which it points (so in Standard C, the variable
Fred Draked7bb3031998-03-03 17:52:07 +0000311\cdata{command} should properly be declared as \samp{const char
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000312*command}).
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000313
Fred Draked7bb3031998-03-03 17:52:07 +0000314The next statement is a call to the \UNIX{} function
315\cfunction{system()}, passing it the string we just got from
316\cfunction{PyArg_ParseTuple()}:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000317
Fred Drake1e11a5c1998-02-13 07:11:32 +0000318\begin{verbatim}
319 sts = system(command);
320\end{verbatim}
321
Fred Draked7bb3031998-03-03 17:52:07 +0000322Our \function{spam.system()} function must return the value of
323\cdata{sts} as a Python object. This is done using the function
324\cfunction{Py_BuildValue()}, which is something like the inverse of
325\cfunction{PyArg_ParseTuple()}: it takes a format string and an
Fred Drakeec9fbe91999-02-15 16:20:25 +0000326arbitrary number of C values, and returns a new Python object.
Fred Draked7bb3031998-03-03 17:52:07 +0000327More info on \cfunction{Py_BuildValue()} is given later.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000328
Fred Drake1e11a5c1998-02-13 07:11:32 +0000329\begin{verbatim}
330 return Py_BuildValue("i", sts);
331\end{verbatim}
332
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000333In this case, it will return an integer object. (Yes, even integers
334are objects on the heap in Python!)
Guido van Rossum6938f061994-08-01 12:22:53 +0000335
Fred Drakeec9fbe91999-02-15 16:20:25 +0000336If you have a C function that returns no useful argument (a function
Fred Draked7bb3031998-03-03 17:52:07 +0000337returning \ctype{void}), the corresponding Python function must return
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000338\code{None}. You need this idiom to do so:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000339
Fred Drake1e11a5c1998-02-13 07:11:32 +0000340\begin{verbatim}
341 Py_INCREF(Py_None);
342 return Py_None;
343\end{verbatim}
344
Fred Drakeec9fbe91999-02-15 16:20:25 +0000345\cdata{Py_None} is the C name for the special Python object
Fred Drakea0dbddf1998-04-02 06:50:02 +0000346\code{None}. It is a genuine Python object rather than a \NULL{}
347pointer, which means ``error'' in most contexts, as we have seen.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000348
349
Fred Drake5e8aa541998-11-16 18:34:07 +0000350\section{The Module's Method Table and Initialization Function
351 \label{methodTable}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000352
Fred Draked7bb3031998-03-03 17:52:07 +0000353I promised to show how \cfunction{spam_system()} is called from Python
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000354programs. First, we need to list its name and address in a ``method
355table'':
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000356
Fred Drake1e11a5c1998-02-13 07:11:32 +0000357\begin{verbatim}
358static PyMethodDef SpamMethods[] = {
359 ...
360 {"system", spam_system, METH_VARARGS},
361 ...
362 {NULL, NULL} /* Sentinel */
363};
364\end{verbatim}
365
Fred Drake0fd82681998-01-09 05:39:38 +0000366Note the third entry (\samp{METH_VARARGS}). This is a flag telling
Fred Drakeec9fbe91999-02-15 16:20:25 +0000367the interpreter the calling convention to be used for the C
Fred Drake0fd82681998-01-09 05:39:38 +0000368function. It should normally always be \samp{METH_VARARGS} or
Fred Drakea0dbddf1998-04-02 06:50:02 +0000369\samp{METH_VARARGS | METH_KEYWORDS}; a value of \code{0} means that an
Fred Drake3da06a61998-02-26 18:49:12 +0000370obsolete variant of \cfunction{PyArg_ParseTuple()} is used.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000371
Fred Drakeb6e50321998-02-04 20:26:31 +0000372When using only \samp{METH_VARARGS}, the function should expect
373the Python-level parameters to be passed in as a tuple acceptable for
374parsing via \cfunction{PyArg_ParseTuple()}; more information on this
375function is provided below.
376
Fred Drake2d545232000-05-10 20:33:18 +0000377The \constant{METH_KEYWORDS} bit may be set in the third field if
378keyword arguments should be passed to the function. In this case, the
379C function should accept a third \samp{PyObject *} parameter which
380will be a dictionary of keywords. Use
381\cfunction{PyArg_ParseTupleAndKeywords()} to parse the arguments to
382such a function.
Fred Drake0fd82681998-01-09 05:39:38 +0000383
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000384The method table must be passed to the interpreter in the module's
Fred Drake2d545232000-05-10 20:33:18 +0000385initialization function. The initialization function must be named
386\cfunction{init\var{name}()}, where \var{name} is the name of the
387module, and should be the only non-\keyword{static} item defined in
388the module file:
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000389
Fred Drake1e11a5c1998-02-13 07:11:32 +0000390\begin{verbatim}
391void
392initspam()
393{
394 (void) Py_InitModule("spam", SpamMethods);
395}
396\end{verbatim}
397
Fred Drake65e69002000-05-10 20:36:34 +0000398Note that for \Cpp, this method must be declared \code{extern "C"}.
399
Fred Draked7bb3031998-03-03 17:52:07 +0000400When the Python program imports module \module{spam} for the first
Fred Drake54fd8452000-04-03 04:54:28 +0000401time, \cfunction{initspam()} is called. (See below for comments about
402embedding Python.) It calls
Fred Draked7bb3031998-03-03 17:52:07 +0000403\cfunction{Py_InitModule()}, which creates a ``module object'' (which
404is inserted in the dictionary \code{sys.modules} under the key
405\code{"spam"}), and inserts built-in function objects into the newly
406created module based upon the table (an array of \ctype{PyMethodDef}
407structures) that was passed as its second argument.
408\cfunction{Py_InitModule()} returns a pointer to the module object
409that it creates (which is unused here). It aborts with a fatal error
410if the module could not be initialized satisfactorily, so the caller
411doesn't need to check for errors.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000412
Fred Drake54fd8452000-04-03 04:54:28 +0000413When embedding Python, the \cfunction{initspam()} function is not
414called automatically unless there's an entry in the
415\cdata{_PyImport_Inittab} table. The easiest way to handle this is to
416statically initialize your statically-linked modules by directly
417calling \cfunction{initspam()} after the call to
418\cfunction{Py_Initialize()} or \cfunction{PyMac_Initialize()}:
419
420\begin{verbatim}
421int main(int argc, char **argv)
422{
423 /* Pass argv[0] to the Python interpreter */
424 Py_SetProgramName(argv[0]);
425
426 /* Initialize the Python interpreter. Required. */
427 Py_Initialize();
428
429 /* Add a static module */
430 initspam();
431\end{verbatim}
432
Fred Drake4dc1a6d2000-10-02 22:38:09 +0000433An example may be found in the file \file{Demo/embed/demo.c} in the
Fred Drake54fd8452000-04-03 04:54:28 +0000434Python source distribution.
435
Fred Drakea48a0831999-06-18 19:17:28 +0000436\strong{Note:} Removing entries from \code{sys.modules} or importing
437compiled modules into multiple interpreters within a process (or
438following a \cfunction{fork()} without an intervening
439\cfunction{exec()}) can create problems for some extension modules.
440Extension module authors should exercise caution when initializing
441internal data structures.
Fred Drake4dc1a6d2000-10-02 22:38:09 +0000442Note also that the \function{reload()} function can be used with
443extension modules, and will call the module initialization function
444(\cfunction{initspam()} in the example), but will not load the module
445again if it was loaded from a dynamically loadable object file
446(\file{.so} on \UNIX, \file{.dll} on Windows).
Fred Drakea48a0831999-06-18 19:17:28 +0000447
Fred Drake54fd8452000-04-03 04:54:28 +0000448A more substantial example module is included in the Python source
449distribution as \file{Modules/xxmodule.c}. This file may be used as a
450template or simply read as an example. The \program{modulator.py}
451script included in the source distribution or Windows install provides
452a simple graphical user interface for declaring the functions and
453objects which a module should implement, and can generate a template
454which can be filled in. The script lives in the
455\file{Tools/modulator/} directory; see the \file{README} file there
456for more information.
457
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000458
Fred Drake5e8aa541998-11-16 18:34:07 +0000459\section{Compilation and Linkage
460 \label{compilation}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000461
Guido van Rossumb92112d1995-03-20 14:24:09 +0000462There are two more things to do before you can use your new extension:
463compiling and linking it with the Python system. If you use dynamic
464loading, the details depend on the style of dynamic loading your
Fred Drake54fd8452000-04-03 04:54:28 +0000465system uses; see the chapters about building extension modules on
466\UNIX{} (chapter \ref{building-on-unix}) and Windows (chapter
467\ref{building-on-windows}) for more information about this.
468% XXX Add information about MacOS
Guido van Rossum6938f061994-08-01 12:22:53 +0000469
470If you can't use dynamic loading, or if you want to make your module a
471permanent part of the Python interpreter, you will have to change the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000472configuration setup and rebuild the interpreter. Luckily, this is
473very simple: just place your file (\file{spammodule.c} for example) in
Fred Drakea4a90dd1999-04-29 02:44:50 +0000474the \file{Modules/} directory of an unpacked source distribution, add
475a line to the file \file{Modules/Setup.local} describing your file:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000476
Fred Drake1e11a5c1998-02-13 07:11:32 +0000477\begin{verbatim}
478spam spammodule.o
479\end{verbatim}
480
Fred Draked7bb3031998-03-03 17:52:07 +0000481and rebuild the interpreter by running \program{make} in the toplevel
Fred Drakea4a90dd1999-04-29 02:44:50 +0000482directory. You can also run \program{make} in the \file{Modules/}
Fred Drakea0dbddf1998-04-02 06:50:02 +0000483subdirectory, but then you must first rebuild \file{Makefile}
Fred Draked7bb3031998-03-03 17:52:07 +0000484there by running `\program{make} Makefile'. (This is necessary each
485time you change the \file{Setup} file.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000486
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000487If your module requires additional libraries to link with, these can
Fred Drakea0dbddf1998-04-02 06:50:02 +0000488be listed on the line in the configuration file as well, for instance:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000489
Fred Drake1e11a5c1998-02-13 07:11:32 +0000490\begin{verbatim}
491spam spammodule.o -lX11
492\end{verbatim}
493
Fred Drakeec9fbe91999-02-15 16:20:25 +0000494\section{Calling Python Functions from C
Fred Drake5e8aa541998-11-16 18:34:07 +0000495 \label{callingPython}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000496
Fred Drakeec9fbe91999-02-15 16:20:25 +0000497So far we have concentrated on making C functions callable from
498Python. The reverse is also useful: calling Python functions from C.
Guido van Rossum6938f061994-08-01 12:22:53 +0000499This is especially the case for libraries that support so-called
Fred Drakeec9fbe91999-02-15 16:20:25 +0000500``callback'' functions. If a C interface makes use of callbacks, the
Guido van Rossum6938f061994-08-01 12:22:53 +0000501equivalent Python often needs to provide a callback mechanism to the
502Python programmer; the implementation will require calling the Python
Fred Drakeec9fbe91999-02-15 16:20:25 +0000503callback functions from a C callback. Other uses are also imaginable.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000504
505Fortunately, the Python interpreter is easily called recursively, and
Guido van Rossum6938f061994-08-01 12:22:53 +0000506there is a standard interface to call a Python function. (I won't
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000507dwell on how to call the Python parser with a particular string as
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000508input --- if you're interested, have a look at the implementation of
Fred Drake9fa76f11999-11-10 16:01:43 +0000509the \programopt{-c} command line option in \file{Python/pythonmain.c}
510from the Python source code.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000511
512Calling a Python function is easy. First, the Python program must
513somehow pass you the Python function object. You should provide a
514function (or some other interface) to do this. When this function is
515called, save a pointer to the Python function object (be careful to
Fred Drakedc12ec81999-03-09 18:36:55 +0000516\cfunction{Py_INCREF()} it!) in a global variable --- or wherever you
Fred Draked7bb3031998-03-03 17:52:07 +0000517see fit. For example, the following function might be part of a module
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000518definition:
519
Fred Drake1e11a5c1998-02-13 07:11:32 +0000520\begin{verbatim}
521static PyObject *my_callback = NULL;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000522
Fred Drake1e11a5c1998-02-13 07:11:32 +0000523static PyObject *
Fred Drake54fd8452000-04-03 04:54:28 +0000524my_set_callback(dummy, args)
525 PyObject *dummy, *args;
Fred Drake1e11a5c1998-02-13 07:11:32 +0000526{
Fred Drake5e8aa541998-11-16 18:34:07 +0000527 PyObject *result = NULL;
528 PyObject *temp;
529
530 if (PyArg_ParseTuple(args, "O:set_callback", &temp)) {
531 if (!PyCallable_Check(temp)) {
532 PyErr_SetString(PyExc_TypeError, "parameter must be callable");
533 return NULL;
534 }
535 Py_XINCREF(temp); /* Add a reference to new callback */
536 Py_XDECREF(my_callback); /* Dispose of previous callback */
537 my_callback = temp; /* Remember new callback */
538 /* Boilerplate to return "None" */
539 Py_INCREF(Py_None);
540 result = Py_None;
541 }
542 return result;
Fred Drake1e11a5c1998-02-13 07:11:32 +0000543}
544\end{verbatim}
545
Fred Drake5e8aa541998-11-16 18:34:07 +0000546This function must be registered with the interpreter using the
Fred Drake5f342ac1999-04-29 02:47:40 +0000547\constant{METH_VARARGS} flag; this is described in section
Fred Drake5e8aa541998-11-16 18:34:07 +0000548\ref{methodTable}, ``The Module's Method Table and Initialization
549Function.'' The \cfunction{PyArg_ParseTuple()} function and its
Fred Drake33327782001-07-20 20:59:49 +0000550arguments are documented in section \ref{parseTuple}, ``Extracting
551Parameters in Extension Functions.''
Fred Drake5e8aa541998-11-16 18:34:07 +0000552
Fred Draked7bb3031998-03-03 17:52:07 +0000553The macros \cfunction{Py_XINCREF()} and \cfunction{Py_XDECREF()}
554increment/decrement the reference count of an object and are safe in
Fred Drake5e8aa541998-11-16 18:34:07 +0000555the presence of \NULL{} pointers (but note that \var{temp} will not be
Fred Drake5f342ac1999-04-29 02:47:40 +0000556\NULL{} in this context). More info on them in section
Fred Drake5e8aa541998-11-16 18:34:07 +0000557\ref{refcounts}, ``Reference Counts.''
Guido van Rossum6938f061994-08-01 12:22:53 +0000558
Fred Drakeec9fbe91999-02-15 16:20:25 +0000559Later, when it is time to call the function, you call the C function
Fred Draked7bb3031998-03-03 17:52:07 +0000560\cfunction{PyEval_CallObject()}. This function has two arguments, both
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000561pointers to arbitrary Python objects: the Python function, and the
562argument list. The argument list must always be a tuple object, whose
563length is the number of arguments. To call the Python function with
564no arguments, pass an empty tuple; to call it with one argument, pass
Fred Draked7bb3031998-03-03 17:52:07 +0000565a singleton tuple. \cfunction{Py_BuildValue()} returns a tuple when its
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000566format string consists of zero or more format codes between
567parentheses. For example:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000568
Fred Drake1e11a5c1998-02-13 07:11:32 +0000569\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000570 int arg;
571 PyObject *arglist;
572 PyObject *result;
573 ...
574 arg = 123;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000575 ...
576 /* Time to call the callback */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000577 arglist = Py_BuildValue("(i)", arg);
578 result = PyEval_CallObject(my_callback, arglist);
579 Py_DECREF(arglist);
Fred Drake1e11a5c1998-02-13 07:11:32 +0000580\end{verbatim}
581
Fred Draked7bb3031998-03-03 17:52:07 +0000582\cfunction{PyEval_CallObject()} returns a Python object pointer: this is
583the return value of the Python function. \cfunction{PyEval_CallObject()} is
Guido van Rossumb92112d1995-03-20 14:24:09 +0000584``reference-count-neutral'' with respect to its arguments. In the
Guido van Rossum6938f061994-08-01 12:22:53 +0000585example a new tuple was created to serve as the argument list, which
Fred Draked7bb3031998-03-03 17:52:07 +0000586is \cfunction{Py_DECREF()}-ed immediately after the call.
Guido van Rossum6938f061994-08-01 12:22:53 +0000587
Fred Draked7bb3031998-03-03 17:52:07 +0000588The return value of \cfunction{PyEval_CallObject()} is ``new'': either it
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000589is a brand new object, or it is an existing object whose reference
590count has been incremented. So, unless you want to save it in a
Fred Draked7bb3031998-03-03 17:52:07 +0000591global variable, you should somehow \cfunction{Py_DECREF()} the result,
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000592even (especially!) if you are not interested in its value.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000593
594Before you do this, however, it is important to check that the return
Fred Draked7bb3031998-03-03 17:52:07 +0000595value isn't \NULL{}. If it is, the Python function terminated by
Fred Drakeec9fbe91999-02-15 16:20:25 +0000596raising an exception. If the C code that called
Fred Draked7bb3031998-03-03 17:52:07 +0000597\cfunction{PyEval_CallObject()} is called from Python, it should now
598return an error indication to its Python caller, so the interpreter
599can print a stack trace, or the calling Python code can handle the
600exception. If this is not possible or desirable, the exception should
601be cleared by calling \cfunction{PyErr_Clear()}. For example:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000602
Fred Drake1e11a5c1998-02-13 07:11:32 +0000603\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000604 if (result == NULL)
605 return NULL; /* Pass error back */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000606 ...use result...
607 Py_DECREF(result);
Fred Drake1e11a5c1998-02-13 07:11:32 +0000608\end{verbatim}
609
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000610Depending on the desired interface to the Python callback function,
Fred Draked7bb3031998-03-03 17:52:07 +0000611you may also have to provide an argument list to
612\cfunction{PyEval_CallObject()}. In some cases the argument list is
613also provided by the Python program, through the same interface that
614specified the callback function. It can then be saved and used in the
615same manner as the function object. In other cases, you may have to
616construct a new tuple to pass as the argument list. The simplest way
617to do this is to call \cfunction{Py_BuildValue()}. For example, if
618you want to pass an integral event code, you might use the following
619code:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000620
Fred Drake1e11a5c1998-02-13 07:11:32 +0000621\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000622 PyObject *arglist;
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000623 ...
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000624 arglist = Py_BuildValue("(l)", eventcode);
625 result = PyEval_CallObject(my_callback, arglist);
626 Py_DECREF(arglist);
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000627 if (result == NULL)
628 return NULL; /* Pass error back */
629 /* Here maybe use the result */
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000630 Py_DECREF(result);
Fred Drake1e11a5c1998-02-13 07:11:32 +0000631\end{verbatim}
632
Fred Draked7bb3031998-03-03 17:52:07 +0000633Note the placement of \samp{Py_DECREF(arglist)} immediately after the
634call, before the error check! Also note that strictly spoken this
635code is not complete: \cfunction{Py_BuildValue()} may run out of
636memory, and this should be checked.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000637
638
Fred Drakebcb09fa2001-01-22 18:38:00 +0000639\section{Extracting Parameters in Extension Functions
Fred Drake5e8aa541998-11-16 18:34:07 +0000640 \label{parseTuple}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000641
Fred Drake3da06a61998-02-26 18:49:12 +0000642The \cfunction{PyArg_ParseTuple()} function is declared as follows:
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000643
Fred Drake1e11a5c1998-02-13 07:11:32 +0000644\begin{verbatim}
645int PyArg_ParseTuple(PyObject *arg, char *format, ...);
646\end{verbatim}
647
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000648The \var{arg} argument must be a tuple object containing an argument
Fred Drakeec9fbe91999-02-15 16:20:25 +0000649list passed from Python to a C function. The \var{format} argument
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000650must be a format string, whose syntax is explained below. The
651remaining arguments must be addresses of variables whose type is
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000652determined by the format string. For the conversion to succeed, the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000653\var{arg} object must match the format and the format must be
Fred Drake33327782001-07-20 20:59:49 +0000654exhausted. On success, \cfunction{PyArg_ParseTuple()} returns true,
655otherwise it returns false and raises an appropriate exception.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000656
Fred Drake3da06a61998-02-26 18:49:12 +0000657Note that while \cfunction{PyArg_ParseTuple()} checks that the Python
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000658arguments have the required types, it cannot check the validity of the
Fred Drakeec9fbe91999-02-15 16:20:25 +0000659addresses of C variables passed to the call: if you make mistakes
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000660there, your code will probably crash or at least overwrite random bits
661in memory. So be careful!
662
663A format string consists of zero or more ``format units''. A format
664unit describes one Python object; it is usually a single character or
665a parenthesized sequence of format units. With a few exceptions, a
666format unit that is not a parenthesized sequence normally corresponds
Fred Drake3da06a61998-02-26 18:49:12 +0000667to a single address argument to \cfunction{PyArg_ParseTuple()}. In the
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000668following description, the quoted form is the format unit; the entry
669in (round) parentheses is the Python object type that matches the
Fred Drakeec9fbe91999-02-15 16:20:25 +0000670format unit; and the entry in [square] brackets is the type of the C
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000671variable(s) whose address should be passed. (Use the \samp{\&}
672operator to pass a variable's address.)
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000673
Fred Drake54fd8452000-04-03 04:54:28 +0000674Note that any Python object references which are provided to the
675caller are \emph{borrowed} references; do not decrement their
676reference count!
677
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000678\begin{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000679
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000680\item[\samp{s} (string or Unicode object) {[char *]}]
681Convert a Python string or Unicode object to a C pointer to a
682character string. You must not provide storage for the string
683itself; a pointer to an existing string is stored into the character
684pointer variable whose address you pass. The C string is
685null-terminated. The Python string must not contain embedded null
686bytes; if it does, a \exception{TypeError} exception is raised.
687Unicode objects are converted to C strings using the default
688encoding. If this conversion fails, an \exception{UnicodeError} is
689raised.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000690
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000691\item[\samp{s\#} (string, Unicode or any read buffer compatible object)
692{[char *, int]}]
693This variant on \samp{s} stores into two C variables, the first one a
694pointer to a character string, the second one its length. In this
695case the Python string may contain embedded null bytes. Unicode
Marc-André Lemburg3578b772000-09-21 21:08:08 +0000696objects pass back a pointer to the default encoded string version of the
697object if such a conversion is possible. All other read buffer
698compatible objects pass back a reference to the raw internal data
699representation.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000700
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000701\item[\samp{z} (string or \code{None}) {[char *]}]
702Like \samp{s}, but the Python object may also be \code{None}, in which
Fred Drakeec9fbe91999-02-15 16:20:25 +0000703case the C pointer is set to \NULL{}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000704
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000705\item[\samp{z\#} (string or \code{None} or any read buffer compatible object)
706{[char *, int]}]
Fred Draked7bb3031998-03-03 17:52:07 +0000707This is to \samp{s\#} as \samp{z} is to \samp{s}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000708
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000709\item[\samp{u} (Unicode object) {[Py_UNICODE *]}]
Fred Drake25871c02000-05-03 15:17:02 +0000710Convert a Python Unicode object to a C pointer to a null-terminated
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000711buffer of 16-bit Unicode (UTF-16) data. As with \samp{s}, there is no need
Fred Drake25871c02000-05-03 15:17:02 +0000712to provide storage for the Unicode data buffer; a pointer to the
713existing Unicode data is stored into the Py_UNICODE pointer variable whose
714address you pass.
715
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000716\item[\samp{u\#} (Unicode object) {[Py_UNICODE *, int]}]
Fred Drake25871c02000-05-03 15:17:02 +0000717This variant on \samp{u} stores into two C variables, the first one
718a pointer to a Unicode data buffer, the second one its length.
719
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000720\item[\samp{es} (string, Unicode object or character buffer compatible
721object) {[const char *encoding, char **buffer]}]
722This variant on \samp{s} is used for encoding Unicode and objects
723convertible to Unicode into a character buffer. It only works for
724encoded data without embedded \NULL{} bytes.
725
726The variant reads one C variable and stores into two C variables, the
Fred Drake4bc0aed2000-11-02 21:49:17 +0000727first one a pointer to an encoding name string (\var{encoding}), and the
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000728second a pointer to a pointer to a character buffer (\var{**buffer},
Fred Drake4bc0aed2000-11-02 21:49:17 +0000729the buffer used for storing the encoded data).
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000730
731The encoding name must map to a registered codec. If set to \NULL{},
732the default encoding is used.
733
Fred Drake4e159452000-08-11 17:09:23 +0000734\cfunction{PyArg_ParseTuple()} will allocate a buffer of the needed
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000735size using \cfunction{PyMem_NEW()}, copy the encoded data into this
736buffer and adjust \var{*buffer} to reference the newly allocated
737storage. The caller is responsible for calling
738\cfunction{PyMem_Free()} to free the allocated buffer after usage.
739
Marc-André Lemburg6f15e572001-05-02 17:16:16 +0000740\item[\samp{et} (string, Unicode object or character buffer compatible
741object) {[const char *encoding, char **buffer]}]
742Same as \samp{es} except that string objects are passed through without
743recoding them. Instead, the implementation assumes that the string
744object uses the encoding passed in as parameter.
745
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000746\item[\samp{es\#} (string, Unicode object or character buffer compatible
747object) {[const char *encoding, char **buffer, int *buffer_length]}]
748This variant on \samp{s\#} is used for encoding Unicode and objects
749convertible to Unicode into a character buffer. It reads one C
Fred Drakeaa126e12000-11-17 18:20:33 +0000750variable and stores into three C variables, the first one a pointer to
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000751an encoding name string (\var{encoding}), the second a pointer to a
752pointer to a character buffer (\var{**buffer}, the buffer used for
753storing the encoded data) and the third one a pointer to an integer
754(\var{*buffer_length}, the buffer length).
755
756The encoding name must map to a registered codec. If set to \NULL{},
757the default encoding is used.
758
759There are two modes of operation:
760
761If \var{*buffer} points a \NULL{} pointer,
Fred Drake4e159452000-08-11 17:09:23 +0000762\cfunction{PyArg_ParseTuple()} will allocate a buffer of the needed
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000763size using \cfunction{PyMem_NEW()}, copy the encoded data into this
764buffer and adjust \var{*buffer} to reference the newly allocated
765storage. The caller is responsible for calling
766\cfunction{PyMem_Free()} to free the allocated buffer after usage.
767
768If \var{*buffer} points to a non-\NULL{} pointer (an already allocated
Fred Drake4e159452000-08-11 17:09:23 +0000769buffer), \cfunction{PyArg_ParseTuple()} will use this location as
Marc-André Lemburg8b9835c2000-08-03 19:38:07 +0000770buffer and interpret \var{*buffer_length} as buffer size. It will then
771copy the encoded data into the buffer and 0-terminate it. Buffer
772overflow is signalled with an exception.
773
774In both cases, \var{*buffer_length} is set to the length of the
775encoded data without the trailing 0-byte.
776
Marc-André Lemburg6f15e572001-05-02 17:16:16 +0000777\item[\samp{et\#} (string, Unicode object or character buffer compatible
778object) {[const char *encoding, char **buffer]}]
779Same as \samp{es\#} except that string objects are passed through without
780recoding them. Instead, the implementation assumes that the string
781object uses the encoding passed in as parameter.
782
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000783\item[\samp{b} (integer) {[char]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000784Convert a Python integer to a tiny int, stored in a C \ctype{char}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000785
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000786\item[\samp{h} (integer) {[short int]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000787Convert a Python integer to a C \ctype{short int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000788
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000789\item[\samp{i} (integer) {[int]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000790Convert a Python integer to a plain C \ctype{int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000791
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000792\item[\samp{l} (integer) {[long int]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000793Convert a Python integer to a C \ctype{long int}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000794
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000795\item[\samp{c} (string of length 1) {[char]}]
796Convert a Python character, represented as a string of length 1, to a
Fred Drakeec9fbe91999-02-15 16:20:25 +0000797C \ctype{char}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000798
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000799\item[\samp{f} (float) {[float]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000800Convert a Python floating point number to a C \ctype{float}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000801
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000802\item[\samp{d} (float) {[double]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000803Convert a Python floating point number to a C \ctype{double}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000804
Fred Drakeb6e50321998-02-04 20:26:31 +0000805\item[\samp{D} (complex) {[Py_complex]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000806Convert a Python complex number to a C \ctype{Py_complex} structure.
Fred Drakeb6e50321998-02-04 20:26:31 +0000807
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000808\item[\samp{O} (object) {[PyObject *]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000809Store a Python object (without any conversion) in a C object pointer.
810The C program thus receives the actual object that was passed. The
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000811object's reference count is not increased. The pointer stored is not
Fred Drake0fd82681998-01-09 05:39:38 +0000812\NULL{}.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000813
Fred Drake3fe985f1998-03-04 03:51:42 +0000814\item[\samp{O!} (object) {[\var{typeobject}, PyObject *]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000815Store a Python object in a C object pointer. This is similar to
816\samp{O}, but takes two C arguments: the first is the address of a
817Python type object, the second is the address of the C variable (of
Fred Draked7bb3031998-03-03 17:52:07 +0000818type \ctype{PyObject *}) into which the object pointer is stored.
Fred Drake54fd8452000-04-03 04:54:28 +0000819If the Python object does not have the required type,
820\exception{TypeError} is raised.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000821
Fred Drake3fe985f1998-03-04 03:51:42 +0000822\item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +0000823Convert a Python object to a C variable through a \var{converter}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000824function. This takes two arguments: the first is a function, the
Fred Drakeec9fbe91999-02-15 16:20:25 +0000825second is the address of a C variable (of arbitrary type), converted
Fred Draked7bb3031998-03-03 17:52:07 +0000826to \ctype{void *}. The \var{converter} function in turn is called as
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000827follows:
828
Fred Drake82ac24f1999-07-02 14:29:14 +0000829\var{status}\code{ = }\var{converter}\code{(}\var{object}, \var{address}\code{);}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000830
831where \var{object} is the Python object to be converted and
Fred Draked7bb3031998-03-03 17:52:07 +0000832\var{address} is the \ctype{void *} argument that was passed to
833\cfunction{PyArg_ConvertTuple()}. The returned \var{status} should be
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000834\code{1} for a successful conversion and \code{0} if the conversion
835has failed. When the conversion fails, the \var{converter} function
836should raise an exception.
837
838\item[\samp{S} (string) {[PyStringObject *]}]
Guido van Rossum2474d681998-02-26 17:07:11 +0000839Like \samp{O} but requires that the Python object is a string object.
Fred Drake54fd8452000-04-03 04:54:28 +0000840Raises \exception{TypeError} if the object is not a string object.
841The C variable may also be declared as \ctype{PyObject *}.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000842
Fred Drake25871c02000-05-03 15:17:02 +0000843\item[\samp{U} (Unicode string) {[PyUnicodeObject *]}]
844Like \samp{O} but requires that the Python object is a Unicode object.
845Raises \exception{TypeError} if the object is not a Unicode object.
846The C variable may also be declared as \ctype{PyObject *}.
847
Fred Drake8779f641999-08-27 15:28:15 +0000848\item[\samp{t\#} (read-only character buffer) {[char *, int]}]
849Like \samp{s\#}, but accepts any object which implements the read-only
850buffer interface. The \ctype{char *} variable is set to point to the
851first byte of the buffer, and the \ctype{int} is set to the length of
852the buffer. Only single-segment buffer objects are accepted;
853\exception{TypeError} is raised for all others.
854
855\item[\samp{w} (read-write character buffer) {[char *]}]
856Similar to \samp{s}, but accepts any object which implements the
857read-write buffer interface. The caller must determine the length of
858the buffer by other means, or use \samp{w\#} instead. Only
859single-segment buffer objects are accepted; \exception{TypeError} is
860raised for all others.
861
862\item[\samp{w\#} (read-write character buffer) {[char *, int]}]
863Like \samp{s\#}, but accepts any object which implements the
864read-write buffer interface. The \ctype{char *} variable is set to
865point to the first byte of the buffer, and the \ctype{int} is set to
866the length of the buffer. Only single-segment buffer objects are
867accepted; \exception{TypeError} is raised for all others.
868
Fred Drake3fe985f1998-03-04 03:51:42 +0000869\item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
Fred Drake29fb54f1999-02-18 03:50:01 +0000870The object must be a Python sequence whose length is the number of
871format units in \var{items}. The C arguments must correspond to the
872individual format units in \var{items}. Format units for sequences
873may be nested.
874
875\strong{Note:} Prior to Python version 1.5.2, this format specifier
876only accepted a tuple containing the individual parameters, not an
Fred Drake54fd8452000-04-03 04:54:28 +0000877arbitrary sequence. Code which previously caused
Fred Drake29fb54f1999-02-18 03:50:01 +0000878\exception{TypeError} to be raised here may now proceed without an
879exception. This is not expected to be a problem for existing code.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +0000880
881\end{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000882
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000883It is possible to pass Python long integers where integers are
Fred Drake1aedbd81998-02-16 14:47:27 +0000884requested; however no proper range checking is done --- the most
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000885significant bits are silently truncated when the receiving field is
886too small to receive the value (actually, the semantics are inherited
Fred Drakedc12ec81999-03-09 18:36:55 +0000887from downcasts in C --- your mileage may vary).
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000888
889A few other characters have a meaning in a format string. These may
890not occur inside nested parentheses. They are:
891
892\begin{description}
893
894\item[\samp{|}]
895Indicates that the remaining arguments in the Python argument list are
Fred Drakeec9fbe91999-02-15 16:20:25 +0000896optional. The C variables corresponding to optional arguments should
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000897be initialized to their default value --- when an optional argument is
Fred Drake40e72f71998-03-03 19:37:38 +0000898not specified, \cfunction{PyArg_ParseTuple()} does not touch the contents
Fred Drakeec9fbe91999-02-15 16:20:25 +0000899of the corresponding C variable(s).
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000900
901\item[\samp{:}]
902The list of format units ends here; the string after the colon is used
903as the function name in error messages (the ``associated value'' of
Fred Drakedc12ec81999-03-09 18:36:55 +0000904the exception that \cfunction{PyArg_ParseTuple()} raises).
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000905
906\item[\samp{;}]
Fred Drakeaa126e12000-11-17 18:20:33 +0000907The list of format units ends here; the string after the semicolon is
908used as the error message \emph{instead} of the default error message.
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000909Clearly, \samp{:} and \samp{;} mutually exclude each other.
910
911\end{description}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000912
913Some example calls:
914
Fred Drake0fd82681998-01-09 05:39:38 +0000915\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000916 int ok;
917 int i, j;
918 long k, l;
919 char *s;
920 int size;
921
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000922 ok = PyArg_ParseTuple(args, ""); /* No arguments */
Guido van Rossum6938f061994-08-01 12:22:53 +0000923 /* Python call: f() */
Fred Drake33698f81999-02-16 23:06:32 +0000924\end{verbatim}
Fred Drake0fd82681998-01-09 05:39:38 +0000925
Fred Drake33698f81999-02-16 23:06:32 +0000926\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000927 ok = PyArg_ParseTuple(args, "s", &s); /* A string */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000928 /* Possible Python call: f('whoops!') */
Fred Drake33698f81999-02-16 23:06:32 +0000929\end{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000930
Fred Drake33698f81999-02-16 23:06:32 +0000931\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000932 ok = PyArg_ParseTuple(args, "lls", &k, &l, &s); /* Two longs and a string */
Guido van Rossum6938f061994-08-01 12:22:53 +0000933 /* Possible Python call: f(1, 2, 'three') */
Fred Drake33698f81999-02-16 23:06:32 +0000934\end{verbatim}
Fred Drake0fd82681998-01-09 05:39:38 +0000935
Fred Drake33698f81999-02-16 23:06:32 +0000936\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000937 ok = PyArg_ParseTuple(args, "(ii)s#", &i, &j, &s, &size);
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000938 /* A pair of ints and a string, whose size is also returned */
Guido van Rossum7e924dd1997-02-10 16:51:52 +0000939 /* Possible Python call: f((1, 2), 'three') */
Fred Drake33698f81999-02-16 23:06:32 +0000940\end{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000941
Fred Drake33698f81999-02-16 23:06:32 +0000942\begin{verbatim}
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000943 {
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000944 char *file;
945 char *mode = "r";
946 int bufsize = 0;
947 ok = PyArg_ParseTuple(args, "s|si", &file, &mode, &bufsize);
948 /* A string, and optionally another string and an integer */
949 /* Possible Python calls:
950 f('spam')
951 f('spam', 'w')
952 f('spam', 'wb', 100000) */
953 }
Fred Drake33698f81999-02-16 23:06:32 +0000954\end{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000955
Fred Drake33698f81999-02-16 23:06:32 +0000956\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000957 {
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000958 int left, top, right, bottom, h, v;
Guido van Rossum5049bcb1995-03-13 16:55:23 +0000959 ok = PyArg_ParseTuple(args, "((ii)(ii))(ii)",
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000960 &left, &top, &right, &bottom, &h, &v);
Fred Drakea0dbddf1998-04-02 06:50:02 +0000961 /* A rectangle and a point */
962 /* Possible Python call:
963 f(((0, 0), (400, 300)), (10, 10)) */
Guido van Rossum7a2dba21993-11-05 14:45:11 +0000964 }
Fred Drake33698f81999-02-16 23:06:32 +0000965\end{verbatim}
Fred Drakeb6e50321998-02-04 20:26:31 +0000966
Fred Drake33698f81999-02-16 23:06:32 +0000967\begin{verbatim}
Fred Drakeb6e50321998-02-04 20:26:31 +0000968 {
969 Py_complex c;
970 ok = PyArg_ParseTuple(args, "D:myfunction", &c);
971 /* a complex, also providing a function name for errors */
972 /* Possible Python call: myfunction(1+2j) */
973 }
Fred Drake0fd82681998-01-09 05:39:38 +0000974\end{verbatim}
Fred Drakeb6e50321998-02-04 20:26:31 +0000975
976
Fred Drakebcb09fa2001-01-22 18:38:00 +0000977\section{Keyword Parameters for Extension Functions
Fred Drake5e8aa541998-11-16 18:34:07 +0000978 \label{parseTupleAndKeywords}}
Fred Drakeb6e50321998-02-04 20:26:31 +0000979
980The \cfunction{PyArg_ParseTupleAndKeywords()} function is declared as
981follows:
982
Fred Drake1e11a5c1998-02-13 07:11:32 +0000983\begin{verbatim}
984int PyArg_ParseTupleAndKeywords(PyObject *arg, PyObject *kwdict,
985 char *format, char **kwlist, ...);
986\end{verbatim}
Fred Drakeb6e50321998-02-04 20:26:31 +0000987
988The \var{arg} and \var{format} parameters are identical to those of the
989\cfunction{PyArg_ParseTuple()} function. The \var{kwdict} parameter
Fred Drake33327782001-07-20 20:59:49 +0000990is the dictionary of keywords received as the third parameter from the
Fred Drakeb6e50321998-02-04 20:26:31 +0000991Python runtime. The \var{kwlist} parameter is a \NULL{}-terminated
992list of strings which identify the parameters; the names are matched
Fred Drake33327782001-07-20 20:59:49 +0000993with the type information from \var{format} from left to right. On
994success, \cfunction{PyArg_ParseTupleAndKeywords()} returns true,
995otherwise it returns false and raises an appropriate exception.
Fred Drakeb6e50321998-02-04 20:26:31 +0000996
997\strong{Note:} Nested tuples cannot be parsed when using keyword
998arguments! Keyword parameters passed in which are not present in the
Fred Drakecd05ca91998-03-07 05:32:08 +0000999\var{kwlist} will cause \exception{TypeError} to be raised.
Fred Drakeb6e50321998-02-04 20:26:31 +00001000
1001Here is an example module which uses keywords, based on an example by
Fred Drakea0dbddf1998-04-02 06:50:02 +00001002Geoff Philbrick (\email{philbrick@hks.com}):%
1003\index{Philbrick, Geoff}
Fred Drakeb6e50321998-02-04 20:26:31 +00001004
1005\begin{verbatim}
1006#include <stdio.h>
1007#include "Python.h"
1008
1009static PyObject *
1010keywdarg_parrot(self, args, keywds)
1011 PyObject *self;
1012 PyObject *args;
1013 PyObject *keywds;
1014{
1015 int voltage;
1016 char *state = "a stiff";
1017 char *action = "voom";
1018 char *type = "Norwegian Blue";
1019
1020 static char *kwlist[] = {"voltage", "state", "action", "type", NULL};
1021
1022 if (!PyArg_ParseTupleAndKeywords(args, keywds, "i|sss", kwlist,
1023 &voltage, &state, &action, &type))
1024 return NULL;
1025
1026 printf("-- This parrot wouldn't %s if you put %i Volts through it.\n",
1027 action, voltage);
1028 printf("-- Lovely plumage, the %s -- It's %s!\n", type, state);
1029
1030 Py_INCREF(Py_None);
1031
1032 return Py_None;
1033}
1034
1035static PyMethodDef keywdarg_methods[] = {
Fred Drakedc12ec81999-03-09 18:36:55 +00001036 /* The cast of the function is necessary since PyCFunction values
1037 * only take two PyObject* parameters, and keywdarg_parrot() takes
1038 * three.
1039 */
Fred Drakeb6e50321998-02-04 20:26:31 +00001040 {"parrot", (PyCFunction)keywdarg_parrot, METH_VARARGS|METH_KEYWORDS},
1041 {NULL, NULL} /* sentinel */
1042};
1043
1044void
1045initkeywdarg()
1046{
1047 /* Create the module and add the functions */
Fred Drakecd05ca91998-03-07 05:32:08 +00001048 Py_InitModule("keywdarg", keywdarg_methods);
Fred Drakeb6e50321998-02-04 20:26:31 +00001049}
1050\end{verbatim}
1051
1052
Fred Drakebcb09fa2001-01-22 18:38:00 +00001053\section{Building Arbitrary Values
Fred Drake5e8aa541998-11-16 18:34:07 +00001054 \label{buildValue}}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001055
Fred Draked7bb3031998-03-03 17:52:07 +00001056This function is the counterpart to \cfunction{PyArg_ParseTuple()}. It is
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001057declared as follows:
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001058
Fred Drake1e11a5c1998-02-13 07:11:32 +00001059\begin{verbatim}
1060PyObject *Py_BuildValue(char *format, ...);
1061\end{verbatim}
1062
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001063It recognizes a set of format units similar to the ones recognized by
Fred Draked7bb3031998-03-03 17:52:07 +00001064\cfunction{PyArg_ParseTuple()}, but the arguments (which are input to the
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001065function, not output) must not be pointers, just values. It returns a
Fred Drakeec9fbe91999-02-15 16:20:25 +00001066new Python object, suitable for returning from a C function called
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001067from Python.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001068
Fred Draked7bb3031998-03-03 17:52:07 +00001069One difference with \cfunction{PyArg_ParseTuple()}: while the latter
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001070requires its first argument to be a tuple (since Python argument lists
Fred Draked7bb3031998-03-03 17:52:07 +00001071are always represented as tuples internally),
1072\cfunction{Py_BuildValue()} does not always build a tuple. It builds
1073a tuple only if its format string contains two or more format units.
1074If the format string is empty, it returns \code{None}; if it contains
1075exactly one format unit, it returns whatever object is described by
1076that format unit. To force it to return a tuple of size 0 or one,
1077parenthesize the format string.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001078
Fred Drake2b9e1802000-06-28 15:32:29 +00001079When memory buffers are passed as parameters to supply data to build
1080objects, as for the \samp{s} and \samp{s\#} formats, the required data
1081is copied. Buffers provided by the caller are never referenced by the
Fred Drakeec105d02000-06-28 16:15:08 +00001082objects created by \cfunction{Py_BuildValue()}. In other words, if
1083your code invokes \cfunction{malloc()} and passes the allocated memory
1084to \cfunction{Py_BuildValue()}, your code is responsible for
1085calling \cfunction{free()} for that memory once
1086\cfunction{Py_BuildValue()} returns.
Fred Drake2b9e1802000-06-28 15:32:29 +00001087
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001088In the following description, the quoted form is the format unit; the
1089entry in (round) parentheses is the Python object type that the format
1090unit will return; and the entry in [square] brackets is the type of
Fred Drakeec9fbe91999-02-15 16:20:25 +00001091the C value(s) to be passed.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001092
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001093The characters space, tab, colon and comma are ignored in format
1094strings (but not within format units such as \samp{s\#}). This can be
1095used to make long format strings a tad more readable.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001096
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001097\begin{description}
1098
1099\item[\samp{s} (string) {[char *]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001100Convert a null-terminated C string to a Python object. If the C
Fred Drake2b9e1802000-06-28 15:32:29 +00001101string pointer is \NULL{}, \code{None} is used.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001102
1103\item[\samp{s\#} (string) {[char *, int]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001104Convert a C string and its length to a Python object. If the C string
Fred Drake0fd82681998-01-09 05:39:38 +00001105pointer is \NULL{}, the length is ignored and \code{None} is
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001106returned.
1107
1108\item[\samp{z} (string or \code{None}) {[char *]}]
1109Same as \samp{s}.
1110
1111\item[\samp{z\#} (string or \code{None}) {[char *, int]}]
1112Same as \samp{s\#}.
1113
Fred Drake3c3507f2000-04-28 14:43:33 +00001114\item[\samp{u} (Unicode string) {[Py_UNICODE *]}]
1115Convert a null-terminated buffer of Unicode (UCS-2) data to a Python
1116Unicode object. If the Unicode buffer pointer is \NULL,
1117\code{None} is returned.
1118
1119\item[\samp{u\#} (Unicode string) {[Py_UNICODE *, int]}]
1120Convert a Unicode (UCS-2) data buffer and its length to a Python
1121Unicode object. If the Unicode buffer pointer is \NULL, the length
1122is ignored and \code{None} is returned.
1123
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001124\item[\samp{i} (integer) {[int]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001125Convert a plain C \ctype{int} to a Python integer object.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001126
1127\item[\samp{b} (integer) {[char]}]
1128Same as \samp{i}.
1129
1130\item[\samp{h} (integer) {[short int]}]
1131Same as \samp{i}.
1132
1133\item[\samp{l} (integer) {[long int]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001134Convert a C \ctype{long int} to a Python integer object.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001135
1136\item[\samp{c} (string of length 1) {[char]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001137Convert a C \ctype{int} representing a character to a Python string of
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001138length 1.
1139
1140\item[\samp{d} (float) {[double]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001141Convert a C \ctype{double} to a Python floating point number.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001142
1143\item[\samp{f} (float) {[float]}]
1144Same as \samp{d}.
1145
Fred Drake93fe96a2001-03-12 21:06:31 +00001146\item[\samp{D} (complex) {[Py_complex *]}]
1147Convert a C \ctype{Py_complex} structure to a Python complex number.
1148
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001149\item[\samp{O} (object) {[PyObject *]}]
1150Pass a Python object untouched (except for its reference count, which
Fred Drake0fd82681998-01-09 05:39:38 +00001151is incremented by one). If the object passed in is a \NULL{}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001152pointer, it is assumed that this was caused because the call producing
1153the argument found an error and set an exception. Therefore,
Fred Draked7bb3031998-03-03 17:52:07 +00001154\cfunction{Py_BuildValue()} will return \NULL{} but won't raise an
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001155exception. If no exception has been raised yet,
Fred Draked7bb3031998-03-03 17:52:07 +00001156\cdata{PyExc_SystemError} is set.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001157
1158\item[\samp{S} (object) {[PyObject *]}]
1159Same as \samp{O}.
1160
Fred Drake25871c02000-05-03 15:17:02 +00001161\item[\samp{U} (object) {[PyObject *]}]
1162Same as \samp{O}.
1163
Guido van Rossumd358afe1998-12-23 05:02:08 +00001164\item[\samp{N} (object) {[PyObject *]}]
1165Same as \samp{O}, except it doesn't increment the reference count on
1166the object. Useful when the object is created by a call to an object
1167constructor in the argument list.
1168
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001169\item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
1170Convert \var{anything} to a Python object through a \var{converter}
1171function. The function is called with \var{anything} (which should be
Fred Draked7bb3031998-03-03 17:52:07 +00001172compatible with \ctype{void *}) as its argument and should return a
Fred Drake0fd82681998-01-09 05:39:38 +00001173``new'' Python object, or \NULL{} if an error occurred.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001174
1175\item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001176Convert a sequence of C values to a Python tuple with the same number
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001177of items.
1178
1179\item[\samp{[\var{items}]} (list) {[\var{matching-items}]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001180Convert a sequence of C values to a Python list with the same number
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001181of items.
1182
1183\item[\samp{\{\var{items}\}} (dictionary) {[\var{matching-items}]}]
Fred Drakeec9fbe91999-02-15 16:20:25 +00001184Convert a sequence of C values to a Python dictionary. Each pair of
1185consecutive C values adds one item to the dictionary, serving as key
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001186and value, respectively.
1187
1188\end{description}
1189
1190If there is an error in the format string, the
Fred Draked7bb3031998-03-03 17:52:07 +00001191\cdata{PyExc_SystemError} exception is raised and \NULL{} returned.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001192
1193Examples (to the left the call, to the right the resulting Python value):
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001194
Fred Drake1e11a5c1998-02-13 07:11:32 +00001195\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001196 Py_BuildValue("") None
1197 Py_BuildValue("i", 123) 123
Guido van Rossumf23e0fe1995-03-18 11:04:29 +00001198 Py_BuildValue("iii", 123, 456, 789) (123, 456, 789)
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001199 Py_BuildValue("s", "hello") 'hello'
1200 Py_BuildValue("ss", "hello", "world") ('hello', 'world')
1201 Py_BuildValue("s#", "hello", 4) 'hell'
1202 Py_BuildValue("()") ()
1203 Py_BuildValue("(i)", 123) (123,)
1204 Py_BuildValue("(ii)", 123, 456) (123, 456)
1205 Py_BuildValue("(i,i)", 123, 456) (123, 456)
1206 Py_BuildValue("[i,i]", 123, 456) [123, 456]
Guido van Rossumf23e0fe1995-03-18 11:04:29 +00001207 Py_BuildValue("{s:i,s:i}",
1208 "abc", 123, "def", 456) {'abc': 123, 'def': 456}
1209 Py_BuildValue("((ii)(ii)) (ii)",
1210 1, 2, 3, 4, 5, 6) (((1, 2), (3, 4)), (5, 6))
Fred Drake1e11a5c1998-02-13 07:11:32 +00001211\end{verbatim}
1212
Fred Drake8e015171999-02-17 18:12:14 +00001213
Fred Drake5e8aa541998-11-16 18:34:07 +00001214\section{Reference Counts
1215 \label{refcounts}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001216
Fred Drakeec9fbe91999-02-15 16:20:25 +00001217In languages like C or \Cpp{}, the programmer is responsible for
1218dynamic allocation and deallocation of memory on the heap. In C,
Fred Draked7bb3031998-03-03 17:52:07 +00001219this is done using the functions \cfunction{malloc()} and
1220\cfunction{free()}. In \Cpp{}, the operators \keyword{new} and
1221\keyword{delete} are used with essentially the same meaning; they are
1222actually implemented using \cfunction{malloc()} and
1223\cfunction{free()}, so we'll restrict the following discussion to the
1224latter.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001225
Fred Draked7bb3031998-03-03 17:52:07 +00001226Every block of memory allocated with \cfunction{malloc()} should
1227eventually be returned to the pool of available memory by exactly one
1228call to \cfunction{free()}. It is important to call
1229\cfunction{free()} at the right time. If a block's address is
1230forgotten but \cfunction{free()} is not called for it, the memory it
1231occupies cannot be reused until the program terminates. This is
1232called a \dfn{memory leak}. On the other hand, if a program calls
1233\cfunction{free()} for a block and then continues to use the block, it
1234creates a conflict with re-use of the block through another
1235\cfunction{malloc()} call. This is called \dfn{using freed memory}.
1236It has the same bad consequences as referencing uninitialized data ---
1237core dumps, wrong results, mysterious crashes.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001238
1239Common causes of memory leaks are unusual paths through the code. For
1240instance, a function may allocate a block of memory, do some
1241calculation, and then free the block again. Now a change in the
1242requirements for the function may add a test to the calculation that
1243detects an error condition and can return prematurely from the
1244function. It's easy to forget to free the allocated memory block when
1245taking this premature exit, especially when it is added later to the
1246code. Such leaks, once introduced, often go undetected for a long
1247time: the error exit is taken only in a small fraction of all calls,
1248and most modern machines have plenty of virtual memory, so the leak
1249only becomes apparent in a long-running process that uses the leaking
1250function frequently. Therefore, it's important to prevent leaks from
1251happening by having a coding convention or strategy that minimizes
1252this kind of errors.
1253
Fred Draked7bb3031998-03-03 17:52:07 +00001254Since Python makes heavy use of \cfunction{malloc()} and
1255\cfunction{free()}, it needs a strategy to avoid memory leaks as well
1256as the use of freed memory. The chosen method is called
1257\dfn{reference counting}. The principle is simple: every object
1258contains a counter, which is incremented when a reference to the
1259object is stored somewhere, and which is decremented when a reference
1260to it is deleted. When the counter reaches zero, the last reference
1261to the object has been deleted and the object is freed.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001262
1263An alternative strategy is called \dfn{automatic garbage collection}.
1264(Sometimes, reference counting is also referred to as a garbage
1265collection strategy, hence my use of ``automatic'' to distinguish the
1266two.) The big advantage of automatic garbage collection is that the
Fred Draked7bb3031998-03-03 17:52:07 +00001267user doesn't need to call \cfunction{free()} explicitly. (Another claimed
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001268advantage is an improvement in speed or memory usage --- this is no
Fred Drakeec9fbe91999-02-15 16:20:25 +00001269hard fact however.) The disadvantage is that for C, there is no
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001270truly portable automatic garbage collector, while reference counting
Fred Draked7bb3031998-03-03 17:52:07 +00001271can be implemented portably (as long as the functions \cfunction{malloc()}
Fred Drakeec9fbe91999-02-15 16:20:25 +00001272and \cfunction{free()} are available --- which the C Standard guarantees).
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001273Maybe some day a sufficiently portable automatic garbage collector
Fred Drakeec9fbe91999-02-15 16:20:25 +00001274will be available for C. Until then, we'll have to live with
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001275reference counts.
1276
Fred Drake5e8aa541998-11-16 18:34:07 +00001277\subsection{Reference Counting in Python
1278 \label{refcountsInPython}}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001279
1280There are two macros, \code{Py_INCREF(x)} and \code{Py_DECREF(x)},
1281which handle the incrementing and decrementing of the reference count.
Fred Draked7bb3031998-03-03 17:52:07 +00001282\cfunction{Py_DECREF()} also frees the object when the count reaches zero.
1283For flexibility, it doesn't call \cfunction{free()} directly --- rather, it
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001284makes a call through a function pointer in the object's \dfn{type
1285object}. For this purpose (and others), every object also contains a
1286pointer to its type object.
1287
1288The big question now remains: when to use \code{Py_INCREF(x)} and
1289\code{Py_DECREF(x)}? Let's first introduce some terms. Nobody
1290``owns'' an object; however, you can \dfn{own a reference} to an
1291object. An object's reference count is now defined as the number of
1292owned references to it. The owner of a reference is responsible for
Fred Draked7bb3031998-03-03 17:52:07 +00001293calling \cfunction{Py_DECREF()} when the reference is no longer
1294needed. Ownership of a reference can be transferred. There are three
1295ways to dispose of an owned reference: pass it on, store it, or call
1296\cfunction{Py_DECREF()}. Forgetting to dispose of an owned reference
1297creates a memory leak.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001298
1299It is also possible to \dfn{borrow}\footnote{The metaphor of
1300``borrowing'' a reference is not completely correct: the owner still
1301has a copy of the reference.} a reference to an object. The borrower
Fred Draked7bb3031998-03-03 17:52:07 +00001302of a reference should not call \cfunction{Py_DECREF()}. The borrower must
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001303not hold on to the object longer than the owner from which it was
1304borrowed. Using a borrowed reference after the owner has disposed of
1305it risks using freed memory and should be avoided
1306completely.\footnote{Checking that the reference count is at least 1
1307\strong{does not work} --- the reference count itself could be in
1308freed memory and may thus be reused for another object!}
1309
1310The advantage of borrowing over owning a reference is that you don't
1311need to take care of disposing of the reference on all possible paths
1312through the code --- in other words, with a borrowed reference you
1313don't run the risk of leaking when a premature exit is taken. The
1314disadvantage of borrowing over leaking is that there are some subtle
1315situations where in seemingly correct code a borrowed reference can be
1316used after the owner from which it was borrowed has in fact disposed
1317of it.
1318
1319A borrowed reference can be changed into an owned reference by calling
Fred Draked7bb3031998-03-03 17:52:07 +00001320\cfunction{Py_INCREF()}. This does not affect the status of the owner from
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001321which the reference was borrowed --- it creates a new owned reference,
Fred Drake15e33d82001-07-06 06:49:32 +00001322and gives full owner responsibilities (the new owner must
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001323dispose of the reference properly, as well as the previous owner).
1324
Fred Drake8e015171999-02-17 18:12:14 +00001325
Fred Drake5e8aa541998-11-16 18:34:07 +00001326\subsection{Ownership Rules
1327 \label{ownershipRules}}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001328
1329Whenever an object reference is passed into or out of a function, it
1330is part of the function's interface specification whether ownership is
1331transferred with the reference or not.
1332
1333Most functions that return a reference to an object pass on ownership
1334with the reference. In particular, all functions whose function it is
Fred Drake15e33d82001-07-06 06:49:32 +00001335to create a new object, such as \cfunction{PyInt_FromLong()} and
Fred Draked7bb3031998-03-03 17:52:07 +00001336\cfunction{Py_BuildValue()}, pass ownership to the receiver. Even if in
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001337fact, in some cases, you don't receive a reference to a brand new
1338object, you still receive ownership of the reference. For instance,
Fred Draked7bb3031998-03-03 17:52:07 +00001339\cfunction{PyInt_FromLong()} maintains a cache of popular values and can
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001340return a reference to a cached item.
1341
1342Many functions that extract objects from other objects also transfer
1343ownership with the reference, for instance
Fred Draked7bb3031998-03-03 17:52:07 +00001344\cfunction{PyObject_GetAttrString()}. The picture is less clear, here,
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001345however, since a few common routines are exceptions:
Fred Draked7bb3031998-03-03 17:52:07 +00001346\cfunction{PyTuple_GetItem()}, \cfunction{PyList_GetItem()},
1347\cfunction{PyDict_GetItem()}, and \cfunction{PyDict_GetItemString()}
1348all return references that you borrow from the tuple, list or
1349dictionary.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001350
Fred Draked7bb3031998-03-03 17:52:07 +00001351The function \cfunction{PyImport_AddModule()} also returns a borrowed
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001352reference, even though it may actually create the object it returns:
1353this is possible because an owned reference to the object is stored in
1354\code{sys.modules}.
1355
1356When you pass an object reference into another function, in general,
1357the function borrows the reference from you --- if it needs to store
Fred Draked7bb3031998-03-03 17:52:07 +00001358it, it will use \cfunction{Py_INCREF()} to become an independent
1359owner. There are exactly two important exceptions to this rule:
1360\cfunction{PyTuple_SetItem()} and \cfunction{PyList_SetItem()}. These
1361functions take over ownership of the item passed to them --- even if
1362they fail! (Note that \cfunction{PyDict_SetItem()} and friends don't
Fred Drakea0dbddf1998-04-02 06:50:02 +00001363take over ownership --- they are ``normal.'')
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001364
Fred Drakeec9fbe91999-02-15 16:20:25 +00001365When a C function is called from Python, it borrows references to its
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001366arguments from the caller. The caller owns a reference to the object,
1367so the borrowed reference's lifetime is guaranteed until the function
1368returns. Only when such a borrowed reference must be stored or passed
1369on, it must be turned into an owned reference by calling
Fred Draked7bb3031998-03-03 17:52:07 +00001370\cfunction{Py_INCREF()}.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001371
Fred Drakeec9fbe91999-02-15 16:20:25 +00001372The object reference returned from a C function that is called from
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001373Python must be an owned reference --- ownership is tranferred from the
1374function to its caller.
1375
Fred Drake8e015171999-02-17 18:12:14 +00001376
Fred Drake5e8aa541998-11-16 18:34:07 +00001377\subsection{Thin Ice
1378 \label{thinIce}}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001379
1380There are a few situations where seemingly harmless use of a borrowed
1381reference can lead to problems. These all have to do with implicit
1382invocations of the interpreter, which can cause the owner of a
1383reference to dispose of it.
1384
1385The first and most important case to know about is using
Fred Draked7bb3031998-03-03 17:52:07 +00001386\cfunction{Py_DECREF()} on an unrelated object while borrowing a
1387reference to a list item. For instance:
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001388
Fred Drake1e11a5c1998-02-13 07:11:32 +00001389\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001390bug(PyObject *list) {
1391 PyObject *item = PyList_GetItem(list, 0);
Fred Drakea0dbddf1998-04-02 06:50:02 +00001392
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001393 PyList_SetItem(list, 1, PyInt_FromLong(0L));
1394 PyObject_Print(item, stdout, 0); /* BUG! */
1395}
Fred Drake1e11a5c1998-02-13 07:11:32 +00001396\end{verbatim}
1397
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001398This function first borrows a reference to \code{list[0]}, then
1399replaces \code{list[1]} with the value \code{0}, and finally prints
1400the borrowed reference. Looks harmless, right? But it's not!
1401
Fred Draked7bb3031998-03-03 17:52:07 +00001402Let's follow the control flow into \cfunction{PyList_SetItem()}. The list
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001403owns references to all its items, so when item 1 is replaced, it has
1404to dispose of the original item 1. Now let's suppose the original
1405item 1 was an instance of a user-defined class, and let's further
Fred Draked7bb3031998-03-03 17:52:07 +00001406suppose that the class defined a \method{__del__()} method. If this
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001407class instance has a reference count of 1, disposing of it will call
Fred Draked7bb3031998-03-03 17:52:07 +00001408its \method{__del__()} method.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001409
Fred Draked7bb3031998-03-03 17:52:07 +00001410Since it is written in Python, the \method{__del__()} method can execute
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001411arbitrary Python code. Could it perhaps do something to invalidate
Fred Draked7bb3031998-03-03 17:52:07 +00001412the reference to \code{item} in \cfunction{bug()}? You bet! Assuming
1413that the list passed into \cfunction{bug()} is accessible to the
1414\method{__del__()} method, it could execute a statement to the effect of
1415\samp{del list[0]}, and assuming this was the last reference to that
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001416object, it would free the memory associated with it, thereby
1417invalidating \code{item}.
1418
1419The solution, once you know the source of the problem, is easy:
1420temporarily increment the reference count. The correct version of the
1421function reads:
1422
Fred Drake1e11a5c1998-02-13 07:11:32 +00001423\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001424no_bug(PyObject *list) {
1425 PyObject *item = PyList_GetItem(list, 0);
Fred Drakea0dbddf1998-04-02 06:50:02 +00001426
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001427 Py_INCREF(item);
1428 PyList_SetItem(list, 1, PyInt_FromLong(0L));
1429 PyObject_Print(item, stdout, 0);
1430 Py_DECREF(item);
1431}
Fred Drake1e11a5c1998-02-13 07:11:32 +00001432\end{verbatim}
1433
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001434This is a true story. An older version of Python contained variants
Fred Drakeec9fbe91999-02-15 16:20:25 +00001435of this bug and someone spent a considerable amount of time in a C
Fred Draked7bb3031998-03-03 17:52:07 +00001436debugger to figure out why his \method{__del__()} methods would fail...
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001437
1438The second case of problems with a borrowed reference is a variant
1439involving threads. Normally, multiple threads in the Python
1440interpreter can't get in each other's way, because there is a global
1441lock protecting Python's entire object space. However, it is possible
1442to temporarily release this lock using the macro
1443\code{Py_BEGIN_ALLOW_THREADS}, and to re-acquire it using
1444\code{Py_END_ALLOW_THREADS}. This is common around blocking I/O
Fred Drake7a889ce2001-07-14 02:27:22 +00001445calls, to let other threads use the processor while waiting for the I/O to
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001446complete. Obviously, the following function has the same problem as
1447the previous one:
1448
Fred Drake1e11a5c1998-02-13 07:11:32 +00001449\begin{verbatim}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001450bug(PyObject *list) {
1451 PyObject *item = PyList_GetItem(list, 0);
1452 Py_BEGIN_ALLOW_THREADS
1453 ...some blocking I/O call...
1454 Py_END_ALLOW_THREADS
1455 PyObject_Print(item, stdout, 0); /* BUG! */
1456}
Fred Drake1e11a5c1998-02-13 07:11:32 +00001457\end{verbatim}
1458
Fred Drake8e015171999-02-17 18:12:14 +00001459
Fred Drake5e8aa541998-11-16 18:34:07 +00001460\subsection{NULL Pointers
1461 \label{nullPointers}}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001462
Fred Drakea0dbddf1998-04-02 06:50:02 +00001463In general, functions that take object references as arguments do not
Fred Drake0fd82681998-01-09 05:39:38 +00001464expect you to pass them \NULL{} pointers, and will dump core (or
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001465cause later core dumps) if you do so. Functions that return object
Fred Drake0fd82681998-01-09 05:39:38 +00001466references generally return \NULL{} only to indicate that an
1467exception occurred. The reason for not testing for \NULL{}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001468arguments is that functions often pass the objects they receive on to
Fred Drake0fd82681998-01-09 05:39:38 +00001469other function --- if each function were to test for \NULL{},
Fred Drake1739be52000-06-30 17:58:34 +00001470there would be a lot of redundant tests and the code would run more
1471slowly.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001472
Fred Drake15e33d82001-07-06 06:49:32 +00001473It is better to test for \NULL{} only at the ``source:'' when a
1474pointer that may be \NULL{} is received, for example, from
Fred Draked7bb3031998-03-03 17:52:07 +00001475\cfunction{malloc()} or from a function that may raise an exception.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001476
Fred Draked7bb3031998-03-03 17:52:07 +00001477The macros \cfunction{Py_INCREF()} and \cfunction{Py_DECREF()}
Fred Drakea0dbddf1998-04-02 06:50:02 +00001478do not check for \NULL{} pointers --- however, their variants
Fred Draked7bb3031998-03-03 17:52:07 +00001479\cfunction{Py_XINCREF()} and \cfunction{Py_XDECREF()} do.
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001480
1481The macros for checking for a particular object type
Fred Drake0fd82681998-01-09 05:39:38 +00001482(\code{Py\var{type}_Check()}) don't check for \NULL{} pointers ---
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001483again, there is much code that calls several of these in a row to test
1484an object against various different expected types, and this would
Fred Drake0fd82681998-01-09 05:39:38 +00001485generate redundant tests. There are no variants with \NULL{}
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001486checking.
1487
Fred Drakeec9fbe91999-02-15 16:20:25 +00001488The C function calling mechanism guarantees that the argument list
1489passed to C functions (\code{args} in the examples) is never
Fred Drake52e2d511999-04-05 21:26:37 +00001490\NULL{} --- in fact it guarantees that it is always a tuple.\footnote{
1491These guarantees don't hold when you use the ``old'' style
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001492calling convention --- this is still found in much existing code.}
1493
Fred Drake0fd82681998-01-09 05:39:38 +00001494It is a severe error to ever let a \NULL{} pointer ``escape'' to
Fred Drake1739be52000-06-30 17:58:34 +00001495the Python user.
1496
1497% Frank Stajano:
1498% A pedagogically buggy example, along the lines of the previous listing,
1499% would be helpful here -- showing in more concrete terms what sort of
1500% actions could cause the problem. I can't very well imagine it from the
1501% description.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001502
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001503
Fred Drake5e8aa541998-11-16 18:34:07 +00001504\section{Writing Extensions in \Cpp{}
1505 \label{cplusplus}}
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001506
Guido van Rossum16d6e711994-08-08 12:30:22 +00001507It is possible to write extension modules in \Cpp{}. Some restrictions
Guido van Rossumed39cd01995-10-08 00:17:19 +00001508apply. If the main program (the Python interpreter) is compiled and
Fred Drakeec9fbe91999-02-15 16:20:25 +00001509linked by the C compiler, global or static objects with constructors
Guido van Rossumed39cd01995-10-08 00:17:19 +00001510cannot be used. This is not a problem if the main program is linked
Guido van Rossumafcd5891998-02-05 19:59:39 +00001511by the \Cpp{} compiler. Functions that will be called by the
1512Python interpreter (in particular, module initalization functions)
1513have to be declared using \code{extern "C"}.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00001514It is unnecessary to enclose the Python header files in
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001515\code{extern "C" \{...\}} --- they use this form already if the symbol
Fred Drake0fd82681998-01-09 05:39:38 +00001516\samp{__cplusplus} is defined (all recent \Cpp{} compilers define this
Guido van Rossum5049bcb1995-03-13 16:55:23 +00001517symbol).
Guido van Rossum7a2dba21993-11-05 14:45:11 +00001518
Fred Drakee743fd01998-11-24 17:07:29 +00001519
Fred Drakeec9fbe91999-02-15 16:20:25 +00001520\section{Providing a C API for an Extension Module
1521 \label{using-cobjects}}
1522\sectionauthor{Konrad Hinsen}{hinsen@cnrs-orleans.fr}
Fred Drakee743fd01998-11-24 17:07:29 +00001523
Fred Drakeec9fbe91999-02-15 16:20:25 +00001524Many extension modules just provide new functions and types to be
1525used from Python, but sometimes the code in an extension module can
1526be useful for other extension modules. For example, an extension
1527module could implement a type ``collection'' which works like lists
1528without order. Just like the standard Python list type has a C API
1529which permits extension modules to create and manipulate lists, this
1530new collection type should have a set of C functions for direct
1531manipulation from other extension modules.
1532
1533At first sight this seems easy: just write the functions (without
1534declaring them \keyword{static}, of course), provide an appropriate
1535header file, and document the C API. And in fact this would work if
1536all extension modules were always linked statically with the Python
1537interpreter. When modules are used as shared libraries, however, the
1538symbols defined in one module may not be visible to another module.
1539The details of visibility depend on the operating system; some systems
1540use one global namespace for the Python interpreter and all extension
Fred Drake15e33d82001-07-06 06:49:32 +00001541modules (Windows, for example), whereas others require an explicit
1542list of imported symbols at module link time (AIX is one example), or
1543offer a choice of different strategies (most Unices). And even if
1544symbols are globally visible, the module whose functions one wishes to
1545call might not have been loaded yet!
Fred Drakeec9fbe91999-02-15 16:20:25 +00001546
1547Portability therefore requires not to make any assumptions about
1548symbol visibility. This means that all symbols in extension modules
1549should be declared \keyword{static}, except for the module's
1550initialization function, in order to avoid name clashes with other
1551extension modules (as discussed in section~\ref{methodTable}). And it
1552means that symbols that \emph{should} be accessible from other
1553extension modules must be exported in a different way.
1554
Fred Drake15e33d82001-07-06 06:49:32 +00001555Python provides a special mechanism to pass C-level information
1556(pointers) from one extension module to another one: CObjects.
Fred Drakeec9fbe91999-02-15 16:20:25 +00001557A CObject is a Python data type which stores a pointer (\ctype{void
1558*}). CObjects can only be created and accessed via their C API, but
1559they can be passed around like any other Python object. In particular,
1560they can be assigned to a name in an extension module's namespace.
1561Other extension modules can then import this module, retrieve the
1562value of this name, and then retrieve the pointer from the CObject.
1563
1564There are many ways in which CObjects can be used to export the C API
1565of an extension module. Each name could get its own CObject, or all C
1566API pointers could be stored in an array whose address is published in
1567a CObject. And the various tasks of storing and retrieving the pointers
1568can be distributed in different ways between the module providing the
1569code and the client modules.
1570
1571The following example demonstrates an approach that puts most of the
1572burden on the writer of the exporting module, which is appropriate
1573for commonly used library modules. It stores all C API pointers
1574(just one in the example!) in an array of \ctype{void} pointers which
1575becomes the value of a CObject. The header file corresponding to
1576the module provides a macro that takes care of importing the module
1577and retrieving its C API pointers; client modules only have to call
1578this macro before accessing the C API.
1579
1580The exporting module is a modification of the \module{spam} module from
1581section~\ref{simpleExample}. The function \function{spam.system()}
1582does not call the C library function \cfunction{system()} directly,
1583but a function \cfunction{PySpam_System()}, which would of course do
1584something more complicated in reality (such as adding ``spam'' to
1585every command). This function \cfunction{PySpam_System()} is also
1586exported to other extension modules.
1587
1588The function \cfunction{PySpam_System()} is a plain C function,
1589declared \keyword{static} like everything else:
1590
1591\begin{verbatim}
1592static int
1593PySpam_System(command)
1594 char *command;
1595{
1596 return system(command);
1597}
1598\end{verbatim}
1599
1600The function \cfunction{spam_system()} is modified in a trivial way:
1601
1602\begin{verbatim}
1603static PyObject *
1604spam_system(self, args)
1605 PyObject *self;
1606 PyObject *args;
1607{
1608 char *command;
1609 int sts;
1610
1611 if (!PyArg_ParseTuple(args, "s", &command))
1612 return NULL;
1613 sts = PySpam_System(command);
1614 return Py_BuildValue("i", sts);
1615}
1616\end{verbatim}
1617
1618In the beginning of the module, right after the line
Fred Drake8e015171999-02-17 18:12:14 +00001619
Fred Drakeec9fbe91999-02-15 16:20:25 +00001620\begin{verbatim}
1621#include "Python.h"
1622\end{verbatim}
Fred Drake8e015171999-02-17 18:12:14 +00001623
Fred Drakeec9fbe91999-02-15 16:20:25 +00001624two more lines must be added:
Fred Drake8e015171999-02-17 18:12:14 +00001625
Fred Drakeec9fbe91999-02-15 16:20:25 +00001626\begin{verbatim}
1627#define SPAM_MODULE
1628#include "spammodule.h"
1629\end{verbatim}
1630
1631The \code{\#define} is used to tell the header file that it is being
1632included in the exporting module, not a client module. Finally,
1633the module's initialization function must take care of initializing
1634the C API pointer array:
Fred Drake8e015171999-02-17 18:12:14 +00001635
Fred Drakeec9fbe91999-02-15 16:20:25 +00001636\begin{verbatim}
1637void
1638initspam()
1639{
Fred Drake80d4c072001-03-02 19:48:06 +00001640 PyObject *m;
Fred Drakeec9fbe91999-02-15 16:20:25 +00001641 static void *PySpam_API[PySpam_API_pointers];
1642 PyObject *c_api_object;
Fred Drake80d4c072001-03-02 19:48:06 +00001643
Fred Drakeec9fbe91999-02-15 16:20:25 +00001644 m = Py_InitModule("spam", SpamMethods);
1645
1646 /* Initialize the C API pointer array */
1647 PySpam_API[PySpam_System_NUM] = (void *)PySpam_System;
1648
1649 /* Create a CObject containing the API pointer array's address */
1650 c_api_object = PyCObject_FromVoidPtr((void *)PySpam_API, NULL);
1651
Fred Drake80d4c072001-03-02 19:48:06 +00001652 if (c_api_object != NULL) {
1653 /* Create a name for this object in the module's namespace */
1654 PyObject *d = PyModule_GetDict(m);
1655
1656 PyDict_SetItemString(d, "_C_API", c_api_object);
1657 Py_DECREF(c_api_object);
1658 }
Fred Drakeec9fbe91999-02-15 16:20:25 +00001659}
1660\end{verbatim}
1661
1662Note that \code{PySpam_API} is declared \code{static}; otherwise
1663the pointer array would disappear when \code{initspam} terminates!
1664
1665The bulk of the work is in the header file \file{spammodule.h},
1666which looks like this:
1667
1668\begin{verbatim}
1669#ifndef Py_SPAMMODULE_H
1670#define Py_SPAMMODULE_H
1671#ifdef __cplusplus
1672extern "C" {
1673#endif
1674
1675/* Header file for spammodule */
1676
1677/* C API functions */
1678#define PySpam_System_NUM 0
1679#define PySpam_System_RETURN int
Greg Steinc2844af2000-07-09 16:27:33 +00001680#define PySpam_System_PROTO (char *command)
Fred Drakeec9fbe91999-02-15 16:20:25 +00001681
1682/* Total number of C API pointers */
1683#define PySpam_API_pointers 1
1684
1685
1686#ifdef SPAM_MODULE
1687/* This section is used when compiling spammodule.c */
1688
1689static PySpam_System_RETURN PySpam_System PySpam_System_PROTO;
1690
1691#else
1692/* This section is used in modules that use spammodule's API */
1693
1694static void **PySpam_API;
1695
1696#define PySpam_System \
1697 (*(PySpam_System_RETURN (*)PySpam_System_PROTO) PySpam_API[PySpam_System_NUM])
1698
1699#define import_spam() \
1700{ \
1701 PyObject *module = PyImport_ImportModule("spam"); \
1702 if (module != NULL) { \
1703 PyObject *module_dict = PyModule_GetDict(module); \
1704 PyObject *c_api_object = PyDict_GetItemString(module_dict, "_C_API"); \
1705 if (PyCObject_Check(c_api_object)) { \
1706 PySpam_API = (void **)PyCObject_AsVoidPtr(c_api_object); \
1707 } \
1708 } \
1709}
1710
1711#endif
1712
1713#ifdef __cplusplus
1714}
1715#endif
1716
1717#endif /* !defined(Py_SPAMMODULE_H */
1718\end{verbatim}
1719
1720All that a client module must do in order to have access to the
1721function \cfunction{PySpam_System()} is to call the function (or
1722rather macro) \cfunction{import_spam()} in its initialization
1723function:
1724
1725\begin{verbatim}
1726void
1727initclient()
1728{
1729 PyObject *m;
1730
1731 Py_InitModule("client", ClientMethods);
1732 import_spam();
1733}
1734\end{verbatim}
1735
1736The main disadvantage of this approach is that the file
1737\file{spammodule.h} is rather complicated. However, the
1738basic structure is the same for each function that is
1739exported, so it has to be learned only once.
1740
1741Finally it should be mentioned that CObjects offer additional
1742functionality, which is especially useful for memory allocation and
1743deallocation of the pointer stored in a CObject. The details
Fred Drake9fa76f11999-11-10 16:01:43 +00001744are described in the \citetitle[../api/api.html]{Python/C API
1745Reference Manual} in the section ``CObjects'' and in the
1746implementation of CObjects (files \file{Include/cobject.h} and
1747\file{Objects/cobject.c} in the Python source code distribution).
Fred Drakeec9fbe91999-02-15 16:20:25 +00001748
1749
Fred Drakef6a96172001-02-19 19:22:00 +00001750\chapter{Defining New Types
1751 \label{defining-new-types}}
1752\sectionauthor{Michael Hudson}{mwh21@cam.ac.uk}
Fred Drakece1650f2001-08-15 19:07:18 +00001753\sectionauthor{Dave Kuhlman}{dkuhlman@rexx.com}
Fred Drakef6a96172001-02-19 19:22:00 +00001754
1755As mentioned in the last chapter, Python allows the writer of an
1756extension module to define new types that can be manipulated from
1757Python code, much like strings and lists in core Python.
1758
1759This is not hard; the code for all extension types follows a pattern,
1760but there are some details that you need to understand before you can
1761get started.
1762
1763\section{The Basics
1764 \label{dnt-basics}}
1765
1766The Python runtime sees all Python objects as variables of type
1767\ctype{PyObject*}. A \ctype{PyObject} is not a very magnificent
1768object - it just contains the refcount and a pointer to the object's
1769``type object''. This is where the action is; the type object
1770determines which (C) functions get called when, for instance, an
1771attribute gets looked up on an object or it is multiplied by another
1772object. I call these C functions ``type methods'' to distinguish them
1773from things like \code{[].append} (which I will call ``object
1774methods'' when I get around to them).
1775
1776So, if you want to define a new object type, you need to create a new
1777type object.
1778
1779This sort of thing can only be explained by example, so here's a
1780minimal, but complete, module that defines a new type:
1781
1782\begin{verbatim}
1783#include <Python.h>
1784
1785staticforward PyTypeObject noddy_NoddyType;
1786
1787typedef struct {
1788 PyObject_HEAD
1789} noddy_NoddyObject;
1790
1791static PyObject*
1792noddy_new_noddy(PyObject* self, PyObject* args)
1793{
1794 noddy_NoddyObject* noddy;
1795
1796 if (!PyArg_ParseTuple(args,":new_noddy"))
1797 return NULL;
1798
1799 noddy = PyObject_New(noddy_NoddyObject, &noddy_NoddyType);
1800
1801 return (PyObject*)noddy;
1802}
1803
1804static void
1805noddy_noddy_dealloc(PyObject* self)
1806{
1807 PyObject_Del(self);
1808}
1809
1810static PyTypeObject noddy_NoddyType = {
1811 PyObject_HEAD_INIT(NULL)
1812 0,
1813 "Noddy",
1814 sizeof(noddy_NoddyObject),
1815 0,
1816 noddy_noddy_dealloc, /*tp_dealloc*/
1817 0, /*tp_print*/
1818 0, /*tp_getattr*/
1819 0, /*tp_setattr*/
1820 0, /*tp_compare*/
1821 0, /*tp_repr*/
1822 0, /*tp_as_number*/
1823 0, /*tp_as_sequence*/
1824 0, /*tp_as_mapping*/
1825 0, /*tp_hash */
1826};
1827
1828static PyMethodDef noddy_methods[] = {
1829 { "new_noddy", noddy_new_noddy, METH_VARARGS },
1830 {NULL, NULL}
1831};
1832
1833DL_EXPORT(void)
1834initnoddy(void)
1835{
1836 noddy_NoddyType.ob_type = &PyType_Type;
1837
1838 Py_InitModule("noddy", noddy_methods);
1839}
1840\end{verbatim}
1841
1842Now that's quite a bit to take in at once, but hopefully bits will
1843seem familiar from the last chapter.
1844
1845The first bit that will be new is:
1846
1847\begin{verbatim}
1848staticforward PyTypeObject noddy_NoddyType;
1849\end{verbatim}
1850
1851This names the type object that will be defining further down in the
1852file. It can't be defined here because its definition has to refer to
1853functions that have no yet been defined, but we need to be able to
1854refer to it, hence the declaration.
1855
1856The \code{staticforward} is required to placate various brain dead
1857compilers.
1858
1859\begin{verbatim}
1860typedef struct {
1861 PyObject_HEAD
1862} noddy_NoddyObject;
1863\end{verbatim}
1864
1865This is what a Noddy object will contain. In this case nothing more
1866than every Python object contains - a refcount and a pointer to a type
1867object. These are the fields the \code{PyObject_HEAD} macro brings
1868in. The reason for the macro is to standardize the layout and to
1869enable special debugging fields to be brought in debug builds.
1870
1871For contrast
1872
1873\begin{verbatim}
1874typedef struct {
1875 PyObject_HEAD
1876 long ob_ival;
1877} PyIntObject;
1878\end{verbatim}
1879
1880is the corresponding definition for standard Python integers.
1881
1882Next up is:
1883
1884\begin{verbatim}
1885static PyObject*
1886noddy_new_noddy(PyObject* self, PyObject* args)
1887{
1888 noddy_NoddyObject* noddy;
1889
1890 if (!PyArg_ParseTuple(args,":new_noddy"))
1891 return NULL;
1892
1893 noddy = PyObject_New(noddy_NoddyObject, &noddy_NoddyType);
1894
1895 return (PyObject*)noddy;
1896}
1897\end{verbatim}
1898
1899This is in fact just a regular module function, as described in the
1900last chapter. The reason it gets special mention is that this is
1901where we create our Noddy object. Defining PyTypeObject structures is
Fred Drakef531ad62001-03-19 04:19:56 +00001902all very well, but if there's no way to actually \emph{create} one
Fred Drakef6a96172001-02-19 19:22:00 +00001903of the wretched things it is not going to do anyone much good.
1904
1905Almost always, you create objects with a call of the form:
1906
1907\begin{verbatim}
1908PyObject_New(<type>, &<type object>);
1909\end{verbatim}
1910
Fred Drake15e33d82001-07-06 06:49:32 +00001911This allocates the memory and then initializes the object (sets
Fred Drakef6a96172001-02-19 19:22:00 +00001912the reference count to one, makes the \cdata{ob_type} pointer point at
1913the right place and maybe some other stuff, depending on build options).
1914You \emph{can} do these steps separately if you have some reason to
1915--- but at this level we don't bother.
1916
1917We cast the return value to a \ctype{PyObject*} because that's what
1918the Python runtime expects. This is safe because of guarantees about
1919the layout of structures in the C standard, and is a fairly common C
1920programming trick. One could declare \cfunction{noddy_new_noddy} to
1921return a \ctype{noddy_NoddyObject*} and then put a cast in the
1922definition of \cdata{noddy_methods} further down the file --- it
1923doesn't make much difference.
1924
1925Now a Noddy object doesn't do very much and so doesn't need to
1926implement many type methods. One you can't avoid is handling
1927deallocation, so we find
1928
1929\begin{verbatim}
1930static void
1931noddy_noddy_dealloc(PyObject* self)
1932{
1933 PyObject_Del(self);
1934}
1935\end{verbatim}
1936
1937This is so short as to be self explanatory. This function will be
1938called when the reference count on a Noddy object reaches \code{0} (or
1939it is found as part of an unreachable cycle by the cyclic garbage
1940collector). \cfunction{PyObject_Del()} is what you call when you want
1941an object to go away. If a Noddy object held references to other
1942Python objects, one would decref them here.
1943
1944Moving on, we come to the crunch --- the type object.
1945
1946\begin{verbatim}
1947static PyTypeObject noddy_NoddyType = {
1948 PyObject_HEAD_INIT(NULL)
1949 0,
1950 "Noddy",
1951 sizeof(noddy_NoddyObject),
1952 0,
1953 noddy_noddy_dealloc, /*tp_dealloc*/
1954 0, /*tp_print*/
1955 0, /*tp_getattr*/
1956 0, /*tp_setattr*/
1957 0, /*tp_compare*/
1958 0, /*tp_repr*/
1959 0, /*tp_as_number*/
1960 0, /*tp_as_sequence*/
1961 0, /*tp_as_mapping*/
1962 0, /*tp_hash */
1963};
1964\end{verbatim}
1965
1966Now if you go and look up the definition of \ctype{PyTypeObject} in
1967\file{object.h} you'll see that it has many, many more fields that the
1968definition above. The remaining fields will be filled with zeros by
1969the C compiler, and it's common practice to not specify them
1970explicitly unless you need them.
1971
1972This is so important that I'm going to pick the top of it apart still
1973further:
1974
1975\begin{verbatim}
1976 PyObject_HEAD_INIT(NULL)
1977\end{verbatim}
1978
1979This line is a bit of a wart; what we'd like to write is:
1980
1981\begin{verbatim}
1982 PyObject_HEAD_INIT(&PyType_Type)
1983\end{verbatim}
1984
1985as the type of a type object is ``type'', but this isn't strictly
1986conforming C and some compilers complain. So instead we fill in the
1987\cdata{ob_type} field of \cdata{noddy_NoddyType} at the earliest
1988oppourtunity --- in \cfunction{initnoddy()}.
1989
1990\begin{verbatim}
1991 0,
1992\end{verbatim}
1993
1994XXX why does the type info struct start PyObject_*VAR*_HEAD??
1995
1996\begin{verbatim}
1997 "Noddy",
1998\end{verbatim}
1999
2000The name of our type. This will appear in the default textual
2001representation of our objects and in some error messages, for example:
2002
2003\begin{verbatim}
2004>>> "" + noddy.new_noddy()
2005Traceback (most recent call last):
2006 File "<stdin>", line 1, in ?
2007TypeError: cannot add type "Noddy" to string
2008\end{verbatim}
2009
2010\begin{verbatim}
2011 sizeof(noddy_NoddyObject),
2012\end{verbatim}
2013
2014This is so that Python knows how much memory to allocate when you call
2015\cfunction{PyObject_New}.
2016
2017\begin{verbatim}
2018 0,
2019\end{verbatim}
2020
2021This has to do with variable length objects like lists and strings.
2022Ignore for now...
2023
2024Now we get into the type methods, the things that make your objects
2025different from the others. Of course, the Noddy object doesn't
2026implement many of these, but as mentioned above you have to implement
2027the deallocation function.
2028
2029\begin{verbatim}
2030 noddy_noddy_dealloc, /*tp_dealloc*/
2031\end{verbatim}
2032
2033From here, all the type methods are nil so I won't go over them yet -
2034that's for the next section!
2035
2036Everything else in the file should be familiar, except for this line
2037in \cfunction{initnoddy}:
2038
2039\begin{verbatim}
2040 noddy_NoddyType.ob_type = &PyType_Type;
2041\end{verbatim}
2042
2043This was alluded to above --- the \cdata{noddy_NoddyType} object should
2044have type ``type'', but \code{\&PyType_Type} is not constant and so
2045can't be used in its initializer. To work around this, we patch it up
2046in the module initialization.
2047
2048That's it! All that remains is to build it; put the above code in a
2049file called \file{noddymodule.c} and
2050
2051\begin{verbatim}
2052from distutils.core import setup, Extension
2053setup(name = "noddy", version = "1.0",
2054 ext_modules = [Extension("noddy", ["noddymodule.c"])])
2055\end{verbatim}
2056
2057in a file called \file{setup.py}; then typing
2058
2059\begin{verbatim}
2060$ python setup.py build%$
2061\end{verbatim}
2062
2063at a shell should produce a file \file{noddy.so} in a subdirectory;
2064move to that directory and fire up Python --- you should be able to
2065\code{import noddy} and play around with Noddy objects.
2066
2067That wasn't so hard, was it?
2068
Fred Drakece1650f2001-08-15 19:07:18 +00002069
Fred Drakef6a96172001-02-19 19:22:00 +00002070\section{Type Methods
2071 \label{dnt-type-methods}}
2072
2073This section aims to give a quick fly-by on the various type methods
2074you can implement and what they do.
2075
2076Here is the definition of \ctype{PyTypeObject}, with some fields only
2077used in debug builds omitted:
2078
2079\begin{verbatim}
2080typedef struct _typeobject {
2081 PyObject_VAR_HEAD
2082 char *tp_name; /* For printing */
2083 int tp_basicsize, tp_itemsize; /* For allocation */
2084
2085 /* Methods to implement standard operations */
2086
2087 destructor tp_dealloc;
2088 printfunc tp_print;
2089 getattrfunc tp_getattr;
2090 setattrfunc tp_setattr;
2091 cmpfunc tp_compare;
2092 reprfunc tp_repr;
2093
2094 /* Method suites for standard classes */
2095
2096 PyNumberMethods *tp_as_number;
2097 PySequenceMethods *tp_as_sequence;
2098 PyMappingMethods *tp_as_mapping;
2099
2100 /* More standard operations (here for binary compatibility) */
2101
2102 hashfunc tp_hash;
2103 ternaryfunc tp_call;
2104 reprfunc tp_str;
2105 getattrofunc tp_getattro;
2106 setattrofunc tp_setattro;
2107
2108 /* Functions to access object as input/output buffer */
2109 PyBufferProcs *tp_as_buffer;
2110
2111 /* Flags to define presence of optional/expanded features */
2112 long tp_flags;
2113
2114 char *tp_doc; /* Documentation string */
2115
Fred Drakece1650f2001-08-15 19:07:18 +00002116 /* Assigned meaning in release 2.0 */
Fred Drakef6a96172001-02-19 19:22:00 +00002117 /* call function for all accessible objects */
2118 traverseproc tp_traverse;
2119
2120 /* delete references to contained objects */
2121 inquiry tp_clear;
2122
Fred Drakece1650f2001-08-15 19:07:18 +00002123 /* Assigned meaning in release 2.1 */
Fred Drakef6a96172001-02-19 19:22:00 +00002124 /* rich comparisons */
2125 richcmpfunc tp_richcompare;
2126
2127 /* weak reference enabler */
2128 long tp_weaklistoffset;
2129
Fred Drakece1650f2001-08-15 19:07:18 +00002130 /* Added in release 2.2 */
2131 /* Iterators */
2132 getiterfunc tp_iter;
2133 iternextfunc tp_iternext;
2134
2135 /* Attribute descriptor and subclassing stuff */
2136 struct PyMethodDef *tp_methods;
2137 struct memberlist *tp_members;
2138 struct getsetlist *tp_getset;
2139 struct _typeobject *tp_base;
2140 PyObject *tp_dict;
2141 descrgetfunc tp_descr_get;
2142 descrsetfunc tp_descr_set;
2143 long tp_dictoffset;
2144 initproc tp_init;
2145 allocfunc tp_alloc;
2146 newfunc tp_new;
2147 destructor tp_free; /* Low-level free-memory routine */
2148 PyObject *tp_bases;
2149 PyObject *tp_mro; /* method resolution order */
2150 PyObject *tp_defined;
2151
Fred Drakef6a96172001-02-19 19:22:00 +00002152} PyTypeObject;
2153\end{verbatim}
2154
2155Now that's a \emph{lot} of methods. Don't worry too much though - if
2156you have a type you want to define, the chances are very good that you
2157will only implement a handful of these.
2158
Fred Drakece1650f2001-08-15 19:07:18 +00002159As you probably expect by now, we're going to go over this and give
2160more information about the various handlers. We won't go in the order
2161they are defined in the structure, because there is a lot of
2162historical baggage that impacts the ordering of the fields; be sure
2163your type initializaion keeps the fields in the right order! It's
2164often easiest to find an example that includes all the fields you need
2165(even if they're initialized to \code{0}) and then change the values
2166to suit your new type.
Fred Drakef6a96172001-02-19 19:22:00 +00002167
2168\begin{verbatim}
2169 char *tp_name; /* For printing */
2170\end{verbatim}
2171
2172The name of the type - as mentioned in the last section, this will
2173appear in various places, almost entirely for diagnostic purposes.
2174Try to choose something that will be helpful in such a situation!
2175
2176\begin{verbatim}
2177 int tp_basicsize, tp_itemsize; /* For allocation */
2178\end{verbatim}
2179
2180These fields tell the runtime how much memory to allocate when new
2181objects of this typed are created. Python has some builtin support
2182for variable length structures (think: strings, lists) which is where
2183the \cdata{tp_itemsize} field comes in. This will be dealt with
2184later.
2185
Fred Drakece1650f2001-08-15 19:07:18 +00002186\begin{verbatim}
2187 char *tp_doc;
2188\end{verbatim}
2189
2190Here you can put a string (or its address) that you want returned when
2191the Python script references \code{obj.__doc__} to retrieve the
2192docstring.
2193
2194Now we come to the basic type methods---the ones most extension types
Fred Drakef6a96172001-02-19 19:22:00 +00002195will implement.
2196
Fred Drakece1650f2001-08-15 19:07:18 +00002197
2198\subsection{Finalization and De-allocation}
2199
Fred Drakef6a96172001-02-19 19:22:00 +00002200\begin{verbatim}
Fred Drake0539bfa2001-03-02 18:15:11 +00002201 destructor tp_dealloc;
Fred Drakece1650f2001-08-15 19:07:18 +00002202\end{verbatim}
2203
2204This function is called when the reference count of the instance of
2205your type is reduced to zero and the Python interpreter wants to
2206reclaim it. If your type has memory to free or other clean-up to
2207perform, put it here. The object itself needs to be freed here as
2208well. Here is an example of this function:
2209
2210\begin{verbatim}
2211static void
2212newdatatype_dealloc(newdatatypeobject * obj)
2213{
2214 free(obj->obj_UnderlyingDatatypePtr);
2215 PyObject_DEL(obj);
2216}
Fred Drakef6a96172001-02-19 19:22:00 +00002217\end{verbatim}
2218
2219
Fred Drakece1650f2001-08-15 19:07:18 +00002220\subsection{Object Representation}
2221
2222In Python, there are three ways to generate a textual representation
2223of an object: the \function{repr()}\bifuncindex{repr} function (or
2224equivalent backtick syntax), the \function{str()}\bifuncindex{str}
2225function, and the \keyword{print} statement. For most objects, the
2226\keyword{print} statement is equivalent to the \function{str()}
2227function, but it is possible to special-case printing to a
2228\ctype{FILE*} if necessary; this should only be done if efficiency is
2229identified as a problem and profiling suggests that creating a
2230temporary string object to be written to a file is too expensive.
2231
2232These handlers are all optional, and most types at most need to
2233implement the \member{tp_str} and \member{tp_repr} handlers.
2234
2235\begin{verbatim}
2236 reprfunc tp_repr;
2237 reprfunc tp_str;
2238 printfunc tp_print;
2239\end{verbatim}
2240
2241The \member{tp_repr} handler should return a string object containing
2242a representation of the instance for which it is called. Here is a
2243simple example:
2244
2245\begin{verbatim}
2246static PyObject *
2247newdatatype_repr(newdatatypeobject * obj)
2248{
2249 char buf[4096];
2250 sprintf(buf, "Repr-ified_newdatatype{{size:%d}}",
2251 obj->obj_UnderlyingDatatypePtr->size);
2252 return PyString_FromString(buf);
2253}
2254\end{verbatim}
2255
2256If no \member{tp_repr} handler is specified, the interpreter will
2257supply a representation that uses the type's \member{tp_name} and a
2258uniquely-identifying value for the object.
2259
2260The \member{tp_str} handler is to \function{str()} what the
2261\member{tp_repr} handler described above is to \function{repr()}; that
2262is, it is called when Python code calls \function{str()} on an
2263instance of your object. It's implementation is very similar to the
2264\member{tp_repr} function, but the resulting string is intended to be
2265human consumption. It \member{tp_str} is not specified, the
2266\member{tp_repr} handler is used instead.
2267
2268Here is a simple example:
2269
2270\begin{verbatim}
2271static PyObject *
2272newdatatype_str(newdatatypeobject * obj)
2273{
2274 PyObject *pyString;
2275 char buf[4096];
2276 sprintf(buf, "Stringified_newdatatype{{size:%d}}",
2277 obj->obj_UnderlyingDatatypePtr->size
2278 );
2279 pyString = PyString_FromString(buf);
2280 return pyString;
2281}
2282\end{verbatim}
2283
2284The print function will be called whenever Python needs to "print" an
2285instance of the type. For example, if 'node' is an instance of type
2286TreeNode, then the print function is called when Python code calls:
2287
2288\begin{verbatim}
2289print node
2290\end{verbatim}
2291
2292There is a flags argument and one flag, \constant{Py_PRINT_RAW}, and
2293it suggests that you print without string quotes and possibly without
2294interpreting escape sequences.
2295
2296The print function receives a file object as an argument. You will
2297likely want to write to that file object.
2298
2299Here is a sampe print function:
2300
2301\begin{verbatim}
2302static int
2303newdatatype_print(newdatatypeobject *obj, FILE *fp, int flags)
2304{
2305 if (flags & Py_PRINT_RAW) {
2306 fprintf(fp, "<{newdatatype object--size: %d}>",
2307 obj->obj_UnderlyingDatatypePtr->size);
2308 }
2309 else {
2310 fprintf(fp, "\"<{newdatatype object--size: %d}>\"",
2311 obj->obj_UnderlyingDatatypePtr->size);
2312 }
2313 return 0;
2314}
2315\end{verbatim}
2316
2317
2318\subsection{Attribute Management Functions}
2319
2320\begin{verbatim}
2321 getattrfunc tp_getattr;
2322 setattrfunc tp_setattr;
2323\end{verbatim}
2324
2325The \member{tp_getattr} handle is called when the object requires an
2326attribute look-up. It is called in the same situations where the
2327\method{__getattr__()} method of a class would be called.
2328
2329A likely way to handle this is (1) to implement a set of functions
2330(such as \cfunction{newdatatype_getSize()} and
2331\cfunction{newdatatype_setSize()} in the example below), (2) provide a
2332method table listing these functions, and (3) provide a getattr
2333function that returns the result of a lookup in that table.
2334
2335Here is an example:
2336
2337\begin{verbatim}
2338static PyMethodDef newdatatype_methods[] = {
2339 {"getSize", (PyCFunction)newdatatype_getSize, METH_VARARGS},
2340 {"setSize", (PyCFunction)newdatatype_setSize, METH_VARARGS},
2341 {NULL, NULL} /* sentinel */
2342};
2343
2344static PyObject *
2345newdatatype_getattr(newdatatypeobject *obj, char *name)
2346{
2347 return Py_FindMethod(newdatatype_methods, (PyObject *)obj, name);
2348}
2349\end{verbatim}
2350
2351The \member{tp_setattr} handler is called when the
2352\method{__setattr__()} or \method{__delattr__()} method of a class
2353instance would be called. When an attribute should be deleted, the
2354third parameter will be \NULL. Here is an example that simply raises
2355an exception; if this were really all you wanted, the
2356\member{tp_setattr} handler should be set to \NULL.
2357
2358\begin{verbatim}
2359static int
2360newdatatype_setattr(newdatatypeobject *obj, char *name, PyObject *v)
2361{
2362 char buf[1024];
2363 sprintf(buf, "Set attribute not supported for attribute %s", name);
2364 PyErr_SetString(PyExc_RuntimeError, buf);
2365 return -1;
2366}
2367\end{verbatim}
2368
2369
2370\subsection{Object Comparison}
2371
2372\begin{verbatim}
2373 cmpfunc tp_compare;
2374\end{verbatim}
2375
2376The \member{tp_compare} handler is called when comparisons are needed
2377are the object does not implement the specific rich comparison method
2378which matches the requested comparison. (It is always used if defined
2379and the \cfunction{PyObject_Compare()} or \cfunction{PyObject_Cmp()}
2380functions are used, or if \function{cmp()} is used from Python.)
2381It is analogous to the \method{__cmp__()} method. This function
2382should return a negative integer if \var{obj1} is less than
2383\var{obj2}, \code{0} if they are equal, and a positive integer if
2384\var{obj1} is greater than
2385\var{obj2}.
2386
2387Here is a sample implementation:
2388
2389\begin{verbatim}
2390static int
2391newdatatype_compare(newdatatypeobject * obj1, newdatatypeobject * obj2)
2392{
2393 long result;
2394
2395 if (obj1->obj_UnderlyingDatatypePtr->size <
2396 obj2->obj_UnderlyingDatatypePtr->size) {
2397 result = -1;
2398 }
2399 else if (obj1->obj_UnderlyingDatatypePtr->size >
2400 obj2->obj_UnderlyingDatatypePtr->size) {
2401 result = 1;
2402 }
2403 else {
2404 result = 0;
2405 }
2406 return result;
2407}
2408\end{verbatim}
2409
2410
2411\subsection{Abstract Protocol Support}
2412
2413\begin{verbatim}
2414 tp_as_number;
2415 tp_as_sequence;
2416 tp_as_mapping;
2417\end{verbatim}
2418
2419If you wish your object to be able to act like a number, a sequence,
2420or a mapping object, then you place the address of a structure that
2421implements the C type \ctype{PyNumberMethods},
2422\ctype{PySequenceMethods}, or \ctype{PyMappingMethods}, respectively.
2423It is up to you to fill in this structure with appropriate values. You
2424can find examples of the use of each of these in the \file{Objects}
2425directory of the Python source distribution.
2426
2427
2428\begin{verbatim}
2429 hashfunc tp_hash;
2430\end{verbatim}
2431
2432This function, if you choose to provide it, should return a hash
2433number for an instance of your datatype. Here is a moderately
2434pointless example:
2435
2436\begin{verbatim}
2437static long
2438newdatatype_hash(newdatatypeobject *obj)
2439{
2440 long result;
2441 result = obj->obj_UnderlyingDatatypePtr->size;
2442 result = result * 3;
2443 return result;
2444}
2445\end{verbatim}
2446
2447\begin{verbatim}
2448 ternaryfunc tp_call;
2449\end{verbatim}
2450
2451This function is called when an instance of your datatype is "called",
2452for example, if \code{obj1} is an instance of your datatype and the Python
2453script contains \code{obj1('hello')}, the \member{tp_call} handler is
2454invoked.
2455
2456This function takes three arguments:
2457
2458\begin{enumerate}
2459 \item
2460 \var{arg1} is the instance of the datatype which is the subject of
2461 the call. If the call is \code{obj1('hello')}, then \var{arg1} is
2462 \code{obj1}.
2463
2464 \item
2465 \var{arg2} is a tuple containing the arguments to the call. You
2466 can use \cfunction{PyArg_ParseTuple()} to extract the arguments.
2467
2468 \item
2469 \var{arg3} is a dictionary of keyword arguments that were passed.
2470 If this is non-\NULL{} and you support keyword arguments, use
2471 \cfunction{PyArg_ParseTupleAndKeywords()} to extract the
2472 arguments. If you do not want to support keyword arguments and
2473 this is non-\NULL, raise a \exception{TypeError} with a message
2474 saying that keyword arguments are not supported.
2475\end{enumerate}
2476
2477Here is a desultory example of the implementation of call function.
2478
2479\begin{verbatim}
2480/* Implement the call function.
2481 * obj1 is the instance receiving the call.
2482 * obj2 is a tuple containing the arguments to the call, in this
2483 * case 3 strings.
2484 */
2485static PyObject *
2486newdatatype_call(newdatatypeobject *obj, PyObject *args, PyObject *other)
2487{
2488 PyObject *result;
2489 char *arg1;
2490 char *arg2;
2491 char *arg3;
2492 char buf[4096];
2493 if (!PyArg_ParseTuple(args, "sss:call", &arg1, &arg2, &arg3)) {
2494 return NULL;
2495 }
2496 sprintf(buf,
2497 "Returning -- value: [%d] arg1: [%s] arg2: [%s] arg3: [%s]\n",
2498 obj->obj_UnderlyingDatatypePtr->size,
2499 arg1, arg2, arg3);
2500 printf(buf);
2501 return PyString_FromString(buf);
2502}
2503\end{verbatim}
2504
2505
2506\subsection{More Suggestions}
2507
2508Remember that you can omit most of these functions, in which case you
2509provide \code{0} as a value.
2510
2511In the \file{Objects} directory of the Python source distribution,
2512there is a file \file{xxobject.c}, which is intended to be used as a
2513template for the implementation of new types. One useful strategy
2514for implementing a new type is to copy and rename this file, then
2515read the instructions at the top of it.
2516
2517There are type definitions for each of the functions you must
2518provide. They are in \file{object.h} in the Python include
2519directory that comes with the source distribution of Python.
2520
2521In order to learn how to implement any specific method for your new
2522datatype, do the following: Download and unpack the Python source
2523distribution. Go the the \file{Objects} directory, then search the
2524C source files for \code{tp_} plus the function you want (for
2525example, \code{tp_print} or \code{tp_compare}). You will find
2526examples of the function you want to implement.
2527
2528When you need to verify that the type of an object is indeed the
2529object you are implementing and if you use xxobject.c as an starting
2530template for your implementation, then there is a macro defined for
2531this purpose. The macro definition will look something like this:
2532
2533\begin{verbatim}
2534#define is_newdatatypeobject(v) ((v)->ob_type == &Newdatatypetype)
2535\end{verbatim}
2536
2537And, a sample of its use might be something like the following:
2538
2539\begin{verbatim}
2540 if (!is_newdatatypeobject(objp1) {
2541 PyErr_SetString(PyExc_TypeError, "arg #1 not a newdatatype");
2542 return NULL;
2543 }
2544\end{verbatim}
2545
2546%For a reasonably extensive example, from which most of the snippits
2547%above were taken, see \file{newdatatype.c} and \file{newdatatype.h}.
Fred Drakef6a96172001-02-19 19:22:00 +00002548
2549
Fred Drakeec9fbe91999-02-15 16:20:25 +00002550\chapter{Building C and \Cpp{} Extensions on \UNIX{}
Fred Drakef6a96172001-02-19 19:22:00 +00002551 \label{building-on-unix}}
Fred Drakee743fd01998-11-24 17:07:29 +00002552
Fred Drake33698f81999-02-16 23:06:32 +00002553\sectionauthor{Jim Fulton}{jim@Digicool.com}
Fred Drakee743fd01998-11-24 17:07:29 +00002554
2555
2556%The make file make file, building C extensions on Unix
2557
2558
2559Starting in Python 1.4, Python provides a special make file for
2560building make files for building dynamically-linked extensions and
2561custom interpreters. The make file make file builds a make file
2562that reflects various system variables determined by configure when
2563the Python interpreter was built, so people building module's don't
2564have to resupply these settings. This vastly simplifies the process
2565of building extensions and custom interpreters on Unix systems.
2566
2567The make file make file is distributed as the file
2568\file{Misc/Makefile.pre.in} in the Python source distribution. The
2569first step in building extensions or custom interpreters is to copy
2570this make file to a development directory containing extension module
2571source.
2572
2573The make file make file, \file{Makefile.pre.in} uses metadata
2574provided in a file named \file{Setup}. The format of the \file{Setup}
Fred Drake585698a2000-10-26 17:19:58 +00002575file is the same as the \file{Setup} (or \file{Setup.dist}) file
Fred Drakee743fd01998-11-24 17:07:29 +00002576provided in the \file{Modules/} directory of the Python source
Fred Drake33698f81999-02-16 23:06:32 +00002577distribution. The \file{Setup} file contains variable definitions:
Fred Drakee743fd01998-11-24 17:07:29 +00002578
2579\begin{verbatim}
2580EC=/projects/ExtensionClass
2581\end{verbatim}
2582
2583and module description lines. It can also contain blank lines and
2584comment lines that start with \character{\#}.
2585
2586A module description line includes a module name, source files,
2587options, variable references, and other input files, such
Fred Drake54fd8452000-04-03 04:54:28 +00002588as libraries or object files. Consider a simple example:
Fred Drakee743fd01998-11-24 17:07:29 +00002589
2590\begin{verbatim}
2591ExtensionClass ExtensionClass.c
2592\end{verbatim}
2593
2594This is the simplest form of a module definition line. It defines a
Fred Drake8e015171999-02-17 18:12:14 +00002595module, \module{ExtensionClass}, which has a single source file,
Fred Drakee743fd01998-11-24 17:07:29 +00002596\file{ExtensionClass.c}.
2597
Fred Drake8e015171999-02-17 18:12:14 +00002598This slightly more complex example uses an \strong{-I} option to
2599specify an include directory:
Fred Drakee743fd01998-11-24 17:07:29 +00002600
2601\begin{verbatim}
Fred Drake8e015171999-02-17 18:12:14 +00002602EC=/projects/ExtensionClass
Fred Drakee743fd01998-11-24 17:07:29 +00002603cPersistence cPersistence.c -I$(EC)
Fred Drake8e015171999-02-17 18:12:14 +00002604\end{verbatim} % $ <-- bow to font lock
Fred Drakee743fd01998-11-24 17:07:29 +00002605
2606This example also illustrates the format for variable references.
2607
2608For systems that support dynamic linking, the \file{Setup} file should
2609begin:
2610
2611\begin{verbatim}
2612*shared*
2613\end{verbatim}
2614
2615to indicate that the modules defined in \file{Setup} are to be built
Fred Drakedc12ec81999-03-09 18:36:55 +00002616as dynamically linked modules. A line containing only \samp{*static*}
2617can be used to indicate the subsequently listed modules should be
2618statically linked.
Fred Drakee743fd01998-11-24 17:07:29 +00002619
2620Here is a complete \file{Setup} file for building a
2621\module{cPersistent} module:
2622
2623\begin{verbatim}
2624# Set-up file to build the cPersistence module.
2625# Note that the text should begin in the first column.
2626*shared*
2627
2628# We need the path to the directory containing the ExtensionClass
2629# include file.
2630EC=/projects/ExtensionClass
2631cPersistence cPersistence.c -I$(EC)
Fred Drake8e015171999-02-17 18:12:14 +00002632\end{verbatim} % $ <-- bow to font lock
Fred Drakee743fd01998-11-24 17:07:29 +00002633
2634After the \file{Setup} file has been created, \file{Makefile.pre.in}
2635is run with the \samp{boot} target to create a make file:
2636
2637\begin{verbatim}
2638make -f Makefile.pre.in boot
2639\end{verbatim}
2640
2641This creates the file, Makefile. To build the extensions, simply
2642run the created make file:
2643
2644\begin{verbatim}
2645make
2646\end{verbatim}
2647
2648It's not necessary to re-run \file{Makefile.pre.in} if the
2649\file{Setup} file is changed. The make file automatically rebuilds
2650itself if the \file{Setup} file changes.
2651
Fred Drake8e015171999-02-17 18:12:14 +00002652
2653\section{Building Custom Interpreters \label{custom-interps}}
Fred Drakee743fd01998-11-24 17:07:29 +00002654
2655The make file built by \file{Makefile.pre.in} can be run with the
2656\samp{static} target to build an interpreter:
2657
2658\begin{verbatim}
2659make static
2660\end{verbatim}
2661
Fred Drake585698a2000-10-26 17:19:58 +00002662Any modules defined in the \file{Setup} file before the
2663\samp{*shared*} line will be statically linked into the interpreter.
2664Typically, a \samp{*shared*} line is omitted from the
2665\file{Setup} file when a custom interpreter is desired.
Fred Drakee743fd01998-11-24 17:07:29 +00002666
Fred Drake8e015171999-02-17 18:12:14 +00002667
2668\section{Module Definition Options \label{module-defn-options}}
Fred Drakee743fd01998-11-24 17:07:29 +00002669
2670Several compiler options are supported:
2671
Fred Drake585698a2000-10-26 17:19:58 +00002672\begin{tableii}{l|l}{programopt}{Option}{Meaning}
Fred Drakee743fd01998-11-24 17:07:29 +00002673 \lineii{-C}{Tell the C pre-processor not to discard comments}
2674 \lineii{-D\var{name}=\var{value}}{Define a macro}
2675 \lineii{-I\var{dir}}{Specify an include directory, \var{dir}}
Fred Drake33698f81999-02-16 23:06:32 +00002676 \lineii{-L\var{dir}}{Specify a link-time library directory, \var{dir}}
2677 \lineii{-R\var{dir}}{Specify a run-time library directory, \var{dir}}
Fred Drakee743fd01998-11-24 17:07:29 +00002678 \lineii{-l\var{lib}}{Link a library, \var{lib}}
2679 \lineii{-U\var{name}}{Undefine a macro}
2680\end{tableii}
2681
2682Other compiler options can be included (snuck in) by putting them
Fred Drakedc12ec81999-03-09 18:36:55 +00002683in variables.
Fred Drakee743fd01998-11-24 17:07:29 +00002684
2685Source files can include files with \file{.c}, \file{.C}, \file{.cc},
Fred Drake8e015171999-02-17 18:12:14 +00002686\file{.cpp}, \file{.cxx}, and \file{.c++} extensions.
Fred Drakee743fd01998-11-24 17:07:29 +00002687
Fred Drake8e015171999-02-17 18:12:14 +00002688Other input files include files with \file{.a}, \file{.o}, \file{.sl},
2689and \file{.so} extensions.
Fred Drakee743fd01998-11-24 17:07:29 +00002690
2691
Fred Drake8e015171999-02-17 18:12:14 +00002692\section{Example \label{module-defn-example}}
Fred Drakee743fd01998-11-24 17:07:29 +00002693
Fred Drake585698a2000-10-26 17:19:58 +00002694Here is a more complicated example from \file{Modules/Setup.dist}:
Fred Drakee743fd01998-11-24 17:07:29 +00002695
2696\begin{verbatim}
2697GMP=/ufs/guido/src/gmp
2698mpz mpzmodule.c -I$(GMP) $(GMP)/libgmp.a
2699\end{verbatim}
2700
2701which could also be written as:
2702
2703\begin{verbatim}
2704mpz mpzmodule.c -I$(GMP) -L$(GMP) -lgmp
2705\end{verbatim}
2706
2707
2708\section{Distributing your extension modules
Fred Drakef6a96172001-02-19 19:22:00 +00002709 \label{distributing}}
Fred Drakee743fd01998-11-24 17:07:29 +00002710
Fred Drake585698a2000-10-26 17:19:58 +00002711There are two ways to distribute extension modules for others to use.
2712The way that allows the easiest cross-platform support is to use the
2713\module{distutils}\refstmodindex{distutils} package. The manual
2714\citetitle[../dist/dist.html]{Distributing Python Modules} contains
2715information on this approach. It is recommended that all new
2716extensions be distributed using this approach to allow easy building
2717and installation across platforms. Older extensions should migrate to
2718this approach as well.
2719
2720What follows describes the older approach; there are still many
2721extensions which use this.
2722
Fred Drakee743fd01998-11-24 17:07:29 +00002723When distributing your extension modules in source form, make sure to
2724include a \file{Setup} file. The \file{Setup} file should be named
2725\file{Setup.in} in the distribution. The make file make file,
Fred Drake585698a2000-10-26 17:19:58 +00002726\file{Makefile.pre.in}, will copy \file{Setup.in} to \file{Setup} if
2727the person installing the extension doesn't do so manually.
Fred Drakee743fd01998-11-24 17:07:29 +00002728Distributing a \file{Setup.in} file makes it easy for people to
2729customize the \file{Setup} file while keeping the original in
2730\file{Setup.in}.
2731
2732It is a good idea to include a copy of \file{Makefile.pre.in} for
2733people who do not have a source distribution of Python.
2734
2735Do not distribute a make file. People building your modules
Fred Drake8e015171999-02-17 18:12:14 +00002736should use \file{Makefile.pre.in} to build their own make file. A
2737\file{README} file included in the package should provide simple
2738instructions to perform the build.
Fred Drakee743fd01998-11-24 17:07:29 +00002739
2740
Fred Drake3de61bc1999-02-16 21:14:16 +00002741\chapter{Building C and \Cpp{} Extensions on Windows
Fred Drakef6a96172001-02-19 19:22:00 +00002742 \label{building-on-windows}}
Fred Drake3de61bc1999-02-16 21:14:16 +00002743
2744
2745This chapter briefly explains how to create a Windows extension module
Fred Drake33698f81999-02-16 23:06:32 +00002746for Python using Microsoft Visual \Cpp{}, and follows with more
2747detailed background information on how it works. The explanatory
2748material is useful for both the Windows programmer learning to build
Fred Drake54fd8452000-04-03 04:54:28 +00002749Python extensions and the \UNIX{} programmer interested in producing
Fred Drake33698f81999-02-16 23:06:32 +00002750software which can be successfully built on both \UNIX{} and Windows.
2751
Fred Drake8e015171999-02-17 18:12:14 +00002752
Fred Drake33698f81999-02-16 23:06:32 +00002753\section{A Cookbook Approach \label{win-cookbook}}
2754
2755\sectionauthor{Neil Schemenauer}{neil_schemenauer@transcanada.com}
2756
2757This section provides a recipe for building a Python extension on
2758Windows.
Fred Drake3de61bc1999-02-16 21:14:16 +00002759
2760Grab the binary installer from \url{http://www.python.org/} and
2761install Python. The binary installer has all of the required header
Martin v. Löwis4f1cd8b2001-07-26 13:41:06 +00002762files except for \file{pyconfig.h}.
Fred Drake3de61bc1999-02-16 21:14:16 +00002763
2764Get the source distribution and extract it into a convenient location.
Martin v. Löwis4f1cd8b2001-07-26 13:41:06 +00002765Copy the \file{pyconfig.h} from the \file{PC/} directory into the
Fred Drake3de61bc1999-02-16 21:14:16 +00002766\file{include/} directory created by the installer.
2767
2768Create a \file{Setup} file for your extension module, as described in
Fred Drake54fd8452000-04-03 04:54:28 +00002769chapter \ref{building-on-unix}.
Fred Drake3de61bc1999-02-16 21:14:16 +00002770
2771Get David Ascher's \file{compile.py} script from
Fred Drakec0fcbc11999-04-29 02:30:04 +00002772\url{http://starship.python.net/crew/da/compile/}. Run the script to
Fred Drake3de61bc1999-02-16 21:14:16 +00002773create Microsoft Visual \Cpp{} project files.
2774
Fred Drake54fd8452000-04-03 04:54:28 +00002775Open the DSW file in Visual \Cpp{} and select \strong{Build}.
Fred Drake3de61bc1999-02-16 21:14:16 +00002776
2777If your module creates a new type, you may have trouble with this line:
2778
2779\begin{verbatim}
2780 PyObject_HEAD_INIT(&PyType_Type)
2781\end{verbatim}
2782
2783Change it to:
2784
2785\begin{verbatim}
2786 PyObject_HEAD_INIT(NULL)
2787\end{verbatim}
2788
2789and add the following to the module initialization function:
2790
2791\begin{verbatim}
2792 MyObject_Type.ob_type = &PyType_Type;
2793\end{verbatim}
2794
Fred Drakef6a96172001-02-19 19:22:00 +00002795Refer to section 3 of the
2796\citetitle[http://www.python.org/doc/FAQ.html]{Python FAQ} for details
2797on why you must do this.
Fred Drake3de61bc1999-02-16 21:14:16 +00002798
2799
Fred Drake33698f81999-02-16 23:06:32 +00002800\section{Differences Between \UNIX{} and Windows
Fred Drakef6a96172001-02-19 19:22:00 +00002801 \label{dynamic-linking}}
Fred Drake33698f81999-02-16 23:06:32 +00002802\sectionauthor{Chris Phoenix}{cphoenix@best.com}
2803
2804
2805\UNIX{} and Windows use completely different paradigms for run-time
2806loading of code. Before you try to build a module that can be
2807dynamically loaded, be aware of how your system works.
2808
Fred Drake54fd8452000-04-03 04:54:28 +00002809In \UNIX{}, a shared object (\file{.so}) file contains code to be used by the
Fred Drake33698f81999-02-16 23:06:32 +00002810program, and also the names of functions and data that it expects to
2811find in the program. When the file is joined to the program, all
2812references to those functions and data in the file's code are changed
2813to point to the actual locations in the program where the functions
2814and data are placed in memory. This is basically a link operation.
2815
2816In Windows, a dynamic-link library (\file{.dll}) file has no dangling
2817references. Instead, an access to functions or data goes through a
2818lookup table. So the DLL code does not have to be fixed up at runtime
2819to refer to the program's memory; instead, the code already uses the
2820DLL's lookup table, and the lookup table is modified at runtime to
2821point to the functions and data.
2822
2823In \UNIX{}, there is only one type of library file (\file{.a}) which
2824contains code from several object files (\file{.o}). During the link
2825step to create a shared object file (\file{.so}), the linker may find
2826that it doesn't know where an identifier is defined. The linker will
2827look for it in the object files in the libraries; if it finds it, it
2828will include all the code from that object file.
2829
2830In Windows, there are two types of library, a static library and an
2831import library (both called \file{.lib}). A static library is like a
2832\UNIX{} \file{.a} file; it contains code to be included as necessary.
2833An import library is basically used only to reassure the linker that a
2834certain identifier is legal, and will be present in the program when
2835the DLL is loaded. So the linker uses the information from the
2836import library to build the lookup table for using identifiers that
2837are not included in the DLL. When an application or a DLL is linked,
2838an import library may be generated, which will need to be used for all
2839future DLLs that depend on the symbols in the application or DLL.
2840
2841Suppose you are building two dynamic-load modules, B and C, which should
2842share another block of code A. On \UNIX{}, you would \emph{not} pass
2843\file{A.a} to the linker for \file{B.so} and \file{C.so}; that would
2844cause it to be included twice, so that B and C would each have their
2845own copy. In Windows, building \file{A.dll} will also build
2846\file{A.lib}. You \emph{do} pass \file{A.lib} to the linker for B and
2847C. \file{A.lib} does not contain code; it just contains information
2848which will be used at runtime to access A's code.
2849
2850In Windows, using an import library is sort of like using \samp{import
2851spam}; it gives you access to spam's names, but does not create a
2852separate copy. On \UNIX{}, linking with a library is more like
2853\samp{from spam import *}; it does create a separate copy.
2854
2855
2856\section{Using DLLs in Practice \label{win-dlls}}
2857\sectionauthor{Chris Phoenix}{cphoenix@best.com}
2858
2859Windows Python is built in Microsoft Visual \Cpp{}; using other
2860compilers may or may not work (though Borland seems to). The rest of
2861this section is MSV\Cpp{} specific.
2862
2863When creating DLLs in Windows, you must pass \file{python15.lib} to
2864the linker. To build two DLLs, spam and ni (which uses C functions
2865found in spam), you could use these commands:
2866
2867\begin{verbatim}
2868cl /LD /I/python/include spam.c ../libs/python15.lib
2869cl /LD /I/python/include ni.c spam.lib ../libs/python15.lib
2870\end{verbatim}
2871
2872The first command created three files: \file{spam.obj},
2873\file{spam.dll} and \file{spam.lib}. \file{Spam.dll} does not contain
2874any Python functions (such as \cfunction{PyArg_ParseTuple()}), but it
2875does know how to find the Python code thanks to \file{python15.lib}.
2876
2877The second command created \file{ni.dll} (and \file{.obj} and
2878\file{.lib}), which knows how to find the necessary functions from
2879spam, and also from the Python executable.
2880
2881Not every identifier is exported to the lookup table. If you want any
2882other modules (including Python) to be able to see your identifiers,
2883you have to say \samp{_declspec(dllexport)}, as in \samp{void
2884_declspec(dllexport) initspam(void)} or \samp{PyObject
2885_declspec(dllexport) *NiGetSpamData(void)}.
2886
2887Developer Studio will throw in a lot of import libraries that you do
2888not really need, adding about 100K to your executable. To get rid of
2889them, use the Project Settings dialog, Link tab, to specify
2890\emph{ignore default libraries}. Add the correct
2891\file{msvcrt\var{xx}.lib} to the list of libraries.
2892
2893
Fred Drake5e8aa541998-11-16 18:34:07 +00002894\chapter{Embedding Python in Another Application
Fred Drakef6a96172001-02-19 19:22:00 +00002895 \label{embedding}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002896
Fred Drake53765752001-08-04 01:58:36 +00002897The previous chapters discussed how to extend Python, that is, how to
2898extend the functionality of Python by attaching a library of C
2899functions to it. It is also possible to do it the other way around:
2900enrich your C/\Cpp{} application by embedding Python in it. Embedding
2901provides your application with the ability to implement some of the
2902functionality of your application in Python rather than C or \Cpp.
2903This can be used for many purposes; one example would be to allow
2904users to tailor the application to their needs by writing some scripts
2905in Python. You can also use it yourself if some of the functionality
2906can be written in Python more easily.
2907
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002908Embedding Python is similar to extending it, but not quite. The
2909difference is that when you extend Python, the main program of the
Guido van Rossum16d6e711994-08-08 12:30:22 +00002910application is still the Python interpreter, while if you embed
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00002911Python, the main program may have nothing to do with Python ---
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002912instead, some parts of the application occasionally call the Python
2913interpreter to run some Python code.
2914
2915So if you are embedding Python, you are providing your own main
2916program. One of the things this main program has to do is initialize
2917the Python interpreter. At the very least, you have to call the
Fred Drake53765752001-08-04 01:58:36 +00002918function \cfunction{Py_Initialize()} (on Mac OS, call
Fred Drake54fd8452000-04-03 04:54:28 +00002919\cfunction{PyMac_Initialize()} instead). There are optional calls to
Fred Draked7bb3031998-03-03 17:52:07 +00002920pass command line arguments to Python. Then later you can call the
2921interpreter from any part of the application.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002922
2923There are several different ways to call the interpreter: you can pass
Fred Draked7bb3031998-03-03 17:52:07 +00002924a string containing Python statements to
2925\cfunction{PyRun_SimpleString()}, or you can pass a stdio file pointer
2926and a file name (for identification in error messages only) to
2927\cfunction{PyRun_SimpleFile()}. You can also call the lower-level
2928operations described in the previous chapters to construct and use
2929Python objects.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002930
2931A simple demo of embedding Python can be found in the directory
Fred Drake295fb431999-02-16 17:29:42 +00002932\file{Demo/embed/} of the source distribution.
Guido van Rossumdb65a6c1993-11-05 17:11:16 +00002933
Guido van Rossum7a2dba21993-11-05 14:45:11 +00002934
Fred Drake53765752001-08-04 01:58:36 +00002935\begin{seealso}
2936 \seetitle[../api/api.html]{Python/C API Reference Manual}{The
2937 details of Python's C interface are given in this manual.
2938 A great deal of necessary information can be found here.}
2939\end{seealso}
2940
2941
2942\section{Very High Level Embedding
2943 \label{high-level-embedding}}
2944
2945The simplest form of embedding Python is the use of the very
2946high level interface. This interface is intended to execute a
2947Python script without needing to interact with the application
2948directly. This can for example be used to perform some operation
2949on a file.
2950
2951\begin{verbatim}
2952#include <Python.h>
2953
2954int main()
2955{
2956 Py_Initialize();
2957 PyRun_SimpleString("from time import time,ctime\n"
2958 "print 'Today is',ctime(time())\n");
2959 Py_Finalize();
2960 return 0;
2961}
2962\end{verbatim}
2963
2964The above code first initializes the Python interpreter with
2965\cfunction{Py_Initialize()}, followed by the execution of a hard-coded
2966Python script that print the date and time. Afterwards, the
2967\cfunction{Py_Finalize()} call shuts the interpreter down, followed by
2968the end of the program. In a real program, you may want to get the
2969Python script from another source, perhaps a text-editor routine, a
2970file, or a database. Getting the Python code from a file can better
2971be done by using the \cfunction{PyRun_SimpleFile()} function, which
2972saves you the trouble of allocating memory space and loading the file
2973contents.
2974
2975
2976\section{Beyond Very High Level Embedding: An overview
2977 \label{lower-level-embedding}}
2978
2979The high level interface gives you the ability to execute
2980arbitrary pieces of Python code from your application, but
2981exchanging data values is quite cumbersome to say the least. If
2982you want that, you should use lower level calls. At the cost of
2983having to write more C code, you can achieve almost anything.
2984
2985It should be noted that extending Python and embedding Python
2986is quite the same activity, despite the different intent. Most
2987topics discussed in the previous chapters are still valid. To
2988show this, consider what the extension code from Python to C
2989really does:
2990
2991\begin{enumerate}
2992 \item Convert data values from Python to C,
2993 \item Perform a function call to a C routine using the
2994 converted values, and
2995 \item Convert the data values from the call from C to Python.
2996\end{enumerate}
2997
2998When embedding Python, the interface code does:
2999
3000\begin{enumerate}
3001 \item Convert data values from C to Python,
3002 \item Perform a function call to a Python interface routine
3003 using the converted values, and
3004 \item Convert the data values from the call from Python to C.
3005\end{enumerate}
3006
3007As you can see, the data conversion steps are simply swapped to
3008accomodate the different direction of the cross-language transfer.
3009The only difference is the routine that you call between both
3010data conversions. When extending, you call a C routine, when
3011embedding, you call a Python routine.
3012
3013This chapter will not discuss how to convert data from Python
3014to C and vice versa. Also, proper use of references and dealing
3015with errors is assumed to be understood. Since these aspects do not
3016differ from extending the interpreter, you can refer to earlier
3017chapters for the required information.
3018
3019
3020\section{Pure Embedding
3021 \label{pure-embedding}}
3022
3023The first program aims to execute a function in a Python
3024script. Like in the section about the very high level interface,
3025the Python interpreter does not directly interact with the
3026application (but that will change in th next section).
3027
3028The code to run a function defined in a Python script is:
3029
3030\verbatiminput{run-func.c}
3031
3032This code loads a Python script using \code{argv[1]}, and calls the
3033function named in \code{argv[2]}. Its integer arguments are the other
3034values of the \code{argv} array. If you compile and link this
3035program (let's call the finished executable \program{call}), and use
3036it to execute a Python script, such as:
3037
3038\begin{verbatim}
3039def multiply(a,b):
3040 print "Thy shall add", a, "times", b
3041 c = 0
3042 for i in range(0, a):
3043 c = c + b
3044 return c
3045\end{verbatim}
3046
3047then the result should be:
3048
3049\begin{verbatim}
3050$ call multiply 3 2
3051Thy shall add 3 times 2
3052Result of call: 6
3053\end{verbatim} % $
3054
3055Although the program is quite large for its functionality, most of the
3056code is for data conversion between Python and C, and for error
3057reporting. The interesting part with respect to embedding Python
3058starts with
3059
3060\begin{verbatim}
3061 Py_Initialize();
3062 pName = PyString_FromString(argv[1]);
3063 /* Error checking of pName left out */
3064 pModule = PyImport_Import(pName);
3065\end{verbatim}
3066
3067After initializing the interpreter, the script is loaded using
3068\cfunction{PyImport_Import()}. This routine needs a Python string
3069as its argument, which is constructed using the
3070\cfunction{PyString_FromString()} data conversion routine.
3071
3072\begin{verbatim}
3073 pDict = PyModule_GetDict(pModule);
3074 /* pDict is a borrowed reference */
3075
3076 pFunc = PyDict_GetItemString(pDict, argv[2]);
3077 /* pFun is a borrowed reference */
3078
3079 if (pFunc && PyCallable_Check(pFunc)) {
3080 ...
3081 }
3082\end{verbatim}
3083
3084Once the script is loaded, its dictionary is retrieved with
3085\cfunction{PyModule_GetDict()}. The dictionary is then searched using
3086the normal dictionary access routines for the function name. If the
3087name exists, and the object retunred is callable, you can safely
3088assume that it is a function. The program then proceeds by
3089constructing a tuple of arguments as normal. The call to the python
3090function is then made with:
3091
3092\begin{verbatim}
3093 pValue = PyObject_CallObject(pFunc, pArgs);
3094\end{verbatim}
3095
3096Upon return of the function, \code{pValue} is either \NULL{} or it
3097contains a reference to the return value of the function. Be sure to
3098release the reference after examining the value.
3099
3100
3101\section{Extending Embedded Python
3102 \label{extending-with-embedding}}
3103
3104Until now, the embedded Python interpreter had no access to
3105functionality from the application itself. The Python API allows this
3106by extending the embedded interpreter. That is, the embedded
3107interpreter gets extended with routines provided by the application.
3108While it sounds complex, it is not so bad. Simply forget for a while
3109that the application starts the Python interpreter. Instead, consider
3110the application to be a set of subroutines, and write some glue code
3111that gives Python access to those routines, just like you would write
3112a normal Python extension. For example:
3113
3114\begin{verbatim}
3115static int numargs=0;
3116
3117/* Return the number of arguments of the application command line */
3118static PyObject*
3119emb_numargs(PyObject *self, PyObject *args)
3120{
3121 if(!PyArg_ParseTuple(args, ":numargs"))
3122 return NULL;
3123 return Py_BuildValue("i", numargs);
3124}
3125
3126static PyMethodDef EmbMethods[]={
3127 {"numargs", emb_numargs, METH_VARARGS},
3128 {NULL, NULL}
3129};
3130\end{verbatim}
3131
3132Insert the above code just above the \cfunction{main()} function.
3133Also, insert the following two statements directly after
3134\cfunction{Py_Initialize()}:
3135
3136\begin{verbatim}
3137 numargs = argc;
3138 Py_InitModule("emb", EmbMethods);
3139\end{verbatim}
3140
3141These two lines initialize the \code{numargs} variable, and make the
3142\function{emb.numargs()} function accessible to the embedded Python
3143interpreter. With these extensions, the Python script can do things
3144like
3145
3146\begin{verbatim}
3147import emb
3148print "Number of arguments", emb.numargs()
3149\end{verbatim}
3150
3151In a real application, the methods will expose an API of the
3152application to Python.
3153
3154
3155%\section{For the future}
3156%
3157%You don't happen to have a nice library to get textual
3158%equivalents of numeric values do you :-) ?
3159%Callbacks here ? (I may be using information from that section
3160%?!)
3161%threads
3162%code examples do not really behave well if errors happen
3163% (what to watch out for)
3164
3165
Fred Drake5e8aa541998-11-16 18:34:07 +00003166\section{Embedding Python in \Cpp{}
Fred Drakef6a96172001-02-19 19:22:00 +00003167 \label{embeddingInCplusplus}}
Guido van Rossum7a2dba21993-11-05 14:45:11 +00003168
Guido van Rossum16d6e711994-08-08 12:30:22 +00003169It is also possible to embed Python in a \Cpp{} program; precisely how this
3170is done will depend on the details of the \Cpp{} system used; in general you
3171will need to write the main program in \Cpp{}, and use the \Cpp{} compiler
3172to compile and link your program. There is no need to recompile Python
3173itself using \Cpp{}.
Guido van Rossum7a2dba21993-11-05 14:45:11 +00003174
Fred Drake1c258032000-09-08 22:54:53 +00003175
3176\section{Linking Requirements
3177 \label{link-reqs}}
3178
3179While the \program{configure} script shipped with the Python sources
3180will correctly build Python to export the symbols needed by
3181dynamically linked extensions, this is not automatically inherited by
3182applications which embed the Python library statically, at least on
3183\UNIX. This is an issue when the application is linked to the static
3184runtime library (\file{libpython.a}) and needs to load dynamic
3185extensions (implemented as \file{.so} files).
3186
3187The problem is that some entry points are defined by the Python
3188runtime solely for extension modules to use. If the embedding
3189application does not use any of these entry points, some linkers will
3190not include those entries in the symbol table of the finished
3191executable. Some additional options are needed to inform the linker
3192not to remove these symbols.
3193
3194Determining the right options to use for any given platform can be
3195quite difficult, but fortunately the Python configuration already has
3196those values. To retrieve them from an installed Python interpreter,
3197start an interactive interpreter and have a short session like this:
3198
3199\begin{verbatim}
3200>>> import distutils.sysconfig
Fred Drake4bc0aed2000-11-02 21:49:17 +00003201>>> distutils.sysconfig.get_config_var('LINKFORSHARED')
Fred Drake1c258032000-09-08 22:54:53 +00003202'-Xlinker -export-dynamic'
3203\end{verbatim}
3204\refstmodindex{distutils.sysconfig}
3205
3206The contents of the string presented will be the options that should
3207be used. If the string is empty, there's no need to add any
3208additional options. The \constant{LINKFORSHARED} definition
3209corresponds to the variable of the same name in Python's top-level
3210\file{Makefile}.
3211
Fred Drakeed773ef2000-09-21 21:35:22 +00003212
3213\appendix
3214\chapter{Reporting Bugs}
3215\input{reportingbugs}
3216
Fred Draked5df09c2001-06-20 21:37:34 +00003217\chapter{History and License}
3218\input{license}
3219
Guido van Rossum7a2dba21993-11-05 14:45:11 +00003220\end{document}