Blame - Doc/ext.tex - platform/external/python/cpython2

blob: f92d96c07a45905f1c16eb89e418f1a8b2ea1050 [file] [log] [blame]

Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1	\documentstyle[twoside,11pt,myformat]{report}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	2
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	3	% XXX PM Modulator
				4
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	5	\title{Extending and Embedding the Python Interpreter}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	6
Guido van Rossum	16cd7f9	1994-10-06 10:29:26 +0000	[diff] [blame]	7	\input{boilerplate}
Guido van Rossum	83eb962	1993-11-23 16:28:45 +0000	[diff] [blame]	8
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	9	% Tell \index to actually write the .idx file
				10	\makeindex
				11
				12	\begin{document}
				13
				14	\pagenumbering{roman}
				15
				16	\maketitle
				17
Guido van Rossum	16cd7f9	1994-10-06 10:29:26 +0000	[diff] [blame]	18	\input{copyright}
				19
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	20	\begin{abstract}
				21
				22	\noindent
Guido van Rossum	16d6e71	1994-08-08 12:30:22 +0000	[diff] [blame]	23	This document describes how to write modules in C or \Cpp{} to extend the
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	24	Python interpreter. It also describes how to use Python as an
				25	`embedded' language, and how extension modules can be loaded
				26	dynamically (at run time) into the interpreter, if the operating
				27	system supports this feature.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	28
				29	\end{abstract}
				30
				31	\pagebreak
				32
				33	{
				34	\parskip = 0mm
				35	\tableofcontents
				36	}
				37
				38	\pagebreak
				39
				40	\pagenumbering{arabic}
				41
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	42
Guido van Rossum	16d6e71	1994-08-08 12:30:22 +0000	[diff] [blame]	43	\chapter{Extending Python with C or \Cpp{} code}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	44
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	45
				46	\section{Introduction}
				47
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	48	It is quite easy to add non-standard built-in modules to Python, if
				49	you know how to program in C. A built-in module known to the Python
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	50	programmer as \code{spam} is generally implemented by a file called
				51	\file{spammodule.c} (if the module name is very long, like
				52	\samp{spammify}, you can drop the \samp{module}, leaving a file name
				53	like \file{spammify.c}). The standard built-in modules also adhere to
				54	this convention, and in fact some of them are excellent examples of
				55	how to create an extension.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	56
				57	Extension modules can do two things that can't be done directly in
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	58	Python: they can implement new data types (which are different from
Guido van Rossum	16d6e71	1994-08-08 12:30:22 +0000	[diff] [blame]	59	classes, by the way), and they can make system calls or call C library
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	60	functions.
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	61
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	62	To support extensions, the Python API (Application Programmers
				63	Interface) defines many functions, macros and variables that provide
				64	access to almost every aspect of the Python run-time system.
				65	Most of the Python API is imported by including the single header file
				66	\code{"Python.h"}. All user-visible symbols defined by including this
				67	file have a prefix of \samp{Py} or \samp{PY}, except those defined in
				68	standard header files --- for convenience, and since they are needed by
				69	the Python interpreter, \file{"Python.h"} includes a few standard
				70	header files: \file{<stdio.h>}, \file{<string.h>}, \file{<errno.h>},
				71	and \file{<stdlib.h>}. If the latter header file does not exist on
				72	your system, it declares the functions \code{malloc()}, \code{free()}
				73	and \code{realloc()} itself.
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	74
				75	The compilation of an extension module depends on your system setup
				76	and the intended use of the module; details are given in a later
				77	section.
				78
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	79	Note: unless otherwise mentioned, all file references in this
				80	document are relative to the Python toplevel directory
				81	(the directory that contains the \file{configure} script).
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	82
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	83
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	84	\section{A Simple Example}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	85
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	86	Let's create an extension module called \samp{spam}. Create a file
				87	\samp{spammodule.c}. The first line of this file can be:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	88
				89	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	90	#include "Python.h"
				91	\end{verbatim}
				92
				93	which pulls in the Python API (you can add a comment describing the
				94	purpose of the module and a copyright notice if you like).
				95
				96	Let's create a Python interface to the C library function
				97	\code{system()}.\footnote{An interface for this function already
				98	exists in the \code{posix} module --- it was chosen as a simple and
				99	straightfoward example.} This function takes a zero-terminated
				100	character string as argument and returns an integer. We will want
				101	this function to be callable from Python as follows:
				102
				103	\begin{verbatim}
				104	>>> import spam
				105	>>> status = spam.system("ls -l")
				106	\end{verbatim}
				107
				108	The next thing we add to our module file is the C function that will
				109	be called when the Python expression \samp{spam.system(\var{string})}
				110	is evaluated (well see shortly how it ends up being called):
				111
				112	\begin{verbatim}
				113	static PyObject *
				114	spam_system(self, args)
				115	PyObject *self;
				116	PyObject *args;
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	117	{
				118	char *command;
				119	int sts;
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	120	if (!PyArg_ParseTuple(args, "s", &command))
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	121	return NULL;
				122	sts = system(command);
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	123	return Py_BuildValue("i", sts);
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	124	}
				125	\end{verbatim}
				126
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	127	There is a straightforward translation from the argument list in
				128	Python (here the single expression \code{"ls -l"}) to the arguments
				129	that are passed to the C function. The C function always has two
				130	arguments, conventionally named \var{self} and \var{args}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	131
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	132	The \var{self} argument is only used when the C function implements a
				133	builtin method --- this will be discussed later. In the example,
				134	\var{self} will always be a \code{NULL} pointer, since we are defining
				135	a function, not a method. (This is done so that the interpreter
				136	doesn't have to understand two different types of C functions.)
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	137
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	138	The \var{args} argument will be a pointer to a Python tuple object
				139	containing the arguments --- the length of the tuple will be the
				140	number of arguments. It is necessary to do full argument type
				141	checking in each call, since otherwise the Python user would be able
				142	to cause the Python interpreter to crash (rather than raising an
				143	exception) by passing invalid arguments to a function in an extension
				144	module. Because argument checking and converting arguments to C are
				145	such common tasks, there's a general function in the Python
				146	interpreter that combines them: \code{PyArg_ParseTuple()}. It uses a
				147	template string to determine the types of the Python argument and the
				148	types of the C variables into which it should store the converted
				149	values (more about this later).
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	150
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	151	\code{PyArg_ParseTuple()} returns nonzero if all arguments have the
				152	right type and its components have been stored in the variables whose
				153	addresses are passed. It returns zero if an invalid argument was
				154	passed. In the latter case it also raises an appropriate exception by
				155	so the calling function can return \code{NULL} immediately. Here's
				156	why:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	157
				158
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	159	\section{Intermezzo: Errors and Exceptions}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	160
				161	An important convention throughout the Python interpreter is the
				162	following: when a function fails, it should set an exception condition
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	163	and return an error value (usually a \code{NULL} pointer). Exceptions
				164	are stored in a static global variable inside the interpreter; if
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	165	this variable is \code{NULL} no exception has occurred. A second
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	166	global variable stores the `associated value' of the exception
				167	--- the second argument to \code{raise}. A third variable contains
				168	the stack traceback in case the error originated in Python code.
				169	These three variables are the C equivalents of the Python variables
				170	\code{sys.exc_type}, \code{sys.exc_value} and \code{sys.exc_traceback}
				171	--- see the section on module \code{sys} in the Library Reference
				172	Manual. It is important to know about them to understand how errors
				173	are passed around.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	174
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	175	The Python API defines a host of functions to set various types of
				176	exceptions. The most common one is \code{PyErr_SetString()} --- its
				177	arguments are an exception object (e.g. \code{PyExc_RuntimeError} ---
				178	actually it can be any object that is a legal exception indicator),
				179	and a C string indicating the cause of the error (this is converted to
				180	a string object and stored as the `associated value' of the
				181	exception). Another useful function is \code{PyErr_SetFromErrno()},
				182	which only takes an exception argument and constructs the associated
				183	value by inspection of the (\UNIX{}) global variable \code{errno}. The
				184	most general function is \code{PyErr_SetObject()}, which takes two
				185	object arguments, the exception and its associated value. You don't
				186	need to \code{Py_INCREF()} the objects passed to any of these
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	187	functions.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	188
				189	You can test non-destructively whether an exception has been set with
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	190	\code{PyErr_Occurred()} --- this returns the current exception object,
				191	or \code{NULL} if no exception has occurred. Most code never needs to
				192	call \code{PyErr_Occurred()} to see whether an error occurred or not,
				193	but relies on error return values from the functions it calls instead.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	194
				195	When a function that calls another function detects that the called
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	196	function fails, it should return an error value (e.g. \code{NULL} or
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	197	\code{-1}). It shouldn't call one of the \code{PyErr_*} functions ---
				198	one has already been called. The caller is then supposed to also
				199	return an error indication to {\em its} caller, again {\em without}
				200	calling \code{PyErr_*()}, and so on --- the most detailed cause of the
				201	error was already reported by the function that first detected it.
				202	Once the error has reached Python's interpreter main loop, this aborts
				203	the currently executing Python code and tries to find an exception
				204	handler specified by the Python programmer.
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	205
				206	(There are situations where a module can actually give a more detailed
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	207	error message by calling another \code{PyErr_*} function, and in such
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	208	cases it is fine to do so. As a general rule, however, this is not
				209	necessary, and can cause information about the cause of the error to
				210	be lost: most operations can fail for a variety of reasons.)
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	211
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	212	To ignore an exception set by a function call that failed, the exception
				213	condition must be cleared explicitly by calling \code{PyErr_Clear()}.
				214	The only time C code should call \code{PyErr_Clear()} is if it doesn't
				215	want to pass the error on to the interpreter but wants to handle it
				216	completely by itself (e.g. by trying something else or pretending
				217	nothing happened).
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	218
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	219	Note that a failing \code{malloc()} call must also be turned into an
				220	exception --- the direct caller of \code{malloc()} (or
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	221	\code{realloc()}) must call \code{PyErr_NoMemory()} and return a
				222	failure indicator itself. All the object-creating functions
				223	(\code{PyInt_FromLong()} etc.) already do this, so only if you call
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	224	\code{malloc()} directly this note is of importance.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	225
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	226	Also note that, with the important exception of
				227	\code{PyArg_ParseTuple()}, functions that return an integer status
				228	usually return \code{0} or a positive value for success and \code{-1}
				229	for failure (like \UNIX{} system calls).
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	230
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	231	Finally, be careful about cleaning up garbage (making \code{Py_XDECREF()}
				232	or \code{Py_DECREF()} calls for objects you have already created) when
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	233	you return an error!
				234
				235	The choice of which exception to raise is entirely yours. There are
				236	predeclared C objects corresponding to all built-in Python exceptions,
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	237	e.g. \code{PyExc_ZeroDevisionError} which you can use directly. Of
				238	course, you should chose exceptions wisely --- don't use
				239	\code{PyExc_TypeError} to mean that a file couldn't be opened (that
				240	should probably be \code{PyExc_IOError}). If something's wrong with
				241	the argument list, the \code{PyArg_ParseTuple()} function usually
				242	raises \code{PyExc_TypeError}. If you have an argument whose value
				243	which must be in a particular range or must satisfy other conditions,
				244	\code{PyExc_ValueError} is appropriate.
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	245
				246	You can also define a new exception that is unique to your module.
				247	For this, you usually declare a static object variable at the
				248	beginning of your file, e.g.
				249
				250	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	251	static PyObject *SpamError;
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	252	\end{verbatim}
				253
				254	and initialize it in your module's initialization function
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	255	(\code{initspam()}) with a string object, e.g. (leaving out the error
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	256	checking for simplicity):
				257
				258	\begin{verbatim}
				259	void
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	260	initspam()
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	261	{
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	262	PyObject m, d;
				263	m = Py_InitModule("spam", spam_methods);
				264	d = PyModule_GetDict(m);
				265	SpamError = PyString_FromString("spam.error");
				266	PyDict_SetItemString(d, "error", SpamError);
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	267	}
				268	\end{verbatim}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	269
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	270	Note that the Python name for the exception object is \code{spam.error}
				271	--- it is conventional for module and exception names to be spelled in
				272	lower case. It is also conventional that the \emph{value} of the
				273	exception object is the same as its name, e.g.\ the string
				274	\code{"spam.error"}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	275
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	276
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	277	\section{Back to the Example}
				278
				279	Going back to our example function, you should now be able to
				280	understand this statement:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	281
				282	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	283	if (!PyArg_ParseTuple(args, "s", &command))
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	284	return NULL;
				285	\end{verbatim}
				286
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	287	It returns \code{NULL} (the error indicator for functions returning
				288	object pointers) if an error is detected in the argument list, relying
				289	on the exception set by \code{PyArg_ParseTuple()}. Otherwise the
				290	string value of the argument has been copied to the local variable
				291	\code{command}. This is a pointer assignment and you are not supposed
				292	to modify the string to which it points (so in ANSI C, the variable
				293	\code{command} should properly be declared as \code{const char
				294	*command}).
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	295
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	296	The next statement is a call to the \UNIX{} function \code{system()},
				297	passing it the string we just got from \code{PyArg_ParseTuple()}:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	298
				299	\begin{verbatim}
				300	sts = system(command);
				301	\end{verbatim}
				302
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	303	Our \code{spam.system()} function must return a value: the integer
				304	\code{sts} which contains the return value of the \UNIX{}
				305	\code{system()} function. This is done using the function
				306	\code{Py_BuildValue()}, which is something like the inverse of
				307	\code{PyArg_ParseTuple()}: it takes a format string and an arbitrary
				308	number of C values, and returns a new Python object. More info on
				309	\code{Py_BuildValue()} is given later.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	310
				311	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	312	return Py_BuildValue("i", sts);
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	313	\end{verbatim}
				314
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	315	In this case, it will return an integer object. (Yes, even integers
				316	are objects on the heap in Python!)
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	317
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	318	If you have a C function that returns no useful argument (a function
				319	returning \code{void}), the corresponding Python function must return
				320	\code{None}. You need this idiom to do so:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	321
				322	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	323	Py_INCREF(Py_None);
				324	return Py_None;
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	325	\end{verbatim}
				326
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	327	\code{Py_None} is the C name for the special Python object
				328	\code{None}. It is a genuine Python object (not a \code{NULL}
				329	pointer, which means `error' in most contexts, as we have seen).
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	330
				331
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	332	\section{The Module's Method Table and Initialization Function}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	333
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	334	I promised to show how \code{spam_system()} is called from Python
				335	programs. First, we need to list its name and address in a ``method
				336	table'':
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	337
				338	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	339	static PyMethodDef spam_methods[] = {
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	340	...
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	341	{"system", spam_system, 1},
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	342	...
				343	{NULL, NULL} /* Sentinel */
				344	};
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	345	\end{verbatim}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	346
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	347	Note the third entry (\samp{1}). This is a flag telling the
				348	interpreter the calling convention to be used for the C function. It
				349	should normally always be \samp{1}; a value of \samp{0} means that an
				350	obsolete variant of \code{PyArg_ParseTuple()} is used.
				351
				352	The method table must be passed to the interpreter in the module's
				353	initialization function (which should be the only non-\code{static}
				354	item defined in the module file):
				355
				356	\begin{verbatim}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	357	void
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	358	initspam()
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	359	{
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	360	(void) Py_InitModule("spam", spam_methods);
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	361	}
				362	\end{verbatim}
				363
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	364	When the Python program imports module \code{spam} for the first time,
				365	\code{initspam()} is called. It calls \code{Py_InitModule()}, which
				366	creates a ``module object'' (which is inserted in the dictionary
				367	\code{sys.modules} under the key \code{"spam"}), and inserts built-in
				368	function objects into the newly created module based upon the table
				369	(an array of \code{PyMethodDef} structures) that was passed as its
				370	second argument. \code{Py_InitModule()} returns a pointer to the
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	371	module object that it creates (which is unused here). It aborts with
				372	a fatal error if the module could not be initialized satisfactorily,
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	373	so the caller doesn't need to check for errors.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	374
				375
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	376	\section{Compilation and Linkage}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	377
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	378	There are two more things to do before you can use your new extension
				379	module: compiling and linking it with the Python system. If you use
				380	dynamic loading, the details depend on the style of dynamic loading
				381	your system uses; see the chapter on Dynamic Loading for more info
				382	about this.
				383
				384	If you can't use dynamic loading, or if you want to make your module a
				385	permanent part of the Python interpreter, you will have to change the
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	386	configuration setup and rebuild the interpreter. Luckily, this is
				387	very simple: just place your file (\file{spammodule.c} for example) in
				388	the \file{Modules} directory, add a line to the file
				389	\file{Modules/Setup} describing your file:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	390
				391	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	392	spam spammodule.o
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	393	\end{verbatim}
				394
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	395	and rebuild the interpreter by running \code{make} in the toplevel
				396	directory. You can also run \code{make} in the \file{Modules}
				397	subdirectory, but then you must first rebuilt the \file{Makefile}
				398	there by running \code{make Makefile}. (This is necessary each time
				399	you change the \file{Setup} file.)
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	400
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	401	If your module requires additional libraries to link with, these can
				402	be listed on the line in the \file{Setup} file as well, for instance:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	403
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	404	\begin{verbatim}
				405	spam spammodule.o -lX11
				406	\end{verbatim}
				407
				408
				409	\section{Calling Python Functions From C}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	410
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	411	So far we have concentrated on making C functions callable from
				412	Python. The reverse is also useful: calling Python functions from C.
				413	This is especially the case for libraries that support so-called
				414	`callback' functions. If a C interface makes use of callbacks, the
				415	equivalent Python often needs to provide a callback mechanism to the
				416	Python programmer; the implementation will require calling the Python
				417	callback functions from a C callback. Other uses are also imaginable.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	418
				419	Fortunately, the Python interpreter is easily called recursively, and
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	420	there is a standard interface to call a Python function. (I won't
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	421	dwell on how to call the Python parser with a particular string as
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	422	input --- if you're interested, have a look at the implementation of
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	423	the \samp{-c} command line option in \file{Python/pythonmain.c}.)
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	424
				425	Calling a Python function is easy. First, the Python program must
				426	somehow pass you the Python function object. You should provide a
				427	function (or some other interface) to do this. When this function is
				428	called, save a pointer to the Python function object (be careful to
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	429	\code{Py_INCREF()} it!) in a global variable --- or whereever you see fit.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	430	For example, the following function might be part of a module
				431	definition:
				432
				433	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	434	static PyObject *my_callback = NULL;
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	435
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	436	static PyObject *
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	437	my_set_callback(dummy, arg)
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	438	PyObject dummy, arg;
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	439	{
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	440	Py_XDECREF(my_callback); /* Dispose of previous callback */
				441	Py_XINCREF(arg); /* Add a reference to new callback */
				442	my_callback = arg; /* Remember new callback */
				443	/* Boilerplate to return "None" */
				444	Py_INCREF(Py_None);
				445	return Py_None;
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	446	}
				447	\end{verbatim}
				448
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	449	The macros \code{Py_XINCREF()} and \code{Py_XDECREF()} increment/decrement
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	450	the reference count of an object and are safe in the presence of
				451	\code{NULL} pointers. More info on them in the section on Reference
				452	Counts below.
				453
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	454	Later, when it is time to call the function, you call the C function
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	455	\code{PyEval_CallObject()}. This function has two arguments, both
				456	pointers to arbitrary Python objects: the Python function, and the
				457	argument list. The argument list must always be a tuple object, whose
				458	length is the number of arguments. To call the Python function with
				459	no arguments, pass an empty tuple; to call it with one argument, pass
				460	a singleton tuple. \code{Py_BuildValue()} returns a tuple when its
				461	format string consists of zero or more format codes between
				462	parentheses. For example:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	463
				464	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	465	int arg;
				466	PyObject *arglist;
				467	PyObject *result;
				468	...
				469	arg = 123;
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	470	...
				471	/* Time to call the callback */
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	472	arglist = Py_BuildValue("(i)", arg);
				473	result = PyEval_CallObject(my_callback, arglist);
				474	Py_DECREF(arglist);
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	475	\end{verbatim}
				476
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	477	\code{PyEval_CallObject()} returns a Python object pointer: this is
				478	the return value of the Python function. \code{PyEval_CallObject()} is
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	479	`reference-count-neutral' with respect to its arguments. In the
				480	example a new tuple was created to serve as the argument list, which
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	481	is \code{Py_DECREF()}-ed immediately after the call.
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	482
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	483	The return value of \code{PyEval_CallObject()} is ``new'': either it
				484	is a brand new object, or it is an existing object whose reference
				485	count has been incremented. So, unless you want to save it in a
				486	global variable, you should somehow \code{Py_DECREF()} the result,
				487	even (especially!) if you are not interested in its value.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	488
				489	Before you do this, however, it is important to check that the return
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	490	value isn't \code{NULL}. If it is, the Python function terminated by raising
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	491	an exception. If the C code that called \code{PyEval_CallObject()} is
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	492	called from Python, it should now return an error indication to its
				493	Python caller, so the interpreter can print a stack trace, or the
				494	calling Python code can handle the exception. If this is not possible
				495	or desirable, the exception should be cleared by calling
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	496	\code{PyErr_Clear()}. For example:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	497
				498	\begin{verbatim}
				499	if (result == NULL)
				500	return NULL; /* Pass error back */
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	501	...use result...
				502	Py_DECREF(result);
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	503	\end{verbatim}
				504
				505	Depending on the desired interface to the Python callback function,
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	506	you may also have to provide an argument list to \code{PyEval_CallObject()}.
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	507	In some cases the argument list is also provided by the Python
				508	program, through the same interface that specified the callback
				509	function. It can then be saved and used in the same manner as the
				510	function object. In other cases, you may have to construct a new
				511	tuple to pass as the argument list. The simplest way to do this is to
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	512	call \code{Py_BuildValue()}. For example, if you want to pass an integral
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	513	event code, you might use the following code:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	514
				515	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	516	PyObject *arglist;
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	517	...
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	518	arglist = Py_BuildValue("(l)", eventcode);
				519	result = PyEval_CallObject(my_callback, arglist);
				520	Py_DECREF(arglist);
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	521	if (result == NULL)
				522	return NULL; /* Pass error back */
				523	/* Here maybe use the result */
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	524	Py_DECREF(result);
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	525	\end{verbatim}
				526
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	527	Note the placement of \code{Py_DECREF(argument)} immediately after the call,
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	528	before the error check! Also note that strictly spoken this code is
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	529	not complete: \code{Py_BuildValue()} may run out of memory, and this should
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	530	be checked.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	531
				532
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	533	\section{Format Strings for {\tt PyArg_ParseTuple()}}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	534
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	535	The \code{PyArg_ParseTuple()} function is declared as follows:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	536
				537	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	538	int PyArg_ParseTuple(PyObject arg, char format, ...);
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	539	\end{verbatim}
				540
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	541	The \var{arg} argument must be a tuple object containing an argument
				542	list passed from Python to a C function. The \var{format} argument
				543	must be a format string, whose syntax is explained below. The
				544	remaining arguments must be addresses of variables whose type is
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	545	determined by the format string. For the conversion to succeed, the
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	546	\var{arg} object must match the format and the format must be
				547	exhausted.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	548
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	549	Note that while \code{PyArg_ParseTuple()} checks that the Python
				550	arguments have the required types, it cannot check the validity of the
				551	addresses of C variables passed to the call: if you make mistakes
				552	there, your code will probably crash or at least overwrite random bits
				553	in memory. So be careful!
				554
				555	A format string consists of zero or more ``format units''. A format
				556	unit describes one Python object; it is usually a single character or
				557	a parenthesized sequence of format units. With a few exceptions, a
				558	format unit that is not a parenthesized sequence normally corresponds
				559	to a single address argument to \code{PyArg_ParseTuple()}. In the
				560	following description, the quoted form is the format unit; the entry
				561	in (round) parentheses is the Python object type that matches the
				562	format unit; and the entry in [square] brackets is the type of the C
				563	variable(s) whose address should be passed. (Use the \samp{\&}
				564	operator to pass a variable's address.)
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	565
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	566	\begin{description}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	567
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	568	\item[\samp{s} (string) [char *]]
				569	Convert a Python string to a C pointer to a character string. You
				570	must not provide storage for the string itself; a pointer to an
				571	existing string is stored into the character pointer variable whose
				572	address you pass. The C string is null-terminated. The Python string
				573	must not contain embedded null bytes; if it does, a \code{TypeError}
				574	exception is raised.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	575
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	576	\item[\samp{s\#} (string) {[char *, int]}]
				577	This variant on \code{'s'} stores into two C variables, the first one
				578	a pointer to a character string, the second one its length. In this
				579	case the Python string may contain embedded null bytes.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	580
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	581	\item[\samp{z} (string or \code{None}) {[char *]}]
				582	Like \samp{s}, but the Python object may also be \code{None}, in which
				583	case the C pointer is set to \code{NULL}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	584
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	585	\item[\samp{z\#} (string or \code{None}) {[char *, int]}]
				586	This is to \code{'s\#'} as \code{'z'} is to \code{'s'}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	587
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	588	\item[\samp{b} (integer) {[char]}]
				589	Convert a Python integer to a tiny int, stored in a C \code{char}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	590
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	591	\item[\samp{h} (integer) {[short int]}]
				592	Convert a Python integer to a C \code{short int}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	593
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	594	\item[\samp{i} (integer) {[int]}]
				595	Convert a Python integer to a plain C \code{int}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	596
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	597	\item[\samp{l} (integer) {[long int]}]
				598	Convert a Python integer to a C \code{long int}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	599
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	600	\item[\samp{c} (string of length 1) {[char]}]
				601	Convert a Python character, represented as a string of length 1, to a
				602	C \code{char}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	603
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	604	\item[\samp{f} (float) {[float]}]
				605	Convert a Python floating point number to a C \code{float}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	606
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	607	\item[\samp{d} (float) {[double]}]
				608	Convert a Python floating point number to a C \code{double}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	609
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	610	\item[\samp{O} (object) {[PyObject *]}]
				611	Store a Python object (without any conversion) in a C object pointer.
				612	The C program thus receives the actual object that was passed. The
				613	object's reference count is not increased. The pointer stored is not
				614	\code{NULL}.
				615
				616	\item[\samp{O!} (object) {[\var{typeobject}, PyObject *]}]
				617	Store a Python object in a C object pointer. This is similar to
				618	\samp{O}, but takes two C arguments: the first is the address of a
				619	Python type object, the second is the address of the C variable (of
				620	type \code{PyObject *}) into which the object pointer is stored.
				621	If the Python object does not have the required type, a
				622	\code{TypeError} exception is raised.
				623
				624	\item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
				625	Convert a Python object to a C variable through a \var{converter}
				626	function. This takes two arguments: the first is a function, the
				627	second is the address of a C variable (of arbitrary type), converted
				628	to \code{void *}. The \var{converter} function in turn is called as
				629	follows:
				630
				631	\code{\var{status} = \var{converter}(\var{object}, \var{address});}
				632
				633	where \var{object} is the Python object to be converted and
				634	\var{address} is the \code{void *} argument that was passed to
				635	\code{PyArg_ConvertTuple()}. The returned \var{status} should be
				636	\code{1} for a successful conversion and \code{0} if the conversion
				637	has failed. When the conversion fails, the \var{converter} function
				638	should raise an exception.
				639
				640	\item[\samp{S} (string) {[PyStringObject *]}]
				641	Like \samp{O} but raises a \code{TypeError} exception that the object
				642	is a string object. The C variable may also be declared as
				643	\code{PyObject *}.
				644
				645	\item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
				646	The object must be a Python tuple whose length is the number of format
				647	units in \var{items}. The C arguments must correspond to the
				648	individual format units in \var{items}. Format units for tuples may
				649	be nested.
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	650
				651	\end{description}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	652
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	653	It is possible to pass Python long integers where integers are
				654	requested; however no proper range checking is done -- the most
				655	significant bits are silently truncated when the receiving field is
				656	too small to receive the value (actually, the semantics are inherited
				657	from downcasts in C --- your milage may vary).
				658
				659	A few other characters have a meaning in a format string. These may
				660	not occur inside nested parentheses. They are:
				661
				662	\begin{description}
				663
				664	\item[\samp{\|}]
				665	Indicates that the remaining arguments in the Python argument list are
				666	optional. The C variables corresponding to optional arguments should
				667	be initialized to their default value --- when an optional argument is
				668	not specified, the \code{PyArg_ParseTuple} does not touch the contents
				669	of the corresponding C variable(s).
				670
				671	\item[\samp{:}]
				672	The list of format units ends here; the string after the colon is used
				673	as the function name in error messages (the ``associated value'' of
				674	the exceptions that \code{PyArg_ParseTuple} raises).
				675
				676	\item[\samp{;}]
				677	The list of format units ends here; the string after the colon is used
				678	as the error message \emph{instead} of the default error message.
				679	Clearly, \samp{:} and \samp{;} mutually exclude each other.
				680
				681	\end{description}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	682
				683	Some example calls:
				684
				685	\begin{verbatim}
				686	int ok;
				687	int i, j;
				688	long k, l;
				689	char *s;
				690	int size;
				691
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	692	ok = PyArg_ParseTuple(args, ""); /* No arguments */
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	693	/* Python call: f() */
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	694
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	695	ok = PyArg_ParseTuple(args, "s", &s); /* A string */
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	696	/* Possible Python call: f('whoops!') */
				697
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	698	ok = PyArg_ParseTuple(args, "lls", &k, &l, &s); /* Two longs and a string */
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	699	/* Possible Python call: f(1, 2, 'three') */
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	700
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	701	ok = PyArg_ParseTuple(args, "(ii)s#", &i, &j, &s, &size);
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	702	/* A pair of ints and a string, whose size is also returned */
				703	/* Possible Python call: f(1, 2, 'three') */
				704
				705	{
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	706	char *file;
				707	char *mode = "r";
				708	int bufsize = 0;
				709	ok = PyArg_ParseTuple(args, "s\|si", &file, &mode, &bufsize);
				710	/* A string, and optionally another string and an integer */
				711	/* Possible Python calls:
				712	f('spam')
				713	f('spam', 'w')
				714	f('spam', 'wb', 100000) */
				715	}
				716
				717	{
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	718	int left, top, right, bottom, h, v;
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	719	ok = PyArg_ParseTuple(args, "((ii)(ii))(ii)",
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	720	&left, &top, &right, &bottom, &h, &v);
				721	/* A rectangle and a point */
				722	/* Possible Python call:
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	723	f(((0, 0), (400, 300)), (10, 10)) */
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	724	}
				725	\end{verbatim}
				726
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	727
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	728	\section{The {\tt Py_BuildValue()} Function}
				729
				730	This function is the counterpart to \code{PyArg_ParseTuple()}. It is
				731	declared as follows:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	732
				733	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	734	PyObject Py_BuildValue(char format, ...);
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	735	\end{verbatim}
				736
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	737	It recognizes a set of format units similar to the ones recognized by
				738	\code{PyArg_ParseTuple()}, but the arguments (which are input to the
				739	function, not output) must not be pointers, just values. It returns a
				740	new Python object, suitable for returning from a C function called
				741	from Python.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	742
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	743	One difference with \code{PyArg_ParseTuple()}: while the latter
				744	requires its first argument to be a tuple (since Python argument lists
				745	are always represented as tuples internally), \code{BuildValue()} does
				746	not always build a tuple. It builds a tuple only if its format string
				747	contains two or more format units. If the format string is empty, it
				748	returns \code{None}; if it contains exactly one format unit, it
				749	returns whatever object is described by that format unit. To force it
				750	to return a tuple of size 0 or one, parenthesize the format string.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	751
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	752	In the following description, the quoted form is the format unit; the
				753	entry in (round) parentheses is the Python object type that the format
				754	unit will return; and the entry in [square] brackets is the type of
				755	the C value(s) to be passed.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	756
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	757	The characters space, tab, colon and comma are ignored in format
				758	strings (but not within format units such as \samp{s\#}). This can be
				759	used to make long format strings a tad more readable.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	760
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	761	\begin{description}
				762
				763	\item[\samp{s} (string) {[char *]}]
				764	Convert a null-terminated C string to a Python object. If the C
				765	string pointer is \code{NULL}, \code{None} is returned.
				766
				767	\item[\samp{s\#} (string) {[char *, int]}]
				768	Convert a C string and its length to a Python object. If the C string
				769	pointer is \code{NULL}, the length is ignored and \code{None} is
				770	returned.
				771
				772	\item[\samp{z} (string or \code{None}) {[char *]}]
				773	Same as \samp{s}.
				774
				775	\item[\samp{z\#} (string or \code{None}) {[char *, int]}]
				776	Same as \samp{s\#}.
				777
				778	\item[\samp{i} (integer) {[int]}]
				779	Convert a plain C \code{int} to a Python integer object.
				780
				781	\item[\samp{b} (integer) {[char]}]
				782	Same as \samp{i}.
				783
				784	\item[\samp{h} (integer) {[short int]}]
				785	Same as \samp{i}.
				786
				787	\item[\samp{l} (integer) {[long int]}]
				788	Convert a C \code{long int} to a Python integer object.
				789
				790	\item[\samp{c} (string of length 1) {[char]}]
				791	Convert a C \code{int} representing a character to a Python string of
				792	length 1.
				793
				794	\item[\samp{d} (float) {[double]}]
				795	Convert a C \code{double} to a Python floating point number.
				796
				797	\item[\samp{f} (float) {[float]}]
				798	Same as \samp{d}.
				799
				800	\item[\samp{O} (object) {[PyObject *]}]
				801	Pass a Python object untouched (except for its reference count, which
				802	is incremented by one). If the object passed in is a \code{NULL}
				803	pointer, it is assumed that this was caused because the call producing
				804	the argument found an error and set an exception. Therefore,
				805	\code{Py_BuildValue()} will return \code{NULL} but won't raise an
				806	exception. If no exception has been raised yet,
				807	\code{PyExc_SystemError} is set.
				808
				809	\item[\samp{S} (object) {[PyObject *]}]
				810	Same as \samp{O}.
				811
				812	\item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
				813	Convert \var{anything} to a Python object through a \var{converter}
				814	function. The function is called with \var{anything} (which should be
				815	compatible with \code{void *}) as its argument and should return a
				816	``new'' Python object, or \code{NULL} if an error occurred.
				817
				818	\item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
				819	Convert a sequence of C values to a Python tuple with the same number
				820	of items.
				821
				822	\item[\samp{[\var{items}]} (list) {[\var{matching-items}]}]
				823	Convert a sequence of C values to a Python list with the same number
				824	of items.
				825
				826	\item[\samp{\{\var{items}\}} (dictionary) {[\var{matching-items}]}]
				827	Convert a sequence of C values to a Python dictionary. Each pair of
				828	consecutive C values adds one item to the dictionary, serving as key
				829	and value, respectively.
				830
				831	\end{description}
				832
				833	If there is an error in the format string, the
				834	\code{PyExc_SystemError} exception is raised and \code{NULL} returned.
				835
				836	Examples (to the left the call, to the right the resulting Python value):
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	837
				838	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	839	Py_BuildValue("") None
				840	Py_BuildValue("i", 123) 123
				841	Py_BuildValue("ii", 123, 456) (123, 456)
				842	Py_BuildValue("s", "hello") 'hello'
				843	Py_BuildValue("ss", "hello", "world") ('hello', 'world')
				844	Py_BuildValue("s#", "hello", 4) 'hell'
				845	Py_BuildValue("()") ()
				846	Py_BuildValue("(i)", 123) (123,)
				847	Py_BuildValue("(ii)", 123, 456) (123, 456)
				848	Py_BuildValue("(i,i)", 123, 456) (123, 456)
				849	Py_BuildValue("[i,i]", 123, 456) [123, 456]
				850	Py_BuildValue("{s:i,s:i}", "abc", 123, "def", 456)
				851	{'abc': 123, 'def': 456}
				852	Py_BuildValue("((ii)(ii)) (ii)", 1, 2, 3, 4, 5, 6)
				853	(((1, 2), (3, 4)), (5, 6))
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	854	\end{verbatim}
				855
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	856
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	857	\section{Reference Counts}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	858
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	859	\subsection{Introduction}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	860
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	861	In languages like C or \Cpp{}, the programmer is responsible for
				862	dynamic allocation and deallocation of memory on the heap. In C, this
				863	is done using the functions \code{malloc()} and \code{free()}. In
				864	\Cpp{}, the operators \code{new} and \code{delete} are used with
				865	essentially the same meaning; they are actually implemented using
				866	\code{malloc()} and \code{free()}, so we'll restrict the following
				867	discussion to the latter.
				868
				869	Every block of memory allocated with \code{malloc()} should eventually
				870	be returned to the pool of available memory by exactly one call to
				871	\code{free()}. It is important to call \code{free()} at the right
				872	time. If a block's address is forgotten but \code{free()} is not
				873	called for it, the memory it occupies cannot be reused until the
				874	program terminates. This is called a \dfn{memory leak}. On the other
				875	hand, if a program calls \code{free()} for a block and then continues
				876	to use the block, it creates a conflict with re-use of the block
				877	through another \code{malloc()} call. This is called \dfn{using freed
				878	memory} has the same bad consequences as referencing uninitialized
				879	data --- core dumps, wrong results, mysterious crashes.
				880
				881	Common causes of memory leaks are unusual paths through the code. For
				882	instance, a function may allocate a block of memory, do some
				883	calculation, and then free the block again. Now a change in the
				884	requirements for the function may add a test to the calculation that
				885	detects an error condition and can return prematurely from the
				886	function. It's easy to forget to free the allocated memory block when
				887	taking this premature exit, especially when it is added later to the
				888	code. Such leaks, once introduced, often go undetected for a long
				889	time: the error exit is taken only in a small fraction of all calls,
				890	and most modern machines have plenty of virtual memory, so the leak
				891	only becomes apparent in a long-running process that uses the leaking
				892	function frequently. Therefore, it's important to prevent leaks from
				893	happening by having a coding convention or strategy that minimizes
				894	this kind of errors.
				895
				896	Since Python makes heavy use of \code{malloc()} and \code{free()}, it
				897	needs a strategy to avoid memory leaks as well as the use of freed
				898	memory. The chosen method is called \dfn{reference counting}. The
				899	principle is simple: every object contains a counter, which is
				900	incremented when a reference to the object is stored somewhere, and
				901	which is decremented when a reference to it is deleted. When the
				902	counter reaches zero, the last reference to the object has been
				903	deleted and the object is freed.
				904
				905	An alternative strategy is called \dfn{automatic garbage collection}.
				906	(Sometimes, reference counting is also referred to as a garbage
				907	collection strategy, hence my use of ``automatic'' to distinguish the
				908	two.) The big advantage of automatic garbage collection is that the
				909	user doesn't need to call \code{free()} explicitly. (Another claimed
				910	advantage is an improvement in speed or memory usage --- this is no
				911	hard fact however.) The disadvantage is that for C, there is no
				912	truly portable automatic garbage collector, while reference counting
				913	can be implemented portably (as long as the functions \code{malloc()}
				914	and \code{free()} are available --- which the C Standard guarantees).
				915	Maybe some day a sufficiently portable automatic garbage collector
				916	will be available for C. Until then, we'll have to live with
				917	reference counts.
				918
				919	\subsection{Reference Counting in Python}
				920
				921	There are two macros, \code{Py_INCREF(x)} and \code{Py_DECREF(x)},
				922	which handle the incrementing and decrementing of the reference count.
				923	\code{Py_DECREF()} also frees the object when the count reaches zero.
				924	For flexibility, it doesn't call \code{free()} directly --- rather, it
				925	makes a call through a function pointer in the object's \dfn{type
				926	object}. For this purpose (and others), every object also contains a
				927	pointer to its type object.
				928
				929	The big question now remains: when to use \code{Py_INCREF(x)} and
				930	\code{Py_DECREF(x)}? Let's first introduce some terms. Nobody
				931	``owns'' an object; however, you can \dfn{own a reference} to an
				932	object. An object's reference count is now defined as the number of
				933	owned references to it. The owner of a reference is responsible for
				934	calling \code{Py_DECREF()} when the reference is no longer needed.
				935	Ownership of a reference can be transferred. There are three ways to
				936	dispose of an owned reference: pass it on, store it, or call
				937	\code{Py_DECREF()}. Forgetting to dispose of an owned reference creates
				938	a memory leak.
				939
				940	It is also possible to \dfn{borrow}\footnote{The metaphor of
				941	``borrowing'' a reference is not completely correct: the owner still
				942	has a copy of the reference.} a reference to an object. The borrower
				943	of a reference should not call \code{Py_DECREF()}. The borrower must
				944	not hold on to the object longer than the owner from which it was
				945	borrowed. Using a borrowed reference after the owner has disposed of
				946	it risks using freed memory and should be avoided
				947	completely.\footnote{Checking that the reference count is at least 1
				948	\strong{does not work} --- the reference count itself could be in
				949	freed memory and may thus be reused for another object!}
				950
				951	The advantage of borrowing over owning a reference is that you don't
				952	need to take care of disposing of the reference on all possible paths
				953	through the code --- in other words, with a borrowed reference you
				954	don't run the risk of leaking when a premature exit is taken. The
				955	disadvantage of borrowing over leaking is that there are some subtle
				956	situations where in seemingly correct code a borrowed reference can be
				957	used after the owner from which it was borrowed has in fact disposed
				958	of it.
				959
				960	A borrowed reference can be changed into an owned reference by calling
				961	\code{Py_INCREF()}. This does not affect the status of the owner from
				962	which the reference was borrowed --- it creates a new owned reference,
				963	and gives full owner responsibilities (i.e., the new owner must
				964	dispose of the reference properly, as well as the previous owner).
				965
				966	\subsection{Ownership Rules}
				967
				968	Whenever an object reference is passed into or out of a function, it
				969	is part of the function's interface specification whether ownership is
				970	transferred with the reference or not.
				971
				972	Most functions that return a reference to an object pass on ownership
				973	with the reference. In particular, all functions whose function it is
				974	to create a new object, e.g.\ \code{PyInt_FromLong()} and
				975	\code{Py_BuildValue()}, pass ownership to the receiver. Even if in
				976	fact, in some cases, you don't receive a reference to a brand new
				977	object, you still receive ownership of the reference. For instance,
				978	\code{PyInt_FromLong()} maintains a cache of popular values and can
				979	return a reference to a cached item.
				980
				981	Many functions that extract objects from other objects also transfer
				982	ownership with the reference, for instance
				983	\code{PyObject_GetAttrString()}. The picture is less clear, here,
				984	however, since a few common routines are exceptions:
				985	\code{PyTuple_GetItem()}, \code{PyList_GetItem()} and
				986	\code{PyDict_GetItem()} (and \code{PyDict_GetItemString()}) all return
				987	references that you borrow from the tuple, list or dictionary.
				988
				989	The function \code{PyImport_AddModule()} also returns a borrowed
				990	reference, even though it may actually create the object it returns:
				991	this is possible because an owned reference to the object is stored in
				992	\code{sys.modules}.
				993
				994	When you pass an object reference into another function, in general,
				995	the function borrows the reference from you --- if it needs to store
				996	it, it will use \code{Py_INCREF()} to become an independent owner.
				997	There are exactly two important exceptions to this rule:
				998	\code{PyTuple_SetItem()} and \code{PyList_SetItem()}. These functions
				999	take over ownership of the item passed to them --- even if they fail!
				1000	(Note that \code{PyDict_SetItem()} and friends don't take over
				1001	ownership --- they are ``normal''.)
				1002
				1003	When a C function is called from Python, it borrows references to its
				1004	arguments from the caller. The caller owns a reference to the object,
				1005	so the borrowed reference's lifetime is guaranteed until the function
				1006	returns. Only when such a borrowed reference must be stored or passed
				1007	on, it must be turned into an owned reference by calling
				1008	\code{Py_INCREF()}.
				1009
				1010	The object reference returned from a C function that is called from
				1011	Python must be an owned reference --- ownership is tranferred from the
				1012	function to its caller.
				1013
				1014	\subsection{Thin Ice}
				1015
				1016	There are a few situations where seemingly harmless use of a borrowed
				1017	reference can lead to problems. These all have to do with implicit
				1018	invocations of the interpreter, which can cause the owner of a
				1019	reference to dispose of it.
				1020
				1021	The first and most important case to know about is using
				1022	\code{Py_DECREF()} on an unrelated object while borrowing a reference
				1023	to a list item. For instance:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	1024
				1025	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1026	bug(PyObject *list) {
				1027	PyObject *item = PyList_GetItem(list, 0);
				1028	PyList_SetItem(list, 1, PyInt_FromLong(0L));
				1029	PyObject_Print(item, stdout, 0); /* BUG! */
				1030	}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	1031	\end{verbatim}
				1032
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1033	This function first borrows a reference to \code{list[0]}, then
				1034	replaces \code{list[1]} with the value \code{0}, and finally prints
				1035	the borrowed reference. Looks harmless, right? But it's not!
				1036
				1037	Let's follow the control flow into \code{PyList_SetItem()}. The list
				1038	owns references to all its items, so when item 1 is replaced, it has
				1039	to dispose of the original item 1. Now let's suppose the original
				1040	item 1 was an instance of a user-defined class, and let's further
				1041	suppose that the class defined a \code{__del__()} method. If this
				1042	class instance has a reference count of 1, disposing of it will call
				1043	its \code{__del__()} method.
				1044
				1045	Since it is written in Python, the \code{__del__()} method can execute
				1046	arbitrary Python code. Could it perhaps do something to invalidate
				1047	the reference to \code{item} in \code{bug()}? You bet! Assuming that
				1048	the list passed into \code{bug()} is accessible to the
				1049	\code{__del__()} method, it could execute a statement to the effect of
				1050	\code{del list[0]}, and assuming this was the last reference to that
				1051	object, it would free the memory associated with it, thereby
				1052	invalidating \code{item}.
				1053
				1054	The solution, once you know the source of the problem, is easy:
				1055	temporarily increment the reference count. The correct version of the
				1056	function reads:
				1057
				1058	\begin{verbatim}
				1059	no_bug(PyObject *list) {
				1060	PyObject *item = PyList_GetItem(list, 0);
				1061	Py_INCREF(item);
				1062	PyList_SetItem(list, 1, PyInt_FromLong(0L));
				1063	PyObject_Print(item, stdout, 0);
				1064	Py_DECREF(item);
				1065	}
				1066	\end{verbatim}
				1067
				1068	This is a true story. An older version of Python contained variants
				1069	of this bug and someone spent a considerable amount of time in a C
				1070	debugger to figure out why his \code{__del__()} methods would fail...
				1071
				1072	The second case of problems with a borrowed reference is a variant
				1073	involving threads. Normally, multiple threads in the Python
				1074	interpreter can't get in each other's way, because there is a global
				1075	lock protecting Python's entire object space. However, it is possible
				1076	to temporarily release this lock using the macro
				1077	\code{Py_BEGIN_ALLOW_THREADS}, and to re-acquire it using
				1078	\code{Py_END_ALLOW_THREADS}. This is common around blocking I/O
				1079	calls, to let other threads use the CPU while waiting for the I/O to
				1080	complete. Obviously, the following function has the same problem as
				1081	the previous one:
				1082
				1083	\begin{verbatim}
				1084	bug(PyObject *list) {
				1085	PyObject *item = PyList_GetItem(list, 0);
				1086	Py_BEGIN_ALLOW_THREADS
				1087	...some blocking I/O call...
				1088	Py_END_ALLOW_THREADS
				1089	PyObject_Print(item, stdout, 0); /* BUG! */
				1090	}
				1091	\end{verbatim}
				1092
				1093	\subsection{NULL Pointers}
				1094
				1095	In general, functions that take object references as arguments don't
				1096	expect you to pass them \code{NULL} pointers, and will dump core (or
				1097	cause later core dumps) if you do so. Functions that return object
				1098	references generally return \code{NULL} only to indicate that an
				1099	exception occurred. The reason for not testing for \code{NULL}
				1100	arguments is that functions often pass the objects they receive on to
				1101	other function --- if each function were to test for \code{NULL},
				1102	there would be a lot of redundant tests and the code would run slower.
				1103
				1104	It is better to test for \code{NULL} only at the ``source'', i.e.\
				1105	when a pointer that may be \code{NULL} is received, e.g.\ from
				1106	\code{malloc()} or from a function that may raise an exception.
				1107
				1108	The macros \code{Py_INCREF()} and \code{Py_DECREF()}
				1109	don't check for \code{NULL} pointers --- however, their variants
				1110	\code{Py_XINCREF()} and \code{Py_XDECREF()} do.
				1111
				1112	The macros for checking for a particular object type
				1113	(\code{Py\var{type}_Check()}) don't check for \code{NULL} pointers ---
				1114	again, there is much code that calls several of these in a row to test
				1115	an object against various different expected types, and this would
				1116	generate redundant tests. There are no variants with \code{NULL}
				1117	checking.
				1118
				1119	The C function calling mechanism guarantees that the argument list
				1120	passed to C functions (\code{args} in the examples) is never
				1121	\code{NULL} --- in fact it guarantees that it is always a tuple.%
				1122	\footnote{These guarantees don't hold when you use the ``old'' style
				1123	calling convention --- this is still found in much existing code.}
				1124
				1125	It is a severe error to ever let a \code{NULL} pointer ``escape'' to
				1126	the Python user.
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	1127
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	1128
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1129	\section{Writing Extensions in \Cpp{}}
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	1130
Guido van Rossum	16d6e71	1994-08-08 12:30:22 +0000	[diff] [blame]	1131	It is possible to write extension modules in \Cpp{}. Some restrictions
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	1132	apply: since the main program (the Python interpreter) is compiled and
				1133	linked by the C compiler, global or static objects with constructors
				1134	cannot be used. All functions that will be called directly or
				1135	indirectly (i.e. via function pointers) by the Python interpreter will
				1136	have to be declared using \code{extern "C"}; this applies to all
				1137	`methods' as well as to the module's initialization function.
				1138	It is unnecessary to enclose the Python header files in
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1139	\code{extern "C" \{...\}} --- they use this form already if the symbol
				1140	\samp{__cplusplus} is defined (all recent C++ compilers define this
				1141	symbol).
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	1142
				1143	\chapter{Embedding Python in another application}
				1144
				1145	Embedding Python is similar to extending it, but not quite. The
				1146	difference is that when you extend Python, the main program of the
Guido van Rossum	16d6e71	1994-08-08 12:30:22 +0000	[diff] [blame]	1147	application is still the Python interpreter, while if you embed
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	1148	Python, the main program may have nothing to do with Python ---
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	1149	instead, some parts of the application occasionally call the Python
				1150	interpreter to run some Python code.
				1151
				1152	So if you are embedding Python, you are providing your own main
				1153	program. One of the things this main program has to do is initialize
				1154	the Python interpreter. At the very least, you have to call the
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1155	function \code{Py_Initialize()}. There are optional calls to pass command
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	1156	line arguments to Python. Then later you can call the interpreter
				1157	from any part of the application.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	1158
				1159	There are several different ways to call the interpreter: you can pass
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1160	a string containing Python statements to \code{PyRun_SimpleString()},
				1161	or you can pass a stdio file pointer and a file name (for
				1162	identification in error messages only) to \code{PyRun_SimpleFile()}. You
				1163	can also call the lower-level operations described in the previous
				1164	chapters to construct and use Python objects.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	1165
				1166	A simple demo of embedding Python can be found in the directory
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1167	\file{Demo/embed}.
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	1168
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	1169
Guido van Rossum	16d6e71	1994-08-08 12:30:22 +0000	[diff] [blame]	1170	\section{Embedding Python in \Cpp{}}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	1171
Guido van Rossum	16d6e71	1994-08-08 12:30:22 +0000	[diff] [blame]	1172	It is also possible to embed Python in a \Cpp{} program; precisely how this
				1173	is done will depend on the details of the \Cpp{} system used; in general you
				1174	will need to write the main program in \Cpp{}, and use the \Cpp{} compiler
				1175	to compile and link your program. There is no need to recompile Python
				1176	itself using \Cpp{}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	1177
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1178
				1179	\chapter{Dynamic Loading}
				1180
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1181	On most modern systems it is possible to configure Python to support
				1182	dynamic loading of extension modules implemented in C. When shared
				1183	libraries are used dynamic loading is configured automatically;
				1184	otherwise you have to select it as a build option (see below). Once
				1185	configured, dynamic loading is trivial to use: when a Python program
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1186	executes \code{import spam}, the search for modules tries to find a
				1187	file \file{spammodule.o} (\file{spammodule.so} when using shared
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1188	libraries) in the module search path, and if one is found, it is
				1189	loaded into the executing binary and executed. Once loaded, the
				1190	module acts just like a built-in extension module.
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1191
				1192	The advantages of dynamic loading are twofold: the `core' Python
				1193	binary gets smaller, and users can extend Python with their own
				1194	modules implemented in C without having to build and maintain their
				1195	own copy of the Python interpreter. There are also disadvantages:
				1196	dynamic loading isn't available on all systems (this just means that
				1197	on some systems you have to use static loading), and dynamically
				1198	loading a module that was compiled for a different version of Python
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1199	(e.g. with a different representation of objects) may dump core.
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1200
				1201
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1202	\section{Configuring and Building the Interpreter for Dynamic Loading}
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1203
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1204	There are three styles of dynamic loading: one using shared libraries,
				1205	one using SGI IRIX 4 dynamic loading, and one using GNU dynamic
				1206	loading.
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1207
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1208	\subsection{Shared Libraries}
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1209
Guido van Rossum	16d6e71	1994-08-08 12:30:22 +0000	[diff] [blame]	1210	The following systems support dynamic loading using shared libraries:
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1211	SunOS 4; Solaris 2; SGI IRIX 5 (but not SGI IRIX 4!); and probably all
				1212	systems derived from SVR4, or at least those SVR4 derivatives that
				1213	support shared libraries (are there any that don't?).
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1214
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1215	You don't need to do anything to configure dynamic loading on these
				1216	systems --- the \file{configure} detects the presence of the
				1217	\file{<dlfcn.h>} header file and automatically configures dynamic
				1218	loading.
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1219
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1220	\subsection{SGI IRIX 4 Dynamic Loading}
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1221
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1222	Only SGI IRIX 4 supports dynamic loading of modules using SGI dynamic
				1223	loading. (SGI IRIX 5 might also support it but it is inferior to
				1224	using shared libraries so there is no reason to; a small test didn't
				1225	work right away so I gave up trying to support it.)
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1226
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1227	Before you build Python, you first need to fetch and build the \code{dl}
				1228	package written by Jack Jansen. This is available by anonymous ftp
				1229	from host \file{ftp.cwi.nl}, directory \file{pub/dynload}, file
				1230	\file{dl-1.6.tar.Z}. (The version number may change.) Follow the
				1231	instructions in the package's \file{README} file to build it.
				1232
				1233	Once you have built \code{dl}, you can configure Python to use it. To
				1234	this end, you run the \file{configure} script with the option
				1235	\code{--with-dl=\var{directory}} where \var{directory} is the absolute
				1236	pathname of the \code{dl} directory.
				1237
				1238	Now build and install Python as you normally would (see the
				1239	\file{README} file in the toplevel Python directory.)
				1240
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1241	\subsection{GNU Dynamic Loading}
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1242
				1243	GNU dynamic loading supports (according to its \file{README} file) the
				1244	following hardware and software combinations: VAX (Ultrix), Sun 3
				1245	(SunOS 3.4 and 4.0), Sparc (SunOS 4.0), Sequent Symmetry (Dynix), and
				1246	Atari ST. There is no reason to use it on a Sparc; I haven't seen a
				1247	Sun 3 for years so I don't know if these have shared libraries or not.
				1248
				1249	You need to fetch and build two packages. One is GNU DLD 3.2.3,
				1250	available by anonymous ftp from host \file{ftp.cwi.nl}, directory
				1251	\file{pub/dynload}, file \file{dld-3.2.3.tar.Z}. (As far as I know,
				1252	no further development on GNU DLD is being done.) The other is an
				1253	emulation of Jack Jansen's \code{dl} package that I wrote on top of
				1254	GNU DLD 3.2.3. This is available from the same host and directory,
				1255	file dl-dld-1.1.tar.Z. (The version number may change --- but I doubt
				1256	it will.) Follow the instructions in each package's \file{README}
				1257	file to configure build them.
				1258
				1259	Now configure Python. Run the \file{configure} script with the option
				1260	\code{--with-dl-dld=\var{dl-directory},\var{dld-directory}} where
				1261	\var{dl-directory} is the absolute pathname of the directory where you
				1262	have built the \file{dl-dld} package, and \var{dld-directory} is that
				1263	of the GNU DLD package. The Python interpreter you build hereafter
				1264	will support GNU dynamic loading.
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1265
				1266
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1267	\section{Building a Dynamically Loadable Module}
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1268
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1269	Since there are three styles of dynamic loading, there are also three
				1270	groups of instructions for building a dynamically loadable module.
				1271	Instructions common for all three styles are given first. Assuming
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1272	your module is called \code{spam}, the source filename must be
				1273	\file{spammodule.c}, so the object name is \file{spammodule.o}. The
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1274	module must be written as a normal Python extension module (as
				1275	described earlier).
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1276
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1277	Note that in all cases you will have to create your own Makefile that
				1278	compiles your module file(s). This Makefile will have to pass two
				1279	\samp{-I} arguments to the C compiler which will make it find the
				1280	Python header files. If the Make variable \var{PYTHONTOP} points to
				1281	the toplevel Python directory, your \var{CFLAGS} Make variable should
				1282	contain the options \samp{-I\$(PYTHONTOP) -I\$(PYTHONTOP)/Include}.
				1283	(Most header files are in the \file{Include} subdirectory, but the
				1284	\file{config.h} header lives in the toplevel directory.) You must
				1285	also add \samp{-DHAVE_CONFIG_H} to the definition of \var{CFLAGS} to
				1286	direct the Python headers to include \file{config.h}.
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1287
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1288
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1289	\subsection{Shared Libraries}
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1290
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1291	You must link the \samp{.o} file to produce a shared library. This is
				1292	done using a special invocation of the \UNIX{} loader/linker, {\em
				1293	ld}(1). Unfortunately the invocation differs slightly per system.
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1294
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1295	On SunOS 4, use
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1296	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1297	ld spammodule.o -o spammodule.so
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1298	\end{verbatim}
				1299
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1300	On Solaris 2, use
				1301	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1302	ld -G spammodule.o -o spammodule.so
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1303	\end{verbatim}
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1304
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1305	On SGI IRIX 5, use
				1306	\begin{verbatim}
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1307	ld -shared spammodule.o -o spammodule.so
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1308	\end{verbatim}
				1309
				1310	On other systems, consult the manual page for {\em ld}(1) to find what
				1311	flags, if any, must be used.
				1312
				1313	If your extension module uses system libraries that haven't already
				1314	been linked with Python (e.g. a windowing system), these must be
				1315	passed to the {\em ld} command as \samp{-l} options after the
				1316	\samp{.o} file.
				1317
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1318	The resulting file \file{spammodule.so} must be copied into a directory
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1319	along the Python module search path.
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1320
				1321
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1322	\subsection{SGI IRIX 4 Dynamic Loading}
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1323
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1324	{bf IMPORTANT:} You must compile your extension module with the
				1325	additional C flag \samp{-G0} (or \samp{-G 0}). This instruct the
				1326	assembler to generate position-independent code.
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1327
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1328	You don't need to link the resulting \file{spammodule.o} file; just
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1329	copy it into a directory along the Python module search path.
				1330
				1331	The first time your extension is loaded, it takes some extra time and
				1332	a few messages may be printed. This creates a file
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1333	\file{spammodule.ld} which is an image that can be loaded quickly into
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1334	the Python interpreter process. When a new Python interpreter is
				1335	installed, the \code{dl} package detects this and rebuilds
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1336	\file{spammodule.ld}. The file \file{spammodule.ld} is placed in the
				1337	directory where \file{spammodule.o} was found, unless this directory is
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1338	unwritable; in that case it is placed in a temporary
				1339	directory.\footnote{Check the manual page of the \code{dl} package for
				1340	details.}
				1341
				1342	If your extension modules uses additional system libraries, you must
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1343	create a file \file{spammodule.libs} in the same directory as the
				1344	\file{spammodule.o}. This file should contain one or more lines with
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1345	whitespace-separated options that will be passed to the linker ---
				1346	normally only \samp{-l} options or absolute pathnames of libraries
				1347	(\samp{.a} files) should be used.
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1348
				1349
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1350	\subsection{GNU Dynamic Loading}
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1351
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1352	Just copy \file{spammodule.o} into a directory along the Python module
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1353	search path.
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1354
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1355	If your extension modules uses additional system libraries, you must
Guido van Rossum	5049bcb	1995-03-13 16:55:23 +0000	[diff] [blame^]	1356	create a file \file{spammodule.libs} in the same directory as the
				1357	\file{spammodule.o}. This file should contain one or more lines with
Guido van Rossum	6938f06	1994-08-01 12:22:53 +0000	[diff] [blame]	1358	whitespace-separated absolute pathnames of libraries (\samp{.a}
				1359	files). No \samp{-l} options can be used.
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	1360
				1361
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	1362	\input{ext.ind}
				1363
				1364	\end{document}