Blame - Doc/ext/ext.tex - platform/external/python/cpython3

blob: 6eeaacfb0540363a6732c85b6e2734a5b5dba44b [file] [log] [blame]

Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	1	\documentstyle[twoside,11pt,myformat,times]{report}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	2
				3	\title{\bf Extending and Embedding the Python Interpreter}
				4
				5	\author{
				6	Guido van Rossum \\
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	7	Dept. CST, CWI, P.O. Box 94079 \\
				8	1090 GB Amsterdam, The Netherlands \\
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	9	E-mail: {\tt guido@cwi.nl}
				10	}
				11
Guido van Rossum	83eb962	1993-11-23 16:28:45 +0000	[diff] [blame]	12	\date{19 November 1993 \\ Release 0.9.9.++} % XXX update before release!
				13
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	14	% Tell \index to actually write the .idx file
				15	\makeindex
				16
				17	\begin{document}
				18
				19	\pagenumbering{roman}
				20
				21	\maketitle
				22
				23	\begin{abstract}
				24
				25	\noindent
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	26	This document describes how to write modules in C or C++ to extend the
				27	Python interpreter. It also describes how to use Python as an
				28	`embedded' language, and how extension modules can be loaded
				29	dynamically (at run time) into the interpreter, if the operating
				30	system supports this feature.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	31
				32	\end{abstract}
				33
				34	\pagebreak
				35
				36	{
				37	\parskip = 0mm
				38	\tableofcontents
				39	}
				40
				41	\pagebreak
				42
				43	\pagenumbering{arabic}
				44
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	45
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	46	\chapter{Extending Python with C or C++ code}
				47
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	48
				49	\section{Introduction}
				50
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	51	It is quite easy to add non-standard built-in modules to Python, if
				52	you know how to program in C. A built-in module known to the Python
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	53	programmer as \code{foo} is generally implemented by a file called
				54	\file{foomodule.c}. All but the most essential standard built-in
				55	modules also adhere to this convention, and in fact some of them form
				56	excellent examples of how to create an extension.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	57
				58	Extension modules can do two things that can't be done directly in
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	59	Python: they can implement new data types, and they can make system
				60	calls or call C library functions. Since the latter is usually the
				61	most important reason for adding an extension, I'll concentrate on
				62	adding `wrappers' around C library functions; the concrete example
				63	uses the wrapper for
				64	\code{system()} in module \code{posix}, found in (of course) the file
				65	\file{posixmodule.c}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	66
				67	It is important not to be impressed by the size and complexity of
				68	the average extension module; much of this is straightforward
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	69	`boilerplate' code (starting right with the copyright notice)!
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	70
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	71	Let's skip the boilerplate and have a look at an interesting function
				72	in \file{posixmodule.c} first:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	73
				74	\begin{verbatim}
				75	static object *
				76	posix_system(self, args)
				77	object *self;
				78	object *args;
				79	{
				80	char *command;
				81	int sts;
				82	if (!getargs(args, "s", &command))
				83	return NULL;
				84	sts = system(command);
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	85	return mkvalue("i", sts);
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	86	}
				87	\end{verbatim}
				88
				89	This is the prototypical top-level function in an extension module.
				90	It will be called (we'll see later how this is made possible) when the
				91	Python program executes statements like
				92
				93	\begin{verbatim}
				94	>>> import posix
				95	>>> sts = posix.system('ls -l')
				96	\end{verbatim}
				97
				98	There is a straightforward translation from the arguments to the call
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	99	in Python (here the single value \code{'ls -l'}) to the arguments that
				100	are passed to the C function. The C function always has two
				101	parameters, conventionally named \var{self} and \var{args}. In this
				102	example, \var{self} will always be a \code{NULL} pointer, since this is a
				103	function, not a method (this is done so that the interpreter doesn't
				104	have to understand two different types of C functions).
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	105
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	106	The \var{args} parameter will be a pointer to a Python object, or
				107	\code{NULL} if the Python function/method was called without
				108	arguments. It is necessary to do full argument type checking on each
				109	call, since otherwise the Python user would be able to cause the
				110	Python interpreter to `dump core' by passing the wrong arguments to a
				111	function in an extension module (or no arguments at all). Because
				112	argument checking and converting arguments to C is such a common task,
				113	there's a general function in the Python interpreter which combines
				114	these tasks: \code{getargs()}. It uses a template string to determine
				115	both the types of the Python argument and the types of the C variables
				116	into which it should store the converted values. (More about this
				117	later.)\footnote{
				118	There are convenience macros \code{getstrarg()},
				119	\code{getintarg()}, etc., for many common forms of \code{getargs()}
				120	templates. These are relics from the past; it's better to call
				121	\code{getargs()} directly.}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	122
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	123	If \code{getargs()} returns nonzero, the argument list has the right
				124	type and its components have been stored in the variables whose
				125	addresses are passed. If it returns zero, an error has occurred. In
				126	the latter case it has already raised an appropriate exception by
				127	calling \code{err_setstr()}, so the calling function can just return
				128	\code{NULL}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	129
				130
				131	\section{Intermezzo: errors and exceptions}
				132
				133	An important convention throughout the Python interpreter is the
				134	following: when a function fails, it should set an exception condition
				135	and return an error value (often a NULL pointer). Exceptions are set
				136	in a global variable in the file errors.c; if this variable is NULL no
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	137	exception has occurred. A second variable is the `associated value'
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	138	of the exception.
				139
				140	The file errors.h declares a host of err_* functions to set various
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	141	types of exceptions. The most common one is \code{err_setstr()} --- its
				142	arguments are an exception object (e.g. RuntimeError --- actually it
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	143	can be any string object) and a C string indicating the cause of the
				144	error (this is converted to a string object and stored as the
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	145	`associated value' of the exception). Another useful function is
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	146	\code{err_errno()}, which only takes an exception argument and
				147	constructs the associated value by inspection of the (UNIX) global
				148	variable errno.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	149
				150	You can test non-destructively whether an exception has been set with
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	151	\code{err_occurred()}. However, most code never calls
				152	\code{err_occurred()} to see whether an error occurred or not, but
				153	relies on error return values from the functions it calls instead:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	154
				155	When a function that calls another function detects that the called
				156	function fails, it should return an error value but not set an
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	157	condition --- one is already set. The caller is then supposed to also
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	158	return an error indication to its caller, again without calling
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	159	\code{err_setstr()}, and so on --- the most detailed cause of the error
				160	was already reported by the function that detected it in the first
				161	place. Once the error has reached Python's interpreter main loop,
				162	this aborts the currently executing Python code and tries to find an
				163	exception handler specified by the Python programmer.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	164
				165	To ignore an exception set by a function call that failed, the
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	166	exception condition must be cleared explicitly by calling
				167	\code{err_clear()}. The only time C code should call
				168	\code{err_clear()} is if it doesn't want to pass the error on to the
				169	interpreter but wants to handle it completely by itself (e.g. by
				170	trying something else or pretending nothing happened).
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	171
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	172	Finally, the function \code{err_get()} gives you both error variables
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	173	and clears them. Note that even if an error occurred the second one
				174	may be NULL. I doubt you will need to use this function.
				175
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	176	Note that a failing \code{malloc()} call must also be turned into an
				177	exception --- the direct caller of \code{malloc()} (or
				178	\code{realloc()}) must call \code{err_nomem()} and return a failure
				179	indicator itself. All the object-creating functions
				180	(\code{newintobject()} etc.) already do this, so only if you call
				181	\code{malloc()} directly this note is of importance.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	182
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	183	Also note that, with the important exception of \code{getargs()}, functions
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	184	that return an integer status usually use 0 for success and -1 for
				185	failure.
				186
				187	Finally, be careful about cleaning up garbage (making appropriate
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	188	[\code{X}]\code{DECREF()} calls) when you return an error!
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	189
				190
				191	\section{Back to the example}
				192
				193	Going back to posix_system, you should now be able to understand this
				194	bit:
				195
				196	\begin{verbatim}
				197	if (!getargs(args, "s", &command))
				198	return NULL;
				199	\end{verbatim}
				200
				201	It returns NULL (the error indicator for functions of this kind) if an
				202	error is detected in the argument list, relying on the exception set
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	203	by \code{getargs()}. The string value of the argument is now copied to the
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	204	local variable 'command'.
				205
				206	If a Python function is called with multiple arguments, the argument
				207	list is turned into a tuple. Python programs can us this feature, for
				208	instance, to explicitly create the tuple containing the arguments
				209	first and make the call later.
				210
				211	The next statement in posix_system is a call tothe C library function
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	212	\code{system()}, passing it the string we just got from \code{getargs()}:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	213
				214	\begin{verbatim}
				215	sts = system(command);
				216	\end{verbatim}
				217
				218	Python strings may contain internal null bytes; but if these occur in
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	219	this example the rest of the string will be ignored by \code{system()}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	220
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	221	Finally, posix.\code{system()} must return a value: the integer status
				222	returned by the C library \code{system()} function. This is done by the
				223	function \code{newintobject()}, which takes a (long) integer as parameter.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	224
				225	\begin{verbatim}
				226	return newintobject((long)sts);
				227	\end{verbatim}
				228
				229	(Yes, even integers are represented as objects on the heap in Python!)
				230	If you had a function that returned no useful argument, you would need
				231	this idiom:
				232
				233	\begin{verbatim}
				234	INCREF(None);
				235	return None;
				236	\end{verbatim}
				237
				238	'None' is a unique Python object representing 'no value'. It differs
				239	from NULL, which means 'error' in most contexts (except when passed as
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	240	a function argument --- there it means 'no arguments').
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	241
				242
				243	\section{The module's function table}
				244
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	245	I promised to show how I made the function \code{posix_system()}
				246	available to Python programs. This is shown later in posixmodule.c:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	247
				248	\begin{verbatim}
				249	static struct methodlist posix_methods[] = {
				250	...
				251	{"system", posix_system},
				252	...
				253	{NULL, NULL} /* Sentinel */
				254	};
				255
				256	void
				257	initposix()
				258	{
				259	(void) initmodule("posix", posix_methods);
				260	}
				261	\end{verbatim}
				262
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	263	(The actual \code{initposix()} is somewhat more complicated, but most
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	264	extension modules are indeed as simple as that.) When the Python
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	265	program first imports module 'posix', \code{initposix()} is called,
				266	which calls \code{initmodule()} with specific parameters. This
				267	creates a module object (which is inserted in the table sys.modules
				268	under the key 'posix'), and adds built-in-function objects to the
				269	newly created module based upon the table (of type struct methodlist)
				270	that was passed as its second parameter. The function
				271	\code{initmodule()} returns a pointer to the module object that it
				272	creates, but this is unused here. It aborts with a fatal error if the
				273	module could not be initialized satisfactorily.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	274
				275
				276	\section{Calling the module initialization function}
				277
				278	There is one more thing to do: telling the Python module to call the
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	279	\code{initfoo()} function when it encounters an 'import foo' statement.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	280	This is done in the file config.c. This file contains a table mapping
				281	module names to parameterless void function pointers. You need to add
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	282	a declaration of \code{initfoo()} somewhere early in the file, and a
				283	line saying
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	284
				285	\begin{verbatim}
				286	{"foo", initfoo},
				287	\end{verbatim}
				288
				289	to the initializer for inittab[]. It is conventional to include both
				290	the declaration and the initializer line in preprocessor commands
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	291	\code{\#ifdef USE_FOO} / \code{\#endif}, to make it easy to turn the
				292	foo extension on or off. Note that the Macintosh version uses a
				293	different configuration file, distributed as configmac.c. This
				294	strategy may be extended to other operating system versions, although
				295	usually the standard config.c file gives a pretty useful starting
				296	point for a new config*.c file.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	297
				298	And, of course, I forgot the Makefile. This is actually not too hard,
				299	just follow the examples for, say, AMOEBA. Just find all occurrences
				300	of the string AMOEBA in the Makefile and do the same for FOO that's
				301	done for AMOEBA...
				302
				303	(Note: if you are using dynamic loading for your extension, you don't
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	304	need to edit config.c and the Makefile. See \file{./DYNLOAD} for more
				305	info about this.)
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	306
				307
				308	\section{Calling Python functions from C}
				309
				310	The above concentrates on making C functions accessible to the Python
				311	programmer. The reverse is also often useful: calling Python
				312	functions from C. This is especially the case for libraries that
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	313	support so-called `callback' functions. If a C interface makes heavy
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	314	use of callbacks, the equivalent Python often needs to provide a
				315	callback mechanism to the Python programmer; the implementation may
				316	require calling the Python callback functions from a C callback.
				317	Other uses are also possible.
				318
				319	Fortunately, the Python interpreter is easily called recursively, and
				320	there is a standard interface to call a Python function. I won't
				321	dwell on how to call the Python parser with a particular string as
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	322	input --- if you're interested, have a look at the implementation of
				323	the \samp{-c} command line option in pythonmain.c.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	324
				325	Calling a Python function is easy. First, the Python program must
				326	somehow pass you the Python function object. You should provide a
				327	function (or some other interface) to do this. When this function is
				328	called, save a pointer to the Python function object (be careful to
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	329	INCREF it!) in a global variable --- or whereever you see fit.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	330	For example, the following function might be part of a module
				331	definition:
				332
				333	\begin{verbatim}
				334	static object *my_callback;
				335
				336	static object *
				337	my_set_callback(dummy, arg)
				338	object dummy, arg;
				339	{
				340	XDECREF(my_callback); /* Dispose of previous callback */
				341	my_callback = arg;
				342	XINCREF(my_callback); /* Remember new callback */
				343	/* Boilerplate for "void" return */
				344	INCREF(None);
				345	return None;
				346	}
				347	\end{verbatim}
				348
				349	Later, when it is time to call the function, you call the C function
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	350	\code{call_object()}. This function has two arguments, both pointers
				351	to arbitrary Python objects: the Python function, and the argument.
				352	The argument can be NULL to call the function without arguments. For
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	353	example:
				354
				355	\begin{verbatim}
				356	object *result;
				357	...
				358	/* Time to call the callback */
				359	result = call_object(my_callback, (object *)NULL);
				360	\end{verbatim}
				361
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	362	\code{call_object()} returns a Python object pointer: this is
				363	the return value of the Python function. \code{call_object()} is
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	364	`reference-count-neutral' with respect to its arguments, but the
				365	return value is `new': either it is a brand new object, or it is an
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	366	existing object whose reference count has been incremented. So, you
				367	should somehow apply DECREF to the result, even (especially!) if you
				368	are not interested in its value.
				369
				370	Before you do this, however, it is important to check that the return
				371	value isn't NULL. If it is, the Python function terminated by raising
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	372	an exception. If the C code that called \code{call_object()} is
				373	called from Python, it should now return an error indication to its
				374	Python caller, so the interpreter can print a stack trace, or the
				375	calling Python code can handle the exception. If this is not possible
				376	or desirable, the exception should be cleared by calling
				377	\code{err_clear()}. For example:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	378
				379	\begin{verbatim}
				380	if (result == NULL)
				381	return NULL; /* Pass error back */
				382	/* Here maybe use the result */
				383	DECREF(result);
				384	\end{verbatim}
				385
				386	Depending on the desired interface to the Python callback function,
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	387	you may also have to provide an argument to \code{call_object()}. In
				388	some cases the argument is also provided by the Python program,
				389	through the same interface that specified the callback function. It
				390	can then be saved and used in the same manner as the function object.
				391	In other cases, you may have to construct a new object to pass as
				392	argument. In this case you must dispose of it as well. For example,
				393	if you want to pass an integral event code, you might use the
				394	following code:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	395
				396	\begin{verbatim}
				397	object *argument;
				398	...
				399	argument = newintobject((long)eventcode);
				400	result = call_object(my_callback, argument);
				401	DECREF(argument);
				402	if (result == NULL)
				403	return NULL; /* Pass error back */
				404	/* Here maybe use the result */
				405	DECREF(result);
				406	\end{verbatim}
				407
				408	Note the placement of DECREF(argument) immediately after the call,
				409	before the error check! Also note that strictly spoken this code is
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	410	not complete: \code{newintobject()} may run out of memory, and this
				411	should be checked.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	412
				413	In even more complicated cases you may want to pass the callback
				414	function multiple arguments. To this end you have to construct (and
				415	dispose of!) a tuple object. Details (mostly concerned with the
				416	errror checks and reference count manipulation) are left as an
				417	exercise for the reader; most of this is also needed when returning
				418	multiple values from a function.
				419
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	420	XXX TO DO: explain objects.
				421
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	422	XXX TO DO: defining new object types.
				423
				424
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	425	\section{Format strings for {\tt getargs()}}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	426
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	427	The \code{getargs()} function is declared in \file{modsupport.h} as
				428	follows:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	429
				430	\begin{verbatim}
				431	int getargs(object arg, char format, ...);
				432	\end{verbatim}
				433
				434	The remaining arguments must be addresses of variables whose type is
				435	determined by the format string. For the conversion to succeed, the
				436	`arg' object must match the format and the format must be exhausted.
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	437	Note that while \code{getargs()} checks that the Python object really
				438	is of the specified type, it cannot check that the addresses provided
				439	in the call match: if you make mistakes there, your code will probably
				440	dump core.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	441
				442	A format string consists of a single `format unit'. A format unit
				443	describes one Python object; it is usually a single character or a
				444	parenthesized string. The type of a format units is determined from
				445	its first character, the `format letter':
				446
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	447	\begin{description}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	448
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	449	\item[\samp{s} (string)]
				450	The Python object must be a string object. The C argument must be a
				451	char** (i.e. the address of a character pointer), and a pointer to
				452	the C string contained in the Python object is stored into it. If the
				453	next character in the format string is \samp{\#}, another C argument
				454	of type int* must be present, and the length of the Python string (not
				455	counting the trailing zero byte) is stored into it.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	456
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	457	\item[\samp{z} (string or zero, i.e. \code{NULL})]
				458	Like \samp{s}, but the object may also be None. In this case the
				459	string pointer is set to NULL and if a \samp{\#} is present the size
				460	it set to 0.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	461
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	462	\item[\samp{b} (byte, i.e. char interpreted as tiny int)]
				463	The object must be a Python integer. The C argument must be a char*.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	464
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	465	\item[\samp{h} (half, i.e. short)]
				466	The object must be a Python integer. The C argument must be a short*.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	467
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	468	\item[\samp{i} (int)]
				469	The object must be a Python integer. The C argument must be an int*.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	470
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	471	\item[\samp{l} (long)]
				472	The object must be a (plain!) Python integer. The C argument must be
				473	a long*.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	474
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	475	\item[\samp{c} (char)]
				476	The Python object must be a string of length 1. The C argument must
				477	be a char. (Don't pass an int!)
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	478
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	479	\item[\samp{f} (float)]
				480	The object must be a Python int or float. The C argument must be a
				481	float*.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	482
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	483	\item[\samp{d} (double)]
				484	The object must be a Python int or float. The C argument must be a
				485	double*.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	486
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	487	\item[\samp{S} (string object)]
				488	The object must be a Python string. The C argument must be an
				489	object** (i.e. the address of an object pointer). The C program thus
				490	gets back the actual string object that was passed, not just a pointer
				491	to its array of characters and its size as for format character
				492	\samp{s}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	493
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	494	\item[\samp{O} (object)]
				495	The object can be any Python object, including None, but not NULL.
				496	The C argument must be an object**. This can be used if an argument
				497	list must contain objects of a type for which no format letter exist:
				498	the caller must then check that it has the right type.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	499
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	500	\item[\samp{(} (tuple)]
				501	The object must be a Python tuple. Following the \samp{(} character
				502	in the format string must come a number of format units describing the
				503	elements of the tuple, followed by a \samp{)} character. Tuple
				504	format units may be nested. (There are no exceptions for empty and
				505	singleton tuples; \samp{()} specifies an empty tuple and \samp{(i)} a
				506	singleton of one integer. Normally you don't want to use the latter,
				507	since it is hard for the user to specify.
				508
				509	\end{description}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	510
				511	More format characters will probably be added as the need arises. It
				512	should be allowed to use Python long integers whereever integers are
				513	expected, and perform a range check. (A range check is in fact always
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	514	necessary for the \samp{b}, \samp{h} and \samp{i} format
				515	letters, but this is currently not implemented.)
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	516
				517	Some example calls:
				518
				519	\begin{verbatim}
				520	int ok;
				521	int i, j;
				522	long k, l;
				523	char *s;
				524	int size;
				525
				526	ok = getargs(args, "(lls)", &k, &l, &s); /* Two longs and a string */
				527	/* Possible Python call: f(1, 2, 'three') */
				528
				529	ok = getargs(args, "s", &s); /* A string */
				530	/* Possible Python call: f('whoops!') */
				531
				532	ok = getargs(args, ""); /* No arguments */
				533	/* Python call: f() */
				534
				535	ok = getargs(args, "((ii)s#)", &i, &j, &s, &size);
				536	/* A pair of ints and a string, whose size is also returned */
				537	/* Possible Python call: f(1, 2, 'three') */
				538
				539	{
				540	int left, top, right, bottom, h, v;
				541	ok = getargs(args, "(((ii)(ii))(ii))",
				542	&left, &top, &right, &bottom, &h, &v);
				543	/* A rectangle and a point */
				544	/* Possible Python call:
				545	f( ((0, 0), (400, 300)), (10, 10)) */
				546	}
				547	\end{verbatim}
				548
				549	Note that a format string must consist of a single unit; strings like
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	550	\samp{is} and \samp{(ii)s\#} are not valid format strings. (But
				551	\samp{s\#} is.)
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	552
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	553	The \code{getargs()} function does not support variable-length
				554	argument lists. In simple cases you can fake these by trying several
				555	calls to
				556	\code{getargs()} until one succeeds, but you must take care to call
				557	\code{err_clear()} before each retry. For example:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	558
				559	\begin{verbatim}
				560	static object my_method(self, args) object self, *args; {
				561	int i, j, k;
				562
				563	if (getargs(args, "(ii)", &i, &j)) {
				564	k = 0; /* Use default third argument */
				565	}
				566	else {
				567	err_clear();
				568	if (!getargs(args, "(iii)", &i, &j, &k))
				569	return NULL;
				570	}
				571	/* ... use i, j and k here ... */
				572	INCREF(None);
				573	return None;
				574	}
				575	\end{verbatim}
				576
				577	(It is possible to think of an extension to the definition of format
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	578	strings to accomodate this directly, e.g., placing a \samp{\|} in a
				579	tuple might specify that the remaining arguments are optional.
				580	\code{getargs()} should then return one more than the number of
				581	variables stored into.)
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	582
				583	Advanced users note: If you set the `varargs' flag in the method list
				584	for a function, the argument will always be a tuple (the `raw argument
				585	list'). In this case you must enclose single and empty argument lists
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	586	in parentheses, e.g., \samp{(s)} and \samp{()}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	587
				588
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	589	\section{The {\tt mkvalue()} function}
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	590
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	591	This function is the counterpart to \code{getargs()}. It is declared
				592	in \file{modsupport.h} as follows:
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	593
				594	\begin{verbatim}
				595	object mkvalue(char format, ...);
				596	\end{verbatim}
				597
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	598	It supports exactly the same format letters as \code{getargs()}, but
				599	the arguments (which are input to the function, not output) must not
				600	be pointers, just values. If a byte, short or float is passed to a
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	601	varargs function, it is widened by the compiler to int or double, so
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	602	\samp{b} and \samp{h} are treated as \samp{i} and \samp{f} is
				603	treated as \samp{d}. \samp{S} is treated as \samp{O}, \samp{s} is
				604	treated as \samp{z}. \samp{z\#} and \samp{s\#} are supported: a
				605	second argument specifies the length of the data (negative means use
				606	\code{strlen()}). \samp{S} and \samp{O} add a reference to their
				607	argument (so you should \code{DECREF()} it if you've just created it
				608	and aren't going to use it again).
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	609
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	610	If the argument for \samp{O} or \samp{S} is a NULL pointer, it is
				611	assumed that this was caused because the call producing the argument
				612	found an error and set an exception. Therefore, \code{mkvalue()} will
				613	return \code{NULL} but won't set an exception if one is already set.
				614	If no exception is set, \code{SystemError} is set.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	615
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	616	If there is an error in the format string, the \code{SystemError}
				617	exception is set, since it is the calling C code's fault, not that of
				618	the Python user who sees the exception.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	619
				620	Example:
				621
				622	\begin{verbatim}
				623	return mkvalue("(ii)", 0, 0);
				624	\end{verbatim}
				625
				626	returns a tuple containing two zeros. (Outer parentheses in the
				627	format string are actually superfluous, but you can use them for
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	628	compatibility with \code{getargs()}, which requires them if more than
				629	one argument is expected.)
				630
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	631
				632	\section{Reference counts}
				633
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	634	Here's a useful explanation of \code{INCREF()} and \code{DECREF()}
				635	(after an original by Sjoerd Mullender).
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	636
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	637	Use \code{XINCREF()} or \code{XDECREF()} instead of \code{INCREF()} /
				638	\code{DECREF()} when the argument may be \code{NULL}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	639
				640	The basic idea is, if you create an extra reference to an object, you
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	641	must \code{INCREF()} it, if you throw away a reference to an object,
				642	you must \code{DECREF()} it. Functions such as
				643	\code{newstringobject()}, \code{newsizedstringobject()},
				644	\code{newintobject()}, etc. create a reference to an object. If you
				645	want to throw away the object thus created, you must use
				646	\code{DECREF()}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	647
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	648	If you put an object into a tuple or list using \code{settupleitem()}
				649	or \code{setlistitem()}, the idea is that you usually don't want to
				650	keep a reference of your own around, so Python does not
				651	\code{INCREF()} the elements. It does \code{DECREF()} the old value.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	652	This means that if you put something into such an object using the
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	653	functions Python provides for this, you must \code{INCREF()} the
				654	object if you also want to keep a separate reference to the object around.
				655	Also, if you replace an element, you should \code{INCREF()} the old
				656	element first if you want to keep it. If you didn't \code{INCREF()}
				657	it before you replaced it, you are not allowed to look at it anymore,
				658	since it may have been freed.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	659
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	660	Returning an object to Python (i.e. when your C function returns)
				661	creates a reference to an object, but it does not change the reference
				662	count. When your code does not keep another reference to the object,
				663	you should not \code{INCREF()} or \code{DECREF()} it (assuming it is a
				664	newly created object). When you do keep a reference around, you
				665	should \code{INCREF()} the object. Also, when you return a global
				666	object such as \code{None}, you should \code{INCREF()} it.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	667
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	668	If you want to return a tuple, you should consider using
				669	\code{mkvalue()}. This function creates a new tuple with a reference
				670	count of 1 which you can return. If any of the elements you put into
				671	the tuple are objects (format codes \samp{O} or \samp{S}), they
				672	are \code{INCREF()}'ed by \code{mkvalue()}. If you don't want to keep
				673	references to those elements around, you should \code{DECREF()} them
				674	after having called \code{mkvalue()}.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	675
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	676	Usually you don't have to worry about arguments. They are
				677	\code{INCREF()}'ed before your function is called and
				678	\code{DECREF()}'ed after your function returns. When you keep a
				679	reference to an argument, you should \code{INCREF()} it and
				680	\code{DECREF()} when you throw it away. Also, when you return an
				681	argument, you should \code{INCREF()} it, because returning the
				682	argument creates an extra reference to it.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	683
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	684	If you use \code{getargs()} to parse the arguments, you can get a
				685	reference to an object (by using \samp{O} in the format string). This
				686	object was not \code{INCREF()}'ed, so you should not \code{DECREF()}
				687	it. If you want to keep the object, you must \code{INCREF()} it
				688	yourself.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	689
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	690	If you create your own type of objects, you should use \code{NEWOBJ()}
				691	to create the object. This sets the reference count to 1. If you
				692	want to throw away the object, you should use \code{DECREF()}. When
				693	the reference count reaches zero, your type's \code{dealloc()}
				694	function is called. In it, you should \code{DECREF()} all object to
				695	which you keep references in your object, but you should not use
				696	\code{DECREF()} on your object. You should use \code{DEL()} instead.
				697
				698
				699	\section{Using C++}
				700
				701	It is possible to write extension modules in C++. Some restrictions
				702	apply: since the main program (the Python interpreter) is compiled and
				703	linked by the C compiler, global or static objects with constructors
				704	cannot be used. All functions that will be called directly or
				705	indirectly (i.e. via function pointers) by the Python interpreter will
				706	have to be declared using \code{extern "C"}; this applies to all
				707	`methods' as well as to the module's initialization function.
				708	It is unnecessary to enclose the Python header files in
				709	\code{extern "C" \{...\}} --- they do this already.
				710
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	711
				712	\chapter{Embedding Python in another application}
				713
				714	Embedding Python is similar to extending it, but not quite. The
				715	difference is that when you extend Python, the main program of the
				716	application is still the Python interpreter, while of you embed
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	717	Python, the main program may have nothing to do with Python ---
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	718	instead, some parts of the application occasionally call the Python
				719	interpreter to run some Python code.
				720
				721	So if you are embedding Python, you are providing your own main
				722	program. One of the things this main program has to do is initialize
				723	the Python interpreter. At the very least, you have to call the
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	724	function \code{initall()}. There are optional calls to pass command
				725	line arguments to Python. Then later you can call the interpreter
				726	from any part of the application.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	727
				728	There are several different ways to call the interpreter: you can pass
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	729	a string containing Python statements to \code{run_command()}, or you
				730	can pass a stdio file pointer and a file name (for identification in
				731	error messages only) to \code{run_script()}. You can also call the
				732	lower-level operations described in the previous chapters to construct
				733	and use Python objects.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	734
				735	A simple demo of embedding Python can be found in the directory
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	736	\file{<pythonroot>/embed}.
				737
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	738
				739	\section{Using C++}
				740
				741	It is also possible to embed Python in a C++ program; how this is done
				742	exactly will depend on the details of the C++ system used; in general
Guido van Rossum	db65a6c	1993-11-05 17:11:16 +0000	[diff] [blame]	743	you will need to write the main program in C++, and use the C++
				744	compiler to compile and link your program. There is no need to
				745	recompile Python itself with C++.
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	746
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	747
				748	\chapter{Dynamic Loading}
				749
				750	On some systems (e.g., SunOS, SGI Irix) it is possible to configure
				751	Python to support dynamic loading of modules implemented in C. Once
				752	configured and installed it's trivial to use: if a Python program
				753	executes \code{import foo}, the search for modules tries to find a
				754	file \file{foomodule.o} in the module search path, and if one is
				755	found, it is linked with the executing binary and executed. Once
				756	linked, the module acts just like a built-in module.
				757
				758	The advantages of dynamic loading are twofold: the `core' Python
				759	binary gets smaller, and users can extend Python with their own
				760	modules implemented in C without having to build and maintain their
				761	own copy of the Python interpreter. There are also disadvantages:
				762	dynamic loading isn't available on all systems (this just means that
				763	on some systems you have to use static loading), and dynamically
				764	loading a module that was compiled for a different version of Python
				765	(e.g., with a different representation of objects) may dump core.
				766
Guido van Rossum	fbee23e	1994-01-01 17:32:24 +0000	[diff] [blame]	767	{\bf NEW:} Under SunOS (all versions) and IRIX 5.x, dynamic loading
				768	now uses shared libraries and is always configured. See at the
				769	end of this chapter for how to create a dynamically loadable module.
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	770
				771
				772	\section{Configuring and building the interpreter for dynamic loading}
				773
Guido van Rossum	fbee23e	1994-01-01 17:32:24 +0000	[diff] [blame]	774	(Ignore this section for SunOS and IRIX 5.x --- on these systems
				775	dynamic loading is always configured.)
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	776
				777	Dynamic loading is a little complicated to configure, since its
				778	implementation is extremely system dependent, and there are no
				779	really standard libraries or interfaces for it. I'm using an
				780	extremely simple interface, which basically needs only one function:
				781
				782	\begin{verbatim}
				783	funcptr = dl_loadmod(binary, object, function)
				784	\end{verbatim}
				785
				786	where \code{binary} is the pathname of the currently executing program
				787	(not just \code{argv[0]}!), \code{object} is the name of the \samp{.o}
				788	file to be dynamically loaded, and \code{function} is the name of a
				789	function in the module. If the dynamic loading succeeds,
				790	\code{dl_loadmod()} returns a pointer to the named function; if not, it
				791	returns \code{NULL}.
				792
				793	I provide two implementations of \code{dl_loadmod()}: one for SGI machines
				794	running Irix 4.0 (written by my colleague Jack Jansen), and one that
				795	is a thin interface layer for Wilson Ho's (GNU) dynamic loading
				796	package \dfn{dld} (version 3.2.3). Dld implements a much more powerful
				797	version of dynamic loading than needed (including unlinking), but it
				798	does not support System V's COFF object file format. It currently
				799	supports only VAX (Ultrix), Sun 3 (SunOS 3.4 and 4.0), SPARCstation
				800	(SunOS 4.0), Sequent Symmetry (Dynix), and Atari ST (from the dld
				801	3.2.3 README file). Dld is part of the standard Python distribution;
				802	if you didn't get it,many ftp archive sites carry dld these days, so
				803	it won't be hard to get hold of it if you need it (using archie).
				804
				805	(If you don't know where to get dld, try anonymous ftp to
				806	\file{wuarchive.wustl.edu:/mirrors2/gnu/dld-3.2.3.tar.Z}. Jack's dld
				807	can be found at \file{ftp.cwi.nl:/pub/python/dl.tar.Z}.)
				808
				809	To build a Python interpreter capable of dynamic loading, you need to
				810	edit the Makefile. Basically you must uncomment the lines starting
				811	with \samp{\#DL_}, but you must also edit some of the lines to choose
				812	which version of dl_loadmod to use, and fill in the pathname of the dld
				813	library if you use it. And, of course, you must first build
				814	dl_loadmod and dld, if used. (This is now done through the Configure
Guido van Rossum	fbee23e	1994-01-01 17:32:24 +0000	[diff] [blame]	815	script. For SunOS and IRIX 5.x, everything is now automatic.)
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	816
				817
				818	\section{Building a dynamically loadable module}
				819
				820	Building an object file usable by dynamic loading is easy, if you
				821	follow these rules (substitute your module name for \code{foo}
				822	everywhere):
				823
				824	\begin{itemize}
				825
				826	\item
				827	The source filename must be \file{foomodule.c}, so the object
				828	name is \file{foomodule.o}.
				829
				830	\item
				831	The module must be written as a (statically linked) Python extension
				832	module (described in an earlier chapter) except that no line for it
				833	must be added to \file{config.c} and it mustn't be linked with the
				834	main Python interpreter.
				835
				836	\item
				837	The module's initialization function must be called \code{initfoo}; it
				838	must install the module in \code{sys.modules} (generally by calling
				839	\code{initmodule()} as explained earlier.
				840
				841	\item
				842	The module must be compiled with \samp{-c}. The resulting .o file must
				843	not be stripped.
				844
				845	\item
				846	Since the module must include many standard Python include files, it
				847	must be compiled with a \samp{-I} option pointing to the Python source
				848	directory (unless it resides there itself).
				849
				850	\item
				851	On SGI Irix, the compiler flag \samp{-G0} (or \samp{-G 0}) must be passed.
				852	IF THIS IS NOT DONE THE RESULTING CODE WILL NOT WORK.
				853
				854	\item
Guido van Rossum	fbee23e	1994-01-01 17:32:24 +0000	[diff] [blame]	855	{\bf NEW:} On SunOS and IRIX 5.x, you must create a shared library
				856	from your \samp{.o} file using the following command (assuming your
				857	module is called \code{foo}):
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	858
				859	\begin{verbatim}
				860	ld -o foomodule.so foomodule.o <any other libraries needed>
				861	\end{verbatim}
				862
				863	and place the resulting \samp{.so} file in the Python search path (not
				864	the \samp{.o} file). Note: on Solaris, you need to pass \samp{-G} to
Guido van Rossum	fbee23e	1994-01-01 17:32:24 +0000	[diff] [blame]	865	the loader; on IRIX 5.x, you need to pass \samp{-shared}. Sigh...
Guido van Rossum	6f0132f	1993-11-19 13:13:22 +0000	[diff] [blame]	866
				867	\end{itemize}
				868
				869
				870	\section{Using libraries}
				871
				872	If your dynamically loadable module needs to be linked with one or
				873	more libraries that aren't linked with Python (or if it needs a
				874	routine that isn't used by Python from one of the libraries with which
				875	Python is linked), you must specify a list of libraries to search
				876	after loading the module in a file with extension \samp{.libs} (and
				877	otherwise the same as your \samp{.o} file). This file should contain
				878	one or more lines containing whitespace-separated absolute library
				879	pathnames. When using the dl interface, \samp{-l...} flags may also
				880	be used (it is in fact passed as an option list to the system linker
				881	ld(1)), but the dl-dld interface requires absolute pathnames. I
				882	believe it is possible to specify shared libraries here.
				883
				884	(On SunOS, any extra libraries must be specified on the \code{ld}
				885	command that creates the \samp{.so} file.)
				886
				887
				888	\section{Caveats}
				889
				890	Dynamic loading requires that \code{main}'s \code{argv[0]} contains
				891	the pathname or at least filename of the Python interpreter.
				892	Unfortunately, when executing a directly executable Python script (an
				893	executable file with \samp{\#!...} on the first line), the kernel
				894	overwrites \code{argv[0]} with the name of the script. There is no
				895	easy way around this, so executable Python scripts cannot use
				896	dynamically loaded modules. (You can always write a simple shell
				897	script that calls the Python interpreter with the script as its
				898	input.)
				899
				900	When using dl, the overlay is first converted into an `overlay' for
				901	the current process by the system linker (\code{ld}). The overlay is
				902	saved as a file with extension \samp{.ld}, either in the directory
				903	where the \samp{.o} file lives or (if that can't be written) in a
				904	temporary directory. An existing \samp{.ld} file resulting from a
				905	previous run (not from a temporary directory) is used, bypassing the
				906	(costly) linking phase, provided its version matches the \samp{.o}
				907	file and the current binary. (See the \code{dl} man page for more
				908	details.)
				909
				910
Guido van Rossum	7a2dba2	1993-11-05 14:45:11 +0000	[diff] [blame]	911	\input{ext.ind}
				912
				913	\end{document}