Guido van Rossum | 7a2dba2 | 1993-11-05 14:45:11 +0000 | [diff] [blame^] | 1 | \documentstyle[twoside,11pt,myformat]{report} |
| 2 | |
| 3 | \title{\bf Extending and Embedding the Python Interpreter} |
| 4 | |
| 5 | \author{ |
| 6 | Guido van Rossum \\ |
| 7 | Dept. CST, CWI, Kruislaan 413 \\ |
| 8 | 1098 SJ Amsterdam, The Netherlands \\ |
| 9 | E-mail: {\tt guido@cwi.nl} |
| 10 | } |
| 11 | |
| 12 | % Tell \index to actually write the .idx file |
| 13 | \makeindex |
| 14 | |
| 15 | \begin{document} |
| 16 | |
| 17 | \pagenumbering{roman} |
| 18 | |
| 19 | \maketitle |
| 20 | |
| 21 | \begin{abstract} |
| 22 | |
| 23 | \noindent |
| 24 | This document describes how you can extend the Python interpreter with |
| 25 | new modules written in C or C++. It also describes how to use the |
| 26 | interpreter as a library package from applications using Python as an |
| 27 | ``embedded'' language. |
| 28 | |
| 29 | \end{abstract} |
| 30 | |
| 31 | \pagebreak |
| 32 | |
| 33 | { |
| 34 | \parskip = 0mm |
| 35 | \tableofcontents |
| 36 | } |
| 37 | |
| 38 | \pagebreak |
| 39 | |
| 40 | \pagenumbering{arabic} |
| 41 | |
| 42 | \chapter{Extending Python with C or C++ code} |
| 43 | |
| 44 | It is quite easy to add non-standard built-in modules to Python, if |
| 45 | you know how to program in C. A built-in module known to the Python |
| 46 | programmer as foo is generally implemented in a file called |
| 47 | foomodule.c. The standard built-in modules also adhere to this |
| 48 | convention, and in fact some of them form excellent examples of how to |
| 49 | create an extension. |
| 50 | |
| 51 | Extension modules can do two things that can't be done directly in |
| 52 | Python: implement new data types and provide access to system calls or |
| 53 | C library functions. Since the latter is usually the most important |
| 54 | reason for adding an extension, I'll concentrate on adding "wrappers" |
| 55 | around C library functions; the concrete example uses the wrapper for |
| 56 | system() in module posix, found in (of course) the file posixmodule.c. |
| 57 | |
| 58 | It is important not to be impressed by the size and complexity of |
| 59 | the average extension module; much of this is straightforward |
| 60 | "boilerplate" code (starting right with the copyright notice!). |
| 61 | |
| 62 | Let's skip the boilerplate and jump right to an interesting function: |
| 63 | |
| 64 | \begin{verbatim} |
| 65 | static object * |
| 66 | posix_system(self, args) |
| 67 | object *self; |
| 68 | object *args; |
| 69 | { |
| 70 | char *command; |
| 71 | int sts; |
| 72 | if (!getargs(args, "s", &command)) |
| 73 | return NULL; |
| 74 | sts = system(command); |
| 75 | return newintobject((long)sts); |
| 76 | } |
| 77 | \end{verbatim} |
| 78 | |
| 79 | This is the prototypical top-level function in an extension module. |
| 80 | It will be called (we'll see later how this is made possible) when the |
| 81 | Python program executes statements like |
| 82 | |
| 83 | \begin{verbatim} |
| 84 | >>> import posix |
| 85 | >>> sts = posix.system('ls -l') |
| 86 | \end{verbatim} |
| 87 | |
| 88 | There is a straightforward translation from the arguments to the call |
| 89 | in Python (here the single value 'ls -l') to the arguments that are |
| 90 | passed to the C function. The C function always has two parameters, |
| 91 | conventionally named 'self' and 'args'. In this example, 'self' will |
| 92 | always be a NULL pointer, since this is a function, not a method (this |
| 93 | is done so that the interpreter doesn't have to understand two |
| 94 | different types of C functions). |
| 95 | |
| 96 | The 'args' parameter will be a pointer to a Python object, or NULL if |
| 97 | the Python function/method was called without arguments. It is |
| 98 | necessary to do full argument type checking on each call, since |
| 99 | otherwise the Python user could cause a core dump by passing the wrong |
| 100 | arguments (or no arguments at all). Because argument checking and |
| 101 | converting arguments to C is such a common task, there's a general |
| 102 | function in the Python interpreter which combines these tasks: |
| 103 | getargs(). It uses a template string to determine both the types of |
| 104 | the Python argument and the types of the C variables into which it |
| 105 | should store the converted values. |
| 106 | |
| 107 | When getargs returns nonzero, the argument list has the right type and |
| 108 | its components have been stored in the variables whose addresses are |
| 109 | passed. When it returns zero, an error has occurred. In the latter |
| 110 | case it has already raised an appropriate exception by calling |
| 111 | err_setstr(), so the calling function can just return NULL. |
| 112 | |
| 113 | The form of the format string is described at the end of this file. |
| 114 | (There are convenience macros getstrarg(), getintarg(), etc., for many |
| 115 | common forms of argument lists. These are relics from the past; it's |
| 116 | better to call getargs() directly.) |
| 117 | |
| 118 | |
| 119 | \section{Intermezzo: errors and exceptions} |
| 120 | |
| 121 | An important convention throughout the Python interpreter is the |
| 122 | following: when a function fails, it should set an exception condition |
| 123 | and return an error value (often a NULL pointer). Exceptions are set |
| 124 | in a global variable in the file errors.c; if this variable is NULL no |
| 125 | exception has occurred. A second variable is the "associated value" |
| 126 | of the exception. |
| 127 | |
| 128 | The file errors.h declares a host of err_* functions to set various |
| 129 | types of exceptions. The most common one is err_setstr() -- its |
| 130 | arguments are an exception object (e.g. RuntimeError -- actually it |
| 131 | can be any string object) and a C string indicating the cause of the |
| 132 | error (this is converted to a string object and stored as the |
| 133 | "associated value" of the exception). Another useful function is |
| 134 | err_errno(), which only takes an exception argument and constructs the |
| 135 | associated value by inspection of the (UNIX) global variable errno. |
| 136 | |
| 137 | You can test non-destructively whether an exception has been set with |
| 138 | err_occurred(). However, most code never calls err_occurred() to see |
| 139 | whether an error occurred or not, but relies on error return values |
| 140 | from the functions it calls instead: |
| 141 | |
| 142 | When a function that calls another function detects that the called |
| 143 | function fails, it should return an error value but not set an |
| 144 | condition -- one is already set. The caller is then supposed to also |
| 145 | return an error indication to *its* caller, again *without* calling |
| 146 | err_setstr(), and so on -- the most detailed cause of the error was |
| 147 | already reported by the function that detected it in the first place. |
| 148 | Once the error has reached Python's interpreter main loop, this aborts |
| 149 | the currently executing Python code and tries to find an exception |
| 150 | handler specified by the Python programmer. |
| 151 | |
| 152 | To ignore an exception set by a function call that failed, the |
| 153 | exception condition must be cleared explicitly by calling err_clear(). |
| 154 | The only time C code should call err_clear() is if it doesn't want to |
| 155 | pass the error on to the interpreter but wants to handle it completely |
| 156 | by itself (e.g. by trying something else or pretending nothing |
| 157 | happened). |
| 158 | |
| 159 | Finally, the function err_get() gives you both error variables |
| 160 | *and clears them*. Note that even if an error occurred the second one |
| 161 | may be NULL. I doubt you will need to use this function. |
| 162 | |
| 163 | Note that a failing malloc() call must also be turned into an |
| 164 | exception -- the direct caller of malloc() (or realloc()) must call |
| 165 | err_nomem() and return a failure indicator itself. All the |
| 166 | object-creating functions (newintobject() etc.) already do this, so |
| 167 | only if you call malloc() directly this note is of importance. |
| 168 | |
| 169 | Also note that, with the important exception of getargs(), functions |
| 170 | that return an integer status usually use 0 for success and -1 for |
| 171 | failure. |
| 172 | |
| 173 | Finally, be careful about cleaning up garbage (making appropriate |
| 174 | [X]DECREF() calls) when you return an error! |
| 175 | |
| 176 | |
| 177 | \section{Back to the example} |
| 178 | |
| 179 | Going back to posix_system, you should now be able to understand this |
| 180 | bit: |
| 181 | |
| 182 | \begin{verbatim} |
| 183 | if (!getargs(args, "s", &command)) |
| 184 | return NULL; |
| 185 | \end{verbatim} |
| 186 | |
| 187 | It returns NULL (the error indicator for functions of this kind) if an |
| 188 | error is detected in the argument list, relying on the exception set |
| 189 | by getargs(). The string value of the argument is now copied to the |
| 190 | local variable 'command'. |
| 191 | |
| 192 | If a Python function is called with multiple arguments, the argument |
| 193 | list is turned into a tuple. Python programs can us this feature, for |
| 194 | instance, to explicitly create the tuple containing the arguments |
| 195 | first and make the call later. |
| 196 | |
| 197 | The next statement in posix_system is a call tothe C library function |
| 198 | system(), passing it the string we just got from getargs(): |
| 199 | |
| 200 | \begin{verbatim} |
| 201 | sts = system(command); |
| 202 | \end{verbatim} |
| 203 | |
| 204 | Python strings may contain internal null bytes; but if these occur in |
| 205 | this example the rest of the string will be ignored by system(). |
| 206 | |
| 207 | Finally, posix.system() must return a value: the integer status |
| 208 | returned by the C library system() function. This is done by the |
| 209 | function newintobject(), which takes a (long) integer as parameter. |
| 210 | |
| 211 | \begin{verbatim} |
| 212 | return newintobject((long)sts); |
| 213 | \end{verbatim} |
| 214 | |
| 215 | (Yes, even integers are represented as objects on the heap in Python!) |
| 216 | If you had a function that returned no useful argument, you would need |
| 217 | this idiom: |
| 218 | |
| 219 | \begin{verbatim} |
| 220 | INCREF(None); |
| 221 | return None; |
| 222 | \end{verbatim} |
| 223 | |
| 224 | 'None' is a unique Python object representing 'no value'. It differs |
| 225 | from NULL, which means 'error' in most contexts (except when passed as |
| 226 | a function argument -- there it means 'no arguments'). |
| 227 | |
| 228 | |
| 229 | \section{The module's function table} |
| 230 | |
| 231 | I promised to show how I made the function posix_system() available to |
| 232 | Python programs. This is shown later in posixmodule.c: |
| 233 | |
| 234 | \begin{verbatim} |
| 235 | static struct methodlist posix_methods[] = { |
| 236 | ... |
| 237 | {"system", posix_system}, |
| 238 | ... |
| 239 | {NULL, NULL} /* Sentinel */ |
| 240 | }; |
| 241 | |
| 242 | void |
| 243 | initposix() |
| 244 | { |
| 245 | (void) initmodule("posix", posix_methods); |
| 246 | } |
| 247 | \end{verbatim} |
| 248 | |
| 249 | (The actual initposix() is somewhat more complicated, but most |
| 250 | extension modules are indeed as simple as that.) When the Python |
| 251 | program first imports module 'posix', initposix() is called, which |
| 252 | calls initmodule() with specific parameters. This creates a module |
| 253 | object (which is inserted in the table sys.modules under the key |
| 254 | 'posix'), and adds built-in-function objects to the newly created |
| 255 | module based upon the table (of type struct methodlist) that was |
| 256 | passed as its second parameter. The function initmodule() returns a |
| 257 | pointer to the module object that it creates, but this is unused here. |
| 258 | It aborts with a fatal error if the module could not be initialized |
| 259 | satisfactorily. |
| 260 | |
| 261 | |
| 262 | \section{Calling the module initialization function} |
| 263 | |
| 264 | There is one more thing to do: telling the Python module to call the |
| 265 | initfoo() function when it encounters an 'import foo' statement. |
| 266 | This is done in the file config.c. This file contains a table mapping |
| 267 | module names to parameterless void function pointers. You need to add |
| 268 | a declaration of initfoo() somewhere early in the file, and a line |
| 269 | saying |
| 270 | |
| 271 | \begin{verbatim} |
| 272 | {"foo", initfoo}, |
| 273 | \end{verbatim} |
| 274 | |
| 275 | to the initializer for inittab[]. It is conventional to include both |
| 276 | the declaration and the initializer line in preprocessor commands |
| 277 | \verb\#ifdef USE_FOO\ / \verb\#endif\, to make it easy to turn the foo |
| 278 | extension on or off. Note that the Macintosh version uses a different |
| 279 | configuration file, distributed as configmac.c. This strategy may be |
| 280 | extended to other operating system versions, although usually the |
| 281 | standard config.c file gives a pretty useful starting point for a new |
| 282 | config*.c file. |
| 283 | |
| 284 | And, of course, I forgot the Makefile. This is actually not too hard, |
| 285 | just follow the examples for, say, AMOEBA. Just find all occurrences |
| 286 | of the string AMOEBA in the Makefile and do the same for FOO that's |
| 287 | done for AMOEBA... |
| 288 | |
| 289 | (Note: if you are using dynamic loading for your extension, you don't |
| 290 | need to edit config.c and the Makefile. See "./DYNLOAD" for more info |
| 291 | about this.) |
| 292 | |
| 293 | |
| 294 | \section{Calling Python functions from C} |
| 295 | |
| 296 | The above concentrates on making C functions accessible to the Python |
| 297 | programmer. The reverse is also often useful: calling Python |
| 298 | functions from C. This is especially the case for libraries that |
| 299 | support so-called "callback" functions. If a C interface makes heavy |
| 300 | use of callbacks, the equivalent Python often needs to provide a |
| 301 | callback mechanism to the Python programmer; the implementation may |
| 302 | require calling the Python callback functions from a C callback. |
| 303 | Other uses are also possible. |
| 304 | |
| 305 | Fortunately, the Python interpreter is easily called recursively, and |
| 306 | there is a standard interface to call a Python function. I won't |
| 307 | dwell on how to call the Python parser with a particular string as |
| 308 | input -- if you're interested, have a look at the implementation of |
| 309 | the "-c" command line option in pythonmain.c. |
| 310 | |
| 311 | Calling a Python function is easy. First, the Python program must |
| 312 | somehow pass you the Python function object. You should provide a |
| 313 | function (or some other interface) to do this. When this function is |
| 314 | called, save a pointer to the Python function object (be careful to |
| 315 | INCREF it!) in a global variable -- or whereever you see fit. |
| 316 | For example, the following function might be part of a module |
| 317 | definition: |
| 318 | |
| 319 | \begin{verbatim} |
| 320 | static object *my_callback; |
| 321 | |
| 322 | static object * |
| 323 | my_set_callback(dummy, arg) |
| 324 | object *dummy, *arg; |
| 325 | { |
| 326 | XDECREF(my_callback); /* Dispose of previous callback */ |
| 327 | my_callback = arg; |
| 328 | XINCREF(my_callback); /* Remember new callback */ |
| 329 | /* Boilerplate for "void" return */ |
| 330 | INCREF(None); |
| 331 | return None; |
| 332 | } |
| 333 | \end{verbatim} |
| 334 | |
| 335 | Later, when it is time to call the function, you call the C function |
| 336 | call_object(). This function has two arguments, both pointers to |
| 337 | arbitrary Python objects: the Python function, and the argument. The |
| 338 | argument can be NULL to call the function without arguments. For |
| 339 | example: |
| 340 | |
| 341 | \begin{verbatim} |
| 342 | object *result; |
| 343 | ... |
| 344 | /* Time to call the callback */ |
| 345 | result = call_object(my_callback, (object *)NULL); |
| 346 | \end{verbatim} |
| 347 | |
| 348 | call_object() returns a Python object pointer: this is |
| 349 | the return value of the Python function. call_object() is |
| 350 | "reference-count-neutral" with respect to its arguments, but the |
| 351 | return value is "new": either it is a brand new object, or it is an |
| 352 | existing object whose reference count has been incremented. So, you |
| 353 | should somehow apply DECREF to the result, even (especially!) if you |
| 354 | are not interested in its value. |
| 355 | |
| 356 | Before you do this, however, it is important to check that the return |
| 357 | value isn't NULL. If it is, the Python function terminated by raising |
| 358 | an exception. If the C code that called call_object() is called from |
| 359 | Python, it should now return an error indication to its Python caller, |
| 360 | so the interpreter can print a stack trace, or the calling Python code |
| 361 | can handle the exception. If this is not possible or desirable, the |
| 362 | exception should be cleared by calling err_clear(). For example: |
| 363 | |
| 364 | \begin{verbatim} |
| 365 | if (result == NULL) |
| 366 | return NULL; /* Pass error back */ |
| 367 | /* Here maybe use the result */ |
| 368 | DECREF(result); |
| 369 | \end{verbatim} |
| 370 | |
| 371 | Depending on the desired interface to the Python callback function, |
| 372 | you may also have to provide an argument to call_object(). In some |
| 373 | cases the argument is also provided by the Python program, through the |
| 374 | same interface that specified the callback function. It can then be |
| 375 | saved and used in the same manner as the function object. In other |
| 376 | cases, you may have to construct a new object to pass as argument. In |
| 377 | this case you must dispose of it as well. For example, if you want to |
| 378 | pass an integral event code, you might use the following code: |
| 379 | |
| 380 | \begin{verbatim} |
| 381 | object *argument; |
| 382 | ... |
| 383 | argument = newintobject((long)eventcode); |
| 384 | result = call_object(my_callback, argument); |
| 385 | DECREF(argument); |
| 386 | if (result == NULL) |
| 387 | return NULL; /* Pass error back */ |
| 388 | /* Here maybe use the result */ |
| 389 | DECREF(result); |
| 390 | \end{verbatim} |
| 391 | |
| 392 | Note the placement of DECREF(argument) immediately after the call, |
| 393 | before the error check! Also note that strictly spoken this code is |
| 394 | not complete: newintobject() may run out of memory, and this should be |
| 395 | checked. |
| 396 | |
| 397 | In even more complicated cases you may want to pass the callback |
| 398 | function multiple arguments. To this end you have to construct (and |
| 399 | dispose of!) a tuple object. Details (mostly concerned with the |
| 400 | errror checks and reference count manipulation) are left as an |
| 401 | exercise for the reader; most of this is also needed when returning |
| 402 | multiple values from a function. |
| 403 | |
| 404 | XXX TO DO: explain objects and reference counting. |
| 405 | XXX TO DO: defining new object types. |
| 406 | |
| 407 | |
| 408 | \section{Format strings for getargs()} |
| 409 | |
| 410 | The getargs() function is declared in "modsupport.h" as follows: |
| 411 | |
| 412 | \begin{verbatim} |
| 413 | int getargs(object *arg, char *format, ...); |
| 414 | \end{verbatim} |
| 415 | |
| 416 | The remaining arguments must be addresses of variables whose type is |
| 417 | determined by the format string. For the conversion to succeed, the |
| 418 | `arg' object must match the format and the format must be exhausted. |
| 419 | Note that while getargs() checks that the Python object really is of |
| 420 | the specified type, it cannot check that the addresses provided in the |
| 421 | call match: if you make mistakes there, your code will probably dump |
| 422 | core. |
| 423 | |
| 424 | A format string consists of a single `format unit'. A format unit |
| 425 | describes one Python object; it is usually a single character or a |
| 426 | parenthesized string. The type of a format units is determined from |
| 427 | its first character, the `format letter': |
| 428 | |
| 429 | 's' (string) |
| 430 | The Python object must be a string object. The C argument |
| 431 | must be a char** (i.e., the address of a character pointer), |
| 432 | and a pointer to the C string contained in the Python object |
| 433 | is stored into it. If the next character in the format string |
| 434 | is \verb\'#'\, another C argument of type int* must be present, and |
| 435 | the length of the Python string (not counting the trailing |
| 436 | zero byte) is stored into it. |
| 437 | |
| 438 | 'z' (string or zero, i.e., NULL) |
| 439 | Like 's', but the object may also be None. In this case the |
| 440 | string pointer is set to NULL and if a \verb\'#'\ is present the size |
| 441 | it set to 0. |
| 442 | |
| 443 | 'b' (byte, i.e., char interpreted as tiny int) |
| 444 | The object must be a Python integer. The C argument must be a |
| 445 | char*. |
| 446 | |
| 447 | 'h' (half, i.e., short) |
| 448 | The object must be a Python integer. The C argument must be a |
| 449 | short*. |
| 450 | |
| 451 | 'i' (int) |
| 452 | The object must be a Python integer. The C argument must be |
| 453 | an int*. |
| 454 | |
| 455 | 'l' (long) |
| 456 | The object must be a (plain!) Python integer. The C argument |
| 457 | must be a long*. |
| 458 | |
| 459 | 'c' (char) |
| 460 | The Python object must be a string of length 1. The C |
| 461 | argument must be a char*. (Don't pass an int*!) |
| 462 | |
| 463 | 'f' (float) |
| 464 | The object must be a Python int or float. The C argument must |
| 465 | be a float*. |
| 466 | |
| 467 | 'd' (double) |
| 468 | The object must be a Python int or float. The C argument must |
| 469 | be a double*. |
| 470 | |
| 471 | 'S' (string object) |
| 472 | The object must be a Python string. The C argument must be an |
| 473 | object** (i.e., the address of an object pointer). The C |
| 474 | program thus gets back the actual string object that was |
| 475 | passed, not just a pointer to its array of characters and its |
| 476 | size as for format character 's'. |
| 477 | |
| 478 | 'O' (object) |
| 479 | The object can be any Python object, including None, but not |
| 480 | NULL. The C argument must be an object**. This can be used |
| 481 | if an argument list must contain objects of a type for which |
| 482 | no format letter exist: the caller must then check that it has |
| 483 | the right type. |
| 484 | |
| 485 | '(' (tuple) |
| 486 | The object must be a Python tuple. Following the '(' |
| 487 | character in the format string must come a number of format |
| 488 | units describing the elements of the tuple, followed by a ')' |
| 489 | character. Tuple format units may be nested. (There are no |
| 490 | exceptions for empty and singleton tuples; "()" specifies an |
| 491 | empty tuple and "(i)" a singleton of one integer. Normally |
| 492 | you don't want to use the latter, since it is hard for the |
| 493 | user to specify. |
| 494 | |
| 495 | |
| 496 | More format characters will probably be added as the need arises. It |
| 497 | should be allowed to use Python long integers whereever integers are |
| 498 | expected, and perform a range check. (A range check is in fact always |
| 499 | necessary for the 'b', 'h' and 'i' format letters, but this is |
| 500 | currently not implemented.) |
| 501 | |
| 502 | |
| 503 | Some example calls: |
| 504 | |
| 505 | \begin{verbatim} |
| 506 | int ok; |
| 507 | int i, j; |
| 508 | long k, l; |
| 509 | char *s; |
| 510 | int size; |
| 511 | |
| 512 | ok = getargs(args, "(lls)", &k, &l, &s); /* Two longs and a string */ |
| 513 | /* Possible Python call: f(1, 2, 'three') */ |
| 514 | |
| 515 | ok = getargs(args, "s", &s); /* A string */ |
| 516 | /* Possible Python call: f('whoops!') */ |
| 517 | |
| 518 | ok = getargs(args, ""); /* No arguments */ |
| 519 | /* Python call: f() */ |
| 520 | |
| 521 | ok = getargs(args, "((ii)s#)", &i, &j, &s, &size); |
| 522 | /* A pair of ints and a string, whose size is also returned */ |
| 523 | /* Possible Python call: f(1, 2, 'three') */ |
| 524 | |
| 525 | { |
| 526 | int left, top, right, bottom, h, v; |
| 527 | ok = getargs(args, "(((ii)(ii))(ii))", |
| 528 | &left, &top, &right, &bottom, &h, &v); |
| 529 | /* A rectangle and a point */ |
| 530 | /* Possible Python call: |
| 531 | f( ((0, 0), (400, 300)), (10, 10)) */ |
| 532 | } |
| 533 | \end{verbatim} |
| 534 | |
| 535 | Note that a format string must consist of a single unit; strings like |
| 536 | \verb\'is'\ and \verb\'(ii)s#'\ are not valid format strings. (But |
| 537 | \verb\'s#'\ is.) |
| 538 | |
| 539 | |
| 540 | The getargs() function does not support variable-length argument |
| 541 | lists. In simple cases you can fake these by trying several calls to |
| 542 | getargs() until one succeeds, but you must take care to call |
| 543 | err_clear() before each retry. For example: |
| 544 | |
| 545 | \begin{verbatim} |
| 546 | static object *my_method(self, args) object *self, *args; { |
| 547 | int i, j, k; |
| 548 | |
| 549 | if (getargs(args, "(ii)", &i, &j)) { |
| 550 | k = 0; /* Use default third argument */ |
| 551 | } |
| 552 | else { |
| 553 | err_clear(); |
| 554 | if (!getargs(args, "(iii)", &i, &j, &k)) |
| 555 | return NULL; |
| 556 | } |
| 557 | /* ... use i, j and k here ... */ |
| 558 | INCREF(None); |
| 559 | return None; |
| 560 | } |
| 561 | \end{verbatim} |
| 562 | |
| 563 | (It is possible to think of an extension to the definition of format |
| 564 | strings to accomodate this directly, e.g., placing a '|' in a tuple |
| 565 | might specify that the remaining arguments are optional. getargs() |
| 566 | should then return 1 + the number of variables stored into.) |
| 567 | |
| 568 | |
| 569 | Advanced users note: If you set the `varargs' flag in the method list |
| 570 | for a function, the argument will always be a tuple (the `raw argument |
| 571 | list'). In this case you must enclose single and empty argument lists |
| 572 | in parentheses, e.g., "(s)" and "()". |
| 573 | |
| 574 | |
| 575 | \section{The mkvalue() function} |
| 576 | |
| 577 | This function is the counterpart to getargs(). It is declared in |
| 578 | "modsupport.h" as follows: |
| 579 | |
| 580 | \begin{verbatim} |
| 581 | object *mkvalue(char *format, ...); |
| 582 | \end{verbatim} |
| 583 | |
| 584 | It supports exactly the same format letters as getargs(), but the |
| 585 | arguments (which are input to the function, not output) must not be |
| 586 | pointers, just values. If a byte, short or float is passed to a |
| 587 | varargs function, it is widened by the compiler to int or double, so |
| 588 | 'b' and 'h' are treated as 'i' and 'f' is treated as 'd'. 'S' is |
| 589 | treated as 'O', 's' is treated as 'z'. \verb\'z#'\ and \verb\'s#'\ |
| 590 | are supported: a second argument specifies the length of the data |
| 591 | (negative means use strlen()). 'S' and 'O' add a reference to their |
| 592 | argument (so you should DECREF it if you've just created it and aren't |
| 593 | going to use it again). |
| 594 | |
| 595 | If the argument for 'O' or 'S' is a NULL pointer, it is assumed that |
| 596 | this was caused because the call producing the argument found an error |
| 597 | and set an exception. Therefore, mkvalue() will return NULL but won't |
| 598 | set an exception if one is already set. If no exception is set, |
| 599 | SystemError is set. |
| 600 | |
| 601 | If there is an error in the format string, the SystemError exception |
| 602 | is set, since it is the calling C code's fault, not that of the Python |
| 603 | user who sees the exception. |
| 604 | |
| 605 | Example: |
| 606 | |
| 607 | \begin{verbatim} |
| 608 | return mkvalue("(ii)", 0, 0); |
| 609 | \end{verbatim} |
| 610 | |
| 611 | returns a tuple containing two zeros. (Outer parentheses in the |
| 612 | format string are actually superfluous, but you can use them for |
| 613 | compatibility with getargs(), which requires them if more than one |
| 614 | argument is expected.) |
| 615 | |
| 616 | \section{Reference counts} |
| 617 | |
| 618 | Here's a useful explanation of INCREF and DECREF by Sjoerd Mullender. |
| 619 | |
| 620 | Use XINCREF or XDECREF instead of INCREF/DECREF when the argument may |
| 621 | be NULL. |
| 622 | |
| 623 | The basic idea is, if you create an extra reference to an object, you |
| 624 | must INCREF it, if you throw away a reference to an object, you must |
| 625 | DECREF it. Functions such as newstringobject, newsizedstringobject, |
| 626 | newintobject, etc. create a reference to an object. If you want to |
| 627 | throw away the object thus created, you must use DECREF. |
| 628 | |
| 629 | If you put an object into a tuple, list, or dictionary, the idea is |
| 630 | that you usually don't want to keep a reference of your own around, so |
| 631 | Python does not INCREF the elements. It does DECREF the old value. |
| 632 | This means that if you put something into such an object using the |
| 633 | functions Python provides for this, you must INCREF the object if you |
| 634 | want to keep a separate reference to the object around. Also, if you |
| 635 | replace an element, you should INCREF the old element first if you |
| 636 | want to keep it. If you didn't INCREF it before you replaced it, you |
| 637 | are not allowed to look at it anymore, since it may have been freed. |
| 638 | |
| 639 | Returning an object to Python (i.e., when your module function |
| 640 | returns) creates a reference to an object, but it does not change the |
| 641 | reference count. When your module does not keep another reference to |
| 642 | the object, you should not INCREF or DECREF it. When you do keep a |
| 643 | reference around, you should INCREF the object. Also, when you return |
| 644 | a global object such as None, you should INCREF it. |
| 645 | |
| 646 | If you want to return a tuple, you should consider using mkvalue. |
| 647 | Mkvalue creates a new tuple with a reference count of 1 which you can |
| 648 | return. If any of the elements you put into the tuple are objects, |
| 649 | they are INCREFfed by mkvalue. If you don't want to keep references |
| 650 | to those elements around, you should DECREF them after having called |
| 651 | mkvalue. |
| 652 | |
| 653 | Usually you don't have to worry about arguments. They are INCREFfed |
| 654 | before your function is called and DECREFfed after your function |
| 655 | returns. When you keep a reference to an argument, you should INCREF |
| 656 | it and DECREF when you throw it away. Also, when you return an |
| 657 | argument, you should INCREF it, because returning the argument creates |
| 658 | an extra reference to it. |
| 659 | |
| 660 | If you use getargs() to parse the arguments, you can get a reference |
| 661 | to an object (by using "O" in the format string). This object was not |
| 662 | INCREFfed, so you should not DECREF it. If you want to keep the |
| 663 | object, you must INCREF it yourself. |
| 664 | |
| 665 | If you create your own type of objects, you should use NEWOBJ to |
| 666 | create the object. This sets the reference count to 1. If you want |
| 667 | to throw away the object, you should use DECREF. When the reference |
| 668 | count reaches 0, the dealloc function is called. In it, you should |
| 669 | DECREF all object to which you keep references in your object, but you |
| 670 | should not use DECREF on your object. You should use DEL instead. |
| 671 | |
| 672 | \chapter{Embedding Python in another application} |
| 673 | |
| 674 | Embedding Python is similar to extending it, but not quite. The |
| 675 | difference is that when you extend Python, the main program of the |
| 676 | application is still the Python interpreter, while of you embed |
| 677 | Python, the main program may have nothing to do with Python -- |
| 678 | instead, some parts of the application occasionally call the Python |
| 679 | interpreter to run some Python code. |
| 680 | |
| 681 | So if you are embedding Python, you are providing your own main |
| 682 | program. One of the things this main program has to do is initialize |
| 683 | the Python interpreter. At the very least, you have to call the |
| 684 | function initall(). There are optional calls to pass command line |
| 685 | arguments to Python. Then later you can call the interpreter from any |
| 686 | part of the application. |
| 687 | |
| 688 | There are several different ways to call the interpreter: you can pass |
| 689 | a string containing Python statements to run_command(), or you can |
| 690 | pass a stdio file pointer and a file name (for identification in error |
| 691 | messages only) to run_script(). You can also call the lower-level |
| 692 | operations described (partly) in the file \verb\<pythonroot>/misc/EXTENDING\ |
| 693 | to construct and use Python objects. |
| 694 | |
| 695 | A simple demo of embedding Python can be found in the directory |
| 696 | \verb\<pythonroot>/embed/\. |
| 697 | |
| 698 | \section{Using C++} |
| 699 | |
| 700 | It is also possible to embed Python in a C++ program; how this is done |
| 701 | exactly will depend on the details of the C++ system used; in general |
| 702 | you will need to write the main program in C++, enclosing the include |
| 703 | files in \verb\"extern "C" { ... }"\, and compile and link this with |
| 704 | the C++ compiler. (There is no need to recompile Python itself with |
| 705 | C++.) |
| 706 | |
| 707 | \input{ext.ind} |
| 708 | |
| 709 | \end{document} |