Fred Drake | 3adf79e | 2001-10-12 19:01:43 +0000 | [diff] [blame^] | 1 | \chapter{Initialization, Finalization, and Threads |
| 2 | \label{initialization}} |
| 3 | |
| 4 | \begin{cfuncdesc}{void}{Py_Initialize}{} |
| 5 | Initialize the Python interpreter. In an application embedding |
| 6 | Python, this should be called before using any other Python/C API |
| 7 | functions; with the exception of |
| 8 | \cfunction{Py_SetProgramName()}\ttindex{Py_SetProgramName()}, |
| 9 | \cfunction{PyEval_InitThreads()}\ttindex{PyEval_InitThreads()}, |
| 10 | \cfunction{PyEval_ReleaseLock()}\ttindex{PyEval_ReleaseLock()}, |
| 11 | and \cfunction{PyEval_AcquireLock()}\ttindex{PyEval_AcquireLock()}. |
| 12 | This initializes the table of loaded modules (\code{sys.modules}), |
| 13 | and\withsubitem{(in module sys)}{\ttindex{modules}\ttindex{path}} |
| 14 | creates the fundamental modules |
| 15 | \module{__builtin__}\refbimodindex{__builtin__}, |
| 16 | \module{__main__}\refbimodindex{__main__} and |
| 17 | \module{sys}\refbimodindex{sys}. It also initializes the module |
| 18 | search\indexiii{module}{search}{path} path (\code{sys.path}). |
| 19 | It does not set \code{sys.argv}; use |
| 20 | \cfunction{PySys_SetArgv()}\ttindex{PySys_SetArgv()} for that. This |
| 21 | is a no-op when called for a second time (without calling |
| 22 | \cfunction{Py_Finalize()}\ttindex{Py_Finalize()} first). There is |
| 23 | no return value; it is a fatal error if the initialization fails. |
| 24 | \end{cfuncdesc} |
| 25 | |
| 26 | \begin{cfuncdesc}{int}{Py_IsInitialized}{} |
| 27 | Return true (nonzero) when the Python interpreter has been |
| 28 | initialized, false (zero) if not. After \cfunction{Py_Finalize()} |
| 29 | is called, this returns false until \cfunction{Py_Initialize()} is |
| 30 | called again. |
| 31 | \end{cfuncdesc} |
| 32 | |
| 33 | \begin{cfuncdesc}{void}{Py_Finalize}{} |
| 34 | Undo all initializations made by \cfunction{Py_Initialize()} and |
| 35 | subsequent use of Python/C API functions, and destroy all |
| 36 | sub-interpreters (see \cfunction{Py_NewInterpreter()} below) that |
| 37 | were created and not yet destroyed since the last call to |
| 38 | \cfunction{Py_Initialize()}. Ideally, this frees all memory |
| 39 | allocated by the Python interpreter. This is a no-op when called |
| 40 | for a second time (without calling \cfunction{Py_Initialize()} again |
| 41 | first). There is no return value; errors during finalization are |
| 42 | ignored. |
| 43 | |
| 44 | This function is provided for a number of reasons. An embedding |
| 45 | application might want to restart Python without having to restart |
| 46 | the application itself. An application that has loaded the Python |
| 47 | interpreter from a dynamically loadable library (or DLL) might want |
| 48 | to free all memory allocated by Python before unloading the |
| 49 | DLL. During a hunt for memory leaks in an application a developer |
| 50 | might want to free all memory allocated by Python before exiting |
| 51 | from the application. |
| 52 | |
| 53 | \strong{Bugs and caveats:} The destruction of modules and objects in |
| 54 | modules is done in random order; this may cause destructors |
| 55 | (\method{__del__()} methods) to fail when they depend on other |
| 56 | objects (even functions) or modules. Dynamically loaded extension |
| 57 | modules loaded by Python are not unloaded. Small amounts of memory |
| 58 | allocated by the Python interpreter may not be freed (if you find a |
| 59 | leak, please report it). Memory tied up in circular references |
| 60 | between objects is not freed. Some memory allocated by extension |
| 61 | modules may not be freed. Some extension may not work properly if |
| 62 | their initialization routine is called more than once; this can |
| 63 | happen if an applcation calls \cfunction{Py_Initialize()} and |
| 64 | \cfunction{Py_Finalize()} more than once. |
| 65 | \end{cfuncdesc} |
| 66 | |
| 67 | \begin{cfuncdesc}{PyThreadState*}{Py_NewInterpreter}{} |
| 68 | Create a new sub-interpreter. This is an (almost) totally separate |
| 69 | environment for the execution of Python code. In particular, the |
| 70 | new interpreter has separate, independent versions of all imported |
| 71 | modules, including the fundamental modules |
| 72 | \module{__builtin__}\refbimodindex{__builtin__}, |
| 73 | \module{__main__}\refbimodindex{__main__} and |
| 74 | \module{sys}\refbimodindex{sys}. The table of loaded modules |
| 75 | (\code{sys.modules}) and the module search path (\code{sys.path}) |
| 76 | are also separate. The new environment has no \code{sys.argv} |
| 77 | variable. It has new standard I/O stream file objects |
| 78 | \code{sys.stdin}, \code{sys.stdout} and \code{sys.stderr} (however |
| 79 | these refer to the same underlying \ctype{FILE} structures in the C |
| 80 | library). |
| 81 | \withsubitem{(in module sys)}{ |
| 82 | \ttindex{stdout}\ttindex{stderr}\ttindex{stdin}} |
| 83 | |
| 84 | The return value points to the first thread state created in the new |
| 85 | sub-interpreter. This thread state is made the current thread |
| 86 | state. Note that no actual thread is created; see the discussion of |
| 87 | thread states below. If creation of the new interpreter is |
| 88 | unsuccessful, \NULL{} is returned; no exception is set since the |
| 89 | exception state is stored in the current thread state and there may |
| 90 | not be a current thread state. (Like all other Python/C API |
| 91 | functions, the global interpreter lock must be held before calling |
| 92 | this function and is still held when it returns; however, unlike |
| 93 | most other Python/C API functions, there needn't be a current thread |
| 94 | state on entry.) |
| 95 | |
| 96 | Extension modules are shared between (sub-)interpreters as follows: |
| 97 | the first time a particular extension is imported, it is initialized |
| 98 | normally, and a (shallow) copy of its module's dictionary is |
| 99 | squirreled away. When the same extension is imported by another |
| 100 | (sub-)interpreter, a new module is initialized and filled with the |
| 101 | contents of this copy; the extension's \code{init} function is not |
| 102 | called. Note that this is different from what happens when an |
| 103 | extension is imported after the interpreter has been completely |
| 104 | re-initialized by calling |
| 105 | \cfunction{Py_Finalize()}\ttindex{Py_Finalize()} and |
| 106 | \cfunction{Py_Initialize()}\ttindex{Py_Initialize()}; in that case, |
| 107 | the extension's \code{init\var{module}} function \emph{is} called |
| 108 | again. |
| 109 | |
| 110 | \strong{Bugs and caveats:} Because sub-interpreters (and the main |
| 111 | interpreter) are part of the same process, the insulation between |
| 112 | them isn't perfect --- for example, using low-level file operations |
| 113 | like \withsubitem{(in module os)}{\ttindex{close()}} |
| 114 | \function{os.close()} they can (accidentally or maliciously) affect |
| 115 | each other's open files. Because of the way extensions are shared |
| 116 | between (sub-)interpreters, some extensions may not work properly; |
| 117 | this is especially likely when the extension makes use of (static) |
| 118 | global variables, or when the extension manipulates its module's |
| 119 | dictionary after its initialization. It is possible to insert |
| 120 | objects created in one sub-interpreter into a namespace of another |
| 121 | sub-interpreter; this should be done with great care to avoid |
| 122 | sharing user-defined functions, methods, instances or classes |
| 123 | between sub-interpreters, since import operations executed by such |
| 124 | objects may affect the wrong (sub-)interpreter's dictionary of |
| 125 | loaded modules. (XXX This is a hard-to-fix bug that will be |
| 126 | addressed in a future release.) |
| 127 | \end{cfuncdesc} |
| 128 | |
| 129 | \begin{cfuncdesc}{void}{Py_EndInterpreter}{PyThreadState *tstate} |
| 130 | Destroy the (sub-)interpreter represented by the given thread state. |
| 131 | The given thread state must be the current thread state. See the |
| 132 | discussion of thread states below. When the call returns, the |
| 133 | current thread state is \NULL. All thread states associated with |
| 134 | this interpreted are destroyed. (The global interpreter lock must |
| 135 | be held before calling this function and is still held when it |
| 136 | returns.) \cfunction{Py_Finalize()}\ttindex{Py_Finalize()} will |
| 137 | destroy all sub-interpreters that haven't been explicitly destroyed |
| 138 | at that point. |
| 139 | \end{cfuncdesc} |
| 140 | |
| 141 | \begin{cfuncdesc}{void}{Py_SetProgramName}{char *name} |
| 142 | This function should be called before |
| 143 | \cfunction{Py_Initialize()}\ttindex{Py_Initialize()} is called |
| 144 | for the first time, if it is called at all. It tells the |
| 145 | interpreter the value of the \code{argv[0]} argument to the |
| 146 | \cfunction{main()}\ttindex{main()} function of the program. This is |
| 147 | used by \cfunction{Py_GetPath()}\ttindex{Py_GetPath()} and some |
| 148 | other functions below to find the Python run-time libraries relative |
| 149 | to the interpreter executable. The default value is |
| 150 | \code{'python'}. The argument should point to a zero-terminated |
| 151 | character string in static storage whose contents will not change |
| 152 | for the duration of the program's execution. No code in the Python |
| 153 | interpreter will change the contents of this storage. |
| 154 | \end{cfuncdesc} |
| 155 | |
| 156 | \begin{cfuncdesc}{char*}{Py_GetProgramName}{} |
| 157 | Return the program name set with |
| 158 | \cfunction{Py_SetProgramName()}\ttindex{Py_SetProgramName()}, or the |
| 159 | default. The returned string points into static storage; the caller |
| 160 | should not modify its value. |
| 161 | \end{cfuncdesc} |
| 162 | |
| 163 | \begin{cfuncdesc}{char*}{Py_GetPrefix}{} |
| 164 | Return the \emph{prefix} for installed platform-independent files. |
| 165 | This is derived through a number of complicated rules from the |
| 166 | program name set with \cfunction{Py_SetProgramName()} and some |
| 167 | environment variables; for example, if the program name is |
| 168 | \code{'/usr/local/bin/python'}, the prefix is \code{'/usr/local'}. |
| 169 | The returned string points into static storage; the caller should |
| 170 | not modify its value. This corresponds to the \makevar{prefix} |
| 171 | variable in the top-level \file{Makefile} and the |
| 172 | \longprogramopt{prefix} argument to the \program{configure} script |
| 173 | at build time. The value is available to Python code as |
| 174 | \code{sys.prefix}. It is only useful on \UNIX. See also the next |
| 175 | function. |
| 176 | \end{cfuncdesc} |
| 177 | |
| 178 | \begin{cfuncdesc}{char*}{Py_GetExecPrefix}{} |
| 179 | Return the \emph{exec-prefix} for installed |
| 180 | platform-\emph{de}pendent files. This is derived through a number |
| 181 | of complicated rules from the program name set with |
| 182 | \cfunction{Py_SetProgramName()} and some environment variables; for |
| 183 | example, if the program name is \code{'/usr/local/bin/python'}, the |
| 184 | exec-prefix is \code{'/usr/local'}. The returned string points into |
| 185 | static storage; the caller should not modify its value. This |
| 186 | corresponds to the \makevar{exec_prefix} variable in the top-level |
| 187 | \file{Makefile} and the \longprogramopt{exec-prefix} argument to the |
| 188 | \program{configure} script at build time. The value is available |
| 189 | to Python code as \code{sys.exec_prefix}. It is only useful on |
| 190 | \UNIX. |
| 191 | |
| 192 | Background: The exec-prefix differs from the prefix when platform |
| 193 | dependent files (such as executables and shared libraries) are |
| 194 | installed in a different directory tree. In a typical installation, |
| 195 | platform dependent files may be installed in the |
| 196 | \file{/usr/local/plat} subtree while platform independent may be |
| 197 | installed in \file{/usr/local}. |
| 198 | |
| 199 | Generally speaking, a platform is a combination of hardware and |
| 200 | software families, e.g. Sparc machines running the Solaris 2.x |
| 201 | operating system are considered the same platform, but Intel |
| 202 | machines running Solaris 2.x are another platform, and Intel |
| 203 | machines running Linux are yet another platform. Different major |
| 204 | revisions of the same operating system generally also form different |
| 205 | platforms. Non-\UNIX{} operating systems are a different story; the |
| 206 | installation strategies on those systems are so different that the |
| 207 | prefix and exec-prefix are meaningless, and set to the empty string. |
| 208 | Note that compiled Python bytecode files are platform independent |
| 209 | (but not independent from the Python version by which they were |
| 210 | compiled!). |
| 211 | |
| 212 | System administrators will know how to configure the \program{mount} |
| 213 | or \program{automount} programs to share \file{/usr/local} between |
| 214 | platforms while having \file{/usr/local/plat} be a different |
| 215 | filesystem for each platform. |
| 216 | \end{cfuncdesc} |
| 217 | |
| 218 | \begin{cfuncdesc}{char*}{Py_GetProgramFullPath}{} |
| 219 | Return the full program name of the Python executable; this is |
| 220 | computed as a side-effect of deriving the default module search path |
| 221 | from the program name (set by |
| 222 | \cfunction{Py_SetProgramName()}\ttindex{Py_SetProgramName()} above). |
| 223 | The returned string points into static storage; the caller should |
| 224 | not modify its value. The value is available to Python code as |
| 225 | \code{sys.executable}. |
| 226 | \withsubitem{(in module sys)}{\ttindex{executable}} |
| 227 | \end{cfuncdesc} |
| 228 | |
| 229 | \begin{cfuncdesc}{char*}{Py_GetPath}{} |
| 230 | \indexiii{module}{search}{path} |
| 231 | Return the default module search path; this is computed from the |
| 232 | program name (set by \cfunction{Py_SetProgramName()} above) and some |
| 233 | environment variables. The returned string consists of a series of |
| 234 | directory names separated by a platform dependent delimiter |
| 235 | character. The delimiter character is \character{:} on \UNIX, |
| 236 | \character{;} on DOS/Windows, and \character{\e n} (the \ASCII{} |
| 237 | newline character) on Macintosh. The returned string points into |
| 238 | static storage; the caller should not modify its value. The value |
| 239 | is available to Python code as the list |
| 240 | \code{sys.path}\withsubitem{(in module sys)}{\ttindex{path}}, which |
| 241 | may be modified to change the future search path for loaded |
| 242 | modules. |
| 243 | |
| 244 | % XXX should give the exact rules |
| 245 | \end{cfuncdesc} |
| 246 | |
| 247 | \begin{cfuncdesc}{const char*}{Py_GetVersion}{} |
| 248 | Return the version of this Python interpreter. This is a string |
| 249 | that looks something like |
| 250 | |
| 251 | \begin{verbatim} |
| 252 | "1.5 (#67, Dec 31 1997, 22:34:28) [GCC 2.7.2.2]" |
| 253 | \end{verbatim} |
| 254 | |
| 255 | The first word (up to the first space character) is the current |
| 256 | Python version; the first three characters are the major and minor |
| 257 | version separated by a period. The returned string points into |
| 258 | static storage; the caller should not modify its value. The value |
| 259 | is available to Python code as the list \code{sys.version}. |
| 260 | \withsubitem{(in module sys)}{\ttindex{version}} |
| 261 | \end{cfuncdesc} |
| 262 | |
| 263 | \begin{cfuncdesc}{const char*}{Py_GetPlatform}{} |
| 264 | Return the platform identifier for the current platform. On \UNIX, |
| 265 | this is formed from the ``official'' name of the operating system, |
| 266 | converted to lower case, followed by the major revision number; |
| 267 | e.g., for Solaris 2.x, which is also known as SunOS 5.x, the value |
| 268 | is \code{'sunos5'}. On Macintosh, it is \code{'mac'}. On Windows, |
| 269 | it is \code{'win'}. The returned string points into static storage; |
| 270 | the caller should not modify its value. The value is available to |
| 271 | Python code as \code{sys.platform}. |
| 272 | \withsubitem{(in module sys)}{\ttindex{platform}} |
| 273 | \end{cfuncdesc} |
| 274 | |
| 275 | \begin{cfuncdesc}{const char*}{Py_GetCopyright}{} |
| 276 | Return the official copyright string for the current Python version, |
| 277 | for example |
| 278 | |
| 279 | \code{'Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam'} |
| 280 | |
| 281 | The returned string points into static storage; the caller should |
| 282 | not modify its value. The value is available to Python code as the |
| 283 | list \code{sys.copyright}. |
| 284 | \withsubitem{(in module sys)}{\ttindex{copyright}} |
| 285 | \end{cfuncdesc} |
| 286 | |
| 287 | \begin{cfuncdesc}{const char*}{Py_GetCompiler}{} |
| 288 | Return an indication of the compiler used to build the current |
| 289 | Python version, in square brackets, for example: |
| 290 | |
| 291 | \begin{verbatim} |
| 292 | "[GCC 2.7.2.2]" |
| 293 | \end{verbatim} |
| 294 | |
| 295 | The returned string points into static storage; the caller should |
| 296 | not modify its value. The value is available to Python code as part |
| 297 | of the variable \code{sys.version}. |
| 298 | \withsubitem{(in module sys)}{\ttindex{version}} |
| 299 | \end{cfuncdesc} |
| 300 | |
| 301 | \begin{cfuncdesc}{const char*}{Py_GetBuildInfo}{} |
| 302 | Return information about the sequence number and build date and time |
| 303 | of the current Python interpreter instance, for example |
| 304 | |
| 305 | \begin{verbatim} |
| 306 | "#67, Aug 1 1997, 22:34:28" |
| 307 | \end{verbatim} |
| 308 | |
| 309 | The returned string points into static storage; the caller should |
| 310 | not modify its value. The value is available to Python code as part |
| 311 | of the variable \code{sys.version}. |
| 312 | \withsubitem{(in module sys)}{\ttindex{version}} |
| 313 | \end{cfuncdesc} |
| 314 | |
| 315 | \begin{cfuncdesc}{int}{PySys_SetArgv}{int argc, char **argv} |
| 316 | Set \code{sys.argv} based on \var{argc} and \var{argv}. These |
| 317 | parameters are similar to those passed to the program's |
| 318 | \cfunction{main()}\ttindex{main()} function with the difference that |
| 319 | the first entry should refer to the script file to be executed |
| 320 | rather than the executable hosting the Python interpreter. If there |
| 321 | isn't a script that will be run, the first entry in \var{argv} can |
| 322 | be an empty string. If this function fails to initialize |
| 323 | \code{sys.argv}, a fatal condition is signalled using |
| 324 | \cfunction{Py_FatalError()}\ttindex{Py_FatalError()}. |
| 325 | \withsubitem{(in module sys)}{\ttindex{argv}} |
| 326 | % XXX impl. doesn't seem consistent in allowing 0/NULL for the params; |
| 327 | % check w/ Guido. |
| 328 | \end{cfuncdesc} |
| 329 | |
| 330 | % XXX Other PySys thingies (doesn't really belong in this chapter) |
| 331 | |
| 332 | \section{Thread State and the Global Interpreter Lock |
| 333 | \label{threads}} |
| 334 | |
| 335 | \index{global interpreter lock} |
| 336 | \index{interpreter lock} |
| 337 | \index{lock, interpreter} |
| 338 | |
| 339 | The Python interpreter is not fully thread safe. In order to support |
| 340 | multi-threaded Python programs, there's a global lock that must be |
| 341 | held by the current thread before it can safely access Python objects. |
| 342 | Without the lock, even the simplest operations could cause problems in |
| 343 | a multi-threaded program: for example, when two threads simultaneously |
| 344 | increment the reference count of the same object, the reference count |
| 345 | could end up being incremented only once instead of twice. |
| 346 | |
| 347 | Therefore, the rule exists that only the thread that has acquired the |
| 348 | global interpreter lock may operate on Python objects or call Python/C |
| 349 | API functions. In order to support multi-threaded Python programs, |
| 350 | the interpreter regularly releases and reacquires the lock --- by |
| 351 | default, every ten bytecode instructions (this can be changed with |
| 352 | \withsubitem{(in module sys)}{\ttindex{setcheckinterval()}} |
| 353 | \function{sys.setcheckinterval()}). The lock is also released and |
| 354 | reacquired around potentially blocking I/O operations like reading or |
| 355 | writing a file, so that other threads can run while the thread that |
| 356 | requests the I/O is waiting for the I/O operation to complete. |
| 357 | |
| 358 | The Python interpreter needs to keep some bookkeeping information |
| 359 | separate per thread --- for this it uses a data structure called |
| 360 | \ctype{PyThreadState}\ttindex{PyThreadState}. This is new in Python |
| 361 | 1.5; in earlier versions, such state was stored in global variables, |
| 362 | and switching threads could cause problems. In particular, exception |
| 363 | handling is now thread safe, when the application uses |
| 364 | \withsubitem{(in module sys)}{\ttindex{exc_info()}} |
| 365 | \function{sys.exc_info()} to access the exception last raised in the |
| 366 | current thread. |
| 367 | |
| 368 | There's one global variable left, however: the pointer to the current |
| 369 | \ctype{PyThreadState}\ttindex{PyThreadState} structure. While most |
| 370 | thread packages have a way to store ``per-thread global data,'' |
| 371 | Python's internal platform independent thread abstraction doesn't |
| 372 | support this yet. Therefore, the current thread state must be |
| 373 | manipulated explicitly. |
| 374 | |
| 375 | This is easy enough in most cases. Most code manipulating the global |
| 376 | interpreter lock has the following simple structure: |
| 377 | |
| 378 | \begin{verbatim} |
| 379 | Save the thread state in a local variable. |
| 380 | Release the interpreter lock. |
| 381 | ...Do some blocking I/O operation... |
| 382 | Reacquire the interpreter lock. |
| 383 | Restore the thread state from the local variable. |
| 384 | \end{verbatim} |
| 385 | |
| 386 | This is so common that a pair of macros exists to simplify it: |
| 387 | |
| 388 | \begin{verbatim} |
| 389 | Py_BEGIN_ALLOW_THREADS |
| 390 | ...Do some blocking I/O operation... |
| 391 | Py_END_ALLOW_THREADS |
| 392 | \end{verbatim} |
| 393 | |
| 394 | The \code{Py_BEGIN_ALLOW_THREADS}\ttindex{Py_BEGIN_ALLOW_THREADS} macro |
| 395 | opens a new block and declares a hidden local variable; the |
| 396 | \code{Py_END_ALLOW_THREADS}\ttindex{Py_END_ALLOW_THREADS} macro closes |
| 397 | the block. Another advantage of using these two macros is that when |
| 398 | Python is compiled without thread support, they are defined empty, |
| 399 | thus saving the thread state and lock manipulations. |
| 400 | |
| 401 | When thread support is enabled, the block above expands to the |
| 402 | following code: |
| 403 | |
| 404 | \begin{verbatim} |
| 405 | PyThreadState *_save; |
| 406 | |
| 407 | _save = PyEval_SaveThread(); |
| 408 | ...Do some blocking I/O operation... |
| 409 | PyEval_RestoreThread(_save); |
| 410 | \end{verbatim} |
| 411 | |
| 412 | Using even lower level primitives, we can get roughly the same effect |
| 413 | as follows: |
| 414 | |
| 415 | \begin{verbatim} |
| 416 | PyThreadState *_save; |
| 417 | |
| 418 | _save = PyThreadState_Swap(NULL); |
| 419 | PyEval_ReleaseLock(); |
| 420 | ...Do some blocking I/O operation... |
| 421 | PyEval_AcquireLock(); |
| 422 | PyThreadState_Swap(_save); |
| 423 | \end{verbatim} |
| 424 | |
| 425 | There are some subtle differences; in particular, |
| 426 | \cfunction{PyEval_RestoreThread()}\ttindex{PyEval_RestoreThread()} saves |
| 427 | and restores the value of the global variable |
| 428 | \cdata{errno}\ttindex{errno}, since the lock manipulation does not |
| 429 | guarantee that \cdata{errno} is left alone. Also, when thread support |
| 430 | is disabled, |
| 431 | \cfunction{PyEval_SaveThread()}\ttindex{PyEval_SaveThread()} and |
| 432 | \cfunction{PyEval_RestoreThread()} don't manipulate the lock; in this |
| 433 | case, \cfunction{PyEval_ReleaseLock()}\ttindex{PyEval_ReleaseLock()} and |
| 434 | \cfunction{PyEval_AcquireLock()}\ttindex{PyEval_AcquireLock()} are not |
| 435 | available. This is done so that dynamically loaded extensions |
| 436 | compiled with thread support enabled can be loaded by an interpreter |
| 437 | that was compiled with disabled thread support. |
| 438 | |
| 439 | The global interpreter lock is used to protect the pointer to the |
| 440 | current thread state. When releasing the lock and saving the thread |
| 441 | state, the current thread state pointer must be retrieved before the |
| 442 | lock is released (since another thread could immediately acquire the |
| 443 | lock and store its own thread state in the global variable). |
| 444 | Conversely, when acquiring the lock and restoring the thread state, |
| 445 | the lock must be acquired before storing the thread state pointer. |
| 446 | |
| 447 | Why am I going on with so much detail about this? Because when |
| 448 | threads are created from C, they don't have the global interpreter |
| 449 | lock, nor is there a thread state data structure for them. Such |
| 450 | threads must bootstrap themselves into existence, by first creating a |
| 451 | thread state data structure, then acquiring the lock, and finally |
| 452 | storing their thread state pointer, before they can start using the |
| 453 | Python/C API. When they are done, they should reset the thread state |
| 454 | pointer, release the lock, and finally free their thread state data |
| 455 | structure. |
| 456 | |
| 457 | When creating a thread data structure, you need to provide an |
| 458 | interpreter state data structure. The interpreter state data |
| 459 | structure hold global data that is shared by all threads in an |
| 460 | interpreter, for example the module administration |
| 461 | (\code{sys.modules}). Depending on your needs, you can either create |
| 462 | a new interpreter state data structure, or share the interpreter state |
| 463 | data structure used by the Python main thread (to access the latter, |
| 464 | you must obtain the thread state and access its \member{interp} member; |
| 465 | this must be done by a thread that is created by Python or by the main |
| 466 | thread after Python is initialized). |
| 467 | |
| 468 | |
| 469 | \begin{ctypedesc}{PyInterpreterState} |
| 470 | This data structure represents the state shared by a number of |
| 471 | cooperating threads. Threads belonging to the same interpreter |
| 472 | share their module administration and a few other internal items. |
| 473 | There are no public members in this structure. |
| 474 | |
| 475 | Threads belonging to different interpreters initially share nothing, |
| 476 | except process state like available memory, open file descriptors |
| 477 | and such. The global interpreter lock is also shared by all |
| 478 | threads, regardless of to which interpreter they belong. |
| 479 | \end{ctypedesc} |
| 480 | |
| 481 | \begin{ctypedesc}{PyThreadState} |
| 482 | This data structure represents the state of a single thread. The |
| 483 | only public data member is \ctype{PyInterpreterState |
| 484 | *}\member{interp}, which points to this thread's interpreter state. |
| 485 | \end{ctypedesc} |
| 486 | |
| 487 | \begin{cfuncdesc}{void}{PyEval_InitThreads}{} |
| 488 | Initialize and acquire the global interpreter lock. It should be |
| 489 | called in the main thread before creating a second thread or |
| 490 | engaging in any other thread operations such as |
| 491 | \cfunction{PyEval_ReleaseLock()}\ttindex{PyEval_ReleaseLock()} or |
| 492 | \code{PyEval_ReleaseThread(\var{tstate})}\ttindex{PyEval_ReleaseThread()}. |
| 493 | It is not needed before calling |
| 494 | \cfunction{PyEval_SaveThread()}\ttindex{PyEval_SaveThread()} or |
| 495 | \cfunction{PyEval_RestoreThread()}\ttindex{PyEval_RestoreThread()}. |
| 496 | |
| 497 | This is a no-op when called for a second time. It is safe to call |
| 498 | this function before calling |
| 499 | \cfunction{Py_Initialize()}\ttindex{Py_Initialize()}. |
| 500 | |
| 501 | When only the main thread exists, no lock operations are needed. |
| 502 | This is a common situation (most Python programs do not use |
| 503 | threads), and the lock operations slow the interpreter down a bit. |
| 504 | Therefore, the lock is not created initially. This situation is |
| 505 | equivalent to having acquired the lock: when there is only a single |
| 506 | thread, all object accesses are safe. Therefore, when this function |
| 507 | initializes the lock, it also acquires it. Before the Python |
| 508 | \module{thread}\refbimodindex{thread} module creates a new thread, |
| 509 | knowing that either it has the lock or the lock hasn't been created |
| 510 | yet, it calls \cfunction{PyEval_InitThreads()}. When this call |
| 511 | returns, it is guaranteed that the lock has been created and that it |
| 512 | has acquired it. |
| 513 | |
| 514 | It is \strong{not} safe to call this function when it is unknown |
| 515 | which thread (if any) currently has the global interpreter lock. |
| 516 | |
| 517 | This function is not available when thread support is disabled at |
| 518 | compile time. |
| 519 | \end{cfuncdesc} |
| 520 | |
| 521 | \begin{cfuncdesc}{void}{PyEval_AcquireLock}{} |
| 522 | Acquire the global interpreter lock. The lock must have been |
| 523 | created earlier. If this thread already has the lock, a deadlock |
| 524 | ensues. This function is not available when thread support is |
| 525 | disabled at compile time. |
| 526 | \end{cfuncdesc} |
| 527 | |
| 528 | \begin{cfuncdesc}{void}{PyEval_ReleaseLock}{} |
| 529 | Release the global interpreter lock. The lock must have been |
| 530 | created earlier. This function is not available when thread support |
| 531 | is disabled at compile time. |
| 532 | \end{cfuncdesc} |
| 533 | |
| 534 | \begin{cfuncdesc}{void}{PyEval_AcquireThread}{PyThreadState *tstate} |
| 535 | Acquire the global interpreter lock and then set the current thread |
| 536 | state to \var{tstate}, which should not be \NULL. The lock must |
| 537 | have been created earlier. If this thread already has the lock, |
| 538 | deadlock ensues. This function is not available when thread support |
| 539 | is disabled at compile time. |
| 540 | \end{cfuncdesc} |
| 541 | |
| 542 | \begin{cfuncdesc}{void}{PyEval_ReleaseThread}{PyThreadState *tstate} |
| 543 | Reset the current thread state to \NULL{} and release the global |
| 544 | interpreter lock. The lock must have been created earlier and must |
| 545 | be held by the current thread. The \var{tstate} argument, which |
| 546 | must not be \NULL, is only used to check that it represents the |
| 547 | current thread state --- if it isn't, a fatal error is reported. |
| 548 | This function is not available when thread support is disabled at |
| 549 | compile time. |
| 550 | \end{cfuncdesc} |
| 551 | |
| 552 | \begin{cfuncdesc}{PyThreadState*}{PyEval_SaveThread}{} |
| 553 | Release the interpreter lock (if it has been created and thread |
| 554 | support is enabled) and reset the thread state to \NULL, returning |
| 555 | the previous thread state (which is not \NULL). If the lock has |
| 556 | been created, the current thread must have acquired it. (This |
| 557 | function is available even when thread support is disabled at |
| 558 | compile time.) |
| 559 | \end{cfuncdesc} |
| 560 | |
| 561 | \begin{cfuncdesc}{void}{PyEval_RestoreThread}{PyThreadState *tstate} |
| 562 | Acquire the interpreter lock (if it has been created and thread |
| 563 | support is enabled) and set the thread state to \var{tstate}, which |
| 564 | must not be \NULL. If the lock has been created, the current thread |
| 565 | must not have acquired it, otherwise deadlock ensues. (This |
| 566 | function is available even when thread support is disabled at |
| 567 | compile time.) |
| 568 | \end{cfuncdesc} |
| 569 | |
| 570 | The following macros are normally used without a trailing semicolon; |
| 571 | look for example usage in the Python source distribution. |
| 572 | |
| 573 | \begin{csimplemacrodesc}{Py_BEGIN_ALLOW_THREADS} |
| 574 | This macro expands to |
| 575 | \samp{\{ PyThreadState *_save; _save = PyEval_SaveThread();}. |
| 576 | Note that it contains an opening brace; it must be matched with a |
| 577 | following \code{Py_END_ALLOW_THREADS} macro. See above for further |
| 578 | discussion of this macro. It is a no-op when thread support is |
| 579 | disabled at compile time. |
| 580 | \end{csimplemacrodesc} |
| 581 | |
| 582 | \begin{csimplemacrodesc}{Py_END_ALLOW_THREADS} |
| 583 | This macro expands to \samp{PyEval_RestoreThread(_save); \}}. |
| 584 | Note that it contains a closing brace; it must be matched with an |
| 585 | earlier \code{Py_BEGIN_ALLOW_THREADS} macro. See above for further |
| 586 | discussion of this macro. It is a no-op when thread support is |
| 587 | disabled at compile time. |
| 588 | \end{csimplemacrodesc} |
| 589 | |
| 590 | \begin{csimplemacrodesc}{Py_BLOCK_THREADS} |
| 591 | This macro expands to \samp{PyEval_RestoreThread(_save);}: it is |
| 592 | equivalent to \code{Py_END_ALLOW_THREADS} without the closing brace. |
| 593 | It is a no-op when thread support is disabled at compile time. |
| 594 | \end{csimplemacrodesc} |
| 595 | |
| 596 | \begin{csimplemacrodesc}{Py_UNBLOCK_THREADS} |
| 597 | This macro expands to \samp{_save = PyEval_SaveThread();}: it is |
| 598 | equivalent to \code{Py_BEGIN_ALLOW_THREADS} without the opening |
| 599 | brace and variable declaration. It is a no-op when thread support |
| 600 | is disabled at compile time. |
| 601 | \end{csimplemacrodesc} |
| 602 | |
| 603 | All of the following functions are only available when thread support |
| 604 | is enabled at compile time, and must be called only when the |
| 605 | interpreter lock has been created. |
| 606 | |
| 607 | \begin{cfuncdesc}{PyInterpreterState*}{PyInterpreterState_New}{} |
| 608 | Create a new interpreter state object. The interpreter lock need |
| 609 | not be held, but may be held if it is necessary to serialize calls |
| 610 | to this function. |
| 611 | \end{cfuncdesc} |
| 612 | |
| 613 | \begin{cfuncdesc}{void}{PyInterpreterState_Clear}{PyInterpreterState *interp} |
| 614 | Reset all information in an interpreter state object. The |
| 615 | interpreter lock must be held. |
| 616 | \end{cfuncdesc} |
| 617 | |
| 618 | \begin{cfuncdesc}{void}{PyInterpreterState_Delete}{PyInterpreterState *interp} |
| 619 | Destroy an interpreter state object. The interpreter lock need not |
| 620 | be held. The interpreter state must have been reset with a previous |
| 621 | call to \cfunction{PyInterpreterState_Clear()}. |
| 622 | \end{cfuncdesc} |
| 623 | |
| 624 | \begin{cfuncdesc}{PyThreadState*}{PyThreadState_New}{PyInterpreterState *interp} |
| 625 | Create a new thread state object belonging to the given interpreter |
| 626 | object. The interpreter lock need not be held, but may be held if |
| 627 | it is necessary to serialize calls to this function. |
| 628 | \end{cfuncdesc} |
| 629 | |
| 630 | \begin{cfuncdesc}{void}{PyThreadState_Clear}{PyThreadState *tstate} |
| 631 | Reset all information in a thread state object. The interpreter lock |
| 632 | must be held. |
| 633 | \end{cfuncdesc} |
| 634 | |
| 635 | \begin{cfuncdesc}{void}{PyThreadState_Delete}{PyThreadState *tstate} |
| 636 | Destroy a thread state object. The interpreter lock need not be |
| 637 | held. The thread state must have been reset with a previous call to |
| 638 | \cfunction{PyThreadState_Clear()}. |
| 639 | \end{cfuncdesc} |
| 640 | |
| 641 | \begin{cfuncdesc}{PyThreadState*}{PyThreadState_Get}{} |
| 642 | Return the current thread state. The interpreter lock must be |
| 643 | held. When the current thread state is \NULL, this issues a fatal |
| 644 | error (so that the caller needn't check for \NULL). |
| 645 | \end{cfuncdesc} |
| 646 | |
| 647 | \begin{cfuncdesc}{PyThreadState*}{PyThreadState_Swap}{PyThreadState *tstate} |
| 648 | Swap the current thread state with the thread state given by the |
| 649 | argument \var{tstate}, which may be \NULL. The interpreter lock |
| 650 | must be held. |
| 651 | \end{cfuncdesc} |
| 652 | |
| 653 | \begin{cfuncdesc}{PyObject*}{PyThreadState_GetDict}{} |
| 654 | Return a dictionary in which extensions can store thread-specific |
| 655 | state information. Each extension should use a unique key to use to |
| 656 | store state in the dictionary. If this function returns \NULL, an |
| 657 | exception has been raised and the caller should allow it to |
| 658 | propogate. |
| 659 | \end{cfuncdesc} |
| 660 | |
| 661 | |
| 662 | \section{Profiling and Tracing \label{profiling}} |
| 663 | |
| 664 | \sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org} |
| 665 | |
| 666 | The Python interpreter provides some low-level support for attaching |
| 667 | profiling and execution tracing facilities. These are used for |
| 668 | profiling, debugging, and coverage analysis tools. |
| 669 | |
| 670 | Starting with Python 2.2, the implementation of this facility was |
| 671 | substantially revised, and an interface from C was added. This C |
| 672 | interface allows the profiling or tracing code to avoid the overhead |
| 673 | of calling through Python-level callable objects, making a direct C |
| 674 | function call instead. The essential attributes of the facility have |
| 675 | not changed; the interface allows trace functions to be installed |
| 676 | per-thread, and the basic events reported to the trace function are |
| 677 | the same as had been reported to the Python-level trace functions in |
| 678 | previous versions. |
| 679 | |
| 680 | \begin{ctypedesc}[Py_tracefunc]{int (*Py_tracefunc)(PyObject *obj, |
| 681 | PyFrameObject *frame, int what, |
| 682 | PyObject *arg)} |
| 683 | The type of the trace function registered using |
| 684 | \cfunction{PyEval_SetProfile()} and \cfunction{PyEval_SetTrace()}. |
| 685 | The first parameter is the object passed to the registration |
| 686 | function as \var{obj}, \var{frame} is the frame object to which the |
| 687 | event pertains, \var{what} is one of the constants |
| 688 | \constant{PyTrace_CALL}, \constant{PyTrace_EXCEPT}, |
| 689 | \constant{PyTrace_LINE} or \constant{PyTrace_RETURN}, and \var{arg} |
| 690 | depends on the value of \var{what}: |
| 691 | |
| 692 | \begin{tableii}{l|l}{constant}{Value of \var{what}}{Meaning of \var{arg}} |
| 693 | \lineii{PyTrace_CALL}{Always \NULL.} |
| 694 | \lineii{PyTrace_EXCEPT}{Exception information as returned by |
| 695 | \function{sys.exc_info()}.} |
| 696 | \lineii{PyTrace_LINE}{Always \NULL.} |
| 697 | \lineii{PyTrace_RETURN}{Value being returned to the caller.} |
| 698 | \end{tableii} |
| 699 | \end{ctypedesc} |
| 700 | |
| 701 | \begin{cvardesc}{int}{PyTrace_CALL} |
| 702 | The value of the \var{what} parameter to a \ctype{Py_tracefunc} |
| 703 | function when a new call to a function or method is being reported, |
| 704 | or a new entry into a generator. Note that the creation of the |
| 705 | iterator for a generator function is not reported as there is no |
| 706 | control transfer to the Python bytecode in the corresponding frame. |
| 707 | \end{cvardesc} |
| 708 | |
| 709 | \begin{cvardesc}{int}{PyTrace_EXCEPT} |
| 710 | The value of the \var{what} parameter to a \ctype{Py_tracefunc} |
| 711 | function when an exception has been raised by Python code as the |
| 712 | result of an operation. The operation may have explictly intended |
| 713 | to raise the operation (as with a \keyword{raise} statement), or may |
| 714 | have triggered an exception in the runtime as a result of the |
| 715 | specific operation. |
| 716 | \end{cvardesc} |
| 717 | |
| 718 | \begin{cvardesc}{int}{PyTrace_LINE} |
| 719 | The value passed as the \var{what} parameter to a trace function |
| 720 | (but not a profiling function) when a line-number event is being |
| 721 | reported. |
| 722 | \end{cvardesc} |
| 723 | |
| 724 | \begin{cvardesc}{int}{PyTrace_RETURN} |
| 725 | The value for the \var{what} parameter to \ctype{Py_tracefunc} |
| 726 | functions when a call is returning without propogating an exception. |
| 727 | \end{cvardesc} |
| 728 | |
| 729 | \begin{cfuncdesc}{void}{PyEval_SetProfile}{Py_tracefunc func, PyObject *obj} |
| 730 | Set the profiler function to \var{func}. The \var{obj} parameter is |
| 731 | passed to the function as its first parameter, and may be any Python |
| 732 | object, or \NULL. If the profile function needs to maintain state, |
| 733 | using a different value for \var{obj} for each thread provides a |
| 734 | convenient and thread-safe place to store it. The profile function |
| 735 | is called for all monitored events except the line-number events. |
| 736 | \end{cfuncdesc} |
| 737 | |
| 738 | \begin{cfuncdesc}{void}{PyEval_SetTrace}{Py_tracefunc func, PyObject *obj} |
| 739 | Set the the tracing function to \var{func}. This is similar to |
| 740 | \cfunction{PyEval_SetProfile()}, except the tracing function does |
| 741 | receive line-number events. |
| 742 | \end{cfuncdesc} |
| 743 | |
| 744 | |
| 745 | \section{Advanced Debugger Support \label{advanced-debugging}} |
| 746 | \sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org} |
| 747 | |
| 748 | These functions are only intended to be used by advanced debugging |
| 749 | tools. |
| 750 | |
| 751 | \begin{cfuncdesc}{PyInterpreterState*}{PyInterpreterState_Head}{} |
| 752 | Return the interpreter state object at the head of the list of all |
| 753 | such objects. |
| 754 | \versionadded{2.2} |
| 755 | \end{cfuncdesc} |
| 756 | |
| 757 | \begin{cfuncdesc}{PyInterpreterState*}{PyInterpreterState_Next}{PyInterpreterState *interp} |
| 758 | Return the next interpreter state object after \var{interp} from the |
| 759 | list of all such objects. |
| 760 | \versionadded{2.2} |
| 761 | \end{cfuncdesc} |
| 762 | |
| 763 | \begin{cfuncdesc}{PyThreadState *}{PyInterpreterState_ThreadHead}{PyInterpreterState *interp} |
| 764 | Return the a pointer to the first \ctype{PyThreadState} object in |
| 765 | the list of threads associated with the interpreter \var{interp}. |
| 766 | \versionadded{2.2} |
| 767 | \end{cfuncdesc} |
| 768 | |
| 769 | \begin{cfuncdesc}{PyThreadState*}{PyThreadState_Next}{PyThreadState *tstate} |
| 770 | Return the next thread state object after \var{tstate} from the list |
| 771 | of all such objects belonging to the same \ctype{PyInterpreterState} |
| 772 | object. |
| 773 | \versionadded{2.2} |
| 774 | \end{cfuncdesc} |