blob: ee099cfbe9d933201b8406e01ebfec03eaa4eacd [file] [log] [blame]
Georg Brandld7413152009-10-11 21:25:26 +00001:tocdepth: 2
2
3=========================
4Library and Extension FAQ
5=========================
6
7.. contents::
8
9General Library Questions
10=========================
11
12How do I find a module or application to perform task X?
13--------------------------------------------------------
14
15Check :ref:`the Library Reference <library-index>` to see if there's a relevant
16standard library module. (Eventually you'll learn what's in the standard
17library and will able to skip this step.)
18
Georg Brandl495f7b52009-10-27 15:28:25 +000019For third-party packages, search the `Python Package Index
20<http://pypi.python.org/pypi>`_ or try `Google <http://www.google.com>`_ or
21another Web search engine. Searching for "Python" plus a keyword or two for
22your topic of interest will usually find something helpful.
Georg Brandld7413152009-10-11 21:25:26 +000023
24
25Where is the math.py (socket.py, regex.py, etc.) source file?
26-------------------------------------------------------------
27
Georg Brandlc4a55fc2010-02-06 18:46:57 +000028If you can't find a source file for a module it may be a built-in or
29dynamically loaded module implemented in C, C++ or other compiled language.
30In this case you may not have the source file or it may be something like
31mathmodule.c, somewhere in a C source directory (not on the Python Path).
Georg Brandld7413152009-10-11 21:25:26 +000032
33There are (at least) three kinds of modules in Python:
34
351) modules written in Python (.py);
362) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc);
373) modules written in C and linked with the interpreter; to get a list of these,
38 type::
39
40 import sys
Georg Brandl9e4ff752009-12-19 17:57:51 +000041 print(sys.builtin_module_names)
Georg Brandld7413152009-10-11 21:25:26 +000042
43
44How do I make a Python script executable on Unix?
45-------------------------------------------------
46
47You need to do two things: the script file's mode must be executable and the
48first line must begin with ``#!`` followed by the path of the Python
49interpreter.
50
51The first is done by executing ``chmod +x scriptfile`` or perhaps ``chmod 755
52scriptfile``.
53
54The second can be done in a number of ways. The most straightforward way is to
55write ::
56
57 #!/usr/local/bin/python
58
59as the very first line of your file, using the pathname for where the Python
60interpreter is installed on your platform.
61
62If you would like the script to be independent of where the Python interpreter
63lives, you can use the "env" program. Almost all Unix variants support the
Ezio Melotti0639d5a2009-12-19 23:26:38 +000064following, assuming the Python interpreter is in a directory on the user's
Georg Brandld7413152009-10-11 21:25:26 +000065$PATH::
66
67 #!/usr/bin/env python
68
69*Don't* do this for CGI scripts. The $PATH variable for CGI scripts is often
70very minimal, so you need to use the actual absolute pathname of the
71interpreter.
72
73Occasionally, a user's environment is so full that the /usr/bin/env program
74fails; or there's no env program at all. In that case, you can try the
75following hack (due to Alex Rezinsky)::
76
77 #! /bin/sh
78 """:"
79 exec python $0 ${1+"$@"}
80 """
81
82The minor disadvantage is that this defines the script's __doc__ string.
83However, you can fix that by adding ::
84
85 __doc__ = """...Whatever..."""
86
87
88
89Is there a curses/termcap package for Python?
90---------------------------------------------
91
92.. XXX curses *is* built by default, isn't it?
93
94For Unix variants: The standard Python source distribution comes with a curses
95module in the ``Modules/`` subdirectory, though it's not compiled by default
96(note that this is not available in the Windows distribution -- there is no
97curses module for Windows).
98
99The curses module supports basic curses features as well as many additional
100functions from ncurses and SYSV curses such as colour, alternative character set
101support, pads, and mouse support. This means the module isn't compatible with
102operating systems that only have BSD curses, but there don't seem to be any
103currently maintained OSes that fall into this category.
104
105For Windows: use `the consolelib module
106<http://effbot.org/zone/console-index.htm>`_.
107
108
109Is there an equivalent to C's onexit() in Python?
110-------------------------------------------------
111
112The :mod:`atexit` module provides a register function that is similar to C's
113onexit.
114
115
116Why don't my signal handlers work?
117----------------------------------
118
119The most common problem is that the signal handler is declared with the wrong
120argument list. It is called as ::
121
122 handler(signum, frame)
123
124so it should be declared with two arguments::
125
126 def handler(signum, frame):
127 ...
128
129
130Common tasks
131============
132
133How do I test a Python program or component?
134--------------------------------------------
135
136Python comes with two testing frameworks. The :mod:`doctest` module finds
137examples in the docstrings for a module and runs them, comparing the output with
138the expected output given in the docstring.
139
140The :mod:`unittest` module is a fancier testing framework modelled on Java and
141Smalltalk testing frameworks.
142
143For testing, it helps to write the program so that it may be easily tested by
144using good modular design. Your program should have almost all functionality
145encapsulated in either functions or class methods -- and this sometimes has the
146surprising and delightful effect of making the program run faster (because local
147variable accesses are faster than global accesses). Furthermore the program
148should avoid depending on mutating global variables, since this makes testing
149much more difficult to do.
150
151The "global main logic" of your program may be as simple as ::
152
153 if __name__ == "__main__":
154 main_logic()
155
156at the bottom of the main module of your program.
157
158Once your program is organized as a tractable collection of functions and class
159behaviours you should write test functions that exercise the behaviours. A test
160suite can be associated with each module which automates a sequence of tests.
161This sounds like a lot of work, but since Python is so terse and flexible it's
162surprisingly easy. You can make coding much more pleasant and fun by writing
163your test functions in parallel with the "production code", since this makes it
164easy to find bugs and even design flaws earlier.
165
166"Support modules" that are not intended to be the main module of a program may
167include a self-test of the module. ::
168
169 if __name__ == "__main__":
170 self_test()
171
172Even programs that interact with complex external interfaces may be tested when
173the external interfaces are unavailable by using "fake" interfaces implemented
174in Python.
175
176
177How do I create documentation from doc strings?
178-----------------------------------------------
179
Georg Brandld7413152009-10-11 21:25:26 +0000180The :mod:`pydoc` module can create HTML from the doc strings in your Python
Georg Brandl495f7b52009-10-27 15:28:25 +0000181source code. An alternative for creating API documentation purely from
182docstrings is `epydoc <http://epydoc.sf.net/>`_. `Sphinx
183<http://sphinx.pocoo.org>`_ can also include docstring content.
Georg Brandld7413152009-10-11 21:25:26 +0000184
185
186How do I get a single keypress at a time?
187-----------------------------------------
188
189For Unix variants: There are several solutions. It's straightforward to do this
Georg Brandl9e4ff752009-12-19 17:57:51 +0000190using curses, but curses is a fairly large module to learn.
191
192.. XXX this doesn't work out of the box, some IO expert needs to check why
193
194 Here's a solution without curses::
Georg Brandld7413152009-10-11 21:25:26 +0000195
196 import termios, fcntl, sys, os
197 fd = sys.stdin.fileno()
198
199 oldterm = termios.tcgetattr(fd)
200 newattr = termios.tcgetattr(fd)
201 newattr[3] = newattr[3] & ~termios.ICANON & ~termios.ECHO
202 termios.tcsetattr(fd, termios.TCSANOW, newattr)
203
204 oldflags = fcntl.fcntl(fd, fcntl.F_GETFL)
205 fcntl.fcntl(fd, fcntl.F_SETFL, oldflags | os.O_NONBLOCK)
206
207 try:
Georg Brandl9e4ff752009-12-19 17:57:51 +0000208 while True:
Georg Brandld7413152009-10-11 21:25:26 +0000209 try:
210 c = sys.stdin.read(1)
Georg Brandl9e4ff752009-12-19 17:57:51 +0000211 print("Got character", repr(c))
212 except IOError:
213 pass
Georg Brandld7413152009-10-11 21:25:26 +0000214 finally:
215 termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm)
216 fcntl.fcntl(fd, fcntl.F_SETFL, oldflags)
217
Georg Brandl9e4ff752009-12-19 17:57:51 +0000218 You need the :mod:`termios` and the :mod:`fcntl` module for any of this to
219 work, and I've only tried it on Linux, though it should work elsewhere. In
220 this code, characters are read and printed one at a time.
Georg Brandld7413152009-10-11 21:25:26 +0000221
Georg Brandl9e4ff752009-12-19 17:57:51 +0000222 :func:`termios.tcsetattr` turns off stdin's echoing and disables canonical
223 mode. :func:`fcntl.fnctl` is used to obtain stdin's file descriptor flags
224 and modify them for non-blocking mode. Since reading stdin when it is empty
225 results in an :exc:`IOError`, this error is caught and ignored.
Georg Brandld7413152009-10-11 21:25:26 +0000226
227
228Threads
229=======
230
231How do I program using threads?
232-------------------------------
233
Georg Brandld404fa62009-10-13 16:55:12 +0000234Be sure to use the :mod:`threading` module and not the :mod:`_thread` module.
Georg Brandld7413152009-10-11 21:25:26 +0000235The :mod:`threading` module builds convenient abstractions on top of the
Georg Brandld404fa62009-10-13 16:55:12 +0000236low-level primitives provided by the :mod:`_thread` module.
Georg Brandld7413152009-10-11 21:25:26 +0000237
238Aahz has a set of slides from his threading tutorial that are helpful; see
Georg Brandl495f7b52009-10-27 15:28:25 +0000239http://www.pythoncraft.com/OSCON2001/.
Georg Brandld7413152009-10-11 21:25:26 +0000240
241
242None of my threads seem to run: why?
243------------------------------------
244
245As soon as the main thread exits, all threads are killed. Your main thread is
246running too quickly, giving the threads no time to do any work.
247
248A simple fix is to add a sleep to the end of the program that's long enough for
249all the threads to finish::
250
251 import threading, time
252
253 def thread_task(name, n):
Georg Brandl9e4ff752009-12-19 17:57:51 +0000254 for i in range(n): print(name, i)
Georg Brandld7413152009-10-11 21:25:26 +0000255
256 for i in range(10):
257 T = threading.Thread(target=thread_task, args=(str(i), i))
258 T.start()
259
Georg Brandl9e4ff752009-12-19 17:57:51 +0000260 time.sleep(10) # <---------------------------!
Georg Brandld7413152009-10-11 21:25:26 +0000261
262But now (on many platforms) the threads don't run in parallel, but appear to run
263sequentially, one at a time! The reason is that the OS thread scheduler doesn't
264start a new thread until the previous thread is blocked.
265
266A simple fix is to add a tiny sleep to the start of the run function::
267
268 def thread_task(name, n):
Georg Brandl9e4ff752009-12-19 17:57:51 +0000269 time.sleep(0.001) # <--------------------!
270 for i in range(n): print(name, i)
Georg Brandld7413152009-10-11 21:25:26 +0000271
272 for i in range(10):
273 T = threading.Thread(target=thread_task, args=(str(i), i))
274 T.start()
275
276 time.sleep(10)
277
278Instead of trying to guess how long a :func:`time.sleep` delay will be enough,
279it's better to use some kind of semaphore mechanism. One idea is to use the
Georg Brandld404fa62009-10-13 16:55:12 +0000280:mod:`queue` module to create a queue object, let each thread append a token to
Georg Brandld7413152009-10-11 21:25:26 +0000281the queue when it finishes, and let the main thread read as many tokens from the
282queue as there are threads.
283
284
285How do I parcel out work among a bunch of worker threads?
286---------------------------------------------------------
287
Antoine Pitrou11480b62011-02-05 11:18:34 +0000288The easiest way is to use the new :mod:`concurrent.futures` module,
289especially the :mod:`~concurrent.futures.ThreadPoolExecutor` class.
290
291Or, if you want fine control over the dispatching algorithm, you can write
292your own logic manually. Use the :mod:`queue` module to create a queue
293containing a list of jobs. The :class:`~queue.Queue` class maintains a
294list of objects with ``.put(obj)`` to add an item to the queue and ``.get()``
295to return an item. The class will take care of the locking necessary to
296ensure that each job is handed out exactly once.
Georg Brandld7413152009-10-11 21:25:26 +0000297
298Here's a trivial example::
299
Georg Brandl9e4ff752009-12-19 17:57:51 +0000300 import threading, queue, time
Georg Brandld7413152009-10-11 21:25:26 +0000301
302 # The worker thread gets jobs off the queue. When the queue is empty, it
303 # assumes there will be no more work and exits.
304 # (Realistically workers will run until terminated.)
305 def worker ():
Georg Brandl9e4ff752009-12-19 17:57:51 +0000306 print('Running worker')
Georg Brandld7413152009-10-11 21:25:26 +0000307 time.sleep(0.1)
308 while True:
309 try:
310 arg = q.get(block=False)
Georg Brandl9e4ff752009-12-19 17:57:51 +0000311 except queue.Empty:
312 print('Worker', threading.currentThread(), end=' ')
313 print('queue empty')
Georg Brandld7413152009-10-11 21:25:26 +0000314 break
315 else:
Georg Brandl9e4ff752009-12-19 17:57:51 +0000316 print('Worker', threading.currentThread(), end=' ')
317 print('running with argument', arg)
Georg Brandld7413152009-10-11 21:25:26 +0000318 time.sleep(0.5)
319
320 # Create queue
Georg Brandl9e4ff752009-12-19 17:57:51 +0000321 q = queue.Queue()
Georg Brandld7413152009-10-11 21:25:26 +0000322
323 # Start a pool of 5 workers
324 for i in range(5):
325 t = threading.Thread(target=worker, name='worker %i' % (i+1))
326 t.start()
327
328 # Begin adding work to the queue
329 for i in range(50):
330 q.put(i)
331
332 # Give threads time to run
Georg Brandl9e4ff752009-12-19 17:57:51 +0000333 print('Main thread sleeping')
Georg Brandld7413152009-10-11 21:25:26 +0000334 time.sleep(5)
335
Georg Brandl9e4ff752009-12-19 17:57:51 +0000336When run, this will produce the following output::
Georg Brandld7413152009-10-11 21:25:26 +0000337
338 Running worker
339 Running worker
340 Running worker
341 Running worker
342 Running worker
343 Main thread sleeping
Georg Brandl9e4ff752009-12-19 17:57:51 +0000344 Worker <Thread(worker 1, started 130283832797456)> running with argument 0
345 Worker <Thread(worker 2, started 130283824404752)> running with argument 1
346 Worker <Thread(worker 3, started 130283816012048)> running with argument 2
347 Worker <Thread(worker 4, started 130283807619344)> running with argument 3
348 Worker <Thread(worker 5, started 130283799226640)> running with argument 4
349 Worker <Thread(worker 1, started 130283832797456)> running with argument 5
Georg Brandld7413152009-10-11 21:25:26 +0000350 ...
351
352Consult the module's documentation for more details; the ``Queue`` class
353provides a featureful interface.
354
355
356What kinds of global value mutation are thread-safe?
357----------------------------------------------------
358
Antoine Pitrou11480b62011-02-05 11:18:34 +0000359A :term:`global interpreter lock` (GIL) is used internally to ensure that only one
Georg Brandld7413152009-10-11 21:25:26 +0000360thread runs in the Python VM at a time. In general, Python offers to switch
361among threads only between bytecode instructions; how frequently it switches can
Georg Brandl9e4ff752009-12-19 17:57:51 +0000362be set via :func:`sys.setswitchinterval`. Each bytecode instruction and
Georg Brandld7413152009-10-11 21:25:26 +0000363therefore all the C implementation code reached from each instruction is
364therefore atomic from the point of view of a Python program.
365
366In theory, this means an exact accounting requires an exact understanding of the
367PVM bytecode implementation. In practice, it means that operations on shared
Georg Brandlc4a55fc2010-02-06 18:46:57 +0000368variables of built-in data types (ints, lists, dicts, etc) that "look atomic"
Georg Brandld7413152009-10-11 21:25:26 +0000369really are.
370
371For example, the following operations are all atomic (L, L1, L2 are lists, D,
372D1, D2 are dicts, x, y are objects, i, j are ints)::
373
374 L.append(x)
375 L1.extend(L2)
376 x = L[i]
377 x = L.pop()
378 L1[i:j] = L2
379 L.sort()
380 x = y
381 x.field = y
382 D[x] = y
383 D1.update(D2)
384 D.keys()
385
386These aren't::
387
388 i = i+1
389 L.append(L[-1])
390 L[i] = L[j]
391 D[x] = D[x] + 1
392
393Operations that replace other objects may invoke those other objects'
394:meth:`__del__` method when their reference count reaches zero, and that can
395affect things. This is especially true for the mass updates to dictionaries and
396lists. When in doubt, use a mutex!
397
398
399Can't we get rid of the Global Interpreter Lock?
400------------------------------------------------
401
Georg Brandl495f7b52009-10-27 15:28:25 +0000402.. XXX link to dbeazley's talk about GIL?
Georg Brandld7413152009-10-11 21:25:26 +0000403
Antoine Pitrou11480b62011-02-05 11:18:34 +0000404The :term:`global interpreter lock` (GIL) is often seen as a hindrance to Python's
Georg Brandld7413152009-10-11 21:25:26 +0000405deployment on high-end multiprocessor server machines, because a multi-threaded
406Python program effectively only uses one CPU, due to the insistence that
407(almost) all Python code can only run while the GIL is held.
408
409Back in the days of Python 1.5, Greg Stein actually implemented a comprehensive
410patch set (the "free threading" patches) that removed the GIL and replaced it
Antoine Pitrou11480b62011-02-05 11:18:34 +0000411with fine-grained locking. Adam Olsen recently did a similar experiment
412in his `python-safethread <http://code.google.com/p/python-safethread/>`_
413project. Unfortunately, both experiments exhibited a sharp drop in single-thread
414performance (at least 30% slower), due to the amount of fine-grained locking
415necessary to compensate for the removal of the GIL.
Georg Brandld7413152009-10-11 21:25:26 +0000416
417This doesn't mean that you can't make good use of Python on multi-CPU machines!
418You just have to be creative with dividing the work up between multiple
Antoine Pitrou11480b62011-02-05 11:18:34 +0000419*processes* rather than multiple *threads*. The
420:class:`~concurrent.futures.ProcessPoolExecutor` class in the new
421:mod:`concurrent.futures` module provides an easy way of doing so; the
422:mod:`multiprocessing` module provides a lower-level API in case you want
423more control over dispatching of tasks.
424
425Judicious use of C extensions will also help; if you use a C extension to
426perform a time-consuming task, the extension can release the GIL while the
427thread of execution is in the C code and allow other threads to get some work
428done. Some standard library modules such as :mod:`zlib` and :mod:`hashlib`
429already do this.
Georg Brandld7413152009-10-11 21:25:26 +0000430
431It has been suggested that the GIL should be a per-interpreter-state lock rather
432than truly global; interpreters then wouldn't be able to share objects.
433Unfortunately, this isn't likely to happen either. It would be a tremendous
434amount of work, because many object implementations currently have global state.
435For example, small integers and short strings are cached; these caches would
436have to be moved to the interpreter state. Other object types have their own
437free list; these free lists would have to be moved to the interpreter state.
438And so on.
439
440And I doubt that it can even be done in finite time, because the same problem
441exists for 3rd party extensions. It is likely that 3rd party extensions are
442being written at a faster rate than you can convert them to store all their
443global state in the interpreter state.
444
445And finally, once you have multiple interpreters not sharing any state, what
446have you gained over running each interpreter in a separate process?
447
448
449Input and Output
450================
451
452How do I delete a file? (And other file questions...)
453-----------------------------------------------------
454
455Use ``os.remove(filename)`` or ``os.unlink(filename)``; for documentation, see
Georg Brandl9e4ff752009-12-19 17:57:51 +0000456the :mod:`os` module. The two functions are identical; :func:`~os.unlink` is simply
Georg Brandld7413152009-10-11 21:25:26 +0000457the name of the Unix system call for this function.
458
459To remove a directory, use :func:`os.rmdir`; use :func:`os.mkdir` to create one.
460``os.makedirs(path)`` will create any intermediate directories in ``path`` that
461don't exist. ``os.removedirs(path)`` will remove intermediate directories as
462long as they're empty; if you want to delete an entire directory tree and its
463contents, use :func:`shutil.rmtree`.
464
465To rename a file, use ``os.rename(old_path, new_path)``.
466
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000467To truncate a file, open it using ``f = open(filename, "rb+")``, and use
Georg Brandld7413152009-10-11 21:25:26 +0000468``f.truncate(offset)``; offset defaults to the current seek position. There's
Georg Brandl682d7e02010-10-06 10:26:05 +0000469also ``os.ftruncate(fd, offset)`` for files opened with :func:`os.open`, where
Georg Brandld7413152009-10-11 21:25:26 +0000470``fd`` is the file descriptor (a small integer).
471
472The :mod:`shutil` module also contains a number of functions to work on files
473including :func:`~shutil.copyfile`, :func:`~shutil.copytree`, and
474:func:`~shutil.rmtree`.
475
476
477How do I copy a file?
478---------------------
479
480The :mod:`shutil` module contains a :func:`~shutil.copyfile` function. Note
481that on MacOS 9 it doesn't copy the resource fork and Finder info.
482
483
484How do I read (or write) binary data?
485-------------------------------------
486
487To read or write complex binary data formats, it's best to use the :mod:`struct`
488module. It allows you to take a string containing binary data (usually numbers)
489and convert it to Python objects; and vice versa.
490
491For example, the following code reads two 2-byte integers and one 4-byte integer
492in big-endian format from a file::
493
494 import struct
495
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000496 with open(filename, "rb") as f:
497 s = f.read(8)
498 x, y, z = struct.unpack(">hhl", s)
Georg Brandld7413152009-10-11 21:25:26 +0000499
500The '>' in the format string forces big-endian data; the letter 'h' reads one
501"short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the
502string.
503
504For data that is more regular (e.g. a homogeneous list of ints or thefloats),
505you can also use the :mod:`array` module.
506
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000507 .. note::
508 To read and write binary data, it is mandatory to open the file in
509 binary mode (here, passing ``"rb"`` to :func:`open`). If you use
510 ``"r"`` instead (the default), the file will be open in text mode
511 and ``f.read()`` will return :class:`str` objects rather than
512 :class:`bytes` objects.
513
Georg Brandld7413152009-10-11 21:25:26 +0000514
515I can't seem to use os.read() on a pipe created with os.popen(); why?
516---------------------------------------------------------------------
517
518:func:`os.read` is a low-level function which takes a file descriptor, a small
519integer representing the opened file. :func:`os.popen` creates a high-level
Georg Brandlc4a55fc2010-02-06 18:46:57 +0000520file object, the same type returned by the built-in :func:`open` function.
521Thus, to read n bytes from a pipe p created with :func:`os.popen`, you need to
522use ``p.read(n)``.
Georg Brandld7413152009-10-11 21:25:26 +0000523
524
Georg Brandl9e4ff752009-12-19 17:57:51 +0000525.. XXX update to use subprocess. See the :ref:`subprocess-replacements` section.
Georg Brandld7413152009-10-11 21:25:26 +0000526
Georg Brandl9e4ff752009-12-19 17:57:51 +0000527 How do I run a subprocess with pipes connected to both input and output?
528 ------------------------------------------------------------------------
Georg Brandld7413152009-10-11 21:25:26 +0000529
Georg Brandl9e4ff752009-12-19 17:57:51 +0000530 Use the :mod:`popen2` module. For example::
Georg Brandld7413152009-10-11 21:25:26 +0000531
Georg Brandl9e4ff752009-12-19 17:57:51 +0000532 import popen2
533 fromchild, tochild = popen2.popen2("command")
534 tochild.write("input\n")
535 tochild.flush()
536 output = fromchild.readline()
Georg Brandld7413152009-10-11 21:25:26 +0000537
Georg Brandl9e4ff752009-12-19 17:57:51 +0000538 Warning: in general it is unwise to do this because you can easily cause a
539 deadlock where your process is blocked waiting for output from the child
540 while the child is blocked waiting for input from you. This can be caused
541 because the parent expects the child to output more text than it does, or it
542 can be caused by data being stuck in stdio buffers due to lack of flushing.
543 The Python parent can of course explicitly flush the data it sends to the
544 child before it reads any output, but if the child is a naive C program it
545 may have been written to never explicitly flush its output, even if it is
546 interactive, since flushing is normally automatic.
Georg Brandld7413152009-10-11 21:25:26 +0000547
Georg Brandl9e4ff752009-12-19 17:57:51 +0000548 Note that a deadlock is also possible if you use :func:`popen3` to read
549 stdout and stderr. If one of the two is too large for the internal buffer
550 (increasing the buffer size does not help) and you ``read()`` the other one
551 first, there is a deadlock, too.
Georg Brandld7413152009-10-11 21:25:26 +0000552
Georg Brandl9e4ff752009-12-19 17:57:51 +0000553 Note on a bug in popen2: unless your program calls ``wait()`` or
554 ``waitpid()``, finished child processes are never removed, and eventually
555 calls to popen2 will fail because of a limit on the number of child
556 processes. Calling :func:`os.waitpid` with the :data:`os.WNOHANG` option can
557 prevent this; a good place to insert such a call would be before calling
558 ``popen2`` again.
Georg Brandld7413152009-10-11 21:25:26 +0000559
Georg Brandl9e4ff752009-12-19 17:57:51 +0000560 In many cases, all you really need is to run some data through a command and
561 get the result back. Unless the amount of data is very large, the easiest
562 way to do this is to write it to a temporary file and run the command with
563 that temporary file as input. The standard module :mod:`tempfile` exports a
564 ``mktemp()`` function to generate unique temporary file names. ::
Georg Brandld7413152009-10-11 21:25:26 +0000565
Georg Brandl9e4ff752009-12-19 17:57:51 +0000566 import tempfile
567 import os
Georg Brandld7413152009-10-11 21:25:26 +0000568
Georg Brandl9e4ff752009-12-19 17:57:51 +0000569 class Popen3:
570 """
571 This is a deadlock-safe version of popen that returns
572 an object with errorlevel, out (a string) and err (a string).
573 (capturestderr may not work under windows.)
574 Example: print(Popen3('grep spam','\n\nhere spam\n\n').out)
575 """
576 def __init__(self,command,input=None,capturestderr=None):
577 outfile=tempfile.mktemp()
578 command="( %s ) > %s" % (command,outfile)
579 if input:
580 infile=tempfile.mktemp()
581 open(infile,"w").write(input)
582 command=command+" <"+infile
583 if capturestderr:
584 errfile=tempfile.mktemp()
585 command=command+" 2>"+errfile
586 self.errorlevel=os.system(command) >> 8
587 self.out=open(outfile,"r").read()
588 os.remove(outfile)
589 if input:
590 os.remove(infile)
591 if capturestderr:
592 self.err=open(errfile,"r").read()
593 os.remove(errfile)
Georg Brandld7413152009-10-11 21:25:26 +0000594
Georg Brandl9e4ff752009-12-19 17:57:51 +0000595 Note that many interactive programs (e.g. vi) don't work well with pipes
596 substituted for standard input and output. You will have to use pseudo ttys
597 ("ptys") instead of pipes. Or you can use a Python interface to Don Libes'
598 "expect" library. A Python extension that interfaces to expect is called
599 "expy" and available from http://expectpy.sourceforge.net. A pure Python
600 solution that works like expect is `pexpect
601 <http://pypi.python.org/pypi/pexpect/>`_.
Georg Brandld7413152009-10-11 21:25:26 +0000602
603
604How do I access the serial (RS232) port?
605----------------------------------------
606
607For Win32, POSIX (Linux, BSD, etc.), Jython:
608
609 http://pyserial.sourceforge.net
610
611For Unix, see a Usenet post by Mitch Chapman:
612
613 http://groups.google.com/groups?selm=34A04430.CF9@ohioee.com
614
615
616Why doesn't closing sys.stdout (stdin, stderr) really close it?
617---------------------------------------------------------------
618
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000619Python :term:`file objects <file object>` are a high-level layer of
620abstraction on low-level C file descriptors.
Georg Brandld7413152009-10-11 21:25:26 +0000621
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000622For most file objects you create in Python via the built-in :func:`open`
623function, ``f.close()`` marks the Python file object as being closed from
624Python's point of view, and also arranges to close the underlying C file
625descriptor. This also happens automatically in ``f``'s destructor, when
626``f`` becomes garbage.
Georg Brandld7413152009-10-11 21:25:26 +0000627
628But stdin, stdout and stderr are treated specially by Python, because of the
629special status also given to them by C. Running ``sys.stdout.close()`` marks
630the Python-level file object as being closed, but does *not* close the
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000631associated C file descriptor.
Georg Brandld7413152009-10-11 21:25:26 +0000632
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000633To close the underlying C file descriptor for one of these three, you should
634first be sure that's what you really want to do (e.g., you may confuse
635extension modules trying to do I/O). If it is, use :func:`os.close`::
Georg Brandld7413152009-10-11 21:25:26 +0000636
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000637 os.close(stdin.fileno())
638 os.close(stdout.fileno())
639 os.close(stderr.fileno())
640
641Or you can use the numeric constants 0, 1 and 2, respectively.
Georg Brandld7413152009-10-11 21:25:26 +0000642
643
644Network/Internet Programming
645============================
646
647What WWW tools are there for Python?
648------------------------------------
649
650See the chapters titled :ref:`internet` and :ref:`netdata` in the Library
651Reference Manual. Python has many modules that will help you build server-side
652and client-side web systems.
653
654.. XXX check if wiki page is still up to date
655
656A summary of available frameworks is maintained by Paul Boddie at
657http://wiki.python.org/moin/WebProgramming .
658
659Cameron Laird maintains a useful set of pages about Python web technologies at
660http://phaseit.net/claird/comp.lang.python/web_python.
661
662
663How can I mimic CGI form submission (METHOD=POST)?
664--------------------------------------------------
665
666I would like to retrieve web pages that are the result of POSTing a form. Is
667there existing code that would let me do this easily?
668
Georg Brandl9e4ff752009-12-19 17:57:51 +0000669Yes. Here's a simple example that uses urllib.request::
Georg Brandld7413152009-10-11 21:25:26 +0000670
671 #!/usr/local/bin/python
672
Georg Brandl9e4ff752009-12-19 17:57:51 +0000673 import urllib.request
Georg Brandld7413152009-10-11 21:25:26 +0000674
675 ### build the query string
676 qs = "First=Josephine&MI=Q&Last=Public"
677
678 ### connect and send the server a path
Georg Brandl9e4ff752009-12-19 17:57:51 +0000679 req = urllib.request.urlopen('http://www.some-server.out-there'
680 '/cgi-bin/some-cgi-script', data=qs)
681 msg, hdrs = req.read(), req.info()
Georg Brandld7413152009-10-11 21:25:26 +0000682
Georg Brandl54ebb782010-08-14 15:48:49 +0000683Note that in general for percent-encoded POST operations, query strings must be
684quoted using :func:`urllib.parse.urlencode`. For example to send name="Guy Steele,
Georg Brandld7413152009-10-11 21:25:26 +0000685Jr."::
686
Georg Brandl9e4ff752009-12-19 17:57:51 +0000687 >>> import urllib.parse
688 >>> urllib.parse.urlencode({'name': 'Guy Steele, Jr.'})
689 'name=Guy+Steele%2C+Jr.'
690
691.. seealso:: :ref:`urllib-howto` for extensive examples.
Georg Brandld7413152009-10-11 21:25:26 +0000692
693
694What module should I use to help with generating HTML?
695------------------------------------------------------
696
697.. XXX add modern template languages
698
699There are many different modules available:
700
701* HTMLgen is a class library of objects corresponding to all the HTML 3.2 markup
702 tags. It's used when you are writing in Python and wish to synthesize HTML
703 pages for generating a web or for CGI forms, etc.
704
705* DocumentTemplate and Zope Page Templates are two different systems that are
706 part of Zope.
707
708* Quixote's PTL uses Python syntax to assemble strings of text.
709
710Consult the `Web Programming wiki pages
711<http://wiki.python.org/moin/WebProgramming>`_ for more links.
712
713
714How do I send mail from a Python script?
715----------------------------------------
716
717Use the standard library module :mod:`smtplib`.
718
719Here's a very simple interactive mail sender that uses it. This method will
720work on any host that supports an SMTP listener. ::
721
722 import sys, smtplib
723
Georg Brandl9e4ff752009-12-19 17:57:51 +0000724 fromaddr = input("From: ")
725 toaddrs = input("To: ").split(',')
726 print("Enter message, end with ^D:")
Georg Brandld7413152009-10-11 21:25:26 +0000727 msg = ''
728 while True:
729 line = sys.stdin.readline()
730 if not line:
731 break
732 msg += line
733
734 # The actual mail send
735 server = smtplib.SMTP('localhost')
736 server.sendmail(fromaddr, toaddrs, msg)
737 server.quit()
738
739A Unix-only alternative uses sendmail. The location of the sendmail program
740varies between systems; sometimes it is ``/usr/lib/sendmail``, sometime
741``/usr/sbin/sendmail``. The sendmail manual page will help you out. Here's
742some sample code::
743
Georg Brandl9e4ff752009-12-19 17:57:51 +0000744 SENDMAIL = "/usr/sbin/sendmail" # sendmail location
Georg Brandld7413152009-10-11 21:25:26 +0000745 import os
746 p = os.popen("%s -t -i" % SENDMAIL, "w")
747 p.write("To: receiver@example.com\n")
748 p.write("Subject: test\n")
Georg Brandl9e4ff752009-12-19 17:57:51 +0000749 p.write("\n") # blank line separating headers from body
Georg Brandld7413152009-10-11 21:25:26 +0000750 p.write("Some text\n")
751 p.write("some more text\n")
752 sts = p.close()
753 if sts != 0:
Georg Brandl9e4ff752009-12-19 17:57:51 +0000754 print("Sendmail exit status", sts)
Georg Brandld7413152009-10-11 21:25:26 +0000755
756
757How do I avoid blocking in the connect() method of a socket?
758------------------------------------------------------------
759
Antoine Pitrou70957212011-02-05 11:24:15 +0000760The :mod:`select` module is commonly used to help with asynchronous I/O on
761sockets.
Georg Brandld7413152009-10-11 21:25:26 +0000762
763To prevent the TCP connect from blocking, you can set the socket to non-blocking
764mode. Then when you do the ``connect()``, you will either connect immediately
765(unlikely) or get an exception that contains the error number as ``.errno``.
766``errno.EINPROGRESS`` indicates that the connection is in progress, but hasn't
767finished yet. Different OSes will return different values, so you're going to
768have to check what's returned on your system.
769
770You can use the ``connect_ex()`` method to avoid creating an exception. It will
771just return the errno value. To poll, you can call ``connect_ex()`` again later
Georg Brandl9e4ff752009-12-19 17:57:51 +0000772-- ``0`` or ``errno.EISCONN`` indicate that you're connected -- or you can pass this
Georg Brandld7413152009-10-11 21:25:26 +0000773socket to select to check if it's writable.
774
Antoine Pitrou70957212011-02-05 11:24:15 +0000775.. note::
776 The :mod:`asyncore` module presents a framework-like approach to the problem
777 of writing non-blocking networking code.
778 The third-party `Twisted <http://twistedmatrix.com/>`_ library is
779 a popular and feature-rich alternative.
780
Georg Brandld7413152009-10-11 21:25:26 +0000781
782Databases
783=========
784
785Are there any interfaces to database packages in Python?
786--------------------------------------------------------
787
788Yes.
789
Georg Brandld404fa62009-10-13 16:55:12 +0000790Interfaces to disk-based hashes such as :mod:`DBM <dbm.ndbm>` and :mod:`GDBM
791<dbm.gnu>` are also included with standard Python. There is also the
792:mod:`sqlite3` module, which provides a lightweight disk-based relational
793database.
Georg Brandld7413152009-10-11 21:25:26 +0000794
795Support for most relational databases is available. See the
796`DatabaseProgramming wiki page
797<http://wiki.python.org/moin/DatabaseProgramming>`_ for details.
798
799
800How do you implement persistent objects in Python?
801--------------------------------------------------
802
803The :mod:`pickle` library module solves this in a very general way (though you
804still can't store things like open files, sockets or windows), and the
805:mod:`shelve` library module uses pickle and (g)dbm to create persistent
Georg Brandld404fa62009-10-13 16:55:12 +0000806mappings containing arbitrary Python objects.
Georg Brandld7413152009-10-11 21:25:26 +0000807
808A more awkward way of doing things is to use pickle's little sister, marshal.
809The :mod:`marshal` module provides very fast ways to store noncircular basic
810Python types to files and strings, and back again. Although marshal does not do
811fancy things like store instances or handle shared references properly, it does
812run extremely fast. For example loading a half megabyte of data may take less
813than a third of a second. This often beats doing something more complex and
814general such as using gdbm with pickle/shelve.
815
816
Georg Brandld7413152009-10-11 21:25:26 +0000817If my program crashes with a bsddb (or anydbm) database open, it gets corrupted. How come?
818------------------------------------------------------------------------------------------
819
Georg Brandl9e4ff752009-12-19 17:57:51 +0000820.. XXX move this FAQ entry elsewhere?
821
822.. note::
823
824 The bsddb module is now available as a standalone package `pybsddb
825 <http://www.jcea.es/programacion/pybsddb.htm>`_.
826
Georg Brandld7413152009-10-11 21:25:26 +0000827Databases opened for write access with the bsddb module (and often by the anydbm
828module, since it will preferentially use bsddb) must explicitly be closed using
829the ``.close()`` method of the database. The underlying library caches database
830contents which need to be converted to on-disk form and written.
831
832If you have initialized a new bsddb database but not written anything to it
833before the program crashes, you will often wind up with a zero-length file and
834encounter an exception the next time the file is opened.
835
836
837I tried to open Berkeley DB file, but bsddb produces bsddb.error: (22, 'Invalid argument'). Help! How can I restore my data?
838----------------------------------------------------------------------------------------------------------------------------
839
Georg Brandl9e4ff752009-12-19 17:57:51 +0000840.. XXX move this FAQ entry elsewhere?
841
842.. note::
843
844 The bsddb module is now available as a standalone package `pybsddb
845 <http://www.jcea.es/programacion/pybsddb.htm>`_.
846
Georg Brandld7413152009-10-11 21:25:26 +0000847Don't panic! Your data is probably intact. The most frequent cause for the error
848is that you tried to open an earlier Berkeley DB file with a later version of
849the Berkeley DB library.
850
851Many Linux systems now have all three versions of Berkeley DB available. If you
852are migrating from version 1 to a newer version use db_dump185 to dump a plain
853text version of the database. If you are migrating from version 2 to version 3
854use db2_dump to create a plain text version of the database. In either case,
855use db_load to create a new native database for the latest version installed on
856your computer. If you have version 3 of Berkeley DB installed, you should be
857able to use db2_load to create a native version 2 database.
858
859You should move away from Berkeley DB version 1 files because the hash file code
860contains known bugs that can corrupt your data.
861
862
863Mathematics and Numerics
864========================
865
866How do I generate random numbers in Python?
867-------------------------------------------
868
869The standard module :mod:`random` implements a random number generator. Usage
870is simple::
871
872 import random
873 random.random()
874
875This returns a random floating point number in the range [0, 1).
876
877There are also many other specialized generators in this module, such as:
878
879* ``randrange(a, b)`` chooses an integer in the range [a, b).
880* ``uniform(a, b)`` chooses a floating point number in the range [a, b).
881* ``normalvariate(mean, sdev)`` samples the normal (Gaussian) distribution.
882
883Some higher-level functions operate on sequences directly, such as:
884
885* ``choice(S)`` chooses random element from a given sequence
886* ``shuffle(L)`` shuffles a list in-place, i.e. permutes it randomly
887
888There's also a ``Random`` class you can instantiate to create independent
889multiple random number generators.