blob: 295ff3bf7949ab7b5b6eed8558b08aeffbc5be70 [file] [log] [blame]
Georg Brandl6728c5a2009-10-11 18:31:23 +00001:tocdepth: 2
2
3=========================
4Library and Extension FAQ
5=========================
6
Georg Brandl44ea77b2013-03-28 13:28:44 +01007.. only:: html
8
9 .. contents::
Georg Brandl6728c5a2009-10-11 18:31:23 +000010
11General Library Questions
12=========================
13
14How do I find a module or application to perform task X?
15--------------------------------------------------------
16
17Check :ref:`the Library Reference <library-index>` to see if there's a relevant
18standard library module. (Eventually you'll learn what's in the standard
Ezio Melotti6176db52012-05-13 19:49:00 +030019library and will be able to skip this step.)
Georg Brandl6728c5a2009-10-11 18:31:23 +000020
Georg Brandla4314c22009-10-11 20:16:16 +000021For third-party packages, search the `Python Package Index
22<http://pypi.python.org/pypi>`_ or try `Google <http://www.google.com>`_ or
23another Web search engine. Searching for "Python" plus a keyword or two for
24your topic of interest will usually find something helpful.
Georg Brandl6728c5a2009-10-11 18:31:23 +000025
26
27Where is the math.py (socket.py, regex.py, etc.) source file?
28-------------------------------------------------------------
29
Georg Brandl6f82cd32010-02-06 18:44:44 +000030If you can't find a source file for a module it may be a built-in or
31dynamically loaded module implemented in C, C++ or other compiled language.
32In this case you may not have the source file or it may be something like
Ezio Melotti6176db52012-05-13 19:49:00 +030033:file:`mathmodule.c`, somewhere in a C source directory (not on the Python Path).
Georg Brandl6728c5a2009-10-11 18:31:23 +000034
35There are (at least) three kinds of modules in Python:
36
371) modules written in Python (.py);
382) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc);
393) modules written in C and linked with the interpreter; to get a list of these,
40 type::
41
42 import sys
43 print sys.builtin_module_names
44
45
46How do I make a Python script executable on Unix?
47-------------------------------------------------
48
49You need to do two things: the script file's mode must be executable and the
50first line must begin with ``#!`` followed by the path of the Python
51interpreter.
52
53The first is done by executing ``chmod +x scriptfile`` or perhaps ``chmod 755
54scriptfile``.
55
56The second can be done in a number of ways. The most straightforward way is to
57write ::
58
59 #!/usr/local/bin/python
60
61as the very first line of your file, using the pathname for where the Python
62interpreter is installed on your platform.
63
64If you would like the script to be independent of where the Python interpreter
Ezio Melotti6176db52012-05-13 19:49:00 +030065lives, you can use the :program:`env` program. Almost all Unix variants support
66the following, assuming the Python interpreter is in a directory on the user's
67:envvar:`PATH`::
Georg Brandl6728c5a2009-10-11 18:31:23 +000068
69 #!/usr/bin/env python
70
Ezio Melotti6176db52012-05-13 19:49:00 +030071*Don't* do this for CGI scripts. The :envvar:`PATH` variable for CGI scripts is
72often very minimal, so you need to use the actual absolute pathname of the
Georg Brandl6728c5a2009-10-11 18:31:23 +000073interpreter.
74
Ezio Melotti6176db52012-05-13 19:49:00 +030075Occasionally, a user's environment is so full that the :program:`/usr/bin/env`
76program fails; or there's no env program at all. In that case, you can try the
Georg Brandl6728c5a2009-10-11 18:31:23 +000077following hack (due to Alex Rezinsky)::
78
79 #! /bin/sh
80 """:"
81 exec python $0 ${1+"$@"}
82 """
83
84The minor disadvantage is that this defines the script's __doc__ string.
85However, you can fix that by adding ::
86
87 __doc__ = """...Whatever..."""
88
89
90
91Is there a curses/termcap package for Python?
92---------------------------------------------
93
94.. XXX curses *is* built by default, isn't it?
95
Ezio Melottie710c992012-05-13 20:19:41 +030096For Unix variants the standard Python source distribution comes with a curses
Ezio Melotti6176db52012-05-13 19:49:00 +030097module in the :source:`Modules` subdirectory, though it's not compiled by default.
98(Note that this is not available in the Windows distribution -- there is no
99curses module for Windows.)
Georg Brandl6728c5a2009-10-11 18:31:23 +0000100
Ezio Melotti6176db52012-05-13 19:49:00 +0300101The :mod:`curses` module supports basic curses features as well as many additional
Georg Brandl6728c5a2009-10-11 18:31:23 +0000102functions from ncurses and SYSV curses such as colour, alternative character set
103support, pads, and mouse support. This means the module isn't compatible with
104operating systems that only have BSD curses, but there don't seem to be any
105currently maintained OSes that fall into this category.
106
107For Windows: use `the consolelib module
108<http://effbot.org/zone/console-index.htm>`_.
109
110
111Is there an equivalent to C's onexit() in Python?
112-------------------------------------------------
113
114The :mod:`atexit` module provides a register function that is similar to C's
Ezio Melotti6176db52012-05-13 19:49:00 +0300115:c:func:`onexit`.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000116
117
118Why don't my signal handlers work?
119----------------------------------
120
121The most common problem is that the signal handler is declared with the wrong
122argument list. It is called as ::
123
124 handler(signum, frame)
125
126so it should be declared with two arguments::
127
128 def handler(signum, frame):
129 ...
130
131
132Common tasks
133============
134
135How do I test a Python program or component?
136--------------------------------------------
137
138Python comes with two testing frameworks. The :mod:`doctest` module finds
139examples in the docstrings for a module and runs them, comparing the output with
140the expected output given in the docstring.
141
142The :mod:`unittest` module is a fancier testing framework modelled on Java and
143Smalltalk testing frameworks.
144
Ezio Melotti6176db52012-05-13 19:49:00 +0300145To make testing easier, you should use good modular design in your program.
146Your program should have almost all functionality
Georg Brandl6728c5a2009-10-11 18:31:23 +0000147encapsulated in either functions or class methods -- and this sometimes has the
148surprising and delightful effect of making the program run faster (because local
149variable accesses are faster than global accesses). Furthermore the program
150should avoid depending on mutating global variables, since this makes testing
151much more difficult to do.
152
153The "global main logic" of your program may be as simple as ::
154
155 if __name__ == "__main__":
156 main_logic()
157
158at the bottom of the main module of your program.
159
160Once your program is organized as a tractable collection of functions and class
161behaviours you should write test functions that exercise the behaviours. A test
Ezio Melotti6176db52012-05-13 19:49:00 +0300162suite that automates a sequence of tests can be associated with each module.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000163This sounds like a lot of work, but since Python is so terse and flexible it's
164surprisingly easy. You can make coding much more pleasant and fun by writing
165your test functions in parallel with the "production code", since this makes it
166easy to find bugs and even design flaws earlier.
167
168"Support modules" that are not intended to be the main module of a program may
169include a self-test of the module. ::
170
171 if __name__ == "__main__":
172 self_test()
173
174Even programs that interact with complex external interfaces may be tested when
175the external interfaces are unavailable by using "fake" interfaces implemented
176in Python.
177
178
179How do I create documentation from doc strings?
180-----------------------------------------------
181
Georg Brandl6728c5a2009-10-11 18:31:23 +0000182The :mod:`pydoc` module can create HTML from the doc strings in your Python
Georg Brandla4314c22009-10-11 20:16:16 +0000183source code. An alternative for creating API documentation purely from
184docstrings is `epydoc <http://epydoc.sf.net/>`_. `Sphinx
185<http://sphinx.pocoo.org>`_ can also include docstring content.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000186
187
188How do I get a single keypress at a time?
189-----------------------------------------
190
Ezio Melotti6176db52012-05-13 19:49:00 +0300191For Unix variants there are several solutions. It's straightforward to do this
Georg Brandl6728c5a2009-10-11 18:31:23 +0000192using curses, but curses is a fairly large module to learn. Here's a solution
193without curses::
194
195 import termios, fcntl, sys, os
196 fd = sys.stdin.fileno()
197
198 oldterm = termios.tcgetattr(fd)
199 newattr = termios.tcgetattr(fd)
200 newattr[3] = newattr[3] & ~termios.ICANON & ~termios.ECHO
201 termios.tcsetattr(fd, termios.TCSANOW, newattr)
202
203 oldflags = fcntl.fcntl(fd, fcntl.F_GETFL)
204 fcntl.fcntl(fd, fcntl.F_SETFL, oldflags | os.O_NONBLOCK)
205
206 try:
207 while 1:
208 try:
209 c = sys.stdin.read(1)
Georg Brandl55427272010-03-12 09:57:43 +0000210 print "Got character", repr(c)
Georg Brandl6728c5a2009-10-11 18:31:23 +0000211 except IOError: pass
212 finally:
213 termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm)
214 fcntl.fcntl(fd, fcntl.F_SETFL, oldflags)
215
216You need the :mod:`termios` and the :mod:`fcntl` module for any of this to work,
217and I've only tried it on Linux, though it should work elsewhere. In this code,
218characters are read and printed one at a time.
219
220:func:`termios.tcsetattr` turns off stdin's echoing and disables canonical mode.
221:func:`fcntl.fnctl` is used to obtain stdin's file descriptor flags and modify
222them for non-blocking mode. Since reading stdin when it is empty results in an
223:exc:`IOError`, this error is caught and ignored.
224
225
226Threads
227=======
228
229How do I program using threads?
230-------------------------------
231
232.. XXX it's _thread in py3k
233
234Be sure to use the :mod:`threading` module and not the :mod:`thread` module.
235The :mod:`threading` module builds convenient abstractions on top of the
236low-level primitives provided by the :mod:`thread` module.
237
238Aahz has a set of slides from his threading tutorial that are helpful; see
Georg Brandla4314c22009-10-11 20:16:16 +0000239http://www.pythoncraft.com/OSCON2001/.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000240
241
242None of my threads seem to run: why?
243------------------------------------
244
245As soon as the main thread exits, all threads are killed. Your main thread is
246running too quickly, giving the threads no time to do any work.
247
248A simple fix is to add a sleep to the end of the program that's long enough for
249all the threads to finish::
250
251 import threading, time
252
253 def thread_task(name, n):
254 for i in range(n): print name, i
255
256 for i in range(10):
257 T = threading.Thread(target=thread_task, args=(str(i), i))
258 T.start()
259
260 time.sleep(10) # <----------------------------!
261
262But now (on many platforms) the threads don't run in parallel, but appear to run
263sequentially, one at a time! The reason is that the OS thread scheduler doesn't
264start a new thread until the previous thread is blocked.
265
266A simple fix is to add a tiny sleep to the start of the run function::
267
268 def thread_task(name, n):
269 time.sleep(0.001) # <---------------------!
270 for i in range(n): print name, i
271
272 for i in range(10):
273 T = threading.Thread(target=thread_task, args=(str(i), i))
274 T.start()
275
276 time.sleep(10)
277
Ezio Melotti6176db52012-05-13 19:49:00 +0300278Instead of trying to guess a good delay value for :func:`time.sleep`,
Georg Brandl6728c5a2009-10-11 18:31:23 +0000279it's better to use some kind of semaphore mechanism. One idea is to use the
280:mod:`Queue` module to create a queue object, let each thread append a token to
281the queue when it finishes, and let the main thread read as many tokens from the
282queue as there are threads.
283
284
285How do I parcel out work among a bunch of worker threads?
286---------------------------------------------------------
287
288Use the :mod:`Queue` module to create a queue containing a list of jobs. The
Ezio Melotti6176db52012-05-13 19:49:00 +0300289:class:`~Queue.Queue` class maintains a list of objects and has a ``.put(obj)``
290method that adds items to the queue and a ``.get()`` method to return them.
291The class will take care of the locking necessary to ensure that each job is
292handed out exactly once.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000293
294Here's a trivial example::
295
296 import threading, Queue, time
297
298 # The worker thread gets jobs off the queue. When the queue is empty, it
299 # assumes there will be no more work and exits.
300 # (Realistically workers will run until terminated.)
Ezio Melotti6176db52012-05-13 19:49:00 +0300301 def worker():
Georg Brandl6728c5a2009-10-11 18:31:23 +0000302 print 'Running worker'
303 time.sleep(0.1)
304 while True:
305 try:
306 arg = q.get(block=False)
307 except Queue.Empty:
308 print 'Worker', threading.currentThread(),
309 print 'queue empty'
310 break
311 else:
312 print 'Worker', threading.currentThread(),
313 print 'running with argument', arg
314 time.sleep(0.5)
315
316 # Create queue
317 q = Queue.Queue()
318
319 # Start a pool of 5 workers
320 for i in range(5):
321 t = threading.Thread(target=worker, name='worker %i' % (i+1))
322 t.start()
323
324 # Begin adding work to the queue
325 for i in range(50):
326 q.put(i)
327
328 # Give threads time to run
329 print 'Main thread sleeping'
330 time.sleep(5)
331
332When run, this will produce the following output:
333
Ezio Melotti6176db52012-05-13 19:49:00 +0300334.. code-block:: none
335
Georg Brandl6728c5a2009-10-11 18:31:23 +0000336 Running worker
337 Running worker
338 Running worker
339 Running worker
340 Running worker
341 Main thread sleeping
342 Worker <Thread(worker 1, started)> running with argument 0
343 Worker <Thread(worker 2, started)> running with argument 1
344 Worker <Thread(worker 3, started)> running with argument 2
345 Worker <Thread(worker 4, started)> running with argument 3
346 Worker <Thread(worker 5, started)> running with argument 4
347 Worker <Thread(worker 1, started)> running with argument 5
348 ...
349
Ezio Melotti6176db52012-05-13 19:49:00 +0300350Consult the module's documentation for more details; the :class:`~Queue.Queue`
351class provides a featureful interface.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000352
353
354What kinds of global value mutation are thread-safe?
355----------------------------------------------------
356
Ezio Melottie710c992012-05-13 20:19:41 +0300357A :term:`global interpreter lock` (GIL) is used internally to ensure that only
Ezio Melotti6176db52012-05-13 19:49:00 +0300358one thread runs in the Python VM at a time. In general, Python offers to switch
Georg Brandl6728c5a2009-10-11 18:31:23 +0000359among threads only between bytecode instructions; how frequently it switches can
360be set via :func:`sys.setcheckinterval`. Each bytecode instruction and
361therefore all the C implementation code reached from each instruction is
362therefore atomic from the point of view of a Python program.
363
364In theory, this means an exact accounting requires an exact understanding of the
365PVM bytecode implementation. In practice, it means that operations on shared
Georg Brandl6f82cd32010-02-06 18:44:44 +0000366variables of built-in data types (ints, lists, dicts, etc) that "look atomic"
Georg Brandl6728c5a2009-10-11 18:31:23 +0000367really are.
368
369For example, the following operations are all atomic (L, L1, L2 are lists, D,
370D1, D2 are dicts, x, y are objects, i, j are ints)::
371
372 L.append(x)
373 L1.extend(L2)
374 x = L[i]
375 x = L.pop()
376 L1[i:j] = L2
377 L.sort()
378 x = y
379 x.field = y
380 D[x] = y
381 D1.update(D2)
382 D.keys()
383
384These aren't::
385
386 i = i+1
387 L.append(L[-1])
388 L[i] = L[j]
389 D[x] = D[x] + 1
390
391Operations that replace other objects may invoke those other objects'
392:meth:`__del__` method when their reference count reaches zero, and that can
393affect things. This is especially true for the mass updates to dictionaries and
394lists. When in doubt, use a mutex!
395
396
397Can't we get rid of the Global Interpreter Lock?
398------------------------------------------------
399
400.. XXX mention multiprocessing
Georg Brandla4314c22009-10-11 20:16:16 +0000401.. XXX link to dbeazley's talk about GIL?
Georg Brandl6728c5a2009-10-11 18:31:23 +0000402
Ezio Melottie710c992012-05-13 20:19:41 +0300403The :term:`global interpreter lock` (GIL) is often seen as a hindrance to Python's
Georg Brandl6728c5a2009-10-11 18:31:23 +0000404deployment on high-end multiprocessor server machines, because a multi-threaded
405Python program effectively only uses one CPU, due to the insistence that
406(almost) all Python code can only run while the GIL is held.
407
408Back in the days of Python 1.5, Greg Stein actually implemented a comprehensive
409patch set (the "free threading" patches) that removed the GIL and replaced it
410with fine-grained locking. Unfortunately, even on Windows (where locks are very
411efficient) this ran ordinary Python code about twice as slow as the interpreter
412using the GIL. On Linux the performance loss was even worse because pthread
413locks aren't as efficient.
414
415Since then, the idea of getting rid of the GIL has occasionally come up but
416nobody has found a way to deal with the expected slowdown, and users who don't
Sandro Tosibde7eac2012-01-15 16:34:29 +0100417use threads would not be happy if their code ran at half the speed. Greg's
Georg Brandl6728c5a2009-10-11 18:31:23 +0000418free threading patch set has not been kept up-to-date for later Python versions.
419
420This doesn't mean that you can't make good use of Python on multi-CPU machines!
421You just have to be creative with dividing the work up between multiple
422*processes* rather than multiple *threads*. Judicious use of C extensions will
423also help; if you use a C extension to perform a time-consuming task, the
424extension can release the GIL while the thread of execution is in the C code and
425allow other threads to get some work done.
426
427It has been suggested that the GIL should be a per-interpreter-state lock rather
428than truly global; interpreters then wouldn't be able to share objects.
429Unfortunately, this isn't likely to happen either. It would be a tremendous
430amount of work, because many object implementations currently have global state.
431For example, small integers and short strings are cached; these caches would
432have to be moved to the interpreter state. Other object types have their own
433free list; these free lists would have to be moved to the interpreter state.
434And so on.
435
436And I doubt that it can even be done in finite time, because the same problem
437exists for 3rd party extensions. It is likely that 3rd party extensions are
438being written at a faster rate than you can convert them to store all their
439global state in the interpreter state.
440
441And finally, once you have multiple interpreters not sharing any state, what
442have you gained over running each interpreter in a separate process?
443
444
445Input and Output
446================
447
448How do I delete a file? (And other file questions...)
449-----------------------------------------------------
450
451Use ``os.remove(filename)`` or ``os.unlink(filename)``; for documentation, see
452the :mod:`os` module. The two functions are identical; :func:`unlink` is simply
453the name of the Unix system call for this function.
454
455To remove a directory, use :func:`os.rmdir`; use :func:`os.mkdir` to create one.
456``os.makedirs(path)`` will create any intermediate directories in ``path`` that
457don't exist. ``os.removedirs(path)`` will remove intermediate directories as
458long as they're empty; if you want to delete an entire directory tree and its
459contents, use :func:`shutil.rmtree`.
460
461To rename a file, use ``os.rename(old_path, new_path)``.
462
463To truncate a file, open it using ``f = open(filename, "r+")``, and use
464``f.truncate(offset)``; offset defaults to the current seek position. There's
Georg Brandl35e7a8f2010-10-06 10:41:31 +0000465also ``os.ftruncate(fd, offset)`` for files opened with :func:`os.open`, where
Ezio Melotti6176db52012-05-13 19:49:00 +0300466*fd* is the file descriptor (a small integer).
Georg Brandl6728c5a2009-10-11 18:31:23 +0000467
468The :mod:`shutil` module also contains a number of functions to work on files
469including :func:`~shutil.copyfile`, :func:`~shutil.copytree`, and
470:func:`~shutil.rmtree`.
471
472
473How do I copy a file?
474---------------------
475
476The :mod:`shutil` module contains a :func:`~shutil.copyfile` function. Note
477that on MacOS 9 it doesn't copy the resource fork and Finder info.
478
479
480How do I read (or write) binary data?
481-------------------------------------
482
483To read or write complex binary data formats, it's best to use the :mod:`struct`
484module. It allows you to take a string containing binary data (usually numbers)
485and convert it to Python objects; and vice versa.
486
487For example, the following code reads two 2-byte integers and one 4-byte integer
488in big-endian format from a file::
489
490 import struct
491
492 f = open(filename, "rb") # Open in binary mode for portability
493 s = f.read(8)
494 x, y, z = struct.unpack(">hhl", s)
495
496The '>' in the format string forces big-endian data; the letter 'h' reads one
497"short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the
498string.
499
Ezio Melotti6176db52012-05-13 19:49:00 +0300500For data that is more regular (e.g. a homogeneous list of ints or floats),
Georg Brandl6728c5a2009-10-11 18:31:23 +0000501you can also use the :mod:`array` module.
502
503
504I can't seem to use os.read() on a pipe created with os.popen(); why?
505---------------------------------------------------------------------
506
507:func:`os.read` is a low-level function which takes a file descriptor, a small
508integer representing the opened file. :func:`os.popen` creates a high-level
Georg Brandl6f82cd32010-02-06 18:44:44 +0000509file object, the same type returned by the built-in :func:`open` function.
Ezio Melotti6176db52012-05-13 19:49:00 +0300510Thus, to read *n* bytes from a pipe *p* created with :func:`os.popen`, you need to
Georg Brandl6f82cd32010-02-06 18:44:44 +0000511use ``p.read(n)``.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000512
513
514How do I run a subprocess with pipes connected to both input and output?
515------------------------------------------------------------------------
516
517.. XXX update to use subprocess
518
519Use the :mod:`popen2` module. For example::
520
521 import popen2
522 fromchild, tochild = popen2.popen2("command")
523 tochild.write("input\n")
524 tochild.flush()
525 output = fromchild.readline()
526
527Warning: in general it is unwise to do this because you can easily cause a
528deadlock where your process is blocked waiting for output from the child while
Ezio Melotti6176db52012-05-13 19:49:00 +0300529the child is blocked waiting for input from you. This can be caused by the
530parent expecting the child to output more text than it does or by data being
531stuck in stdio buffers due to lack of flushing. The Python parent
Georg Brandl6728c5a2009-10-11 18:31:23 +0000532can of course explicitly flush the data it sends to the child before it reads
533any output, but if the child is a naive C program it may have been written to
534never explicitly flush its output, even if it is interactive, since flushing is
535normally automatic.
536
537Note that a deadlock is also possible if you use :func:`popen3` to read stdout
538and stderr. If one of the two is too large for the internal buffer (increasing
539the buffer size does not help) and you ``read()`` the other one first, there is
540a deadlock, too.
541
542Note on a bug in popen2: unless your program calls ``wait()`` or ``waitpid()``,
543finished child processes are never removed, and eventually calls to popen2 will
544fail because of a limit on the number of child processes. Calling
545:func:`os.waitpid` with the :data:`os.WNOHANG` option can prevent this; a good
546place to insert such a call would be before calling ``popen2`` again.
547
548In many cases, all you really need is to run some data through a command and get
549the result back. Unless the amount of data is very large, the easiest way to do
550this is to write it to a temporary file and run the command with that temporary
Ezio Melotti6176db52012-05-13 19:49:00 +0300551file as input. The standard module :mod:`tempfile` exports a
552:func:`~tempfile.mktemp` function to generate unique temporary file names. ::
Georg Brandl6728c5a2009-10-11 18:31:23 +0000553
554 import tempfile
555 import os
556
557 class Popen3:
558 """
559 This is a deadlock-safe version of popen that returns
560 an object with errorlevel, out (a string) and err (a string).
561 (capturestderr may not work under windows.)
562 Example: print Popen3('grep spam','\n\nhere spam\n\n').out
563 """
564 def __init__(self,command,input=None,capturestderr=None):
565 outfile=tempfile.mktemp()
566 command="( %s ) > %s" % (command,outfile)
567 if input:
568 infile=tempfile.mktemp()
569 open(infile,"w").write(input)
570 command=command+" <"+infile
571 if capturestderr:
572 errfile=tempfile.mktemp()
573 command=command+" 2>"+errfile
574 self.errorlevel=os.system(command) >> 8
575 self.out=open(outfile,"r").read()
576 os.remove(outfile)
577 if input:
578 os.remove(infile)
579 if capturestderr:
580 self.err=open(errfile,"r").read()
581 os.remove(errfile)
582
583Note that many interactive programs (e.g. vi) don't work well with pipes
584substituted for standard input and output. You will have to use pseudo ttys
585("ptys") instead of pipes. Or you can use a Python interface to Don Libes'
586"expect" library. A Python extension that interfaces to expect is called "expy"
587and available from http://expectpy.sourceforge.net. A pure Python solution that
Georg Brandla4314c22009-10-11 20:16:16 +0000588works like expect is `pexpect <http://pypi.python.org/pypi/pexpect/>`_.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000589
590
591How do I access the serial (RS232) port?
592----------------------------------------
593
594For Win32, POSIX (Linux, BSD, etc.), Jython:
595
596 http://pyserial.sourceforge.net
597
598For Unix, see a Usenet post by Mitch Chapman:
599
600 http://groups.google.com/groups?selm=34A04430.CF9@ohioee.com
601
602
603Why doesn't closing sys.stdout (stdin, stderr) really close it?
604---------------------------------------------------------------
605
606Python file objects are a high-level layer of abstraction on top of C streams,
607which in turn are a medium-level layer of abstraction on top of (among other
608things) low-level C file descriptors.
609
Georg Brandl6f82cd32010-02-06 18:44:44 +0000610For most file objects you create in Python via the built-in ``file``
611constructor, ``f.close()`` marks the Python file object as being closed from
612Python's point of view, and also arranges to close the underlying C stream.
613This also happens automatically in ``f``'s destructor, when ``f`` becomes
614garbage.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000615
616But stdin, stdout and stderr are treated specially by Python, because of the
617special status also given to them by C. Running ``sys.stdout.close()`` marks
618the Python-level file object as being closed, but does *not* close the
619associated C stream.
620
621To close the underlying C stream for one of these three, you should first be
622sure that's what you really want to do (e.g., you may confuse extension modules
623trying to do I/O). If it is, use os.close::
624
625 os.close(0) # close C's stdin stream
626 os.close(1) # close C's stdout stream
627 os.close(2) # close C's stderr stream
628
629
630Network/Internet Programming
631============================
632
633What WWW tools are there for Python?
634------------------------------------
635
636See the chapters titled :ref:`internet` and :ref:`netdata` in the Library
637Reference Manual. Python has many modules that will help you build server-side
638and client-side web systems.
639
640.. XXX check if wiki page is still up to date
641
642A summary of available frameworks is maintained by Paul Boddie at
643http://wiki.python.org/moin/WebProgramming .
644
645Cameron Laird maintains a useful set of pages about Python web technologies at
646http://phaseit.net/claird/comp.lang.python/web_python.
647
648
649How can I mimic CGI form submission (METHOD=POST)?
650--------------------------------------------------
651
652I would like to retrieve web pages that are the result of POSTing a form. Is
653there existing code that would let me do this easily?
654
655Yes. Here's a simple example that uses httplib::
656
657 #!/usr/local/bin/python
658
659 import httplib, sys, time
660
661 ### build the query string
662 qs = "First=Josephine&MI=Q&Last=Public"
663
664 ### connect and send the server a path
665 httpobj = httplib.HTTP('www.some-server.out-there', 80)
666 httpobj.putrequest('POST', '/cgi-bin/some-cgi-script')
667 ### now generate the rest of the HTTP headers...
668 httpobj.putheader('Accept', '*/*')
669 httpobj.putheader('Connection', 'Keep-Alive')
670 httpobj.putheader('Content-type', 'application/x-www-form-urlencoded')
671 httpobj.putheader('Content-length', '%d' % len(qs))
672 httpobj.endheaders()
673 httpobj.send(qs)
674 ### find out what the server said in response...
675 reply, msg, hdrs = httpobj.getreply()
676 if reply != 200:
677 sys.stdout.write(httpobj.getfile().read())
678
Georg Brandl21946af2010-10-06 09:28:45 +0000679Note that in general for percent-encoded POST operations, query strings must be
Ezio Melottie710c992012-05-13 20:19:41 +0300680quoted using :func:`urllib.urlencode`. For example, to send
681``name=Guy Steele, Jr.``::
Georg Brandl6728c5a2009-10-11 18:31:23 +0000682
Ezio Melottie710c992012-05-13 20:19:41 +0300683 >>> import urllib
684 >>> urllib.urlencode({'name': 'Guy Steele, Jr.'})
685 'name=Guy+Steele%2C+Jr.'
Georg Brandl6728c5a2009-10-11 18:31:23 +0000686
687
688What module should I use to help with generating HTML?
689------------------------------------------------------
690
691.. XXX add modern template languages
692
Ezio Melotti6176db52012-05-13 19:49:00 +0300693You can find a collection of useful links on the `Web Programming wiki page
694<http://wiki.python.org/moin/WebProgramming>`_.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000695
696
697How do I send mail from a Python script?
698----------------------------------------
699
700Use the standard library module :mod:`smtplib`.
701
702Here's a very simple interactive mail sender that uses it. This method will
703work on any host that supports an SMTP listener. ::
704
705 import sys, smtplib
706
707 fromaddr = raw_input("From: ")
708 toaddrs = raw_input("To: ").split(',')
709 print "Enter message, end with ^D:"
710 msg = ''
711 while True:
712 line = sys.stdin.readline()
713 if not line:
714 break
715 msg += line
716
717 # The actual mail send
718 server = smtplib.SMTP('localhost')
719 server.sendmail(fromaddr, toaddrs, msg)
720 server.quit()
721
722A Unix-only alternative uses sendmail. The location of the sendmail program
Ezio Melotti6176db52012-05-13 19:49:00 +0300723varies between systems; sometimes it is ``/usr/lib/sendmail``, sometimes
Georg Brandl6728c5a2009-10-11 18:31:23 +0000724``/usr/sbin/sendmail``. The sendmail manual page will help you out. Here's
725some sample code::
726
727 SENDMAIL = "/usr/sbin/sendmail" # sendmail location
728 import os
729 p = os.popen("%s -t -i" % SENDMAIL, "w")
730 p.write("To: receiver@example.com\n")
731 p.write("Subject: test\n")
732 p.write("\n") # blank line separating headers from body
733 p.write("Some text\n")
734 p.write("some more text\n")
735 sts = p.close()
736 if sts != 0:
737 print "Sendmail exit status", sts
738
739
740How do I avoid blocking in the connect() method of a socket?
741------------------------------------------------------------
742
743The select module is commonly used to help with asynchronous I/O on sockets.
744
745To prevent the TCP connect from blocking, you can set the socket to non-blocking
746mode. Then when you do the ``connect()``, you will either connect immediately
747(unlikely) or get an exception that contains the error number as ``.errno``.
748``errno.EINPROGRESS`` indicates that the connection is in progress, but hasn't
749finished yet. Different OSes will return different values, so you're going to
750have to check what's returned on your system.
751
752You can use the ``connect_ex()`` method to avoid creating an exception. It will
753just return the errno value. To poll, you can call ``connect_ex()`` again later
754-- 0 or ``errno.EISCONN`` indicate that you're connected -- or you can pass this
755socket to select to check if it's writable.
756
757
758Databases
759=========
760
761Are there any interfaces to database packages in Python?
762--------------------------------------------------------
763
764Yes.
765
766.. XXX remove bsddb in py3k, fix other module names
767
768Python 2.3 includes the :mod:`bsddb` package which provides an interface to the
769BerkeleyDB library. Interfaces to disk-based hashes such as :mod:`DBM <dbm>`
770and :mod:`GDBM <gdbm>` are also included with standard Python.
771
772Support for most relational databases is available. See the
773`DatabaseProgramming wiki page
774<http://wiki.python.org/moin/DatabaseProgramming>`_ for details.
775
776
777How do you implement persistent objects in Python?
778--------------------------------------------------
779
780The :mod:`pickle` library module solves this in a very general way (though you
781still can't store things like open files, sockets or windows), and the
782:mod:`shelve` library module uses pickle and (g)dbm to create persistent
783mappings containing arbitrary Python objects. For better performance, you can
784use the :mod:`cPickle` module.
785
786A more awkward way of doing things is to use pickle's little sister, marshal.
787The :mod:`marshal` module provides very fast ways to store noncircular basic
788Python types to files and strings, and back again. Although marshal does not do
789fancy things like store instances or handle shared references properly, it does
Ezio Melotti6176db52012-05-13 19:49:00 +0300790run extremely fast. For example, loading a half megabyte of data may take less
Georg Brandl6728c5a2009-10-11 18:31:23 +0000791than a third of a second. This often beats doing something more complex and
792general such as using gdbm with pickle/shelve.
793
794
795Why is cPickle so slow?
796-----------------------
797
798.. XXX update this, default protocol is 2/3
799
Ezio Melotti6176db52012-05-13 19:49:00 +0300800By default :mod:`pickle` uses a relatively old and slow format for backward
801compatibility. You can however specify other protocol versions that are
802faster::
Georg Brandl6728c5a2009-10-11 18:31:23 +0000803
804 largeString = 'z' * (100 * 1024)
805 myPickle = cPickle.dumps(largeString, protocol=1)
806
807
808If my program crashes with a bsddb (or anydbm) database open, it gets corrupted. How come?
809------------------------------------------------------------------------------------------
810
811Databases opened for write access with the bsddb module (and often by the anydbm
812module, since it will preferentially use bsddb) must explicitly be closed using
813the ``.close()`` method of the database. The underlying library caches database
814contents which need to be converted to on-disk form and written.
815
816If you have initialized a new bsddb database but not written anything to it
817before the program crashes, you will often wind up with a zero-length file and
818encounter an exception the next time the file is opened.
819
820
821I tried to open Berkeley DB file, but bsddb produces bsddb.error: (22, 'Invalid argument'). Help! How can I restore my data?
822----------------------------------------------------------------------------------------------------------------------------
823
824Don't panic! Your data is probably intact. The most frequent cause for the error
825is that you tried to open an earlier Berkeley DB file with a later version of
826the Berkeley DB library.
827
828Many Linux systems now have all three versions of Berkeley DB available. If you
829are migrating from version 1 to a newer version use db_dump185 to dump a plain
830text version of the database. If you are migrating from version 2 to version 3
831use db2_dump to create a plain text version of the database. In either case,
832use db_load to create a new native database for the latest version installed on
833your computer. If you have version 3 of Berkeley DB installed, you should be
834able to use db2_load to create a native version 2 database.
835
836You should move away from Berkeley DB version 1 files because the hash file code
837contains known bugs that can corrupt your data.
838
839
840Mathematics and Numerics
841========================
842
843How do I generate random numbers in Python?
844-------------------------------------------
845
846The standard module :mod:`random` implements a random number generator. Usage
847is simple::
848
849 import random
850 random.random()
851
852This returns a random floating point number in the range [0, 1).
853
854There are also many other specialized generators in this module, such as:
855
856* ``randrange(a, b)`` chooses an integer in the range [a, b).
857* ``uniform(a, b)`` chooses a floating point number in the range [a, b).
858* ``normalvariate(mean, sdev)`` samples the normal (Gaussian) distribution.
859
860Some higher-level functions operate on sequences directly, such as:
861
862* ``choice(S)`` chooses random element from a given sequence
863* ``shuffle(L)`` shuffles a list in-place, i.e. permutes it randomly
864
865There's also a ``Random`` class you can instantiate to create independent
866multiple random number generators.