blob: 7b340b2d2073301449189797d7aab0693cebe694 [file] [log] [blame]
Georg Brandl6728c5a2009-10-11 18:31:23 +00001:tocdepth: 2
2
3=========================
4Library and Extension FAQ
5=========================
6
7.. contents::
8
9General Library Questions
10=========================
11
12How do I find a module or application to perform task X?
13--------------------------------------------------------
14
15Check :ref:`the Library Reference <library-index>` to see if there's a relevant
16standard library module. (Eventually you'll learn what's in the standard
Ezio Melotti6176db52012-05-13 19:49:00 +030017library and will be able to skip this step.)
Georg Brandl6728c5a2009-10-11 18:31:23 +000018
Georg Brandla4314c22009-10-11 20:16:16 +000019For third-party packages, search the `Python Package Index
20<http://pypi.python.org/pypi>`_ or try `Google <http://www.google.com>`_ or
21another Web search engine. Searching for "Python" plus a keyword or two for
22your topic of interest will usually find something helpful.
Georg Brandl6728c5a2009-10-11 18:31:23 +000023
24
25Where is the math.py (socket.py, regex.py, etc.) source file?
26-------------------------------------------------------------
27
Georg Brandl6f82cd32010-02-06 18:44:44 +000028If you can't find a source file for a module it may be a built-in or
29dynamically loaded module implemented in C, C++ or other compiled language.
30In this case you may not have the source file or it may be something like
Ezio Melotti6176db52012-05-13 19:49:00 +030031:file:`mathmodule.c`, somewhere in a C source directory (not on the Python Path).
Georg Brandl6728c5a2009-10-11 18:31:23 +000032
33There are (at least) three kinds of modules in Python:
34
351) modules written in Python (.py);
362) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc);
373) modules written in C and linked with the interpreter; to get a list of these,
38 type::
39
40 import sys
41 print sys.builtin_module_names
42
43
44How do I make a Python script executable on Unix?
45-------------------------------------------------
46
47You need to do two things: the script file's mode must be executable and the
48first line must begin with ``#!`` followed by the path of the Python
49interpreter.
50
51The first is done by executing ``chmod +x scriptfile`` or perhaps ``chmod 755
52scriptfile``.
53
54The second can be done in a number of ways. The most straightforward way is to
55write ::
56
57 #!/usr/local/bin/python
58
59as the very first line of your file, using the pathname for where the Python
60interpreter is installed on your platform.
61
62If you would like the script to be independent of where the Python interpreter
Ezio Melotti6176db52012-05-13 19:49:00 +030063lives, you can use the :program:`env` program. Almost all Unix variants support
64the following, assuming the Python interpreter is in a directory on the user's
65:envvar:`PATH`::
Georg Brandl6728c5a2009-10-11 18:31:23 +000066
67 #!/usr/bin/env python
68
Ezio Melotti6176db52012-05-13 19:49:00 +030069*Don't* do this for CGI scripts. The :envvar:`PATH` variable for CGI scripts is
70often very minimal, so you need to use the actual absolute pathname of the
Georg Brandl6728c5a2009-10-11 18:31:23 +000071interpreter.
72
Ezio Melotti6176db52012-05-13 19:49:00 +030073Occasionally, a user's environment is so full that the :program:`/usr/bin/env`
74program fails; or there's no env program at all. In that case, you can try the
Georg Brandl6728c5a2009-10-11 18:31:23 +000075following hack (due to Alex Rezinsky)::
76
77 #! /bin/sh
78 """:"
79 exec python $0 ${1+"$@"}
80 """
81
82The minor disadvantage is that this defines the script's __doc__ string.
83However, you can fix that by adding ::
84
85 __doc__ = """...Whatever..."""
86
87
88
89Is there a curses/termcap package for Python?
90---------------------------------------------
91
92.. XXX curses *is* built by default, isn't it?
93
Ezio Melottie710c992012-05-13 20:19:41 +030094For Unix variants the standard Python source distribution comes with a curses
Ezio Melotti6176db52012-05-13 19:49:00 +030095module in the :source:`Modules` subdirectory, though it's not compiled by default.
96(Note that this is not available in the Windows distribution -- there is no
97curses module for Windows.)
Georg Brandl6728c5a2009-10-11 18:31:23 +000098
Ezio Melotti6176db52012-05-13 19:49:00 +030099The :mod:`curses` module supports basic curses features as well as many additional
Georg Brandl6728c5a2009-10-11 18:31:23 +0000100functions from ncurses and SYSV curses such as colour, alternative character set
101support, pads, and mouse support. This means the module isn't compatible with
102operating systems that only have BSD curses, but there don't seem to be any
103currently maintained OSes that fall into this category.
104
105For Windows: use `the consolelib module
106<http://effbot.org/zone/console-index.htm>`_.
107
108
109Is there an equivalent to C's onexit() in Python?
110-------------------------------------------------
111
112The :mod:`atexit` module provides a register function that is similar to C's
Ezio Melotti6176db52012-05-13 19:49:00 +0300113:c:func:`onexit`.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000114
115
116Why don't my signal handlers work?
117----------------------------------
118
119The most common problem is that the signal handler is declared with the wrong
120argument list. It is called as ::
121
122 handler(signum, frame)
123
124so it should be declared with two arguments::
125
126 def handler(signum, frame):
127 ...
128
129
130Common tasks
131============
132
133How do I test a Python program or component?
134--------------------------------------------
135
136Python comes with two testing frameworks. The :mod:`doctest` module finds
137examples in the docstrings for a module and runs them, comparing the output with
138the expected output given in the docstring.
139
140The :mod:`unittest` module is a fancier testing framework modelled on Java and
141Smalltalk testing frameworks.
142
Ezio Melotti6176db52012-05-13 19:49:00 +0300143To make testing easier, you should use good modular design in your program.
144Your program should have almost all functionality
Georg Brandl6728c5a2009-10-11 18:31:23 +0000145encapsulated in either functions or class methods -- and this sometimes has the
146surprising and delightful effect of making the program run faster (because local
147variable accesses are faster than global accesses). Furthermore the program
148should avoid depending on mutating global variables, since this makes testing
149much more difficult to do.
150
151The "global main logic" of your program may be as simple as ::
152
153 if __name__ == "__main__":
154 main_logic()
155
156at the bottom of the main module of your program.
157
158Once your program is organized as a tractable collection of functions and class
159behaviours you should write test functions that exercise the behaviours. A test
Ezio Melotti6176db52012-05-13 19:49:00 +0300160suite that automates a sequence of tests can be associated with each module.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000161This sounds like a lot of work, but since Python is so terse and flexible it's
162surprisingly easy. You can make coding much more pleasant and fun by writing
163your test functions in parallel with the "production code", since this makes it
164easy to find bugs and even design flaws earlier.
165
166"Support modules" that are not intended to be the main module of a program may
167include a self-test of the module. ::
168
169 if __name__ == "__main__":
170 self_test()
171
172Even programs that interact with complex external interfaces may be tested when
173the external interfaces are unavailable by using "fake" interfaces implemented
174in Python.
175
176
177How do I create documentation from doc strings?
178-----------------------------------------------
179
Georg Brandl6728c5a2009-10-11 18:31:23 +0000180The :mod:`pydoc` module can create HTML from the doc strings in your Python
Georg Brandla4314c22009-10-11 20:16:16 +0000181source code. An alternative for creating API documentation purely from
182docstrings is `epydoc <http://epydoc.sf.net/>`_. `Sphinx
183<http://sphinx.pocoo.org>`_ can also include docstring content.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000184
185
186How do I get a single keypress at a time?
187-----------------------------------------
188
Ezio Melotti6176db52012-05-13 19:49:00 +0300189For Unix variants there are several solutions. It's straightforward to do this
Georg Brandl6728c5a2009-10-11 18:31:23 +0000190using curses, but curses is a fairly large module to learn. Here's a solution
191without curses::
192
193 import termios, fcntl, sys, os
194 fd = sys.stdin.fileno()
195
196 oldterm = termios.tcgetattr(fd)
197 newattr = termios.tcgetattr(fd)
198 newattr[3] = newattr[3] & ~termios.ICANON & ~termios.ECHO
199 termios.tcsetattr(fd, termios.TCSANOW, newattr)
200
201 oldflags = fcntl.fcntl(fd, fcntl.F_GETFL)
202 fcntl.fcntl(fd, fcntl.F_SETFL, oldflags | os.O_NONBLOCK)
203
204 try:
205 while 1:
206 try:
207 c = sys.stdin.read(1)
Georg Brandl55427272010-03-12 09:57:43 +0000208 print "Got character", repr(c)
Georg Brandl6728c5a2009-10-11 18:31:23 +0000209 except IOError: pass
210 finally:
211 termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm)
212 fcntl.fcntl(fd, fcntl.F_SETFL, oldflags)
213
214You need the :mod:`termios` and the :mod:`fcntl` module for any of this to work,
215and I've only tried it on Linux, though it should work elsewhere. In this code,
216characters are read and printed one at a time.
217
218:func:`termios.tcsetattr` turns off stdin's echoing and disables canonical mode.
219:func:`fcntl.fnctl` is used to obtain stdin's file descriptor flags and modify
220them for non-blocking mode. Since reading stdin when it is empty results in an
221:exc:`IOError`, this error is caught and ignored.
222
223
224Threads
225=======
226
227How do I program using threads?
228-------------------------------
229
230.. XXX it's _thread in py3k
231
232Be sure to use the :mod:`threading` module and not the :mod:`thread` module.
233The :mod:`threading` module builds convenient abstractions on top of the
234low-level primitives provided by the :mod:`thread` module.
235
236Aahz has a set of slides from his threading tutorial that are helpful; see
Georg Brandla4314c22009-10-11 20:16:16 +0000237http://www.pythoncraft.com/OSCON2001/.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000238
239
240None of my threads seem to run: why?
241------------------------------------
242
243As soon as the main thread exits, all threads are killed. Your main thread is
244running too quickly, giving the threads no time to do any work.
245
246A simple fix is to add a sleep to the end of the program that's long enough for
247all the threads to finish::
248
249 import threading, time
250
251 def thread_task(name, n):
252 for i in range(n): print name, i
253
254 for i in range(10):
255 T = threading.Thread(target=thread_task, args=(str(i), i))
256 T.start()
257
258 time.sleep(10) # <----------------------------!
259
260But now (on many platforms) the threads don't run in parallel, but appear to run
261sequentially, one at a time! The reason is that the OS thread scheduler doesn't
262start a new thread until the previous thread is blocked.
263
264A simple fix is to add a tiny sleep to the start of the run function::
265
266 def thread_task(name, n):
267 time.sleep(0.001) # <---------------------!
268 for i in range(n): print name, i
269
270 for i in range(10):
271 T = threading.Thread(target=thread_task, args=(str(i), i))
272 T.start()
273
274 time.sleep(10)
275
Ezio Melotti6176db52012-05-13 19:49:00 +0300276Instead of trying to guess a good delay value for :func:`time.sleep`,
Georg Brandl6728c5a2009-10-11 18:31:23 +0000277it's better to use some kind of semaphore mechanism. One idea is to use the
278:mod:`Queue` module to create a queue object, let each thread append a token to
279the queue when it finishes, and let the main thread read as many tokens from the
280queue as there are threads.
281
282
283How do I parcel out work among a bunch of worker threads?
284---------------------------------------------------------
285
286Use the :mod:`Queue` module to create a queue containing a list of jobs. The
Ezio Melotti6176db52012-05-13 19:49:00 +0300287:class:`~Queue.Queue` class maintains a list of objects and has a ``.put(obj)``
288method that adds items to the queue and a ``.get()`` method to return them.
289The class will take care of the locking necessary to ensure that each job is
290handed out exactly once.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000291
292Here's a trivial example::
293
294 import threading, Queue, time
295
296 # The worker thread gets jobs off the queue. When the queue is empty, it
297 # assumes there will be no more work and exits.
298 # (Realistically workers will run until terminated.)
Ezio Melotti6176db52012-05-13 19:49:00 +0300299 def worker():
Georg Brandl6728c5a2009-10-11 18:31:23 +0000300 print 'Running worker'
301 time.sleep(0.1)
302 while True:
303 try:
304 arg = q.get(block=False)
305 except Queue.Empty:
306 print 'Worker', threading.currentThread(),
307 print 'queue empty'
308 break
309 else:
310 print 'Worker', threading.currentThread(),
311 print 'running with argument', arg
312 time.sleep(0.5)
313
314 # Create queue
315 q = Queue.Queue()
316
317 # Start a pool of 5 workers
318 for i in range(5):
319 t = threading.Thread(target=worker, name='worker %i' % (i+1))
320 t.start()
321
322 # Begin adding work to the queue
323 for i in range(50):
324 q.put(i)
325
326 # Give threads time to run
327 print 'Main thread sleeping'
328 time.sleep(5)
329
330When run, this will produce the following output:
331
Ezio Melotti6176db52012-05-13 19:49:00 +0300332.. code-block:: none
333
Georg Brandl6728c5a2009-10-11 18:31:23 +0000334 Running worker
335 Running worker
336 Running worker
337 Running worker
338 Running worker
339 Main thread sleeping
340 Worker <Thread(worker 1, started)> running with argument 0
341 Worker <Thread(worker 2, started)> running with argument 1
342 Worker <Thread(worker 3, started)> running with argument 2
343 Worker <Thread(worker 4, started)> running with argument 3
344 Worker <Thread(worker 5, started)> running with argument 4
345 Worker <Thread(worker 1, started)> running with argument 5
346 ...
347
Ezio Melotti6176db52012-05-13 19:49:00 +0300348Consult the module's documentation for more details; the :class:`~Queue.Queue`
349class provides a featureful interface.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000350
351
352What kinds of global value mutation are thread-safe?
353----------------------------------------------------
354
Ezio Melottie710c992012-05-13 20:19:41 +0300355A :term:`global interpreter lock` (GIL) is used internally to ensure that only
Ezio Melotti6176db52012-05-13 19:49:00 +0300356one thread runs in the Python VM at a time. In general, Python offers to switch
Georg Brandl6728c5a2009-10-11 18:31:23 +0000357among threads only between bytecode instructions; how frequently it switches can
358be set via :func:`sys.setcheckinterval`. Each bytecode instruction and
359therefore all the C implementation code reached from each instruction is
360therefore atomic from the point of view of a Python program.
361
362In theory, this means an exact accounting requires an exact understanding of the
363PVM bytecode implementation. In practice, it means that operations on shared
Georg Brandl6f82cd32010-02-06 18:44:44 +0000364variables of built-in data types (ints, lists, dicts, etc) that "look atomic"
Georg Brandl6728c5a2009-10-11 18:31:23 +0000365really are.
366
367For example, the following operations are all atomic (L, L1, L2 are lists, D,
368D1, D2 are dicts, x, y are objects, i, j are ints)::
369
370 L.append(x)
371 L1.extend(L2)
372 x = L[i]
373 x = L.pop()
374 L1[i:j] = L2
375 L.sort()
376 x = y
377 x.field = y
378 D[x] = y
379 D1.update(D2)
380 D.keys()
381
382These aren't::
383
384 i = i+1
385 L.append(L[-1])
386 L[i] = L[j]
387 D[x] = D[x] + 1
388
389Operations that replace other objects may invoke those other objects'
390:meth:`__del__` method when their reference count reaches zero, and that can
391affect things. This is especially true for the mass updates to dictionaries and
392lists. When in doubt, use a mutex!
393
394
395Can't we get rid of the Global Interpreter Lock?
396------------------------------------------------
397
398.. XXX mention multiprocessing
Georg Brandla4314c22009-10-11 20:16:16 +0000399.. XXX link to dbeazley's talk about GIL?
Georg Brandl6728c5a2009-10-11 18:31:23 +0000400
Ezio Melottie710c992012-05-13 20:19:41 +0300401The :term:`global interpreter lock` (GIL) is often seen as a hindrance to Python's
Georg Brandl6728c5a2009-10-11 18:31:23 +0000402deployment on high-end multiprocessor server machines, because a multi-threaded
403Python program effectively only uses one CPU, due to the insistence that
404(almost) all Python code can only run while the GIL is held.
405
406Back in the days of Python 1.5, Greg Stein actually implemented a comprehensive
407patch set (the "free threading" patches) that removed the GIL and replaced it
408with fine-grained locking. Unfortunately, even on Windows (where locks are very
409efficient) this ran ordinary Python code about twice as slow as the interpreter
410using the GIL. On Linux the performance loss was even worse because pthread
411locks aren't as efficient.
412
413Since then, the idea of getting rid of the GIL has occasionally come up but
414nobody has found a way to deal with the expected slowdown, and users who don't
Sandro Tosibde7eac2012-01-15 16:34:29 +0100415use threads would not be happy if their code ran at half the speed. Greg's
Georg Brandl6728c5a2009-10-11 18:31:23 +0000416free threading patch set has not been kept up-to-date for later Python versions.
417
418This doesn't mean that you can't make good use of Python on multi-CPU machines!
419You just have to be creative with dividing the work up between multiple
420*processes* rather than multiple *threads*. Judicious use of C extensions will
421also help; if you use a C extension to perform a time-consuming task, the
422extension can release the GIL while the thread of execution is in the C code and
423allow other threads to get some work done.
424
425It has been suggested that the GIL should be a per-interpreter-state lock rather
426than truly global; interpreters then wouldn't be able to share objects.
427Unfortunately, this isn't likely to happen either. It would be a tremendous
428amount of work, because many object implementations currently have global state.
429For example, small integers and short strings are cached; these caches would
430have to be moved to the interpreter state. Other object types have their own
431free list; these free lists would have to be moved to the interpreter state.
432And so on.
433
434And I doubt that it can even be done in finite time, because the same problem
435exists for 3rd party extensions. It is likely that 3rd party extensions are
436being written at a faster rate than you can convert them to store all their
437global state in the interpreter state.
438
439And finally, once you have multiple interpreters not sharing any state, what
440have you gained over running each interpreter in a separate process?
441
442
443Input and Output
444================
445
446How do I delete a file? (And other file questions...)
447-----------------------------------------------------
448
449Use ``os.remove(filename)`` or ``os.unlink(filename)``; for documentation, see
450the :mod:`os` module. The two functions are identical; :func:`unlink` is simply
451the name of the Unix system call for this function.
452
453To remove a directory, use :func:`os.rmdir`; use :func:`os.mkdir` to create one.
454``os.makedirs(path)`` will create any intermediate directories in ``path`` that
455don't exist. ``os.removedirs(path)`` will remove intermediate directories as
456long as they're empty; if you want to delete an entire directory tree and its
457contents, use :func:`shutil.rmtree`.
458
459To rename a file, use ``os.rename(old_path, new_path)``.
460
461To truncate a file, open it using ``f = open(filename, "r+")``, and use
462``f.truncate(offset)``; offset defaults to the current seek position. There's
Georg Brandl35e7a8f2010-10-06 10:41:31 +0000463also ``os.ftruncate(fd, offset)`` for files opened with :func:`os.open`, where
Ezio Melotti6176db52012-05-13 19:49:00 +0300464*fd* is the file descriptor (a small integer).
Georg Brandl6728c5a2009-10-11 18:31:23 +0000465
466The :mod:`shutil` module also contains a number of functions to work on files
467including :func:`~shutil.copyfile`, :func:`~shutil.copytree`, and
468:func:`~shutil.rmtree`.
469
470
471How do I copy a file?
472---------------------
473
474The :mod:`shutil` module contains a :func:`~shutil.copyfile` function. Note
475that on MacOS 9 it doesn't copy the resource fork and Finder info.
476
477
478How do I read (or write) binary data?
479-------------------------------------
480
481To read or write complex binary data formats, it's best to use the :mod:`struct`
482module. It allows you to take a string containing binary data (usually numbers)
483and convert it to Python objects; and vice versa.
484
485For example, the following code reads two 2-byte integers and one 4-byte integer
486in big-endian format from a file::
487
488 import struct
489
490 f = open(filename, "rb") # Open in binary mode for portability
491 s = f.read(8)
492 x, y, z = struct.unpack(">hhl", s)
493
494The '>' in the format string forces big-endian data; the letter 'h' reads one
495"short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the
496string.
497
Ezio Melotti6176db52012-05-13 19:49:00 +0300498For data that is more regular (e.g. a homogeneous list of ints or floats),
Georg Brandl6728c5a2009-10-11 18:31:23 +0000499you can also use the :mod:`array` module.
500
501
502I can't seem to use os.read() on a pipe created with os.popen(); why?
503---------------------------------------------------------------------
504
505:func:`os.read` is a low-level function which takes a file descriptor, a small
506integer representing the opened file. :func:`os.popen` creates a high-level
Georg Brandl6f82cd32010-02-06 18:44:44 +0000507file object, the same type returned by the built-in :func:`open` function.
Ezio Melotti6176db52012-05-13 19:49:00 +0300508Thus, to read *n* bytes from a pipe *p* created with :func:`os.popen`, you need to
Georg Brandl6f82cd32010-02-06 18:44:44 +0000509use ``p.read(n)``.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000510
511
512How do I run a subprocess with pipes connected to both input and output?
513------------------------------------------------------------------------
514
515.. XXX update to use subprocess
516
517Use the :mod:`popen2` module. For example::
518
519 import popen2
520 fromchild, tochild = popen2.popen2("command")
521 tochild.write("input\n")
522 tochild.flush()
523 output = fromchild.readline()
524
525Warning: in general it is unwise to do this because you can easily cause a
526deadlock where your process is blocked waiting for output from the child while
Ezio Melotti6176db52012-05-13 19:49:00 +0300527the child is blocked waiting for input from you. This can be caused by the
528parent expecting the child to output more text than it does or by data being
529stuck in stdio buffers due to lack of flushing. The Python parent
Georg Brandl6728c5a2009-10-11 18:31:23 +0000530can of course explicitly flush the data it sends to the child before it reads
531any output, but if the child is a naive C program it may have been written to
532never explicitly flush its output, even if it is interactive, since flushing is
533normally automatic.
534
535Note that a deadlock is also possible if you use :func:`popen3` to read stdout
536and stderr. If one of the two is too large for the internal buffer (increasing
537the buffer size does not help) and you ``read()`` the other one first, there is
538a deadlock, too.
539
540Note on a bug in popen2: unless your program calls ``wait()`` or ``waitpid()``,
541finished child processes are never removed, and eventually calls to popen2 will
542fail because of a limit on the number of child processes. Calling
543:func:`os.waitpid` with the :data:`os.WNOHANG` option can prevent this; a good
544place to insert such a call would be before calling ``popen2`` again.
545
546In many cases, all you really need is to run some data through a command and get
547the result back. Unless the amount of data is very large, the easiest way to do
548this is to write it to a temporary file and run the command with that temporary
Ezio Melotti6176db52012-05-13 19:49:00 +0300549file as input. The standard module :mod:`tempfile` exports a
550:func:`~tempfile.mktemp` function to generate unique temporary file names. ::
Georg Brandl6728c5a2009-10-11 18:31:23 +0000551
552 import tempfile
553 import os
554
555 class Popen3:
556 """
557 This is a deadlock-safe version of popen that returns
558 an object with errorlevel, out (a string) and err (a string).
559 (capturestderr may not work under windows.)
560 Example: print Popen3('grep spam','\n\nhere spam\n\n').out
561 """
562 def __init__(self,command,input=None,capturestderr=None):
563 outfile=tempfile.mktemp()
564 command="( %s ) > %s" % (command,outfile)
565 if input:
566 infile=tempfile.mktemp()
567 open(infile,"w").write(input)
568 command=command+" <"+infile
569 if capturestderr:
570 errfile=tempfile.mktemp()
571 command=command+" 2>"+errfile
572 self.errorlevel=os.system(command) >> 8
573 self.out=open(outfile,"r").read()
574 os.remove(outfile)
575 if input:
576 os.remove(infile)
577 if capturestderr:
578 self.err=open(errfile,"r").read()
579 os.remove(errfile)
580
581Note that many interactive programs (e.g. vi) don't work well with pipes
582substituted for standard input and output. You will have to use pseudo ttys
583("ptys") instead of pipes. Or you can use a Python interface to Don Libes'
584"expect" library. A Python extension that interfaces to expect is called "expy"
585and available from http://expectpy.sourceforge.net. A pure Python solution that
Georg Brandla4314c22009-10-11 20:16:16 +0000586works like expect is `pexpect <http://pypi.python.org/pypi/pexpect/>`_.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000587
588
589How do I access the serial (RS232) port?
590----------------------------------------
591
592For Win32, POSIX (Linux, BSD, etc.), Jython:
593
594 http://pyserial.sourceforge.net
595
596For Unix, see a Usenet post by Mitch Chapman:
597
598 http://groups.google.com/groups?selm=34A04430.CF9@ohioee.com
599
600
601Why doesn't closing sys.stdout (stdin, stderr) really close it?
602---------------------------------------------------------------
603
604Python file objects are a high-level layer of abstraction on top of C streams,
605which in turn are a medium-level layer of abstraction on top of (among other
606things) low-level C file descriptors.
607
Georg Brandl6f82cd32010-02-06 18:44:44 +0000608For most file objects you create in Python via the built-in ``file``
609constructor, ``f.close()`` marks the Python file object as being closed from
610Python's point of view, and also arranges to close the underlying C stream.
611This also happens automatically in ``f``'s destructor, when ``f`` becomes
612garbage.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000613
614But stdin, stdout and stderr are treated specially by Python, because of the
615special status also given to them by C. Running ``sys.stdout.close()`` marks
616the Python-level file object as being closed, but does *not* close the
617associated C stream.
618
619To close the underlying C stream for one of these three, you should first be
620sure that's what you really want to do (e.g., you may confuse extension modules
621trying to do I/O). If it is, use os.close::
622
623 os.close(0) # close C's stdin stream
624 os.close(1) # close C's stdout stream
625 os.close(2) # close C's stderr stream
626
627
628Network/Internet Programming
629============================
630
631What WWW tools are there for Python?
632------------------------------------
633
634See the chapters titled :ref:`internet` and :ref:`netdata` in the Library
635Reference Manual. Python has many modules that will help you build server-side
636and client-side web systems.
637
638.. XXX check if wiki page is still up to date
639
640A summary of available frameworks is maintained by Paul Boddie at
641http://wiki.python.org/moin/WebProgramming .
642
643Cameron Laird maintains a useful set of pages about Python web technologies at
644http://phaseit.net/claird/comp.lang.python/web_python.
645
646
647How can I mimic CGI form submission (METHOD=POST)?
648--------------------------------------------------
649
650I would like to retrieve web pages that are the result of POSTing a form. Is
651there existing code that would let me do this easily?
652
653Yes. Here's a simple example that uses httplib::
654
655 #!/usr/local/bin/python
656
657 import httplib, sys, time
658
659 ### build the query string
660 qs = "First=Josephine&MI=Q&Last=Public"
661
662 ### connect and send the server a path
663 httpobj = httplib.HTTP('www.some-server.out-there', 80)
664 httpobj.putrequest('POST', '/cgi-bin/some-cgi-script')
665 ### now generate the rest of the HTTP headers...
666 httpobj.putheader('Accept', '*/*')
667 httpobj.putheader('Connection', 'Keep-Alive')
668 httpobj.putheader('Content-type', 'application/x-www-form-urlencoded')
669 httpobj.putheader('Content-length', '%d' % len(qs))
670 httpobj.endheaders()
671 httpobj.send(qs)
672 ### find out what the server said in response...
673 reply, msg, hdrs = httpobj.getreply()
674 if reply != 200:
675 sys.stdout.write(httpobj.getfile().read())
676
Georg Brandl21946af2010-10-06 09:28:45 +0000677Note that in general for percent-encoded POST operations, query strings must be
Ezio Melottie710c992012-05-13 20:19:41 +0300678quoted using :func:`urllib.urlencode`. For example, to send
679``name=Guy Steele, Jr.``::
Georg Brandl6728c5a2009-10-11 18:31:23 +0000680
Ezio Melottie710c992012-05-13 20:19:41 +0300681 >>> import urllib
682 >>> urllib.urlencode({'name': 'Guy Steele, Jr.'})
683 'name=Guy+Steele%2C+Jr.'
Georg Brandl6728c5a2009-10-11 18:31:23 +0000684
685
686What module should I use to help with generating HTML?
687------------------------------------------------------
688
689.. XXX add modern template languages
690
Ezio Melotti6176db52012-05-13 19:49:00 +0300691You can find a collection of useful links on the `Web Programming wiki page
692<http://wiki.python.org/moin/WebProgramming>`_.
Georg Brandl6728c5a2009-10-11 18:31:23 +0000693
694
695How do I send mail from a Python script?
696----------------------------------------
697
698Use the standard library module :mod:`smtplib`.
699
700Here's a very simple interactive mail sender that uses it. This method will
701work on any host that supports an SMTP listener. ::
702
703 import sys, smtplib
704
705 fromaddr = raw_input("From: ")
706 toaddrs = raw_input("To: ").split(',')
707 print "Enter message, end with ^D:"
708 msg = ''
709 while True:
710 line = sys.stdin.readline()
711 if not line:
712 break
713 msg += line
714
715 # The actual mail send
716 server = smtplib.SMTP('localhost')
717 server.sendmail(fromaddr, toaddrs, msg)
718 server.quit()
719
720A Unix-only alternative uses sendmail. The location of the sendmail program
Ezio Melotti6176db52012-05-13 19:49:00 +0300721varies between systems; sometimes it is ``/usr/lib/sendmail``, sometimes
Georg Brandl6728c5a2009-10-11 18:31:23 +0000722``/usr/sbin/sendmail``. The sendmail manual page will help you out. Here's
723some sample code::
724
725 SENDMAIL = "/usr/sbin/sendmail" # sendmail location
726 import os
727 p = os.popen("%s -t -i" % SENDMAIL, "w")
728 p.write("To: receiver@example.com\n")
729 p.write("Subject: test\n")
730 p.write("\n") # blank line separating headers from body
731 p.write("Some text\n")
732 p.write("some more text\n")
733 sts = p.close()
734 if sts != 0:
735 print "Sendmail exit status", sts
736
737
738How do I avoid blocking in the connect() method of a socket?
739------------------------------------------------------------
740
741The select module is commonly used to help with asynchronous I/O on sockets.
742
743To prevent the TCP connect from blocking, you can set the socket to non-blocking
744mode. Then when you do the ``connect()``, you will either connect immediately
745(unlikely) or get an exception that contains the error number as ``.errno``.
746``errno.EINPROGRESS`` indicates that the connection is in progress, but hasn't
747finished yet. Different OSes will return different values, so you're going to
748have to check what's returned on your system.
749
750You can use the ``connect_ex()`` method to avoid creating an exception. It will
751just return the errno value. To poll, you can call ``connect_ex()`` again later
752-- 0 or ``errno.EISCONN`` indicate that you're connected -- or you can pass this
753socket to select to check if it's writable.
754
755
756Databases
757=========
758
759Are there any interfaces to database packages in Python?
760--------------------------------------------------------
761
762Yes.
763
764.. XXX remove bsddb in py3k, fix other module names
765
766Python 2.3 includes the :mod:`bsddb` package which provides an interface to the
767BerkeleyDB library. Interfaces to disk-based hashes such as :mod:`DBM <dbm>`
768and :mod:`GDBM <gdbm>` are also included with standard Python.
769
770Support for most relational databases is available. See the
771`DatabaseProgramming wiki page
772<http://wiki.python.org/moin/DatabaseProgramming>`_ for details.
773
774
775How do you implement persistent objects in Python?
776--------------------------------------------------
777
778The :mod:`pickle` library module solves this in a very general way (though you
779still can't store things like open files, sockets or windows), and the
780:mod:`shelve` library module uses pickle and (g)dbm to create persistent
781mappings containing arbitrary Python objects. For better performance, you can
782use the :mod:`cPickle` module.
783
784A more awkward way of doing things is to use pickle's little sister, marshal.
785The :mod:`marshal` module provides very fast ways to store noncircular basic
786Python types to files and strings, and back again. Although marshal does not do
787fancy things like store instances or handle shared references properly, it does
Ezio Melotti6176db52012-05-13 19:49:00 +0300788run extremely fast. For example, loading a half megabyte of data may take less
Georg Brandl6728c5a2009-10-11 18:31:23 +0000789than a third of a second. This often beats doing something more complex and
790general such as using gdbm with pickle/shelve.
791
792
793Why is cPickle so slow?
794-----------------------
795
796.. XXX update this, default protocol is 2/3
797
Ezio Melotti6176db52012-05-13 19:49:00 +0300798By default :mod:`pickle` uses a relatively old and slow format for backward
799compatibility. You can however specify other protocol versions that are
800faster::
Georg Brandl6728c5a2009-10-11 18:31:23 +0000801
802 largeString = 'z' * (100 * 1024)
803 myPickle = cPickle.dumps(largeString, protocol=1)
804
805
806If my program crashes with a bsddb (or anydbm) database open, it gets corrupted. How come?
807------------------------------------------------------------------------------------------
808
809Databases opened for write access with the bsddb module (and often by the anydbm
810module, since it will preferentially use bsddb) must explicitly be closed using
811the ``.close()`` method of the database. The underlying library caches database
812contents which need to be converted to on-disk form and written.
813
814If you have initialized a new bsddb database but not written anything to it
815before the program crashes, you will often wind up with a zero-length file and
816encounter an exception the next time the file is opened.
817
818
819I tried to open Berkeley DB file, but bsddb produces bsddb.error: (22, 'Invalid argument'). Help! How can I restore my data?
820----------------------------------------------------------------------------------------------------------------------------
821
822Don't panic! Your data is probably intact. The most frequent cause for the error
823is that you tried to open an earlier Berkeley DB file with a later version of
824the Berkeley DB library.
825
826Many Linux systems now have all three versions of Berkeley DB available. If you
827are migrating from version 1 to a newer version use db_dump185 to dump a plain
828text version of the database. If you are migrating from version 2 to version 3
829use db2_dump to create a plain text version of the database. In either case,
830use db_load to create a new native database for the latest version installed on
831your computer. If you have version 3 of Berkeley DB installed, you should be
832able to use db2_load to create a native version 2 database.
833
834You should move away from Berkeley DB version 1 files because the hash file code
835contains known bugs that can corrupt your data.
836
837
838Mathematics and Numerics
839========================
840
841How do I generate random numbers in Python?
842-------------------------------------------
843
844The standard module :mod:`random` implements a random number generator. Usage
845is simple::
846
847 import random
848 random.random()
849
850This returns a random floating point number in the range [0, 1).
851
852There are also many other specialized generators in this module, such as:
853
854* ``randrange(a, b)`` chooses an integer in the range [a, b).
855* ``uniform(a, b)`` chooses a floating point number in the range [a, b).
856* ``normalvariate(mean, sdev)`` samples the normal (Gaussian) distribution.
857
858Some higher-level functions operate on sequences directly, such as:
859
860* ``choice(S)`` chooses random element from a given sequence
861* ``shuffle(L)`` shuffles a list in-place, i.e. permutes it randomly
862
863There's also a ``Random`` class you can instantiate to create independent
864multiple random number generators.