blob: 1fc2389df4316c0785ffeb2c234e968ee50c8428 [file] [log] [blame]
Georg Brandlcb7cb242009-10-27 20:20:38 +00001:tocdepth: 2
2
3=========================
4Library and Extension FAQ
5=========================
6
7.. contents::
8
9General Library Questions
10=========================
11
12How do I find a module or application to perform task X?
13--------------------------------------------------------
14
15Check :ref:`the Library Reference <library-index>` to see if there's a relevant
16standard library module. (Eventually you'll learn what's in the standard
17library and will able to skip this step.)
18
Georg Brandl628e6f92009-10-27 20:24:45 +000019For third-party packages, search the `Python Package Index
20<http://pypi.python.org/pypi>`_ or try `Google <http://www.google.com>`_ or
21another Web search engine. Searching for "Python" plus a keyword or two for
22your topic of interest will usually find something helpful.
Georg Brandlcb7cb242009-10-27 20:20:38 +000023
24
25Where is the math.py (socket.py, regex.py, etc.) source file?
26-------------------------------------------------------------
27
28If you can't find a source file for a module it may be a builtin or dynamically
29loaded module implemented in C, C++ or other compiled language. In this case
30you may not have the source file or it may be something like mathmodule.c,
31somewhere in a C source directory (not on the Python Path).
32
33There are (at least) three kinds of modules in Python:
34
351) modules written in Python (.py);
362) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc);
373) modules written in C and linked with the interpreter; to get a list of these,
38 type::
39
40 import sys
41 print sys.builtin_module_names
42
43
44How do I make a Python script executable on Unix?
45-------------------------------------------------
46
47You need to do two things: the script file's mode must be executable and the
48first line must begin with ``#!`` followed by the path of the Python
49interpreter.
50
51The first is done by executing ``chmod +x scriptfile`` or perhaps ``chmod 755
52scriptfile``.
53
54The second can be done in a number of ways. The most straightforward way is to
55write ::
56
57 #!/usr/local/bin/python
58
59as the very first line of your file, using the pathname for where the Python
60interpreter is installed on your platform.
61
62If you would like the script to be independent of where the Python interpreter
63lives, you can use the "env" program. Almost all Unix variants support the
Ezio Melotti890c1932009-12-19 23:33:46 +000064following, assuming the Python interpreter is in a directory on the user's
Georg Brandlcb7cb242009-10-27 20:20:38 +000065$PATH::
66
67 #!/usr/bin/env python
68
69*Don't* do this for CGI scripts. The $PATH variable for CGI scripts is often
70very minimal, so you need to use the actual absolute pathname of the
71interpreter.
72
73Occasionally, a user's environment is so full that the /usr/bin/env program
74fails; or there's no env program at all. In that case, you can try the
75following hack (due to Alex Rezinsky)::
76
77 #! /bin/sh
78 """:"
79 exec python $0 ${1+"$@"}
80 """
81
82The minor disadvantage is that this defines the script's __doc__ string.
83However, you can fix that by adding ::
84
85 __doc__ = """...Whatever..."""
86
87
88
89Is there a curses/termcap package for Python?
90---------------------------------------------
91
92.. XXX curses *is* built by default, isn't it?
93
94For Unix variants: The standard Python source distribution comes with a curses
95module in the ``Modules/`` subdirectory, though it's not compiled by default
96(note that this is not available in the Windows distribution -- there is no
97curses module for Windows).
98
99The curses module supports basic curses features as well as many additional
100functions from ncurses and SYSV curses such as colour, alternative character set
101support, pads, and mouse support. This means the module isn't compatible with
102operating systems that only have BSD curses, but there don't seem to be any
103currently maintained OSes that fall into this category.
104
105For Windows: use `the consolelib module
106<http://effbot.org/zone/console-index.htm>`_.
107
108
109Is there an equivalent to C's onexit() in Python?
110-------------------------------------------------
111
112The :mod:`atexit` module provides a register function that is similar to C's
113onexit.
114
115
116Why don't my signal handlers work?
117----------------------------------
118
119The most common problem is that the signal handler is declared with the wrong
120argument list. It is called as ::
121
122 handler(signum, frame)
123
124so it should be declared with two arguments::
125
126 def handler(signum, frame):
127 ...
128
129
130Common tasks
131============
132
133How do I test a Python program or component?
134--------------------------------------------
135
136Python comes with two testing frameworks. The :mod:`doctest` module finds
137examples in the docstrings for a module and runs them, comparing the output with
138the expected output given in the docstring.
139
140The :mod:`unittest` module is a fancier testing framework modelled on Java and
141Smalltalk testing frameworks.
142
143For testing, it helps to write the program so that it may be easily tested by
144using good modular design. Your program should have almost all functionality
145encapsulated in either functions or class methods -- and this sometimes has the
146surprising and delightful effect of making the program run faster (because local
147variable accesses are faster than global accesses). Furthermore the program
148should avoid depending on mutating global variables, since this makes testing
149much more difficult to do.
150
151The "global main logic" of your program may be as simple as ::
152
153 if __name__ == "__main__":
154 main_logic()
155
156at the bottom of the main module of your program.
157
158Once your program is organized as a tractable collection of functions and class
159behaviours you should write test functions that exercise the behaviours. A test
160suite can be associated with each module which automates a sequence of tests.
161This sounds like a lot of work, but since Python is so terse and flexible it's
162surprisingly easy. You can make coding much more pleasant and fun by writing
163your test functions in parallel with the "production code", since this makes it
164easy to find bugs and even design flaws earlier.
165
166"Support modules" that are not intended to be the main module of a program may
167include a self-test of the module. ::
168
169 if __name__ == "__main__":
170 self_test()
171
172Even programs that interact with complex external interfaces may be tested when
173the external interfaces are unavailable by using "fake" interfaces implemented
174in Python.
175
176
177How do I create documentation from doc strings?
178-----------------------------------------------
179
Georg Brandlcb7cb242009-10-27 20:20:38 +0000180The :mod:`pydoc` module can create HTML from the doc strings in your Python
Georg Brandl628e6f92009-10-27 20:24:45 +0000181source code. An alternative for creating API documentation purely from
182docstrings is `epydoc <http://epydoc.sf.net/>`_. `Sphinx
183<http://sphinx.pocoo.org>`_ can also include docstring content.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000184
185
186How do I get a single keypress at a time?
187-----------------------------------------
188
189For Unix variants: There are several solutions. It's straightforward to do this
190using curses, but curses is a fairly large module to learn. Here's a solution
191without curses::
192
193 import termios, fcntl, sys, os
194 fd = sys.stdin.fileno()
195
196 oldterm = termios.tcgetattr(fd)
197 newattr = termios.tcgetattr(fd)
198 newattr[3] = newattr[3] & ~termios.ICANON & ~termios.ECHO
199 termios.tcsetattr(fd, termios.TCSANOW, newattr)
200
201 oldflags = fcntl.fcntl(fd, fcntl.F_GETFL)
202 fcntl.fcntl(fd, fcntl.F_SETFL, oldflags | os.O_NONBLOCK)
203
204 try:
205 while 1:
206 try:
207 c = sys.stdin.read(1)
208 print "Got character", `c`
209 except IOError: pass
210 finally:
211 termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm)
212 fcntl.fcntl(fd, fcntl.F_SETFL, oldflags)
213
214You need the :mod:`termios` and the :mod:`fcntl` module for any of this to work,
215and I've only tried it on Linux, though it should work elsewhere. In this code,
216characters are read and printed one at a time.
217
218:func:`termios.tcsetattr` turns off stdin's echoing and disables canonical mode.
219:func:`fcntl.fnctl` is used to obtain stdin's file descriptor flags and modify
220them for non-blocking mode. Since reading stdin when it is empty results in an
221:exc:`IOError`, this error is caught and ignored.
222
223
224Threads
225=======
226
227How do I program using threads?
228-------------------------------
229
Georg Brandl1e8cbe32009-10-27 20:23:20 +0000230Be sure to use the :mod:`threading` module and not the :mod:`_thread` module.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000231The :mod:`threading` module builds convenient abstractions on top of the
Georg Brandl1e8cbe32009-10-27 20:23:20 +0000232low-level primitives provided by the :mod:`_thread` module.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000233
234Aahz has a set of slides from his threading tutorial that are helpful; see
Georg Brandl628e6f92009-10-27 20:24:45 +0000235http://www.pythoncraft.com/OSCON2001/.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000236
237
238None of my threads seem to run: why?
239------------------------------------
240
241As soon as the main thread exits, all threads are killed. Your main thread is
242running too quickly, giving the threads no time to do any work.
243
244A simple fix is to add a sleep to the end of the program that's long enough for
245all the threads to finish::
246
247 import threading, time
248
249 def thread_task(name, n):
250 for i in range(n): print name, i
251
252 for i in range(10):
253 T = threading.Thread(target=thread_task, args=(str(i), i))
254 T.start()
255
256 time.sleep(10) # <----------------------------!
257
258But now (on many platforms) the threads don't run in parallel, but appear to run
259sequentially, one at a time! The reason is that the OS thread scheduler doesn't
260start a new thread until the previous thread is blocked.
261
262A simple fix is to add a tiny sleep to the start of the run function::
263
264 def thread_task(name, n):
265 time.sleep(0.001) # <---------------------!
266 for i in range(n): print name, i
267
268 for i in range(10):
269 T = threading.Thread(target=thread_task, args=(str(i), i))
270 T.start()
271
272 time.sleep(10)
273
274Instead of trying to guess how long a :func:`time.sleep` delay will be enough,
275it's better to use some kind of semaphore mechanism. One idea is to use the
Georg Brandl1e8cbe32009-10-27 20:23:20 +0000276:mod:`queue` module to create a queue object, let each thread append a token to
Georg Brandlcb7cb242009-10-27 20:20:38 +0000277the queue when it finishes, and let the main thread read as many tokens from the
278queue as there are threads.
279
280
281How do I parcel out work among a bunch of worker threads?
282---------------------------------------------------------
283
Georg Brandl1e8cbe32009-10-27 20:23:20 +0000284Use the :mod:`queue` module to create a queue containing a list of jobs. The
285:class:`~queue.Queue` class maintains a list of objects with ``.put(obj)`` to
Georg Brandlcb7cb242009-10-27 20:20:38 +0000286add an item to the queue and ``.get()`` to return an item. The class will take
287care of the locking necessary to ensure that each job is handed out exactly
288once.
289
290Here's a trivial example::
291
292 import threading, Queue, time
293
294 # The worker thread gets jobs off the queue. When the queue is empty, it
295 # assumes there will be no more work and exits.
296 # (Realistically workers will run until terminated.)
297 def worker ():
298 print 'Running worker'
299 time.sleep(0.1)
300 while True:
301 try:
302 arg = q.get(block=False)
303 except Queue.Empty:
304 print 'Worker', threading.currentThread(),
305 print 'queue empty'
306 break
307 else:
308 print 'Worker', threading.currentThread(),
309 print 'running with argument', arg
310 time.sleep(0.5)
311
312 # Create queue
313 q = Queue.Queue()
314
315 # Start a pool of 5 workers
316 for i in range(5):
317 t = threading.Thread(target=worker, name='worker %i' % (i+1))
318 t.start()
319
320 # Begin adding work to the queue
321 for i in range(50):
322 q.put(i)
323
324 # Give threads time to run
325 print 'Main thread sleeping'
326 time.sleep(5)
327
328When run, this will produce the following output:
329
330 Running worker
331 Running worker
332 Running worker
333 Running worker
334 Running worker
335 Main thread sleeping
336 Worker <Thread(worker 1, started)> running with argument 0
337 Worker <Thread(worker 2, started)> running with argument 1
338 Worker <Thread(worker 3, started)> running with argument 2
339 Worker <Thread(worker 4, started)> running with argument 3
340 Worker <Thread(worker 5, started)> running with argument 4
341 Worker <Thread(worker 1, started)> running with argument 5
342 ...
343
344Consult the module's documentation for more details; the ``Queue`` class
345provides a featureful interface.
346
347
348What kinds of global value mutation are thread-safe?
349----------------------------------------------------
350
351A global interpreter lock (GIL) is used internally to ensure that only one
352thread runs in the Python VM at a time. In general, Python offers to switch
353among threads only between bytecode instructions; how frequently it switches can
354be set via :func:`sys.setcheckinterval`. Each bytecode instruction and
355therefore all the C implementation code reached from each instruction is
356therefore atomic from the point of view of a Python program.
357
358In theory, this means an exact accounting requires an exact understanding of the
359PVM bytecode implementation. In practice, it means that operations on shared
360variables of builtin data types (ints, lists, dicts, etc) that "look atomic"
361really are.
362
363For example, the following operations are all atomic (L, L1, L2 are lists, D,
364D1, D2 are dicts, x, y are objects, i, j are ints)::
365
366 L.append(x)
367 L1.extend(L2)
368 x = L[i]
369 x = L.pop()
370 L1[i:j] = L2
371 L.sort()
372 x = y
373 x.field = y
374 D[x] = y
375 D1.update(D2)
376 D.keys()
377
378These aren't::
379
380 i = i+1
381 L.append(L[-1])
382 L[i] = L[j]
383 D[x] = D[x] + 1
384
385Operations that replace other objects may invoke those other objects'
386:meth:`__del__` method when their reference count reaches zero, and that can
387affect things. This is especially true for the mass updates to dictionaries and
388lists. When in doubt, use a mutex!
389
390
391Can't we get rid of the Global Interpreter Lock?
392------------------------------------------------
393
394.. XXX mention multiprocessing
Georg Brandl628e6f92009-10-27 20:24:45 +0000395.. XXX link to dbeazley's talk about GIL?
Georg Brandlcb7cb242009-10-27 20:20:38 +0000396
397The Global Interpreter Lock (GIL) is often seen as a hindrance to Python's
398deployment on high-end multiprocessor server machines, because a multi-threaded
399Python program effectively only uses one CPU, due to the insistence that
400(almost) all Python code can only run while the GIL is held.
401
402Back in the days of Python 1.5, Greg Stein actually implemented a comprehensive
403patch set (the "free threading" patches) that removed the GIL and replaced it
404with fine-grained locking. Unfortunately, even on Windows (where locks are very
405efficient) this ran ordinary Python code about twice as slow as the interpreter
406using the GIL. On Linux the performance loss was even worse because pthread
407locks aren't as efficient.
408
409Since then, the idea of getting rid of the GIL has occasionally come up but
410nobody has found a way to deal with the expected slowdown, and users who don't
411use threads would not be happy if their code ran at half at the speed. Greg's
412free threading patch set has not been kept up-to-date for later Python versions.
413
414This doesn't mean that you can't make good use of Python on multi-CPU machines!
415You just have to be creative with dividing the work up between multiple
416*processes* rather than multiple *threads*. Judicious use of C extensions will
417also help; if you use a C extension to perform a time-consuming task, the
418extension can release the GIL while the thread of execution is in the C code and
419allow other threads to get some work done.
420
421It has been suggested that the GIL should be a per-interpreter-state lock rather
422than truly global; interpreters then wouldn't be able to share objects.
423Unfortunately, this isn't likely to happen either. It would be a tremendous
424amount of work, because many object implementations currently have global state.
425For example, small integers and short strings are cached; these caches would
426have to be moved to the interpreter state. Other object types have their own
427free list; these free lists would have to be moved to the interpreter state.
428And so on.
429
430And I doubt that it can even be done in finite time, because the same problem
431exists for 3rd party extensions. It is likely that 3rd party extensions are
432being written at a faster rate than you can convert them to store all their
433global state in the interpreter state.
434
435And finally, once you have multiple interpreters not sharing any state, what
436have you gained over running each interpreter in a separate process?
437
438
439Input and Output
440================
441
442How do I delete a file? (And other file questions...)
443-----------------------------------------------------
444
445Use ``os.remove(filename)`` or ``os.unlink(filename)``; for documentation, see
446the :mod:`os` module. The two functions are identical; :func:`unlink` is simply
447the name of the Unix system call for this function.
448
449To remove a directory, use :func:`os.rmdir`; use :func:`os.mkdir` to create one.
450``os.makedirs(path)`` will create any intermediate directories in ``path`` that
451don't exist. ``os.removedirs(path)`` will remove intermediate directories as
452long as they're empty; if you want to delete an entire directory tree and its
453contents, use :func:`shutil.rmtree`.
454
455To rename a file, use ``os.rename(old_path, new_path)``.
456
457To truncate a file, open it using ``f = open(filename, "r+")``, and use
458``f.truncate(offset)``; offset defaults to the current seek position. There's
459also ```os.ftruncate(fd, offset)`` for files opened with :func:`os.open`, where
460``fd`` is the file descriptor (a small integer).
461
462The :mod:`shutil` module also contains a number of functions to work on files
463including :func:`~shutil.copyfile`, :func:`~shutil.copytree`, and
464:func:`~shutil.rmtree`.
465
466
467How do I copy a file?
468---------------------
469
470The :mod:`shutil` module contains a :func:`~shutil.copyfile` function. Note
471that on MacOS 9 it doesn't copy the resource fork and Finder info.
472
473
474How do I read (or write) binary data?
475-------------------------------------
476
477To read or write complex binary data formats, it's best to use the :mod:`struct`
478module. It allows you to take a string containing binary data (usually numbers)
479and convert it to Python objects; and vice versa.
480
481For example, the following code reads two 2-byte integers and one 4-byte integer
482in big-endian format from a file::
483
484 import struct
485
486 f = open(filename, "rb") # Open in binary mode for portability
487 s = f.read(8)
488 x, y, z = struct.unpack(">hhl", s)
489
490The '>' in the format string forces big-endian data; the letter 'h' reads one
491"short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the
492string.
493
494For data that is more regular (e.g. a homogeneous list of ints or thefloats),
495you can also use the :mod:`array` module.
496
497
498I can't seem to use os.read() on a pipe created with os.popen(); why?
499---------------------------------------------------------------------
500
501:func:`os.read` is a low-level function which takes a file descriptor, a small
502integer representing the opened file. :func:`os.popen` creates a high-level
503file object, the same type returned by the builtin :func:`open` function. Thus,
504to read n bytes from a pipe p created with :func:`os.popen`, you need to use
505``p.read(n)``.
506
507
508How do I run a subprocess with pipes connected to both input and output?
509------------------------------------------------------------------------
510
511.. XXX update to use subprocess
512
513Use the :mod:`popen2` module. For example::
514
515 import popen2
516 fromchild, tochild = popen2.popen2("command")
517 tochild.write("input\n")
518 tochild.flush()
519 output = fromchild.readline()
520
521Warning: in general it is unwise to do this because you can easily cause a
522deadlock where your process is blocked waiting for output from the child while
523the child is blocked waiting for input from you. This can be caused because the
524parent expects the child to output more text than it does, or it can be caused
525by data being stuck in stdio buffers due to lack of flushing. The Python parent
526can of course explicitly flush the data it sends to the child before it reads
527any output, but if the child is a naive C program it may have been written to
528never explicitly flush its output, even if it is interactive, since flushing is
529normally automatic.
530
531Note that a deadlock is also possible if you use :func:`popen3` to read stdout
532and stderr. If one of the two is too large for the internal buffer (increasing
533the buffer size does not help) and you ``read()`` the other one first, there is
534a deadlock, too.
535
536Note on a bug in popen2: unless your program calls ``wait()`` or ``waitpid()``,
537finished child processes are never removed, and eventually calls to popen2 will
538fail because of a limit on the number of child processes. Calling
539:func:`os.waitpid` with the :data:`os.WNOHANG` option can prevent this; a good
540place to insert such a call would be before calling ``popen2`` again.
541
542In many cases, all you really need is to run some data through a command and get
543the result back. Unless the amount of data is very large, the easiest way to do
544this is to write it to a temporary file and run the command with that temporary
545file as input. The standard module :mod:`tempfile` exports a ``mktemp()``
546function to generate unique temporary file names. ::
547
548 import tempfile
549 import os
550
551 class Popen3:
552 """
553 This is a deadlock-safe version of popen that returns
554 an object with errorlevel, out (a string) and err (a string).
555 (capturestderr may not work under windows.)
556 Example: print Popen3('grep spam','\n\nhere spam\n\n').out
557 """
558 def __init__(self,command,input=None,capturestderr=None):
559 outfile=tempfile.mktemp()
560 command="( %s ) > %s" % (command,outfile)
561 if input:
562 infile=tempfile.mktemp()
563 open(infile,"w").write(input)
564 command=command+" <"+infile
565 if capturestderr:
566 errfile=tempfile.mktemp()
567 command=command+" 2>"+errfile
568 self.errorlevel=os.system(command) >> 8
569 self.out=open(outfile,"r").read()
570 os.remove(outfile)
571 if input:
572 os.remove(infile)
573 if capturestderr:
574 self.err=open(errfile,"r").read()
575 os.remove(errfile)
576
577Note that many interactive programs (e.g. vi) don't work well with pipes
578substituted for standard input and output. You will have to use pseudo ttys
579("ptys") instead of pipes. Or you can use a Python interface to Don Libes'
580"expect" library. A Python extension that interfaces to expect is called "expy"
581and available from http://expectpy.sourceforge.net. A pure Python solution that
Georg Brandl628e6f92009-10-27 20:24:45 +0000582works like expect is `pexpect <http://pypi.python.org/pypi/pexpect/>`_.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000583
584
585How do I access the serial (RS232) port?
586----------------------------------------
587
588For Win32, POSIX (Linux, BSD, etc.), Jython:
589
590 http://pyserial.sourceforge.net
591
592For Unix, see a Usenet post by Mitch Chapman:
593
594 http://groups.google.com/groups?selm=34A04430.CF9@ohioee.com
595
596
597Why doesn't closing sys.stdout (stdin, stderr) really close it?
598---------------------------------------------------------------
599
600Python file objects are a high-level layer of abstraction on top of C streams,
601which in turn are a medium-level layer of abstraction on top of (among other
602things) low-level C file descriptors.
603
604For most file objects you create in Python via the builtin ``file`` constructor,
605``f.close()`` marks the Python file object as being closed from Python's point
606of view, and also arranges to close the underlying C stream. This also happens
607automatically in f's destructor, when f becomes garbage.
608
609But stdin, stdout and stderr are treated specially by Python, because of the
610special status also given to them by C. Running ``sys.stdout.close()`` marks
611the Python-level file object as being closed, but does *not* close the
612associated C stream.
613
614To close the underlying C stream for one of these three, you should first be
615sure that's what you really want to do (e.g., you may confuse extension modules
616trying to do I/O). If it is, use os.close::
617
618 os.close(0) # close C's stdin stream
619 os.close(1) # close C's stdout stream
620 os.close(2) # close C's stderr stream
621
622
623Network/Internet Programming
624============================
625
626What WWW tools are there for Python?
627------------------------------------
628
629See the chapters titled :ref:`internet` and :ref:`netdata` in the Library
630Reference Manual. Python has many modules that will help you build server-side
631and client-side web systems.
632
633.. XXX check if wiki page is still up to date
634
635A summary of available frameworks is maintained by Paul Boddie at
636http://wiki.python.org/moin/WebProgramming .
637
638Cameron Laird maintains a useful set of pages about Python web technologies at
639http://phaseit.net/claird/comp.lang.python/web_python.
640
641
642How can I mimic CGI form submission (METHOD=POST)?
643--------------------------------------------------
644
645I would like to retrieve web pages that are the result of POSTing a form. Is
646there existing code that would let me do this easily?
647
648Yes. Here's a simple example that uses httplib::
649
650 #!/usr/local/bin/python
651
652 import httplib, sys, time
653
654 ### build the query string
655 qs = "First=Josephine&MI=Q&Last=Public"
656
657 ### connect and send the server a path
658 httpobj = httplib.HTTP('www.some-server.out-there', 80)
659 httpobj.putrequest('POST', '/cgi-bin/some-cgi-script')
660 ### now generate the rest of the HTTP headers...
661 httpobj.putheader('Accept', '*/*')
662 httpobj.putheader('Connection', 'Keep-Alive')
663 httpobj.putheader('Content-type', 'application/x-www-form-urlencoded')
664 httpobj.putheader('Content-length', '%d' % len(qs))
665 httpobj.endheaders()
666 httpobj.send(qs)
667 ### find out what the server said in response...
668 reply, msg, hdrs = httpobj.getreply()
669 if reply != 200:
670 sys.stdout.write(httpobj.getfile().read())
671
672Note that in general for URL-encoded POST operations, query strings must be
673quoted by using :func:`urllib.quote`. For example to send name="Guy Steele,
674Jr."::
675
676 >>> from urllib import quote
677 >>> x = quote("Guy Steele, Jr.")
678 >>> x
679 'Guy%20Steele,%20Jr.'
680 >>> query_string = "name="+x
681 >>> query_string
682 'name=Guy%20Steele,%20Jr.'
683
684
685What module should I use to help with generating HTML?
686------------------------------------------------------
687
688.. XXX add modern template languages
689
690There are many different modules available:
691
692* HTMLgen is a class library of objects corresponding to all the HTML 3.2 markup
693 tags. It's used when you are writing in Python and wish to synthesize HTML
694 pages for generating a web or for CGI forms, etc.
695
696* DocumentTemplate and Zope Page Templates are two different systems that are
697 part of Zope.
698
699* Quixote's PTL uses Python syntax to assemble strings of text.
700
701Consult the `Web Programming wiki pages
702<http://wiki.python.org/moin/WebProgramming>`_ for more links.
703
704
705How do I send mail from a Python script?
706----------------------------------------
707
708Use the standard library module :mod:`smtplib`.
709
710Here's a very simple interactive mail sender that uses it. This method will
711work on any host that supports an SMTP listener. ::
712
713 import sys, smtplib
714
715 fromaddr = raw_input("From: ")
716 toaddrs = raw_input("To: ").split(',')
717 print "Enter message, end with ^D:"
718 msg = ''
719 while True:
720 line = sys.stdin.readline()
721 if not line:
722 break
723 msg += line
724
725 # The actual mail send
726 server = smtplib.SMTP('localhost')
727 server.sendmail(fromaddr, toaddrs, msg)
728 server.quit()
729
730A Unix-only alternative uses sendmail. The location of the sendmail program
731varies between systems; sometimes it is ``/usr/lib/sendmail``, sometime
732``/usr/sbin/sendmail``. The sendmail manual page will help you out. Here's
733some sample code::
734
735 SENDMAIL = "/usr/sbin/sendmail" # sendmail location
736 import os
737 p = os.popen("%s -t -i" % SENDMAIL, "w")
738 p.write("To: receiver@example.com\n")
739 p.write("Subject: test\n")
740 p.write("\n") # blank line separating headers from body
741 p.write("Some text\n")
742 p.write("some more text\n")
743 sts = p.close()
744 if sts != 0:
745 print "Sendmail exit status", sts
746
747
748How do I avoid blocking in the connect() method of a socket?
749------------------------------------------------------------
750
751The select module is commonly used to help with asynchronous I/O on sockets.
752
753To prevent the TCP connect from blocking, you can set the socket to non-blocking
754mode. Then when you do the ``connect()``, you will either connect immediately
755(unlikely) or get an exception that contains the error number as ``.errno``.
756``errno.EINPROGRESS`` indicates that the connection is in progress, but hasn't
757finished yet. Different OSes will return different values, so you're going to
758have to check what's returned on your system.
759
760You can use the ``connect_ex()`` method to avoid creating an exception. It will
761just return the errno value. To poll, you can call ``connect_ex()`` again later
762-- 0 or ``errno.EISCONN`` indicate that you're connected -- or you can pass this
763socket to select to check if it's writable.
764
765
766Databases
767=========
768
769Are there any interfaces to database packages in Python?
770--------------------------------------------------------
771
772Yes.
773
Georg Brandl1e8cbe32009-10-27 20:23:20 +0000774Interfaces to disk-based hashes such as :mod:`DBM <dbm.ndbm>` and :mod:`GDBM
775<dbm.gnu>` are also included with standard Python. There is also the
776:mod:`sqlite3` module, which provides a lightweight disk-based relational
777database.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000778
779Support for most relational databases is available. See the
780`DatabaseProgramming wiki page
781<http://wiki.python.org/moin/DatabaseProgramming>`_ for details.
782
783
784How do you implement persistent objects in Python?
785--------------------------------------------------
786
787The :mod:`pickle` library module solves this in a very general way (though you
788still can't store things like open files, sockets or windows), and the
789:mod:`shelve` library module uses pickle and (g)dbm to create persistent
Georg Brandl1e8cbe32009-10-27 20:23:20 +0000790mappings containing arbitrary Python objects.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000791
792A more awkward way of doing things is to use pickle's little sister, marshal.
793The :mod:`marshal` module provides very fast ways to store noncircular basic
794Python types to files and strings, and back again. Although marshal does not do
795fancy things like store instances or handle shared references properly, it does
796run extremely fast. For example loading a half megabyte of data may take less
797than a third of a second. This often beats doing something more complex and
798general such as using gdbm with pickle/shelve.
799
800
801Why is cPickle so slow?
802-----------------------
803
804.. XXX update this, default protocol is 2/3
805
806The default format used by the pickle module is a slow one that results in
807readable pickles. Making it the default, but it would break backward
808compatibility::
809
810 largeString = 'z' * (100 * 1024)
811 myPickle = cPickle.dumps(largeString, protocol=1)
812
813
814If my program crashes with a bsddb (or anydbm) database open, it gets corrupted. How come?
815------------------------------------------------------------------------------------------
816
817Databases opened for write access with the bsddb module (and often by the anydbm
818module, since it will preferentially use bsddb) must explicitly be closed using
819the ``.close()`` method of the database. The underlying library caches database
820contents which need to be converted to on-disk form and written.
821
822If you have initialized a new bsddb database but not written anything to it
823before the program crashes, you will often wind up with a zero-length file and
824encounter an exception the next time the file is opened.
825
826
827I tried to open Berkeley DB file, but bsddb produces bsddb.error: (22, 'Invalid argument'). Help! How can I restore my data?
828----------------------------------------------------------------------------------------------------------------------------
829
830Don't panic! Your data is probably intact. The most frequent cause for the error
831is that you tried to open an earlier Berkeley DB file with a later version of
832the Berkeley DB library.
833
834Many Linux systems now have all three versions of Berkeley DB available. If you
835are migrating from version 1 to a newer version use db_dump185 to dump a plain
836text version of the database. If you are migrating from version 2 to version 3
837use db2_dump to create a plain text version of the database. In either case,
838use db_load to create a new native database for the latest version installed on
839your computer. If you have version 3 of Berkeley DB installed, you should be
840able to use db2_load to create a native version 2 database.
841
842You should move away from Berkeley DB version 1 files because the hash file code
843contains known bugs that can corrupt your data.
844
845
846Mathematics and Numerics
847========================
848
849How do I generate random numbers in Python?
850-------------------------------------------
851
852The standard module :mod:`random` implements a random number generator. Usage
853is simple::
854
855 import random
856 random.random()
857
858This returns a random floating point number in the range [0, 1).
859
860There are also many other specialized generators in this module, such as:
861
862* ``randrange(a, b)`` chooses an integer in the range [a, b).
863* ``uniform(a, b)`` chooses a floating point number in the range [a, b).
864* ``normalvariate(mean, sdev)`` samples the normal (Gaussian) distribution.
865
866Some higher-level functions operate on sequences directly, such as:
867
868* ``choice(S)`` chooses random element from a given sequence
869* ``shuffle(L)`` shuffles a list in-place, i.e. permutes it randomly
870
871There's also a ``Random`` class you can instantiate to create independent
872multiple random number generators.