blob: c59b38f52874f8ae548aef8cb1b9948ff74c2dbf [file] [log] [blame]
Georg Brandlcb7cb242009-10-27 20:20:38 +00001:tocdepth: 2
2
3=========================
4Library and Extension FAQ
5=========================
6
7.. contents::
8
9General Library Questions
10=========================
11
12How do I find a module or application to perform task X?
13--------------------------------------------------------
14
15Check :ref:`the Library Reference <library-index>` to see if there's a relevant
16standard library module. (Eventually you'll learn what's in the standard
17library and will able to skip this step.)
18
Georg Brandl628e6f92009-10-27 20:24:45 +000019For third-party packages, search the `Python Package Index
20<http://pypi.python.org/pypi>`_ or try `Google <http://www.google.com>`_ or
21another Web search engine. Searching for "Python" plus a keyword or two for
22your topic of interest will usually find something helpful.
Georg Brandlcb7cb242009-10-27 20:20:38 +000023
24
25Where is the math.py (socket.py, regex.py, etc.) source file?
26-------------------------------------------------------------
27
28If you can't find a source file for a module it may be a builtin or dynamically
29loaded module implemented in C, C++ or other compiled language. In this case
30you may not have the source file or it may be something like mathmodule.c,
31somewhere in a C source directory (not on the Python Path).
32
33There are (at least) three kinds of modules in Python:
34
351) modules written in Python (.py);
362) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc);
373) modules written in C and linked with the interpreter; to get a list of these,
38 type::
39
40 import sys
41 print sys.builtin_module_names
42
43
44How do I make a Python script executable on Unix?
45-------------------------------------------------
46
47You need to do two things: the script file's mode must be executable and the
48first line must begin with ``#!`` followed by the path of the Python
49interpreter.
50
51The first is done by executing ``chmod +x scriptfile`` or perhaps ``chmod 755
52scriptfile``.
53
54The second can be done in a number of ways. The most straightforward way is to
55write ::
56
57 #!/usr/local/bin/python
58
59as the very first line of your file, using the pathname for where the Python
60interpreter is installed on your platform.
61
62If you would like the script to be independent of where the Python interpreter
63lives, you can use the "env" program. Almost all Unix variants support the
Ezio Melotti890c1932009-12-19 23:33:46 +000064following, assuming the Python interpreter is in a directory on the user's
Georg Brandlcb7cb242009-10-27 20:20:38 +000065$PATH::
66
67 #!/usr/bin/env python
68
69*Don't* do this for CGI scripts. The $PATH variable for CGI scripts is often
70very minimal, so you need to use the actual absolute pathname of the
71interpreter.
72
73Occasionally, a user's environment is so full that the /usr/bin/env program
74fails; or there's no env program at all. In that case, you can try the
75following hack (due to Alex Rezinsky)::
76
77 #! /bin/sh
78 """:"
79 exec python $0 ${1+"$@"}
80 """
81
82The minor disadvantage is that this defines the script's __doc__ string.
83However, you can fix that by adding ::
84
85 __doc__ = """...Whatever..."""
86
87
88
89Is there a curses/termcap package for Python?
90---------------------------------------------
91
92.. XXX curses *is* built by default, isn't it?
93
94For Unix variants: The standard Python source distribution comes with a curses
95module in the ``Modules/`` subdirectory, though it's not compiled by default
96(note that this is not available in the Windows distribution -- there is no
97curses module for Windows).
98
99The curses module supports basic curses features as well as many additional
100functions from ncurses and SYSV curses such as colour, alternative character set
101support, pads, and mouse support. This means the module isn't compatible with
102operating systems that only have BSD curses, but there don't seem to be any
103currently maintained OSes that fall into this category.
104
105For Windows: use `the consolelib module
106<http://effbot.org/zone/console-index.htm>`_.
107
108
109Is there an equivalent to C's onexit() in Python?
110-------------------------------------------------
111
112The :mod:`atexit` module provides a register function that is similar to C's
113onexit.
114
115
116Why don't my signal handlers work?
117----------------------------------
118
119The most common problem is that the signal handler is declared with the wrong
120argument list. It is called as ::
121
122 handler(signum, frame)
123
124so it should be declared with two arguments::
125
126 def handler(signum, frame):
127 ...
128
129
130Common tasks
131============
132
133How do I test a Python program or component?
134--------------------------------------------
135
136Python comes with two testing frameworks. The :mod:`doctest` module finds
137examples in the docstrings for a module and runs them, comparing the output with
138the expected output given in the docstring.
139
140The :mod:`unittest` module is a fancier testing framework modelled on Java and
141Smalltalk testing frameworks.
142
143For testing, it helps to write the program so that it may be easily tested by
144using good modular design. Your program should have almost all functionality
145encapsulated in either functions or class methods -- and this sometimes has the
146surprising and delightful effect of making the program run faster (because local
147variable accesses are faster than global accesses). Furthermore the program
148should avoid depending on mutating global variables, since this makes testing
149much more difficult to do.
150
151The "global main logic" of your program may be as simple as ::
152
153 if __name__ == "__main__":
154 main_logic()
155
156at the bottom of the main module of your program.
157
158Once your program is organized as a tractable collection of functions and class
159behaviours you should write test functions that exercise the behaviours. A test
160suite can be associated with each module which automates a sequence of tests.
161This sounds like a lot of work, but since Python is so terse and flexible it's
162surprisingly easy. You can make coding much more pleasant and fun by writing
163your test functions in parallel with the "production code", since this makes it
164easy to find bugs and even design flaws earlier.
165
166"Support modules" that are not intended to be the main module of a program may
167include a self-test of the module. ::
168
169 if __name__ == "__main__":
170 self_test()
171
172Even programs that interact with complex external interfaces may be tested when
173the external interfaces are unavailable by using "fake" interfaces implemented
174in Python.
175
176
177How do I create documentation from doc strings?
178-----------------------------------------------
179
Georg Brandlcb7cb242009-10-27 20:20:38 +0000180The :mod:`pydoc` module can create HTML from the doc strings in your Python
Georg Brandl628e6f92009-10-27 20:24:45 +0000181source code. An alternative for creating API documentation purely from
182docstrings is `epydoc <http://epydoc.sf.net/>`_. `Sphinx
183<http://sphinx.pocoo.org>`_ can also include docstring content.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000184
185
186How do I get a single keypress at a time?
187-----------------------------------------
188
189For Unix variants: There are several solutions. It's straightforward to do this
190using curses, but curses is a fairly large module to learn. Here's a solution
191without curses::
192
193 import termios, fcntl, sys, os
194 fd = sys.stdin.fileno()
195
196 oldterm = termios.tcgetattr(fd)
197 newattr = termios.tcgetattr(fd)
198 newattr[3] = newattr[3] & ~termios.ICANON & ~termios.ECHO
199 termios.tcsetattr(fd, termios.TCSANOW, newattr)
200
201 oldflags = fcntl.fcntl(fd, fcntl.F_GETFL)
202 fcntl.fcntl(fd, fcntl.F_SETFL, oldflags | os.O_NONBLOCK)
203
204 try:
205 while 1:
206 try:
207 c = sys.stdin.read(1)
208 print "Got character", `c`
209 except IOError: pass
210 finally:
211 termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm)
212 fcntl.fcntl(fd, fcntl.F_SETFL, oldflags)
213
214You need the :mod:`termios` and the :mod:`fcntl` module for any of this to work,
215and I've only tried it on Linux, though it should work elsewhere. In this code,
216characters are read and printed one at a time.
217
218:func:`termios.tcsetattr` turns off stdin's echoing and disables canonical mode.
219:func:`fcntl.fnctl` is used to obtain stdin's file descriptor flags and modify
220them for non-blocking mode. Since reading stdin when it is empty results in an
221:exc:`IOError`, this error is caught and ignored.
222
223
224Threads
225=======
226
227How do I program using threads?
228-------------------------------
229
Georg Brandl1e8cbe32009-10-27 20:23:20 +0000230Be sure to use the :mod:`threading` module and not the :mod:`_thread` module.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000231The :mod:`threading` module builds convenient abstractions on top of the
Georg Brandl1e8cbe32009-10-27 20:23:20 +0000232low-level primitives provided by the :mod:`_thread` module.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000233
234Aahz has a set of slides from his threading tutorial that are helpful; see
Georg Brandl628e6f92009-10-27 20:24:45 +0000235http://www.pythoncraft.com/OSCON2001/.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000236
237
238None of my threads seem to run: why?
239------------------------------------
240
241As soon as the main thread exits, all threads are killed. Your main thread is
242running too quickly, giving the threads no time to do any work.
243
244A simple fix is to add a sleep to the end of the program that's long enough for
245all the threads to finish::
246
247 import threading, time
248
249 def thread_task(name, n):
250 for i in range(n): print name, i
251
252 for i in range(10):
253 T = threading.Thread(target=thread_task, args=(str(i), i))
254 T.start()
255
256 time.sleep(10) # <----------------------------!
257
258But now (on many platforms) the threads don't run in parallel, but appear to run
259sequentially, one at a time! The reason is that the OS thread scheduler doesn't
260start a new thread until the previous thread is blocked.
261
262A simple fix is to add a tiny sleep to the start of the run function::
263
264 def thread_task(name, n):
265 time.sleep(0.001) # <---------------------!
266 for i in range(n): print name, i
267
268 for i in range(10):
269 T = threading.Thread(target=thread_task, args=(str(i), i))
270 T.start()
271
272 time.sleep(10)
273
274Instead of trying to guess how long a :func:`time.sleep` delay will be enough,
275it's better to use some kind of semaphore mechanism. One idea is to use the
Georg Brandl1e8cbe32009-10-27 20:23:20 +0000276:mod:`queue` module to create a queue object, let each thread append a token to
Georg Brandlcb7cb242009-10-27 20:20:38 +0000277the queue when it finishes, and let the main thread read as many tokens from the
278queue as there are threads.
279
280
281How do I parcel out work among a bunch of worker threads?
282---------------------------------------------------------
283
Georg Brandl1e8cbe32009-10-27 20:23:20 +0000284Use the :mod:`queue` module to create a queue containing a list of jobs. The
285:class:`~queue.Queue` class maintains a list of objects with ``.put(obj)`` to
Georg Brandlcb7cb242009-10-27 20:20:38 +0000286add an item to the queue and ``.get()`` to return an item. The class will take
287care of the locking necessary to ensure that each job is handed out exactly
288once.
289
290Here's a trivial example::
291
292 import threading, Queue, time
293
294 # The worker thread gets jobs off the queue. When the queue is empty, it
295 # assumes there will be no more work and exits.
296 # (Realistically workers will run until terminated.)
297 def worker ():
298 print 'Running worker'
299 time.sleep(0.1)
300 while True:
301 try:
302 arg = q.get(block=False)
303 except Queue.Empty:
304 print 'Worker', threading.currentThread(),
305 print 'queue empty'
306 break
307 else:
308 print 'Worker', threading.currentThread(),
309 print 'running with argument', arg
310 time.sleep(0.5)
311
312 # Create queue
313 q = Queue.Queue()
314
315 # Start a pool of 5 workers
316 for i in range(5):
317 t = threading.Thread(target=worker, name='worker %i' % (i+1))
318 t.start()
319
320 # Begin adding work to the queue
321 for i in range(50):
322 q.put(i)
323
324 # Give threads time to run
325 print 'Main thread sleeping'
326 time.sleep(5)
327
328When run, this will produce the following output:
329
330 Running worker
331 Running worker
332 Running worker
333 Running worker
334 Running worker
335 Main thread sleeping
336 Worker <Thread(worker 1, started)> running with argument 0
337 Worker <Thread(worker 2, started)> running with argument 1
338 Worker <Thread(worker 3, started)> running with argument 2
339 Worker <Thread(worker 4, started)> running with argument 3
340 Worker <Thread(worker 5, started)> running with argument 4
341 Worker <Thread(worker 1, started)> running with argument 5
342 ...
343
344Consult the module's documentation for more details; the ``Queue`` class
345provides a featureful interface.
346
347
348What kinds of global value mutation are thread-safe?
349----------------------------------------------------
350
351A global interpreter lock (GIL) is used internally to ensure that only one
352thread runs in the Python VM at a time. In general, Python offers to switch
353among threads only between bytecode instructions; how frequently it switches can
354be set via :func:`sys.setcheckinterval`. Each bytecode instruction and
355therefore all the C implementation code reached from each instruction is
356therefore atomic from the point of view of a Python program.
357
358In theory, this means an exact accounting requires an exact understanding of the
359PVM bytecode implementation. In practice, it means that operations on shared
360variables of builtin data types (ints, lists, dicts, etc) that "look atomic"
361really are.
362
363For example, the following operations are all atomic (L, L1, L2 are lists, D,
364D1, D2 are dicts, x, y are objects, i, j are ints)::
365
366 L.append(x)
367 L1.extend(L2)
368 x = L[i]
369 x = L.pop()
370 L1[i:j] = L2
371 L.sort()
372 x = y
373 x.field = y
374 D[x] = y
375 D1.update(D2)
376 D.keys()
377
378These aren't::
379
380 i = i+1
381 L.append(L[-1])
382 L[i] = L[j]
383 D[x] = D[x] + 1
384
385Operations that replace other objects may invoke those other objects'
386:meth:`__del__` method when their reference count reaches zero, and that can
387affect things. This is especially true for the mass updates to dictionaries and
388lists. When in doubt, use a mutex!
389
390
391Can't we get rid of the Global Interpreter Lock?
392------------------------------------------------
393
394.. XXX mention multiprocessing
Georg Brandl628e6f92009-10-27 20:24:45 +0000395.. XXX link to dbeazley's talk about GIL?
Georg Brandlcb7cb242009-10-27 20:20:38 +0000396
397The Global Interpreter Lock (GIL) is often seen as a hindrance to Python's
398deployment on high-end multiprocessor server machines, because a multi-threaded
399Python program effectively only uses one CPU, due to the insistence that
400(almost) all Python code can only run while the GIL is held.
401
402Back in the days of Python 1.5, Greg Stein actually implemented a comprehensive
403patch set (the "free threading" patches) that removed the GIL and replaced it
404with fine-grained locking. Unfortunately, even on Windows (where locks are very
405efficient) this ran ordinary Python code about twice as slow as the interpreter
406using the GIL. On Linux the performance loss was even worse because pthread
407locks aren't as efficient.
408
409Since then, the idea of getting rid of the GIL has occasionally come up but
410nobody has found a way to deal with the expected slowdown, and users who don't
411use threads would not be happy if their code ran at half at the speed. Greg's
412free threading patch set has not been kept up-to-date for later Python versions.
413
414This doesn't mean that you can't make good use of Python on multi-CPU machines!
415You just have to be creative with dividing the work up between multiple
416*processes* rather than multiple *threads*. Judicious use of C extensions will
417also help; if you use a C extension to perform a time-consuming task, the
418extension can release the GIL while the thread of execution is in the C code and
419allow other threads to get some work done.
420
421It has been suggested that the GIL should be a per-interpreter-state lock rather
422than truly global; interpreters then wouldn't be able to share objects.
423Unfortunately, this isn't likely to happen either. It would be a tremendous
424amount of work, because many object implementations currently have global state.
425For example, small integers and short strings are cached; these caches would
426have to be moved to the interpreter state. Other object types have their own
427free list; these free lists would have to be moved to the interpreter state.
428And so on.
429
430And I doubt that it can even be done in finite time, because the same problem
431exists for 3rd party extensions. It is likely that 3rd party extensions are
432being written at a faster rate than you can convert them to store all their
433global state in the interpreter state.
434
435And finally, once you have multiple interpreters not sharing any state, what
436have you gained over running each interpreter in a separate process?
437
438
439Input and Output
440================
441
442How do I delete a file? (And other file questions...)
443-----------------------------------------------------
444
445Use ``os.remove(filename)`` or ``os.unlink(filename)``; for documentation, see
446the :mod:`os` module. The two functions are identical; :func:`unlink` is simply
447the name of the Unix system call for this function.
448
449To remove a directory, use :func:`os.rmdir`; use :func:`os.mkdir` to create one.
450``os.makedirs(path)`` will create any intermediate directories in ``path`` that
451don't exist. ``os.removedirs(path)`` will remove intermediate directories as
452long as they're empty; if you want to delete an entire directory tree and its
453contents, use :func:`shutil.rmtree`.
454
455To rename a file, use ``os.rename(old_path, new_path)``.
456
Antoine Pitrou25d535e2010-09-15 11:25:11 +0000457To truncate a file, open it using ``f = open(filename, "rb+")``, and use
Georg Brandlcb7cb242009-10-27 20:20:38 +0000458``f.truncate(offset)``; offset defaults to the current seek position. There's
459also ```os.ftruncate(fd, offset)`` for files opened with :func:`os.open`, where
460``fd`` is the file descriptor (a small integer).
461
462The :mod:`shutil` module also contains a number of functions to work on files
463including :func:`~shutil.copyfile`, :func:`~shutil.copytree`, and
464:func:`~shutil.rmtree`.
465
466
467How do I copy a file?
468---------------------
469
470The :mod:`shutil` module contains a :func:`~shutil.copyfile` function. Note
471that on MacOS 9 it doesn't copy the resource fork and Finder info.
472
473
474How do I read (or write) binary data?
475-------------------------------------
476
477To read or write complex binary data formats, it's best to use the :mod:`struct`
478module. It allows you to take a string containing binary data (usually numbers)
479and convert it to Python objects; and vice versa.
480
481For example, the following code reads two 2-byte integers and one 4-byte integer
482in big-endian format from a file::
483
484 import struct
485
Antoine Pitrou25d535e2010-09-15 11:25:11 +0000486 with open(filename, "rb") as f:
487 s = f.read(8)
488 x, y, z = struct.unpack(">hhl", s)
Georg Brandlcb7cb242009-10-27 20:20:38 +0000489
490The '>' in the format string forces big-endian data; the letter 'h' reads one
491"short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the
492string.
493
494For data that is more regular (e.g. a homogeneous list of ints or thefloats),
495you can also use the :mod:`array` module.
496
Antoine Pitrou25d535e2010-09-15 11:25:11 +0000497 .. note::
498 To read and write binary data, it is mandatory to open the file in
499 binary mode (here, passing ``"rb"`` to :func:`open`). If you use
500 ``"r"`` instead (the default), the file will be open in text mode
501 and ``f.read()`` will return :class:`str` objects rather than
502 :class:`bytes` objects.
503
Georg Brandlcb7cb242009-10-27 20:20:38 +0000504
505I can't seem to use os.read() on a pipe created with os.popen(); why?
506---------------------------------------------------------------------
507
508:func:`os.read` is a low-level function which takes a file descriptor, a small
509integer representing the opened file. :func:`os.popen` creates a high-level
510file object, the same type returned by the builtin :func:`open` function. Thus,
511to read n bytes from a pipe p created with :func:`os.popen`, you need to use
512``p.read(n)``.
513
514
515How do I run a subprocess with pipes connected to both input and output?
516------------------------------------------------------------------------
517
518.. XXX update to use subprocess
519
520Use the :mod:`popen2` module. For example::
521
522 import popen2
523 fromchild, tochild = popen2.popen2("command")
524 tochild.write("input\n")
525 tochild.flush()
526 output = fromchild.readline()
527
528Warning: in general it is unwise to do this because you can easily cause a
529deadlock where your process is blocked waiting for output from the child while
530the child is blocked waiting for input from you. This can be caused because the
531parent expects the child to output more text than it does, or it can be caused
532by data being stuck in stdio buffers due to lack of flushing. The Python parent
533can of course explicitly flush the data it sends to the child before it reads
534any output, but if the child is a naive C program it may have been written to
535never explicitly flush its output, even if it is interactive, since flushing is
536normally automatic.
537
538Note that a deadlock is also possible if you use :func:`popen3` to read stdout
539and stderr. If one of the two is too large for the internal buffer (increasing
540the buffer size does not help) and you ``read()`` the other one first, there is
541a deadlock, too.
542
543Note on a bug in popen2: unless your program calls ``wait()`` or ``waitpid()``,
544finished child processes are never removed, and eventually calls to popen2 will
545fail because of a limit on the number of child processes. Calling
546:func:`os.waitpid` with the :data:`os.WNOHANG` option can prevent this; a good
547place to insert such a call would be before calling ``popen2`` again.
548
549In many cases, all you really need is to run some data through a command and get
550the result back. Unless the amount of data is very large, the easiest way to do
551this is to write it to a temporary file and run the command with that temporary
552file as input. The standard module :mod:`tempfile` exports a ``mktemp()``
553function to generate unique temporary file names. ::
554
555 import tempfile
556 import os
557
558 class Popen3:
559 """
560 This is a deadlock-safe version of popen that returns
561 an object with errorlevel, out (a string) and err (a string).
562 (capturestderr may not work under windows.)
563 Example: print Popen3('grep spam','\n\nhere spam\n\n').out
564 """
565 def __init__(self,command,input=None,capturestderr=None):
566 outfile=tempfile.mktemp()
567 command="( %s ) > %s" % (command,outfile)
568 if input:
569 infile=tempfile.mktemp()
570 open(infile,"w").write(input)
571 command=command+" <"+infile
572 if capturestderr:
573 errfile=tempfile.mktemp()
574 command=command+" 2>"+errfile
575 self.errorlevel=os.system(command) >> 8
576 self.out=open(outfile,"r").read()
577 os.remove(outfile)
578 if input:
579 os.remove(infile)
580 if capturestderr:
581 self.err=open(errfile,"r").read()
582 os.remove(errfile)
583
584Note that many interactive programs (e.g. vi) don't work well with pipes
585substituted for standard input and output. You will have to use pseudo ttys
586("ptys") instead of pipes. Or you can use a Python interface to Don Libes'
587"expect" library. A Python extension that interfaces to expect is called "expy"
588and available from http://expectpy.sourceforge.net. A pure Python solution that
Georg Brandl628e6f92009-10-27 20:24:45 +0000589works like expect is `pexpect <http://pypi.python.org/pypi/pexpect/>`_.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000590
591
592How do I access the serial (RS232) port?
593----------------------------------------
594
595For Win32, POSIX (Linux, BSD, etc.), Jython:
596
597 http://pyserial.sourceforge.net
598
599For Unix, see a Usenet post by Mitch Chapman:
600
601 http://groups.google.com/groups?selm=34A04430.CF9@ohioee.com
602
603
604Why doesn't closing sys.stdout (stdin, stderr) really close it?
605---------------------------------------------------------------
606
Antoine Pitrou25d535e2010-09-15 11:25:11 +0000607Python :term:`file objects <file object>` are a high-level layer of
608abstraction on low-level C file descriptors.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000609
Antoine Pitrou25d535e2010-09-15 11:25:11 +0000610For most file objects you create in Python via the built-in :func:`open`
611function, ``f.close()`` marks the Python file object as being closed from
612Python's point of view, and also arranges to close the underlying C file
613descriptor. This also happens automatically in ``f``'s destructor, when
614``f`` becomes garbage.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000615
616But stdin, stdout and stderr are treated specially by Python, because of the
617special status also given to them by C. Running ``sys.stdout.close()`` marks
618the Python-level file object as being closed, but does *not* close the
Antoine Pitrou25d535e2010-09-15 11:25:11 +0000619associated C file descriptor.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000620
Antoine Pitrou25d535e2010-09-15 11:25:11 +0000621To close the underlying C file descriptor for one of these three, you should
622first be sure that's what you really want to do (e.g., you may confuse
623extension modules trying to do I/O). If it is, use :func:`os.close`::
Georg Brandlcb7cb242009-10-27 20:20:38 +0000624
Antoine Pitrou25d535e2010-09-15 11:25:11 +0000625 os.close(stdin.fileno())
626 os.close(stdout.fileno())
627 os.close(stderr.fileno())
628
629Or you can use the numeric constants 0, 1 and 2, respectively.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000630
631
632Network/Internet Programming
633============================
634
635What WWW tools are there for Python?
636------------------------------------
637
638See the chapters titled :ref:`internet` and :ref:`netdata` in the Library
639Reference Manual. Python has many modules that will help you build server-side
640and client-side web systems.
641
642.. XXX check if wiki page is still up to date
643
644A summary of available frameworks is maintained by Paul Boddie at
645http://wiki.python.org/moin/WebProgramming .
646
647Cameron Laird maintains a useful set of pages about Python web technologies at
648http://phaseit.net/claird/comp.lang.python/web_python.
649
650
651How can I mimic CGI form submission (METHOD=POST)?
652--------------------------------------------------
653
654I would like to retrieve web pages that are the result of POSTing a form. Is
655there existing code that would let me do this easily?
656
657Yes. Here's a simple example that uses httplib::
658
659 #!/usr/local/bin/python
660
661 import httplib, sys, time
662
663 ### build the query string
664 qs = "First=Josephine&MI=Q&Last=Public"
665
666 ### connect and send the server a path
667 httpobj = httplib.HTTP('www.some-server.out-there', 80)
668 httpobj.putrequest('POST', '/cgi-bin/some-cgi-script')
669 ### now generate the rest of the HTTP headers...
670 httpobj.putheader('Accept', '*/*')
671 httpobj.putheader('Connection', 'Keep-Alive')
672 httpobj.putheader('Content-type', 'application/x-www-form-urlencoded')
673 httpobj.putheader('Content-length', '%d' % len(qs))
674 httpobj.endheaders()
675 httpobj.send(qs)
676 ### find out what the server said in response...
677 reply, msg, hdrs = httpobj.getreply()
678 if reply != 200:
679 sys.stdout.write(httpobj.getfile().read())
680
Senthil Kumaranea54b032010-08-09 20:05:35 +0000681Note that in general for a percent-encoded POST operations, query strings must be
Georg Brandlcb7cb242009-10-27 20:20:38 +0000682quoted by using :func:`urllib.quote`. For example to send name="Guy Steele,
683Jr."::
684
685 >>> from urllib import quote
686 >>> x = quote("Guy Steele, Jr.")
687 >>> x
688 'Guy%20Steele,%20Jr.'
689 >>> query_string = "name="+x
690 >>> query_string
691 'name=Guy%20Steele,%20Jr.'
692
693
694What module should I use to help with generating HTML?
695------------------------------------------------------
696
697.. XXX add modern template languages
698
699There are many different modules available:
700
701* HTMLgen is a class library of objects corresponding to all the HTML 3.2 markup
702 tags. It's used when you are writing in Python and wish to synthesize HTML
703 pages for generating a web or for CGI forms, etc.
704
705* DocumentTemplate and Zope Page Templates are two different systems that are
706 part of Zope.
707
708* Quixote's PTL uses Python syntax to assemble strings of text.
709
710Consult the `Web Programming wiki pages
711<http://wiki.python.org/moin/WebProgramming>`_ for more links.
712
713
714How do I send mail from a Python script?
715----------------------------------------
716
717Use the standard library module :mod:`smtplib`.
718
719Here's a very simple interactive mail sender that uses it. This method will
720work on any host that supports an SMTP listener. ::
721
722 import sys, smtplib
723
724 fromaddr = raw_input("From: ")
725 toaddrs = raw_input("To: ").split(',')
726 print "Enter message, end with ^D:"
727 msg = ''
728 while True:
729 line = sys.stdin.readline()
730 if not line:
731 break
732 msg += line
733
734 # The actual mail send
735 server = smtplib.SMTP('localhost')
736 server.sendmail(fromaddr, toaddrs, msg)
737 server.quit()
738
739A Unix-only alternative uses sendmail. The location of the sendmail program
740varies between systems; sometimes it is ``/usr/lib/sendmail``, sometime
741``/usr/sbin/sendmail``. The sendmail manual page will help you out. Here's
742some sample code::
743
744 SENDMAIL = "/usr/sbin/sendmail" # sendmail location
745 import os
746 p = os.popen("%s -t -i" % SENDMAIL, "w")
747 p.write("To: receiver@example.com\n")
748 p.write("Subject: test\n")
749 p.write("\n") # blank line separating headers from body
750 p.write("Some text\n")
751 p.write("some more text\n")
752 sts = p.close()
753 if sts != 0:
754 print "Sendmail exit status", sts
755
756
757How do I avoid blocking in the connect() method of a socket?
758------------------------------------------------------------
759
760The select module is commonly used to help with asynchronous I/O on sockets.
761
762To prevent the TCP connect from blocking, you can set the socket to non-blocking
763mode. Then when you do the ``connect()``, you will either connect immediately
764(unlikely) or get an exception that contains the error number as ``.errno``.
765``errno.EINPROGRESS`` indicates that the connection is in progress, but hasn't
766finished yet. Different OSes will return different values, so you're going to
767have to check what's returned on your system.
768
769You can use the ``connect_ex()`` method to avoid creating an exception. It will
770just return the errno value. To poll, you can call ``connect_ex()`` again later
771-- 0 or ``errno.EISCONN`` indicate that you're connected -- or you can pass this
772socket to select to check if it's writable.
773
774
775Databases
776=========
777
778Are there any interfaces to database packages in Python?
779--------------------------------------------------------
780
781Yes.
782
Georg Brandl1e8cbe32009-10-27 20:23:20 +0000783Interfaces to disk-based hashes such as :mod:`DBM <dbm.ndbm>` and :mod:`GDBM
784<dbm.gnu>` are also included with standard Python. There is also the
785:mod:`sqlite3` module, which provides a lightweight disk-based relational
786database.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000787
788Support for most relational databases is available. See the
789`DatabaseProgramming wiki page
790<http://wiki.python.org/moin/DatabaseProgramming>`_ for details.
791
792
793How do you implement persistent objects in Python?
794--------------------------------------------------
795
796The :mod:`pickle` library module solves this in a very general way (though you
797still can't store things like open files, sockets or windows), and the
798:mod:`shelve` library module uses pickle and (g)dbm to create persistent
Georg Brandl1e8cbe32009-10-27 20:23:20 +0000799mappings containing arbitrary Python objects.
Georg Brandlcb7cb242009-10-27 20:20:38 +0000800
801A more awkward way of doing things is to use pickle's little sister, marshal.
802The :mod:`marshal` module provides very fast ways to store noncircular basic
803Python types to files and strings, and back again. Although marshal does not do
804fancy things like store instances or handle shared references properly, it does
805run extremely fast. For example loading a half megabyte of data may take less
806than a third of a second. This often beats doing something more complex and
807general such as using gdbm with pickle/shelve.
808
809
810Why is cPickle so slow?
811-----------------------
812
813.. XXX update this, default protocol is 2/3
814
815The default format used by the pickle module is a slow one that results in
816readable pickles. Making it the default, but it would break backward
817compatibility::
818
819 largeString = 'z' * (100 * 1024)
820 myPickle = cPickle.dumps(largeString, protocol=1)
821
822
823If my program crashes with a bsddb (or anydbm) database open, it gets corrupted. How come?
824------------------------------------------------------------------------------------------
825
826Databases opened for write access with the bsddb module (and often by the anydbm
827module, since it will preferentially use bsddb) must explicitly be closed using
828the ``.close()`` method of the database. The underlying library caches database
829contents which need to be converted to on-disk form and written.
830
831If you have initialized a new bsddb database but not written anything to it
832before the program crashes, you will often wind up with a zero-length file and
833encounter an exception the next time the file is opened.
834
835
836I tried to open Berkeley DB file, but bsddb produces bsddb.error: (22, 'Invalid argument'). Help! How can I restore my data?
837----------------------------------------------------------------------------------------------------------------------------
838
839Don't panic! Your data is probably intact. The most frequent cause for the error
840is that you tried to open an earlier Berkeley DB file with a later version of
841the Berkeley DB library.
842
843Many Linux systems now have all three versions of Berkeley DB available. If you
844are migrating from version 1 to a newer version use db_dump185 to dump a plain
845text version of the database. If you are migrating from version 2 to version 3
846use db2_dump to create a plain text version of the database. In either case,
847use db_load to create a new native database for the latest version installed on
848your computer. If you have version 3 of Berkeley DB installed, you should be
849able to use db2_load to create a native version 2 database.
850
851You should move away from Berkeley DB version 1 files because the hash file code
852contains known bugs that can corrupt your data.
853
854
855Mathematics and Numerics
856========================
857
858How do I generate random numbers in Python?
859-------------------------------------------
860
861The standard module :mod:`random` implements a random number generator. Usage
862is simple::
863
864 import random
865 random.random()
866
867This returns a random floating point number in the range [0, 1).
868
869There are also many other specialized generators in this module, such as:
870
871* ``randrange(a, b)`` chooses an integer in the range [a, b).
872* ``uniform(a, b)`` chooses a floating point number in the range [a, b).
873* ``normalvariate(mean, sdev)`` samples the normal (Gaussian) distribution.
874
875Some higher-level functions operate on sequences directly, such as:
876
877* ``choice(S)`` chooses random element from a given sequence
878* ``shuffle(L)`` shuffles a list in-place, i.e. permutes it randomly
879
880There's also a ``Random`` class you can instantiate to create independent
881multiple random number generators.