blob: 7385c594320b345f81b6262373b7460731bdda52 [file] [log] [blame]
Georg Brandld7413152009-10-11 21:25:26 +00001:tocdepth: 2
2
3=========================
4Library and Extension FAQ
5=========================
6
7.. contents::
8
9General Library Questions
10=========================
11
12How do I find a module or application to perform task X?
13--------------------------------------------------------
14
15Check :ref:`the Library Reference <library-index>` to see if there's a relevant
16standard library module. (Eventually you'll learn what's in the standard
Ezio Melottib35480e2012-05-13 20:14:04 +030017library and will be able to skip this step.)
Georg Brandld7413152009-10-11 21:25:26 +000018
Georg Brandl495f7b52009-10-27 15:28:25 +000019For third-party packages, search the `Python Package Index
20<http://pypi.python.org/pypi>`_ or try `Google <http://www.google.com>`_ or
21another Web search engine. Searching for "Python" plus a keyword or two for
22your topic of interest will usually find something helpful.
Georg Brandld7413152009-10-11 21:25:26 +000023
24
25Where is the math.py (socket.py, regex.py, etc.) source file?
26-------------------------------------------------------------
27
Georg Brandlc4a55fc2010-02-06 18:46:57 +000028If you can't find a source file for a module it may be a built-in or
29dynamically loaded module implemented in C, C++ or other compiled language.
30In this case you may not have the source file or it may be something like
Ezio Melottib35480e2012-05-13 20:14:04 +030031:file:`mathmodule.c`, somewhere in a C source directory (not on the Python Path).
Georg Brandld7413152009-10-11 21:25:26 +000032
33There are (at least) three kinds of modules in Python:
34
351) modules written in Python (.py);
362) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc);
373) modules written in C and linked with the interpreter; to get a list of these,
38 type::
39
40 import sys
Georg Brandl9e4ff752009-12-19 17:57:51 +000041 print(sys.builtin_module_names)
Georg Brandld7413152009-10-11 21:25:26 +000042
43
44How do I make a Python script executable on Unix?
45-------------------------------------------------
46
47You need to do two things: the script file's mode must be executable and the
48first line must begin with ``#!`` followed by the path of the Python
49interpreter.
50
51The first is done by executing ``chmod +x scriptfile`` or perhaps ``chmod 755
52scriptfile``.
53
54The second can be done in a number of ways. The most straightforward way is to
55write ::
56
57 #!/usr/local/bin/python
58
59as the very first line of your file, using the pathname for where the Python
60interpreter is installed on your platform.
61
62If you would like the script to be independent of where the Python interpreter
Ezio Melottib35480e2012-05-13 20:14:04 +030063lives, you can use the :program:`env` program. Almost all Unix variants support
64the following, assuming the Python interpreter is in a directory on the user's
65:envvar:`PATH`::
Georg Brandld7413152009-10-11 21:25:26 +000066
67 #!/usr/bin/env python
68
Ezio Melottib35480e2012-05-13 20:14:04 +030069*Don't* do this for CGI scripts. The :envvar:`PATH` variable for CGI scripts is
70often very minimal, so you need to use the actual absolute pathname of the
Georg Brandld7413152009-10-11 21:25:26 +000071interpreter.
72
Ezio Melottib35480e2012-05-13 20:14:04 +030073Occasionally, a user's environment is so full that the :program:`/usr/bin/env`
74program fails; or there's no env program at all. In that case, you can try the
Georg Brandld7413152009-10-11 21:25:26 +000075following hack (due to Alex Rezinsky)::
76
77 #! /bin/sh
78 """:"
79 exec python $0 ${1+"$@"}
80 """
81
82The minor disadvantage is that this defines the script's __doc__ string.
83However, you can fix that by adding ::
84
85 __doc__ = """...Whatever..."""
86
87
88
89Is there a curses/termcap package for Python?
90---------------------------------------------
91
92.. XXX curses *is* built by default, isn't it?
93
94For Unix variants: The standard Python source distribution comes with a curses
Ezio Melottib35480e2012-05-13 20:14:04 +030095module in the :source:`Modules` subdirectory, though it's not compiled by default.
96(Note that this is not available in the Windows distribution -- there is no
97curses module for Windows.)
Georg Brandld7413152009-10-11 21:25:26 +000098
Ezio Melottib35480e2012-05-13 20:14:04 +030099The :mod:`curses` module supports basic curses features as well as many additional
Georg Brandld7413152009-10-11 21:25:26 +0000100functions from ncurses and SYSV curses such as colour, alternative character set
101support, pads, and mouse support. This means the module isn't compatible with
102operating systems that only have BSD curses, but there don't seem to be any
103currently maintained OSes that fall into this category.
104
105For Windows: use `the consolelib module
106<http://effbot.org/zone/console-index.htm>`_.
107
108
109Is there an equivalent to C's onexit() in Python?
110-------------------------------------------------
111
112The :mod:`atexit` module provides a register function that is similar to C's
Ezio Melottib35480e2012-05-13 20:14:04 +0300113:c:func:`onexit`.
Georg Brandld7413152009-10-11 21:25:26 +0000114
115
116Why don't my signal handlers work?
117----------------------------------
118
119The most common problem is that the signal handler is declared with the wrong
120argument list. It is called as ::
121
122 handler(signum, frame)
123
124so it should be declared with two arguments::
125
126 def handler(signum, frame):
127 ...
128
129
130Common tasks
131============
132
133How do I test a Python program or component?
134--------------------------------------------
135
136Python comes with two testing frameworks. The :mod:`doctest` module finds
137examples in the docstrings for a module and runs them, comparing the output with
138the expected output given in the docstring.
139
140The :mod:`unittest` module is a fancier testing framework modelled on Java and
141Smalltalk testing frameworks.
142
Ezio Melottib35480e2012-05-13 20:14:04 +0300143To make testing easier, you should use good modular design in your program.
144Your program should have almost all functionality
Georg Brandld7413152009-10-11 21:25:26 +0000145encapsulated in either functions or class methods -- and this sometimes has the
146surprising and delightful effect of making the program run faster (because local
147variable accesses are faster than global accesses). Furthermore the program
148should avoid depending on mutating global variables, since this makes testing
149much more difficult to do.
150
151The "global main logic" of your program may be as simple as ::
152
153 if __name__ == "__main__":
154 main_logic()
155
156at the bottom of the main module of your program.
157
158Once your program is organized as a tractable collection of functions and class
159behaviours you should write test functions that exercise the behaviours. A test
Ezio Melottib35480e2012-05-13 20:14:04 +0300160suite that automates a sequence of tests can be associated with each module.
Georg Brandld7413152009-10-11 21:25:26 +0000161This sounds like a lot of work, but since Python is so terse and flexible it's
162surprisingly easy. You can make coding much more pleasant and fun by writing
163your test functions in parallel with the "production code", since this makes it
164easy to find bugs and even design flaws earlier.
165
166"Support modules" that are not intended to be the main module of a program may
167include a self-test of the module. ::
168
169 if __name__ == "__main__":
170 self_test()
171
172Even programs that interact with complex external interfaces may be tested when
173the external interfaces are unavailable by using "fake" interfaces implemented
174in Python.
175
176
177How do I create documentation from doc strings?
178-----------------------------------------------
179
Georg Brandld7413152009-10-11 21:25:26 +0000180The :mod:`pydoc` module can create HTML from the doc strings in your Python
Georg Brandl495f7b52009-10-27 15:28:25 +0000181source code. An alternative for creating API documentation purely from
182docstrings is `epydoc <http://epydoc.sf.net/>`_. `Sphinx
183<http://sphinx.pocoo.org>`_ can also include docstring content.
Georg Brandld7413152009-10-11 21:25:26 +0000184
185
186How do I get a single keypress at a time?
187-----------------------------------------
188
Ezio Melottib35480e2012-05-13 20:14:04 +0300189For Unix variants there are several solutions. It's straightforward to do this
Georg Brandl9e4ff752009-12-19 17:57:51 +0000190using curses, but curses is a fairly large module to learn.
191
192.. XXX this doesn't work out of the box, some IO expert needs to check why
193
194 Here's a solution without curses::
Georg Brandld7413152009-10-11 21:25:26 +0000195
196 import termios, fcntl, sys, os
197 fd = sys.stdin.fileno()
198
199 oldterm = termios.tcgetattr(fd)
200 newattr = termios.tcgetattr(fd)
201 newattr[3] = newattr[3] & ~termios.ICANON & ~termios.ECHO
202 termios.tcsetattr(fd, termios.TCSANOW, newattr)
203
204 oldflags = fcntl.fcntl(fd, fcntl.F_GETFL)
205 fcntl.fcntl(fd, fcntl.F_SETFL, oldflags | os.O_NONBLOCK)
206
207 try:
Georg Brandl9e4ff752009-12-19 17:57:51 +0000208 while True:
Georg Brandld7413152009-10-11 21:25:26 +0000209 try:
210 c = sys.stdin.read(1)
Georg Brandl9e4ff752009-12-19 17:57:51 +0000211 print("Got character", repr(c))
212 except IOError:
213 pass
Georg Brandld7413152009-10-11 21:25:26 +0000214 finally:
215 termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm)
216 fcntl.fcntl(fd, fcntl.F_SETFL, oldflags)
217
Georg Brandl9e4ff752009-12-19 17:57:51 +0000218 You need the :mod:`termios` and the :mod:`fcntl` module for any of this to
219 work, and I've only tried it on Linux, though it should work elsewhere. In
220 this code, characters are read and printed one at a time.
Georg Brandld7413152009-10-11 21:25:26 +0000221
Georg Brandl9e4ff752009-12-19 17:57:51 +0000222 :func:`termios.tcsetattr` turns off stdin's echoing and disables canonical
223 mode. :func:`fcntl.fnctl` is used to obtain stdin's file descriptor flags
224 and modify them for non-blocking mode. Since reading stdin when it is empty
225 results in an :exc:`IOError`, this error is caught and ignored.
Georg Brandld7413152009-10-11 21:25:26 +0000226
227
228Threads
229=======
230
231How do I program using threads?
232-------------------------------
233
Georg Brandld404fa62009-10-13 16:55:12 +0000234Be sure to use the :mod:`threading` module and not the :mod:`_thread` module.
Georg Brandld7413152009-10-11 21:25:26 +0000235The :mod:`threading` module builds convenient abstractions on top of the
Georg Brandld404fa62009-10-13 16:55:12 +0000236low-level primitives provided by the :mod:`_thread` module.
Georg Brandld7413152009-10-11 21:25:26 +0000237
238Aahz has a set of slides from his threading tutorial that are helpful; see
Georg Brandl495f7b52009-10-27 15:28:25 +0000239http://www.pythoncraft.com/OSCON2001/.
Georg Brandld7413152009-10-11 21:25:26 +0000240
241
242None of my threads seem to run: why?
243------------------------------------
244
245As soon as the main thread exits, all threads are killed. Your main thread is
246running too quickly, giving the threads no time to do any work.
247
248A simple fix is to add a sleep to the end of the program that's long enough for
249all the threads to finish::
250
251 import threading, time
252
253 def thread_task(name, n):
Georg Brandl9e4ff752009-12-19 17:57:51 +0000254 for i in range(n): print(name, i)
Georg Brandld7413152009-10-11 21:25:26 +0000255
256 for i in range(10):
257 T = threading.Thread(target=thread_task, args=(str(i), i))
258 T.start()
259
Georg Brandl9e4ff752009-12-19 17:57:51 +0000260 time.sleep(10) # <---------------------------!
Georg Brandld7413152009-10-11 21:25:26 +0000261
262But now (on many platforms) the threads don't run in parallel, but appear to run
263sequentially, one at a time! The reason is that the OS thread scheduler doesn't
264start a new thread until the previous thread is blocked.
265
266A simple fix is to add a tiny sleep to the start of the run function::
267
268 def thread_task(name, n):
Georg Brandl9e4ff752009-12-19 17:57:51 +0000269 time.sleep(0.001) # <--------------------!
270 for i in range(n): print(name, i)
Georg Brandld7413152009-10-11 21:25:26 +0000271
272 for i in range(10):
273 T = threading.Thread(target=thread_task, args=(str(i), i))
274 T.start()
275
276 time.sleep(10)
277
Ezio Melottib35480e2012-05-13 20:14:04 +0300278Instead of trying to guess a good delay value for :func:`time.sleep`,
Georg Brandld7413152009-10-11 21:25:26 +0000279it's better to use some kind of semaphore mechanism. One idea is to use the
Georg Brandld404fa62009-10-13 16:55:12 +0000280:mod:`queue` module to create a queue object, let each thread append a token to
Georg Brandld7413152009-10-11 21:25:26 +0000281the queue when it finishes, and let the main thread read as many tokens from the
282queue as there are threads.
283
284
285How do I parcel out work among a bunch of worker threads?
286---------------------------------------------------------
287
Antoine Pitrou11480b62011-02-05 11:18:34 +0000288The easiest way is to use the new :mod:`concurrent.futures` module,
289especially the :mod:`~concurrent.futures.ThreadPoolExecutor` class.
290
291Or, if you want fine control over the dispatching algorithm, you can write
292your own logic manually. Use the :mod:`queue` module to create a queue
293containing a list of jobs. The :class:`~queue.Queue` class maintains a
Ezio Melottib35480e2012-05-13 20:14:04 +0300294list of objects and has a ``.put(obj)`` method that adds items to the queue and
295a ``.get()`` method to return them. The class will take care of the locking
296necessary to ensure that each job is handed out exactly once.
Georg Brandld7413152009-10-11 21:25:26 +0000297
298Here's a trivial example::
299
Georg Brandl9e4ff752009-12-19 17:57:51 +0000300 import threading, queue, time
Georg Brandld7413152009-10-11 21:25:26 +0000301
302 # The worker thread gets jobs off the queue. When the queue is empty, it
303 # assumes there will be no more work and exits.
304 # (Realistically workers will run until terminated.)
Ezio Melottib35480e2012-05-13 20:14:04 +0300305 def worker():
Georg Brandl9e4ff752009-12-19 17:57:51 +0000306 print('Running worker')
Georg Brandld7413152009-10-11 21:25:26 +0000307 time.sleep(0.1)
308 while True:
309 try:
310 arg = q.get(block=False)
Georg Brandl9e4ff752009-12-19 17:57:51 +0000311 except queue.Empty:
312 print('Worker', threading.currentThread(), end=' ')
313 print('queue empty')
Georg Brandld7413152009-10-11 21:25:26 +0000314 break
315 else:
Georg Brandl9e4ff752009-12-19 17:57:51 +0000316 print('Worker', threading.currentThread(), end=' ')
317 print('running with argument', arg)
Georg Brandld7413152009-10-11 21:25:26 +0000318 time.sleep(0.5)
319
320 # Create queue
Georg Brandl9e4ff752009-12-19 17:57:51 +0000321 q = queue.Queue()
Georg Brandld7413152009-10-11 21:25:26 +0000322
323 # Start a pool of 5 workers
324 for i in range(5):
325 t = threading.Thread(target=worker, name='worker %i' % (i+1))
326 t.start()
327
328 # Begin adding work to the queue
329 for i in range(50):
330 q.put(i)
331
332 # Give threads time to run
Georg Brandl9e4ff752009-12-19 17:57:51 +0000333 print('Main thread sleeping')
Georg Brandld7413152009-10-11 21:25:26 +0000334 time.sleep(5)
335
Ezio Melottib35480e2012-05-13 20:14:04 +0300336When run, this will produce the following output:
337
338.. code-block:: none
Georg Brandld7413152009-10-11 21:25:26 +0000339
340 Running worker
341 Running worker
342 Running worker
343 Running worker
344 Running worker
345 Main thread sleeping
Georg Brandl9e4ff752009-12-19 17:57:51 +0000346 Worker <Thread(worker 1, started 130283832797456)> running with argument 0
347 Worker <Thread(worker 2, started 130283824404752)> running with argument 1
348 Worker <Thread(worker 3, started 130283816012048)> running with argument 2
349 Worker <Thread(worker 4, started 130283807619344)> running with argument 3
350 Worker <Thread(worker 5, started 130283799226640)> running with argument 4
351 Worker <Thread(worker 1, started 130283832797456)> running with argument 5
Georg Brandld7413152009-10-11 21:25:26 +0000352 ...
353
Ezio Melottib35480e2012-05-13 20:14:04 +0300354Consult the module's documentation for more details; the :class:`~queue.Queue``
355class provides a featureful interface.
Georg Brandld7413152009-10-11 21:25:26 +0000356
357
358What kinds of global value mutation are thread-safe?
359----------------------------------------------------
360
Antoine Pitrou11480b62011-02-05 11:18:34 +0000361A :term:`global interpreter lock` (GIL) is used internally to ensure that only one
Georg Brandld7413152009-10-11 21:25:26 +0000362thread runs in the Python VM at a time. In general, Python offers to switch
363among threads only between bytecode instructions; how frequently it switches can
Georg Brandl9e4ff752009-12-19 17:57:51 +0000364be set via :func:`sys.setswitchinterval`. Each bytecode instruction and
Georg Brandld7413152009-10-11 21:25:26 +0000365therefore all the C implementation code reached from each instruction is
366therefore atomic from the point of view of a Python program.
367
368In theory, this means an exact accounting requires an exact understanding of the
369PVM bytecode implementation. In practice, it means that operations on shared
Georg Brandlc4a55fc2010-02-06 18:46:57 +0000370variables of built-in data types (ints, lists, dicts, etc) that "look atomic"
Georg Brandld7413152009-10-11 21:25:26 +0000371really are.
372
373For example, the following operations are all atomic (L, L1, L2 are lists, D,
374D1, D2 are dicts, x, y are objects, i, j are ints)::
375
376 L.append(x)
377 L1.extend(L2)
378 x = L[i]
379 x = L.pop()
380 L1[i:j] = L2
381 L.sort()
382 x = y
383 x.field = y
384 D[x] = y
385 D1.update(D2)
386 D.keys()
387
388These aren't::
389
390 i = i+1
391 L.append(L[-1])
392 L[i] = L[j]
393 D[x] = D[x] + 1
394
395Operations that replace other objects may invoke those other objects'
396:meth:`__del__` method when their reference count reaches zero, and that can
397affect things. This is especially true for the mass updates to dictionaries and
398lists. When in doubt, use a mutex!
399
400
401Can't we get rid of the Global Interpreter Lock?
402------------------------------------------------
403
Georg Brandl495f7b52009-10-27 15:28:25 +0000404.. XXX link to dbeazley's talk about GIL?
Georg Brandld7413152009-10-11 21:25:26 +0000405
Antoine Pitrou11480b62011-02-05 11:18:34 +0000406The :term:`global interpreter lock` (GIL) is often seen as a hindrance to Python's
Georg Brandld7413152009-10-11 21:25:26 +0000407deployment on high-end multiprocessor server machines, because a multi-threaded
408Python program effectively only uses one CPU, due to the insistence that
409(almost) all Python code can only run while the GIL is held.
410
411Back in the days of Python 1.5, Greg Stein actually implemented a comprehensive
412patch set (the "free threading" patches) that removed the GIL and replaced it
Antoine Pitrou11480b62011-02-05 11:18:34 +0000413with fine-grained locking. Adam Olsen recently did a similar experiment
414in his `python-safethread <http://code.google.com/p/python-safethread/>`_
415project. Unfortunately, both experiments exhibited a sharp drop in single-thread
416performance (at least 30% slower), due to the amount of fine-grained locking
417necessary to compensate for the removal of the GIL.
Georg Brandld7413152009-10-11 21:25:26 +0000418
419This doesn't mean that you can't make good use of Python on multi-CPU machines!
420You just have to be creative with dividing the work up between multiple
Antoine Pitrou11480b62011-02-05 11:18:34 +0000421*processes* rather than multiple *threads*. The
422:class:`~concurrent.futures.ProcessPoolExecutor` class in the new
423:mod:`concurrent.futures` module provides an easy way of doing so; the
424:mod:`multiprocessing` module provides a lower-level API in case you want
425more control over dispatching of tasks.
426
427Judicious use of C extensions will also help; if you use a C extension to
428perform a time-consuming task, the extension can release the GIL while the
429thread of execution is in the C code and allow other threads to get some work
430done. Some standard library modules such as :mod:`zlib` and :mod:`hashlib`
431already do this.
Georg Brandld7413152009-10-11 21:25:26 +0000432
433It has been suggested that the GIL should be a per-interpreter-state lock rather
434than truly global; interpreters then wouldn't be able to share objects.
435Unfortunately, this isn't likely to happen either. It would be a tremendous
436amount of work, because many object implementations currently have global state.
437For example, small integers and short strings are cached; these caches would
438have to be moved to the interpreter state. Other object types have their own
439free list; these free lists would have to be moved to the interpreter state.
440And so on.
441
442And I doubt that it can even be done in finite time, because the same problem
443exists for 3rd party extensions. It is likely that 3rd party extensions are
444being written at a faster rate than you can convert them to store all their
445global state in the interpreter state.
446
447And finally, once you have multiple interpreters not sharing any state, what
448have you gained over running each interpreter in a separate process?
449
450
451Input and Output
452================
453
454How do I delete a file? (And other file questions...)
455-----------------------------------------------------
456
457Use ``os.remove(filename)`` or ``os.unlink(filename)``; for documentation, see
Georg Brandl9e4ff752009-12-19 17:57:51 +0000458the :mod:`os` module. The two functions are identical; :func:`~os.unlink` is simply
Georg Brandld7413152009-10-11 21:25:26 +0000459the name of the Unix system call for this function.
460
461To remove a directory, use :func:`os.rmdir`; use :func:`os.mkdir` to create one.
462``os.makedirs(path)`` will create any intermediate directories in ``path`` that
463don't exist. ``os.removedirs(path)`` will remove intermediate directories as
464long as they're empty; if you want to delete an entire directory tree and its
465contents, use :func:`shutil.rmtree`.
466
467To rename a file, use ``os.rename(old_path, new_path)``.
468
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000469To truncate a file, open it using ``f = open(filename, "rb+")``, and use
Georg Brandld7413152009-10-11 21:25:26 +0000470``f.truncate(offset)``; offset defaults to the current seek position. There's
Georg Brandl682d7e02010-10-06 10:26:05 +0000471also ``os.ftruncate(fd, offset)`` for files opened with :func:`os.open`, where
Ezio Melottib35480e2012-05-13 20:14:04 +0300472*fd* is the file descriptor (a small integer).
Georg Brandld7413152009-10-11 21:25:26 +0000473
474The :mod:`shutil` module also contains a number of functions to work on files
475including :func:`~shutil.copyfile`, :func:`~shutil.copytree`, and
476:func:`~shutil.rmtree`.
477
478
479How do I copy a file?
480---------------------
481
482The :mod:`shutil` module contains a :func:`~shutil.copyfile` function. Note
483that on MacOS 9 it doesn't copy the resource fork and Finder info.
484
485
486How do I read (or write) binary data?
487-------------------------------------
488
489To read or write complex binary data formats, it's best to use the :mod:`struct`
490module. It allows you to take a string containing binary data (usually numbers)
491and convert it to Python objects; and vice versa.
492
493For example, the following code reads two 2-byte integers and one 4-byte integer
494in big-endian format from a file::
495
496 import struct
497
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000498 with open(filename, "rb") as f:
499 s = f.read(8)
500 x, y, z = struct.unpack(">hhl", s)
Georg Brandld7413152009-10-11 21:25:26 +0000501
502The '>' in the format string forces big-endian data; the letter 'h' reads one
503"short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the
504string.
505
Ezio Melottib35480e2012-05-13 20:14:04 +0300506For data that is more regular (e.g. a homogeneous list of ints or floats),
Georg Brandld7413152009-10-11 21:25:26 +0000507you can also use the :mod:`array` module.
508
Ezio Melottib35480e2012-05-13 20:14:04 +0300509.. note::
510 To read and write binary data, it is mandatory to open the file in
511 binary mode (here, passing ``"rb"`` to :func:`open`). If you use
512 ``"r"`` instead (the default), the file will be open in text mode
513 and ``f.read()`` will return :class:`str` objects rather than
514 :class:`bytes` objects.
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000515
Georg Brandld7413152009-10-11 21:25:26 +0000516
517I can't seem to use os.read() on a pipe created with os.popen(); why?
518---------------------------------------------------------------------
519
520:func:`os.read` is a low-level function which takes a file descriptor, a small
521integer representing the opened file. :func:`os.popen` creates a high-level
Georg Brandlc4a55fc2010-02-06 18:46:57 +0000522file object, the same type returned by the built-in :func:`open` function.
Ezio Melottib35480e2012-05-13 20:14:04 +0300523Thus, to read *n* bytes from a pipe *p* created with :func:`os.popen`, you need to
Georg Brandlc4a55fc2010-02-06 18:46:57 +0000524use ``p.read(n)``.
Georg Brandld7413152009-10-11 21:25:26 +0000525
526
Georg Brandl9e4ff752009-12-19 17:57:51 +0000527.. XXX update to use subprocess. See the :ref:`subprocess-replacements` section.
Georg Brandld7413152009-10-11 21:25:26 +0000528
Georg Brandl9e4ff752009-12-19 17:57:51 +0000529 How do I run a subprocess with pipes connected to both input and output?
530 ------------------------------------------------------------------------
Georg Brandld7413152009-10-11 21:25:26 +0000531
Georg Brandl9e4ff752009-12-19 17:57:51 +0000532 Use the :mod:`popen2` module. For example::
Georg Brandld7413152009-10-11 21:25:26 +0000533
Georg Brandl9e4ff752009-12-19 17:57:51 +0000534 import popen2
535 fromchild, tochild = popen2.popen2("command")
536 tochild.write("input\n")
537 tochild.flush()
538 output = fromchild.readline()
Georg Brandld7413152009-10-11 21:25:26 +0000539
Georg Brandl9e4ff752009-12-19 17:57:51 +0000540 Warning: in general it is unwise to do this because you can easily cause a
541 deadlock where your process is blocked waiting for output from the child
542 while the child is blocked waiting for input from you. This can be caused
Ezio Melottib35480e2012-05-13 20:14:04 +0300543 by the parent expecting the child to output more text than it does or
544 by data being stuck in stdio buffers due to lack of flushing.
Georg Brandl9e4ff752009-12-19 17:57:51 +0000545 The Python parent can of course explicitly flush the data it sends to the
546 child before it reads any output, but if the child is a naive C program it
547 may have been written to never explicitly flush its output, even if it is
548 interactive, since flushing is normally automatic.
Georg Brandld7413152009-10-11 21:25:26 +0000549
Georg Brandl9e4ff752009-12-19 17:57:51 +0000550 Note that a deadlock is also possible if you use :func:`popen3` to read
551 stdout and stderr. If one of the two is too large for the internal buffer
552 (increasing the buffer size does not help) and you ``read()`` the other one
553 first, there is a deadlock, too.
Georg Brandld7413152009-10-11 21:25:26 +0000554
Georg Brandl9e4ff752009-12-19 17:57:51 +0000555 Note on a bug in popen2: unless your program calls ``wait()`` or
556 ``waitpid()``, finished child processes are never removed, and eventually
557 calls to popen2 will fail because of a limit on the number of child
558 processes. Calling :func:`os.waitpid` with the :data:`os.WNOHANG` option can
559 prevent this; a good place to insert such a call would be before calling
560 ``popen2`` again.
Georg Brandld7413152009-10-11 21:25:26 +0000561
Georg Brandl9e4ff752009-12-19 17:57:51 +0000562 In many cases, all you really need is to run some data through a command and
563 get the result back. Unless the amount of data is very large, the easiest
564 way to do this is to write it to a temporary file and run the command with
565 that temporary file as input. The standard module :mod:`tempfile` exports a
Ezio Melottib35480e2012-05-13 20:14:04 +0300566 :func:`~tempfile.mktemp` function to generate unique temporary file names. ::
Georg Brandld7413152009-10-11 21:25:26 +0000567
Georg Brandl9e4ff752009-12-19 17:57:51 +0000568 import tempfile
569 import os
Georg Brandld7413152009-10-11 21:25:26 +0000570
Georg Brandl9e4ff752009-12-19 17:57:51 +0000571 class Popen3:
572 """
573 This is a deadlock-safe version of popen that returns
574 an object with errorlevel, out (a string) and err (a string).
575 (capturestderr may not work under windows.)
576 Example: print(Popen3('grep spam','\n\nhere spam\n\n').out)
577 """
578 def __init__(self,command,input=None,capturestderr=None):
579 outfile=tempfile.mktemp()
580 command="( %s ) > %s" % (command,outfile)
581 if input:
582 infile=tempfile.mktemp()
583 open(infile,"w").write(input)
584 command=command+" <"+infile
585 if capturestderr:
586 errfile=tempfile.mktemp()
587 command=command+" 2>"+errfile
588 self.errorlevel=os.system(command) >> 8
589 self.out=open(outfile,"r").read()
590 os.remove(outfile)
591 if input:
592 os.remove(infile)
593 if capturestderr:
594 self.err=open(errfile,"r").read()
595 os.remove(errfile)
Georg Brandld7413152009-10-11 21:25:26 +0000596
Georg Brandl9e4ff752009-12-19 17:57:51 +0000597 Note that many interactive programs (e.g. vi) don't work well with pipes
598 substituted for standard input and output. You will have to use pseudo ttys
599 ("ptys") instead of pipes. Or you can use a Python interface to Don Libes'
600 "expect" library. A Python extension that interfaces to expect is called
601 "expy" and available from http://expectpy.sourceforge.net. A pure Python
602 solution that works like expect is `pexpect
603 <http://pypi.python.org/pypi/pexpect/>`_.
Georg Brandld7413152009-10-11 21:25:26 +0000604
605
606How do I access the serial (RS232) port?
607----------------------------------------
608
609For Win32, POSIX (Linux, BSD, etc.), Jython:
610
611 http://pyserial.sourceforge.net
612
613For Unix, see a Usenet post by Mitch Chapman:
614
615 http://groups.google.com/groups?selm=34A04430.CF9@ohioee.com
616
617
618Why doesn't closing sys.stdout (stdin, stderr) really close it?
619---------------------------------------------------------------
620
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000621Python :term:`file objects <file object>` are a high-level layer of
622abstraction on low-level C file descriptors.
Georg Brandld7413152009-10-11 21:25:26 +0000623
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000624For most file objects you create in Python via the built-in :func:`open`
625function, ``f.close()`` marks the Python file object as being closed from
626Python's point of view, and also arranges to close the underlying C file
627descriptor. This also happens automatically in ``f``'s destructor, when
628``f`` becomes garbage.
Georg Brandld7413152009-10-11 21:25:26 +0000629
630But stdin, stdout and stderr are treated specially by Python, because of the
631special status also given to them by C. Running ``sys.stdout.close()`` marks
632the Python-level file object as being closed, but does *not* close the
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000633associated C file descriptor.
Georg Brandld7413152009-10-11 21:25:26 +0000634
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000635To close the underlying C file descriptor for one of these three, you should
636first be sure that's what you really want to do (e.g., you may confuse
637extension modules trying to do I/O). If it is, use :func:`os.close`::
Georg Brandld7413152009-10-11 21:25:26 +0000638
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000639 os.close(stdin.fileno())
640 os.close(stdout.fileno())
641 os.close(stderr.fileno())
642
643Or you can use the numeric constants 0, 1 and 2, respectively.
Georg Brandld7413152009-10-11 21:25:26 +0000644
645
646Network/Internet Programming
647============================
648
649What WWW tools are there for Python?
650------------------------------------
651
652See the chapters titled :ref:`internet` and :ref:`netdata` in the Library
653Reference Manual. Python has many modules that will help you build server-side
654and client-side web systems.
655
656.. XXX check if wiki page is still up to date
657
658A summary of available frameworks is maintained by Paul Boddie at
659http://wiki.python.org/moin/WebProgramming .
660
661Cameron Laird maintains a useful set of pages about Python web technologies at
662http://phaseit.net/claird/comp.lang.python/web_python.
663
664
665How can I mimic CGI form submission (METHOD=POST)?
666--------------------------------------------------
667
668I would like to retrieve web pages that are the result of POSTing a form. Is
669there existing code that would let me do this easily?
670
Georg Brandl9e4ff752009-12-19 17:57:51 +0000671Yes. Here's a simple example that uses urllib.request::
Georg Brandld7413152009-10-11 21:25:26 +0000672
673 #!/usr/local/bin/python
674
Georg Brandl9e4ff752009-12-19 17:57:51 +0000675 import urllib.request
Georg Brandld7413152009-10-11 21:25:26 +0000676
677 ### build the query string
678 qs = "First=Josephine&MI=Q&Last=Public"
679
680 ### connect and send the server a path
Georg Brandl9e4ff752009-12-19 17:57:51 +0000681 req = urllib.request.urlopen('http://www.some-server.out-there'
682 '/cgi-bin/some-cgi-script', data=qs)
683 msg, hdrs = req.read(), req.info()
Georg Brandld7413152009-10-11 21:25:26 +0000684
Georg Brandl54ebb782010-08-14 15:48:49 +0000685Note that in general for percent-encoded POST operations, query strings must be
Ezio Melottib35480e2012-05-13 20:14:04 +0300686quoted using :func:`urllib.parse.urlencode`. For example, to send
687``name=Guy Steele, Jr.``::
Georg Brandld7413152009-10-11 21:25:26 +0000688
Georg Brandl9e4ff752009-12-19 17:57:51 +0000689 >>> import urllib.parse
690 >>> urllib.parse.urlencode({'name': 'Guy Steele, Jr.'})
691 'name=Guy+Steele%2C+Jr.'
692
693.. seealso:: :ref:`urllib-howto` for extensive examples.
Georg Brandld7413152009-10-11 21:25:26 +0000694
695
696What module should I use to help with generating HTML?
697------------------------------------------------------
698
699.. XXX add modern template languages
700
Ezio Melottib35480e2012-05-13 20:14:04 +0300701You can find a collection of useful links on the `Web Programming wiki page
702<http://wiki.python.org/moin/WebProgramming>`_.
Georg Brandld7413152009-10-11 21:25:26 +0000703
704
705How do I send mail from a Python script?
706----------------------------------------
707
708Use the standard library module :mod:`smtplib`.
709
710Here's a very simple interactive mail sender that uses it. This method will
711work on any host that supports an SMTP listener. ::
712
713 import sys, smtplib
714
Georg Brandl9e4ff752009-12-19 17:57:51 +0000715 fromaddr = input("From: ")
716 toaddrs = input("To: ").split(',')
717 print("Enter message, end with ^D:")
Georg Brandld7413152009-10-11 21:25:26 +0000718 msg = ''
719 while True:
720 line = sys.stdin.readline()
721 if not line:
722 break
723 msg += line
724
725 # The actual mail send
726 server = smtplib.SMTP('localhost')
727 server.sendmail(fromaddr, toaddrs, msg)
728 server.quit()
729
730A Unix-only alternative uses sendmail. The location of the sendmail program
Ezio Melottib35480e2012-05-13 20:14:04 +0300731varies between systems; sometimes it is ``/usr/lib/sendmail``, sometimes
Georg Brandld7413152009-10-11 21:25:26 +0000732``/usr/sbin/sendmail``. The sendmail manual page will help you out. Here's
733some sample code::
734
Georg Brandl9e4ff752009-12-19 17:57:51 +0000735 SENDMAIL = "/usr/sbin/sendmail" # sendmail location
Georg Brandld7413152009-10-11 21:25:26 +0000736 import os
737 p = os.popen("%s -t -i" % SENDMAIL, "w")
738 p.write("To: receiver@example.com\n")
739 p.write("Subject: test\n")
Georg Brandl9e4ff752009-12-19 17:57:51 +0000740 p.write("\n") # blank line separating headers from body
Georg Brandld7413152009-10-11 21:25:26 +0000741 p.write("Some text\n")
742 p.write("some more text\n")
743 sts = p.close()
744 if sts != 0:
Georg Brandl9e4ff752009-12-19 17:57:51 +0000745 print("Sendmail exit status", sts)
Georg Brandld7413152009-10-11 21:25:26 +0000746
747
748How do I avoid blocking in the connect() method of a socket?
749------------------------------------------------------------
750
Antoine Pitrou70957212011-02-05 11:24:15 +0000751The :mod:`select` module is commonly used to help with asynchronous I/O on
752sockets.
Georg Brandld7413152009-10-11 21:25:26 +0000753
754To prevent the TCP connect from blocking, you can set the socket to non-blocking
755mode. Then when you do the ``connect()``, you will either connect immediately
756(unlikely) or get an exception that contains the error number as ``.errno``.
757``errno.EINPROGRESS`` indicates that the connection is in progress, but hasn't
758finished yet. Different OSes will return different values, so you're going to
759have to check what's returned on your system.
760
761You can use the ``connect_ex()`` method to avoid creating an exception. It will
762just return the errno value. To poll, you can call ``connect_ex()`` again later
Georg Brandl9e4ff752009-12-19 17:57:51 +0000763-- ``0`` or ``errno.EISCONN`` indicate that you're connected -- or you can pass this
Georg Brandld7413152009-10-11 21:25:26 +0000764socket to select to check if it's writable.
765
Antoine Pitrou70957212011-02-05 11:24:15 +0000766.. note::
767 The :mod:`asyncore` module presents a framework-like approach to the problem
768 of writing non-blocking networking code.
769 The third-party `Twisted <http://twistedmatrix.com/>`_ library is
770 a popular and feature-rich alternative.
771
Georg Brandld7413152009-10-11 21:25:26 +0000772
773Databases
774=========
775
776Are there any interfaces to database packages in Python?
777--------------------------------------------------------
778
779Yes.
780
Georg Brandld404fa62009-10-13 16:55:12 +0000781Interfaces to disk-based hashes such as :mod:`DBM <dbm.ndbm>` and :mod:`GDBM
782<dbm.gnu>` are also included with standard Python. There is also the
783:mod:`sqlite3` module, which provides a lightweight disk-based relational
784database.
Georg Brandld7413152009-10-11 21:25:26 +0000785
786Support for most relational databases is available. See the
787`DatabaseProgramming wiki page
788<http://wiki.python.org/moin/DatabaseProgramming>`_ for details.
789
790
791How do you implement persistent objects in Python?
792--------------------------------------------------
793
794The :mod:`pickle` library module solves this in a very general way (though you
795still can't store things like open files, sockets or windows), and the
796:mod:`shelve` library module uses pickle and (g)dbm to create persistent
Georg Brandld404fa62009-10-13 16:55:12 +0000797mappings containing arbitrary Python objects.
Georg Brandld7413152009-10-11 21:25:26 +0000798
Georg Brandld7413152009-10-11 21:25:26 +0000799
Georg Brandld7413152009-10-11 21:25:26 +0000800Mathematics and Numerics
801========================
802
803How do I generate random numbers in Python?
804-------------------------------------------
805
806The standard module :mod:`random` implements a random number generator. Usage
807is simple::
808
809 import random
810 random.random()
811
812This returns a random floating point number in the range [0, 1).
813
814There are also many other specialized generators in this module, such as:
815
816* ``randrange(a, b)`` chooses an integer in the range [a, b).
817* ``uniform(a, b)`` chooses a floating point number in the range [a, b).
818* ``normalvariate(mean, sdev)`` samples the normal (Gaussian) distribution.
819
820Some higher-level functions operate on sequences directly, such as:
821
822* ``choice(S)`` chooses random element from a given sequence
823* ``shuffle(L)`` shuffles a list in-place, i.e. permutes it randomly
824
825There's also a ``Random`` class you can instantiate to create independent
826multiple random number generators.