blob: 97058b5806a35c30d34b4568d9f1073902769983 [file] [log] [blame]
Georg Brandld7413152009-10-11 21:25:26 +00001:tocdepth: 2
2
3=========================
4Library and Extension FAQ
5=========================
6
Georg Brandl44ea77b2013-03-28 13:28:44 +01007.. only:: html
8
9 .. contents::
Georg Brandld7413152009-10-11 21:25:26 +000010
11General Library Questions
12=========================
13
14How do I find a module or application to perform task X?
15--------------------------------------------------------
16
17Check :ref:`the Library Reference <library-index>` to see if there's a relevant
18standard library module. (Eventually you'll learn what's in the standard
Ezio Melottib35480e2012-05-13 20:14:04 +030019library and will be able to skip this step.)
Georg Brandld7413152009-10-11 21:25:26 +000020
Georg Brandl495f7b52009-10-27 15:28:25 +000021For third-party packages, search the `Python Package Index
Stéphane Wirtel19177fb2018-05-15 20:58:35 +020022<https://pypi.org>`_ or try `Google <https://www.google.com>`_ or
Georg Brandl495f7b52009-10-27 15:28:25 +000023another Web search engine. Searching for "Python" plus a keyword or two for
24your topic of interest will usually find something helpful.
Georg Brandld7413152009-10-11 21:25:26 +000025
26
27Where is the math.py (socket.py, regex.py, etc.) source file?
28-------------------------------------------------------------
29
Georg Brandlc4a55fc2010-02-06 18:46:57 +000030If you can't find a source file for a module it may be a built-in or
31dynamically loaded module implemented in C, C++ or other compiled language.
32In this case you may not have the source file or it may be something like
Ezio Melottib35480e2012-05-13 20:14:04 +030033:file:`mathmodule.c`, somewhere in a C source directory (not on the Python Path).
Georg Brandld7413152009-10-11 21:25:26 +000034
35There are (at least) three kinds of modules in Python:
36
371) modules written in Python (.py);
382) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc);
393) modules written in C and linked with the interpreter; to get a list of these,
40 type::
41
42 import sys
Georg Brandl9e4ff752009-12-19 17:57:51 +000043 print(sys.builtin_module_names)
Georg Brandld7413152009-10-11 21:25:26 +000044
45
46How do I make a Python script executable on Unix?
47-------------------------------------------------
48
49You need to do two things: the script file's mode must be executable and the
50first line must begin with ``#!`` followed by the path of the Python
51interpreter.
52
53The first is done by executing ``chmod +x scriptfile`` or perhaps ``chmod 755
54scriptfile``.
55
56The second can be done in a number of ways. The most straightforward way is to
57write ::
58
59 #!/usr/local/bin/python
60
61as the very first line of your file, using the pathname for where the Python
62interpreter is installed on your platform.
63
64If you would like the script to be independent of where the Python interpreter
Ezio Melottib35480e2012-05-13 20:14:04 +030065lives, you can use the :program:`env` program. Almost all Unix variants support
66the following, assuming the Python interpreter is in a directory on the user's
67:envvar:`PATH`::
Georg Brandld7413152009-10-11 21:25:26 +000068
69 #!/usr/bin/env python
70
Ezio Melottib35480e2012-05-13 20:14:04 +030071*Don't* do this for CGI scripts. The :envvar:`PATH` variable for CGI scripts is
72often very minimal, so you need to use the actual absolute pathname of the
Georg Brandld7413152009-10-11 21:25:26 +000073interpreter.
74
Ezio Melottib35480e2012-05-13 20:14:04 +030075Occasionally, a user's environment is so full that the :program:`/usr/bin/env`
76program fails; or there's no env program at all. In that case, you can try the
Serhiy Storchaka46936d52018-04-08 19:18:04 +030077following hack (due to Alex Rezinsky):
78
79.. code-block:: sh
Georg Brandld7413152009-10-11 21:25:26 +000080
81 #! /bin/sh
82 """:"
83 exec python $0 ${1+"$@"}
84 """
85
86The minor disadvantage is that this defines the script's __doc__ string.
87However, you can fix that by adding ::
88
89 __doc__ = """...Whatever..."""
90
91
92
93Is there a curses/termcap package for Python?
94---------------------------------------------
95
96.. XXX curses *is* built by default, isn't it?
97
98For Unix variants: The standard Python source distribution comes with a curses
Ezio Melottib35480e2012-05-13 20:14:04 +030099module in the :source:`Modules` subdirectory, though it's not compiled by default.
100(Note that this is not available in the Windows distribution -- there is no
101curses module for Windows.)
Georg Brandld7413152009-10-11 21:25:26 +0000102
Ezio Melottib35480e2012-05-13 20:14:04 +0300103The :mod:`curses` module supports basic curses features as well as many additional
Georg Brandld7413152009-10-11 21:25:26 +0000104functions from ncurses and SYSV curses such as colour, alternative character set
105support, pads, and mouse support. This means the module isn't compatible with
106operating systems that only have BSD curses, but there don't seem to be any
107currently maintained OSes that fall into this category.
108
109For Windows: use `the consolelib module
110<http://effbot.org/zone/console-index.htm>`_.
111
112
113Is there an equivalent to C's onexit() in Python?
114-------------------------------------------------
115
116The :mod:`atexit` module provides a register function that is similar to C's
Ezio Melottib35480e2012-05-13 20:14:04 +0300117:c:func:`onexit`.
Georg Brandld7413152009-10-11 21:25:26 +0000118
119
120Why don't my signal handlers work?
121----------------------------------
122
123The most common problem is that the signal handler is declared with the wrong
124argument list. It is called as ::
125
126 handler(signum, frame)
127
Antoine88b24f92019-09-09 17:00:44 +0200128so it should be declared with two parameters::
Georg Brandld7413152009-10-11 21:25:26 +0000129
130 def handler(signum, frame):
131 ...
132
133
134Common tasks
135============
136
137How do I test a Python program or component?
138--------------------------------------------
139
140Python comes with two testing frameworks. The :mod:`doctest` module finds
141examples in the docstrings for a module and runs them, comparing the output with
142the expected output given in the docstring.
143
144The :mod:`unittest` module is a fancier testing framework modelled on Java and
145Smalltalk testing frameworks.
146
Ezio Melottib35480e2012-05-13 20:14:04 +0300147To make testing easier, you should use good modular design in your program.
148Your program should have almost all functionality
Georg Brandld7413152009-10-11 21:25:26 +0000149encapsulated in either functions or class methods -- and this sometimes has the
150surprising and delightful effect of making the program run faster (because local
151variable accesses are faster than global accesses). Furthermore the program
152should avoid depending on mutating global variables, since this makes testing
153much more difficult to do.
154
155The "global main logic" of your program may be as simple as ::
156
157 if __name__ == "__main__":
158 main_logic()
159
160at the bottom of the main module of your program.
161
Antoine88b24f92019-09-09 17:00:44 +0200162Once your program is organized as a tractable collection of function and class
163behaviours, you should write test functions that exercise the behaviours. A
164test suite that automates a sequence of tests can be associated with each module.
Georg Brandld7413152009-10-11 21:25:26 +0000165This sounds like a lot of work, but since Python is so terse and flexible it's
166surprisingly easy. You can make coding much more pleasant and fun by writing
167your test functions in parallel with the "production code", since this makes it
168easy to find bugs and even design flaws earlier.
169
170"Support modules" that are not intended to be the main module of a program may
171include a self-test of the module. ::
172
173 if __name__ == "__main__":
174 self_test()
175
176Even programs that interact with complex external interfaces may be tested when
177the external interfaces are unavailable by using "fake" interfaces implemented
178in Python.
179
180
181How do I create documentation from doc strings?
182-----------------------------------------------
183
Georg Brandld7413152009-10-11 21:25:26 +0000184The :mod:`pydoc` module can create HTML from the doc strings in your Python
Georg Brandl495f7b52009-10-27 15:28:25 +0000185source code. An alternative for creating API documentation purely from
Georg Brandlb7354a62014-10-29 10:57:37 +0100186docstrings is `epydoc <http://epydoc.sourceforge.net/>`_. `Sphinx
Georg Brandl77fe77d2014-10-29 09:24:54 +0100187<http://sphinx-doc.org>`_ can also include docstring content.
Georg Brandld7413152009-10-11 21:25:26 +0000188
189
190How do I get a single keypress at a time?
191-----------------------------------------
192
Ezio Melottib35480e2012-05-13 20:14:04 +0300193For Unix variants there are several solutions. It's straightforward to do this
Georg Brandl9e4ff752009-12-19 17:57:51 +0000194using curses, but curses is a fairly large module to learn.
195
196.. XXX this doesn't work out of the box, some IO expert needs to check why
197
198 Here's a solution without curses::
Georg Brandld7413152009-10-11 21:25:26 +0000199
200 import termios, fcntl, sys, os
201 fd = sys.stdin.fileno()
202
203 oldterm = termios.tcgetattr(fd)
204 newattr = termios.tcgetattr(fd)
205 newattr[3] = newattr[3] & ~termios.ICANON & ~termios.ECHO
206 termios.tcsetattr(fd, termios.TCSANOW, newattr)
207
208 oldflags = fcntl.fcntl(fd, fcntl.F_GETFL)
209 fcntl.fcntl(fd, fcntl.F_SETFL, oldflags | os.O_NONBLOCK)
210
211 try:
Georg Brandl9e4ff752009-12-19 17:57:51 +0000212 while True:
Georg Brandld7413152009-10-11 21:25:26 +0000213 try:
214 c = sys.stdin.read(1)
Georg Brandl9e4ff752009-12-19 17:57:51 +0000215 print("Got character", repr(c))
Andrew Svetlov5f11a002012-12-18 23:16:44 +0200216 except OSError:
Georg Brandl9e4ff752009-12-19 17:57:51 +0000217 pass
Georg Brandld7413152009-10-11 21:25:26 +0000218 finally:
219 termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm)
220 fcntl.fcntl(fd, fcntl.F_SETFL, oldflags)
221
Georg Brandl9e4ff752009-12-19 17:57:51 +0000222 You need the :mod:`termios` and the :mod:`fcntl` module for any of this to
223 work, and I've only tried it on Linux, though it should work elsewhere. In
224 this code, characters are read and printed one at a time.
Georg Brandld7413152009-10-11 21:25:26 +0000225
Georg Brandl9e4ff752009-12-19 17:57:51 +0000226 :func:`termios.tcsetattr` turns off stdin's echoing and disables canonical
227 mode. :func:`fcntl.fnctl` is used to obtain stdin's file descriptor flags
228 and modify them for non-blocking mode. Since reading stdin when it is empty
Andrew Svetlov5f11a002012-12-18 23:16:44 +0200229 results in an :exc:`OSError`, this error is caught and ignored.
Georg Brandld7413152009-10-11 21:25:26 +0000230
Andrew Svetlov8a045cb2012-12-19 13:45:30 +0200231 .. versionchanged:: 3.3
232 *sys.stdin.read* used to raise :exc:`IOError`. Starting from Python 3.3
233 :exc:`IOError` is alias for :exc:`OSError`.
234
Georg Brandld7413152009-10-11 21:25:26 +0000235
236Threads
237=======
238
239How do I program using threads?
240-------------------------------
241
Georg Brandld404fa62009-10-13 16:55:12 +0000242Be sure to use the :mod:`threading` module and not the :mod:`_thread` module.
Georg Brandld7413152009-10-11 21:25:26 +0000243The :mod:`threading` module builds convenient abstractions on top of the
Georg Brandld404fa62009-10-13 16:55:12 +0000244low-level primitives provided by the :mod:`_thread` module.
Georg Brandld7413152009-10-11 21:25:26 +0000245
246Aahz has a set of slides from his threading tutorial that are helpful; see
Georg Brandl495f7b52009-10-27 15:28:25 +0000247http://www.pythoncraft.com/OSCON2001/.
Georg Brandld7413152009-10-11 21:25:26 +0000248
249
250None of my threads seem to run: why?
251------------------------------------
252
253As soon as the main thread exits, all threads are killed. Your main thread is
254running too quickly, giving the threads no time to do any work.
255
256A simple fix is to add a sleep to the end of the program that's long enough for
257all the threads to finish::
258
259 import threading, time
260
261 def thread_task(name, n):
Serhiy Storchakadba90392016-05-10 12:01:23 +0300262 for i in range(n):
263 print(name, i)
Georg Brandld7413152009-10-11 21:25:26 +0000264
265 for i in range(10):
266 T = threading.Thread(target=thread_task, args=(str(i), i))
267 T.start()
268
Georg Brandl9e4ff752009-12-19 17:57:51 +0000269 time.sleep(10) # <---------------------------!
Georg Brandld7413152009-10-11 21:25:26 +0000270
271But now (on many platforms) the threads don't run in parallel, but appear to run
272sequentially, one at a time! The reason is that the OS thread scheduler doesn't
273start a new thread until the previous thread is blocked.
274
275A simple fix is to add a tiny sleep to the start of the run function::
276
277 def thread_task(name, n):
Georg Brandl9e4ff752009-12-19 17:57:51 +0000278 time.sleep(0.001) # <--------------------!
Serhiy Storchakadba90392016-05-10 12:01:23 +0300279 for i in range(n):
280 print(name, i)
Georg Brandld7413152009-10-11 21:25:26 +0000281
282 for i in range(10):
283 T = threading.Thread(target=thread_task, args=(str(i), i))
284 T.start()
285
286 time.sleep(10)
287
Ezio Melottib35480e2012-05-13 20:14:04 +0300288Instead of trying to guess a good delay value for :func:`time.sleep`,
Georg Brandld7413152009-10-11 21:25:26 +0000289it's better to use some kind of semaphore mechanism. One idea is to use the
Georg Brandld404fa62009-10-13 16:55:12 +0000290:mod:`queue` module to create a queue object, let each thread append a token to
Georg Brandld7413152009-10-11 21:25:26 +0000291the queue when it finishes, and let the main thread read as many tokens from the
292queue as there are threads.
293
294
295How do I parcel out work among a bunch of worker threads?
296---------------------------------------------------------
297
Antoine88b24f92019-09-09 17:00:44 +0200298The easiest way is to use the :mod:`concurrent.futures` module,
Antoine Pitrou11480b62011-02-05 11:18:34 +0000299especially the :mod:`~concurrent.futures.ThreadPoolExecutor` class.
300
301Or, if you want fine control over the dispatching algorithm, you can write
302your own logic manually. Use the :mod:`queue` module to create a queue
303containing a list of jobs. The :class:`~queue.Queue` class maintains a
Ezio Melottib35480e2012-05-13 20:14:04 +0300304list of objects and has a ``.put(obj)`` method that adds items to the queue and
305a ``.get()`` method to return them. The class will take care of the locking
306necessary to ensure that each job is handed out exactly once.
Georg Brandld7413152009-10-11 21:25:26 +0000307
308Here's a trivial example::
309
Georg Brandl9e4ff752009-12-19 17:57:51 +0000310 import threading, queue, time
Georg Brandld7413152009-10-11 21:25:26 +0000311
312 # The worker thread gets jobs off the queue. When the queue is empty, it
313 # assumes there will be no more work and exits.
314 # (Realistically workers will run until terminated.)
Ezio Melottib35480e2012-05-13 20:14:04 +0300315 def worker():
Georg Brandl9e4ff752009-12-19 17:57:51 +0000316 print('Running worker')
Georg Brandld7413152009-10-11 21:25:26 +0000317 time.sleep(0.1)
318 while True:
319 try:
320 arg = q.get(block=False)
Georg Brandl9e4ff752009-12-19 17:57:51 +0000321 except queue.Empty:
322 print('Worker', threading.currentThread(), end=' ')
323 print('queue empty')
Georg Brandld7413152009-10-11 21:25:26 +0000324 break
325 else:
Georg Brandl9e4ff752009-12-19 17:57:51 +0000326 print('Worker', threading.currentThread(), end=' ')
327 print('running with argument', arg)
Georg Brandld7413152009-10-11 21:25:26 +0000328 time.sleep(0.5)
329
330 # Create queue
Georg Brandl9e4ff752009-12-19 17:57:51 +0000331 q = queue.Queue()
Georg Brandld7413152009-10-11 21:25:26 +0000332
333 # Start a pool of 5 workers
334 for i in range(5):
335 t = threading.Thread(target=worker, name='worker %i' % (i+1))
336 t.start()
337
338 # Begin adding work to the queue
339 for i in range(50):
340 q.put(i)
341
342 # Give threads time to run
Georg Brandl9e4ff752009-12-19 17:57:51 +0000343 print('Main thread sleeping')
Georg Brandld7413152009-10-11 21:25:26 +0000344 time.sleep(5)
345
Ezio Melottib35480e2012-05-13 20:14:04 +0300346When run, this will produce the following output:
347
348.. code-block:: none
Georg Brandld7413152009-10-11 21:25:26 +0000349
350 Running worker
351 Running worker
352 Running worker
353 Running worker
354 Running worker
355 Main thread sleeping
Georg Brandl9e4ff752009-12-19 17:57:51 +0000356 Worker <Thread(worker 1, started 130283832797456)> running with argument 0
357 Worker <Thread(worker 2, started 130283824404752)> running with argument 1
358 Worker <Thread(worker 3, started 130283816012048)> running with argument 2
359 Worker <Thread(worker 4, started 130283807619344)> running with argument 3
360 Worker <Thread(worker 5, started 130283799226640)> running with argument 4
361 Worker <Thread(worker 1, started 130283832797456)> running with argument 5
Georg Brandld7413152009-10-11 21:25:26 +0000362 ...
363
Georg Brandl3539afd2012-05-30 22:03:20 +0200364Consult the module's documentation for more details; the :class:`~queue.Queue`
Ezio Melottib35480e2012-05-13 20:14:04 +0300365class provides a featureful interface.
Georg Brandld7413152009-10-11 21:25:26 +0000366
367
368What kinds of global value mutation are thread-safe?
369----------------------------------------------------
370
Antoine Pitrou11480b62011-02-05 11:18:34 +0000371A :term:`global interpreter lock` (GIL) is used internally to ensure that only one
Georg Brandld7413152009-10-11 21:25:26 +0000372thread runs in the Python VM at a time. In general, Python offers to switch
373among threads only between bytecode instructions; how frequently it switches can
Georg Brandl9e4ff752009-12-19 17:57:51 +0000374be set via :func:`sys.setswitchinterval`. Each bytecode instruction and
Georg Brandld7413152009-10-11 21:25:26 +0000375therefore all the C implementation code reached from each instruction is
376therefore atomic from the point of view of a Python program.
377
378In theory, this means an exact accounting requires an exact understanding of the
379PVM bytecode implementation. In practice, it means that operations on shared
Georg Brandlc4a55fc2010-02-06 18:46:57 +0000380variables of built-in data types (ints, lists, dicts, etc) that "look atomic"
Georg Brandld7413152009-10-11 21:25:26 +0000381really are.
382
383For example, the following operations are all atomic (L, L1, L2 are lists, D,
384D1, D2 are dicts, x, y are objects, i, j are ints)::
385
386 L.append(x)
387 L1.extend(L2)
388 x = L[i]
389 x = L.pop()
390 L1[i:j] = L2
391 L.sort()
392 x = y
393 x.field = y
394 D[x] = y
395 D1.update(D2)
396 D.keys()
397
398These aren't::
399
400 i = i+1
401 L.append(L[-1])
402 L[i] = L[j]
403 D[x] = D[x] + 1
404
405Operations that replace other objects may invoke those other objects'
406:meth:`__del__` method when their reference count reaches zero, and that can
407affect things. This is especially true for the mass updates to dictionaries and
408lists. When in doubt, use a mutex!
409
410
411Can't we get rid of the Global Interpreter Lock?
412------------------------------------------------
413
Georg Brandl495f7b52009-10-27 15:28:25 +0000414.. XXX link to dbeazley's talk about GIL?
Georg Brandld7413152009-10-11 21:25:26 +0000415
Antoine Pitrou11480b62011-02-05 11:18:34 +0000416The :term:`global interpreter lock` (GIL) is often seen as a hindrance to Python's
Georg Brandld7413152009-10-11 21:25:26 +0000417deployment on high-end multiprocessor server machines, because a multi-threaded
418Python program effectively only uses one CPU, due to the insistence that
419(almost) all Python code can only run while the GIL is held.
420
421Back in the days of Python 1.5, Greg Stein actually implemented a comprehensive
422patch set (the "free threading" patches) that removed the GIL and replaced it
Antoine Pitrou11480b62011-02-05 11:18:34 +0000423with fine-grained locking. Adam Olsen recently did a similar experiment
Sanyam Khurana1b4587a2017-12-06 22:09:33 +0530424in his `python-safethread <https://code.google.com/archive/p/python-safethread>`_
Antoine Pitrou11480b62011-02-05 11:18:34 +0000425project. Unfortunately, both experiments exhibited a sharp drop in single-thread
426performance (at least 30% slower), due to the amount of fine-grained locking
427necessary to compensate for the removal of the GIL.
Georg Brandld7413152009-10-11 21:25:26 +0000428
429This doesn't mean that you can't make good use of Python on multi-CPU machines!
430You just have to be creative with dividing the work up between multiple
Antoine Pitrou11480b62011-02-05 11:18:34 +0000431*processes* rather than multiple *threads*. The
432:class:`~concurrent.futures.ProcessPoolExecutor` class in the new
433:mod:`concurrent.futures` module provides an easy way of doing so; the
434:mod:`multiprocessing` module provides a lower-level API in case you want
435more control over dispatching of tasks.
436
437Judicious use of C extensions will also help; if you use a C extension to
438perform a time-consuming task, the extension can release the GIL while the
439thread of execution is in the C code and allow other threads to get some work
440done. Some standard library modules such as :mod:`zlib` and :mod:`hashlib`
441already do this.
Georg Brandld7413152009-10-11 21:25:26 +0000442
443It has been suggested that the GIL should be a per-interpreter-state lock rather
444than truly global; interpreters then wouldn't be able to share objects.
445Unfortunately, this isn't likely to happen either. It would be a tremendous
446amount of work, because many object implementations currently have global state.
447For example, small integers and short strings are cached; these caches would
448have to be moved to the interpreter state. Other object types have their own
449free list; these free lists would have to be moved to the interpreter state.
450And so on.
451
452And I doubt that it can even be done in finite time, because the same problem
453exists for 3rd party extensions. It is likely that 3rd party extensions are
454being written at a faster rate than you can convert them to store all their
455global state in the interpreter state.
456
457And finally, once you have multiple interpreters not sharing any state, what
458have you gained over running each interpreter in a separate process?
459
460
461Input and Output
462================
463
464How do I delete a file? (And other file questions...)
465-----------------------------------------------------
466
467Use ``os.remove(filename)`` or ``os.unlink(filename)``; for documentation, see
Georg Brandl9e4ff752009-12-19 17:57:51 +0000468the :mod:`os` module. The two functions are identical; :func:`~os.unlink` is simply
Georg Brandld7413152009-10-11 21:25:26 +0000469the name of the Unix system call for this function.
470
471To remove a directory, use :func:`os.rmdir`; use :func:`os.mkdir` to create one.
472``os.makedirs(path)`` will create any intermediate directories in ``path`` that
473don't exist. ``os.removedirs(path)`` will remove intermediate directories as
474long as they're empty; if you want to delete an entire directory tree and its
475contents, use :func:`shutil.rmtree`.
476
477To rename a file, use ``os.rename(old_path, new_path)``.
478
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000479To truncate a file, open it using ``f = open(filename, "rb+")``, and use
Georg Brandld7413152009-10-11 21:25:26 +0000480``f.truncate(offset)``; offset defaults to the current seek position. There's
Georg Brandl682d7e02010-10-06 10:26:05 +0000481also ``os.ftruncate(fd, offset)`` for files opened with :func:`os.open`, where
Ezio Melottib35480e2012-05-13 20:14:04 +0300482*fd* is the file descriptor (a small integer).
Georg Brandld7413152009-10-11 21:25:26 +0000483
484The :mod:`shutil` module also contains a number of functions to work on files
485including :func:`~shutil.copyfile`, :func:`~shutil.copytree`, and
486:func:`~shutil.rmtree`.
487
488
489How do I copy a file?
490---------------------
491
492The :mod:`shutil` module contains a :func:`~shutil.copyfile` function. Note
493that on MacOS 9 it doesn't copy the resource fork and Finder info.
494
495
496How do I read (or write) binary data?
497-------------------------------------
498
499To read or write complex binary data formats, it's best to use the :mod:`struct`
500module. It allows you to take a string containing binary data (usually numbers)
501and convert it to Python objects; and vice versa.
502
503For example, the following code reads two 2-byte integers and one 4-byte integer
504in big-endian format from a file::
505
506 import struct
507
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000508 with open(filename, "rb") as f:
Serhiy Storchakadba90392016-05-10 12:01:23 +0300509 s = f.read(8)
510 x, y, z = struct.unpack(">hhl", s)
Georg Brandld7413152009-10-11 21:25:26 +0000511
512The '>' in the format string forces big-endian data; the letter 'h' reads one
513"short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the
514string.
515
Ezio Melottib35480e2012-05-13 20:14:04 +0300516For data that is more regular (e.g. a homogeneous list of ints or floats),
Georg Brandld7413152009-10-11 21:25:26 +0000517you can also use the :mod:`array` module.
518
Ezio Melottib35480e2012-05-13 20:14:04 +0300519.. note::
Larry Hastings3732ed22014-03-15 21:13:56 -0700520
Ezio Melottib35480e2012-05-13 20:14:04 +0300521 To read and write binary data, it is mandatory to open the file in
522 binary mode (here, passing ``"rb"`` to :func:`open`). If you use
523 ``"r"`` instead (the default), the file will be open in text mode
524 and ``f.read()`` will return :class:`str` objects rather than
525 :class:`bytes` objects.
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000526
Georg Brandld7413152009-10-11 21:25:26 +0000527
528I can't seem to use os.read() on a pipe created with os.popen(); why?
529---------------------------------------------------------------------
530
531:func:`os.read` is a low-level function which takes a file descriptor, a small
532integer representing the opened file. :func:`os.popen` creates a high-level
Georg Brandlc4a55fc2010-02-06 18:46:57 +0000533file object, the same type returned by the built-in :func:`open` function.
Ezio Melottib35480e2012-05-13 20:14:04 +0300534Thus, to read *n* bytes from a pipe *p* created with :func:`os.popen`, you need to
Georg Brandlc4a55fc2010-02-06 18:46:57 +0000535use ``p.read(n)``.
Georg Brandld7413152009-10-11 21:25:26 +0000536
537
Georg Brandl9e4ff752009-12-19 17:57:51 +0000538.. XXX update to use subprocess. See the :ref:`subprocess-replacements` section.
Georg Brandld7413152009-10-11 21:25:26 +0000539
Georg Brandl9e4ff752009-12-19 17:57:51 +0000540 How do I run a subprocess with pipes connected to both input and output?
541 ------------------------------------------------------------------------
Georg Brandld7413152009-10-11 21:25:26 +0000542
Georg Brandl9e4ff752009-12-19 17:57:51 +0000543 Use the :mod:`popen2` module. For example::
Georg Brandld7413152009-10-11 21:25:26 +0000544
Georg Brandl9e4ff752009-12-19 17:57:51 +0000545 import popen2
546 fromchild, tochild = popen2.popen2("command")
547 tochild.write("input\n")
548 tochild.flush()
549 output = fromchild.readline()
Georg Brandld7413152009-10-11 21:25:26 +0000550
Georg Brandl9e4ff752009-12-19 17:57:51 +0000551 Warning: in general it is unwise to do this because you can easily cause a
552 deadlock where your process is blocked waiting for output from the child
553 while the child is blocked waiting for input from you. This can be caused
Ezio Melottib35480e2012-05-13 20:14:04 +0300554 by the parent expecting the child to output more text than it does or
555 by data being stuck in stdio buffers due to lack of flushing.
Georg Brandl9e4ff752009-12-19 17:57:51 +0000556 The Python parent can of course explicitly flush the data it sends to the
557 child before it reads any output, but if the child is a naive C program it
558 may have been written to never explicitly flush its output, even if it is
559 interactive, since flushing is normally automatic.
Georg Brandld7413152009-10-11 21:25:26 +0000560
Georg Brandl9e4ff752009-12-19 17:57:51 +0000561 Note that a deadlock is also possible if you use :func:`popen3` to read
562 stdout and stderr. If one of the two is too large for the internal buffer
563 (increasing the buffer size does not help) and you ``read()`` the other one
564 first, there is a deadlock, too.
Georg Brandld7413152009-10-11 21:25:26 +0000565
Georg Brandl9e4ff752009-12-19 17:57:51 +0000566 Note on a bug in popen2: unless your program calls ``wait()`` or
567 ``waitpid()``, finished child processes are never removed, and eventually
568 calls to popen2 will fail because of a limit on the number of child
569 processes. Calling :func:`os.waitpid` with the :data:`os.WNOHANG` option can
570 prevent this; a good place to insert such a call would be before calling
571 ``popen2`` again.
Georg Brandld7413152009-10-11 21:25:26 +0000572
Georg Brandl9e4ff752009-12-19 17:57:51 +0000573 In many cases, all you really need is to run some data through a command and
574 get the result back. Unless the amount of data is very large, the easiest
575 way to do this is to write it to a temporary file and run the command with
576 that temporary file as input. The standard module :mod:`tempfile` exports a
Ezio Melottib35480e2012-05-13 20:14:04 +0300577 :func:`~tempfile.mktemp` function to generate unique temporary file names. ::
Georg Brandld7413152009-10-11 21:25:26 +0000578
Georg Brandl9e4ff752009-12-19 17:57:51 +0000579 import tempfile
580 import os
Georg Brandld7413152009-10-11 21:25:26 +0000581
Georg Brandl9e4ff752009-12-19 17:57:51 +0000582 class Popen3:
583 """
584 This is a deadlock-safe version of popen that returns
585 an object with errorlevel, out (a string) and err (a string).
586 (capturestderr may not work under windows.)
587 Example: print(Popen3('grep spam','\n\nhere spam\n\n').out)
588 """
589 def __init__(self,command,input=None,capturestderr=None):
590 outfile=tempfile.mktemp()
591 command="( %s ) > %s" % (command,outfile)
592 if input:
593 infile=tempfile.mktemp()
594 open(infile,"w").write(input)
595 command=command+" <"+infile
596 if capturestderr:
597 errfile=tempfile.mktemp()
598 command=command+" 2>"+errfile
599 self.errorlevel=os.system(command) >> 8
600 self.out=open(outfile,"r").read()
601 os.remove(outfile)
602 if input:
603 os.remove(infile)
604 if capturestderr:
605 self.err=open(errfile,"r").read()
606 os.remove(errfile)
Georg Brandld7413152009-10-11 21:25:26 +0000607
Georg Brandl9e4ff752009-12-19 17:57:51 +0000608 Note that many interactive programs (e.g. vi) don't work well with pipes
609 substituted for standard input and output. You will have to use pseudo ttys
610 ("ptys") instead of pipes. Or you can use a Python interface to Don Libes'
611 "expect" library. A Python extension that interfaces to expect is called
612 "expy" and available from http://expectpy.sourceforge.net. A pure Python
613 solution that works like expect is `pexpect
Stéphane Wirtel19177fb2018-05-15 20:58:35 +0200614 <https://pypi.org/project/pexpect/>`_.
Georg Brandld7413152009-10-11 21:25:26 +0000615
616
617How do I access the serial (RS232) port?
618----------------------------------------
619
620For Win32, POSIX (Linux, BSD, etc.), Jython:
621
622 http://pyserial.sourceforge.net
623
624For Unix, see a Usenet post by Mitch Chapman:
625
Georg Brandl5d941342016-02-26 19:37:12 +0100626 https://groups.google.com/groups?selm=34A04430.CF9@ohioee.com
Georg Brandld7413152009-10-11 21:25:26 +0000627
628
629Why doesn't closing sys.stdout (stdin, stderr) really close it?
630---------------------------------------------------------------
631
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000632Python :term:`file objects <file object>` are a high-level layer of
633abstraction on low-level C file descriptors.
Georg Brandld7413152009-10-11 21:25:26 +0000634
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000635For most file objects you create in Python via the built-in :func:`open`
636function, ``f.close()`` marks the Python file object as being closed from
637Python's point of view, and also arranges to close the underlying C file
638descriptor. This also happens automatically in ``f``'s destructor, when
639``f`` becomes garbage.
Georg Brandld7413152009-10-11 21:25:26 +0000640
641But stdin, stdout and stderr are treated specially by Python, because of the
642special status also given to them by C. Running ``sys.stdout.close()`` marks
643the Python-level file object as being closed, but does *not* close the
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000644associated C file descriptor.
Georg Brandld7413152009-10-11 21:25:26 +0000645
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000646To close the underlying C file descriptor for one of these three, you should
647first be sure that's what you really want to do (e.g., you may confuse
648extension modules trying to do I/O). If it is, use :func:`os.close`::
Georg Brandld7413152009-10-11 21:25:26 +0000649
Antoine Pitrou6a11a982010-09-15 10:08:31 +0000650 os.close(stdin.fileno())
651 os.close(stdout.fileno())
652 os.close(stderr.fileno())
653
654Or you can use the numeric constants 0, 1 and 2, respectively.
Georg Brandld7413152009-10-11 21:25:26 +0000655
656
657Network/Internet Programming
658============================
659
660What WWW tools are there for Python?
661------------------------------------
662
663See the chapters titled :ref:`internet` and :ref:`netdata` in the Library
664Reference Manual. Python has many modules that will help you build server-side
665and client-side web systems.
666
667.. XXX check if wiki page is still up to date
668
669A summary of available frameworks is maintained by Paul Boddie at
Georg Brandle73778c2014-10-29 08:36:35 +0100670https://wiki.python.org/moin/WebProgramming\ .
Georg Brandld7413152009-10-11 21:25:26 +0000671
672Cameron Laird maintains a useful set of pages about Python web technologies at
673http://phaseit.net/claird/comp.lang.python/web_python.
674
675
676How can I mimic CGI form submission (METHOD=POST)?
677--------------------------------------------------
678
679I would like to retrieve web pages that are the result of POSTing a form. Is
680there existing code that would let me do this easily?
681
Antoine88b24f92019-09-09 17:00:44 +0200682Yes. Here's a simple example that uses :mod:`urllib.request`::
Georg Brandld7413152009-10-11 21:25:26 +0000683
684 #!/usr/local/bin/python
685
Georg Brandl9e4ff752009-12-19 17:57:51 +0000686 import urllib.request
Georg Brandld7413152009-10-11 21:25:26 +0000687
Serhiy Storchakadba90392016-05-10 12:01:23 +0300688 # build the query string
Georg Brandld7413152009-10-11 21:25:26 +0000689 qs = "First=Josephine&MI=Q&Last=Public"
690
Serhiy Storchakadba90392016-05-10 12:01:23 +0300691 # connect and send the server a path
Georg Brandl9e4ff752009-12-19 17:57:51 +0000692 req = urllib.request.urlopen('http://www.some-server.out-there'
693 '/cgi-bin/some-cgi-script', data=qs)
Berker Peksag9575e182015-04-12 13:52:49 +0300694 with req:
695 msg, hdrs = req.read(), req.info()
Georg Brandld7413152009-10-11 21:25:26 +0000696
Georg Brandl54ebb782010-08-14 15:48:49 +0000697Note that in general for percent-encoded POST operations, query strings must be
Ezio Melottib35480e2012-05-13 20:14:04 +0300698quoted using :func:`urllib.parse.urlencode`. For example, to send
699``name=Guy Steele, Jr.``::
Georg Brandld7413152009-10-11 21:25:26 +0000700
Georg Brandl9e4ff752009-12-19 17:57:51 +0000701 >>> import urllib.parse
702 >>> urllib.parse.urlencode({'name': 'Guy Steele, Jr.'})
703 'name=Guy+Steele%2C+Jr.'
704
Berker Peksag9c1dba22014-09-28 00:00:58 +0300705.. seealso:: :ref:`urllib-howto` for extensive examples.
Georg Brandld7413152009-10-11 21:25:26 +0000706
707
708What module should I use to help with generating HTML?
709------------------------------------------------------
710
711.. XXX add modern template languages
712
Ezio Melottib35480e2012-05-13 20:14:04 +0300713You can find a collection of useful links on the `Web Programming wiki page
Georg Brandle73778c2014-10-29 08:36:35 +0100714<https://wiki.python.org/moin/WebProgramming>`_.
Georg Brandld7413152009-10-11 21:25:26 +0000715
716
717How do I send mail from a Python script?
718----------------------------------------
719
720Use the standard library module :mod:`smtplib`.
721
722Here's a very simple interactive mail sender that uses it. This method will
723work on any host that supports an SMTP listener. ::
724
725 import sys, smtplib
726
Georg Brandl9e4ff752009-12-19 17:57:51 +0000727 fromaddr = input("From: ")
728 toaddrs = input("To: ").split(',')
729 print("Enter message, end with ^D:")
Georg Brandld7413152009-10-11 21:25:26 +0000730 msg = ''
731 while True:
732 line = sys.stdin.readline()
733 if not line:
734 break
735 msg += line
736
737 # The actual mail send
738 server = smtplib.SMTP('localhost')
739 server.sendmail(fromaddr, toaddrs, msg)
740 server.quit()
741
742A Unix-only alternative uses sendmail. The location of the sendmail program
Ezio Melottib35480e2012-05-13 20:14:04 +0300743varies between systems; sometimes it is ``/usr/lib/sendmail``, sometimes
Georg Brandld7413152009-10-11 21:25:26 +0000744``/usr/sbin/sendmail``. The sendmail manual page will help you out. Here's
745some sample code::
746
Georg Brandld7413152009-10-11 21:25:26 +0000747 import os
Serhiy Storchakadba90392016-05-10 12:01:23 +0300748
749 SENDMAIL = "/usr/sbin/sendmail" # sendmail location
Georg Brandld7413152009-10-11 21:25:26 +0000750 p = os.popen("%s -t -i" % SENDMAIL, "w")
751 p.write("To: receiver@example.com\n")
752 p.write("Subject: test\n")
Georg Brandl9e4ff752009-12-19 17:57:51 +0000753 p.write("\n") # blank line separating headers from body
Georg Brandld7413152009-10-11 21:25:26 +0000754 p.write("Some text\n")
755 p.write("some more text\n")
756 sts = p.close()
757 if sts != 0:
Georg Brandl9e4ff752009-12-19 17:57:51 +0000758 print("Sendmail exit status", sts)
Georg Brandld7413152009-10-11 21:25:26 +0000759
760
761How do I avoid blocking in the connect() method of a socket?
762------------------------------------------------------------
763
Antoine Pitrou70957212011-02-05 11:24:15 +0000764The :mod:`select` module is commonly used to help with asynchronous I/O on
765sockets.
Georg Brandld7413152009-10-11 21:25:26 +0000766
767To prevent the TCP connect from blocking, you can set the socket to non-blocking
Antoine88b24f92019-09-09 17:00:44 +0200768mode. Then when you do the :meth:`socket.connect`, you will either connect immediately
Georg Brandld7413152009-10-11 21:25:26 +0000769(unlikely) or get an exception that contains the error number as ``.errno``.
770``errno.EINPROGRESS`` indicates that the connection is in progress, but hasn't
771finished yet. Different OSes will return different values, so you're going to
772have to check what's returned on your system.
773
Antoine88b24f92019-09-09 17:00:44 +0200774You can use the :meth:`socket.connect_ex` method to avoid creating an exception. It will
775just return the errno value. To poll, you can call :meth:`socket.connect_ex` again later
Georg Brandl9e4ff752009-12-19 17:57:51 +0000776-- ``0`` or ``errno.EISCONN`` indicate that you're connected -- or you can pass this
Antoine88b24f92019-09-09 17:00:44 +0200777socket to :meth:`select.select` to check if it's writable.
Georg Brandld7413152009-10-11 21:25:26 +0000778
Antoine Pitrou70957212011-02-05 11:24:15 +0000779.. note::
Antoine88b24f92019-09-09 17:00:44 +0200780 The :mod:`asyncio` module provides a general purpose single-threaded and
781 concurrent asynchronous library, which can be used for writing non-blocking
782 network code.
Georg Brandlb7354a62014-10-29 10:57:37 +0100783 The third-party `Twisted <https://twistedmatrix.com/trac/>`_ library is
Antoine Pitrou70957212011-02-05 11:24:15 +0000784 a popular and feature-rich alternative.
785
Georg Brandld7413152009-10-11 21:25:26 +0000786
787Databases
788=========
789
790Are there any interfaces to database packages in Python?
791--------------------------------------------------------
792
793Yes.
794
Georg Brandld404fa62009-10-13 16:55:12 +0000795Interfaces to disk-based hashes such as :mod:`DBM <dbm.ndbm>` and :mod:`GDBM
796<dbm.gnu>` are also included with standard Python. There is also the
797:mod:`sqlite3` module, which provides a lightweight disk-based relational
798database.
Georg Brandld7413152009-10-11 21:25:26 +0000799
800Support for most relational databases is available. See the
801`DatabaseProgramming wiki page
Georg Brandle73778c2014-10-29 08:36:35 +0100802<https://wiki.python.org/moin/DatabaseProgramming>`_ for details.
Georg Brandld7413152009-10-11 21:25:26 +0000803
804
805How do you implement persistent objects in Python?
806--------------------------------------------------
807
808The :mod:`pickle` library module solves this in a very general way (though you
809still can't store things like open files, sockets or windows), and the
810:mod:`shelve` library module uses pickle and (g)dbm to create persistent
Georg Brandld404fa62009-10-13 16:55:12 +0000811mappings containing arbitrary Python objects.
Georg Brandld7413152009-10-11 21:25:26 +0000812
Georg Brandld7413152009-10-11 21:25:26 +0000813
Georg Brandld7413152009-10-11 21:25:26 +0000814Mathematics and Numerics
815========================
816
817How do I generate random numbers in Python?
818-------------------------------------------
819
820The standard module :mod:`random` implements a random number generator. Usage
821is simple::
822
823 import random
824 random.random()
825
826This returns a random floating point number in the range [0, 1).
827
828There are also many other specialized generators in this module, such as:
829
830* ``randrange(a, b)`` chooses an integer in the range [a, b).
831* ``uniform(a, b)`` chooses a floating point number in the range [a, b).
832* ``normalvariate(mean, sdev)`` samples the normal (Gaussian) distribution.
833
834Some higher-level functions operate on sequences directly, such as:
835
Antoine88b24f92019-09-09 17:00:44 +0200836* ``choice(S)`` chooses a random element from a given sequence.
837* ``shuffle(L)`` shuffles a list in-place, i.e. permutes it randomly.
Georg Brandld7413152009-10-11 21:25:26 +0000838
839There's also a ``Random`` class you can instantiate to create independent
840multiple random number generators.