Blame - Doc/faq/library.rst - platform/external/python/cpython3

blob: c8ef9e7199d3257bd88869b43b705cfb8ecd0f91 [file] [log] [blame]

Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	1	:tocdepth: 2
				2
				3	=========================
				4	Library and Extension FAQ
				5	=========================
				6
Georg Brandl	44ea77b	2013-03-28 13:28:44 +0100	[diff] [blame]	7	.. only:: html
				8
				9	.. contents::
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	10
				11	General Library Questions
				12	=========================
				13
				14	How do I find a module or application to perform task X?
				15	--------------------------------------------------------
				16
				17	Check :ref:`the Library Reference <library-index>` to see if there's a relevant
				18	standard library module. (Eventually you'll learn what's in the standard
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	19	library and will be able to skip this step.)
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	20
Georg Brandl	495f7b5	2009-10-27 15:28:25 +0000	[diff] [blame]	21	For third-party packages, search the `Python Package Index
				22	<http://pypi.python.org/pypi>`_ or try `Google <http://www.google.com>`_ or
				23	another Web search engine. Searching for "Python" plus a keyword or two for
				24	your topic of interest will usually find something helpful.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	25
				26
				27	Where is the math.py (socket.py, regex.py, etc.) source file?
				28	-------------------------------------------------------------
				29
Georg Brandl	c4a55fc	2010-02-06 18:46:57 +0000	[diff] [blame]	30	If you can't find a source file for a module it may be a built-in or
				31	dynamically loaded module implemented in C, C++ or other compiled language.
				32	In this case you may not have the source file or it may be something like
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	33	:file:`mathmodule.c`, somewhere in a C source directory (not on the Python Path).
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	34
				35	There are (at least) three kinds of modules in Python:
				36
				37	1) modules written in Python (.py);
				38	2) modules written in C and dynamically loaded (.dll, .pyd, .so, .sl, etc);
				39	3) modules written in C and linked with the interpreter; to get a list of these,
				40	type::
				41
				42	import sys
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	43	print(sys.builtin_module_names)
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	44
				45
				46	How do I make a Python script executable on Unix?
				47	-------------------------------------------------
				48
				49	You need to do two things: the script file's mode must be executable and the
				50	first line must begin with ``#!`` followed by the path of the Python
				51	interpreter.
				52
				53	The first is done by executing ``chmod +x scriptfile`` or perhaps ``chmod 755
				54	scriptfile``.
				55
				56	The second can be done in a number of ways. The most straightforward way is to
				57	write ::
				58
				59	#!/usr/local/bin/python
				60
				61	as the very first line of your file, using the pathname for where the Python
				62	interpreter is installed on your platform.
				63
				64	If you would like the script to be independent of where the Python interpreter
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	65	lives, you can use the :program:`env` program. Almost all Unix variants support
				66	the following, assuming the Python interpreter is in a directory on the user's
				67	:envvar:`PATH`::
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	68
				69	#!/usr/bin/env python
				70
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	71	Don't do this for CGI scripts. The :envvar:`PATH` variable for CGI scripts is
				72	often very minimal, so you need to use the actual absolute pathname of the
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	73	interpreter.
				74
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	75	Occasionally, a user's environment is so full that the :program:`/usr/bin/env`
				76	program fails; or there's no env program at all. In that case, you can try the
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	77	following hack (due to Alex Rezinsky)::
				78
				79	#! /bin/sh
				80	""":"
				81	exec python $0 ${1+"$@"}
				82	"""
				83
				84	The minor disadvantage is that this defines the script's __doc__ string.
				85	However, you can fix that by adding ::
				86
				87	__doc__ = """...Whatever..."""
				88
				89
				90
				91	Is there a curses/termcap package for Python?
				92	---------------------------------------------
				93
				94	.. XXX curses is built by default, isn't it?
				95
				96	For Unix variants: The standard Python source distribution comes with a curses
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	97	module in the :source:`Modules` subdirectory, though it's not compiled by default.
				98	(Note that this is not available in the Windows distribution -- there is no
				99	curses module for Windows.)
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	100
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	101	The :mod:`curses` module supports basic curses features as well as many additional
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	102	functions from ncurses and SYSV curses such as colour, alternative character set
				103	support, pads, and mouse support. This means the module isn't compatible with
				104	operating systems that only have BSD curses, but there don't seem to be any
				105	currently maintained OSes that fall into this category.
				106
				107	For Windows: use `the consolelib module
				108	<http://effbot.org/zone/console-index.htm>`_.
				109
				110
				111	Is there an equivalent to C's onexit() in Python?
				112	-------------------------------------------------
				113
				114	The :mod:`atexit` module provides a register function that is similar to C's
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	115	:c:func:`onexit`.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	116
				117
				118	Why don't my signal handlers work?
				119	----------------------------------
				120
				121	The most common problem is that the signal handler is declared with the wrong
				122	argument list. It is called as ::
				123
				124	handler(signum, frame)
				125
				126	so it should be declared with two arguments::
				127
				128	def handler(signum, frame):
				129	...
				130
				131
				132	Common tasks
				133	============
				134
				135	How do I test a Python program or component?
				136	--------------------------------------------
				137
				138	Python comes with two testing frameworks. The :mod:`doctest` module finds
				139	examples in the docstrings for a module and runs them, comparing the output with
				140	the expected output given in the docstring.
				141
				142	The :mod:`unittest` module is a fancier testing framework modelled on Java and
				143	Smalltalk testing frameworks.
				144
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	145	To make testing easier, you should use good modular design in your program.
				146	Your program should have almost all functionality
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	147	encapsulated in either functions or class methods -- and this sometimes has the
				148	surprising and delightful effect of making the program run faster (because local
				149	variable accesses are faster than global accesses). Furthermore the program
				150	should avoid depending on mutating global variables, since this makes testing
				151	much more difficult to do.
				152
				153	The "global main logic" of your program may be as simple as ::
				154
				155	if __name__ == "__main__":
				156	main_logic()
				157
				158	at the bottom of the main module of your program.
				159
				160	Once your program is organized as a tractable collection of functions and class
				161	behaviours you should write test functions that exercise the behaviours. A test
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	162	suite that automates a sequence of tests can be associated with each module.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	163	This sounds like a lot of work, but since Python is so terse and flexible it's
				164	surprisingly easy. You can make coding much more pleasant and fun by writing
				165	your test functions in parallel with the "production code", since this makes it
				166	easy to find bugs and even design flaws earlier.
				167
				168	"Support modules" that are not intended to be the main module of a program may
				169	include a self-test of the module. ::
				170
				171	if __name__ == "__main__":
				172	self_test()
				173
				174	Even programs that interact with complex external interfaces may be tested when
				175	the external interfaces are unavailable by using "fake" interfaces implemented
				176	in Python.
				177
				178
				179	How do I create documentation from doc strings?
				180	-----------------------------------------------
				181
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	182	The :mod:`pydoc` module can create HTML from the doc strings in your Python
Georg Brandl	495f7b5	2009-10-27 15:28:25 +0000	[diff] [blame]	183	source code. An alternative for creating API documentation purely from
				184	docstrings is `epydoc <http://epydoc.sf.net/>`_. `Sphinx
				185	<http://sphinx.pocoo.org>`_ can also include docstring content.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	186
				187
				188	How do I get a single keypress at a time?
				189	-----------------------------------------
				190
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	191	For Unix variants there are several solutions. It's straightforward to do this
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	192	using curses, but curses is a fairly large module to learn.
				193
				194	.. XXX this doesn't work out of the box, some IO expert needs to check why
				195
				196	Here's a solution without curses::
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	197
				198	import termios, fcntl, sys, os
				199	fd = sys.stdin.fileno()
				200
				201	oldterm = termios.tcgetattr(fd)
				202	newattr = termios.tcgetattr(fd)
				203	newattr[3] = newattr[3] & ~termios.ICANON & ~termios.ECHO
				204	termios.tcsetattr(fd, termios.TCSANOW, newattr)
				205
				206	oldflags = fcntl.fcntl(fd, fcntl.F_GETFL)
				207	fcntl.fcntl(fd, fcntl.F_SETFL, oldflags \| os.O_NONBLOCK)
				208
				209	try:
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	210	while True:
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	211	try:
				212	c = sys.stdin.read(1)
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	213	print("Got character", repr(c))
				214	except IOError:
				215	pass
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	216	finally:
				217	termios.tcsetattr(fd, termios.TCSAFLUSH, oldterm)
				218	fcntl.fcntl(fd, fcntl.F_SETFL, oldflags)
				219
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	220	You need the :mod:`termios` and the :mod:`fcntl` module for any of this to
				221	work, and I've only tried it on Linux, though it should work elsewhere. In
				222	this code, characters are read and printed one at a time.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	223
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	224	:func:`termios.tcsetattr` turns off stdin's echoing and disables canonical
				225	mode. :func:`fcntl.fnctl` is used to obtain stdin's file descriptor flags
				226	and modify them for non-blocking mode. Since reading stdin when it is empty
				227	results in an :exc:`IOError`, this error is caught and ignored.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	228
				229
				230	Threads
				231	=======
				232
				233	How do I program using threads?
				234	-------------------------------
				235
Georg Brandl	d404fa6	2009-10-13 16:55:12 +0000	[diff] [blame]	236	Be sure to use the :mod:`threading` module and not the :mod:`_thread` module.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	237	The :mod:`threading` module builds convenient abstractions on top of the
Georg Brandl	d404fa6	2009-10-13 16:55:12 +0000	[diff] [blame]	238	low-level primitives provided by the :mod:`_thread` module.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	239
				240	Aahz has a set of slides from his threading tutorial that are helpful; see
Georg Brandl	495f7b5	2009-10-27 15:28:25 +0000	[diff] [blame]	241	http://www.pythoncraft.com/OSCON2001/.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	242
				243
				244	None of my threads seem to run: why?
				245	------------------------------------
				246
				247	As soon as the main thread exits, all threads are killed. Your main thread is
				248	running too quickly, giving the threads no time to do any work.
				249
				250	A simple fix is to add a sleep to the end of the program that's long enough for
				251	all the threads to finish::
				252
				253	import threading, time
				254
				255	def thread_task(name, n):
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	256	for i in range(n): print(name, i)
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	257
				258	for i in range(10):
				259	T = threading.Thread(target=thread_task, args=(str(i), i))
				260	T.start()
				261
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	262	time.sleep(10) # <---------------------------!
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	263
				264	But now (on many platforms) the threads don't run in parallel, but appear to run
				265	sequentially, one at a time! The reason is that the OS thread scheduler doesn't
				266	start a new thread until the previous thread is blocked.
				267
				268	A simple fix is to add a tiny sleep to the start of the run function::
				269
				270	def thread_task(name, n):
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	271	time.sleep(0.001) # <--------------------!
				272	for i in range(n): print(name, i)
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	273
				274	for i in range(10):
				275	T = threading.Thread(target=thread_task, args=(str(i), i))
				276	T.start()
				277
				278	time.sleep(10)
				279
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	280	Instead of trying to guess a good delay value for :func:`time.sleep`,
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	281	it's better to use some kind of semaphore mechanism. One idea is to use the
Georg Brandl	d404fa6	2009-10-13 16:55:12 +0000	[diff] [blame]	282	:mod:`queue` module to create a queue object, let each thread append a token to
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	283	the queue when it finishes, and let the main thread read as many tokens from the
				284	queue as there are threads.
				285
				286
				287	How do I parcel out work among a bunch of worker threads?
				288	---------------------------------------------------------
				289
Antoine Pitrou	11480b6	2011-02-05 11:18:34 +0000	[diff] [blame]	290	The easiest way is to use the new :mod:`concurrent.futures` module,
				291	especially the :mod:`~concurrent.futures.ThreadPoolExecutor` class.
				292
				293	Or, if you want fine control over the dispatching algorithm, you can write
				294	your own logic manually. Use the :mod:`queue` module to create a queue
				295	containing a list of jobs. The :class:`~queue.Queue` class maintains a
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	296	list of objects and has a ``.put(obj)`` method that adds items to the queue and
				297	a ``.get()`` method to return them. The class will take care of the locking
				298	necessary to ensure that each job is handed out exactly once.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	299
				300	Here's a trivial example::
				301
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	302	import threading, queue, time
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	303
				304	# The worker thread gets jobs off the queue. When the queue is empty, it
				305	# assumes there will be no more work and exits.
				306	# (Realistically workers will run until terminated.)
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	307	def worker():
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	308	print('Running worker')
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	309	time.sleep(0.1)
				310	while True:
				311	try:
				312	arg = q.get(block=False)
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	313	except queue.Empty:
				314	print('Worker', threading.currentThread(), end=' ')
				315	print('queue empty')
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	316	break
				317	else:
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	318	print('Worker', threading.currentThread(), end=' ')
				319	print('running with argument', arg)
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	320	time.sleep(0.5)
				321
				322	# Create queue
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	323	q = queue.Queue()
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	324
				325	# Start a pool of 5 workers
				326	for i in range(5):
				327	t = threading.Thread(target=worker, name='worker %i' % (i+1))
				328	t.start()
				329
				330	# Begin adding work to the queue
				331	for i in range(50):
				332	q.put(i)
				333
				334	# Give threads time to run
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	335	print('Main thread sleeping')
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	336	time.sleep(5)
				337
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	338	When run, this will produce the following output:
				339
				340	.. code-block:: none
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	341
				342	Running worker
				343	Running worker
				344	Running worker
				345	Running worker
				346	Running worker
				347	Main thread sleeping
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	348	Worker <Thread(worker 1, started 130283832797456)> running with argument 0
				349	Worker <Thread(worker 2, started 130283824404752)> running with argument 1
				350	Worker <Thread(worker 3, started 130283816012048)> running with argument 2
				351	Worker <Thread(worker 4, started 130283807619344)> running with argument 3
				352	Worker <Thread(worker 5, started 130283799226640)> running with argument 4
				353	Worker <Thread(worker 1, started 130283832797456)> running with argument 5
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	354	...
				355
Georg Brandl	3539afd	2012-05-30 22:03:20 +0200	[diff] [blame]	356	Consult the module's documentation for more details; the :class:`~queue.Queue`
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	357	class provides a featureful interface.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	358
				359
				360	What kinds of global value mutation are thread-safe?
				361	----------------------------------------------------
				362
Antoine Pitrou	11480b6	2011-02-05 11:18:34 +0000	[diff] [blame]	363	A :term:`global interpreter lock` (GIL) is used internally to ensure that only one
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	364	thread runs in the Python VM at a time. In general, Python offers to switch
				365	among threads only between bytecode instructions; how frequently it switches can
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	366	be set via :func:`sys.setswitchinterval`. Each bytecode instruction and
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	367	therefore all the C implementation code reached from each instruction is
				368	therefore atomic from the point of view of a Python program.
				369
				370	In theory, this means an exact accounting requires an exact understanding of the
				371	PVM bytecode implementation. In practice, it means that operations on shared
Georg Brandl	c4a55fc	2010-02-06 18:46:57 +0000	[diff] [blame]	372	variables of built-in data types (ints, lists, dicts, etc) that "look atomic"
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	373	really are.
				374
				375	For example, the following operations are all atomic (L, L1, L2 are lists, D,
				376	D1, D2 are dicts, x, y are objects, i, j are ints)::
				377
				378	L.append(x)
				379	L1.extend(L2)
				380	x = L[i]
				381	x = L.pop()
				382	L1[i:j] = L2
				383	L.sort()
				384	x = y
				385	x.field = y
				386	D[x] = y
				387	D1.update(D2)
				388	D.keys()
				389
				390	These aren't::
				391
				392	i = i+1
				393	L.append(L[-1])
				394	L[i] = L[j]
				395	D[x] = D[x] + 1
				396
				397	Operations that replace other objects may invoke those other objects'
				398	:meth:`__del__` method when their reference count reaches zero, and that can
				399	affect things. This is especially true for the mass updates to dictionaries and
				400	lists. When in doubt, use a mutex!
				401
				402
				403	Can't we get rid of the Global Interpreter Lock?
				404	------------------------------------------------
				405
Georg Brandl	495f7b5	2009-10-27 15:28:25 +0000	[diff] [blame]	406	.. XXX link to dbeazley's talk about GIL?
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	407
Antoine Pitrou	11480b6	2011-02-05 11:18:34 +0000	[diff] [blame]	408	The :term:`global interpreter lock` (GIL) is often seen as a hindrance to Python's
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	409	deployment on high-end multiprocessor server machines, because a multi-threaded
				410	Python program effectively only uses one CPU, due to the insistence that
				411	(almost) all Python code can only run while the GIL is held.
				412
				413	Back in the days of Python 1.5, Greg Stein actually implemented a comprehensive
				414	patch set (the "free threading" patches) that removed the GIL and replaced it
Antoine Pitrou	11480b6	2011-02-05 11:18:34 +0000	[diff] [blame]	415	with fine-grained locking. Adam Olsen recently did a similar experiment
				416	in his `python-safethread <http://code.google.com/p/python-safethread/>`_
				417	project. Unfortunately, both experiments exhibited a sharp drop in single-thread
				418	performance (at least 30% slower), due to the amount of fine-grained locking
				419	necessary to compensate for the removal of the GIL.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	420
				421	This doesn't mean that you can't make good use of Python on multi-CPU machines!
				422	You just have to be creative with dividing the work up between multiple
Antoine Pitrou	11480b6	2011-02-05 11:18:34 +0000	[diff] [blame]	423	processes rather than multiple threads. The
				424	:class:`~concurrent.futures.ProcessPoolExecutor` class in the new
				425	:mod:`concurrent.futures` module provides an easy way of doing so; the
				426	:mod:`multiprocessing` module provides a lower-level API in case you want
				427	more control over dispatching of tasks.
				428
				429	Judicious use of C extensions will also help; if you use a C extension to
				430	perform a time-consuming task, the extension can release the GIL while the
				431	thread of execution is in the C code and allow other threads to get some work
				432	done. Some standard library modules such as :mod:`zlib` and :mod:`hashlib`
				433	already do this.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	434
				435	It has been suggested that the GIL should be a per-interpreter-state lock rather
				436	than truly global; interpreters then wouldn't be able to share objects.
				437	Unfortunately, this isn't likely to happen either. It would be a tremendous
				438	amount of work, because many object implementations currently have global state.
				439	For example, small integers and short strings are cached; these caches would
				440	have to be moved to the interpreter state. Other object types have their own
				441	free list; these free lists would have to be moved to the interpreter state.
				442	And so on.
				443
				444	And I doubt that it can even be done in finite time, because the same problem
				445	exists for 3rd party extensions. It is likely that 3rd party extensions are
				446	being written at a faster rate than you can convert them to store all their
				447	global state in the interpreter state.
				448
				449	And finally, once you have multiple interpreters not sharing any state, what
				450	have you gained over running each interpreter in a separate process?
				451
				452
				453	Input and Output
				454	================
				455
				456	How do I delete a file? (And other file questions...)
				457	-----------------------------------------------------
				458
				459	Use ``os.remove(filename)`` or ``os.unlink(filename)``; for documentation, see
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	460	the :mod:`os` module. The two functions are identical; :func:`~os.unlink` is simply
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	461	the name of the Unix system call for this function.
				462
				463	To remove a directory, use :func:`os.rmdir`; use :func:`os.mkdir` to create one.
				464	``os.makedirs(path)`` will create any intermediate directories in ``path`` that
				465	don't exist. ``os.removedirs(path)`` will remove intermediate directories as
				466	long as they're empty; if you want to delete an entire directory tree and its
				467	contents, use :func:`shutil.rmtree`.
				468
				469	To rename a file, use ``os.rename(old_path, new_path)``.
				470
Antoine Pitrou	6a11a98	2010-09-15 10:08:31 +0000	[diff] [blame]	471	To truncate a file, open it using ``f = open(filename, "rb+")``, and use
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	472	``f.truncate(offset)``; offset defaults to the current seek position. There's
Georg Brandl	682d7e0	2010-10-06 10:26:05 +0000	[diff] [blame]	473	also ``os.ftruncate(fd, offset)`` for files opened with :func:`os.open`, where
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	474	fd is the file descriptor (a small integer).
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	475
				476	The :mod:`shutil` module also contains a number of functions to work on files
				477	including :func:`~shutil.copyfile`, :func:`~shutil.copytree`, and
				478	:func:`~shutil.rmtree`.
				479
				480
				481	How do I copy a file?
				482	---------------------
				483
				484	The :mod:`shutil` module contains a :func:`~shutil.copyfile` function. Note
				485	that on MacOS 9 it doesn't copy the resource fork and Finder info.
				486
				487
				488	How do I read (or write) binary data?
				489	-------------------------------------
				490
				491	To read or write complex binary data formats, it's best to use the :mod:`struct`
				492	module. It allows you to take a string containing binary data (usually numbers)
				493	and convert it to Python objects; and vice versa.
				494
				495	For example, the following code reads two 2-byte integers and one 4-byte integer
				496	in big-endian format from a file::
				497
				498	import struct
				499
Antoine Pitrou	6a11a98	2010-09-15 10:08:31 +0000	[diff] [blame]	500	with open(filename, "rb") as f:
				501	s = f.read(8)
				502	x, y, z = struct.unpack(">hhl", s)
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	503
				504	The '>' in the format string forces big-endian data; the letter 'h' reads one
				505	"short integer" (2 bytes), and 'l' reads one "long integer" (4 bytes) from the
				506	string.
				507
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	508	For data that is more regular (e.g. a homogeneous list of ints or floats),
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	509	you can also use the :mod:`array` module.
				510
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	511	.. note::
				512	To read and write binary data, it is mandatory to open the file in
				513	binary mode (here, passing ``"rb"`` to :func:`open`). If you use
				514	``"r"`` instead (the default), the file will be open in text mode
				515	and ``f.read()`` will return :class:`str` objects rather than
				516	:class:`bytes` objects.
Antoine Pitrou	6a11a98	2010-09-15 10:08:31 +0000	[diff] [blame]	517
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	518
				519	I can't seem to use os.read() on a pipe created with os.popen(); why?
				520	---------------------------------------------------------------------
				521
				522	:func:`os.read` is a low-level function which takes a file descriptor, a small
				523	integer representing the opened file. :func:`os.popen` creates a high-level
Georg Brandl	c4a55fc	2010-02-06 18:46:57 +0000	[diff] [blame]	524	file object, the same type returned by the built-in :func:`open` function.
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	525	Thus, to read n bytes from a pipe p created with :func:`os.popen`, you need to
Georg Brandl	c4a55fc	2010-02-06 18:46:57 +0000	[diff] [blame]	526	use ``p.read(n)``.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	527
				528
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	529	.. XXX update to use subprocess. See the :ref:`subprocess-replacements` section.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	530
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	531	How do I run a subprocess with pipes connected to both input and output?
				532	------------------------------------------------------------------------
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	533
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	534	Use the :mod:`popen2` module. For example::
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	535
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	536	import popen2
				537	fromchild, tochild = popen2.popen2("command")
				538	tochild.write("input\n")
				539	tochild.flush()
				540	output = fromchild.readline()
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	541
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	542	Warning: in general it is unwise to do this because you can easily cause a
				543	deadlock where your process is blocked waiting for output from the child
				544	while the child is blocked waiting for input from you. This can be caused
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	545	by the parent expecting the child to output more text than it does or
				546	by data being stuck in stdio buffers due to lack of flushing.
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	547	The Python parent can of course explicitly flush the data it sends to the
				548	child before it reads any output, but if the child is a naive C program it
				549	may have been written to never explicitly flush its output, even if it is
				550	interactive, since flushing is normally automatic.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	551
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	552	Note that a deadlock is also possible if you use :func:`popen3` to read
				553	stdout and stderr. If one of the two is too large for the internal buffer
				554	(increasing the buffer size does not help) and you ``read()`` the other one
				555	first, there is a deadlock, too.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	556
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	557	Note on a bug in popen2: unless your program calls ``wait()`` or
				558	``waitpid()``, finished child processes are never removed, and eventually
				559	calls to popen2 will fail because of a limit on the number of child
				560	processes. Calling :func:`os.waitpid` with the :data:`os.WNOHANG` option can
				561	prevent this; a good place to insert such a call would be before calling
				562	``popen2`` again.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	563
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	564	In many cases, all you really need is to run some data through a command and
				565	get the result back. Unless the amount of data is very large, the easiest
				566	way to do this is to write it to a temporary file and run the command with
				567	that temporary file as input. The standard module :mod:`tempfile` exports a
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	568	:func:`~tempfile.mktemp` function to generate unique temporary file names. ::
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	569
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	570	import tempfile
				571	import os
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	572
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	573	class Popen3:
				574	"""
				575	This is a deadlock-safe version of popen that returns
				576	an object with errorlevel, out (a string) and err (a string).
				577	(capturestderr may not work under windows.)
				578	Example: print(Popen3('grep spam','\n\nhere spam\n\n').out)
				579	"""
				580	def __init__(self,command,input=None,capturestderr=None):
				581	outfile=tempfile.mktemp()
				582	command="( %s ) > %s" % (command,outfile)
				583	if input:
				584	infile=tempfile.mktemp()
				585	open(infile,"w").write(input)
				586	command=command+" <"+infile
				587	if capturestderr:
				588	errfile=tempfile.mktemp()
				589	command=command+" 2>"+errfile
				590	self.errorlevel=os.system(command) >> 8
				591	self.out=open(outfile,"r").read()
				592	os.remove(outfile)
				593	if input:
				594	os.remove(infile)
				595	if capturestderr:
				596	self.err=open(errfile,"r").read()
				597	os.remove(errfile)
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	598
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	599	Note that many interactive programs (e.g. vi) don't work well with pipes
				600	substituted for standard input and output. You will have to use pseudo ttys
				601	("ptys") instead of pipes. Or you can use a Python interface to Don Libes'
				602	"expect" library. A Python extension that interfaces to expect is called
				603	"expy" and available from http://expectpy.sourceforge.net. A pure Python
				604	solution that works like expect is `pexpect
				605	<http://pypi.python.org/pypi/pexpect/>`_.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	606
				607
				608	How do I access the serial (RS232) port?
				609	----------------------------------------
				610
				611	For Win32, POSIX (Linux, BSD, etc.), Jython:
				612
				613	http://pyserial.sourceforge.net
				614
				615	For Unix, see a Usenet post by Mitch Chapman:
				616
				617	http://groups.google.com/groups?selm=34A04430.CF9@ohioee.com
				618
				619
				620	Why doesn't closing sys.stdout (stdin, stderr) really close it?
				621	---------------------------------------------------------------
				622
Antoine Pitrou	6a11a98	2010-09-15 10:08:31 +0000	[diff] [blame]	623	Python :term:`file objects <file object>` are a high-level layer of
				624	abstraction on low-level C file descriptors.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	625
Antoine Pitrou	6a11a98	2010-09-15 10:08:31 +0000	[diff] [blame]	626	For most file objects you create in Python via the built-in :func:`open`
				627	function, ``f.close()`` marks the Python file object as being closed from
				628	Python's point of view, and also arranges to close the underlying C file
				629	descriptor. This also happens automatically in ``f``'s destructor, when
				630	``f`` becomes garbage.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	631
				632	But stdin, stdout and stderr are treated specially by Python, because of the
				633	special status also given to them by C. Running ``sys.stdout.close()`` marks
				634	the Python-level file object as being closed, but does not close the
Antoine Pitrou	6a11a98	2010-09-15 10:08:31 +0000	[diff] [blame]	635	associated C file descriptor.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	636
Antoine Pitrou	6a11a98	2010-09-15 10:08:31 +0000	[diff] [blame]	637	To close the underlying C file descriptor for one of these three, you should
				638	first be sure that's what you really want to do (e.g., you may confuse
				639	extension modules trying to do I/O). If it is, use :func:`os.close`::
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	640
Antoine Pitrou	6a11a98	2010-09-15 10:08:31 +0000	[diff] [blame]	641	os.close(stdin.fileno())
				642	os.close(stdout.fileno())
				643	os.close(stderr.fileno())
				644
				645	Or you can use the numeric constants 0, 1 and 2, respectively.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	646
				647
				648	Network/Internet Programming
				649	============================
				650
				651	What WWW tools are there for Python?
				652	------------------------------------
				653
				654	See the chapters titled :ref:`internet` and :ref:`netdata` in the Library
				655	Reference Manual. Python has many modules that will help you build server-side
				656	and client-side web systems.
				657
				658	.. XXX check if wiki page is still up to date
				659
				660	A summary of available frameworks is maintained by Paul Boddie at
				661	http://wiki.python.org/moin/WebProgramming .
				662
				663	Cameron Laird maintains a useful set of pages about Python web technologies at
				664	http://phaseit.net/claird/comp.lang.python/web_python.
				665
				666
				667	How can I mimic CGI form submission (METHOD=POST)?
				668	--------------------------------------------------
				669
				670	I would like to retrieve web pages that are the result of POSTing a form. Is
				671	there existing code that would let me do this easily?
				672
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	673	Yes. Here's a simple example that uses urllib.request::
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	674
				675	#!/usr/local/bin/python
				676
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	677	import urllib.request
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	678
				679	### build the query string
				680	qs = "First=Josephine&MI=Q&Last=Public"
				681
				682	### connect and send the server a path
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	683	req = urllib.request.urlopen('http://www.some-server.out-there'
				684	'/cgi-bin/some-cgi-script', data=qs)
				685	msg, hdrs = req.read(), req.info()
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	686
Georg Brandl	54ebb78	2010-08-14 15:48:49 +0000	[diff] [blame]	687	Note that in general for percent-encoded POST operations, query strings must be
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	688	quoted using :func:`urllib.parse.urlencode`. For example, to send
				689	``name=Guy Steele, Jr.``::
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	690
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	691	>>> import urllib.parse
				692	>>> urllib.parse.urlencode({'name': 'Guy Steele, Jr.'})
				693	'name=Guy+Steele%2C+Jr.'
				694
				695	.. seealso:: :ref:`urllib-howto` for extensive examples.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	696
				697
				698	What module should I use to help with generating HTML?
				699	------------------------------------------------------
				700
				701	.. XXX add modern template languages
				702
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	703	You can find a collection of useful links on the `Web Programming wiki page
				704	<http://wiki.python.org/moin/WebProgramming>`_.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	705
				706
				707	How do I send mail from a Python script?
				708	----------------------------------------
				709
				710	Use the standard library module :mod:`smtplib`.
				711
				712	Here's a very simple interactive mail sender that uses it. This method will
				713	work on any host that supports an SMTP listener. ::
				714
				715	import sys, smtplib
				716
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	717	fromaddr = input("From: ")
				718	toaddrs = input("To: ").split(',')
				719	print("Enter message, end with ^D:")
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	720	msg = ''
				721	while True:
				722	line = sys.stdin.readline()
				723	if not line:
				724	break
				725	msg += line
				726
				727	# The actual mail send
				728	server = smtplib.SMTP('localhost')
				729	server.sendmail(fromaddr, toaddrs, msg)
				730	server.quit()
				731
				732	A Unix-only alternative uses sendmail. The location of the sendmail program
Ezio Melotti	b35480e	2012-05-13 20:14:04 +0300	[diff] [blame]	733	varies between systems; sometimes it is ``/usr/lib/sendmail``, sometimes
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	734	``/usr/sbin/sendmail``. The sendmail manual page will help you out. Here's
				735	some sample code::
				736
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	737	SENDMAIL = "/usr/sbin/sendmail" # sendmail location
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	738	import os
				739	p = os.popen("%s -t -i" % SENDMAIL, "w")
				740	p.write("To: receiver@example.com\n")
				741	p.write("Subject: test\n")
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	742	p.write("\n") # blank line separating headers from body
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	743	p.write("Some text\n")
				744	p.write("some more text\n")
				745	sts = p.close()
				746	if sts != 0:
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	747	print("Sendmail exit status", sts)
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	748
				749
				750	How do I avoid blocking in the connect() method of a socket?
				751	------------------------------------------------------------
				752
Antoine Pitrou	7095721	2011-02-05 11:24:15 +0000	[diff] [blame]	753	The :mod:`select` module is commonly used to help with asynchronous I/O on
				754	sockets.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	755
				756	To prevent the TCP connect from blocking, you can set the socket to non-blocking
				757	mode. Then when you do the ``connect()``, you will either connect immediately
				758	(unlikely) or get an exception that contains the error number as ``.errno``.
				759	``errno.EINPROGRESS`` indicates that the connection is in progress, but hasn't
				760	finished yet. Different OSes will return different values, so you're going to
				761	have to check what's returned on your system.
				762
				763	You can use the ``connect_ex()`` method to avoid creating an exception. It will
				764	just return the errno value. To poll, you can call ``connect_ex()`` again later
Georg Brandl	9e4ff75	2009-12-19 17:57:51 +0000	[diff] [blame]	765	-- ``0`` or ``errno.EISCONN`` indicate that you're connected -- or you can pass this
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	766	socket to select to check if it's writable.
				767
Antoine Pitrou	7095721	2011-02-05 11:24:15 +0000	[diff] [blame]	768	.. note::
				769	The :mod:`asyncore` module presents a framework-like approach to the problem
				770	of writing non-blocking networking code.
				771	The third-party `Twisted <http://twistedmatrix.com/>`_ library is
				772	a popular and feature-rich alternative.
				773
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	774
				775	Databases
				776	=========
				777
				778	Are there any interfaces to database packages in Python?
				779	--------------------------------------------------------
				780
				781	Yes.
				782
Georg Brandl	d404fa6	2009-10-13 16:55:12 +0000	[diff] [blame]	783	Interfaces to disk-based hashes such as :mod:`DBM <dbm.ndbm>` and :mod:`GDBM
				784	<dbm.gnu>` are also included with standard Python. There is also the
				785	:mod:`sqlite3` module, which provides a lightweight disk-based relational
				786	database.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	787
				788	Support for most relational databases is available. See the
				789	`DatabaseProgramming wiki page
				790	<http://wiki.python.org/moin/DatabaseProgramming>`_ for details.
				791
				792
				793	How do you implement persistent objects in Python?
				794	--------------------------------------------------
				795
				796	The :mod:`pickle` library module solves this in a very general way (though you
				797	still can't store things like open files, sockets or windows), and the
				798	:mod:`shelve` library module uses pickle and (g)dbm to create persistent
Georg Brandl	d404fa6	2009-10-13 16:55:12 +0000	[diff] [blame]	799	mappings containing arbitrary Python objects.
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	800
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	801
Georg Brandl	d741315	2009-10-11 21:25:26 +0000	[diff] [blame]	802	Mathematics and Numerics
				803	========================
				804
				805	How do I generate random numbers in Python?
				806	-------------------------------------------
				807
				808	The standard module :mod:`random` implements a random number generator. Usage
				809	is simple::
				810
				811	import random
				812	random.random()
				813
				814	This returns a random floating point number in the range [0, 1).
				815
				816	There are also many other specialized generators in this module, such as:
				817
				818	* ``randrange(a, b)`` chooses an integer in the range [a, b).
				819	* ``uniform(a, b)`` chooses a floating point number in the range [a, b).
				820	* ``normalvariate(mean, sdev)`` samples the normal (Gaussian) distribution.
				821
				822	Some higher-level functions operate on sequences directly, such as:
				823
				824	* ``choice(S)`` chooses random element from a given sequence
				825	* ``shuffle(L)`` shuffles a list in-place, i.e. permutes it randomly
				826
				827	There's also a ``Random`` class you can instantiate to create independent
				828	multiple random number generators.