Blame - Doc/howto/functional.rst - platform/external/python/cpython3

blob: f1ed07b44ceaeea28a6451e0567b5dd87b0f4ca6 [file] [log] [blame]

Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1	********************************
				2	Functional Programming HOWTO
				3	********************************
				4
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	5	:Author: A. M. Kuchling
Christian Heimes	0449f63	2007-12-15 01:27:15 +0000	[diff] [blame]	6	:Release: 0.31
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	7
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	8	In this document, we'll take a tour of Python's features suitable for
				9	implementing programs in a functional style. After an introduction to the
				10	concepts of functional programming, we'll look at language features such as
Georg Brandl	9afde1c	2007-11-01 20:32:30 +0000	[diff] [blame]	11	:term:`iterator`\s and :term:`generator`\s and relevant library modules such as
				12	:mod:`itertools` and :mod:`functools`.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	13
				14
				15	Introduction
				16	============
				17
				18	This section explains the basic concept of functional programming; if you're
				19	just interested in learning about Python language features, skip to the next
				20	section.
				21
				22	Programming languages support decomposing problems in several different ways:
				23
				24	* Most programming languages are procedural: programs are lists of
				25	instructions that tell the computer what to do with the program's input. C,
				26	Pascal, and even Unix shells are procedural languages.
				27
				28	* In declarative languages, you write a specification that describes the
				29	problem to be solved, and the language implementation figures out how to
				30	perform the computation efficiently. SQL is the declarative language you're
				31	most likely to be familiar with; a SQL query describes the data set you want
				32	to retrieve, and the SQL engine decides whether to scan tables or use indexes,
				33	which subclauses should be performed first, etc.
				34
				35	* Object-oriented programs manipulate collections of objects. Objects have
				36	internal state and support methods that query or modify this internal state in
				37	some way. Smalltalk and Java are object-oriented languages. C++ and Python
				38	are languages that support object-oriented programming, but don't force the
				39	use of object-oriented features.
				40
				41	* Functional programming decomposes a problem into a set of functions.
				42	Ideally, functions only take inputs and produce outputs, and don't have any
				43	internal state that affects the output produced for a given input. Well-known
				44	functional languages include the ML family (Standard ML, OCaml, and other
				45	variants) and Haskell.
				46
Christian Heimes	0449f63	2007-12-15 01:27:15 +0000	[diff] [blame]	47	The designers of some computer languages choose to emphasize one
				48	particular approach to programming. This often makes it difficult to
				49	write programs that use a different approach. Other languages are
				50	multi-paradigm languages that support several different approaches.
				51	Lisp, C++, and Python are multi-paradigm; you can write programs or
				52	libraries that are largely procedural, object-oriented, or functional
				53	in all of these languages. In a large program, different sections
				54	might be written using different approaches; the GUI might be
				55	object-oriented while the processing logic is procedural or
				56	functional, for example.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	57
				58	In a functional program, input flows through a set of functions. Each function
Christian Heimes	0449f63	2007-12-15 01:27:15 +0000	[diff] [blame]	59	operates on its input and produces some output. Functional style discourages
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	60	functions with side effects that modify internal state or make other changes
				61	that aren't visible in the function's return value. Functions that have no side
				62	effects at all are called purely functional. Avoiding side effects means
				63	not using data structures that get updated as a program runs; every function's
				64	output must only depend on its input.
				65
				66	Some languages are very strict about purity and don't even have assignment
				67	statements such as ``a=3`` or ``c = a + b``, but it's difficult to avoid all
				68	side effects. Printing to the screen or writing to a disk file are side
Georg Brandl	0df7979	2008-10-04 18:33:26 +0000	[diff] [blame]	69	effects, for example. For example, in Python a call to the :func:`print` or
				70	:func:`time.sleep` function both return no useful value; they're only called for
				71	their side effects of sending some text to the screen or pausing execution for a
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	72	second.
				73
				74	Python programs written in functional style usually won't go to the extreme of
				75	avoiding all I/O or all assignments; instead, they'll provide a
				76	functional-appearing interface but will use non-functional features internally.
				77	For example, the implementation of a function will still use assignments to
				78	local variables, but won't modify global variables or have other side effects.
				79
				80	Functional programming can be considered the opposite of object-oriented
				81	programming. Objects are little capsules containing some internal state along
				82	with a collection of method calls that let you modify this state, and programs
				83	consist of making the right set of state changes. Functional programming wants
				84	to avoid state changes as much as possible and works with data flowing between
				85	functions. In Python you might combine the two approaches by writing functions
				86	that take and return instances representing objects in your application (e-mail
				87	messages, transactions, etc.).
				88
				89	Functional design may seem like an odd constraint to work under. Why should you
				90	avoid objects and side effects? There are theoretical and practical advantages
				91	to the functional style:
				92
				93	* Formal provability.
				94	* Modularity.
				95	* Composability.
				96	* Ease of debugging and testing.
				97
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	98
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	99	Formal provability
				100	------------------
				101
				102	A theoretical benefit is that it's easier to construct a mathematical proof that
				103	a functional program is correct.
				104
				105	For a long time researchers have been interested in finding ways to
				106	mathematically prove programs correct. This is different from testing a program
				107	on numerous inputs and concluding that its output is usually correct, or reading
				108	a program's source code and concluding that the code looks right; the goal is
				109	instead a rigorous proof that a program produces the right result for all
				110	possible inputs.
				111
				112	The technique used to prove programs correct is to write down invariants,
				113	properties of the input data and of the program's variables that are always
				114	true. For each line of code, you then show that if invariants X and Y are true
				115	before the line is executed, the slightly different invariants X' and Y' are
				116	true after the line is executed. This continues until you reach the end of
				117	the program, at which point the invariants should match the desired conditions
				118	on the program's output.
				119
				120	Functional programming's avoidance of assignments arose because assignments are
				121	difficult to handle with this technique; assignments can break invariants that
				122	were true before the assignment without producing any new invariants that can be
				123	propagated onward.
				124
				125	Unfortunately, proving programs correct is largely impractical and not relevant
				126	to Python software. Even trivial programs require proofs that are several pages
				127	long; the proof of correctness for a moderately complicated program would be
				128	enormous, and few or none of the programs you use daily (the Python interpreter,
				129	your XML parser, your web browser) could be proven correct. Even if you wrote
				130	down or generated a proof, there would then be the question of verifying the
				131	proof; maybe there's an error in it, and you wrongly believe you've proved the
				132	program correct.
				133
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	134
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	135	Modularity
				136	----------
				137
				138	A more practical benefit of functional programming is that it forces you to
				139	break apart your problem into small pieces. Programs are more modular as a
				140	result. It's easier to specify and write a small function that does one thing
				141	than a large function that performs a complicated transformation. Small
				142	functions are also easier to read and to check for errors.
				143
				144
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	145	Ease of debugging and testing
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	146	-----------------------------
				147
				148	Testing and debugging a functional-style program is easier.
				149
				150	Debugging is simplified because functions are generally small and clearly
				151	specified. When a program doesn't work, each function is an interface point
				152	where you can check that the data are correct. You can look at the intermediate
				153	inputs and outputs to quickly isolate the function that's responsible for a bug.
				154
				155	Testing is easier because each function is a potential subject for a unit test.
				156	Functions don't depend on system state that needs to be replicated before
				157	running a test; instead you only have to synthesize the right input and then
				158	check that the output matches expectations.
				159
				160
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	161	Composability
				162	-------------
				163
				164	As you work on a functional-style program, you'll write a number of functions
				165	with varying inputs and outputs. Some of these functions will be unavoidably
				166	specialized to a particular application, but others will be useful in a wide
				167	variety of programs. For example, a function that takes a directory path and
				168	returns all the XML files in the directory, or a function that takes a filename
				169	and returns its contents, can be applied to many different situations.
				170
				171	Over time you'll form a personal library of utilities. Often you'll assemble
				172	new programs by arranging existing functions in a new configuration and writing
				173	a few functions specialized for the current task.
				174
				175
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	176	Iterators
				177	=========
				178
				179	I'll start by looking at a Python language feature that's an important
				180	foundation for writing functional-style programs: iterators.
				181
				182	An iterator is an object representing a stream of data; this object returns the
				183	data one element at a time. A Python iterator must support a method called
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	184	``__next__()`` that takes no arguments and always returns the next element of
				185	the stream. If there are no more elements in the stream, ``__next__()`` must
				186	raise the ``StopIteration`` exception. Iterators don't have to be finite,
				187	though; it's perfectly reasonable to write an iterator that produces an infinite
				188	stream of data.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	189
				190	The built-in :func:`iter` function takes an arbitrary object and tries to return
				191	an iterator that will return the object's contents or elements, raising
				192	:exc:`TypeError` if the object doesn't support iteration. Several of Python's
				193	built-in data types support iteration, the most common being lists and
				194	dictionaries. An object is called an iterable object if you can get an
				195	iterator for it.
				196
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	197	You can experiment with the iteration interface manually:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	198
				199	>>> L = [1,2,3]
				200	>>> it = iter(L)
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	201	>>> it
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	202	<...iterator object at ...>
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	203	>>> it.__next__()
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	204	1
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	205	>>> next(it)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	206	2
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	207	>>> next(it)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	208	3
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	209	>>> next(it)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	210	Traceback (most recent call last):
				211	File "<stdin>", line 1, in ?
				212	StopIteration
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	213	>>>
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	214
				215	Python expects iterable objects in several different contexts, the most
				216	important being the ``for`` statement. In the statement ``for X in Y``, Y must
				217	be an iterator or some object for which ``iter()`` can create an iterator.
				218	These two statements are equivalent::
				219
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	220
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	221	for i in iter(obj):
Neal Norwitz	752abd0	2008-05-13 04:55:24 +0000	[diff] [blame]	222	print(i)
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	223
				224	for i in obj:
Neal Norwitz	752abd0	2008-05-13 04:55:24 +0000	[diff] [blame]	225	print(i)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	226
				227	Iterators can be materialized as lists or tuples by using the :func:`list` or
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	228	:func:`tuple` constructor functions:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	229
				230	>>> L = [1,2,3]
				231	>>> iterator = iter(L)
				232	>>> t = tuple(iterator)
				233	>>> t
				234	(1, 2, 3)
				235
				236	Sequence unpacking also supports iterators: if you know an iterator will return
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	237	N elements, you can unpack them into an N-tuple:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	238
				239	>>> L = [1,2,3]
				240	>>> iterator = iter(L)
				241	>>> a,b,c = iterator
				242	>>> a,b,c
				243	(1, 2, 3)
				244
				245	Built-in functions such as :func:`max` and :func:`min` can take a single
				246	iterator argument and will return the largest or smallest element. The ``"in"``
				247	and ``"not in"`` operators also support iterators: ``X in iterator`` is true if
				248	X is found in the stream returned by the iterator. You'll run into obvious
Sandro Tosi	dd7c552	2012-08-15 21:37:35 +0200	[diff] [blame^]	249	problems if the iterator is infinite; ``max()``, ``min()``
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	250	will never return, and if the element X never appears in the stream, the
Sandro Tosi	dd7c552	2012-08-15 21:37:35 +0200	[diff] [blame^]	251	``"in"`` and ``"not in"`` operators won't return either.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	252
				253	Note that you can only go forward in an iterator; there's no way to get the
				254	previous element, reset the iterator, or make a copy of it. Iterator objects
				255	can optionally provide these additional capabilities, but the iterator protocol
				256	only specifies the ``next()`` method. Functions may therefore consume all of
				257	the iterator's output, and if you need to do something different with the same
				258	stream, you'll have to create a new iterator.
				259
				260
				261
				262	Data Types That Support Iterators
				263	---------------------------------
				264
				265	We've already seen how lists and tuples support iterators. In fact, any Python
				266	sequence type, such as strings, will automatically support creation of an
				267	iterator.
				268
				269	Calling :func:`iter` on a dictionary returns an iterator that will loop over the
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	270	dictionary's keys:
				271
				272	.. not a doctest since dict ordering varies across Pythons
				273
				274	::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	275
				276	>>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
				277	... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
				278	>>> for key in m:
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	279	... print(key, m[key])
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	280	Mar 3
				281	Feb 2
				282	Aug 8
				283	Sep 9
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	284	Apr 4
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	285	Jun 6
				286	Jul 7
				287	Jan 1
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	288	May 5
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	289	Nov 11
				290	Dec 12
				291	Oct 10
				292
				293	Note that the order is essentially random, because it's based on the hash
				294	ordering of the objects in the dictionary.
				295
Fred Drake	2e74878	2007-09-04 17:33:11 +0000	[diff] [blame]	296	Applying :func:`iter` to a dictionary always loops over the keys, but
				297	dictionaries have methods that return other iterators. If you want to iterate
				298	over values or key/value pairs, you can explicitly call the
				299	:meth:`values` or :meth:`items` methods to get an appropriate iterator.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	300
				301	The :func:`dict` constructor can accept an iterator that returns a finite stream
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	302	of ``(key, value)`` tuples:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	303
				304	>>> L = [('Italy', 'Rome'), ('France', 'Paris'), ('US', 'Washington DC')]
				305	>>> dict(iter(L))
				306	{'Italy': 'Rome', 'US': 'Washington DC', 'France': 'Paris'}
				307
				308	Files also support iteration by calling the ``readline()`` method until there
				309	are no more lines in the file. This means you can read each line of a file like
				310	this::
				311
				312	for line in file:
				313	# do something for each line
				314	...
				315
				316	Sets can take their contents from an iterable and let you iterate over the set's
				317	elements::
				318
Georg Brandl	f694518	2008-02-01 11:56:49 +0000	[diff] [blame]	319	S = {2, 3, 5, 7, 11, 13}
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	320	for i in S:
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	321	print(i)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	322
				323
				324
				325	Generator expressions and list comprehensions
				326	=============================================
				327
				328	Two common operations on an iterator's output are 1) performing some operation
				329	for every element, 2) selecting a subset of elements that meet some condition.
				330	For example, given a list of strings, you might want to strip off trailing
				331	whitespace from each line or extract all the strings containing a given
				332	substring.
				333
				334	List comprehensions and generator expressions (short form: "listcomps" and
				335	"genexps") are a concise notation for such operations, borrowed from the
Ezio Melotti	19192dd	2010-04-05 13:25:51 +0000	[diff] [blame]	336	functional programming language Haskell (http://www.haskell.org/). You can strip
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	337	all the whitespace from a stream of strings with the following code::
				338
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	339	line_list = [' line 1\n', 'line 2 \n', ...]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	340
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	341	# Generator expression -- returns iterator
				342	stripped_iter = (line.strip() for line in line_list)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	343
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	344	# List comprehension -- returns list
				345	stripped_list = [line.strip() for line in line_list]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	346
				347	You can select only certain elements by adding an ``"if"`` condition::
				348
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	349	stripped_list = [line.strip() for line in line_list
				350	if line != ""]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	351
				352	With a list comprehension, you get back a Python list; ``stripped_list`` is a
				353	list containing the resulting lines, not an iterator. Generator expressions
				354	return an iterator that computes the values as necessary, not needing to
				355	materialize all the values at once. This means that list comprehensions aren't
				356	useful if you're working with iterators that return an infinite stream or a very
				357	large amount of data. Generator expressions are preferable in these situations.
				358
				359	Generator expressions are surrounded by parentheses ("()") and list
				360	comprehensions are surrounded by square brackets ("[]"). Generator expressions
				361	have the form::
				362
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	363	( expression for expr in sequence1
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	364	if condition1
				365	for expr2 in sequence2
				366	if condition2
				367	for expr3 in sequence3 ...
				368	if condition3
				369	for exprN in sequenceN
				370	if conditionN )
				371
				372	Again, for a list comprehension only the outside brackets are different (square
				373	brackets instead of parentheses).
				374
				375	The elements of the generated output will be the successive values of
				376	``expression``. The ``if`` clauses are all optional; if present, ``expression``
				377	is only evaluated and added to the result when ``condition`` is true.
				378
				379	Generator expressions always have to be written inside parentheses, but the
				380	parentheses signalling a function call also count. If you want to create an
				381	iterator that will be immediately passed to a function you can write::
				382
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	383	obj_total = sum(obj.count for obj in list_all_objects())
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	384
				385	The ``for...in`` clauses contain the sequences to be iterated over. The
				386	sequences do not have to be the same length, because they are iterated over from
				387	left to right, not in parallel. For each element in ``sequence1``,
				388	``sequence2`` is looped over from the beginning. ``sequence3`` is then looped
				389	over for each resulting pair of elements from ``sequence1`` and ``sequence2``.
				390
				391	To put it another way, a list comprehension or generator expression is
				392	equivalent to the following Python code::
				393
				394	for expr1 in sequence1:
				395	if not (condition1):
				396	continue # Skip this element
				397	for expr2 in sequence2:
				398	if not (condition2):
				399	continue # Skip this element
				400	...
				401	for exprN in sequenceN:
				402	if not (conditionN):
				403	continue # Skip this element
				404
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	405	# Output the value of
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	406	# the expression.
				407
				408	This means that when there are multiple ``for...in`` clauses but no ``if``
				409	clauses, the length of the resulting output will be equal to the product of the
				410	lengths of all the sequences. If you have two lists of length 3, the output
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	411	list is 9 elements long:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	412
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	413	.. doctest::
				414	:options: +NORMALIZE_WHITESPACE
				415
				416	>>> seq1 = 'abc'
				417	>>> seq2 = (1,2,3)
				418	>>> [(x,y) for x in seq1 for y in seq2]
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	419	[('a', 1), ('a', 2), ('a', 3),
				420	('b', 1), ('b', 2), ('b', 3),
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	421	('c', 1), ('c', 2), ('c', 3)]
				422
				423	To avoid introducing an ambiguity into Python's grammar, if ``expression`` is
				424	creating a tuple, it must be surrounded with parentheses. The first list
				425	comprehension below is a syntax error, while the second one is correct::
				426
				427	# Syntax error
				428	[ x,y for x in seq1 for y in seq2]
				429	# Correct
				430	[ (x,y) for x in seq1 for y in seq2]
				431
				432
				433	Generators
				434	==========
				435
				436	Generators are a special class of functions that simplify the task of writing
				437	iterators. Regular functions compute a value and return it, but generators
				438	return an iterator that returns a stream of values.
				439
				440	You're doubtless familiar with how regular function calls work in Python or C.
				441	When you call a function, it gets a private namespace where its local variables
				442	are created. When the function reaches a ``return`` statement, the local
				443	variables are destroyed and the value is returned to the caller. A later call
				444	to the same function creates a new private namespace and a fresh set of local
				445	variables. But, what if the local variables weren't thrown away on exiting a
				446	function? What if you could later resume the function where it left off? This
				447	is what generators provide; they can be thought of as resumable functions.
				448
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	449	Here's the simplest example of a generator function:
				450
				451	.. testcode::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	452
				453	def generate_ints(N):
				454	for i in range(N):
				455	yield i
				456
				457	Any function containing a ``yield`` keyword is a generator function; this is
Georg Brandl	9afde1c	2007-11-01 20:32:30 +0000	[diff] [blame]	458	detected by Python's :term:`bytecode` compiler which compiles the function
				459	specially as a result.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	460
				461	When you call a generator function, it doesn't return a single value; instead it
				462	returns a generator object that supports the iterator protocol. On executing
				463	the ``yield`` expression, the generator outputs the value of ``i``, similar to a
				464	``return`` statement. The big difference between ``yield`` and a ``return``
				465	statement is that on reaching a ``yield`` the generator's state of execution is
				466	suspended and local variables are preserved. On the next call to the
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	467	generator's ``.__next__()`` method, the function will resume executing.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	468
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	469	Here's a sample usage of the ``generate_ints()`` generator:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	470
				471	>>> gen = generate_ints(3)
				472	>>> gen
Benjamin Peterson	25c95f1	2009-05-08 20:42:26 +0000	[diff] [blame]	473	<generator object generate_ints at ...>
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	474	>>> next(gen)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	475	0
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	476	>>> next(gen)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	477	1
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	478	>>> next(gen)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	479	2
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	480	>>> next(gen)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	481	Traceback (most recent call last):
				482	File "stdin", line 1, in ?
				483	File "stdin", line 2, in generate_ints
				484	StopIteration
				485
				486	You could equally write ``for i in generate_ints(5)``, or ``a,b,c =
				487	generate_ints(3)``.
				488
				489	Inside a generator function, the ``return`` statement can only be used without a
				490	value, and signals the end of the procession of values; after executing a
				491	``return`` the generator cannot return any further values. ``return`` with a
				492	value, such as ``return 5``, is a syntax error inside a generator function. The
				493	end of the generator's results can also be indicated by raising
				494	``StopIteration`` manually, or by just letting the flow of execution fall off
				495	the bottom of the function.
				496
				497	You could achieve the effect of generators manually by writing your own class
				498	and storing all the local variables of the generator as instance variables. For
				499	example, returning a list of integers could be done by setting ``self.count`` to
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	500	0, and having the ``__next__()`` method increment ``self.count`` and return it.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	501	However, for a moderately complicated generator, writing a corresponding class
				502	can be much messier.
				503
				504	The test suite included with Python's library, ``test_generators.py``, contains
				505	a number of more interesting examples. Here's one generator that implements an
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	506	in-order traversal of a tree using generators recursively. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	507
				508	# A recursive generator that generates Tree leaves in in-order.
				509	def inorder(t):
				510	if t:
				511	for x in inorder(t.left):
				512	yield x
				513
				514	yield t.label
				515
				516	for x in inorder(t.right):
				517	yield x
				518
				519	Two other examples in ``test_generators.py`` produce solutions for the N-Queens
				520	problem (placing N queens on an NxN chess board so that no queen threatens
				521	another) and the Knight's Tour (finding a route that takes a knight to every
				522	square of an NxN chessboard without visiting any square twice).
				523
				524
				525
				526	Passing values into a generator
				527	-------------------------------
				528
				529	In Python 2.4 and earlier, generators only produced output. Once a generator's
				530	code was invoked to create an iterator, there was no way to pass any new
				531	information into the function when its execution is resumed. You could hack
				532	together this ability by making the generator look at a global variable or by
				533	passing in some mutable object that callers then modify, but these approaches
				534	are messy.
				535
				536	In Python 2.5 there's a simple way to pass values into a generator.
				537	:keyword:`yield` became an expression, returning a value that can be assigned to
				538	a variable or otherwise operated on::
				539
				540	val = (yield i)
				541
				542	I recommend that you always put parentheses around a ``yield`` expression
				543	when you're doing something with the returned value, as in the above example.
				544	The parentheses aren't always necessary, but it's easier to always add them
				545	instead of having to remember when they're needed.
				546
				547	(PEP 342 explains the exact rules, which are that a ``yield``-expression must
				548	always be parenthesized except when it occurs at the top-level expression on the
				549	right-hand side of an assignment. This means you can write ``val = yield i``
				550	but have to use parentheses when there's an operation, as in ``val = (yield i)
				551	+ 12``.)
				552
				553	Values are sent into a generator by calling its ``send(value)`` method. This
				554	method resumes the generator's code and the ``yield`` expression returns the
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	555	specified value. If the regular ``__next__()`` method is called, the ``yield``
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	556	returns ``None``.
				557
				558	Here's a simple counter that increments by 1 and allows changing the value of
				559	the internal counter.
				560
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	561	.. testcode::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	562
				563	def counter (maximum):
				564	i = 0
				565	while i < maximum:
				566	val = (yield i)
				567	# If value provided, change counter
				568	if val is not None:
				569	i = val
				570	else:
				571	i += 1
				572
				573	And here's an example of changing the counter:
				574
				575	>>> it = counter(10)
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	576	>>> next(it)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	577	0
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	578	>>> next(it)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	579	1
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	580	>>> it.send(8)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	581	8
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	582	>>> next(it)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	583	9
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	584	>>> next(it)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	585	Traceback (most recent call last):
Georg Brandl	1f01deb	2009-01-03 22:47:39 +0000	[diff] [blame]	586	File "t.py", line 15, in ?
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	587	it.next()
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	588	StopIteration
				589
				590	Because ``yield`` will often be returning ``None``, you should always check for
				591	this case. Don't just use its value in expressions unless you're sure that the
				592	``send()`` method will be the only method used resume your generator function.
				593
				594	In addition to ``send()``, there are two other new methods on generators:
				595
				596	* ``throw(type, value=None, traceback=None)`` is used to raise an exception
				597	inside the generator; the exception is raised by the ``yield`` expression
				598	where the generator's execution is paused.
				599
				600	* ``close()`` raises a :exc:`GeneratorExit` exception inside the generator to
				601	terminate the iteration. On receiving this exception, the generator's code
				602	must either raise :exc:`GeneratorExit` or :exc:`StopIteration`; catching the
				603	exception and doing anything else is illegal and will trigger a
				604	:exc:`RuntimeError`. ``close()`` will also be called by Python's garbage
				605	collector when the generator is garbage-collected.
				606
				607	If you need to run cleanup code when a :exc:`GeneratorExit` occurs, I suggest
				608	using a ``try: ... finally:`` suite instead of catching :exc:`GeneratorExit`.
				609
				610	The cumulative effect of these changes is to turn generators from one-way
				611	producers of information into both producers and consumers.
				612
				613	Generators also become coroutines, a more generalized form of subroutines.
				614	Subroutines are entered at one point and exited at another point (the top of the
				615	function, and a ``return`` statement), but coroutines can be entered, exited,
				616	and resumed at many different points (the ``yield`` statements).
				617
				618
				619	Built-in functions
				620	==================
				621
				622	Let's look in more detail at built-in functions often used with iterators.
				623
Georg Brandl	f694518	2008-02-01 11:56:49 +0000	[diff] [blame]	624	Two of Python's built-in functions, :func:`map` and :func:`filter` duplicate the
				625	features of generator expressions:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	626
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	627	``map(f, iterA, iterB, ...)`` returns an iterator over the sequence
Georg Brandl	f694518	2008-02-01 11:56:49 +0000	[diff] [blame]	628	``f(iterA[0], iterB[0]), f(iterA[1], iterB[1]), f(iterA[2], iterB[2]), ...``.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	629
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	630	>>> def upper(s):
				631	... return s.upper()
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	632
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	633
Georg Brandl	a3deea1	2008-12-15 08:29:32 +0000	[diff] [blame]	634	>>> list(map(upper, ['sentence', 'fragment']))
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	635	['SENTENCE', 'FRAGMENT']
				636	>>> [upper(s) for s in ['sentence', 'fragment']]
				637	['SENTENCE', 'FRAGMENT']
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	638
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	639	You can of course achieve the same effect with a list comprehension.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	640
Georg Brandl	f694518	2008-02-01 11:56:49 +0000	[diff] [blame]	641	``filter(predicate, iter)`` returns an iterator over all the sequence elements
				642	that meet a certain condition, and is similarly duplicated by list
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	643	comprehensions. A predicate is a function that returns the truth value of
				644	some condition; for use with :func:`filter`, the predicate must take a single
				645	value.
				646
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	647	>>> def is_even(x):
				648	... return (x % 2) == 0
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	649
Georg Brandl	a3deea1	2008-12-15 08:29:32 +0000	[diff] [blame]	650	>>> list(filter(is_even, range(10)))
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	651	[0, 2, 4, 6, 8]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	652
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	653
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	654	This can also be written as a list comprehension:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	655
Georg Brandl	f694518	2008-02-01 11:56:49 +0000	[diff] [blame]	656	>>> list(x for x in range(10) if is_even(x))
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	657	[0, 2, 4, 6, 8]
				658
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	659
				660	``enumerate(iter)`` counts off the elements in the iterable, returning 2-tuples
Georg Brandl	f694518	2008-02-01 11:56:49 +0000	[diff] [blame]	661	containing the count and each element. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	662
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	663	>>> for item in enumerate(['subject', 'verb', 'object']):
Neal Norwitz	752abd0	2008-05-13 04:55:24 +0000	[diff] [blame]	664	... print(item)
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	665	(0, 'subject')
				666	(1, 'verb')
				667	(2, 'object')
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	668
				669	:func:`enumerate` is often used when looping through a list and recording the
				670	indexes at which certain conditions are met::
				671
				672	f = open('data.txt', 'r')
				673	for i, line in enumerate(f):
				674	if line.strip() == '':
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	675	print('Blank line at line #%i' % i)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	676
Benjamin Peterson	6ebe78f	2008-12-21 00:06:59 +0000	[diff] [blame]	677	``sorted(iterable, [key=None], [reverse=False])`` collects all the elements of
				678	the iterable into a list, sorts the list, and returns the sorted result. The
				679	``key``, and ``reverse`` arguments are passed through to the constructed list's
				680	``.sort()`` method. ::
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	681
				682	>>> import random
				683	>>> # Generate 8 random numbers between [0, 10000)
				684	>>> rand_list = random.sample(range(10000), 8)
				685	>>> rand_list
				686	[769, 7953, 9828, 6431, 8442, 9878, 6213, 2207]
				687	>>> sorted(rand_list)
				688	[769, 2207, 6213, 6431, 7953, 8442, 9828, 9878]
				689	>>> sorted(rand_list, reverse=True)
				690	[9878, 9828, 8442, 7953, 6431, 6213, 2207, 769]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	691
				692	(For a more detailed discussion of sorting, see the Sorting mini-HOWTO in the
				693	Python wiki at http://wiki.python.org/moin/HowTo/Sorting.)
				694
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	695
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	696	The ``any(iter)`` and ``all(iter)`` built-ins look at the truth values of an
				697	iterable's contents. :func:`any` returns True if any element in the iterable is
				698	a true value, and :func:`all` returns True if all of the elements are true
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	699	values:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	700
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	701	>>> any([0,1,0])
				702	True
				703	>>> any([0,0,0])
				704	False
				705	>>> any([1,1,1])
				706	True
				707	>>> all([0,1,0])
				708	False
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	709	>>> all([0,0,0])
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	710	False
				711	>>> all([1,1,1])
				712	True
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	713
				714
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	715	``zip(iterA, iterB, ...)`` takes one element from each iterable and
				716	returns them in a tuple::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	717
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	718	zip(['a', 'b', 'c'], (1, 2, 3)) =>
				719	('a', 1), ('b', 2), ('c', 3)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	720
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	721	It doesn't construct an in-memory list and exhaust all the input iterators
				722	before returning; instead tuples are constructed and returned only if they're
				723	requested. (The technical term for this behaviour is `lazy evaluation
				724	<http://en.wikipedia.org/wiki/Lazy_evaluation>`__.)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	725
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	726	This iterator is intended to be used with iterables that are all of the same
				727	length. If the iterables are of different lengths, the resulting stream will be
				728	the same length as the shortest iterable. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	729
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	730	zip(['a', 'b'], (1, 2, 3)) =>
				731	('a', 1), ('b', 2)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	732
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	733	You should avoid doing this, though, because an element may be taken from the
				734	longer iterators and discarded. This means you can't go on to use the iterators
				735	further because you risk skipping a discarded element.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	736
				737
				738	The itertools module
				739	====================
				740
				741	The :mod:`itertools` module contains a number of commonly-used iterators as well
				742	as functions for combining several iterators. This section will introduce the
				743	module's contents by showing small examples.
				744
				745	The module's functions fall into a few broad classes:
				746
				747	* Functions that create a new iterator based on an existing iterator.
				748	* Functions for treating an iterator's elements as function arguments.
				749	* Functions for selecting portions of an iterator's output.
				750	* A function for grouping an iterator's output.
				751
				752	Creating new iterators
				753	----------------------
				754
				755	``itertools.count(n)`` returns an infinite stream of integers, increasing by 1
				756	each time. You can optionally supply the starting number, which defaults to 0::
				757
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	758	itertools.count() =>
				759	0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
				760	itertools.count(10) =>
				761	10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	762
				763	``itertools.cycle(iter)`` saves a copy of the contents of a provided iterable
				764	and returns a new iterator that returns its elements from first to last. The
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	765	new iterator will repeat these elements infinitely. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	766
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	767	itertools.cycle([1,2,3,4,5]) =>
				768	1, 2, 3, 4, 5, 1, 2, 3, 4, 5, ...
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	769
				770	``itertools.repeat(elem, [n])`` returns the provided element ``n`` times, or
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	771	returns the element endlessly if ``n`` is not provided. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	772
				773	itertools.repeat('abc') =>
				774	abc, abc, abc, abc, abc, abc, abc, abc, abc, abc, ...
				775	itertools.repeat('abc', 5) =>
				776	abc, abc, abc, abc, abc
				777
				778	``itertools.chain(iterA, iterB, ...)`` takes an arbitrary number of iterables as
				779	input, and returns all the elements of the first iterator, then all the elements
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	780	of the second, and so on, until all of the iterables have been exhausted. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	781
				782	itertools.chain(['a', 'b', 'c'], (1, 2, 3)) =>
				783	a, b, c, 1, 2, 3
				784
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	785	``itertools.islice(iter, [start], stop, [step])`` returns a stream that's a
				786	slice of the iterator. With a single ``stop`` argument, it will return the
				787	first ``stop`` elements. If you supply a starting index, you'll get
				788	``stop-start`` elements, and if you supply a value for ``step``, elements will
				789	be skipped accordingly. Unlike Python's string and list slicing, you can't use
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	790	negative values for ``start``, ``stop``, or ``step``. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	791
				792	itertools.islice(range(10), 8) =>
				793	0, 1, 2, 3, 4, 5, 6, 7
				794	itertools.islice(range(10), 2, 8) =>
				795	2, 3, 4, 5, 6, 7
				796	itertools.islice(range(10), 2, 8, 2) =>
				797	2, 4, 6
				798
				799	``itertools.tee(iter, [n])`` replicates an iterator; it returns ``n``
				800	independent iterators that will all return the contents of the source iterator.
				801	If you don't supply a value for ``n``, the default is 2. Replicating iterators
				802	requires saving some of the contents of the source iterator, so this can consume
				803	significant memory if the iterator is large and one of the new iterators is
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	804	consumed more than the others. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	805
				806	itertools.tee( itertools.count() ) =>
				807	iterA, iterB
				808
				809	where iterA ->
				810	0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
				811
				812	and iterB ->
				813	0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
				814
				815
				816	Calling functions on elements
				817	-----------------------------
				818
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	819	The ``operator`` module contains a set of functions corresponding to Python's
				820	operators. Some examples are ``operator.add(a, b)`` (adds two values),
				821	``operator.ne(a, b)`` (same as ``a!=b``), and ``operator.attrgetter('id')``
				822	(returns a callable that fetches the ``"id"`` attribute).
				823
				824	``itertools.starmap(func, iter)`` assumes that the iterable will return a stream
				825	of tuples, and calls ``f()`` using these tuples as the arguments::
				826
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	827	itertools.starmap(os.path.join,
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	828	[('/usr', 'bin', 'java'), ('/bin', 'python'),
				829	('/usr', 'bin', 'perl'),('/usr', 'bin', 'ruby')])
				830	=>
				831	/usr/bin/java, /bin/python, /usr/bin/perl, /usr/bin/ruby
				832
				833
				834	Selecting elements
				835	------------------
				836
				837	Another group of functions chooses a subset of an iterator's elements based on a
				838	predicate.
				839
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	840	``itertools.filterfalse(predicate, iter)`` is the opposite, returning all
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	841	elements for which the predicate returns false::
				842
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	843	itertools.filterfalse(is_even, itertools.count()) =>
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	844	1, 3, 5, 7, 9, 11, 13, 15, ...
				845
				846	``itertools.takewhile(predicate, iter)`` returns elements for as long as the
				847	predicate returns true. Once the predicate returns false, the iterator will
				848	signal the end of its results.
				849
				850	::
				851
				852	def less_than_10(x):
				853	return (x < 10)
				854
				855	itertools.takewhile(less_than_10, itertools.count()) =>
				856	0, 1, 2, 3, 4, 5, 6, 7, 8, 9
				857
				858	itertools.takewhile(is_even, itertools.count()) =>
				859	0
				860
				861	``itertools.dropwhile(predicate, iter)`` discards elements while the predicate
				862	returns true, and then returns the rest of the iterable's results.
				863
				864	::
				865
				866	itertools.dropwhile(less_than_10, itertools.count()) =>
				867	10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
				868
				869	itertools.dropwhile(is_even, itertools.count()) =>
				870	1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...
				871
				872
				873	Grouping elements
				874	-----------------
				875
				876	The last function I'll discuss, ``itertools.groupby(iter, key_func=None)``, is
				877	the most complicated. ``key_func(elem)`` is a function that can compute a key
				878	value for each element returned by the iterable. If you don't supply a key
				879	function, the key is simply each element itself.
				880
				881	``groupby()`` collects all the consecutive elements from the underlying iterable
				882	that have the same key value, and returns a stream of 2-tuples containing a key
				883	value and an iterator for the elements with that key.
				884
				885	::
				886
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	887	city_list = [('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL'),
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	888	('Anchorage', 'AK'), ('Nome', 'AK'),
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	889	('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ'),
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	890	...
				891	]
				892
Georg Brandl	0df7979	2008-10-04 18:33:26 +0000	[diff] [blame]	893	def get_state (city_state):
				894	return city_state[1]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	895
				896	itertools.groupby(city_list, get_state) =>
				897	('AL', iterator-1),
				898	('AK', iterator-2),
				899	('AZ', iterator-3), ...
				900
				901	where
				902	iterator-1 =>
				903	('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL')
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	904	iterator-2 =>
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	905	('Anchorage', 'AK'), ('Nome', 'AK')
				906	iterator-3 =>
				907	('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ')
				908
				909	``groupby()`` assumes that the underlying iterable's contents will already be
				910	sorted based on the key. Note that the returned iterators also use the
				911	underlying iterable, so you have to consume the results of iterator-1 before
				912	requesting iterator-2 and its corresponding key.
				913
				914
				915	The functools module
				916	====================
				917
				918	The :mod:`functools` module in Python 2.5 contains some higher-order functions.
				919	A higher-order function takes one or more functions as input and returns a
				920	new function. The most useful tool in this module is the
				921	:func:`functools.partial` function.
				922
				923	For programs written in a functional style, you'll sometimes want to construct
				924	variants of existing functions that have some of the parameters filled in.
				925	Consider a Python function ``f(a, b, c)``; you may wish to create a new function
				926	``g(b, c)`` that's equivalent to ``f(1, b, c)``; you're filling in a value for
				927	one of ``f()``'s parameters. This is called "partial function application".
				928
				929	The constructor for ``partial`` takes the arguments ``(function, arg1, arg2,
				930	... kwarg1=value1, kwarg2=value2)``. The resulting object is callable, so you
				931	can just call it to invoke ``function`` with the filled-in arguments.
				932
				933	Here's a small but realistic example::
				934
				935	import functools
				936
				937	def log (message, subsystem):
				938	"Write the contents of 'message' to the specified subsystem."
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	939	print('%s: %s' % (subsystem, message))
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	940	...
				941
				942	server_log = functools.partial(log, subsystem='server')
				943	server_log('Unable to open socket')
				944
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	945	``functools.reduce(func, iter, [initial_value])`` cumulatively performs an
				946	operation on all the iterable's elements and, therefore, can't be applied to
				947	infinite iterables. (Note it is not in :mod:`builtins`, but in the
				948	:mod:`functools` module.) ``func`` must be a function that takes two elements
				949	and returns a single value. :func:`functools.reduce` takes the first two
				950	elements A and B returned by the iterator and calculates ``func(A, B)``. It
				951	then requests the third element, C, calculates ``func(func(A, B), C)``, combines
				952	this result with the fourth element returned, and continues until the iterable
				953	is exhausted. If the iterable returns no values at all, a :exc:`TypeError`
				954	exception is raised. If the initial value is supplied, it's used as a starting
				955	point and ``func(initial_value, A)`` is the first calculation. ::
				956
				957	>>> import operator, functools
				958	>>> functools.reduce(operator.concat, ['A', 'BB', 'C'])
				959	'ABBC'
				960	>>> functools.reduce(operator.concat, [])
				961	Traceback (most recent call last):
				962	...
				963	TypeError: reduce() of empty sequence with no initial value
				964	>>> functools.reduce(operator.mul, [1,2,3], 1)
				965	6
				966	>>> functools.reduce(operator.mul, [], 1)
				967	1
				968
				969	If you use :func:`operator.add` with :func:`functools.reduce`, you'll add up all the
				970	elements of the iterable. This case is so common that there's a special
				971	built-in called :func:`sum` to compute it:
				972
				973	>>> import functools
				974	>>> functools.reduce(operator.add, [1,2,3,4], 0)
				975	10
				976	>>> sum([1,2,3,4])
				977	10
				978	>>> sum([])
				979	0
				980
				981	For many uses of :func:`functools.reduce`, though, it can be clearer to just write the
				982	obvious :keyword:`for` loop::
				983
				984	import functools
				985	# Instead of:
				986	product = functools.reduce(operator.mul, [1,2,3], 1)
				987
				988	# You can write:
				989	product = 1
				990	for i in [1,2,3]:
				991	product *= i
				992
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	993
				994	The operator module
				995	-------------------
				996
				997	The :mod:`operator` module was mentioned earlier. It contains a set of
				998	functions corresponding to Python's operators. These functions are often useful
				999	in functional-style code because they save you from writing trivial functions
				1000	that perform a single operation.
				1001
				1002	Some of the functions in this module are:
				1003
Georg Brandl	f694518	2008-02-01 11:56:49 +0000	[diff] [blame]	1004	* Math operations: ``add()``, ``sub()``, ``mul()``, ``floordiv()``, ``abs()``, ...
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1005	* Logical operations: ``not_()``, ``truth()``.
				1006	* Bitwise operations: ``and_()``, ``or_()``, ``invert()``.
				1007	* Comparisons: ``eq()``, ``ne()``, ``lt()``, ``le()``, ``gt()``, and ``ge()``.
				1008	* Object identity: ``is_()``, ``is_not()``.
				1009
				1010	Consult the operator module's documentation for a complete list.
				1011
				1012
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1013	Small functions and the lambda expression
				1014	=========================================
				1015
				1016	When writing functional-style programs, you'll often need little functions that
				1017	act as predicates or that combine elements in some way.
				1018
				1019	If there's a Python built-in or a module function that's suitable, you don't
				1020	need to define a new function at all::
				1021
				1022	stripped_lines = [line.strip() for line in lines]
				1023	existing_files = filter(os.path.exists, file_list)
				1024
				1025	If the function you need doesn't exist, you need to write it. One way to write
				1026	small functions is to use the ``lambda`` statement. ``lambda`` takes a number
				1027	of parameters and an expression combining these parameters, and creates a small
				1028	function that returns the value of the expression::
				1029
				1030	lowercase = lambda x: x.lower()
				1031
				1032	print_assign = lambda name, value: name + '=' + str(value)
				1033
				1034	adder = lambda x, y: x+y
				1035
				1036	An alternative is to just use the ``def`` statement and define a function in the
				1037	usual way::
				1038
				1039	def lowercase(x):
				1040	return x.lower()
				1041
				1042	def print_assign(name, value):
				1043	return name + '=' + str(value)
				1044
				1045	def adder(x,y):
				1046	return x + y
				1047
				1048	Which alternative is preferable? That's a style question; my usual course is to
				1049	avoid using ``lambda``.
				1050
				1051	One reason for my preference is that ``lambda`` is quite limited in the
				1052	functions it can define. The result has to be computable as a single
				1053	expression, which means you can't have multiway ``if... elif... else``
				1054	comparisons or ``try... except`` statements. If you try to do too much in a
				1055	``lambda`` statement, you'll end up with an overly complicated expression that's
				1056	hard to read. Quick, what's the following code doing?
				1057
				1058	::
				1059
				1060	import functools
				1061	total = functools.reduce(lambda a, b: (0, a[1] + b[1]), items)[1]
				1062
				1063	You can figure it out, but it takes time to disentangle the expression to figure
				1064	out what's going on. Using a short nested ``def`` statements makes things a
				1065	little bit better::
				1066
				1067	import functools
				1068	def combine (a, b):
				1069	return 0, a[1] + b[1]
				1070
				1071	total = functools.reduce(combine, items)[1]
				1072
				1073	But it would be best of all if I had simply used a ``for`` loop::
				1074
				1075	total = 0
				1076	for a, b in items:
				1077	total += b
				1078
				1079	Or the :func:`sum` built-in and a generator expression::
				1080
				1081	total = sum(b for a,b in items)
				1082
				1083	Many uses of :func:`functools.reduce` are clearer when written as ``for`` loops.
				1084
				1085	Fredrik Lundh once suggested the following set of rules for refactoring uses of
				1086	``lambda``:
				1087
				1088	1) Write a lambda function.
				1089	2) Write a comment explaining what the heck that lambda does.
				1090	3) Study the comment for a while, and think of a name that captures the essence
				1091	of the comment.
				1092	4) Convert the lambda to a def statement, using that name.
				1093	5) Remove the comment.
				1094
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	1095	I really like these rules, but you're free to disagree
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1096	about whether this lambda-free style is better.
				1097
				1098
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1099	Revision History and Acknowledgements
				1100	=====================================
				1101
				1102	The author would like to thank the following people for offering suggestions,
				1103	corrections and assistance with various drafts of this article: Ian Bicking,
				1104	Nick Coghlan, Nick Efford, Raymond Hettinger, Jim Jewett, Mike Krell, Leandro
				1105	Lameiro, Jussi Salmela, Collin Winter, Blake Winton.
				1106
				1107	Version 0.1: posted June 30 2006.
				1108
				1109	Version 0.11: posted July 1 2006. Typo fixes.
				1110
				1111	Version 0.2: posted July 10 2006. Merged genexp and listcomp sections into one.
				1112	Typo fixes.
				1113
				1114	Version 0.21: Added more references suggested on the tutor mailing list.
				1115
				1116	Version 0.30: Adds a section on the ``functional`` module written by Collin
				1117	Winter; adds short section on the operator module; a few other edits.
				1118
				1119
				1120	References
				1121	==========
				1122
				1123	General
				1124	-------
				1125
				1126	Structure and Interpretation of Computer Programs, by Harold Abelson and
				1127	Gerald Jay Sussman with Julie Sussman. Full text at
				1128	http://mitpress.mit.edu/sicp/. In this classic textbook of computer science,
				1129	chapters 2 and 3 discuss the use of sequences and streams to organize the data
				1130	flow inside a program. The book uses Scheme for its examples, but many of the
				1131	design approaches described in these chapters are applicable to functional-style
				1132	Python code.
				1133
				1134	http://www.defmacro.org/ramblings/fp.html: A general introduction to functional
				1135	programming that uses Java examples and has a lengthy historical introduction.
				1136
				1137	http://en.wikipedia.org/wiki/Functional_programming: General Wikipedia entry
				1138	describing functional programming.
				1139
				1140	http://en.wikipedia.org/wiki/Coroutine: Entry for coroutines.
				1141
				1142	http://en.wikipedia.org/wiki/Currying: Entry for the concept of currying.
				1143
				1144	Python-specific
				1145	---------------
				1146
				1147	http://gnosis.cx/TPiP/: The first chapter of David Mertz's book
				1148	:title-reference:`Text Processing in Python` discusses functional programming
				1149	for text processing, in the section titled "Utilizing Higher-Order Functions in
				1150	Text Processing".
				1151
				1152	Mertz also wrote a 3-part series of articles on functional programming
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	1153	for IBM's DeveloperWorks site; see
Sandro Tosi	1abde36	2011-12-31 18:46:50 +0100	[diff] [blame]	1154	`part 1 <http://www.ibm.com/developerworks/linux/library/l-prog/index.html>`__,
				1155	`part 2 <http://www.ibm.com/developerworks/linux/library/l-prog2/index.html>`__, and
				1156	`part 3 <http://www.ibm.com/developerworks/linux/library/l-prog3/index.html>`__,
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1157
				1158
				1159	Python documentation
				1160	--------------------
				1161
				1162	Documentation for the :mod:`itertools` module.
				1163
				1164	Documentation for the :mod:`operator` module.
				1165
				1166	:pep:`289`: "Generator Expressions"
				1167
				1168	:pep:`342`: "Coroutines via Enhanced Generators" describes the new generator
				1169	features in Python 2.5.
				1170
				1171	.. comment
				1172
				1173	Topics to place
				1174	-----------------------------
				1175
				1176	XXX os.walk()
				1177
				1178	XXX Need a large example.
				1179
				1180	But will an example add much? I'll post a first draft and see
				1181	what the comments say.
				1182
				1183	.. comment
				1184
				1185	Original outline:
				1186	Introduction
				1187	Idea of FP
				1188	Programs built out of functions
				1189	Functions are strictly input-output, no internal state
				1190	Opposed to OO programming, where objects have state
				1191
				1192	Why FP?
				1193	Formal provability
				1194	Assignment is difficult to reason about
				1195	Not very relevant to Python
				1196	Modularity
				1197	Small functions that do one thing
				1198	Debuggability:
				1199	Easy to test due to lack of state
				1200	Easy to verify output from intermediate steps
				1201	Composability
				1202	You assemble a toolbox of functions that can be mixed
				1203
				1204	Tackling a problem
				1205	Need a significant example
				1206
				1207	Iterators
				1208	Generators
				1209	The itertools module
				1210	List comprehensions
				1211	Small functions and the lambda statement
				1212	Built-in functions
				1213	map
				1214	filter
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1215
				1216	.. comment
				1217
				1218	Handy little function for printing part of an iterator -- used
				1219	while writing this document.
				1220
				1221	import itertools
				1222	def print_iter(it):
				1223	slice = itertools.islice(it, 10)
				1224	for elem in slice[:-1]:
				1225	sys.stdout.write(str(elem))
				1226	sys.stdout.write(', ')
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	1227	print(elem[-1])
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1228
				1229