Blame - Doc/howto/functional.rst - platform/external/python/cpython3

blob: f8f2aac70f9b063aad79ebc3f35d31dd09c1184a [file] [log] [blame]

Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1	********************************
				2	Functional Programming HOWTO
				3	********************************
				4
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	5	:Author: A. M. Kuchling
Andrew Kuchling	2a9c8e8	2013-05-20 10:14:53 -0400	[diff] [blame]	6	:Release: 0.32
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	7
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	8	In this document, we'll take a tour of Python's features suitable for
				9	implementing programs in a functional style. After an introduction to the
				10	concepts of functional programming, we'll look at language features such as
Georg Brandl	9afde1c	2007-11-01 20:32:30 +0000	[diff] [blame]	11	:term:`iterator`\s and :term:`generator`\s and relevant library modules such as
				12	:mod:`itertools` and :mod:`functools`.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	13
				14
				15	Introduction
				16	============
				17
Andrew Kuchling	2a9c8e8	2013-05-20 10:14:53 -0400	[diff] [blame]	18	This section explains the basic concept of functional programming; if
				19	you're just interested in learning about Python language features,
				20	skip to the next section on :ref:`functional-howto-iterators`.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	21
				22	Programming languages support decomposing problems in several different ways:
				23
				24	* Most programming languages are procedural: programs are lists of
				25	instructions that tell the computer what to do with the program's input. C,
				26	Pascal, and even Unix shells are procedural languages.
				27
				28	* In declarative languages, you write a specification that describes the
				29	problem to be solved, and the language implementation figures out how to
				30	perform the computation efficiently. SQL is the declarative language you're
				31	most likely to be familiar with; a SQL query describes the data set you want
				32	to retrieve, and the SQL engine decides whether to scan tables or use indexes,
				33	which subclauses should be performed first, etc.
				34
				35	* Object-oriented programs manipulate collections of objects. Objects have
				36	internal state and support methods that query or modify this internal state in
				37	some way. Smalltalk and Java are object-oriented languages. C++ and Python
				38	are languages that support object-oriented programming, but don't force the
				39	use of object-oriented features.
				40
				41	* Functional programming decomposes a problem into a set of functions.
				42	Ideally, functions only take inputs and produce outputs, and don't have any
				43	internal state that affects the output produced for a given input. Well-known
				44	functional languages include the ML family (Standard ML, OCaml, and other
				45	variants) and Haskell.
				46
Christian Heimes	0449f63	2007-12-15 01:27:15 +0000	[diff] [blame]	47	The designers of some computer languages choose to emphasize one
				48	particular approach to programming. This often makes it difficult to
				49	write programs that use a different approach. Other languages are
				50	multi-paradigm languages that support several different approaches.
				51	Lisp, C++, and Python are multi-paradigm; you can write programs or
				52	libraries that are largely procedural, object-oriented, or functional
				53	in all of these languages. In a large program, different sections
				54	might be written using different approaches; the GUI might be
				55	object-oriented while the processing logic is procedural or
				56	functional, for example.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	57
				58	In a functional program, input flows through a set of functions. Each function
Christian Heimes	0449f63	2007-12-15 01:27:15 +0000	[diff] [blame]	59	operates on its input and produces some output. Functional style discourages
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	60	functions with side effects that modify internal state or make other changes
				61	that aren't visible in the function's return value. Functions that have no side
				62	effects at all are called purely functional. Avoiding side effects means
				63	not using data structures that get updated as a program runs; every function's
				64	output must only depend on its input.
				65
				66	Some languages are very strict about purity and don't even have assignment
				67	statements such as ``a=3`` or ``c = a + b``, but it's difficult to avoid all
				68	side effects. Printing to the screen or writing to a disk file are side
Georg Brandl	0df7979	2008-10-04 18:33:26 +0000	[diff] [blame]	69	effects, for example. For example, in Python a call to the :func:`print` or
				70	:func:`time.sleep` function both return no useful value; they're only called for
				71	their side effects of sending some text to the screen or pausing execution for a
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	72	second.
				73
				74	Python programs written in functional style usually won't go to the extreme of
				75	avoiding all I/O or all assignments; instead, they'll provide a
				76	functional-appearing interface but will use non-functional features internally.
				77	For example, the implementation of a function will still use assignments to
				78	local variables, but won't modify global variables or have other side effects.
				79
				80	Functional programming can be considered the opposite of object-oriented
				81	programming. Objects are little capsules containing some internal state along
				82	with a collection of method calls that let you modify this state, and programs
				83	consist of making the right set of state changes. Functional programming wants
				84	to avoid state changes as much as possible and works with data flowing between
				85	functions. In Python you might combine the two approaches by writing functions
				86	that take and return instances representing objects in your application (e-mail
				87	messages, transactions, etc.).
				88
				89	Functional design may seem like an odd constraint to work under. Why should you
				90	avoid objects and side effects? There are theoretical and practical advantages
				91	to the functional style:
				92
				93	* Formal provability.
				94	* Modularity.
				95	* Composability.
				96	* Ease of debugging and testing.
				97
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	98
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	99	Formal provability
				100	------------------
				101
				102	A theoretical benefit is that it's easier to construct a mathematical proof that
				103	a functional program is correct.
				104
				105	For a long time researchers have been interested in finding ways to
				106	mathematically prove programs correct. This is different from testing a program
				107	on numerous inputs and concluding that its output is usually correct, or reading
				108	a program's source code and concluding that the code looks right; the goal is
				109	instead a rigorous proof that a program produces the right result for all
				110	possible inputs.
				111
				112	The technique used to prove programs correct is to write down invariants,
				113	properties of the input data and of the program's variables that are always
				114	true. For each line of code, you then show that if invariants X and Y are true
				115	before the line is executed, the slightly different invariants X' and Y' are
				116	true after the line is executed. This continues until you reach the end of
				117	the program, at which point the invariants should match the desired conditions
				118	on the program's output.
				119
				120	Functional programming's avoidance of assignments arose because assignments are
				121	difficult to handle with this technique; assignments can break invariants that
				122	were true before the assignment without producing any new invariants that can be
				123	propagated onward.
				124
				125	Unfortunately, proving programs correct is largely impractical and not relevant
				126	to Python software. Even trivial programs require proofs that are several pages
				127	long; the proof of correctness for a moderately complicated program would be
				128	enormous, and few or none of the programs you use daily (the Python interpreter,
				129	your XML parser, your web browser) could be proven correct. Even if you wrote
				130	down or generated a proof, there would then be the question of verifying the
				131	proof; maybe there's an error in it, and you wrongly believe you've proved the
				132	program correct.
				133
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	134
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	135	Modularity
				136	----------
				137
				138	A more practical benefit of functional programming is that it forces you to
				139	break apart your problem into small pieces. Programs are more modular as a
				140	result. It's easier to specify and write a small function that does one thing
				141	than a large function that performs a complicated transformation. Small
				142	functions are also easier to read and to check for errors.
				143
				144
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	145	Ease of debugging and testing
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	146	-----------------------------
				147
				148	Testing and debugging a functional-style program is easier.
				149
				150	Debugging is simplified because functions are generally small and clearly
				151	specified. When a program doesn't work, each function is an interface point
				152	where you can check that the data are correct. You can look at the intermediate
				153	inputs and outputs to quickly isolate the function that's responsible for a bug.
				154
				155	Testing is easier because each function is a potential subject for a unit test.
				156	Functions don't depend on system state that needs to be replicated before
				157	running a test; instead you only have to synthesize the right input and then
				158	check that the output matches expectations.
				159
				160
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	161	Composability
				162	-------------
				163
				164	As you work on a functional-style program, you'll write a number of functions
				165	with varying inputs and outputs. Some of these functions will be unavoidably
				166	specialized to a particular application, but others will be useful in a wide
				167	variety of programs. For example, a function that takes a directory path and
				168	returns all the XML files in the directory, or a function that takes a filename
				169	and returns its contents, can be applied to many different situations.
				170
				171	Over time you'll form a personal library of utilities. Often you'll assemble
				172	new programs by arranging existing functions in a new configuration and writing
				173	a few functions specialized for the current task.
				174
				175
Andrew Kuchling	2a9c8e8	2013-05-20 10:14:53 -0400	[diff] [blame]	176	.. _functional-howto-iterators:
				177
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	178	Iterators
				179	=========
				180
				181	I'll start by looking at a Python language feature that's an important
				182	foundation for writing functional-style programs: iterators.
				183
				184	An iterator is an object representing a stream of data; this object returns the
				185	data one element at a time. A Python iterator must support a method called
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	186	:meth:`~iterator.__next__` that takes no arguments and always returns the next
				187	element of the stream. If there are no more elements in the stream,
				188	:meth:`~iterator.__next__` must raise the :exc:`StopIteration` exception.
				189	Iterators don't have to be finite, though; it's perfectly reasonable to write
				190	an iterator that produces an infinite stream of data.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	191
				192	The built-in :func:`iter` function takes an arbitrary object and tries to return
				193	an iterator that will return the object's contents or elements, raising
				194	:exc:`TypeError` if the object doesn't support iteration. Several of Python's
				195	built-in data types support iteration, the most common being lists and
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	196	dictionaries. An object is called :term:`iterable` if you can get an iterator
				197	for it.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	198
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	199	You can experiment with the iteration interface manually:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	200
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	201	>>> L = [1, 2, 3]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	202	>>> it = iter(L)
Ezio Melotti	35cbf16	2012-10-12 13:24:19 +0300	[diff] [blame]	203	>>> it #doctest: +ELLIPSIS
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	204	<...iterator object at ...>
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	205	>>> it.__next__() # same as next(it)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	206	1
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	207	>>> next(it)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	208	2
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	209	>>> next(it)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	210	3
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	211	>>> next(it)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	212	Traceback (most recent call last):
UltimateCoder	8856940	2017-05-03 22:16:45 +0530	[diff] [blame]	213	File "<stdin>", line 1, in <module>
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	214	StopIteration
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	215	>>>
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	216
				217	Python expects iterable objects in several different contexts, the most
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	218	important being the :keyword:`for` statement. In the statement ``for X in Y``,
				219	Y must be an iterator or some object for which :func:`iter` can create an
				220	iterator. These two statements are equivalent::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	221
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	222
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	223	for i in iter(obj):
Neal Norwitz	752abd0	2008-05-13 04:55:24 +0000	[diff] [blame]	224	print(i)
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	225
				226	for i in obj:
Neal Norwitz	752abd0	2008-05-13 04:55:24 +0000	[diff] [blame]	227	print(i)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	228
				229	Iterators can be materialized as lists or tuples by using the :func:`list` or
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	230	:func:`tuple` constructor functions:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	231
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	232	>>> L = [1, 2, 3]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	233	>>> iterator = iter(L)
				234	>>> t = tuple(iterator)
				235	>>> t
				236	(1, 2, 3)
				237
				238	Sequence unpacking also supports iterators: if you know an iterator will return
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	239	N elements, you can unpack them into an N-tuple:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	240
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	241	>>> L = [1, 2, 3]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	242	>>> iterator = iter(L)
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	243	>>> a, b, c = iterator
				244	>>> a, b, c
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	245	(1, 2, 3)
				246
				247	Built-in functions such as :func:`max` and :func:`min` can take a single
				248	iterator argument and will return the largest or smallest element. The ``"in"``
				249	and ``"not in"`` operators also support iterators: ``X in iterator`` is true if
				250	X is found in the stream returned by the iterator. You'll run into obvious
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	251	problems if the iterator is infinite; :func:`max`, :func:`min`
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	252	will never return, and if the element X never appears in the stream, the
Sandro Tosi	dd7c552	2012-08-15 21:37:35 +0200	[diff] [blame]	253	``"in"`` and ``"not in"`` operators won't return either.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	254
				255	Note that you can only go forward in an iterator; there's no way to get the
				256	previous element, reset the iterator, or make a copy of it. Iterator objects
				257	can optionally provide these additional capabilities, but the iterator protocol
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	258	only specifies the :meth:`~iterator.__next__` method. Functions may therefore
				259	consume all of the iterator's output, and if you need to do something different
				260	with the same stream, you'll have to create a new iterator.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	261
				262
				263
				264	Data Types That Support Iterators
				265	---------------------------------
				266
				267	We've already seen how lists and tuples support iterators. In fact, any Python
				268	sequence type, such as strings, will automatically support creation of an
				269	iterator.
				270
				271	Calling :func:`iter` on a dictionary returns an iterator that will loop over the
Ezio Melotti	35cbf16	2012-10-12 13:24:19 +0300	[diff] [blame]	272	dictionary's keys::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	273
				274	>>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
				275	... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
Stig Johan Berggren	5e5bbbe	2018-07-11 11:53:07 +0200	[diff] [blame]	276	>>> for key in m:
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	277	... print(key, m[key])
Stig Johan Berggren	5e5bbbe	2018-07-11 11:53:07 +0200	[diff] [blame]	278	Jan 1
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	279	Feb 2
Stig Johan Berggren	5e5bbbe	2018-07-11 11:53:07 +0200	[diff] [blame]	280	Mar 3
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	281	Apr 4
Stig Johan Berggren	5e5bbbe	2018-07-11 11:53:07 +0200	[diff] [blame]	282	May 5
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	283	Jun 6
				284	Jul 7
Stig Johan Berggren	5e5bbbe	2018-07-11 11:53:07 +0200	[diff] [blame]	285	Aug 8
				286	Sep 9
				287	Oct 10
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	288	Nov 11
				289	Dec 12
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	290
Stig Johan Berggren	5e5bbbe	2018-07-11 11:53:07 +0200	[diff] [blame]	291	Note that starting with Python 3.7, dictionary iteration order is guaranteed
				292	to be the same as the insertion order. In earlier versions, the behaviour was
				293	unspecified and could vary between implementations.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	294
Fred Drake	2e74878	2007-09-04 17:33:11 +0000	[diff] [blame]	295	Applying :func:`iter` to a dictionary always loops over the keys, but
				296	dictionaries have methods that return other iterators. If you want to iterate
				297	over values or key/value pairs, you can explicitly call the
Chris Jerdonek	006d907	2012-10-12 20:28:26 -0700	[diff] [blame]	298	:meth:`~dict.values` or :meth:`~dict.items` methods to get an appropriate
				299	iterator.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	300
				301	The :func:`dict` constructor can accept an iterator that returns a finite stream
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	302	of ``(key, value)`` tuples:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	303
				304	>>> L = [('Italy', 'Rome'), ('France', 'Paris'), ('US', 'Washington DC')]
Stig Johan Berggren	5e5bbbe	2018-07-11 11:53:07 +0200	[diff] [blame]	305	>>> dict(iter(L))
				306	{'Italy': 'Rome', 'France': 'Paris', 'US': 'Washington DC'}
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	307
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	308	Files also support iteration by calling the :meth:`~io.TextIOBase.readline`
				309	method until there are no more lines in the file. This means you can read each
				310	line of a file like this::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	311
				312	for line in file:
				313	# do something for each line
				314	...
				315
				316	Sets can take their contents from an iterable and let you iterate over the set's
				317	elements::
				318
Georg Brandl	f694518	2008-02-01 11:56:49 +0000	[diff] [blame]	319	S = {2, 3, 5, 7, 11, 13}
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	320	for i in S:
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	321	print(i)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	322
				323
				324
				325	Generator expressions and list comprehensions
				326	=============================================
				327
				328	Two common operations on an iterator's output are 1) performing some operation
				329	for every element, 2) selecting a subset of elements that meet some condition.
				330	For example, given a list of strings, you might want to strip off trailing
				331	whitespace from each line or extract all the strings containing a given
				332	substring.
				333
				334	List comprehensions and generator expressions (short form: "listcomps" and
				335	"genexps") are a concise notation for such operations, borrowed from the
Georg Brandl	5d94134	2016-02-26 19:37:12 +0100	[diff] [blame]	336	functional programming language Haskell (https://www.haskell.org/). You can strip
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	337	all the whitespace from a stream of strings with the following code::
				338
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	339	line_list = [' line 1\n', 'line 2 \n', ...]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	340
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	341	# Generator expression -- returns iterator
				342	stripped_iter = (line.strip() for line in line_list)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	343
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	344	# List comprehension -- returns list
				345	stripped_list = [line.strip() for line in line_list]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	346
				347	You can select only certain elements by adding an ``"if"`` condition::
				348
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	349	stripped_list = [line.strip() for line in line_list
				350	if line != ""]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	351
				352	With a list comprehension, you get back a Python list; ``stripped_list`` is a
				353	list containing the resulting lines, not an iterator. Generator expressions
				354	return an iterator that computes the values as necessary, not needing to
				355	materialize all the values at once. This means that list comprehensions aren't
				356	useful if you're working with iterators that return an infinite stream or a very
				357	large amount of data. Generator expressions are preferable in these situations.
				358
				359	Generator expressions are surrounded by parentheses ("()") and list
				360	comprehensions are surrounded by square brackets ("[]"). Generator expressions
				361	have the form::
				362
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	363	( expression for expr in sequence1
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	364	if condition1
				365	for expr2 in sequence2
				366	if condition2
				367	for expr3 in sequence3 ...
				368	if condition3
				369	for exprN in sequenceN
				370	if conditionN )
				371
				372	Again, for a list comprehension only the outside brackets are different (square
				373	brackets instead of parentheses).
				374
				375	The elements of the generated output will be the successive values of
				376	``expression``. The ``if`` clauses are all optional; if present, ``expression``
				377	is only evaluated and added to the result when ``condition`` is true.
				378
				379	Generator expressions always have to be written inside parentheses, but the
				380	parentheses signalling a function call also count. If you want to create an
				381	iterator that will be immediately passed to a function you can write::
				382
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	383	obj_total = sum(obj.count for obj in list_all_objects())
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	384
				385	The ``for...in`` clauses contain the sequences to be iterated over. The
				386	sequences do not have to be the same length, because they are iterated over from
				387	left to right, not in parallel. For each element in ``sequence1``,
				388	``sequence2`` is looped over from the beginning. ``sequence3`` is then looped
				389	over for each resulting pair of elements from ``sequence1`` and ``sequence2``.
				390
				391	To put it another way, a list comprehension or generator expression is
				392	equivalent to the following Python code::
				393
				394	for expr1 in sequence1:
				395	if not (condition1):
				396	continue # Skip this element
				397	for expr2 in sequence2:
				398	if not (condition2):
Serhiy Storchaka	dba9039	2016-05-10 12:01:23 +0300	[diff] [blame]	399	continue # Skip this element
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	400	...
				401	for exprN in sequenceN:
Serhiy Storchaka	dba9039	2016-05-10 12:01:23 +0300	[diff] [blame]	402	if not (conditionN):
				403	continue # Skip this element
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	404
Serhiy Storchaka	dba9039	2016-05-10 12:01:23 +0300	[diff] [blame]	405	# Output the value of
				406	# the expression.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	407
				408	This means that when there are multiple ``for...in`` clauses but no ``if``
				409	clauses, the length of the resulting output will be equal to the product of the
				410	lengths of all the sequences. If you have two lists of length 3, the output
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	411	list is 9 elements long:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	412
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	413	>>> seq1 = 'abc'
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	414	>>> seq2 = (1, 2, 3)
Ezio Melotti	35cbf16	2012-10-12 13:24:19 +0300	[diff] [blame]	415	>>> [(x, y) for x in seq1 for y in seq2] #doctest: +NORMALIZE_WHITESPACE
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	416	[('a', 1), ('a', 2), ('a', 3),
				417	('b', 1), ('b', 2), ('b', 3),
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	418	('c', 1), ('c', 2), ('c', 3)]
				419
				420	To avoid introducing an ambiguity into Python's grammar, if ``expression`` is
				421	creating a tuple, it must be surrounded with parentheses. The first list
				422	comprehension below is a syntax error, while the second one is correct::
				423
				424	# Syntax error
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	425	[x, y for x in seq1 for y in seq2]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	426	# Correct
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	427	[(x, y) for x in seq1 for y in seq2]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	428
				429
				430	Generators
				431	==========
				432
				433	Generators are a special class of functions that simplify the task of writing
				434	iterators. Regular functions compute a value and return it, but generators
				435	return an iterator that returns a stream of values.
				436
				437	You're doubtless familiar with how regular function calls work in Python or C.
				438	When you call a function, it gets a private namespace where its local variables
				439	are created. When the function reaches a ``return`` statement, the local
				440	variables are destroyed and the value is returned to the caller. A later call
				441	to the same function creates a new private namespace and a fresh set of local
				442	variables. But, what if the local variables weren't thrown away on exiting a
				443	function? What if you could later resume the function where it left off? This
				444	is what generators provide; they can be thought of as resumable functions.
				445
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	446	Here's the simplest example of a generator function:
				447
Ezio Melotti	35cbf16	2012-10-12 13:24:19 +0300	[diff] [blame]	448	>>> def generate_ints(N):
				449	... for i in range(N):
				450	... yield i
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	451
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	452	Any function containing a :keyword:`yield` keyword is a generator function;
				453	this is detected by Python's :term:`bytecode` compiler which compiles the
				454	function specially as a result.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	455
				456	When you call a generator function, it doesn't return a single value; instead it
				457	returns a generator object that supports the iterator protocol. On executing
				458	the ``yield`` expression, the generator outputs the value of ``i``, similar to a
				459	``return`` statement. The big difference between ``yield`` and a ``return``
				460	statement is that on reaching a ``yield`` the generator's state of execution is
				461	suspended and local variables are preserved. On the next call to the
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	462	generator's :meth:`~generator.__next__` method, the function will resume
				463	executing.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	464
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	465	Here's a sample usage of the ``generate_ints()`` generator:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	466
				467	>>> gen = generate_ints(3)
Ezio Melotti	35cbf16	2012-10-12 13:24:19 +0300	[diff] [blame]	468	>>> gen #doctest: +ELLIPSIS
Benjamin Peterson	25c95f1	2009-05-08 20:42:26 +0000	[diff] [blame]	469	<generator object generate_ints at ...>
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	470	>>> next(gen)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	471	0
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	472	>>> next(gen)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	473	1
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	474	>>> next(gen)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	475	2
Benjamin Peterson	e7c78b2	2008-07-03 20:28:26 +0000	[diff] [blame]	476	>>> next(gen)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	477	Traceback (most recent call last):
UltimateCoder	8856940	2017-05-03 22:16:45 +0530	[diff] [blame]	478	File "stdin", line 1, in <module>
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	479	File "stdin", line 2, in generate_ints
				480	StopIteration
				481
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	482	You could equally write ``for i in generate_ints(5)``, or ``a, b, c =
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	483	generate_ints(3)``.
				484
Yury Selivanov	8170e8c	2015-05-09 11:44:30 -0400	[diff] [blame]	485	Inside a generator function, ``return value`` causes ``StopIteration(value)``
				486	to be raised from the :meth:`~generator.__next__` method. Once this happens, or
				487	the bottom of the function is reached, the procession of values ends and the
				488	generator cannot yield any further values.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	489
				490	You could achieve the effect of generators manually by writing your own class
				491	and storing all the local variables of the generator as instance variables. For
				492	example, returning a list of integers could be done by setting ``self.count`` to
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	493	0, and having the :meth:`~iterator.__next__` method increment ``self.count`` and
				494	return it.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	495	However, for a moderately complicated generator, writing a corresponding class
				496	can be much messier.
				497
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	498	The test suite included with Python's library,
				499	:source:`Lib/test/test_generators.py`, contains
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	500	a number of more interesting examples. Here's one generator that implements an
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	501	in-order traversal of a tree using generators recursively. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	502
				503	# A recursive generator that generates Tree leaves in in-order.
				504	def inorder(t):
				505	if t:
				506	for x in inorder(t.left):
				507	yield x
				508
				509	yield t.label
				510
				511	for x in inorder(t.right):
				512	yield x
				513
				514	Two other examples in ``test_generators.py`` produce solutions for the N-Queens
				515	problem (placing N queens on an NxN chess board so that no queen threatens
				516	another) and the Knight's Tour (finding a route that takes a knight to every
				517	square of an NxN chessboard without visiting any square twice).
				518
				519
				520
				521	Passing values into a generator
				522	-------------------------------
				523
				524	In Python 2.4 and earlier, generators only produced output. Once a generator's
				525	code was invoked to create an iterator, there was no way to pass any new
				526	information into the function when its execution is resumed. You could hack
				527	together this ability by making the generator look at a global variable or by
				528	passing in some mutable object that callers then modify, but these approaches
				529	are messy.
				530
				531	In Python 2.5 there's a simple way to pass values into a generator.
				532	:keyword:`yield` became an expression, returning a value that can be assigned to
				533	a variable or otherwise operated on::
				534
				535	val = (yield i)
				536
				537	I recommend that you always put parentheses around a ``yield`` expression
				538	when you're doing something with the returned value, as in the above example.
				539	The parentheses aren't always necessary, but it's easier to always add them
				540	instead of having to remember when they're needed.
				541
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	542	(:pep:`342` explains the exact rules, which are that a ``yield``-expression must
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	543	always be parenthesized except when it occurs at the top-level expression on the
				544	right-hand side of an assignment. This means you can write ``val = yield i``
				545	but have to use parentheses when there's an operation, as in ``val = (yield i)
				546	+ 12``.)
				547
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	548	Values are sent into a generator by calling its :meth:`send(value)
				549	<generator.send>` method. This method resumes the generator's code and the
				550	``yield`` expression returns the specified value. If the regular
				551	:meth:`~generator.__next__` method is called, the ``yield`` returns ``None``.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	552
				553	Here's a simple counter that increments by 1 and allows changing the value of
				554	the internal counter.
				555
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	556	.. testcode::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	557
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	558	def counter(maximum):
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	559	i = 0
				560	while i < maximum:
				561	val = (yield i)
				562	# If value provided, change counter
				563	if val is not None:
				564	i = val
				565	else:
				566	i += 1
				567
				568	And here's an example of changing the counter:
				569
Ezio Melotti	35cbf16	2012-10-12 13:24:19 +0300	[diff] [blame]	570	>>> it = counter(10) #doctest: +SKIP
				571	>>> next(it) #doctest: +SKIP
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	572	0
Ezio Melotti	35cbf16	2012-10-12 13:24:19 +0300	[diff] [blame]	573	>>> next(it) #doctest: +SKIP
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	574	1
Ezio Melotti	35cbf16	2012-10-12 13:24:19 +0300	[diff] [blame]	575	>>> it.send(8) #doctest: +SKIP
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	576	8
Ezio Melotti	35cbf16	2012-10-12 13:24:19 +0300	[diff] [blame]	577	>>> next(it) #doctest: +SKIP
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	578	9
Ezio Melotti	35cbf16	2012-10-12 13:24:19 +0300	[diff] [blame]	579	>>> next(it) #doctest: +SKIP
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	580	Traceback (most recent call last):
UltimateCoder	8856940	2017-05-03 22:16:45 +0530	[diff] [blame]	581	File "t.py", line 15, in <module>
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	582	it.next()
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	583	StopIteration
				584
				585	Because ``yield`` will often be returning ``None``, you should always check for
				586	this case. Don't just use its value in expressions unless you're sure that the
Zachary Ware	0aecc18	2014-06-16 11:13:01 -0500	[diff] [blame]	587	:meth:`~generator.send` method will be the only method used to resume your
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	588	generator function.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	589
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	590	In addition to :meth:`~generator.send`, there are two other methods on
				591	generators:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	592
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	593	* :meth:`throw(type, value=None, traceback=None) <generator.throw>` is used to
				594	raise an exception inside the generator; the exception is raised by the
				595	``yield`` expression where the generator's execution is paused.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	596
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	597	* :meth:`~generator.close` raises a :exc:`GeneratorExit` exception inside the
				598	generator to terminate the iteration. On receiving this exception, the
				599	generator's code must either raise :exc:`GeneratorExit` or
				600	:exc:`StopIteration`; catching the exception and doing anything else is
				601	illegal and will trigger a :exc:`RuntimeError`. :meth:`~generator.close`
				602	will also be called by Python's garbage collector when the generator is
				603	garbage-collected.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	604
				605	If you need to run cleanup code when a :exc:`GeneratorExit` occurs, I suggest
				606	using a ``try: ... finally:`` suite instead of catching :exc:`GeneratorExit`.
				607
				608	The cumulative effect of these changes is to turn generators from one-way
				609	producers of information into both producers and consumers.
				610
				611	Generators also become coroutines, a more generalized form of subroutines.
				612	Subroutines are entered at one point and exited at another point (the top of the
				613	function, and a ``return`` statement), but coroutines can be entered, exited,
				614	and resumed at many different points (the ``yield`` statements).
				615
				616
				617	Built-in functions
				618	==================
				619
				620	Let's look in more detail at built-in functions often used with iterators.
				621
Georg Brandl	f694518	2008-02-01 11:56:49 +0000	[diff] [blame]	622	Two of Python's built-in functions, :func:`map` and :func:`filter` duplicate the
				623	features of generator expressions:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	624
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	625	:func:`map(f, iterA, iterB, ...) <map>` returns an iterator over the sequence
Georg Brandl	f694518	2008-02-01 11:56:49 +0000	[diff] [blame]	626	``f(iterA[0], iterB[0]), f(iterA[1], iterB[1]), f(iterA[2], iterB[2]), ...``.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	627
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	628	>>> def upper(s):
				629	... return s.upper()
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	630
Georg Brandl	a3deea1	2008-12-15 08:29:32 +0000	[diff] [blame]	631	>>> list(map(upper, ['sentence', 'fragment']))
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	632	['SENTENCE', 'FRAGMENT']
				633	>>> [upper(s) for s in ['sentence', 'fragment']]
				634	['SENTENCE', 'FRAGMENT']
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	635
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	636	You can of course achieve the same effect with a list comprehension.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	637
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	638	:func:`filter(predicate, iter) <filter>` returns an iterator over all the
				639	sequence elements that meet a certain condition, and is similarly duplicated by
				640	list comprehensions. A predicate is a function that returns the truth
				641	value of some condition; for use with :func:`filter`, the predicate must take a
				642	single value.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	643
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	644	>>> def is_even(x):
				645	... return (x % 2) == 0
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	646
Georg Brandl	a3deea1	2008-12-15 08:29:32 +0000	[diff] [blame]	647	>>> list(filter(is_even, range(10)))
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	648	[0, 2, 4, 6, 8]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	649
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	650
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	651	This can also be written as a list comprehension:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	652
Georg Brandl	f694518	2008-02-01 11:56:49 +0000	[diff] [blame]	653	>>> list(x for x in range(10) if is_even(x))
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	654	[0, 2, 4, 6, 8]
				655
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	656
csabella	9be4ff3	2017-06-04 13:39:21 -0400	[diff] [blame]	657	:func:`enumerate(iter, start=0) <enumerate>` counts off the elements in the
				658	iterable returning 2-tuples containing the count (from start) and
				659	each element. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	660
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	661	>>> for item in enumerate(['subject', 'verb', 'object']):
Neal Norwitz	752abd0	2008-05-13 04:55:24 +0000	[diff] [blame]	662	... print(item)
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	663	(0, 'subject')
				664	(1, 'verb')
				665	(2, 'object')
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	666
				667	:func:`enumerate` is often used when looping through a list and recording the
				668	indexes at which certain conditions are met::
				669
				670	f = open('data.txt', 'r')
				671	for i, line in enumerate(f):
				672	if line.strip() == '':
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	673	print('Blank line at line #%i' % i)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	674
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	675	:func:`sorted(iterable, key=None, reverse=False) <sorted>` collects all the
				676	elements of the iterable into a list, sorts the list, and returns the sorted
Andrew Kuchling	2a9c8e8	2013-05-20 10:14:53 -0400	[diff] [blame]	677	result. The key and reverse arguments are passed through to the
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	678	constructed list's :meth:`~list.sort` method. ::
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	679
				680	>>> import random
				681	>>> # Generate 8 random numbers between [0, 10000)
				682	>>> rand_list = random.sample(range(10000), 8)
Ezio Melotti	35cbf16	2012-10-12 13:24:19 +0300	[diff] [blame]	683	>>> rand_list #doctest: +SKIP
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	684	[769, 7953, 9828, 6431, 8442, 9878, 6213, 2207]
Ezio Melotti	35cbf16	2012-10-12 13:24:19 +0300	[diff] [blame]	685	>>> sorted(rand_list) #doctest: +SKIP
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	686	[769, 2207, 6213, 6431, 7953, 8442, 9828, 9878]
Ezio Melotti	35cbf16	2012-10-12 13:24:19 +0300	[diff] [blame]	687	>>> sorted(rand_list, reverse=True) #doctest: +SKIP
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	688	[9878, 9828, 8442, 7953, 6431, 6213, 2207, 769]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	689
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	690	(For a more detailed discussion of sorting, see the :ref:`sortinghowto`.)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	691
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	692
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	693	The :func:`any(iter) <any>` and :func:`all(iter) <all>` built-ins look at the
Serhiy Storchaka	fbc1c26	2013-11-29 12:17:13 +0200	[diff] [blame]	694	truth values of an iterable's contents. :func:`any` returns ``True`` if any element
				695	in the iterable is a true value, and :func:`all` returns ``True`` if all of the
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	696	elements are true values:
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	697
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	698	>>> any([0, 1, 0])
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	699	True
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	700	>>> any([0, 0, 0])
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	701	False
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	702	>>> any([1, 1, 1])
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	703	True
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	704	>>> all([0, 1, 0])
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	705	False
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	706	>>> all([0, 0, 0])
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	707	False
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	708	>>> all([1, 1, 1])
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	709	True
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	710
				711
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	712	:func:`zip(iterA, iterB, ...) <zip>` takes one element from each iterable and
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	713	returns them in a tuple::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	714
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	715	zip(['a', 'b', 'c'], (1, 2, 3)) =>
				716	('a', 1), ('b', 2), ('c', 3)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	717
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	718	It doesn't construct an in-memory list and exhaust all the input iterators
				719	before returning; instead tuples are constructed and returned only if they're
				720	requested. (The technical term for this behaviour is `lazy evaluation
Georg Brandl	5d94134	2016-02-26 19:37:12 +0100	[diff] [blame]	721	<https://en.wikipedia.org/wiki/Lazy_evaluation>`__.)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	722
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	723	This iterator is intended to be used with iterables that are all of the same
				724	length. If the iterables are of different lengths, the resulting stream will be
				725	the same length as the shortest iterable. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	726
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	727	zip(['a', 'b'], (1, 2, 3)) =>
				728	('a', 1), ('b', 2)
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	729
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	730	You should avoid doing this, though, because an element may be taken from the
				731	longer iterators and discarded. This means you can't go on to use the iterators
				732	further because you risk skipping a discarded element.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	733
				734
				735	The itertools module
				736	====================
				737
				738	The :mod:`itertools` module contains a number of commonly-used iterators as well
				739	as functions for combining several iterators. This section will introduce the
				740	module's contents by showing small examples.
				741
				742	The module's functions fall into a few broad classes:
				743
				744	* Functions that create a new iterator based on an existing iterator.
				745	* Functions for treating an iterator's elements as function arguments.
				746	* Functions for selecting portions of an iterator's output.
				747	* A function for grouping an iterator's output.
				748
				749	Creating new iterators
				750	----------------------
				751
csabella	9be4ff3	2017-06-04 13:39:21 -0400	[diff] [blame]	752	:func:`itertools.count(start, step) <itertools.count>` returns an infinite
				753	stream of evenly spaced values. You can optionally supply the starting number,
				754	which defaults to 0, and the interval between numbers, which defaults to 1::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	755
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	756	itertools.count() =>
				757	0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
				758	itertools.count(10) =>
				759	10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
csabella	9be4ff3	2017-06-04 13:39:21 -0400	[diff] [blame]	760	itertools.count(10, 5) =>
				761	10, 15, 20, 25, 30, 35, 40, 45, 50, 55, ...
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	762
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	763	:func:`itertools.cycle(iter) <itertools.cycle>` saves a copy of the contents of
				764	a provided iterable and returns a new iterator that returns its elements from
				765	first to last. The new iterator will repeat these elements infinitely. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	766
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	767	itertools.cycle([1, 2, 3, 4, 5]) =>
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	768	1, 2, 3, 4, 5, 1, 2, 3, 4, 5, ...
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	769
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	770	:func:`itertools.repeat(elem, [n]) <itertools.repeat>` returns the provided
				771	element n times, or returns the element endlessly if n is not provided. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	772
				773	itertools.repeat('abc') =>
				774	abc, abc, abc, abc, abc, abc, abc, abc, abc, abc, ...
				775	itertools.repeat('abc', 5) =>
				776	abc, abc, abc, abc, abc
				777
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	778	:func:`itertools.chain(iterA, iterB, ...) <itertools.chain>` takes an arbitrary
				779	number of iterables as input, and returns all the elements of the first
				780	iterator, then all the elements of the second, and so on, until all of the
				781	iterables have been exhausted. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	782
				783	itertools.chain(['a', 'b', 'c'], (1, 2, 3)) =>
				784	a, b, c, 1, 2, 3
				785
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	786	:func:`itertools.islice(iter, [start], stop, [step]) <itertools.islice>` returns
				787	a stream that's a slice of the iterator. With a single stop argument, it
				788	will return the first stop elements. If you supply a starting index, you'll
				789	get stop-start elements, and if you supply a value for step, elements
				790	will be skipped accordingly. Unlike Python's string and list slicing, you can't
				791	use negative values for start, stop, or step. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	792
				793	itertools.islice(range(10), 8) =>
				794	0, 1, 2, 3, 4, 5, 6, 7
				795	itertools.islice(range(10), 2, 8) =>
				796	2, 3, 4, 5, 6, 7
				797	itertools.islice(range(10), 2, 8, 2) =>
				798	2, 4, 6
				799
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	800	:func:`itertools.tee(iter, [n]) <itertools.tee>` replicates an iterator; it
				801	returns n independent iterators that will all return the contents of the
				802	source iterator.
				803	If you don't supply a value for n, the default is 2. Replicating iterators
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	804	requires saving some of the contents of the source iterator, so this can consume
				805	significant memory if the iterator is large and one of the new iterators is
Christian Heimes	fe337bf	2008-03-23 21:54:12 +0000	[diff] [blame]	806	consumed more than the others. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	807
				808	itertools.tee( itertools.count() ) =>
				809	iterA, iterB
				810
				811	where iterA ->
				812	0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
				813
				814	and iterB ->
				815	0, 1, 2, 3, 4, 5, 6, 7, 8, 9, ...
				816
				817
				818	Calling functions on elements
				819	-----------------------------
				820
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	821	The :mod:`operator` module contains a set of functions corresponding to Python's
				822	operators. Some examples are :func:`operator.add(a, b) <operator.add>` (adds
				823	two values), :func:`operator.ne(a, b) <operator.ne>` (same as ``a != b``), and
				824	:func:`operator.attrgetter('id') <operator.attrgetter>`
				825	(returns a callable that fetches the ``.id`` attribute).
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	826
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	827	:func:`itertools.starmap(func, iter) <itertools.starmap>` assumes that the
				828	iterable will return a stream of tuples, and calls func using these tuples as
				829	the arguments::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	830
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	831	itertools.starmap(os.path.join,
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	832	[('/bin', 'python'), ('/usr', 'bin', 'java'),
				833	('/usr', 'bin', 'perl'), ('/usr', 'bin', 'ruby')])
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	834	=>
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	835	/bin/python, /usr/bin/java, /usr/bin/perl, /usr/bin/ruby
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	836
				837
				838	Selecting elements
				839	------------------
				840
				841	Another group of functions chooses a subset of an iterator's elements based on a
				842	predicate.
				843
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	844	:func:`itertools.filterfalse(predicate, iter) <itertools.filterfalse>` is the
Andrew Kuchling	2a9c8e8	2013-05-20 10:14:53 -0400	[diff] [blame]	845	opposite of :func:`filter`, returning all elements for which the predicate
				846	returns false::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	847
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	848	itertools.filterfalse(is_even, itertools.count()) =>
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	849	1, 3, 5, 7, 9, 11, 13, 15, ...
				850
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	851	:func:`itertools.takewhile(predicate, iter) <itertools.takewhile>` returns
				852	elements for as long as the predicate returns true. Once the predicate returns
				853	false, the iterator will signal the end of its results. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	854
				855	def less_than_10(x):
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	856	return x < 10
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	857
				858	itertools.takewhile(less_than_10, itertools.count()) =>
				859	0, 1, 2, 3, 4, 5, 6, 7, 8, 9
				860
				861	itertools.takewhile(is_even, itertools.count()) =>
				862	0
				863
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	864	:func:`itertools.dropwhile(predicate, iter) <itertools.dropwhile>` discards
				865	elements while the predicate returns true, and then returns the rest of the
				866	iterable's results. ::
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	867
				868	itertools.dropwhile(less_than_10, itertools.count()) =>
				869	10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ...
				870
				871	itertools.dropwhile(is_even, itertools.count()) =>
				872	1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...
				873
Andrew Kuchling	2a9c8e8	2013-05-20 10:14:53 -0400	[diff] [blame]	874	:func:`itertools.compress(data, selectors) <itertools.compress>` takes two
				875	iterators and returns only those elements of data for which the corresponding
				876	element of selectors is true, stopping whenever either one is exhausted::
				877
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	878	itertools.compress([1, 2, 3, 4, 5], [True, True, False, False, True]) =>
Andrew Kuchling	2a9c8e8	2013-05-20 10:14:53 -0400	[diff] [blame]	879	1, 2, 5
				880
				881
				882	Combinatoric functions
				883	----------------------
				884
				885	The :func:`itertools.combinations(iterable, r) <itertools.combinations>`
				886	returns an iterator giving all possible r-tuple combinations of the
				887	elements contained in iterable. ::
				888
				889	itertools.combinations([1, 2, 3, 4, 5], 2) =>
				890	(1, 2), (1, 3), (1, 4), (1, 5),
				891	(2, 3), (2, 4), (2, 5),
				892	(3, 4), (3, 5),
				893	(4, 5)
				894
				895	itertools.combinations([1, 2, 3, 4, 5], 3) =>
				896	(1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 3, 4), (1, 3, 5), (1, 4, 5),
				897	(2, 3, 4), (2, 3, 5), (2, 4, 5),
				898	(3, 4, 5)
				899
				900	The elements within each tuple remain in the same order as
				901	iterable returned them. For example, the number 1 is always before
				902	2, 3, 4, or 5 in the examples above. A similar function,
				903	:func:`itertools.permutations(iterable, r=None) <itertools.permutations>`,
				904	removes this constraint on the order, returning all possible
				905	arrangements of length r::
				906
				907	itertools.permutations([1, 2, 3, 4, 5], 2) =>
				908	(1, 2), (1, 3), (1, 4), (1, 5),
				909	(2, 1), (2, 3), (2, 4), (2, 5),
				910	(3, 1), (3, 2), (3, 4), (3, 5),
				911	(4, 1), (4, 2), (4, 3), (4, 5),
				912	(5, 1), (5, 2), (5, 3), (5, 4)
				913
				914	itertools.permutations([1, 2, 3, 4, 5]) =>
				915	(1, 2, 3, 4, 5), (1, 2, 3, 5, 4), (1, 2, 4, 3, 5),
				916	...
				917	(5, 4, 3, 2, 1)
				918
				919	If you don't supply a value for r the length of the iterable is used,
				920	meaning that all the elements are permuted.
				921
				922	Note that these functions produce all of the possible combinations by
				923	position and don't require that the contents of iterable are unique::
				924
				925	itertools.permutations('aba', 3) =>
				926	('a', 'b', 'a'), ('a', 'a', 'b'), ('b', 'a', 'a'),
				927	('b', 'a', 'a'), ('a', 'a', 'b'), ('a', 'b', 'a')
				928
				929	The identical tuple ``('a', 'a', 'b')`` occurs twice, but the two 'a'
				930	strings came from different positions.
				931
				932	The :func:`itertools.combinations_with_replacement(iterable, r) <itertools.combinations_with_replacement>`
				933	function relaxes a different constraint: elements can be repeated
				934	within a single tuple. Conceptually an element is selected for the
				935	first position of each tuple and then is replaced before the second
				936	element is selected. ::
				937
				938	itertools.combinations_with_replacement([1, 2, 3, 4, 5], 2) =>
				939	(1, 1), (1, 2), (1, 3), (1, 4), (1, 5),
				940	(2, 2), (2, 3), (2, 4), (2, 5),
				941	(3, 3), (3, 4), (3, 5),
				942	(4, 4), (4, 5),
				943	(5, 5)
				944
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	945
				946	Grouping elements
				947	-----------------
				948
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	949	The last function I'll discuss, :func:`itertools.groupby(iter, key_func=None)
				950	<itertools.groupby>`, is the most complicated. ``key_func(elem)`` is a function
				951	that can compute a key value for each element returned by the iterable. If you
				952	don't supply a key function, the key is simply each element itself.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	953
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	954	:func:`~itertools.groupby` collects all the consecutive elements from the
				955	underlying iterable that have the same key value, and returns a stream of
				956	2-tuples containing a key value and an iterator for the elements with that key.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	957
				958	::
				959
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	960	city_list = [('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL'),
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	961	('Anchorage', 'AK'), ('Nome', 'AK'),
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	962	('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ'),
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	963	...
				964	]
				965
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	966	def get_state(city_state):
Georg Brandl	0df7979	2008-10-04 18:33:26 +0000	[diff] [blame]	967	return city_state[1]
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	968
				969	itertools.groupby(city_list, get_state) =>
				970	('AL', iterator-1),
				971	('AK', iterator-2),
				972	('AZ', iterator-3), ...
				973
				974	where
				975	iterator-1 =>
				976	('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL')
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	977	iterator-2 =>
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	978	('Anchorage', 'AK'), ('Nome', 'AK')
				979	iterator-3 =>
				980	('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ')
				981
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	982	:func:`~itertools.groupby` assumes that the underlying iterable's contents will
				983	already be sorted based on the key. Note that the returned iterators also use
				984	the underlying iterable, so you have to consume the results of iterator-1 before
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	985	requesting iterator-2 and its corresponding key.
				986
				987
				988	The functools module
				989	====================
				990
				991	The :mod:`functools` module in Python 2.5 contains some higher-order functions.
				992	A higher-order function takes one or more functions as input and returns a
				993	new function. The most useful tool in this module is the
				994	:func:`functools.partial` function.
				995
				996	For programs written in a functional style, you'll sometimes want to construct
				997	variants of existing functions that have some of the parameters filled in.
				998	Consider a Python function ``f(a, b, c)``; you may wish to create a new function
				999	``g(b, c)`` that's equivalent to ``f(1, b, c)``; you're filling in a value for
				1000	one of ``f()``'s parameters. This is called "partial function application".
				1001
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	1002	The constructor for :func:`~functools.partial` takes the arguments
				1003	``(function, arg1, arg2, ..., kwarg1=value1, kwarg2=value2)``. The resulting
				1004	object is callable, so you can just call it to invoke ``function`` with the
				1005	filled-in arguments.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1006
				1007	Here's a small but realistic example::
				1008
				1009	import functools
				1010
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	1011	def log(message, subsystem):
				1012	"""Write the contents of 'message' to the specified subsystem."""
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	1013	print('%s: %s' % (subsystem, message))
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1014	...
				1015
				1016	server_log = functools.partial(log, subsystem='server')
				1017	server_log('Unable to open socket')
				1018
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	1019	:func:`functools.reduce(func, iter, [initial_value]) <functools.reduce>`
				1020	cumulatively performs an operation on all the iterable's elements and,
				1021	therefore, can't be applied to infinite iterables. func must be a function
				1022	that takes two elements and returns a single value. :func:`functools.reduce`
				1023	takes the first two elements A and B returned by the iterator and calculates
				1024	``func(A, B)``. It then requests the third element, C, calculates
				1025	``func(func(A, B), C)``, combines this result with the fourth element returned,
				1026	and continues until the iterable is exhausted. If the iterable returns no
				1027	values at all, a :exc:`TypeError` exception is raised. If the initial value is
				1028	supplied, it's used as a starting point and ``func(initial_value, A)`` is the
				1029	first calculation. ::
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1030
				1031	>>> import operator, functools
				1032	>>> functools.reduce(operator.concat, ['A', 'BB', 'C'])
				1033	'ABBC'
				1034	>>> functools.reduce(operator.concat, [])
				1035	Traceback (most recent call last):
				1036	...
				1037	TypeError: reduce() of empty sequence with no initial value
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	1038	>>> functools.reduce(operator.mul, [1, 2, 3], 1)
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1039	6
				1040	>>> functools.reduce(operator.mul, [], 1)
				1041	1
				1042
				1043	If you use :func:`operator.add` with :func:`functools.reduce`, you'll add up all the
				1044	elements of the iterable. This case is so common that there's a special
				1045	built-in called :func:`sum` to compute it:
				1046
Zachary Ware	378a1d7	2016-08-09 16:47:04 -0500	[diff] [blame]	1047	>>> import functools, operator
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	1048	>>> functools.reduce(operator.add, [1, 2, 3, 4], 0)
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1049	10
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	1050	>>> sum([1, 2, 3, 4])
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1051	10
				1052	>>> sum([])
				1053	0
				1054
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	1055	For many uses of :func:`functools.reduce`, though, it can be clearer to just
				1056	write the obvious :keyword:`for` loop::
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1057
				1058	import functools
				1059	# Instead of:
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	1060	product = functools.reduce(operator.mul, [1, 2, 3], 1)
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1061
				1062	# You can write:
				1063	product = 1
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	1064	for i in [1, 2, 3]:
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1065	product *= i
				1066
csabella	9be4ff3	2017-06-04 13:39:21 -0400	[diff] [blame]	1067	A related function is :func:`itertools.accumulate(iterable, func=operator.add)
				1068	<itertools.accumulate>`. It performs the same calculation, but instead of
				1069	returning only the final result, :func:`accumulate` returns an iterator that
				1070	also yields each partial result::
Andrew Kuchling	2a9c8e8	2013-05-20 10:14:53 -0400	[diff] [blame]	1071
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	1072	itertools.accumulate([1, 2, 3, 4, 5]) =>
Andrew Kuchling	2a9c8e8	2013-05-20 10:14:53 -0400	[diff] [blame]	1073	1, 3, 6, 10, 15
				1074
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	1075	itertools.accumulate([1, 2, 3, 4, 5], operator.mul) =>
Andrew Kuchling	2a9c8e8	2013-05-20 10:14:53 -0400	[diff] [blame]	1076	1, 2, 6, 24, 120
				1077
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1078
				1079	The operator module
				1080	-------------------
				1081
				1082	The :mod:`operator` module was mentioned earlier. It contains a set of
				1083	functions corresponding to Python's operators. These functions are often useful
				1084	in functional-style code because they save you from writing trivial functions
				1085	that perform a single operation.
				1086
				1087	Some of the functions in this module are:
				1088
Georg Brandl	f694518	2008-02-01 11:56:49 +0000	[diff] [blame]	1089	* Math operations: ``add()``, ``sub()``, ``mul()``, ``floordiv()``, ``abs()``, ...
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1090	* Logical operations: ``not_()``, ``truth()``.
				1091	* Bitwise operations: ``and_()``, ``or_()``, ``invert()``.
				1092	* Comparisons: ``eq()``, ``ne()``, ``lt()``, ``le()``, ``gt()``, and ``ge()``.
				1093	* Object identity: ``is_()``, ``is_not()``.
				1094
				1095	Consult the operator module's documentation for a complete list.
				1096
				1097
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1098	Small functions and the lambda expression
				1099	=========================================
				1100
				1101	When writing functional-style programs, you'll often need little functions that
				1102	act as predicates or that combine elements in some way.
				1103
				1104	If there's a Python built-in or a module function that's suitable, you don't
				1105	need to define a new function at all::
				1106
				1107	stripped_lines = [line.strip() for line in lines]
				1108	existing_files = filter(os.path.exists, file_list)
				1109
				1110	If the function you need doesn't exist, you need to write it. One way to write
Serhiy Storchaka	2b57c43	2018-12-19 08:09:46 +0200	[diff] [blame]	1111	small functions is to use the :keyword:`lambda` expression. ``lambda`` takes a
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	1112	number of parameters and an expression combining these parameters, and creates
				1113	an anonymous function that returns the value of the expression::
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1114
				1115	adder = lambda x, y: x+y
				1116
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	1117	print_assign = lambda name, value: name + '=' + str(value)
				1118
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1119	An alternative is to just use the ``def`` statement and define a function in the
				1120	usual way::
				1121
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	1122	def adder(x, y):
				1123	return x + y
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1124
				1125	def print_assign(name, value):
				1126	return name + '=' + str(value)
				1127
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1128	Which alternative is preferable? That's a style question; my usual course is to
				1129	avoid using ``lambda``.
				1130
				1131	One reason for my preference is that ``lambda`` is quite limited in the
				1132	functions it can define. The result has to be computable as a single
				1133	expression, which means you can't have multiway ``if... elif... else``
				1134	comparisons or ``try... except`` statements. If you try to do too much in a
				1135	``lambda`` statement, you'll end up with an overly complicated expression that's
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	1136	hard to read. Quick, what's the following code doing? ::
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1137
				1138	import functools
				1139	total = functools.reduce(lambda a, b: (0, a[1] + b[1]), items)[1]
				1140
				1141	You can figure it out, but it takes time to disentangle the expression to figure
				1142	out what's going on. Using a short nested ``def`` statements makes things a
				1143	little bit better::
				1144
				1145	import functools
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	1146	def combine(a, b):
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1147	return 0, a[1] + b[1]
				1148
				1149	total = functools.reduce(combine, items)[1]
				1150
				1151	But it would be best of all if I had simply used a ``for`` loop::
				1152
				1153	total = 0
				1154	for a, b in items:
				1155	total += b
				1156
				1157	Or the :func:`sum` built-in and a generator expression::
				1158
Sergey Fedoseev	db8707c	2018-08-08 02:38:00 +0500	[diff] [blame]	1159	total = sum(b for a, b in items)
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1160
				1161	Many uses of :func:`functools.reduce` are clearer when written as ``for`` loops.
				1162
				1163	Fredrik Lundh once suggested the following set of rules for refactoring uses of
				1164	``lambda``:
				1165
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	1166	1. Write a lambda function.
				1167	2. Write a comment explaining what the heck that lambda does.
				1168	3. Study the comment for a while, and think of a name that captures the essence
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1169	of the comment.
Ezio Melotti	45a101d	2012-10-12 12:42:51 +0300	[diff] [blame]	1170	4. Convert the lambda to a def statement, using that name.
				1171	5. Remove the comment.
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1172
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	1173	I really like these rules, but you're free to disagree
Georg Brandl	4216d2d	2008-11-22 08:27:24 +0000	[diff] [blame]	1174	about whether this lambda-free style is better.
				1175
				1176
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1177	Revision History and Acknowledgements
				1178	=====================================
				1179
				1180	The author would like to thank the following people for offering suggestions,
				1181	corrections and assistance with various drafts of this article: Ian Bicking,
				1182	Nick Coghlan, Nick Efford, Raymond Hettinger, Jim Jewett, Mike Krell, Leandro
				1183	Lameiro, Jussi Salmela, Collin Winter, Blake Winton.
				1184
				1185	Version 0.1: posted June 30 2006.
				1186
				1187	Version 0.11: posted July 1 2006. Typo fixes.
				1188
				1189	Version 0.2: posted July 10 2006. Merged genexp and listcomp sections into one.
				1190	Typo fixes.
				1191
				1192	Version 0.21: Added more references suggested on the tutor mailing list.
				1193
				1194	Version 0.30: Adds a section on the ``functional`` module written by Collin
				1195	Winter; adds short section on the operator module; a few other edits.
				1196
				1197
				1198	References
				1199	==========
				1200
				1201	General
				1202	-------
				1203
				1204	Structure and Interpretation of Computer Programs, by Harold Abelson and
				1205	Gerald Jay Sussman with Julie Sussman. Full text at
Georg Brandl	5d94134	2016-02-26 19:37:12 +0100	[diff] [blame]	1206	https://mitpress.mit.edu/sicp/. In this classic textbook of computer science,
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1207	chapters 2 and 3 discuss the use of sequences and streams to organize the data
				1208	flow inside a program. The book uses Scheme for its examples, but many of the
				1209	design approaches described in these chapters are applicable to functional-style
				1210	Python code.
				1211
				1212	http://www.defmacro.org/ramblings/fp.html: A general introduction to functional
				1213	programming that uses Java examples and has a lengthy historical introduction.
				1214
Georg Brandl	5d94134	2016-02-26 19:37:12 +0100	[diff] [blame]	1215	https://en.wikipedia.org/wiki/Functional_programming: General Wikipedia entry
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1216	describing functional programming.
				1217
Georg Brandl	5d94134	2016-02-26 19:37:12 +0100	[diff] [blame]	1218	https://en.wikipedia.org/wiki/Coroutine: Entry for coroutines.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1219
Georg Brandl	5d94134	2016-02-26 19:37:12 +0100	[diff] [blame]	1220	https://en.wikipedia.org/wiki/Currying: Entry for the concept of currying.
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1221
				1222	Python-specific
				1223	---------------
				1224
				1225	http://gnosis.cx/TPiP/: The first chapter of David Mertz's book
				1226	:title-reference:`Text Processing in Python` discusses functional programming
				1227	for text processing, in the section titled "Utilizing Higher-Order Functions in
				1228	Text Processing".
				1229
				1230	Mertz also wrote a 3-part series of articles on functional programming
Georg Brandl	48310cd	2009-01-03 21:18:54 +0000	[diff] [blame]	1231	for IBM's DeveloperWorks site; see
Serhiy Storchaka	6dff020	2016-05-07 10:49:07 +0300	[diff] [blame]	1232	`part 1 <https://www.ibm.com/developerworks/linux/library/l-prog/index.html>`__,
				1233	`part 2 <https://www.ibm.com/developerworks/linux/library/l-prog2/index.html>`__, and
				1234	`part 3 <https://www.ibm.com/developerworks/linux/library/l-prog3/index.html>`__,
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1235
				1236
				1237	Python documentation
				1238	--------------------
				1239
				1240	Documentation for the :mod:`itertools` module.
				1241
csabella	9be4ff3	2017-06-04 13:39:21 -0400	[diff] [blame]	1242	Documentation for the :mod:`functools` module.
				1243
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1244	Documentation for the :mod:`operator` module.
				1245
				1246	:pep:`289`: "Generator Expressions"
				1247
				1248	:pep:`342`: "Coroutines via Enhanced Generators" describes the new generator
				1249	features in Python 2.5.
				1250
				1251	.. comment
				1252
Georg Brandl	116aa62	2007-08-15 14:28:22 +0000	[diff] [blame]	1253	Handy little function for printing part of an iterator -- used
				1254	while writing this document.
				1255
				1256	import itertools
				1257	def print_iter(it):
				1258	slice = itertools.islice(it, 10)
				1259	for elem in slice[:-1]:
				1260	sys.stdout.write(str(elem))
				1261	sys.stdout.write(', ')
Georg Brandl	6911e3c	2007-09-04 07:15:32 +0000	[diff] [blame]	1262	print(elem[-1])