blob: 5c6c2b728dea39b6cf440300c1f0543f4b4641f5 [file] [log] [blame]
Raymond Hettinger96ef8112003-02-01 00:10:11 +00001\section{\module{itertools} ---
2 Functions creating iterators for efficient looping}
3
4\declaremodule{standard}{itertools}
5\modulesynopsis{Functions creating iterators for efficient looping.}
6\moduleauthor{Raymond Hettinger}{python@rcn.com}
7\sectionauthor{Raymond Hettinger}{python@rcn.com}
8\versionadded{2.3}
9
10
11This module implements a number of iterator building blocks inspired
12by constructs from the Haskell and SML programming languages. Each
13has been recast in a form suitable for Python.
14
Raymond Hettinger60eca932003-02-09 06:40:58 +000015The module standardizes a core set of fast, memory efficient tools
16that are useful by themselves or in combination. Standardization helps
17avoid the readability and reliability problems which arise when many
18different individuals create their own slightly varying implementations,
19each with their own quirks and naming conventions.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000020
Raymond Hettinger1b18ba42003-02-21 01:45:34 +000021The tools are designed to combine readily with one another. This makes
Raymond Hettinger60eca932003-02-09 06:40:58 +000022it easy to construct more specialized tools succinctly and efficiently
23in pure Python.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000024
Raymond Hettinger1b18ba42003-02-21 01:45:34 +000025For instance, SML provides a tabulation tool: \code{tabulate(f)}
Raymond Hettinger60eca932003-02-09 06:40:58 +000026which produces a sequence \code{f(0), f(1), ...}. This toolbox
27provides \function{imap()} and \function{count()} which can be combined
Raymond Hettinger1b18ba42003-02-21 01:45:34 +000028to form \code{imap(f, count())} and produce an equivalent result.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000029
Raymond Hettinger60eca932003-02-09 06:40:58 +000030Whether cast in pure python form or C code, tools that use iterators
31are more memory efficient (and faster) than their list based counterparts.
32Adopting the principles of just-in-time manufacturing, they create
33data when and where needed instead of consuming memory with the
34computer equivalent of ``inventory''.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000035
Raymond Hettinger60eca932003-02-09 06:40:58 +000036Some tools were omitted from the module because they offered no
37advantage over their pure python counterparts or because their behavior
38was too surprising.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000039
Raymond Hettinger60eca932003-02-09 06:40:58 +000040For instance, SML provides a tool: \code{cycle(\var{seq})} which
41loops over the sequence elements and then starts again when the
42sequence is exhausted. The surprising behavior is the need for
43significant auxiliary storage (which is unusual for an iterator).
44If needed, the tool is readily constructible using pure Python.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000045
Raymond Hettinger60eca932003-02-09 06:40:58 +000046Other tools are being considered for inclusion in future versions of the
47module. For instance, the function
Michael W. Hudsoneb189932003-02-11 14:24:13 +000048\function{chain(\var{it0}, \var{it1}, ...)} would return elements from
Raymond Hettinger60eca932003-02-09 06:40:58 +000049the first iterator until it was exhausted and then move on to each
50successive iterator. The module author welcomes suggestions for other
51basic building blocks.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000052
53\begin{seealso}
54 \seetext{The Standard ML Basis Library,
55 \citetitle[http://www.standardml.org/Basis/]
56 {The Standard ML Basis Library}.}
57
58 \seetext{Haskell, A Purely Functional Language,
59 \citetitle[http://www.haskell.org/definition/]
60 {Definition of Haskell and the Standard Libraries}.}
61\end{seealso}
62
63
64\subsection{Itertool functions \label{itertools-functions}}
65
66The following module functions all construct and return iterators.
67Some provide streams of infinite length, so they should only be accessed
68by functions or loops that truncate the stream.
69
70\begin{funcdesc}{count}{\optional{n}}
71 Make an iterator that returns consecutive integers starting with \var{n}.
72 Does not currently support python long integers. Often used as an
73 argument to \function{imap()} to generate consecutive data points.
74 Also, used in \function{izip()} to add sequence numbers. Equivalent to:
75
76 \begin{verbatim}
77 def count(n=0):
Raymond Hettinger96ef8112003-02-01 00:10:11 +000078 while True:
Raymond Hettinger1b18ba42003-02-21 01:45:34 +000079 yield n
80 n += 1
Raymond Hettinger96ef8112003-02-01 00:10:11 +000081 \end{verbatim}
Raymond Hettinger2012f172003-02-07 05:32:58 +000082
83 Note, \function{count()} does not check for overflow and will return
84 negative numbers after exceeding \code{sys.maxint}. This behavior
85 may change in the future.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000086\end{funcdesc}
87
88\begin{funcdesc}{dropwhile}{predicate, iterable}
89 Make an iterator that drops elements from the iterable as long as
90 the predicate is true; afterwards, returns every element. Note,
91 the iterator does not produce \emph{any} output until the predicate
92 is true, so it may have a lengthy start-up time. Equivalent to:
93
94 \begin{verbatim}
95 def dropwhile(predicate, iterable):
96 iterable = iter(iterable)
97 while True:
98 x = iterable.next()
99 if predicate(x): continue # drop when predicate is true
100 yield x
101 break
102 while True:
103 yield iterable.next()
104 \end{verbatim}
105\end{funcdesc}
106
Raymond Hettinger60eca932003-02-09 06:40:58 +0000107\begin{funcdesc}{ifilter}{predicate, iterable}
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000108 Make an iterator that filters elements from iterable returning only
Raymond Hettinger60eca932003-02-09 06:40:58 +0000109 those for which the predicate is \code{True}.
110 If \var{predicate} is \code{None}, return the items that are true.
111 Equivalent to:
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000112
113 \begin{verbatim}
Raymond Hettinger60eca932003-02-09 06:40:58 +0000114 def ifilter(predicate, iterable):
115 if predicate is None:
116 def predicate(x):
117 return x
118 for x in iterable:
119 if predicate(x):
120 yield x
121 \end{verbatim}
122\end{funcdesc}
123
124\begin{funcdesc}{ifilterfalse}{predicate, iterable}
125 Make an iterator that filters elements from iterable returning only
126 those for which the predicate is \code{False}.
127 If \var{predicate} is \code{None}, return the items that are false.
128 Equivalent to:
129
130 \begin{verbatim}
131 def ifilterfalse(predicate, iterable):
132 if predicate is None:
133 def predicate(x):
134 return x
135 for x in iterable:
136 if not predicate(x):
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000137 yield x
138 \end{verbatim}
139\end{funcdesc}
140
141\begin{funcdesc}{imap}{function, *iterables}
142 Make an iterator that computes the function using arguments from
143 each of the iterables. If \var{function} is set to \code{None}, then
144 \function{imap()} returns the arguments as a tuple. Like
145 \function{map()} but stops when the shortest iterable is exhausted
146 instead of filling in \code{None} for shorter iterables. The reason
147 for the difference is that infinite iterator arguments are typically
148 an error for \function{map()} (because the output is fully evaluated)
149 but represent a common and useful way of supplying arguments to
150 \function{imap()}.
151 Equivalent to:
152
153 \begin{verbatim}
154 def imap(function, *iterables):
155 iterables = map(iter, iterables)
156 while True:
157 args = [i.next() for i in iterables]
158 if function is None:
159 yield tuple(args)
160 else:
161 yield function(*args)
162 \end{verbatim}
163\end{funcdesc}
164
165\begin{funcdesc}{islice}{iterable, \optional{start,} stop \optional{, step}}
166 Make an iterator that returns selected elements from the iterable.
167 If \var{start} is non-zero, then elements from the iterable are skipped
168 until start is reached. Afterward, elements are returned consecutively
169 unless \var{step} is set higher than one which results in items being
170 skipped. If \var{stop} is specified, then iteration stops at the
171 specified element position; otherwise, it continues indefinitely or
172 until the iterable is exhausted. Unlike regular slicing,
173 \function{islice()} does not support negative values for \var{start},
174 \var{stop}, or \var{step}. Can be used to extract related fields
175 from data where the internal structure has been flattened (for
176 example, a multi-line report may list a name field on every
177 third line). Equivalent to:
178
179 \begin{verbatim}
180 def islice(iterable, *args):
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000181 s = slice(*args)
182 next = s.start or 0
183 stop = s.stop
184 step = s.step or 1
Raymond Hettinger60eca932003-02-09 06:40:58 +0000185 for cnt, element in enumerate(iterable):
186 if cnt < next:
187 continue
188 if cnt >= stop:
189 break
190 yield element
191 next += step
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000192 \end{verbatim}
193\end{funcdesc}
194
195\begin{funcdesc}{izip}{*iterables}
196 Make an iterator that aggregates elements from each of the iterables.
197 Like \function{zip()} except that it returns an iterator instead of
198 a list. Used for lock-step iteration over several iterables at a
199 time. Equivalent to:
200
201 \begin{verbatim}
202 def izip(*iterables):
203 iterables = map(iter, iterables)
204 while True:
205 result = [i.next() for i in iterables]
206 yield tuple(result)
207 \end{verbatim}
208\end{funcdesc}
209
Raymond Hettinger1b18ba42003-02-21 01:45:34 +0000210\begin{funcdesc}{repeat}{object}
211 Make an iterator that returns \var{object} over and over again.
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000212 Used as argument to \function{imap()} for invariant parameters
Raymond Hettinger1b18ba42003-02-21 01:45:34 +0000213 to the called function. Also used with \function{izip()} to create
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000214 an invariant part of a tuple record. Equivalent to:
215
216 \begin{verbatim}
Raymond Hettinger1b18ba42003-02-21 01:45:34 +0000217 def repeat(object):
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000218 while True:
Raymond Hettinger1b18ba42003-02-21 01:45:34 +0000219 yield object
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000220 \end{verbatim}
221\end{funcdesc}
222
223\begin{funcdesc}{starmap}{function, iterable}
224 Make an iterator that computes the function using arguments tuples
225 obtained from the iterable. Used instead of \function{imap()} when
226 argument parameters are already grouped in tuples from a single iterable
227 (the data has been ``pre-zipped''). The difference between
Raymond Hettinger1b18ba42003-02-21 01:45:34 +0000228 \function{imap()} and \function{starmap()} parallels the distinction
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000229 between \code{function(a,b)} and \code{function(*c)}.
230 Equivalent to:
231
232 \begin{verbatim}
233 def starmap(function, iterable):
234 iterable = iter(iterable)
235 while True:
236 yield function(*iterable.next())
237 \end{verbatim}
238\end{funcdesc}
239
240\begin{funcdesc}{takewhile}{predicate, iterable}
241 Make an iterator that returns elements from the iterable as long as
242 the predicate is true. Equivalent to:
243
244 \begin{verbatim}
245 def takewhile(predicate, iterable):
246 iterable = iter(iterable)
247 while True:
248 x = iterable.next()
249 if predicate(x):
250 yield x
251 else:
252 break
253 \end{verbatim}
254\end{funcdesc}
255
256\begin{funcdesc}{times}{n, \optional{object}}
257 Make an iterator that returns \var{object} \var{n} times.
258 \var{object} defaults to \code{None}. Used for looping a specific
259 number of times without creating a number object on each pass.
260 Equivalent to:
261
262 \begin{verbatim}
263 def times(n, object=None):
264 if n<0 : raise ValueError
265 for i in xrange(n):
266 yield object
267 \end{verbatim}
268\end{funcdesc}
269
270
271\subsection{Examples \label{itertools-example}}
272
273The following examples show common uses for each tool and
274demonstrate ways they can be combined.
275
276\begin{verbatim}
277>>> for i in times(3):
278... print "Hello"
279...
280Hello
281Hello
282Hello
283
284>>> amounts = [120.15, 764.05, 823.14]
285>>> for checknum, amount in izip(count(1200), amounts):
286... print 'Check %d is for $%.2f' % (checknum, amount)
287...
288Check 1200 is for $120.15
289Check 1201 is for $764.05
290Check 1202 is for $823.14
291
292>>> import operator
293>>> for cube in imap(operator.pow, xrange(1,4), repeat(3)):
294... print cube
295...
2961
2978
29827
299
300>>> reportlines = ['EuroPython', 'Roster', '', 'alex', '', 'laura',
301 '', 'martin', '', 'walter', '', 'samuele']
302>>> for name in islice(reportlines, 3, len(reportlines), 2):
303... print name.title()
304...
305Alex
306Laura
307Martin
308Walter
309Samuele
310
311\end{verbatim}
312
313This section has further examples of how itertools can be combined.
314Note that \function{enumerate()} and \method{iteritems()} already
315have highly efficient implementations in Python. They are only
316included here to illustrate how higher level tools can be created
317from building blocks.
318
319\begin{verbatim}
320>>> def enumerate(iterable):
321... return izip(count(), iterable)
322
323>>> def tabulate(function):
324... "Return function(0), function(1), ..."
325... return imap(function, count())
326
327>>> def iteritems(mapping):
328... return izip(mapping.iterkeys(), mapping.itervalues())
329
330>>> def nth(iterable, n):
331... "Returns the nth item"
Raymond Hettinger60eca932003-02-09 06:40:58 +0000332... return list(islice(iterable, n, n+1))
333
334>>> def all(pred, seq):
335... "Returns True if pred(x) is True for every element in the iterable"
336... return not nth(ifilterfalse(pred, seq), 0)
337
338>>> def some(pred, seq):
339... "Returns True if pred(x) is True at least one element in the iterable"
340... return bool(nth(ifilter(pred, seq), 0))
341
342>>> def no(pred, seq):
343... "Returns True if pred(x) is False for every element in the iterable"
344... return not nth(ifilter(pred, seq), 0)
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000345
346\end{verbatim}