blob: fafc48b6bc13c993d905714ca578212c050ad739 [file] [log] [blame]
Raymond Hettinger96ef8112003-02-01 00:10:11 +00001\section{\module{itertools} ---
2 Functions creating iterators for efficient looping}
3
4\declaremodule{standard}{itertools}
5\modulesynopsis{Functions creating iterators for efficient looping.}
6\moduleauthor{Raymond Hettinger}{python@rcn.com}
7\sectionauthor{Raymond Hettinger}{python@rcn.com}
8\versionadded{2.3}
9
10
11This module implements a number of iterator building blocks inspired
12by constructs from the Haskell and SML programming languages. Each
13has been recast in a form suitable for Python.
14
Raymond Hettinger60eca932003-02-09 06:40:58 +000015The module standardizes a core set of fast, memory efficient tools
16that are useful by themselves or in combination. Standardization helps
17avoid the readability and reliability problems which arise when many
18different individuals create their own slightly varying implementations,
19each with their own quirks and naming conventions.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000020
Raymond Hettinger1b18ba42003-02-21 01:45:34 +000021The tools are designed to combine readily with one another. This makes
Raymond Hettinger60eca932003-02-09 06:40:58 +000022it easy to construct more specialized tools succinctly and efficiently
23in pure Python.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000024
Raymond Hettinger1b18ba42003-02-21 01:45:34 +000025For instance, SML provides a tabulation tool: \code{tabulate(f)}
Raymond Hettinger60eca932003-02-09 06:40:58 +000026which produces a sequence \code{f(0), f(1), ...}. This toolbox
27provides \function{imap()} and \function{count()} which can be combined
Raymond Hettinger1b18ba42003-02-21 01:45:34 +000028to form \code{imap(f, count())} and produce an equivalent result.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000029
Raymond Hettinger60eca932003-02-09 06:40:58 +000030Whether cast in pure python form or C code, tools that use iterators
31are more memory efficient (and faster) than their list based counterparts.
32Adopting the principles of just-in-time manufacturing, they create
33data when and where needed instead of consuming memory with the
34computer equivalent of ``inventory''.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000035
Raymond Hettinger61fe64d2003-02-23 04:40:07 +000036The module author welcomes suggestions for other basic building blocks
37to be added to future versions of the module.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000038
39\begin{seealso}
40 \seetext{The Standard ML Basis Library,
41 \citetitle[http://www.standardml.org/Basis/]
42 {The Standard ML Basis Library}.}
43
44 \seetext{Haskell, A Purely Functional Language,
45 \citetitle[http://www.haskell.org/definition/]
46 {Definition of Haskell and the Standard Libraries}.}
47\end{seealso}
48
49
50\subsection{Itertool functions \label{itertools-functions}}
51
52The following module functions all construct and return iterators.
53Some provide streams of infinite length, so they should only be accessed
54by functions or loops that truncate the stream.
55
Raymond Hettinger61fe64d2003-02-23 04:40:07 +000056\begin{funcdesc}{chain}{*iterables}
57 Make an iterator that returns elements from the first iterable until
58 it is exhausted, then proceeds to the next iterable, until all of the
59 iterables are exhausted. Used for treating consecutive sequences as
60 a single sequence. Equivalent to:
61
62 \begin{verbatim}
63 def chain(*iterables):
64 for it in iterables:
65 for element in it:
66 yield element
67 \end{verbatim}
68\end{funcdesc}
69
Raymond Hettinger96ef8112003-02-01 00:10:11 +000070\begin{funcdesc}{count}{\optional{n}}
71 Make an iterator that returns consecutive integers starting with \var{n}.
72 Does not currently support python long integers. Often used as an
73 argument to \function{imap()} to generate consecutive data points.
74 Also, used in \function{izip()} to add sequence numbers. Equivalent to:
75
76 \begin{verbatim}
77 def count(n=0):
Raymond Hettinger96ef8112003-02-01 00:10:11 +000078 while True:
Raymond Hettinger1b18ba42003-02-21 01:45:34 +000079 yield n
80 n += 1
Raymond Hettinger96ef8112003-02-01 00:10:11 +000081 \end{verbatim}
Raymond Hettinger2012f172003-02-07 05:32:58 +000082
83 Note, \function{count()} does not check for overflow and will return
84 negative numbers after exceeding \code{sys.maxint}. This behavior
85 may change in the future.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000086\end{funcdesc}
87
Raymond Hettinger61fe64d2003-02-23 04:40:07 +000088\begin{funcdesc}{cycle}{iterable}
89 Make an iterator returning elements from the iterable and saving a
90 copy of each. When the iterable is exhausted, return elements from
91 the saved copy. Repeats indefinitely. Equivalent to:
92
93 \begin{verbatim}
94 def cycle(iterable):
95 saved = []
96 for element in iterable:
97 yield element
98 saved.append(element)
99 if len(saved) == 0:
100 return
101 while True:
102 for element in saved:
103 yield element
104 \end{verbatim}
105
106 Note, this is the only member of the toolkit that may require
107 significant auxiliary storage (depending on the length of the
108 iterable.
109\end{funcdesc}
110
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000111\begin{funcdesc}{dropwhile}{predicate, iterable}
112 Make an iterator that drops elements from the iterable as long as
113 the predicate is true; afterwards, returns every element. Note,
114 the iterator does not produce \emph{any} output until the predicate
115 is true, so it may have a lengthy start-up time. Equivalent to:
116
117 \begin{verbatim}
118 def dropwhile(predicate, iterable):
119 iterable = iter(iterable)
120 while True:
121 x = iterable.next()
122 if predicate(x): continue # drop when predicate is true
123 yield x
124 break
125 while True:
126 yield iterable.next()
127 \end{verbatim}
128\end{funcdesc}
129
Raymond Hettinger60eca932003-02-09 06:40:58 +0000130\begin{funcdesc}{ifilter}{predicate, iterable}
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000131 Make an iterator that filters elements from iterable returning only
Raymond Hettinger60eca932003-02-09 06:40:58 +0000132 those for which the predicate is \code{True}.
133 If \var{predicate} is \code{None}, return the items that are true.
134 Equivalent to:
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000135
136 \begin{verbatim}
Raymond Hettinger60eca932003-02-09 06:40:58 +0000137 def ifilter(predicate, iterable):
138 if predicate is None:
139 def predicate(x):
140 return x
141 for x in iterable:
142 if predicate(x):
143 yield x
144 \end{verbatim}
145\end{funcdesc}
146
147\begin{funcdesc}{ifilterfalse}{predicate, iterable}
148 Make an iterator that filters elements from iterable returning only
149 those for which the predicate is \code{False}.
150 If \var{predicate} is \code{None}, return the items that are false.
151 Equivalent to:
152
153 \begin{verbatim}
154 def ifilterfalse(predicate, iterable):
155 if predicate is None:
156 def predicate(x):
157 return x
158 for x in iterable:
159 if not predicate(x):
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000160 yield x
161 \end{verbatim}
162\end{funcdesc}
163
164\begin{funcdesc}{imap}{function, *iterables}
165 Make an iterator that computes the function using arguments from
166 each of the iterables. If \var{function} is set to \code{None}, then
167 \function{imap()} returns the arguments as a tuple. Like
168 \function{map()} but stops when the shortest iterable is exhausted
169 instead of filling in \code{None} for shorter iterables. The reason
170 for the difference is that infinite iterator arguments are typically
171 an error for \function{map()} (because the output is fully evaluated)
172 but represent a common and useful way of supplying arguments to
173 \function{imap()}.
174 Equivalent to:
175
176 \begin{verbatim}
177 def imap(function, *iterables):
178 iterables = map(iter, iterables)
179 while True:
180 args = [i.next() for i in iterables]
181 if function is None:
182 yield tuple(args)
183 else:
184 yield function(*args)
185 \end{verbatim}
186\end{funcdesc}
187
188\begin{funcdesc}{islice}{iterable, \optional{start,} stop \optional{, step}}
189 Make an iterator that returns selected elements from the iterable.
190 If \var{start} is non-zero, then elements from the iterable are skipped
191 until start is reached. Afterward, elements are returned consecutively
192 unless \var{step} is set higher than one which results in items being
193 skipped. If \var{stop} is specified, then iteration stops at the
194 specified element position; otherwise, it continues indefinitely or
195 until the iterable is exhausted. Unlike regular slicing,
196 \function{islice()} does not support negative values for \var{start},
197 \var{stop}, or \var{step}. Can be used to extract related fields
198 from data where the internal structure has been flattened (for
199 example, a multi-line report may list a name field on every
200 third line). Equivalent to:
201
202 \begin{verbatim}
203 def islice(iterable, *args):
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000204 s = slice(*args)
205 next = s.start or 0
206 stop = s.stop
207 step = s.step or 1
Raymond Hettinger60eca932003-02-09 06:40:58 +0000208 for cnt, element in enumerate(iterable):
209 if cnt < next:
210 continue
211 if cnt >= stop:
212 break
213 yield element
214 next += step
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000215 \end{verbatim}
216\end{funcdesc}
217
218\begin{funcdesc}{izip}{*iterables}
219 Make an iterator that aggregates elements from each of the iterables.
220 Like \function{zip()} except that it returns an iterator instead of
221 a list. Used for lock-step iteration over several iterables at a
222 time. Equivalent to:
223
224 \begin{verbatim}
225 def izip(*iterables):
226 iterables = map(iter, iterables)
227 while True:
228 result = [i.next() for i in iterables]
229 yield tuple(result)
230 \end{verbatim}
231\end{funcdesc}
232
Raymond Hettinger61fe64d2003-02-23 04:40:07 +0000233\begin{funcdesc}{repeat}{object\optional{, times}}
Raymond Hettinger1b18ba42003-02-21 01:45:34 +0000234 Make an iterator that returns \var{object} over and over again.
Raymond Hettinger61fe64d2003-02-23 04:40:07 +0000235 Runs indefinitely unless the \var{times} argument is specified.
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000236 Used as argument to \function{imap()} for invariant parameters
Raymond Hettinger1b18ba42003-02-21 01:45:34 +0000237 to the called function. Also used with \function{izip()} to create
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000238 an invariant part of a tuple record. Equivalent to:
239
240 \begin{verbatim}
Raymond Hettinger61fe64d2003-02-23 04:40:07 +0000241 def repeat(object, times=None):
242 if times is None:
243 while True:
244 yield object
245 else:
246 for i in xrange(times):
247 yield object
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000248 \end{verbatim}
249\end{funcdesc}
250
251\begin{funcdesc}{starmap}{function, iterable}
252 Make an iterator that computes the function using arguments tuples
253 obtained from the iterable. Used instead of \function{imap()} when
254 argument parameters are already grouped in tuples from a single iterable
255 (the data has been ``pre-zipped''). The difference between
Raymond Hettinger1b18ba42003-02-21 01:45:34 +0000256 \function{imap()} and \function{starmap()} parallels the distinction
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000257 between \code{function(a,b)} and \code{function(*c)}.
258 Equivalent to:
259
260 \begin{verbatim}
261 def starmap(function, iterable):
262 iterable = iter(iterable)
263 while True:
264 yield function(*iterable.next())
265 \end{verbatim}
266\end{funcdesc}
267
268\begin{funcdesc}{takewhile}{predicate, iterable}
269 Make an iterator that returns elements from the iterable as long as
270 the predicate is true. Equivalent to:
271
272 \begin{verbatim}
273 def takewhile(predicate, iterable):
274 iterable = iter(iterable)
275 while True:
276 x = iterable.next()
277 if predicate(x):
278 yield x
279 else:
280 break
281 \end{verbatim}
282\end{funcdesc}
283
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000284
285\subsection{Examples \label{itertools-example}}
286
287The following examples show common uses for each tool and
288demonstrate ways they can be combined.
289
290\begin{verbatim}
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000291
292>>> amounts = [120.15, 764.05, 823.14]
293>>> for checknum, amount in izip(count(1200), amounts):
294... print 'Check %d is for $%.2f' % (checknum, amount)
295...
296Check 1200 is for $120.15
297Check 1201 is for $764.05
298Check 1202 is for $823.14
299
300>>> import operator
301>>> for cube in imap(operator.pow, xrange(1,4), repeat(3)):
302... print cube
303...
3041
3058
30627
307
308>>> reportlines = ['EuroPython', 'Roster', '', 'alex', '', 'laura',
309 '', 'martin', '', 'walter', '', 'samuele']
310>>> for name in islice(reportlines, 3, len(reportlines), 2):
311... print name.title()
312...
313Alex
314Laura
315Martin
316Walter
317Samuele
318
319\end{verbatim}
320
321This section has further examples of how itertools can be combined.
322Note that \function{enumerate()} and \method{iteritems()} already
323have highly efficient implementations in Python. They are only
324included here to illustrate how higher level tools can be created
325from building blocks.
326
327\begin{verbatim}
328>>> def enumerate(iterable):
329... return izip(count(), iterable)
330
331>>> def tabulate(function):
332... "Return function(0), function(1), ..."
333... return imap(function, count())
334
335>>> def iteritems(mapping):
336... return izip(mapping.iterkeys(), mapping.itervalues())
337
338>>> def nth(iterable, n):
339... "Returns the nth item"
Raymond Hettinger60eca932003-02-09 06:40:58 +0000340... return list(islice(iterable, n, n+1))
341
342>>> def all(pred, seq):
343... "Returns True if pred(x) is True for every element in the iterable"
344... return not nth(ifilterfalse(pred, seq), 0)
345
346>>> def some(pred, seq):
347... "Returns True if pred(x) is True at least one element in the iterable"
348... return bool(nth(ifilter(pred, seq), 0))
349
350>>> def no(pred, seq):
351... "Returns True if pred(x) is False for every element in the iterable"
352... return not nth(ifilter(pred, seq), 0)
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000353
Raymond Hettinger61fe64d2003-02-23 04:40:07 +0000354>>> def pairwise(seq):
355... "s -> (s0,s1), (s1,s2), (s2, s3), ..."
356... return izip(seq, islice(seq,1,len(seq)))
357
Raymond Hettinger96ef8112003-02-01 00:10:11 +0000358\end{verbatim}