blob: 8f6c655506814a50f85001ab64ded0fc953fe645 [file] [log] [blame]
Raymond Hettinger96ef8112003-02-01 00:10:11 +00001\section{\module{itertools} ---
2 Functions creating iterators for efficient looping}
3
4\declaremodule{standard}{itertools}
5\modulesynopsis{Functions creating iterators for efficient looping.}
6\moduleauthor{Raymond Hettinger}{python@rcn.com}
7\sectionauthor{Raymond Hettinger}{python@rcn.com}
8\versionadded{2.3}
9
10
11This module implements a number of iterator building blocks inspired
12by constructs from the Haskell and SML programming languages. Each
13has been recast in a form suitable for Python.
14
15With the advent of iterators and generators in Python 2.3, each of
16these tools can be expressed easily and succinctly in pure python.
17Rather duplicating what can already be done, this module emphasizes
18providing value in other ways:
19
20\begin{itemize}
21
22 \item Instead of constructing an over-specialized toolset, this module
23 provides basic building blocks that can be readily combined.
24
25 For instance, SML provides a tabulation tool: \code{tabulate(\var{f})}
26 which produces a sequence \code{f(0), f(1), ...}. This toolbox
27 takes a different approach of providing \function{imap()} and
28 \function{count()} which can be combined to form
29 \code{imap(\var{f}, count())} and produce an equivalent result.
30
31 \item Some tools were dropped because they offer no advantage over their
32 pure python counterparts or because their behavior was too
33 surprising.
34
35 For instance, SML provides a tool: \code{cycle(\var{seq})} which
36 loops over the sequence elements and then starts again when the
37 sequence is exhausted. The surprising behavior is the need for
38 significant auxiliary storage (unusual for iterators). Also, it
39 is trivially implemented in python with almost no performance
40 penalty.
41
42 \item Another source of value comes from standardizing a core set of tools
43 to avoid the readability and reliability problems that arise when many
44 different individuals create their own slightly varying implementations
45 each with their own quirks and naming conventions.
46
47 \item Whether cast in pure python form or C code, tools that use iterators
48 are more memory efficient (and faster) than their list based counterparts.
49 Adopting the principles of just-in-time manufacturing, they create
50 data when and where needed instead of consuming memory with the
51 computer equivalent of ``inventory''.
52
53\end{itemize}
54
55\begin{seealso}
56 \seetext{The Standard ML Basis Library,
57 \citetitle[http://www.standardml.org/Basis/]
58 {The Standard ML Basis Library}.}
59
60 \seetext{Haskell, A Purely Functional Language,
61 \citetitle[http://www.haskell.org/definition/]
62 {Definition of Haskell and the Standard Libraries}.}
63\end{seealso}
64
65
66\subsection{Itertool functions \label{itertools-functions}}
67
68The following module functions all construct and return iterators.
69Some provide streams of infinite length, so they should only be accessed
70by functions or loops that truncate the stream.
71
72\begin{funcdesc}{count}{\optional{n}}
73 Make an iterator that returns consecutive integers starting with \var{n}.
74 Does not currently support python long integers. Often used as an
75 argument to \function{imap()} to generate consecutive data points.
76 Also, used in \function{izip()} to add sequence numbers. Equivalent to:
77
78 \begin{verbatim}
79 def count(n=0):
80 cnt = n
81 while True:
82 yield cnt
83 cnt += 1
84 \end{verbatim}
Raymond Hettinger2012f172003-02-07 05:32:58 +000085
86 Note, \function{count()} does not check for overflow and will return
87 negative numbers after exceeding \code{sys.maxint}. This behavior
88 may change in the future.
Raymond Hettinger96ef8112003-02-01 00:10:11 +000089\end{funcdesc}
90
91\begin{funcdesc}{dropwhile}{predicate, iterable}
92 Make an iterator that drops elements from the iterable as long as
93 the predicate is true; afterwards, returns every element. Note,
94 the iterator does not produce \emph{any} output until the predicate
95 is true, so it may have a lengthy start-up time. Equivalent to:
96
97 \begin{verbatim}
98 def dropwhile(predicate, iterable):
99 iterable = iter(iterable)
100 while True:
101 x = iterable.next()
102 if predicate(x): continue # drop when predicate is true
103 yield x
104 break
105 while True:
106 yield iterable.next()
107 \end{verbatim}
108\end{funcdesc}
109
110\begin{funcdesc}{ifilter}{predicate, iterable \optional{, invert}}
111 Make an iterator that filters elements from iterable returning only
112 those for which the predicate is \code{True}. If
113 \var{invert} is \code{True}, then reverse the process and pass through
114 only those elements for which the predicate is \code{False}.
115 If \var{predicate} is \code{None}, return the items that are true
116 (or false if \var{invert} has been set). Equivalent to:
117
118 \begin{verbatim}
119 def ifilter(predicate, iterable, invert=False):
120 iterable = iter(iterable)
121 while True:
122 x = iterable.next()
123 if predicate is None:
124 b = bool(x)
125 else:
126 b = bool(predicate(x))
127 if not invert and b or invert and not b:
128 yield x
129 \end{verbatim}
130\end{funcdesc}
131
132\begin{funcdesc}{imap}{function, *iterables}
133 Make an iterator that computes the function using arguments from
134 each of the iterables. If \var{function} is set to \code{None}, then
135 \function{imap()} returns the arguments as a tuple. Like
136 \function{map()} but stops when the shortest iterable is exhausted
137 instead of filling in \code{None} for shorter iterables. The reason
138 for the difference is that infinite iterator arguments are typically
139 an error for \function{map()} (because the output is fully evaluated)
140 but represent a common and useful way of supplying arguments to
141 \function{imap()}.
142 Equivalent to:
143
144 \begin{verbatim}
145 def imap(function, *iterables):
146 iterables = map(iter, iterables)
147 while True:
148 args = [i.next() for i in iterables]
149 if function is None:
150 yield tuple(args)
151 else:
152 yield function(*args)
153 \end{verbatim}
154\end{funcdesc}
155
156\begin{funcdesc}{islice}{iterable, \optional{start,} stop \optional{, step}}
157 Make an iterator that returns selected elements from the iterable.
158 If \var{start} is non-zero, then elements from the iterable are skipped
159 until start is reached. Afterward, elements are returned consecutively
160 unless \var{step} is set higher than one which results in items being
161 skipped. If \var{stop} is specified, then iteration stops at the
162 specified element position; otherwise, it continues indefinitely or
163 until the iterable is exhausted. Unlike regular slicing,
164 \function{islice()} does not support negative values for \var{start},
165 \var{stop}, or \var{step}. Can be used to extract related fields
166 from data where the internal structure has been flattened (for
167 example, a multi-line report may list a name field on every
168 third line). Equivalent to:
169
170 \begin{verbatim}
171 def islice(iterable, *args):
172 iterable = iter(iterable)
173 s = slice(*args)
174 next = s.start or 0
175 stop = s.stop
176 step = s.step or 1
177 cnt = 0
178 while True:
179 while cnt < next:
180 dummy = iterable.next()
181 cnt += 1
182 if cnt >= stop:
183 break
184 yield iterable.next()
185 cnt += 1
186 next += step
187 \end{verbatim}
188\end{funcdesc}
189
190\begin{funcdesc}{izip}{*iterables}
191 Make an iterator that aggregates elements from each of the iterables.
192 Like \function{zip()} except that it returns an iterator instead of
193 a list. Used for lock-step iteration over several iterables at a
194 time. Equivalent to:
195
196 \begin{verbatim}
197 def izip(*iterables):
198 iterables = map(iter, iterables)
199 while True:
200 result = [i.next() for i in iterables]
201 yield tuple(result)
202 \end{verbatim}
203\end{funcdesc}
204
205\begin{funcdesc}{repeat}{obj}
206 Make an iterator that returns \var{obj} over and over again.
207 Used as argument to \function{imap()} for invariant parameters
208 to the called function. Also used with function{izip()} to create
209 an invariant part of a tuple record. Equivalent to:
210
211 \begin{verbatim}
212 def repeat(x):
213 while True:
214 yield x
215 \end{verbatim}
216\end{funcdesc}
217
218\begin{funcdesc}{starmap}{function, iterable}
219 Make an iterator that computes the function using arguments tuples
220 obtained from the iterable. Used instead of \function{imap()} when
221 argument parameters are already grouped in tuples from a single iterable
222 (the data has been ``pre-zipped''). The difference between
223 \function{imap()} and \function{starmap} parallels the distinction
224 between \code{function(a,b)} and \code{function(*c)}.
225 Equivalent to:
226
227 \begin{verbatim}
228 def starmap(function, iterable):
229 iterable = iter(iterable)
230 while True:
231 yield function(*iterable.next())
232 \end{verbatim}
233\end{funcdesc}
234
235\begin{funcdesc}{takewhile}{predicate, iterable}
236 Make an iterator that returns elements from the iterable as long as
237 the predicate is true. Equivalent to:
238
239 \begin{verbatim}
240 def takewhile(predicate, iterable):
241 iterable = iter(iterable)
242 while True:
243 x = iterable.next()
244 if predicate(x):
245 yield x
246 else:
247 break
248 \end{verbatim}
249\end{funcdesc}
250
251\begin{funcdesc}{times}{n, \optional{object}}
252 Make an iterator that returns \var{object} \var{n} times.
253 \var{object} defaults to \code{None}. Used for looping a specific
254 number of times without creating a number object on each pass.
255 Equivalent to:
256
257 \begin{verbatim}
258 def times(n, object=None):
259 if n<0 : raise ValueError
260 for i in xrange(n):
261 yield object
262 \end{verbatim}
263\end{funcdesc}
264
265
266\subsection{Examples \label{itertools-example}}
267
268The following examples show common uses for each tool and
269demonstrate ways they can be combined.
270
271\begin{verbatim}
272>>> for i in times(3):
273... print "Hello"
274...
275Hello
276Hello
277Hello
278
279>>> amounts = [120.15, 764.05, 823.14]
280>>> for checknum, amount in izip(count(1200), amounts):
281... print 'Check %d is for $%.2f' % (checknum, amount)
282...
283Check 1200 is for $120.15
284Check 1201 is for $764.05
285Check 1202 is for $823.14
286
287>>> import operator
288>>> for cube in imap(operator.pow, xrange(1,4), repeat(3)):
289... print cube
290...
2911
2928
29327
294
295>>> reportlines = ['EuroPython', 'Roster', '', 'alex', '', 'laura',
296 '', 'martin', '', 'walter', '', 'samuele']
297>>> for name in islice(reportlines, 3, len(reportlines), 2):
298... print name.title()
299...
300Alex
301Laura
302Martin
303Walter
304Samuele
305
306\end{verbatim}
307
308This section has further examples of how itertools can be combined.
309Note that \function{enumerate()} and \method{iteritems()} already
310have highly efficient implementations in Python. They are only
311included here to illustrate how higher level tools can be created
312from building blocks.
313
314\begin{verbatim}
315>>> def enumerate(iterable):
316... return izip(count(), iterable)
317
318>>> def tabulate(function):
319... "Return function(0), function(1), ..."
320... return imap(function, count())
321
322>>> def iteritems(mapping):
323... return izip(mapping.iterkeys(), mapping.itervalues())
324
325>>> def nth(iterable, n):
326... "Returns the nth item"
327... return islice(iterable, n, n+1).next()
328
329\end{verbatim}