blob: 0366fa200021cc00aa82f456fb85129cb7a26768 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Raymond Hettinger0aee9422009-02-17 11:00:27 +000017This module implements a number of :term:`iterator` building blocks inspired
18by constructs from APL, Haskell, and SML. Each has been recast in a form
19suitable for Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000020
21The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettinger0aee9422009-02-17 11:00:27 +000022useful by themselves or in combination. Together, they form an "iterator
23algebra" making it possible to construct specialized tools succinctly and
24efficiently in pure Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000025
26For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Ezio Melotti77a64e72010-01-21 20:50:57 +000027sequence ``f(0), f(1), ...``. The same effect can be achieved in Python
28by combining :func:`imap` and :func:`count` to form ``imap(f, count())``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000029
Raymond Hettingerefa7c132009-03-12 00:31:58 +000030These tools and their built-in counterparts also work well with the high-speed
31functions in the :mod:`operator` module. For example, the multiplication
32operator can be mapped across two vectors to form an efficient dot-product:
33``sum(imap(operator.mul, vector1, vector2))``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000034
Georg Brandl8ec7f652007-08-15 14:28:01 +000035
Raymond Hettinger0aee9422009-02-17 11:00:27 +000036**Infinite Iterators:**
Georg Brandl8ec7f652007-08-15 14:28:01 +000037
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000038================== ================= ================================================= =========================================
39Iterator Arguments Results Example
40================== ================= ================================================= =========================================
41:func:`count` start, [step] start, start+step, start+2*step, ... ``count(10) --> 10 11 12 13 14 ...``
42:func:`cycle` p p0, p1, ... plast, p0, p1, ... ``cycle('ABCD') --> A B C D A B C D ...``
43:func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times ``repeat(10, 3) --> 10 10 10``
44================== ================= ================================================= =========================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000045
Raymond Hettinger0aee9422009-02-17 11:00:27 +000046**Iterators terminating on the shortest input sequence:**
47
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000048==================== ============================ ================================================= =============================================================
49Iterator Arguments Results Example
50==================== ============================ ================================================= =============================================================
51:func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ... ``chain('ABC', 'DEF') --> A B C D E F``
52:func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ... ``compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F``
53:func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails ``dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1``
54:func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
55:func:`ifilter` pred, seq elements of seq where pred(elem) is True ``ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9``
56:func:`ifilterfalse` pred, seq elements of seq where pred(elem) is False ``ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8``
57:func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step] ``islice('ABCDEFG', 2, None) --> C D E F G``
58:func:`imap` func, p, q, ... func(p0, q0), func(p1, q1), ... ``imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000``
59:func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ... ``starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000``
60:func:`tee` it, n it1, it2 , ... itn splits one iterator into n
61:func:`takewhile` pred, seq seq[0], seq[1], until pred fails ``takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4``
62:func:`izip` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip('ABCD', 'xy') --> Ax By``
63:func:`izip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-``
64==================== ============================ ================================================= =============================================================
Raymond Hettinger0aee9422009-02-17 11:00:27 +000065
66**Combinatoric generators:**
67
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000068============================================== ==================== =============================================================
69Iterator Arguments Results
70============================================== ==================== =============================================================
71:func:`product` p, q, ... [repeat=1] cartesian product, equivalent to a nested for-loop
72:func:`permutations` p[, r] r-length tuples, all possible orderings, no repeated elements
Raymond Hettinger9eac1192009-11-19 01:22:04 +000073:func:`combinations` p, r r-length tuples, in sorted order, no repeated elements
74:func:`combinations_with_replacement` p, r r-length tuples, in sorted order, with repeated elements
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000075|
76``product('ABCD', repeat=2)`` ``AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD``
77``permutations('ABCD', 2)`` ``AB AC AD BA BC BD CA CB CD DA DB DC``
78``combinations('ABCD', 2)`` ``AB AC AD BC BD CD``
79``combinations_with_replacement('ABCD', 2)`` ``AA AB AC AD BB BC BD CC CD DD``
80============================================== ==================== =============================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000081
82
83.. _itertools-functions:
84
85Itertool functions
86------------------
87
88The following module functions all construct and return iterators. Some provide
89streams of infinite length, so they should only be accessed by functions or
90loops that truncate the stream.
91
92
93.. function:: chain(*iterables)
94
95 Make an iterator that returns elements from the first iterable until it is
96 exhausted, then proceeds to the next iterable, until all of the iterables are
97 exhausted. Used for treating consecutive sequences as a single sequence.
98 Equivalent to::
99
100 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000101 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +0000102 for it in iterables:
103 for element in it:
104 yield element
105
106
Raymond Hettinger330958e2008-02-28 19:41:24 +0000107.. function:: itertools.chain.from_iterable(iterable)
108
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000109 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e2008-02-28 19:41:24 +0000110 single iterable argument that is evaluated lazily. Equivalent to::
111
112 @classmethod
113 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000114 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +0000115 for it in iterables:
116 for element in it:
117 yield element
118
119 .. versionadded:: 2.6
120
Raymond Hettingerd553d852008-03-04 04:17:08 +0000121
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000122.. function:: combinations(iterable, r)
123
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000124 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000125
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000126 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000127 input *iterable* is sorted, the combination tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000128 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000129
130 Elements are treated as unique based on their position, not on their
131 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000132 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000133
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000134 Equivalent to::
135
136 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000137 # combinations('ABCD', 2) --> AB AC AD BC BD CD
138 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000139 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000140 n = len(pool)
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000141 if r > n:
142 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000143 indices = range(r)
144 yield tuple(pool[i] for i in indices)
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000145 while True:
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000146 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000147 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000148 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000149 else:
150 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000151 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000152 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000153 indices[j] = indices[j-1] + 1
154 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000155
Raymond Hettingerd553d852008-03-04 04:17:08 +0000156 The code for :func:`combinations` can be also expressed as a subsequence
157 of :func:`permutations` after filtering entries where the elements are not
158 in sorted order (according to their position in the input pool)::
159
160 def combinations(iterable, r):
161 pool = tuple(iterable)
162 n = len(pool)
163 for indices in permutations(range(n), r):
164 if sorted(indices) == list(indices):
165 yield tuple(pool[i] for i in indices)
166
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000167 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
168 or zero when ``r > n``.
169
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000170 .. versionadded:: 2.6
171
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000172.. function:: combinations_with_replacement(iterable, r)
173
174 Return *r* length subsequences of elements from the input *iterable*
175 allowing individual elements to be repeated more than once.
176
177 Combinations are emitted in lexicographic sort order. So, if the
178 input *iterable* is sorted, the combination tuples will be produced
179 in sorted order.
180
181 Elements are treated as unique based on their position, not on their
182 value. So if the input elements are unique, the generated combinations
183 will also be unique.
184
185 Equivalent to::
186
187 def combinations_with_replacement(iterable, r):
188 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
189 pool = tuple(iterable)
190 n = len(pool)
191 if not n and r:
192 return
193 indices = [0] * r
194 yield tuple(pool[i] for i in indices)
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000195 while True:
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000196 for i in reversed(range(r)):
197 if indices[i] != n - 1:
198 break
199 else:
200 return
201 indices[i:] = [indices[i] + 1] * (r - i)
202 yield tuple(pool[i] for i in indices)
203
204 The code for :func:`combinations_with_replacement` can be also expressed as
205 a subsequence of :func:`product` after filtering entries where the elements
206 are not in sorted order (according to their position in the input pool)::
207
208 def combinations_with_replacement(iterable, r):
209 pool = tuple(iterable)
210 n = len(pool)
211 for indices in product(range(n), repeat=r):
212 if sorted(indices) == list(indices):
213 yield tuple(pool[i] for i in indices)
214
215 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
216
217 .. versionadded:: 2.7
218
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000219.. function:: compress(data, selectors)
220
221 Make an iterator that filters elements from *data* returning only those that
222 have a corresponding element in *selectors* that evaluates to ``True``.
Andrew M. Kuchlingefa97712009-03-30 23:08:24 +0000223 Stops when either the *data* or *selectors* iterables has been exhausted.
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000224 Equivalent to::
225
226 def compress(data, selectors):
227 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
228 return (d for d, s in izip(data, selectors) if s)
229
230 .. versionadded:: 2.7
231
232
Raymond Hettingera4038032009-02-14 00:25:51 +0000233.. function:: count(start=0, step=1)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000234
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000235 Make an iterator that returns evenly spaced values starting with *n*. Often
236 used as an argument to :func:`imap` to generate consecutive data points.
237 Also, used with :func:`izip` to add sequence numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000238
Raymond Hettingera4038032009-02-14 00:25:51 +0000239 def count(start=0, step=1):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000240 # count(10) --> 10 11 12 13 14 ...
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000241 # count(2.5, 0.5) -> 3.5 3.0 4.5 ...
Raymond Hettingera4038032009-02-14 00:25:51 +0000242 n = start
Georg Brandl8ec7f652007-08-15 14:28:01 +0000243 while True:
244 yield n
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000245 n += step
Georg Brandl8ec7f652007-08-15 14:28:01 +0000246
Raymond Hettinger3a026242009-06-17 01:43:47 +0000247 When counting with floating point numbers, better accuracy can sometimes be
248 achieved by substituting multiplicative code such as: ``(start + step * i
249 for i in count())``.
250
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000251 .. versionchanged:: 2.7
252 added *step* argument and allowed non-integer arguments.
253
Georg Brandl8ec7f652007-08-15 14:28:01 +0000254.. function:: cycle(iterable)
255
256 Make an iterator returning elements from the iterable and saving a copy of each.
257 When the iterable is exhausted, return elements from the saved copy. Repeats
258 indefinitely. Equivalent to::
259
260 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000261 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000262 saved = []
263 for element in iterable:
264 yield element
265 saved.append(element)
266 while saved:
267 for element in saved:
268 yield element
269
270 Note, this member of the toolkit may require significant auxiliary storage
271 (depending on the length of the iterable).
272
273
274.. function:: dropwhile(predicate, iterable)
275
276 Make an iterator that drops elements from the iterable as long as the predicate
277 is true; afterwards, returns every element. Note, the iterator does not produce
278 *any* output until the predicate first becomes false, so it may have a lengthy
279 start-up time. Equivalent to::
280
281 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000282 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000283 iterable = iter(iterable)
284 for x in iterable:
285 if not predicate(x):
286 yield x
287 break
288 for x in iterable:
289 yield x
290
291
292.. function:: groupby(iterable[, key])
293
294 Make an iterator that returns consecutive keys and groups from the *iterable*.
295 The *key* is a function computing a key value for each element. If not
296 specified or is ``None``, *key* defaults to an identity function and returns
297 the element unchanged. Generally, the iterable needs to already be sorted on
298 the same key function.
299
300 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
301 generates a break or new group every time the value of the key function changes
302 (which is why it is usually necessary to have sorted the data using the same key
303 function). That behavior differs from SQL's GROUP BY which aggregates common
304 elements regardless of their input order.
305
306 The returned group is itself an iterator that shares the underlying iterable
307 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
308 object is advanced, the previous group is no longer visible. So, if that data
309 is needed later, it should be stored as a list::
310
311 groups = []
312 uniquekeys = []
313 data = sorted(data, key=keyfunc)
314 for k, g in groupby(data, keyfunc):
315 groups.append(list(g)) # Store group iterator as a list
316 uniquekeys.append(k)
317
318 :func:`groupby` is equivalent to::
319
320 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000321 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettingerd507afd2009-02-04 10:52:32 +0000322 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000323 def __init__(self, iterable, key=None):
324 if key is None:
325 key = lambda x: x
326 self.keyfunc = key
327 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000328 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000329 def __iter__(self):
330 return self
331 def next(self):
332 while self.currkey == self.tgtkey:
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000333 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000334 self.currkey = self.keyfunc(self.currvalue)
335 self.tgtkey = self.currkey
336 return (self.currkey, self._grouper(self.tgtkey))
337 def _grouper(self, tgtkey):
338 while self.currkey == tgtkey:
339 yield self.currvalue
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000340 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000341 self.currkey = self.keyfunc(self.currvalue)
342
343 .. versionadded:: 2.4
344
345
346.. function:: ifilter(predicate, iterable)
347
348 Make an iterator that filters elements from iterable returning only those for
349 which the predicate is ``True``. If *predicate* is ``None``, return the items
350 that are true. Equivalent to::
351
352 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000353 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000354 if predicate is None:
355 predicate = bool
356 for x in iterable:
357 if predicate(x):
358 yield x
359
360
361.. function:: ifilterfalse(predicate, iterable)
362
363 Make an iterator that filters elements from iterable returning only those for
364 which the predicate is ``False``. If *predicate* is ``None``, return the items
365 that are false. Equivalent to::
366
367 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000368 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000369 if predicate is None:
370 predicate = bool
371 for x in iterable:
372 if not predicate(x):
373 yield x
374
375
376.. function:: imap(function, *iterables)
377
378 Make an iterator that computes the function using arguments from each of the
379 iterables. If *function* is set to ``None``, then :func:`imap` returns the
380 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
381 exhausted instead of filling in ``None`` for shorter iterables. The reason for
382 the difference is that infinite iterator arguments are typically an error for
383 :func:`map` (because the output is fully evaluated) but represent a common and
384 useful way of supplying arguments to :func:`imap`. Equivalent to::
385
386 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000387 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000388 iterables = map(iter, iterables)
389 while True:
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000390 args = [next(it) for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000391 if function is None:
392 yield tuple(args)
393 else:
394 yield function(*args)
395
396
397.. function:: islice(iterable, [start,] stop [, step])
398
399 Make an iterator that returns selected elements from the iterable. If *start* is
400 non-zero, then elements from the iterable are skipped until start is reached.
401 Afterward, elements are returned consecutively unless *step* is set higher than
402 one which results in items being skipped. If *stop* is ``None``, then iteration
403 continues until the iterator is exhausted, if at all; otherwise, it stops at the
404 specified position. Unlike regular slicing, :func:`islice` does not support
405 negative values for *start*, *stop*, or *step*. Can be used to extract related
406 fields from data where the internal structure has been flattened (for example, a
407 multi-line report may list a name field on every third line). Equivalent to::
408
409 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000410 # islice('ABCDEFG', 2) --> A B
411 # islice('ABCDEFG', 2, 4) --> C D
412 # islice('ABCDEFG', 2, None) --> C D E F G
413 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000414 s = slice(*args)
415 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000416 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000417 for i, element in enumerate(iterable):
418 if i == nexti:
419 yield element
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000420 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000421
422 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
423 then the step defaults to one.
424
425 .. versionchanged:: 2.5
426 accept ``None`` values for default *start* and *step*.
427
428
429.. function:: izip(*iterables)
430
431 Make an iterator that aggregates elements from each of the iterables. Like
432 :func:`zip` except that it returns an iterator instead of a list. Used for
433 lock-step iteration over several iterables at a time. Equivalent to::
434
435 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000436 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000437 iterables = map(iter, iterables)
438 while iterables:
Raymond Hettingercccfc822009-04-20 18:23:57 +0000439 yield tuple(map(next, iterables))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000440
441 .. versionchanged:: 2.4
442 When no iterables are specified, returns a zero length iterator instead of
443 raising a :exc:`TypeError` exception.
444
Raymond Hettinger48c62932008-01-22 19:51:41 +0000445 The left-to-right evaluation order of the iterables is guaranteed. This
446 makes possible an idiom for clustering a data series into n-length groups
447 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000448
Raymond Hettinger48c62932008-01-22 19:51:41 +0000449 :func:`izip` should only be used with unequal length inputs when you don't
450 care about trailing, unmatched values from the longer iterables. If those
451 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000452
453
454.. function:: izip_longest(*iterables[, fillvalue])
455
456 Make an iterator that aggregates elements from each of the iterables. If the
457 iterables are of uneven length, missing values are filled-in with *fillvalue*.
458 Iteration continues until the longest iterable is exhausted. Equivalent to::
459
460 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000461 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000462 fillvalue = kwds.get('fillvalue')
463 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
464 yield counter() # yields the fillvalue, or raises IndexError
465 fillers = repeat(fillvalue)
466 iters = [chain(it, sentinel(), fillers) for it in args]
467 try:
468 for tup in izip(*iters):
469 yield tup
470 except IndexError:
471 pass
472
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000473 If one of the iterables is potentially infinite, then the
474 :func:`izip_longest` function should be wrapped with something that limits
475 the number of calls (for example :func:`islice` or :func:`takewhile`). If
476 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000477
478 .. versionadded:: 2.6
479
Raymond Hettinger330958e2008-02-28 19:41:24 +0000480.. function:: permutations(iterable[, r])
481
482 Return successive *r* length permutations of elements in the *iterable*.
483
484 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000485 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e2008-02-28 19:41:24 +0000486 are generated.
487
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000488 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e2008-02-28 19:41:24 +0000489 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000490 in sorted order.
Raymond Hettinger330958e2008-02-28 19:41:24 +0000491
492 Elements are treated as unique based on their position, not on their
493 value. So if the input elements are unique, there will be no repeat
494 values in each permutation.
495
Raymond Hettingerf287f172008-03-02 10:59:31 +0000496 Equivalent to::
497
498 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000499 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
500 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000501 pool = tuple(iterable)
502 n = len(pool)
503 r = n if r is None else r
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000504 if r > n:
505 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000506 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000507 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000508 yield tuple(pool[i] for i in indices[:r])
509 while n:
510 for i in reversed(range(r)):
511 cycles[i] -= 1
512 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000513 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000514 cycles[i] = n - i
515 else:
516 j = cycles[i]
517 indices[i], indices[-j] = indices[-j], indices[i]
518 yield tuple(pool[i] for i in indices[:r])
519 break
520 else:
521 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000522
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000523 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000524 :func:`product`, filtered to exclude entries with repeated elements (those
525 from the same position in the input pool)::
526
527 def permutations(iterable, r=None):
528 pool = tuple(iterable)
529 n = len(pool)
530 r = n if r is None else r
531 for indices in product(range(n), repeat=r):
532 if len(set(indices)) == r:
533 yield tuple(pool[i] for i in indices)
534
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000535 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
536 or zero when ``r > n``.
537
Raymond Hettinger330958e2008-02-28 19:41:24 +0000538 .. versionadded:: 2.6
539
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000540.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000541
542 Cartesian product of input iterables.
543
544 Equivalent to nested for-loops in a generator expression. For example,
545 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
546
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000547 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000548 on every iteration. This pattern creates a lexicographic ordering so that if
549 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000550 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000551
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000552 To compute the product of an iterable with itself, specify the number of
553 repetitions with the optional *repeat* keyword argument. For example,
554 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
555
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000556 This function is equivalent to the following code, except that the
557 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000558
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000559 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000560 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
561 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000562 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000563 result = [[]]
564 for pool in pools:
565 result = [x+[y] for x in result for y in pool]
566 for prod in result:
567 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000568
569 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000570
571.. function:: repeat(object[, times])
572
573 Make an iterator that returns *object* over and over again. Runs indefinitely
574 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000575 invariant function parameters. Also used with :func:`izip` to create constant
576 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000577
578 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000579 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000580 if times is None:
581 while True:
582 yield object
583 else:
584 for i in xrange(times):
585 yield object
586
587
588.. function:: starmap(function, iterable)
589
Raymond Hettinger47317092008-01-17 03:02:14 +0000590 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000591 the iterable. Used instead of :func:`imap` when argument parameters are already
592 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
593 difference between :func:`imap` and :func:`starmap` parallels the distinction
594 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
595
596 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000597 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000598 for args in iterable:
599 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000600
Raymond Hettinger47317092008-01-17 03:02:14 +0000601 .. versionchanged:: 2.6
602 Previously, :func:`starmap` required the function arguments to be tuples.
603 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000604
605.. function:: takewhile(predicate, iterable)
606
607 Make an iterator that returns elements from the iterable as long as the
608 predicate is true. Equivalent to::
609
610 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000611 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000612 for x in iterable:
613 if predicate(x):
614 yield x
615 else:
616 break
617
618
619.. function:: tee(iterable[, n=2])
620
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000621 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000622
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000623 def tee(iterable, n=2):
624 it = iter(iterable)
625 deques = [collections.deque() for i in range(n)]
626 def gen(mydeque):
627 while True:
628 if not mydeque: # when the local deque is empty
629 newval = next(it) # fetch a new value and
630 for d in deques: # load it to all the deques
631 d.append(newval)
632 yield mydeque.popleft()
633 return tuple(gen(d) for d in deques)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000634
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000635 Once :func:`tee` has made a split, the original *iterable* should not be
636 used anywhere else; otherwise, the *iterable* could get advanced without
637 the tee objects being informed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000638
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000639 This itertool may require significant auxiliary storage (depending on how
640 much temporary data needs to be stored). In general, if one iterator uses
641 most or all of the data before another iterator starts, it is faster to use
642 :func:`list` instead of :func:`tee`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000643
644 .. versionadded:: 2.4
645
646
Georg Brandl8ec7f652007-08-15 14:28:01 +0000647.. _itertools-recipes:
648
649Recipes
650-------
651
652This section shows recipes for creating an extended toolset using the existing
653itertools as building blocks.
654
655The extended tools offer the same high performance as the underlying toolset.
656The superior memory performance is kept by processing elements one at a time
657rather than bringing the whole iterable into memory all at once. Code volume is
658kept small by linking the tools together in a functional style which helps
659eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000660"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000661which incur interpreter overhead.
662
663.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000664
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000665 def take(n, iterable):
666 "Return first n items of the iterable as a list"
667 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000668
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000669 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000670 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000671 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000672
Raymond Hettinger3496a892009-03-09 11:57:29 +0000673 def consume(iterator, n):
674 "Advance the iterator n-steps ahead. If n is none, consume entirely."
Raymond Hettingerb8d688c2010-03-28 18:25:01 +0000675 # Use functions that consume iterators at C speed.
676 if n is None:
677 # feed the entire iterator into a zero-length deque
678 collections.deque(iterator, maxlen=0)
679 else:
680 # advance to the emtpy slice starting at position n
681 next(islice(iterator, n, n), None)
Raymond Hettinger3496a892009-03-09 11:57:29 +0000682
Raymond Hettingerf9bce832009-02-19 05:34:35 +0000683 def nth(iterable, n, default=None):
684 "Returns the nth item or a default value"
685 return next(islice(iterable, n, None), default)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000686
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000687 def quantify(iterable, pred=bool):
688 "Count how many times the predicate is true"
689 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000690
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000691 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000692 """Returns the sequence elements and then returns None indefinitely.
693
694 Useful for emulating the behavior of the built-in map() function.
695 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000696 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000697
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000698 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000699 "Returns the sequence elements n times"
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000700 return chain.from_iterable(repeat(tuple(iterable), n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000701
702 def dotproduct(vec1, vec2):
703 return sum(imap(operator.mul, vec1, vec2))
704
705 def flatten(listOfLists):
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000706 "Flatten one level of nesting"
707 return chain.from_iterable(listOfLists)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000708
709 def repeatfunc(func, times=None, *args):
710 """Repeat calls to func with specified arguments.
711
712 Example: repeatfunc(random.random)
713 """
714 if times is None:
715 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000716 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000717
718 def pairwise(iterable):
719 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
720 a, b = tee(iterable)
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000721 next(b, None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000722 return izip(a, b)
723
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000724 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000725 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000726 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000727 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000728
Raymond Hettingera44327a2008-01-30 22:17:31 +0000729 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000730 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000731 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000732 pending = len(iterables)
733 nexts = cycle(iter(it).next for it in iterables)
734 while pending:
735 try:
736 for next in nexts:
737 yield next()
738 except StopIteration:
739 pending -= 1
740 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000741
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000742 def powerset(iterable):
Raymond Hettinger68d919e2009-01-25 21:31:47 +0000743 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
744 s = list(iterable)
745 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000746
Benjamin Peterson48291362009-01-31 20:01:48 +0000747 def unique_everseen(iterable, key=None):
748 "List unique elements, preserving order. Remember all elements ever seen."
749 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
750 # unique_everseen('ABBCcAD', str.lower) --> A B C D
751 seen = set()
752 seen_add = seen.add
753 if key is None:
Raymond Hettinger5b027f82010-03-28 18:02:41 +0000754 for element in ifilterfalse(seen.__contains__, iterable):
755 seen_add(element)
756 yield element
Benjamin Peterson48291362009-01-31 20:01:48 +0000757 else:
758 for element in iterable:
759 k = key(element)
760 if k not in seen:
761 seen_add(k)
762 yield element
Raymond Hettinger44e15812009-01-02 21:26:45 +0000763
Benjamin Peterson48291362009-01-31 20:01:48 +0000764 def unique_justseen(iterable, key=None):
765 "List unique elements, preserving order. Remember only the element just seen."
766 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
767 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
768 return imap(next, imap(itemgetter(1), groupby(iterable, key)))
Raymond Hettinger5b027f82010-03-28 18:02:41 +0000769
770 def iter_except(func, exception, first=None):
771 """ Call a function repeatedly until an exception is raised.
772
773 Converts a call-until-exception interface to an iterator interface.
774 Like __builtin__.iter(func, sentinel) but uses an exception instead
775 of a sentinel to end the loop.
776
777 Examples:
778 bsddbiter = iter_except(db.next, bsddb.error, db.first)
779 heapiter = iter_except(functools.partial(heappop, h), IndexError)
780 dictiter = iter_except(d.popitem, KeyError)
781 dequeiter = iter_except(d.popleft, IndexError)
782 queueiter = iter_except(q.get_nowait, Queue.Empty)
783 setiter = iter_except(s.pop, KeyError)
784
785 """
786 try:
787 if first is not None:
788 yield first()
789 while 1:
790 yield func()
791 except exception:
792 pass
793
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000794 def random_product(*args, **kwds):
795 "Random selection from itertools.product(*args, **kwds)"
796 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000797 return tuple(random.choice(pool) for pool in pools)
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000798
Raymond Hettingera1d61d02010-04-10 07:01:32 +0000799 def random_permutation(iterable, r=None):
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000800 "Random selection from itertools.permutations(iterable, r)"
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000801 pool = tuple(iterable)
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000802 r = len(pool) if r is None else r
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000803 return tuple(random.sample(pool, r))
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000804
805 def random_combination(iterable, r):
806 "Random selection from itertools.combinations(iterable, r)"
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000807 pool = tuple(iterable)
Raymond Hettingera1d61d02010-04-10 07:01:32 +0000808 n = len(pool)
809 indices = sorted(random.sample(xrange(n), r))
810 return tuple(pool[i] for i in indices)
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000811
812 def random_combination_with_replacement(iterable, r):
813 "Random selection from itertools.combinations_with_replacement(iterable, r)"
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000814 pool = tuple(iterable)
Raymond Hettingera1d61d02010-04-10 07:01:32 +0000815 n = len(pool)
816 indices = sorted(random.randrange(n) for i in xrange(r))
817 return tuple(pool[i] for i in indices)
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000818
Raymond Hettingerd282b932010-03-28 18:08:15 +0000819Note, many of the above recipes can be optimized by replacing global lookups
820with local variables defined as default values. For example, the
821*dotproduct* recipe can be written as::
822
823 def dotproduct(vec1, vec2, sum=sum, imap=imap, mul=operator.mul):
824 return sum(imap(mul, vec1, vec2))