blob: 918918cc949f314e02eccc509ab1d05b747ad538 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Raymond Hettinger0aee9422009-02-17 11:00:27 +000017This module implements a number of :term:`iterator` building blocks inspired
18by constructs from APL, Haskell, and SML. Each has been recast in a form
19suitable for Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000020
21The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettinger0aee9422009-02-17 11:00:27 +000022useful by themselves or in combination. Together, they form an "iterator
23algebra" making it possible to construct specialized tools succinctly and
24efficiently in pure Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000025
26For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
27sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
Raymond Hettinger0aee9422009-02-17 11:00:27 +000028:func:`count` which can be combined to form ``imap(f, count())`` to produce an
Georg Brandl8ec7f652007-08-15 14:28:01 +000029equivalent result.
30
Raymond Hettingerefa7c132009-03-12 00:31:58 +000031These tools and their built-in counterparts also work well with the high-speed
32functions in the :mod:`operator` module. For example, the multiplication
33operator can be mapped across two vectors to form an efficient dot-product:
34``sum(imap(operator.mul, vector1, vector2))``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000035
Georg Brandl8ec7f652007-08-15 14:28:01 +000036
Raymond Hettinger0aee9422009-02-17 11:00:27 +000037**Infinite Iterators:**
Georg Brandl8ec7f652007-08-15 14:28:01 +000038
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000039================== ================= ================================================= =========================================
40Iterator Arguments Results Example
41================== ================= ================================================= =========================================
42:func:`count` start, [step] start, start+step, start+2*step, ... ``count(10) --> 10 11 12 13 14 ...``
43:func:`cycle` p p0, p1, ... plast, p0, p1, ... ``cycle('ABCD') --> A B C D A B C D ...``
44:func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times ``repeat(10, 3) --> 10 10 10``
45================== ================= ================================================= =========================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000046
Raymond Hettinger0aee9422009-02-17 11:00:27 +000047**Iterators terminating on the shortest input sequence:**
48
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000049==================== ============================ ================================================= =============================================================
50Iterator Arguments Results Example
51==================== ============================ ================================================= =============================================================
52:func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ... ``chain('ABC', 'DEF') --> A B C D E F``
53:func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ... ``compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F``
54:func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails ``dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1``
55:func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
56:func:`ifilter` pred, seq elements of seq where pred(elem) is True ``ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9``
57:func:`ifilterfalse` pred, seq elements of seq where pred(elem) is False ``ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8``
58:func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step] ``islice('ABCDEFG', 2, None) --> C D E F G``
59:func:`imap` func, p, q, ... func(p0, q0), func(p1, q1), ... ``imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000``
60:func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ... ``starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000``
61:func:`tee` it, n it1, it2 , ... itn splits one iterator into n
62:func:`takewhile` pred, seq seq[0], seq[1], until pred fails ``takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4``
63:func:`izip` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip('ABCD', 'xy') --> Ax By``
64:func:`izip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-``
65==================== ============================ ================================================= =============================================================
Raymond Hettinger0aee9422009-02-17 11:00:27 +000066
67**Combinatoric generators:**
68
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000069============================================== ==================== =============================================================
70Iterator Arguments Results
71============================================== ==================== =============================================================
72:func:`product` p, q, ... [repeat=1] cartesian product, equivalent to a nested for-loop
73:func:`permutations` p[, r] r-length tuples, all possible orderings, no repeated elements
74:func:`combinations` p[, r] r-length tuples, in sorted order, no repeated elements
75:func:`combinations_with_replacement` p[, r] r-length tuples, in sorted order, with repeated elements
76|
77``product('ABCD', repeat=2)`` ``AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD``
78``permutations('ABCD', 2)`` ``AB AC AD BA BC BD CA CB CD DA DB DC``
79``combinations('ABCD', 2)`` ``AB AC AD BC BD CD``
80``combinations_with_replacement('ABCD', 2)`` ``AA AB AC AD BB BC BD CC CD DD``
81============================================== ==================== =============================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000082
83
84.. _itertools-functions:
85
86Itertool functions
87------------------
88
89The following module functions all construct and return iterators. Some provide
90streams of infinite length, so they should only be accessed by functions or
91loops that truncate the stream.
92
93
94.. function:: chain(*iterables)
95
96 Make an iterator that returns elements from the first iterable until it is
97 exhausted, then proceeds to the next iterable, until all of the iterables are
98 exhausted. Used for treating consecutive sequences as a single sequence.
99 Equivalent to::
100
101 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000102 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +0000103 for it in iterables:
104 for element in it:
105 yield element
106
107
Raymond Hettinger330958e2008-02-28 19:41:24 +0000108.. function:: itertools.chain.from_iterable(iterable)
109
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000110 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e2008-02-28 19:41:24 +0000111 single iterable argument that is evaluated lazily. Equivalent to::
112
113 @classmethod
114 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000115 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +0000116 for it in iterables:
117 for element in it:
118 yield element
119
120 .. versionadded:: 2.6
121
Raymond Hettingerd553d852008-03-04 04:17:08 +0000122
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000123.. function:: combinations(iterable, r)
124
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000125 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000126
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000127 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000128 input *iterable* is sorted, the combination tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000129 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000130
131 Elements are treated as unique based on their position, not on their
132 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000133 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000134
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000135 Equivalent to::
136
137 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000138 # combinations('ABCD', 2) --> AB AC AD BC BD CD
139 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000140 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000141 n = len(pool)
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000142 if r > n:
143 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000144 indices = range(r)
145 yield tuple(pool[i] for i in indices)
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000146 while True:
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000147 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000148 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000149 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000150 else:
151 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000152 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000153 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000154 indices[j] = indices[j-1] + 1
155 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000156
Raymond Hettingerd553d852008-03-04 04:17:08 +0000157 The code for :func:`combinations` can be also expressed as a subsequence
158 of :func:`permutations` after filtering entries where the elements are not
159 in sorted order (according to their position in the input pool)::
160
161 def combinations(iterable, r):
162 pool = tuple(iterable)
163 n = len(pool)
164 for indices in permutations(range(n), r):
165 if sorted(indices) == list(indices):
166 yield tuple(pool[i] for i in indices)
167
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000168 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
169 or zero when ``r > n``.
170
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000171 .. versionadded:: 2.6
172
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000173.. function:: combinations_with_replacement(iterable, r)
174
175 Return *r* length subsequences of elements from the input *iterable*
176 allowing individual elements to be repeated more than once.
177
178 Combinations are emitted in lexicographic sort order. So, if the
179 input *iterable* is sorted, the combination tuples will be produced
180 in sorted order.
181
182 Elements are treated as unique based on their position, not on their
183 value. So if the input elements are unique, the generated combinations
184 will also be unique.
185
186 Equivalent to::
187
188 def combinations_with_replacement(iterable, r):
189 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
190 pool = tuple(iterable)
191 n = len(pool)
192 if not n and r:
193 return
194 indices = [0] * r
195 yield tuple(pool[i] for i in indices)
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000196 while True:
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000197 for i in reversed(range(r)):
198 if indices[i] != n - 1:
199 break
200 else:
201 return
202 indices[i:] = [indices[i] + 1] * (r - i)
203 yield tuple(pool[i] for i in indices)
204
205 The code for :func:`combinations_with_replacement` can be also expressed as
206 a subsequence of :func:`product` after filtering entries where the elements
207 are not in sorted order (according to their position in the input pool)::
208
209 def combinations_with_replacement(iterable, r):
210 pool = tuple(iterable)
211 n = len(pool)
212 for indices in product(range(n), repeat=r):
213 if sorted(indices) == list(indices):
214 yield tuple(pool[i] for i in indices)
215
216 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
217
218 .. versionadded:: 2.7
219
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000220.. function:: compress(data, selectors)
221
222 Make an iterator that filters elements from *data* returning only those that
223 have a corresponding element in *selectors* that evaluates to ``True``.
Andrew M. Kuchlingefa97712009-03-30 23:08:24 +0000224 Stops when either the *data* or *selectors* iterables has been exhausted.
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000225 Equivalent to::
226
227 def compress(data, selectors):
228 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
229 return (d for d, s in izip(data, selectors) if s)
230
231 .. versionadded:: 2.7
232
233
Raymond Hettingera4038032009-02-14 00:25:51 +0000234.. function:: count(start=0, step=1)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000235
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000236 Make an iterator that returns evenly spaced values starting with *n*. Often
237 used as an argument to :func:`imap` to generate consecutive data points.
238 Also, used with :func:`izip` to add sequence numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000239
Raymond Hettingera4038032009-02-14 00:25:51 +0000240 def count(start=0, step=1):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000241 # count(10) --> 10 11 12 13 14 ...
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000242 # count(2.5, 0.5) -> 3.5 3.0 4.5 ...
Raymond Hettingera4038032009-02-14 00:25:51 +0000243 n = start
Georg Brandl8ec7f652007-08-15 14:28:01 +0000244 while True:
245 yield n
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000246 n += step
Georg Brandl8ec7f652007-08-15 14:28:01 +0000247
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000248 .. versionchanged:: 2.7
249 added *step* argument and allowed non-integer arguments.
250
Georg Brandl8ec7f652007-08-15 14:28:01 +0000251.. function:: cycle(iterable)
252
253 Make an iterator returning elements from the iterable and saving a copy of each.
254 When the iterable is exhausted, return elements from the saved copy. Repeats
255 indefinitely. Equivalent to::
256
257 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000258 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000259 saved = []
260 for element in iterable:
261 yield element
262 saved.append(element)
263 while saved:
264 for element in saved:
265 yield element
266
267 Note, this member of the toolkit may require significant auxiliary storage
268 (depending on the length of the iterable).
269
270
271.. function:: dropwhile(predicate, iterable)
272
273 Make an iterator that drops elements from the iterable as long as the predicate
274 is true; afterwards, returns every element. Note, the iterator does not produce
275 *any* output until the predicate first becomes false, so it may have a lengthy
276 start-up time. Equivalent to::
277
278 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000279 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000280 iterable = iter(iterable)
281 for x in iterable:
282 if not predicate(x):
283 yield x
284 break
285 for x in iterable:
286 yield x
287
288
289.. function:: groupby(iterable[, key])
290
291 Make an iterator that returns consecutive keys and groups from the *iterable*.
292 The *key* is a function computing a key value for each element. If not
293 specified or is ``None``, *key* defaults to an identity function and returns
294 the element unchanged. Generally, the iterable needs to already be sorted on
295 the same key function.
296
297 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
298 generates a break or new group every time the value of the key function changes
299 (which is why it is usually necessary to have sorted the data using the same key
300 function). That behavior differs from SQL's GROUP BY which aggregates common
301 elements regardless of their input order.
302
303 The returned group is itself an iterator that shares the underlying iterable
304 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
305 object is advanced, the previous group is no longer visible. So, if that data
306 is needed later, it should be stored as a list::
307
308 groups = []
309 uniquekeys = []
310 data = sorted(data, key=keyfunc)
311 for k, g in groupby(data, keyfunc):
312 groups.append(list(g)) # Store group iterator as a list
313 uniquekeys.append(k)
314
315 :func:`groupby` is equivalent to::
316
317 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000318 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettingerd507afd2009-02-04 10:52:32 +0000319 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000320 def __init__(self, iterable, key=None):
321 if key is None:
322 key = lambda x: x
323 self.keyfunc = key
324 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000325 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000326 def __iter__(self):
327 return self
328 def next(self):
329 while self.currkey == self.tgtkey:
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000330 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000331 self.currkey = self.keyfunc(self.currvalue)
332 self.tgtkey = self.currkey
333 return (self.currkey, self._grouper(self.tgtkey))
334 def _grouper(self, tgtkey):
335 while self.currkey == tgtkey:
336 yield self.currvalue
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000337 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000338 self.currkey = self.keyfunc(self.currvalue)
339
340 .. versionadded:: 2.4
341
342
343.. function:: ifilter(predicate, iterable)
344
345 Make an iterator that filters elements from iterable returning only those for
346 which the predicate is ``True``. If *predicate* is ``None``, return the items
347 that are true. Equivalent to::
348
349 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000350 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000351 if predicate is None:
352 predicate = bool
353 for x in iterable:
354 if predicate(x):
355 yield x
356
357
358.. function:: ifilterfalse(predicate, iterable)
359
360 Make an iterator that filters elements from iterable returning only those for
361 which the predicate is ``False``. If *predicate* is ``None``, return the items
362 that are false. Equivalent to::
363
364 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000365 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000366 if predicate is None:
367 predicate = bool
368 for x in iterable:
369 if not predicate(x):
370 yield x
371
372
373.. function:: imap(function, *iterables)
374
375 Make an iterator that computes the function using arguments from each of the
376 iterables. If *function* is set to ``None``, then :func:`imap` returns the
377 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
378 exhausted instead of filling in ``None`` for shorter iterables. The reason for
379 the difference is that infinite iterator arguments are typically an error for
380 :func:`map` (because the output is fully evaluated) but represent a common and
381 useful way of supplying arguments to :func:`imap`. Equivalent to::
382
383 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000384 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000385 iterables = map(iter, iterables)
386 while True:
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000387 args = [next(it) for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000388 if function is None:
389 yield tuple(args)
390 else:
391 yield function(*args)
392
393
394.. function:: islice(iterable, [start,] stop [, step])
395
396 Make an iterator that returns selected elements from the iterable. If *start* is
397 non-zero, then elements from the iterable are skipped until start is reached.
398 Afterward, elements are returned consecutively unless *step* is set higher than
399 one which results in items being skipped. If *stop* is ``None``, then iteration
400 continues until the iterator is exhausted, if at all; otherwise, it stops at the
401 specified position. Unlike regular slicing, :func:`islice` does not support
402 negative values for *start*, *stop*, or *step*. Can be used to extract related
403 fields from data where the internal structure has been flattened (for example, a
404 multi-line report may list a name field on every third line). Equivalent to::
405
406 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000407 # islice('ABCDEFG', 2) --> A B
408 # islice('ABCDEFG', 2, 4) --> C D
409 # islice('ABCDEFG', 2, None) --> C D E F G
410 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000411 s = slice(*args)
412 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000413 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000414 for i, element in enumerate(iterable):
415 if i == nexti:
416 yield element
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000417 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000418
419 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
420 then the step defaults to one.
421
422 .. versionchanged:: 2.5
423 accept ``None`` values for default *start* and *step*.
424
425
426.. function:: izip(*iterables)
427
428 Make an iterator that aggregates elements from each of the iterables. Like
429 :func:`zip` except that it returns an iterator instead of a list. Used for
430 lock-step iteration over several iterables at a time. Equivalent to::
431
432 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000433 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000434 iterables = map(iter, iterables)
435 while iterables:
Raymond Hettingercccfc822009-04-20 18:23:57 +0000436 yield tuple(map(next, iterables))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000437
438 .. versionchanged:: 2.4
439 When no iterables are specified, returns a zero length iterator instead of
440 raising a :exc:`TypeError` exception.
441
Raymond Hettinger48c62932008-01-22 19:51:41 +0000442 The left-to-right evaluation order of the iterables is guaranteed. This
443 makes possible an idiom for clustering a data series into n-length groups
444 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000445
Raymond Hettinger48c62932008-01-22 19:51:41 +0000446 :func:`izip` should only be used with unequal length inputs when you don't
447 care about trailing, unmatched values from the longer iterables. If those
448 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000449
450
451.. function:: izip_longest(*iterables[, fillvalue])
452
453 Make an iterator that aggregates elements from each of the iterables. If the
454 iterables are of uneven length, missing values are filled-in with *fillvalue*.
455 Iteration continues until the longest iterable is exhausted. Equivalent to::
456
457 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000458 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000459 fillvalue = kwds.get('fillvalue')
460 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
461 yield counter() # yields the fillvalue, or raises IndexError
462 fillers = repeat(fillvalue)
463 iters = [chain(it, sentinel(), fillers) for it in args]
464 try:
465 for tup in izip(*iters):
466 yield tup
467 except IndexError:
468 pass
469
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000470 If one of the iterables is potentially infinite, then the
471 :func:`izip_longest` function should be wrapped with something that limits
472 the number of calls (for example :func:`islice` or :func:`takewhile`). If
473 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000474
475 .. versionadded:: 2.6
476
Raymond Hettinger330958e2008-02-28 19:41:24 +0000477.. function:: permutations(iterable[, r])
478
479 Return successive *r* length permutations of elements in the *iterable*.
480
481 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000482 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e2008-02-28 19:41:24 +0000483 are generated.
484
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000485 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e2008-02-28 19:41:24 +0000486 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000487 in sorted order.
Raymond Hettinger330958e2008-02-28 19:41:24 +0000488
489 Elements are treated as unique based on their position, not on their
490 value. So if the input elements are unique, there will be no repeat
491 values in each permutation.
492
Raymond Hettingerf287f172008-03-02 10:59:31 +0000493 Equivalent to::
494
495 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000496 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
497 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000498 pool = tuple(iterable)
499 n = len(pool)
500 r = n if r is None else r
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000501 if r > n:
502 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000503 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000504 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000505 yield tuple(pool[i] for i in indices[:r])
506 while n:
507 for i in reversed(range(r)):
508 cycles[i] -= 1
509 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000510 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000511 cycles[i] = n - i
512 else:
513 j = cycles[i]
514 indices[i], indices[-j] = indices[-j], indices[i]
515 yield tuple(pool[i] for i in indices[:r])
516 break
517 else:
518 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000519
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000520 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000521 :func:`product`, filtered to exclude entries with repeated elements (those
522 from the same position in the input pool)::
523
524 def permutations(iterable, r=None):
525 pool = tuple(iterable)
526 n = len(pool)
527 r = n if r is None else r
528 for indices in product(range(n), repeat=r):
529 if len(set(indices)) == r:
530 yield tuple(pool[i] for i in indices)
531
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000532 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
533 or zero when ``r > n``.
534
Raymond Hettinger330958e2008-02-28 19:41:24 +0000535 .. versionadded:: 2.6
536
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000537.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000538
539 Cartesian product of input iterables.
540
541 Equivalent to nested for-loops in a generator expression. For example,
542 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
543
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000544 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000545 on every iteration. This pattern creates a lexicographic ordering so that if
546 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000547 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000548
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000549 To compute the product of an iterable with itself, specify the number of
550 repetitions with the optional *repeat* keyword argument. For example,
551 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
552
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000553 This function is equivalent to the following code, except that the
554 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000555
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000556 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000557 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
558 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000559 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000560 result = [[]]
561 for pool in pools:
562 result = [x+[y] for x in result for y in pool]
563 for prod in result:
564 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000565
566 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000567
568.. function:: repeat(object[, times])
569
570 Make an iterator that returns *object* over and over again. Runs indefinitely
571 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000572 invariant function parameters. Also used with :func:`izip` to create constant
573 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000574
575 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000576 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000577 if times is None:
578 while True:
579 yield object
580 else:
581 for i in xrange(times):
582 yield object
583
584
585.. function:: starmap(function, iterable)
586
Raymond Hettinger47317092008-01-17 03:02:14 +0000587 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000588 the iterable. Used instead of :func:`imap` when argument parameters are already
589 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
590 difference between :func:`imap` and :func:`starmap` parallels the distinction
591 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
592
593 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000594 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000595 for args in iterable:
596 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000597
Raymond Hettinger47317092008-01-17 03:02:14 +0000598 .. versionchanged:: 2.6
599 Previously, :func:`starmap` required the function arguments to be tuples.
600 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000601
602.. function:: takewhile(predicate, iterable)
603
604 Make an iterator that returns elements from the iterable as long as the
605 predicate is true. Equivalent to::
606
607 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000608 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000609 for x in iterable:
610 if predicate(x):
611 yield x
612 else:
613 break
614
615
616.. function:: tee(iterable[, n=2])
617
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000618 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000619
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000620 def tee(iterable, n=2):
621 it = iter(iterable)
622 deques = [collections.deque() for i in range(n)]
623 def gen(mydeque):
624 while True:
625 if not mydeque: # when the local deque is empty
626 newval = next(it) # fetch a new value and
627 for d in deques: # load it to all the deques
628 d.append(newval)
629 yield mydeque.popleft()
630 return tuple(gen(d) for d in deques)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000631
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000632 Once :func:`tee` has made a split, the original *iterable* should not be
633 used anywhere else; otherwise, the *iterable* could get advanced without
634 the tee objects being informed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000635
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000636 This itertool may require significant auxiliary storage (depending on how
637 much temporary data needs to be stored). In general, if one iterator uses
638 most or all of the data before another iterator starts, it is faster to use
639 :func:`list` instead of :func:`tee`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000640
641 .. versionadded:: 2.4
642
643
Georg Brandl8ec7f652007-08-15 14:28:01 +0000644.. _itertools-recipes:
645
646Recipes
647-------
648
649This section shows recipes for creating an extended toolset using the existing
650itertools as building blocks.
651
652The extended tools offer the same high performance as the underlying toolset.
653The superior memory performance is kept by processing elements one at a time
654rather than bringing the whole iterable into memory all at once. Code volume is
655kept small by linking the tools together in a functional style which helps
656eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000657"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000658which incur interpreter overhead.
659
660.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000661
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000662 def take(n, iterable):
663 "Return first n items of the iterable as a list"
664 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000665
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000666 def enumerate(iterable, start=0):
667 return izip(count(start), iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000668
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000669 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000670 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000671 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000672
Raymond Hettinger3496a892009-03-09 11:57:29 +0000673 def consume(iterator, n):
674 "Advance the iterator n-steps ahead. If n is none, consume entirely."
675 collections.deque(islice(iterator, n), maxlen=0)
676
Raymond Hettingerf9bce832009-02-19 05:34:35 +0000677 def nth(iterable, n, default=None):
678 "Returns the nth item or a default value"
679 return next(islice(iterable, n, None), default)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000680
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000681 def quantify(iterable, pred=bool):
682 "Count how many times the predicate is true"
683 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000684
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000685 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000686 """Returns the sequence elements and then returns None indefinitely.
687
688 Useful for emulating the behavior of the built-in map() function.
689 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000690 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000691
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000692 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000693 "Returns the sequence elements n times"
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000694 return chain.from_iterable(repeat(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000695
696 def dotproduct(vec1, vec2):
697 return sum(imap(operator.mul, vec1, vec2))
698
699 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000700 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000701
702 def repeatfunc(func, times=None, *args):
703 """Repeat calls to func with specified arguments.
704
705 Example: repeatfunc(random.random)
706 """
707 if times is None:
708 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000709 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000710
711 def pairwise(iterable):
712 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
713 a, b = tee(iterable)
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000714 next(b, None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000715 return izip(a, b)
716
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000717 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000718 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000719 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000720 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000721
Raymond Hettingera44327a2008-01-30 22:17:31 +0000722 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000723 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000724 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000725 pending = len(iterables)
726 nexts = cycle(iter(it).next for it in iterables)
727 while pending:
728 try:
729 for next in nexts:
730 yield next()
731 except StopIteration:
732 pending -= 1
733 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000734
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000735 def powerset(iterable):
Raymond Hettinger68d919e2009-01-25 21:31:47 +0000736 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
737 s = list(iterable)
738 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000739
Benjamin Peterson48291362009-01-31 20:01:48 +0000740 def unique_everseen(iterable, key=None):
741 "List unique elements, preserving order. Remember all elements ever seen."
742 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
743 # unique_everseen('ABBCcAD', str.lower) --> A B C D
744 seen = set()
745 seen_add = seen.add
746 if key is None:
747 for element in iterable:
748 if element not in seen:
749 seen_add(element)
750 yield element
751 else:
752 for element in iterable:
753 k = key(element)
754 if k not in seen:
755 seen_add(k)
756 yield element
Raymond Hettinger44e15812009-01-02 21:26:45 +0000757
Benjamin Peterson48291362009-01-31 20:01:48 +0000758 def unique_justseen(iterable, key=None):
759 "List unique elements, preserving order. Remember only the element just seen."
760 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
761 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
762 return imap(next, imap(itemgetter(1), groupby(iterable, key)))