blob: 2c79b06c9dffda6e36a97fde261dcc527245e23d [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Christian Heimesfe337bf2008-03-23 21:54:12 +000011.. testsetup::
12
13 from itertools import *
14
15
Raymond Hettingerf76b9202009-02-17 20:00:59 +000016This module implements a number of :term:`iterator` building blocks inspired
17by constructs from APL, Haskell, and SML. Each has been recast in a form
18suitable for Python.
Georg Brandl116aa622007-08-15 14:28:22 +000019
20The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettingerf76b9202009-02-17 20:00:59 +000021useful by themselves or in combination. Together, they form an "iterator
22algebra" making it possible to construct specialized tools succinctly and
23efficiently in pure Python.
Georg Brandl116aa622007-08-15 14:28:22 +000024
25For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000026sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
27by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000028
Raymond Hettingerf76b9202009-02-17 20:00:59 +000029The tools also work well with the high-speed functions in the :mod:`operator`
30module. For example, the plus-operator can be mapped across two vectors to
31form an efficient dot-product: ``sum(map(operator.add, vector1, vector2))``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
Georg Brandl116aa622007-08-15 14:28:22 +000033
Raymond Hettingerf76b9202009-02-17 20:00:59 +000034**Infinite Iterators:**
Georg Brandl116aa622007-08-15 14:28:22 +000035
Raymond Hettingerf76b9202009-02-17 20:00:59 +000036 ================== ================= =================================================
37 Iterator Arguments Results
38 ================== ================= =================================================
39 :func:`count` start, [step] start, start+step, start+2*step, ...
40 :func:`cycle` p p0, p1, ... plast, p0, p1, ...
41 :func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times
42 ================== ================= =================================================
Georg Brandl116aa622007-08-15 14:28:22 +000043
Raymond Hettingerf76b9202009-02-17 20:00:59 +000044**Iterators terminating on the shortest input sequence:**
45
46 ==================== ============================ =================================================
47 Iterator Arguments Results
48 ==================== ============================ =================================================
49 :func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ...
50 :func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ...
51 :func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails
52 :func:`filterfalse` pred, seq elements of seq where pred(elem) is False
53 :func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
54 :func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step]
Raymond Hettinger5fa5d4f2009-03-09 11:37:57 +000055 :func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ...
Raymond Hettingerf76b9202009-02-17 20:00:59 +000056 :func:`tee` it, n it1, it2 , ... itn splits one iterator into n
57 :func:`takewhile` pred, seq seq[0], seq[1], until pred fails
58 :func:`zip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ...
59 ==================== ============================ =================================================
60
61**Combinatoric generators:**
62
63 ===================================== ==================== =================================================
64 Iterator Arguments Results
65 ===================================== ==================== =================================================
66 :func:`product` p, q, ... [repeat=1] cartesian product
67 :func:`permutations` p[, r] r-length permutations (without repeated elements)
68 :func:`combinations` p[, r] r-length combinations (sorted and no repeats)
69 :func:`combinations_with_replacement` p[, r] r-length combinations (sorted but with repeats)
70 ===================================== ==================== =================================================
Georg Brandl116aa622007-08-15 14:28:22 +000071
72
73.. _itertools-functions:
74
75Itertool functions
76------------------
77
78The following module functions all construct and return iterators. Some provide
79streams of infinite length, so they should only be accessed by functions or
80loops that truncate the stream.
81
82
83.. function:: chain(*iterables)
84
85 Make an iterator that returns elements from the first iterable until it is
86 exhausted, then proceeds to the next iterable, until all of the iterables are
87 exhausted. Used for treating consecutive sequences as a single sequence.
88 Equivalent to::
89
90 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000091 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000092 for it in iterables:
93 for element in it:
94 yield element
95
96
Christian Heimes70e7ea22008-02-28 20:02:27 +000097.. function:: itertools.chain.from_iterable(iterable)
98
Georg Brandl48310cd2009-01-03 21:18:54 +000099 Alternate constructor for :func:`chain`. Gets chained inputs from a
Christian Heimes70e7ea22008-02-28 20:02:27 +0000100 single iterable argument that is evaluated lazily. Equivalent to::
101
102 @classmethod
103 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000104 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +0000105 for it in iterables:
106 for element in it:
107 yield element
108
Christian Heimes78644762008-03-04 23:39:23 +0000109
Christian Heimes836baa52008-02-26 08:18:30 +0000110.. function:: combinations(iterable, r)
111
Christian Heimesdae2a892008-04-19 00:55:37 +0000112 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +0000113
Georg Brandl48310cd2009-01-03 21:18:54 +0000114 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +0000115 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +0000116 in sorted order.
Christian Heimes836baa52008-02-26 08:18:30 +0000117
118 Elements are treated as unique based on their position, not on their
119 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +0000120 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +0000121
Christian Heimes836baa52008-02-26 08:18:30 +0000122 Equivalent to::
123
124 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000125 # combinations('ABCD', 2) --> AB AC AD BC BD CD
126 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000127 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000128 n = len(pool)
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000129 if r > n:
130 return
131 indices = list(range(r))
Christian Heimesb558a2e2008-03-02 22:46:37 +0000132 yield tuple(pool[i] for i in indices)
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000133 while True:
Christian Heimes380f7f22008-02-28 11:19:05 +0000134 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000135 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000136 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000137 else:
138 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000139 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000140 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000141 indices[j] = indices[j-1] + 1
142 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000143
Christian Heimes78644762008-03-04 23:39:23 +0000144 The code for :func:`combinations` can be also expressed as a subsequence
145 of :func:`permutations` after filtering entries where the elements are not
146 in sorted order (according to their position in the input pool)::
147
148 def combinations(iterable, r):
149 pool = tuple(iterable)
150 n = len(pool)
151 for indices in permutations(range(n), r):
152 if sorted(indices) == list(indices):
153 yield tuple(pool[i] for i in indices)
154
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000155 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
156 or zero when ``r > n``.
Christian Heimes836baa52008-02-26 08:18:30 +0000157
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000158.. function:: combinations_with_replacement(iterable, r)
159
160 Return *r* length subsequences of elements from the input *iterable*
161 allowing individual elements to be repeated more than once.
162
163 Combinations are emitted in lexicographic sort order. So, if the
164 input *iterable* is sorted, the combination tuples will be produced
165 in sorted order.
166
167 Elements are treated as unique based on their position, not on their
168 value. So if the input elements are unique, the generated combinations
169 will also be unique.
170
171 Equivalent to::
172
173 def combinations_with_replacement(iterable, r):
174 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
175 pool = tuple(iterable)
176 n = len(pool)
177 if not n and r:
178 return
179 indices = [0] * r
180 yield tuple(pool[i] for i in indices)
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000181 while True:
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000182 for i in reversed(range(r)):
183 if indices[i] != n - 1:
184 break
185 else:
186 return
187 indices[i:] = [indices[i] + 1] * (r - i)
188 yield tuple(pool[i] for i in indices)
189
190 The code for :func:`combinations_with_replacement` can be also expressed as
191 a subsequence of :func:`product` after filtering entries where the elements
192 are not in sorted order (according to their position in the input pool)::
193
194 def combinations_with_replacement(iterable, r):
195 pool = tuple(iterable)
196 n = len(pool)
197 for indices in product(range(n), repeat=r):
198 if sorted(indices) == list(indices):
199 yield tuple(pool[i] for i in indices)
200
201 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
202
Raymond Hettinger30729212009-02-12 06:28:27 +0000203 .. versionadded:: 3.1
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000204
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000205.. function:: compress(data, selectors)
206
207 Make an iterator that filters elements from *data* returning only those that
208 have a corresponding element in *selectors* that evaluates to ``True``.
209 Stops when either the *data* or *selectors* iterables have been exhausted.
210 Equivalent to::
211
212 def compress(data, selectors):
213 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
214 return (d for d, s in zip(data, selectors) if s)
215
Raymond Hettinger30729212009-02-12 06:28:27 +0000216 .. versionadded:: 3.1
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000217
218
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000219.. function:: count(start=0, step=1)
Georg Brandl116aa622007-08-15 14:28:22 +0000220
Raymond Hettinger30729212009-02-12 06:28:27 +0000221 Make an iterator that returns evenly spaced values starting with *n*. Often
222 used as an argument to :func:`map` to generate consecutive data points.
223 Also, used with :func:`zip` to add sequence numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000224
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000225 def count(start=0, step=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000226 # count(10) --> 10 11 12 13 14 ...
Raymond Hettinger30729212009-02-12 06:28:27 +0000227 # count(2.5, 0.5) -> 3.5 3.0 4.5 ...
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000228 n = start
Georg Brandl116aa622007-08-15 14:28:22 +0000229 while True:
230 yield n
Raymond Hettinger30729212009-02-12 06:28:27 +0000231 n += step
Georg Brandl116aa622007-08-15 14:28:22 +0000232
Raymond Hettinger30729212009-02-12 06:28:27 +0000233 .. versionchanged:: 3.1
234 added *step* argument and allowed non-integer arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000235
236.. function:: cycle(iterable)
237
238 Make an iterator returning elements from the iterable and saving a copy of each.
239 When the iterable is exhausted, return elements from the saved copy. Repeats
240 indefinitely. Equivalent to::
241
242 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000243 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000244 saved = []
245 for element in iterable:
246 yield element
247 saved.append(element)
248 while saved:
249 for element in saved:
250 yield element
251
252 Note, this member of the toolkit may require significant auxiliary storage
253 (depending on the length of the iterable).
254
255
256.. function:: dropwhile(predicate, iterable)
257
258 Make an iterator that drops elements from the iterable as long as the predicate
259 is true; afterwards, returns every element. Note, the iterator does not produce
260 *any* output until the predicate first becomes false, so it may have a lengthy
261 start-up time. Equivalent to::
262
263 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000264 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000265 iterable = iter(iterable)
266 for x in iterable:
267 if not predicate(x):
268 yield x
269 break
270 for x in iterable:
271 yield x
272
Raymond Hettinger749761e2009-01-27 04:42:48 +0000273.. function:: filterfalse(predicate, iterable)
274
275 Make an iterator that filters elements from iterable returning only those for
276 which the predicate is ``False``. If *predicate* is ``None``, return the items
277 that are false. Equivalent to::
278
279 def filterfalse(predicate, iterable):
280 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
281 if predicate is None:
282 predicate = bool
283 for x in iterable:
284 if not predicate(x):
285 yield x
286
Georg Brandl116aa622007-08-15 14:28:22 +0000287
288.. function:: groupby(iterable[, key])
289
290 Make an iterator that returns consecutive keys and groups from the *iterable*.
291 The *key* is a function computing a key value for each element. If not
292 specified or is ``None``, *key* defaults to an identity function and returns
293 the element unchanged. Generally, the iterable needs to already be sorted on
294 the same key function.
295
296 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
297 generates a break or new group every time the value of the key function changes
298 (which is why it is usually necessary to have sorted the data using the same key
299 function). That behavior differs from SQL's GROUP BY which aggregates common
300 elements regardless of their input order.
301
302 The returned group is itself an iterator that shares the underlying iterable
303 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
304 object is advanced, the previous group is no longer visible. So, if that data
305 is needed later, it should be stored as a list::
306
307 groups = []
308 uniquekeys = []
309 data = sorted(data, key=keyfunc)
310 for k, g in groupby(data, keyfunc):
311 groups.append(list(g)) # Store group iterator as a list
312 uniquekeys.append(k)
313
314 :func:`groupby` is equivalent to::
315
316 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000317 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettingerd04fa312009-02-04 19:45:13 +0000318 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000319 def __init__(self, iterable, key=None):
320 if key is None:
321 key = lambda x: x
322 self.keyfunc = key
323 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000324 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000325 def __iter__(self):
326 return self
327 def __next__(self):
328 while self.currkey == self.tgtkey:
Raymond Hettinger21315ba2009-02-23 19:38:09 +0000329 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl116aa622007-08-15 14:28:22 +0000330 self.currkey = self.keyfunc(self.currvalue)
331 self.tgtkey = self.currkey
332 return (self.currkey, self._grouper(self.tgtkey))
333 def _grouper(self, tgtkey):
334 while self.currkey == tgtkey:
335 yield self.currvalue
Raymond Hettinger21315ba2009-02-23 19:38:09 +0000336 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl116aa622007-08-15 14:28:22 +0000337 self.currkey = self.keyfunc(self.currvalue)
338
Georg Brandl116aa622007-08-15 14:28:22 +0000339
Georg Brandl116aa622007-08-15 14:28:22 +0000340.. function:: islice(iterable, [start,] stop [, step])
341
342 Make an iterator that returns selected elements from the iterable. If *start* is
343 non-zero, then elements from the iterable are skipped until start is reached.
344 Afterward, elements are returned consecutively unless *step* is set higher than
345 one which results in items being skipped. If *stop* is ``None``, then iteration
346 continues until the iterator is exhausted, if at all; otherwise, it stops at the
347 specified position. Unlike regular slicing, :func:`islice` does not support
348 negative values for *start*, *stop*, or *step*. Can be used to extract related
349 fields from data where the internal structure has been flattened (for example, a
350 multi-line report may list a name field on every third line). Equivalent to::
351
352 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000353 # islice('ABCDEFG', 2) --> A B
354 # islice('ABCDEFG', 2, 4) --> C D
355 # islice('ABCDEFG', 2, None) --> C D E F G
356 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000357 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000358 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000359 nexti = next(it)
360 for i, element in enumerate(iterable):
361 if i == nexti:
362 yield element
363 nexti = next(it)
364
365 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
366 then the step defaults to one.
367
Georg Brandl116aa622007-08-15 14:28:22 +0000368
Christian Heimes70e7ea22008-02-28 20:02:27 +0000369.. function:: permutations(iterable[, r])
370
371 Return successive *r* length permutations of elements in the *iterable*.
372
373 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl48310cd2009-01-03 21:18:54 +0000374 of the *iterable* and all possible full-length permutations
Christian Heimes70e7ea22008-02-28 20:02:27 +0000375 are generated.
376
Georg Brandl48310cd2009-01-03 21:18:54 +0000377 Permutations are emitted in lexicographic sort order. So, if the
Christian Heimes70e7ea22008-02-28 20:02:27 +0000378 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +0000379 in sorted order.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000380
381 Elements are treated as unique based on their position, not on their
382 value. So if the input elements are unique, there will be no repeat
383 values in each permutation.
384
Christian Heimesb558a2e2008-03-02 22:46:37 +0000385 Equivalent to::
386
387 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000388 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
389 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000390 pool = tuple(iterable)
391 n = len(pool)
392 r = n if r is None else r
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000393 if r > n:
394 return
395 indices = list(range(n))
Christian Heimesfe337bf2008-03-23 21:54:12 +0000396 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000397 yield tuple(pool[i] for i in indices[:r])
398 while n:
399 for i in reversed(range(r)):
400 cycles[i] -= 1
401 if cycles[i] == 0:
402 indices[i:] = indices[i+1:] + indices[i:i+1]
403 cycles[i] = n - i
404 else:
405 j = cycles[i]
406 indices[i], indices[-j] = indices[-j], indices[i]
407 yield tuple(pool[i] for i in indices[:r])
408 break
409 else:
410 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000411
Georg Brandl48310cd2009-01-03 21:18:54 +0000412 The code for :func:`permutations` can be also expressed as a subsequence of
Christian Heimes78644762008-03-04 23:39:23 +0000413 :func:`product`, filtered to exclude entries with repeated elements (those
414 from the same position in the input pool)::
415
416 def permutations(iterable, r=None):
417 pool = tuple(iterable)
418 n = len(pool)
419 r = n if r is None else r
420 for indices in product(range(n), repeat=r):
421 if len(set(indices)) == r:
422 yield tuple(pool[i] for i in indices)
423
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000424 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
425 or zero when ``r > n``.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000426
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000427.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000428
429 Cartesian product of input iterables.
430
431 Equivalent to nested for-loops in a generator expression. For example,
432 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
433
Christian Heimesdae2a892008-04-19 00:55:37 +0000434 The nested loops cycle like an odometer with the rightmost element advancing
435 on every iteration. This pattern creates a lexicographic ordering so that if
436 the input's iterables are sorted, the product tuples are emitted in sorted
437 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000438
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000439 To compute the product of an iterable with itself, specify the number of
440 repetitions with the optional *repeat* keyword argument. For example,
441 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
442
Christian Heimes78644762008-03-04 23:39:23 +0000443 This function is equivalent to the following code, except that the
444 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000445
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000446 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000447 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
448 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000449 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000450 result = [[]]
451 for pool in pools:
452 result = [x+[y] for x in result for y in pool]
453 for prod in result:
454 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000455
456
Georg Brandl116aa622007-08-15 14:28:22 +0000457.. function:: repeat(object[, times])
458
459 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000460 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000461 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000462 create an invariant part of a tuple record. Equivalent to::
463
464 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000465 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000466 if times is None:
467 while True:
468 yield object
469 else:
470 for i in range(times):
471 yield object
472
473
474.. function:: starmap(function, iterable)
475
Christian Heimes679db4a2008-01-18 09:56:22 +0000476 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000477 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000478 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000479 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000480 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
481
482 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000483 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000484 for args in iterable:
485 yield function(*args)
486
Georg Brandl116aa622007-08-15 14:28:22 +0000487
488.. function:: takewhile(predicate, iterable)
489
490 Make an iterator that returns elements from the iterable as long as the
491 predicate is true. Equivalent to::
492
493 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000494 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000495 for x in iterable:
496 if predicate(x):
497 yield x
498 else:
499 break
500
501
502.. function:: tee(iterable[, n=2])
503
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000504 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000505
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000506 def tee(iterable, n=2):
507 it = iter(iterable)
508 deques = [collections.deque() for i in range(n)]
509 def gen(mydeque):
510 while True:
511 if not mydeque: # when the local deque is empty
512 newval = next(it) # fetch a new value and
513 for d in deques: # load it to all the deques
514 d.append(newval)
515 yield mydeque.popleft()
516 return tuple(gen(d) for d in deques)
Georg Brandl116aa622007-08-15 14:28:22 +0000517
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000518 Once :func:`tee` has made a split, the original *iterable* should not be
519 used anywhere else; otherwise, the *iterable* could get advanced without
520 the tee objects being informed.
Georg Brandl116aa622007-08-15 14:28:22 +0000521
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000522 This itertool may require significant auxiliary storage (depending on how
523 much temporary data needs to be stored). In general, if one iterator uses
524 most or all of the data before another iterator starts, it is faster to use
525 :func:`list` instead of :func:`tee`.
Georg Brandl116aa622007-08-15 14:28:22 +0000526
Georg Brandl116aa622007-08-15 14:28:22 +0000527
Raymond Hettinger749761e2009-01-27 04:42:48 +0000528.. function:: zip_longest(*iterables[, fillvalue])
529
530 Make an iterator that aggregates elements from each of the iterables. If the
531 iterables are of uneven length, missing values are filled-in with *fillvalue*.
532 Iteration continues until the longest iterable is exhausted. Equivalent to::
533
534 def zip_longest(*args, fillvalue=None):
535 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
536 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
537 yield counter() # yields the fillvalue, or raises IndexError
538 fillers = repeat(fillvalue)
539 iters = [chain(it, sentinel(), fillers) for it in args]
540 try:
541 for tup in zip(*iters):
542 yield tup
543 except IndexError:
544 pass
545
546 If one of the iterables is potentially infinite, then the :func:`zip_longest`
547 function should be wrapped with something that limits the number of calls
548 (for example :func:`islice` or :func:`takewhile`). If not specified,
549 *fillvalue* defaults to ``None``.
550
551
Georg Brandl116aa622007-08-15 14:28:22 +0000552.. _itertools-example:
553
554Examples
555--------
556
557The following examples show common uses for each tool and demonstrate ways they
Christian Heimesfe337bf2008-03-23 21:54:12 +0000558can be combined.
559
560.. doctest::
Georg Brandl116aa622007-08-15 14:28:22 +0000561
Georg Brandlb1441c72009-01-03 22:33:39 +0000562 >>> # Show a dictionary sorted and grouped by value
Georg Brandl116aa622007-08-15 14:28:22 +0000563 >>> from operator import itemgetter
564 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000565 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000566 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000567 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000568 ...
569 1 ['a', 'c', 'e']
570 2 ['b', 'd', 'f']
571 3 ['g']
572
Georg Brandlb1441c72009-01-03 22:33:39 +0000573 >>> # Find runs of consecutive numbers using groupby. The key to the solution
574 >>> # is differencing with a range so that consecutive numbers all appear in
575 >>> # same group.
Georg Brandl116aa622007-08-15 14:28:22 +0000576 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
577 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000578 ... print(map(operator.itemgetter(1), g))
Georg Brandl48310cd2009-01-03 21:18:54 +0000579 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000580 [1]
581 [4, 5, 6]
582 [10]
583 [15, 16, 17, 18]
584 [22]
585 [25, 26, 27, 28]
586
587
588
589.. _itertools-recipes:
590
591Recipes
592-------
593
594This section shows recipes for creating an extended toolset using the existing
595itertools as building blocks.
596
597The extended tools offer the same high performance as the underlying toolset.
598The superior memory performance is kept by processing elements one at a time
599rather than bringing the whole iterable into memory all at once. Code volume is
600kept small by linking the tools together in a functional style which helps
601eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000602"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000603which incur interpreter overhead.
604
605.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000606
Georg Brandl3dbca812008-07-23 16:10:53 +0000607 def take(n, iterable):
608 "Return first n items of the iterable as a list"
609 return list(islice(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000610
Georg Brandl3dbca812008-07-23 16:10:53 +0000611 def enumerate(iterable, start=0):
612 return zip(count(start), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000613
Georg Brandl3dbca812008-07-23 16:10:53 +0000614 def tabulate(function, start=0):
Georg Brandl116aa622007-08-15 14:28:22 +0000615 "Return function(0), function(1), ..."
Georg Brandl3dbca812008-07-23 16:10:53 +0000616 return map(function, count(start))
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000617
Raymond Hettingerfa007962009-03-09 11:55:25 +0000618 def consume(iterator, n):
619 "Advance the iterator n-steps ahead. If n is none, consume entirely."
620 collections.deque(islice(iterator, n), maxlen=0)
621
Raymond Hettingercdf8ba32009-02-19 04:45:07 +0000622 def nth(iterable, n, default=None):
623 "Returns the nth item or a default value"
624 return next(islice(iterable, n, None), default)
Georg Brandl116aa622007-08-15 14:28:22 +0000625
Georg Brandl3dbca812008-07-23 16:10:53 +0000626 def quantify(iterable, pred=bool):
627 "Count how many times the predicate is true"
628 return sum(map(pred, iterable))
Georg Brandl116aa622007-08-15 14:28:22 +0000629
Georg Brandl3dbca812008-07-23 16:10:53 +0000630 def padnone(iterable):
Georg Brandl116aa622007-08-15 14:28:22 +0000631 """Returns the sequence elements and then returns None indefinitely.
632
633 Useful for emulating the behavior of the built-in map() function.
634 """
Georg Brandl3dbca812008-07-23 16:10:53 +0000635 return chain(iterable, repeat(None))
Georg Brandl116aa622007-08-15 14:28:22 +0000636
Georg Brandl3dbca812008-07-23 16:10:53 +0000637 def ncycles(iterable, n):
Georg Brandl116aa622007-08-15 14:28:22 +0000638 "Returns the sequence elements n times"
Georg Brandl3dbca812008-07-23 16:10:53 +0000639 return chain.from_iterable(repeat(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000640
641 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000642 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000643
644 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000645 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000646
647 def repeatfunc(func, times=None, *args):
648 """Repeat calls to func with specified arguments.
649
650 Example: repeatfunc(random.random)
651 """
652 if times is None:
653 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000654 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000655
656 def pairwise(iterable):
657 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
658 a, b = tee(iterable)
Raymond Hettinger21315ba2009-02-23 19:38:09 +0000659 next(b, None)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000660 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000661
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000662 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000663 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000664 args = [iter(iterable)] * n
Raymond Hettinger883d2762009-01-27 04:57:51 +0000665 return zip_longest(*args, fillvalue=fillvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000666
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000667 def roundrobin(*iterables):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000668 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000669 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000670 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000671 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000672 while pending:
673 try:
674 for next in nexts:
675 yield next()
676 except StopIteration:
677 pending -= 1
678 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000679
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000680 def powerset(iterable):
Raymond Hettingerace67332009-01-26 02:23:50 +0000681 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
682 s = list(iterable)
683 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000684
Benjamin Peterson063ff652009-02-06 03:01:24 +0000685 def unique_everseen(iterable, key=None):
686 "List unique elements, preserving order. Remember all elements ever seen."
687 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
688 # unique_everseen('ABBCcAD', str.lower) --> A B C D
689 seen = set()
690 seen_add = seen.add
691 if key is None:
692 for element in iterable:
693 if element not in seen:
694 seen_add(element)
695 yield element
696 else:
697 for element in iterable:
698 k = key(element)
699 if k not in seen:
700 seen_add(k)
701 yield element
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000702
Benjamin Peterson063ff652009-02-06 03:01:24 +0000703 def unique_justseen(iterable, key=None):
704 "List unique elements, preserving order. Remember only the element just seen."
705 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
706 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
707 return map(next, imap(itemgetter(1), groupby(iterable, key)))