blob: fa4484d085ff576cbd7f95ad66a2c239cb78991b [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Christian Heimesfe337bf2008-03-23 21:54:12 +000011.. testsetup::
12
13 from itertools import *
14
15
Raymond Hettingerf76b9202009-02-17 20:00:59 +000016This module implements a number of :term:`iterator` building blocks inspired
17by constructs from APL, Haskell, and SML. Each has been recast in a form
18suitable for Python.
Georg Brandl116aa622007-08-15 14:28:22 +000019
20The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettingerf76b9202009-02-17 20:00:59 +000021useful by themselves or in combination. Together, they form an "iterator
22algebra" making it possible to construct specialized tools succinctly and
23efficiently in pure Python.
Georg Brandl116aa622007-08-15 14:28:22 +000024
25For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000026sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
27by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000028
Raymond Hettinger2c109ab2009-03-12 00:29:44 +000029These tools and their built-in counterparts also work well with the high-speed
30functions in the :mod:`operator` module. For example, the multiplication
31operator can be mapped across two vectors to form an efficient dot-product:
32``sum(map(operator.mul, vector1, vector2))``.
Georg Brandl116aa622007-08-15 14:28:22 +000033
Georg Brandl116aa622007-08-15 14:28:22 +000034
Raymond Hettingerf76b9202009-02-17 20:00:59 +000035**Infinite Iterators:**
Georg Brandl116aa622007-08-15 14:28:22 +000036
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000037================== ================= ================================================= =========================================
38Iterator Arguments Results Example
39================== ================= ================================================= =========================================
40:func:`count` start, [step] start, start+step, start+2*step, ... ``count(10) --> 10 11 12 13 14 ...``
41:func:`cycle` p p0, p1, ... plast, p0, p1, ... ``cycle('ABCD') --> A B C D A B C D ...``
42:func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times ``repeat(10, 3) --> 10 10 10``
43================== ================= ================================================= =========================================
Georg Brandl116aa622007-08-15 14:28:22 +000044
Raymond Hettingerf76b9202009-02-17 20:00:59 +000045**Iterators terminating on the shortest input sequence:**
46
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000047==================== ============================ ================================================= =============================================================
48Iterator Arguments Results Example
49==================== ============================ ================================================= =============================================================
50:func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ... ``chain('ABC', 'DEF') --> A B C D E F``
51:func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ... ``compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F``
52:func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails ``dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1``
53:func:`filterfalse` pred, seq elements of seq where pred(elem) is False ``ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8``
54:func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
55:func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step] ``islice('ABCDEFG', 2, None) --> C D E F G``
56:func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ... ``starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000``
57:func:`takewhile` pred, seq seq[0], seq[1], until pred fails ``takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4``
58:func:`tee` it, n it1, it2 , ... itn splits one iterator into n
59:func:`zip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-``
60==================== ============================ ================================================= =============================================================
Raymond Hettingerf76b9202009-02-17 20:00:59 +000061
62**Combinatoric generators:**
63
Raymond Hettinger7f587cd2009-04-10 19:43:50 +000064============================================== ==================== =============================================================
65Iterator Arguments Results
66============================================== ==================== =============================================================
67:func:`product` p, q, ... [repeat=1] cartesian product, equivalent to a nested for-loop
68:func:`permutations` p[, r] r-length tuples, all possible orderings, no repeated elements
69:func:`combinations` p[, r] r-length tuples, in sorted order, no repeated elements
70:func:`combinations_with_replacement` p[, r] r-length tuples, in sorted order, with repeated elements
71|
72``product('ABCD', repeat=2)`` ``AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD``
73``permutations('ABCD', 2)`` ``AB AC AD BA BC BD CA CB CD DA DB DC``
74``combinations('ABCD', 2)`` ``AB AC AD BC BD CD``
75``combinations_with_replacement('ABCD', 2)`` ``AA AB AC AD BB BC BD CC CD DD``
76============================================== ==================== =============================================================
Georg Brandl116aa622007-08-15 14:28:22 +000077
78
79.. _itertools-functions:
80
81Itertool functions
82------------------
83
84The following module functions all construct and return iterators. Some provide
85streams of infinite length, so they should only be accessed by functions or
86loops that truncate the stream.
87
88
89.. function:: chain(*iterables)
90
91 Make an iterator that returns elements from the first iterable until it is
92 exhausted, then proceeds to the next iterable, until all of the iterables are
93 exhausted. Used for treating consecutive sequences as a single sequence.
94 Equivalent to::
95
96 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000097 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000098 for it in iterables:
99 for element in it:
100 yield element
101
102
Christian Heimes70e7ea22008-02-28 20:02:27 +0000103.. function:: itertools.chain.from_iterable(iterable)
104
Georg Brandl48310cd2009-01-03 21:18:54 +0000105 Alternate constructor for :func:`chain`. Gets chained inputs from a
Christian Heimes70e7ea22008-02-28 20:02:27 +0000106 single iterable argument that is evaluated lazily. Equivalent to::
107
108 @classmethod
109 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000110 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +0000111 for it in iterables:
112 for element in it:
113 yield element
114
Christian Heimes78644762008-03-04 23:39:23 +0000115
Christian Heimes836baa52008-02-26 08:18:30 +0000116.. function:: combinations(iterable, r)
117
Christian Heimesdae2a892008-04-19 00:55:37 +0000118 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +0000119
Georg Brandl48310cd2009-01-03 21:18:54 +0000120 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +0000121 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +0000122 in sorted order.
Christian Heimes836baa52008-02-26 08:18:30 +0000123
124 Elements are treated as unique based on their position, not on their
125 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +0000126 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +0000127
Christian Heimes836baa52008-02-26 08:18:30 +0000128 Equivalent to::
129
130 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000131 # combinations('ABCD', 2) --> AB AC AD BC BD CD
132 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000133 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000134 n = len(pool)
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000135 if r > n:
136 return
137 indices = list(range(r))
Christian Heimesb558a2e2008-03-02 22:46:37 +0000138 yield tuple(pool[i] for i in indices)
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000139 while True:
Christian Heimes380f7f22008-02-28 11:19:05 +0000140 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000141 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000142 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000143 else:
144 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000145 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000146 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000147 indices[j] = indices[j-1] + 1
148 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000149
Christian Heimes78644762008-03-04 23:39:23 +0000150 The code for :func:`combinations` can be also expressed as a subsequence
151 of :func:`permutations` after filtering entries where the elements are not
152 in sorted order (according to their position in the input pool)::
153
154 def combinations(iterable, r):
155 pool = tuple(iterable)
156 n = len(pool)
157 for indices in permutations(range(n), r):
158 if sorted(indices) == list(indices):
159 yield tuple(pool[i] for i in indices)
160
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000161 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
162 or zero when ``r > n``.
Christian Heimes836baa52008-02-26 08:18:30 +0000163
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000164.. function:: combinations_with_replacement(iterable, r)
165
166 Return *r* length subsequences of elements from the input *iterable*
167 allowing individual elements to be repeated more than once.
168
169 Combinations are emitted in lexicographic sort order. So, if the
170 input *iterable* is sorted, the combination tuples will be produced
171 in sorted order.
172
173 Elements are treated as unique based on their position, not on their
174 value. So if the input elements are unique, the generated combinations
175 will also be unique.
176
177 Equivalent to::
178
179 def combinations_with_replacement(iterable, r):
180 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
181 pool = tuple(iterable)
182 n = len(pool)
183 if not n and r:
184 return
185 indices = [0] * r
186 yield tuple(pool[i] for i in indices)
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000187 while True:
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000188 for i in reversed(range(r)):
189 if indices[i] != n - 1:
190 break
191 else:
192 return
193 indices[i:] = [indices[i] + 1] * (r - i)
194 yield tuple(pool[i] for i in indices)
195
196 The code for :func:`combinations_with_replacement` can be also expressed as
197 a subsequence of :func:`product` after filtering entries where the elements
198 are not in sorted order (according to their position in the input pool)::
199
200 def combinations_with_replacement(iterable, r):
201 pool = tuple(iterable)
202 n = len(pool)
203 for indices in product(range(n), repeat=r):
204 if sorted(indices) == list(indices):
205 yield tuple(pool[i] for i in indices)
206
207 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
208
Raymond Hettinger30729212009-02-12 06:28:27 +0000209 .. versionadded:: 3.1
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000210
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000211.. function:: compress(data, selectors)
212
213 Make an iterator that filters elements from *data* returning only those that
214 have a corresponding element in *selectors* that evaluates to ``True``.
Benjamin Petersond23f8222009-04-05 19:13:16 +0000215 Stops when either the *data* or *selectors* iterables has been exhausted.
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000216 Equivalent to::
217
218 def compress(data, selectors):
219 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
220 return (d for d, s in zip(data, selectors) if s)
221
Raymond Hettinger30729212009-02-12 06:28:27 +0000222 .. versionadded:: 3.1
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000223
224
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000225.. function:: count(start=0, step=1)
Georg Brandl116aa622007-08-15 14:28:22 +0000226
Raymond Hettinger30729212009-02-12 06:28:27 +0000227 Make an iterator that returns evenly spaced values starting with *n*. Often
228 used as an argument to :func:`map` to generate consecutive data points.
229 Also, used with :func:`zip` to add sequence numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000230
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000231 def count(start=0, step=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000232 # count(10) --> 10 11 12 13 14 ...
Raymond Hettinger30729212009-02-12 06:28:27 +0000233 # count(2.5, 0.5) -> 3.5 3.0 4.5 ...
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000234 n = start
Georg Brandl116aa622007-08-15 14:28:22 +0000235 while True:
236 yield n
Raymond Hettinger30729212009-02-12 06:28:27 +0000237 n += step
Georg Brandl116aa622007-08-15 14:28:22 +0000238
Raymond Hettinger30729212009-02-12 06:28:27 +0000239 .. versionchanged:: 3.1
240 added *step* argument and allowed non-integer arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000241
242.. function:: cycle(iterable)
243
244 Make an iterator returning elements from the iterable and saving a copy of each.
245 When the iterable is exhausted, return elements from the saved copy. Repeats
246 indefinitely. Equivalent to::
247
248 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000249 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000250 saved = []
251 for element in iterable:
252 yield element
253 saved.append(element)
254 while saved:
255 for element in saved:
256 yield element
257
258 Note, this member of the toolkit may require significant auxiliary storage
259 (depending on the length of the iterable).
260
261
262.. function:: dropwhile(predicate, iterable)
263
264 Make an iterator that drops elements from the iterable as long as the predicate
265 is true; afterwards, returns every element. Note, the iterator does not produce
266 *any* output until the predicate first becomes false, so it may have a lengthy
267 start-up time. Equivalent to::
268
269 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000270 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000271 iterable = iter(iterable)
272 for x in iterable:
273 if not predicate(x):
274 yield x
275 break
276 for x in iterable:
277 yield x
278
Raymond Hettinger749761e2009-01-27 04:42:48 +0000279.. function:: filterfalse(predicate, iterable)
280
281 Make an iterator that filters elements from iterable returning only those for
282 which the predicate is ``False``. If *predicate* is ``None``, return the items
283 that are false. Equivalent to::
284
285 def filterfalse(predicate, iterable):
286 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
287 if predicate is None:
288 predicate = bool
289 for x in iterable:
290 if not predicate(x):
291 yield x
292
Georg Brandl116aa622007-08-15 14:28:22 +0000293
294.. function:: groupby(iterable[, key])
295
296 Make an iterator that returns consecutive keys and groups from the *iterable*.
297 The *key* is a function computing a key value for each element. If not
298 specified or is ``None``, *key* defaults to an identity function and returns
299 the element unchanged. Generally, the iterable needs to already be sorted on
300 the same key function.
301
302 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
303 generates a break or new group every time the value of the key function changes
304 (which is why it is usually necessary to have sorted the data using the same key
305 function). That behavior differs from SQL's GROUP BY which aggregates common
306 elements regardless of their input order.
307
308 The returned group is itself an iterator that shares the underlying iterable
309 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
310 object is advanced, the previous group is no longer visible. So, if that data
311 is needed later, it should be stored as a list::
312
313 groups = []
314 uniquekeys = []
315 data = sorted(data, key=keyfunc)
316 for k, g in groupby(data, keyfunc):
317 groups.append(list(g)) # Store group iterator as a list
318 uniquekeys.append(k)
319
320 :func:`groupby` is equivalent to::
321
322 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000323 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettingerd04fa312009-02-04 19:45:13 +0000324 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000325 def __init__(self, iterable, key=None):
326 if key is None:
327 key = lambda x: x
328 self.keyfunc = key
329 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000330 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000331 def __iter__(self):
332 return self
333 def __next__(self):
334 while self.currkey == self.tgtkey:
Raymond Hettinger21315ba2009-02-23 19:38:09 +0000335 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl116aa622007-08-15 14:28:22 +0000336 self.currkey = self.keyfunc(self.currvalue)
337 self.tgtkey = self.currkey
338 return (self.currkey, self._grouper(self.tgtkey))
339 def _grouper(self, tgtkey):
340 while self.currkey == tgtkey:
341 yield self.currvalue
Raymond Hettinger21315ba2009-02-23 19:38:09 +0000342 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl116aa622007-08-15 14:28:22 +0000343 self.currkey = self.keyfunc(self.currvalue)
344
Georg Brandl116aa622007-08-15 14:28:22 +0000345
Georg Brandl116aa622007-08-15 14:28:22 +0000346.. function:: islice(iterable, [start,] stop [, step])
347
348 Make an iterator that returns selected elements from the iterable. If *start* is
349 non-zero, then elements from the iterable are skipped until start is reached.
350 Afterward, elements are returned consecutively unless *step* is set higher than
351 one which results in items being skipped. If *stop* is ``None``, then iteration
352 continues until the iterator is exhausted, if at all; otherwise, it stops at the
353 specified position. Unlike regular slicing, :func:`islice` does not support
354 negative values for *start*, *stop*, or *step*. Can be used to extract related
355 fields from data where the internal structure has been flattened (for example, a
356 multi-line report may list a name field on every third line). Equivalent to::
357
358 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000359 # islice('ABCDEFG', 2) --> A B
360 # islice('ABCDEFG', 2, 4) --> C D
361 # islice('ABCDEFG', 2, None) --> C D E F G
362 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000363 s = slice(*args)
Raymond Hettinger7c8494b2009-05-14 21:48:02 +0000364 it = iter(range(s.start or 0, s.stop or sys.maxsize, s.step or 1))
Georg Brandl116aa622007-08-15 14:28:22 +0000365 nexti = next(it)
366 for i, element in enumerate(iterable):
367 if i == nexti:
368 yield element
369 nexti = next(it)
370
371 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
372 then the step defaults to one.
373
Georg Brandl116aa622007-08-15 14:28:22 +0000374
Christian Heimes70e7ea22008-02-28 20:02:27 +0000375.. function:: permutations(iterable[, r])
376
377 Return successive *r* length permutations of elements in the *iterable*.
378
379 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl48310cd2009-01-03 21:18:54 +0000380 of the *iterable* and all possible full-length permutations
Christian Heimes70e7ea22008-02-28 20:02:27 +0000381 are generated.
382
Georg Brandl48310cd2009-01-03 21:18:54 +0000383 Permutations are emitted in lexicographic sort order. So, if the
Christian Heimes70e7ea22008-02-28 20:02:27 +0000384 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +0000385 in sorted order.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000386
387 Elements are treated as unique based on their position, not on their
388 value. So if the input elements are unique, there will be no repeat
389 values in each permutation.
390
Christian Heimesb558a2e2008-03-02 22:46:37 +0000391 Equivalent to::
392
393 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000394 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
395 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000396 pool = tuple(iterable)
397 n = len(pool)
398 r = n if r is None else r
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000399 if r > n:
400 return
401 indices = list(range(n))
Christian Heimesfe337bf2008-03-23 21:54:12 +0000402 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000403 yield tuple(pool[i] for i in indices[:r])
404 while n:
405 for i in reversed(range(r)):
406 cycles[i] -= 1
407 if cycles[i] == 0:
408 indices[i:] = indices[i+1:] + indices[i:i+1]
409 cycles[i] = n - i
410 else:
411 j = cycles[i]
412 indices[i], indices[-j] = indices[-j], indices[i]
413 yield tuple(pool[i] for i in indices[:r])
414 break
415 else:
416 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000417
Georg Brandl48310cd2009-01-03 21:18:54 +0000418 The code for :func:`permutations` can be also expressed as a subsequence of
Christian Heimes78644762008-03-04 23:39:23 +0000419 :func:`product`, filtered to exclude entries with repeated elements (those
420 from the same position in the input pool)::
421
422 def permutations(iterable, r=None):
423 pool = tuple(iterable)
424 n = len(pool)
425 r = n if r is None else r
426 for indices in product(range(n), repeat=r):
427 if len(set(indices)) == r:
428 yield tuple(pool[i] for i in indices)
429
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000430 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
431 or zero when ``r > n``.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000432
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000433.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000434
435 Cartesian product of input iterables.
436
437 Equivalent to nested for-loops in a generator expression. For example,
438 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
439
Christian Heimesdae2a892008-04-19 00:55:37 +0000440 The nested loops cycle like an odometer with the rightmost element advancing
441 on every iteration. This pattern creates a lexicographic ordering so that if
442 the input's iterables are sorted, the product tuples are emitted in sorted
443 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000444
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000445 To compute the product of an iterable with itself, specify the number of
446 repetitions with the optional *repeat* keyword argument. For example,
447 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
448
Christian Heimes78644762008-03-04 23:39:23 +0000449 This function is equivalent to the following code, except that the
450 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000451
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000452 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000453 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
454 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000455 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000456 result = [[]]
457 for pool in pools:
458 result = [x+[y] for x in result for y in pool]
459 for prod in result:
460 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000461
462
Georg Brandl116aa622007-08-15 14:28:22 +0000463.. function:: repeat(object[, times])
464
465 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000466 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000467 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000468 create an invariant part of a tuple record. Equivalent to::
469
470 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000471 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000472 if times is None:
473 while True:
474 yield object
475 else:
476 for i in range(times):
477 yield object
478
479
480.. function:: starmap(function, iterable)
481
Christian Heimes679db4a2008-01-18 09:56:22 +0000482 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000483 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000484 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000485 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000486 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
487
488 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000489 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000490 for args in iterable:
491 yield function(*args)
492
Georg Brandl116aa622007-08-15 14:28:22 +0000493
494.. function:: takewhile(predicate, iterable)
495
496 Make an iterator that returns elements from the iterable as long as the
497 predicate is true. Equivalent to::
498
499 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000500 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000501 for x in iterable:
502 if predicate(x):
503 yield x
504 else:
505 break
506
507
508.. function:: tee(iterable[, n=2])
509
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000510 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000511
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000512 def tee(iterable, n=2):
513 it = iter(iterable)
514 deques = [collections.deque() for i in range(n)]
515 def gen(mydeque):
516 while True:
517 if not mydeque: # when the local deque is empty
518 newval = next(it) # fetch a new value and
519 for d in deques: # load it to all the deques
520 d.append(newval)
521 yield mydeque.popleft()
522 return tuple(gen(d) for d in deques)
Georg Brandl116aa622007-08-15 14:28:22 +0000523
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000524 Once :func:`tee` has made a split, the original *iterable* should not be
525 used anywhere else; otherwise, the *iterable* could get advanced without
526 the tee objects being informed.
Georg Brandl116aa622007-08-15 14:28:22 +0000527
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000528 This itertool may require significant auxiliary storage (depending on how
529 much temporary data needs to be stored). In general, if one iterator uses
530 most or all of the data before another iterator starts, it is faster to use
531 :func:`list` instead of :func:`tee`.
Georg Brandl116aa622007-08-15 14:28:22 +0000532
Georg Brandl116aa622007-08-15 14:28:22 +0000533
Raymond Hettinger749761e2009-01-27 04:42:48 +0000534.. function:: zip_longest(*iterables[, fillvalue])
535
536 Make an iterator that aggregates elements from each of the iterables. If the
537 iterables are of uneven length, missing values are filled-in with *fillvalue*.
538 Iteration continues until the longest iterable is exhausted. Equivalent to::
539
540 def zip_longest(*args, fillvalue=None):
541 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
542 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
543 yield counter() # yields the fillvalue, or raises IndexError
544 fillers = repeat(fillvalue)
545 iters = [chain(it, sentinel(), fillers) for it in args]
546 try:
547 for tup in zip(*iters):
548 yield tup
549 except IndexError:
550 pass
551
552 If one of the iterables is potentially infinite, then the :func:`zip_longest`
553 function should be wrapped with something that limits the number of calls
554 (for example :func:`islice` or :func:`takewhile`). If not specified,
555 *fillvalue* defaults to ``None``.
556
557
Georg Brandl116aa622007-08-15 14:28:22 +0000558.. _itertools-recipes:
559
560Recipes
561-------
562
563This section shows recipes for creating an extended toolset using the existing
564itertools as building blocks.
565
566The extended tools offer the same high performance as the underlying toolset.
567The superior memory performance is kept by processing elements one at a time
568rather than bringing the whole iterable into memory all at once. Code volume is
569kept small by linking the tools together in a functional style which helps
570eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000571"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000572which incur interpreter overhead.
573
574.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000575
Georg Brandl3dbca812008-07-23 16:10:53 +0000576 def take(n, iterable):
577 "Return first n items of the iterable as a list"
578 return list(islice(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000579
Georg Brandl3dbca812008-07-23 16:10:53 +0000580 def enumerate(iterable, start=0):
581 return zip(count(start), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000582
Georg Brandl3dbca812008-07-23 16:10:53 +0000583 def tabulate(function, start=0):
Georg Brandl116aa622007-08-15 14:28:22 +0000584 "Return function(0), function(1), ..."
Georg Brandl3dbca812008-07-23 16:10:53 +0000585 return map(function, count(start))
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000586
Raymond Hettingerfa007962009-03-09 11:55:25 +0000587 def consume(iterator, n):
588 "Advance the iterator n-steps ahead. If n is none, consume entirely."
589 collections.deque(islice(iterator, n), maxlen=0)
590
Raymond Hettingercdf8ba32009-02-19 04:45:07 +0000591 def nth(iterable, n, default=None):
592 "Returns the nth item or a default value"
593 return next(islice(iterable, n, None), default)
Georg Brandl116aa622007-08-15 14:28:22 +0000594
Georg Brandl3dbca812008-07-23 16:10:53 +0000595 def quantify(iterable, pred=bool):
596 "Count how many times the predicate is true"
597 return sum(map(pred, iterable))
Georg Brandl116aa622007-08-15 14:28:22 +0000598
Georg Brandl3dbca812008-07-23 16:10:53 +0000599 def padnone(iterable):
Georg Brandl116aa622007-08-15 14:28:22 +0000600 """Returns the sequence elements and then returns None indefinitely.
601
602 Useful for emulating the behavior of the built-in map() function.
603 """
Georg Brandl3dbca812008-07-23 16:10:53 +0000604 return chain(iterable, repeat(None))
Georg Brandl116aa622007-08-15 14:28:22 +0000605
Georg Brandl3dbca812008-07-23 16:10:53 +0000606 def ncycles(iterable, n):
Georg Brandl116aa622007-08-15 14:28:22 +0000607 "Returns the sequence elements n times"
Georg Brandl3dbca812008-07-23 16:10:53 +0000608 return chain.from_iterable(repeat(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000609
610 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000611 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000612
613 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000614 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000615
616 def repeatfunc(func, times=None, *args):
617 """Repeat calls to func with specified arguments.
618
619 Example: repeatfunc(random.random)
620 """
621 if times is None:
622 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000623 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000624
625 def pairwise(iterable):
626 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
627 a, b = tee(iterable)
Raymond Hettinger21315ba2009-02-23 19:38:09 +0000628 next(b, None)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000629 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000630
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000631 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000632 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000633 args = [iter(iterable)] * n
Raymond Hettinger883d2762009-01-27 04:57:51 +0000634 return zip_longest(*args, fillvalue=fillvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000635
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000636 def roundrobin(*iterables):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000637 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000638 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000639 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000640 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000641 while pending:
642 try:
643 for next in nexts:
644 yield next()
645 except StopIteration:
646 pending -= 1
647 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000648
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000649 def powerset(iterable):
Raymond Hettingerace67332009-01-26 02:23:50 +0000650 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
651 s = list(iterable)
652 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000653
Raymond Hettingerfd88ea72009-03-23 22:42:28 +0000654 def unique_everseen(iterable, key=None):
655 "List unique elements, preserving order. Remember all elements ever seen."
656 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
657 # unique_everseen('ABBCcAD', str.lower) --> A B C D
658 seen = set()
659 seen_add = seen.add
660 if key is None:
661 for element in iterable:
662 if element not in seen:
663 seen_add(element)
664 yield element
665 else:
666 for element in iterable:
667 k = key(element)
668 if k not in seen:
669 seen_add(k)
670 yield element
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000671
Raymond Hettingerfd88ea72009-03-23 22:42:28 +0000672 def unique_justseen(iterable, key=None):
673 "List unique elements, preserving order. Remember only the element just seen."
674 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
675 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
676 return map(next, map(itemgetter(1), groupby(iterable, key)))