blob: 2cfc779ae5961df2279143f44609fd99a5c746c8 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Christian Heimesfe337bf2008-03-23 21:54:12 +000011.. testsetup::
12
13 from itertools import *
14
15
Raymond Hettingerf76b9202009-02-17 20:00:59 +000016This module implements a number of :term:`iterator` building blocks inspired
17by constructs from APL, Haskell, and SML. Each has been recast in a form
18suitable for Python.
Georg Brandl116aa622007-08-15 14:28:22 +000019
20The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettingerf76b9202009-02-17 20:00:59 +000021useful by themselves or in combination. Together, they form an "iterator
22algebra" making it possible to construct specialized tools succinctly and
23efficiently in pure Python.
Georg Brandl116aa622007-08-15 14:28:22 +000024
25For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000026sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
27by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000028
Raymond Hettinger2c109ab2009-03-12 00:29:44 +000029These tools and their built-in counterparts also work well with the high-speed
30functions in the :mod:`operator` module. For example, the multiplication
31operator can be mapped across two vectors to form an efficient dot-product:
32``sum(map(operator.mul, vector1, vector2))``.
Georg Brandl116aa622007-08-15 14:28:22 +000033
Georg Brandl116aa622007-08-15 14:28:22 +000034
Raymond Hettingerf76b9202009-02-17 20:00:59 +000035**Infinite Iterators:**
Georg Brandl116aa622007-08-15 14:28:22 +000036
Raymond Hettingerf76b9202009-02-17 20:00:59 +000037 ================== ================= =================================================
38 Iterator Arguments Results
39 ================== ================= =================================================
40 :func:`count` start, [step] start, start+step, start+2*step, ...
41 :func:`cycle` p p0, p1, ... plast, p0, p1, ...
42 :func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times
43 ================== ================= =================================================
Georg Brandl116aa622007-08-15 14:28:22 +000044
Raymond Hettingerf76b9202009-02-17 20:00:59 +000045**Iterators terminating on the shortest input sequence:**
46
47 ==================== ============================ =================================================
48 Iterator Arguments Results
49 ==================== ============================ =================================================
50 :func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ...
51 :func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ...
52 :func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails
53 :func:`filterfalse` pred, seq elements of seq where pred(elem) is False
54 :func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
55 :func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step]
Raymond Hettinger5fa5d4f2009-03-09 11:37:57 +000056 :func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ...
Raymond Hettingerf76b9202009-02-17 20:00:59 +000057 :func:`tee` it, n it1, it2 , ... itn splits one iterator into n
58 :func:`takewhile` pred, seq seq[0], seq[1], until pred fails
59 :func:`zip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ...
60 ==================== ============================ =================================================
61
62**Combinatoric generators:**
63
64 ===================================== ==================== =================================================
65 Iterator Arguments Results
66 ===================================== ==================== =================================================
67 :func:`product` p, q, ... [repeat=1] cartesian product
68 :func:`permutations` p[, r] r-length permutations (without repeated elements)
69 :func:`combinations` p[, r] r-length combinations (sorted and no repeats)
70 :func:`combinations_with_replacement` p[, r] r-length combinations (sorted but with repeats)
71 ===================================== ==================== =================================================
Georg Brandl116aa622007-08-15 14:28:22 +000072
73
74.. _itertools-functions:
75
76Itertool functions
77------------------
78
79The following module functions all construct and return iterators. Some provide
80streams of infinite length, so they should only be accessed by functions or
81loops that truncate the stream.
82
83
84.. function:: chain(*iterables)
85
86 Make an iterator that returns elements from the first iterable until it is
87 exhausted, then proceeds to the next iterable, until all of the iterables are
88 exhausted. Used for treating consecutive sequences as a single sequence.
89 Equivalent to::
90
91 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000092 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000093 for it in iterables:
94 for element in it:
95 yield element
96
97
Christian Heimes70e7ea22008-02-28 20:02:27 +000098.. function:: itertools.chain.from_iterable(iterable)
99
Georg Brandl48310cd2009-01-03 21:18:54 +0000100 Alternate constructor for :func:`chain`. Gets chained inputs from a
Christian Heimes70e7ea22008-02-28 20:02:27 +0000101 single iterable argument that is evaluated lazily. Equivalent to::
102
103 @classmethod
104 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000105 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +0000106 for it in iterables:
107 for element in it:
108 yield element
109
Christian Heimes78644762008-03-04 23:39:23 +0000110
Christian Heimes836baa52008-02-26 08:18:30 +0000111.. function:: combinations(iterable, r)
112
Christian Heimesdae2a892008-04-19 00:55:37 +0000113 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +0000114
Georg Brandl48310cd2009-01-03 21:18:54 +0000115 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +0000116 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +0000117 in sorted order.
Christian Heimes836baa52008-02-26 08:18:30 +0000118
119 Elements are treated as unique based on their position, not on their
120 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +0000121 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +0000122
Christian Heimes836baa52008-02-26 08:18:30 +0000123 Equivalent to::
124
125 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000126 # combinations('ABCD', 2) --> AB AC AD BC BD CD
127 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000128 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000129 n = len(pool)
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000130 if r > n:
131 return
132 indices = list(range(r))
Christian Heimesb558a2e2008-03-02 22:46:37 +0000133 yield tuple(pool[i] for i in indices)
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000134 while True:
Christian Heimes380f7f22008-02-28 11:19:05 +0000135 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000136 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000137 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000138 else:
139 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000140 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000141 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000142 indices[j] = indices[j-1] + 1
143 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000144
Christian Heimes78644762008-03-04 23:39:23 +0000145 The code for :func:`combinations` can be also expressed as a subsequence
146 of :func:`permutations` after filtering entries where the elements are not
147 in sorted order (according to their position in the input pool)::
148
149 def combinations(iterable, r):
150 pool = tuple(iterable)
151 n = len(pool)
152 for indices in permutations(range(n), r):
153 if sorted(indices) == list(indices):
154 yield tuple(pool[i] for i in indices)
155
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000156 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
157 or zero when ``r > n``.
Christian Heimes836baa52008-02-26 08:18:30 +0000158
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000159.. function:: combinations_with_replacement(iterable, r)
160
161 Return *r* length subsequences of elements from the input *iterable*
162 allowing individual elements to be repeated more than once.
163
164 Combinations are emitted in lexicographic sort order. So, if the
165 input *iterable* is sorted, the combination tuples will be produced
166 in sorted order.
167
168 Elements are treated as unique based on their position, not on their
169 value. So if the input elements are unique, the generated combinations
170 will also be unique.
171
172 Equivalent to::
173
174 def combinations_with_replacement(iterable, r):
175 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
176 pool = tuple(iterable)
177 n = len(pool)
178 if not n and r:
179 return
180 indices = [0] * r
181 yield tuple(pool[i] for i in indices)
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000182 while True:
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000183 for i in reversed(range(r)):
184 if indices[i] != n - 1:
185 break
186 else:
187 return
188 indices[i:] = [indices[i] + 1] * (r - i)
189 yield tuple(pool[i] for i in indices)
190
191 The code for :func:`combinations_with_replacement` can be also expressed as
192 a subsequence of :func:`product` after filtering entries where the elements
193 are not in sorted order (according to their position in the input pool)::
194
195 def combinations_with_replacement(iterable, r):
196 pool = tuple(iterable)
197 n = len(pool)
198 for indices in product(range(n), repeat=r):
199 if sorted(indices) == list(indices):
200 yield tuple(pool[i] for i in indices)
201
202 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
203
Raymond Hettinger30729212009-02-12 06:28:27 +0000204 .. versionadded:: 3.1
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000205
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000206.. function:: compress(data, selectors)
207
208 Make an iterator that filters elements from *data* returning only those that
209 have a corresponding element in *selectors* that evaluates to ``True``.
Benjamin Petersond23f8222009-04-05 19:13:16 +0000210 Stops when either the *data* or *selectors* iterables has been exhausted.
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000211 Equivalent to::
212
213 def compress(data, selectors):
214 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
215 return (d for d, s in zip(data, selectors) if s)
216
Raymond Hettinger30729212009-02-12 06:28:27 +0000217 .. versionadded:: 3.1
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000218
219
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000220.. function:: count(start=0, step=1)
Georg Brandl116aa622007-08-15 14:28:22 +0000221
Raymond Hettinger30729212009-02-12 06:28:27 +0000222 Make an iterator that returns evenly spaced values starting with *n*. Often
223 used as an argument to :func:`map` to generate consecutive data points.
224 Also, used with :func:`zip` to add sequence numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000225
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000226 def count(start=0, step=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000227 # count(10) --> 10 11 12 13 14 ...
Raymond Hettinger30729212009-02-12 06:28:27 +0000228 # count(2.5, 0.5) -> 3.5 3.0 4.5 ...
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000229 n = start
Georg Brandl116aa622007-08-15 14:28:22 +0000230 while True:
231 yield n
Raymond Hettinger30729212009-02-12 06:28:27 +0000232 n += step
Georg Brandl116aa622007-08-15 14:28:22 +0000233
Raymond Hettinger30729212009-02-12 06:28:27 +0000234 .. versionchanged:: 3.1
235 added *step* argument and allowed non-integer arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000236
237.. function:: cycle(iterable)
238
239 Make an iterator returning elements from the iterable and saving a copy of each.
240 When the iterable is exhausted, return elements from the saved copy. Repeats
241 indefinitely. Equivalent to::
242
243 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000244 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000245 saved = []
246 for element in iterable:
247 yield element
248 saved.append(element)
249 while saved:
250 for element in saved:
251 yield element
252
253 Note, this member of the toolkit may require significant auxiliary storage
254 (depending on the length of the iterable).
255
256
257.. function:: dropwhile(predicate, iterable)
258
259 Make an iterator that drops elements from the iterable as long as the predicate
260 is true; afterwards, returns every element. Note, the iterator does not produce
261 *any* output until the predicate first becomes false, so it may have a lengthy
262 start-up time. Equivalent to::
263
264 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000265 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000266 iterable = iter(iterable)
267 for x in iterable:
268 if not predicate(x):
269 yield x
270 break
271 for x in iterable:
272 yield x
273
Raymond Hettinger749761e2009-01-27 04:42:48 +0000274.. function:: filterfalse(predicate, iterable)
275
276 Make an iterator that filters elements from iterable returning only those for
277 which the predicate is ``False``. If *predicate* is ``None``, return the items
278 that are false. Equivalent to::
279
280 def filterfalse(predicate, iterable):
281 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
282 if predicate is None:
283 predicate = bool
284 for x in iterable:
285 if not predicate(x):
286 yield x
287
Georg Brandl116aa622007-08-15 14:28:22 +0000288
289.. function:: groupby(iterable[, key])
290
291 Make an iterator that returns consecutive keys and groups from the *iterable*.
292 The *key* is a function computing a key value for each element. If not
293 specified or is ``None``, *key* defaults to an identity function and returns
294 the element unchanged. Generally, the iterable needs to already be sorted on
295 the same key function.
296
297 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
298 generates a break or new group every time the value of the key function changes
299 (which is why it is usually necessary to have sorted the data using the same key
300 function). That behavior differs from SQL's GROUP BY which aggregates common
301 elements regardless of their input order.
302
303 The returned group is itself an iterator that shares the underlying iterable
304 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
305 object is advanced, the previous group is no longer visible. So, if that data
306 is needed later, it should be stored as a list::
307
308 groups = []
309 uniquekeys = []
310 data = sorted(data, key=keyfunc)
311 for k, g in groupby(data, keyfunc):
312 groups.append(list(g)) # Store group iterator as a list
313 uniquekeys.append(k)
314
315 :func:`groupby` is equivalent to::
316
317 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000318 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettingerd04fa312009-02-04 19:45:13 +0000319 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000320 def __init__(self, iterable, key=None):
321 if key is None:
322 key = lambda x: x
323 self.keyfunc = key
324 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000325 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000326 def __iter__(self):
327 return self
328 def __next__(self):
329 while self.currkey == self.tgtkey:
Raymond Hettinger21315ba2009-02-23 19:38:09 +0000330 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl116aa622007-08-15 14:28:22 +0000331 self.currkey = self.keyfunc(self.currvalue)
332 self.tgtkey = self.currkey
333 return (self.currkey, self._grouper(self.tgtkey))
334 def _grouper(self, tgtkey):
335 while self.currkey == tgtkey:
336 yield self.currvalue
Raymond Hettinger21315ba2009-02-23 19:38:09 +0000337 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl116aa622007-08-15 14:28:22 +0000338 self.currkey = self.keyfunc(self.currvalue)
339
Georg Brandl116aa622007-08-15 14:28:22 +0000340
Georg Brandl116aa622007-08-15 14:28:22 +0000341.. function:: islice(iterable, [start,] stop [, step])
342
343 Make an iterator that returns selected elements from the iterable. If *start* is
344 non-zero, then elements from the iterable are skipped until start is reached.
345 Afterward, elements are returned consecutively unless *step* is set higher than
346 one which results in items being skipped. If *stop* is ``None``, then iteration
347 continues until the iterator is exhausted, if at all; otherwise, it stops at the
348 specified position. Unlike regular slicing, :func:`islice` does not support
349 negative values for *start*, *stop*, or *step*. Can be used to extract related
350 fields from data where the internal structure has been flattened (for example, a
351 multi-line report may list a name field on every third line). Equivalent to::
352
353 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000354 # islice('ABCDEFG', 2) --> A B
355 # islice('ABCDEFG', 2, 4) --> C D
356 # islice('ABCDEFG', 2, None) --> C D E F G
357 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000358 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000359 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000360 nexti = next(it)
361 for i, element in enumerate(iterable):
362 if i == nexti:
363 yield element
364 nexti = next(it)
365
366 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
367 then the step defaults to one.
368
Georg Brandl116aa622007-08-15 14:28:22 +0000369
Christian Heimes70e7ea22008-02-28 20:02:27 +0000370.. function:: permutations(iterable[, r])
371
372 Return successive *r* length permutations of elements in the *iterable*.
373
374 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl48310cd2009-01-03 21:18:54 +0000375 of the *iterable* and all possible full-length permutations
Christian Heimes70e7ea22008-02-28 20:02:27 +0000376 are generated.
377
Georg Brandl48310cd2009-01-03 21:18:54 +0000378 Permutations are emitted in lexicographic sort order. So, if the
Christian Heimes70e7ea22008-02-28 20:02:27 +0000379 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +0000380 in sorted order.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000381
382 Elements are treated as unique based on their position, not on their
383 value. So if the input elements are unique, there will be no repeat
384 values in each permutation.
385
Christian Heimesb558a2e2008-03-02 22:46:37 +0000386 Equivalent to::
387
388 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000389 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
390 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000391 pool = tuple(iterable)
392 n = len(pool)
393 r = n if r is None else r
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000394 if r > n:
395 return
396 indices = list(range(n))
Christian Heimesfe337bf2008-03-23 21:54:12 +0000397 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000398 yield tuple(pool[i] for i in indices[:r])
399 while n:
400 for i in reversed(range(r)):
401 cycles[i] -= 1
402 if cycles[i] == 0:
403 indices[i:] = indices[i+1:] + indices[i:i+1]
404 cycles[i] = n - i
405 else:
406 j = cycles[i]
407 indices[i], indices[-j] = indices[-j], indices[i]
408 yield tuple(pool[i] for i in indices[:r])
409 break
410 else:
411 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000412
Georg Brandl48310cd2009-01-03 21:18:54 +0000413 The code for :func:`permutations` can be also expressed as a subsequence of
Christian Heimes78644762008-03-04 23:39:23 +0000414 :func:`product`, filtered to exclude entries with repeated elements (those
415 from the same position in the input pool)::
416
417 def permutations(iterable, r=None):
418 pool = tuple(iterable)
419 n = len(pool)
420 r = n if r is None else r
421 for indices in product(range(n), repeat=r):
422 if len(set(indices)) == r:
423 yield tuple(pool[i] for i in indices)
424
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000425 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
426 or zero when ``r > n``.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000427
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000428.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000429
430 Cartesian product of input iterables.
431
432 Equivalent to nested for-loops in a generator expression. For example,
433 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
434
Christian Heimesdae2a892008-04-19 00:55:37 +0000435 The nested loops cycle like an odometer with the rightmost element advancing
436 on every iteration. This pattern creates a lexicographic ordering so that if
437 the input's iterables are sorted, the product tuples are emitted in sorted
438 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000439
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000440 To compute the product of an iterable with itself, specify the number of
441 repetitions with the optional *repeat* keyword argument. For example,
442 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
443
Christian Heimes78644762008-03-04 23:39:23 +0000444 This function is equivalent to the following code, except that the
445 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000446
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000447 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000448 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
449 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000450 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000451 result = [[]]
452 for pool in pools:
453 result = [x+[y] for x in result for y in pool]
454 for prod in result:
455 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000456
457
Georg Brandl116aa622007-08-15 14:28:22 +0000458.. function:: repeat(object[, times])
459
460 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000461 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000462 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000463 create an invariant part of a tuple record. Equivalent to::
464
465 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000466 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000467 if times is None:
468 while True:
469 yield object
470 else:
471 for i in range(times):
472 yield object
473
474
475.. function:: starmap(function, iterable)
476
Christian Heimes679db4a2008-01-18 09:56:22 +0000477 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000478 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000479 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000480 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000481 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
482
483 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000484 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000485 for args in iterable:
486 yield function(*args)
487
Georg Brandl116aa622007-08-15 14:28:22 +0000488
489.. function:: takewhile(predicate, iterable)
490
491 Make an iterator that returns elements from the iterable as long as the
492 predicate is true. Equivalent to::
493
494 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000495 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000496 for x in iterable:
497 if predicate(x):
498 yield x
499 else:
500 break
501
502
503.. function:: tee(iterable[, n=2])
504
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000505 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000506
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000507 def tee(iterable, n=2):
508 it = iter(iterable)
509 deques = [collections.deque() for i in range(n)]
510 def gen(mydeque):
511 while True:
512 if not mydeque: # when the local deque is empty
513 newval = next(it) # fetch a new value and
514 for d in deques: # load it to all the deques
515 d.append(newval)
516 yield mydeque.popleft()
517 return tuple(gen(d) for d in deques)
Georg Brandl116aa622007-08-15 14:28:22 +0000518
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000519 Once :func:`tee` has made a split, the original *iterable* should not be
520 used anywhere else; otherwise, the *iterable* could get advanced without
521 the tee objects being informed.
Georg Brandl116aa622007-08-15 14:28:22 +0000522
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000523 This itertool may require significant auxiliary storage (depending on how
524 much temporary data needs to be stored). In general, if one iterator uses
525 most or all of the data before another iterator starts, it is faster to use
526 :func:`list` instead of :func:`tee`.
Georg Brandl116aa622007-08-15 14:28:22 +0000527
Georg Brandl116aa622007-08-15 14:28:22 +0000528
Raymond Hettinger749761e2009-01-27 04:42:48 +0000529.. function:: zip_longest(*iterables[, fillvalue])
530
531 Make an iterator that aggregates elements from each of the iterables. If the
532 iterables are of uneven length, missing values are filled-in with *fillvalue*.
533 Iteration continues until the longest iterable is exhausted. Equivalent to::
534
535 def zip_longest(*args, fillvalue=None):
536 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
537 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
538 yield counter() # yields the fillvalue, or raises IndexError
539 fillers = repeat(fillvalue)
540 iters = [chain(it, sentinel(), fillers) for it in args]
541 try:
542 for tup in zip(*iters):
543 yield tup
544 except IndexError:
545 pass
546
547 If one of the iterables is potentially infinite, then the :func:`zip_longest`
548 function should be wrapped with something that limits the number of calls
549 (for example :func:`islice` or :func:`takewhile`). If not specified,
550 *fillvalue* defaults to ``None``.
551
552
Georg Brandl116aa622007-08-15 14:28:22 +0000553.. _itertools-example:
554
555Examples
556--------
557
558The following examples show common uses for each tool and demonstrate ways they
Christian Heimesfe337bf2008-03-23 21:54:12 +0000559can be combined.
560
561.. doctest::
Georg Brandl116aa622007-08-15 14:28:22 +0000562
Georg Brandlb1441c72009-01-03 22:33:39 +0000563 >>> # Show a dictionary sorted and grouped by value
Georg Brandl116aa622007-08-15 14:28:22 +0000564 >>> from operator import itemgetter
565 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000566 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000567 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000568 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000569 ...
570 1 ['a', 'c', 'e']
571 2 ['b', 'd', 'f']
572 3 ['g']
573
Georg Brandlb1441c72009-01-03 22:33:39 +0000574 >>> # Find runs of consecutive numbers using groupby. The key to the solution
575 >>> # is differencing with a range so that consecutive numbers all appear in
576 >>> # same group.
Georg Brandl116aa622007-08-15 14:28:22 +0000577 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
578 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000579 ... print(map(operator.itemgetter(1), g))
Georg Brandl48310cd2009-01-03 21:18:54 +0000580 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000581 [1]
582 [4, 5, 6]
583 [10]
584 [15, 16, 17, 18]
585 [22]
586 [25, 26, 27, 28]
587
588
589
590.. _itertools-recipes:
591
592Recipes
593-------
594
595This section shows recipes for creating an extended toolset using the existing
596itertools as building blocks.
597
598The extended tools offer the same high performance as the underlying toolset.
599The superior memory performance is kept by processing elements one at a time
600rather than bringing the whole iterable into memory all at once. Code volume is
601kept small by linking the tools together in a functional style which helps
602eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000603"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000604which incur interpreter overhead.
605
606.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000607
Georg Brandl3dbca812008-07-23 16:10:53 +0000608 def take(n, iterable):
609 "Return first n items of the iterable as a list"
610 return list(islice(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000611
Georg Brandl3dbca812008-07-23 16:10:53 +0000612 def enumerate(iterable, start=0):
613 return zip(count(start), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000614
Georg Brandl3dbca812008-07-23 16:10:53 +0000615 def tabulate(function, start=0):
Georg Brandl116aa622007-08-15 14:28:22 +0000616 "Return function(0), function(1), ..."
Georg Brandl3dbca812008-07-23 16:10:53 +0000617 return map(function, count(start))
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000618
Raymond Hettingerfa007962009-03-09 11:55:25 +0000619 def consume(iterator, n):
620 "Advance the iterator n-steps ahead. If n is none, consume entirely."
621 collections.deque(islice(iterator, n), maxlen=0)
622
Raymond Hettingercdf8ba32009-02-19 04:45:07 +0000623 def nth(iterable, n, default=None):
624 "Returns the nth item or a default value"
625 return next(islice(iterable, n, None), default)
Georg Brandl116aa622007-08-15 14:28:22 +0000626
Georg Brandl3dbca812008-07-23 16:10:53 +0000627 def quantify(iterable, pred=bool):
628 "Count how many times the predicate is true"
629 return sum(map(pred, iterable))
Georg Brandl116aa622007-08-15 14:28:22 +0000630
Georg Brandl3dbca812008-07-23 16:10:53 +0000631 def padnone(iterable):
Georg Brandl116aa622007-08-15 14:28:22 +0000632 """Returns the sequence elements and then returns None indefinitely.
633
634 Useful for emulating the behavior of the built-in map() function.
635 """
Georg Brandl3dbca812008-07-23 16:10:53 +0000636 return chain(iterable, repeat(None))
Georg Brandl116aa622007-08-15 14:28:22 +0000637
Georg Brandl3dbca812008-07-23 16:10:53 +0000638 def ncycles(iterable, n):
Georg Brandl116aa622007-08-15 14:28:22 +0000639 "Returns the sequence elements n times"
Georg Brandl3dbca812008-07-23 16:10:53 +0000640 return chain.from_iterable(repeat(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000641
642 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000643 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000644
645 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000646 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000647
648 def repeatfunc(func, times=None, *args):
649 """Repeat calls to func with specified arguments.
650
651 Example: repeatfunc(random.random)
652 """
653 if times is None:
654 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000655 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000656
657 def pairwise(iterable):
658 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
659 a, b = tee(iterable)
Raymond Hettinger21315ba2009-02-23 19:38:09 +0000660 next(b, None)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000661 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000662
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000663 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000664 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000665 args = [iter(iterable)] * n
Raymond Hettinger883d2762009-01-27 04:57:51 +0000666 return zip_longest(*args, fillvalue=fillvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000667
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000668 def roundrobin(*iterables):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000669 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000670 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000671 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000672 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000673 while pending:
674 try:
675 for next in nexts:
676 yield next()
677 except StopIteration:
678 pending -= 1
679 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000680
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000681 def powerset(iterable):
Raymond Hettingerace67332009-01-26 02:23:50 +0000682 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
683 s = list(iterable)
684 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000685
Raymond Hettingerfd88ea72009-03-23 22:42:28 +0000686 def unique_everseen(iterable, key=None):
687 "List unique elements, preserving order. Remember all elements ever seen."
688 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
689 # unique_everseen('ABBCcAD', str.lower) --> A B C D
690 seen = set()
691 seen_add = seen.add
692 if key is None:
693 for element in iterable:
694 if element not in seen:
695 seen_add(element)
696 yield element
697 else:
698 for element in iterable:
699 k = key(element)
700 if k not in seen:
701 seen_add(k)
702 yield element
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000703
Raymond Hettingerfd88ea72009-03-23 22:42:28 +0000704 def unique_justseen(iterable, key=None):
705 "List unique elements, preserving order. Remember only the element just seen."
706 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
707 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
708 return map(next, map(itemgetter(1), groupby(iterable, key)))