blob: 147072146588c4e8b6b9b0af52a53521c6eb651c [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Christian Heimesfe337bf2008-03-23 21:54:12 +000011.. testsetup::
12
13 from itertools import *
14
15
Georg Brandl9afde1c2007-11-01 20:32:30 +000016This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl116aa622007-08-15 14:28:22 +000017constructs from the Haskell and SML programming languages. Each has been recast
18in a form suitable for Python.
19
20The module standardizes a core set of fast, memory efficient tools that are
21useful by themselves or in combination. Standardization helps avoid the
22readability and reliability problems which arise when many different individuals
23create their own slightly varying implementations, each with their own quirks
24and naming conventions.
25
26The tools are designed to combine readily with one another. This makes it easy
27to construct more specialized tools succinctly and efficiently in pure Python.
28
29For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000030sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
31by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
33Likewise, the functional tools are designed to work well with the high-speed
34functions provided by the :mod:`operator` module.
35
Georg Brandl116aa622007-08-15 14:28:22 +000036Whether cast in pure python form or compiled code, tools that use iterators are
Georg Brandl3dbca812008-07-23 16:10:53 +000037more memory efficient (and often faster) than their list based counterparts. Adopting
Georg Brandl116aa622007-08-15 14:28:22 +000038the principles of just-in-time manufacturing, they create data when and where
39needed instead of consuming memory with the computer equivalent of "inventory".
40
Georg Brandl116aa622007-08-15 14:28:22 +000041
42.. seealso::
43
44 The Standard ML Basis Library, `The Standard ML Basis Library
45 <http://www.standardml.org/Basis/>`_.
46
47 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
48 Libraries <http://www.haskell.org/definition/>`_.
49
50
51.. _itertools-functions:
52
53Itertool functions
54------------------
55
56The following module functions all construct and return iterators. Some provide
57streams of infinite length, so they should only be accessed by functions or
58loops that truncate the stream.
59
60
61.. function:: chain(*iterables)
62
63 Make an iterator that returns elements from the first iterable until it is
64 exhausted, then proceeds to the next iterable, until all of the iterables are
65 exhausted. Used for treating consecutive sequences as a single sequence.
66 Equivalent to::
67
68 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000069 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000070 for it in iterables:
71 for element in it:
72 yield element
73
74
Christian Heimes70e7ea22008-02-28 20:02:27 +000075.. function:: itertools.chain.from_iterable(iterable)
76
Georg Brandl48310cd2009-01-03 21:18:54 +000077 Alternate constructor for :func:`chain`. Gets chained inputs from a
Christian Heimes70e7ea22008-02-28 20:02:27 +000078 single iterable argument that is evaluated lazily. Equivalent to::
79
80 @classmethod
81 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000082 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +000083 for it in iterables:
84 for element in it:
85 yield element
86
Christian Heimes78644762008-03-04 23:39:23 +000087
Christian Heimes836baa52008-02-26 08:18:30 +000088.. function:: combinations(iterable, r)
89
Christian Heimesdae2a892008-04-19 00:55:37 +000090 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +000091
Georg Brandl48310cd2009-01-03 21:18:54 +000092 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +000093 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +000094 in sorted order.
Christian Heimes836baa52008-02-26 08:18:30 +000095
96 Elements are treated as unique based on their position, not on their
97 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +000098 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +000099
Christian Heimes836baa52008-02-26 08:18:30 +0000100 Equivalent to::
101
102 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000103 # combinations('ABCD', 2) --> AB AC AD BC BD CD
104 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000105 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000106 n = len(pool)
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000107 if r > n:
108 return
109 indices = list(range(r))
Christian Heimesb558a2e2008-03-02 22:46:37 +0000110 yield tuple(pool[i] for i in indices)
Christian Heimes380f7f22008-02-28 11:19:05 +0000111 while 1:
112 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000113 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000114 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000115 else:
116 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000117 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000118 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000119 indices[j] = indices[j-1] + 1
120 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000121
Christian Heimes78644762008-03-04 23:39:23 +0000122 The code for :func:`combinations` can be also expressed as a subsequence
123 of :func:`permutations` after filtering entries where the elements are not
124 in sorted order (according to their position in the input pool)::
125
126 def combinations(iterable, r):
127 pool = tuple(iterable)
128 n = len(pool)
129 for indices in permutations(range(n), r):
130 if sorted(indices) == list(indices):
131 yield tuple(pool[i] for i in indices)
132
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000133 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
134 or zero when ``r > n``.
Christian Heimes836baa52008-02-26 08:18:30 +0000135
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000136.. function:: combinations_with_replacement(iterable, r)
137
138 Return *r* length subsequences of elements from the input *iterable*
139 allowing individual elements to be repeated more than once.
140
141 Combinations are emitted in lexicographic sort order. So, if the
142 input *iterable* is sorted, the combination tuples will be produced
143 in sorted order.
144
145 Elements are treated as unique based on their position, not on their
146 value. So if the input elements are unique, the generated combinations
147 will also be unique.
148
149 Equivalent to::
150
151 def combinations_with_replacement(iterable, r):
152 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
153 pool = tuple(iterable)
154 n = len(pool)
155 if not n and r:
156 return
157 indices = [0] * r
158 yield tuple(pool[i] for i in indices)
159 while 1:
160 for i in reversed(range(r)):
161 if indices[i] != n - 1:
162 break
163 else:
164 return
165 indices[i:] = [indices[i] + 1] * (r - i)
166 yield tuple(pool[i] for i in indices)
167
168 The code for :func:`combinations_with_replacement` can be also expressed as
169 a subsequence of :func:`product` after filtering entries where the elements
170 are not in sorted order (according to their position in the input pool)::
171
172 def combinations_with_replacement(iterable, r):
173 pool = tuple(iterable)
174 n = len(pool)
175 for indices in product(range(n), repeat=r):
176 if sorted(indices) == list(indices):
177 yield tuple(pool[i] for i in indices)
178
179 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
180
181 .. versionadded:: 2.7
182
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000183.. function:: compress(data, selectors)
184
185 Make an iterator that filters elements from *data* returning only those that
186 have a corresponding element in *selectors* that evaluates to ``True``.
187 Stops when either the *data* or *selectors* iterables have been exhausted.
188 Equivalent to::
189
190 def compress(data, selectors):
191 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
192 return (d for d, s in zip(data, selectors) if s)
193
194 .. versionadded:: 2.7
195
196
Georg Brandl116aa622007-08-15 14:28:22 +0000197.. function:: count([n])
198
199 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettingera6c60372008-03-13 01:26:19 +0000200 specified *n* defaults to zero. Often used as an argument to :func:`map` to
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000201 generate consecutive data points. Also, used with :func:`zip` to add sequence
Georg Brandl9afde1c2007-11-01 20:32:30 +0000202 numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000203
204 def count(n=0):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000205 # count(10) --> 10 11 12 13 14 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000206 while True:
207 yield n
208 n += 1
209
Georg Brandl116aa622007-08-15 14:28:22 +0000210
211.. function:: cycle(iterable)
212
213 Make an iterator returning elements from the iterable and saving a copy of each.
214 When the iterable is exhausted, return elements from the saved copy. Repeats
215 indefinitely. Equivalent to::
216
217 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000218 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000219 saved = []
220 for element in iterable:
221 yield element
222 saved.append(element)
223 while saved:
224 for element in saved:
225 yield element
226
227 Note, this member of the toolkit may require significant auxiliary storage
228 (depending on the length of the iterable).
229
230
231.. function:: dropwhile(predicate, iterable)
232
233 Make an iterator that drops elements from the iterable as long as the predicate
234 is true; afterwards, returns every element. Note, the iterator does not produce
235 *any* output until the predicate first becomes false, so it may have a lengthy
236 start-up time. Equivalent to::
237
238 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000239 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000240 iterable = iter(iterable)
241 for x in iterable:
242 if not predicate(x):
243 yield x
244 break
245 for x in iterable:
246 yield x
247
Raymond Hettinger749761e2009-01-27 04:42:48 +0000248.. function:: filterfalse(predicate, iterable)
249
250 Make an iterator that filters elements from iterable returning only those for
251 which the predicate is ``False``. If *predicate* is ``None``, return the items
252 that are false. Equivalent to::
253
254 def filterfalse(predicate, iterable):
255 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
256 if predicate is None:
257 predicate = bool
258 for x in iterable:
259 if not predicate(x):
260 yield x
261
Georg Brandl116aa622007-08-15 14:28:22 +0000262
263.. function:: groupby(iterable[, key])
264
265 Make an iterator that returns consecutive keys and groups from the *iterable*.
266 The *key* is a function computing a key value for each element. If not
267 specified or is ``None``, *key* defaults to an identity function and returns
268 the element unchanged. Generally, the iterable needs to already be sorted on
269 the same key function.
270
271 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
272 generates a break or new group every time the value of the key function changes
273 (which is why it is usually necessary to have sorted the data using the same key
274 function). That behavior differs from SQL's GROUP BY which aggregates common
275 elements regardless of their input order.
276
277 The returned group is itself an iterator that shares the underlying iterable
278 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
279 object is advanced, the previous group is no longer visible. So, if that data
280 is needed later, it should be stored as a list::
281
282 groups = []
283 uniquekeys = []
284 data = sorted(data, key=keyfunc)
285 for k, g in groupby(data, keyfunc):
286 groups.append(list(g)) # Store group iterator as a list
287 uniquekeys.append(k)
288
289 :func:`groupby` is equivalent to::
290
291 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000292 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
293 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000294 def __init__(self, iterable, key=None):
295 if key is None:
296 key = lambda x: x
297 self.keyfunc = key
298 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000299 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000300 def __iter__(self):
301 return self
302 def __next__(self):
303 while self.currkey == self.tgtkey:
304 self.currvalue = next(self.it) # Exit on StopIteration
305 self.currkey = self.keyfunc(self.currvalue)
306 self.tgtkey = self.currkey
307 return (self.currkey, self._grouper(self.tgtkey))
308 def _grouper(self, tgtkey):
309 while self.currkey == tgtkey:
310 yield self.currvalue
311 self.currvalue = next(self.it) # Exit on StopIteration
312 self.currkey = self.keyfunc(self.currvalue)
313
Georg Brandl116aa622007-08-15 14:28:22 +0000314
Georg Brandl116aa622007-08-15 14:28:22 +0000315.. function:: islice(iterable, [start,] stop [, step])
316
317 Make an iterator that returns selected elements from the iterable. If *start* is
318 non-zero, then elements from the iterable are skipped until start is reached.
319 Afterward, elements are returned consecutively unless *step* is set higher than
320 one which results in items being skipped. If *stop* is ``None``, then iteration
321 continues until the iterator is exhausted, if at all; otherwise, it stops at the
322 specified position. Unlike regular slicing, :func:`islice` does not support
323 negative values for *start*, *stop*, or *step*. Can be used to extract related
324 fields from data where the internal structure has been flattened (for example, a
325 multi-line report may list a name field on every third line). Equivalent to::
326
327 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000328 # islice('ABCDEFG', 2) --> A B
329 # islice('ABCDEFG', 2, 4) --> C D
330 # islice('ABCDEFG', 2, None) --> C D E F G
331 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000332 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000333 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000334 nexti = next(it)
335 for i, element in enumerate(iterable):
336 if i == nexti:
337 yield element
338 nexti = next(it)
339
340 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
341 then the step defaults to one.
342
Georg Brandl116aa622007-08-15 14:28:22 +0000343
Christian Heimes70e7ea22008-02-28 20:02:27 +0000344.. function:: permutations(iterable[, r])
345
346 Return successive *r* length permutations of elements in the *iterable*.
347
348 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl48310cd2009-01-03 21:18:54 +0000349 of the *iterable* and all possible full-length permutations
Christian Heimes70e7ea22008-02-28 20:02:27 +0000350 are generated.
351
Georg Brandl48310cd2009-01-03 21:18:54 +0000352 Permutations are emitted in lexicographic sort order. So, if the
Christian Heimes70e7ea22008-02-28 20:02:27 +0000353 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +0000354 in sorted order.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000355
356 Elements are treated as unique based on their position, not on their
357 value. So if the input elements are unique, there will be no repeat
358 values in each permutation.
359
Christian Heimesb558a2e2008-03-02 22:46:37 +0000360 Equivalent to::
361
362 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000363 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
364 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000365 pool = tuple(iterable)
366 n = len(pool)
367 r = n if r is None else r
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000368 if r > n:
369 return
370 indices = list(range(n))
Christian Heimesfe337bf2008-03-23 21:54:12 +0000371 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000372 yield tuple(pool[i] for i in indices[:r])
373 while n:
374 for i in reversed(range(r)):
375 cycles[i] -= 1
376 if cycles[i] == 0:
377 indices[i:] = indices[i+1:] + indices[i:i+1]
378 cycles[i] = n - i
379 else:
380 j = cycles[i]
381 indices[i], indices[-j] = indices[-j], indices[i]
382 yield tuple(pool[i] for i in indices[:r])
383 break
384 else:
385 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000386
Georg Brandl48310cd2009-01-03 21:18:54 +0000387 The code for :func:`permutations` can be also expressed as a subsequence of
Christian Heimes78644762008-03-04 23:39:23 +0000388 :func:`product`, filtered to exclude entries with repeated elements (those
389 from the same position in the input pool)::
390
391 def permutations(iterable, r=None):
392 pool = tuple(iterable)
393 n = len(pool)
394 r = n if r is None else r
395 for indices in product(range(n), repeat=r):
396 if len(set(indices)) == r:
397 yield tuple(pool[i] for i in indices)
398
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000399 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
400 or zero when ``r > n``.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000401
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000402.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000403
404 Cartesian product of input iterables.
405
406 Equivalent to nested for-loops in a generator expression. For example,
407 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
408
Christian Heimesdae2a892008-04-19 00:55:37 +0000409 The nested loops cycle like an odometer with the rightmost element advancing
410 on every iteration. This pattern creates a lexicographic ordering so that if
411 the input's iterables are sorted, the product tuples are emitted in sorted
412 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000413
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000414 To compute the product of an iterable with itself, specify the number of
415 repetitions with the optional *repeat* keyword argument. For example,
416 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
417
Christian Heimes78644762008-03-04 23:39:23 +0000418 This function is equivalent to the following code, except that the
419 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000420
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000421 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000422 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
423 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000424 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000425 result = [[]]
426 for pool in pools:
427 result = [x+[y] for x in result for y in pool]
428 for prod in result:
429 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000430
431
Georg Brandl116aa622007-08-15 14:28:22 +0000432.. function:: repeat(object[, times])
433
434 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000435 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000436 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000437 create an invariant part of a tuple record. Equivalent to::
438
439 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000440 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000441 if times is None:
442 while True:
443 yield object
444 else:
445 for i in range(times):
446 yield object
447
448
449.. function:: starmap(function, iterable)
450
Christian Heimes679db4a2008-01-18 09:56:22 +0000451 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000452 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000453 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000454 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000455 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
456
457 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000458 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000459 for args in iterable:
460 yield function(*args)
461
Georg Brandl116aa622007-08-15 14:28:22 +0000462
463.. function:: takewhile(predicate, iterable)
464
465 Make an iterator that returns elements from the iterable as long as the
466 predicate is true. Equivalent to::
467
468 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000469 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000470 for x in iterable:
471 if predicate(x):
472 yield x
473 else:
474 break
475
476
477.. function:: tee(iterable[, n=2])
478
479 Return *n* independent iterators from a single iterable. The case where ``n==2``
480 is equivalent to::
481
482 def tee(iterable):
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000483 def gen(next, data={}):
Georg Brandl116aa622007-08-15 14:28:22 +0000484 for i in count():
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000485 if i in data:
486 yield data.pop(i)
Georg Brandl116aa622007-08-15 14:28:22 +0000487 else:
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000488 data[i] = next()
489 yield data[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000490 it = iter(iterable)
491 return (gen(it.__next__), gen(it.__next__))
492
493 Note, once :func:`tee` has made a split, the original *iterable* should not be
494 used anywhere else; otherwise, the *iterable* could get advanced without the tee
495 objects being informed.
496
497 Note, this member of the toolkit may require significant auxiliary storage
498 (depending on how much temporary data needs to be stored). In general, if one
499 iterator is going to use most or all of the data before the other iterator, it
500 is faster to use :func:`list` instead of :func:`tee`.
501
Georg Brandl116aa622007-08-15 14:28:22 +0000502
Raymond Hettinger749761e2009-01-27 04:42:48 +0000503.. function:: zip_longest(*iterables[, fillvalue])
504
505 Make an iterator that aggregates elements from each of the iterables. If the
506 iterables are of uneven length, missing values are filled-in with *fillvalue*.
507 Iteration continues until the longest iterable is exhausted. Equivalent to::
508
509 def zip_longest(*args, fillvalue=None):
510 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
511 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
512 yield counter() # yields the fillvalue, or raises IndexError
513 fillers = repeat(fillvalue)
514 iters = [chain(it, sentinel(), fillers) for it in args]
515 try:
516 for tup in zip(*iters):
517 yield tup
518 except IndexError:
519 pass
520
521 If one of the iterables is potentially infinite, then the :func:`zip_longest`
522 function should be wrapped with something that limits the number of calls
523 (for example :func:`islice` or :func:`takewhile`). If not specified,
524 *fillvalue* defaults to ``None``.
525
526
Georg Brandl116aa622007-08-15 14:28:22 +0000527.. _itertools-example:
528
529Examples
530--------
531
532The following examples show common uses for each tool and demonstrate ways they
Christian Heimesfe337bf2008-03-23 21:54:12 +0000533can be combined.
534
535.. doctest::
Georg Brandl116aa622007-08-15 14:28:22 +0000536
Georg Brandlb1441c72009-01-03 22:33:39 +0000537 >>> # Show a dictionary sorted and grouped by value
Georg Brandl116aa622007-08-15 14:28:22 +0000538 >>> from operator import itemgetter
539 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000540 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000541 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000542 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000543 ...
544 1 ['a', 'c', 'e']
545 2 ['b', 'd', 'f']
546 3 ['g']
547
Georg Brandlb1441c72009-01-03 22:33:39 +0000548 >>> # Find runs of consecutive numbers using groupby. The key to the solution
549 >>> # is differencing with a range so that consecutive numbers all appear in
550 >>> # same group.
Georg Brandl116aa622007-08-15 14:28:22 +0000551 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
552 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000553 ... print(map(operator.itemgetter(1), g))
Georg Brandl48310cd2009-01-03 21:18:54 +0000554 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000555 [1]
556 [4, 5, 6]
557 [10]
558 [15, 16, 17, 18]
559 [22]
560 [25, 26, 27, 28]
561
562
563
564.. _itertools-recipes:
565
566Recipes
567-------
568
569This section shows recipes for creating an extended toolset using the existing
570itertools as building blocks.
571
572The extended tools offer the same high performance as the underlying toolset.
573The superior memory performance is kept by processing elements one at a time
574rather than bringing the whole iterable into memory all at once. Code volume is
575kept small by linking the tools together in a functional style which helps
576eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000577"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000578which incur interpreter overhead.
579
580.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000581
Georg Brandl3dbca812008-07-23 16:10:53 +0000582 def take(n, iterable):
583 "Return first n items of the iterable as a list"
584 return list(islice(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000585
Georg Brandl3dbca812008-07-23 16:10:53 +0000586 def enumerate(iterable, start=0):
587 return zip(count(start), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000588
Georg Brandl3dbca812008-07-23 16:10:53 +0000589 def tabulate(function, start=0):
Georg Brandl116aa622007-08-15 14:28:22 +0000590 "Return function(0), function(1), ..."
Georg Brandl3dbca812008-07-23 16:10:53 +0000591 return map(function, count(start))
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000592
Georg Brandl116aa622007-08-15 14:28:22 +0000593 def nth(iterable, n):
Georg Brandl3dbca812008-07-23 16:10:53 +0000594 "Returns the nth item or empty list"
595 return list(islice(iterable, n, n+1))
Georg Brandl116aa622007-08-15 14:28:22 +0000596
Georg Brandl3dbca812008-07-23 16:10:53 +0000597 def quantify(iterable, pred=bool):
598 "Count how many times the predicate is true"
599 return sum(map(pred, iterable))
Georg Brandl116aa622007-08-15 14:28:22 +0000600
Georg Brandl3dbca812008-07-23 16:10:53 +0000601 def padnone(iterable):
Georg Brandl116aa622007-08-15 14:28:22 +0000602 """Returns the sequence elements and then returns None indefinitely.
603
604 Useful for emulating the behavior of the built-in map() function.
605 """
Georg Brandl3dbca812008-07-23 16:10:53 +0000606 return chain(iterable, repeat(None))
Georg Brandl116aa622007-08-15 14:28:22 +0000607
Georg Brandl3dbca812008-07-23 16:10:53 +0000608 def ncycles(iterable, n):
Georg Brandl116aa622007-08-15 14:28:22 +0000609 "Returns the sequence elements n times"
Georg Brandl3dbca812008-07-23 16:10:53 +0000610 return chain.from_iterable(repeat(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000611
612 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000613 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000614
615 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000616 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000617
618 def repeatfunc(func, times=None, *args):
619 """Repeat calls to func with specified arguments.
620
621 Example: repeatfunc(random.random)
622 """
623 if times is None:
624 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000625 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000626
627 def pairwise(iterable):
628 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
629 a, b = tee(iterable)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000630 for elem in b:
631 break
632 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000633
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000634 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000635 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000636 args = [iter(iterable)] * n
Benjamin Petersond18de0e2008-07-31 20:21:46 +0000637 return zip_longest(fillvalue=fillvalue, *args)
Georg Brandl116aa622007-08-15 14:28:22 +0000638
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000639 def roundrobin(*iterables):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000640 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000641 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000642 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000643 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000644 while pending:
645 try:
646 for next in nexts:
647 yield next()
648 except StopIteration:
649 pending -= 1
650 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000651
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000652 def powerset(iterable):
Raymond Hettingerace67332009-01-26 02:23:50 +0000653 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
654 s = list(iterable)
655 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000656
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000657 def unique_everseen(iterable, key=None):
658 "List unique elements, preserving order. Remember all elements ever seen."
659 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
Georg Brandl48310cd2009-01-03 21:18:54 +0000660 # unique_everseen('ABBCcAD', str.lower) --> A B C D
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000661 seen = set()
662 seen_add = seen.add
663 if key is None:
664 for element in iterable:
665 if element not in seen:
666 seen_add(element)
667 yield element
668 else:
669 for element in iterable:
670 k = key(element)
671 if k not in seen:
672 seen_add(k)
673 yield element
674
675 def unique_justseen(iterable, key=None):
676 "List unique elements, preserving order. Remember only the element just seen."
677 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
678 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
679 return map(next, map(itemgetter(1), groupby(iterable, key)))