blob: 36254cd8f8d38bcfe99133be7385dfeca9ad9813 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Christian Heimesfe337bf2008-03-23 21:54:12 +000011.. testsetup::
12
13 from itertools import *
14
15
Georg Brandl9afde1c2007-11-01 20:32:30 +000016This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl116aa622007-08-15 14:28:22 +000017constructs from the Haskell and SML programming languages. Each has been recast
18in a form suitable for Python.
19
20The module standardizes a core set of fast, memory efficient tools that are
21useful by themselves or in combination. Standardization helps avoid the
22readability and reliability problems which arise when many different individuals
23create their own slightly varying implementations, each with their own quirks
24and naming conventions.
25
26The tools are designed to combine readily with one another. This makes it easy
27to construct more specialized tools succinctly and efficiently in pure Python.
28
29For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000030sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
31by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
33Likewise, the functional tools are designed to work well with the high-speed
34functions provided by the :mod:`operator` module.
35
Georg Brandl116aa622007-08-15 14:28:22 +000036Whether cast in pure python form or compiled code, tools that use iterators are
Georg Brandl3dbca812008-07-23 16:10:53 +000037more memory efficient (and often faster) than their list based counterparts. Adopting
Georg Brandl116aa622007-08-15 14:28:22 +000038the principles of just-in-time manufacturing, they create data when and where
39needed instead of consuming memory with the computer equivalent of "inventory".
40
Georg Brandl116aa622007-08-15 14:28:22 +000041
42.. seealso::
43
44 The Standard ML Basis Library, `The Standard ML Basis Library
45 <http://www.standardml.org/Basis/>`_.
46
47 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
48 Libraries <http://www.haskell.org/definition/>`_.
49
50
51.. _itertools-functions:
52
53Itertool functions
54------------------
55
56The following module functions all construct and return iterators. Some provide
57streams of infinite length, so they should only be accessed by functions or
58loops that truncate the stream.
59
60
61.. function:: chain(*iterables)
62
63 Make an iterator that returns elements from the first iterable until it is
64 exhausted, then proceeds to the next iterable, until all of the iterables are
65 exhausted. Used for treating consecutive sequences as a single sequence.
66 Equivalent to::
67
68 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000069 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000070 for it in iterables:
71 for element in it:
72 yield element
73
74
Christian Heimes70e7ea22008-02-28 20:02:27 +000075.. function:: itertools.chain.from_iterable(iterable)
76
Georg Brandl48310cd2009-01-03 21:18:54 +000077 Alternate constructor for :func:`chain`. Gets chained inputs from a
Christian Heimes70e7ea22008-02-28 20:02:27 +000078 single iterable argument that is evaluated lazily. Equivalent to::
79
80 @classmethod
81 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000082 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +000083 for it in iterables:
84 for element in it:
85 yield element
86
Christian Heimes78644762008-03-04 23:39:23 +000087
Christian Heimes836baa52008-02-26 08:18:30 +000088.. function:: combinations(iterable, r)
89
Christian Heimesdae2a892008-04-19 00:55:37 +000090 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +000091
Georg Brandl48310cd2009-01-03 21:18:54 +000092 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +000093 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +000094 in sorted order.
Christian Heimes836baa52008-02-26 08:18:30 +000095
96 Elements are treated as unique based on their position, not on their
97 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +000098 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +000099
Christian Heimes836baa52008-02-26 08:18:30 +0000100 Equivalent to::
101
102 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000103 # combinations('ABCD', 2) --> AB AC AD BC BD CD
104 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000105 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000106 n = len(pool)
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000107 if r > n:
108 return
109 indices = list(range(r))
Christian Heimesb558a2e2008-03-02 22:46:37 +0000110 yield tuple(pool[i] for i in indices)
Christian Heimes380f7f22008-02-28 11:19:05 +0000111 while 1:
112 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000113 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000114 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000115 else:
116 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000117 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000118 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000119 indices[j] = indices[j-1] + 1
120 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000121
Christian Heimes78644762008-03-04 23:39:23 +0000122 The code for :func:`combinations` can be also expressed as a subsequence
123 of :func:`permutations` after filtering entries where the elements are not
124 in sorted order (according to their position in the input pool)::
125
126 def combinations(iterable, r):
127 pool = tuple(iterable)
128 n = len(pool)
129 for indices in permutations(range(n), r):
130 if sorted(indices) == list(indices):
131 yield tuple(pool[i] for i in indices)
132
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000133 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
134 or zero when ``r > n``.
Christian Heimes836baa52008-02-26 08:18:30 +0000135
Georg Brandl116aa622007-08-15 14:28:22 +0000136.. function:: count([n])
137
138 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettingera6c60372008-03-13 01:26:19 +0000139 specified *n* defaults to zero. Often used as an argument to :func:`map` to
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000140 generate consecutive data points. Also, used with :func:`zip` to add sequence
Georg Brandl9afde1c2007-11-01 20:32:30 +0000141 numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000142
143 def count(n=0):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000144 # count(10) --> 10 11 12 13 14 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000145 while True:
146 yield n
147 n += 1
148
Georg Brandl116aa622007-08-15 14:28:22 +0000149
150.. function:: cycle(iterable)
151
152 Make an iterator returning elements from the iterable and saving a copy of each.
153 When the iterable is exhausted, return elements from the saved copy. Repeats
154 indefinitely. Equivalent to::
155
156 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000157 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000158 saved = []
159 for element in iterable:
160 yield element
161 saved.append(element)
162 while saved:
163 for element in saved:
164 yield element
165
166 Note, this member of the toolkit may require significant auxiliary storage
167 (depending on the length of the iterable).
168
169
170.. function:: dropwhile(predicate, iterable)
171
172 Make an iterator that drops elements from the iterable as long as the predicate
173 is true; afterwards, returns every element. Note, the iterator does not produce
174 *any* output until the predicate first becomes false, so it may have a lengthy
175 start-up time. Equivalent to::
176
177 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000178 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000179 iterable = iter(iterable)
180 for x in iterable:
181 if not predicate(x):
182 yield x
183 break
184 for x in iterable:
185 yield x
186
187
188.. function:: groupby(iterable[, key])
189
190 Make an iterator that returns consecutive keys and groups from the *iterable*.
191 The *key* is a function computing a key value for each element. If not
192 specified or is ``None``, *key* defaults to an identity function and returns
193 the element unchanged. Generally, the iterable needs to already be sorted on
194 the same key function.
195
196 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
197 generates a break or new group every time the value of the key function changes
198 (which is why it is usually necessary to have sorted the data using the same key
199 function). That behavior differs from SQL's GROUP BY which aggregates common
200 elements regardless of their input order.
201
202 The returned group is itself an iterator that shares the underlying iterable
203 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
204 object is advanced, the previous group is no longer visible. So, if that data
205 is needed later, it should be stored as a list::
206
207 groups = []
208 uniquekeys = []
209 data = sorted(data, key=keyfunc)
210 for k, g in groupby(data, keyfunc):
211 groups.append(list(g)) # Store group iterator as a list
212 uniquekeys.append(k)
213
214 :func:`groupby` is equivalent to::
215
216 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000217 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
218 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000219 def __init__(self, iterable, key=None):
220 if key is None:
221 key = lambda x: x
222 self.keyfunc = key
223 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000224 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000225 def __iter__(self):
226 return self
227 def __next__(self):
228 while self.currkey == self.tgtkey:
229 self.currvalue = next(self.it) # Exit on StopIteration
230 self.currkey = self.keyfunc(self.currvalue)
231 self.tgtkey = self.currkey
232 return (self.currkey, self._grouper(self.tgtkey))
233 def _grouper(self, tgtkey):
234 while self.currkey == tgtkey:
235 yield self.currvalue
236 self.currvalue = next(self.it) # Exit on StopIteration
237 self.currkey = self.keyfunc(self.currvalue)
238
Georg Brandl116aa622007-08-15 14:28:22 +0000239
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000240.. function:: filterfalse(predicate, iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000241
242 Make an iterator that filters elements from iterable returning only those for
243 which the predicate is ``False``. If *predicate* is ``None``, return the items
244 that are false. Equivalent to::
245
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000246 def filterfalse(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000247 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl116aa622007-08-15 14:28:22 +0000248 if predicate is None:
249 predicate = bool
250 for x in iterable:
251 if not predicate(x):
252 yield x
253
254
Georg Brandl116aa622007-08-15 14:28:22 +0000255.. function:: islice(iterable, [start,] stop [, step])
256
257 Make an iterator that returns selected elements from the iterable. If *start* is
258 non-zero, then elements from the iterable are skipped until start is reached.
259 Afterward, elements are returned consecutively unless *step* is set higher than
260 one which results in items being skipped. If *stop* is ``None``, then iteration
261 continues until the iterator is exhausted, if at all; otherwise, it stops at the
262 specified position. Unlike regular slicing, :func:`islice` does not support
263 negative values for *start*, *stop*, or *step*. Can be used to extract related
264 fields from data where the internal structure has been flattened (for example, a
265 multi-line report may list a name field on every third line). Equivalent to::
266
267 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000268 # islice('ABCDEFG', 2) --> A B
269 # islice('ABCDEFG', 2, 4) --> C D
270 # islice('ABCDEFG', 2, None) --> C D E F G
271 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000272 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000273 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000274 nexti = next(it)
275 for i, element in enumerate(iterable):
276 if i == nexti:
277 yield element
278 nexti = next(it)
279
280 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
281 then the step defaults to one.
282
Georg Brandl116aa622007-08-15 14:28:22 +0000283
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000284.. function:: zip_longest(*iterables[, fillvalue])
Georg Brandl116aa622007-08-15 14:28:22 +0000285
286 Make an iterator that aggregates elements from each of the iterables. If the
287 iterables are of uneven length, missing values are filled-in with *fillvalue*.
288 Iteration continues until the longest iterable is exhausted. Equivalent to::
289
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000290 def zip_longest(*args, fillvalue=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000291 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl116aa622007-08-15 14:28:22 +0000292 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
293 yield counter() # yields the fillvalue, or raises IndexError
294 fillers = repeat(fillvalue)
295 iters = [chain(it, sentinel(), fillers) for it in args]
296 try:
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000297 for tup in zip(*iters):
Georg Brandl116aa622007-08-15 14:28:22 +0000298 yield tup
299 except IndexError:
300 pass
301
Benjamin Peterson37d2fe02008-10-24 22:28:58 +0000302 If one of the iterables is potentially infinite, then the :func:`zip_longest`
303 function should be wrapped with something that limits the number of calls
304 (for example :func:`islice` or :func:`takewhile`). If not specified,
305 *fillvalue* defaults to ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +0000306
Georg Brandl116aa622007-08-15 14:28:22 +0000307
Christian Heimes70e7ea22008-02-28 20:02:27 +0000308.. function:: permutations(iterable[, r])
309
310 Return successive *r* length permutations of elements in the *iterable*.
311
312 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl48310cd2009-01-03 21:18:54 +0000313 of the *iterable* and all possible full-length permutations
Christian Heimes70e7ea22008-02-28 20:02:27 +0000314 are generated.
315
Georg Brandl48310cd2009-01-03 21:18:54 +0000316 Permutations are emitted in lexicographic sort order. So, if the
Christian Heimes70e7ea22008-02-28 20:02:27 +0000317 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +0000318 in sorted order.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000319
320 Elements are treated as unique based on their position, not on their
321 value. So if the input elements are unique, there will be no repeat
322 values in each permutation.
323
Christian Heimesb558a2e2008-03-02 22:46:37 +0000324 Equivalent to::
325
326 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000327 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
328 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000329 pool = tuple(iterable)
330 n = len(pool)
331 r = n if r is None else r
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000332 if r > n:
333 return
334 indices = list(range(n))
Christian Heimesfe337bf2008-03-23 21:54:12 +0000335 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000336 yield tuple(pool[i] for i in indices[:r])
337 while n:
338 for i in reversed(range(r)):
339 cycles[i] -= 1
340 if cycles[i] == 0:
341 indices[i:] = indices[i+1:] + indices[i:i+1]
342 cycles[i] = n - i
343 else:
344 j = cycles[i]
345 indices[i], indices[-j] = indices[-j], indices[i]
346 yield tuple(pool[i] for i in indices[:r])
347 break
348 else:
349 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000350
Georg Brandl48310cd2009-01-03 21:18:54 +0000351 The code for :func:`permutations` can be also expressed as a subsequence of
Christian Heimes78644762008-03-04 23:39:23 +0000352 :func:`product`, filtered to exclude entries with repeated elements (those
353 from the same position in the input pool)::
354
355 def permutations(iterable, r=None):
356 pool = tuple(iterable)
357 n = len(pool)
358 r = n if r is None else r
359 for indices in product(range(n), repeat=r):
360 if len(set(indices)) == r:
361 yield tuple(pool[i] for i in indices)
362
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000363 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
364 or zero when ``r > n``.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000365
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000366.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000367
368 Cartesian product of input iterables.
369
370 Equivalent to nested for-loops in a generator expression. For example,
371 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
372
Christian Heimesdae2a892008-04-19 00:55:37 +0000373 The nested loops cycle like an odometer with the rightmost element advancing
374 on every iteration. This pattern creates a lexicographic ordering so that if
375 the input's iterables are sorted, the product tuples are emitted in sorted
376 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000377
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000378 To compute the product of an iterable with itself, specify the number of
379 repetitions with the optional *repeat* keyword argument. For example,
380 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
381
Christian Heimes78644762008-03-04 23:39:23 +0000382 This function is equivalent to the following code, except that the
383 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000384
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000385 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000386 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
387 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000388 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000389 result = [[]]
390 for pool in pools:
391 result = [x+[y] for x in result for y in pool]
392 for prod in result:
393 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000394
395
Georg Brandl116aa622007-08-15 14:28:22 +0000396.. function:: repeat(object[, times])
397
398 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000399 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000400 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000401 create an invariant part of a tuple record. Equivalent to::
402
403 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000404 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000405 if times is None:
406 while True:
407 yield object
408 else:
409 for i in range(times):
410 yield object
411
412
413.. function:: starmap(function, iterable)
414
Christian Heimes679db4a2008-01-18 09:56:22 +0000415 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000416 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000417 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000418 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000419 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
420
421 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000422 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000423 for args in iterable:
424 yield function(*args)
425
Georg Brandl116aa622007-08-15 14:28:22 +0000426
427.. function:: takewhile(predicate, iterable)
428
429 Make an iterator that returns elements from the iterable as long as the
430 predicate is true. Equivalent to::
431
432 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000433 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000434 for x in iterable:
435 if predicate(x):
436 yield x
437 else:
438 break
439
440
441.. function:: tee(iterable[, n=2])
442
443 Return *n* independent iterators from a single iterable. The case where ``n==2``
444 is equivalent to::
445
446 def tee(iterable):
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000447 def gen(next, data={}):
Georg Brandl116aa622007-08-15 14:28:22 +0000448 for i in count():
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000449 if i in data:
450 yield data.pop(i)
Georg Brandl116aa622007-08-15 14:28:22 +0000451 else:
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000452 data[i] = next()
453 yield data[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000454 it = iter(iterable)
455 return (gen(it.__next__), gen(it.__next__))
456
457 Note, once :func:`tee` has made a split, the original *iterable* should not be
458 used anywhere else; otherwise, the *iterable* could get advanced without the tee
459 objects being informed.
460
461 Note, this member of the toolkit may require significant auxiliary storage
462 (depending on how much temporary data needs to be stored). In general, if one
463 iterator is going to use most or all of the data before the other iterator, it
464 is faster to use :func:`list` instead of :func:`tee`.
465
Georg Brandl116aa622007-08-15 14:28:22 +0000466
467.. _itertools-example:
468
469Examples
470--------
471
472The following examples show common uses for each tool and demonstrate ways they
Christian Heimesfe337bf2008-03-23 21:54:12 +0000473can be combined.
474
475.. doctest::
Georg Brandl116aa622007-08-15 14:28:22 +0000476
Georg Brandlb1441c72009-01-03 22:33:39 +0000477 >>> # Show a dictionary sorted and grouped by value
Georg Brandl116aa622007-08-15 14:28:22 +0000478 >>> from operator import itemgetter
479 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000480 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000481 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000482 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000483 ...
484 1 ['a', 'c', 'e']
485 2 ['b', 'd', 'f']
486 3 ['g']
487
Georg Brandlb1441c72009-01-03 22:33:39 +0000488 >>> # Find runs of consecutive numbers using groupby. The key to the solution
489 >>> # is differencing with a range so that consecutive numbers all appear in
490 >>> # same group.
Georg Brandl116aa622007-08-15 14:28:22 +0000491 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
492 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000493 ... print(map(operator.itemgetter(1), g))
Georg Brandl48310cd2009-01-03 21:18:54 +0000494 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000495 [1]
496 [4, 5, 6]
497 [10]
498 [15, 16, 17, 18]
499 [22]
500 [25, 26, 27, 28]
501
502
503
504.. _itertools-recipes:
505
506Recipes
507-------
508
509This section shows recipes for creating an extended toolset using the existing
510itertools as building blocks.
511
512The extended tools offer the same high performance as the underlying toolset.
513The superior memory performance is kept by processing elements one at a time
514rather than bringing the whole iterable into memory all at once. Code volume is
515kept small by linking the tools together in a functional style which helps
516eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000517"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000518which incur interpreter overhead.
519
520.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000521
Georg Brandl3dbca812008-07-23 16:10:53 +0000522 def take(n, iterable):
523 "Return first n items of the iterable as a list"
524 return list(islice(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000525
Georg Brandl3dbca812008-07-23 16:10:53 +0000526 def enumerate(iterable, start=0):
527 return zip(count(start), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000528
Georg Brandl3dbca812008-07-23 16:10:53 +0000529 def tabulate(function, start=0):
Georg Brandl116aa622007-08-15 14:28:22 +0000530 "Return function(0), function(1), ..."
Georg Brandl3dbca812008-07-23 16:10:53 +0000531 return map(function, count(start))
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000532
Georg Brandl116aa622007-08-15 14:28:22 +0000533 def nth(iterable, n):
Georg Brandl3dbca812008-07-23 16:10:53 +0000534 "Returns the nth item or empty list"
535 return list(islice(iterable, n, n+1))
Georg Brandl116aa622007-08-15 14:28:22 +0000536
Georg Brandl3dbca812008-07-23 16:10:53 +0000537 def quantify(iterable, pred=bool):
538 "Count how many times the predicate is true"
539 return sum(map(pred, iterable))
Georg Brandl116aa622007-08-15 14:28:22 +0000540
Georg Brandl3dbca812008-07-23 16:10:53 +0000541 def padnone(iterable):
Georg Brandl116aa622007-08-15 14:28:22 +0000542 """Returns the sequence elements and then returns None indefinitely.
543
544 Useful for emulating the behavior of the built-in map() function.
545 """
Georg Brandl3dbca812008-07-23 16:10:53 +0000546 return chain(iterable, repeat(None))
Georg Brandl116aa622007-08-15 14:28:22 +0000547
Georg Brandl3dbca812008-07-23 16:10:53 +0000548 def ncycles(iterable, n):
Georg Brandl116aa622007-08-15 14:28:22 +0000549 "Returns the sequence elements n times"
Georg Brandl3dbca812008-07-23 16:10:53 +0000550 return chain.from_iterable(repeat(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000551
552 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000553 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000554
555 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000556 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000557
558 def repeatfunc(func, times=None, *args):
559 """Repeat calls to func with specified arguments.
560
561 Example: repeatfunc(random.random)
562 """
563 if times is None:
564 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000565 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000566
567 def pairwise(iterable):
568 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
569 a, b = tee(iterable)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000570 for elem in b:
571 break
572 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000573
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000574 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000575 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000576 args = [iter(iterable)] * n
Benjamin Petersond18de0e2008-07-31 20:21:46 +0000577 return zip_longest(fillvalue=fillvalue, *args)
Georg Brandl116aa622007-08-15 14:28:22 +0000578
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000579 def roundrobin(*iterables):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000580 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000581 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000582 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000583 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000584 while pending:
585 try:
586 for next in nexts:
587 yield next()
588 except StopIteration:
589 pending -= 1
590 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000591
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000592 def powerset(iterable):
Raymond Hettingerace67332009-01-26 02:23:50 +0000593 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
594 s = list(iterable)
595 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000596
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000597 def compress(data, selectors):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000598 "compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F"
Benjamin Peterson37d2fe02008-10-24 22:28:58 +0000599 return (d for d, s in zip(data, selectors) if s)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000600
Georg Brandl3dbca812008-07-23 16:10:53 +0000601 def combinations_with_replacement(iterable, r):
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000602 "combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC"
603 # number items returned: (n+r-1)! / r! / (n-1)!
Georg Brandl3dbca812008-07-23 16:10:53 +0000604 pool = tuple(iterable)
605 n = len(pool)
606 indices = [0] * r
607 yield tuple(pool[i] for i in indices)
608 while True:
609 for i in reversed(range(r)):
610 if indices[i] != n - 1:
611 break
612 else:
613 return
614 indices[i:] = [indices[i] + 1] * (r - i)
615 yield tuple(pool[i] for i in indices)
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000616
617 def unique_everseen(iterable, key=None):
618 "List unique elements, preserving order. Remember all elements ever seen."
619 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
Georg Brandl48310cd2009-01-03 21:18:54 +0000620 # unique_everseen('ABBCcAD', str.lower) --> A B C D
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000621 seen = set()
622 seen_add = seen.add
623 if key is None:
624 for element in iterable:
625 if element not in seen:
626 seen_add(element)
627 yield element
628 else:
629 for element in iterable:
630 k = key(element)
631 if k not in seen:
632 seen_add(k)
633 yield element
634
635 def unique_justseen(iterable, key=None):
636 "List unique elements, preserving order. Remember only the element just seen."
637 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
638 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
639 return map(next, map(itemgetter(1), groupby(iterable, key)))