blob: d28127801f5fbac71e53523983785a7da96d6a8f [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Christian Heimesfe337bf2008-03-23 21:54:12 +000011.. testsetup::
12
13 from itertools import *
14
15
Georg Brandl9afde1c2007-11-01 20:32:30 +000016This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl116aa622007-08-15 14:28:22 +000017constructs from the Haskell and SML programming languages. Each has been recast
18in a form suitable for Python.
19
20The module standardizes a core set of fast, memory efficient tools that are
21useful by themselves or in combination. Standardization helps avoid the
22readability and reliability problems which arise when many different individuals
23create their own slightly varying implementations, each with their own quirks
24and naming conventions.
25
26The tools are designed to combine readily with one another. This makes it easy
27to construct more specialized tools succinctly and efficiently in pure Python.
28
29For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000030sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
31by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
33Likewise, the functional tools are designed to work well with the high-speed
34functions provided by the :mod:`operator` module.
35
Georg Brandl116aa622007-08-15 14:28:22 +000036Whether cast in pure python form or compiled code, tools that use iterators are
Georg Brandl3dbca812008-07-23 16:10:53 +000037more memory efficient (and often faster) than their list based counterparts. Adopting
Georg Brandl116aa622007-08-15 14:28:22 +000038the principles of just-in-time manufacturing, they create data when and where
39needed instead of consuming memory with the computer equivalent of "inventory".
40
Georg Brandl116aa622007-08-15 14:28:22 +000041
42.. seealso::
43
44 The Standard ML Basis Library, `The Standard ML Basis Library
45 <http://www.standardml.org/Basis/>`_.
46
47 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
48 Libraries <http://www.haskell.org/definition/>`_.
49
50
51.. _itertools-functions:
52
53Itertool functions
54------------------
55
56The following module functions all construct and return iterators. Some provide
57streams of infinite length, so they should only be accessed by functions or
58loops that truncate the stream.
59
60
61.. function:: chain(*iterables)
62
63 Make an iterator that returns elements from the first iterable until it is
64 exhausted, then proceeds to the next iterable, until all of the iterables are
65 exhausted. Used for treating consecutive sequences as a single sequence.
66 Equivalent to::
67
68 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000069 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000070 for it in iterables:
71 for element in it:
72 yield element
73
74
Christian Heimes70e7ea22008-02-28 20:02:27 +000075.. function:: itertools.chain.from_iterable(iterable)
76
Georg Brandl48310cd2009-01-03 21:18:54 +000077 Alternate constructor for :func:`chain`. Gets chained inputs from a
Christian Heimes70e7ea22008-02-28 20:02:27 +000078 single iterable argument that is evaluated lazily. Equivalent to::
79
80 @classmethod
81 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000082 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +000083 for it in iterables:
84 for element in it:
85 yield element
86
Christian Heimes78644762008-03-04 23:39:23 +000087
Christian Heimes836baa52008-02-26 08:18:30 +000088.. function:: combinations(iterable, r)
89
Christian Heimesdae2a892008-04-19 00:55:37 +000090 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +000091
Georg Brandl48310cd2009-01-03 21:18:54 +000092 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +000093 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +000094 in sorted order.
Christian Heimes836baa52008-02-26 08:18:30 +000095
96 Elements are treated as unique based on their position, not on their
97 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +000098 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +000099
Christian Heimes836baa52008-02-26 08:18:30 +0000100 Equivalent to::
101
102 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000103 # combinations('ABCD', 2) --> AB AC AD BC BD CD
104 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000105 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000106 n = len(pool)
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000107 if r > n:
108 return
109 indices = list(range(r))
Christian Heimesb558a2e2008-03-02 22:46:37 +0000110 yield tuple(pool[i] for i in indices)
Christian Heimes380f7f22008-02-28 11:19:05 +0000111 while 1:
112 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000113 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000114 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000115 else:
116 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000117 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000118 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000119 indices[j] = indices[j-1] + 1
120 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000121
Christian Heimes78644762008-03-04 23:39:23 +0000122 The code for :func:`combinations` can be also expressed as a subsequence
123 of :func:`permutations` after filtering entries where the elements are not
124 in sorted order (according to their position in the input pool)::
125
126 def combinations(iterable, r):
127 pool = tuple(iterable)
128 n = len(pool)
129 for indices in permutations(range(n), r):
130 if sorted(indices) == list(indices):
131 yield tuple(pool[i] for i in indices)
132
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000133 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
134 or zero when ``r > n``.
Christian Heimes836baa52008-02-26 08:18:30 +0000135
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000136.. function:: compress(data, selectors)
137
138 Make an iterator that filters elements from *data* returning only those that
139 have a corresponding element in *selectors* that evaluates to ``True``.
140 Stops when either the *data* or *selectors* iterables have been exhausted.
141 Equivalent to::
142
143 def compress(data, selectors):
144 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
145 return (d for d, s in zip(data, selectors) if s)
146
147 .. versionadded:: 2.7
148
149
Georg Brandl116aa622007-08-15 14:28:22 +0000150.. function:: count([n])
151
152 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettingera6c60372008-03-13 01:26:19 +0000153 specified *n* defaults to zero. Often used as an argument to :func:`map` to
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000154 generate consecutive data points. Also, used with :func:`zip` to add sequence
Georg Brandl9afde1c2007-11-01 20:32:30 +0000155 numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000156
157 def count(n=0):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000158 # count(10) --> 10 11 12 13 14 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000159 while True:
160 yield n
161 n += 1
162
Georg Brandl116aa622007-08-15 14:28:22 +0000163
164.. function:: cycle(iterable)
165
166 Make an iterator returning elements from the iterable and saving a copy of each.
167 When the iterable is exhausted, return elements from the saved copy. Repeats
168 indefinitely. Equivalent to::
169
170 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000171 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000172 saved = []
173 for element in iterable:
174 yield element
175 saved.append(element)
176 while saved:
177 for element in saved:
178 yield element
179
180 Note, this member of the toolkit may require significant auxiliary storage
181 (depending on the length of the iterable).
182
183
184.. function:: dropwhile(predicate, iterable)
185
186 Make an iterator that drops elements from the iterable as long as the predicate
187 is true; afterwards, returns every element. Note, the iterator does not produce
188 *any* output until the predicate first becomes false, so it may have a lengthy
189 start-up time. Equivalent to::
190
191 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000192 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000193 iterable = iter(iterable)
194 for x in iterable:
195 if not predicate(x):
196 yield x
197 break
198 for x in iterable:
199 yield x
200
201
202.. function:: groupby(iterable[, key])
203
204 Make an iterator that returns consecutive keys and groups from the *iterable*.
205 The *key* is a function computing a key value for each element. If not
206 specified or is ``None``, *key* defaults to an identity function and returns
207 the element unchanged. Generally, the iterable needs to already be sorted on
208 the same key function.
209
210 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
211 generates a break or new group every time the value of the key function changes
212 (which is why it is usually necessary to have sorted the data using the same key
213 function). That behavior differs from SQL's GROUP BY which aggregates common
214 elements regardless of their input order.
215
216 The returned group is itself an iterator that shares the underlying iterable
217 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
218 object is advanced, the previous group is no longer visible. So, if that data
219 is needed later, it should be stored as a list::
220
221 groups = []
222 uniquekeys = []
223 data = sorted(data, key=keyfunc)
224 for k, g in groupby(data, keyfunc):
225 groups.append(list(g)) # Store group iterator as a list
226 uniquekeys.append(k)
227
228 :func:`groupby` is equivalent to::
229
230 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000231 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
232 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000233 def __init__(self, iterable, key=None):
234 if key is None:
235 key = lambda x: x
236 self.keyfunc = key
237 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000238 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000239 def __iter__(self):
240 return self
241 def __next__(self):
242 while self.currkey == self.tgtkey:
243 self.currvalue = next(self.it) # Exit on StopIteration
244 self.currkey = self.keyfunc(self.currvalue)
245 self.tgtkey = self.currkey
246 return (self.currkey, self._grouper(self.tgtkey))
247 def _grouper(self, tgtkey):
248 while self.currkey == tgtkey:
249 yield self.currvalue
250 self.currvalue = next(self.it) # Exit on StopIteration
251 self.currkey = self.keyfunc(self.currvalue)
252
Georg Brandl116aa622007-08-15 14:28:22 +0000253
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000254.. function:: filterfalse(predicate, iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000255
256 Make an iterator that filters elements from iterable returning only those for
257 which the predicate is ``False``. If *predicate* is ``None``, return the items
258 that are false. Equivalent to::
259
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000260 def filterfalse(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000261 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl116aa622007-08-15 14:28:22 +0000262 if predicate is None:
263 predicate = bool
264 for x in iterable:
265 if not predicate(x):
266 yield x
267
268
Georg Brandl116aa622007-08-15 14:28:22 +0000269.. function:: islice(iterable, [start,] stop [, step])
270
271 Make an iterator that returns selected elements from the iterable. If *start* is
272 non-zero, then elements from the iterable are skipped until start is reached.
273 Afterward, elements are returned consecutively unless *step* is set higher than
274 one which results in items being skipped. If *stop* is ``None``, then iteration
275 continues until the iterator is exhausted, if at all; otherwise, it stops at the
276 specified position. Unlike regular slicing, :func:`islice` does not support
277 negative values for *start*, *stop*, or *step*. Can be used to extract related
278 fields from data where the internal structure has been flattened (for example, a
279 multi-line report may list a name field on every third line). Equivalent to::
280
281 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000282 # islice('ABCDEFG', 2) --> A B
283 # islice('ABCDEFG', 2, 4) --> C D
284 # islice('ABCDEFG', 2, None) --> C D E F G
285 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000286 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000287 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000288 nexti = next(it)
289 for i, element in enumerate(iterable):
290 if i == nexti:
291 yield element
292 nexti = next(it)
293
294 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
295 then the step defaults to one.
296
Georg Brandl116aa622007-08-15 14:28:22 +0000297
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000298.. function:: zip_longest(*iterables[, fillvalue])
Georg Brandl116aa622007-08-15 14:28:22 +0000299
300 Make an iterator that aggregates elements from each of the iterables. If the
301 iterables are of uneven length, missing values are filled-in with *fillvalue*.
302 Iteration continues until the longest iterable is exhausted. Equivalent to::
303
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000304 def zip_longest(*args, fillvalue=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000305 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl116aa622007-08-15 14:28:22 +0000306 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
307 yield counter() # yields the fillvalue, or raises IndexError
308 fillers = repeat(fillvalue)
309 iters = [chain(it, sentinel(), fillers) for it in args]
310 try:
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000311 for tup in zip(*iters):
Georg Brandl116aa622007-08-15 14:28:22 +0000312 yield tup
313 except IndexError:
314 pass
315
Benjamin Peterson37d2fe02008-10-24 22:28:58 +0000316 If one of the iterables is potentially infinite, then the :func:`zip_longest`
317 function should be wrapped with something that limits the number of calls
318 (for example :func:`islice` or :func:`takewhile`). If not specified,
319 *fillvalue* defaults to ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +0000320
Georg Brandl116aa622007-08-15 14:28:22 +0000321
Christian Heimes70e7ea22008-02-28 20:02:27 +0000322.. function:: permutations(iterable[, r])
323
324 Return successive *r* length permutations of elements in the *iterable*.
325
326 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl48310cd2009-01-03 21:18:54 +0000327 of the *iterable* and all possible full-length permutations
Christian Heimes70e7ea22008-02-28 20:02:27 +0000328 are generated.
329
Georg Brandl48310cd2009-01-03 21:18:54 +0000330 Permutations are emitted in lexicographic sort order. So, if the
Christian Heimes70e7ea22008-02-28 20:02:27 +0000331 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +0000332 in sorted order.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000333
334 Elements are treated as unique based on their position, not on their
335 value. So if the input elements are unique, there will be no repeat
336 values in each permutation.
337
Christian Heimesb558a2e2008-03-02 22:46:37 +0000338 Equivalent to::
339
340 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000341 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
342 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000343 pool = tuple(iterable)
344 n = len(pool)
345 r = n if r is None else r
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000346 if r > n:
347 return
348 indices = list(range(n))
Christian Heimesfe337bf2008-03-23 21:54:12 +0000349 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000350 yield tuple(pool[i] for i in indices[:r])
351 while n:
352 for i in reversed(range(r)):
353 cycles[i] -= 1
354 if cycles[i] == 0:
355 indices[i:] = indices[i+1:] + indices[i:i+1]
356 cycles[i] = n - i
357 else:
358 j = cycles[i]
359 indices[i], indices[-j] = indices[-j], indices[i]
360 yield tuple(pool[i] for i in indices[:r])
361 break
362 else:
363 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000364
Georg Brandl48310cd2009-01-03 21:18:54 +0000365 The code for :func:`permutations` can be also expressed as a subsequence of
Christian Heimes78644762008-03-04 23:39:23 +0000366 :func:`product`, filtered to exclude entries with repeated elements (those
367 from the same position in the input pool)::
368
369 def permutations(iterable, r=None):
370 pool = tuple(iterable)
371 n = len(pool)
372 r = n if r is None else r
373 for indices in product(range(n), repeat=r):
374 if len(set(indices)) == r:
375 yield tuple(pool[i] for i in indices)
376
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000377 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
378 or zero when ``r > n``.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000379
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000380.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000381
382 Cartesian product of input iterables.
383
384 Equivalent to nested for-loops in a generator expression. For example,
385 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
386
Christian Heimesdae2a892008-04-19 00:55:37 +0000387 The nested loops cycle like an odometer with the rightmost element advancing
388 on every iteration. This pattern creates a lexicographic ordering so that if
389 the input's iterables are sorted, the product tuples are emitted in sorted
390 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000391
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000392 To compute the product of an iterable with itself, specify the number of
393 repetitions with the optional *repeat* keyword argument. For example,
394 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
395
Christian Heimes78644762008-03-04 23:39:23 +0000396 This function is equivalent to the following code, except that the
397 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000398
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000399 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000400 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
401 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000402 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000403 result = [[]]
404 for pool in pools:
405 result = [x+[y] for x in result for y in pool]
406 for prod in result:
407 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000408
409
Georg Brandl116aa622007-08-15 14:28:22 +0000410.. function:: repeat(object[, times])
411
412 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000413 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000414 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000415 create an invariant part of a tuple record. Equivalent to::
416
417 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000418 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000419 if times is None:
420 while True:
421 yield object
422 else:
423 for i in range(times):
424 yield object
425
426
427.. function:: starmap(function, iterable)
428
Christian Heimes679db4a2008-01-18 09:56:22 +0000429 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000430 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000431 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000432 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000433 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
434
435 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000436 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000437 for args in iterable:
438 yield function(*args)
439
Georg Brandl116aa622007-08-15 14:28:22 +0000440
441.. function:: takewhile(predicate, iterable)
442
443 Make an iterator that returns elements from the iterable as long as the
444 predicate is true. Equivalent to::
445
446 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000447 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000448 for x in iterable:
449 if predicate(x):
450 yield x
451 else:
452 break
453
454
455.. function:: tee(iterable[, n=2])
456
457 Return *n* independent iterators from a single iterable. The case where ``n==2``
458 is equivalent to::
459
460 def tee(iterable):
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000461 def gen(next, data={}):
Georg Brandl116aa622007-08-15 14:28:22 +0000462 for i in count():
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000463 if i in data:
464 yield data.pop(i)
Georg Brandl116aa622007-08-15 14:28:22 +0000465 else:
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000466 data[i] = next()
467 yield data[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000468 it = iter(iterable)
469 return (gen(it.__next__), gen(it.__next__))
470
471 Note, once :func:`tee` has made a split, the original *iterable* should not be
472 used anywhere else; otherwise, the *iterable* could get advanced without the tee
473 objects being informed.
474
475 Note, this member of the toolkit may require significant auxiliary storage
476 (depending on how much temporary data needs to be stored). In general, if one
477 iterator is going to use most or all of the data before the other iterator, it
478 is faster to use :func:`list` instead of :func:`tee`.
479
Georg Brandl116aa622007-08-15 14:28:22 +0000480
481.. _itertools-example:
482
483Examples
484--------
485
486The following examples show common uses for each tool and demonstrate ways they
Christian Heimesfe337bf2008-03-23 21:54:12 +0000487can be combined.
488
489.. doctest::
Georg Brandl116aa622007-08-15 14:28:22 +0000490
Georg Brandlb1441c72009-01-03 22:33:39 +0000491 >>> # Show a dictionary sorted and grouped by value
Georg Brandl116aa622007-08-15 14:28:22 +0000492 >>> from operator import itemgetter
493 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000494 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000495 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000496 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000497 ...
498 1 ['a', 'c', 'e']
499 2 ['b', 'd', 'f']
500 3 ['g']
501
Georg Brandlb1441c72009-01-03 22:33:39 +0000502 >>> # Find runs of consecutive numbers using groupby. The key to the solution
503 >>> # is differencing with a range so that consecutive numbers all appear in
504 >>> # same group.
Georg Brandl116aa622007-08-15 14:28:22 +0000505 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
506 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000507 ... print(map(operator.itemgetter(1), g))
Georg Brandl48310cd2009-01-03 21:18:54 +0000508 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000509 [1]
510 [4, 5, 6]
511 [10]
512 [15, 16, 17, 18]
513 [22]
514 [25, 26, 27, 28]
515
516
517
518.. _itertools-recipes:
519
520Recipes
521-------
522
523This section shows recipes for creating an extended toolset using the existing
524itertools as building blocks.
525
526The extended tools offer the same high performance as the underlying toolset.
527The superior memory performance is kept by processing elements one at a time
528rather than bringing the whole iterable into memory all at once. Code volume is
529kept small by linking the tools together in a functional style which helps
530eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000531"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000532which incur interpreter overhead.
533
534.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000535
Georg Brandl3dbca812008-07-23 16:10:53 +0000536 def take(n, iterable):
537 "Return first n items of the iterable as a list"
538 return list(islice(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000539
Georg Brandl3dbca812008-07-23 16:10:53 +0000540 def enumerate(iterable, start=0):
541 return zip(count(start), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000542
Georg Brandl3dbca812008-07-23 16:10:53 +0000543 def tabulate(function, start=0):
Georg Brandl116aa622007-08-15 14:28:22 +0000544 "Return function(0), function(1), ..."
Georg Brandl3dbca812008-07-23 16:10:53 +0000545 return map(function, count(start))
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000546
Georg Brandl116aa622007-08-15 14:28:22 +0000547 def nth(iterable, n):
Georg Brandl3dbca812008-07-23 16:10:53 +0000548 "Returns the nth item or empty list"
549 return list(islice(iterable, n, n+1))
Georg Brandl116aa622007-08-15 14:28:22 +0000550
Georg Brandl3dbca812008-07-23 16:10:53 +0000551 def quantify(iterable, pred=bool):
552 "Count how many times the predicate is true"
553 return sum(map(pred, iterable))
Georg Brandl116aa622007-08-15 14:28:22 +0000554
Georg Brandl3dbca812008-07-23 16:10:53 +0000555 def padnone(iterable):
Georg Brandl116aa622007-08-15 14:28:22 +0000556 """Returns the sequence elements and then returns None indefinitely.
557
558 Useful for emulating the behavior of the built-in map() function.
559 """
Georg Brandl3dbca812008-07-23 16:10:53 +0000560 return chain(iterable, repeat(None))
Georg Brandl116aa622007-08-15 14:28:22 +0000561
Georg Brandl3dbca812008-07-23 16:10:53 +0000562 def ncycles(iterable, n):
Georg Brandl116aa622007-08-15 14:28:22 +0000563 "Returns the sequence elements n times"
Georg Brandl3dbca812008-07-23 16:10:53 +0000564 return chain.from_iterable(repeat(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000565
566 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000567 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000568
569 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000570 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000571
572 def repeatfunc(func, times=None, *args):
573 """Repeat calls to func with specified arguments.
574
575 Example: repeatfunc(random.random)
576 """
577 if times is None:
578 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000579 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000580
581 def pairwise(iterable):
582 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
583 a, b = tee(iterable)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000584 for elem in b:
585 break
586 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000587
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000588 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000589 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000590 args = [iter(iterable)] * n
Benjamin Petersond18de0e2008-07-31 20:21:46 +0000591 return zip_longest(fillvalue=fillvalue, *args)
Georg Brandl116aa622007-08-15 14:28:22 +0000592
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000593 def roundrobin(*iterables):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000594 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000595 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000596 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000597 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000598 while pending:
599 try:
600 for next in nexts:
601 yield next()
602 except StopIteration:
603 pending -= 1
604 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000605
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000606 def powerset(iterable):
Raymond Hettingerace67332009-01-26 02:23:50 +0000607 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
608 s = list(iterable)
609 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000610
Georg Brandl3dbca812008-07-23 16:10:53 +0000611 def combinations_with_replacement(iterable, r):
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000612 "combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC"
613 # number items returned: (n+r-1)! / r! / (n-1)!
Georg Brandl3dbca812008-07-23 16:10:53 +0000614 pool = tuple(iterable)
615 n = len(pool)
616 indices = [0] * r
617 yield tuple(pool[i] for i in indices)
618 while True:
619 for i in reversed(range(r)):
620 if indices[i] != n - 1:
621 break
622 else:
623 return
624 indices[i:] = [indices[i] + 1] * (r - i)
625 yield tuple(pool[i] for i in indices)
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000626
627 def unique_everseen(iterable, key=None):
628 "List unique elements, preserving order. Remember all elements ever seen."
629 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
Georg Brandl48310cd2009-01-03 21:18:54 +0000630 # unique_everseen('ABBCcAD', str.lower) --> A B C D
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000631 seen = set()
632 seen_add = seen.add
633 if key is None:
634 for element in iterable:
635 if element not in seen:
636 seen_add(element)
637 yield element
638 else:
639 for element in iterable:
640 k = key(element)
641 if k not in seen:
642 seen_add(k)
643 yield element
644
645 def unique_justseen(iterable, key=None):
646 "List unique elements, preserving order. Remember only the element just seen."
647 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
648 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
649 return map(next, map(itemgetter(1), groupby(iterable, key)))