blob: aba1e25f091678a445ddc33acf40fbe0f8c65abe [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Christian Heimesfe337bf2008-03-23 21:54:12 +000011.. testsetup::
12
13 from itertools import *
14
15
Georg Brandl9afde1c2007-11-01 20:32:30 +000016This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl116aa622007-08-15 14:28:22 +000017constructs from the Haskell and SML programming languages. Each has been recast
18in a form suitable for Python.
19
20The module standardizes a core set of fast, memory efficient tools that are
21useful by themselves or in combination. Standardization helps avoid the
22readability and reliability problems which arise when many different individuals
23create their own slightly varying implementations, each with their own quirks
24and naming conventions.
25
26The tools are designed to combine readily with one another. This makes it easy
27to construct more specialized tools succinctly and efficiently in pure Python.
28
29For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000030sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
31by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
33Likewise, the functional tools are designed to work well with the high-speed
34functions provided by the :mod:`operator` module.
35
Georg Brandl116aa622007-08-15 14:28:22 +000036Whether cast in pure python form or compiled code, tools that use iterators are
Georg Brandl3dbca812008-07-23 16:10:53 +000037more memory efficient (and often faster) than their list based counterparts. Adopting
Georg Brandl116aa622007-08-15 14:28:22 +000038the principles of just-in-time manufacturing, they create data when and where
39needed instead of consuming memory with the computer equivalent of "inventory".
40
Georg Brandl116aa622007-08-15 14:28:22 +000041
42.. seealso::
43
44 The Standard ML Basis Library, `The Standard ML Basis Library
45 <http://www.standardml.org/Basis/>`_.
46
47 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
48 Libraries <http://www.haskell.org/definition/>`_.
49
50
51.. _itertools-functions:
52
53Itertool functions
54------------------
55
56The following module functions all construct and return iterators. Some provide
57streams of infinite length, so they should only be accessed by functions or
58loops that truncate the stream.
59
60
61.. function:: chain(*iterables)
62
63 Make an iterator that returns elements from the first iterable until it is
64 exhausted, then proceeds to the next iterable, until all of the iterables are
65 exhausted. Used for treating consecutive sequences as a single sequence.
66 Equivalent to::
67
68 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000069 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000070 for it in iterables:
71 for element in it:
72 yield element
73
74
Christian Heimes70e7ea22008-02-28 20:02:27 +000075.. function:: itertools.chain.from_iterable(iterable)
76
Georg Brandl48310cd2009-01-03 21:18:54 +000077 Alternate constructor for :func:`chain`. Gets chained inputs from a
Christian Heimes70e7ea22008-02-28 20:02:27 +000078 single iterable argument that is evaluated lazily. Equivalent to::
79
80 @classmethod
81 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000082 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +000083 for it in iterables:
84 for element in it:
85 yield element
86
Christian Heimes78644762008-03-04 23:39:23 +000087
Christian Heimes836baa52008-02-26 08:18:30 +000088.. function:: combinations(iterable, r)
89
Christian Heimesdae2a892008-04-19 00:55:37 +000090 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +000091
Georg Brandl48310cd2009-01-03 21:18:54 +000092 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +000093 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +000094 in sorted order.
Christian Heimes836baa52008-02-26 08:18:30 +000095
96 Elements are treated as unique based on their position, not on their
97 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +000098 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +000099
Christian Heimes836baa52008-02-26 08:18:30 +0000100 Equivalent to::
101
102 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000103 # combinations('ABCD', 2) --> AB AC AD BC BD CD
104 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000105 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000106 n = len(pool)
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000107 if r > n:
108 return
109 indices = list(range(r))
Christian Heimesb558a2e2008-03-02 22:46:37 +0000110 yield tuple(pool[i] for i in indices)
Christian Heimes380f7f22008-02-28 11:19:05 +0000111 while 1:
112 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000113 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000114 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000115 else:
116 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000117 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000118 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000119 indices[j] = indices[j-1] + 1
120 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000121
Christian Heimes78644762008-03-04 23:39:23 +0000122 The code for :func:`combinations` can be also expressed as a subsequence
123 of :func:`permutations` after filtering entries where the elements are not
124 in sorted order (according to their position in the input pool)::
125
126 def combinations(iterable, r):
127 pool = tuple(iterable)
128 n = len(pool)
129 for indices in permutations(range(n), r):
130 if sorted(indices) == list(indices):
131 yield tuple(pool[i] for i in indices)
132
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000133 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
134 or zero when ``r > n``.
Christian Heimes836baa52008-02-26 08:18:30 +0000135
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000136.. function:: combinations_with_replacement(iterable, r)
137
138 Return *r* length subsequences of elements from the input *iterable*
139 allowing individual elements to be repeated more than once.
140
141 Combinations are emitted in lexicographic sort order. So, if the
142 input *iterable* is sorted, the combination tuples will be produced
143 in sorted order.
144
145 Elements are treated as unique based on their position, not on their
146 value. So if the input elements are unique, the generated combinations
147 will also be unique.
148
149 Equivalent to::
150
151 def combinations_with_replacement(iterable, r):
152 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
153 pool = tuple(iterable)
154 n = len(pool)
155 if not n and r:
156 return
157 indices = [0] * r
158 yield tuple(pool[i] for i in indices)
159 while 1:
160 for i in reversed(range(r)):
161 if indices[i] != n - 1:
162 break
163 else:
164 return
165 indices[i:] = [indices[i] + 1] * (r - i)
166 yield tuple(pool[i] for i in indices)
167
168 The code for :func:`combinations_with_replacement` can be also expressed as
169 a subsequence of :func:`product` after filtering entries where the elements
170 are not in sorted order (according to their position in the input pool)::
171
172 def combinations_with_replacement(iterable, r):
173 pool = tuple(iterable)
174 n = len(pool)
175 for indices in product(range(n), repeat=r):
176 if sorted(indices) == list(indices):
177 yield tuple(pool[i] for i in indices)
178
179 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
180
181 .. versionadded:: 2.7
182
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000183.. function:: compress(data, selectors)
184
185 Make an iterator that filters elements from *data* returning only those that
186 have a corresponding element in *selectors* that evaluates to ``True``.
187 Stops when either the *data* or *selectors* iterables have been exhausted.
188 Equivalent to::
189
190 def compress(data, selectors):
191 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
192 return (d for d, s in zip(data, selectors) if s)
193
194 .. versionadded:: 2.7
195
196
Georg Brandl116aa622007-08-15 14:28:22 +0000197.. function:: count([n])
198
199 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettingera6c60372008-03-13 01:26:19 +0000200 specified *n* defaults to zero. Often used as an argument to :func:`map` to
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000201 generate consecutive data points. Also, used with :func:`zip` to add sequence
Georg Brandl9afde1c2007-11-01 20:32:30 +0000202 numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000203
204 def count(n=0):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000205 # count(10) --> 10 11 12 13 14 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000206 while True:
207 yield n
208 n += 1
209
Georg Brandl116aa622007-08-15 14:28:22 +0000210
211.. function:: cycle(iterable)
212
213 Make an iterator returning elements from the iterable and saving a copy of each.
214 When the iterable is exhausted, return elements from the saved copy. Repeats
215 indefinitely. Equivalent to::
216
217 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000218 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000219 saved = []
220 for element in iterable:
221 yield element
222 saved.append(element)
223 while saved:
224 for element in saved:
225 yield element
226
227 Note, this member of the toolkit may require significant auxiliary storage
228 (depending on the length of the iterable).
229
230
231.. function:: dropwhile(predicate, iterable)
232
233 Make an iterator that drops elements from the iterable as long as the predicate
234 is true; afterwards, returns every element. Note, the iterator does not produce
235 *any* output until the predicate first becomes false, so it may have a lengthy
236 start-up time. Equivalent to::
237
238 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000239 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000240 iterable = iter(iterable)
241 for x in iterable:
242 if not predicate(x):
243 yield x
244 break
245 for x in iterable:
246 yield x
247
248
249.. function:: groupby(iterable[, key])
250
251 Make an iterator that returns consecutive keys and groups from the *iterable*.
252 The *key* is a function computing a key value for each element. If not
253 specified or is ``None``, *key* defaults to an identity function and returns
254 the element unchanged. Generally, the iterable needs to already be sorted on
255 the same key function.
256
257 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
258 generates a break or new group every time the value of the key function changes
259 (which is why it is usually necessary to have sorted the data using the same key
260 function). That behavior differs from SQL's GROUP BY which aggregates common
261 elements regardless of their input order.
262
263 The returned group is itself an iterator that shares the underlying iterable
264 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
265 object is advanced, the previous group is no longer visible. So, if that data
266 is needed later, it should be stored as a list::
267
268 groups = []
269 uniquekeys = []
270 data = sorted(data, key=keyfunc)
271 for k, g in groupby(data, keyfunc):
272 groups.append(list(g)) # Store group iterator as a list
273 uniquekeys.append(k)
274
275 :func:`groupby` is equivalent to::
276
277 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000278 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
279 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000280 def __init__(self, iterable, key=None):
281 if key is None:
282 key = lambda x: x
283 self.keyfunc = key
284 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000285 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000286 def __iter__(self):
287 return self
288 def __next__(self):
289 while self.currkey == self.tgtkey:
290 self.currvalue = next(self.it) # Exit on StopIteration
291 self.currkey = self.keyfunc(self.currvalue)
292 self.tgtkey = self.currkey
293 return (self.currkey, self._grouper(self.tgtkey))
294 def _grouper(self, tgtkey):
295 while self.currkey == tgtkey:
296 yield self.currvalue
297 self.currvalue = next(self.it) # Exit on StopIteration
298 self.currkey = self.keyfunc(self.currvalue)
299
Georg Brandl116aa622007-08-15 14:28:22 +0000300
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000301.. function:: filterfalse(predicate, iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000302
303 Make an iterator that filters elements from iterable returning only those for
304 which the predicate is ``False``. If *predicate* is ``None``, return the items
305 that are false. Equivalent to::
306
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000307 def filterfalse(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000308 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl116aa622007-08-15 14:28:22 +0000309 if predicate is None:
310 predicate = bool
311 for x in iterable:
312 if not predicate(x):
313 yield x
314
315
Georg Brandl116aa622007-08-15 14:28:22 +0000316.. function:: islice(iterable, [start,] stop [, step])
317
318 Make an iterator that returns selected elements from the iterable. If *start* is
319 non-zero, then elements from the iterable are skipped until start is reached.
320 Afterward, elements are returned consecutively unless *step* is set higher than
321 one which results in items being skipped. If *stop* is ``None``, then iteration
322 continues until the iterator is exhausted, if at all; otherwise, it stops at the
323 specified position. Unlike regular slicing, :func:`islice` does not support
324 negative values for *start*, *stop*, or *step*. Can be used to extract related
325 fields from data where the internal structure has been flattened (for example, a
326 multi-line report may list a name field on every third line). Equivalent to::
327
328 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000329 # islice('ABCDEFG', 2) --> A B
330 # islice('ABCDEFG', 2, 4) --> C D
331 # islice('ABCDEFG', 2, None) --> C D E F G
332 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000333 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000334 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000335 nexti = next(it)
336 for i, element in enumerate(iterable):
337 if i == nexti:
338 yield element
339 nexti = next(it)
340
341 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
342 then the step defaults to one.
343
Georg Brandl116aa622007-08-15 14:28:22 +0000344
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000345.. function:: zip_longest(*iterables[, fillvalue])
Georg Brandl116aa622007-08-15 14:28:22 +0000346
347 Make an iterator that aggregates elements from each of the iterables. If the
348 iterables are of uneven length, missing values are filled-in with *fillvalue*.
349 Iteration continues until the longest iterable is exhausted. Equivalent to::
350
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000351 def zip_longest(*args, fillvalue=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000352 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl116aa622007-08-15 14:28:22 +0000353 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
354 yield counter() # yields the fillvalue, or raises IndexError
355 fillers = repeat(fillvalue)
356 iters = [chain(it, sentinel(), fillers) for it in args]
357 try:
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000358 for tup in zip(*iters):
Georg Brandl116aa622007-08-15 14:28:22 +0000359 yield tup
360 except IndexError:
361 pass
362
Benjamin Peterson37d2fe02008-10-24 22:28:58 +0000363 If one of the iterables is potentially infinite, then the :func:`zip_longest`
364 function should be wrapped with something that limits the number of calls
365 (for example :func:`islice` or :func:`takewhile`). If not specified,
366 *fillvalue* defaults to ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +0000367
Georg Brandl116aa622007-08-15 14:28:22 +0000368
Christian Heimes70e7ea22008-02-28 20:02:27 +0000369.. function:: permutations(iterable[, r])
370
371 Return successive *r* length permutations of elements in the *iterable*.
372
373 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl48310cd2009-01-03 21:18:54 +0000374 of the *iterable* and all possible full-length permutations
Christian Heimes70e7ea22008-02-28 20:02:27 +0000375 are generated.
376
Georg Brandl48310cd2009-01-03 21:18:54 +0000377 Permutations are emitted in lexicographic sort order. So, if the
Christian Heimes70e7ea22008-02-28 20:02:27 +0000378 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +0000379 in sorted order.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000380
381 Elements are treated as unique based on their position, not on their
382 value. So if the input elements are unique, there will be no repeat
383 values in each permutation.
384
Christian Heimesb558a2e2008-03-02 22:46:37 +0000385 Equivalent to::
386
387 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000388 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
389 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000390 pool = tuple(iterable)
391 n = len(pool)
392 r = n if r is None else r
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000393 if r > n:
394 return
395 indices = list(range(n))
Christian Heimesfe337bf2008-03-23 21:54:12 +0000396 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000397 yield tuple(pool[i] for i in indices[:r])
398 while n:
399 for i in reversed(range(r)):
400 cycles[i] -= 1
401 if cycles[i] == 0:
402 indices[i:] = indices[i+1:] + indices[i:i+1]
403 cycles[i] = n - i
404 else:
405 j = cycles[i]
406 indices[i], indices[-j] = indices[-j], indices[i]
407 yield tuple(pool[i] for i in indices[:r])
408 break
409 else:
410 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000411
Georg Brandl48310cd2009-01-03 21:18:54 +0000412 The code for :func:`permutations` can be also expressed as a subsequence of
Christian Heimes78644762008-03-04 23:39:23 +0000413 :func:`product`, filtered to exclude entries with repeated elements (those
414 from the same position in the input pool)::
415
416 def permutations(iterable, r=None):
417 pool = tuple(iterable)
418 n = len(pool)
419 r = n if r is None else r
420 for indices in product(range(n), repeat=r):
421 if len(set(indices)) == r:
422 yield tuple(pool[i] for i in indices)
423
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000424 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
425 or zero when ``r > n``.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000426
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000427.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000428
429 Cartesian product of input iterables.
430
431 Equivalent to nested for-loops in a generator expression. For example,
432 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
433
Christian Heimesdae2a892008-04-19 00:55:37 +0000434 The nested loops cycle like an odometer with the rightmost element advancing
435 on every iteration. This pattern creates a lexicographic ordering so that if
436 the input's iterables are sorted, the product tuples are emitted in sorted
437 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000438
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000439 To compute the product of an iterable with itself, specify the number of
440 repetitions with the optional *repeat* keyword argument. For example,
441 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
442
Christian Heimes78644762008-03-04 23:39:23 +0000443 This function is equivalent to the following code, except that the
444 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000445
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000446 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000447 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
448 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000449 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000450 result = [[]]
451 for pool in pools:
452 result = [x+[y] for x in result for y in pool]
453 for prod in result:
454 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000455
456
Georg Brandl116aa622007-08-15 14:28:22 +0000457.. function:: repeat(object[, times])
458
459 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000460 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000461 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000462 create an invariant part of a tuple record. Equivalent to::
463
464 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000465 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000466 if times is None:
467 while True:
468 yield object
469 else:
470 for i in range(times):
471 yield object
472
473
474.. function:: starmap(function, iterable)
475
Christian Heimes679db4a2008-01-18 09:56:22 +0000476 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000477 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000478 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000479 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000480 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
481
482 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000483 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000484 for args in iterable:
485 yield function(*args)
486
Georg Brandl116aa622007-08-15 14:28:22 +0000487
488.. function:: takewhile(predicate, iterable)
489
490 Make an iterator that returns elements from the iterable as long as the
491 predicate is true. Equivalent to::
492
493 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000494 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000495 for x in iterable:
496 if predicate(x):
497 yield x
498 else:
499 break
500
501
502.. function:: tee(iterable[, n=2])
503
504 Return *n* independent iterators from a single iterable. The case where ``n==2``
505 is equivalent to::
506
507 def tee(iterable):
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000508 def gen(next, data={}):
Georg Brandl116aa622007-08-15 14:28:22 +0000509 for i in count():
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000510 if i in data:
511 yield data.pop(i)
Georg Brandl116aa622007-08-15 14:28:22 +0000512 else:
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000513 data[i] = next()
514 yield data[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000515 it = iter(iterable)
516 return (gen(it.__next__), gen(it.__next__))
517
518 Note, once :func:`tee` has made a split, the original *iterable* should not be
519 used anywhere else; otherwise, the *iterable* could get advanced without the tee
520 objects being informed.
521
522 Note, this member of the toolkit may require significant auxiliary storage
523 (depending on how much temporary data needs to be stored). In general, if one
524 iterator is going to use most or all of the data before the other iterator, it
525 is faster to use :func:`list` instead of :func:`tee`.
526
Georg Brandl116aa622007-08-15 14:28:22 +0000527
528.. _itertools-example:
529
530Examples
531--------
532
533The following examples show common uses for each tool and demonstrate ways they
Christian Heimesfe337bf2008-03-23 21:54:12 +0000534can be combined.
535
536.. doctest::
Georg Brandl116aa622007-08-15 14:28:22 +0000537
Georg Brandlb1441c72009-01-03 22:33:39 +0000538 >>> # Show a dictionary sorted and grouped by value
Georg Brandl116aa622007-08-15 14:28:22 +0000539 >>> from operator import itemgetter
540 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000541 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000542 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000543 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000544 ...
545 1 ['a', 'c', 'e']
546 2 ['b', 'd', 'f']
547 3 ['g']
548
Georg Brandlb1441c72009-01-03 22:33:39 +0000549 >>> # Find runs of consecutive numbers using groupby. The key to the solution
550 >>> # is differencing with a range so that consecutive numbers all appear in
551 >>> # same group.
Georg Brandl116aa622007-08-15 14:28:22 +0000552 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
553 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000554 ... print(map(operator.itemgetter(1), g))
Georg Brandl48310cd2009-01-03 21:18:54 +0000555 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000556 [1]
557 [4, 5, 6]
558 [10]
559 [15, 16, 17, 18]
560 [22]
561 [25, 26, 27, 28]
562
563
564
565.. _itertools-recipes:
566
567Recipes
568-------
569
570This section shows recipes for creating an extended toolset using the existing
571itertools as building blocks.
572
573The extended tools offer the same high performance as the underlying toolset.
574The superior memory performance is kept by processing elements one at a time
575rather than bringing the whole iterable into memory all at once. Code volume is
576kept small by linking the tools together in a functional style which helps
577eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000578"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000579which incur interpreter overhead.
580
581.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000582
Georg Brandl3dbca812008-07-23 16:10:53 +0000583 def take(n, iterable):
584 "Return first n items of the iterable as a list"
585 return list(islice(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000586
Georg Brandl3dbca812008-07-23 16:10:53 +0000587 def enumerate(iterable, start=0):
588 return zip(count(start), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000589
Georg Brandl3dbca812008-07-23 16:10:53 +0000590 def tabulate(function, start=0):
Georg Brandl116aa622007-08-15 14:28:22 +0000591 "Return function(0), function(1), ..."
Georg Brandl3dbca812008-07-23 16:10:53 +0000592 return map(function, count(start))
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000593
Georg Brandl116aa622007-08-15 14:28:22 +0000594 def nth(iterable, n):
Georg Brandl3dbca812008-07-23 16:10:53 +0000595 "Returns the nth item or empty list"
596 return list(islice(iterable, n, n+1))
Georg Brandl116aa622007-08-15 14:28:22 +0000597
Georg Brandl3dbca812008-07-23 16:10:53 +0000598 def quantify(iterable, pred=bool):
599 "Count how many times the predicate is true"
600 return sum(map(pred, iterable))
Georg Brandl116aa622007-08-15 14:28:22 +0000601
Georg Brandl3dbca812008-07-23 16:10:53 +0000602 def padnone(iterable):
Georg Brandl116aa622007-08-15 14:28:22 +0000603 """Returns the sequence elements and then returns None indefinitely.
604
605 Useful for emulating the behavior of the built-in map() function.
606 """
Georg Brandl3dbca812008-07-23 16:10:53 +0000607 return chain(iterable, repeat(None))
Georg Brandl116aa622007-08-15 14:28:22 +0000608
Georg Brandl3dbca812008-07-23 16:10:53 +0000609 def ncycles(iterable, n):
Georg Brandl116aa622007-08-15 14:28:22 +0000610 "Returns the sequence elements n times"
Georg Brandl3dbca812008-07-23 16:10:53 +0000611 return chain.from_iterable(repeat(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000612
613 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000614 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000615
616 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000617 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000618
619 def repeatfunc(func, times=None, *args):
620 """Repeat calls to func with specified arguments.
621
622 Example: repeatfunc(random.random)
623 """
624 if times is None:
625 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000626 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000627
628 def pairwise(iterable):
629 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
630 a, b = tee(iterable)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000631 for elem in b:
632 break
633 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000634
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000635 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000636 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000637 args = [iter(iterable)] * n
Benjamin Petersond18de0e2008-07-31 20:21:46 +0000638 return zip_longest(fillvalue=fillvalue, *args)
Georg Brandl116aa622007-08-15 14:28:22 +0000639
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000640 def roundrobin(*iterables):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000641 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000642 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000643 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000644 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000645 while pending:
646 try:
647 for next in nexts:
648 yield next()
649 except StopIteration:
650 pending -= 1
651 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000652
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000653 def powerset(iterable):
Raymond Hettingerace67332009-01-26 02:23:50 +0000654 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
655 s = list(iterable)
656 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000657
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000658 def unique_everseen(iterable, key=None):
659 "List unique elements, preserving order. Remember all elements ever seen."
660 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
Georg Brandl48310cd2009-01-03 21:18:54 +0000661 # unique_everseen('ABBCcAD', str.lower) --> A B C D
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000662 seen = set()
663 seen_add = seen.add
664 if key is None:
665 for element in iterable:
666 if element not in seen:
667 seen_add(element)
668 yield element
669 else:
670 for element in iterable:
671 k = key(element)
672 if k not in seen:
673 seen_add(k)
674 yield element
675
676 def unique_justseen(iterable, key=None):
677 "List unique elements, preserving order. Remember only the element just seen."
678 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
679 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
680 return map(next, map(itemgetter(1), groupby(iterable, key)))