blob: ad5a23bec55c92e2b281cdb9d030cb21f961bb62 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Christian Heimesfe337bf2008-03-23 21:54:12 +000011.. testsetup::
12
13 from itertools import *
14
15
Georg Brandl9afde1c2007-11-01 20:32:30 +000016This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl116aa622007-08-15 14:28:22 +000017constructs from the Haskell and SML programming languages. Each has been recast
18in a form suitable for Python.
19
20The module standardizes a core set of fast, memory efficient tools that are
21useful by themselves or in combination. Standardization helps avoid the
22readability and reliability problems which arise when many different individuals
23create their own slightly varying implementations, each with their own quirks
24and naming conventions.
25
26The tools are designed to combine readily with one another. This makes it easy
27to construct more specialized tools succinctly and efficiently in pure Python.
28
29For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000030sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
31by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
33Likewise, the functional tools are designed to work well with the high-speed
34functions provided by the :mod:`operator` module.
35
Georg Brandl116aa622007-08-15 14:28:22 +000036Whether cast in pure python form or compiled code, tools that use iterators are
Georg Brandl3dbca812008-07-23 16:10:53 +000037more memory efficient (and often faster) than their list based counterparts. Adopting
Georg Brandl116aa622007-08-15 14:28:22 +000038the principles of just-in-time manufacturing, they create data when and where
39needed instead of consuming memory with the computer equivalent of "inventory".
40
Georg Brandl116aa622007-08-15 14:28:22 +000041
42.. seealso::
43
44 The Standard ML Basis Library, `The Standard ML Basis Library
45 <http://www.standardml.org/Basis/>`_.
46
47 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
48 Libraries <http://www.haskell.org/definition/>`_.
49
50
51.. _itertools-functions:
52
53Itertool functions
54------------------
55
56The following module functions all construct and return iterators. Some provide
57streams of infinite length, so they should only be accessed by functions or
58loops that truncate the stream.
59
60
61.. function:: chain(*iterables)
62
63 Make an iterator that returns elements from the first iterable until it is
64 exhausted, then proceeds to the next iterable, until all of the iterables are
65 exhausted. Used for treating consecutive sequences as a single sequence.
66 Equivalent to::
67
68 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000069 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000070 for it in iterables:
71 for element in it:
72 yield element
73
74
Christian Heimes70e7ea22008-02-28 20:02:27 +000075.. function:: itertools.chain.from_iterable(iterable)
76
Georg Brandl48310cd2009-01-03 21:18:54 +000077 Alternate constructor for :func:`chain`. Gets chained inputs from a
Christian Heimes70e7ea22008-02-28 20:02:27 +000078 single iterable argument that is evaluated lazily. Equivalent to::
79
80 @classmethod
81 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000082 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +000083 for it in iterables:
84 for element in it:
85 yield element
86
Christian Heimes78644762008-03-04 23:39:23 +000087
Christian Heimes836baa52008-02-26 08:18:30 +000088.. function:: combinations(iterable, r)
89
Christian Heimesdae2a892008-04-19 00:55:37 +000090 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +000091
Georg Brandl48310cd2009-01-03 21:18:54 +000092 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +000093 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +000094 in sorted order.
Christian Heimes836baa52008-02-26 08:18:30 +000095
96 Elements are treated as unique based on their position, not on their
97 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +000098 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +000099
Christian Heimes836baa52008-02-26 08:18:30 +0000100 Equivalent to::
101
102 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000103 # combinations('ABCD', 2) --> AB AC AD BC BD CD
104 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000105 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000106 n = len(pool)
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000107 if r > n:
108 return
109 indices = list(range(r))
Christian Heimesb558a2e2008-03-02 22:46:37 +0000110 yield tuple(pool[i] for i in indices)
Christian Heimes380f7f22008-02-28 11:19:05 +0000111 while 1:
112 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000113 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000114 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000115 else:
116 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000117 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000118 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000119 indices[j] = indices[j-1] + 1
120 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000121
Christian Heimes78644762008-03-04 23:39:23 +0000122 The code for :func:`combinations` can be also expressed as a subsequence
123 of :func:`permutations` after filtering entries where the elements are not
124 in sorted order (according to their position in the input pool)::
125
126 def combinations(iterable, r):
127 pool = tuple(iterable)
128 n = len(pool)
129 for indices in permutations(range(n), r):
130 if sorted(indices) == list(indices):
131 yield tuple(pool[i] for i in indices)
132
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000133 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
134 or zero when ``r > n``.
Christian Heimes836baa52008-02-26 08:18:30 +0000135
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000136.. function:: combinations_with_replacement(iterable, r)
137
138 Return *r* length subsequences of elements from the input *iterable*
139 allowing individual elements to be repeated more than once.
140
141 Combinations are emitted in lexicographic sort order. So, if the
142 input *iterable* is sorted, the combination tuples will be produced
143 in sorted order.
144
145 Elements are treated as unique based on their position, not on their
146 value. So if the input elements are unique, the generated combinations
147 will also be unique.
148
149 Equivalent to::
150
151 def combinations_with_replacement(iterable, r):
152 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
153 pool = tuple(iterable)
154 n = len(pool)
155 if not n and r:
156 return
157 indices = [0] * r
158 yield tuple(pool[i] for i in indices)
159 while 1:
160 for i in reversed(range(r)):
161 if indices[i] != n - 1:
162 break
163 else:
164 return
165 indices[i:] = [indices[i] + 1] * (r - i)
166 yield tuple(pool[i] for i in indices)
167
168 The code for :func:`combinations_with_replacement` can be also expressed as
169 a subsequence of :func:`product` after filtering entries where the elements
170 are not in sorted order (according to their position in the input pool)::
171
172 def combinations_with_replacement(iterable, r):
173 pool = tuple(iterable)
174 n = len(pool)
175 for indices in product(range(n), repeat=r):
176 if sorted(indices) == list(indices):
177 yield tuple(pool[i] for i in indices)
178
179 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
180
Raymond Hettinger30729212009-02-12 06:28:27 +0000181 .. versionadded:: 3.1
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000182
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000183.. function:: compress(data, selectors)
184
185 Make an iterator that filters elements from *data* returning only those that
186 have a corresponding element in *selectors* that evaluates to ``True``.
187 Stops when either the *data* or *selectors* iterables have been exhausted.
188 Equivalent to::
189
190 def compress(data, selectors):
191 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
192 return (d for d, s in zip(data, selectors) if s)
193
Raymond Hettinger30729212009-02-12 06:28:27 +0000194 .. versionadded:: 3.1
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000195
196
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000197.. function:: count(start=0, step=1)
Georg Brandl116aa622007-08-15 14:28:22 +0000198
Raymond Hettinger30729212009-02-12 06:28:27 +0000199 Make an iterator that returns evenly spaced values starting with *n*. Often
200 used as an argument to :func:`map` to generate consecutive data points.
201 Also, used with :func:`zip` to add sequence numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000202
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000203 def count(start=0, step=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000204 # count(10) --> 10 11 12 13 14 ...
Raymond Hettinger30729212009-02-12 06:28:27 +0000205 # count(2.5, 0.5) -> 3.5 3.0 4.5 ...
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000206 n = start
Georg Brandl116aa622007-08-15 14:28:22 +0000207 while True:
208 yield n
Raymond Hettinger30729212009-02-12 06:28:27 +0000209 n += step
Georg Brandl116aa622007-08-15 14:28:22 +0000210
Raymond Hettinger30729212009-02-12 06:28:27 +0000211 .. versionchanged:: 3.1
212 added *step* argument and allowed non-integer arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000213
214.. function:: cycle(iterable)
215
216 Make an iterator returning elements from the iterable and saving a copy of each.
217 When the iterable is exhausted, return elements from the saved copy. Repeats
218 indefinitely. Equivalent to::
219
220 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000221 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000222 saved = []
223 for element in iterable:
224 yield element
225 saved.append(element)
226 while saved:
227 for element in saved:
228 yield element
229
230 Note, this member of the toolkit may require significant auxiliary storage
231 (depending on the length of the iterable).
232
233
234.. function:: dropwhile(predicate, iterable)
235
236 Make an iterator that drops elements from the iterable as long as the predicate
237 is true; afterwards, returns every element. Note, the iterator does not produce
238 *any* output until the predicate first becomes false, so it may have a lengthy
239 start-up time. Equivalent to::
240
241 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000242 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000243 iterable = iter(iterable)
244 for x in iterable:
245 if not predicate(x):
246 yield x
247 break
248 for x in iterable:
249 yield x
250
Raymond Hettinger749761e2009-01-27 04:42:48 +0000251.. function:: filterfalse(predicate, iterable)
252
253 Make an iterator that filters elements from iterable returning only those for
254 which the predicate is ``False``. If *predicate* is ``None``, return the items
255 that are false. Equivalent to::
256
257 def filterfalse(predicate, iterable):
258 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
259 if predicate is None:
260 predicate = bool
261 for x in iterable:
262 if not predicate(x):
263 yield x
264
Georg Brandl116aa622007-08-15 14:28:22 +0000265
266.. function:: groupby(iterable[, key])
267
268 Make an iterator that returns consecutive keys and groups from the *iterable*.
269 The *key* is a function computing a key value for each element. If not
270 specified or is ``None``, *key* defaults to an identity function and returns
271 the element unchanged. Generally, the iterable needs to already be sorted on
272 the same key function.
273
274 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
275 generates a break or new group every time the value of the key function changes
276 (which is why it is usually necessary to have sorted the data using the same key
277 function). That behavior differs from SQL's GROUP BY which aggregates common
278 elements regardless of their input order.
279
280 The returned group is itself an iterator that shares the underlying iterable
281 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
282 object is advanced, the previous group is no longer visible. So, if that data
283 is needed later, it should be stored as a list::
284
285 groups = []
286 uniquekeys = []
287 data = sorted(data, key=keyfunc)
288 for k, g in groupby(data, keyfunc):
289 groups.append(list(g)) # Store group iterator as a list
290 uniquekeys.append(k)
291
292 :func:`groupby` is equivalent to::
293
294 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000295 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettingerd04fa312009-02-04 19:45:13 +0000296 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000297 def __init__(self, iterable, key=None):
298 if key is None:
299 key = lambda x: x
300 self.keyfunc = key
301 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000302 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000303 def __iter__(self):
304 return self
305 def __next__(self):
306 while self.currkey == self.tgtkey:
307 self.currvalue = next(self.it) # Exit on StopIteration
308 self.currkey = self.keyfunc(self.currvalue)
309 self.tgtkey = self.currkey
310 return (self.currkey, self._grouper(self.tgtkey))
311 def _grouper(self, tgtkey):
312 while self.currkey == tgtkey:
313 yield self.currvalue
314 self.currvalue = next(self.it) # Exit on StopIteration
315 self.currkey = self.keyfunc(self.currvalue)
316
Georg Brandl116aa622007-08-15 14:28:22 +0000317
Georg Brandl116aa622007-08-15 14:28:22 +0000318.. function:: islice(iterable, [start,] stop [, step])
319
320 Make an iterator that returns selected elements from the iterable. If *start* is
321 non-zero, then elements from the iterable are skipped until start is reached.
322 Afterward, elements are returned consecutively unless *step* is set higher than
323 one which results in items being skipped. If *stop* is ``None``, then iteration
324 continues until the iterator is exhausted, if at all; otherwise, it stops at the
325 specified position. Unlike regular slicing, :func:`islice` does not support
326 negative values for *start*, *stop*, or *step*. Can be used to extract related
327 fields from data where the internal structure has been flattened (for example, a
328 multi-line report may list a name field on every third line). Equivalent to::
329
330 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000331 # islice('ABCDEFG', 2) --> A B
332 # islice('ABCDEFG', 2, 4) --> C D
333 # islice('ABCDEFG', 2, None) --> C D E F G
334 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000335 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000336 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000337 nexti = next(it)
338 for i, element in enumerate(iterable):
339 if i == nexti:
340 yield element
341 nexti = next(it)
342
343 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
344 then the step defaults to one.
345
Georg Brandl116aa622007-08-15 14:28:22 +0000346
Christian Heimes70e7ea22008-02-28 20:02:27 +0000347.. function:: permutations(iterable[, r])
348
349 Return successive *r* length permutations of elements in the *iterable*.
350
351 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl48310cd2009-01-03 21:18:54 +0000352 of the *iterable* and all possible full-length permutations
Christian Heimes70e7ea22008-02-28 20:02:27 +0000353 are generated.
354
Georg Brandl48310cd2009-01-03 21:18:54 +0000355 Permutations are emitted in lexicographic sort order. So, if the
Christian Heimes70e7ea22008-02-28 20:02:27 +0000356 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +0000357 in sorted order.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000358
359 Elements are treated as unique based on their position, not on their
360 value. So if the input elements are unique, there will be no repeat
361 values in each permutation.
362
Christian Heimesb558a2e2008-03-02 22:46:37 +0000363 Equivalent to::
364
365 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000366 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
367 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000368 pool = tuple(iterable)
369 n = len(pool)
370 r = n if r is None else r
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000371 if r > n:
372 return
373 indices = list(range(n))
Christian Heimesfe337bf2008-03-23 21:54:12 +0000374 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000375 yield tuple(pool[i] for i in indices[:r])
376 while n:
377 for i in reversed(range(r)):
378 cycles[i] -= 1
379 if cycles[i] == 0:
380 indices[i:] = indices[i+1:] + indices[i:i+1]
381 cycles[i] = n - i
382 else:
383 j = cycles[i]
384 indices[i], indices[-j] = indices[-j], indices[i]
385 yield tuple(pool[i] for i in indices[:r])
386 break
387 else:
388 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000389
Georg Brandl48310cd2009-01-03 21:18:54 +0000390 The code for :func:`permutations` can be also expressed as a subsequence of
Christian Heimes78644762008-03-04 23:39:23 +0000391 :func:`product`, filtered to exclude entries with repeated elements (those
392 from the same position in the input pool)::
393
394 def permutations(iterable, r=None):
395 pool = tuple(iterable)
396 n = len(pool)
397 r = n if r is None else r
398 for indices in product(range(n), repeat=r):
399 if len(set(indices)) == r:
400 yield tuple(pool[i] for i in indices)
401
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000402 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
403 or zero when ``r > n``.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000404
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000405.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000406
407 Cartesian product of input iterables.
408
409 Equivalent to nested for-loops in a generator expression. For example,
410 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
411
Christian Heimesdae2a892008-04-19 00:55:37 +0000412 The nested loops cycle like an odometer with the rightmost element advancing
413 on every iteration. This pattern creates a lexicographic ordering so that if
414 the input's iterables are sorted, the product tuples are emitted in sorted
415 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000416
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000417 To compute the product of an iterable with itself, specify the number of
418 repetitions with the optional *repeat* keyword argument. For example,
419 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
420
Christian Heimes78644762008-03-04 23:39:23 +0000421 This function is equivalent to the following code, except that the
422 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000423
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000424 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000425 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
426 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000427 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000428 result = [[]]
429 for pool in pools:
430 result = [x+[y] for x in result for y in pool]
431 for prod in result:
432 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000433
434
Georg Brandl116aa622007-08-15 14:28:22 +0000435.. function:: repeat(object[, times])
436
437 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000438 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000439 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000440 create an invariant part of a tuple record. Equivalent to::
441
442 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000443 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000444 if times is None:
445 while True:
446 yield object
447 else:
448 for i in range(times):
449 yield object
450
451
452.. function:: starmap(function, iterable)
453
Christian Heimes679db4a2008-01-18 09:56:22 +0000454 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000455 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000456 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000457 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000458 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
459
460 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000461 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000462 for args in iterable:
463 yield function(*args)
464
Georg Brandl116aa622007-08-15 14:28:22 +0000465
466.. function:: takewhile(predicate, iterable)
467
468 Make an iterator that returns elements from the iterable as long as the
469 predicate is true. Equivalent to::
470
471 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000472 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000473 for x in iterable:
474 if predicate(x):
475 yield x
476 else:
477 break
478
479
480.. function:: tee(iterable[, n=2])
481
482 Return *n* independent iterators from a single iterable. The case where ``n==2``
483 is equivalent to::
484
485 def tee(iterable):
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000486 def gen(next, data={}):
Georg Brandl116aa622007-08-15 14:28:22 +0000487 for i in count():
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000488 if i in data:
489 yield data.pop(i)
Georg Brandl116aa622007-08-15 14:28:22 +0000490 else:
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000491 data[i] = next()
492 yield data[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000493 it = iter(iterable)
494 return (gen(it.__next__), gen(it.__next__))
495
496 Note, once :func:`tee` has made a split, the original *iterable* should not be
497 used anywhere else; otherwise, the *iterable* could get advanced without the tee
498 objects being informed.
499
500 Note, this member of the toolkit may require significant auxiliary storage
501 (depending on how much temporary data needs to be stored). In general, if one
502 iterator is going to use most or all of the data before the other iterator, it
503 is faster to use :func:`list` instead of :func:`tee`.
504
Georg Brandl116aa622007-08-15 14:28:22 +0000505
Raymond Hettinger749761e2009-01-27 04:42:48 +0000506.. function:: zip_longest(*iterables[, fillvalue])
507
508 Make an iterator that aggregates elements from each of the iterables. If the
509 iterables are of uneven length, missing values are filled-in with *fillvalue*.
510 Iteration continues until the longest iterable is exhausted. Equivalent to::
511
512 def zip_longest(*args, fillvalue=None):
513 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
514 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
515 yield counter() # yields the fillvalue, or raises IndexError
516 fillers = repeat(fillvalue)
517 iters = [chain(it, sentinel(), fillers) for it in args]
518 try:
519 for tup in zip(*iters):
520 yield tup
521 except IndexError:
522 pass
523
524 If one of the iterables is potentially infinite, then the :func:`zip_longest`
525 function should be wrapped with something that limits the number of calls
526 (for example :func:`islice` or :func:`takewhile`). If not specified,
527 *fillvalue* defaults to ``None``.
528
529
Georg Brandl116aa622007-08-15 14:28:22 +0000530.. _itertools-example:
531
532Examples
533--------
534
535The following examples show common uses for each tool and demonstrate ways they
Christian Heimesfe337bf2008-03-23 21:54:12 +0000536can be combined.
537
538.. doctest::
Georg Brandl116aa622007-08-15 14:28:22 +0000539
Georg Brandlb1441c72009-01-03 22:33:39 +0000540 >>> # Show a dictionary sorted and grouped by value
Georg Brandl116aa622007-08-15 14:28:22 +0000541 >>> from operator import itemgetter
542 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000543 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000544 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000545 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000546 ...
547 1 ['a', 'c', 'e']
548 2 ['b', 'd', 'f']
549 3 ['g']
550
Georg Brandlb1441c72009-01-03 22:33:39 +0000551 >>> # Find runs of consecutive numbers using groupby. The key to the solution
552 >>> # is differencing with a range so that consecutive numbers all appear in
553 >>> # same group.
Georg Brandl116aa622007-08-15 14:28:22 +0000554 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
555 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000556 ... print(map(operator.itemgetter(1), g))
Georg Brandl48310cd2009-01-03 21:18:54 +0000557 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000558 [1]
559 [4, 5, 6]
560 [10]
561 [15, 16, 17, 18]
562 [22]
563 [25, 26, 27, 28]
564
565
566
567.. _itertools-recipes:
568
569Recipes
570-------
571
572This section shows recipes for creating an extended toolset using the existing
573itertools as building blocks.
574
575The extended tools offer the same high performance as the underlying toolset.
576The superior memory performance is kept by processing elements one at a time
577rather than bringing the whole iterable into memory all at once. Code volume is
578kept small by linking the tools together in a functional style which helps
579eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000580"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000581which incur interpreter overhead.
582
583.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000584
Georg Brandl3dbca812008-07-23 16:10:53 +0000585 def take(n, iterable):
586 "Return first n items of the iterable as a list"
587 return list(islice(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000588
Georg Brandl3dbca812008-07-23 16:10:53 +0000589 def enumerate(iterable, start=0):
590 return zip(count(start), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000591
Georg Brandl3dbca812008-07-23 16:10:53 +0000592 def tabulate(function, start=0):
Georg Brandl116aa622007-08-15 14:28:22 +0000593 "Return function(0), function(1), ..."
Georg Brandl3dbca812008-07-23 16:10:53 +0000594 return map(function, count(start))
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000595
Georg Brandl116aa622007-08-15 14:28:22 +0000596 def nth(iterable, n):
Raymond Hettingerd04fa312009-02-04 19:45:13 +0000597 "Returns the nth item or None"
598 return next(islice(iterable, n, None), None)
Georg Brandl116aa622007-08-15 14:28:22 +0000599
Georg Brandl3dbca812008-07-23 16:10:53 +0000600 def quantify(iterable, pred=bool):
601 "Count how many times the predicate is true"
602 return sum(map(pred, iterable))
Georg Brandl116aa622007-08-15 14:28:22 +0000603
Georg Brandl3dbca812008-07-23 16:10:53 +0000604 def padnone(iterable):
Georg Brandl116aa622007-08-15 14:28:22 +0000605 """Returns the sequence elements and then returns None indefinitely.
606
607 Useful for emulating the behavior of the built-in map() function.
608 """
Georg Brandl3dbca812008-07-23 16:10:53 +0000609 return chain(iterable, repeat(None))
Georg Brandl116aa622007-08-15 14:28:22 +0000610
Georg Brandl3dbca812008-07-23 16:10:53 +0000611 def ncycles(iterable, n):
Georg Brandl116aa622007-08-15 14:28:22 +0000612 "Returns the sequence elements n times"
Georg Brandl3dbca812008-07-23 16:10:53 +0000613 return chain.from_iterable(repeat(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000614
615 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000616 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000617
618 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000619 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000620
621 def repeatfunc(func, times=None, *args):
622 """Repeat calls to func with specified arguments.
623
624 Example: repeatfunc(random.random)
625 """
626 if times is None:
627 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000628 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000629
630 def pairwise(iterable):
631 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
632 a, b = tee(iterable)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000633 for elem in b:
634 break
635 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000636
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000637 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000638 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000639 args = [iter(iterable)] * n
Raymond Hettinger883d2762009-01-27 04:57:51 +0000640 return zip_longest(*args, fillvalue=fillvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000641
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000642 def roundrobin(*iterables):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000643 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000644 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000645 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000646 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000647 while pending:
648 try:
649 for next in nexts:
650 yield next()
651 except StopIteration:
652 pending -= 1
653 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000654
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000655 def powerset(iterable):
Raymond Hettingerace67332009-01-26 02:23:50 +0000656 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
657 s = list(iterable)
658 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000659
Benjamin Peterson063ff652009-02-06 03:01:24 +0000660 def unique_everseen(iterable, key=None):
661 "List unique elements, preserving order. Remember all elements ever seen."
662 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
663 # unique_everseen('ABBCcAD', str.lower) --> A B C D
664 seen = set()
665 seen_add = seen.add
666 if key is None:
667 for element in iterable:
668 if element not in seen:
669 seen_add(element)
670 yield element
671 else:
672 for element in iterable:
673 k = key(element)
674 if k not in seen:
675 seen_add(k)
676 yield element
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000677
Benjamin Peterson063ff652009-02-06 03:01:24 +0000678 def unique_justseen(iterable, key=None):
679 "List unique elements, preserving order. Remember only the element just seen."
680 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
681 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
682 return map(next, imap(itemgetter(1), groupby(iterable, key)))