blob: fda3beb67066806cd642006b66fa0aed90c8eb1a [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Christian Heimesfe337bf2008-03-23 21:54:12 +000011.. testsetup::
12
13 from itertools import *
14
15
Georg Brandl9afde1c2007-11-01 20:32:30 +000016This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl116aa622007-08-15 14:28:22 +000017constructs from the Haskell and SML programming languages. Each has been recast
18in a form suitable for Python.
19
20The module standardizes a core set of fast, memory efficient tools that are
21useful by themselves or in combination. Standardization helps avoid the
22readability and reliability problems which arise when many different individuals
23create their own slightly varying implementations, each with their own quirks
24and naming conventions.
25
26The tools are designed to combine readily with one another. This makes it easy
27to construct more specialized tools succinctly and efficiently in pure Python.
28
29For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000030sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
31by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
33Likewise, the functional tools are designed to work well with the high-speed
34functions provided by the :mod:`operator` module.
35
Georg Brandl116aa622007-08-15 14:28:22 +000036Whether cast in pure python form or compiled code, tools that use iterators are
Georg Brandl3dbca812008-07-23 16:10:53 +000037more memory efficient (and often faster) than their list based counterparts. Adopting
Georg Brandl116aa622007-08-15 14:28:22 +000038the principles of just-in-time manufacturing, they create data when and where
39needed instead of consuming memory with the computer equivalent of "inventory".
40
Georg Brandl116aa622007-08-15 14:28:22 +000041
42.. seealso::
43
44 The Standard ML Basis Library, `The Standard ML Basis Library
45 <http://www.standardml.org/Basis/>`_.
46
47 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
48 Libraries <http://www.haskell.org/definition/>`_.
49
50
51.. _itertools-functions:
52
53Itertool functions
54------------------
55
56The following module functions all construct and return iterators. Some provide
57streams of infinite length, so they should only be accessed by functions or
58loops that truncate the stream.
59
60
61.. function:: chain(*iterables)
62
63 Make an iterator that returns elements from the first iterable until it is
64 exhausted, then proceeds to the next iterable, until all of the iterables are
65 exhausted. Used for treating consecutive sequences as a single sequence.
66 Equivalent to::
67
68 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000069 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000070 for it in iterables:
71 for element in it:
72 yield element
73
74
Christian Heimes70e7ea22008-02-28 20:02:27 +000075.. function:: itertools.chain.from_iterable(iterable)
76
Georg Brandl48310cd2009-01-03 21:18:54 +000077 Alternate constructor for :func:`chain`. Gets chained inputs from a
Christian Heimes70e7ea22008-02-28 20:02:27 +000078 single iterable argument that is evaluated lazily. Equivalent to::
79
80 @classmethod
81 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000082 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +000083 for it in iterables:
84 for element in it:
85 yield element
86
Christian Heimes78644762008-03-04 23:39:23 +000087
Christian Heimes836baa52008-02-26 08:18:30 +000088.. function:: combinations(iterable, r)
89
Christian Heimesdae2a892008-04-19 00:55:37 +000090 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +000091
Georg Brandl48310cd2009-01-03 21:18:54 +000092 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +000093 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +000094 in sorted order.
Christian Heimes836baa52008-02-26 08:18:30 +000095
96 Elements are treated as unique based on their position, not on their
97 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +000098 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +000099
Christian Heimes836baa52008-02-26 08:18:30 +0000100 Equivalent to::
101
102 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000103 # combinations('ABCD', 2) --> AB AC AD BC BD CD
104 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000105 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000106 n = len(pool)
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000107 if r > n:
108 return
109 indices = list(range(r))
Christian Heimesb558a2e2008-03-02 22:46:37 +0000110 yield tuple(pool[i] for i in indices)
Christian Heimes380f7f22008-02-28 11:19:05 +0000111 while 1:
112 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000113 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000114 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000115 else:
116 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000117 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000118 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000119 indices[j] = indices[j-1] + 1
120 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000121
Christian Heimes78644762008-03-04 23:39:23 +0000122 The code for :func:`combinations` can be also expressed as a subsequence
123 of :func:`permutations` after filtering entries where the elements are not
124 in sorted order (according to their position in the input pool)::
125
126 def combinations(iterable, r):
127 pool = tuple(iterable)
128 n = len(pool)
129 for indices in permutations(range(n), r):
130 if sorted(indices) == list(indices):
131 yield tuple(pool[i] for i in indices)
132
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000133 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
134 or zero when ``r > n``.
Christian Heimes836baa52008-02-26 08:18:30 +0000135
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000136.. function:: combinations_with_replacement(iterable, r)
137
138 Return *r* length subsequences of elements from the input *iterable*
139 allowing individual elements to be repeated more than once.
140
141 Combinations are emitted in lexicographic sort order. So, if the
142 input *iterable* is sorted, the combination tuples will be produced
143 in sorted order.
144
145 Elements are treated as unique based on their position, not on their
146 value. So if the input elements are unique, the generated combinations
147 will also be unique.
148
149 Equivalent to::
150
151 def combinations_with_replacement(iterable, r):
152 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
153 pool = tuple(iterable)
154 n = len(pool)
155 if not n and r:
156 return
157 indices = [0] * r
158 yield tuple(pool[i] for i in indices)
159 while 1:
160 for i in reversed(range(r)):
161 if indices[i] != n - 1:
162 break
163 else:
164 return
165 indices[i:] = [indices[i] + 1] * (r - i)
166 yield tuple(pool[i] for i in indices)
167
168 The code for :func:`combinations_with_replacement` can be also expressed as
169 a subsequence of :func:`product` after filtering entries where the elements
170 are not in sorted order (according to their position in the input pool)::
171
172 def combinations_with_replacement(iterable, r):
173 pool = tuple(iterable)
174 n = len(pool)
175 for indices in product(range(n), repeat=r):
176 if sorted(indices) == list(indices):
177 yield tuple(pool[i] for i in indices)
178
179 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
180
Raymond Hettinger30729212009-02-12 06:28:27 +0000181 .. versionadded:: 3.1
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000182
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000183.. function:: compress(data, selectors)
184
185 Make an iterator that filters elements from *data* returning only those that
186 have a corresponding element in *selectors* that evaluates to ``True``.
187 Stops when either the *data* or *selectors* iterables have been exhausted.
188 Equivalent to::
189
190 def compress(data, selectors):
191 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
192 return (d for d, s in zip(data, selectors) if s)
193
Raymond Hettinger30729212009-02-12 06:28:27 +0000194 .. versionadded:: 3.1
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000195
196
Raymond Hettinger30729212009-02-12 06:28:27 +0000197.. function:: count(n=0, step=1)
Georg Brandl116aa622007-08-15 14:28:22 +0000198
Raymond Hettinger30729212009-02-12 06:28:27 +0000199 Make an iterator that returns evenly spaced values starting with *n*. Often
200 used as an argument to :func:`map` to generate consecutive data points.
201 Also, used with :func:`zip` to add sequence numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000202
Raymond Hettinger30729212009-02-12 06:28:27 +0000203 def count(n=0, step=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000204 # count(10) --> 10 11 12 13 14 ...
Raymond Hettinger30729212009-02-12 06:28:27 +0000205 # count(2.5, 0.5) -> 3.5 3.0 4.5 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000206 while True:
207 yield n
Raymond Hettinger30729212009-02-12 06:28:27 +0000208 n += step
Georg Brandl116aa622007-08-15 14:28:22 +0000209
Raymond Hettinger30729212009-02-12 06:28:27 +0000210 .. versionchanged:: 3.1
211 added *step* argument and allowed non-integer arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000212
213.. function:: cycle(iterable)
214
215 Make an iterator returning elements from the iterable and saving a copy of each.
216 When the iterable is exhausted, return elements from the saved copy. Repeats
217 indefinitely. Equivalent to::
218
219 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000220 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000221 saved = []
222 for element in iterable:
223 yield element
224 saved.append(element)
225 while saved:
226 for element in saved:
227 yield element
228
229 Note, this member of the toolkit may require significant auxiliary storage
230 (depending on the length of the iterable).
231
232
233.. function:: dropwhile(predicate, iterable)
234
235 Make an iterator that drops elements from the iterable as long as the predicate
236 is true; afterwards, returns every element. Note, the iterator does not produce
237 *any* output until the predicate first becomes false, so it may have a lengthy
238 start-up time. Equivalent to::
239
240 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000241 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000242 iterable = iter(iterable)
243 for x in iterable:
244 if not predicate(x):
245 yield x
246 break
247 for x in iterable:
248 yield x
249
Raymond Hettinger749761e2009-01-27 04:42:48 +0000250.. function:: filterfalse(predicate, iterable)
251
252 Make an iterator that filters elements from iterable returning only those for
253 which the predicate is ``False``. If *predicate* is ``None``, return the items
254 that are false. Equivalent to::
255
256 def filterfalse(predicate, iterable):
257 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
258 if predicate is None:
259 predicate = bool
260 for x in iterable:
261 if not predicate(x):
262 yield x
263
Georg Brandl116aa622007-08-15 14:28:22 +0000264
265.. function:: groupby(iterable[, key])
266
267 Make an iterator that returns consecutive keys and groups from the *iterable*.
268 The *key* is a function computing a key value for each element. If not
269 specified or is ``None``, *key* defaults to an identity function and returns
270 the element unchanged. Generally, the iterable needs to already be sorted on
271 the same key function.
272
273 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
274 generates a break or new group every time the value of the key function changes
275 (which is why it is usually necessary to have sorted the data using the same key
276 function). That behavior differs from SQL's GROUP BY which aggregates common
277 elements regardless of their input order.
278
279 The returned group is itself an iterator that shares the underlying iterable
280 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
281 object is advanced, the previous group is no longer visible. So, if that data
282 is needed later, it should be stored as a list::
283
284 groups = []
285 uniquekeys = []
286 data = sorted(data, key=keyfunc)
287 for k, g in groupby(data, keyfunc):
288 groups.append(list(g)) # Store group iterator as a list
289 uniquekeys.append(k)
290
291 :func:`groupby` is equivalent to::
292
293 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000294 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettingerd04fa312009-02-04 19:45:13 +0000295 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000296 def __init__(self, iterable, key=None):
297 if key is None:
298 key = lambda x: x
299 self.keyfunc = key
300 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000301 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000302 def __iter__(self):
303 return self
304 def __next__(self):
305 while self.currkey == self.tgtkey:
306 self.currvalue = next(self.it) # Exit on StopIteration
307 self.currkey = self.keyfunc(self.currvalue)
308 self.tgtkey = self.currkey
309 return (self.currkey, self._grouper(self.tgtkey))
310 def _grouper(self, tgtkey):
311 while self.currkey == tgtkey:
312 yield self.currvalue
313 self.currvalue = next(self.it) # Exit on StopIteration
314 self.currkey = self.keyfunc(self.currvalue)
315
Georg Brandl116aa622007-08-15 14:28:22 +0000316
Georg Brandl116aa622007-08-15 14:28:22 +0000317.. function:: islice(iterable, [start,] stop [, step])
318
319 Make an iterator that returns selected elements from the iterable. If *start* is
320 non-zero, then elements from the iterable are skipped until start is reached.
321 Afterward, elements are returned consecutively unless *step* is set higher than
322 one which results in items being skipped. If *stop* is ``None``, then iteration
323 continues until the iterator is exhausted, if at all; otherwise, it stops at the
324 specified position. Unlike regular slicing, :func:`islice` does not support
325 negative values for *start*, *stop*, or *step*. Can be used to extract related
326 fields from data where the internal structure has been flattened (for example, a
327 multi-line report may list a name field on every third line). Equivalent to::
328
329 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000330 # islice('ABCDEFG', 2) --> A B
331 # islice('ABCDEFG', 2, 4) --> C D
332 # islice('ABCDEFG', 2, None) --> C D E F G
333 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000334 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000335 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000336 nexti = next(it)
337 for i, element in enumerate(iterable):
338 if i == nexti:
339 yield element
340 nexti = next(it)
341
342 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
343 then the step defaults to one.
344
Georg Brandl116aa622007-08-15 14:28:22 +0000345
Christian Heimes70e7ea22008-02-28 20:02:27 +0000346.. function:: permutations(iterable[, r])
347
348 Return successive *r* length permutations of elements in the *iterable*.
349
350 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl48310cd2009-01-03 21:18:54 +0000351 of the *iterable* and all possible full-length permutations
Christian Heimes70e7ea22008-02-28 20:02:27 +0000352 are generated.
353
Georg Brandl48310cd2009-01-03 21:18:54 +0000354 Permutations are emitted in lexicographic sort order. So, if the
Christian Heimes70e7ea22008-02-28 20:02:27 +0000355 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +0000356 in sorted order.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000357
358 Elements are treated as unique based on their position, not on their
359 value. So if the input elements are unique, there will be no repeat
360 values in each permutation.
361
Christian Heimesb558a2e2008-03-02 22:46:37 +0000362 Equivalent to::
363
364 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000365 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
366 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000367 pool = tuple(iterable)
368 n = len(pool)
369 r = n if r is None else r
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000370 if r > n:
371 return
372 indices = list(range(n))
Christian Heimesfe337bf2008-03-23 21:54:12 +0000373 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000374 yield tuple(pool[i] for i in indices[:r])
375 while n:
376 for i in reversed(range(r)):
377 cycles[i] -= 1
378 if cycles[i] == 0:
379 indices[i:] = indices[i+1:] + indices[i:i+1]
380 cycles[i] = n - i
381 else:
382 j = cycles[i]
383 indices[i], indices[-j] = indices[-j], indices[i]
384 yield tuple(pool[i] for i in indices[:r])
385 break
386 else:
387 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000388
Georg Brandl48310cd2009-01-03 21:18:54 +0000389 The code for :func:`permutations` can be also expressed as a subsequence of
Christian Heimes78644762008-03-04 23:39:23 +0000390 :func:`product`, filtered to exclude entries with repeated elements (those
391 from the same position in the input pool)::
392
393 def permutations(iterable, r=None):
394 pool = tuple(iterable)
395 n = len(pool)
396 r = n if r is None else r
397 for indices in product(range(n), repeat=r):
398 if len(set(indices)) == r:
399 yield tuple(pool[i] for i in indices)
400
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000401 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
402 or zero when ``r > n``.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000403
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000404.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000405
406 Cartesian product of input iterables.
407
408 Equivalent to nested for-loops in a generator expression. For example,
409 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
410
Christian Heimesdae2a892008-04-19 00:55:37 +0000411 The nested loops cycle like an odometer with the rightmost element advancing
412 on every iteration. This pattern creates a lexicographic ordering so that if
413 the input's iterables are sorted, the product tuples are emitted in sorted
414 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000415
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000416 To compute the product of an iterable with itself, specify the number of
417 repetitions with the optional *repeat* keyword argument. For example,
418 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
419
Christian Heimes78644762008-03-04 23:39:23 +0000420 This function is equivalent to the following code, except that the
421 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000422
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000423 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000424 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
425 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000426 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000427 result = [[]]
428 for pool in pools:
429 result = [x+[y] for x in result for y in pool]
430 for prod in result:
431 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000432
433
Georg Brandl116aa622007-08-15 14:28:22 +0000434.. function:: repeat(object[, times])
435
436 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000437 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000438 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000439 create an invariant part of a tuple record. Equivalent to::
440
441 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000442 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000443 if times is None:
444 while True:
445 yield object
446 else:
447 for i in range(times):
448 yield object
449
450
451.. function:: starmap(function, iterable)
452
Christian Heimes679db4a2008-01-18 09:56:22 +0000453 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000454 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000455 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000456 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000457 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
458
459 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000460 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000461 for args in iterable:
462 yield function(*args)
463
Georg Brandl116aa622007-08-15 14:28:22 +0000464
465.. function:: takewhile(predicate, iterable)
466
467 Make an iterator that returns elements from the iterable as long as the
468 predicate is true. Equivalent to::
469
470 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000471 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000472 for x in iterable:
473 if predicate(x):
474 yield x
475 else:
476 break
477
478
479.. function:: tee(iterable[, n=2])
480
481 Return *n* independent iterators from a single iterable. The case where ``n==2``
482 is equivalent to::
483
484 def tee(iterable):
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000485 def gen(next, data={}):
Georg Brandl116aa622007-08-15 14:28:22 +0000486 for i in count():
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000487 if i in data:
488 yield data.pop(i)
Georg Brandl116aa622007-08-15 14:28:22 +0000489 else:
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000490 data[i] = next()
491 yield data[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000492 it = iter(iterable)
493 return (gen(it.__next__), gen(it.__next__))
494
495 Note, once :func:`tee` has made a split, the original *iterable* should not be
496 used anywhere else; otherwise, the *iterable* could get advanced without the tee
497 objects being informed.
498
499 Note, this member of the toolkit may require significant auxiliary storage
500 (depending on how much temporary data needs to be stored). In general, if one
501 iterator is going to use most or all of the data before the other iterator, it
502 is faster to use :func:`list` instead of :func:`tee`.
503
Georg Brandl116aa622007-08-15 14:28:22 +0000504
Raymond Hettinger749761e2009-01-27 04:42:48 +0000505.. function:: zip_longest(*iterables[, fillvalue])
506
507 Make an iterator that aggregates elements from each of the iterables. If the
508 iterables are of uneven length, missing values are filled-in with *fillvalue*.
509 Iteration continues until the longest iterable is exhausted. Equivalent to::
510
511 def zip_longest(*args, fillvalue=None):
512 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
513 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
514 yield counter() # yields the fillvalue, or raises IndexError
515 fillers = repeat(fillvalue)
516 iters = [chain(it, sentinel(), fillers) for it in args]
517 try:
518 for tup in zip(*iters):
519 yield tup
520 except IndexError:
521 pass
522
523 If one of the iterables is potentially infinite, then the :func:`zip_longest`
524 function should be wrapped with something that limits the number of calls
525 (for example :func:`islice` or :func:`takewhile`). If not specified,
526 *fillvalue* defaults to ``None``.
527
528
Georg Brandl116aa622007-08-15 14:28:22 +0000529.. _itertools-example:
530
531Examples
532--------
533
534The following examples show common uses for each tool and demonstrate ways they
Christian Heimesfe337bf2008-03-23 21:54:12 +0000535can be combined.
536
537.. doctest::
Georg Brandl116aa622007-08-15 14:28:22 +0000538
Georg Brandlb1441c72009-01-03 22:33:39 +0000539 >>> # Show a dictionary sorted and grouped by value
Georg Brandl116aa622007-08-15 14:28:22 +0000540 >>> from operator import itemgetter
541 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000542 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000543 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000544 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000545 ...
546 1 ['a', 'c', 'e']
547 2 ['b', 'd', 'f']
548 3 ['g']
549
Georg Brandlb1441c72009-01-03 22:33:39 +0000550 >>> # Find runs of consecutive numbers using groupby. The key to the solution
551 >>> # is differencing with a range so that consecutive numbers all appear in
552 >>> # same group.
Georg Brandl116aa622007-08-15 14:28:22 +0000553 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
554 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000555 ... print(map(operator.itemgetter(1), g))
Georg Brandl48310cd2009-01-03 21:18:54 +0000556 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000557 [1]
558 [4, 5, 6]
559 [10]
560 [15, 16, 17, 18]
561 [22]
562 [25, 26, 27, 28]
563
564
565
566.. _itertools-recipes:
567
568Recipes
569-------
570
571This section shows recipes for creating an extended toolset using the existing
572itertools as building blocks.
573
574The extended tools offer the same high performance as the underlying toolset.
575The superior memory performance is kept by processing elements one at a time
576rather than bringing the whole iterable into memory all at once. Code volume is
577kept small by linking the tools together in a functional style which helps
578eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000579"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000580which incur interpreter overhead.
581
582.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000583
Georg Brandl3dbca812008-07-23 16:10:53 +0000584 def take(n, iterable):
585 "Return first n items of the iterable as a list"
586 return list(islice(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000587
Georg Brandl3dbca812008-07-23 16:10:53 +0000588 def enumerate(iterable, start=0):
589 return zip(count(start), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000590
Georg Brandl3dbca812008-07-23 16:10:53 +0000591 def tabulate(function, start=0):
Georg Brandl116aa622007-08-15 14:28:22 +0000592 "Return function(0), function(1), ..."
Georg Brandl3dbca812008-07-23 16:10:53 +0000593 return map(function, count(start))
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000594
Georg Brandl116aa622007-08-15 14:28:22 +0000595 def nth(iterable, n):
Raymond Hettingerd04fa312009-02-04 19:45:13 +0000596 "Returns the nth item or None"
597 return next(islice(iterable, n, None), None)
Georg Brandl116aa622007-08-15 14:28:22 +0000598
Georg Brandl3dbca812008-07-23 16:10:53 +0000599 def quantify(iterable, pred=bool):
600 "Count how many times the predicate is true"
601 return sum(map(pred, iterable))
Georg Brandl116aa622007-08-15 14:28:22 +0000602
Georg Brandl3dbca812008-07-23 16:10:53 +0000603 def padnone(iterable):
Georg Brandl116aa622007-08-15 14:28:22 +0000604 """Returns the sequence elements and then returns None indefinitely.
605
606 Useful for emulating the behavior of the built-in map() function.
607 """
Georg Brandl3dbca812008-07-23 16:10:53 +0000608 return chain(iterable, repeat(None))
Georg Brandl116aa622007-08-15 14:28:22 +0000609
Georg Brandl3dbca812008-07-23 16:10:53 +0000610 def ncycles(iterable, n):
Georg Brandl116aa622007-08-15 14:28:22 +0000611 "Returns the sequence elements n times"
Georg Brandl3dbca812008-07-23 16:10:53 +0000612 return chain.from_iterable(repeat(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000613
614 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000615 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000616
617 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000618 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000619
620 def repeatfunc(func, times=None, *args):
621 """Repeat calls to func with specified arguments.
622
623 Example: repeatfunc(random.random)
624 """
625 if times is None:
626 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000627 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000628
629 def pairwise(iterable):
630 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
631 a, b = tee(iterable)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000632 for elem in b:
633 break
634 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000635
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000636 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000637 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000638 args = [iter(iterable)] * n
Raymond Hettinger883d2762009-01-27 04:57:51 +0000639 return zip_longest(*args, fillvalue=fillvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000640
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000641 def roundrobin(*iterables):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000642 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000643 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000644 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000645 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000646 while pending:
647 try:
648 for next in nexts:
649 yield next()
650 except StopIteration:
651 pending -= 1
652 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000653
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000654 def powerset(iterable):
Raymond Hettingerace67332009-01-26 02:23:50 +0000655 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
656 s = list(iterable)
657 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000658
Benjamin Peterson063ff652009-02-06 03:01:24 +0000659 def unique_everseen(iterable, key=None):
660 "List unique elements, preserving order. Remember all elements ever seen."
661 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
662 # unique_everseen('ABBCcAD', str.lower) --> A B C D
663 seen = set()
664 seen_add = seen.add
665 if key is None:
666 for element in iterable:
667 if element not in seen:
668 seen_add(element)
669 yield element
670 else:
671 for element in iterable:
672 k = key(element)
673 if k not in seen:
674 seen_add(k)
675 yield element
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000676
Benjamin Peterson063ff652009-02-06 03:01:24 +0000677 def unique_justseen(iterable, key=None):
678 "List unique elements, preserving order. Remember only the element just seen."
679 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
680 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
681 return map(next, imap(itemgetter(1), groupby(iterable, key)))