blob: 9a3626f98cd4ffc68e371d09a366e422c74a2b79 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Georg Brandle7a09902007-10-21 12:10:28 +000017This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl8ec7f652007-08-15 14:28:01 +000018constructs from the Haskell and SML programming languages. Each has been recast
19in a form suitable for Python.
20
21The module standardizes a core set of fast, memory efficient tools that are
22useful by themselves or in combination. Standardization helps avoid the
23readability and reliability problems which arise when many different individuals
24create their own slightly varying implementations, each with their own quirks
25and naming conventions.
26
27The tools are designed to combine readily with one another. This makes it easy
28to construct more specialized tools succinctly and efficiently in pure Python.
29
30For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
31sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
32:func:`count` which can be combined to form ``imap(f, count())`` and produce an
33equivalent result.
34
35Likewise, the functional tools are designed to work well with the high-speed
36functions provided by the :mod:`operator` module.
37
38The module author welcomes suggestions for other basic building blocks to be
39added to future versions of the module.
40
41Whether cast in pure python form or compiled code, tools that use iterators are
42more memory efficient (and faster) than their list based counterparts. Adopting
43the principles of just-in-time manufacturing, they create data when and where
44needed instead of consuming memory with the computer equivalent of "inventory".
45
46The performance advantage of iterators becomes more acute as the number of
47elements increases -- at some point, lists grow large enough to severely impact
48memory cache performance and start running slowly.
49
50
51.. seealso::
52
53 The Standard ML Basis Library, `The Standard ML Basis Library
54 <http://www.standardml.org/Basis/>`_.
55
56 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
57 Libraries <http://www.haskell.org/definition/>`_.
58
59
60.. _itertools-functions:
61
62Itertool functions
63------------------
64
65The following module functions all construct and return iterators. Some provide
66streams of infinite length, so they should only be accessed by functions or
67loops that truncate the stream.
68
69
70.. function:: chain(*iterables)
71
72 Make an iterator that returns elements from the first iterable until it is
73 exhausted, then proceeds to the next iterable, until all of the iterables are
74 exhausted. Used for treating consecutive sequences as a single sequence.
75 Equivalent to::
76
77 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000078 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +000079 for it in iterables:
80 for element in it:
81 yield element
82
83
Raymond Hettinger330958e2008-02-28 19:41:24 +000084.. function:: itertools.chain.from_iterable(iterable)
85
86 Alternate constructor for :func:`chain`. Gets chained inputs from a
87 single iterable argument that is evaluated lazily. Equivalent to::
88
89 @classmethod
90 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000091 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +000092 for it in iterables:
93 for element in it:
94 yield element
95
96 .. versionadded:: 2.6
97
Raymond Hettingerd553d852008-03-04 04:17:08 +000098
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000099.. function:: combinations(iterable, r)
100
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000101 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000102
Raymond Hettinger330958e2008-02-28 19:41:24 +0000103 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000104 input *iterable* is sorted, the combination tuples will be produced
105 in sorted order.
106
107 Elements are treated as unique based on their position, not on their
108 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000109 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000110
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000111 Equivalent to::
112
113 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000114 # combinations('ABCD', 2) --> AB AC AD BC BD CD
115 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000116 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000117 n = len(pool)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000118 indices = range(r)
119 yield tuple(pool[i] for i in indices)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000120 while 1:
121 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000122 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000123 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000124 else:
125 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000126 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000127 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000128 indices[j] = indices[j-1] + 1
129 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000130
Raymond Hettingerd553d852008-03-04 04:17:08 +0000131 The code for :func:`combinations` can be also expressed as a subsequence
132 of :func:`permutations` after filtering entries where the elements are not
133 in sorted order (according to their position in the input pool)::
134
135 def combinations(iterable, r):
136 pool = tuple(iterable)
137 n = len(pool)
138 for indices in permutations(range(n), r):
139 if sorted(indices) == list(indices):
140 yield tuple(pool[i] for i in indices)
141
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000142 .. versionadded:: 2.6
143
Georg Brandl8ec7f652007-08-15 14:28:01 +0000144.. function:: count([n])
145
146 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettinger50e90e22007-10-04 00:20:27 +0000147 specified *n* defaults to zero. Often used as an argument to :func:`imap` to
148 generate consecutive data points. Also, used with :func:`izip` to add sequence
149 numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000150
151 def count(n=0):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000152 # count(10) --> 10 11 12 13 14 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000153 while True:
154 yield n
155 n += 1
156
Georg Brandl8ec7f652007-08-15 14:28:01 +0000157
158.. function:: cycle(iterable)
159
160 Make an iterator returning elements from the iterable and saving a copy of each.
161 When the iterable is exhausted, return elements from the saved copy. Repeats
162 indefinitely. Equivalent to::
163
164 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000165 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000166 saved = []
167 for element in iterable:
168 yield element
169 saved.append(element)
170 while saved:
171 for element in saved:
172 yield element
173
174 Note, this member of the toolkit may require significant auxiliary storage
175 (depending on the length of the iterable).
176
177
178.. function:: dropwhile(predicate, iterable)
179
180 Make an iterator that drops elements from the iterable as long as the predicate
181 is true; afterwards, returns every element. Note, the iterator does not produce
182 *any* output until the predicate first becomes false, so it may have a lengthy
183 start-up time. Equivalent to::
184
185 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000186 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000187 iterable = iter(iterable)
188 for x in iterable:
189 if not predicate(x):
190 yield x
191 break
192 for x in iterable:
193 yield x
194
195
196.. function:: groupby(iterable[, key])
197
198 Make an iterator that returns consecutive keys and groups from the *iterable*.
199 The *key* is a function computing a key value for each element. If not
200 specified or is ``None``, *key* defaults to an identity function and returns
201 the element unchanged. Generally, the iterable needs to already be sorted on
202 the same key function.
203
204 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
205 generates a break or new group every time the value of the key function changes
206 (which is why it is usually necessary to have sorted the data using the same key
207 function). That behavior differs from SQL's GROUP BY which aggregates common
208 elements regardless of their input order.
209
210 The returned group is itself an iterator that shares the underlying iterable
211 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
212 object is advanced, the previous group is no longer visible. So, if that data
213 is needed later, it should be stored as a list::
214
215 groups = []
216 uniquekeys = []
217 data = sorted(data, key=keyfunc)
218 for k, g in groupby(data, keyfunc):
219 groups.append(list(g)) # Store group iterator as a list
220 uniquekeys.append(k)
221
222 :func:`groupby` is equivalent to::
223
224 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000225 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
226 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000227 def __init__(self, iterable, key=None):
228 if key is None:
229 key = lambda x: x
230 self.keyfunc = key
231 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000232 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000233 def __iter__(self):
234 return self
235 def next(self):
236 while self.currkey == self.tgtkey:
237 self.currvalue = self.it.next() # Exit on StopIteration
238 self.currkey = self.keyfunc(self.currvalue)
239 self.tgtkey = self.currkey
240 return (self.currkey, self._grouper(self.tgtkey))
241 def _grouper(self, tgtkey):
242 while self.currkey == tgtkey:
243 yield self.currvalue
244 self.currvalue = self.it.next() # Exit on StopIteration
245 self.currkey = self.keyfunc(self.currvalue)
246
247 .. versionadded:: 2.4
248
249
250.. function:: ifilter(predicate, iterable)
251
252 Make an iterator that filters elements from iterable returning only those for
253 which the predicate is ``True``. If *predicate* is ``None``, return the items
254 that are true. Equivalent to::
255
256 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000257 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000258 if predicate is None:
259 predicate = bool
260 for x in iterable:
261 if predicate(x):
262 yield x
263
264
265.. function:: ifilterfalse(predicate, iterable)
266
267 Make an iterator that filters elements from iterable returning only those for
268 which the predicate is ``False``. If *predicate* is ``None``, return the items
269 that are false. Equivalent to::
270
271 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000272 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000273 if predicate is None:
274 predicate = bool
275 for x in iterable:
276 if not predicate(x):
277 yield x
278
279
280.. function:: imap(function, *iterables)
281
282 Make an iterator that computes the function using arguments from each of the
283 iterables. If *function* is set to ``None``, then :func:`imap` returns the
284 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
285 exhausted instead of filling in ``None`` for shorter iterables. The reason for
286 the difference is that infinite iterator arguments are typically an error for
287 :func:`map` (because the output is fully evaluated) but represent a common and
288 useful way of supplying arguments to :func:`imap`. Equivalent to::
289
290 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000291 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000292 iterables = map(iter, iterables)
293 while True:
Raymond Hettinger2dec48d2008-01-22 22:09:26 +0000294 args = [it.next() for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000295 if function is None:
296 yield tuple(args)
297 else:
298 yield function(*args)
299
300
301.. function:: islice(iterable, [start,] stop [, step])
302
303 Make an iterator that returns selected elements from the iterable. If *start* is
304 non-zero, then elements from the iterable are skipped until start is reached.
305 Afterward, elements are returned consecutively unless *step* is set higher than
306 one which results in items being skipped. If *stop* is ``None``, then iteration
307 continues until the iterator is exhausted, if at all; otherwise, it stops at the
308 specified position. Unlike regular slicing, :func:`islice` does not support
309 negative values for *start*, *stop*, or *step*. Can be used to extract related
310 fields from data where the internal structure has been flattened (for example, a
311 multi-line report may list a name field on every third line). Equivalent to::
312
313 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000314 # islice('ABCDEFG', 2) --> A B
315 # islice('ABCDEFG', 2, 4) --> C D
316 # islice('ABCDEFG', 2, None) --> C D E F G
317 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000318 s = slice(*args)
319 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
320 nexti = it.next()
321 for i, element in enumerate(iterable):
322 if i == nexti:
323 yield element
324 nexti = it.next()
325
326 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
327 then the step defaults to one.
328
329 .. versionchanged:: 2.5
330 accept ``None`` values for default *start* and *step*.
331
332
333.. function:: izip(*iterables)
334
335 Make an iterator that aggregates elements from each of the iterables. Like
336 :func:`zip` except that it returns an iterator instead of a list. Used for
337 lock-step iteration over several iterables at a time. Equivalent to::
338
339 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000340 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000341 iterables = map(iter, iterables)
342 while iterables:
343 result = [it.next() for it in iterables]
344 yield tuple(result)
345
346 .. versionchanged:: 2.4
347 When no iterables are specified, returns a zero length iterator instead of
348 raising a :exc:`TypeError` exception.
349
Raymond Hettinger48c62932008-01-22 19:51:41 +0000350 The left-to-right evaluation order of the iterables is guaranteed. This
351 makes possible an idiom for clustering a data series into n-length groups
352 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000353
Raymond Hettinger48c62932008-01-22 19:51:41 +0000354 :func:`izip` should only be used with unequal length inputs when you don't
355 care about trailing, unmatched values from the longer iterables. If those
356 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000357
358
359.. function:: izip_longest(*iterables[, fillvalue])
360
361 Make an iterator that aggregates elements from each of the iterables. If the
362 iterables are of uneven length, missing values are filled-in with *fillvalue*.
363 Iteration continues until the longest iterable is exhausted. Equivalent to::
364
365 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000366 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000367 fillvalue = kwds.get('fillvalue')
368 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
369 yield counter() # yields the fillvalue, or raises IndexError
370 fillers = repeat(fillvalue)
371 iters = [chain(it, sentinel(), fillers) for it in args]
372 try:
373 for tup in izip(*iters):
374 yield tup
375 except IndexError:
376 pass
377
378 If one of the iterables is potentially infinite, then the :func:`izip_longest`
379 function should be wrapped with something that limits the number of calls (for
380 example :func:`islice` or :func:`takewhile`).
381
382 .. versionadded:: 2.6
383
Raymond Hettinger330958e2008-02-28 19:41:24 +0000384.. function:: permutations(iterable[, r])
385
386 Return successive *r* length permutations of elements in the *iterable*.
387
388 If *r* is not specified or is ``None``, then *r* defaults to the length
389 of the *iterable* and all possible full-length permutations
390 are generated.
391
392 Permutations are emitted in lexicographic sort order. So, if the
393 input *iterable* is sorted, the permutation tuples will be produced
394 in sorted order.
395
396 Elements are treated as unique based on their position, not on their
397 value. So if the input elements are unique, there will be no repeat
398 values in each permutation.
399
Raymond Hettingerf287f172008-03-02 10:59:31 +0000400 Equivalent to::
401
402 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000403 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
404 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000405 pool = tuple(iterable)
406 n = len(pool)
407 r = n if r is None else r
408 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000409 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000410 yield tuple(pool[i] for i in indices[:r])
411 while n:
412 for i in reversed(range(r)):
413 cycles[i] -= 1
414 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000415 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000416 cycles[i] = n - i
417 else:
418 j = cycles[i]
419 indices[i], indices[-j] = indices[-j], indices[i]
420 yield tuple(pool[i] for i in indices[:r])
421 break
422 else:
423 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000424
Raymond Hettingerd553d852008-03-04 04:17:08 +0000425 The code for :func:`permutations` can be also expressed as a subsequence of
426 :func:`product`, filtered to exclude entries with repeated elements (those
427 from the same position in the input pool)::
428
429 def permutations(iterable, r=None):
430 pool = tuple(iterable)
431 n = len(pool)
432 r = n if r is None else r
433 for indices in product(range(n), repeat=r):
434 if len(set(indices)) == r:
435 yield tuple(pool[i] for i in indices)
436
Raymond Hettinger330958e2008-02-28 19:41:24 +0000437 .. versionadded:: 2.6
438
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000439.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000440
441 Cartesian product of input iterables.
442
443 Equivalent to nested for-loops in a generator expression. For example,
444 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
445
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000446 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000447 on every iteration. This pattern creates a lexicographic ordering so that if
448 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000449 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000450
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000451 To compute the product of an iterable with itself, specify the number of
452 repetitions with the optional *repeat* keyword argument. For example,
453 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
454
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000455 This function is equivalent to the following code, except that the
456 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000457
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000458 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000459 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
460 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000461 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000462 result = [[]]
463 for pool in pools:
464 result = [x+[y] for x in result for y in pool]
465 for prod in result:
466 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000467
468 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000469
470.. function:: repeat(object[, times])
471
472 Make an iterator that returns *object* over and over again. Runs indefinitely
473 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000474 invariant function parameters. Also used with :func:`izip` to create constant
475 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000476
477 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000478 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000479 if times is None:
480 while True:
481 yield object
482 else:
483 for i in xrange(times):
484 yield object
485
486
487.. function:: starmap(function, iterable)
488
Raymond Hettinger47317092008-01-17 03:02:14 +0000489 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000490 the iterable. Used instead of :func:`imap` when argument parameters are already
491 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
492 difference between :func:`imap` and :func:`starmap` parallels the distinction
493 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
494
495 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000496 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000497 for args in iterable:
498 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000499
Raymond Hettinger47317092008-01-17 03:02:14 +0000500 .. versionchanged:: 2.6
501 Previously, :func:`starmap` required the function arguments to be tuples.
502 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000503
504.. function:: takewhile(predicate, iterable)
505
506 Make an iterator that returns elements from the iterable as long as the
507 predicate is true. Equivalent to::
508
509 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000510 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000511 for x in iterable:
512 if predicate(x):
513 yield x
514 else:
515 break
516
517
518.. function:: tee(iterable[, n=2])
519
520 Return *n* independent iterators from a single iterable. The case where ``n==2``
521 is equivalent to::
522
523 def tee(iterable):
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000524 def gen(next, data={}):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000525 for i in count():
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000526 if i in data:
527 yield data.pop(i)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000528 else:
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000529 data[i] = next()
530 yield data[i]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000531 it = iter(iterable)
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000532 return gen(it.next), gen(it.next)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000533
534 Note, once :func:`tee` has made a split, the original *iterable* should not be
535 used anywhere else; otherwise, the *iterable* could get advanced without the tee
536 objects being informed.
537
538 Note, this member of the toolkit may require significant auxiliary storage
539 (depending on how much temporary data needs to be stored). In general, if one
540 iterator is going to use most or all of the data before the other iterator, it
541 is faster to use :func:`list` instead of :func:`tee`.
542
543 .. versionadded:: 2.4
544
545
546.. _itertools-example:
547
548Examples
549--------
550
551The following examples show common uses for each tool and demonstrate ways they
Georg Brandle8f1b002008-03-22 22:04:10 +0000552can be combined.
553
554.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000555
Georg Brandl8ec7f652007-08-15 14:28:01 +0000556 # Show a dictionary sorted and grouped by value
557 >>> from operator import itemgetter
558 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
559 >>> di = sorted(d.iteritems(), key=itemgetter(1))
560 >>> for k, g in groupby(di, key=itemgetter(1)):
561 ... print k, map(itemgetter(0), g)
562 ...
563 1 ['a', 'c', 'e']
564 2 ['b', 'd', 'f']
565 3 ['g']
566
567 # Find runs of consecutive numbers using groupby. The key to the solution
568 # is differencing with a range so that consecutive numbers all appear in
569 # same group.
570 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
571 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
Georg Brandle8f1b002008-03-22 22:04:10 +0000572 ... print map(itemgetter(1), g)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000573 ...
574 [1]
575 [4, 5, 6]
576 [10]
577 [15, 16, 17, 18]
578 [22]
579 [25, 26, 27, 28]
580
581
582
583.. _itertools-recipes:
584
585Recipes
586-------
587
588This section shows recipes for creating an extended toolset using the existing
589itertools as building blocks.
590
591The extended tools offer the same high performance as the underlying toolset.
592The superior memory performance is kept by processing elements one at a time
593rather than bringing the whole iterable into memory all at once. Code volume is
594kept small by linking the tools together in a functional style which helps
595eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000596"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000597which incur interpreter overhead.
598
599.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000600
601 def take(n, seq):
602 return list(islice(seq, n))
603
604 def enumerate(iterable):
605 return izip(count(), iterable)
606
607 def tabulate(function):
608 "Return function(0), function(1), ..."
609 return imap(function, count())
610
611 def iteritems(mapping):
612 return izip(mapping.iterkeys(), mapping.itervalues())
613
614 def nth(iterable, n):
615 "Returns the nth item or raise StopIteration"
616 return islice(iterable, n, None).next()
617
618 def all(seq, pred=None):
619 "Returns True if pred(x) is true for every element in the iterable"
620 for elem in ifilterfalse(pred, seq):
621 return False
622 return True
623
624 def any(seq, pred=None):
625 "Returns True if pred(x) is true for at least one element in the iterable"
626 for elem in ifilter(pred, seq):
627 return True
628 return False
629
630 def no(seq, pred=None):
631 "Returns True if pred(x) is false for every element in the iterable"
632 for elem in ifilter(pred, seq):
633 return False
634 return True
635
636 def quantify(seq, pred=None):
637 "Count how many times the predicate is true in the sequence"
638 return sum(imap(pred, seq))
639
640 def padnone(seq):
641 """Returns the sequence elements and then returns None indefinitely.
642
643 Useful for emulating the behavior of the built-in map() function.
644 """
645 return chain(seq, repeat(None))
646
647 def ncycles(seq, n):
648 "Returns the sequence elements n times"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000649 return chain.from_iterable(repeat(seq, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000650
651 def dotproduct(vec1, vec2):
652 return sum(imap(operator.mul, vec1, vec2))
653
654 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000655 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000656
657 def repeatfunc(func, times=None, *args):
658 """Repeat calls to func with specified arguments.
659
660 Example: repeatfunc(random.random)
661 """
662 if times is None:
663 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000664 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000665
666 def pairwise(iterable):
667 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
668 a, b = tee(iterable)
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000669 for elem in b:
670 break
Georg Brandl8ec7f652007-08-15 14:28:01 +0000671 return izip(a, b)
672
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000673 def grouper(n, iterable, fillvalue=None):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000674 "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000675 args = [iter(iterable)] * n
676 kwds = dict(fillvalue=fillvalue)
677 return izip_longest(*args, **kwds)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000678
Raymond Hettingera44327a2008-01-30 22:17:31 +0000679 def roundrobin(*iterables):
680 "roundrobin('abc', 'd', 'ef') --> 'a', 'd', 'e', 'b', 'f', 'c'"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000681 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000682 pending = len(iterables)
683 nexts = cycle(iter(it).next for it in iterables)
684 while pending:
685 try:
686 for next in nexts:
687 yield next()
688 except StopIteration:
689 pending -= 1
690 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000691
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000692 def powerset(iterable):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000693 "powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])"
694 # Recipe credited to Eric Raymond
695 pairs = [(2**i, x) for i, x in enumerate(iterable)]
696 for n in xrange(2**len(pairs)):
697 yield set(x for m, x in pairs if m&n)
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000698
Raymond Hettingere8b4b602008-03-11 00:19:07 +0000699 def compress(data, selectors):
700 "compress('abcdef', [1,0,1,0,1,1]) --> a c e f"
701 for d, s in izip(data, selectors):
702 if s:
703 yield d
Raymond Hettinger33691672008-07-19 00:43:00 +0000704
705 def combinations_with_replacement(iterable, r):
706 "combinations_with_replacement('ABC', 3) --> AA AB AC BB BC CC"
707 pool = tuple(iterable)
708 n = len(pool)
709 indices = [0] * r
710 yield tuple(pool[i] for i in indices)
711 while 1:
712 for i in reversed(range(r)):
713 if indices[i] != n - 1:
714 break
715 else:
716 return
717 indices[i:] = [indices[i] + 1] * (r - i)
718 yield tuple(pool[i] for i in indices)