blob: 42f648ab045ed08644d6f30f1a3f04c5aac90f48 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Georg Brandle7a09902007-10-21 12:10:28 +000017This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl8ec7f652007-08-15 14:28:01 +000018constructs from the Haskell and SML programming languages. Each has been recast
19in a form suitable for Python.
20
21The module standardizes a core set of fast, memory efficient tools that are
22useful by themselves or in combination. Standardization helps avoid the
23readability and reliability problems which arise when many different individuals
24create their own slightly varying implementations, each with their own quirks
25and naming conventions.
26
27The tools are designed to combine readily with one another. This makes it easy
28to construct more specialized tools succinctly and efficiently in pure Python.
29
30For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
31sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
32:func:`count` which can be combined to form ``imap(f, count())`` and produce an
33equivalent result.
34
35Likewise, the functional tools are designed to work well with the high-speed
36functions provided by the :mod:`operator` module.
37
38The module author welcomes suggestions for other basic building blocks to be
39added to future versions of the module.
40
41Whether cast in pure python form or compiled code, tools that use iterators are
42more memory efficient (and faster) than their list based counterparts. Adopting
43the principles of just-in-time manufacturing, they create data when and where
44needed instead of consuming memory with the computer equivalent of "inventory".
45
46The performance advantage of iterators becomes more acute as the number of
47elements increases -- at some point, lists grow large enough to severely impact
48memory cache performance and start running slowly.
49
50
51.. seealso::
52
53 The Standard ML Basis Library, `The Standard ML Basis Library
54 <http://www.standardml.org/Basis/>`_.
55
56 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
57 Libraries <http://www.haskell.org/definition/>`_.
58
59
60.. _itertools-functions:
61
62Itertool functions
63------------------
64
65The following module functions all construct and return iterators. Some provide
66streams of infinite length, so they should only be accessed by functions or
67loops that truncate the stream.
68
69
70.. function:: chain(*iterables)
71
72 Make an iterator that returns elements from the first iterable until it is
73 exhausted, then proceeds to the next iterable, until all of the iterables are
74 exhausted. Used for treating consecutive sequences as a single sequence.
75 Equivalent to::
76
77 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000078 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +000079 for it in iterables:
80 for element in it:
81 yield element
82
83
Raymond Hettinger330958e2008-02-28 19:41:24 +000084.. function:: itertools.chain.from_iterable(iterable)
85
86 Alternate constructor for :func:`chain`. Gets chained inputs from a
87 single iterable argument that is evaluated lazily. Equivalent to::
88
89 @classmethod
90 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000091 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +000092 for it in iterables:
93 for element in it:
94 yield element
95
96 .. versionadded:: 2.6
97
Raymond Hettingerd553d852008-03-04 04:17:08 +000098
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000099.. function:: combinations(iterable, r)
100
101 Return successive *r* length combinations of elements in the *iterable*.
102
Raymond Hettinger330958e2008-02-28 19:41:24 +0000103 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000104 input *iterable* is sorted, the combination tuples will be produced
105 in sorted order.
106
107 Elements are treated as unique based on their position, not on their
108 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000109 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000110
111 Each result tuple is ordered to match the input order. So, every
112 combination is a subsequence of the input *iterable*.
113
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000114 Equivalent to::
115
116 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000117 # combinations('ABCD', 2) --> AB AC AD BC BD CD
118 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000119 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000120 n = len(pool)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000121 indices = range(r)
122 yield tuple(pool[i] for i in indices)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000123 while 1:
124 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000125 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000126 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000127 else:
128 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000129 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000130 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000131 indices[j] = indices[j-1] + 1
132 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000133
Raymond Hettingerd553d852008-03-04 04:17:08 +0000134 The code for :func:`combinations` can be also expressed as a subsequence
135 of :func:`permutations` after filtering entries where the elements are not
136 in sorted order (according to their position in the input pool)::
137
138 def combinations(iterable, r):
139 pool = tuple(iterable)
140 n = len(pool)
141 for indices in permutations(range(n), r):
142 if sorted(indices) == list(indices):
143 yield tuple(pool[i] for i in indices)
144
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000145 .. versionadded:: 2.6
146
Georg Brandl8ec7f652007-08-15 14:28:01 +0000147.. function:: count([n])
148
149 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettinger50e90e22007-10-04 00:20:27 +0000150 specified *n* defaults to zero. Often used as an argument to :func:`imap` to
151 generate consecutive data points. Also, used with :func:`izip` to add sequence
152 numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000153
154 def count(n=0):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000155 # count(10) --> 10 11 12 13 14 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000156 while True:
157 yield n
158 n += 1
159
Georg Brandl8ec7f652007-08-15 14:28:01 +0000160
161.. function:: cycle(iterable)
162
163 Make an iterator returning elements from the iterable and saving a copy of each.
164 When the iterable is exhausted, return elements from the saved copy. Repeats
165 indefinitely. Equivalent to::
166
167 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000168 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000169 saved = []
170 for element in iterable:
171 yield element
172 saved.append(element)
173 while saved:
174 for element in saved:
175 yield element
176
177 Note, this member of the toolkit may require significant auxiliary storage
178 (depending on the length of the iterable).
179
180
181.. function:: dropwhile(predicate, iterable)
182
183 Make an iterator that drops elements from the iterable as long as the predicate
184 is true; afterwards, returns every element. Note, the iterator does not produce
185 *any* output until the predicate first becomes false, so it may have a lengthy
186 start-up time. Equivalent to::
187
188 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000189 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000190 iterable = iter(iterable)
191 for x in iterable:
192 if not predicate(x):
193 yield x
194 break
195 for x in iterable:
196 yield x
197
198
199.. function:: groupby(iterable[, key])
200
201 Make an iterator that returns consecutive keys and groups from the *iterable*.
202 The *key* is a function computing a key value for each element. If not
203 specified or is ``None``, *key* defaults to an identity function and returns
204 the element unchanged. Generally, the iterable needs to already be sorted on
205 the same key function.
206
207 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
208 generates a break or new group every time the value of the key function changes
209 (which is why it is usually necessary to have sorted the data using the same key
210 function). That behavior differs from SQL's GROUP BY which aggregates common
211 elements regardless of their input order.
212
213 The returned group is itself an iterator that shares the underlying iterable
214 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
215 object is advanced, the previous group is no longer visible. So, if that data
216 is needed later, it should be stored as a list::
217
218 groups = []
219 uniquekeys = []
220 data = sorted(data, key=keyfunc)
221 for k, g in groupby(data, keyfunc):
222 groups.append(list(g)) # Store group iterator as a list
223 uniquekeys.append(k)
224
225 :func:`groupby` is equivalent to::
226
227 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000228 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
229 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000230 def __init__(self, iterable, key=None):
231 if key is None:
232 key = lambda x: x
233 self.keyfunc = key
234 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000235 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000236 def __iter__(self):
237 return self
238 def next(self):
239 while self.currkey == self.tgtkey:
240 self.currvalue = self.it.next() # Exit on StopIteration
241 self.currkey = self.keyfunc(self.currvalue)
242 self.tgtkey = self.currkey
243 return (self.currkey, self._grouper(self.tgtkey))
244 def _grouper(self, tgtkey):
245 while self.currkey == tgtkey:
246 yield self.currvalue
247 self.currvalue = self.it.next() # Exit on StopIteration
248 self.currkey = self.keyfunc(self.currvalue)
249
250 .. versionadded:: 2.4
251
252
253.. function:: ifilter(predicate, iterable)
254
255 Make an iterator that filters elements from iterable returning only those for
256 which the predicate is ``True``. If *predicate* is ``None``, return the items
257 that are true. Equivalent to::
258
259 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000260 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000261 if predicate is None:
262 predicate = bool
263 for x in iterable:
264 if predicate(x):
265 yield x
266
267
268.. function:: ifilterfalse(predicate, iterable)
269
270 Make an iterator that filters elements from iterable returning only those for
271 which the predicate is ``False``. If *predicate* is ``None``, return the items
272 that are false. Equivalent to::
273
274 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000275 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000276 if predicate is None:
277 predicate = bool
278 for x in iterable:
279 if not predicate(x):
280 yield x
281
282
283.. function:: imap(function, *iterables)
284
285 Make an iterator that computes the function using arguments from each of the
286 iterables. If *function* is set to ``None``, then :func:`imap` returns the
287 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
288 exhausted instead of filling in ``None`` for shorter iterables. The reason for
289 the difference is that infinite iterator arguments are typically an error for
290 :func:`map` (because the output is fully evaluated) but represent a common and
291 useful way of supplying arguments to :func:`imap`. Equivalent to::
292
293 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000294 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000295 iterables = map(iter, iterables)
296 while True:
Raymond Hettinger2dec48d2008-01-22 22:09:26 +0000297 args = [it.next() for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000298 if function is None:
299 yield tuple(args)
300 else:
301 yield function(*args)
302
303
304.. function:: islice(iterable, [start,] stop [, step])
305
306 Make an iterator that returns selected elements from the iterable. If *start* is
307 non-zero, then elements from the iterable are skipped until start is reached.
308 Afterward, elements are returned consecutively unless *step* is set higher than
309 one which results in items being skipped. If *stop* is ``None``, then iteration
310 continues until the iterator is exhausted, if at all; otherwise, it stops at the
311 specified position. Unlike regular slicing, :func:`islice` does not support
312 negative values for *start*, *stop*, or *step*. Can be used to extract related
313 fields from data where the internal structure has been flattened (for example, a
314 multi-line report may list a name field on every third line). Equivalent to::
315
316 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000317 # islice('ABCDEFG', 2) --> A B
318 # islice('ABCDEFG', 2, 4) --> C D
319 # islice('ABCDEFG', 2, None) --> C D E F G
320 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000321 s = slice(*args)
322 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
323 nexti = it.next()
324 for i, element in enumerate(iterable):
325 if i == nexti:
326 yield element
327 nexti = it.next()
328
329 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
330 then the step defaults to one.
331
332 .. versionchanged:: 2.5
333 accept ``None`` values for default *start* and *step*.
334
335
336.. function:: izip(*iterables)
337
338 Make an iterator that aggregates elements from each of the iterables. Like
339 :func:`zip` except that it returns an iterator instead of a list. Used for
340 lock-step iteration over several iterables at a time. Equivalent to::
341
342 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000343 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000344 iterables = map(iter, iterables)
345 while iterables:
346 result = [it.next() for it in iterables]
347 yield tuple(result)
348
349 .. versionchanged:: 2.4
350 When no iterables are specified, returns a zero length iterator instead of
351 raising a :exc:`TypeError` exception.
352
Raymond Hettinger48c62932008-01-22 19:51:41 +0000353 The left-to-right evaluation order of the iterables is guaranteed. This
354 makes possible an idiom for clustering a data series into n-length groups
355 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000356
Raymond Hettinger48c62932008-01-22 19:51:41 +0000357 :func:`izip` should only be used with unequal length inputs when you don't
358 care about trailing, unmatched values from the longer iterables. If those
359 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000360
361
362.. function:: izip_longest(*iterables[, fillvalue])
363
364 Make an iterator that aggregates elements from each of the iterables. If the
365 iterables are of uneven length, missing values are filled-in with *fillvalue*.
366 Iteration continues until the longest iterable is exhausted. Equivalent to::
367
368 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000369 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000370 fillvalue = kwds.get('fillvalue')
371 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
372 yield counter() # yields the fillvalue, or raises IndexError
373 fillers = repeat(fillvalue)
374 iters = [chain(it, sentinel(), fillers) for it in args]
375 try:
376 for tup in izip(*iters):
377 yield tup
378 except IndexError:
379 pass
380
381 If one of the iterables is potentially infinite, then the :func:`izip_longest`
382 function should be wrapped with something that limits the number of calls (for
383 example :func:`islice` or :func:`takewhile`).
384
385 .. versionadded:: 2.6
386
Raymond Hettinger330958e2008-02-28 19:41:24 +0000387.. function:: permutations(iterable[, r])
388
389 Return successive *r* length permutations of elements in the *iterable*.
390
391 If *r* is not specified or is ``None``, then *r* defaults to the length
392 of the *iterable* and all possible full-length permutations
393 are generated.
394
395 Permutations are emitted in lexicographic sort order. So, if the
396 input *iterable* is sorted, the permutation tuples will be produced
397 in sorted order.
398
399 Elements are treated as unique based on their position, not on their
400 value. So if the input elements are unique, there will be no repeat
401 values in each permutation.
402
Raymond Hettingerf287f172008-03-02 10:59:31 +0000403 Equivalent to::
404
405 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000406 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
407 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000408 pool = tuple(iterable)
409 n = len(pool)
410 r = n if r is None else r
411 indices = range(n)
412 cycles = range(n-r+1, n+1)[::-1]
413 yield tuple(pool[i] for i in indices[:r])
414 while n:
415 for i in reversed(range(r)):
416 cycles[i] -= 1
417 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000418 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000419 cycles[i] = n - i
420 else:
421 j = cycles[i]
422 indices[i], indices[-j] = indices[-j], indices[i]
423 yield tuple(pool[i] for i in indices[:r])
424 break
425 else:
426 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000427
Raymond Hettingerd553d852008-03-04 04:17:08 +0000428 The code for :func:`permutations` can be also expressed as a subsequence of
429 :func:`product`, filtered to exclude entries with repeated elements (those
430 from the same position in the input pool)::
431
432 def permutations(iterable, r=None):
433 pool = tuple(iterable)
434 n = len(pool)
435 r = n if r is None else r
436 for indices in product(range(n), repeat=r):
437 if len(set(indices)) == r:
438 yield tuple(pool[i] for i in indices)
439
Raymond Hettinger330958e2008-02-28 19:41:24 +0000440 .. versionadded:: 2.6
441
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000442.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000443
444 Cartesian product of input iterables.
445
446 Equivalent to nested for-loops in a generator expression. For example,
447 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
448
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000449 The leftmost iterators correspond to the outermost for-loop, so the output
450 tuples cycle like an odometer (with the rightmost element changing on every
Raymond Hettingerd553d852008-03-04 04:17:08 +0000451 iteration). This results in a lexicographic ordering so that if the
452 inputs iterables are sorted, the product tuples are emitted
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000453 in sorted order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000454
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000455 To compute the product of an iterable with itself, specify the number of
456 repetitions with the optional *repeat* keyword argument. For example,
457 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
458
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000459 This function is equivalent to the following code, except that the
460 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000461
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000462 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000463 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
464 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000465 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000466 result = [[]]
467 for pool in pools:
468 result = [x+[y] for x in result for y in pool]
469 for prod in result:
470 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000471
472 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000473
474.. function:: repeat(object[, times])
475
476 Make an iterator that returns *object* over and over again. Runs indefinitely
477 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000478 invariant function parameters. Also used with :func:`izip` to create constant
479 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000480
481 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000482 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000483 if times is None:
484 while True:
485 yield object
486 else:
487 for i in xrange(times):
488 yield object
489
490
491.. function:: starmap(function, iterable)
492
Raymond Hettinger47317092008-01-17 03:02:14 +0000493 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000494 the iterable. Used instead of :func:`imap` when argument parameters are already
495 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
496 difference between :func:`imap` and :func:`starmap` parallels the distinction
497 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
498
499 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000500 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000501 for args in iterable:
502 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000503
Raymond Hettinger47317092008-01-17 03:02:14 +0000504 .. versionchanged:: 2.6
505 Previously, :func:`starmap` required the function arguments to be tuples.
506 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000507
508.. function:: takewhile(predicate, iterable)
509
510 Make an iterator that returns elements from the iterable as long as the
511 predicate is true. Equivalent to::
512
513 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000514 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000515 for x in iterable:
516 if predicate(x):
517 yield x
518 else:
519 break
520
521
522.. function:: tee(iterable[, n=2])
523
524 Return *n* independent iterators from a single iterable. The case where ``n==2``
525 is equivalent to::
526
527 def tee(iterable):
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000528 def gen(next, data={}):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000529 for i in count():
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000530 if i in data:
531 yield data.pop(i)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000532 else:
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000533 data[i] = next()
534 yield data[i]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000535 it = iter(iterable)
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000536 return gen(it.next), gen(it.next)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000537
538 Note, once :func:`tee` has made a split, the original *iterable* should not be
539 used anywhere else; otherwise, the *iterable* could get advanced without the tee
540 objects being informed.
541
542 Note, this member of the toolkit may require significant auxiliary storage
543 (depending on how much temporary data needs to be stored). In general, if one
544 iterator is going to use most or all of the data before the other iterator, it
545 is faster to use :func:`list` instead of :func:`tee`.
546
547 .. versionadded:: 2.4
548
549
550.. _itertools-example:
551
552Examples
553--------
554
555The following examples show common uses for each tool and demonstrate ways they
Georg Brandle8f1b002008-03-22 22:04:10 +0000556can be combined.
557
558.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000559
Georg Brandl8ec7f652007-08-15 14:28:01 +0000560 # Show a dictionary sorted and grouped by value
561 >>> from operator import itemgetter
562 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
563 >>> di = sorted(d.iteritems(), key=itemgetter(1))
564 >>> for k, g in groupby(di, key=itemgetter(1)):
565 ... print k, map(itemgetter(0), g)
566 ...
567 1 ['a', 'c', 'e']
568 2 ['b', 'd', 'f']
569 3 ['g']
570
571 # Find runs of consecutive numbers using groupby. The key to the solution
572 # is differencing with a range so that consecutive numbers all appear in
573 # same group.
574 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
575 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
Georg Brandle8f1b002008-03-22 22:04:10 +0000576 ... print map(itemgetter(1), g)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000577 ...
578 [1]
579 [4, 5, 6]
580 [10]
581 [15, 16, 17, 18]
582 [22]
583 [25, 26, 27, 28]
584
585
586
587.. _itertools-recipes:
588
589Recipes
590-------
591
592This section shows recipes for creating an extended toolset using the existing
593itertools as building blocks.
594
595The extended tools offer the same high performance as the underlying toolset.
596The superior memory performance is kept by processing elements one at a time
597rather than bringing the whole iterable into memory all at once. Code volume is
598kept small by linking the tools together in a functional style which helps
599eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000600"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000601which incur interpreter overhead.
602
603.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000604
605 def take(n, seq):
606 return list(islice(seq, n))
607
608 def enumerate(iterable):
609 return izip(count(), iterable)
610
611 def tabulate(function):
612 "Return function(0), function(1), ..."
613 return imap(function, count())
614
615 def iteritems(mapping):
616 return izip(mapping.iterkeys(), mapping.itervalues())
617
618 def nth(iterable, n):
619 "Returns the nth item or raise StopIteration"
620 return islice(iterable, n, None).next()
621
622 def all(seq, pred=None):
623 "Returns True if pred(x) is true for every element in the iterable"
624 for elem in ifilterfalse(pred, seq):
625 return False
626 return True
627
628 def any(seq, pred=None):
629 "Returns True if pred(x) is true for at least one element in the iterable"
630 for elem in ifilter(pred, seq):
631 return True
632 return False
633
634 def no(seq, pred=None):
635 "Returns True if pred(x) is false for every element in the iterable"
636 for elem in ifilter(pred, seq):
637 return False
638 return True
639
640 def quantify(seq, pred=None):
641 "Count how many times the predicate is true in the sequence"
642 return sum(imap(pred, seq))
643
644 def padnone(seq):
645 """Returns the sequence elements and then returns None indefinitely.
646
647 Useful for emulating the behavior of the built-in map() function.
648 """
649 return chain(seq, repeat(None))
650
651 def ncycles(seq, n):
652 "Returns the sequence elements n times"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000653 return chain.from_iterable(repeat(seq, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000654
655 def dotproduct(vec1, vec2):
656 return sum(imap(operator.mul, vec1, vec2))
657
658 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000659 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000660
661 def repeatfunc(func, times=None, *args):
662 """Repeat calls to func with specified arguments.
663
664 Example: repeatfunc(random.random)
665 """
666 if times is None:
667 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000668 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000669
670 def pairwise(iterable):
671 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
672 a, b = tee(iterable)
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000673 for elem in b:
674 break
Georg Brandl8ec7f652007-08-15 14:28:01 +0000675 return izip(a, b)
676
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000677 def grouper(n, iterable, fillvalue=None):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000678 "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000679 args = [iter(iterable)] * n
680 kwds = dict(fillvalue=fillvalue)
681 return izip_longest(*args, **kwds)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000682
Raymond Hettingera44327a2008-01-30 22:17:31 +0000683 def roundrobin(*iterables):
684 "roundrobin('abc', 'd', 'ef') --> 'a', 'd', 'e', 'b', 'f', 'c'"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000685 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000686 pending = len(iterables)
687 nexts = cycle(iter(it).next for it in iterables)
688 while pending:
689 try:
690 for next in nexts:
691 yield next()
692 except StopIteration:
693 pending -= 1
694 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000695
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000696 def powerset(iterable):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000697 "powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])"
698 # Recipe credited to Eric Raymond
699 pairs = [(2**i, x) for i, x in enumerate(iterable)]
700 for n in xrange(2**len(pairs)):
701 yield set(x for m, x in pairs if m&n)
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000702
Raymond Hettingere8b4b602008-03-11 00:19:07 +0000703 def compress(data, selectors):
704 "compress('abcdef', [1,0,1,0,1,1]) --> a c e f"
705 for d, s in izip(data, selectors):
706 if s:
707 yield d