blob: f546fe16ee2bf55ac5c5f725891722f9bc41979a [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
11.. versionadded:: 2.3
12
Georg Brandle7a09902007-10-21 12:10:28 +000013This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl8ec7f652007-08-15 14:28:01 +000014constructs from the Haskell and SML programming languages. Each has been recast
15in a form suitable for Python.
16
17The module standardizes a core set of fast, memory efficient tools that are
18useful by themselves or in combination. Standardization helps avoid the
19readability and reliability problems which arise when many different individuals
20create their own slightly varying implementations, each with their own quirks
21and naming conventions.
22
23The tools are designed to combine readily with one another. This makes it easy
24to construct more specialized tools succinctly and efficiently in pure Python.
25
26For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
27sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
28:func:`count` which can be combined to form ``imap(f, count())`` and produce an
29equivalent result.
30
31Likewise, the functional tools are designed to work well with the high-speed
32functions provided by the :mod:`operator` module.
33
34The module author welcomes suggestions for other basic building blocks to be
35added to future versions of the module.
36
37Whether cast in pure python form or compiled code, tools that use iterators are
38more memory efficient (and faster) than their list based counterparts. Adopting
39the principles of just-in-time manufacturing, they create data when and where
40needed instead of consuming memory with the computer equivalent of "inventory".
41
42The performance advantage of iterators becomes more acute as the number of
43elements increases -- at some point, lists grow large enough to severely impact
44memory cache performance and start running slowly.
45
46
47.. seealso::
48
49 The Standard ML Basis Library, `The Standard ML Basis Library
50 <http://www.standardml.org/Basis/>`_.
51
52 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
53 Libraries <http://www.haskell.org/definition/>`_.
54
55
56.. _itertools-functions:
57
58Itertool functions
59------------------
60
61The following module functions all construct and return iterators. Some provide
62streams of infinite length, so they should only be accessed by functions or
63loops that truncate the stream.
64
65
66.. function:: chain(*iterables)
67
68 Make an iterator that returns elements from the first iterable until it is
69 exhausted, then proceeds to the next iterable, until all of the iterables are
70 exhausted. Used for treating consecutive sequences as a single sequence.
71 Equivalent to::
72
73 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000074 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +000075 for it in iterables:
76 for element in it:
77 yield element
78
79
Raymond Hettinger330958e2008-02-28 19:41:24 +000080.. function:: itertools.chain.from_iterable(iterable)
81
82 Alternate constructor for :func:`chain`. Gets chained inputs from a
83 single iterable argument that is evaluated lazily. Equivalent to::
84
85 @classmethod
86 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000087 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +000088 for it in iterables:
89 for element in it:
90 yield element
91
92 .. versionadded:: 2.6
93
Raymond Hettingerd553d852008-03-04 04:17:08 +000094
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000095.. function:: combinations(iterable, r)
96
97 Return successive *r* length combinations of elements in the *iterable*.
98
Raymond Hettinger330958e2008-02-28 19:41:24 +000099 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000100 input *iterable* is sorted, the combination tuples will be produced
101 in sorted order.
102
103 Elements are treated as unique based on their position, not on their
104 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000105 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000106
107 Each result tuple is ordered to match the input order. So, every
108 combination is a subsequence of the input *iterable*.
109
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000110 Equivalent to::
111
112 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000113 # combinations('ABCD', 2) --> AB AC AD BC BD CD
114 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000115 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000116 n = len(pool)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000117 indices = range(r)
118 yield tuple(pool[i] for i in indices)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000119 while 1:
120 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000121 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000122 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000123 else:
124 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000125 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000126 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000127 indices[j] = indices[j-1] + 1
128 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000129
Raymond Hettingerd553d852008-03-04 04:17:08 +0000130 The code for :func:`combinations` can be also expressed as a subsequence
131 of :func:`permutations` after filtering entries where the elements are not
132 in sorted order (according to their position in the input pool)::
133
134 def combinations(iterable, r):
135 pool = tuple(iterable)
136 n = len(pool)
137 for indices in permutations(range(n), r):
138 if sorted(indices) == list(indices):
139 yield tuple(pool[i] for i in indices)
140
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000141 .. versionadded:: 2.6
142
Georg Brandl8ec7f652007-08-15 14:28:01 +0000143.. function:: count([n])
144
145 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettinger50e90e22007-10-04 00:20:27 +0000146 specified *n* defaults to zero. Often used as an argument to :func:`imap` to
147 generate consecutive data points. Also, used with :func:`izip` to add sequence
148 numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000149
150 def count(n=0):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000151 # count(10) --> 10 11 12 13 14 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000152 while True:
153 yield n
154 n += 1
155
Georg Brandl8ec7f652007-08-15 14:28:01 +0000156
157.. function:: cycle(iterable)
158
159 Make an iterator returning elements from the iterable and saving a copy of each.
160 When the iterable is exhausted, return elements from the saved copy. Repeats
161 indefinitely. Equivalent to::
162
163 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000164 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000165 saved = []
166 for element in iterable:
167 yield element
168 saved.append(element)
169 while saved:
170 for element in saved:
171 yield element
172
173 Note, this member of the toolkit may require significant auxiliary storage
174 (depending on the length of the iterable).
175
176
177.. function:: dropwhile(predicate, iterable)
178
179 Make an iterator that drops elements from the iterable as long as the predicate
180 is true; afterwards, returns every element. Note, the iterator does not produce
181 *any* output until the predicate first becomes false, so it may have a lengthy
182 start-up time. Equivalent to::
183
184 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000185 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000186 iterable = iter(iterable)
187 for x in iterable:
188 if not predicate(x):
189 yield x
190 break
191 for x in iterable:
192 yield x
193
194
195.. function:: groupby(iterable[, key])
196
197 Make an iterator that returns consecutive keys and groups from the *iterable*.
198 The *key* is a function computing a key value for each element. If not
199 specified or is ``None``, *key* defaults to an identity function and returns
200 the element unchanged. Generally, the iterable needs to already be sorted on
201 the same key function.
202
203 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
204 generates a break or new group every time the value of the key function changes
205 (which is why it is usually necessary to have sorted the data using the same key
206 function). That behavior differs from SQL's GROUP BY which aggregates common
207 elements regardless of their input order.
208
209 The returned group is itself an iterator that shares the underlying iterable
210 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
211 object is advanced, the previous group is no longer visible. So, if that data
212 is needed later, it should be stored as a list::
213
214 groups = []
215 uniquekeys = []
216 data = sorted(data, key=keyfunc)
217 for k, g in groupby(data, keyfunc):
218 groups.append(list(g)) # Store group iterator as a list
219 uniquekeys.append(k)
220
221 :func:`groupby` is equivalent to::
222
223 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000224 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
225 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000226 def __init__(self, iterable, key=None):
227 if key is None:
228 key = lambda x: x
229 self.keyfunc = key
230 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000231 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000232 def __iter__(self):
233 return self
234 def next(self):
235 while self.currkey == self.tgtkey:
236 self.currvalue = self.it.next() # Exit on StopIteration
237 self.currkey = self.keyfunc(self.currvalue)
238 self.tgtkey = self.currkey
239 return (self.currkey, self._grouper(self.tgtkey))
240 def _grouper(self, tgtkey):
241 while self.currkey == tgtkey:
242 yield self.currvalue
243 self.currvalue = self.it.next() # Exit on StopIteration
244 self.currkey = self.keyfunc(self.currvalue)
245
246 .. versionadded:: 2.4
247
248
249.. function:: ifilter(predicate, iterable)
250
251 Make an iterator that filters elements from iterable returning only those for
252 which the predicate is ``True``. If *predicate* is ``None``, return the items
253 that are true. Equivalent to::
254
255 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000256 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000257 if predicate is None:
258 predicate = bool
259 for x in iterable:
260 if predicate(x):
261 yield x
262
263
264.. function:: ifilterfalse(predicate, iterable)
265
266 Make an iterator that filters elements from iterable returning only those for
267 which the predicate is ``False``. If *predicate* is ``None``, return the items
268 that are false. Equivalent to::
269
270 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000271 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000272 if predicate is None:
273 predicate = bool
274 for x in iterable:
275 if not predicate(x):
276 yield x
277
278
279.. function:: imap(function, *iterables)
280
281 Make an iterator that computes the function using arguments from each of the
282 iterables. If *function* is set to ``None``, then :func:`imap` returns the
283 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
284 exhausted instead of filling in ``None`` for shorter iterables. The reason for
285 the difference is that infinite iterator arguments are typically an error for
286 :func:`map` (because the output is fully evaluated) but represent a common and
287 useful way of supplying arguments to :func:`imap`. Equivalent to::
288
289 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000290 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000291 iterables = map(iter, iterables)
292 while True:
Raymond Hettinger2dec48d2008-01-22 22:09:26 +0000293 args = [it.next() for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000294 if function is None:
295 yield tuple(args)
296 else:
297 yield function(*args)
298
299
300.. function:: islice(iterable, [start,] stop [, step])
301
302 Make an iterator that returns selected elements from the iterable. If *start* is
303 non-zero, then elements from the iterable are skipped until start is reached.
304 Afterward, elements are returned consecutively unless *step* is set higher than
305 one which results in items being skipped. If *stop* is ``None``, then iteration
306 continues until the iterator is exhausted, if at all; otherwise, it stops at the
307 specified position. Unlike regular slicing, :func:`islice` does not support
308 negative values for *start*, *stop*, or *step*. Can be used to extract related
309 fields from data where the internal structure has been flattened (for example, a
310 multi-line report may list a name field on every third line). Equivalent to::
311
312 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000313 # islice('ABCDEFG', 2) --> A B
314 # islice('ABCDEFG', 2, 4) --> C D
315 # islice('ABCDEFG', 2, None) --> C D E F G
316 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000317 s = slice(*args)
318 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
319 nexti = it.next()
320 for i, element in enumerate(iterable):
321 if i == nexti:
322 yield element
323 nexti = it.next()
324
325 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
326 then the step defaults to one.
327
328 .. versionchanged:: 2.5
329 accept ``None`` values for default *start* and *step*.
330
331
332.. function:: izip(*iterables)
333
334 Make an iterator that aggregates elements from each of the iterables. Like
335 :func:`zip` except that it returns an iterator instead of a list. Used for
336 lock-step iteration over several iterables at a time. Equivalent to::
337
338 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000339 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000340 iterables = map(iter, iterables)
341 while iterables:
342 result = [it.next() for it in iterables]
343 yield tuple(result)
344
345 .. versionchanged:: 2.4
346 When no iterables are specified, returns a zero length iterator instead of
347 raising a :exc:`TypeError` exception.
348
Raymond Hettinger48c62932008-01-22 19:51:41 +0000349 The left-to-right evaluation order of the iterables is guaranteed. This
350 makes possible an idiom for clustering a data series into n-length groups
351 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000352
Raymond Hettinger48c62932008-01-22 19:51:41 +0000353 :func:`izip` should only be used with unequal length inputs when you don't
354 care about trailing, unmatched values from the longer iterables. If those
355 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000356
357
358.. function:: izip_longest(*iterables[, fillvalue])
359
360 Make an iterator that aggregates elements from each of the iterables. If the
361 iterables are of uneven length, missing values are filled-in with *fillvalue*.
362 Iteration continues until the longest iterable is exhausted. Equivalent to::
363
364 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000365 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000366 fillvalue = kwds.get('fillvalue')
367 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
368 yield counter() # yields the fillvalue, or raises IndexError
369 fillers = repeat(fillvalue)
370 iters = [chain(it, sentinel(), fillers) for it in args]
371 try:
372 for tup in izip(*iters):
373 yield tup
374 except IndexError:
375 pass
376
377 If one of the iterables is potentially infinite, then the :func:`izip_longest`
378 function should be wrapped with something that limits the number of calls (for
379 example :func:`islice` or :func:`takewhile`).
380
381 .. versionadded:: 2.6
382
Raymond Hettinger330958e2008-02-28 19:41:24 +0000383.. function:: permutations(iterable[, r])
384
385 Return successive *r* length permutations of elements in the *iterable*.
386
387 If *r* is not specified or is ``None``, then *r* defaults to the length
388 of the *iterable* and all possible full-length permutations
389 are generated.
390
391 Permutations are emitted in lexicographic sort order. So, if the
392 input *iterable* is sorted, the permutation tuples will be produced
393 in sorted order.
394
395 Elements are treated as unique based on their position, not on their
396 value. So if the input elements are unique, there will be no repeat
397 values in each permutation.
398
Raymond Hettingerf287f172008-03-02 10:59:31 +0000399 Equivalent to::
400
401 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000402 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
403 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000404 pool = tuple(iterable)
405 n = len(pool)
406 r = n if r is None else r
407 indices = range(n)
408 cycles = range(n-r+1, n+1)[::-1]
409 yield tuple(pool[i] for i in indices[:r])
410 while n:
411 for i in reversed(range(r)):
412 cycles[i] -= 1
413 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000414 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000415 cycles[i] = n - i
416 else:
417 j = cycles[i]
418 indices[i], indices[-j] = indices[-j], indices[i]
419 yield tuple(pool[i] for i in indices[:r])
420 break
421 else:
422 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000423
Raymond Hettingerd553d852008-03-04 04:17:08 +0000424 The code for :func:`permutations` can be also expressed as a subsequence of
425 :func:`product`, filtered to exclude entries with repeated elements (those
426 from the same position in the input pool)::
427
428 def permutations(iterable, r=None):
429 pool = tuple(iterable)
430 n = len(pool)
431 r = n if r is None else r
432 for indices in product(range(n), repeat=r):
433 if len(set(indices)) == r:
434 yield tuple(pool[i] for i in indices)
435
Raymond Hettinger330958e2008-02-28 19:41:24 +0000436 .. versionadded:: 2.6
437
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000438.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000439
440 Cartesian product of input iterables.
441
442 Equivalent to nested for-loops in a generator expression. For example,
443 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
444
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000445 The leftmost iterators correspond to the outermost for-loop, so the output
446 tuples cycle like an odometer (with the rightmost element changing on every
Raymond Hettingerd553d852008-03-04 04:17:08 +0000447 iteration). This results in a lexicographic ordering so that if the
448 inputs iterables are sorted, the product tuples are emitted
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000449 in sorted order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000450
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000451 To compute the product of an iterable with itself, specify the number of
452 repetitions with the optional *repeat* keyword argument. For example,
453 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
454
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000455 This function is equivalent to the following code, except that the
456 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000457
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000458 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000459 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
460 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000461 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000462 result = [[]]
463 for pool in pools:
464 result = [x+[y] for x in result for y in pool]
465 for prod in result:
466 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000467
468 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000469
470.. function:: repeat(object[, times])
471
472 Make an iterator that returns *object* over and over again. Runs indefinitely
473 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000474 invariant function parameters. Also used with :func:`izip` to create constant
475 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000476
477 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000478 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000479 if times is None:
480 while True:
481 yield object
482 else:
483 for i in xrange(times):
484 yield object
485
486
487.. function:: starmap(function, iterable)
488
Raymond Hettinger47317092008-01-17 03:02:14 +0000489 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000490 the iterable. Used instead of :func:`imap` when argument parameters are already
491 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
492 difference between :func:`imap` and :func:`starmap` parallels the distinction
493 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
494
495 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000496 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000497 for args in iterable:
498 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000499
Raymond Hettinger47317092008-01-17 03:02:14 +0000500 .. versionchanged:: 2.6
501 Previously, :func:`starmap` required the function arguments to be tuples.
502 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000503
504.. function:: takewhile(predicate, iterable)
505
506 Make an iterator that returns elements from the iterable as long as the
507 predicate is true. Equivalent to::
508
509 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000510 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000511 for x in iterable:
512 if predicate(x):
513 yield x
514 else:
515 break
516
517
518.. function:: tee(iterable[, n=2])
519
520 Return *n* independent iterators from a single iterable. The case where ``n==2``
521 is equivalent to::
522
523 def tee(iterable):
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000524 def gen(next, data={}):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000525 for i in count():
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000526 if i in data:
527 yield data.pop(i)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000528 else:
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000529 data[i] = next()
530 yield data[i]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000531 it = iter(iterable)
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000532 return gen(it.next), gen(it.next)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000533
534 Note, once :func:`tee` has made a split, the original *iterable* should not be
535 used anywhere else; otherwise, the *iterable* could get advanced without the tee
536 objects being informed.
537
538 Note, this member of the toolkit may require significant auxiliary storage
539 (depending on how much temporary data needs to be stored). In general, if one
540 iterator is going to use most or all of the data before the other iterator, it
541 is faster to use :func:`list` instead of :func:`tee`.
542
543 .. versionadded:: 2.4
544
545
546.. _itertools-example:
547
548Examples
549--------
550
551The following examples show common uses for each tool and demonstrate ways they
552can be combined. ::
553
Georg Brandl8ec7f652007-08-15 14:28:01 +0000554 # Show a dictionary sorted and grouped by value
555 >>> from operator import itemgetter
556 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
557 >>> di = sorted(d.iteritems(), key=itemgetter(1))
558 >>> for k, g in groupby(di, key=itemgetter(1)):
559 ... print k, map(itemgetter(0), g)
560 ...
561 1 ['a', 'c', 'e']
562 2 ['b', 'd', 'f']
563 3 ['g']
564
565 # Find runs of consecutive numbers using groupby. The key to the solution
566 # is differencing with a range so that consecutive numbers all appear in
567 # same group.
568 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
569 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
570 ... print map(operator.itemgetter(1), g)
571 ...
572 [1]
573 [4, 5, 6]
574 [10]
575 [15, 16, 17, 18]
576 [22]
577 [25, 26, 27, 28]
578
579
580
581.. _itertools-recipes:
582
583Recipes
584-------
585
586This section shows recipes for creating an extended toolset using the existing
587itertools as building blocks.
588
589The extended tools offer the same high performance as the underlying toolset.
590The superior memory performance is kept by processing elements one at a time
591rather than bringing the whole iterable into memory all at once. Code volume is
592kept small by linking the tools together in a functional style which helps
593eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000594"vectorized" building blocks over the use of for-loops and :term:`generator`\s
595which incur interpreter overhead. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000596
597 def take(n, seq):
598 return list(islice(seq, n))
599
600 def enumerate(iterable):
601 return izip(count(), iterable)
602
603 def tabulate(function):
604 "Return function(0), function(1), ..."
605 return imap(function, count())
606
607 def iteritems(mapping):
608 return izip(mapping.iterkeys(), mapping.itervalues())
609
610 def nth(iterable, n):
611 "Returns the nth item or raise StopIteration"
612 return islice(iterable, n, None).next()
613
614 def all(seq, pred=None):
615 "Returns True if pred(x) is true for every element in the iterable"
616 for elem in ifilterfalse(pred, seq):
617 return False
618 return True
619
620 def any(seq, pred=None):
621 "Returns True if pred(x) is true for at least one element in the iterable"
622 for elem in ifilter(pred, seq):
623 return True
624 return False
625
626 def no(seq, pred=None):
627 "Returns True if pred(x) is false for every element in the iterable"
628 for elem in ifilter(pred, seq):
629 return False
630 return True
631
632 def quantify(seq, pred=None):
633 "Count how many times the predicate is true in the sequence"
634 return sum(imap(pred, seq))
635
636 def padnone(seq):
637 """Returns the sequence elements and then returns None indefinitely.
638
639 Useful for emulating the behavior of the built-in map() function.
640 """
641 return chain(seq, repeat(None))
642
643 def ncycles(seq, n):
644 "Returns the sequence elements n times"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000645 return chain.from_iterable(repeat(seq, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000646
647 def dotproduct(vec1, vec2):
648 return sum(imap(operator.mul, vec1, vec2))
649
650 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000651 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000652
653 def repeatfunc(func, times=None, *args):
654 """Repeat calls to func with specified arguments.
655
656 Example: repeatfunc(random.random)
657 """
658 if times is None:
659 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000660 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000661
662 def pairwise(iterable):
663 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
664 a, b = tee(iterable)
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000665 for elem in b:
666 break
Georg Brandl8ec7f652007-08-15 14:28:01 +0000667 return izip(a, b)
668
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000669 def grouper(n, iterable, fillvalue=None):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000670 "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000671 args = [iter(iterable)] * n
672 kwds = dict(fillvalue=fillvalue)
673 return izip_longest(*args, **kwds)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000674
Raymond Hettingera44327a2008-01-30 22:17:31 +0000675 def roundrobin(*iterables):
676 "roundrobin('abc', 'd', 'ef') --> 'a', 'd', 'e', 'b', 'f', 'c'"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000677 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000678 pending = len(iterables)
679 nexts = cycle(iter(it).next for it in iterables)
680 while pending:
681 try:
682 for next in nexts:
683 yield next()
684 except StopIteration:
685 pending -= 1
686 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000687
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000688 def powerset(iterable):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000689 "powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])"
690 # Recipe credited to Eric Raymond
691 pairs = [(2**i, x) for i, x in enumerate(iterable)]
692 for n in xrange(2**len(pairs)):
693 yield set(x for m, x in pairs if m&n)
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000694