blob: db38e697d7111cb07817da340dbc7aa3dc46843a [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Georg Brandle7a09902007-10-21 12:10:28 +000017This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl8ec7f652007-08-15 14:28:01 +000018constructs from the Haskell and SML programming languages. Each has been recast
19in a form suitable for Python.
20
21The module standardizes a core set of fast, memory efficient tools that are
22useful by themselves or in combination. Standardization helps avoid the
23readability and reliability problems which arise when many different individuals
24create their own slightly varying implementations, each with their own quirks
25and naming conventions.
26
27The tools are designed to combine readily with one another. This makes it easy
28to construct more specialized tools succinctly and efficiently in pure Python.
29
30For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
31sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
32:func:`count` which can be combined to form ``imap(f, count())`` and produce an
33equivalent result.
34
35Likewise, the functional tools are designed to work well with the high-speed
36functions provided by the :mod:`operator` module.
37
Georg Brandl8ec7f652007-08-15 14:28:01 +000038Whether cast in pure python form or compiled code, tools that use iterators are
Raymond Hettingerf1f46f02008-07-19 23:58:47 +000039more memory efficient (and often faster) than their list based counterparts. Adopting
Georg Brandl8ec7f652007-08-15 14:28:01 +000040the principles of just-in-time manufacturing, they create data when and where
41needed instead of consuming memory with the computer equivalent of "inventory".
42
Georg Brandl8ec7f652007-08-15 14:28:01 +000043
44.. seealso::
45
46 The Standard ML Basis Library, `The Standard ML Basis Library
47 <http://www.standardml.org/Basis/>`_.
48
49 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
50 Libraries <http://www.haskell.org/definition/>`_.
51
52
53.. _itertools-functions:
54
55Itertool functions
56------------------
57
58The following module functions all construct and return iterators. Some provide
59streams of infinite length, so they should only be accessed by functions or
60loops that truncate the stream.
61
62
63.. function:: chain(*iterables)
64
65 Make an iterator that returns elements from the first iterable until it is
66 exhausted, then proceeds to the next iterable, until all of the iterables are
67 exhausted. Used for treating consecutive sequences as a single sequence.
68 Equivalent to::
69
70 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000071 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +000072 for it in iterables:
73 for element in it:
74 yield element
75
76
Raymond Hettinger330958e2008-02-28 19:41:24 +000077.. function:: itertools.chain.from_iterable(iterable)
78
79 Alternate constructor for :func:`chain`. Gets chained inputs from a
80 single iterable argument that is evaluated lazily. Equivalent to::
81
82 @classmethod
83 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000084 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +000085 for it in iterables:
86 for element in it:
87 yield element
88
89 .. versionadded:: 2.6
90
Raymond Hettingerd553d852008-03-04 04:17:08 +000091
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000092.. function:: combinations(iterable, r)
93
Raymond Hettinger5eaffc42008-04-17 10:48:31 +000094 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000095
Raymond Hettinger330958e2008-02-28 19:41:24 +000096 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000097 input *iterable* is sorted, the combination tuples will be produced
98 in sorted order.
99
100 Elements are treated as unique based on their position, not on their
101 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000102 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000103
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000104 Equivalent to::
105
106 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000107 # combinations('ABCD', 2) --> AB AC AD BC BD CD
108 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000109 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000110 n = len(pool)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000111 indices = range(r)
112 yield tuple(pool[i] for i in indices)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000113 while 1:
114 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000115 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000116 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000117 else:
118 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000119 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000120 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000121 indices[j] = indices[j-1] + 1
122 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000123
Raymond Hettingerd553d852008-03-04 04:17:08 +0000124 The code for :func:`combinations` can be also expressed as a subsequence
125 of :func:`permutations` after filtering entries where the elements are not
126 in sorted order (according to their position in the input pool)::
127
128 def combinations(iterable, r):
129 pool = tuple(iterable)
130 n = len(pool)
131 for indices in permutations(range(n), r):
132 if sorted(indices) == list(indices):
133 yield tuple(pool[i] for i in indices)
134
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000135 .. versionadded:: 2.6
136
Georg Brandl8ec7f652007-08-15 14:28:01 +0000137.. function:: count([n])
138
139 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettinger50e90e22007-10-04 00:20:27 +0000140 specified *n* defaults to zero. Often used as an argument to :func:`imap` to
141 generate consecutive data points. Also, used with :func:`izip` to add sequence
142 numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000143
144 def count(n=0):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000145 # count(10) --> 10 11 12 13 14 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000146 while True:
147 yield n
148 n += 1
149
Georg Brandl8ec7f652007-08-15 14:28:01 +0000150
151.. function:: cycle(iterable)
152
153 Make an iterator returning elements from the iterable and saving a copy of each.
154 When the iterable is exhausted, return elements from the saved copy. Repeats
155 indefinitely. Equivalent to::
156
157 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000158 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000159 saved = []
160 for element in iterable:
161 yield element
162 saved.append(element)
163 while saved:
164 for element in saved:
165 yield element
166
167 Note, this member of the toolkit may require significant auxiliary storage
168 (depending on the length of the iterable).
169
170
171.. function:: dropwhile(predicate, iterable)
172
173 Make an iterator that drops elements from the iterable as long as the predicate
174 is true; afterwards, returns every element. Note, the iterator does not produce
175 *any* output until the predicate first becomes false, so it may have a lengthy
176 start-up time. Equivalent to::
177
178 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000179 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000180 iterable = iter(iterable)
181 for x in iterable:
182 if not predicate(x):
183 yield x
184 break
185 for x in iterable:
186 yield x
187
188
189.. function:: groupby(iterable[, key])
190
191 Make an iterator that returns consecutive keys and groups from the *iterable*.
192 The *key* is a function computing a key value for each element. If not
193 specified or is ``None``, *key* defaults to an identity function and returns
194 the element unchanged. Generally, the iterable needs to already be sorted on
195 the same key function.
196
197 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
198 generates a break or new group every time the value of the key function changes
199 (which is why it is usually necessary to have sorted the data using the same key
200 function). That behavior differs from SQL's GROUP BY which aggregates common
201 elements regardless of their input order.
202
203 The returned group is itself an iterator that shares the underlying iterable
204 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
205 object is advanced, the previous group is no longer visible. So, if that data
206 is needed later, it should be stored as a list::
207
208 groups = []
209 uniquekeys = []
210 data = sorted(data, key=keyfunc)
211 for k, g in groupby(data, keyfunc):
212 groups.append(list(g)) # Store group iterator as a list
213 uniquekeys.append(k)
214
215 :func:`groupby` is equivalent to::
216
217 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000218 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
219 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000220 def __init__(self, iterable, key=None):
221 if key is None:
222 key = lambda x: x
223 self.keyfunc = key
224 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000225 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000226 def __iter__(self):
227 return self
228 def next(self):
229 while self.currkey == self.tgtkey:
230 self.currvalue = self.it.next() # Exit on StopIteration
231 self.currkey = self.keyfunc(self.currvalue)
232 self.tgtkey = self.currkey
233 return (self.currkey, self._grouper(self.tgtkey))
234 def _grouper(self, tgtkey):
235 while self.currkey == tgtkey:
236 yield self.currvalue
237 self.currvalue = self.it.next() # Exit on StopIteration
238 self.currkey = self.keyfunc(self.currvalue)
239
240 .. versionadded:: 2.4
241
242
243.. function:: ifilter(predicate, iterable)
244
245 Make an iterator that filters elements from iterable returning only those for
246 which the predicate is ``True``. If *predicate* is ``None``, return the items
247 that are true. Equivalent to::
248
249 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000250 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000251 if predicate is None:
252 predicate = bool
253 for x in iterable:
254 if predicate(x):
255 yield x
256
257
258.. function:: ifilterfalse(predicate, iterable)
259
260 Make an iterator that filters elements from iterable returning only those for
261 which the predicate is ``False``. If *predicate* is ``None``, return the items
262 that are false. Equivalent to::
263
264 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000265 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000266 if predicate is None:
267 predicate = bool
268 for x in iterable:
269 if not predicate(x):
270 yield x
271
272
273.. function:: imap(function, *iterables)
274
275 Make an iterator that computes the function using arguments from each of the
276 iterables. If *function* is set to ``None``, then :func:`imap` returns the
277 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
278 exhausted instead of filling in ``None`` for shorter iterables. The reason for
279 the difference is that infinite iterator arguments are typically an error for
280 :func:`map` (because the output is fully evaluated) but represent a common and
281 useful way of supplying arguments to :func:`imap`. Equivalent to::
282
283 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000284 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000285 iterables = map(iter, iterables)
286 while True:
Raymond Hettinger2dec48d2008-01-22 22:09:26 +0000287 args = [it.next() for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000288 if function is None:
289 yield tuple(args)
290 else:
291 yield function(*args)
292
293
294.. function:: islice(iterable, [start,] stop [, step])
295
296 Make an iterator that returns selected elements from the iterable. If *start* is
297 non-zero, then elements from the iterable are skipped until start is reached.
298 Afterward, elements are returned consecutively unless *step* is set higher than
299 one which results in items being skipped. If *stop* is ``None``, then iteration
300 continues until the iterator is exhausted, if at all; otherwise, it stops at the
301 specified position. Unlike regular slicing, :func:`islice` does not support
302 negative values for *start*, *stop*, or *step*. Can be used to extract related
303 fields from data where the internal structure has been flattened (for example, a
304 multi-line report may list a name field on every third line). Equivalent to::
305
306 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000307 # islice('ABCDEFG', 2) --> A B
308 # islice('ABCDEFG', 2, 4) --> C D
309 # islice('ABCDEFG', 2, None) --> C D E F G
310 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000311 s = slice(*args)
312 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
313 nexti = it.next()
314 for i, element in enumerate(iterable):
315 if i == nexti:
316 yield element
317 nexti = it.next()
318
319 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
320 then the step defaults to one.
321
322 .. versionchanged:: 2.5
323 accept ``None`` values for default *start* and *step*.
324
325
326.. function:: izip(*iterables)
327
328 Make an iterator that aggregates elements from each of the iterables. Like
329 :func:`zip` except that it returns an iterator instead of a list. Used for
330 lock-step iteration over several iterables at a time. Equivalent to::
331
332 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000333 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000334 iterables = map(iter, iterables)
335 while iterables:
336 result = [it.next() for it in iterables]
337 yield tuple(result)
338
339 .. versionchanged:: 2.4
340 When no iterables are specified, returns a zero length iterator instead of
341 raising a :exc:`TypeError` exception.
342
Raymond Hettinger48c62932008-01-22 19:51:41 +0000343 The left-to-right evaluation order of the iterables is guaranteed. This
344 makes possible an idiom for clustering a data series into n-length groups
345 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000346
Raymond Hettinger48c62932008-01-22 19:51:41 +0000347 :func:`izip` should only be used with unequal length inputs when you don't
348 care about trailing, unmatched values from the longer iterables. If those
349 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000350
351
352.. function:: izip_longest(*iterables[, fillvalue])
353
354 Make an iterator that aggregates elements from each of the iterables. If the
355 iterables are of uneven length, missing values are filled-in with *fillvalue*.
356 Iteration continues until the longest iterable is exhausted. Equivalent to::
357
358 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000359 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000360 fillvalue = kwds.get('fillvalue')
361 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
362 yield counter() # yields the fillvalue, or raises IndexError
363 fillers = repeat(fillvalue)
364 iters = [chain(it, sentinel(), fillers) for it in args]
365 try:
366 for tup in izip(*iters):
367 yield tup
368 except IndexError:
369 pass
370
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000371 If one of the iterables is potentially infinite, then the
372 :func:`izip_longest` function should be wrapped with something that limits
373 the number of calls (for example :func:`islice` or :func:`takewhile`). If
374 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000375
376 .. versionadded:: 2.6
377
Raymond Hettinger330958e2008-02-28 19:41:24 +0000378.. function:: permutations(iterable[, r])
379
380 Return successive *r* length permutations of elements in the *iterable*.
381
382 If *r* is not specified or is ``None``, then *r* defaults to the length
383 of the *iterable* and all possible full-length permutations
384 are generated.
385
386 Permutations are emitted in lexicographic sort order. So, if the
387 input *iterable* is sorted, the permutation tuples will be produced
388 in sorted order.
389
390 Elements are treated as unique based on their position, not on their
391 value. So if the input elements are unique, there will be no repeat
392 values in each permutation.
393
Raymond Hettingerf287f172008-03-02 10:59:31 +0000394 Equivalent to::
395
396 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000397 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
398 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000399 pool = tuple(iterable)
400 n = len(pool)
401 r = n if r is None else r
402 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000403 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000404 yield tuple(pool[i] for i in indices[:r])
405 while n:
406 for i in reversed(range(r)):
407 cycles[i] -= 1
408 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000409 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000410 cycles[i] = n - i
411 else:
412 j = cycles[i]
413 indices[i], indices[-j] = indices[-j], indices[i]
414 yield tuple(pool[i] for i in indices[:r])
415 break
416 else:
417 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000418
Raymond Hettingerd553d852008-03-04 04:17:08 +0000419 The code for :func:`permutations` can be also expressed as a subsequence of
420 :func:`product`, filtered to exclude entries with repeated elements (those
421 from the same position in the input pool)::
422
423 def permutations(iterable, r=None):
424 pool = tuple(iterable)
425 n = len(pool)
426 r = n if r is None else r
427 for indices in product(range(n), repeat=r):
428 if len(set(indices)) == r:
429 yield tuple(pool[i] for i in indices)
430
Raymond Hettinger330958e2008-02-28 19:41:24 +0000431 .. versionadded:: 2.6
432
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000433.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000434
435 Cartesian product of input iterables.
436
437 Equivalent to nested for-loops in a generator expression. For example,
438 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
439
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000440 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000441 on every iteration. This pattern creates a lexicographic ordering so that if
442 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000443 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000444
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000445 To compute the product of an iterable with itself, specify the number of
446 repetitions with the optional *repeat* keyword argument. For example,
447 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
448
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000449 This function is equivalent to the following code, except that the
450 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000451
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000452 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000453 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
454 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000455 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000456 result = [[]]
457 for pool in pools:
458 result = [x+[y] for x in result for y in pool]
459 for prod in result:
460 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000461
462 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000463
464.. function:: repeat(object[, times])
465
466 Make an iterator that returns *object* over and over again. Runs indefinitely
467 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000468 invariant function parameters. Also used with :func:`izip` to create constant
469 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000470
471 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000472 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000473 if times is None:
474 while True:
475 yield object
476 else:
477 for i in xrange(times):
478 yield object
479
480
481.. function:: starmap(function, iterable)
482
Raymond Hettinger47317092008-01-17 03:02:14 +0000483 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000484 the iterable. Used instead of :func:`imap` when argument parameters are already
485 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
486 difference between :func:`imap` and :func:`starmap` parallels the distinction
487 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
488
489 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000490 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000491 for args in iterable:
492 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000493
Raymond Hettinger47317092008-01-17 03:02:14 +0000494 .. versionchanged:: 2.6
495 Previously, :func:`starmap` required the function arguments to be tuples.
496 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000497
498.. function:: takewhile(predicate, iterable)
499
500 Make an iterator that returns elements from the iterable as long as the
501 predicate is true. Equivalent to::
502
503 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000504 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000505 for x in iterable:
506 if predicate(x):
507 yield x
508 else:
509 break
510
511
512.. function:: tee(iterable[, n=2])
513
514 Return *n* independent iterators from a single iterable. The case where ``n==2``
515 is equivalent to::
516
517 def tee(iterable):
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000518 def gen(next, data={}):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000519 for i in count():
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000520 if i in data:
521 yield data.pop(i)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000522 else:
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000523 data[i] = next()
524 yield data[i]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000525 it = iter(iterable)
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000526 return gen(it.next), gen(it.next)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000527
528 Note, once :func:`tee` has made a split, the original *iterable* should not be
529 used anywhere else; otherwise, the *iterable* could get advanced without the tee
530 objects being informed.
531
532 Note, this member of the toolkit may require significant auxiliary storage
533 (depending on how much temporary data needs to be stored). In general, if one
534 iterator is going to use most or all of the data before the other iterator, it
535 is faster to use :func:`list` instead of :func:`tee`.
536
537 .. versionadded:: 2.4
538
539
540.. _itertools-example:
541
542Examples
543--------
544
545The following examples show common uses for each tool and demonstrate ways they
Georg Brandle8f1b002008-03-22 22:04:10 +0000546can be combined.
547
548.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000549
Georg Brandl8ec7f652007-08-15 14:28:01 +0000550 # Show a dictionary sorted and grouped by value
551 >>> from operator import itemgetter
552 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
553 >>> di = sorted(d.iteritems(), key=itemgetter(1))
554 >>> for k, g in groupby(di, key=itemgetter(1)):
555 ... print k, map(itemgetter(0), g)
556 ...
557 1 ['a', 'c', 'e']
558 2 ['b', 'd', 'f']
559 3 ['g']
560
561 # Find runs of consecutive numbers using groupby. The key to the solution
562 # is differencing with a range so that consecutive numbers all appear in
563 # same group.
564 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
565 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
Georg Brandle8f1b002008-03-22 22:04:10 +0000566 ... print map(itemgetter(1), g)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000567 ...
568 [1]
569 [4, 5, 6]
570 [10]
571 [15, 16, 17, 18]
572 [22]
573 [25, 26, 27, 28]
574
575
576
577.. _itertools-recipes:
578
579Recipes
580-------
581
582This section shows recipes for creating an extended toolset using the existing
583itertools as building blocks.
584
585The extended tools offer the same high performance as the underlying toolset.
586The superior memory performance is kept by processing elements one at a time
587rather than bringing the whole iterable into memory all at once. Code volume is
588kept small by linking the tools together in a functional style which helps
589eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000590"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000591which incur interpreter overhead.
592
593.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000594
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000595 def take(n, iterable):
596 "Return first n items of the iterable as a list"
597 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000598
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000599 def enumerate(iterable, start=0):
600 return izip(count(start), iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000601
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000602 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000603 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000604 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000605
606 def nth(iterable, n):
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000607 "Returns the nth item or empty list"
608 return list(islice(iterable, n, n+1))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000609
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000610 def quantify(iterable, pred=bool):
611 "Count how many times the predicate is true"
612 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000613
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000614 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000615 """Returns the sequence elements and then returns None indefinitely.
616
617 Useful for emulating the behavior of the built-in map() function.
618 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000619 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000620
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000621 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000622 "Returns the sequence elements n times"
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000623 return chain.from_iterable(repeat(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000624
625 def dotproduct(vec1, vec2):
626 return sum(imap(operator.mul, vec1, vec2))
627
628 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000629 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000630
631 def repeatfunc(func, times=None, *args):
632 """Repeat calls to func with specified arguments.
633
634 Example: repeatfunc(random.random)
635 """
636 if times is None:
637 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000638 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000639
640 def pairwise(iterable):
641 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
642 a, b = tee(iterable)
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000643 for elem in b:
644 break
Georg Brandl8ec7f652007-08-15 14:28:01 +0000645 return izip(a, b)
646
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000647 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000648 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000649 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000650 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000651
Raymond Hettingera44327a2008-01-30 22:17:31 +0000652 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000653 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000654 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000655 pending = len(iterables)
656 nexts = cycle(iter(it).next for it in iterables)
657 while pending:
658 try:
659 for next in nexts:
660 yield next()
661 except StopIteration:
662 pending -= 1
663 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000664
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000665 def powerset(iterable):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000666 "powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])"
667 # Recipe credited to Eric Raymond
668 pairs = [(2**i, x) for i, x in enumerate(iterable)]
669 for n in xrange(2**len(pairs)):
670 yield set(x for m, x in pairs if m&n)
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000671
Raymond Hettingere8b4b602008-03-11 00:19:07 +0000672 def compress(data, selectors):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000673 "compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F"
674 return (d for d, s in izip(data, selectors) if s)
Raymond Hettinger33691672008-07-19 00:43:00 +0000675
Georg Brandl8c81fda2008-07-23 16:00:44 +0000676 def combinations_with_replacement(iterable, r):
677 "combinations_with_replacement('ABC', 3) --> AA AB AC BB BC CC"
678 pool = tuple(iterable)
679 n = len(pool)
680 indices = [0] * r
681 yield tuple(pool[i] for i in indices)
682 while 1:
683 for i in reversed(range(r)):
684 if indices[i] != n - 1:
685 break
686 else:
687 return
688 indices[i:] = [indices[i] + 1] * (r - i)
689 yield tuple(pool[i] for i in indices)