blob: 9aff478c65fcd069831c265d3b426c653aedcf03 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Georg Brandle7a09902007-10-21 12:10:28 +000017This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl8ec7f652007-08-15 14:28:01 +000018constructs from the Haskell and SML programming languages. Each has been recast
19in a form suitable for Python.
20
21The module standardizes a core set of fast, memory efficient tools that are
22useful by themselves or in combination. Standardization helps avoid the
23readability and reliability problems which arise when many different individuals
24create their own slightly varying implementations, each with their own quirks
25and naming conventions.
26
27The tools are designed to combine readily with one another. This makes it easy
28to construct more specialized tools succinctly and efficiently in pure Python.
29
30For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
31sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
32:func:`count` which can be combined to form ``imap(f, count())`` and produce an
33equivalent result.
34
35Likewise, the functional tools are designed to work well with the high-speed
36functions provided by the :mod:`operator` module.
37
Georg Brandl8ec7f652007-08-15 14:28:01 +000038Whether cast in pure python form or compiled code, tools that use iterators are
Raymond Hettingerf1f46f02008-07-19 23:58:47 +000039more memory efficient (and often faster) than their list based counterparts. Adopting
Georg Brandl8ec7f652007-08-15 14:28:01 +000040the principles of just-in-time manufacturing, they create data when and where
41needed instead of consuming memory with the computer equivalent of "inventory".
42
Georg Brandl8ec7f652007-08-15 14:28:01 +000043
44.. seealso::
45
46 The Standard ML Basis Library, `The Standard ML Basis Library
47 <http://www.standardml.org/Basis/>`_.
48
49 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
50 Libraries <http://www.haskell.org/definition/>`_.
51
52
53.. _itertools-functions:
54
55Itertool functions
56------------------
57
58The following module functions all construct and return iterators. Some provide
59streams of infinite length, so they should only be accessed by functions or
60loops that truncate the stream.
61
62
63.. function:: chain(*iterables)
64
65 Make an iterator that returns elements from the first iterable until it is
66 exhausted, then proceeds to the next iterable, until all of the iterables are
67 exhausted. Used for treating consecutive sequences as a single sequence.
68 Equivalent to::
69
70 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000071 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +000072 for it in iterables:
73 for element in it:
74 yield element
75
76
Raymond Hettinger330958e2008-02-28 19:41:24 +000077.. function:: itertools.chain.from_iterable(iterable)
78
Georg Brandlc62ef8b2009-01-03 20:55:06 +000079 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e2008-02-28 19:41:24 +000080 single iterable argument that is evaluated lazily. Equivalent to::
81
82 @classmethod
83 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000084 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +000085 for it in iterables:
86 for element in it:
87 yield element
88
89 .. versionadded:: 2.6
90
Raymond Hettingerd553d852008-03-04 04:17:08 +000091
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000092.. function:: combinations(iterable, r)
93
Raymond Hettinger5eaffc42008-04-17 10:48:31 +000094 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000095
Georg Brandlc62ef8b2009-01-03 20:55:06 +000096 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000097 input *iterable* is sorted, the combination tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +000098 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000099
100 Elements are treated as unique based on their position, not on their
101 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000102 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000103
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000104 Equivalent to::
105
106 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000107 # combinations('ABCD', 2) --> AB AC AD BC BD CD
108 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000109 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000110 n = len(pool)
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000111 if r > n:
112 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000113 indices = range(r)
114 yield tuple(pool[i] for i in indices)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000115 while 1:
116 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000117 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000118 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000119 else:
120 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000121 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000122 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000123 indices[j] = indices[j-1] + 1
124 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000125
Raymond Hettingerd553d852008-03-04 04:17:08 +0000126 The code for :func:`combinations` can be also expressed as a subsequence
127 of :func:`permutations` after filtering entries where the elements are not
128 in sorted order (according to their position in the input pool)::
129
130 def combinations(iterable, r):
131 pool = tuple(iterable)
132 n = len(pool)
133 for indices in permutations(range(n), r):
134 if sorted(indices) == list(indices):
135 yield tuple(pool[i] for i in indices)
136
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000137 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
138 or zero when ``r > n``.
139
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000140 .. versionadded:: 2.6
141
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000142.. function:: combinations_with_replacement(iterable, r)
143
144 Return *r* length subsequences of elements from the input *iterable*
145 allowing individual elements to be repeated more than once.
146
147 Combinations are emitted in lexicographic sort order. So, if the
148 input *iterable* is sorted, the combination tuples will be produced
149 in sorted order.
150
151 Elements are treated as unique based on their position, not on their
152 value. So if the input elements are unique, the generated combinations
153 will also be unique.
154
155 Equivalent to::
156
157 def combinations_with_replacement(iterable, r):
158 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
159 pool = tuple(iterable)
160 n = len(pool)
161 if not n and r:
162 return
163 indices = [0] * r
164 yield tuple(pool[i] for i in indices)
165 while 1:
166 for i in reversed(range(r)):
167 if indices[i] != n - 1:
168 break
169 else:
170 return
171 indices[i:] = [indices[i] + 1] * (r - i)
172 yield tuple(pool[i] for i in indices)
173
174 The code for :func:`combinations_with_replacement` can be also expressed as
175 a subsequence of :func:`product` after filtering entries where the elements
176 are not in sorted order (according to their position in the input pool)::
177
178 def combinations_with_replacement(iterable, r):
179 pool = tuple(iterable)
180 n = len(pool)
181 for indices in product(range(n), repeat=r):
182 if sorted(indices) == list(indices):
183 yield tuple(pool[i] for i in indices)
184
185 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
186
187 .. versionadded:: 2.7
188
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000189.. function:: compress(data, selectors)
190
191 Make an iterator that filters elements from *data* returning only those that
192 have a corresponding element in *selectors* that evaluates to ``True``.
193 Stops when either the *data* or *selectors* iterables have been exhausted.
194 Equivalent to::
195
196 def compress(data, selectors):
197 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
198 return (d for d, s in izip(data, selectors) if s)
199
200 .. versionadded:: 2.7
201
202
Georg Brandl8ec7f652007-08-15 14:28:01 +0000203.. function:: count([n])
204
205 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettinger50e90e22007-10-04 00:20:27 +0000206 specified *n* defaults to zero. Often used as an argument to :func:`imap` to
207 generate consecutive data points. Also, used with :func:`izip` to add sequence
208 numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000209
210 def count(n=0):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000211 # count(10) --> 10 11 12 13 14 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000212 while True:
213 yield n
214 n += 1
215
Georg Brandl8ec7f652007-08-15 14:28:01 +0000216
217.. function:: cycle(iterable)
218
219 Make an iterator returning elements from the iterable and saving a copy of each.
220 When the iterable is exhausted, return elements from the saved copy. Repeats
221 indefinitely. Equivalent to::
222
223 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000224 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000225 saved = []
226 for element in iterable:
227 yield element
228 saved.append(element)
229 while saved:
230 for element in saved:
231 yield element
232
233 Note, this member of the toolkit may require significant auxiliary storage
234 (depending on the length of the iterable).
235
236
237.. function:: dropwhile(predicate, iterable)
238
239 Make an iterator that drops elements from the iterable as long as the predicate
240 is true; afterwards, returns every element. Note, the iterator does not produce
241 *any* output until the predicate first becomes false, so it may have a lengthy
242 start-up time. Equivalent to::
243
244 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000245 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000246 iterable = iter(iterable)
247 for x in iterable:
248 if not predicate(x):
249 yield x
250 break
251 for x in iterable:
252 yield x
253
254
255.. function:: groupby(iterable[, key])
256
257 Make an iterator that returns consecutive keys and groups from the *iterable*.
258 The *key* is a function computing a key value for each element. If not
259 specified or is ``None``, *key* defaults to an identity function and returns
260 the element unchanged. Generally, the iterable needs to already be sorted on
261 the same key function.
262
263 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
264 generates a break or new group every time the value of the key function changes
265 (which is why it is usually necessary to have sorted the data using the same key
266 function). That behavior differs from SQL's GROUP BY which aggregates common
267 elements regardless of their input order.
268
269 The returned group is itself an iterator that shares the underlying iterable
270 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
271 object is advanced, the previous group is no longer visible. So, if that data
272 is needed later, it should be stored as a list::
273
274 groups = []
275 uniquekeys = []
276 data = sorted(data, key=keyfunc)
277 for k, g in groupby(data, keyfunc):
278 groups.append(list(g)) # Store group iterator as a list
279 uniquekeys.append(k)
280
281 :func:`groupby` is equivalent to::
282
283 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000284 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
285 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000286 def __init__(self, iterable, key=None):
287 if key is None:
288 key = lambda x: x
289 self.keyfunc = key
290 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000291 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000292 def __iter__(self):
293 return self
294 def next(self):
295 while self.currkey == self.tgtkey:
296 self.currvalue = self.it.next() # Exit on StopIteration
297 self.currkey = self.keyfunc(self.currvalue)
298 self.tgtkey = self.currkey
299 return (self.currkey, self._grouper(self.tgtkey))
300 def _grouper(self, tgtkey):
301 while self.currkey == tgtkey:
302 yield self.currvalue
303 self.currvalue = self.it.next() # Exit on StopIteration
304 self.currkey = self.keyfunc(self.currvalue)
305
306 .. versionadded:: 2.4
307
308
309.. function:: ifilter(predicate, iterable)
310
311 Make an iterator that filters elements from iterable returning only those for
312 which the predicate is ``True``. If *predicate* is ``None``, return the items
313 that are true. Equivalent to::
314
315 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000316 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000317 if predicate is None:
318 predicate = bool
319 for x in iterable:
320 if predicate(x):
321 yield x
322
323
324.. function:: ifilterfalse(predicate, iterable)
325
326 Make an iterator that filters elements from iterable returning only those for
327 which the predicate is ``False``. If *predicate* is ``None``, return the items
328 that are false. Equivalent to::
329
330 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000331 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000332 if predicate is None:
333 predicate = bool
334 for x in iterable:
335 if not predicate(x):
336 yield x
337
338
339.. function:: imap(function, *iterables)
340
341 Make an iterator that computes the function using arguments from each of the
342 iterables. If *function* is set to ``None``, then :func:`imap` returns the
343 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
344 exhausted instead of filling in ``None`` for shorter iterables. The reason for
345 the difference is that infinite iterator arguments are typically an error for
346 :func:`map` (because the output is fully evaluated) but represent a common and
347 useful way of supplying arguments to :func:`imap`. Equivalent to::
348
349 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000350 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000351 iterables = map(iter, iterables)
352 while True:
Raymond Hettinger2dec48d2008-01-22 22:09:26 +0000353 args = [it.next() for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000354 if function is None:
355 yield tuple(args)
356 else:
357 yield function(*args)
358
359
360.. function:: islice(iterable, [start,] stop [, step])
361
362 Make an iterator that returns selected elements from the iterable. If *start* is
363 non-zero, then elements from the iterable are skipped until start is reached.
364 Afterward, elements are returned consecutively unless *step* is set higher than
365 one which results in items being skipped. If *stop* is ``None``, then iteration
366 continues until the iterator is exhausted, if at all; otherwise, it stops at the
367 specified position. Unlike regular slicing, :func:`islice` does not support
368 negative values for *start*, *stop*, or *step*. Can be used to extract related
369 fields from data where the internal structure has been flattened (for example, a
370 multi-line report may list a name field on every third line). Equivalent to::
371
372 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000373 # islice('ABCDEFG', 2) --> A B
374 # islice('ABCDEFG', 2, 4) --> C D
375 # islice('ABCDEFG', 2, None) --> C D E F G
376 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000377 s = slice(*args)
378 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
379 nexti = it.next()
380 for i, element in enumerate(iterable):
381 if i == nexti:
382 yield element
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000383 nexti = it.next()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000384
385 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
386 then the step defaults to one.
387
388 .. versionchanged:: 2.5
389 accept ``None`` values for default *start* and *step*.
390
391
392.. function:: izip(*iterables)
393
394 Make an iterator that aggregates elements from each of the iterables. Like
395 :func:`zip` except that it returns an iterator instead of a list. Used for
396 lock-step iteration over several iterables at a time. Equivalent to::
397
398 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000399 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000400 iterables = map(iter, iterables)
401 while iterables:
402 result = [it.next() for it in iterables]
403 yield tuple(result)
404
405 .. versionchanged:: 2.4
406 When no iterables are specified, returns a zero length iterator instead of
407 raising a :exc:`TypeError` exception.
408
Raymond Hettinger48c62932008-01-22 19:51:41 +0000409 The left-to-right evaluation order of the iterables is guaranteed. This
410 makes possible an idiom for clustering a data series into n-length groups
411 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000412
Raymond Hettinger48c62932008-01-22 19:51:41 +0000413 :func:`izip` should only be used with unequal length inputs when you don't
414 care about trailing, unmatched values from the longer iterables. If those
415 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000416
417
418.. function:: izip_longest(*iterables[, fillvalue])
419
420 Make an iterator that aggregates elements from each of the iterables. If the
421 iterables are of uneven length, missing values are filled-in with *fillvalue*.
422 Iteration continues until the longest iterable is exhausted. Equivalent to::
423
424 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000425 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000426 fillvalue = kwds.get('fillvalue')
427 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
428 yield counter() # yields the fillvalue, or raises IndexError
429 fillers = repeat(fillvalue)
430 iters = [chain(it, sentinel(), fillers) for it in args]
431 try:
432 for tup in izip(*iters):
433 yield tup
434 except IndexError:
435 pass
436
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000437 If one of the iterables is potentially infinite, then the
438 :func:`izip_longest` function should be wrapped with something that limits
439 the number of calls (for example :func:`islice` or :func:`takewhile`). If
440 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000441
442 .. versionadded:: 2.6
443
Raymond Hettinger330958e2008-02-28 19:41:24 +0000444.. function:: permutations(iterable[, r])
445
446 Return successive *r* length permutations of elements in the *iterable*.
447
448 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000449 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e2008-02-28 19:41:24 +0000450 are generated.
451
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000452 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e2008-02-28 19:41:24 +0000453 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000454 in sorted order.
Raymond Hettinger330958e2008-02-28 19:41:24 +0000455
456 Elements are treated as unique based on their position, not on their
457 value. So if the input elements are unique, there will be no repeat
458 values in each permutation.
459
Raymond Hettingerf287f172008-03-02 10:59:31 +0000460 Equivalent to::
461
462 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000463 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
464 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000465 pool = tuple(iterable)
466 n = len(pool)
467 r = n if r is None else r
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000468 if r > n:
469 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000470 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000471 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000472 yield tuple(pool[i] for i in indices[:r])
473 while n:
474 for i in reversed(range(r)):
475 cycles[i] -= 1
476 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000477 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000478 cycles[i] = n - i
479 else:
480 j = cycles[i]
481 indices[i], indices[-j] = indices[-j], indices[i]
482 yield tuple(pool[i] for i in indices[:r])
483 break
484 else:
485 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000486
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000487 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000488 :func:`product`, filtered to exclude entries with repeated elements (those
489 from the same position in the input pool)::
490
491 def permutations(iterable, r=None):
492 pool = tuple(iterable)
493 n = len(pool)
494 r = n if r is None else r
495 for indices in product(range(n), repeat=r):
496 if len(set(indices)) == r:
497 yield tuple(pool[i] for i in indices)
498
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000499 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
500 or zero when ``r > n``.
501
Raymond Hettinger330958e2008-02-28 19:41:24 +0000502 .. versionadded:: 2.6
503
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000504.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000505
506 Cartesian product of input iterables.
507
508 Equivalent to nested for-loops in a generator expression. For example,
509 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
510
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000511 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000512 on every iteration. This pattern creates a lexicographic ordering so that if
513 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000514 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000515
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000516 To compute the product of an iterable with itself, specify the number of
517 repetitions with the optional *repeat* keyword argument. For example,
518 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
519
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000520 This function is equivalent to the following code, except that the
521 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000522
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000523 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000524 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
525 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000526 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000527 result = [[]]
528 for pool in pools:
529 result = [x+[y] for x in result for y in pool]
530 for prod in result:
531 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000532
533 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000534
535.. function:: repeat(object[, times])
536
537 Make an iterator that returns *object* over and over again. Runs indefinitely
538 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000539 invariant function parameters. Also used with :func:`izip` to create constant
540 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000541
542 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000543 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000544 if times is None:
545 while True:
546 yield object
547 else:
548 for i in xrange(times):
549 yield object
550
551
552.. function:: starmap(function, iterable)
553
Raymond Hettinger47317092008-01-17 03:02:14 +0000554 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000555 the iterable. Used instead of :func:`imap` when argument parameters are already
556 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
557 difference between :func:`imap` and :func:`starmap` parallels the distinction
558 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
559
560 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000561 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000562 for args in iterable:
563 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000564
Raymond Hettinger47317092008-01-17 03:02:14 +0000565 .. versionchanged:: 2.6
566 Previously, :func:`starmap` required the function arguments to be tuples.
567 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000568
569.. function:: takewhile(predicate, iterable)
570
571 Make an iterator that returns elements from the iterable as long as the
572 predicate is true. Equivalent to::
573
574 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000575 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000576 for x in iterable:
577 if predicate(x):
578 yield x
579 else:
580 break
581
582
583.. function:: tee(iterable[, n=2])
584
585 Return *n* independent iterators from a single iterable. The case where ``n==2``
586 is equivalent to::
587
588 def tee(iterable):
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000589 def gen(next, data={}):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000590 for i in count():
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000591 if i in data:
592 yield data.pop(i)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000593 else:
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000594 data[i] = next()
595 yield data[i]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000596 it = iter(iterable)
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000597 return gen(it.next), gen(it.next)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000598
599 Note, once :func:`tee` has made a split, the original *iterable* should not be
600 used anywhere else; otherwise, the *iterable* could get advanced without the tee
601 objects being informed.
602
603 Note, this member of the toolkit may require significant auxiliary storage
604 (depending on how much temporary data needs to be stored). In general, if one
605 iterator is going to use most or all of the data before the other iterator, it
606 is faster to use :func:`list` instead of :func:`tee`.
607
608 .. versionadded:: 2.4
609
610
611.. _itertools-example:
612
613Examples
614--------
615
616The following examples show common uses for each tool and demonstrate ways they
Georg Brandle8f1b002008-03-22 22:04:10 +0000617can be combined.
618
619.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000620
Benjamin Peterson8ea99992009-01-01 16:43:12 +0000621 >>> # Show a dictionary sorted and grouped by value
Georg Brandl8ec7f652007-08-15 14:28:01 +0000622 >>> from operator import itemgetter
623 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
624 >>> di = sorted(d.iteritems(), key=itemgetter(1))
625 >>> for k, g in groupby(di, key=itemgetter(1)):
626 ... print k, map(itemgetter(0), g)
627 ...
628 1 ['a', 'c', 'e']
629 2 ['b', 'd', 'f']
630 3 ['g']
631
Benjamin Peterson8ea99992009-01-01 16:43:12 +0000632 >>> # Find runs of consecutive numbers using groupby. The key to the solution
633 >>> # is differencing with a range so that consecutive numbers all appear in
634 >>> # same group.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000635 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
636 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
Georg Brandle8f1b002008-03-22 22:04:10 +0000637 ... print map(itemgetter(1), g)
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000638 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000639 [1]
640 [4, 5, 6]
641 [10]
642 [15, 16, 17, 18]
643 [22]
644 [25, 26, 27, 28]
645
646
647
648.. _itertools-recipes:
649
650Recipes
651-------
652
653This section shows recipes for creating an extended toolset using the existing
654itertools as building blocks.
655
656The extended tools offer the same high performance as the underlying toolset.
657The superior memory performance is kept by processing elements one at a time
658rather than bringing the whole iterable into memory all at once. Code volume is
659kept small by linking the tools together in a functional style which helps
660eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000661"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000662which incur interpreter overhead.
663
664.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000665
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000666 def take(n, iterable):
667 "Return first n items of the iterable as a list"
668 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000669
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000670 def enumerate(iterable, start=0):
671 return izip(count(start), iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000672
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000673 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000674 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000675 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000676
677 def nth(iterable, n):
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000678 "Returns the nth item or empty list"
679 return list(islice(iterable, n, n+1))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000680
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000681 def quantify(iterable, pred=bool):
682 "Count how many times the predicate is true"
683 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000684
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000685 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000686 """Returns the sequence elements and then returns None indefinitely.
687
688 Useful for emulating the behavior of the built-in map() function.
689 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000690 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000691
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000692 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000693 "Returns the sequence elements n times"
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000694 return chain.from_iterable(repeat(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000695
696 def dotproduct(vec1, vec2):
697 return sum(imap(operator.mul, vec1, vec2))
698
699 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000700 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000701
702 def repeatfunc(func, times=None, *args):
703 """Repeat calls to func with specified arguments.
704
705 Example: repeatfunc(random.random)
706 """
707 if times is None:
708 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000709 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000710
711 def pairwise(iterable):
712 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
713 a, b = tee(iterable)
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000714 for elem in b:
715 break
Georg Brandl8ec7f652007-08-15 14:28:01 +0000716 return izip(a, b)
717
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000718 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000719 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000720 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000721 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000722
Raymond Hettingera44327a2008-01-30 22:17:31 +0000723 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000724 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000725 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000726 pending = len(iterables)
727 nexts = cycle(iter(it).next for it in iterables)
728 while pending:
729 try:
730 for next in nexts:
731 yield next()
732 except StopIteration:
733 pending -= 1
734 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000735
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000736 def powerset(iterable):
Raymond Hettinger68d919e2009-01-25 21:31:47 +0000737 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
738 s = list(iterable)
739 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000740
Raymond Hettinger44e15812009-01-02 21:26:45 +0000741 def unique_everseen(iterable, key=None):
742 "List unique elements, preserving order. Remember all elements ever seen."
743 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000744 # unique_everseen('ABBCcAD', str.lower) --> A B C D
Raymond Hettinger44e15812009-01-02 21:26:45 +0000745 seen = set()
746 seen_add = seen.add
747 if key is None:
748 for element in iterable:
749 if element not in seen:
750 seen_add(element)
751 yield element
752 else:
753 for element in iterable:
754 k = key(element)
755 if k not in seen:
756 seen_add(k)
757 yield element
758
759 def unique_justseen(iterable, key=None):
760 "List unique elements, preserving order. Remember only the element just seen."
761 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
762 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
763 return imap(next, imap(itemgetter(1), groupby(iterable, key)))