blob: 4dfc08340b6b4f6913721fdd7b86353d13ae343e [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Georg Brandle7a09902007-10-21 12:10:28 +000017This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl8ec7f652007-08-15 14:28:01 +000018constructs from the Haskell and SML programming languages. Each has been recast
19in a form suitable for Python.
20
21The module standardizes a core set of fast, memory efficient tools that are
22useful by themselves or in combination. Standardization helps avoid the
23readability and reliability problems which arise when many different individuals
24create their own slightly varying implementations, each with their own quirks
25and naming conventions.
26
27The tools are designed to combine readily with one another. This makes it easy
28to construct more specialized tools succinctly and efficiently in pure Python.
29
30For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
31sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
32:func:`count` which can be combined to form ``imap(f, count())`` and produce an
33equivalent result.
34
35Likewise, the functional tools are designed to work well with the high-speed
36functions provided by the :mod:`operator` module.
37
Georg Brandl8ec7f652007-08-15 14:28:01 +000038Whether cast in pure python form or compiled code, tools that use iterators are
Raymond Hettingerf1f46f02008-07-19 23:58:47 +000039more memory efficient (and often faster) than their list based counterparts. Adopting
Georg Brandl8ec7f652007-08-15 14:28:01 +000040the principles of just-in-time manufacturing, they create data when and where
41needed instead of consuming memory with the computer equivalent of "inventory".
42
Georg Brandl8ec7f652007-08-15 14:28:01 +000043
44.. seealso::
45
46 The Standard ML Basis Library, `The Standard ML Basis Library
47 <http://www.standardml.org/Basis/>`_.
48
49 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
50 Libraries <http://www.haskell.org/definition/>`_.
51
52
53.. _itertools-functions:
54
55Itertool functions
56------------------
57
58The following module functions all construct and return iterators. Some provide
59streams of infinite length, so they should only be accessed by functions or
60loops that truncate the stream.
61
62
63.. function:: chain(*iterables)
64
65 Make an iterator that returns elements from the first iterable until it is
66 exhausted, then proceeds to the next iterable, until all of the iterables are
67 exhausted. Used for treating consecutive sequences as a single sequence.
68 Equivalent to::
69
70 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000071 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +000072 for it in iterables:
73 for element in it:
74 yield element
75
76
Raymond Hettinger330958e2008-02-28 19:41:24 +000077.. function:: itertools.chain.from_iterable(iterable)
78
Georg Brandlc62ef8b2009-01-03 20:55:06 +000079 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e2008-02-28 19:41:24 +000080 single iterable argument that is evaluated lazily. Equivalent to::
81
82 @classmethod
83 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000084 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +000085 for it in iterables:
86 for element in it:
87 yield element
88
89 .. versionadded:: 2.6
90
Raymond Hettingerd553d852008-03-04 04:17:08 +000091
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000092.. function:: combinations(iterable, r)
93
Raymond Hettinger5eaffc42008-04-17 10:48:31 +000094 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000095
Georg Brandlc62ef8b2009-01-03 20:55:06 +000096 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000097 input *iterable* is sorted, the combination tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +000098 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000099
100 Elements are treated as unique based on their position, not on their
101 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000102 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000103
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000104 Equivalent to::
105
106 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000107 # combinations('ABCD', 2) --> AB AC AD BC BD CD
108 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000109 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000110 n = len(pool)
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000111 if r > n:
112 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000113 indices = range(r)
114 yield tuple(pool[i] for i in indices)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000115 while 1:
116 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000117 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000118 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000119 else:
120 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000121 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000122 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000123 indices[j] = indices[j-1] + 1
124 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000125
Raymond Hettingerd553d852008-03-04 04:17:08 +0000126 The code for :func:`combinations` can be also expressed as a subsequence
127 of :func:`permutations` after filtering entries where the elements are not
128 in sorted order (according to their position in the input pool)::
129
130 def combinations(iterable, r):
131 pool = tuple(iterable)
132 n = len(pool)
133 for indices in permutations(range(n), r):
134 if sorted(indices) == list(indices):
135 yield tuple(pool[i] for i in indices)
136
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000137 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
138 or zero when ``r > n``.
139
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000140 .. versionadded:: 2.6
141
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000142.. function:: combinations_with_replacement(iterable, r)
143
144 Return *r* length subsequences of elements from the input *iterable*
145 allowing individual elements to be repeated more than once.
146
147 Combinations are emitted in lexicographic sort order. So, if the
148 input *iterable* is sorted, the combination tuples will be produced
149 in sorted order.
150
151 Elements are treated as unique based on their position, not on their
152 value. So if the input elements are unique, the generated combinations
153 will also be unique.
154
155 Equivalent to::
156
157 def combinations_with_replacement(iterable, r):
158 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
159 pool = tuple(iterable)
160 n = len(pool)
161 if not n and r:
162 return
163 indices = [0] * r
164 yield tuple(pool[i] for i in indices)
165 while 1:
166 for i in reversed(range(r)):
167 if indices[i] != n - 1:
168 break
169 else:
170 return
171 indices[i:] = [indices[i] + 1] * (r - i)
172 yield tuple(pool[i] for i in indices)
173
174 The code for :func:`combinations_with_replacement` can be also expressed as
175 a subsequence of :func:`product` after filtering entries where the elements
176 are not in sorted order (according to their position in the input pool)::
177
178 def combinations_with_replacement(iterable, r):
179 pool = tuple(iterable)
180 n = len(pool)
181 for indices in product(range(n), repeat=r):
182 if sorted(indices) == list(indices):
183 yield tuple(pool[i] for i in indices)
184
185 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
186
187 .. versionadded:: 2.7
188
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000189.. function:: compress(data, selectors)
190
191 Make an iterator that filters elements from *data* returning only those that
192 have a corresponding element in *selectors* that evaluates to ``True``.
193 Stops when either the *data* or *selectors* iterables have been exhausted.
194 Equivalent to::
195
196 def compress(data, selectors):
197 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
198 return (d for d, s in izip(data, selectors) if s)
199
200 .. versionadded:: 2.7
201
202
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000203.. function:: count(n=0, step=1)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000204
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000205 Make an iterator that returns evenly spaced values starting with *n*. Often
206 used as an argument to :func:`imap` to generate consecutive data points.
207 Also, used with :func:`izip` to add sequence numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000208
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000209 def count(n=0, step=1):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000210 # count(10) --> 10 11 12 13 14 ...
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000211 # count(2.5, 0.5) -> 3.5 3.0 4.5 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000212 while True:
213 yield n
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000214 n += step
Georg Brandl8ec7f652007-08-15 14:28:01 +0000215
Georg Brandl8ec7f652007-08-15 14:28:01 +0000216
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000217 .. versionchanged:: 2.7
218 added *step* argument and allowed non-integer arguments.
219
Georg Brandl8ec7f652007-08-15 14:28:01 +0000220.. function:: cycle(iterable)
221
222 Make an iterator returning elements from the iterable and saving a copy of each.
223 When the iterable is exhausted, return elements from the saved copy. Repeats
224 indefinitely. Equivalent to::
225
226 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000227 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000228 saved = []
229 for element in iterable:
230 yield element
231 saved.append(element)
232 while saved:
233 for element in saved:
234 yield element
235
236 Note, this member of the toolkit may require significant auxiliary storage
237 (depending on the length of the iterable).
238
239
240.. function:: dropwhile(predicate, iterable)
241
242 Make an iterator that drops elements from the iterable as long as the predicate
243 is true; afterwards, returns every element. Note, the iterator does not produce
244 *any* output until the predicate first becomes false, so it may have a lengthy
245 start-up time. Equivalent to::
246
247 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000248 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000249 iterable = iter(iterable)
250 for x in iterable:
251 if not predicate(x):
252 yield x
253 break
254 for x in iterable:
255 yield x
256
257
258.. function:: groupby(iterable[, key])
259
260 Make an iterator that returns consecutive keys and groups from the *iterable*.
261 The *key* is a function computing a key value for each element. If not
262 specified or is ``None``, *key* defaults to an identity function and returns
263 the element unchanged. Generally, the iterable needs to already be sorted on
264 the same key function.
265
266 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
267 generates a break or new group every time the value of the key function changes
268 (which is why it is usually necessary to have sorted the data using the same key
269 function). That behavior differs from SQL's GROUP BY which aggregates common
270 elements regardless of their input order.
271
272 The returned group is itself an iterator that shares the underlying iterable
273 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
274 object is advanced, the previous group is no longer visible. So, if that data
275 is needed later, it should be stored as a list::
276
277 groups = []
278 uniquekeys = []
279 data = sorted(data, key=keyfunc)
280 for k, g in groupby(data, keyfunc):
281 groups.append(list(g)) # Store group iterator as a list
282 uniquekeys.append(k)
283
284 :func:`groupby` is equivalent to::
285
286 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000287 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettingerd507afd2009-02-04 10:52:32 +0000288 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000289 def __init__(self, iterable, key=None):
290 if key is None:
291 key = lambda x: x
292 self.keyfunc = key
293 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000294 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000295 def __iter__(self):
296 return self
297 def next(self):
298 while self.currkey == self.tgtkey:
299 self.currvalue = self.it.next() # Exit on StopIteration
300 self.currkey = self.keyfunc(self.currvalue)
301 self.tgtkey = self.currkey
302 return (self.currkey, self._grouper(self.tgtkey))
303 def _grouper(self, tgtkey):
304 while self.currkey == tgtkey:
305 yield self.currvalue
306 self.currvalue = self.it.next() # Exit on StopIteration
307 self.currkey = self.keyfunc(self.currvalue)
308
309 .. versionadded:: 2.4
310
311
312.. function:: ifilter(predicate, iterable)
313
314 Make an iterator that filters elements from iterable returning only those for
315 which the predicate is ``True``. If *predicate* is ``None``, return the items
316 that are true. Equivalent to::
317
318 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000319 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000320 if predicate is None:
321 predicate = bool
322 for x in iterable:
323 if predicate(x):
324 yield x
325
326
327.. function:: ifilterfalse(predicate, iterable)
328
329 Make an iterator that filters elements from iterable returning only those for
330 which the predicate is ``False``. If *predicate* is ``None``, return the items
331 that are false. Equivalent to::
332
333 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000334 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000335 if predicate is None:
336 predicate = bool
337 for x in iterable:
338 if not predicate(x):
339 yield x
340
341
342.. function:: imap(function, *iterables)
343
344 Make an iterator that computes the function using arguments from each of the
345 iterables. If *function* is set to ``None``, then :func:`imap` returns the
346 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
347 exhausted instead of filling in ``None`` for shorter iterables. The reason for
348 the difference is that infinite iterator arguments are typically an error for
349 :func:`map` (because the output is fully evaluated) but represent a common and
350 useful way of supplying arguments to :func:`imap`. Equivalent to::
351
352 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000353 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000354 iterables = map(iter, iterables)
355 while True:
Raymond Hettinger2dec48d2008-01-22 22:09:26 +0000356 args = [it.next() for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000357 if function is None:
358 yield tuple(args)
359 else:
360 yield function(*args)
361
362
363.. function:: islice(iterable, [start,] stop [, step])
364
365 Make an iterator that returns selected elements from the iterable. If *start* is
366 non-zero, then elements from the iterable are skipped until start is reached.
367 Afterward, elements are returned consecutively unless *step* is set higher than
368 one which results in items being skipped. If *stop* is ``None``, then iteration
369 continues until the iterator is exhausted, if at all; otherwise, it stops at the
370 specified position. Unlike regular slicing, :func:`islice` does not support
371 negative values for *start*, *stop*, or *step*. Can be used to extract related
372 fields from data where the internal structure has been flattened (for example, a
373 multi-line report may list a name field on every third line). Equivalent to::
374
375 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000376 # islice('ABCDEFG', 2) --> A B
377 # islice('ABCDEFG', 2, 4) --> C D
378 # islice('ABCDEFG', 2, None) --> C D E F G
379 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000380 s = slice(*args)
381 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
382 nexti = it.next()
383 for i, element in enumerate(iterable):
384 if i == nexti:
385 yield element
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000386 nexti = it.next()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000387
388 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
389 then the step defaults to one.
390
391 .. versionchanged:: 2.5
392 accept ``None`` values for default *start* and *step*.
393
394
395.. function:: izip(*iterables)
396
397 Make an iterator that aggregates elements from each of the iterables. Like
398 :func:`zip` except that it returns an iterator instead of a list. Used for
399 lock-step iteration over several iterables at a time. Equivalent to::
400
401 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000402 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000403 iterables = map(iter, iterables)
404 while iterables:
405 result = [it.next() for it in iterables]
406 yield tuple(result)
407
408 .. versionchanged:: 2.4
409 When no iterables are specified, returns a zero length iterator instead of
410 raising a :exc:`TypeError` exception.
411
Raymond Hettinger48c62932008-01-22 19:51:41 +0000412 The left-to-right evaluation order of the iterables is guaranteed. This
413 makes possible an idiom for clustering a data series into n-length groups
414 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000415
Raymond Hettinger48c62932008-01-22 19:51:41 +0000416 :func:`izip` should only be used with unequal length inputs when you don't
417 care about trailing, unmatched values from the longer iterables. If those
418 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000419
420
421.. function:: izip_longest(*iterables[, fillvalue])
422
423 Make an iterator that aggregates elements from each of the iterables. If the
424 iterables are of uneven length, missing values are filled-in with *fillvalue*.
425 Iteration continues until the longest iterable is exhausted. Equivalent to::
426
427 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000428 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000429 fillvalue = kwds.get('fillvalue')
430 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
431 yield counter() # yields the fillvalue, or raises IndexError
432 fillers = repeat(fillvalue)
433 iters = [chain(it, sentinel(), fillers) for it in args]
434 try:
435 for tup in izip(*iters):
436 yield tup
437 except IndexError:
438 pass
439
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000440 If one of the iterables is potentially infinite, then the
441 :func:`izip_longest` function should be wrapped with something that limits
442 the number of calls (for example :func:`islice` or :func:`takewhile`). If
443 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000444
445 .. versionadded:: 2.6
446
Raymond Hettinger330958e2008-02-28 19:41:24 +0000447.. function:: permutations(iterable[, r])
448
449 Return successive *r* length permutations of elements in the *iterable*.
450
451 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000452 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e2008-02-28 19:41:24 +0000453 are generated.
454
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000455 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e2008-02-28 19:41:24 +0000456 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000457 in sorted order.
Raymond Hettinger330958e2008-02-28 19:41:24 +0000458
459 Elements are treated as unique based on their position, not on their
460 value. So if the input elements are unique, there will be no repeat
461 values in each permutation.
462
Raymond Hettingerf287f172008-03-02 10:59:31 +0000463 Equivalent to::
464
465 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000466 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
467 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000468 pool = tuple(iterable)
469 n = len(pool)
470 r = n if r is None else r
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000471 if r > n:
472 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000473 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000474 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000475 yield tuple(pool[i] for i in indices[:r])
476 while n:
477 for i in reversed(range(r)):
478 cycles[i] -= 1
479 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000480 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000481 cycles[i] = n - i
482 else:
483 j = cycles[i]
484 indices[i], indices[-j] = indices[-j], indices[i]
485 yield tuple(pool[i] for i in indices[:r])
486 break
487 else:
488 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000489
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000490 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000491 :func:`product`, filtered to exclude entries with repeated elements (those
492 from the same position in the input pool)::
493
494 def permutations(iterable, r=None):
495 pool = tuple(iterable)
496 n = len(pool)
497 r = n if r is None else r
498 for indices in product(range(n), repeat=r):
499 if len(set(indices)) == r:
500 yield tuple(pool[i] for i in indices)
501
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000502 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
503 or zero when ``r > n``.
504
Raymond Hettinger330958e2008-02-28 19:41:24 +0000505 .. versionadded:: 2.6
506
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000507.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000508
509 Cartesian product of input iterables.
510
511 Equivalent to nested for-loops in a generator expression. For example,
512 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
513
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000514 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000515 on every iteration. This pattern creates a lexicographic ordering so that if
516 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000517 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000518
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000519 To compute the product of an iterable with itself, specify the number of
520 repetitions with the optional *repeat* keyword argument. For example,
521 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
522
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000523 This function is equivalent to the following code, except that the
524 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000525
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000526 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000527 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
528 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000529 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000530 result = [[]]
531 for pool in pools:
532 result = [x+[y] for x in result for y in pool]
533 for prod in result:
534 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000535
536 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000537
538.. function:: repeat(object[, times])
539
540 Make an iterator that returns *object* over and over again. Runs indefinitely
541 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000542 invariant function parameters. Also used with :func:`izip` to create constant
543 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000544
545 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000546 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000547 if times is None:
548 while True:
549 yield object
550 else:
551 for i in xrange(times):
552 yield object
553
554
555.. function:: starmap(function, iterable)
556
Raymond Hettinger47317092008-01-17 03:02:14 +0000557 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000558 the iterable. Used instead of :func:`imap` when argument parameters are already
559 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
560 difference between :func:`imap` and :func:`starmap` parallels the distinction
561 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
562
563 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000564 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000565 for args in iterable:
566 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000567
Raymond Hettinger47317092008-01-17 03:02:14 +0000568 .. versionchanged:: 2.6
569 Previously, :func:`starmap` required the function arguments to be tuples.
570 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000571
572.. function:: takewhile(predicate, iterable)
573
574 Make an iterator that returns elements from the iterable as long as the
575 predicate is true. Equivalent to::
576
577 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000578 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000579 for x in iterable:
580 if predicate(x):
581 yield x
582 else:
583 break
584
585
586.. function:: tee(iterable[, n=2])
587
588 Return *n* independent iterators from a single iterable. The case where ``n==2``
589 is equivalent to::
590
591 def tee(iterable):
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000592 def gen(next, data={}):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000593 for i in count():
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000594 if i in data:
595 yield data.pop(i)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000596 else:
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000597 data[i] = next()
598 yield data[i]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000599 it = iter(iterable)
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000600 return gen(it.next), gen(it.next)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000601
602 Note, once :func:`tee` has made a split, the original *iterable* should not be
603 used anywhere else; otherwise, the *iterable* could get advanced without the tee
604 objects being informed.
605
606 Note, this member of the toolkit may require significant auxiliary storage
607 (depending on how much temporary data needs to be stored). In general, if one
608 iterator is going to use most or all of the data before the other iterator, it
609 is faster to use :func:`list` instead of :func:`tee`.
610
611 .. versionadded:: 2.4
612
613
614.. _itertools-example:
615
616Examples
617--------
618
619The following examples show common uses for each tool and demonstrate ways they
Georg Brandle8f1b002008-03-22 22:04:10 +0000620can be combined.
621
622.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000623
Benjamin Peterson8ea99992009-01-01 16:43:12 +0000624 >>> # Show a dictionary sorted and grouped by value
Georg Brandl8ec7f652007-08-15 14:28:01 +0000625 >>> from operator import itemgetter
626 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
627 >>> di = sorted(d.iteritems(), key=itemgetter(1))
628 >>> for k, g in groupby(di, key=itemgetter(1)):
629 ... print k, map(itemgetter(0), g)
630 ...
631 1 ['a', 'c', 'e']
632 2 ['b', 'd', 'f']
633 3 ['g']
634
Benjamin Peterson8ea99992009-01-01 16:43:12 +0000635 >>> # Find runs of consecutive numbers using groupby. The key to the solution
636 >>> # is differencing with a range so that consecutive numbers all appear in
637 >>> # same group.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000638 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
639 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
Georg Brandle8f1b002008-03-22 22:04:10 +0000640 ... print map(itemgetter(1), g)
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000641 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000642 [1]
643 [4, 5, 6]
644 [10]
645 [15, 16, 17, 18]
646 [22]
647 [25, 26, 27, 28]
648
649
650
651.. _itertools-recipes:
652
653Recipes
654-------
655
656This section shows recipes for creating an extended toolset using the existing
657itertools as building blocks.
658
659The extended tools offer the same high performance as the underlying toolset.
660The superior memory performance is kept by processing elements one at a time
661rather than bringing the whole iterable into memory all at once. Code volume is
662kept small by linking the tools together in a functional style which helps
663eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000664"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000665which incur interpreter overhead.
666
667.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000668
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000669 def take(n, iterable):
670 "Return first n items of the iterable as a list"
671 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000672
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000673 def enumerate(iterable, start=0):
674 return izip(count(start), iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000675
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000676 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000677 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000678 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000679
680 def nth(iterable, n):
Raymond Hettingerd507afd2009-02-04 10:52:32 +0000681 "Returns the nth item or None"
682 return next(islice(iterable, n, None), None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000683
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000684 def quantify(iterable, pred=bool):
685 "Count how many times the predicate is true"
686 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000687
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000688 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000689 """Returns the sequence elements and then returns None indefinitely.
690
691 Useful for emulating the behavior of the built-in map() function.
692 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000693 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000694
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000695 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000696 "Returns the sequence elements n times"
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000697 return chain.from_iterable(repeat(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000698
699 def dotproduct(vec1, vec2):
700 return sum(imap(operator.mul, vec1, vec2))
701
702 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000703 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000704
705 def repeatfunc(func, times=None, *args):
706 """Repeat calls to func with specified arguments.
707
708 Example: repeatfunc(random.random)
709 """
710 if times is None:
711 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000712 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000713
714 def pairwise(iterable):
715 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
716 a, b = tee(iterable)
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000717 for elem in b:
718 break
Georg Brandl8ec7f652007-08-15 14:28:01 +0000719 return izip(a, b)
720
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000721 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000722 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000723 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000724 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000725
Raymond Hettingera44327a2008-01-30 22:17:31 +0000726 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000727 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000728 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000729 pending = len(iterables)
730 nexts = cycle(iter(it).next for it in iterables)
731 while pending:
732 try:
733 for next in nexts:
734 yield next()
735 except StopIteration:
736 pending -= 1
737 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000738
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000739 def powerset(iterable):
Raymond Hettinger68d919e2009-01-25 21:31:47 +0000740 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
741 s = list(iterable)
742 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000743
Benjamin Peterson48291362009-01-31 20:01:48 +0000744 def unique_everseen(iterable, key=None):
745 "List unique elements, preserving order. Remember all elements ever seen."
746 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
747 # unique_everseen('ABBCcAD', str.lower) --> A B C D
748 seen = set()
749 seen_add = seen.add
750 if key is None:
751 for element in iterable:
752 if element not in seen:
753 seen_add(element)
754 yield element
755 else:
756 for element in iterable:
757 k = key(element)
758 if k not in seen:
759 seen_add(k)
760 yield element
Raymond Hettinger44e15812009-01-02 21:26:45 +0000761
Benjamin Peterson48291362009-01-31 20:01:48 +0000762 def unique_justseen(iterable, key=None):
763 "List unique elements, preserving order. Remember only the element just seen."
764 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
765 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
766 return imap(next, imap(itemgetter(1), groupby(iterable, key)))