blob: d4d130fea6d6694504694476d90569f3ccb2ac58 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Raymond Hettingerd187d522009-02-23 21:23:04 +000017This module implements a number of :term:`iterator` building blocks inspired
18by constructs from APL, Haskell, and SML. Each has been recast in a form
19suitable for Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000020
21The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettingerd187d522009-02-23 21:23:04 +000022useful by themselves or in combination. Together, they form an "iterator
23algebra" making it possible to construct specialized tools succinctly and
24efficiently in pure Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000025
26For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
27sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
Raymond Hettingerd187d522009-02-23 21:23:04 +000028:func:`count` which can be combined to form ``imap(f, count())`` to produce an
Georg Brandl8ec7f652007-08-15 14:28:01 +000029equivalent result.
30
Raymond Hettingerd187d522009-02-23 21:23:04 +000031The tools also work well with the high-speed functions in the :mod:`operator`
32module. For example, the plus-operator can be mapped across two vectors to
33form an efficient dot-product: ``sum(imap(operator.add, vector1, vector2))``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000034
Georg Brandl8ec7f652007-08-15 14:28:01 +000035
Raymond Hettingerd187d522009-02-23 21:23:04 +000036**Infinite Iterators:**
Georg Brandl8ec7f652007-08-15 14:28:01 +000037
Raymond Hettingerd187d522009-02-23 21:23:04 +000038 ================== ================= =================================================
39 Iterator Arguments Results
40 ================== ================= =================================================
41 :func:`count` start start, start+1, start+2, ...
42 :func:`cycle` p p0, p1, ... plast, p0, p1, ...
43 :func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times
44 ================== ================= =================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000045
Raymond Hettingerd187d522009-02-23 21:23:04 +000046**Iterators terminating on the shortest input sequence:**
47
48 ==================== ============================ =================================================
49 Iterator Arguments Results
50 ==================== ============================ =================================================
51 :func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ...
52 :func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails
53 :func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
54 :func:`ifilter` pred, seq elements of seq where pred(elem) is True
55 :func:`ifilterfalse` pred, seq elements of seq where pred(elem) is False
56 :func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step]
Raymond Hettingerd31d2942009-03-09 11:39:03 +000057 :func:`imap` func, p, q, ... func(p0, q0), func(p1, q1), ...
Raymond Hettingerd187d522009-02-23 21:23:04 +000058 :func:`starmap` func, seq func(\*seq[0]), fun(\*seq[1]), ...
59 :func:`tee` it, n it1, it2 , ... itn splits one iterator into n
60 :func:`takewhile` pred, seq seq[0], seq[1], until pred fails
61 :func:`izip` p, q, ... (p[0], q[0]), (p[1], q[1]), ...
62 :func:`izip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ...
63 ==================== ============================ =================================================
64
65**Combinatoric generators:**
66
67 ===================================== ==================== =================================================
68 Iterator Arguments Results
69 ===================================== ==================== =================================================
70 :func:`product` p, q, ... [repeat=1] cartesian product
71 :func:`permutations` p[, r] r-length permutations (without repeated elements)
72 :func:`combinations` p[, r] r-length combinations (sorted and no repeats)
73 ===================================== ==================== =================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000074
75
76.. _itertools-functions:
77
78Itertool functions
79------------------
80
81The following module functions all construct and return iterators. Some provide
82streams of infinite length, so they should only be accessed by functions or
83loops that truncate the stream.
84
85
86.. function:: chain(*iterables)
87
88 Make an iterator that returns elements from the first iterable until it is
89 exhausted, then proceeds to the next iterable, until all of the iterables are
90 exhausted. Used for treating consecutive sequences as a single sequence.
91 Equivalent to::
92
93 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000094 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +000095 for it in iterables:
96 for element in it:
97 yield element
98
99
Raymond Hettinger330958e2008-02-28 19:41:24 +0000100.. function:: itertools.chain.from_iterable(iterable)
101
Georg Brandl734373c2009-01-03 21:55:17 +0000102 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e2008-02-28 19:41:24 +0000103 single iterable argument that is evaluated lazily. Equivalent to::
104
105 @classmethod
106 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000107 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +0000108 for it in iterables:
109 for element in it:
110 yield element
111
112 .. versionadded:: 2.6
113
Raymond Hettingerd553d852008-03-04 04:17:08 +0000114
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000115.. function:: combinations(iterable, r)
116
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000117 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000118
Georg Brandl734373c2009-01-03 21:55:17 +0000119 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000120 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl734373c2009-01-03 21:55:17 +0000121 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000122
123 Elements are treated as unique based on their position, not on their
124 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000125 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000126
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000127 Equivalent to::
128
129 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000130 # combinations('ABCD', 2) --> AB AC AD BC BD CD
131 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000132 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000133 n = len(pool)
Raymond Hettinger825758c2009-01-08 05:20:19 +0000134 if r > n:
135 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000136 indices = range(r)
137 yield tuple(pool[i] for i in indices)
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000138 while True:
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000139 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000140 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000141 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000142 else:
143 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000144 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000145 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000146 indices[j] = indices[j-1] + 1
147 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000148
Raymond Hettingerd553d852008-03-04 04:17:08 +0000149 The code for :func:`combinations` can be also expressed as a subsequence
150 of :func:`permutations` after filtering entries where the elements are not
151 in sorted order (according to their position in the input pool)::
152
153 def combinations(iterable, r):
154 pool = tuple(iterable)
155 n = len(pool)
156 for indices in permutations(range(n), r):
157 if sorted(indices) == list(indices):
158 yield tuple(pool[i] for i in indices)
159
Raymond Hettinger825758c2009-01-08 05:20:19 +0000160 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
161 or zero when ``r > n``.
162
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000163 .. versionadded:: 2.6
164
Georg Brandl8ec7f652007-08-15 14:28:01 +0000165.. function:: count([n])
166
167 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettinger50e90e22007-10-04 00:20:27 +0000168 specified *n* defaults to zero. Often used as an argument to :func:`imap` to
169 generate consecutive data points. Also, used with :func:`izip` to add sequence
170 numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000171
172 def count(n=0):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000173 # count(10) --> 10 11 12 13 14 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000174 while True:
175 yield n
176 n += 1
177
Georg Brandl8ec7f652007-08-15 14:28:01 +0000178
179.. function:: cycle(iterable)
180
181 Make an iterator returning elements from the iterable and saving a copy of each.
182 When the iterable is exhausted, return elements from the saved copy. Repeats
183 indefinitely. Equivalent to::
184
185 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000186 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000187 saved = []
188 for element in iterable:
189 yield element
190 saved.append(element)
191 while saved:
192 for element in saved:
193 yield element
194
195 Note, this member of the toolkit may require significant auxiliary storage
196 (depending on the length of the iterable).
197
198
199.. function:: dropwhile(predicate, iterable)
200
201 Make an iterator that drops elements from the iterable as long as the predicate
202 is true; afterwards, returns every element. Note, the iterator does not produce
203 *any* output until the predicate first becomes false, so it may have a lengthy
204 start-up time. Equivalent to::
205
206 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000207 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000208 iterable = iter(iterable)
209 for x in iterable:
210 if not predicate(x):
211 yield x
212 break
213 for x in iterable:
214 yield x
215
216
217.. function:: groupby(iterable[, key])
218
219 Make an iterator that returns consecutive keys and groups from the *iterable*.
220 The *key* is a function computing a key value for each element. If not
221 specified or is ``None``, *key* defaults to an identity function and returns
222 the element unchanged. Generally, the iterable needs to already be sorted on
223 the same key function.
224
225 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
226 generates a break or new group every time the value of the key function changes
227 (which is why it is usually necessary to have sorted the data using the same key
228 function). That behavior differs from SQL's GROUP BY which aggregates common
229 elements regardless of their input order.
230
231 The returned group is itself an iterator that shares the underlying iterable
232 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
233 object is advanced, the previous group is no longer visible. So, if that data
234 is needed later, it should be stored as a list::
235
236 groups = []
237 uniquekeys = []
238 data = sorted(data, key=keyfunc)
239 for k, g in groupby(data, keyfunc):
240 groups.append(list(g)) # Store group iterator as a list
241 uniquekeys.append(k)
242
243 :func:`groupby` is equivalent to::
244
245 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000246 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettinger11485b42009-02-04 19:34:31 +0000247 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000248 def __init__(self, iterable, key=None):
249 if key is None:
250 key = lambda x: x
251 self.keyfunc = key
252 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000253 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000254 def __iter__(self):
255 return self
256 def next(self):
257 while self.currkey == self.tgtkey:
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000258 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000259 self.currkey = self.keyfunc(self.currvalue)
260 self.tgtkey = self.currkey
261 return (self.currkey, self._grouper(self.tgtkey))
262 def _grouper(self, tgtkey):
263 while self.currkey == tgtkey:
264 yield self.currvalue
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000265 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000266 self.currkey = self.keyfunc(self.currvalue)
267
268 .. versionadded:: 2.4
269
270
271.. function:: ifilter(predicate, iterable)
272
273 Make an iterator that filters elements from iterable returning only those for
274 which the predicate is ``True``. If *predicate* is ``None``, return the items
275 that are true. Equivalent to::
276
277 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000278 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000279 if predicate is None:
280 predicate = bool
281 for x in iterable:
282 if predicate(x):
283 yield x
284
285
286.. function:: ifilterfalse(predicate, iterable)
287
288 Make an iterator that filters elements from iterable returning only those for
289 which the predicate is ``False``. If *predicate* is ``None``, return the items
290 that are false. Equivalent to::
291
292 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000293 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000294 if predicate is None:
295 predicate = bool
296 for x in iterable:
297 if not predicate(x):
298 yield x
299
300
301.. function:: imap(function, *iterables)
302
303 Make an iterator that computes the function using arguments from each of the
304 iterables. If *function* is set to ``None``, then :func:`imap` returns the
305 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
306 exhausted instead of filling in ``None`` for shorter iterables. The reason for
307 the difference is that infinite iterator arguments are typically an error for
308 :func:`map` (because the output is fully evaluated) but represent a common and
309 useful way of supplying arguments to :func:`imap`. Equivalent to::
310
311 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000312 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000313 iterables = map(iter, iterables)
314 while True:
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000315 args = [next(it) for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000316 if function is None:
317 yield tuple(args)
318 else:
319 yield function(*args)
320
321
322.. function:: islice(iterable, [start,] stop [, step])
323
324 Make an iterator that returns selected elements from the iterable. If *start* is
325 non-zero, then elements from the iterable are skipped until start is reached.
326 Afterward, elements are returned consecutively unless *step* is set higher than
327 one which results in items being skipped. If *stop* is ``None``, then iteration
328 continues until the iterator is exhausted, if at all; otherwise, it stops at the
329 specified position. Unlike regular slicing, :func:`islice` does not support
330 negative values for *start*, *stop*, or *step*. Can be used to extract related
331 fields from data where the internal structure has been flattened (for example, a
332 multi-line report may list a name field on every third line). Equivalent to::
333
334 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000335 # islice('ABCDEFG', 2) --> A B
336 # islice('ABCDEFG', 2, 4) --> C D
337 # islice('ABCDEFG', 2, None) --> C D E F G
338 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000339 s = slice(*args)
340 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000341 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000342 for i, element in enumerate(iterable):
343 if i == nexti:
344 yield element
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000345 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000346
347 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
348 then the step defaults to one.
349
350 .. versionchanged:: 2.5
351 accept ``None`` values for default *start* and *step*.
352
353
354.. function:: izip(*iterables)
355
356 Make an iterator that aggregates elements from each of the iterables. Like
357 :func:`zip` except that it returns an iterator instead of a list. Used for
358 lock-step iteration over several iterables at a time. Equivalent to::
359
360 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000361 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000362 iterables = map(iter, iterables)
363 while iterables:
Raymond Hettinger5894c2b2009-02-19 05:38:53 +0000364 yield yield tuple(map(next, iterables))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000365
366 .. versionchanged:: 2.4
367 When no iterables are specified, returns a zero length iterator instead of
368 raising a :exc:`TypeError` exception.
369
Raymond Hettinger48c62932008-01-22 19:51:41 +0000370 The left-to-right evaluation order of the iterables is guaranteed. This
371 makes possible an idiom for clustering a data series into n-length groups
372 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000373
Raymond Hettinger48c62932008-01-22 19:51:41 +0000374 :func:`izip` should only be used with unequal length inputs when you don't
375 care about trailing, unmatched values from the longer iterables. If those
376 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000377
378
379.. function:: izip_longest(*iterables[, fillvalue])
380
381 Make an iterator that aggregates elements from each of the iterables. If the
382 iterables are of uneven length, missing values are filled-in with *fillvalue*.
383 Iteration continues until the longest iterable is exhausted. Equivalent to::
384
385 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000386 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000387 fillvalue = kwds.get('fillvalue')
388 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
389 yield counter() # yields the fillvalue, or raises IndexError
390 fillers = repeat(fillvalue)
391 iters = [chain(it, sentinel(), fillers) for it in args]
392 try:
393 for tup in izip(*iters):
394 yield tup
395 except IndexError:
396 pass
397
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000398 If one of the iterables is potentially infinite, then the
399 :func:`izip_longest` function should be wrapped with something that limits
400 the number of calls (for example :func:`islice` or :func:`takewhile`). If
401 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000402
403 .. versionadded:: 2.6
404
Raymond Hettinger330958e2008-02-28 19:41:24 +0000405.. function:: permutations(iterable[, r])
406
407 Return successive *r* length permutations of elements in the *iterable*.
408
409 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl734373c2009-01-03 21:55:17 +0000410 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e2008-02-28 19:41:24 +0000411 are generated.
412
Georg Brandl734373c2009-01-03 21:55:17 +0000413 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e2008-02-28 19:41:24 +0000414 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl734373c2009-01-03 21:55:17 +0000415 in sorted order.
Raymond Hettinger330958e2008-02-28 19:41:24 +0000416
417 Elements are treated as unique based on their position, not on their
418 value. So if the input elements are unique, there will be no repeat
419 values in each permutation.
420
Raymond Hettingerf287f172008-03-02 10:59:31 +0000421 Equivalent to::
422
423 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000424 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
425 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000426 pool = tuple(iterable)
427 n = len(pool)
428 r = n if r is None else r
Raymond Hettinger825758c2009-01-08 05:20:19 +0000429 if r > n:
430 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000431 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000432 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000433 yield tuple(pool[i] for i in indices[:r])
434 while n:
435 for i in reversed(range(r)):
436 cycles[i] -= 1
437 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000438 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000439 cycles[i] = n - i
440 else:
441 j = cycles[i]
442 indices[i], indices[-j] = indices[-j], indices[i]
443 yield tuple(pool[i] for i in indices[:r])
444 break
445 else:
446 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000447
Georg Brandl734373c2009-01-03 21:55:17 +0000448 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000449 :func:`product`, filtered to exclude entries with repeated elements (those
450 from the same position in the input pool)::
451
452 def permutations(iterable, r=None):
453 pool = tuple(iterable)
454 n = len(pool)
455 r = n if r is None else r
456 for indices in product(range(n), repeat=r):
457 if len(set(indices)) == r:
458 yield tuple(pool[i] for i in indices)
459
Raymond Hettinger825758c2009-01-08 05:20:19 +0000460 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
461 or zero when ``r > n``.
462
Raymond Hettinger330958e2008-02-28 19:41:24 +0000463 .. versionadded:: 2.6
464
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000465.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000466
467 Cartesian product of input iterables.
468
469 Equivalent to nested for-loops in a generator expression. For example,
470 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
471
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000472 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000473 on every iteration. This pattern creates a lexicographic ordering so that if
474 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000475 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000476
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000477 To compute the product of an iterable with itself, specify the number of
478 repetitions with the optional *repeat* keyword argument. For example,
479 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
480
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000481 This function is equivalent to the following code, except that the
482 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000483
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000484 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000485 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
486 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000487 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000488 result = [[]]
489 for pool in pools:
490 result = [x+[y] for x in result for y in pool]
491 for prod in result:
492 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000493
494 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000495
496.. function:: repeat(object[, times])
497
498 Make an iterator that returns *object* over and over again. Runs indefinitely
499 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000500 invariant function parameters. Also used with :func:`izip` to create constant
501 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000502
503 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000504 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000505 if times is None:
506 while True:
507 yield object
508 else:
509 for i in xrange(times):
510 yield object
511
512
513.. function:: starmap(function, iterable)
514
Raymond Hettinger47317092008-01-17 03:02:14 +0000515 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000516 the iterable. Used instead of :func:`imap` when argument parameters are already
517 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
518 difference between :func:`imap` and :func:`starmap` parallels the distinction
519 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
520
521 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000522 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000523 for args in iterable:
524 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000525
Raymond Hettinger47317092008-01-17 03:02:14 +0000526 .. versionchanged:: 2.6
527 Previously, :func:`starmap` required the function arguments to be tuples.
528 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000529
530.. function:: takewhile(predicate, iterable)
531
532 Make an iterator that returns elements from the iterable as long as the
533 predicate is true. Equivalent to::
534
535 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000536 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000537 for x in iterable:
538 if predicate(x):
539 yield x
540 else:
541 break
542
543
544.. function:: tee(iterable[, n=2])
545
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000546 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000547
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000548 def tee(iterable, n=2):
549 it = iter(iterable)
550 deques = [collections.deque() for i in range(n)]
551 def gen(mydeque):
552 while True:
553 if not mydeque: # when the local deque is empty
554 newval = next(it) # fetch a new value and
555 for d in deques: # load it to all the deques
556 d.append(newval)
557 yield mydeque.popleft()
558 return tuple(gen(d) for d in deques)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000559
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000560 Once :func:`tee` has made a split, the original *iterable* should not be
561 used anywhere else; otherwise, the *iterable* could get advanced without
562 the tee objects being informed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000563
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000564 This itertool may require significant auxiliary storage (depending on how
565 much temporary data needs to be stored). In general, if one iterator uses
566 most or all of the data before another iterator starts, it is faster to use
567 :func:`list` instead of :func:`tee`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000568
569 .. versionadded:: 2.4
570
571
572.. _itertools-example:
573
574Examples
575--------
576
577The following examples show common uses for each tool and demonstrate ways they
Georg Brandle8f1b002008-03-22 22:04:10 +0000578can be combined.
579
580.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000581
Georg Brandl734373c2009-01-03 21:55:17 +0000582 >>> # Show a dictionary sorted and grouped by value
Georg Brandl8ec7f652007-08-15 14:28:01 +0000583 >>> from operator import itemgetter
584 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
585 >>> di = sorted(d.iteritems(), key=itemgetter(1))
586 >>> for k, g in groupby(di, key=itemgetter(1)):
587 ... print k, map(itemgetter(0), g)
588 ...
589 1 ['a', 'c', 'e']
590 2 ['b', 'd', 'f']
591 3 ['g']
592
Georg Brandl734373c2009-01-03 21:55:17 +0000593 >>> # Find runs of consecutive numbers using groupby. The key to the solution
594 >>> # is differencing with a range so that consecutive numbers all appear in
595 >>> # same group.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000596 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
597 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
Georg Brandle8f1b002008-03-22 22:04:10 +0000598 ... print map(itemgetter(1), g)
Georg Brandl734373c2009-01-03 21:55:17 +0000599 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000600 [1]
601 [4, 5, 6]
602 [10]
603 [15, 16, 17, 18]
604 [22]
605 [25, 26, 27, 28]
606
607
608
609.. _itertools-recipes:
610
611Recipes
612-------
613
614This section shows recipes for creating an extended toolset using the existing
615itertools as building blocks.
616
617The extended tools offer the same high performance as the underlying toolset.
618The superior memory performance is kept by processing elements one at a time
619rather than bringing the whole iterable into memory all at once. Code volume is
620kept small by linking the tools together in a functional style which helps
621eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000622"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000623which incur interpreter overhead.
624
625.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000626
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000627 def take(n, iterable):
628 "Return first n items of the iterable as a list"
629 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000630
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000631 def enumerate(iterable, start=0):
632 return izip(count(start), iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000633
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000634 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000635 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000636 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000637
Raymond Hettingerfb30cca2009-03-09 11:58:33 +0000638 def consume(iterator, n):
639 "Advance the iterator n-steps ahead. If n is none, consume entirely."
640 collections.deque(islice(iterator, n), maxlen=0)
641
Raymond Hettinger5894c2b2009-02-19 05:38:53 +0000642 def nth(iterable, n, default=None):
643 "Returns the nth item or a default value"
644 return next(islice(iterable, n, None), default)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000645
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000646 def quantify(iterable, pred=bool):
647 "Count how many times the predicate is true"
648 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000649
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000650 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000651 """Returns the sequence elements and then returns None indefinitely.
652
653 Useful for emulating the behavior of the built-in map() function.
654 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000655 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000656
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000657 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000658 "Returns the sequence elements n times"
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000659 return chain.from_iterable(repeat(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000660
661 def dotproduct(vec1, vec2):
662 return sum(imap(operator.mul, vec1, vec2))
663
664 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000665 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000666
667 def repeatfunc(func, times=None, *args):
668 """Repeat calls to func with specified arguments.
669
670 Example: repeatfunc(random.random)
671 """
672 if times is None:
673 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000674 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000675
676 def pairwise(iterable):
677 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
678 a, b = tee(iterable)
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000679 next(b, None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000680 return izip(a, b)
681
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000682 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000683 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000684 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000685 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000686
Raymond Hettingera44327a2008-01-30 22:17:31 +0000687 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000688 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000689 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000690 pending = len(iterables)
691 nexts = cycle(iter(it).next for it in iterables)
692 while pending:
693 try:
694 for next in nexts:
695 yield next()
696 except StopIteration:
697 pending -= 1
698 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000699
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000700 def powerset(iterable):
Raymond Hettinger5cd012b2009-01-26 02:17:52 +0000701 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
702 s = list(iterable)
703 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000704
Raymond Hettingere8b4b602008-03-11 00:19:07 +0000705 def compress(data, selectors):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000706 "compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F"
707 return (d for d, s in izip(data, selectors) if s)
Raymond Hettinger33691672008-07-19 00:43:00 +0000708
Georg Brandl8c81fda2008-07-23 16:00:44 +0000709 def combinations_with_replacement(iterable, r):
Raymond Hettinger825758c2009-01-08 05:20:19 +0000710 "combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC"
711 # number items returned: (n+r-1)! / r! / (n-1)!
Georg Brandl8c81fda2008-07-23 16:00:44 +0000712 pool = tuple(iterable)
713 n = len(pool)
Raymond Hettingercdc9f2c2009-01-27 06:38:00 +0000714 if not n and r:
715 return
Georg Brandl8c81fda2008-07-23 16:00:44 +0000716 indices = [0] * r
717 yield tuple(pool[i] for i in indices)
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000718 while True:
Georg Brandl8c81fda2008-07-23 16:00:44 +0000719 for i in reversed(range(r)):
720 if indices[i] != n - 1:
721 break
722 else:
723 return
724 indices[i:] = [indices[i] + 1] * (r - i)
725 yield tuple(pool[i] for i in indices)
Raymond Hettingerd8461652009-01-02 21:20:38 +0000726
Raymond Hettingerd187d522009-02-23 21:23:04 +0000727 def powerset(iterable):
728 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
729 s = list(iterable)
730 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettingerd8461652009-01-02 21:20:38 +0000731
Raymond Hettingerd187d522009-02-23 21:23:04 +0000732 def unique_everseen(iterable, key=None):
733 "List unique elements, preserving order. Remember all elements ever seen."
734 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
735 # unique_everseen('ABBCcAD', str.lower) --> A B C D
736 seen = set()
737 seen_add = seen.add
738 if key is None:
739 for element in iterable:
740 if element not in seen:
741 seen_add(element)
742 yield element
743 else:
744 for element in iterable:
745 k = key(element)
746 if k not in seen:
747 seen_add(k)
748 yield element
749
750 def unique_justseen(iterable, key=None):
751 "List unique elements, preserving order. Remember only the element just seen."
752 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
753 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
754 return imap(next, imap(itemgetter(1), groupby(iterable, key)))