blob: c0a13050e6f4dbe7cd2937fe87ad27be1272ca1d [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Raymond Hettingerd187d522009-02-23 21:23:04 +000017This module implements a number of :term:`iterator` building blocks inspired
18by constructs from APL, Haskell, and SML. Each has been recast in a form
19suitable for Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000020
21The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettingerd187d522009-02-23 21:23:04 +000022useful by themselves or in combination. Together, they form an "iterator
23algebra" making it possible to construct specialized tools succinctly and
24efficiently in pure Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000025
26For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
27sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
Raymond Hettingerd187d522009-02-23 21:23:04 +000028:func:`count` which can be combined to form ``imap(f, count())`` to produce an
Georg Brandl8ec7f652007-08-15 14:28:01 +000029equivalent result.
30
Raymond Hettinger2b458192009-03-12 00:35:05 +000031These tools and their built-in counterparts also work well with the high-speed
32functions in the :mod:`operator` module. For example, the multiplication
33operator can be mapped across two vectors to form an efficient dot-product:
34``sum(imap(operator.mul, vector1, vector2))``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000035
Georg Brandl8ec7f652007-08-15 14:28:01 +000036
Raymond Hettingerd187d522009-02-23 21:23:04 +000037**Infinite Iterators:**
Georg Brandl8ec7f652007-08-15 14:28:01 +000038
Raymond Hettingerd187d522009-02-23 21:23:04 +000039 ================== ================= =================================================
40 Iterator Arguments Results
41 ================== ================= =================================================
42 :func:`count` start start, start+1, start+2, ...
43 :func:`cycle` p p0, p1, ... plast, p0, p1, ...
44 :func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times
45 ================== ================= =================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000046
Raymond Hettingerd187d522009-02-23 21:23:04 +000047**Iterators terminating on the shortest input sequence:**
48
49 ==================== ============================ =================================================
50 Iterator Arguments Results
51 ==================== ============================ =================================================
52 :func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ...
53 :func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails
54 :func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
55 :func:`ifilter` pred, seq elements of seq where pred(elem) is True
56 :func:`ifilterfalse` pred, seq elements of seq where pred(elem) is False
57 :func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step]
Raymond Hettingerd31d2942009-03-09 11:39:03 +000058 :func:`imap` func, p, q, ... func(p0, q0), func(p1, q1), ...
Raymond Hettinger8ef4e782009-03-10 13:05:40 +000059 :func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ...
Raymond Hettingerd187d522009-02-23 21:23:04 +000060 :func:`tee` it, n it1, it2 , ... itn splits one iterator into n
61 :func:`takewhile` pred, seq seq[0], seq[1], until pred fails
62 :func:`izip` p, q, ... (p[0], q[0]), (p[1], q[1]), ...
63 :func:`izip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ...
64 ==================== ============================ =================================================
65
66**Combinatoric generators:**
67
68 ===================================== ==================== =================================================
69 Iterator Arguments Results
70 ===================================== ==================== =================================================
71 :func:`product` p, q, ... [repeat=1] cartesian product
72 :func:`permutations` p[, r] r-length permutations (without repeated elements)
73 :func:`combinations` p[, r] r-length combinations (sorted and no repeats)
74 ===================================== ==================== =================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000075
76
77.. _itertools-functions:
78
79Itertool functions
80------------------
81
82The following module functions all construct and return iterators. Some provide
83streams of infinite length, so they should only be accessed by functions or
84loops that truncate the stream.
85
86
87.. function:: chain(*iterables)
88
89 Make an iterator that returns elements from the first iterable until it is
90 exhausted, then proceeds to the next iterable, until all of the iterables are
91 exhausted. Used for treating consecutive sequences as a single sequence.
92 Equivalent to::
93
94 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000095 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +000096 for it in iterables:
97 for element in it:
98 yield element
99
100
Raymond Hettinger330958e2008-02-28 19:41:24 +0000101.. function:: itertools.chain.from_iterable(iterable)
102
Georg Brandl734373c2009-01-03 21:55:17 +0000103 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e2008-02-28 19:41:24 +0000104 single iterable argument that is evaluated lazily. Equivalent to::
105
106 @classmethod
107 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000108 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +0000109 for it in iterables:
110 for element in it:
111 yield element
112
113 .. versionadded:: 2.6
114
Raymond Hettingerd553d852008-03-04 04:17:08 +0000115
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000116.. function:: combinations(iterable, r)
117
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000118 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000119
Georg Brandl734373c2009-01-03 21:55:17 +0000120 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000121 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl734373c2009-01-03 21:55:17 +0000122 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000123
124 Elements are treated as unique based on their position, not on their
125 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000126 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000127
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000128 Equivalent to::
129
130 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000131 # combinations('ABCD', 2) --> AB AC AD BC BD CD
132 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000133 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000134 n = len(pool)
Raymond Hettinger825758c2009-01-08 05:20:19 +0000135 if r > n:
136 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000137 indices = range(r)
138 yield tuple(pool[i] for i in indices)
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000139 while True:
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000140 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000141 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000142 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000143 else:
144 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000145 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000146 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000147 indices[j] = indices[j-1] + 1
148 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000149
Raymond Hettingerd553d852008-03-04 04:17:08 +0000150 The code for :func:`combinations` can be also expressed as a subsequence
151 of :func:`permutations` after filtering entries where the elements are not
152 in sorted order (according to their position in the input pool)::
153
154 def combinations(iterable, r):
155 pool = tuple(iterable)
156 n = len(pool)
157 for indices in permutations(range(n), r):
158 if sorted(indices) == list(indices):
159 yield tuple(pool[i] for i in indices)
160
Raymond Hettinger825758c2009-01-08 05:20:19 +0000161 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
162 or zero when ``r > n``.
163
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000164 .. versionadded:: 2.6
165
Georg Brandl8ec7f652007-08-15 14:28:01 +0000166.. function:: count([n])
167
168 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettinger50e90e22007-10-04 00:20:27 +0000169 specified *n* defaults to zero. Often used as an argument to :func:`imap` to
170 generate consecutive data points. Also, used with :func:`izip` to add sequence
171 numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000172
173 def count(n=0):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000174 # count(10) --> 10 11 12 13 14 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000175 while True:
176 yield n
177 n += 1
178
Georg Brandl8ec7f652007-08-15 14:28:01 +0000179
180.. function:: cycle(iterable)
181
182 Make an iterator returning elements from the iterable and saving a copy of each.
183 When the iterable is exhausted, return elements from the saved copy. Repeats
184 indefinitely. Equivalent to::
185
186 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000187 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000188 saved = []
189 for element in iterable:
190 yield element
191 saved.append(element)
192 while saved:
193 for element in saved:
194 yield element
195
196 Note, this member of the toolkit may require significant auxiliary storage
197 (depending on the length of the iterable).
198
199
200.. function:: dropwhile(predicate, iterable)
201
202 Make an iterator that drops elements from the iterable as long as the predicate
203 is true; afterwards, returns every element. Note, the iterator does not produce
204 *any* output until the predicate first becomes false, so it may have a lengthy
205 start-up time. Equivalent to::
206
207 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000208 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000209 iterable = iter(iterable)
210 for x in iterable:
211 if not predicate(x):
212 yield x
213 break
214 for x in iterable:
215 yield x
216
217
218.. function:: groupby(iterable[, key])
219
220 Make an iterator that returns consecutive keys and groups from the *iterable*.
221 The *key* is a function computing a key value for each element. If not
222 specified or is ``None``, *key* defaults to an identity function and returns
223 the element unchanged. Generally, the iterable needs to already be sorted on
224 the same key function.
225
226 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
227 generates a break or new group every time the value of the key function changes
228 (which is why it is usually necessary to have sorted the data using the same key
229 function). That behavior differs from SQL's GROUP BY which aggregates common
230 elements regardless of their input order.
231
232 The returned group is itself an iterator that shares the underlying iterable
233 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
234 object is advanced, the previous group is no longer visible. So, if that data
235 is needed later, it should be stored as a list::
236
237 groups = []
238 uniquekeys = []
239 data = sorted(data, key=keyfunc)
240 for k, g in groupby(data, keyfunc):
241 groups.append(list(g)) # Store group iterator as a list
242 uniquekeys.append(k)
243
244 :func:`groupby` is equivalent to::
245
246 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000247 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettinger11485b42009-02-04 19:34:31 +0000248 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000249 def __init__(self, iterable, key=None):
250 if key is None:
251 key = lambda x: x
252 self.keyfunc = key
253 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000254 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000255 def __iter__(self):
256 return self
257 def next(self):
258 while self.currkey == self.tgtkey:
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000259 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000260 self.currkey = self.keyfunc(self.currvalue)
261 self.tgtkey = self.currkey
262 return (self.currkey, self._grouper(self.tgtkey))
263 def _grouper(self, tgtkey):
264 while self.currkey == tgtkey:
265 yield self.currvalue
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000266 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000267 self.currkey = self.keyfunc(self.currvalue)
268
269 .. versionadded:: 2.4
270
271
272.. function:: ifilter(predicate, iterable)
273
274 Make an iterator that filters elements from iterable returning only those for
275 which the predicate is ``True``. If *predicate* is ``None``, return the items
276 that are true. Equivalent to::
277
278 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000279 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000280 if predicate is None:
281 predicate = bool
282 for x in iterable:
283 if predicate(x):
284 yield x
285
286
287.. function:: ifilterfalse(predicate, iterable)
288
289 Make an iterator that filters elements from iterable returning only those for
290 which the predicate is ``False``. If *predicate* is ``None``, return the items
291 that are false. Equivalent to::
292
293 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000294 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000295 if predicate is None:
296 predicate = bool
297 for x in iterable:
298 if not predicate(x):
299 yield x
300
301
302.. function:: imap(function, *iterables)
303
304 Make an iterator that computes the function using arguments from each of the
305 iterables. If *function* is set to ``None``, then :func:`imap` returns the
306 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
307 exhausted instead of filling in ``None`` for shorter iterables. The reason for
308 the difference is that infinite iterator arguments are typically an error for
309 :func:`map` (because the output is fully evaluated) but represent a common and
310 useful way of supplying arguments to :func:`imap`. Equivalent to::
311
312 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000313 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000314 iterables = map(iter, iterables)
315 while True:
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000316 args = [next(it) for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000317 if function is None:
318 yield tuple(args)
319 else:
320 yield function(*args)
321
322
323.. function:: islice(iterable, [start,] stop [, step])
324
325 Make an iterator that returns selected elements from the iterable. If *start* is
326 non-zero, then elements from the iterable are skipped until start is reached.
327 Afterward, elements are returned consecutively unless *step* is set higher than
328 one which results in items being skipped. If *stop* is ``None``, then iteration
329 continues until the iterator is exhausted, if at all; otherwise, it stops at the
330 specified position. Unlike regular slicing, :func:`islice` does not support
331 negative values for *start*, *stop*, or *step*. Can be used to extract related
332 fields from data where the internal structure has been flattened (for example, a
333 multi-line report may list a name field on every third line). Equivalent to::
334
335 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000336 # islice('ABCDEFG', 2) --> A B
337 # islice('ABCDEFG', 2, 4) --> C D
338 # islice('ABCDEFG', 2, None) --> C D E F G
339 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000340 s = slice(*args)
341 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000342 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000343 for i, element in enumerate(iterable):
344 if i == nexti:
345 yield element
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000346 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000347
348 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
349 then the step defaults to one.
350
351 .. versionchanged:: 2.5
352 accept ``None`` values for default *start* and *step*.
353
354
355.. function:: izip(*iterables)
356
357 Make an iterator that aggregates elements from each of the iterables. Like
358 :func:`zip` except that it returns an iterator instead of a list. Used for
359 lock-step iteration over several iterables at a time. Equivalent to::
360
361 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000362 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000363 iterables = map(iter, iterables)
364 while iterables:
Raymond Hettinger5894c2b2009-02-19 05:38:53 +0000365 yield yield tuple(map(next, iterables))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000366
367 .. versionchanged:: 2.4
368 When no iterables are specified, returns a zero length iterator instead of
369 raising a :exc:`TypeError` exception.
370
Raymond Hettinger48c62932008-01-22 19:51:41 +0000371 The left-to-right evaluation order of the iterables is guaranteed. This
372 makes possible an idiom for clustering a data series into n-length groups
373 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000374
Raymond Hettinger48c62932008-01-22 19:51:41 +0000375 :func:`izip` should only be used with unequal length inputs when you don't
376 care about trailing, unmatched values from the longer iterables. If those
377 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000378
379
380.. function:: izip_longest(*iterables[, fillvalue])
381
382 Make an iterator that aggregates elements from each of the iterables. If the
383 iterables are of uneven length, missing values are filled-in with *fillvalue*.
384 Iteration continues until the longest iterable is exhausted. Equivalent to::
385
386 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000387 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000388 fillvalue = kwds.get('fillvalue')
389 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
390 yield counter() # yields the fillvalue, or raises IndexError
391 fillers = repeat(fillvalue)
392 iters = [chain(it, sentinel(), fillers) for it in args]
393 try:
394 for tup in izip(*iters):
395 yield tup
396 except IndexError:
397 pass
398
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000399 If one of the iterables is potentially infinite, then the
400 :func:`izip_longest` function should be wrapped with something that limits
401 the number of calls (for example :func:`islice` or :func:`takewhile`). If
402 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000403
404 .. versionadded:: 2.6
405
Raymond Hettinger330958e2008-02-28 19:41:24 +0000406.. function:: permutations(iterable[, r])
407
408 Return successive *r* length permutations of elements in the *iterable*.
409
410 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl734373c2009-01-03 21:55:17 +0000411 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e2008-02-28 19:41:24 +0000412 are generated.
413
Georg Brandl734373c2009-01-03 21:55:17 +0000414 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e2008-02-28 19:41:24 +0000415 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl734373c2009-01-03 21:55:17 +0000416 in sorted order.
Raymond Hettinger330958e2008-02-28 19:41:24 +0000417
418 Elements are treated as unique based on their position, not on their
419 value. So if the input elements are unique, there will be no repeat
420 values in each permutation.
421
Raymond Hettingerf287f172008-03-02 10:59:31 +0000422 Equivalent to::
423
424 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000425 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
426 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000427 pool = tuple(iterable)
428 n = len(pool)
429 r = n if r is None else r
Raymond Hettinger825758c2009-01-08 05:20:19 +0000430 if r > n:
431 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000432 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000433 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000434 yield tuple(pool[i] for i in indices[:r])
435 while n:
436 for i in reversed(range(r)):
437 cycles[i] -= 1
438 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000439 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000440 cycles[i] = n - i
441 else:
442 j = cycles[i]
443 indices[i], indices[-j] = indices[-j], indices[i]
444 yield tuple(pool[i] for i in indices[:r])
445 break
446 else:
447 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000448
Georg Brandl734373c2009-01-03 21:55:17 +0000449 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000450 :func:`product`, filtered to exclude entries with repeated elements (those
451 from the same position in the input pool)::
452
453 def permutations(iterable, r=None):
454 pool = tuple(iterable)
455 n = len(pool)
456 r = n if r is None else r
457 for indices in product(range(n), repeat=r):
458 if len(set(indices)) == r:
459 yield tuple(pool[i] for i in indices)
460
Raymond Hettinger825758c2009-01-08 05:20:19 +0000461 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
462 or zero when ``r > n``.
463
Raymond Hettinger330958e2008-02-28 19:41:24 +0000464 .. versionadded:: 2.6
465
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000466.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000467
468 Cartesian product of input iterables.
469
470 Equivalent to nested for-loops in a generator expression. For example,
471 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
472
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000473 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000474 on every iteration. This pattern creates a lexicographic ordering so that if
475 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000476 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000477
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000478 To compute the product of an iterable with itself, specify the number of
479 repetitions with the optional *repeat* keyword argument. For example,
480 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
481
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000482 This function is equivalent to the following code, except that the
483 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000484
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000485 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000486 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
487 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000488 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000489 result = [[]]
490 for pool in pools:
491 result = [x+[y] for x in result for y in pool]
492 for prod in result:
493 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000494
495 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000496
497.. function:: repeat(object[, times])
498
499 Make an iterator that returns *object* over and over again. Runs indefinitely
500 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000501 invariant function parameters. Also used with :func:`izip` to create constant
502 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000503
504 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000505 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000506 if times is None:
507 while True:
508 yield object
509 else:
510 for i in xrange(times):
511 yield object
512
513
514.. function:: starmap(function, iterable)
515
Raymond Hettinger47317092008-01-17 03:02:14 +0000516 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000517 the iterable. Used instead of :func:`imap` when argument parameters are already
518 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
519 difference between :func:`imap` and :func:`starmap` parallels the distinction
520 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
521
522 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000523 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000524 for args in iterable:
525 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000526
Raymond Hettinger47317092008-01-17 03:02:14 +0000527 .. versionchanged:: 2.6
528 Previously, :func:`starmap` required the function arguments to be tuples.
529 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000530
531.. function:: takewhile(predicate, iterable)
532
533 Make an iterator that returns elements from the iterable as long as the
534 predicate is true. Equivalent to::
535
536 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000537 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000538 for x in iterable:
539 if predicate(x):
540 yield x
541 else:
542 break
543
544
545.. function:: tee(iterable[, n=2])
546
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000547 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000548
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000549 def tee(iterable, n=2):
550 it = iter(iterable)
551 deques = [collections.deque() for i in range(n)]
552 def gen(mydeque):
553 while True:
554 if not mydeque: # when the local deque is empty
555 newval = next(it) # fetch a new value and
556 for d in deques: # load it to all the deques
557 d.append(newval)
558 yield mydeque.popleft()
559 return tuple(gen(d) for d in deques)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000560
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000561 Once :func:`tee` has made a split, the original *iterable* should not be
562 used anywhere else; otherwise, the *iterable* could get advanced without
563 the tee objects being informed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000564
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000565 This itertool may require significant auxiliary storage (depending on how
566 much temporary data needs to be stored). In general, if one iterator uses
567 most or all of the data before another iterator starts, it is faster to use
568 :func:`list` instead of :func:`tee`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000569
570 .. versionadded:: 2.4
571
572
573.. _itertools-example:
574
575Examples
576--------
577
578The following examples show common uses for each tool and demonstrate ways they
Georg Brandle8f1b002008-03-22 22:04:10 +0000579can be combined.
580
581.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000582
Georg Brandl734373c2009-01-03 21:55:17 +0000583 >>> # Show a dictionary sorted and grouped by value
Georg Brandl8ec7f652007-08-15 14:28:01 +0000584 >>> from operator import itemgetter
585 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
586 >>> di = sorted(d.iteritems(), key=itemgetter(1))
587 >>> for k, g in groupby(di, key=itemgetter(1)):
588 ... print k, map(itemgetter(0), g)
589 ...
590 1 ['a', 'c', 'e']
591 2 ['b', 'd', 'f']
592 3 ['g']
593
Georg Brandl734373c2009-01-03 21:55:17 +0000594 >>> # Find runs of consecutive numbers using groupby. The key to the solution
595 >>> # is differencing with a range so that consecutive numbers all appear in
596 >>> # same group.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000597 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
598 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
Georg Brandle8f1b002008-03-22 22:04:10 +0000599 ... print map(itemgetter(1), g)
Georg Brandl734373c2009-01-03 21:55:17 +0000600 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000601 [1]
602 [4, 5, 6]
603 [10]
604 [15, 16, 17, 18]
605 [22]
606 [25, 26, 27, 28]
607
608
609
610.. _itertools-recipes:
611
612Recipes
613-------
614
615This section shows recipes for creating an extended toolset using the existing
616itertools as building blocks.
617
618The extended tools offer the same high performance as the underlying toolset.
619The superior memory performance is kept by processing elements one at a time
620rather than bringing the whole iterable into memory all at once. Code volume is
621kept small by linking the tools together in a functional style which helps
622eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000623"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000624which incur interpreter overhead.
625
626.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000627
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000628 def take(n, iterable):
629 "Return first n items of the iterable as a list"
630 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000631
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000632 def enumerate(iterable, start=0):
633 return izip(count(start), iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000634
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000635 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000636 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000637 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000638
Raymond Hettingerfb30cca2009-03-09 11:58:33 +0000639 def consume(iterator, n):
640 "Advance the iterator n-steps ahead. If n is none, consume entirely."
641 collections.deque(islice(iterator, n), maxlen=0)
642
Raymond Hettinger5894c2b2009-02-19 05:38:53 +0000643 def nth(iterable, n, default=None):
644 "Returns the nth item or a default value"
645 return next(islice(iterable, n, None), default)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000646
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000647 def quantify(iterable, pred=bool):
648 "Count how many times the predicate is true"
649 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000650
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000651 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000652 """Returns the sequence elements and then returns None indefinitely.
653
654 Useful for emulating the behavior of the built-in map() function.
655 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000656 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000657
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000658 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000659 "Returns the sequence elements n times"
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000660 return chain.from_iterable(repeat(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000661
662 def dotproduct(vec1, vec2):
663 return sum(imap(operator.mul, vec1, vec2))
664
665 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000666 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000667
668 def repeatfunc(func, times=None, *args):
669 """Repeat calls to func with specified arguments.
670
671 Example: repeatfunc(random.random)
672 """
673 if times is None:
674 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000675 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000676
677 def pairwise(iterable):
678 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
679 a, b = tee(iterable)
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000680 next(b, None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000681 return izip(a, b)
682
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000683 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000684 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000685 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000686 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000687
Raymond Hettingera44327a2008-01-30 22:17:31 +0000688 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000689 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000690 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000691 pending = len(iterables)
692 nexts = cycle(iter(it).next for it in iterables)
693 while pending:
694 try:
695 for next in nexts:
696 yield next()
697 except StopIteration:
698 pending -= 1
699 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000700
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000701 def powerset(iterable):
Raymond Hettinger5cd012b2009-01-26 02:17:52 +0000702 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
703 s = list(iterable)
704 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000705
Raymond Hettingere8b4b602008-03-11 00:19:07 +0000706 def compress(data, selectors):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000707 "compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F"
708 return (d for d, s in izip(data, selectors) if s)
Raymond Hettinger33691672008-07-19 00:43:00 +0000709
Georg Brandl8c81fda2008-07-23 16:00:44 +0000710 def combinations_with_replacement(iterable, r):
Raymond Hettinger825758c2009-01-08 05:20:19 +0000711 "combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC"
712 # number items returned: (n+r-1)! / r! / (n-1)!
Georg Brandl8c81fda2008-07-23 16:00:44 +0000713 pool = tuple(iterable)
714 n = len(pool)
Raymond Hettingercdc9f2c2009-01-27 06:38:00 +0000715 if not n and r:
716 return
Georg Brandl8c81fda2008-07-23 16:00:44 +0000717 indices = [0] * r
718 yield tuple(pool[i] for i in indices)
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000719 while True:
Georg Brandl8c81fda2008-07-23 16:00:44 +0000720 for i in reversed(range(r)):
721 if indices[i] != n - 1:
722 break
723 else:
724 return
725 indices[i:] = [indices[i] + 1] * (r - i)
726 yield tuple(pool[i] for i in indices)
Raymond Hettingerd8461652009-01-02 21:20:38 +0000727
Raymond Hettingerd187d522009-02-23 21:23:04 +0000728 def powerset(iterable):
729 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
730 s = list(iterable)
731 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettingerd8461652009-01-02 21:20:38 +0000732
Raymond Hettingerd187d522009-02-23 21:23:04 +0000733 def unique_everseen(iterable, key=None):
734 "List unique elements, preserving order. Remember all elements ever seen."
735 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
736 # unique_everseen('ABBCcAD', str.lower) --> A B C D
737 seen = set()
738 seen_add = seen.add
739 if key is None:
740 for element in iterable:
741 if element not in seen:
742 seen_add(element)
743 yield element
744 else:
745 for element in iterable:
746 k = key(element)
747 if k not in seen:
748 seen_add(k)
749 yield element
750
751 def unique_justseen(iterable, key=None):
752 "List unique elements, preserving order. Remember only the element just seen."
753 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
754 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
755 return imap(next, imap(itemgetter(1), groupby(iterable, key)))