blob: 0b7907295d6f1f76c9d4f8e5b6ef8027da7b2982 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Raymond Hettinger0aee9422009-02-17 11:00:27 +000017This module implements a number of :term:`iterator` building blocks inspired
18by constructs from APL, Haskell, and SML. Each has been recast in a form
19suitable for Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000020
21The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettinger0aee9422009-02-17 11:00:27 +000022useful by themselves or in combination. Together, they form an "iterator
23algebra" making it possible to construct specialized tools succinctly and
24efficiently in pure Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000025
26For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
27sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
Raymond Hettinger0aee9422009-02-17 11:00:27 +000028:func:`count` which can be combined to form ``imap(f, count())`` to produce an
Georg Brandl8ec7f652007-08-15 14:28:01 +000029equivalent result.
30
Raymond Hettingerefa7c132009-03-12 00:31:58 +000031These tools and their built-in counterparts also work well with the high-speed
32functions in the :mod:`operator` module. For example, the multiplication
33operator can be mapped across two vectors to form an efficient dot-product:
34``sum(imap(operator.mul, vector1, vector2))``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000035
Georg Brandl8ec7f652007-08-15 14:28:01 +000036
Raymond Hettinger0aee9422009-02-17 11:00:27 +000037**Infinite Iterators:**
Georg Brandl8ec7f652007-08-15 14:28:01 +000038
Raymond Hettinger0aee9422009-02-17 11:00:27 +000039 ================== ================= =================================================
40 Iterator Arguments Results
41 ================== ================= =================================================
42 :func:`count` start, [step] start, start+step, start+2*step, ...
43 :func:`cycle` p p0, p1, ... plast, p0, p1, ...
44 :func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times
45 ================== ================= =================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000046
Raymond Hettinger0aee9422009-02-17 11:00:27 +000047**Iterators terminating on the shortest input sequence:**
48
49 ==================== ============================ =================================================
50 Iterator Arguments Results
51 ==================== ============================ =================================================
52 :func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ...
53 :func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ...
54 :func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails
55 :func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
56 :func:`ifilter` pred, seq elements of seq where pred(elem) is True
57 :func:`ifilterfalse` pred, seq elements of seq where pred(elem) is False
58 :func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step]
Raymond Hettingerfed84c72009-03-09 11:31:39 +000059 :func:`imap` func, p, q, ... func(p0, q0), func(p1, q1), ...
Raymond Hettinger8f195982009-03-10 13:04:30 +000060 :func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ...
Raymond Hettinger0aee9422009-02-17 11:00:27 +000061 :func:`tee` it, n it1, it2 , ... itn splits one iterator into n
62 :func:`takewhile` pred, seq seq[0], seq[1], until pred fails
63 :func:`izip` p, q, ... (p[0], q[0]), (p[1], q[1]), ...
64 :func:`izip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ...
65 ==================== ============================ =================================================
66
67**Combinatoric generators:**
68
69 ===================================== ==================== =================================================
70 Iterator Arguments Results
71 ===================================== ==================== =================================================
72 :func:`product` p, q, ... [repeat=1] cartesian product
73 :func:`permutations` p[, r] r-length permutations (without repeated elements)
74 :func:`combinations` p[, r] r-length combinations (sorted and no repeats)
75 :func:`combinations_with_replacement` p[, r] r-length combinations (sorted but with repeats)
76 ===================================== ==================== =================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000077
78
79.. _itertools-functions:
80
81Itertool functions
82------------------
83
84The following module functions all construct and return iterators. Some provide
85streams of infinite length, so they should only be accessed by functions or
86loops that truncate the stream.
87
88
89.. function:: chain(*iterables)
90
91 Make an iterator that returns elements from the first iterable until it is
92 exhausted, then proceeds to the next iterable, until all of the iterables are
93 exhausted. Used for treating consecutive sequences as a single sequence.
94 Equivalent to::
95
96 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000097 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +000098 for it in iterables:
99 for element in it:
100 yield element
101
102
Raymond Hettinger330958e2008-02-28 19:41:24 +0000103.. function:: itertools.chain.from_iterable(iterable)
104
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000105 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e2008-02-28 19:41:24 +0000106 single iterable argument that is evaluated lazily. Equivalent to::
107
108 @classmethod
109 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000110 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +0000111 for it in iterables:
112 for element in it:
113 yield element
114
115 .. versionadded:: 2.6
116
Raymond Hettingerd553d852008-03-04 04:17:08 +0000117
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000118.. function:: combinations(iterable, r)
119
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000120 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000121
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000122 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000123 input *iterable* is sorted, the combination tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000124 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000125
126 Elements are treated as unique based on their position, not on their
127 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000128 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000129
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000130 Equivalent to::
131
132 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000133 # combinations('ABCD', 2) --> AB AC AD BC BD CD
134 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000135 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000136 n = len(pool)
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000137 if r > n:
138 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000139 indices = range(r)
140 yield tuple(pool[i] for i in indices)
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000141 while True:
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000142 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000143 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000144 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000145 else:
146 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000147 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000148 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000149 indices[j] = indices[j-1] + 1
150 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000151
Raymond Hettingerd553d852008-03-04 04:17:08 +0000152 The code for :func:`combinations` can be also expressed as a subsequence
153 of :func:`permutations` after filtering entries where the elements are not
154 in sorted order (according to their position in the input pool)::
155
156 def combinations(iterable, r):
157 pool = tuple(iterable)
158 n = len(pool)
159 for indices in permutations(range(n), r):
160 if sorted(indices) == list(indices):
161 yield tuple(pool[i] for i in indices)
162
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000163 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
164 or zero when ``r > n``.
165
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000166 .. versionadded:: 2.6
167
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000168.. function:: combinations_with_replacement(iterable, r)
169
170 Return *r* length subsequences of elements from the input *iterable*
171 allowing individual elements to be repeated more than once.
172
173 Combinations are emitted in lexicographic sort order. So, if the
174 input *iterable* is sorted, the combination tuples will be produced
175 in sorted order.
176
177 Elements are treated as unique based on their position, not on their
178 value. So if the input elements are unique, the generated combinations
179 will also be unique.
180
181 Equivalent to::
182
183 def combinations_with_replacement(iterable, r):
184 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
185 pool = tuple(iterable)
186 n = len(pool)
187 if not n and r:
188 return
189 indices = [0] * r
190 yield tuple(pool[i] for i in indices)
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000191 while True:
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000192 for i in reversed(range(r)):
193 if indices[i] != n - 1:
194 break
195 else:
196 return
197 indices[i:] = [indices[i] + 1] * (r - i)
198 yield tuple(pool[i] for i in indices)
199
200 The code for :func:`combinations_with_replacement` can be also expressed as
201 a subsequence of :func:`product` after filtering entries where the elements
202 are not in sorted order (according to their position in the input pool)::
203
204 def combinations_with_replacement(iterable, r):
205 pool = tuple(iterable)
206 n = len(pool)
207 for indices in product(range(n), repeat=r):
208 if sorted(indices) == list(indices):
209 yield tuple(pool[i] for i in indices)
210
211 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
212
213 .. versionadded:: 2.7
214
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000215.. function:: compress(data, selectors)
216
217 Make an iterator that filters elements from *data* returning only those that
218 have a corresponding element in *selectors* that evaluates to ``True``.
Andrew M. Kuchlingefa97712009-03-30 23:08:24 +0000219 Stops when either the *data* or *selectors* iterables has been exhausted.
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000220 Equivalent to::
221
222 def compress(data, selectors):
223 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
224 return (d for d, s in izip(data, selectors) if s)
225
226 .. versionadded:: 2.7
227
228
Raymond Hettingera4038032009-02-14 00:25:51 +0000229.. function:: count(start=0, step=1)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000230
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000231 Make an iterator that returns evenly spaced values starting with *n*. Often
232 used as an argument to :func:`imap` to generate consecutive data points.
233 Also, used with :func:`izip` to add sequence numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000234
Raymond Hettingera4038032009-02-14 00:25:51 +0000235 def count(start=0, step=1):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000236 # count(10) --> 10 11 12 13 14 ...
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000237 # count(2.5, 0.5) -> 3.5 3.0 4.5 ...
Raymond Hettingera4038032009-02-14 00:25:51 +0000238 n = start
Georg Brandl8ec7f652007-08-15 14:28:01 +0000239 while True:
240 yield n
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000241 n += step
Georg Brandl8ec7f652007-08-15 14:28:01 +0000242
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000243 .. versionchanged:: 2.7
244 added *step* argument and allowed non-integer arguments.
245
Georg Brandl8ec7f652007-08-15 14:28:01 +0000246.. function:: cycle(iterable)
247
248 Make an iterator returning elements from the iterable and saving a copy of each.
249 When the iterable is exhausted, return elements from the saved copy. Repeats
250 indefinitely. Equivalent to::
251
252 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000253 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000254 saved = []
255 for element in iterable:
256 yield element
257 saved.append(element)
258 while saved:
259 for element in saved:
260 yield element
261
262 Note, this member of the toolkit may require significant auxiliary storage
263 (depending on the length of the iterable).
264
265
266.. function:: dropwhile(predicate, iterable)
267
268 Make an iterator that drops elements from the iterable as long as the predicate
269 is true; afterwards, returns every element. Note, the iterator does not produce
270 *any* output until the predicate first becomes false, so it may have a lengthy
271 start-up time. Equivalent to::
272
273 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000274 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000275 iterable = iter(iterable)
276 for x in iterable:
277 if not predicate(x):
278 yield x
279 break
280 for x in iterable:
281 yield x
282
283
284.. function:: groupby(iterable[, key])
285
286 Make an iterator that returns consecutive keys and groups from the *iterable*.
287 The *key* is a function computing a key value for each element. If not
288 specified or is ``None``, *key* defaults to an identity function and returns
289 the element unchanged. Generally, the iterable needs to already be sorted on
290 the same key function.
291
292 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
293 generates a break or new group every time the value of the key function changes
294 (which is why it is usually necessary to have sorted the data using the same key
295 function). That behavior differs from SQL's GROUP BY which aggregates common
296 elements regardless of their input order.
297
298 The returned group is itself an iterator that shares the underlying iterable
299 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
300 object is advanced, the previous group is no longer visible. So, if that data
301 is needed later, it should be stored as a list::
302
303 groups = []
304 uniquekeys = []
305 data = sorted(data, key=keyfunc)
306 for k, g in groupby(data, keyfunc):
307 groups.append(list(g)) # Store group iterator as a list
308 uniquekeys.append(k)
309
310 :func:`groupby` is equivalent to::
311
312 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000313 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettingerd507afd2009-02-04 10:52:32 +0000314 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000315 def __init__(self, iterable, key=None):
316 if key is None:
317 key = lambda x: x
318 self.keyfunc = key
319 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000320 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000321 def __iter__(self):
322 return self
323 def next(self):
324 while self.currkey == self.tgtkey:
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000325 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000326 self.currkey = self.keyfunc(self.currvalue)
327 self.tgtkey = self.currkey
328 return (self.currkey, self._grouper(self.tgtkey))
329 def _grouper(self, tgtkey):
330 while self.currkey == tgtkey:
331 yield self.currvalue
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000332 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000333 self.currkey = self.keyfunc(self.currvalue)
334
335 .. versionadded:: 2.4
336
337
338.. function:: ifilter(predicate, iterable)
339
340 Make an iterator that filters elements from iterable returning only those for
341 which the predicate is ``True``. If *predicate* is ``None``, return the items
342 that are true. Equivalent to::
343
344 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000345 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000346 if predicate is None:
347 predicate = bool
348 for x in iterable:
349 if predicate(x):
350 yield x
351
352
353.. function:: ifilterfalse(predicate, iterable)
354
355 Make an iterator that filters elements from iterable returning only those for
356 which the predicate is ``False``. If *predicate* is ``None``, return the items
357 that are false. Equivalent to::
358
359 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000360 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000361 if predicate is None:
362 predicate = bool
363 for x in iterable:
364 if not predicate(x):
365 yield x
366
367
368.. function:: imap(function, *iterables)
369
370 Make an iterator that computes the function using arguments from each of the
371 iterables. If *function* is set to ``None``, then :func:`imap` returns the
372 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
373 exhausted instead of filling in ``None`` for shorter iterables. The reason for
374 the difference is that infinite iterator arguments are typically an error for
375 :func:`map` (because the output is fully evaluated) but represent a common and
376 useful way of supplying arguments to :func:`imap`. Equivalent to::
377
378 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000379 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000380 iterables = map(iter, iterables)
381 while True:
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000382 args = [next(it) for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000383 if function is None:
384 yield tuple(args)
385 else:
386 yield function(*args)
387
388
389.. function:: islice(iterable, [start,] stop [, step])
390
391 Make an iterator that returns selected elements from the iterable. If *start* is
392 non-zero, then elements from the iterable are skipped until start is reached.
393 Afterward, elements are returned consecutively unless *step* is set higher than
394 one which results in items being skipped. If *stop* is ``None``, then iteration
395 continues until the iterator is exhausted, if at all; otherwise, it stops at the
396 specified position. Unlike regular slicing, :func:`islice` does not support
397 negative values for *start*, *stop*, or *step*. Can be used to extract related
398 fields from data where the internal structure has been flattened (for example, a
399 multi-line report may list a name field on every third line). Equivalent to::
400
401 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000402 # islice('ABCDEFG', 2) --> A B
403 # islice('ABCDEFG', 2, 4) --> C D
404 # islice('ABCDEFG', 2, None) --> C D E F G
405 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000406 s = slice(*args)
407 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000408 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000409 for i, element in enumerate(iterable):
410 if i == nexti:
411 yield element
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000412 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000413
414 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
415 then the step defaults to one.
416
417 .. versionchanged:: 2.5
418 accept ``None`` values for default *start* and *step*.
419
420
421.. function:: izip(*iterables)
422
423 Make an iterator that aggregates elements from each of the iterables. Like
424 :func:`zip` except that it returns an iterator instead of a list. Used for
425 lock-step iteration over several iterables at a time. Equivalent to::
426
427 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000428 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000429 iterables = map(iter, iterables)
430 while iterables:
Raymond Hettingerf9bce832009-02-19 05:34:35 +0000431 yield yield tuple(map(next, iterables))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000432
433 .. versionchanged:: 2.4
434 When no iterables are specified, returns a zero length iterator instead of
435 raising a :exc:`TypeError` exception.
436
Raymond Hettinger48c62932008-01-22 19:51:41 +0000437 The left-to-right evaluation order of the iterables is guaranteed. This
438 makes possible an idiom for clustering a data series into n-length groups
439 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000440
Raymond Hettinger48c62932008-01-22 19:51:41 +0000441 :func:`izip` should only be used with unequal length inputs when you don't
442 care about trailing, unmatched values from the longer iterables. If those
443 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000444
445
446.. function:: izip_longest(*iterables[, fillvalue])
447
448 Make an iterator that aggregates elements from each of the iterables. If the
449 iterables are of uneven length, missing values are filled-in with *fillvalue*.
450 Iteration continues until the longest iterable is exhausted. Equivalent to::
451
452 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000453 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000454 fillvalue = kwds.get('fillvalue')
455 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
456 yield counter() # yields the fillvalue, or raises IndexError
457 fillers = repeat(fillvalue)
458 iters = [chain(it, sentinel(), fillers) for it in args]
459 try:
460 for tup in izip(*iters):
461 yield tup
462 except IndexError:
463 pass
464
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000465 If one of the iterables is potentially infinite, then the
466 :func:`izip_longest` function should be wrapped with something that limits
467 the number of calls (for example :func:`islice` or :func:`takewhile`). If
468 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000469
470 .. versionadded:: 2.6
471
Raymond Hettinger330958e2008-02-28 19:41:24 +0000472.. function:: permutations(iterable[, r])
473
474 Return successive *r* length permutations of elements in the *iterable*.
475
476 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000477 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e2008-02-28 19:41:24 +0000478 are generated.
479
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000480 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e2008-02-28 19:41:24 +0000481 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000482 in sorted order.
Raymond Hettinger330958e2008-02-28 19:41:24 +0000483
484 Elements are treated as unique based on their position, not on their
485 value. So if the input elements are unique, there will be no repeat
486 values in each permutation.
487
Raymond Hettingerf287f172008-03-02 10:59:31 +0000488 Equivalent to::
489
490 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000491 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
492 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000493 pool = tuple(iterable)
494 n = len(pool)
495 r = n if r is None else r
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000496 if r > n:
497 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000498 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000499 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000500 yield tuple(pool[i] for i in indices[:r])
501 while n:
502 for i in reversed(range(r)):
503 cycles[i] -= 1
504 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000505 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000506 cycles[i] = n - i
507 else:
508 j = cycles[i]
509 indices[i], indices[-j] = indices[-j], indices[i]
510 yield tuple(pool[i] for i in indices[:r])
511 break
512 else:
513 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000514
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000515 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000516 :func:`product`, filtered to exclude entries with repeated elements (those
517 from the same position in the input pool)::
518
519 def permutations(iterable, r=None):
520 pool = tuple(iterable)
521 n = len(pool)
522 r = n if r is None else r
523 for indices in product(range(n), repeat=r):
524 if len(set(indices)) == r:
525 yield tuple(pool[i] for i in indices)
526
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000527 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
528 or zero when ``r > n``.
529
Raymond Hettinger330958e2008-02-28 19:41:24 +0000530 .. versionadded:: 2.6
531
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000532.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000533
534 Cartesian product of input iterables.
535
536 Equivalent to nested for-loops in a generator expression. For example,
537 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
538
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000539 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000540 on every iteration. This pattern creates a lexicographic ordering so that if
541 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000542 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000543
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000544 To compute the product of an iterable with itself, specify the number of
545 repetitions with the optional *repeat* keyword argument. For example,
546 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
547
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000548 This function is equivalent to the following code, except that the
549 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000550
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000551 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000552 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
553 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000554 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000555 result = [[]]
556 for pool in pools:
557 result = [x+[y] for x in result for y in pool]
558 for prod in result:
559 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000560
561 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000562
563.. function:: repeat(object[, times])
564
565 Make an iterator that returns *object* over and over again. Runs indefinitely
566 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000567 invariant function parameters. Also used with :func:`izip` to create constant
568 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000569
570 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000571 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000572 if times is None:
573 while True:
574 yield object
575 else:
576 for i in xrange(times):
577 yield object
578
579
580.. function:: starmap(function, iterable)
581
Raymond Hettinger47317092008-01-17 03:02:14 +0000582 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000583 the iterable. Used instead of :func:`imap` when argument parameters are already
584 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
585 difference between :func:`imap` and :func:`starmap` parallels the distinction
586 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
587
588 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000589 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000590 for args in iterable:
591 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000592
Raymond Hettinger47317092008-01-17 03:02:14 +0000593 .. versionchanged:: 2.6
594 Previously, :func:`starmap` required the function arguments to be tuples.
595 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000596
597.. function:: takewhile(predicate, iterable)
598
599 Make an iterator that returns elements from the iterable as long as the
600 predicate is true. Equivalent to::
601
602 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000603 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000604 for x in iterable:
605 if predicate(x):
606 yield x
607 else:
608 break
609
610
611.. function:: tee(iterable[, n=2])
612
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000613 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000614
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000615 def tee(iterable, n=2):
616 it = iter(iterable)
617 deques = [collections.deque() for i in range(n)]
618 def gen(mydeque):
619 while True:
620 if not mydeque: # when the local deque is empty
621 newval = next(it) # fetch a new value and
622 for d in deques: # load it to all the deques
623 d.append(newval)
624 yield mydeque.popleft()
625 return tuple(gen(d) for d in deques)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000626
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000627 Once :func:`tee` has made a split, the original *iterable* should not be
628 used anywhere else; otherwise, the *iterable* could get advanced without
629 the tee objects being informed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000630
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000631 This itertool may require significant auxiliary storage (depending on how
632 much temporary data needs to be stored). In general, if one iterator uses
633 most or all of the data before another iterator starts, it is faster to use
634 :func:`list` instead of :func:`tee`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000635
636 .. versionadded:: 2.4
637
638
639.. _itertools-example:
640
641Examples
642--------
643
644The following examples show common uses for each tool and demonstrate ways they
Georg Brandle8f1b002008-03-22 22:04:10 +0000645can be combined.
646
647.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000648
Benjamin Peterson8ea99992009-01-01 16:43:12 +0000649 >>> # Show a dictionary sorted and grouped by value
Georg Brandl8ec7f652007-08-15 14:28:01 +0000650 >>> from operator import itemgetter
651 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
652 >>> di = sorted(d.iteritems(), key=itemgetter(1))
653 >>> for k, g in groupby(di, key=itemgetter(1)):
654 ... print k, map(itemgetter(0), g)
655 ...
656 1 ['a', 'c', 'e']
657 2 ['b', 'd', 'f']
658 3 ['g']
659
Benjamin Peterson8ea99992009-01-01 16:43:12 +0000660 >>> # Find runs of consecutive numbers using groupby. The key to the solution
661 >>> # is differencing with a range so that consecutive numbers all appear in
662 >>> # same group.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000663 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
664 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
Georg Brandle8f1b002008-03-22 22:04:10 +0000665 ... print map(itemgetter(1), g)
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000666 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000667 [1]
668 [4, 5, 6]
669 [10]
670 [15, 16, 17, 18]
671 [22]
672 [25, 26, 27, 28]
673
674
675
676.. _itertools-recipes:
677
678Recipes
679-------
680
681This section shows recipes for creating an extended toolset using the existing
682itertools as building blocks.
683
684The extended tools offer the same high performance as the underlying toolset.
685The superior memory performance is kept by processing elements one at a time
686rather than bringing the whole iterable into memory all at once. Code volume is
687kept small by linking the tools together in a functional style which helps
688eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000689"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000690which incur interpreter overhead.
691
692.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000693
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000694 def take(n, iterable):
695 "Return first n items of the iterable as a list"
696 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000697
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000698 def enumerate(iterable, start=0):
699 return izip(count(start), iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000700
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000701 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000702 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000703 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000704
Raymond Hettinger3496a892009-03-09 11:57:29 +0000705 def consume(iterator, n):
706 "Advance the iterator n-steps ahead. If n is none, consume entirely."
707 collections.deque(islice(iterator, n), maxlen=0)
708
Raymond Hettingerf9bce832009-02-19 05:34:35 +0000709 def nth(iterable, n, default=None):
710 "Returns the nth item or a default value"
711 return next(islice(iterable, n, None), default)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000712
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000713 def quantify(iterable, pred=bool):
714 "Count how many times the predicate is true"
715 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000716
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000717 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000718 """Returns the sequence elements and then returns None indefinitely.
719
720 Useful for emulating the behavior of the built-in map() function.
721 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000722 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000723
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000724 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000725 "Returns the sequence elements n times"
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000726 return chain.from_iterable(repeat(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000727
728 def dotproduct(vec1, vec2):
729 return sum(imap(operator.mul, vec1, vec2))
730
731 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000732 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000733
734 def repeatfunc(func, times=None, *args):
735 """Repeat calls to func with specified arguments.
736
737 Example: repeatfunc(random.random)
738 """
739 if times is None:
740 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000741 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000742
743 def pairwise(iterable):
744 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
745 a, b = tee(iterable)
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000746 next(b, None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000747 return izip(a, b)
748
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000749 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000750 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000751 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000752 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000753
Raymond Hettingera44327a2008-01-30 22:17:31 +0000754 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000755 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000756 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000757 pending = len(iterables)
758 nexts = cycle(iter(it).next for it in iterables)
759 while pending:
760 try:
761 for next in nexts:
762 yield next()
763 except StopIteration:
764 pending -= 1
765 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000766
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000767 def powerset(iterable):
Raymond Hettinger68d919e2009-01-25 21:31:47 +0000768 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
769 s = list(iterable)
770 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000771
Benjamin Peterson48291362009-01-31 20:01:48 +0000772 def unique_everseen(iterable, key=None):
773 "List unique elements, preserving order. Remember all elements ever seen."
774 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
775 # unique_everseen('ABBCcAD', str.lower) --> A B C D
776 seen = set()
777 seen_add = seen.add
778 if key is None:
779 for element in iterable:
780 if element not in seen:
781 seen_add(element)
782 yield element
783 else:
784 for element in iterable:
785 k = key(element)
786 if k not in seen:
787 seen_add(k)
788 yield element
Raymond Hettinger44e15812009-01-02 21:26:45 +0000789
Benjamin Peterson48291362009-01-31 20:01:48 +0000790 def unique_justseen(iterable, key=None):
791 "List unique elements, preserving order. Remember only the element just seen."
792 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
793 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
794 return imap(next, imap(itemgetter(1), groupby(iterable, key)))