blob: b7f407d6c65ac572acfafd239c3ed7fa778f382e [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Raymond Hettinger0aee9422009-02-17 11:00:27 +000017This module implements a number of :term:`iterator` building blocks inspired
18by constructs from APL, Haskell, and SML. Each has been recast in a form
19suitable for Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000020
21The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettinger0aee9422009-02-17 11:00:27 +000022useful by themselves or in combination. Together, they form an "iterator
23algebra" making it possible to construct specialized tools succinctly and
24efficiently in pure Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000025
26For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Ezio Melotti77a64e72010-01-21 20:50:57 +000027sequence ``f(0), f(1), ...``. The same effect can be achieved in Python
28by combining :func:`imap` and :func:`count` to form ``imap(f, count())``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000029
Raymond Hettingerefa7c132009-03-12 00:31:58 +000030These tools and their built-in counterparts also work well with the high-speed
31functions in the :mod:`operator` module. For example, the multiplication
32operator can be mapped across two vectors to form an efficient dot-product:
33``sum(imap(operator.mul, vector1, vector2))``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000034
Georg Brandl8ec7f652007-08-15 14:28:01 +000035
Raymond Hettinger0aee9422009-02-17 11:00:27 +000036**Infinite Iterators:**
Georg Brandl8ec7f652007-08-15 14:28:01 +000037
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000038================== ================= ================================================= =========================================
39Iterator Arguments Results Example
40================== ================= ================================================= =========================================
41:func:`count` start, [step] start, start+step, start+2*step, ... ``count(10) --> 10 11 12 13 14 ...``
42:func:`cycle` p p0, p1, ... plast, p0, p1, ... ``cycle('ABCD') --> A B C D A B C D ...``
43:func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times ``repeat(10, 3) --> 10 10 10``
44================== ================= ================================================= =========================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000045
Raymond Hettinger0aee9422009-02-17 11:00:27 +000046**Iterators terminating on the shortest input sequence:**
47
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000048==================== ============================ ================================================= =============================================================
49Iterator Arguments Results Example
50==================== ============================ ================================================= =============================================================
51:func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ... ``chain('ABC', 'DEF') --> A B C D E F``
52:func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ... ``compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F``
53:func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails ``dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1``
54:func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
55:func:`ifilter` pred, seq elements of seq where pred(elem) is True ``ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9``
56:func:`ifilterfalse` pred, seq elements of seq where pred(elem) is False ``ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8``
57:func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step] ``islice('ABCDEFG', 2, None) --> C D E F G``
58:func:`imap` func, p, q, ... func(p0, q0), func(p1, q1), ... ``imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000``
59:func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ... ``starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000``
60:func:`tee` it, n it1, it2 , ... itn splits one iterator into n
61:func:`takewhile` pred, seq seq[0], seq[1], until pred fails ``takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4``
62:func:`izip` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip('ABCD', 'xy') --> Ax By``
63:func:`izip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-``
64==================== ============================ ================================================= =============================================================
Raymond Hettinger0aee9422009-02-17 11:00:27 +000065
66**Combinatoric generators:**
67
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000068============================================== ==================== =============================================================
69Iterator Arguments Results
70============================================== ==================== =============================================================
71:func:`product` p, q, ... [repeat=1] cartesian product, equivalent to a nested for-loop
72:func:`permutations` p[, r] r-length tuples, all possible orderings, no repeated elements
Raymond Hettinger9eac1192009-11-19 01:22:04 +000073:func:`combinations` p, r r-length tuples, in sorted order, no repeated elements
74:func:`combinations_with_replacement` p, r r-length tuples, in sorted order, with repeated elements
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000075``product('ABCD', repeat=2)`` ``AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD``
76``permutations('ABCD', 2)`` ``AB AC AD BA BC BD CA CB CD DA DB DC``
77``combinations('ABCD', 2)`` ``AB AC AD BC BD CD``
78``combinations_with_replacement('ABCD', 2)`` ``AA AB AC AD BB BC BD CC CD DD``
79============================================== ==================== =============================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000080
81
82.. _itertools-functions:
83
84Itertool functions
85------------------
86
87The following module functions all construct and return iterators. Some provide
88streams of infinite length, so they should only be accessed by functions or
89loops that truncate the stream.
90
91
92.. function:: chain(*iterables)
93
94 Make an iterator that returns elements from the first iterable until it is
95 exhausted, then proceeds to the next iterable, until all of the iterables are
96 exhausted. Used for treating consecutive sequences as a single sequence.
97 Equivalent to::
98
99 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000100 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +0000101 for it in iterables:
102 for element in it:
103 yield element
104
105
Georg Brandld070cc52010-08-01 21:06:46 +0000106.. classmethod:: chain.from_iterable(iterable)
Raymond Hettinger330958e2008-02-28 19:41:24 +0000107
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000108 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e2008-02-28 19:41:24 +0000109 single iterable argument that is evaluated lazily. Equivalent to::
110
111 @classmethod
112 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000113 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +0000114 for it in iterables:
115 for element in it:
116 yield element
117
118 .. versionadded:: 2.6
119
Raymond Hettingerd553d852008-03-04 04:17:08 +0000120
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000121.. function:: combinations(iterable, r)
122
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000123 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000124
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000125 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000126 input *iterable* is sorted, the combination tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000127 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000128
129 Elements are treated as unique based on their position, not on their
130 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000131 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000132
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000133 Equivalent to::
134
135 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000136 # combinations('ABCD', 2) --> AB AC AD BC BD CD
137 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000138 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000139 n = len(pool)
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000140 if r > n:
141 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000142 indices = range(r)
143 yield tuple(pool[i] for i in indices)
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000144 while True:
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000145 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000146 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000147 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000148 else:
149 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000150 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000151 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000152 indices[j] = indices[j-1] + 1
153 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000154
Raymond Hettingerd553d852008-03-04 04:17:08 +0000155 The code for :func:`combinations` can be also expressed as a subsequence
156 of :func:`permutations` after filtering entries where the elements are not
157 in sorted order (according to their position in the input pool)::
158
159 def combinations(iterable, r):
160 pool = tuple(iterable)
161 n = len(pool)
162 for indices in permutations(range(n), r):
163 if sorted(indices) == list(indices):
164 yield tuple(pool[i] for i in indices)
165
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000166 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
167 or zero when ``r > n``.
168
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000169 .. versionadded:: 2.6
170
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000171.. function:: combinations_with_replacement(iterable, r)
172
173 Return *r* length subsequences of elements from the input *iterable*
174 allowing individual elements to be repeated more than once.
175
176 Combinations are emitted in lexicographic sort order. So, if the
177 input *iterable* is sorted, the combination tuples will be produced
178 in sorted order.
179
180 Elements are treated as unique based on their position, not on their
181 value. So if the input elements are unique, the generated combinations
182 will also be unique.
183
184 Equivalent to::
185
186 def combinations_with_replacement(iterable, r):
187 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
188 pool = tuple(iterable)
189 n = len(pool)
190 if not n and r:
191 return
192 indices = [0] * r
193 yield tuple(pool[i] for i in indices)
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000194 while True:
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000195 for i in reversed(range(r)):
196 if indices[i] != n - 1:
197 break
198 else:
199 return
200 indices[i:] = [indices[i] + 1] * (r - i)
201 yield tuple(pool[i] for i in indices)
202
203 The code for :func:`combinations_with_replacement` can be also expressed as
204 a subsequence of :func:`product` after filtering entries where the elements
205 are not in sorted order (according to their position in the input pool)::
206
207 def combinations_with_replacement(iterable, r):
208 pool = tuple(iterable)
209 n = len(pool)
210 for indices in product(range(n), repeat=r):
211 if sorted(indices) == list(indices):
212 yield tuple(pool[i] for i in indices)
213
214 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
215
216 .. versionadded:: 2.7
217
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000218.. function:: compress(data, selectors)
219
220 Make an iterator that filters elements from *data* returning only those that
221 have a corresponding element in *selectors* that evaluates to ``True``.
Andrew M. Kuchlingefa97712009-03-30 23:08:24 +0000222 Stops when either the *data* or *selectors* iterables has been exhausted.
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000223 Equivalent to::
224
225 def compress(data, selectors):
226 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
227 return (d for d, s in izip(data, selectors) if s)
228
229 .. versionadded:: 2.7
230
231
Raymond Hettingera4038032009-02-14 00:25:51 +0000232.. function:: count(start=0, step=1)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000233
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000234 Make an iterator that returns evenly spaced values starting with *n*. Often
235 used as an argument to :func:`imap` to generate consecutive data points.
236 Also, used with :func:`izip` to add sequence numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000237
Raymond Hettingera4038032009-02-14 00:25:51 +0000238 def count(start=0, step=1):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000239 # count(10) --> 10 11 12 13 14 ...
Raymond Hettinger97b31952011-02-14 06:03:41 +0000240 # count(2.5, 0.5) -> 2.5 3.0 3.5 ...
Raymond Hettingera4038032009-02-14 00:25:51 +0000241 n = start
Georg Brandl8ec7f652007-08-15 14:28:01 +0000242 while True:
243 yield n
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000244 n += step
Georg Brandl8ec7f652007-08-15 14:28:01 +0000245
Raymond Hettinger3a026242009-06-17 01:43:47 +0000246 When counting with floating point numbers, better accuracy can sometimes be
247 achieved by substituting multiplicative code such as: ``(start + step * i
248 for i in count())``.
249
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000250 .. versionchanged:: 2.7
251 added *step* argument and allowed non-integer arguments.
252
Georg Brandl8ec7f652007-08-15 14:28:01 +0000253.. function:: cycle(iterable)
254
255 Make an iterator returning elements from the iterable and saving a copy of each.
256 When the iterable is exhausted, return elements from the saved copy. Repeats
257 indefinitely. Equivalent to::
258
259 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000260 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000261 saved = []
262 for element in iterable:
263 yield element
264 saved.append(element)
265 while saved:
266 for element in saved:
267 yield element
268
269 Note, this member of the toolkit may require significant auxiliary storage
270 (depending on the length of the iterable).
271
272
273.. function:: dropwhile(predicate, iterable)
274
275 Make an iterator that drops elements from the iterable as long as the predicate
276 is true; afterwards, returns every element. Note, the iterator does not produce
277 *any* output until the predicate first becomes false, so it may have a lengthy
278 start-up time. Equivalent to::
279
280 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000281 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000282 iterable = iter(iterable)
283 for x in iterable:
284 if not predicate(x):
285 yield x
286 break
287 for x in iterable:
288 yield x
289
290
291.. function:: groupby(iterable[, key])
292
293 Make an iterator that returns consecutive keys and groups from the *iterable*.
294 The *key* is a function computing a key value for each element. If not
295 specified or is ``None``, *key* defaults to an identity function and returns
296 the element unchanged. Generally, the iterable needs to already be sorted on
297 the same key function.
298
299 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
300 generates a break or new group every time the value of the key function changes
301 (which is why it is usually necessary to have sorted the data using the same key
302 function). That behavior differs from SQL's GROUP BY which aggregates common
303 elements regardless of their input order.
304
305 The returned group is itself an iterator that shares the underlying iterable
306 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
307 object is advanced, the previous group is no longer visible. So, if that data
308 is needed later, it should be stored as a list::
309
310 groups = []
311 uniquekeys = []
312 data = sorted(data, key=keyfunc)
313 for k, g in groupby(data, keyfunc):
314 groups.append(list(g)) # Store group iterator as a list
315 uniquekeys.append(k)
316
317 :func:`groupby` is equivalent to::
318
319 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000320 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettingerd507afd2009-02-04 10:52:32 +0000321 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000322 def __init__(self, iterable, key=None):
323 if key is None:
324 key = lambda x: x
325 self.keyfunc = key
326 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000327 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000328 def __iter__(self):
329 return self
330 def next(self):
331 while self.currkey == self.tgtkey:
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000332 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000333 self.currkey = self.keyfunc(self.currvalue)
334 self.tgtkey = self.currkey
335 return (self.currkey, self._grouper(self.tgtkey))
336 def _grouper(self, tgtkey):
337 while self.currkey == tgtkey:
338 yield self.currvalue
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000339 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000340 self.currkey = self.keyfunc(self.currvalue)
341
342 .. versionadded:: 2.4
343
344
345.. function:: ifilter(predicate, iterable)
346
347 Make an iterator that filters elements from iterable returning only those for
348 which the predicate is ``True``. If *predicate* is ``None``, return the items
349 that are true. Equivalent to::
350
351 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000352 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000353 if predicate is None:
354 predicate = bool
355 for x in iterable:
356 if predicate(x):
357 yield x
358
359
360.. function:: ifilterfalse(predicate, iterable)
361
362 Make an iterator that filters elements from iterable returning only those for
363 which the predicate is ``False``. If *predicate* is ``None``, return the items
364 that are false. Equivalent to::
365
366 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000367 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000368 if predicate is None:
369 predicate = bool
370 for x in iterable:
371 if not predicate(x):
372 yield x
373
374
375.. function:: imap(function, *iterables)
376
377 Make an iterator that computes the function using arguments from each of the
378 iterables. If *function* is set to ``None``, then :func:`imap` returns the
379 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
380 exhausted instead of filling in ``None`` for shorter iterables. The reason for
381 the difference is that infinite iterator arguments are typically an error for
382 :func:`map` (because the output is fully evaluated) but represent a common and
383 useful way of supplying arguments to :func:`imap`. Equivalent to::
384
385 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000386 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000387 iterables = map(iter, iterables)
388 while True:
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000389 args = [next(it) for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000390 if function is None:
391 yield tuple(args)
392 else:
393 yield function(*args)
394
395
396.. function:: islice(iterable, [start,] stop [, step])
397
398 Make an iterator that returns selected elements from the iterable. If *start* is
399 non-zero, then elements from the iterable are skipped until start is reached.
400 Afterward, elements are returned consecutively unless *step* is set higher than
401 one which results in items being skipped. If *stop* is ``None``, then iteration
402 continues until the iterator is exhausted, if at all; otherwise, it stops at the
403 specified position. Unlike regular slicing, :func:`islice` does not support
404 negative values for *start*, *stop*, or *step*. Can be used to extract related
405 fields from data where the internal structure has been flattened (for example, a
406 multi-line report may list a name field on every third line). Equivalent to::
407
408 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000409 # islice('ABCDEFG', 2) --> A B
410 # islice('ABCDEFG', 2, 4) --> C D
411 # islice('ABCDEFG', 2, None) --> C D E F G
412 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000413 s = slice(*args)
414 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000415 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000416 for i, element in enumerate(iterable):
417 if i == nexti:
418 yield element
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000419 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000420
421 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
422 then the step defaults to one.
423
424 .. versionchanged:: 2.5
425 accept ``None`` values for default *start* and *step*.
426
427
428.. function:: izip(*iterables)
429
430 Make an iterator that aggregates elements from each of the iterables. Like
431 :func:`zip` except that it returns an iterator instead of a list. Used for
432 lock-step iteration over several iterables at a time. Equivalent to::
433
434 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000435 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000436 iterables = map(iter, iterables)
437 while iterables:
Raymond Hettingercccfc822009-04-20 18:23:57 +0000438 yield tuple(map(next, iterables))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000439
440 .. versionchanged:: 2.4
441 When no iterables are specified, returns a zero length iterator instead of
442 raising a :exc:`TypeError` exception.
443
Raymond Hettinger48c62932008-01-22 19:51:41 +0000444 The left-to-right evaluation order of the iterables is guaranteed. This
445 makes possible an idiom for clustering a data series into n-length groups
446 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000447
Raymond Hettinger48c62932008-01-22 19:51:41 +0000448 :func:`izip` should only be used with unequal length inputs when you don't
449 care about trailing, unmatched values from the longer iterables. If those
450 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000451
452
453.. function:: izip_longest(*iterables[, fillvalue])
454
455 Make an iterator that aggregates elements from each of the iterables. If the
456 iterables are of uneven length, missing values are filled-in with *fillvalue*.
457 Iteration continues until the longest iterable is exhausted. Equivalent to::
458
459 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000460 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000461 fillvalue = kwds.get('fillvalue')
462 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
463 yield counter() # yields the fillvalue, or raises IndexError
464 fillers = repeat(fillvalue)
465 iters = [chain(it, sentinel(), fillers) for it in args]
466 try:
467 for tup in izip(*iters):
468 yield tup
469 except IndexError:
470 pass
471
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000472 If one of the iterables is potentially infinite, then the
473 :func:`izip_longest` function should be wrapped with something that limits
474 the number of calls (for example :func:`islice` or :func:`takewhile`). If
475 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000476
477 .. versionadded:: 2.6
478
Raymond Hettinger330958e2008-02-28 19:41:24 +0000479.. function:: permutations(iterable[, r])
480
481 Return successive *r* length permutations of elements in the *iterable*.
482
483 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000484 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e2008-02-28 19:41:24 +0000485 are generated.
486
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000487 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e2008-02-28 19:41:24 +0000488 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000489 in sorted order.
Raymond Hettinger330958e2008-02-28 19:41:24 +0000490
491 Elements are treated as unique based on their position, not on their
492 value. So if the input elements are unique, there will be no repeat
493 values in each permutation.
494
Raymond Hettingerf287f172008-03-02 10:59:31 +0000495 Equivalent to::
496
497 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000498 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
499 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000500 pool = tuple(iterable)
501 n = len(pool)
502 r = n if r is None else r
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000503 if r > n:
504 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000505 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000506 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000507 yield tuple(pool[i] for i in indices[:r])
508 while n:
509 for i in reversed(range(r)):
510 cycles[i] -= 1
511 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000512 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000513 cycles[i] = n - i
514 else:
515 j = cycles[i]
516 indices[i], indices[-j] = indices[-j], indices[i]
517 yield tuple(pool[i] for i in indices[:r])
518 break
519 else:
520 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000521
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000522 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000523 :func:`product`, filtered to exclude entries with repeated elements (those
524 from the same position in the input pool)::
525
526 def permutations(iterable, r=None):
527 pool = tuple(iterable)
528 n = len(pool)
529 r = n if r is None else r
530 for indices in product(range(n), repeat=r):
531 if len(set(indices)) == r:
532 yield tuple(pool[i] for i in indices)
533
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000534 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
535 or zero when ``r > n``.
536
Raymond Hettinger330958e2008-02-28 19:41:24 +0000537 .. versionadded:: 2.6
538
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000539.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000540
541 Cartesian product of input iterables.
542
543 Equivalent to nested for-loops in a generator expression. For example,
544 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
545
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000546 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000547 on every iteration. This pattern creates a lexicographic ordering so that if
548 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000549 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000550
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000551 To compute the product of an iterable with itself, specify the number of
552 repetitions with the optional *repeat* keyword argument. For example,
553 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
554
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000555 This function is equivalent to the following code, except that the
556 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000557
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000558 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000559 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
560 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000561 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000562 result = [[]]
563 for pool in pools:
564 result = [x+[y] for x in result for y in pool]
565 for prod in result:
566 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000567
568 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000569
570.. function:: repeat(object[, times])
571
572 Make an iterator that returns *object* over and over again. Runs indefinitely
573 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000574 invariant function parameters. Also used with :func:`izip` to create constant
575 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000576
577 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000578 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000579 if times is None:
580 while True:
581 yield object
582 else:
583 for i in xrange(times):
584 yield object
585
586
587.. function:: starmap(function, iterable)
588
Raymond Hettinger47317092008-01-17 03:02:14 +0000589 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000590 the iterable. Used instead of :func:`imap` when argument parameters are already
591 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
592 difference between :func:`imap` and :func:`starmap` parallels the distinction
593 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
594
595 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000596 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000597 for args in iterable:
598 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000599
Raymond Hettinger47317092008-01-17 03:02:14 +0000600 .. versionchanged:: 2.6
601 Previously, :func:`starmap` required the function arguments to be tuples.
602 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000603
604.. function:: takewhile(predicate, iterable)
605
606 Make an iterator that returns elements from the iterable as long as the
607 predicate is true. Equivalent to::
608
609 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000610 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000611 for x in iterable:
612 if predicate(x):
613 yield x
614 else:
615 break
616
617
618.. function:: tee(iterable[, n=2])
619
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000620 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000621
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000622 def tee(iterable, n=2):
623 it = iter(iterable)
624 deques = [collections.deque() for i in range(n)]
625 def gen(mydeque):
626 while True:
627 if not mydeque: # when the local deque is empty
628 newval = next(it) # fetch a new value and
629 for d in deques: # load it to all the deques
630 d.append(newval)
631 yield mydeque.popleft()
632 return tuple(gen(d) for d in deques)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000633
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000634 Once :func:`tee` has made a split, the original *iterable* should not be
635 used anywhere else; otherwise, the *iterable* could get advanced without
636 the tee objects being informed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000637
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000638 This itertool may require significant auxiliary storage (depending on how
639 much temporary data needs to be stored). In general, if one iterator uses
640 most or all of the data before another iterator starts, it is faster to use
641 :func:`list` instead of :func:`tee`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000642
643 .. versionadded:: 2.4
644
645
Georg Brandl8ec7f652007-08-15 14:28:01 +0000646.. _itertools-recipes:
647
648Recipes
649-------
650
651This section shows recipes for creating an extended toolset using the existing
652itertools as building blocks.
653
654The extended tools offer the same high performance as the underlying toolset.
655The superior memory performance is kept by processing elements one at a time
656rather than bringing the whole iterable into memory all at once. Code volume is
657kept small by linking the tools together in a functional style which helps
658eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000659"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000660which incur interpreter overhead.
661
662.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000663
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000664 def take(n, iterable):
665 "Return first n items of the iterable as a list"
666 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000667
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000668 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000669 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000670 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000671
Raymond Hettinger3496a892009-03-09 11:57:29 +0000672 def consume(iterator, n):
673 "Advance the iterator n-steps ahead. If n is none, consume entirely."
Raymond Hettingerb8d688c2010-03-28 18:25:01 +0000674 # Use functions that consume iterators at C speed.
675 if n is None:
676 # feed the entire iterator into a zero-length deque
677 collections.deque(iterator, maxlen=0)
678 else:
Georg Brandldb235c12010-10-06 09:33:55 +0000679 # advance to the empty slice starting at position n
Raymond Hettingerb8d688c2010-03-28 18:25:01 +0000680 next(islice(iterator, n, n), None)
Raymond Hettinger3496a892009-03-09 11:57:29 +0000681
Raymond Hettingerf9bce832009-02-19 05:34:35 +0000682 def nth(iterable, n, default=None):
683 "Returns the nth item or a default value"
684 return next(islice(iterable, n, None), default)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000685
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000686 def quantify(iterable, pred=bool):
687 "Count how many times the predicate is true"
688 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000689
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000690 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000691 """Returns the sequence elements and then returns None indefinitely.
692
693 Useful for emulating the behavior of the built-in map() function.
694 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000695 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000696
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000697 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000698 "Returns the sequence elements n times"
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000699 return chain.from_iterable(repeat(tuple(iterable), n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000700
701 def dotproduct(vec1, vec2):
702 return sum(imap(operator.mul, vec1, vec2))
703
704 def flatten(listOfLists):
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000705 "Flatten one level of nesting"
706 return chain.from_iterable(listOfLists)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000707
708 def repeatfunc(func, times=None, *args):
709 """Repeat calls to func with specified arguments.
710
711 Example: repeatfunc(random.random)
712 """
713 if times is None:
714 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000715 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000716
717 def pairwise(iterable):
718 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
719 a, b = tee(iterable)
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000720 next(b, None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000721 return izip(a, b)
722
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000723 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000724 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000725 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000726 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000727
Raymond Hettingera44327a2008-01-30 22:17:31 +0000728 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000729 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000730 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000731 pending = len(iterables)
732 nexts = cycle(iter(it).next for it in iterables)
733 while pending:
734 try:
735 for next in nexts:
736 yield next()
737 except StopIteration:
738 pending -= 1
739 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000740
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000741 def powerset(iterable):
Raymond Hettinger68d919e2009-01-25 21:31:47 +0000742 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
743 s = list(iterable)
744 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000745
Benjamin Peterson48291362009-01-31 20:01:48 +0000746 def unique_everseen(iterable, key=None):
747 "List unique elements, preserving order. Remember all elements ever seen."
748 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
749 # unique_everseen('ABBCcAD', str.lower) --> A B C D
750 seen = set()
751 seen_add = seen.add
752 if key is None:
Raymond Hettinger5b027f82010-03-28 18:02:41 +0000753 for element in ifilterfalse(seen.__contains__, iterable):
754 seen_add(element)
755 yield element
Benjamin Peterson48291362009-01-31 20:01:48 +0000756 else:
757 for element in iterable:
758 k = key(element)
759 if k not in seen:
760 seen_add(k)
761 yield element
Raymond Hettinger44e15812009-01-02 21:26:45 +0000762
Benjamin Peterson48291362009-01-31 20:01:48 +0000763 def unique_justseen(iterable, key=None):
764 "List unique elements, preserving order. Remember only the element just seen."
765 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
766 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
767 return imap(next, imap(itemgetter(1), groupby(iterable, key)))
Raymond Hettinger5b027f82010-03-28 18:02:41 +0000768
769 def iter_except(func, exception, first=None):
770 """ Call a function repeatedly until an exception is raised.
771
772 Converts a call-until-exception interface to an iterator interface.
773 Like __builtin__.iter(func, sentinel) but uses an exception instead
774 of a sentinel to end the loop.
775
776 Examples:
777 bsddbiter = iter_except(db.next, bsddb.error, db.first)
778 heapiter = iter_except(functools.partial(heappop, h), IndexError)
779 dictiter = iter_except(d.popitem, KeyError)
780 dequeiter = iter_except(d.popleft, IndexError)
781 queueiter = iter_except(q.get_nowait, Queue.Empty)
782 setiter = iter_except(s.pop, KeyError)
783
784 """
785 try:
786 if first is not None:
787 yield first()
788 while 1:
789 yield func()
790 except exception:
791 pass
792
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000793 def random_product(*args, **kwds):
794 "Random selection from itertools.product(*args, **kwds)"
795 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000796 return tuple(random.choice(pool) for pool in pools)
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000797
Raymond Hettingera1d61d02010-04-10 07:01:32 +0000798 def random_permutation(iterable, r=None):
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000799 "Random selection from itertools.permutations(iterable, r)"
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000800 pool = tuple(iterable)
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000801 r = len(pool) if r is None else r
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000802 return tuple(random.sample(pool, r))
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000803
804 def random_combination(iterable, r):
805 "Random selection from itertools.combinations(iterable, r)"
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000806 pool = tuple(iterable)
Raymond Hettingera1d61d02010-04-10 07:01:32 +0000807 n = len(pool)
808 indices = sorted(random.sample(xrange(n), r))
809 return tuple(pool[i] for i in indices)
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000810
811 def random_combination_with_replacement(iterable, r):
812 "Random selection from itertools.combinations_with_replacement(iterable, r)"
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000813 pool = tuple(iterable)
Raymond Hettingera1d61d02010-04-10 07:01:32 +0000814 n = len(pool)
815 indices = sorted(random.randrange(n) for i in xrange(r))
816 return tuple(pool[i] for i in indices)
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000817
Raymond Hettingerd282b932010-03-28 18:08:15 +0000818Note, many of the above recipes can be optimized by replacing global lookups
819with local variables defined as default values. For example, the
820*dotproduct* recipe can be written as::
821
822 def dotproduct(vec1, vec2, sum=sum, imap=imap, mul=operator.mul):
823 return sum(imap(mul, vec1, vec2))