blob: a553d09964e345215f43048192fab1e646b33425 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Raymond Hettinger0aee9422009-02-17 11:00:27 +000017This module implements a number of :term:`iterator` building blocks inspired
18by constructs from APL, Haskell, and SML. Each has been recast in a form
19suitable for Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000020
21The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettinger0aee9422009-02-17 11:00:27 +000022useful by themselves or in combination. Together, they form an "iterator
23algebra" making it possible to construct specialized tools succinctly and
24efficiently in pure Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000025
26For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Ezio Melotti77a64e72010-01-21 20:50:57 +000027sequence ``f(0), f(1), ...``. The same effect can be achieved in Python
28by combining :func:`imap` and :func:`count` to form ``imap(f, count())``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000029
Raymond Hettingerefa7c132009-03-12 00:31:58 +000030These tools and their built-in counterparts also work well with the high-speed
31functions in the :mod:`operator` module. For example, the multiplication
32operator can be mapped across two vectors to form an efficient dot-product:
33``sum(imap(operator.mul, vector1, vector2))``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000034
Georg Brandl8ec7f652007-08-15 14:28:01 +000035
Raymond Hettinger0aee9422009-02-17 11:00:27 +000036**Infinite Iterators:**
Georg Brandl8ec7f652007-08-15 14:28:01 +000037
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000038================== ================= ================================================= =========================================
39Iterator Arguments Results Example
40================== ================= ================================================= =========================================
41:func:`count` start, [step] start, start+step, start+2*step, ... ``count(10) --> 10 11 12 13 14 ...``
42:func:`cycle` p p0, p1, ... plast, p0, p1, ... ``cycle('ABCD') --> A B C D A B C D ...``
43:func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times ``repeat(10, 3) --> 10 10 10``
44================== ================= ================================================= =========================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000045
Raymond Hettinger0aee9422009-02-17 11:00:27 +000046**Iterators terminating on the shortest input sequence:**
47
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000048==================== ============================ ================================================= =============================================================
49Iterator Arguments Results Example
50==================== ============================ ================================================= =============================================================
51:func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ... ``chain('ABC', 'DEF') --> A B C D E F``
52:func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ... ``compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F``
53:func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails ``dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1``
54:func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
55:func:`ifilter` pred, seq elements of seq where pred(elem) is True ``ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9``
56:func:`ifilterfalse` pred, seq elements of seq where pred(elem) is False ``ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8``
57:func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step] ``islice('ABCDEFG', 2, None) --> C D E F G``
58:func:`imap` func, p, q, ... func(p0, q0), func(p1, q1), ... ``imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000``
59:func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ... ``starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000``
60:func:`tee` it, n it1, it2 , ... itn splits one iterator into n
61:func:`takewhile` pred, seq seq[0], seq[1], until pred fails ``takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4``
62:func:`izip` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip('ABCD', 'xy') --> Ax By``
63:func:`izip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-``
64==================== ============================ ================================================= =============================================================
Raymond Hettinger0aee9422009-02-17 11:00:27 +000065
66**Combinatoric generators:**
67
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000068============================================== ==================== =============================================================
69Iterator Arguments Results
70============================================== ==================== =============================================================
71:func:`product` p, q, ... [repeat=1] cartesian product, equivalent to a nested for-loop
72:func:`permutations` p[, r] r-length tuples, all possible orderings, no repeated elements
Raymond Hettinger9eac1192009-11-19 01:22:04 +000073:func:`combinations` p, r r-length tuples, in sorted order, no repeated elements
74:func:`combinations_with_replacement` p, r r-length tuples, in sorted order, with repeated elements
Raymond Hettingerf0f475d2009-04-10 13:16:50 +000075``product('ABCD', repeat=2)`` ``AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD``
76``permutations('ABCD', 2)`` ``AB AC AD BA BC BD CA CB CD DA DB DC``
77``combinations('ABCD', 2)`` ``AB AC AD BC BD CD``
78``combinations_with_replacement('ABCD', 2)`` ``AA AB AC AD BB BC BD CC CD DD``
79============================================== ==================== =============================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000080
81
82.. _itertools-functions:
83
84Itertool functions
85------------------
86
87The following module functions all construct and return iterators. Some provide
88streams of infinite length, so they should only be accessed by functions or
89loops that truncate the stream.
90
91
92.. function:: chain(*iterables)
93
94 Make an iterator that returns elements from the first iterable until it is
95 exhausted, then proceeds to the next iterable, until all of the iterables are
96 exhausted. Used for treating consecutive sequences as a single sequence.
97 Equivalent to::
98
99 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000100 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +0000101 for it in iterables:
102 for element in it:
103 yield element
104
105
Georg Brandld070cc52010-08-01 21:06:46 +0000106.. classmethod:: chain.from_iterable(iterable)
Raymond Hettinger330958e2008-02-28 19:41:24 +0000107
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000108 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e2008-02-28 19:41:24 +0000109 single iterable argument that is evaluated lazily. Equivalent to::
110
111 @classmethod
112 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000113 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +0000114 for it in iterables:
115 for element in it:
116 yield element
117
118 .. versionadded:: 2.6
119
Raymond Hettingerd553d852008-03-04 04:17:08 +0000120
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000121.. function:: combinations(iterable, r)
122
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000123 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000124
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000125 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000126 input *iterable* is sorted, the combination tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000127 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000128
129 Elements are treated as unique based on their position, not on their
130 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000131 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000132
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000133 Equivalent to::
134
135 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000136 # combinations('ABCD', 2) --> AB AC AD BC BD CD
137 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000138 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000139 n = len(pool)
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000140 if r > n:
141 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000142 indices = range(r)
143 yield tuple(pool[i] for i in indices)
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000144 while True:
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000145 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000146 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000147 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000148 else:
149 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000150 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000151 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000152 indices[j] = indices[j-1] + 1
153 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000154
Raymond Hettingerd553d852008-03-04 04:17:08 +0000155 The code for :func:`combinations` can be also expressed as a subsequence
156 of :func:`permutations` after filtering entries where the elements are not
157 in sorted order (according to their position in the input pool)::
158
159 def combinations(iterable, r):
160 pool = tuple(iterable)
161 n = len(pool)
162 for indices in permutations(range(n), r):
163 if sorted(indices) == list(indices):
164 yield tuple(pool[i] for i in indices)
165
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000166 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
167 or zero when ``r > n``.
168
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000169 .. versionadded:: 2.6
170
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000171.. function:: combinations_with_replacement(iterable, r)
172
173 Return *r* length subsequences of elements from the input *iterable*
174 allowing individual elements to be repeated more than once.
175
176 Combinations are emitted in lexicographic sort order. So, if the
177 input *iterable* is sorted, the combination tuples will be produced
178 in sorted order.
179
180 Elements are treated as unique based on their position, not on their
181 value. So if the input elements are unique, the generated combinations
182 will also be unique.
183
184 Equivalent to::
185
186 def combinations_with_replacement(iterable, r):
187 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
188 pool = tuple(iterable)
189 n = len(pool)
190 if not n and r:
191 return
192 indices = [0] * r
193 yield tuple(pool[i] for i in indices)
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000194 while True:
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000195 for i in reversed(range(r)):
196 if indices[i] != n - 1:
197 break
198 else:
199 return
200 indices[i:] = [indices[i] + 1] * (r - i)
201 yield tuple(pool[i] for i in indices)
202
203 The code for :func:`combinations_with_replacement` can be also expressed as
204 a subsequence of :func:`product` after filtering entries where the elements
205 are not in sorted order (according to their position in the input pool)::
206
207 def combinations_with_replacement(iterable, r):
208 pool = tuple(iterable)
209 n = len(pool)
210 for indices in product(range(n), repeat=r):
211 if sorted(indices) == list(indices):
212 yield tuple(pool[i] for i in indices)
213
214 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
215
216 .. versionadded:: 2.7
217
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000218.. function:: compress(data, selectors)
219
220 Make an iterator that filters elements from *data* returning only those that
221 have a corresponding element in *selectors* that evaluates to ``True``.
Andrew M. Kuchlingefa97712009-03-30 23:08:24 +0000222 Stops when either the *data* or *selectors* iterables has been exhausted.
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000223 Equivalent to::
224
225 def compress(data, selectors):
226 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
227 return (d for d, s in izip(data, selectors) if s)
228
229 .. versionadded:: 2.7
230
231
Raymond Hettingera4038032009-02-14 00:25:51 +0000232.. function:: count(start=0, step=1)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000233
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000234 Make an iterator that returns evenly spaced values starting with *n*. Often
235 used as an argument to :func:`imap` to generate consecutive data points.
236 Also, used with :func:`izip` to add sequence numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000237
Raymond Hettingera4038032009-02-14 00:25:51 +0000238 def count(start=0, step=1):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000239 # count(10) --> 10 11 12 13 14 ...
Raymond Hettinger97b31952011-02-14 06:03:41 +0000240 # count(2.5, 0.5) -> 2.5 3.0 3.5 ...
Raymond Hettingera4038032009-02-14 00:25:51 +0000241 n = start
Georg Brandl8ec7f652007-08-15 14:28:01 +0000242 while True:
243 yield n
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000244 n += step
Georg Brandl8ec7f652007-08-15 14:28:01 +0000245
Raymond Hettinger3a026242009-06-17 01:43:47 +0000246 When counting with floating point numbers, better accuracy can sometimes be
247 achieved by substituting multiplicative code such as: ``(start + step * i
248 for i in count())``.
249
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000250 .. versionchanged:: 2.7
251 added *step* argument and allowed non-integer arguments.
252
Georg Brandl8ec7f652007-08-15 14:28:01 +0000253.. function:: cycle(iterable)
254
255 Make an iterator returning elements from the iterable and saving a copy of each.
256 When the iterable is exhausted, return elements from the saved copy. Repeats
257 indefinitely. Equivalent to::
258
259 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000260 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000261 saved = []
262 for element in iterable:
263 yield element
264 saved.append(element)
265 while saved:
266 for element in saved:
267 yield element
268
269 Note, this member of the toolkit may require significant auxiliary storage
270 (depending on the length of the iterable).
271
272
273.. function:: dropwhile(predicate, iterable)
274
275 Make an iterator that drops elements from the iterable as long as the predicate
276 is true; afterwards, returns every element. Note, the iterator does not produce
277 *any* output until the predicate first becomes false, so it may have a lengthy
278 start-up time. Equivalent to::
279
280 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000281 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000282 iterable = iter(iterable)
283 for x in iterable:
284 if not predicate(x):
285 yield x
286 break
287 for x in iterable:
288 yield x
289
290
291.. function:: groupby(iterable[, key])
292
293 Make an iterator that returns consecutive keys and groups from the *iterable*.
294 The *key* is a function computing a key value for each element. If not
295 specified or is ``None``, *key* defaults to an identity function and returns
296 the element unchanged. Generally, the iterable needs to already be sorted on
297 the same key function.
298
299 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
300 generates a break or new group every time the value of the key function changes
301 (which is why it is usually necessary to have sorted the data using the same key
302 function). That behavior differs from SQL's GROUP BY which aggregates common
303 elements regardless of their input order.
304
305 The returned group is itself an iterator that shares the underlying iterable
306 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
307 object is advanced, the previous group is no longer visible. So, if that data
308 is needed later, it should be stored as a list::
309
310 groups = []
311 uniquekeys = []
312 data = sorted(data, key=keyfunc)
313 for k, g in groupby(data, keyfunc):
314 groups.append(list(g)) # Store group iterator as a list
315 uniquekeys.append(k)
316
317 :func:`groupby` is equivalent to::
318
319 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000320 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettingerd507afd2009-02-04 10:52:32 +0000321 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000322 def __init__(self, iterable, key=None):
323 if key is None:
324 key = lambda x: x
325 self.keyfunc = key
326 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000327 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000328 def __iter__(self):
329 return self
330 def next(self):
331 while self.currkey == self.tgtkey:
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000332 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000333 self.currkey = self.keyfunc(self.currvalue)
334 self.tgtkey = self.currkey
335 return (self.currkey, self._grouper(self.tgtkey))
336 def _grouper(self, tgtkey):
337 while self.currkey == tgtkey:
338 yield self.currvalue
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000339 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000340 self.currkey = self.keyfunc(self.currvalue)
341
342 .. versionadded:: 2.4
343
344
345.. function:: ifilter(predicate, iterable)
346
347 Make an iterator that filters elements from iterable returning only those for
348 which the predicate is ``True``. If *predicate* is ``None``, return the items
349 that are true. Equivalent to::
350
351 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000352 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000353 if predicate is None:
354 predicate = bool
355 for x in iterable:
356 if predicate(x):
357 yield x
358
359
360.. function:: ifilterfalse(predicate, iterable)
361
362 Make an iterator that filters elements from iterable returning only those for
363 which the predicate is ``False``. If *predicate* is ``None``, return the items
364 that are false. Equivalent to::
365
366 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000367 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000368 if predicate is None:
369 predicate = bool
370 for x in iterable:
371 if not predicate(x):
372 yield x
373
374
375.. function:: imap(function, *iterables)
376
377 Make an iterator that computes the function using arguments from each of the
378 iterables. If *function* is set to ``None``, then :func:`imap` returns the
379 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
380 exhausted instead of filling in ``None`` for shorter iterables. The reason for
381 the difference is that infinite iterator arguments are typically an error for
382 :func:`map` (because the output is fully evaluated) but represent a common and
383 useful way of supplying arguments to :func:`imap`. Equivalent to::
384
385 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000386 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000387 iterables = map(iter, iterables)
388 while True:
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000389 args = [next(it) for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000390 if function is None:
391 yield tuple(args)
392 else:
393 yield function(*args)
394
395
396.. function:: islice(iterable, [start,] stop [, step])
397
398 Make an iterator that returns selected elements from the iterable. If *start* is
399 non-zero, then elements from the iterable are skipped until start is reached.
400 Afterward, elements are returned consecutively unless *step* is set higher than
401 one which results in items being skipped. If *stop* is ``None``, then iteration
402 continues until the iterator is exhausted, if at all; otherwise, it stops at the
403 specified position. Unlike regular slicing, :func:`islice` does not support
404 negative values for *start*, *stop*, or *step*. Can be used to extract related
405 fields from data where the internal structure has been flattened (for example, a
406 multi-line report may list a name field on every third line). Equivalent to::
407
408 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000409 # islice('ABCDEFG', 2) --> A B
410 # islice('ABCDEFG', 2, 4) --> C D
411 # islice('ABCDEFG', 2, None) --> C D E F G
412 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000413 s = slice(*args)
414 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000415 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000416 for i, element in enumerate(iterable):
417 if i == nexti:
418 yield element
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000419 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000420
421 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
422 then the step defaults to one.
423
424 .. versionchanged:: 2.5
425 accept ``None`` values for default *start* and *step*.
426
427
428.. function:: izip(*iterables)
429
430 Make an iterator that aggregates elements from each of the iterables. Like
431 :func:`zip` except that it returns an iterator instead of a list. Used for
432 lock-step iteration over several iterables at a time. Equivalent to::
433
434 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000435 # izip('ABCD', 'xy') --> Ax By
Raymond Hettinger187aa262011-10-30 14:53:17 -0700436 iterators = map(iter, iterables)
437 while iterators:
438 yield tuple(map(next, iterators))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000439
440 .. versionchanged:: 2.4
441 When no iterables are specified, returns a zero length iterator instead of
442 raising a :exc:`TypeError` exception.
443
Raymond Hettinger48c62932008-01-22 19:51:41 +0000444 The left-to-right evaluation order of the iterables is guaranteed. This
445 makes possible an idiom for clustering a data series into n-length groups
446 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000447
Raymond Hettinger48c62932008-01-22 19:51:41 +0000448 :func:`izip` should only be used with unequal length inputs when you don't
449 care about trailing, unmatched values from the longer iterables. If those
450 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000451
452
453.. function:: izip_longest(*iterables[, fillvalue])
454
455 Make an iterator that aggregates elements from each of the iterables. If the
456 iterables are of uneven length, missing values are filled-in with *fillvalue*.
457 Iteration continues until the longest iterable is exhausted. Equivalent to::
458
Raymond Hettinger187aa262011-10-30 14:53:17 -0700459 class ZipExhausted(Exception):
460 pass
461
Georg Brandl8ec7f652007-08-15 14:28:01 +0000462 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000463 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000464 fillvalue = kwds.get('fillvalue')
Raymond Hettinger187aa262011-10-30 14:53:17 -0700465 counter = [len(args) - 1]
466 def sentinel():
467 if not counter[0]:
468 raise ZipExhausted
469 counter[0] -= 1
470 yield fillvalue
Georg Brandl8ec7f652007-08-15 14:28:01 +0000471 fillers = repeat(fillvalue)
Raymond Hettinger187aa262011-10-30 14:53:17 -0700472 iterators = [chain(it, sentinel(), fillers) for it in args]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000473 try:
Raymond Hettinger187aa262011-10-30 14:53:17 -0700474 while iterators:
475 yield tuple(map(next, iterators))
476 except ZipExhausted:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000477 pass
478
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000479 If one of the iterables is potentially infinite, then the
480 :func:`izip_longest` function should be wrapped with something that limits
481 the number of calls (for example :func:`islice` or :func:`takewhile`). If
482 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000483
484 .. versionadded:: 2.6
485
Raymond Hettinger330958e2008-02-28 19:41:24 +0000486.. function:: permutations(iterable[, r])
487
488 Return successive *r* length permutations of elements in the *iterable*.
489
490 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000491 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e2008-02-28 19:41:24 +0000492 are generated.
493
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000494 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e2008-02-28 19:41:24 +0000495 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000496 in sorted order.
Raymond Hettinger330958e2008-02-28 19:41:24 +0000497
498 Elements are treated as unique based on their position, not on their
499 value. So if the input elements are unique, there will be no repeat
500 values in each permutation.
501
Raymond Hettingerf287f172008-03-02 10:59:31 +0000502 Equivalent to::
503
504 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000505 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
506 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000507 pool = tuple(iterable)
508 n = len(pool)
509 r = n if r is None else r
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000510 if r > n:
511 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000512 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000513 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000514 yield tuple(pool[i] for i in indices[:r])
515 while n:
516 for i in reversed(range(r)):
517 cycles[i] -= 1
518 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000519 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000520 cycles[i] = n - i
521 else:
522 j = cycles[i]
523 indices[i], indices[-j] = indices[-j], indices[i]
524 yield tuple(pool[i] for i in indices[:r])
525 break
526 else:
527 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000528
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000529 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000530 :func:`product`, filtered to exclude entries with repeated elements (those
531 from the same position in the input pool)::
532
533 def permutations(iterable, r=None):
534 pool = tuple(iterable)
535 n = len(pool)
536 r = n if r is None else r
537 for indices in product(range(n), repeat=r):
538 if len(set(indices)) == r:
539 yield tuple(pool[i] for i in indices)
540
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000541 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
542 or zero when ``r > n``.
543
Raymond Hettinger330958e2008-02-28 19:41:24 +0000544 .. versionadded:: 2.6
545
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000546.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000547
548 Cartesian product of input iterables.
549
550 Equivalent to nested for-loops in a generator expression. For example,
551 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
552
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000553 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000554 on every iteration. This pattern creates a lexicographic ordering so that if
555 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000556 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000557
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000558 To compute the product of an iterable with itself, specify the number of
559 repetitions with the optional *repeat* keyword argument. For example,
560 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
561
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000562 This function is equivalent to the following code, except that the
563 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000564
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000565 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000566 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
567 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000568 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000569 result = [[]]
570 for pool in pools:
571 result = [x+[y] for x in result for y in pool]
572 for prod in result:
573 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000574
575 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000576
577.. function:: repeat(object[, times])
578
579 Make an iterator that returns *object* over and over again. Runs indefinitely
580 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000581 invariant function parameters. Also used with :func:`izip` to create constant
582 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000583
584 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000585 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000586 if times is None:
587 while True:
588 yield object
589 else:
590 for i in xrange(times):
591 yield object
592
Raymond Hettingerbdb7fe42012-02-01 08:52:44 -0800593 A common use for *repeat* is to supply a stream of constant values to *imap*
594 or *zip*::
Raymond Hettinger9f55b632012-02-01 08:54:14 -0800595
Raymond Hettinger6ab98132012-02-01 08:55:21 -0800596 >>> list(imap(pow, xrange(10), repeat(2)))
Raymond Hettingerbdb7fe42012-02-01 08:52:44 -0800597 [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000598
599.. function:: starmap(function, iterable)
600
Raymond Hettinger47317092008-01-17 03:02:14 +0000601 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000602 the iterable. Used instead of :func:`imap` when argument parameters are already
603 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
604 difference between :func:`imap` and :func:`starmap` parallels the distinction
605 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
606
607 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000608 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000609 for args in iterable:
610 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000611
Raymond Hettinger47317092008-01-17 03:02:14 +0000612 .. versionchanged:: 2.6
613 Previously, :func:`starmap` required the function arguments to be tuples.
614 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000615
616.. function:: takewhile(predicate, iterable)
617
618 Make an iterator that returns elements from the iterable as long as the
619 predicate is true. Equivalent to::
620
621 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000622 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000623 for x in iterable:
624 if predicate(x):
625 yield x
626 else:
627 break
628
629
Hynek Schlawackd68ffdb2012-05-22 15:22:14 +0200630.. function:: tee(iterable[, n=2])
Georg Brandl8ec7f652007-08-15 14:28:01 +0000631
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000632 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000633
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000634 def tee(iterable, n=2):
635 it = iter(iterable)
636 deques = [collections.deque() for i in range(n)]
637 def gen(mydeque):
638 while True:
639 if not mydeque: # when the local deque is empty
640 newval = next(it) # fetch a new value and
641 for d in deques: # load it to all the deques
642 d.append(newval)
643 yield mydeque.popleft()
644 return tuple(gen(d) for d in deques)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000645
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000646 Once :func:`tee` has made a split, the original *iterable* should not be
647 used anywhere else; otherwise, the *iterable* could get advanced without
648 the tee objects being informed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000649
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000650 This itertool may require significant auxiliary storage (depending on how
651 much temporary data needs to be stored). In general, if one iterator uses
652 most or all of the data before another iterator starts, it is faster to use
653 :func:`list` instead of :func:`tee`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000654
655 .. versionadded:: 2.4
656
657
Georg Brandl8ec7f652007-08-15 14:28:01 +0000658.. _itertools-recipes:
659
660Recipes
661-------
662
663This section shows recipes for creating an extended toolset using the existing
664itertools as building blocks.
665
666The extended tools offer the same high performance as the underlying toolset.
667The superior memory performance is kept by processing elements one at a time
668rather than bringing the whole iterable into memory all at once. Code volume is
669kept small by linking the tools together in a functional style which helps
670eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000671"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000672which incur interpreter overhead.
673
674.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000675
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000676 def take(n, iterable):
677 "Return first n items of the iterable as a list"
678 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000679
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000680 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000681 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000682 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000683
Raymond Hettinger3496a892009-03-09 11:57:29 +0000684 def consume(iterator, n):
685 "Advance the iterator n-steps ahead. If n is none, consume entirely."
Raymond Hettingerb8d688c2010-03-28 18:25:01 +0000686 # Use functions that consume iterators at C speed.
687 if n is None:
688 # feed the entire iterator into a zero-length deque
689 collections.deque(iterator, maxlen=0)
690 else:
Georg Brandldb235c12010-10-06 09:33:55 +0000691 # advance to the empty slice starting at position n
Raymond Hettingerb8d688c2010-03-28 18:25:01 +0000692 next(islice(iterator, n, n), None)
Raymond Hettinger3496a892009-03-09 11:57:29 +0000693
Raymond Hettingerf9bce832009-02-19 05:34:35 +0000694 def nth(iterable, n, default=None):
695 "Returns the nth item or a default value"
696 return next(islice(iterable, n, None), default)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000697
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000698 def quantify(iterable, pred=bool):
699 "Count how many times the predicate is true"
700 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000701
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000702 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000703 """Returns the sequence elements and then returns None indefinitely.
704
705 Useful for emulating the behavior of the built-in map() function.
706 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000707 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000708
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000709 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000710 "Returns the sequence elements n times"
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000711 return chain.from_iterable(repeat(tuple(iterable), n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000712
713 def dotproduct(vec1, vec2):
714 return sum(imap(operator.mul, vec1, vec2))
715
716 def flatten(listOfLists):
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000717 "Flatten one level of nesting"
718 return chain.from_iterable(listOfLists)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000719
720 def repeatfunc(func, times=None, *args):
721 """Repeat calls to func with specified arguments.
722
723 Example: repeatfunc(random.random)
724 """
725 if times is None:
726 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000727 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000728
729 def pairwise(iterable):
730 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
731 a, b = tee(iterable)
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000732 next(b, None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000733 return izip(a, b)
734
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000735 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000736 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000737 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000738 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000739
Raymond Hettingera44327a2008-01-30 22:17:31 +0000740 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000741 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000742 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000743 pending = len(iterables)
744 nexts = cycle(iter(it).next for it in iterables)
745 while pending:
746 try:
747 for next in nexts:
748 yield next()
749 except StopIteration:
750 pending -= 1
751 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000752
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000753 def powerset(iterable):
Raymond Hettinger68d919e2009-01-25 21:31:47 +0000754 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
755 s = list(iterable)
756 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000757
Benjamin Peterson48291362009-01-31 20:01:48 +0000758 def unique_everseen(iterable, key=None):
759 "List unique elements, preserving order. Remember all elements ever seen."
760 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
761 # unique_everseen('ABBCcAD', str.lower) --> A B C D
762 seen = set()
763 seen_add = seen.add
764 if key is None:
Raymond Hettinger5b027f82010-03-28 18:02:41 +0000765 for element in ifilterfalse(seen.__contains__, iterable):
766 seen_add(element)
767 yield element
Benjamin Peterson48291362009-01-31 20:01:48 +0000768 else:
769 for element in iterable:
770 k = key(element)
771 if k not in seen:
772 seen_add(k)
773 yield element
Raymond Hettinger44e15812009-01-02 21:26:45 +0000774
Benjamin Peterson48291362009-01-31 20:01:48 +0000775 def unique_justseen(iterable, key=None):
776 "List unique elements, preserving order. Remember only the element just seen."
777 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
778 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
779 return imap(next, imap(itemgetter(1), groupby(iterable, key)))
Raymond Hettinger5b027f82010-03-28 18:02:41 +0000780
781 def iter_except(func, exception, first=None):
782 """ Call a function repeatedly until an exception is raised.
783
784 Converts a call-until-exception interface to an iterator interface.
785 Like __builtin__.iter(func, sentinel) but uses an exception instead
786 of a sentinel to end the loop.
787
788 Examples:
789 bsddbiter = iter_except(db.next, bsddb.error, db.first)
790 heapiter = iter_except(functools.partial(heappop, h), IndexError)
791 dictiter = iter_except(d.popitem, KeyError)
792 dequeiter = iter_except(d.popleft, IndexError)
793 queueiter = iter_except(q.get_nowait, Queue.Empty)
794 setiter = iter_except(s.pop, KeyError)
795
796 """
797 try:
798 if first is not None:
799 yield first()
800 while 1:
801 yield func()
802 except exception:
803 pass
804
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000805 def random_product(*args, **kwds):
806 "Random selection from itertools.product(*args, **kwds)"
807 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000808 return tuple(random.choice(pool) for pool in pools)
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000809
Raymond Hettingera1d61d02010-04-10 07:01:32 +0000810 def random_permutation(iterable, r=None):
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000811 "Random selection from itertools.permutations(iterable, r)"
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000812 pool = tuple(iterable)
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000813 r = len(pool) if r is None else r
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000814 return tuple(random.sample(pool, r))
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000815
816 def random_combination(iterable, r):
817 "Random selection from itertools.combinations(iterable, r)"
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000818 pool = tuple(iterable)
Raymond Hettingera1d61d02010-04-10 07:01:32 +0000819 n = len(pool)
820 indices = sorted(random.sample(xrange(n), r))
821 return tuple(pool[i] for i in indices)
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000822
823 def random_combination_with_replacement(iterable, r):
824 "Random selection from itertools.combinations_with_replacement(iterable, r)"
Raymond Hettingerf28dd0d2010-04-02 06:23:12 +0000825 pool = tuple(iterable)
Raymond Hettingera1d61d02010-04-10 07:01:32 +0000826 n = len(pool)
827 indices = sorted(random.randrange(n) for i in xrange(r))
828 return tuple(pool[i] for i in indices)
Raymond Hettinger4bfd3bd2010-04-02 02:44:31 +0000829
Raymond Hettingerd282b932010-03-28 18:08:15 +0000830Note, many of the above recipes can be optimized by replacing global lookups
831with local variables defined as default values. For example, the
832*dotproduct* recipe can be written as::
833
834 def dotproduct(vec1, vec2, sum=sum, imap=imap, mul=operator.mul):
835 return sum(imap(mul, vec1, vec2))