blob: 28625e8834c41b9b476f2d3caebeb098059e4751 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`itertools` --- Functions creating iterators for efficient looping
2=======================================================================
3
4.. module:: itertools
Raymond Hettinger30c73622010-12-01 23:45:20 +00005 :synopsis: Functions creating iterators for efficient looping.
Georg Brandl116aa622007-08-15 14:28:22 +00006.. moduleauthor:: Raymond Hettinger <python@rcn.com>
7.. sectionauthor:: Raymond Hettinger <python@rcn.com>
8
9
Christian Heimesfe337bf2008-03-23 21:54:12 +000010.. testsetup::
11
Raymond Hettinger30c73622010-12-01 23:45:20 +000012 from itertools import *
Christian Heimesfe337bf2008-03-23 21:54:12 +000013
14
Raymond Hettingerf76b9202009-02-17 20:00:59 +000015This module implements a number of :term:`iterator` building blocks inspired
16by constructs from APL, Haskell, and SML. Each has been recast in a form
17suitable for Python.
Georg Brandl116aa622007-08-15 14:28:22 +000018
19The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettingerf76b9202009-02-17 20:00:59 +000020useful by themselves or in combination. Together, they form an "iterator
21algebra" making it possible to construct specialized tools succinctly and
22efficiently in pure Python.
Georg Brandl116aa622007-08-15 14:28:22 +000023
24For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Ezio Melottib6605992010-01-21 20:57:24 +000025sequence ``f(0), f(1), ...``. The same effect can be achieved in Python
Raymond Hettingera6c60372008-03-13 01:26:19 +000026by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000027
Raymond Hettinger2c109ab2009-03-12 00:29:44 +000028These tools and their built-in counterparts also work well with the high-speed
29functions in the :mod:`operator` module. For example, the multiplication
30operator can be mapped across two vectors to form an efficient dot-product:
31``sum(map(operator.mul, vector1, vector2))``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
Georg Brandl116aa622007-08-15 14:28:22 +000033
Raymond Hettingerf76b9202009-02-17 20:00:59 +000034**Infinite Iterators:**
Georg Brandl116aa622007-08-15 14:28:22 +000035
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000036================== ================= ================================================= =========================================
37Iterator Arguments Results Example
38================== ================= ================================================= =========================================
39:func:`count` start, [step] start, start+step, start+2*step, ... ``count(10) --> 10 11 12 13 14 ...``
40:func:`cycle` p p0, p1, ... plast, p0, p1, ... ``cycle('ABCD') --> A B C D A B C D ...``
41:func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times ``repeat(10, 3) --> 10 10 10``
42================== ================= ================================================= =========================================
Georg Brandl116aa622007-08-15 14:28:22 +000043
Raymond Hettingerf76b9202009-02-17 20:00:59 +000044**Iterators terminating on the shortest input sequence:**
45
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000046==================== ============================ ================================================= =============================================================
47Iterator Arguments Results Example
48==================== ============================ ================================================= =============================================================
Raymond Hettinger2d93e6e2010-12-03 02:33:53 +000049:func:`accumulate` p p0, p0+p1, p0+p1+p2, ... ``accumulate([1,2,3,4,5]) --> 1 3 6 10 15``
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000050:func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ... ``chain('ABC', 'DEF') --> A B C D E F``
51:func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ... ``compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F``
52:func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails ``dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1``
Georg Brandlc2da5ce2009-08-13 09:16:39 +000053:func:`filterfalse` pred, seq elements of seq where pred(elem) is False ``filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8``
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000054:func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
55:func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step] ``islice('ABCDEFG', 2, None) --> C D E F G``
56:func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ... ``starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000``
57:func:`takewhile` pred, seq seq[0], seq[1], until pred fails ``takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4``
58:func:`tee` it, n it1, it2 , ... itn splits one iterator into n
Georg Brandlc2da5ce2009-08-13 09:16:39 +000059:func:`zip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-``
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000060==================== ============================ ================================================= =============================================================
Raymond Hettingerf76b9202009-02-17 20:00:59 +000061
62**Combinatoric generators:**
63
Raymond Hettinger7f587cd2009-04-10 19:43:50 +000064============================================== ==================== =============================================================
65Iterator Arguments Results
66============================================== ==================== =============================================================
67:func:`product` p, q, ... [repeat=1] cartesian product, equivalent to a nested for-loop
68:func:`permutations` p[, r] r-length tuples, all possible orderings, no repeated elements
Raymond Hettinger36c3c022009-11-19 01:20:07 +000069:func:`combinations` p, r r-length tuples, in sorted order, no repeated elements
70:func:`combinations_with_replacement` p, r r-length tuples, in sorted order, with repeated elements
Raymond Hettinger7f587cd2009-04-10 19:43:50 +000071``product('ABCD', repeat=2)`` ``AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD``
72``permutations('ABCD', 2)`` ``AB AC AD BA BC BD CA CB CD DA DB DC``
73``combinations('ABCD', 2)`` ``AB AC AD BC BD CD``
74``combinations_with_replacement('ABCD', 2)`` ``AA AB AC AD BB BC BD CC CD DD``
75============================================== ==================== =============================================================
Georg Brandl116aa622007-08-15 14:28:22 +000076
77
78.. _itertools-functions:
79
80Itertool functions
81------------------
82
83The following module functions all construct and return iterators. Some provide
84streams of infinite length, so they should only be accessed by functions or
85loops that truncate the stream.
86
Raymond Hettinger2d93e6e2010-12-03 02:33:53 +000087.. function:: accumulate(iterable)
Raymond Hettingeradb81462010-12-01 22:50:36 +000088
Raymond Hettinger2d93e6e2010-12-03 02:33:53 +000089 Make an iterator that returns accumulated sums. Elements may be any addable
90 type including :class:`Decimal` or :class:`Fraction`. Equivalent to::
Raymond Hettingeradb81462010-12-01 22:50:36 +000091
Raymond Hettingerd8ff4652010-12-03 02:09:34 +000092 def accumulate(iterable):
Raymond Hettingeradb81462010-12-01 22:50:36 +000093 'Return running totals'
Raymond Hettingerd8ff4652010-12-03 02:09:34 +000094 # accumulate([1,2,3,4,5]) --> 1 3 6 10 15
95 it = iter(iterable)
96 total = next(it)
97 yield total
98 for element in it:
Raymond Hettingerd9404b52010-12-04 20:51:36 +000099 total = total + element
Raymond Hettingerd8ff4652010-12-03 02:09:34 +0000100 yield total
Raymond Hettingeradb81462010-12-01 22:50:36 +0000101
102 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000103
104.. function:: chain(*iterables)
105
Raymond Hettinger30c73622010-12-01 23:45:20 +0000106 Make an iterator that returns elements from the first iterable until it is
107 exhausted, then proceeds to the next iterable, until all of the iterables are
108 exhausted. Used for treating consecutive sequences as a single sequence.
109 Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000110
Raymond Hettinger30c73622010-12-01 23:45:20 +0000111 def chain(*iterables):
112 # chain('ABC', 'DEF') --> A B C D E F
113 for it in iterables:
114 for element in it:
115 yield element
Georg Brandl116aa622007-08-15 14:28:22 +0000116
117
Georg Brandl933b9742010-07-29 14:36:11 +0000118.. classmethod:: chain.from_iterable(iterable)
Christian Heimes70e7ea22008-02-28 20:02:27 +0000119
Raymond Hettinger30c73622010-12-01 23:45:20 +0000120 Alternate constructor for :func:`chain`. Gets chained inputs from a
121 single iterable argument that is evaluated lazily. Equivalent to::
Christian Heimes70e7ea22008-02-28 20:02:27 +0000122
Raymond Hettinger30c73622010-12-01 23:45:20 +0000123 @classmethod
124 def from_iterable(iterables):
125 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
126 for it in iterables:
127 for element in it:
128 yield element
Christian Heimes70e7ea22008-02-28 20:02:27 +0000129
Christian Heimes78644762008-03-04 23:39:23 +0000130
Christian Heimes836baa52008-02-26 08:18:30 +0000131.. function:: combinations(iterable, r)
132
Raymond Hettinger30c73622010-12-01 23:45:20 +0000133 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +0000134
Raymond Hettinger30c73622010-12-01 23:45:20 +0000135 Combinations are emitted in lexicographic sort order. So, if the
136 input *iterable* is sorted, the combination tuples will be produced
137 in sorted order.
Christian Heimes836baa52008-02-26 08:18:30 +0000138
Raymond Hettinger30c73622010-12-01 23:45:20 +0000139 Elements are treated as unique based on their position, not on their
140 value. So if the input elements are unique, there will be no repeat
141 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +0000142
Raymond Hettinger30c73622010-12-01 23:45:20 +0000143 Equivalent to::
Christian Heimes836baa52008-02-26 08:18:30 +0000144
145 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000146 # combinations('ABCD', 2) --> AB AC AD BC BD CD
147 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000148 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000149 n = len(pool)
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000150 if r > n:
151 return
152 indices = list(range(r))
Christian Heimesb558a2e2008-03-02 22:46:37 +0000153 yield tuple(pool[i] for i in indices)
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000154 while True:
Christian Heimes380f7f22008-02-28 11:19:05 +0000155 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000156 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000157 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000158 else:
159 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000160 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000161 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000162 indices[j] = indices[j-1] + 1
163 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000164
Raymond Hettinger30c73622010-12-01 23:45:20 +0000165 The code for :func:`combinations` can be also expressed as a subsequence
166 of :func:`permutations` after filtering entries where the elements are not
167 in sorted order (according to their position in the input pool)::
Christian Heimes78644762008-03-04 23:39:23 +0000168
169 def combinations(iterable, r):
170 pool = tuple(iterable)
171 n = len(pool)
172 for indices in permutations(range(n), r):
173 if sorted(indices) == list(indices):
174 yield tuple(pool[i] for i in indices)
175
Raymond Hettinger30c73622010-12-01 23:45:20 +0000176 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
177 or zero when ``r > n``.
Christian Heimes836baa52008-02-26 08:18:30 +0000178
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000179.. function:: combinations_with_replacement(iterable, r)
180
Raymond Hettinger30c73622010-12-01 23:45:20 +0000181 Return *r* length subsequences of elements from the input *iterable*
182 allowing individual elements to be repeated more than once.
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000183
Raymond Hettinger30c73622010-12-01 23:45:20 +0000184 Combinations are emitted in lexicographic sort order. So, if the
185 input *iterable* is sorted, the combination tuples will be produced
186 in sorted order.
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000187
Raymond Hettinger30c73622010-12-01 23:45:20 +0000188 Elements are treated as unique based on their position, not on their
189 value. So if the input elements are unique, the generated combinations
190 will also be unique.
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000191
Raymond Hettinger30c73622010-12-01 23:45:20 +0000192 Equivalent to::
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000193
194 def combinations_with_replacement(iterable, r):
195 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
196 pool = tuple(iterable)
197 n = len(pool)
198 if not n and r:
199 return
200 indices = [0] * r
201 yield tuple(pool[i] for i in indices)
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000202 while True:
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000203 for i in reversed(range(r)):
204 if indices[i] != n - 1:
205 break
206 else:
207 return
208 indices[i:] = [indices[i] + 1] * (r - i)
209 yield tuple(pool[i] for i in indices)
210
Raymond Hettinger30c73622010-12-01 23:45:20 +0000211 The code for :func:`combinations_with_replacement` can be also expressed as
212 a subsequence of :func:`product` after filtering entries where the elements
213 are not in sorted order (according to their position in the input pool)::
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000214
215 def combinations_with_replacement(iterable, r):
216 pool = tuple(iterable)
217 n = len(pool)
218 for indices in product(range(n), repeat=r):
219 if sorted(indices) == list(indices):
220 yield tuple(pool[i] for i in indices)
221
Raymond Hettinger30c73622010-12-01 23:45:20 +0000222 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000223
Raymond Hettinger30c73622010-12-01 23:45:20 +0000224 .. versionadded:: 3.1
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000225
Georg Brandl67b21b72010-08-17 15:07:14 +0000226
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000227.. function:: compress(data, selectors)
228
Raymond Hettinger30c73622010-12-01 23:45:20 +0000229 Make an iterator that filters elements from *data* returning only those that
230 have a corresponding element in *selectors* that evaluates to ``True``.
231 Stops when either the *data* or *selectors* iterables has been exhausted.
232 Equivalent to::
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000233
Raymond Hettinger30c73622010-12-01 23:45:20 +0000234 def compress(data, selectors):
235 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
236 return (d for d, s in zip(data, selectors) if s)
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000237
Raymond Hettinger30c73622010-12-01 23:45:20 +0000238 .. versionadded:: 3.1
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000239
240
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000241.. function:: count(start=0, step=1)
Georg Brandl116aa622007-08-15 14:28:22 +0000242
Raymond Hettinger30c73622010-12-01 23:45:20 +0000243 Make an iterator that returns evenly spaced values starting with *n*. Often
244 used as an argument to :func:`map` to generate consecutive data points.
245 Also, used with :func:`zip` to add sequence numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000246
Raymond Hettinger30c73622010-12-01 23:45:20 +0000247 def count(start=0, step=1):
248 # count(10) --> 10 11 12 13 14 ...
Georg Brandl37a80dc2011-01-13 07:31:18 +0000249 # count(2.5, 0.5) -> 2.5 3.0 3.5 ...
Raymond Hettinger30c73622010-12-01 23:45:20 +0000250 n = start
251 while True:
252 yield n
253 n += step
Georg Brandl116aa622007-08-15 14:28:22 +0000254
Raymond Hettinger30c73622010-12-01 23:45:20 +0000255 When counting with floating point numbers, better accuracy can sometimes be
256 achieved by substituting multiplicative code such as: ``(start + step * i
257 for i in count())``.
Raymond Hettinger5bc472a2009-06-17 01:40:52 +0000258
Raymond Hettinger30c73622010-12-01 23:45:20 +0000259 .. versionchanged:: 3.1
260 Added *step* argument and allowed non-integer arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000261
262.. function:: cycle(iterable)
263
Raymond Hettinger30c73622010-12-01 23:45:20 +0000264 Make an iterator returning elements from the iterable and saving a copy of each.
265 When the iterable is exhausted, return elements from the saved copy. Repeats
266 indefinitely. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000267
Raymond Hettinger30c73622010-12-01 23:45:20 +0000268 def cycle(iterable):
269 # cycle('ABCD') --> A B C D A B C D A B C D ...
270 saved = []
271 for element in iterable:
272 yield element
273 saved.append(element)
274 while saved:
275 for element in saved:
Georg Brandl116aa622007-08-15 14:28:22 +0000276 yield element
277
Raymond Hettinger30c73622010-12-01 23:45:20 +0000278 Note, this member of the toolkit may require significant auxiliary storage
279 (depending on the length of the iterable).
Georg Brandl116aa622007-08-15 14:28:22 +0000280
281
282.. function:: dropwhile(predicate, iterable)
283
Raymond Hettinger30c73622010-12-01 23:45:20 +0000284 Make an iterator that drops elements from the iterable as long as the predicate
285 is true; afterwards, returns every element. Note, the iterator does not produce
286 *any* output until the predicate first becomes false, so it may have a lengthy
287 start-up time. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000288
Raymond Hettinger30c73622010-12-01 23:45:20 +0000289 def dropwhile(predicate, iterable):
290 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
291 iterable = iter(iterable)
292 for x in iterable:
293 if not predicate(x):
294 yield x
295 break
296 for x in iterable:
297 yield x
Georg Brandl116aa622007-08-15 14:28:22 +0000298
Raymond Hettinger749761e2009-01-27 04:42:48 +0000299.. function:: filterfalse(predicate, iterable)
300
Raymond Hettinger30c73622010-12-01 23:45:20 +0000301 Make an iterator that filters elements from iterable returning only those for
302 which the predicate is ``False``. If *predicate* is ``None``, return the items
303 that are false. Equivalent to::
Raymond Hettinger749761e2009-01-27 04:42:48 +0000304
Raymond Hettinger30c73622010-12-01 23:45:20 +0000305 def filterfalse(predicate, iterable):
306 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
307 if predicate is None:
308 predicate = bool
309 for x in iterable:
310 if not predicate(x):
311 yield x
Raymond Hettinger749761e2009-01-27 04:42:48 +0000312
Georg Brandl116aa622007-08-15 14:28:22 +0000313
Georg Brandl3dd33882009-06-01 17:35:27 +0000314.. function:: groupby(iterable, key=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000315
Raymond Hettinger30c73622010-12-01 23:45:20 +0000316 Make an iterator that returns consecutive keys and groups from the *iterable*.
317 The *key* is a function computing a key value for each element. If not
318 specified or is ``None``, *key* defaults to an identity function and returns
319 the element unchanged. Generally, the iterable needs to already be sorted on
320 the same key function.
Georg Brandl116aa622007-08-15 14:28:22 +0000321
Raymond Hettinger30c73622010-12-01 23:45:20 +0000322 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
323 generates a break or new group every time the value of the key function changes
324 (which is why it is usually necessary to have sorted the data using the same key
325 function). That behavior differs from SQL's GROUP BY which aggregates common
326 elements regardless of their input order.
Georg Brandl116aa622007-08-15 14:28:22 +0000327
Raymond Hettinger30c73622010-12-01 23:45:20 +0000328 The returned group is itself an iterator that shares the underlying iterable
329 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
330 object is advanced, the previous group is no longer visible. So, if that data
331 is needed later, it should be stored as a list::
Georg Brandl116aa622007-08-15 14:28:22 +0000332
Raymond Hettinger30c73622010-12-01 23:45:20 +0000333 groups = []
334 uniquekeys = []
335 data = sorted(data, key=keyfunc)
336 for k, g in groupby(data, keyfunc):
337 groups.append(list(g)) # Store group iterator as a list
338 uniquekeys.append(k)
Georg Brandl116aa622007-08-15 14:28:22 +0000339
Raymond Hettinger30c73622010-12-01 23:45:20 +0000340 :func:`groupby` is equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000341
Raymond Hettinger30c73622010-12-01 23:45:20 +0000342 class groupby:
343 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
344 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
345 def __init__(self, iterable, key=None):
346 if key is None:
347 key = lambda x: x
348 self.keyfunc = key
349 self.it = iter(iterable)
350 self.tgtkey = self.currkey = self.currvalue = object()
351 def __iter__(self):
352 return self
353 def __next__(self):
354 while self.currkey == self.tgtkey:
355 self.currvalue = next(self.it) # Exit on StopIteration
356 self.currkey = self.keyfunc(self.currvalue)
357 self.tgtkey = self.currkey
358 return (self.currkey, self._grouper(self.tgtkey))
359 def _grouper(self, tgtkey):
360 while self.currkey == tgtkey:
361 yield self.currvalue
362 self.currvalue = next(self.it) # Exit on StopIteration
363 self.currkey = self.keyfunc(self.currvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000364
Georg Brandl116aa622007-08-15 14:28:22 +0000365
Georg Brandl116aa622007-08-15 14:28:22 +0000366.. function:: islice(iterable, [start,] stop [, step])
367
Raymond Hettinger30c73622010-12-01 23:45:20 +0000368 Make an iterator that returns selected elements from the iterable. If *start* is
369 non-zero, then elements from the iterable are skipped until start is reached.
370 Afterward, elements are returned consecutively unless *step* is set higher than
371 one which results in items being skipped. If *stop* is ``None``, then iteration
372 continues until the iterator is exhausted, if at all; otherwise, it stops at the
373 specified position. Unlike regular slicing, :func:`islice` does not support
374 negative values for *start*, *stop*, or *step*. Can be used to extract related
375 fields from data where the internal structure has been flattened (for example, a
376 multi-line report may list a name field on every third line). Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000377
Raymond Hettinger30c73622010-12-01 23:45:20 +0000378 def islice(iterable, *args):
379 # islice('ABCDEFG', 2) --> A B
380 # islice('ABCDEFG', 2, 4) --> C D
381 # islice('ABCDEFG', 2, None) --> C D E F G
382 # islice('ABCDEFG', 0, None, 2) --> A C E G
383 s = slice(*args)
384 it = iter(range(s.start or 0, s.stop or sys.maxsize, s.step or 1))
385 nexti = next(it)
386 for i, element in enumerate(iterable):
387 if i == nexti:
388 yield element
389 nexti = next(it)
Georg Brandl116aa622007-08-15 14:28:22 +0000390
Raymond Hettinger30c73622010-12-01 23:45:20 +0000391 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
392 then the step defaults to one.
Georg Brandl116aa622007-08-15 14:28:22 +0000393
Georg Brandl116aa622007-08-15 14:28:22 +0000394
Georg Brandl3dd33882009-06-01 17:35:27 +0000395.. function:: permutations(iterable, r=None)
Christian Heimes70e7ea22008-02-28 20:02:27 +0000396
Raymond Hettinger30c73622010-12-01 23:45:20 +0000397 Return successive *r* length permutations of elements in the *iterable*.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000398
Raymond Hettinger30c73622010-12-01 23:45:20 +0000399 If *r* is not specified or is ``None``, then *r* defaults to the length
400 of the *iterable* and all possible full-length permutations
401 are generated.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000402
Raymond Hettinger30c73622010-12-01 23:45:20 +0000403 Permutations are emitted in lexicographic sort order. So, if the
404 input *iterable* is sorted, the permutation tuples will be produced
405 in sorted order.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000406
Raymond Hettinger30c73622010-12-01 23:45:20 +0000407 Elements are treated as unique based on their position, not on their
408 value. So if the input elements are unique, there will be no repeat
409 values in each permutation.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000410
Raymond Hettinger30c73622010-12-01 23:45:20 +0000411 Equivalent to::
Christian Heimesb558a2e2008-03-02 22:46:37 +0000412
413 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000414 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
415 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000416 pool = tuple(iterable)
417 n = len(pool)
418 r = n if r is None else r
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000419 if r > n:
420 return
421 indices = list(range(n))
Christian Heimesfe337bf2008-03-23 21:54:12 +0000422 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000423 yield tuple(pool[i] for i in indices[:r])
424 while n:
425 for i in reversed(range(r)):
426 cycles[i] -= 1
427 if cycles[i] == 0:
428 indices[i:] = indices[i+1:] + indices[i:i+1]
429 cycles[i] = n - i
430 else:
431 j = cycles[i]
432 indices[i], indices[-j] = indices[-j], indices[i]
433 yield tuple(pool[i] for i in indices[:r])
434 break
435 else:
436 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000437
Raymond Hettinger30c73622010-12-01 23:45:20 +0000438 The code for :func:`permutations` can be also expressed as a subsequence of
439 :func:`product`, filtered to exclude entries with repeated elements (those
440 from the same position in the input pool)::
Christian Heimes78644762008-03-04 23:39:23 +0000441
442 def permutations(iterable, r=None):
443 pool = tuple(iterable)
444 n = len(pool)
445 r = n if r is None else r
446 for indices in product(range(n), repeat=r):
447 if len(set(indices)) == r:
448 yield tuple(pool[i] for i in indices)
449
Raymond Hettinger30c73622010-12-01 23:45:20 +0000450 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
451 or zero when ``r > n``.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000452
Georg Brandl3dd33882009-06-01 17:35:27 +0000453.. function:: product(*iterables, repeat=1)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000454
Raymond Hettinger30c73622010-12-01 23:45:20 +0000455 Cartesian product of input iterables.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000456
Raymond Hettinger30c73622010-12-01 23:45:20 +0000457 Equivalent to nested for-loops in a generator expression. For example,
458 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000459
Raymond Hettinger30c73622010-12-01 23:45:20 +0000460 The nested loops cycle like an odometer with the rightmost element advancing
461 on every iteration. This pattern creates a lexicographic ordering so that if
462 the input's iterables are sorted, the product tuples are emitted in sorted
463 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000464
Raymond Hettinger30c73622010-12-01 23:45:20 +0000465 To compute the product of an iterable with itself, specify the number of
466 repetitions with the optional *repeat* keyword argument. For example,
467 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000468
Raymond Hettinger30c73622010-12-01 23:45:20 +0000469 This function is equivalent to the following code, except that the
470 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000471
Raymond Hettinger30c73622010-12-01 23:45:20 +0000472 def product(*args, repeat=1):
473 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
474 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
475 pools = [tuple(pool) for pool in args] * repeat
476 result = [[]]
477 for pool in pools:
478 result = [x+[y] for x in result for y in pool]
479 for prod in result:
480 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000481
482
Raymond Hettingerd75ad442009-06-01 19:16:52 +0000483.. function:: repeat(object[, times])
Georg Brandl116aa622007-08-15 14:28:22 +0000484
Raymond Hettinger30c73622010-12-01 23:45:20 +0000485 Make an iterator that returns *object* over and over again. Runs indefinitely
486 unless the *times* argument is specified. Used as argument to :func:`map` for
487 invariant parameters to the called function. Also used with :func:`zip` to
488 create an invariant part of a tuple record. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000489
Raymond Hettinger30c73622010-12-01 23:45:20 +0000490 def repeat(object, times=None):
491 # repeat(10, 3) --> 10 10 10
492 if times is None:
493 while True:
494 yield object
495 else:
496 for i in range(times):
497 yield object
Georg Brandl116aa622007-08-15 14:28:22 +0000498
499
500.. function:: starmap(function, iterable)
501
Raymond Hettinger30c73622010-12-01 23:45:20 +0000502 Make an iterator that computes the function using arguments obtained from
503 the iterable. Used instead of :func:`map` when argument parameters are already
504 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
505 difference between :func:`map` and :func:`starmap` parallels the distinction
506 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000507
Raymond Hettinger30c73622010-12-01 23:45:20 +0000508 def starmap(function, iterable):
509 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
510 for args in iterable:
511 yield function(*args)
Christian Heimes679db4a2008-01-18 09:56:22 +0000512
Georg Brandl116aa622007-08-15 14:28:22 +0000513
514.. function:: takewhile(predicate, iterable)
515
Raymond Hettinger30c73622010-12-01 23:45:20 +0000516 Make an iterator that returns elements from the iterable as long as the
517 predicate is true. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000518
Raymond Hettinger30c73622010-12-01 23:45:20 +0000519 def takewhile(predicate, iterable):
520 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
521 for x in iterable:
522 if predicate(x):
523 yield x
524 else:
525 break
Georg Brandl116aa622007-08-15 14:28:22 +0000526
527
Georg Brandl3dd33882009-06-01 17:35:27 +0000528.. function:: tee(iterable, n=2)
Georg Brandl116aa622007-08-15 14:28:22 +0000529
Raymond Hettinger30c73622010-12-01 23:45:20 +0000530 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000531
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000532 def tee(iterable, n=2):
533 it = iter(iterable)
534 deques = [collections.deque() for i in range(n)]
535 def gen(mydeque):
536 while True:
537 if not mydeque: # when the local deque is empty
538 newval = next(it) # fetch a new value and
539 for d in deques: # load it to all the deques
540 d.append(newval)
541 yield mydeque.popleft()
542 return tuple(gen(d) for d in deques)
Georg Brandl116aa622007-08-15 14:28:22 +0000543
Raymond Hettinger30c73622010-12-01 23:45:20 +0000544 Once :func:`tee` has made a split, the original *iterable* should not be
545 used anywhere else; otherwise, the *iterable* could get advanced without
546 the tee objects being informed.
Georg Brandl116aa622007-08-15 14:28:22 +0000547
Raymond Hettinger30c73622010-12-01 23:45:20 +0000548 This itertool may require significant auxiliary storage (depending on how
549 much temporary data needs to be stored). In general, if one iterator uses
550 most or all of the data before another iterator starts, it is faster to use
551 :func:`list` instead of :func:`tee`.
Georg Brandl116aa622007-08-15 14:28:22 +0000552
Georg Brandl116aa622007-08-15 14:28:22 +0000553
Georg Brandl3dd33882009-06-01 17:35:27 +0000554.. function:: zip_longest(*iterables, fillvalue=None)
Raymond Hettinger749761e2009-01-27 04:42:48 +0000555
Raymond Hettinger30c73622010-12-01 23:45:20 +0000556 Make an iterator that aggregates elements from each of the iterables. If the
557 iterables are of uneven length, missing values are filled-in with *fillvalue*.
558 Iteration continues until the longest iterable is exhausted. Equivalent to::
Raymond Hettinger749761e2009-01-27 04:42:48 +0000559
Raymond Hettinger6f45d182011-10-30 15:06:14 -0700560 class ZipExhausted(Exception):
561 pass
562
563 def zip_longest(*args, **kwds):
Raymond Hettinger30c73622010-12-01 23:45:20 +0000564 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Raymond Hettinger6f45d182011-10-30 15:06:14 -0700565 fillvalue = kwds.get('fillvalue')
566 counter = len(args) - 1
567 def sentinel():
568 nonlocal counter
569 if not counter:
570 raise ZipExhausted
571 counter -= 1
572 yield fillvalue
Raymond Hettinger30c73622010-12-01 23:45:20 +0000573 fillers = repeat(fillvalue)
Raymond Hettinger6f45d182011-10-30 15:06:14 -0700574 iterators = [chain(it, sentinel(), fillers) for it in args]
Raymond Hettinger30c73622010-12-01 23:45:20 +0000575 try:
Raymond Hettinger6f45d182011-10-30 15:06:14 -0700576 while iterators:
577 yield tuple(map(next, iterators))
578 except ZipExhausted:
Raymond Hettinger30c73622010-12-01 23:45:20 +0000579 pass
Raymond Hettinger749761e2009-01-27 04:42:48 +0000580
Raymond Hettinger30c73622010-12-01 23:45:20 +0000581 If one of the iterables is potentially infinite, then the :func:`zip_longest`
582 function should be wrapped with something that limits the number of calls
583 (for example :func:`islice` or :func:`takewhile`). If not specified,
584 *fillvalue* defaults to ``None``.
Raymond Hettinger749761e2009-01-27 04:42:48 +0000585
586
Georg Brandl116aa622007-08-15 14:28:22 +0000587.. _itertools-recipes:
588
Raymond Hettinger1fa76822010-12-06 23:31:36 +0000589Itertools Recipes
590-----------------
Georg Brandl116aa622007-08-15 14:28:22 +0000591
592This section shows recipes for creating an extended toolset using the existing
593itertools as building blocks.
594
595The extended tools offer the same high performance as the underlying toolset.
596The superior memory performance is kept by processing elements one at a time
597rather than bringing the whole iterable into memory all at once. Code volume is
598kept small by linking the tools together in a functional style which helps
599eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000600"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000601which incur interpreter overhead.
602
603.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000604
Raymond Hettinger30c73622010-12-01 23:45:20 +0000605 def take(n, iterable):
606 "Return first n items of the iterable as a list"
607 return list(islice(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000608
Raymond Hettinger30c73622010-12-01 23:45:20 +0000609 def tabulate(function, start=0):
610 "Return function(0), function(1), ..."
611 return map(function, count(start))
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000612
Raymond Hettinger30c73622010-12-01 23:45:20 +0000613 def consume(iterator, n):
614 "Advance the iterator n-steps ahead. If n is none, consume entirely."
615 # Use functions that consume iterators at C speed.
616 if n is None:
617 # feed the entire iterator into a zero-length deque
618 collections.deque(iterator, maxlen=0)
619 else:
620 # advance to the empty slice starting at position n
621 next(islice(iterator, n, n), None)
Raymond Hettingerfa007962009-03-09 11:55:25 +0000622
Raymond Hettinger30c73622010-12-01 23:45:20 +0000623 def nth(iterable, n, default=None):
624 "Returns the nth item or a default value"
625 return next(islice(iterable, n, None), default)
Georg Brandl116aa622007-08-15 14:28:22 +0000626
Raymond Hettinger30c73622010-12-01 23:45:20 +0000627 def quantify(iterable, pred=bool):
628 "Count how many times the predicate is true"
629 return sum(map(pred, iterable))
Georg Brandl116aa622007-08-15 14:28:22 +0000630
Raymond Hettinger30c73622010-12-01 23:45:20 +0000631 def padnone(iterable):
632 """Returns the sequence elements and then returns None indefinitely.
Georg Brandl116aa622007-08-15 14:28:22 +0000633
Raymond Hettinger30c73622010-12-01 23:45:20 +0000634 Useful for emulating the behavior of the built-in map() function.
635 """
636 return chain(iterable, repeat(None))
Georg Brandl116aa622007-08-15 14:28:22 +0000637
Raymond Hettinger30c73622010-12-01 23:45:20 +0000638 def ncycles(iterable, n):
639 "Returns the sequence elements n times"
640 return chain.from_iterable(repeat(tuple(iterable), n))
Georg Brandl116aa622007-08-15 14:28:22 +0000641
Raymond Hettinger30c73622010-12-01 23:45:20 +0000642 def dotproduct(vec1, vec2):
643 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000644
Raymond Hettinger30c73622010-12-01 23:45:20 +0000645 def flatten(listOfLists):
646 "Flatten one level of nesting"
647 return chain.from_iterable(listOfLists)
Georg Brandl116aa622007-08-15 14:28:22 +0000648
Raymond Hettinger30c73622010-12-01 23:45:20 +0000649 def repeatfunc(func, times=None, *args):
650 """Repeat calls to func with specified arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000651
Raymond Hettinger30c73622010-12-01 23:45:20 +0000652 Example: repeatfunc(random.random)
653 """
654 if times is None:
655 return starmap(func, repeat(args))
656 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000657
Raymond Hettinger30c73622010-12-01 23:45:20 +0000658 def pairwise(iterable):
659 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
660 a, b = tee(iterable)
661 next(b, None)
662 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000663
Raymond Hettinger30c73622010-12-01 23:45:20 +0000664 def grouper(n, iterable, fillvalue=None):
665 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
666 args = [iter(iterable)] * n
667 return zip_longest(*args, fillvalue=fillvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000668
Raymond Hettinger30c73622010-12-01 23:45:20 +0000669 def roundrobin(*iterables):
670 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
671 # Recipe credited to George Sakkis
672 pending = len(iterables)
673 nexts = cycle(iter(it).__next__ for it in iterables)
674 while pending:
675 try:
676 for next in nexts:
677 yield next()
678 except StopIteration:
679 pending -= 1
680 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000681
Raymond Hettinger30c73622010-12-01 23:45:20 +0000682 def partition(pred, iterable):
683 'Use a predicate to partition entries into false entries and true entries'
684 # partition(is_odd, range(10)) --> 0 2 4 6 8 and 1 3 5 7 9
685 t1, t2 = tee(iterable)
686 return filterfalse(pred, t1), filter(pred, t2)
Raymond Hettinger5ce0aa22010-12-01 10:49:19 +0000687
Raymond Hettinger30c73622010-12-01 23:45:20 +0000688 def powerset(iterable):
689 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
690 s = list(iterable)
691 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger08d01ee2010-08-07 05:36:53 +0000692
Raymond Hettinger30c73622010-12-01 23:45:20 +0000693 def unique_everseen(iterable, key=None):
694 "List unique elements, preserving order. Remember all elements ever seen."
695 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
696 # unique_everseen('ABBCcAD', str.lower) --> A B C D
697 seen = set()
698 seen_add = seen.add
699 if key is None:
700 for element in filterfalse(seen.__contains__, iterable):
701 seen_add(element)
702 yield element
703 else:
704 for element in iterable:
705 k = key(element)
706 if k not in seen:
707 seen_add(k)
708 yield element
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000709
Raymond Hettinger30c73622010-12-01 23:45:20 +0000710 def unique_justseen(iterable, key=None):
711 "List unique elements, preserving order. Remember only the element just seen."
712 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
713 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
714 return map(next, map(itemgetter(1), groupby(iterable, key)))
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000715
Raymond Hettinger30c73622010-12-01 23:45:20 +0000716 def iter_except(func, exception, first=None):
717 """ Call a function repeatedly until an exception is raised.
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000718
Raymond Hettinger30c73622010-12-01 23:45:20 +0000719 Converts a call-until-exception interface to an iterator interface.
720 Like __builtin__.iter(func, sentinel) but uses an exception instead
721 of a sentinel to end the loop.
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000722
Raymond Hettinger30c73622010-12-01 23:45:20 +0000723 Examples:
724 iter_except(functools.partial(heappop, h), IndexError) # priority queue iterator
725 iter_except(d.popitem, KeyError) # non-blocking dict iterator
726 iter_except(d.popleft, IndexError) # non-blocking deque iterator
727 iter_except(q.get_nowait, Queue.Empty) # loop over a producer Queue
728 iter_except(s.pop, KeyError) # non-blocking set iterator
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000729
Raymond Hettinger30c73622010-12-01 23:45:20 +0000730 """
731 try:
732 if first is not None:
733 yield first() # For database APIs needing an initial cast to db.first()
734 while 1:
735 yield func()
736 except exception:
737 pass
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000738
Raymond Hettinger30c73622010-12-01 23:45:20 +0000739 def random_product(*args, repeat=1):
740 "Random selection from itertools.product(*args, **kwds)"
741 pools = [tuple(pool) for pool in args] * repeat
742 return tuple(random.choice(pool) for pool in pools)
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000743
Raymond Hettinger30c73622010-12-01 23:45:20 +0000744 def random_permutation(iterable, r=None):
745 "Random selection from itertools.permutations(iterable, r)"
746 pool = tuple(iterable)
747 r = len(pool) if r is None else r
748 return tuple(random.sample(pool, r))
Raymond Hettinger063a4b62010-04-02 04:18:18 +0000749
Raymond Hettinger30c73622010-12-01 23:45:20 +0000750 def random_combination(iterable, r):
751 "Random selection from itertools.combinations(iterable, r)"
752 pool = tuple(iterable)
753 n = len(pool)
754 indices = sorted(random.sample(range(n), r))
755 return tuple(pool[i] for i in indices)
Raymond Hettinger063a4b62010-04-02 04:18:18 +0000756
Raymond Hettinger30c73622010-12-01 23:45:20 +0000757 def random_combination_with_replacement(iterable, r):
758 "Random selection from itertools.combinations_with_replacement(iterable, r)"
759 pool = tuple(iterable)
760 n = len(pool)
761 indices = sorted(random.randrange(n) for i in range(r))
762 return tuple(pool[i] for i in indices)
Raymond Hettinger063a4b62010-04-02 04:18:18 +0000763
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000764Note, many of the above recipes can be optimized by replacing global lookups
765with local variables defined as default values. For example, the
766*dotproduct* recipe can be written as::
767
Raymond Hettinger30c73622010-12-01 23:45:20 +0000768 def dotproduct(vec1, vec2, sum=sum, map=map, mul=operator.mul):
769 return sum(map(mul, vec1, vec2))