blob: 308f925d76c9b4757c67b6752bf567fd5810732c [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`itertools` --- Functions creating iterators for efficient looping
2=======================================================================
3
4.. module:: itertools
Raymond Hettinger30c73622010-12-01 23:45:20 +00005 :synopsis: Functions creating iterators for efficient looping.
Georg Brandl116aa622007-08-15 14:28:22 +00006.. moduleauthor:: Raymond Hettinger <python@rcn.com>
7.. sectionauthor:: Raymond Hettinger <python@rcn.com>
8
9
Christian Heimesfe337bf2008-03-23 21:54:12 +000010.. testsetup::
11
Raymond Hettinger30c73622010-12-01 23:45:20 +000012 from itertools import *
Christian Heimesfe337bf2008-03-23 21:54:12 +000013
14
Raymond Hettingerf76b9202009-02-17 20:00:59 +000015This module implements a number of :term:`iterator` building blocks inspired
16by constructs from APL, Haskell, and SML. Each has been recast in a form
17suitable for Python.
Georg Brandl116aa622007-08-15 14:28:22 +000018
19The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettingerf76b9202009-02-17 20:00:59 +000020useful by themselves or in combination. Together, they form an "iterator
21algebra" making it possible to construct specialized tools succinctly and
22efficiently in pure Python.
Georg Brandl116aa622007-08-15 14:28:22 +000023
24For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Ezio Melottib6605992010-01-21 20:57:24 +000025sequence ``f(0), f(1), ...``. The same effect can be achieved in Python
Raymond Hettingera6c60372008-03-13 01:26:19 +000026by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000027
Raymond Hettinger2c109ab2009-03-12 00:29:44 +000028These tools and their built-in counterparts also work well with the high-speed
29functions in the :mod:`operator` module. For example, the multiplication
30operator can be mapped across two vectors to form an efficient dot-product:
31``sum(map(operator.mul, vector1, vector2))``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
Georg Brandl116aa622007-08-15 14:28:22 +000033
Raymond Hettingerf76b9202009-02-17 20:00:59 +000034**Infinite Iterators:**
Georg Brandl116aa622007-08-15 14:28:22 +000035
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000036================== ================= ================================================= =========================================
37Iterator Arguments Results Example
38================== ================= ================================================= =========================================
39:func:`count` start, [step] start, start+step, start+2*step, ... ``count(10) --> 10 11 12 13 14 ...``
40:func:`cycle` p p0, p1, ... plast, p0, p1, ... ``cycle('ABCD') --> A B C D A B C D ...``
41:func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times ``repeat(10, 3) --> 10 10 10``
42================== ================= ================================================= =========================================
Georg Brandl116aa622007-08-15 14:28:22 +000043
Raymond Hettingerf76b9202009-02-17 20:00:59 +000044**Iterators terminating on the shortest input sequence:**
45
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000046==================== ============================ ================================================= =============================================================
47Iterator Arguments Results Example
48==================== ============================ ================================================= =============================================================
Raymond Hettinger2d93e6e2010-12-03 02:33:53 +000049:func:`accumulate` p p0, p0+p1, p0+p1+p2, ... ``accumulate([1,2,3,4,5]) --> 1 3 6 10 15``
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000050:func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ... ``chain('ABC', 'DEF') --> A B C D E F``
51:func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ... ``compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F``
52:func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails ``dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1``
Georg Brandlc2da5ce2009-08-13 09:16:39 +000053:func:`filterfalse` pred, seq elements of seq where pred(elem) is False ``filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8``
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000054:func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
55:func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step] ``islice('ABCDEFG', 2, None) --> C D E F G``
56:func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ... ``starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000``
57:func:`takewhile` pred, seq seq[0], seq[1], until pred fails ``takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4``
58:func:`tee` it, n it1, it2 , ... itn splits one iterator into n
Georg Brandlc2da5ce2009-08-13 09:16:39 +000059:func:`zip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-``
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000060==================== ============================ ================================================= =============================================================
Raymond Hettingerf76b9202009-02-17 20:00:59 +000061
62**Combinatoric generators:**
63
Raymond Hettinger7f587cd2009-04-10 19:43:50 +000064============================================== ==================== =============================================================
65Iterator Arguments Results
66============================================== ==================== =============================================================
67:func:`product` p, q, ... [repeat=1] cartesian product, equivalent to a nested for-loop
68:func:`permutations` p[, r] r-length tuples, all possible orderings, no repeated elements
Raymond Hettinger36c3c022009-11-19 01:20:07 +000069:func:`combinations` p, r r-length tuples, in sorted order, no repeated elements
70:func:`combinations_with_replacement` p, r r-length tuples, in sorted order, with repeated elements
Raymond Hettinger7f587cd2009-04-10 19:43:50 +000071``product('ABCD', repeat=2)`` ``AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD``
72``permutations('ABCD', 2)`` ``AB AC AD BA BC BD CA CB CD DA DB DC``
73``combinations('ABCD', 2)`` ``AB AC AD BC BD CD``
74``combinations_with_replacement('ABCD', 2)`` ``AA AB AC AD BB BC BD CC CD DD``
75============================================== ==================== =============================================================
Georg Brandl116aa622007-08-15 14:28:22 +000076
77
78.. _itertools-functions:
79
80Itertool functions
81------------------
82
83The following module functions all construct and return iterators. Some provide
84streams of infinite length, so they should only be accessed by functions or
85loops that truncate the stream.
86
Raymond Hettinger2d93e6e2010-12-03 02:33:53 +000087.. function:: accumulate(iterable)
Raymond Hettingeradb81462010-12-01 22:50:36 +000088
Raymond Hettinger2d93e6e2010-12-03 02:33:53 +000089 Make an iterator that returns accumulated sums. Elements may be any addable
90 type including :class:`Decimal` or :class:`Fraction`. Equivalent to::
Raymond Hettingeradb81462010-12-01 22:50:36 +000091
Raymond Hettingerd8ff4652010-12-03 02:09:34 +000092 def accumulate(iterable):
Raymond Hettingeradb81462010-12-01 22:50:36 +000093 'Return running totals'
Raymond Hettingerd8ff4652010-12-03 02:09:34 +000094 # accumulate([1,2,3,4,5]) --> 1 3 6 10 15
95 it = iter(iterable)
96 total = next(it)
97 yield total
98 for element in it:
Raymond Hettingerd9404b52010-12-04 20:51:36 +000099 total = total + element
Raymond Hettingerd8ff4652010-12-03 02:09:34 +0000100 yield total
Raymond Hettingeradb81462010-12-01 22:50:36 +0000101
102 .. versionadded:: 3.2
Georg Brandl116aa622007-08-15 14:28:22 +0000103
104.. function:: chain(*iterables)
105
Raymond Hettinger30c73622010-12-01 23:45:20 +0000106 Make an iterator that returns elements from the first iterable until it is
107 exhausted, then proceeds to the next iterable, until all of the iterables are
108 exhausted. Used for treating consecutive sequences as a single sequence.
109 Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000110
Raymond Hettinger30c73622010-12-01 23:45:20 +0000111 def chain(*iterables):
112 # chain('ABC', 'DEF') --> A B C D E F
113 for it in iterables:
114 for element in it:
115 yield element
Georg Brandl116aa622007-08-15 14:28:22 +0000116
117
Georg Brandl933b9742010-07-29 14:36:11 +0000118.. classmethod:: chain.from_iterable(iterable)
Christian Heimes70e7ea22008-02-28 20:02:27 +0000119
Raymond Hettinger30c73622010-12-01 23:45:20 +0000120 Alternate constructor for :func:`chain`. Gets chained inputs from a
121 single iterable argument that is evaluated lazily. Equivalent to::
Christian Heimes70e7ea22008-02-28 20:02:27 +0000122
Raymond Hettinger30c73622010-12-01 23:45:20 +0000123 @classmethod
124 def from_iterable(iterables):
125 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
126 for it in iterables:
127 for element in it:
128 yield element
Christian Heimes70e7ea22008-02-28 20:02:27 +0000129
Christian Heimes78644762008-03-04 23:39:23 +0000130
Christian Heimes836baa52008-02-26 08:18:30 +0000131.. function:: combinations(iterable, r)
132
Raymond Hettinger30c73622010-12-01 23:45:20 +0000133 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +0000134
Raymond Hettinger30c73622010-12-01 23:45:20 +0000135 Combinations are emitted in lexicographic sort order. So, if the
136 input *iterable* is sorted, the combination tuples will be produced
137 in sorted order.
Christian Heimes836baa52008-02-26 08:18:30 +0000138
Raymond Hettinger30c73622010-12-01 23:45:20 +0000139 Elements are treated as unique based on their position, not on their
140 value. So if the input elements are unique, there will be no repeat
141 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +0000142
Raymond Hettinger30c73622010-12-01 23:45:20 +0000143 Equivalent to::
Christian Heimes836baa52008-02-26 08:18:30 +0000144
145 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000146 # combinations('ABCD', 2) --> AB AC AD BC BD CD
147 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000148 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000149 n = len(pool)
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000150 if r > n:
151 return
152 indices = list(range(r))
Christian Heimesb558a2e2008-03-02 22:46:37 +0000153 yield tuple(pool[i] for i in indices)
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000154 while True:
Christian Heimes380f7f22008-02-28 11:19:05 +0000155 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000156 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000157 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000158 else:
159 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000160 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000161 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000162 indices[j] = indices[j-1] + 1
163 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000164
Raymond Hettinger30c73622010-12-01 23:45:20 +0000165 The code for :func:`combinations` can be also expressed as a subsequence
166 of :func:`permutations` after filtering entries where the elements are not
167 in sorted order (according to their position in the input pool)::
Christian Heimes78644762008-03-04 23:39:23 +0000168
169 def combinations(iterable, r):
170 pool = tuple(iterable)
171 n = len(pool)
172 for indices in permutations(range(n), r):
173 if sorted(indices) == list(indices):
174 yield tuple(pool[i] for i in indices)
175
Raymond Hettinger30c73622010-12-01 23:45:20 +0000176 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
177 or zero when ``r > n``.
Christian Heimes836baa52008-02-26 08:18:30 +0000178
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000179.. function:: combinations_with_replacement(iterable, r)
180
Raymond Hettinger30c73622010-12-01 23:45:20 +0000181 Return *r* length subsequences of elements from the input *iterable*
182 allowing individual elements to be repeated more than once.
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000183
Raymond Hettinger30c73622010-12-01 23:45:20 +0000184 Combinations are emitted in lexicographic sort order. So, if the
185 input *iterable* is sorted, the combination tuples will be produced
186 in sorted order.
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000187
Raymond Hettinger30c73622010-12-01 23:45:20 +0000188 Elements are treated as unique based on their position, not on their
189 value. So if the input elements are unique, the generated combinations
190 will also be unique.
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000191
Raymond Hettinger30c73622010-12-01 23:45:20 +0000192 Equivalent to::
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000193
194 def combinations_with_replacement(iterable, r):
195 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
196 pool = tuple(iterable)
197 n = len(pool)
198 if not n and r:
199 return
200 indices = [0] * r
201 yield tuple(pool[i] for i in indices)
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000202 while True:
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000203 for i in reversed(range(r)):
204 if indices[i] != n - 1:
205 break
206 else:
207 return
208 indices[i:] = [indices[i] + 1] * (r - i)
209 yield tuple(pool[i] for i in indices)
210
Raymond Hettinger30c73622010-12-01 23:45:20 +0000211 The code for :func:`combinations_with_replacement` can be also expressed as
212 a subsequence of :func:`product` after filtering entries where the elements
213 are not in sorted order (according to their position in the input pool)::
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000214
215 def combinations_with_replacement(iterable, r):
216 pool = tuple(iterable)
217 n = len(pool)
218 for indices in product(range(n), repeat=r):
219 if sorted(indices) == list(indices):
220 yield tuple(pool[i] for i in indices)
221
Raymond Hettinger30c73622010-12-01 23:45:20 +0000222 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000223
Raymond Hettinger30c73622010-12-01 23:45:20 +0000224 .. versionadded:: 3.1
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000225
Georg Brandl67b21b72010-08-17 15:07:14 +0000226
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000227.. function:: compress(data, selectors)
228
Raymond Hettinger30c73622010-12-01 23:45:20 +0000229 Make an iterator that filters elements from *data* returning only those that
230 have a corresponding element in *selectors* that evaluates to ``True``.
231 Stops when either the *data* or *selectors* iterables has been exhausted.
232 Equivalent to::
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000233
Raymond Hettinger30c73622010-12-01 23:45:20 +0000234 def compress(data, selectors):
235 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
236 return (d for d, s in zip(data, selectors) if s)
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000237
Raymond Hettinger30c73622010-12-01 23:45:20 +0000238 .. versionadded:: 3.1
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000239
240
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000241.. function:: count(start=0, step=1)
Georg Brandl116aa622007-08-15 14:28:22 +0000242
Raymond Hettinger30c73622010-12-01 23:45:20 +0000243 Make an iterator that returns evenly spaced values starting with *n*. Often
244 used as an argument to :func:`map` to generate consecutive data points.
245 Also, used with :func:`zip` to add sequence numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000246
Raymond Hettinger30c73622010-12-01 23:45:20 +0000247 def count(start=0, step=1):
248 # count(10) --> 10 11 12 13 14 ...
Georg Brandl37a80dc2011-01-13 07:31:18 +0000249 # count(2.5, 0.5) -> 2.5 3.0 3.5 ...
Raymond Hettinger30c73622010-12-01 23:45:20 +0000250 n = start
251 while True:
252 yield n
253 n += step
Georg Brandl116aa622007-08-15 14:28:22 +0000254
Raymond Hettinger30c73622010-12-01 23:45:20 +0000255 When counting with floating point numbers, better accuracy can sometimes be
256 achieved by substituting multiplicative code such as: ``(start + step * i
257 for i in count())``.
Raymond Hettinger5bc472a2009-06-17 01:40:52 +0000258
Raymond Hettinger30c73622010-12-01 23:45:20 +0000259 .. versionchanged:: 3.1
260 Added *step* argument and allowed non-integer arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000261
262.. function:: cycle(iterable)
263
Raymond Hettinger30c73622010-12-01 23:45:20 +0000264 Make an iterator returning elements from the iterable and saving a copy of each.
265 When the iterable is exhausted, return elements from the saved copy. Repeats
266 indefinitely. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000267
Raymond Hettinger30c73622010-12-01 23:45:20 +0000268 def cycle(iterable):
269 # cycle('ABCD') --> A B C D A B C D A B C D ...
270 saved = []
271 for element in iterable:
272 yield element
273 saved.append(element)
274 while saved:
275 for element in saved:
Georg Brandl116aa622007-08-15 14:28:22 +0000276 yield element
277
Raymond Hettinger30c73622010-12-01 23:45:20 +0000278 Note, this member of the toolkit may require significant auxiliary storage
279 (depending on the length of the iterable).
Georg Brandl116aa622007-08-15 14:28:22 +0000280
281
282.. function:: dropwhile(predicate, iterable)
283
Raymond Hettinger30c73622010-12-01 23:45:20 +0000284 Make an iterator that drops elements from the iterable as long as the predicate
285 is true; afterwards, returns every element. Note, the iterator does not produce
286 *any* output until the predicate first becomes false, so it may have a lengthy
287 start-up time. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000288
Raymond Hettinger30c73622010-12-01 23:45:20 +0000289 def dropwhile(predicate, iterable):
290 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
291 iterable = iter(iterable)
292 for x in iterable:
293 if not predicate(x):
294 yield x
295 break
296 for x in iterable:
297 yield x
Georg Brandl116aa622007-08-15 14:28:22 +0000298
Raymond Hettinger749761e2009-01-27 04:42:48 +0000299.. function:: filterfalse(predicate, iterable)
300
Raymond Hettinger30c73622010-12-01 23:45:20 +0000301 Make an iterator that filters elements from iterable returning only those for
302 which the predicate is ``False``. If *predicate* is ``None``, return the items
303 that are false. Equivalent to::
Raymond Hettinger749761e2009-01-27 04:42:48 +0000304
Raymond Hettinger30c73622010-12-01 23:45:20 +0000305 def filterfalse(predicate, iterable):
306 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
307 if predicate is None:
308 predicate = bool
309 for x in iterable:
310 if not predicate(x):
311 yield x
Raymond Hettinger749761e2009-01-27 04:42:48 +0000312
Georg Brandl116aa622007-08-15 14:28:22 +0000313
Georg Brandl3dd33882009-06-01 17:35:27 +0000314.. function:: groupby(iterable, key=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000315
Raymond Hettinger30c73622010-12-01 23:45:20 +0000316 Make an iterator that returns consecutive keys and groups from the *iterable*.
317 The *key* is a function computing a key value for each element. If not
318 specified or is ``None``, *key* defaults to an identity function and returns
319 the element unchanged. Generally, the iterable needs to already be sorted on
320 the same key function.
Georg Brandl116aa622007-08-15 14:28:22 +0000321
Raymond Hettinger30c73622010-12-01 23:45:20 +0000322 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
323 generates a break or new group every time the value of the key function changes
324 (which is why it is usually necessary to have sorted the data using the same key
325 function). That behavior differs from SQL's GROUP BY which aggregates common
326 elements regardless of their input order.
Georg Brandl116aa622007-08-15 14:28:22 +0000327
Raymond Hettinger30c73622010-12-01 23:45:20 +0000328 The returned group is itself an iterator that shares the underlying iterable
329 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
330 object is advanced, the previous group is no longer visible. So, if that data
331 is needed later, it should be stored as a list::
Georg Brandl116aa622007-08-15 14:28:22 +0000332
Raymond Hettinger30c73622010-12-01 23:45:20 +0000333 groups = []
334 uniquekeys = []
335 data = sorted(data, key=keyfunc)
336 for k, g in groupby(data, keyfunc):
337 groups.append(list(g)) # Store group iterator as a list
338 uniquekeys.append(k)
Georg Brandl116aa622007-08-15 14:28:22 +0000339
Raymond Hettinger30c73622010-12-01 23:45:20 +0000340 :func:`groupby` is equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000341
Raymond Hettinger30c73622010-12-01 23:45:20 +0000342 class groupby:
343 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
344 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
345 def __init__(self, iterable, key=None):
346 if key is None:
347 key = lambda x: x
348 self.keyfunc = key
349 self.it = iter(iterable)
350 self.tgtkey = self.currkey = self.currvalue = object()
351 def __iter__(self):
352 return self
353 def __next__(self):
354 while self.currkey == self.tgtkey:
355 self.currvalue = next(self.it) # Exit on StopIteration
356 self.currkey = self.keyfunc(self.currvalue)
357 self.tgtkey = self.currkey
358 return (self.currkey, self._grouper(self.tgtkey))
359 def _grouper(self, tgtkey):
360 while self.currkey == tgtkey:
361 yield self.currvalue
362 self.currvalue = next(self.it) # Exit on StopIteration
363 self.currkey = self.keyfunc(self.currvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000364
Georg Brandl116aa622007-08-15 14:28:22 +0000365
Ezio Melottie0add762012-09-14 06:32:35 +0300366.. function:: islice(iterable, stop)
367 islice(iterable, start, stop[, step])
Georg Brandl116aa622007-08-15 14:28:22 +0000368
Raymond Hettinger30c73622010-12-01 23:45:20 +0000369 Make an iterator that returns selected elements from the iterable. If *start* is
370 non-zero, then elements from the iterable are skipped until start is reached.
371 Afterward, elements are returned consecutively unless *step* is set higher than
372 one which results in items being skipped. If *stop* is ``None``, then iteration
373 continues until the iterator is exhausted, if at all; otherwise, it stops at the
374 specified position. Unlike regular slicing, :func:`islice` does not support
375 negative values for *start*, *stop*, or *step*. Can be used to extract related
376 fields from data where the internal structure has been flattened (for example, a
377 multi-line report may list a name field on every third line). Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000378
Raymond Hettinger30c73622010-12-01 23:45:20 +0000379 def islice(iterable, *args):
380 # islice('ABCDEFG', 2) --> A B
381 # islice('ABCDEFG', 2, 4) --> C D
382 # islice('ABCDEFG', 2, None) --> C D E F G
383 # islice('ABCDEFG', 0, None, 2) --> A C E G
384 s = slice(*args)
385 it = iter(range(s.start or 0, s.stop or sys.maxsize, s.step or 1))
386 nexti = next(it)
387 for i, element in enumerate(iterable):
388 if i == nexti:
389 yield element
390 nexti = next(it)
Georg Brandl116aa622007-08-15 14:28:22 +0000391
Raymond Hettinger30c73622010-12-01 23:45:20 +0000392 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
393 then the step defaults to one.
Georg Brandl116aa622007-08-15 14:28:22 +0000394
Georg Brandl116aa622007-08-15 14:28:22 +0000395
Georg Brandl3dd33882009-06-01 17:35:27 +0000396.. function:: permutations(iterable, r=None)
Christian Heimes70e7ea22008-02-28 20:02:27 +0000397
Raymond Hettinger30c73622010-12-01 23:45:20 +0000398 Return successive *r* length permutations of elements in the *iterable*.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000399
Raymond Hettinger30c73622010-12-01 23:45:20 +0000400 If *r* is not specified or is ``None``, then *r* defaults to the length
401 of the *iterable* and all possible full-length permutations
402 are generated.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000403
Raymond Hettinger30c73622010-12-01 23:45:20 +0000404 Permutations are emitted in lexicographic sort order. So, if the
405 input *iterable* is sorted, the permutation tuples will be produced
406 in sorted order.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000407
Raymond Hettinger30c73622010-12-01 23:45:20 +0000408 Elements are treated as unique based on their position, not on their
409 value. So if the input elements are unique, there will be no repeat
410 values in each permutation.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000411
Raymond Hettinger30c73622010-12-01 23:45:20 +0000412 Equivalent to::
Christian Heimesb558a2e2008-03-02 22:46:37 +0000413
414 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000415 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
416 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000417 pool = tuple(iterable)
418 n = len(pool)
419 r = n if r is None else r
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000420 if r > n:
421 return
422 indices = list(range(n))
Sandro Tosi73866622011-12-25 17:25:45 +0100423 cycles = list(range(n, n-r, -1))
Christian Heimesb558a2e2008-03-02 22:46:37 +0000424 yield tuple(pool[i] for i in indices[:r])
425 while n:
426 for i in reversed(range(r)):
427 cycles[i] -= 1
428 if cycles[i] == 0:
429 indices[i:] = indices[i+1:] + indices[i:i+1]
430 cycles[i] = n - i
431 else:
432 j = cycles[i]
433 indices[i], indices[-j] = indices[-j], indices[i]
434 yield tuple(pool[i] for i in indices[:r])
435 break
436 else:
437 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000438
Raymond Hettinger30c73622010-12-01 23:45:20 +0000439 The code for :func:`permutations` can be also expressed as a subsequence of
440 :func:`product`, filtered to exclude entries with repeated elements (those
441 from the same position in the input pool)::
Christian Heimes78644762008-03-04 23:39:23 +0000442
443 def permutations(iterable, r=None):
444 pool = tuple(iterable)
445 n = len(pool)
446 r = n if r is None else r
447 for indices in product(range(n), repeat=r):
448 if len(set(indices)) == r:
449 yield tuple(pool[i] for i in indices)
450
Raymond Hettinger30c73622010-12-01 23:45:20 +0000451 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
452 or zero when ``r > n``.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000453
Georg Brandl3dd33882009-06-01 17:35:27 +0000454.. function:: product(*iterables, repeat=1)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000455
Raymond Hettinger30c73622010-12-01 23:45:20 +0000456 Cartesian product of input iterables.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000457
Raymond Hettinger30c73622010-12-01 23:45:20 +0000458 Equivalent to nested for-loops in a generator expression. For example,
459 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000460
Raymond Hettinger30c73622010-12-01 23:45:20 +0000461 The nested loops cycle like an odometer with the rightmost element advancing
462 on every iteration. This pattern creates a lexicographic ordering so that if
463 the input's iterables are sorted, the product tuples are emitted in sorted
464 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000465
Raymond Hettinger30c73622010-12-01 23:45:20 +0000466 To compute the product of an iterable with itself, specify the number of
467 repetitions with the optional *repeat* keyword argument. For example,
468 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000469
Raymond Hettinger30c73622010-12-01 23:45:20 +0000470 This function is equivalent to the following code, except that the
471 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000472
Raymond Hettinger30c73622010-12-01 23:45:20 +0000473 def product(*args, repeat=1):
474 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
475 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
476 pools = [tuple(pool) for pool in args] * repeat
477 result = [[]]
478 for pool in pools:
479 result = [x+[y] for x in result for y in pool]
480 for prod in result:
481 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000482
483
Raymond Hettingerd75ad442009-06-01 19:16:52 +0000484.. function:: repeat(object[, times])
Georg Brandl116aa622007-08-15 14:28:22 +0000485
Raymond Hettinger30c73622010-12-01 23:45:20 +0000486 Make an iterator that returns *object* over and over again. Runs indefinitely
487 unless the *times* argument is specified. Used as argument to :func:`map` for
488 invariant parameters to the called function. Also used with :func:`zip` to
489 create an invariant part of a tuple record. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000490
Raymond Hettinger30c73622010-12-01 23:45:20 +0000491 def repeat(object, times=None):
492 # repeat(10, 3) --> 10 10 10
493 if times is None:
494 while True:
495 yield object
496 else:
497 for i in range(times):
498 yield object
Georg Brandl116aa622007-08-15 14:28:22 +0000499
Raymond Hettingerfc3ba6b2012-02-01 09:07:40 -0800500 A common use for *repeat* is to supply a stream of constant values to *map*
501 or *zip*::
502
503 >>> list(map(pow, range(10), repeat(2)))
504 [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Georg Brandl116aa622007-08-15 14:28:22 +0000505
506.. function:: starmap(function, iterable)
507
Raymond Hettinger30c73622010-12-01 23:45:20 +0000508 Make an iterator that computes the function using arguments obtained from
509 the iterable. Used instead of :func:`map` when argument parameters are already
510 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
511 difference between :func:`map` and :func:`starmap` parallels the distinction
512 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000513
Raymond Hettinger30c73622010-12-01 23:45:20 +0000514 def starmap(function, iterable):
515 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
516 for args in iterable:
517 yield function(*args)
Christian Heimes679db4a2008-01-18 09:56:22 +0000518
Georg Brandl116aa622007-08-15 14:28:22 +0000519
520.. function:: takewhile(predicate, iterable)
521
Raymond Hettinger30c73622010-12-01 23:45:20 +0000522 Make an iterator that returns elements from the iterable as long as the
523 predicate is true. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000524
Raymond Hettinger30c73622010-12-01 23:45:20 +0000525 def takewhile(predicate, iterable):
526 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
527 for x in iterable:
528 if predicate(x):
529 yield x
530 else:
531 break
Georg Brandl116aa622007-08-15 14:28:22 +0000532
533
Georg Brandl3dd33882009-06-01 17:35:27 +0000534.. function:: tee(iterable, n=2)
Georg Brandl116aa622007-08-15 14:28:22 +0000535
Raymond Hettinger30c73622010-12-01 23:45:20 +0000536 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000537
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000538 def tee(iterable, n=2):
539 it = iter(iterable)
540 deques = [collections.deque() for i in range(n)]
541 def gen(mydeque):
542 while True:
543 if not mydeque: # when the local deque is empty
544 newval = next(it) # fetch a new value and
545 for d in deques: # load it to all the deques
546 d.append(newval)
547 yield mydeque.popleft()
548 return tuple(gen(d) for d in deques)
Georg Brandl116aa622007-08-15 14:28:22 +0000549
Raymond Hettinger30c73622010-12-01 23:45:20 +0000550 Once :func:`tee` has made a split, the original *iterable* should not be
551 used anywhere else; otherwise, the *iterable* could get advanced without
552 the tee objects being informed.
Georg Brandl116aa622007-08-15 14:28:22 +0000553
Raymond Hettinger30c73622010-12-01 23:45:20 +0000554 This itertool may require significant auxiliary storage (depending on how
555 much temporary data needs to be stored). In general, if one iterator uses
556 most or all of the data before another iterator starts, it is faster to use
557 :func:`list` instead of :func:`tee`.
Georg Brandl116aa622007-08-15 14:28:22 +0000558
Georg Brandl116aa622007-08-15 14:28:22 +0000559
Georg Brandl3dd33882009-06-01 17:35:27 +0000560.. function:: zip_longest(*iterables, fillvalue=None)
Raymond Hettinger749761e2009-01-27 04:42:48 +0000561
Raymond Hettinger30c73622010-12-01 23:45:20 +0000562 Make an iterator that aggregates elements from each of the iterables. If the
563 iterables are of uneven length, missing values are filled-in with *fillvalue*.
564 Iteration continues until the longest iterable is exhausted. Equivalent to::
Raymond Hettinger749761e2009-01-27 04:42:48 +0000565
Raymond Hettinger6f45d182011-10-30 15:06:14 -0700566 class ZipExhausted(Exception):
567 pass
568
569 def zip_longest(*args, **kwds):
Raymond Hettinger30c73622010-12-01 23:45:20 +0000570 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Raymond Hettinger6f45d182011-10-30 15:06:14 -0700571 fillvalue = kwds.get('fillvalue')
572 counter = len(args) - 1
573 def sentinel():
574 nonlocal counter
575 if not counter:
576 raise ZipExhausted
577 counter -= 1
578 yield fillvalue
Raymond Hettinger30c73622010-12-01 23:45:20 +0000579 fillers = repeat(fillvalue)
Raymond Hettinger6f45d182011-10-30 15:06:14 -0700580 iterators = [chain(it, sentinel(), fillers) for it in args]
Raymond Hettinger30c73622010-12-01 23:45:20 +0000581 try:
Raymond Hettinger6f45d182011-10-30 15:06:14 -0700582 while iterators:
583 yield tuple(map(next, iterators))
584 except ZipExhausted:
Raymond Hettinger30c73622010-12-01 23:45:20 +0000585 pass
Raymond Hettinger749761e2009-01-27 04:42:48 +0000586
Raymond Hettinger30c73622010-12-01 23:45:20 +0000587 If one of the iterables is potentially infinite, then the :func:`zip_longest`
588 function should be wrapped with something that limits the number of calls
589 (for example :func:`islice` or :func:`takewhile`). If not specified,
590 *fillvalue* defaults to ``None``.
Raymond Hettinger749761e2009-01-27 04:42:48 +0000591
592
Georg Brandl116aa622007-08-15 14:28:22 +0000593.. _itertools-recipes:
594
Raymond Hettinger1fa76822010-12-06 23:31:36 +0000595Itertools Recipes
596-----------------
Georg Brandl116aa622007-08-15 14:28:22 +0000597
598This section shows recipes for creating an extended toolset using the existing
599itertools as building blocks.
600
601The extended tools offer the same high performance as the underlying toolset.
602The superior memory performance is kept by processing elements one at a time
603rather than bringing the whole iterable into memory all at once. Code volume is
604kept small by linking the tools together in a functional style which helps
605eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000606"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000607which incur interpreter overhead.
608
609.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000610
Raymond Hettinger30c73622010-12-01 23:45:20 +0000611 def take(n, iterable):
612 "Return first n items of the iterable as a list"
613 return list(islice(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000614
Raymond Hettinger30c73622010-12-01 23:45:20 +0000615 def tabulate(function, start=0):
616 "Return function(0), function(1), ..."
617 return map(function, count(start))
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000618
Raymond Hettinger30c73622010-12-01 23:45:20 +0000619 def consume(iterator, n):
620 "Advance the iterator n-steps ahead. If n is none, consume entirely."
621 # Use functions that consume iterators at C speed.
622 if n is None:
623 # feed the entire iterator into a zero-length deque
624 collections.deque(iterator, maxlen=0)
625 else:
626 # advance to the empty slice starting at position n
627 next(islice(iterator, n, n), None)
Raymond Hettingerfa007962009-03-09 11:55:25 +0000628
Raymond Hettinger30c73622010-12-01 23:45:20 +0000629 def nth(iterable, n, default=None):
630 "Returns the nth item or a default value"
631 return next(islice(iterable, n, None), default)
Georg Brandl116aa622007-08-15 14:28:22 +0000632
Raymond Hettinger30c73622010-12-01 23:45:20 +0000633 def quantify(iterable, pred=bool):
634 "Count how many times the predicate is true"
635 return sum(map(pred, iterable))
Georg Brandl116aa622007-08-15 14:28:22 +0000636
Raymond Hettinger30c73622010-12-01 23:45:20 +0000637 def padnone(iterable):
638 """Returns the sequence elements and then returns None indefinitely.
Georg Brandl116aa622007-08-15 14:28:22 +0000639
Raymond Hettinger30c73622010-12-01 23:45:20 +0000640 Useful for emulating the behavior of the built-in map() function.
641 """
642 return chain(iterable, repeat(None))
Georg Brandl116aa622007-08-15 14:28:22 +0000643
Raymond Hettinger30c73622010-12-01 23:45:20 +0000644 def ncycles(iterable, n):
645 "Returns the sequence elements n times"
646 return chain.from_iterable(repeat(tuple(iterable), n))
Georg Brandl116aa622007-08-15 14:28:22 +0000647
Raymond Hettinger30c73622010-12-01 23:45:20 +0000648 def dotproduct(vec1, vec2):
649 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000650
Raymond Hettinger30c73622010-12-01 23:45:20 +0000651 def flatten(listOfLists):
652 "Flatten one level of nesting"
653 return chain.from_iterable(listOfLists)
Georg Brandl116aa622007-08-15 14:28:22 +0000654
Raymond Hettinger30c73622010-12-01 23:45:20 +0000655 def repeatfunc(func, times=None, *args):
656 """Repeat calls to func with specified arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000657
Raymond Hettinger30c73622010-12-01 23:45:20 +0000658 Example: repeatfunc(random.random)
659 """
660 if times is None:
661 return starmap(func, repeat(args))
662 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000663
Raymond Hettinger30c73622010-12-01 23:45:20 +0000664 def pairwise(iterable):
665 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
666 a, b = tee(iterable)
667 next(b, None)
668 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000669
Raymond Hettinger30c73622010-12-01 23:45:20 +0000670 def grouper(n, iterable, fillvalue=None):
671 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
672 args = [iter(iterable)] * n
673 return zip_longest(*args, fillvalue=fillvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000674
Raymond Hettinger30c73622010-12-01 23:45:20 +0000675 def roundrobin(*iterables):
676 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
677 # Recipe credited to George Sakkis
678 pending = len(iterables)
679 nexts = cycle(iter(it).__next__ for it in iterables)
680 while pending:
681 try:
682 for next in nexts:
683 yield next()
684 except StopIteration:
685 pending -= 1
686 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000687
Raymond Hettinger30c73622010-12-01 23:45:20 +0000688 def partition(pred, iterable):
689 'Use a predicate to partition entries into false entries and true entries'
690 # partition(is_odd, range(10)) --> 0 2 4 6 8 and 1 3 5 7 9
691 t1, t2 = tee(iterable)
692 return filterfalse(pred, t1), filter(pred, t2)
Raymond Hettinger5ce0aa22010-12-01 10:49:19 +0000693
Raymond Hettinger30c73622010-12-01 23:45:20 +0000694 def powerset(iterable):
695 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
696 s = list(iterable)
697 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger08d01ee2010-08-07 05:36:53 +0000698
Raymond Hettinger30c73622010-12-01 23:45:20 +0000699 def unique_everseen(iterable, key=None):
700 "List unique elements, preserving order. Remember all elements ever seen."
701 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
702 # unique_everseen('ABBCcAD', str.lower) --> A B C D
703 seen = set()
704 seen_add = seen.add
705 if key is None:
706 for element in filterfalse(seen.__contains__, iterable):
707 seen_add(element)
708 yield element
709 else:
710 for element in iterable:
711 k = key(element)
712 if k not in seen:
713 seen_add(k)
714 yield element
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000715
Raymond Hettinger30c73622010-12-01 23:45:20 +0000716 def unique_justseen(iterable, key=None):
717 "List unique elements, preserving order. Remember only the element just seen."
718 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
719 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
720 return map(next, map(itemgetter(1), groupby(iterable, key)))
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000721
Raymond Hettinger30c73622010-12-01 23:45:20 +0000722 def iter_except(func, exception, first=None):
723 """ Call a function repeatedly until an exception is raised.
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000724
Raymond Hettinger30c73622010-12-01 23:45:20 +0000725 Converts a call-until-exception interface to an iterator interface.
726 Like __builtin__.iter(func, sentinel) but uses an exception instead
727 of a sentinel to end the loop.
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000728
Raymond Hettinger30c73622010-12-01 23:45:20 +0000729 Examples:
730 iter_except(functools.partial(heappop, h), IndexError) # priority queue iterator
731 iter_except(d.popitem, KeyError) # non-blocking dict iterator
732 iter_except(d.popleft, IndexError) # non-blocking deque iterator
733 iter_except(q.get_nowait, Queue.Empty) # loop over a producer Queue
734 iter_except(s.pop, KeyError) # non-blocking set iterator
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000735
Raymond Hettinger30c73622010-12-01 23:45:20 +0000736 """
737 try:
738 if first is not None:
739 yield first() # For database APIs needing an initial cast to db.first()
740 while 1:
741 yield func()
742 except exception:
743 pass
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000744
Raymond Hettinger30c73622010-12-01 23:45:20 +0000745 def random_product(*args, repeat=1):
746 "Random selection from itertools.product(*args, **kwds)"
747 pools = [tuple(pool) for pool in args] * repeat
748 return tuple(random.choice(pool) for pool in pools)
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000749
Raymond Hettinger30c73622010-12-01 23:45:20 +0000750 def random_permutation(iterable, r=None):
751 "Random selection from itertools.permutations(iterable, r)"
752 pool = tuple(iterable)
753 r = len(pool) if r is None else r
754 return tuple(random.sample(pool, r))
Raymond Hettinger063a4b62010-04-02 04:18:18 +0000755
Raymond Hettinger30c73622010-12-01 23:45:20 +0000756 def random_combination(iterable, r):
757 "Random selection from itertools.combinations(iterable, r)"
758 pool = tuple(iterable)
759 n = len(pool)
760 indices = sorted(random.sample(range(n), r))
761 return tuple(pool[i] for i in indices)
Raymond Hettinger063a4b62010-04-02 04:18:18 +0000762
Raymond Hettinger30c73622010-12-01 23:45:20 +0000763 def random_combination_with_replacement(iterable, r):
764 "Random selection from itertools.combinations_with_replacement(iterable, r)"
765 pool = tuple(iterable)
766 n = len(pool)
767 indices = sorted(random.randrange(n) for i in range(r))
768 return tuple(pool[i] for i in indices)
Raymond Hettinger063a4b62010-04-02 04:18:18 +0000769
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000770Note, many of the above recipes can be optimized by replacing global lookups
771with local variables defined as default values. For example, the
772*dotproduct* recipe can be written as::
773
Raymond Hettinger30c73622010-12-01 23:45:20 +0000774 def dotproduct(vec1, vec2, sum=sum, map=map, mul=operator.mul):
775 return sum(map(mul, vec1, vec2))