blob: 5c63b1268593fae5744f2bc8baf114ed566cf944 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`itertools` --- Functions creating iterators for efficient looping
2=======================================================================
3
4.. module:: itertools
5 :synopsis: Functions creating iterators for efficient looping.
6.. moduleauthor:: Raymond Hettinger <python@rcn.com>
7.. sectionauthor:: Raymond Hettinger <python@rcn.com>
8
9
Christian Heimesfe337bf2008-03-23 21:54:12 +000010.. testsetup::
11
12 from itertools import *
13
14
Raymond Hettingerf76b9202009-02-17 20:00:59 +000015This module implements a number of :term:`iterator` building blocks inspired
16by constructs from APL, Haskell, and SML. Each has been recast in a form
17suitable for Python.
Georg Brandl116aa622007-08-15 14:28:22 +000018
19The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettingerf76b9202009-02-17 20:00:59 +000020useful by themselves or in combination. Together, they form an "iterator
21algebra" making it possible to construct specialized tools succinctly and
22efficiently in pure Python.
Georg Brandl116aa622007-08-15 14:28:22 +000023
24For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Ezio Melottib6605992010-01-21 20:57:24 +000025sequence ``f(0), f(1), ...``. The same effect can be achieved in Python
Raymond Hettingera6c60372008-03-13 01:26:19 +000026by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000027
Raymond Hettinger2c109ab2009-03-12 00:29:44 +000028These tools and their built-in counterparts also work well with the high-speed
29functions in the :mod:`operator` module. For example, the multiplication
30operator can be mapped across two vectors to form an efficient dot-product:
31``sum(map(operator.mul, vector1, vector2))``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
Georg Brandl116aa622007-08-15 14:28:22 +000033
Raymond Hettingerf76b9202009-02-17 20:00:59 +000034**Infinite Iterators:**
Georg Brandl116aa622007-08-15 14:28:22 +000035
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000036================== ================= ================================================= =========================================
37Iterator Arguments Results Example
38================== ================= ================================================= =========================================
39:func:`count` start, [step] start, start+step, start+2*step, ... ``count(10) --> 10 11 12 13 14 ...``
40:func:`cycle` p p0, p1, ... plast, p0, p1, ... ``cycle('ABCD') --> A B C D A B C D ...``
41:func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times ``repeat(10, 3) --> 10 10 10``
42================== ================= ================================================= =========================================
Georg Brandl116aa622007-08-15 14:28:22 +000043
Raymond Hettingerf76b9202009-02-17 20:00:59 +000044**Iterators terminating on the shortest input sequence:**
45
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000046==================== ============================ ================================================= =============================================================
47Iterator Arguments Results Example
48==================== ============================ ================================================= =============================================================
49:func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ... ``chain('ABC', 'DEF') --> A B C D E F``
50:func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ... ``compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F``
51:func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails ``dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1``
Georg Brandlc2da5ce2009-08-13 09:16:39 +000052:func:`filterfalse` pred, seq elements of seq where pred(elem) is False ``filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8``
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000053:func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
54:func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step] ``islice('ABCDEFG', 2, None) --> C D E F G``
55:func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ... ``starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000``
56:func:`takewhile` pred, seq seq[0], seq[1], until pred fails ``takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4``
57:func:`tee` it, n it1, it2 , ... itn splits one iterator into n
Georg Brandlc2da5ce2009-08-13 09:16:39 +000058:func:`zip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-``
Raymond Hettinger5bfd8ce2009-04-10 19:02:36 +000059==================== ============================ ================================================= =============================================================
Raymond Hettingerf76b9202009-02-17 20:00:59 +000060
61**Combinatoric generators:**
62
Raymond Hettinger7f587cd2009-04-10 19:43:50 +000063============================================== ==================== =============================================================
64Iterator Arguments Results
65============================================== ==================== =============================================================
66:func:`product` p, q, ... [repeat=1] cartesian product, equivalent to a nested for-loop
67:func:`permutations` p[, r] r-length tuples, all possible orderings, no repeated elements
Raymond Hettinger36c3c022009-11-19 01:20:07 +000068:func:`combinations` p, r r-length tuples, in sorted order, no repeated elements
69:func:`combinations_with_replacement` p, r r-length tuples, in sorted order, with repeated elements
Raymond Hettinger7f587cd2009-04-10 19:43:50 +000070|
71``product('ABCD', repeat=2)`` ``AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD``
72``permutations('ABCD', 2)`` ``AB AC AD BA BC BD CA CB CD DA DB DC``
73``combinations('ABCD', 2)`` ``AB AC AD BC BD CD``
74``combinations_with_replacement('ABCD', 2)`` ``AA AB AC AD BB BC BD CC CD DD``
75============================================== ==================== =============================================================
Georg Brandl116aa622007-08-15 14:28:22 +000076
77
78.. _itertools-functions:
79
80Itertool functions
81------------------
82
83The following module functions all construct and return iterators. Some provide
84streams of infinite length, so they should only be accessed by functions or
85loops that truncate the stream.
86
87
88.. function:: chain(*iterables)
89
90 Make an iterator that returns elements from the first iterable until it is
91 exhausted, then proceeds to the next iterable, until all of the iterables are
92 exhausted. Used for treating consecutive sequences as a single sequence.
93 Equivalent to::
94
95 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000096 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000097 for it in iterables:
98 for element in it:
99 yield element
100
101
Christian Heimes70e7ea22008-02-28 20:02:27 +0000102.. function:: itertools.chain.from_iterable(iterable)
103
Georg Brandl48310cd2009-01-03 21:18:54 +0000104 Alternate constructor for :func:`chain`. Gets chained inputs from a
Christian Heimes70e7ea22008-02-28 20:02:27 +0000105 single iterable argument that is evaluated lazily. Equivalent to::
106
107 @classmethod
108 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000109 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +0000110 for it in iterables:
111 for element in it:
112 yield element
113
Christian Heimes78644762008-03-04 23:39:23 +0000114
Christian Heimes836baa52008-02-26 08:18:30 +0000115.. function:: combinations(iterable, r)
116
Christian Heimesdae2a892008-04-19 00:55:37 +0000117 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +0000118
Georg Brandl48310cd2009-01-03 21:18:54 +0000119 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +0000120 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +0000121 in sorted order.
Christian Heimes836baa52008-02-26 08:18:30 +0000122
123 Elements are treated as unique based on their position, not on their
124 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +0000125 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +0000126
Christian Heimes836baa52008-02-26 08:18:30 +0000127 Equivalent to::
128
129 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000130 # combinations('ABCD', 2) --> AB AC AD BC BD CD
131 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000132 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000133 n = len(pool)
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000134 if r > n:
135 return
136 indices = list(range(r))
Christian Heimesb558a2e2008-03-02 22:46:37 +0000137 yield tuple(pool[i] for i in indices)
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000138 while True:
Christian Heimes380f7f22008-02-28 11:19:05 +0000139 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000140 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000141 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000142 else:
143 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000144 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000145 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000146 indices[j] = indices[j-1] + 1
147 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000148
Christian Heimes78644762008-03-04 23:39:23 +0000149 The code for :func:`combinations` can be also expressed as a subsequence
150 of :func:`permutations` after filtering entries where the elements are not
151 in sorted order (according to their position in the input pool)::
152
153 def combinations(iterable, r):
154 pool = tuple(iterable)
155 n = len(pool)
156 for indices in permutations(range(n), r):
157 if sorted(indices) == list(indices):
158 yield tuple(pool[i] for i in indices)
159
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000160 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
161 or zero when ``r > n``.
Christian Heimes836baa52008-02-26 08:18:30 +0000162
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000163.. function:: combinations_with_replacement(iterable, r)
164
165 Return *r* length subsequences of elements from the input *iterable*
166 allowing individual elements to be repeated more than once.
167
168 Combinations are emitted in lexicographic sort order. So, if the
169 input *iterable* is sorted, the combination tuples will be produced
170 in sorted order.
171
172 Elements are treated as unique based on their position, not on their
173 value. So if the input elements are unique, the generated combinations
174 will also be unique.
175
176 Equivalent to::
177
178 def combinations_with_replacement(iterable, r):
179 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
180 pool = tuple(iterable)
181 n = len(pool)
182 if not n and r:
183 return
184 indices = [0] * r
185 yield tuple(pool[i] for i in indices)
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000186 while True:
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000187 for i in reversed(range(r)):
188 if indices[i] != n - 1:
189 break
190 else:
191 return
192 indices[i:] = [indices[i] + 1] * (r - i)
193 yield tuple(pool[i] for i in indices)
194
195 The code for :func:`combinations_with_replacement` can be also expressed as
196 a subsequence of :func:`product` after filtering entries where the elements
197 are not in sorted order (according to their position in the input pool)::
198
199 def combinations_with_replacement(iterable, r):
200 pool = tuple(iterable)
201 n = len(pool)
202 for indices in product(range(n), repeat=r):
203 if sorted(indices) == list(indices):
204 yield tuple(pool[i] for i in indices)
205
206 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
207
Raymond Hettinger30729212009-02-12 06:28:27 +0000208 .. versionadded:: 3.1
Raymond Hettingerd07d9392009-01-27 04:20:44 +0000209
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000210.. function:: compress(data, selectors)
211
212 Make an iterator that filters elements from *data* returning only those that
213 have a corresponding element in *selectors* that evaluates to ``True``.
Benjamin Petersond23f8222009-04-05 19:13:16 +0000214 Stops when either the *data* or *selectors* iterables has been exhausted.
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000215 Equivalent to::
216
217 def compress(data, selectors):
218 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
219 return (d for d, s in zip(data, selectors) if s)
220
Raymond Hettinger30729212009-02-12 06:28:27 +0000221 .. versionadded:: 3.1
Raymond Hettinger6b3b0fc2009-01-26 02:56:58 +0000222
223
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000224.. function:: count(start=0, step=1)
Georg Brandl116aa622007-08-15 14:28:22 +0000225
Raymond Hettinger30729212009-02-12 06:28:27 +0000226 Make an iterator that returns evenly spaced values starting with *n*. Often
227 used as an argument to :func:`map` to generate consecutive data points.
228 Also, used with :func:`zip` to add sequence numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000229
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000230 def count(start=0, step=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000231 # count(10) --> 10 11 12 13 14 ...
Raymond Hettinger30729212009-02-12 06:28:27 +0000232 # count(2.5, 0.5) -> 3.5 3.0 4.5 ...
Raymond Hettinger9e8dbbc2009-02-14 04:21:49 +0000233 n = start
Georg Brandl116aa622007-08-15 14:28:22 +0000234 while True:
235 yield n
Raymond Hettinger30729212009-02-12 06:28:27 +0000236 n += step
Georg Brandl116aa622007-08-15 14:28:22 +0000237
Raymond Hettinger5bc472a2009-06-17 01:40:52 +0000238 When counting with floating point numbers, better accuracy can sometimes be
239 achieved by substituting multiplicative code such as: ``(start + step * i
240 for i in count())``.
241
Raymond Hettinger30729212009-02-12 06:28:27 +0000242 .. versionchanged:: 3.1
243 added *step* argument and allowed non-integer arguments.
Georg Brandl116aa622007-08-15 14:28:22 +0000244
245.. function:: cycle(iterable)
246
247 Make an iterator returning elements from the iterable and saving a copy of each.
248 When the iterable is exhausted, return elements from the saved copy. Repeats
249 indefinitely. Equivalent to::
250
251 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000252 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000253 saved = []
254 for element in iterable:
255 yield element
256 saved.append(element)
257 while saved:
258 for element in saved:
259 yield element
260
261 Note, this member of the toolkit may require significant auxiliary storage
262 (depending on the length of the iterable).
263
264
265.. function:: dropwhile(predicate, iterable)
266
267 Make an iterator that drops elements from the iterable as long as the predicate
268 is true; afterwards, returns every element. Note, the iterator does not produce
269 *any* output until the predicate first becomes false, so it may have a lengthy
270 start-up time. Equivalent to::
271
272 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000273 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000274 iterable = iter(iterable)
275 for x in iterable:
276 if not predicate(x):
277 yield x
278 break
279 for x in iterable:
280 yield x
281
Raymond Hettinger749761e2009-01-27 04:42:48 +0000282.. function:: filterfalse(predicate, iterable)
283
284 Make an iterator that filters elements from iterable returning only those for
285 which the predicate is ``False``. If *predicate* is ``None``, return the items
286 that are false. Equivalent to::
287
288 def filterfalse(predicate, iterable):
289 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
290 if predicate is None:
291 predicate = bool
292 for x in iterable:
293 if not predicate(x):
294 yield x
295
Georg Brandl116aa622007-08-15 14:28:22 +0000296
Georg Brandl3dd33882009-06-01 17:35:27 +0000297.. function:: groupby(iterable, key=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000298
299 Make an iterator that returns consecutive keys and groups from the *iterable*.
300 The *key* is a function computing a key value for each element. If not
301 specified or is ``None``, *key* defaults to an identity function and returns
302 the element unchanged. Generally, the iterable needs to already be sorted on
303 the same key function.
304
305 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
306 generates a break or new group every time the value of the key function changes
307 (which is why it is usually necessary to have sorted the data using the same key
308 function). That behavior differs from SQL's GROUP BY which aggregates common
309 elements regardless of their input order.
310
311 The returned group is itself an iterator that shares the underlying iterable
312 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
313 object is advanced, the previous group is no longer visible. So, if that data
314 is needed later, it should be stored as a list::
315
316 groups = []
317 uniquekeys = []
318 data = sorted(data, key=keyfunc)
319 for k, g in groupby(data, keyfunc):
320 groups.append(list(g)) # Store group iterator as a list
321 uniquekeys.append(k)
322
323 :func:`groupby` is equivalent to::
324
325 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000326 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettingerd04fa312009-02-04 19:45:13 +0000327 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000328 def __init__(self, iterable, key=None):
329 if key is None:
330 key = lambda x: x
331 self.keyfunc = key
332 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000333 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000334 def __iter__(self):
335 return self
336 def __next__(self):
337 while self.currkey == self.tgtkey:
Raymond Hettinger21315ba2009-02-23 19:38:09 +0000338 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl116aa622007-08-15 14:28:22 +0000339 self.currkey = self.keyfunc(self.currvalue)
340 self.tgtkey = self.currkey
341 return (self.currkey, self._grouper(self.tgtkey))
342 def _grouper(self, tgtkey):
343 while self.currkey == tgtkey:
344 yield self.currvalue
Raymond Hettinger21315ba2009-02-23 19:38:09 +0000345 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl116aa622007-08-15 14:28:22 +0000346 self.currkey = self.keyfunc(self.currvalue)
347
Georg Brandl116aa622007-08-15 14:28:22 +0000348
Georg Brandl116aa622007-08-15 14:28:22 +0000349.. function:: islice(iterable, [start,] stop [, step])
350
351 Make an iterator that returns selected elements from the iterable. If *start* is
352 non-zero, then elements from the iterable are skipped until start is reached.
353 Afterward, elements are returned consecutively unless *step* is set higher than
354 one which results in items being skipped. If *stop* is ``None``, then iteration
355 continues until the iterator is exhausted, if at all; otherwise, it stops at the
356 specified position. Unlike regular slicing, :func:`islice` does not support
357 negative values for *start*, *stop*, or *step*. Can be used to extract related
358 fields from data where the internal structure has been flattened (for example, a
359 multi-line report may list a name field on every third line). Equivalent to::
360
361 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000362 # islice('ABCDEFG', 2) --> A B
363 # islice('ABCDEFG', 2, 4) --> C D
364 # islice('ABCDEFG', 2, None) --> C D E F G
365 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000366 s = slice(*args)
Raymond Hettinger7c8494b2009-05-14 21:48:02 +0000367 it = iter(range(s.start or 0, s.stop or sys.maxsize, s.step or 1))
Georg Brandl116aa622007-08-15 14:28:22 +0000368 nexti = next(it)
369 for i, element in enumerate(iterable):
370 if i == nexti:
371 yield element
372 nexti = next(it)
373
374 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
375 then the step defaults to one.
376
Georg Brandl116aa622007-08-15 14:28:22 +0000377
Georg Brandl3dd33882009-06-01 17:35:27 +0000378.. function:: permutations(iterable, r=None)
Christian Heimes70e7ea22008-02-28 20:02:27 +0000379
380 Return successive *r* length permutations of elements in the *iterable*.
381
382 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl48310cd2009-01-03 21:18:54 +0000383 of the *iterable* and all possible full-length permutations
Christian Heimes70e7ea22008-02-28 20:02:27 +0000384 are generated.
385
Georg Brandl48310cd2009-01-03 21:18:54 +0000386 Permutations are emitted in lexicographic sort order. So, if the
Christian Heimes70e7ea22008-02-28 20:02:27 +0000387 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl48310cd2009-01-03 21:18:54 +0000388 in sorted order.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000389
390 Elements are treated as unique based on their position, not on their
391 value. So if the input elements are unique, there will be no repeat
392 values in each permutation.
393
Christian Heimesb558a2e2008-03-02 22:46:37 +0000394 Equivalent to::
395
396 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000397 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
398 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000399 pool = tuple(iterable)
400 n = len(pool)
401 r = n if r is None else r
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000402 if r > n:
403 return
404 indices = list(range(n))
Christian Heimesfe337bf2008-03-23 21:54:12 +0000405 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000406 yield tuple(pool[i] for i in indices[:r])
407 while n:
408 for i in reversed(range(r)):
409 cycles[i] -= 1
410 if cycles[i] == 0:
411 indices[i:] = indices[i+1:] + indices[i:i+1]
412 cycles[i] = n - i
413 else:
414 j = cycles[i]
415 indices[i], indices[-j] = indices[-j], indices[i]
416 yield tuple(pool[i] for i in indices[:r])
417 break
418 else:
419 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000420
Georg Brandl48310cd2009-01-03 21:18:54 +0000421 The code for :func:`permutations` can be also expressed as a subsequence of
Christian Heimes78644762008-03-04 23:39:23 +0000422 :func:`product`, filtered to exclude entries with repeated elements (those
423 from the same position in the input pool)::
424
425 def permutations(iterable, r=None):
426 pool = tuple(iterable)
427 n = len(pool)
428 r = n if r is None else r
429 for indices in product(range(n), repeat=r):
430 if len(set(indices)) == r:
431 yield tuple(pool[i] for i in indices)
432
Raymond Hettinger5bad41e2009-01-08 21:01:54 +0000433 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
434 or zero when ``r > n``.
Christian Heimes70e7ea22008-02-28 20:02:27 +0000435
Georg Brandl3dd33882009-06-01 17:35:27 +0000436.. function:: product(*iterables, repeat=1)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000437
438 Cartesian product of input iterables.
439
440 Equivalent to nested for-loops in a generator expression. For example,
441 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
442
Christian Heimesdae2a892008-04-19 00:55:37 +0000443 The nested loops cycle like an odometer with the rightmost element advancing
444 on every iteration. This pattern creates a lexicographic ordering so that if
445 the input's iterables are sorted, the product tuples are emitted in sorted
446 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000447
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000448 To compute the product of an iterable with itself, specify the number of
449 repetitions with the optional *repeat* keyword argument. For example,
450 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
451
Christian Heimes78644762008-03-04 23:39:23 +0000452 This function is equivalent to the following code, except that the
453 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000454
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000455 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000456 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
457 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger063a4b62010-04-02 04:18:18 +0000458 pools = [tuple(pool) for pool in args] * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000459 result = [[]]
460 for pool in pools:
461 result = [x+[y] for x in result for y in pool]
462 for prod in result:
463 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000464
465
Raymond Hettingerd75ad442009-06-01 19:16:52 +0000466.. function:: repeat(object[, times])
Georg Brandl116aa622007-08-15 14:28:22 +0000467
468 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000469 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000470 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000471 create an invariant part of a tuple record. Equivalent to::
472
473 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000474 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000475 if times is None:
476 while True:
477 yield object
478 else:
479 for i in range(times):
480 yield object
481
482
483.. function:: starmap(function, iterable)
484
Christian Heimes679db4a2008-01-18 09:56:22 +0000485 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000486 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000487 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000488 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000489 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
490
491 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000492 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000493 for args in iterable:
494 yield function(*args)
495
Georg Brandl116aa622007-08-15 14:28:22 +0000496
497.. function:: takewhile(predicate, iterable)
498
499 Make an iterator that returns elements from the iterable as long as the
500 predicate is true. Equivalent to::
501
502 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000503 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000504 for x in iterable:
505 if predicate(x):
506 yield x
507 else:
508 break
509
510
Georg Brandl3dd33882009-06-01 17:35:27 +0000511.. function:: tee(iterable, n=2)
Georg Brandl116aa622007-08-15 14:28:22 +0000512
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000513 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000514
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000515 def tee(iterable, n=2):
516 it = iter(iterable)
517 deques = [collections.deque() for i in range(n)]
518 def gen(mydeque):
519 while True:
520 if not mydeque: # when the local deque is empty
521 newval = next(it) # fetch a new value and
522 for d in deques: # load it to all the deques
523 d.append(newval)
524 yield mydeque.popleft()
525 return tuple(gen(d) for d in deques)
Georg Brandl116aa622007-08-15 14:28:22 +0000526
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000527 Once :func:`tee` has made a split, the original *iterable* should not be
528 used anywhere else; otherwise, the *iterable* could get advanced without
529 the tee objects being informed.
Georg Brandl116aa622007-08-15 14:28:22 +0000530
Raymond Hettingercf984ce2009-02-18 20:56:51 +0000531 This itertool may require significant auxiliary storage (depending on how
532 much temporary data needs to be stored). In general, if one iterator uses
533 most or all of the data before another iterator starts, it is faster to use
534 :func:`list` instead of :func:`tee`.
Georg Brandl116aa622007-08-15 14:28:22 +0000535
Georg Brandl116aa622007-08-15 14:28:22 +0000536
Georg Brandl3dd33882009-06-01 17:35:27 +0000537.. function:: zip_longest(*iterables, fillvalue=None)
Raymond Hettinger749761e2009-01-27 04:42:48 +0000538
539 Make an iterator that aggregates elements from each of the iterables. If the
540 iterables are of uneven length, missing values are filled-in with *fillvalue*.
541 Iteration continues until the longest iterable is exhausted. Equivalent to::
542
543 def zip_longest(*args, fillvalue=None):
544 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
545 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
546 yield counter() # yields the fillvalue, or raises IndexError
547 fillers = repeat(fillvalue)
548 iters = [chain(it, sentinel(), fillers) for it in args]
549 try:
550 for tup in zip(*iters):
551 yield tup
552 except IndexError:
553 pass
554
555 If one of the iterables is potentially infinite, then the :func:`zip_longest`
556 function should be wrapped with something that limits the number of calls
557 (for example :func:`islice` or :func:`takewhile`). If not specified,
558 *fillvalue* defaults to ``None``.
559
560
Georg Brandl116aa622007-08-15 14:28:22 +0000561.. _itertools-recipes:
562
563Recipes
564-------
565
566This section shows recipes for creating an extended toolset using the existing
567itertools as building blocks.
568
569The extended tools offer the same high performance as the underlying toolset.
570The superior memory performance is kept by processing elements one at a time
571rather than bringing the whole iterable into memory all at once. Code volume is
572kept small by linking the tools together in a functional style which helps
573eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000574"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000575which incur interpreter overhead.
576
577.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000578
Georg Brandl3dbca812008-07-23 16:10:53 +0000579 def take(n, iterable):
580 "Return first n items of the iterable as a list"
581 return list(islice(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000582
Georg Brandl3dbca812008-07-23 16:10:53 +0000583 def tabulate(function, start=0):
Georg Brandl116aa622007-08-15 14:28:22 +0000584 "Return function(0), function(1), ..."
Georg Brandl3dbca812008-07-23 16:10:53 +0000585 return map(function, count(start))
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000586
Raymond Hettingerfa007962009-03-09 11:55:25 +0000587 def consume(iterator, n):
588 "Advance the iterator n-steps ahead. If n is none, consume entirely."
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000589 # Use functions that consume iterators at C speed.
590 if n is None:
591 # feed the entire iterator into a zero-length deque
592 collections.deque(iterator, maxlen=0)
593 else:
594 # advance to the emtpy slice starting at position n
595 next(islice(iterator, n, n), None)
Raymond Hettingerfa007962009-03-09 11:55:25 +0000596
Raymond Hettingercdf8ba32009-02-19 04:45:07 +0000597 def nth(iterable, n, default=None):
598 "Returns the nth item or a default value"
599 return next(islice(iterable, n, None), default)
Georg Brandl116aa622007-08-15 14:28:22 +0000600
Georg Brandl3dbca812008-07-23 16:10:53 +0000601 def quantify(iterable, pred=bool):
602 "Count how many times the predicate is true"
603 return sum(map(pred, iterable))
Georg Brandl116aa622007-08-15 14:28:22 +0000604
Georg Brandl3dbca812008-07-23 16:10:53 +0000605 def padnone(iterable):
Georg Brandl116aa622007-08-15 14:28:22 +0000606 """Returns the sequence elements and then returns None indefinitely.
607
608 Useful for emulating the behavior of the built-in map() function.
609 """
Georg Brandl3dbca812008-07-23 16:10:53 +0000610 return chain(iterable, repeat(None))
Georg Brandl116aa622007-08-15 14:28:22 +0000611
Georg Brandl3dbca812008-07-23 16:10:53 +0000612 def ncycles(iterable, n):
Georg Brandl116aa622007-08-15 14:28:22 +0000613 "Returns the sequence elements n times"
Raymond Hettinger0f3ec6d2010-04-02 04:50:35 +0000614 return chain.from_iterable(repeat(tuple(iterable), n))
Georg Brandl116aa622007-08-15 14:28:22 +0000615
616 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000617 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000618
619 def flatten(listOfLists):
Raymond Hettinger063a4b62010-04-02 04:18:18 +0000620 "Flatten one level of nesting"
621 return chain.from_iterable(listOfLists)
Georg Brandl116aa622007-08-15 14:28:22 +0000622
623 def repeatfunc(func, times=None, *args):
624 """Repeat calls to func with specified arguments.
625
626 Example: repeatfunc(random.random)
627 """
628 if times is None:
629 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000630 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000631
632 def pairwise(iterable):
633 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
634 a, b = tee(iterable)
Raymond Hettinger21315ba2009-02-23 19:38:09 +0000635 next(b, None)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000636 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000637
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000638 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000639 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000640 args = [iter(iterable)] * n
Raymond Hettinger883d2762009-01-27 04:57:51 +0000641 return zip_longest(*args, fillvalue=fillvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000642
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000643 def roundrobin(*iterables):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000644 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000645 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000646 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000647 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000648 while pending:
649 try:
650 for next in nexts:
651 yield next()
652 except StopIteration:
653 pending -= 1
654 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000655
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000656 def powerset(iterable):
Raymond Hettingerace67332009-01-26 02:23:50 +0000657 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
658 s = list(iterable)
659 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000660
Raymond Hettingerfd88ea72009-03-23 22:42:28 +0000661 def unique_everseen(iterable, key=None):
662 "List unique elements, preserving order. Remember all elements ever seen."
663 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
664 # unique_everseen('ABBCcAD', str.lower) --> A B C D
665 seen = set()
666 seen_add = seen.add
667 if key is None:
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000668 for element in filterfalse(seen.__contains__, iterable):
669 seen_add(element)
670 yield element
Raymond Hettingerfd88ea72009-03-23 22:42:28 +0000671 else:
672 for element in iterable:
673 k = key(element)
674 if k not in seen:
675 seen_add(k)
676 yield element
Raymond Hettingerad9d96b2009-01-02 21:39:07 +0000677
Raymond Hettingerfd88ea72009-03-23 22:42:28 +0000678 def unique_justseen(iterable, key=None):
679 "List unique elements, preserving order. Remember only the element just seen."
680 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
681 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
682 return map(next, map(itemgetter(1), groupby(iterable, key)))
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000683
684 def iter_except(func, exception, first=None):
685 """ Call a function repeatedly until an exception is raised.
686
687 Converts a call-until-exception interface to an iterator interface.
688 Like __builtin__.iter(func, sentinel) but uses an exception instead
689 of a sentinel to end the loop.
690
691 Examples:
692 iter_except(functools.partial(heappop, h), IndexError) # priority queue iterator
693 iter_except(d.popitem, KeyError) # non-blocking dict iterator
694 iter_except(d.popleft, IndexError) # non-blocking deque iterator
695 iter_except(q.get_nowait, Queue.Empty) # loop over a producer Queue
696 iter_except(s.pop, KeyError) # non-blocking set iterator
697
698 """
699 try:
700 if first is not None:
701 yield first() # For database APIs needing an initial cast to db.first()
702 while 1:
703 yield func()
704 except exception:
705 pass
706
Raymond Hettinger063a4b62010-04-02 04:18:18 +0000707 def random_product(*args, repeat=1):
708 "Random selection from itertools.product(*args, **kwds)"
709 pools = [tuple(pool) for pool in args] * repeat
Raymond Hettinger0f3ec6d2010-04-02 04:50:35 +0000710 return tuple(random.choice(pool) for pool in pools)
Raymond Hettinger063a4b62010-04-02 04:18:18 +0000711
712 def random_permuation(iterable, r=None):
713 "Random selection from itertools.permutations(iterable, r)"
714 pool = tuple(iterable)
715 r = len(pool) if r is None else r
Raymond Hettinger0f3ec6d2010-04-02 04:50:35 +0000716 return tuple(random.sample(pool, r))
Raymond Hettinger063a4b62010-04-02 04:18:18 +0000717
718 def random_combination(iterable, r):
719 "Random selection from itertools.combinations(iterable, r)"
720 pool = tuple(iterable)
Raymond Hettinger0f3ec6d2010-04-02 04:50:35 +0000721 return tuple(sorted(random.sample(pool, r), key=pool.index))
Raymond Hettinger063a4b62010-04-02 04:18:18 +0000722
723 def random_combination_with_replacement(iterable, r):
724 "Random selection from itertools.combinations_with_replacement(iterable, r)"
725 pool = tuple(iterable)
Raymond Hettinger0f3ec6d2010-04-02 04:50:35 +0000726 return tuple(sorted(map(random.choice, repeat(pool, r)), key=pool.index))
Raymond Hettinger063a4b62010-04-02 04:18:18 +0000727
Raymond Hettingerfc91aa22010-03-28 18:27:13 +0000728Note, many of the above recipes can be optimized by replacing global lookups
729with local variables defined as default values. For example, the
730*dotproduct* recipe can be written as::
731
732 def dotproduct(vec1, vec2, sum=sum, map=map, mul=operator.mul):
733 return sum(map(mul, vec1, vec2))