blob: 250c84249e31cf962d6ec3afde1a5fec21dc3cb9 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Raymond Hettingerd187d522009-02-23 21:23:04 +000017This module implements a number of :term:`iterator` building blocks inspired
18by constructs from APL, Haskell, and SML. Each has been recast in a form
19suitable for Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000020
21The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettingerd187d522009-02-23 21:23:04 +000022useful by themselves or in combination. Together, they form an "iterator
23algebra" making it possible to construct specialized tools succinctly and
24efficiently in pure Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000025
26For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Ezio Melottib5ff9fa2010-01-21 20:52:23 +000027sequence ``f(0), f(1), ...``. The same effect can be achieved in Python
28by combining :func:`imap` and :func:`count` to form ``imap(f, count())``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000029
Raymond Hettinger2b458192009-03-12 00:35:05 +000030These tools and their built-in counterparts also work well with the high-speed
31functions in the :mod:`operator` module. For example, the multiplication
32operator can be mapped across two vectors to form an efficient dot-product:
33``sum(imap(operator.mul, vector1, vector2))``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000034
Georg Brandl8ec7f652007-08-15 14:28:01 +000035
Raymond Hettingerd187d522009-02-23 21:23:04 +000036**Infinite Iterators:**
Georg Brandl8ec7f652007-08-15 14:28:01 +000037
Raymond Hettinger92d19512009-04-10 18:52:23 +000038================== ================= ================================================= =========================================
39Iterator Arguments Results Example
40================== ================= ================================================= =========================================
Raymond Hettinger099769c2009-04-10 18:58:58 +000041:func:`count` start start, start+1, start+2, ... ``count(10) --> 10 11 12 13 14 ...``
Raymond Hettinger92d19512009-04-10 18:52:23 +000042:func:`cycle` p p0, p1, ... plast, p0, p1, ... ``cycle('ABCD') --> A B C D A B C D ...``
43:func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times ``repeat(10, 3) --> 10 10 10``
44================== ================= ================================================= =========================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000045
Raymond Hettingerd187d522009-02-23 21:23:04 +000046**Iterators terminating on the shortest input sequence:**
47
Raymond Hettinger92d19512009-04-10 18:52:23 +000048==================== ============================ ================================================= =============================================================
49Iterator Arguments Results Example
50==================== ============================ ================================================= =============================================================
51:func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ... ``chain('ABC', 'DEF') --> A B C D E F``
52:func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails ``dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1``
53:func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
54:func:`ifilter` pred, seq elements of seq where pred(elem) is True ``ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9``
55:func:`ifilterfalse` pred, seq elements of seq where pred(elem) is False ``ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8``
56:func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step] ``islice('ABCDEFG', 2, None) --> C D E F G``
57:func:`imap` func, p, q, ... func(p0, q0), func(p1, q1), ... ``imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000``
58:func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ... ``starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000``
59:func:`tee` it, n it1, it2 , ... itn splits one iterator into n
60:func:`takewhile` pred, seq seq[0], seq[1], until pred fails ``takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4``
61:func:`izip` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip('ABCD', 'xy') --> Ax By``
62:func:`izip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-``
63==================== ============================ ================================================= =============================================================
Raymond Hettingerd187d522009-02-23 21:23:04 +000064
65**Combinatoric generators:**
66
Raymond Hettinger21fde352009-04-10 19:46:51 +000067============================================== ==================== =============================================================
68Iterator Arguments Results
69============================================== ==================== =============================================================
70:func:`product` p, q, ... [repeat=1] cartesian product, equivalent to a nested for-loop
71:func:`permutations` p[, r] r-length tuples, all possible orderings, no repeated elements
Raymond Hettinger1aef4442009-11-19 01:26:23 +000072:func:`combinations` p, r r-length tuples, in sorted order, no repeated elements
Raymond Hettinger21fde352009-04-10 19:46:51 +000073|
74``product('ABCD', repeat=2)`` ``AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD``
75``permutations('ABCD', 2)`` ``AB AC AD BA BC BD CA CB CD DA DB DC``
76``combinations('ABCD', 2)`` ``AB AC AD BC BD CD``
77============================================== ==================== =============================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000078
79
80.. _itertools-functions:
81
82Itertool functions
83------------------
84
85The following module functions all construct and return iterators. Some provide
86streams of infinite length, so they should only be accessed by functions or
87loops that truncate the stream.
88
89
90.. function:: chain(*iterables)
91
92 Make an iterator that returns elements from the first iterable until it is
93 exhausted, then proceeds to the next iterable, until all of the iterables are
94 exhausted. Used for treating consecutive sequences as a single sequence.
95 Equivalent to::
96
97 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000098 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +000099 for it in iterables:
100 for element in it:
101 yield element
102
103
Raymond Hettinger330958e62008-02-28 19:41:24 +0000104.. function:: itertools.chain.from_iterable(iterable)
105
Georg Brandl734373c2009-01-03 21:55:17 +0000106 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e62008-02-28 19:41:24 +0000107 single iterable argument that is evaluated lazily. Equivalent to::
108
109 @classmethod
110 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000111 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e62008-02-28 19:41:24 +0000112 for it in iterables:
113 for element in it:
114 yield element
115
116 .. versionadded:: 2.6
117
Raymond Hettingerd553d852008-03-04 04:17:08 +0000118
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000119.. function:: combinations(iterable, r)
120
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000121 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000122
Georg Brandl734373c2009-01-03 21:55:17 +0000123 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000124 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl734373c2009-01-03 21:55:17 +0000125 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000126
127 Elements are treated as unique based on their position, not on their
128 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e62008-02-28 19:41:24 +0000129 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000130
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000131 Equivalent to::
132
133 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000134 # combinations('ABCD', 2) --> AB AC AD BC BD CD
135 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000136 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000137 n = len(pool)
Raymond Hettinger825758c2009-01-08 05:20:19 +0000138 if r > n:
139 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000140 indices = range(r)
141 yield tuple(pool[i] for i in indices)
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000142 while True:
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000143 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000144 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000145 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000146 else:
147 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000148 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000149 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000150 indices[j] = indices[j-1] + 1
151 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000152
Raymond Hettingerd553d852008-03-04 04:17:08 +0000153 The code for :func:`combinations` can be also expressed as a subsequence
154 of :func:`permutations` after filtering entries where the elements are not
155 in sorted order (according to their position in the input pool)::
156
157 def combinations(iterable, r):
158 pool = tuple(iterable)
159 n = len(pool)
160 for indices in permutations(range(n), r):
161 if sorted(indices) == list(indices):
162 yield tuple(pool[i] for i in indices)
163
Raymond Hettinger825758c2009-01-08 05:20:19 +0000164 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
165 or zero when ``r > n``.
166
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000167 .. versionadded:: 2.6
168
Georg Brandl8ec7f652007-08-15 14:28:01 +0000169.. function:: count([n])
170
171 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettinger50e90e22007-10-04 00:20:27 +0000172 specified *n* defaults to zero. Often used as an argument to :func:`imap` to
173 generate consecutive data points. Also, used with :func:`izip` to add sequence
174 numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000175
176 def count(n=0):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000177 # count(10) --> 10 11 12 13 14 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000178 while True:
179 yield n
180 n += 1
181
Georg Brandl8ec7f652007-08-15 14:28:01 +0000182
183.. function:: cycle(iterable)
184
185 Make an iterator returning elements from the iterable and saving a copy of each.
186 When the iterable is exhausted, return elements from the saved copy. Repeats
187 indefinitely. Equivalent to::
188
189 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000190 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000191 saved = []
192 for element in iterable:
193 yield element
194 saved.append(element)
195 while saved:
196 for element in saved:
197 yield element
198
199 Note, this member of the toolkit may require significant auxiliary storage
200 (depending on the length of the iterable).
201
202
203.. function:: dropwhile(predicate, iterable)
204
205 Make an iterator that drops elements from the iterable as long as the predicate
206 is true; afterwards, returns every element. Note, the iterator does not produce
207 *any* output until the predicate first becomes false, so it may have a lengthy
208 start-up time. Equivalent to::
209
210 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000211 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000212 iterable = iter(iterable)
213 for x in iterable:
214 if not predicate(x):
215 yield x
216 break
217 for x in iterable:
218 yield x
219
220
221.. function:: groupby(iterable[, key])
222
223 Make an iterator that returns consecutive keys and groups from the *iterable*.
224 The *key* is a function computing a key value for each element. If not
225 specified or is ``None``, *key* defaults to an identity function and returns
226 the element unchanged. Generally, the iterable needs to already be sorted on
227 the same key function.
228
229 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
230 generates a break or new group every time the value of the key function changes
231 (which is why it is usually necessary to have sorted the data using the same key
232 function). That behavior differs from SQL's GROUP BY which aggregates common
233 elements regardless of their input order.
234
235 The returned group is itself an iterator that shares the underlying iterable
236 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
237 object is advanced, the previous group is no longer visible. So, if that data
238 is needed later, it should be stored as a list::
239
240 groups = []
241 uniquekeys = []
242 data = sorted(data, key=keyfunc)
243 for k, g in groupby(data, keyfunc):
244 groups.append(list(g)) # Store group iterator as a list
245 uniquekeys.append(k)
246
247 :func:`groupby` is equivalent to::
248
249 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000250 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettinger11485b42009-02-04 19:34:31 +0000251 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000252 def __init__(self, iterable, key=None):
253 if key is None:
254 key = lambda x: x
255 self.keyfunc = key
256 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000257 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000258 def __iter__(self):
259 return self
260 def next(self):
261 while self.currkey == self.tgtkey:
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000262 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000263 self.currkey = self.keyfunc(self.currvalue)
264 self.tgtkey = self.currkey
265 return (self.currkey, self._grouper(self.tgtkey))
266 def _grouper(self, tgtkey):
267 while self.currkey == tgtkey:
268 yield self.currvalue
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000269 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000270 self.currkey = self.keyfunc(self.currvalue)
271
272 .. versionadded:: 2.4
273
274
275.. function:: ifilter(predicate, iterable)
276
277 Make an iterator that filters elements from iterable returning only those for
278 which the predicate is ``True``. If *predicate* is ``None``, return the items
279 that are true. Equivalent to::
280
281 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000282 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000283 if predicate is None:
284 predicate = bool
285 for x in iterable:
286 if predicate(x):
287 yield x
288
289
290.. function:: ifilterfalse(predicate, iterable)
291
292 Make an iterator that filters elements from iterable returning only those for
293 which the predicate is ``False``. If *predicate* is ``None``, return the items
294 that are false. Equivalent to::
295
296 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000297 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000298 if predicate is None:
299 predicate = bool
300 for x in iterable:
301 if not predicate(x):
302 yield x
303
304
305.. function:: imap(function, *iterables)
306
307 Make an iterator that computes the function using arguments from each of the
308 iterables. If *function* is set to ``None``, then :func:`imap` returns the
309 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
310 exhausted instead of filling in ``None`` for shorter iterables. The reason for
311 the difference is that infinite iterator arguments are typically an error for
312 :func:`map` (because the output is fully evaluated) but represent a common and
313 useful way of supplying arguments to :func:`imap`. Equivalent to::
314
315 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000316 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000317 iterables = map(iter, iterables)
318 while True:
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000319 args = [next(it) for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000320 if function is None:
321 yield tuple(args)
322 else:
323 yield function(*args)
324
325
326.. function:: islice(iterable, [start,] stop [, step])
327
328 Make an iterator that returns selected elements from the iterable. If *start* is
329 non-zero, then elements from the iterable are skipped until start is reached.
330 Afterward, elements are returned consecutively unless *step* is set higher than
331 one which results in items being skipped. If *stop* is ``None``, then iteration
332 continues until the iterator is exhausted, if at all; otherwise, it stops at the
333 specified position. Unlike regular slicing, :func:`islice` does not support
334 negative values for *start*, *stop*, or *step*. Can be used to extract related
335 fields from data where the internal structure has been flattened (for example, a
336 multi-line report may list a name field on every third line). Equivalent to::
337
338 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000339 # islice('ABCDEFG', 2) --> A B
340 # islice('ABCDEFG', 2, 4) --> C D
341 # islice('ABCDEFG', 2, None) --> C D E F G
342 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000343 s = slice(*args)
344 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000345 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000346 for i, element in enumerate(iterable):
347 if i == nexti:
348 yield element
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000349 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000350
351 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
352 then the step defaults to one.
353
354 .. versionchanged:: 2.5
355 accept ``None`` values for default *start* and *step*.
356
357
358.. function:: izip(*iterables)
359
360 Make an iterator that aggregates elements from each of the iterables. Like
361 :func:`zip` except that it returns an iterator instead of a list. Used for
362 lock-step iteration over several iterables at a time. Equivalent to::
363
364 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000365 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000366 iterables = map(iter, iterables)
367 while iterables:
Raymond Hettingerceecb8d2009-04-20 18:23:26 +0000368 yield tuple(map(next, iterables))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000369
370 .. versionchanged:: 2.4
371 When no iterables are specified, returns a zero length iterator instead of
372 raising a :exc:`TypeError` exception.
373
Raymond Hettinger48c62932008-01-22 19:51:41 +0000374 The left-to-right evaluation order of the iterables is guaranteed. This
375 makes possible an idiom for clustering a data series into n-length groups
376 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000377
Raymond Hettinger48c62932008-01-22 19:51:41 +0000378 :func:`izip` should only be used with unequal length inputs when you don't
379 care about trailing, unmatched values from the longer iterables. If those
380 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000381
382
383.. function:: izip_longest(*iterables[, fillvalue])
384
385 Make an iterator that aggregates elements from each of the iterables. If the
386 iterables are of uneven length, missing values are filled-in with *fillvalue*.
387 Iteration continues until the longest iterable is exhausted. Equivalent to::
388
389 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000390 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000391 fillvalue = kwds.get('fillvalue')
392 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
393 yield counter() # yields the fillvalue, or raises IndexError
394 fillers = repeat(fillvalue)
395 iters = [chain(it, sentinel(), fillers) for it in args]
396 try:
397 for tup in izip(*iters):
398 yield tup
399 except IndexError:
400 pass
401
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000402 If one of the iterables is potentially infinite, then the
403 :func:`izip_longest` function should be wrapped with something that limits
404 the number of calls (for example :func:`islice` or :func:`takewhile`). If
405 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000406
407 .. versionadded:: 2.6
408
Raymond Hettinger330958e62008-02-28 19:41:24 +0000409.. function:: permutations(iterable[, r])
410
411 Return successive *r* length permutations of elements in the *iterable*.
412
413 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl734373c2009-01-03 21:55:17 +0000414 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e62008-02-28 19:41:24 +0000415 are generated.
416
Georg Brandl734373c2009-01-03 21:55:17 +0000417 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e62008-02-28 19:41:24 +0000418 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl734373c2009-01-03 21:55:17 +0000419 in sorted order.
Raymond Hettinger330958e62008-02-28 19:41:24 +0000420
421 Elements are treated as unique based on their position, not on their
422 value. So if the input elements are unique, there will be no repeat
423 values in each permutation.
424
Raymond Hettingerf287f172008-03-02 10:59:31 +0000425 Equivalent to::
426
427 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000428 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
429 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000430 pool = tuple(iterable)
431 n = len(pool)
432 r = n if r is None else r
Raymond Hettinger825758c2009-01-08 05:20:19 +0000433 if r > n:
434 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000435 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000436 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000437 yield tuple(pool[i] for i in indices[:r])
438 while n:
439 for i in reversed(range(r)):
440 cycles[i] -= 1
441 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000442 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000443 cycles[i] = n - i
444 else:
445 j = cycles[i]
446 indices[i], indices[-j] = indices[-j], indices[i]
447 yield tuple(pool[i] for i in indices[:r])
448 break
449 else:
450 return
Raymond Hettinger330958e62008-02-28 19:41:24 +0000451
Georg Brandl734373c2009-01-03 21:55:17 +0000452 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000453 :func:`product`, filtered to exclude entries with repeated elements (those
454 from the same position in the input pool)::
455
456 def permutations(iterable, r=None):
457 pool = tuple(iterable)
458 n = len(pool)
459 r = n if r is None else r
460 for indices in product(range(n), repeat=r):
461 if len(set(indices)) == r:
462 yield tuple(pool[i] for i in indices)
463
Raymond Hettinger825758c2009-01-08 05:20:19 +0000464 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
465 or zero when ``r > n``.
466
Raymond Hettinger330958e62008-02-28 19:41:24 +0000467 .. versionadded:: 2.6
468
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000469.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000470
471 Cartesian product of input iterables.
472
473 Equivalent to nested for-loops in a generator expression. For example,
474 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
475
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000476 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000477 on every iteration. This pattern creates a lexicographic ordering so that if
478 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000479 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000480
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000481 To compute the product of an iterable with itself, specify the number of
482 repetitions with the optional *repeat* keyword argument. For example,
483 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
484
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000485 This function is equivalent to the following code, except that the
486 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000487
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000488 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000489 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
490 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000491 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000492 result = [[]]
493 for pool in pools:
494 result = [x+[y] for x in result for y in pool]
495 for prod in result:
496 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000497
498 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000499
500.. function:: repeat(object[, times])
501
502 Make an iterator that returns *object* over and over again. Runs indefinitely
503 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000504 invariant function parameters. Also used with :func:`izip` to create constant
505 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000506
507 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000508 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000509 if times is None:
510 while True:
511 yield object
512 else:
513 for i in xrange(times):
514 yield object
515
516
517.. function:: starmap(function, iterable)
518
Raymond Hettinger47317092008-01-17 03:02:14 +0000519 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000520 the iterable. Used instead of :func:`imap` when argument parameters are already
521 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
522 difference between :func:`imap` and :func:`starmap` parallels the distinction
523 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
524
525 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000526 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000527 for args in iterable:
528 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000529
Raymond Hettinger47317092008-01-17 03:02:14 +0000530 .. versionchanged:: 2.6
531 Previously, :func:`starmap` required the function arguments to be tuples.
532 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000533
534.. function:: takewhile(predicate, iterable)
535
536 Make an iterator that returns elements from the iterable as long as the
537 predicate is true. Equivalent to::
538
539 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000540 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000541 for x in iterable:
542 if predicate(x):
543 yield x
544 else:
545 break
546
547
548.. function:: tee(iterable[, n=2])
549
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000550 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000551
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000552 def tee(iterable, n=2):
553 it = iter(iterable)
554 deques = [collections.deque() for i in range(n)]
555 def gen(mydeque):
556 while True:
557 if not mydeque: # when the local deque is empty
558 newval = next(it) # fetch a new value and
559 for d in deques: # load it to all the deques
560 d.append(newval)
561 yield mydeque.popleft()
562 return tuple(gen(d) for d in deques)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000563
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000564 Once :func:`tee` has made a split, the original *iterable* should not be
565 used anywhere else; otherwise, the *iterable* could get advanced without
566 the tee objects being informed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000567
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000568 This itertool may require significant auxiliary storage (depending on how
569 much temporary data needs to be stored). In general, if one iterator uses
570 most or all of the data before another iterator starts, it is faster to use
571 :func:`list` instead of :func:`tee`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000572
573 .. versionadded:: 2.4
574
575
576.. _itertools-example:
577
578Examples
579--------
580
581The following examples show common uses for each tool and demonstrate ways they
Georg Brandle8f1b002008-03-22 22:04:10 +0000582can be combined.
583
584.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000585
Georg Brandl734373c2009-01-03 21:55:17 +0000586 >>> # Show a dictionary sorted and grouped by value
Georg Brandl8ec7f652007-08-15 14:28:01 +0000587 >>> from operator import itemgetter
588 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
589 >>> di = sorted(d.iteritems(), key=itemgetter(1))
590 >>> for k, g in groupby(di, key=itemgetter(1)):
591 ... print k, map(itemgetter(0), g)
592 ...
593 1 ['a', 'c', 'e']
594 2 ['b', 'd', 'f']
595 3 ['g']
596
Georg Brandl734373c2009-01-03 21:55:17 +0000597 >>> # Find runs of consecutive numbers using groupby. The key to the solution
598 >>> # is differencing with a range so that consecutive numbers all appear in
599 >>> # same group.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000600 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
601 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
Georg Brandle8f1b002008-03-22 22:04:10 +0000602 ... print map(itemgetter(1), g)
Georg Brandl734373c2009-01-03 21:55:17 +0000603 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000604 [1]
605 [4, 5, 6]
606 [10]
607 [15, 16, 17, 18]
608 [22]
609 [25, 26, 27, 28]
610
611
612
613.. _itertools-recipes:
614
615Recipes
616-------
617
618This section shows recipes for creating an extended toolset using the existing
619itertools as building blocks.
620
621The extended tools offer the same high performance as the underlying toolset.
622The superior memory performance is kept by processing elements one at a time
623rather than bringing the whole iterable into memory all at once. Code volume is
624kept small by linking the tools together in a functional style which helps
625eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000626"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000627which incur interpreter overhead.
628
629.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000630
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000631 def take(n, iterable):
632 "Return first n items of the iterable as a list"
633 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000634
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000635 def enumerate(iterable, start=0):
636 return izip(count(start), iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000637
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000638 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000639 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000640 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000641
Raymond Hettingerfb30cca2009-03-09 11:58:33 +0000642 def consume(iterator, n):
643 "Advance the iterator n-steps ahead. If n is none, consume entirely."
Raymond Hettingercef40c32010-01-24 03:34:56 +0000644 # The technique uses objects that consume iterators at C speed.
645 if n is None:
646 # feed the entire iterator into a zero-length deque
647 collections.deque(iterator, maxlen=0)
648 else:
649 # advance to the emtpy slice starting at position n
650 next(islice(iterator, n, n), None)
Raymond Hettingerfb30cca2009-03-09 11:58:33 +0000651
Raymond Hettinger5894c2b2009-02-19 05:38:53 +0000652 def nth(iterable, n, default=None):
653 "Returns the nth item or a default value"
654 return next(islice(iterable, n, None), default)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000655
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000656 def quantify(iterable, pred=bool):
657 "Count how many times the predicate is true"
658 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000659
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000660 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000661 """Returns the sequence elements and then returns None indefinitely.
662
663 Useful for emulating the behavior of the built-in map() function.
664 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000665 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000666
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000667 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000668 "Returns the sequence elements n times"
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000669 return chain.from_iterable(repeat(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000670
671 def dotproduct(vec1, vec2):
672 return sum(imap(operator.mul, vec1, vec2))
673
674 def flatten(listOfLists):
Raymond Hettinger330958e62008-02-28 19:41:24 +0000675 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000676
677 def repeatfunc(func, times=None, *args):
678 """Repeat calls to func with specified arguments.
679
680 Example: repeatfunc(random.random)
681 """
682 if times is None:
683 return starmap(func, repeat(args))
Raymond Hettinger330958e62008-02-28 19:41:24 +0000684 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000685
686 def pairwise(iterable):
687 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
688 a, b = tee(iterable)
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000689 next(b, None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000690 return izip(a, b)
691
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000692 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000693 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000694 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000695 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000696
Raymond Hettingera44327a2008-01-30 22:17:31 +0000697 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000698 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e62008-02-28 19:41:24 +0000699 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000700 pending = len(iterables)
701 nexts = cycle(iter(it).next for it in iterables)
702 while pending:
703 try:
704 for next in nexts:
705 yield next()
706 except StopIteration:
707 pending -= 1
708 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000709
Raymond Hettingere8b4b602008-03-11 00:19:07 +0000710 def compress(data, selectors):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000711 "compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F"
712 return (d for d, s in izip(data, selectors) if s)
Raymond Hettinger33691672008-07-19 00:43:00 +0000713
Georg Brandl8c81fda2008-07-23 16:00:44 +0000714 def combinations_with_replacement(iterable, r):
Raymond Hettinger825758c2009-01-08 05:20:19 +0000715 "combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC"
716 # number items returned: (n+r-1)! / r! / (n-1)!
Georg Brandl8c81fda2008-07-23 16:00:44 +0000717 pool = tuple(iterable)
718 n = len(pool)
Raymond Hettingercdc9f2c2009-01-27 06:38:00 +0000719 if not n and r:
720 return
Georg Brandl8c81fda2008-07-23 16:00:44 +0000721 indices = [0] * r
722 yield tuple(pool[i] for i in indices)
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000723 while True:
Georg Brandl8c81fda2008-07-23 16:00:44 +0000724 for i in reversed(range(r)):
725 if indices[i] != n - 1:
726 break
727 else:
728 return
729 indices[i:] = [indices[i] + 1] * (r - i)
730 yield tuple(pool[i] for i in indices)
Raymond Hettingerd8461652009-01-02 21:20:38 +0000731
Raymond Hettingerd187d522009-02-23 21:23:04 +0000732 def powerset(iterable):
733 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
734 s = list(iterable)
735 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettingerd8461652009-01-02 21:20:38 +0000736
Raymond Hettingerd187d522009-02-23 21:23:04 +0000737 def unique_everseen(iterable, key=None):
738 "List unique elements, preserving order. Remember all elements ever seen."
739 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
740 # unique_everseen('ABBCcAD', str.lower) --> A B C D
741 seen = set()
742 seen_add = seen.add
743 if key is None:
744 for element in iterable:
745 if element not in seen:
746 seen_add(element)
747 yield element
748 else:
749 for element in iterable:
750 k = key(element)
751 if k not in seen:
752 seen_add(k)
753 yield element
754
755 def unique_justseen(iterable, key=None):
756 "List unique elements, preserving order. Remember only the element just seen."
757 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
758 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
759 return imap(next, imap(itemgetter(1), groupby(iterable, key)))
Raymond Hettingercef40c32010-01-24 03:34:56 +0000760
761Note, many of the above recipes can be optimized by replacing global lookups
762with local variables defined as default values. For example, the
763*dotproduct* recipe can be written as:
764
765 def dotproduct(vec1, vec2, sum=sum, imap=imap, mul=operator.mul):
766 return sum(imap(mul, vec1, vec2))