blob: 87e99d04f071e9608b834a5630f8305cf5ccef45 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Raymond Hettingerd187d522009-02-23 21:23:04 +000017This module implements a number of :term:`iterator` building blocks inspired
18by constructs from APL, Haskell, and SML. Each has been recast in a form
19suitable for Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000020
21The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettingerd187d522009-02-23 21:23:04 +000022useful by themselves or in combination. Together, they form an "iterator
23algebra" making it possible to construct specialized tools succinctly and
24efficiently in pure Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000025
26For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
27sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
Raymond Hettingerd187d522009-02-23 21:23:04 +000028:func:`count` which can be combined to form ``imap(f, count())`` to produce an
Georg Brandl8ec7f652007-08-15 14:28:01 +000029equivalent result.
30
Raymond Hettinger2b458192009-03-12 00:35:05 +000031These tools and their built-in counterparts also work well with the high-speed
32functions in the :mod:`operator` module. For example, the multiplication
33operator can be mapped across two vectors to form an efficient dot-product:
34``sum(imap(operator.mul, vector1, vector2))``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000035
Georg Brandl8ec7f652007-08-15 14:28:01 +000036
Raymond Hettingerd187d522009-02-23 21:23:04 +000037**Infinite Iterators:**
Georg Brandl8ec7f652007-08-15 14:28:01 +000038
Raymond Hettinger92d19512009-04-10 18:52:23 +000039================== ================= ================================================= =========================================
40Iterator Arguments Results Example
41================== ================= ================================================= =========================================
Raymond Hettinger099769c2009-04-10 18:58:58 +000042:func:`count` start start, start+1, start+2, ... ``count(10) --> 10 11 12 13 14 ...``
Raymond Hettinger92d19512009-04-10 18:52:23 +000043:func:`cycle` p p0, p1, ... plast, p0, p1, ... ``cycle('ABCD') --> A B C D A B C D ...``
44:func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times ``repeat(10, 3) --> 10 10 10``
45================== ================= ================================================= =========================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000046
Raymond Hettingerd187d522009-02-23 21:23:04 +000047**Iterators terminating on the shortest input sequence:**
48
Raymond Hettinger92d19512009-04-10 18:52:23 +000049==================== ============================ ================================================= =============================================================
50Iterator Arguments Results Example
51==================== ============================ ================================================= =============================================================
52:func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ... ``chain('ABC', 'DEF') --> A B C D E F``
53:func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails ``dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1``
54:func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
55:func:`ifilter` pred, seq elements of seq where pred(elem) is True ``ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9``
56:func:`ifilterfalse` pred, seq elements of seq where pred(elem) is False ``ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8``
57:func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step] ``islice('ABCDEFG', 2, None) --> C D E F G``
58:func:`imap` func, p, q, ... func(p0, q0), func(p1, q1), ... ``imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000``
59:func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ... ``starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000``
60:func:`tee` it, n it1, it2 , ... itn splits one iterator into n
61:func:`takewhile` pred, seq seq[0], seq[1], until pred fails ``takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4``
62:func:`izip` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip('ABCD', 'xy') --> Ax By``
63:func:`izip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-``
64==================== ============================ ================================================= =============================================================
Raymond Hettingerd187d522009-02-23 21:23:04 +000065
66**Combinatoric generators:**
67
Raymond Hettinger21fde352009-04-10 19:46:51 +000068============================================== ==================== =============================================================
69Iterator Arguments Results
70============================================== ==================== =============================================================
71:func:`product` p, q, ... [repeat=1] cartesian product, equivalent to a nested for-loop
72:func:`permutations` p[, r] r-length tuples, all possible orderings, no repeated elements
Raymond Hettinger1aef4442009-11-19 01:26:23 +000073:func:`combinations` p, r r-length tuples, in sorted order, no repeated elements
Raymond Hettinger21fde352009-04-10 19:46:51 +000074|
75``product('ABCD', repeat=2)`` ``AA AB AC AD BA BB BC BD CA CB CC CD DA DB DC DD``
76``permutations('ABCD', 2)`` ``AB AC AD BA BC BD CA CB CD DA DB DC``
77``combinations('ABCD', 2)`` ``AB AC AD BC BD CD``
78============================================== ==================== =============================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000079
80
81.. _itertools-functions:
82
83Itertool functions
84------------------
85
86The following module functions all construct and return iterators. Some provide
87streams of infinite length, so they should only be accessed by functions or
88loops that truncate the stream.
89
90
91.. function:: chain(*iterables)
92
93 Make an iterator that returns elements from the first iterable until it is
94 exhausted, then proceeds to the next iterable, until all of the iterables are
95 exhausted. Used for treating consecutive sequences as a single sequence.
96 Equivalent to::
97
98 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000099 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +0000100 for it in iterables:
101 for element in it:
102 yield element
103
104
Raymond Hettinger330958e2008-02-28 19:41:24 +0000105.. function:: itertools.chain.from_iterable(iterable)
106
Georg Brandl734373c2009-01-03 21:55:17 +0000107 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e2008-02-28 19:41:24 +0000108 single iterable argument that is evaluated lazily. Equivalent to::
109
110 @classmethod
111 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000112 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +0000113 for it in iterables:
114 for element in it:
115 yield element
116
117 .. versionadded:: 2.6
118
Raymond Hettingerd553d852008-03-04 04:17:08 +0000119
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000120.. function:: combinations(iterable, r)
121
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000122 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000123
Georg Brandl734373c2009-01-03 21:55:17 +0000124 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000125 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl734373c2009-01-03 21:55:17 +0000126 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000127
128 Elements are treated as unique based on their position, not on their
129 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000130 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000131
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000132 Equivalent to::
133
134 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000135 # combinations('ABCD', 2) --> AB AC AD BC BD CD
136 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000137 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000138 n = len(pool)
Raymond Hettinger825758c2009-01-08 05:20:19 +0000139 if r > n:
140 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000141 indices = range(r)
142 yield tuple(pool[i] for i in indices)
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000143 while True:
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000144 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000145 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000146 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000147 else:
148 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000149 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000150 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000151 indices[j] = indices[j-1] + 1
152 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000153
Raymond Hettingerd553d852008-03-04 04:17:08 +0000154 The code for :func:`combinations` can be also expressed as a subsequence
155 of :func:`permutations` after filtering entries where the elements are not
156 in sorted order (according to their position in the input pool)::
157
158 def combinations(iterable, r):
159 pool = tuple(iterable)
160 n = len(pool)
161 for indices in permutations(range(n), r):
162 if sorted(indices) == list(indices):
163 yield tuple(pool[i] for i in indices)
164
Raymond Hettinger825758c2009-01-08 05:20:19 +0000165 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
166 or zero when ``r > n``.
167
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000168 .. versionadded:: 2.6
169
Georg Brandl8ec7f652007-08-15 14:28:01 +0000170.. function:: count([n])
171
172 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettinger50e90e22007-10-04 00:20:27 +0000173 specified *n* defaults to zero. Often used as an argument to :func:`imap` to
174 generate consecutive data points. Also, used with :func:`izip` to add sequence
175 numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000176
177 def count(n=0):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000178 # count(10) --> 10 11 12 13 14 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000179 while True:
180 yield n
181 n += 1
182
Georg Brandl8ec7f652007-08-15 14:28:01 +0000183
184.. function:: cycle(iterable)
185
186 Make an iterator returning elements from the iterable and saving a copy of each.
187 When the iterable is exhausted, return elements from the saved copy. Repeats
188 indefinitely. Equivalent to::
189
190 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000191 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000192 saved = []
193 for element in iterable:
194 yield element
195 saved.append(element)
196 while saved:
197 for element in saved:
198 yield element
199
200 Note, this member of the toolkit may require significant auxiliary storage
201 (depending on the length of the iterable).
202
203
204.. function:: dropwhile(predicate, iterable)
205
206 Make an iterator that drops elements from the iterable as long as the predicate
207 is true; afterwards, returns every element. Note, the iterator does not produce
208 *any* output until the predicate first becomes false, so it may have a lengthy
209 start-up time. Equivalent to::
210
211 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000212 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000213 iterable = iter(iterable)
214 for x in iterable:
215 if not predicate(x):
216 yield x
217 break
218 for x in iterable:
219 yield x
220
221
222.. function:: groupby(iterable[, key])
223
224 Make an iterator that returns consecutive keys and groups from the *iterable*.
225 The *key* is a function computing a key value for each element. If not
226 specified or is ``None``, *key* defaults to an identity function and returns
227 the element unchanged. Generally, the iterable needs to already be sorted on
228 the same key function.
229
230 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
231 generates a break or new group every time the value of the key function changes
232 (which is why it is usually necessary to have sorted the data using the same key
233 function). That behavior differs from SQL's GROUP BY which aggregates common
234 elements regardless of their input order.
235
236 The returned group is itself an iterator that shares the underlying iterable
237 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
238 object is advanced, the previous group is no longer visible. So, if that data
239 is needed later, it should be stored as a list::
240
241 groups = []
242 uniquekeys = []
243 data = sorted(data, key=keyfunc)
244 for k, g in groupby(data, keyfunc):
245 groups.append(list(g)) # Store group iterator as a list
246 uniquekeys.append(k)
247
248 :func:`groupby` is equivalent to::
249
250 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000251 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettinger11485b42009-02-04 19:34:31 +0000252 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000253 def __init__(self, iterable, key=None):
254 if key is None:
255 key = lambda x: x
256 self.keyfunc = key
257 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000258 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000259 def __iter__(self):
260 return self
261 def next(self):
262 while self.currkey == self.tgtkey:
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000263 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000264 self.currkey = self.keyfunc(self.currvalue)
265 self.tgtkey = self.currkey
266 return (self.currkey, self._grouper(self.tgtkey))
267 def _grouper(self, tgtkey):
268 while self.currkey == tgtkey:
269 yield self.currvalue
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000270 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000271 self.currkey = self.keyfunc(self.currvalue)
272
273 .. versionadded:: 2.4
274
275
276.. function:: ifilter(predicate, iterable)
277
278 Make an iterator that filters elements from iterable returning only those for
279 which the predicate is ``True``. If *predicate* is ``None``, return the items
280 that are true. Equivalent to::
281
282 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000283 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000284 if predicate is None:
285 predicate = bool
286 for x in iterable:
287 if predicate(x):
288 yield x
289
290
291.. function:: ifilterfalse(predicate, iterable)
292
293 Make an iterator that filters elements from iterable returning only those for
294 which the predicate is ``False``. If *predicate* is ``None``, return the items
295 that are false. Equivalent to::
296
297 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000298 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000299 if predicate is None:
300 predicate = bool
301 for x in iterable:
302 if not predicate(x):
303 yield x
304
305
306.. function:: imap(function, *iterables)
307
308 Make an iterator that computes the function using arguments from each of the
309 iterables. If *function* is set to ``None``, then :func:`imap` returns the
310 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
311 exhausted instead of filling in ``None`` for shorter iterables. The reason for
312 the difference is that infinite iterator arguments are typically an error for
313 :func:`map` (because the output is fully evaluated) but represent a common and
314 useful way of supplying arguments to :func:`imap`. Equivalent to::
315
316 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000317 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000318 iterables = map(iter, iterables)
319 while True:
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000320 args = [next(it) for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000321 if function is None:
322 yield tuple(args)
323 else:
324 yield function(*args)
325
326
327.. function:: islice(iterable, [start,] stop [, step])
328
329 Make an iterator that returns selected elements from the iterable. If *start* is
330 non-zero, then elements from the iterable are skipped until start is reached.
331 Afterward, elements are returned consecutively unless *step* is set higher than
332 one which results in items being skipped. If *stop* is ``None``, then iteration
333 continues until the iterator is exhausted, if at all; otherwise, it stops at the
334 specified position. Unlike regular slicing, :func:`islice` does not support
335 negative values for *start*, *stop*, or *step*. Can be used to extract related
336 fields from data where the internal structure has been flattened (for example, a
337 multi-line report may list a name field on every third line). Equivalent to::
338
339 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000340 # islice('ABCDEFG', 2) --> A B
341 # islice('ABCDEFG', 2, 4) --> C D
342 # islice('ABCDEFG', 2, None) --> C D E F G
343 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000344 s = slice(*args)
345 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000346 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000347 for i, element in enumerate(iterable):
348 if i == nexti:
349 yield element
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000350 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000351
352 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
353 then the step defaults to one.
354
355 .. versionchanged:: 2.5
356 accept ``None`` values for default *start* and *step*.
357
358
359.. function:: izip(*iterables)
360
361 Make an iterator that aggregates elements from each of the iterables. Like
362 :func:`zip` except that it returns an iterator instead of a list. Used for
363 lock-step iteration over several iterables at a time. Equivalent to::
364
365 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000366 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000367 iterables = map(iter, iterables)
368 while iterables:
Raymond Hettingerceecb8d2009-04-20 18:23:26 +0000369 yield tuple(map(next, iterables))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000370
371 .. versionchanged:: 2.4
372 When no iterables are specified, returns a zero length iterator instead of
373 raising a :exc:`TypeError` exception.
374
Raymond Hettinger48c62932008-01-22 19:51:41 +0000375 The left-to-right evaluation order of the iterables is guaranteed. This
376 makes possible an idiom for clustering a data series into n-length groups
377 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000378
Raymond Hettinger48c62932008-01-22 19:51:41 +0000379 :func:`izip` should only be used with unequal length inputs when you don't
380 care about trailing, unmatched values from the longer iterables. If those
381 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000382
383
384.. function:: izip_longest(*iterables[, fillvalue])
385
386 Make an iterator that aggregates elements from each of the iterables. If the
387 iterables are of uneven length, missing values are filled-in with *fillvalue*.
388 Iteration continues until the longest iterable is exhausted. Equivalent to::
389
390 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000391 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000392 fillvalue = kwds.get('fillvalue')
393 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
394 yield counter() # yields the fillvalue, or raises IndexError
395 fillers = repeat(fillvalue)
396 iters = [chain(it, sentinel(), fillers) for it in args]
397 try:
398 for tup in izip(*iters):
399 yield tup
400 except IndexError:
401 pass
402
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000403 If one of the iterables is potentially infinite, then the
404 :func:`izip_longest` function should be wrapped with something that limits
405 the number of calls (for example :func:`islice` or :func:`takewhile`). If
406 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000407
408 .. versionadded:: 2.6
409
Raymond Hettinger330958e2008-02-28 19:41:24 +0000410.. function:: permutations(iterable[, r])
411
412 Return successive *r* length permutations of elements in the *iterable*.
413
414 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl734373c2009-01-03 21:55:17 +0000415 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e2008-02-28 19:41:24 +0000416 are generated.
417
Georg Brandl734373c2009-01-03 21:55:17 +0000418 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e2008-02-28 19:41:24 +0000419 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl734373c2009-01-03 21:55:17 +0000420 in sorted order.
Raymond Hettinger330958e2008-02-28 19:41:24 +0000421
422 Elements are treated as unique based on their position, not on their
423 value. So if the input elements are unique, there will be no repeat
424 values in each permutation.
425
Raymond Hettingerf287f172008-03-02 10:59:31 +0000426 Equivalent to::
427
428 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000429 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
430 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000431 pool = tuple(iterable)
432 n = len(pool)
433 r = n if r is None else r
Raymond Hettinger825758c2009-01-08 05:20:19 +0000434 if r > n:
435 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000436 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000437 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000438 yield tuple(pool[i] for i in indices[:r])
439 while n:
440 for i in reversed(range(r)):
441 cycles[i] -= 1
442 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000443 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000444 cycles[i] = n - i
445 else:
446 j = cycles[i]
447 indices[i], indices[-j] = indices[-j], indices[i]
448 yield tuple(pool[i] for i in indices[:r])
449 break
450 else:
451 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000452
Georg Brandl734373c2009-01-03 21:55:17 +0000453 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000454 :func:`product`, filtered to exclude entries with repeated elements (those
455 from the same position in the input pool)::
456
457 def permutations(iterable, r=None):
458 pool = tuple(iterable)
459 n = len(pool)
460 r = n if r is None else r
461 for indices in product(range(n), repeat=r):
462 if len(set(indices)) == r:
463 yield tuple(pool[i] for i in indices)
464
Raymond Hettinger825758c2009-01-08 05:20:19 +0000465 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
466 or zero when ``r > n``.
467
Raymond Hettinger330958e2008-02-28 19:41:24 +0000468 .. versionadded:: 2.6
469
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000470.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000471
472 Cartesian product of input iterables.
473
474 Equivalent to nested for-loops in a generator expression. For example,
475 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
476
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000477 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000478 on every iteration. This pattern creates a lexicographic ordering so that if
479 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000480 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000481
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000482 To compute the product of an iterable with itself, specify the number of
483 repetitions with the optional *repeat* keyword argument. For example,
484 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
485
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000486 This function is equivalent to the following code, except that the
487 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000488
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000489 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000490 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
491 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000492 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000493 result = [[]]
494 for pool in pools:
495 result = [x+[y] for x in result for y in pool]
496 for prod in result:
497 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000498
499 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000500
501.. function:: repeat(object[, times])
502
503 Make an iterator that returns *object* over and over again. Runs indefinitely
504 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000505 invariant function parameters. Also used with :func:`izip` to create constant
506 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000507
508 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000509 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000510 if times is None:
511 while True:
512 yield object
513 else:
514 for i in xrange(times):
515 yield object
516
517
518.. function:: starmap(function, iterable)
519
Raymond Hettinger47317092008-01-17 03:02:14 +0000520 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000521 the iterable. Used instead of :func:`imap` when argument parameters are already
522 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
523 difference between :func:`imap` and :func:`starmap` parallels the distinction
524 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
525
526 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000527 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000528 for args in iterable:
529 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000530
Raymond Hettinger47317092008-01-17 03:02:14 +0000531 .. versionchanged:: 2.6
532 Previously, :func:`starmap` required the function arguments to be tuples.
533 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000534
535.. function:: takewhile(predicate, iterable)
536
537 Make an iterator that returns elements from the iterable as long as the
538 predicate is true. Equivalent to::
539
540 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000541 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000542 for x in iterable:
543 if predicate(x):
544 yield x
545 else:
546 break
547
548
549.. function:: tee(iterable[, n=2])
550
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000551 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000552
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000553 def tee(iterable, n=2):
554 it = iter(iterable)
555 deques = [collections.deque() for i in range(n)]
556 def gen(mydeque):
557 while True:
558 if not mydeque: # when the local deque is empty
559 newval = next(it) # fetch a new value and
560 for d in deques: # load it to all the deques
561 d.append(newval)
562 yield mydeque.popleft()
563 return tuple(gen(d) for d in deques)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000564
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000565 Once :func:`tee` has made a split, the original *iterable* should not be
566 used anywhere else; otherwise, the *iterable* could get advanced without
567 the tee objects being informed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000568
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000569 This itertool may require significant auxiliary storage (depending on how
570 much temporary data needs to be stored). In general, if one iterator uses
571 most or all of the data before another iterator starts, it is faster to use
572 :func:`list` instead of :func:`tee`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000573
574 .. versionadded:: 2.4
575
576
577.. _itertools-example:
578
579Examples
580--------
581
582The following examples show common uses for each tool and demonstrate ways they
Georg Brandle8f1b002008-03-22 22:04:10 +0000583can be combined.
584
585.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000586
Georg Brandl734373c2009-01-03 21:55:17 +0000587 >>> # Show a dictionary sorted and grouped by value
Georg Brandl8ec7f652007-08-15 14:28:01 +0000588 >>> from operator import itemgetter
589 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
590 >>> di = sorted(d.iteritems(), key=itemgetter(1))
591 >>> for k, g in groupby(di, key=itemgetter(1)):
592 ... print k, map(itemgetter(0), g)
593 ...
594 1 ['a', 'c', 'e']
595 2 ['b', 'd', 'f']
596 3 ['g']
597
Georg Brandl734373c2009-01-03 21:55:17 +0000598 >>> # Find runs of consecutive numbers using groupby. The key to the solution
599 >>> # is differencing with a range so that consecutive numbers all appear in
600 >>> # same group.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000601 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
602 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
Georg Brandle8f1b002008-03-22 22:04:10 +0000603 ... print map(itemgetter(1), g)
Georg Brandl734373c2009-01-03 21:55:17 +0000604 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000605 [1]
606 [4, 5, 6]
607 [10]
608 [15, 16, 17, 18]
609 [22]
610 [25, 26, 27, 28]
611
612
613
614.. _itertools-recipes:
615
616Recipes
617-------
618
619This section shows recipes for creating an extended toolset using the existing
620itertools as building blocks.
621
622The extended tools offer the same high performance as the underlying toolset.
623The superior memory performance is kept by processing elements one at a time
624rather than bringing the whole iterable into memory all at once. Code volume is
625kept small by linking the tools together in a functional style which helps
626eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000627"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000628which incur interpreter overhead.
629
630.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000631
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000632 def take(n, iterable):
633 "Return first n items of the iterable as a list"
634 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000635
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000636 def enumerate(iterable, start=0):
637 return izip(count(start), iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000638
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000639 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000640 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000641 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000642
Raymond Hettingerfb30cca2009-03-09 11:58:33 +0000643 def consume(iterator, n):
644 "Advance the iterator n-steps ahead. If n is none, consume entirely."
645 collections.deque(islice(iterator, n), maxlen=0)
646
Raymond Hettinger5894c2b2009-02-19 05:38:53 +0000647 def nth(iterable, n, default=None):
648 "Returns the nth item or a default value"
649 return next(islice(iterable, n, None), default)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000650
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000651 def quantify(iterable, pred=bool):
652 "Count how many times the predicate is true"
653 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000654
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000655 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000656 """Returns the sequence elements and then returns None indefinitely.
657
658 Useful for emulating the behavior of the built-in map() function.
659 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000660 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000661
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000662 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000663 "Returns the sequence elements n times"
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000664 return chain.from_iterable(repeat(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000665
666 def dotproduct(vec1, vec2):
667 return sum(imap(operator.mul, vec1, vec2))
668
669 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000670 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000671
672 def repeatfunc(func, times=None, *args):
673 """Repeat calls to func with specified arguments.
674
675 Example: repeatfunc(random.random)
676 """
677 if times is None:
678 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000679 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000680
681 def pairwise(iterable):
682 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
683 a, b = tee(iterable)
Raymond Hettingerfc9a6652009-02-23 19:59:57 +0000684 next(b, None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000685 return izip(a, b)
686
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000687 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000688 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000689 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000690 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000691
Raymond Hettingera44327a2008-01-30 22:17:31 +0000692 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000693 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000694 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000695 pending = len(iterables)
696 nexts = cycle(iter(it).next for it in iterables)
697 while pending:
698 try:
699 for next in nexts:
700 yield next()
701 except StopIteration:
702 pending -= 1
703 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000704
Raymond Hettingere8b4b602008-03-11 00:19:07 +0000705 def compress(data, selectors):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000706 "compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F"
707 return (d for d, s in izip(data, selectors) if s)
Raymond Hettinger33691672008-07-19 00:43:00 +0000708
Georg Brandl8c81fda2008-07-23 16:00:44 +0000709 def combinations_with_replacement(iterable, r):
Raymond Hettinger825758c2009-01-08 05:20:19 +0000710 "combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC"
711 # number items returned: (n+r-1)! / r! / (n-1)!
Georg Brandl8c81fda2008-07-23 16:00:44 +0000712 pool = tuple(iterable)
713 n = len(pool)
Raymond Hettingercdc9f2c2009-01-27 06:38:00 +0000714 if not n and r:
715 return
Georg Brandl8c81fda2008-07-23 16:00:44 +0000716 indices = [0] * r
717 yield tuple(pool[i] for i in indices)
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000718 while True:
Georg Brandl8c81fda2008-07-23 16:00:44 +0000719 for i in reversed(range(r)):
720 if indices[i] != n - 1:
721 break
722 else:
723 return
724 indices[i:] = [indices[i] + 1] * (r - i)
725 yield tuple(pool[i] for i in indices)
Raymond Hettingerd8461652009-01-02 21:20:38 +0000726
Raymond Hettingerd187d522009-02-23 21:23:04 +0000727 def powerset(iterable):
728 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
729 s = list(iterable)
730 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettingerd8461652009-01-02 21:20:38 +0000731
Raymond Hettingerd187d522009-02-23 21:23:04 +0000732 def unique_everseen(iterable, key=None):
733 "List unique elements, preserving order. Remember all elements ever seen."
734 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
735 # unique_everseen('ABBCcAD', str.lower) --> A B C D
736 seen = set()
737 seen_add = seen.add
738 if key is None:
739 for element in iterable:
740 if element not in seen:
741 seen_add(element)
742 yield element
743 else:
744 for element in iterable:
745 k = key(element)
746 if k not in seen:
747 seen_add(k)
748 yield element
749
750 def unique_justseen(iterable, key=None):
751 "List unique elements, preserving order. Remember only the element just seen."
752 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
753 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
754 return imap(next, imap(itemgetter(1), groupby(iterable, key)))