blob: 1e946b11b0fdd7912c8c0e9c3330d17e7df0c31a [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Raymond Hettinger0aee9422009-02-17 11:00:27 +000017This module implements a number of :term:`iterator` building blocks inspired
18by constructs from APL, Haskell, and SML. Each has been recast in a form
19suitable for Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000020
21The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettinger0aee9422009-02-17 11:00:27 +000022useful by themselves or in combination. Together, they form an "iterator
23algebra" making it possible to construct specialized tools succinctly and
24efficiently in pure Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000025
26For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
27sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
Raymond Hettinger0aee9422009-02-17 11:00:27 +000028:func:`count` which can be combined to form ``imap(f, count())`` to produce an
Georg Brandl8ec7f652007-08-15 14:28:01 +000029equivalent result.
30
Raymond Hettinger0aee9422009-02-17 11:00:27 +000031The tools also work well with the high-speed functions in the :mod:`operator`
32module. For example, the plus-operator can be mapped across two vectors to
33form an efficient dot-product: ``sum(imap(operator.add, vector1, vector2))``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000034
Georg Brandl8ec7f652007-08-15 14:28:01 +000035
Raymond Hettinger0aee9422009-02-17 11:00:27 +000036**Infinite Iterators:**
Georg Brandl8ec7f652007-08-15 14:28:01 +000037
Raymond Hettinger0aee9422009-02-17 11:00:27 +000038 ================== ================= =================================================
39 Iterator Arguments Results
40 ================== ================= =================================================
41 :func:`count` start, [step] start, start+step, start+2*step, ...
42 :func:`cycle` p p0, p1, ... plast, p0, p1, ...
43 :func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times
44 ================== ================= =================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000045
Raymond Hettinger0aee9422009-02-17 11:00:27 +000046**Iterators terminating on the shortest input sequence:**
47
48 ==================== ============================ =================================================
49 Iterator Arguments Results
50 ==================== ============================ =================================================
51 :func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ...
52 :func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ...
53 :func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails
54 :func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
55 :func:`ifilter` pred, seq elements of seq where pred(elem) is True
56 :func:`ifilterfalse` pred, seq elements of seq where pred(elem) is False
57 :func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step]
Raymond Hettingerfed84c72009-03-09 11:31:39 +000058 :func:`imap` func, p, q, ... func(p0, q0), func(p1, q1), ...
Raymond Hettinger0aee9422009-02-17 11:00:27 +000059 :func:`starmap` func, seq func(\*seq[0]), fun(\*seq[1]), ...
60 :func:`tee` it, n it1, it2 , ... itn splits one iterator into n
61 :func:`takewhile` pred, seq seq[0], seq[1], until pred fails
62 :func:`izip` p, q, ... (p[0], q[0]), (p[1], q[1]), ...
63 :func:`izip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ...
64 ==================== ============================ =================================================
65
66**Combinatoric generators:**
67
68 ===================================== ==================== =================================================
69 Iterator Arguments Results
70 ===================================== ==================== =================================================
71 :func:`product` p, q, ... [repeat=1] cartesian product
72 :func:`permutations` p[, r] r-length permutations (without repeated elements)
73 :func:`combinations` p[, r] r-length combinations (sorted and no repeats)
74 :func:`combinations_with_replacement` p[, r] r-length combinations (sorted but with repeats)
75 ===================================== ==================== =================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000076
77
78.. _itertools-functions:
79
80Itertool functions
81------------------
82
83The following module functions all construct and return iterators. Some provide
84streams of infinite length, so they should only be accessed by functions or
85loops that truncate the stream.
86
87
88.. function:: chain(*iterables)
89
90 Make an iterator that returns elements from the first iterable until it is
91 exhausted, then proceeds to the next iterable, until all of the iterables are
92 exhausted. Used for treating consecutive sequences as a single sequence.
93 Equivalent to::
94
95 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000096 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +000097 for it in iterables:
98 for element in it:
99 yield element
100
101
Raymond Hettinger330958e2008-02-28 19:41:24 +0000102.. function:: itertools.chain.from_iterable(iterable)
103
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000104 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e2008-02-28 19:41:24 +0000105 single iterable argument that is evaluated lazily. Equivalent to::
106
107 @classmethod
108 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000109 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +0000110 for it in iterables:
111 for element in it:
112 yield element
113
114 .. versionadded:: 2.6
115
Raymond Hettingerd553d852008-03-04 04:17:08 +0000116
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000117.. function:: combinations(iterable, r)
118
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000119 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000120
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000121 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000122 input *iterable* is sorted, the combination tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000123 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000124
125 Elements are treated as unique based on their position, not on their
126 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000127 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000128
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000129 Equivalent to::
130
131 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000132 # combinations('ABCD', 2) --> AB AC AD BC BD CD
133 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000134 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000135 n = len(pool)
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000136 if r > n:
137 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000138 indices = range(r)
139 yield tuple(pool[i] for i in indices)
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000140 while True:
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000141 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000142 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000143 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000144 else:
145 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000146 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000147 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000148 indices[j] = indices[j-1] + 1
149 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000150
Raymond Hettingerd553d852008-03-04 04:17:08 +0000151 The code for :func:`combinations` can be also expressed as a subsequence
152 of :func:`permutations` after filtering entries where the elements are not
153 in sorted order (according to their position in the input pool)::
154
155 def combinations(iterable, r):
156 pool = tuple(iterable)
157 n = len(pool)
158 for indices in permutations(range(n), r):
159 if sorted(indices) == list(indices):
160 yield tuple(pool[i] for i in indices)
161
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000162 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
163 or zero when ``r > n``.
164
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000165 .. versionadded:: 2.6
166
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000167.. function:: combinations_with_replacement(iterable, r)
168
169 Return *r* length subsequences of elements from the input *iterable*
170 allowing individual elements to be repeated more than once.
171
172 Combinations are emitted in lexicographic sort order. So, if the
173 input *iterable* is sorted, the combination tuples will be produced
174 in sorted order.
175
176 Elements are treated as unique based on their position, not on their
177 value. So if the input elements are unique, the generated combinations
178 will also be unique.
179
180 Equivalent to::
181
182 def combinations_with_replacement(iterable, r):
183 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
184 pool = tuple(iterable)
185 n = len(pool)
186 if not n and r:
187 return
188 indices = [0] * r
189 yield tuple(pool[i] for i in indices)
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000190 while True:
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000191 for i in reversed(range(r)):
192 if indices[i] != n - 1:
193 break
194 else:
195 return
196 indices[i:] = [indices[i] + 1] * (r - i)
197 yield tuple(pool[i] for i in indices)
198
199 The code for :func:`combinations_with_replacement` can be also expressed as
200 a subsequence of :func:`product` after filtering entries where the elements
201 are not in sorted order (according to their position in the input pool)::
202
203 def combinations_with_replacement(iterable, r):
204 pool = tuple(iterable)
205 n = len(pool)
206 for indices in product(range(n), repeat=r):
207 if sorted(indices) == list(indices):
208 yield tuple(pool[i] for i in indices)
209
210 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
211
212 .. versionadded:: 2.7
213
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000214.. function:: compress(data, selectors)
215
216 Make an iterator that filters elements from *data* returning only those that
217 have a corresponding element in *selectors* that evaluates to ``True``.
218 Stops when either the *data* or *selectors* iterables have been exhausted.
219 Equivalent to::
220
221 def compress(data, selectors):
222 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
223 return (d for d, s in izip(data, selectors) if s)
224
225 .. versionadded:: 2.7
226
227
Raymond Hettingera4038032009-02-14 00:25:51 +0000228.. function:: count(start=0, step=1)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000229
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000230 Make an iterator that returns evenly spaced values starting with *n*. Often
231 used as an argument to :func:`imap` to generate consecutive data points.
232 Also, used with :func:`izip` to add sequence numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000233
Raymond Hettingera4038032009-02-14 00:25:51 +0000234 def count(start=0, step=1):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000235 # count(10) --> 10 11 12 13 14 ...
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000236 # count(2.5, 0.5) -> 3.5 3.0 4.5 ...
Raymond Hettingera4038032009-02-14 00:25:51 +0000237 n = start
Georg Brandl8ec7f652007-08-15 14:28:01 +0000238 while True:
239 yield n
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000240 n += step
Georg Brandl8ec7f652007-08-15 14:28:01 +0000241
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000242 .. versionchanged:: 2.7
243 added *step* argument and allowed non-integer arguments.
244
Georg Brandl8ec7f652007-08-15 14:28:01 +0000245.. function:: cycle(iterable)
246
247 Make an iterator returning elements from the iterable and saving a copy of each.
248 When the iterable is exhausted, return elements from the saved copy. Repeats
249 indefinitely. Equivalent to::
250
251 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000252 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000253 saved = []
254 for element in iterable:
255 yield element
256 saved.append(element)
257 while saved:
258 for element in saved:
259 yield element
260
261 Note, this member of the toolkit may require significant auxiliary storage
262 (depending on the length of the iterable).
263
264
265.. function:: dropwhile(predicate, iterable)
266
267 Make an iterator that drops elements from the iterable as long as the predicate
268 is true; afterwards, returns every element. Note, the iterator does not produce
269 *any* output until the predicate first becomes false, so it may have a lengthy
270 start-up time. Equivalent to::
271
272 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000273 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000274 iterable = iter(iterable)
275 for x in iterable:
276 if not predicate(x):
277 yield x
278 break
279 for x in iterable:
280 yield x
281
282
283.. function:: groupby(iterable[, key])
284
285 Make an iterator that returns consecutive keys and groups from the *iterable*.
286 The *key* is a function computing a key value for each element. If not
287 specified or is ``None``, *key* defaults to an identity function and returns
288 the element unchanged. Generally, the iterable needs to already be sorted on
289 the same key function.
290
291 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
292 generates a break or new group every time the value of the key function changes
293 (which is why it is usually necessary to have sorted the data using the same key
294 function). That behavior differs from SQL's GROUP BY which aggregates common
295 elements regardless of their input order.
296
297 The returned group is itself an iterator that shares the underlying iterable
298 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
299 object is advanced, the previous group is no longer visible. So, if that data
300 is needed later, it should be stored as a list::
301
302 groups = []
303 uniquekeys = []
304 data = sorted(data, key=keyfunc)
305 for k, g in groupby(data, keyfunc):
306 groups.append(list(g)) # Store group iterator as a list
307 uniquekeys.append(k)
308
309 :func:`groupby` is equivalent to::
310
311 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000312 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettingerd507afd2009-02-04 10:52:32 +0000313 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000314 def __init__(self, iterable, key=None):
315 if key is None:
316 key = lambda x: x
317 self.keyfunc = key
318 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000319 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000320 def __iter__(self):
321 return self
322 def next(self):
323 while self.currkey == self.tgtkey:
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000324 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000325 self.currkey = self.keyfunc(self.currvalue)
326 self.tgtkey = self.currkey
327 return (self.currkey, self._grouper(self.tgtkey))
328 def _grouper(self, tgtkey):
329 while self.currkey == tgtkey:
330 yield self.currvalue
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000331 self.currvalue = next(self.it) # Exit on StopIteration
Georg Brandl8ec7f652007-08-15 14:28:01 +0000332 self.currkey = self.keyfunc(self.currvalue)
333
334 .. versionadded:: 2.4
335
336
337.. function:: ifilter(predicate, iterable)
338
339 Make an iterator that filters elements from iterable returning only those for
340 which the predicate is ``True``. If *predicate* is ``None``, return the items
341 that are true. Equivalent to::
342
343 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000344 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000345 if predicate is None:
346 predicate = bool
347 for x in iterable:
348 if predicate(x):
349 yield x
350
351
352.. function:: ifilterfalse(predicate, iterable)
353
354 Make an iterator that filters elements from iterable returning only those for
355 which the predicate is ``False``. If *predicate* is ``None``, return the items
356 that are false. Equivalent to::
357
358 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000359 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000360 if predicate is None:
361 predicate = bool
362 for x in iterable:
363 if not predicate(x):
364 yield x
365
366
367.. function:: imap(function, *iterables)
368
369 Make an iterator that computes the function using arguments from each of the
370 iterables. If *function* is set to ``None``, then :func:`imap` returns the
371 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
372 exhausted instead of filling in ``None`` for shorter iterables. The reason for
373 the difference is that infinite iterator arguments are typically an error for
374 :func:`map` (because the output is fully evaluated) but represent a common and
375 useful way of supplying arguments to :func:`imap`. Equivalent to::
376
377 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000378 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000379 iterables = map(iter, iterables)
380 while True:
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000381 args = [next(it) for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000382 if function is None:
383 yield tuple(args)
384 else:
385 yield function(*args)
386
387
388.. function:: islice(iterable, [start,] stop [, step])
389
390 Make an iterator that returns selected elements from the iterable. If *start* is
391 non-zero, then elements from the iterable are skipped until start is reached.
392 Afterward, elements are returned consecutively unless *step* is set higher than
393 one which results in items being skipped. If *stop* is ``None``, then iteration
394 continues until the iterator is exhausted, if at all; otherwise, it stops at the
395 specified position. Unlike regular slicing, :func:`islice` does not support
396 negative values for *start*, *stop*, or *step*. Can be used to extract related
397 fields from data where the internal structure has been flattened (for example, a
398 multi-line report may list a name field on every third line). Equivalent to::
399
400 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000401 # islice('ABCDEFG', 2) --> A B
402 # islice('ABCDEFG', 2, 4) --> C D
403 # islice('ABCDEFG', 2, None) --> C D E F G
404 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000405 s = slice(*args)
406 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000407 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000408 for i, element in enumerate(iterable):
409 if i == nexti:
410 yield element
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000411 nexti = next(it)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000412
413 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
414 then the step defaults to one.
415
416 .. versionchanged:: 2.5
417 accept ``None`` values for default *start* and *step*.
418
419
420.. function:: izip(*iterables)
421
422 Make an iterator that aggregates elements from each of the iterables. Like
423 :func:`zip` except that it returns an iterator instead of a list. Used for
424 lock-step iteration over several iterables at a time. Equivalent to::
425
426 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000427 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000428 iterables = map(iter, iterables)
429 while iterables:
Raymond Hettingerf9bce832009-02-19 05:34:35 +0000430 yield yield tuple(map(next, iterables))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000431
432 .. versionchanged:: 2.4
433 When no iterables are specified, returns a zero length iterator instead of
434 raising a :exc:`TypeError` exception.
435
Raymond Hettinger48c62932008-01-22 19:51:41 +0000436 The left-to-right evaluation order of the iterables is guaranteed. This
437 makes possible an idiom for clustering a data series into n-length groups
438 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000439
Raymond Hettinger48c62932008-01-22 19:51:41 +0000440 :func:`izip` should only be used with unequal length inputs when you don't
441 care about trailing, unmatched values from the longer iterables. If those
442 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000443
444
445.. function:: izip_longest(*iterables[, fillvalue])
446
447 Make an iterator that aggregates elements from each of the iterables. If the
448 iterables are of uneven length, missing values are filled-in with *fillvalue*.
449 Iteration continues until the longest iterable is exhausted. Equivalent to::
450
451 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000452 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000453 fillvalue = kwds.get('fillvalue')
454 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
455 yield counter() # yields the fillvalue, or raises IndexError
456 fillers = repeat(fillvalue)
457 iters = [chain(it, sentinel(), fillers) for it in args]
458 try:
459 for tup in izip(*iters):
460 yield tup
461 except IndexError:
462 pass
463
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000464 If one of the iterables is potentially infinite, then the
465 :func:`izip_longest` function should be wrapped with something that limits
466 the number of calls (for example :func:`islice` or :func:`takewhile`). If
467 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000468
469 .. versionadded:: 2.6
470
Raymond Hettinger330958e2008-02-28 19:41:24 +0000471.. function:: permutations(iterable[, r])
472
473 Return successive *r* length permutations of elements in the *iterable*.
474
475 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000476 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e2008-02-28 19:41:24 +0000477 are generated.
478
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000479 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e2008-02-28 19:41:24 +0000480 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000481 in sorted order.
Raymond Hettinger330958e2008-02-28 19:41:24 +0000482
483 Elements are treated as unique based on their position, not on their
484 value. So if the input elements are unique, there will be no repeat
485 values in each permutation.
486
Raymond Hettingerf287f172008-03-02 10:59:31 +0000487 Equivalent to::
488
489 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000490 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
491 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000492 pool = tuple(iterable)
493 n = len(pool)
494 r = n if r is None else r
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000495 if r > n:
496 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000497 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000498 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000499 yield tuple(pool[i] for i in indices[:r])
500 while n:
501 for i in reversed(range(r)):
502 cycles[i] -= 1
503 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000504 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000505 cycles[i] = n - i
506 else:
507 j = cycles[i]
508 indices[i], indices[-j] = indices[-j], indices[i]
509 yield tuple(pool[i] for i in indices[:r])
510 break
511 else:
512 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000513
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000514 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000515 :func:`product`, filtered to exclude entries with repeated elements (those
516 from the same position in the input pool)::
517
518 def permutations(iterable, r=None):
519 pool = tuple(iterable)
520 n = len(pool)
521 r = n if r is None else r
522 for indices in product(range(n), repeat=r):
523 if len(set(indices)) == r:
524 yield tuple(pool[i] for i in indices)
525
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000526 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
527 or zero when ``r > n``.
528
Raymond Hettinger330958e2008-02-28 19:41:24 +0000529 .. versionadded:: 2.6
530
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000531.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000532
533 Cartesian product of input iterables.
534
535 Equivalent to nested for-loops in a generator expression. For example,
536 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
537
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000538 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000539 on every iteration. This pattern creates a lexicographic ordering so that if
540 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000541 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000542
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000543 To compute the product of an iterable with itself, specify the number of
544 repetitions with the optional *repeat* keyword argument. For example,
545 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
546
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000547 This function is equivalent to the following code, except that the
548 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000549
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000550 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000551 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
552 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000553 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000554 result = [[]]
555 for pool in pools:
556 result = [x+[y] for x in result for y in pool]
557 for prod in result:
558 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000559
560 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000561
562.. function:: repeat(object[, times])
563
564 Make an iterator that returns *object* over and over again. Runs indefinitely
565 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000566 invariant function parameters. Also used with :func:`izip` to create constant
567 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000568
569 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000570 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000571 if times is None:
572 while True:
573 yield object
574 else:
575 for i in xrange(times):
576 yield object
577
578
579.. function:: starmap(function, iterable)
580
Raymond Hettinger47317092008-01-17 03:02:14 +0000581 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000582 the iterable. Used instead of :func:`imap` when argument parameters are already
583 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
584 difference between :func:`imap` and :func:`starmap` parallels the distinction
585 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
586
587 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000588 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000589 for args in iterable:
590 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000591
Raymond Hettinger47317092008-01-17 03:02:14 +0000592 .. versionchanged:: 2.6
593 Previously, :func:`starmap` required the function arguments to be tuples.
594 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000595
596.. function:: takewhile(predicate, iterable)
597
598 Make an iterator that returns elements from the iterable as long as the
599 predicate is true. Equivalent to::
600
601 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000602 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000603 for x in iterable:
604 if predicate(x):
605 yield x
606 else:
607 break
608
609
610.. function:: tee(iterable[, n=2])
611
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000612 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000613
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000614 def tee(iterable, n=2):
615 it = iter(iterable)
616 deques = [collections.deque() for i in range(n)]
617 def gen(mydeque):
618 while True:
619 if not mydeque: # when the local deque is empty
620 newval = next(it) # fetch a new value and
621 for d in deques: # load it to all the deques
622 d.append(newval)
623 yield mydeque.popleft()
624 return tuple(gen(d) for d in deques)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000625
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000626 Once :func:`tee` has made a split, the original *iterable* should not be
627 used anywhere else; otherwise, the *iterable* could get advanced without
628 the tee objects being informed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000629
Raymond Hettingerc8223b02009-02-18 20:54:53 +0000630 This itertool may require significant auxiliary storage (depending on how
631 much temporary data needs to be stored). In general, if one iterator uses
632 most or all of the data before another iterator starts, it is faster to use
633 :func:`list` instead of :func:`tee`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000634
635 .. versionadded:: 2.4
636
637
638.. _itertools-example:
639
640Examples
641--------
642
643The following examples show common uses for each tool and demonstrate ways they
Georg Brandle8f1b002008-03-22 22:04:10 +0000644can be combined.
645
646.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000647
Benjamin Peterson8ea99992009-01-01 16:43:12 +0000648 >>> # Show a dictionary sorted and grouped by value
Georg Brandl8ec7f652007-08-15 14:28:01 +0000649 >>> from operator import itemgetter
650 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
651 >>> di = sorted(d.iteritems(), key=itemgetter(1))
652 >>> for k, g in groupby(di, key=itemgetter(1)):
653 ... print k, map(itemgetter(0), g)
654 ...
655 1 ['a', 'c', 'e']
656 2 ['b', 'd', 'f']
657 3 ['g']
658
Benjamin Peterson8ea99992009-01-01 16:43:12 +0000659 >>> # Find runs of consecutive numbers using groupby. The key to the solution
660 >>> # is differencing with a range so that consecutive numbers all appear in
661 >>> # same group.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000662 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
663 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
Georg Brandle8f1b002008-03-22 22:04:10 +0000664 ... print map(itemgetter(1), g)
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000665 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000666 [1]
667 [4, 5, 6]
668 [10]
669 [15, 16, 17, 18]
670 [22]
671 [25, 26, 27, 28]
672
673
674
675.. _itertools-recipes:
676
677Recipes
678-------
679
680This section shows recipes for creating an extended toolset using the existing
681itertools as building blocks.
682
683The extended tools offer the same high performance as the underlying toolset.
684The superior memory performance is kept by processing elements one at a time
685rather than bringing the whole iterable into memory all at once. Code volume is
686kept small by linking the tools together in a functional style which helps
687eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000688"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000689which incur interpreter overhead.
690
691.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000692
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000693 def take(n, iterable):
694 "Return first n items of the iterable as a list"
695 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000696
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000697 def enumerate(iterable, start=0):
698 return izip(count(start), iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000699
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000700 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000701 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000702 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000703
Raymond Hettinger3496a892009-03-09 11:57:29 +0000704 def consume(iterator, n):
705 "Advance the iterator n-steps ahead. If n is none, consume entirely."
706 collections.deque(islice(iterator, n), maxlen=0)
707
Raymond Hettingerf9bce832009-02-19 05:34:35 +0000708 def nth(iterable, n, default=None):
709 "Returns the nth item or a default value"
710 return next(islice(iterable, n, None), default)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000711
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000712 def quantify(iterable, pred=bool):
713 "Count how many times the predicate is true"
714 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000715
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000716 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000717 """Returns the sequence elements and then returns None indefinitely.
718
719 Useful for emulating the behavior of the built-in map() function.
720 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000721 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000722
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000723 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000724 "Returns the sequence elements n times"
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000725 return chain.from_iterable(repeat(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000726
727 def dotproduct(vec1, vec2):
728 return sum(imap(operator.mul, vec1, vec2))
729
730 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000731 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000732
733 def repeatfunc(func, times=None, *args):
734 """Repeat calls to func with specified arguments.
735
736 Example: repeatfunc(random.random)
737 """
738 if times is None:
739 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000740 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000741
742 def pairwise(iterable):
743 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
744 a, b = tee(iterable)
Raymond Hettingerd47442e2009-02-23 19:32:55 +0000745 next(b, None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000746 return izip(a, b)
747
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000748 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000749 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000750 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000751 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000752
Raymond Hettingera44327a2008-01-30 22:17:31 +0000753 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000754 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000755 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000756 pending = len(iterables)
757 nexts = cycle(iter(it).next for it in iterables)
758 while pending:
759 try:
760 for next in nexts:
761 yield next()
762 except StopIteration:
763 pending -= 1
764 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000765
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000766 def powerset(iterable):
Raymond Hettinger68d919e2009-01-25 21:31:47 +0000767 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
768 s = list(iterable)
769 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000770
Benjamin Peterson48291362009-01-31 20:01:48 +0000771 def unique_everseen(iterable, key=None):
772 "List unique elements, preserving order. Remember all elements ever seen."
773 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
774 # unique_everseen('ABBCcAD', str.lower) --> A B C D
775 seen = set()
776 seen_add = seen.add
777 if key is None:
778 for element in iterable:
779 if element not in seen:
780 seen_add(element)
781 yield element
782 else:
783 for element in iterable:
784 k = key(element)
785 if k not in seen:
786 seen_add(k)
787 yield element
Raymond Hettinger44e15812009-01-02 21:26:45 +0000788
Benjamin Peterson48291362009-01-31 20:01:48 +0000789 def unique_justseen(iterable, key=None):
790 "List unique elements, preserving order. Remember only the element just seen."
791 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
792 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
793 return imap(next, imap(itemgetter(1), groupby(iterable, key)))