blob: 64aa634a2b92b1f20ef6055d6a4061c2ec345be9 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Raymond Hettinger0aee9422009-02-17 11:00:27 +000017This module implements a number of :term:`iterator` building blocks inspired
18by constructs from APL, Haskell, and SML. Each has been recast in a form
19suitable for Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000020
21The module standardizes a core set of fast, memory efficient tools that are
Raymond Hettinger0aee9422009-02-17 11:00:27 +000022useful by themselves or in combination. Together, they form an "iterator
23algebra" making it possible to construct specialized tools succinctly and
24efficiently in pure Python.
Georg Brandl8ec7f652007-08-15 14:28:01 +000025
26For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
27sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
Raymond Hettinger0aee9422009-02-17 11:00:27 +000028:func:`count` which can be combined to form ``imap(f, count())`` to produce an
Georg Brandl8ec7f652007-08-15 14:28:01 +000029equivalent result.
30
Raymond Hettinger0aee9422009-02-17 11:00:27 +000031The tools also work well with the high-speed functions in the :mod:`operator`
32module. For example, the plus-operator can be mapped across two vectors to
33form an efficient dot-product: ``sum(imap(operator.add, vector1, vector2))``.
Georg Brandl8ec7f652007-08-15 14:28:01 +000034
Georg Brandl8ec7f652007-08-15 14:28:01 +000035
Raymond Hettinger0aee9422009-02-17 11:00:27 +000036**Infinite Iterators:**
Georg Brandl8ec7f652007-08-15 14:28:01 +000037
Raymond Hettinger0aee9422009-02-17 11:00:27 +000038 ================== ================= =================================================
39 Iterator Arguments Results
40 ================== ================= =================================================
41 :func:`count` start, [step] start, start+step, start+2*step, ...
42 :func:`cycle` p p0, p1, ... plast, p0, p1, ...
43 :func:`repeat` elem [,n] elem, elem, elem, ... endlessly or up to n times
44 ================== ================= =================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000045
Raymond Hettinger0aee9422009-02-17 11:00:27 +000046**Iterators terminating on the shortest input sequence:**
47
48 ==================== ============================ =================================================
49 Iterator Arguments Results
50 ==================== ============================ =================================================
51 :func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ...
52 :func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ...
53 :func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails
54 :func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v)
55 :func:`ifilter` pred, seq elements of seq where pred(elem) is True
56 :func:`ifilterfalse` pred, seq elements of seq where pred(elem) is False
57 :func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step]
58 :func:`imap` func, p, q, ... func(p0, q0), fun(p1, q1), ...
59 :func:`starmap` func, seq func(\*seq[0]), fun(\*seq[1]), ...
60 :func:`tee` it, n it1, it2 , ... itn splits one iterator into n
61 :func:`takewhile` pred, seq seq[0], seq[1], until pred fails
62 :func:`izip` p, q, ... (p[0], q[0]), (p[1], q[1]), ...
63 :func:`izip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ...
64 ==================== ============================ =================================================
65
66**Combinatoric generators:**
67
68 ===================================== ==================== =================================================
69 Iterator Arguments Results
70 ===================================== ==================== =================================================
71 :func:`product` p, q, ... [repeat=1] cartesian product
72 :func:`permutations` p[, r] r-length permutations (without repeated elements)
73 :func:`combinations` p[, r] r-length combinations (sorted and no repeats)
74 :func:`combinations_with_replacement` p[, r] r-length combinations (sorted but with repeats)
75 ===================================== ==================== =================================================
Georg Brandl8ec7f652007-08-15 14:28:01 +000076
77
78.. _itertools-functions:
79
80Itertool functions
81------------------
82
83The following module functions all construct and return iterators. Some provide
84streams of infinite length, so they should only be accessed by functions or
85loops that truncate the stream.
86
87
88.. function:: chain(*iterables)
89
90 Make an iterator that returns elements from the first iterable until it is
91 exhausted, then proceeds to the next iterable, until all of the iterables are
92 exhausted. Used for treating consecutive sequences as a single sequence.
93 Equivalent to::
94
95 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000096 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +000097 for it in iterables:
98 for element in it:
99 yield element
100
101
Raymond Hettinger330958e2008-02-28 19:41:24 +0000102.. function:: itertools.chain.from_iterable(iterable)
103
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000104 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e2008-02-28 19:41:24 +0000105 single iterable argument that is evaluated lazily. Equivalent to::
106
107 @classmethod
108 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000109 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +0000110 for it in iterables:
111 for element in it:
112 yield element
113
114 .. versionadded:: 2.6
115
Raymond Hettingerd553d852008-03-04 04:17:08 +0000116
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000117.. function:: combinations(iterable, r)
118
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000119 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000120
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000121 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000122 input *iterable* is sorted, the combination tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000123 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000124
125 Elements are treated as unique based on their position, not on their
126 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000127 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000128
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000129 Equivalent to::
130
131 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000132 # combinations('ABCD', 2) --> AB AC AD BC BD CD
133 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000134 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000135 n = len(pool)
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000136 if r > n:
137 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000138 indices = range(r)
139 yield tuple(pool[i] for i in indices)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000140 while 1:
141 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000142 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000143 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000144 else:
145 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000146 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000147 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000148 indices[j] = indices[j-1] + 1
149 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000150
Raymond Hettingerd553d852008-03-04 04:17:08 +0000151 The code for :func:`combinations` can be also expressed as a subsequence
152 of :func:`permutations` after filtering entries where the elements are not
153 in sorted order (according to their position in the input pool)::
154
155 def combinations(iterable, r):
156 pool = tuple(iterable)
157 n = len(pool)
158 for indices in permutations(range(n), r):
159 if sorted(indices) == list(indices):
160 yield tuple(pool[i] for i in indices)
161
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000162 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
163 or zero when ``r > n``.
164
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000165 .. versionadded:: 2.6
166
Raymond Hettingerd081abc2009-01-27 02:58:49 +0000167.. function:: combinations_with_replacement(iterable, r)
168
169 Return *r* length subsequences of elements from the input *iterable*
170 allowing individual elements to be repeated more than once.
171
172 Combinations are emitted in lexicographic sort order. So, if the
173 input *iterable* is sorted, the combination tuples will be produced
174 in sorted order.
175
176 Elements are treated as unique based on their position, not on their
177 value. So if the input elements are unique, the generated combinations
178 will also be unique.
179
180 Equivalent to::
181
182 def combinations_with_replacement(iterable, r):
183 # combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC
184 pool = tuple(iterable)
185 n = len(pool)
186 if not n and r:
187 return
188 indices = [0] * r
189 yield tuple(pool[i] for i in indices)
190 while 1:
191 for i in reversed(range(r)):
192 if indices[i] != n - 1:
193 break
194 else:
195 return
196 indices[i:] = [indices[i] + 1] * (r - i)
197 yield tuple(pool[i] for i in indices)
198
199 The code for :func:`combinations_with_replacement` can be also expressed as
200 a subsequence of :func:`product` after filtering entries where the elements
201 are not in sorted order (according to their position in the input pool)::
202
203 def combinations_with_replacement(iterable, r):
204 pool = tuple(iterable)
205 n = len(pool)
206 for indices in product(range(n), repeat=r):
207 if sorted(indices) == list(indices):
208 yield tuple(pool[i] for i in indices)
209
210 The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.
211
212 .. versionadded:: 2.7
213
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000214.. function:: compress(data, selectors)
215
216 Make an iterator that filters elements from *data* returning only those that
217 have a corresponding element in *selectors* that evaluates to ``True``.
218 Stops when either the *data* or *selectors* iterables have been exhausted.
219 Equivalent to::
220
221 def compress(data, selectors):
222 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
223 return (d for d, s in izip(data, selectors) if s)
224
225 .. versionadded:: 2.7
226
227
Raymond Hettingera4038032009-02-14 00:25:51 +0000228.. function:: count(start=0, step=1)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000229
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000230 Make an iterator that returns evenly spaced values starting with *n*. Often
231 used as an argument to :func:`imap` to generate consecutive data points.
232 Also, used with :func:`izip` to add sequence numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000233
Raymond Hettingera4038032009-02-14 00:25:51 +0000234 def count(start=0, step=1):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000235 # count(10) --> 10 11 12 13 14 ...
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000236 # count(2.5, 0.5) -> 3.5 3.0 4.5 ...
Raymond Hettingera4038032009-02-14 00:25:51 +0000237 n = start
Georg Brandl8ec7f652007-08-15 14:28:01 +0000238 while True:
239 yield n
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000240 n += step
Georg Brandl8ec7f652007-08-15 14:28:01 +0000241
Raymond Hettinger31c769c2009-02-12 05:39:46 +0000242 .. versionchanged:: 2.7
243 added *step* argument and allowed non-integer arguments.
244
Georg Brandl8ec7f652007-08-15 14:28:01 +0000245.. function:: cycle(iterable)
246
247 Make an iterator returning elements from the iterable and saving a copy of each.
248 When the iterable is exhausted, return elements from the saved copy. Repeats
249 indefinitely. Equivalent to::
250
251 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000252 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000253 saved = []
254 for element in iterable:
255 yield element
256 saved.append(element)
257 while saved:
258 for element in saved:
259 yield element
260
261 Note, this member of the toolkit may require significant auxiliary storage
262 (depending on the length of the iterable).
263
264
265.. function:: dropwhile(predicate, iterable)
266
267 Make an iterator that drops elements from the iterable as long as the predicate
268 is true; afterwards, returns every element. Note, the iterator does not produce
269 *any* output until the predicate first becomes false, so it may have a lengthy
270 start-up time. Equivalent to::
271
272 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000273 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000274 iterable = iter(iterable)
275 for x in iterable:
276 if not predicate(x):
277 yield x
278 break
279 for x in iterable:
280 yield x
281
282
283.. function:: groupby(iterable[, key])
284
285 Make an iterator that returns consecutive keys and groups from the *iterable*.
286 The *key* is a function computing a key value for each element. If not
287 specified or is ``None``, *key* defaults to an identity function and returns
288 the element unchanged. Generally, the iterable needs to already be sorted on
289 the same key function.
290
291 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
292 generates a break or new group every time the value of the key function changes
293 (which is why it is usually necessary to have sorted the data using the same key
294 function). That behavior differs from SQL's GROUP BY which aggregates common
295 elements regardless of their input order.
296
297 The returned group is itself an iterator that shares the underlying iterable
298 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
299 object is advanced, the previous group is no longer visible. So, if that data
300 is needed later, it should be stored as a list::
301
302 groups = []
303 uniquekeys = []
304 data = sorted(data, key=keyfunc)
305 for k, g in groupby(data, keyfunc):
306 groups.append(list(g)) # Store group iterator as a list
307 uniquekeys.append(k)
308
309 :func:`groupby` is equivalent to::
310
311 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000312 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettingerd507afd2009-02-04 10:52:32 +0000313 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000314 def __init__(self, iterable, key=None):
315 if key is None:
316 key = lambda x: x
317 self.keyfunc = key
318 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000319 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000320 def __iter__(self):
321 return self
322 def next(self):
323 while self.currkey == self.tgtkey:
324 self.currvalue = self.it.next() # Exit on StopIteration
325 self.currkey = self.keyfunc(self.currvalue)
326 self.tgtkey = self.currkey
327 return (self.currkey, self._grouper(self.tgtkey))
328 def _grouper(self, tgtkey):
329 while self.currkey == tgtkey:
330 yield self.currvalue
331 self.currvalue = self.it.next() # Exit on StopIteration
332 self.currkey = self.keyfunc(self.currvalue)
333
334 .. versionadded:: 2.4
335
336
337.. function:: ifilter(predicate, iterable)
338
339 Make an iterator that filters elements from iterable returning only those for
340 which the predicate is ``True``. If *predicate* is ``None``, return the items
341 that are true. Equivalent to::
342
343 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000344 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000345 if predicate is None:
346 predicate = bool
347 for x in iterable:
348 if predicate(x):
349 yield x
350
351
352.. function:: ifilterfalse(predicate, iterable)
353
354 Make an iterator that filters elements from iterable returning only those for
355 which the predicate is ``False``. If *predicate* is ``None``, return the items
356 that are false. Equivalent to::
357
358 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000359 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000360 if predicate is None:
361 predicate = bool
362 for x in iterable:
363 if not predicate(x):
364 yield x
365
366
367.. function:: imap(function, *iterables)
368
369 Make an iterator that computes the function using arguments from each of the
370 iterables. If *function* is set to ``None``, then :func:`imap` returns the
371 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
372 exhausted instead of filling in ``None`` for shorter iterables. The reason for
373 the difference is that infinite iterator arguments are typically an error for
374 :func:`map` (because the output is fully evaluated) but represent a common and
375 useful way of supplying arguments to :func:`imap`. Equivalent to::
376
377 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000378 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000379 iterables = map(iter, iterables)
380 while True:
Raymond Hettinger2dec48d2008-01-22 22:09:26 +0000381 args = [it.next() for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000382 if function is None:
383 yield tuple(args)
384 else:
385 yield function(*args)
386
387
388.. function:: islice(iterable, [start,] stop [, step])
389
390 Make an iterator that returns selected elements from the iterable. If *start* is
391 non-zero, then elements from the iterable are skipped until start is reached.
392 Afterward, elements are returned consecutively unless *step* is set higher than
393 one which results in items being skipped. If *stop* is ``None``, then iteration
394 continues until the iterator is exhausted, if at all; otherwise, it stops at the
395 specified position. Unlike regular slicing, :func:`islice` does not support
396 negative values for *start*, *stop*, or *step*. Can be used to extract related
397 fields from data where the internal structure has been flattened (for example, a
398 multi-line report may list a name field on every third line). Equivalent to::
399
400 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000401 # islice('ABCDEFG', 2) --> A B
402 # islice('ABCDEFG', 2, 4) --> C D
403 # islice('ABCDEFG', 2, None) --> C D E F G
404 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000405 s = slice(*args)
406 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
407 nexti = it.next()
408 for i, element in enumerate(iterable):
409 if i == nexti:
410 yield element
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000411 nexti = it.next()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000412
413 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
414 then the step defaults to one.
415
416 .. versionchanged:: 2.5
417 accept ``None`` values for default *start* and *step*.
418
419
420.. function:: izip(*iterables)
421
422 Make an iterator that aggregates elements from each of the iterables. Like
423 :func:`zip` except that it returns an iterator instead of a list. Used for
424 lock-step iteration over several iterables at a time. Equivalent to::
425
426 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000427 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000428 iterables = map(iter, iterables)
429 while iterables:
430 result = [it.next() for it in iterables]
431 yield tuple(result)
432
433 .. versionchanged:: 2.4
434 When no iterables are specified, returns a zero length iterator instead of
435 raising a :exc:`TypeError` exception.
436
Raymond Hettinger48c62932008-01-22 19:51:41 +0000437 The left-to-right evaluation order of the iterables is guaranteed. This
438 makes possible an idiom for clustering a data series into n-length groups
439 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000440
Raymond Hettinger48c62932008-01-22 19:51:41 +0000441 :func:`izip` should only be used with unequal length inputs when you don't
442 care about trailing, unmatched values from the longer iterables. If those
443 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000444
445
446.. function:: izip_longest(*iterables[, fillvalue])
447
448 Make an iterator that aggregates elements from each of the iterables. If the
449 iterables are of uneven length, missing values are filled-in with *fillvalue*.
450 Iteration continues until the longest iterable is exhausted. Equivalent to::
451
452 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000453 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000454 fillvalue = kwds.get('fillvalue')
455 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
456 yield counter() # yields the fillvalue, or raises IndexError
457 fillers = repeat(fillvalue)
458 iters = [chain(it, sentinel(), fillers) for it in args]
459 try:
460 for tup in izip(*iters):
461 yield tup
462 except IndexError:
463 pass
464
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000465 If one of the iterables is potentially infinite, then the
466 :func:`izip_longest` function should be wrapped with something that limits
467 the number of calls (for example :func:`islice` or :func:`takewhile`). If
468 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000469
470 .. versionadded:: 2.6
471
Raymond Hettinger330958e2008-02-28 19:41:24 +0000472.. function:: permutations(iterable[, r])
473
474 Return successive *r* length permutations of elements in the *iterable*.
475
476 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000477 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e2008-02-28 19:41:24 +0000478 are generated.
479
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000480 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e2008-02-28 19:41:24 +0000481 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000482 in sorted order.
Raymond Hettinger330958e2008-02-28 19:41:24 +0000483
484 Elements are treated as unique based on their position, not on their
485 value. So if the input elements are unique, there will be no repeat
486 values in each permutation.
487
Raymond Hettingerf287f172008-03-02 10:59:31 +0000488 Equivalent to::
489
490 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000491 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
492 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000493 pool = tuple(iterable)
494 n = len(pool)
495 r = n if r is None else r
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000496 if r > n:
497 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000498 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000499 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000500 yield tuple(pool[i] for i in indices[:r])
501 while n:
502 for i in reversed(range(r)):
503 cycles[i] -= 1
504 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000505 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000506 cycles[i] = n - i
507 else:
508 j = cycles[i]
509 indices[i], indices[-j] = indices[-j], indices[i]
510 yield tuple(pool[i] for i in indices[:r])
511 break
512 else:
513 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000514
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000515 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000516 :func:`product`, filtered to exclude entries with repeated elements (those
517 from the same position in the input pool)::
518
519 def permutations(iterable, r=None):
520 pool = tuple(iterable)
521 n = len(pool)
522 r = n if r is None else r
523 for indices in product(range(n), repeat=r):
524 if len(set(indices)) == r:
525 yield tuple(pool[i] for i in indices)
526
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000527 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
528 or zero when ``r > n``.
529
Raymond Hettinger330958e2008-02-28 19:41:24 +0000530 .. versionadded:: 2.6
531
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000532.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000533
534 Cartesian product of input iterables.
535
536 Equivalent to nested for-loops in a generator expression. For example,
537 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
538
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000539 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000540 on every iteration. This pattern creates a lexicographic ordering so that if
541 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000542 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000543
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000544 To compute the product of an iterable with itself, specify the number of
545 repetitions with the optional *repeat* keyword argument. For example,
546 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
547
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000548 This function is equivalent to the following code, except that the
549 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000550
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000551 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000552 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
553 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000554 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000555 result = [[]]
556 for pool in pools:
557 result = [x+[y] for x in result for y in pool]
558 for prod in result:
559 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000560
561 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000562
563.. function:: repeat(object[, times])
564
565 Make an iterator that returns *object* over and over again. Runs indefinitely
566 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000567 invariant function parameters. Also used with :func:`izip` to create constant
568 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000569
570 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000571 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000572 if times is None:
573 while True:
574 yield object
575 else:
576 for i in xrange(times):
577 yield object
578
579
580.. function:: starmap(function, iterable)
581
Raymond Hettinger47317092008-01-17 03:02:14 +0000582 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000583 the iterable. Used instead of :func:`imap` when argument parameters are already
584 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
585 difference between :func:`imap` and :func:`starmap` parallels the distinction
586 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
587
588 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000589 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000590 for args in iterable:
591 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000592
Raymond Hettinger47317092008-01-17 03:02:14 +0000593 .. versionchanged:: 2.6
594 Previously, :func:`starmap` required the function arguments to be tuples.
595 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000596
597.. function:: takewhile(predicate, iterable)
598
599 Make an iterator that returns elements from the iterable as long as the
600 predicate is true. Equivalent to::
601
602 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000603 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000604 for x in iterable:
605 if predicate(x):
606 yield x
607 else:
608 break
609
610
611.. function:: tee(iterable[, n=2])
612
613 Return *n* independent iterators from a single iterable. The case where ``n==2``
614 is equivalent to::
615
616 def tee(iterable):
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000617 def gen(next, data={}):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000618 for i in count():
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000619 if i in data:
620 yield data.pop(i)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000621 else:
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000622 data[i] = next()
623 yield data[i]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000624 it = iter(iterable)
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000625 return gen(it.next), gen(it.next)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000626
627 Note, once :func:`tee` has made a split, the original *iterable* should not be
628 used anywhere else; otherwise, the *iterable* could get advanced without the tee
629 objects being informed.
630
631 Note, this member of the toolkit may require significant auxiliary storage
632 (depending on how much temporary data needs to be stored). In general, if one
633 iterator is going to use most or all of the data before the other iterator, it
634 is faster to use :func:`list` instead of :func:`tee`.
635
636 .. versionadded:: 2.4
637
638
639.. _itertools-example:
640
641Examples
642--------
643
644The following examples show common uses for each tool and demonstrate ways they
Georg Brandle8f1b002008-03-22 22:04:10 +0000645can be combined.
646
647.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000648
Benjamin Peterson8ea99992009-01-01 16:43:12 +0000649 >>> # Show a dictionary sorted and grouped by value
Georg Brandl8ec7f652007-08-15 14:28:01 +0000650 >>> from operator import itemgetter
651 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
652 >>> di = sorted(d.iteritems(), key=itemgetter(1))
653 >>> for k, g in groupby(di, key=itemgetter(1)):
654 ... print k, map(itemgetter(0), g)
655 ...
656 1 ['a', 'c', 'e']
657 2 ['b', 'd', 'f']
658 3 ['g']
659
Benjamin Peterson8ea99992009-01-01 16:43:12 +0000660 >>> # Find runs of consecutive numbers using groupby. The key to the solution
661 >>> # is differencing with a range so that consecutive numbers all appear in
662 >>> # same group.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000663 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
664 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
Georg Brandle8f1b002008-03-22 22:04:10 +0000665 ... print map(itemgetter(1), g)
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000666 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000667 [1]
668 [4, 5, 6]
669 [10]
670 [15, 16, 17, 18]
671 [22]
672 [25, 26, 27, 28]
673
674
675
676.. _itertools-recipes:
677
678Recipes
679-------
680
681This section shows recipes for creating an extended toolset using the existing
682itertools as building blocks.
683
684The extended tools offer the same high performance as the underlying toolset.
685The superior memory performance is kept by processing elements one at a time
686rather than bringing the whole iterable into memory all at once. Code volume is
687kept small by linking the tools together in a functional style which helps
688eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000689"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000690which incur interpreter overhead.
691
692.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000693
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000694 def take(n, iterable):
695 "Return first n items of the iterable as a list"
696 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000697
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000698 def enumerate(iterable, start=0):
699 return izip(count(start), iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000700
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000701 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000702 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000703 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000704
705 def nth(iterable, n):
Raymond Hettingerd507afd2009-02-04 10:52:32 +0000706 "Returns the nth item or None"
707 return next(islice(iterable, n, None), None)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000708
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000709 def quantify(iterable, pred=bool):
710 "Count how many times the predicate is true"
711 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000712
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000713 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000714 """Returns the sequence elements and then returns None indefinitely.
715
716 Useful for emulating the behavior of the built-in map() function.
717 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000718 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000719
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000720 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000721 "Returns the sequence elements n times"
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000722 return chain.from_iterable(repeat(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000723
724 def dotproduct(vec1, vec2):
725 return sum(imap(operator.mul, vec1, vec2))
726
727 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000728 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000729
730 def repeatfunc(func, times=None, *args):
731 """Repeat calls to func with specified arguments.
732
733 Example: repeatfunc(random.random)
734 """
735 if times is None:
736 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000737 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000738
739 def pairwise(iterable):
740 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
741 a, b = tee(iterable)
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000742 for elem in b:
743 break
Georg Brandl8ec7f652007-08-15 14:28:01 +0000744 return izip(a, b)
745
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000746 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000747 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000748 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000749 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000750
Raymond Hettingera44327a2008-01-30 22:17:31 +0000751 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000752 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000753 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000754 pending = len(iterables)
755 nexts = cycle(iter(it).next for it in iterables)
756 while pending:
757 try:
758 for next in nexts:
759 yield next()
760 except StopIteration:
761 pending -= 1
762 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000763
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000764 def powerset(iterable):
Raymond Hettinger68d919e2009-01-25 21:31:47 +0000765 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
766 s = list(iterable)
767 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000768
Benjamin Peterson48291362009-01-31 20:01:48 +0000769 def unique_everseen(iterable, key=None):
770 "List unique elements, preserving order. Remember all elements ever seen."
771 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
772 # unique_everseen('ABBCcAD', str.lower) --> A B C D
773 seen = set()
774 seen_add = seen.add
775 if key is None:
776 for element in iterable:
777 if element not in seen:
778 seen_add(element)
779 yield element
780 else:
781 for element in iterable:
782 k = key(element)
783 if k not in seen:
784 seen_add(k)
785 yield element
Raymond Hettinger44e15812009-01-02 21:26:45 +0000786
Benjamin Peterson48291362009-01-31 20:01:48 +0000787 def unique_justseen(iterable, key=None):
788 "List unique elements, preserving order. Remember only the element just seen."
789 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
790 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
791 return imap(next, imap(itemgetter(1), groupby(iterable, key)))