blob: b7cd4318bf506c6cd13b9828cf91d671fcd4ced4 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Georg Brandle7a09902007-10-21 12:10:28 +000017This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl8ec7f652007-08-15 14:28:01 +000018constructs from the Haskell and SML programming languages. Each has been recast
19in a form suitable for Python.
20
21The module standardizes a core set of fast, memory efficient tools that are
22useful by themselves or in combination. Standardization helps avoid the
23readability and reliability problems which arise when many different individuals
24create their own slightly varying implementations, each with their own quirks
25and naming conventions.
26
27The tools are designed to combine readily with one another. This makes it easy
28to construct more specialized tools succinctly and efficiently in pure Python.
29
30For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
31sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
32:func:`count` which can be combined to form ``imap(f, count())`` and produce an
33equivalent result.
34
35Likewise, the functional tools are designed to work well with the high-speed
36functions provided by the :mod:`operator` module.
37
Georg Brandl8ec7f652007-08-15 14:28:01 +000038Whether cast in pure python form or compiled code, tools that use iterators are
Raymond Hettingerf1f46f02008-07-19 23:58:47 +000039more memory efficient (and often faster) than their list based counterparts. Adopting
Georg Brandl8ec7f652007-08-15 14:28:01 +000040the principles of just-in-time manufacturing, they create data when and where
41needed instead of consuming memory with the computer equivalent of "inventory".
42
Georg Brandl8ec7f652007-08-15 14:28:01 +000043
44.. seealso::
45
46 The Standard ML Basis Library, `The Standard ML Basis Library
47 <http://www.standardml.org/Basis/>`_.
48
49 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
50 Libraries <http://www.haskell.org/definition/>`_.
51
52
53.. _itertools-functions:
54
55Itertool functions
56------------------
57
58The following module functions all construct and return iterators. Some provide
59streams of infinite length, so they should only be accessed by functions or
60loops that truncate the stream.
61
62
63.. function:: chain(*iterables)
64
65 Make an iterator that returns elements from the first iterable until it is
66 exhausted, then proceeds to the next iterable, until all of the iterables are
67 exhausted. Used for treating consecutive sequences as a single sequence.
68 Equivalent to::
69
70 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000071 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +000072 for it in iterables:
73 for element in it:
74 yield element
75
76
Raymond Hettinger330958e2008-02-28 19:41:24 +000077.. function:: itertools.chain.from_iterable(iterable)
78
Georg Brandlc62ef8b2009-01-03 20:55:06 +000079 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e2008-02-28 19:41:24 +000080 single iterable argument that is evaluated lazily. Equivalent to::
81
82 @classmethod
83 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000084 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +000085 for it in iterables:
86 for element in it:
87 yield element
88
89 .. versionadded:: 2.6
90
Raymond Hettingerd553d852008-03-04 04:17:08 +000091
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000092.. function:: combinations(iterable, r)
93
Raymond Hettinger5eaffc42008-04-17 10:48:31 +000094 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000095
Georg Brandlc62ef8b2009-01-03 20:55:06 +000096 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000097 input *iterable* is sorted, the combination tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +000098 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000099
100 Elements are treated as unique based on their position, not on their
101 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000102 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000103
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000104 Equivalent to::
105
106 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000107 # combinations('ABCD', 2) --> AB AC AD BC BD CD
108 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000109 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000110 n = len(pool)
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000111 if r > n:
112 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000113 indices = range(r)
114 yield tuple(pool[i] for i in indices)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000115 while 1:
116 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000117 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000118 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000119 else:
120 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000121 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000122 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000123 indices[j] = indices[j-1] + 1
124 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000125
Raymond Hettingerd553d852008-03-04 04:17:08 +0000126 The code for :func:`combinations` can be also expressed as a subsequence
127 of :func:`permutations` after filtering entries where the elements are not
128 in sorted order (according to their position in the input pool)::
129
130 def combinations(iterable, r):
131 pool = tuple(iterable)
132 n = len(pool)
133 for indices in permutations(range(n), r):
134 if sorted(indices) == list(indices):
135 yield tuple(pool[i] for i in indices)
136
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000137 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
138 or zero when ``r > n``.
139
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000140 .. versionadded:: 2.6
141
Raymond Hettinger2bcb8e92009-01-25 21:04:14 +0000142.. function:: compress(data, selectors)
143
144 Make an iterator that filters elements from *data* returning only those that
145 have a corresponding element in *selectors* that evaluates to ``True``.
146 Stops when either the *data* or *selectors* iterables have been exhausted.
147 Equivalent to::
148
149 def compress(data, selectors):
150 # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
151 return (d for d, s in izip(data, selectors) if s)
152
153 .. versionadded:: 2.7
154
155
Georg Brandl8ec7f652007-08-15 14:28:01 +0000156.. function:: count([n])
157
158 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettinger50e90e22007-10-04 00:20:27 +0000159 specified *n* defaults to zero. Often used as an argument to :func:`imap` to
160 generate consecutive data points. Also, used with :func:`izip` to add sequence
161 numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000162
163 def count(n=0):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000164 # count(10) --> 10 11 12 13 14 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000165 while True:
166 yield n
167 n += 1
168
Georg Brandl8ec7f652007-08-15 14:28:01 +0000169
170.. function:: cycle(iterable)
171
172 Make an iterator returning elements from the iterable and saving a copy of each.
173 When the iterable is exhausted, return elements from the saved copy. Repeats
174 indefinitely. Equivalent to::
175
176 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000177 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000178 saved = []
179 for element in iterable:
180 yield element
181 saved.append(element)
182 while saved:
183 for element in saved:
184 yield element
185
186 Note, this member of the toolkit may require significant auxiliary storage
187 (depending on the length of the iterable).
188
189
190.. function:: dropwhile(predicate, iterable)
191
192 Make an iterator that drops elements from the iterable as long as the predicate
193 is true; afterwards, returns every element. Note, the iterator does not produce
194 *any* output until the predicate first becomes false, so it may have a lengthy
195 start-up time. Equivalent to::
196
197 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000198 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000199 iterable = iter(iterable)
200 for x in iterable:
201 if not predicate(x):
202 yield x
203 break
204 for x in iterable:
205 yield x
206
207
208.. function:: groupby(iterable[, key])
209
210 Make an iterator that returns consecutive keys and groups from the *iterable*.
211 The *key* is a function computing a key value for each element. If not
212 specified or is ``None``, *key* defaults to an identity function and returns
213 the element unchanged. Generally, the iterable needs to already be sorted on
214 the same key function.
215
216 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
217 generates a break or new group every time the value of the key function changes
218 (which is why it is usually necessary to have sorted the data using the same key
219 function). That behavior differs from SQL's GROUP BY which aggregates common
220 elements regardless of their input order.
221
222 The returned group is itself an iterator that shares the underlying iterable
223 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
224 object is advanced, the previous group is no longer visible. So, if that data
225 is needed later, it should be stored as a list::
226
227 groups = []
228 uniquekeys = []
229 data = sorted(data, key=keyfunc)
230 for k, g in groupby(data, keyfunc):
231 groups.append(list(g)) # Store group iterator as a list
232 uniquekeys.append(k)
233
234 :func:`groupby` is equivalent to::
235
236 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000237 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
238 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000239 def __init__(self, iterable, key=None):
240 if key is None:
241 key = lambda x: x
242 self.keyfunc = key
243 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000244 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000245 def __iter__(self):
246 return self
247 def next(self):
248 while self.currkey == self.tgtkey:
249 self.currvalue = self.it.next() # Exit on StopIteration
250 self.currkey = self.keyfunc(self.currvalue)
251 self.tgtkey = self.currkey
252 return (self.currkey, self._grouper(self.tgtkey))
253 def _grouper(self, tgtkey):
254 while self.currkey == tgtkey:
255 yield self.currvalue
256 self.currvalue = self.it.next() # Exit on StopIteration
257 self.currkey = self.keyfunc(self.currvalue)
258
259 .. versionadded:: 2.4
260
261
262.. function:: ifilter(predicate, iterable)
263
264 Make an iterator that filters elements from iterable returning only those for
265 which the predicate is ``True``. If *predicate* is ``None``, return the items
266 that are true. Equivalent to::
267
268 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000269 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000270 if predicate is None:
271 predicate = bool
272 for x in iterable:
273 if predicate(x):
274 yield x
275
276
277.. function:: ifilterfalse(predicate, iterable)
278
279 Make an iterator that filters elements from iterable returning only those for
280 which the predicate is ``False``. If *predicate* is ``None``, return the items
281 that are false. Equivalent to::
282
283 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000284 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000285 if predicate is None:
286 predicate = bool
287 for x in iterable:
288 if not predicate(x):
289 yield x
290
291
292.. function:: imap(function, *iterables)
293
294 Make an iterator that computes the function using arguments from each of the
295 iterables. If *function* is set to ``None``, then :func:`imap` returns the
296 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
297 exhausted instead of filling in ``None`` for shorter iterables. The reason for
298 the difference is that infinite iterator arguments are typically an error for
299 :func:`map` (because the output is fully evaluated) but represent a common and
300 useful way of supplying arguments to :func:`imap`. Equivalent to::
301
302 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000303 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000304 iterables = map(iter, iterables)
305 while True:
Raymond Hettinger2dec48d2008-01-22 22:09:26 +0000306 args = [it.next() for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000307 if function is None:
308 yield tuple(args)
309 else:
310 yield function(*args)
311
312
313.. function:: islice(iterable, [start,] stop [, step])
314
315 Make an iterator that returns selected elements from the iterable. If *start* is
316 non-zero, then elements from the iterable are skipped until start is reached.
317 Afterward, elements are returned consecutively unless *step* is set higher than
318 one which results in items being skipped. If *stop* is ``None``, then iteration
319 continues until the iterator is exhausted, if at all; otherwise, it stops at the
320 specified position. Unlike regular slicing, :func:`islice` does not support
321 negative values for *start*, *stop*, or *step*. Can be used to extract related
322 fields from data where the internal structure has been flattened (for example, a
323 multi-line report may list a name field on every third line). Equivalent to::
324
325 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000326 # islice('ABCDEFG', 2) --> A B
327 # islice('ABCDEFG', 2, 4) --> C D
328 # islice('ABCDEFG', 2, None) --> C D E F G
329 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000330 s = slice(*args)
331 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
332 nexti = it.next()
333 for i, element in enumerate(iterable):
334 if i == nexti:
335 yield element
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000336 nexti = it.next()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000337
338 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
339 then the step defaults to one.
340
341 .. versionchanged:: 2.5
342 accept ``None`` values for default *start* and *step*.
343
344
345.. function:: izip(*iterables)
346
347 Make an iterator that aggregates elements from each of the iterables. Like
348 :func:`zip` except that it returns an iterator instead of a list. Used for
349 lock-step iteration over several iterables at a time. Equivalent to::
350
351 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000352 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000353 iterables = map(iter, iterables)
354 while iterables:
355 result = [it.next() for it in iterables]
356 yield tuple(result)
357
358 .. versionchanged:: 2.4
359 When no iterables are specified, returns a zero length iterator instead of
360 raising a :exc:`TypeError` exception.
361
Raymond Hettinger48c62932008-01-22 19:51:41 +0000362 The left-to-right evaluation order of the iterables is guaranteed. This
363 makes possible an idiom for clustering a data series into n-length groups
364 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000365
Raymond Hettinger48c62932008-01-22 19:51:41 +0000366 :func:`izip` should only be used with unequal length inputs when you don't
367 care about trailing, unmatched values from the longer iterables. If those
368 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000369
370
371.. function:: izip_longest(*iterables[, fillvalue])
372
373 Make an iterator that aggregates elements from each of the iterables. If the
374 iterables are of uneven length, missing values are filled-in with *fillvalue*.
375 Iteration continues until the longest iterable is exhausted. Equivalent to::
376
377 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000378 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000379 fillvalue = kwds.get('fillvalue')
380 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
381 yield counter() # yields the fillvalue, or raises IndexError
382 fillers = repeat(fillvalue)
383 iters = [chain(it, sentinel(), fillers) for it in args]
384 try:
385 for tup in izip(*iters):
386 yield tup
387 except IndexError:
388 pass
389
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000390 If one of the iterables is potentially infinite, then the
391 :func:`izip_longest` function should be wrapped with something that limits
392 the number of calls (for example :func:`islice` or :func:`takewhile`). If
393 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000394
395 .. versionadded:: 2.6
396
Raymond Hettinger330958e2008-02-28 19:41:24 +0000397.. function:: permutations(iterable[, r])
398
399 Return successive *r* length permutations of elements in the *iterable*.
400
401 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000402 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e2008-02-28 19:41:24 +0000403 are generated.
404
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000405 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e2008-02-28 19:41:24 +0000406 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000407 in sorted order.
Raymond Hettinger330958e2008-02-28 19:41:24 +0000408
409 Elements are treated as unique based on their position, not on their
410 value. So if the input elements are unique, there will be no repeat
411 values in each permutation.
412
Raymond Hettingerf287f172008-03-02 10:59:31 +0000413 Equivalent to::
414
415 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000416 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
417 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000418 pool = tuple(iterable)
419 n = len(pool)
420 r = n if r is None else r
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000421 if r > n:
422 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000423 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000424 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000425 yield tuple(pool[i] for i in indices[:r])
426 while n:
427 for i in reversed(range(r)):
428 cycles[i] -= 1
429 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000430 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000431 cycles[i] = n - i
432 else:
433 j = cycles[i]
434 indices[i], indices[-j] = indices[-j], indices[i]
435 yield tuple(pool[i] for i in indices[:r])
436 break
437 else:
438 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000439
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000440 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000441 :func:`product`, filtered to exclude entries with repeated elements (those
442 from the same position in the input pool)::
443
444 def permutations(iterable, r=None):
445 pool = tuple(iterable)
446 n = len(pool)
447 r = n if r is None else r
448 for indices in product(range(n), repeat=r):
449 if len(set(indices)) == r:
450 yield tuple(pool[i] for i in indices)
451
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000452 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
453 or zero when ``r > n``.
454
Raymond Hettinger330958e2008-02-28 19:41:24 +0000455 .. versionadded:: 2.6
456
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000457.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000458
459 Cartesian product of input iterables.
460
461 Equivalent to nested for-loops in a generator expression. For example,
462 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
463
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000464 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000465 on every iteration. This pattern creates a lexicographic ordering so that if
466 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000467 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000468
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000469 To compute the product of an iterable with itself, specify the number of
470 repetitions with the optional *repeat* keyword argument. For example,
471 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
472
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000473 This function is equivalent to the following code, except that the
474 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000475
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000476 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000477 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
478 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000479 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000480 result = [[]]
481 for pool in pools:
482 result = [x+[y] for x in result for y in pool]
483 for prod in result:
484 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000485
486 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000487
488.. function:: repeat(object[, times])
489
490 Make an iterator that returns *object* over and over again. Runs indefinitely
491 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000492 invariant function parameters. Also used with :func:`izip` to create constant
493 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000494
495 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000496 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000497 if times is None:
498 while True:
499 yield object
500 else:
501 for i in xrange(times):
502 yield object
503
504
505.. function:: starmap(function, iterable)
506
Raymond Hettinger47317092008-01-17 03:02:14 +0000507 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000508 the iterable. Used instead of :func:`imap` when argument parameters are already
509 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
510 difference between :func:`imap` and :func:`starmap` parallels the distinction
511 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
512
513 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000514 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000515 for args in iterable:
516 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000517
Raymond Hettinger47317092008-01-17 03:02:14 +0000518 .. versionchanged:: 2.6
519 Previously, :func:`starmap` required the function arguments to be tuples.
520 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000521
522.. function:: takewhile(predicate, iterable)
523
524 Make an iterator that returns elements from the iterable as long as the
525 predicate is true. Equivalent to::
526
527 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000528 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000529 for x in iterable:
530 if predicate(x):
531 yield x
532 else:
533 break
534
535
536.. function:: tee(iterable[, n=2])
537
538 Return *n* independent iterators from a single iterable. The case where ``n==2``
539 is equivalent to::
540
541 def tee(iterable):
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000542 def gen(next, data={}):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000543 for i in count():
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000544 if i in data:
545 yield data.pop(i)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000546 else:
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000547 data[i] = next()
548 yield data[i]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000549 it = iter(iterable)
Raymond Hettinger5d332bb2007-12-29 22:09:34 +0000550 return gen(it.next), gen(it.next)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000551
552 Note, once :func:`tee` has made a split, the original *iterable* should not be
553 used anywhere else; otherwise, the *iterable* could get advanced without the tee
554 objects being informed.
555
556 Note, this member of the toolkit may require significant auxiliary storage
557 (depending on how much temporary data needs to be stored). In general, if one
558 iterator is going to use most or all of the data before the other iterator, it
559 is faster to use :func:`list` instead of :func:`tee`.
560
561 .. versionadded:: 2.4
562
563
564.. _itertools-example:
565
566Examples
567--------
568
569The following examples show common uses for each tool and demonstrate ways they
Georg Brandle8f1b002008-03-22 22:04:10 +0000570can be combined.
571
572.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000573
Benjamin Peterson8ea99992009-01-01 16:43:12 +0000574 >>> # Show a dictionary sorted and grouped by value
Georg Brandl8ec7f652007-08-15 14:28:01 +0000575 >>> from operator import itemgetter
576 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
577 >>> di = sorted(d.iteritems(), key=itemgetter(1))
578 >>> for k, g in groupby(di, key=itemgetter(1)):
579 ... print k, map(itemgetter(0), g)
580 ...
581 1 ['a', 'c', 'e']
582 2 ['b', 'd', 'f']
583 3 ['g']
584
Benjamin Peterson8ea99992009-01-01 16:43:12 +0000585 >>> # Find runs of consecutive numbers using groupby. The key to the solution
586 >>> # is differencing with a range so that consecutive numbers all appear in
587 >>> # same group.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000588 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
589 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
Georg Brandle8f1b002008-03-22 22:04:10 +0000590 ... print map(itemgetter(1), g)
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000591 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000592 [1]
593 [4, 5, 6]
594 [10]
595 [15, 16, 17, 18]
596 [22]
597 [25, 26, 27, 28]
598
599
600
601.. _itertools-recipes:
602
603Recipes
604-------
605
606This section shows recipes for creating an extended toolset using the existing
607itertools as building blocks.
608
609The extended tools offer the same high performance as the underlying toolset.
610The superior memory performance is kept by processing elements one at a time
611rather than bringing the whole iterable into memory all at once. Code volume is
612kept small by linking the tools together in a functional style which helps
613eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000614"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000615which incur interpreter overhead.
616
617.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000618
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000619 def take(n, iterable):
620 "Return first n items of the iterable as a list"
621 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000622
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000623 def enumerate(iterable, start=0):
624 return izip(count(start), iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000625
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000626 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000627 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000628 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000629
630 def nth(iterable, n):
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000631 "Returns the nth item or empty list"
632 return list(islice(iterable, n, n+1))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000633
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000634 def quantify(iterable, pred=bool):
635 "Count how many times the predicate is true"
636 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000637
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000638 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000639 """Returns the sequence elements and then returns None indefinitely.
640
641 Useful for emulating the behavior of the built-in map() function.
642 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000643 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000644
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000645 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000646 "Returns the sequence elements n times"
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000647 return chain.from_iterable(repeat(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000648
649 def dotproduct(vec1, vec2):
650 return sum(imap(operator.mul, vec1, vec2))
651
652 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000653 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000654
655 def repeatfunc(func, times=None, *args):
656 """Repeat calls to func with specified arguments.
657
658 Example: repeatfunc(random.random)
659 """
660 if times is None:
661 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000662 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000663
664 def pairwise(iterable):
665 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
666 a, b = tee(iterable)
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000667 for elem in b:
668 break
Georg Brandl8ec7f652007-08-15 14:28:01 +0000669 return izip(a, b)
670
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000671 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000672 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000673 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000674 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000675
Raymond Hettingera44327a2008-01-30 22:17:31 +0000676 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000677 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000678 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000679 pending = len(iterables)
680 nexts = cycle(iter(it).next for it in iterables)
681 while pending:
682 try:
683 for next in nexts:
684 yield next()
685 except StopIteration:
686 pending -= 1
687 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000688
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000689 def powerset(iterable):
Raymond Hettinger68d919e2009-01-25 21:31:47 +0000690 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
691 s = list(iterable)
692 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000693
Georg Brandl8c81fda2008-07-23 16:00:44 +0000694 def combinations_with_replacement(iterable, r):
Raymond Hettinger5b913e32009-01-08 06:39:04 +0000695 "combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC"
696 # number items returned: (n+r-1)! / r! / (n-1)!
Georg Brandl8c81fda2008-07-23 16:00:44 +0000697 pool = tuple(iterable)
698 n = len(pool)
699 indices = [0] * r
700 yield tuple(pool[i] for i in indices)
701 while 1:
702 for i in reversed(range(r)):
703 if indices[i] != n - 1:
704 break
705 else:
706 return
707 indices[i:] = [indices[i] + 1] * (r - i)
708 yield tuple(pool[i] for i in indices)
Raymond Hettinger44e15812009-01-02 21:26:45 +0000709
710 def unique_everseen(iterable, key=None):
711 "List unique elements, preserving order. Remember all elements ever seen."
712 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000713 # unique_everseen('ABBCcAD', str.lower) --> A B C D
Raymond Hettinger44e15812009-01-02 21:26:45 +0000714 seen = set()
715 seen_add = seen.add
716 if key is None:
717 for element in iterable:
718 if element not in seen:
719 seen_add(element)
720 yield element
721 else:
722 for element in iterable:
723 k = key(element)
724 if k not in seen:
725 seen_add(k)
726 yield element
727
728 def unique_justseen(iterable, key=None):
729 "List unique elements, preserving order. Remember only the element just seen."
730 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
731 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
732 return imap(next, imap(itemgetter(1), groupby(iterable, key)))