blob: fbf672c5a92918dadb9ecde56a6394f4dd1fda3f [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandle8f1b002008-03-22 22:04:10 +000011.. testsetup::
12
13 from itertools import *
14
Georg Brandl8ec7f652007-08-15 14:28:01 +000015.. versionadded:: 2.3
16
Georg Brandle7a09902007-10-21 12:10:28 +000017This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl8ec7f652007-08-15 14:28:01 +000018constructs from the Haskell and SML programming languages. Each has been recast
19in a form suitable for Python.
20
21The module standardizes a core set of fast, memory efficient tools that are
22useful by themselves or in combination. Standardization helps avoid the
23readability and reliability problems which arise when many different individuals
24create their own slightly varying implementations, each with their own quirks
25and naming conventions.
26
27The tools are designed to combine readily with one another. This makes it easy
28to construct more specialized tools succinctly and efficiently in pure Python.
29
30For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
31sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and
32:func:`count` which can be combined to form ``imap(f, count())`` and produce an
33equivalent result.
34
35Likewise, the functional tools are designed to work well with the high-speed
36functions provided by the :mod:`operator` module.
37
Georg Brandl8ec7f652007-08-15 14:28:01 +000038Whether cast in pure python form or compiled code, tools that use iterators are
Raymond Hettingerf1f46f02008-07-19 23:58:47 +000039more memory efficient (and often faster) than their list based counterparts. Adopting
Georg Brandl8ec7f652007-08-15 14:28:01 +000040the principles of just-in-time manufacturing, they create data when and where
41needed instead of consuming memory with the computer equivalent of "inventory".
42
Georg Brandl8ec7f652007-08-15 14:28:01 +000043
44.. seealso::
45
46 The Standard ML Basis Library, `The Standard ML Basis Library
47 <http://www.standardml.org/Basis/>`_.
48
49 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
50 Libraries <http://www.haskell.org/definition/>`_.
51
52
53.. _itertools-functions:
54
55Itertool functions
56------------------
57
58The following module functions all construct and return iterators. Some provide
59streams of infinite length, so they should only be accessed by functions or
60loops that truncate the stream.
61
62
63.. function:: chain(*iterables)
64
65 Make an iterator that returns elements from the first iterable until it is
66 exhausted, then proceeds to the next iterable, until all of the iterables are
67 exhausted. Used for treating consecutive sequences as a single sequence.
68 Equivalent to::
69
70 def chain(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000071 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl8ec7f652007-08-15 14:28:01 +000072 for it in iterables:
73 for element in it:
74 yield element
75
76
Raymond Hettinger330958e2008-02-28 19:41:24 +000077.. function:: itertools.chain.from_iterable(iterable)
78
Georg Brandl734373c2009-01-03 21:55:17 +000079 Alternate constructor for :func:`chain`. Gets chained inputs from a
Raymond Hettinger330958e2008-02-28 19:41:24 +000080 single iterable argument that is evaluated lazily. Equivalent to::
81
82 @classmethod
83 def from_iterable(iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +000084 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Raymond Hettinger330958e2008-02-28 19:41:24 +000085 for it in iterables:
86 for element in it:
87 yield element
88
89 .. versionadded:: 2.6
90
Raymond Hettingerd553d852008-03-04 04:17:08 +000091
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000092.. function:: combinations(iterable, r)
93
Raymond Hettinger5eaffc42008-04-17 10:48:31 +000094 Return *r* length subsequences of elements from the input *iterable*.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000095
Georg Brandl734373c2009-01-03 21:55:17 +000096 Combinations are emitted in lexicographic sort order. So, if the
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000097 input *iterable* is sorted, the combination tuples will be produced
Georg Brandl734373c2009-01-03 21:55:17 +000098 in sorted order.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +000099
100 Elements are treated as unique based on their position, not on their
101 value. So if the input elements are unique, there will be no repeat
Raymond Hettinger330958e2008-02-28 19:41:24 +0000102 values in each combination.
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000103
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000104 Equivalent to::
105
106 def combinations(iterable, r):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000107 # combinations('ABCD', 2) --> AB AC AD BC BD CD
108 # combinations(range(4), 3) --> 012 013 023 123
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000109 pool = tuple(iterable)
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000110 n = len(pool)
Raymond Hettinger825758c2009-01-08 05:20:19 +0000111 if r > n:
112 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000113 indices = range(r)
114 yield tuple(pool[i] for i in indices)
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000115 while True:
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000116 for i in reversed(range(r)):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000117 if indices[i] != i + n - r:
Raymond Hettingerc1052892008-02-27 01:44:34 +0000118 break
Raymond Hettinger93e804d2008-02-26 23:40:50 +0000119 else:
120 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000121 indices[i] += 1
Raymond Hettingerc1052892008-02-27 01:44:34 +0000122 for j in range(i+1, r):
Raymond Hettingerf287f172008-03-02 10:59:31 +0000123 indices[j] = indices[j-1] + 1
124 yield tuple(pool[i] for i in indices)
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000125
Raymond Hettingerd553d852008-03-04 04:17:08 +0000126 The code for :func:`combinations` can be also expressed as a subsequence
127 of :func:`permutations` after filtering entries where the elements are not
128 in sorted order (according to their position in the input pool)::
129
130 def combinations(iterable, r):
131 pool = tuple(iterable)
132 n = len(pool)
133 for indices in permutations(range(n), r):
134 if sorted(indices) == list(indices):
135 yield tuple(pool[i] for i in indices)
136
Raymond Hettinger825758c2009-01-08 05:20:19 +0000137 The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``
138 or zero when ``r > n``.
139
Raymond Hettinger3fa41d52008-02-26 02:46:54 +0000140 .. versionadded:: 2.6
141
Georg Brandl8ec7f652007-08-15 14:28:01 +0000142.. function:: count([n])
143
144 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettinger50e90e22007-10-04 00:20:27 +0000145 specified *n* defaults to zero. Often used as an argument to :func:`imap` to
146 generate consecutive data points. Also, used with :func:`izip` to add sequence
147 numbers. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000148
149 def count(n=0):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000150 # count(10) --> 10 11 12 13 14 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000151 while True:
152 yield n
153 n += 1
154
Georg Brandl8ec7f652007-08-15 14:28:01 +0000155
156.. function:: cycle(iterable)
157
158 Make an iterator returning elements from the iterable and saving a copy of each.
159 When the iterable is exhausted, return elements from the saved copy. Repeats
160 indefinitely. Equivalent to::
161
162 def cycle(iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000163 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000164 saved = []
165 for element in iterable:
166 yield element
167 saved.append(element)
168 while saved:
169 for element in saved:
170 yield element
171
172 Note, this member of the toolkit may require significant auxiliary storage
173 (depending on the length of the iterable).
174
175
176.. function:: dropwhile(predicate, iterable)
177
178 Make an iterator that drops elements from the iterable as long as the predicate
179 is true; afterwards, returns every element. Note, the iterator does not produce
180 *any* output until the predicate first becomes false, so it may have a lengthy
181 start-up time. Equivalent to::
182
183 def dropwhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000184 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl8ec7f652007-08-15 14:28:01 +0000185 iterable = iter(iterable)
186 for x in iterable:
187 if not predicate(x):
188 yield x
189 break
190 for x in iterable:
191 yield x
192
193
194.. function:: groupby(iterable[, key])
195
196 Make an iterator that returns consecutive keys and groups from the *iterable*.
197 The *key* is a function computing a key value for each element. If not
198 specified or is ``None``, *key* defaults to an identity function and returns
199 the element unchanged. Generally, the iterable needs to already be sorted on
200 the same key function.
201
202 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
203 generates a break or new group every time the value of the key function changes
204 (which is why it is usually necessary to have sorted the data using the same key
205 function). That behavior differs from SQL's GROUP BY which aggregates common
206 elements regardless of their input order.
207
208 The returned group is itself an iterator that shares the underlying iterable
209 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
210 object is advanced, the previous group is no longer visible. So, if that data
211 is needed later, it should be stored as a list::
212
213 groups = []
214 uniquekeys = []
215 data = sorted(data, key=keyfunc)
216 for k, g in groupby(data, keyfunc):
217 groups.append(list(g)) # Store group iterator as a list
218 uniquekeys.append(k)
219
220 :func:`groupby` is equivalent to::
221
222 class groupby(object):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000223 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
Raymond Hettinger11485b42009-02-04 19:34:31 +0000224 # [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl8ec7f652007-08-15 14:28:01 +0000225 def __init__(self, iterable, key=None):
226 if key is None:
227 key = lambda x: x
228 self.keyfunc = key
229 self.it = iter(iterable)
Raymond Hettinger81a885a2007-12-29 22:16:24 +0000230 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000231 def __iter__(self):
232 return self
233 def next(self):
234 while self.currkey == self.tgtkey:
235 self.currvalue = self.it.next() # Exit on StopIteration
236 self.currkey = self.keyfunc(self.currvalue)
237 self.tgtkey = self.currkey
238 return (self.currkey, self._grouper(self.tgtkey))
239 def _grouper(self, tgtkey):
240 while self.currkey == tgtkey:
241 yield self.currvalue
242 self.currvalue = self.it.next() # Exit on StopIteration
243 self.currkey = self.keyfunc(self.currvalue)
244
245 .. versionadded:: 2.4
246
247
248.. function:: ifilter(predicate, iterable)
249
250 Make an iterator that filters elements from iterable returning only those for
251 which the predicate is ``True``. If *predicate* is ``None``, return the items
252 that are true. Equivalent to::
253
254 def ifilter(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000255 # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9
Georg Brandl8ec7f652007-08-15 14:28:01 +0000256 if predicate is None:
257 predicate = bool
258 for x in iterable:
259 if predicate(x):
260 yield x
261
262
263.. function:: ifilterfalse(predicate, iterable)
264
265 Make an iterator that filters elements from iterable returning only those for
266 which the predicate is ``False``. If *predicate* is ``None``, return the items
267 that are false. Equivalent to::
268
269 def ifilterfalse(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000270 # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl8ec7f652007-08-15 14:28:01 +0000271 if predicate is None:
272 predicate = bool
273 for x in iterable:
274 if not predicate(x):
275 yield x
276
277
278.. function:: imap(function, *iterables)
279
280 Make an iterator that computes the function using arguments from each of the
281 iterables. If *function* is set to ``None``, then :func:`imap` returns the
282 arguments as a tuple. Like :func:`map` but stops when the shortest iterable is
283 exhausted instead of filling in ``None`` for shorter iterables. The reason for
284 the difference is that infinite iterator arguments are typically an error for
285 :func:`map` (because the output is fully evaluated) but represent a common and
286 useful way of supplying arguments to :func:`imap`. Equivalent to::
287
288 def imap(function, *iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000289 # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000
Georg Brandl8ec7f652007-08-15 14:28:01 +0000290 iterables = map(iter, iterables)
291 while True:
Raymond Hettinger2dec48d2008-01-22 22:09:26 +0000292 args = [it.next() for it in iterables]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000293 if function is None:
294 yield tuple(args)
295 else:
296 yield function(*args)
297
298
299.. function:: islice(iterable, [start,] stop [, step])
300
301 Make an iterator that returns selected elements from the iterable. If *start* is
302 non-zero, then elements from the iterable are skipped until start is reached.
303 Afterward, elements are returned consecutively unless *step* is set higher than
304 one which results in items being skipped. If *stop* is ``None``, then iteration
305 continues until the iterator is exhausted, if at all; otherwise, it stops at the
306 specified position. Unlike regular slicing, :func:`islice` does not support
307 negative values for *start*, *stop*, or *step*. Can be used to extract related
308 fields from data where the internal structure has been flattened (for example, a
309 multi-line report may list a name field on every third line). Equivalent to::
310
311 def islice(iterable, *args):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000312 # islice('ABCDEFG', 2) --> A B
313 # islice('ABCDEFG', 2, 4) --> C D
314 # islice('ABCDEFG', 2, None) --> C D E F G
315 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl8ec7f652007-08-15 14:28:01 +0000316 s = slice(*args)
317 it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
318 nexti = it.next()
319 for i, element in enumerate(iterable):
320 if i == nexti:
321 yield element
Georg Brandl734373c2009-01-03 21:55:17 +0000322 nexti = it.next()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000323
324 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
325 then the step defaults to one.
326
327 .. versionchanged:: 2.5
328 accept ``None`` values for default *start* and *step*.
329
330
331.. function:: izip(*iterables)
332
333 Make an iterator that aggregates elements from each of the iterables. Like
334 :func:`zip` except that it returns an iterator instead of a list. Used for
335 lock-step iteration over several iterables at a time. Equivalent to::
336
337 def izip(*iterables):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000338 # izip('ABCD', 'xy') --> Ax By
Georg Brandl8ec7f652007-08-15 14:28:01 +0000339 iterables = map(iter, iterables)
340 while iterables:
Raymond Hettinger5894c2b2009-02-19 05:38:53 +0000341 yield yield tuple(map(next, iterables))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000342
343 .. versionchanged:: 2.4
344 When no iterables are specified, returns a zero length iterator instead of
345 raising a :exc:`TypeError` exception.
346
Raymond Hettinger48c62932008-01-22 19:51:41 +0000347 The left-to-right evaluation order of the iterables is guaranteed. This
348 makes possible an idiom for clustering a data series into n-length groups
349 using ``izip(*[iter(s)]*n)``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000350
Raymond Hettinger48c62932008-01-22 19:51:41 +0000351 :func:`izip` should only be used with unequal length inputs when you don't
352 care about trailing, unmatched values from the longer iterables. If those
353 values are important, use :func:`izip_longest` instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000354
355
356.. function:: izip_longest(*iterables[, fillvalue])
357
358 Make an iterator that aggregates elements from each of the iterables. If the
359 iterables are of uneven length, missing values are filled-in with *fillvalue*.
360 Iteration continues until the longest iterable is exhausted. Equivalent to::
361
362 def izip_longest(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000363 # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl8ec7f652007-08-15 14:28:01 +0000364 fillvalue = kwds.get('fillvalue')
365 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
366 yield counter() # yields the fillvalue, or raises IndexError
367 fillers = repeat(fillvalue)
368 iters = [chain(it, sentinel(), fillers) for it in args]
369 try:
370 for tup in izip(*iters):
371 yield tup
372 except IndexError:
373 pass
374
Benjamin Peterson5255cba2008-07-25 17:02:11 +0000375 If one of the iterables is potentially infinite, then the
376 :func:`izip_longest` function should be wrapped with something that limits
377 the number of calls (for example :func:`islice` or :func:`takewhile`). If
378 not specified, *fillvalue* defaults to ``None``.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000379
380 .. versionadded:: 2.6
381
Raymond Hettinger330958e2008-02-28 19:41:24 +0000382.. function:: permutations(iterable[, r])
383
384 Return successive *r* length permutations of elements in the *iterable*.
385
386 If *r* is not specified or is ``None``, then *r* defaults to the length
Georg Brandl734373c2009-01-03 21:55:17 +0000387 of the *iterable* and all possible full-length permutations
Raymond Hettinger330958e2008-02-28 19:41:24 +0000388 are generated.
389
Georg Brandl734373c2009-01-03 21:55:17 +0000390 Permutations are emitted in lexicographic sort order. So, if the
Raymond Hettinger330958e2008-02-28 19:41:24 +0000391 input *iterable* is sorted, the permutation tuples will be produced
Georg Brandl734373c2009-01-03 21:55:17 +0000392 in sorted order.
Raymond Hettinger330958e2008-02-28 19:41:24 +0000393
394 Elements are treated as unique based on their position, not on their
395 value. So if the input elements are unique, there will be no repeat
396 values in each permutation.
397
Raymond Hettingerf287f172008-03-02 10:59:31 +0000398 Equivalent to::
399
400 def permutations(iterable, r=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000401 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
402 # permutations(range(3)) --> 012 021 102 120 201 210
Raymond Hettingerf287f172008-03-02 10:59:31 +0000403 pool = tuple(iterable)
404 n = len(pool)
405 r = n if r is None else r
Raymond Hettinger825758c2009-01-08 05:20:19 +0000406 if r > n:
407 return
Raymond Hettingerf287f172008-03-02 10:59:31 +0000408 indices = range(n)
Raymond Hettingere70bb8d2008-03-23 00:55:46 +0000409 cycles = range(n, n-r, -1)
Raymond Hettingerf287f172008-03-02 10:59:31 +0000410 yield tuple(pool[i] for i in indices[:r])
411 while n:
412 for i in reversed(range(r)):
413 cycles[i] -= 1
414 if cycles[i] == 0:
Raymond Hettinger2b7a5c42008-03-02 11:17:51 +0000415 indices[i:] = indices[i+1:] + indices[i:i+1]
Raymond Hettingerf287f172008-03-02 10:59:31 +0000416 cycles[i] = n - i
417 else:
418 j = cycles[i]
419 indices[i], indices[-j] = indices[-j], indices[i]
420 yield tuple(pool[i] for i in indices[:r])
421 break
422 else:
423 return
Raymond Hettinger330958e2008-02-28 19:41:24 +0000424
Georg Brandl734373c2009-01-03 21:55:17 +0000425 The code for :func:`permutations` can be also expressed as a subsequence of
Raymond Hettingerd553d852008-03-04 04:17:08 +0000426 :func:`product`, filtered to exclude entries with repeated elements (those
427 from the same position in the input pool)::
428
429 def permutations(iterable, r=None):
430 pool = tuple(iterable)
431 n = len(pool)
432 r = n if r is None else r
433 for indices in product(range(n), repeat=r):
434 if len(set(indices)) == r:
435 yield tuple(pool[i] for i in indices)
436
Raymond Hettinger825758c2009-01-08 05:20:19 +0000437 The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``
438 or zero when ``r > n``.
439
Raymond Hettinger330958e2008-02-28 19:41:24 +0000440 .. versionadded:: 2.6
441
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000442.. function:: product(*iterables[, repeat])
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000443
444 Cartesian product of input iterables.
445
446 Equivalent to nested for-loops in a generator expression. For example,
447 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
448
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000449 The nested loops cycle like an odometer with the rightmost element advancing
Andrew M. Kuchlinge2e03132008-04-17 20:44:06 +0000450 on every iteration. This pattern creates a lexicographic ordering so that if
451 the input's iterables are sorted, the product tuples are emitted in sorted
Raymond Hettinger5eaffc42008-04-17 10:48:31 +0000452 order.
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000453
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000454 To compute the product of an iterable with itself, specify the number of
455 repetitions with the optional *repeat* keyword argument. For example,
456 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
457
Andrew M. Kuchling684868a2008-03-04 01:47:38 +0000458 This function is equivalent to the following code, except that the
459 actual implementation does not build up intermediate results in memory::
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000460
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000461 def product(*args, **kwds):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000462 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
463 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettinger18750ab2008-02-28 09:23:48 +0000464 pools = map(tuple, args) * kwds.get('repeat', 1)
Raymond Hettingerd553d852008-03-04 04:17:08 +0000465 result = [[]]
466 for pool in pools:
467 result = [x+[y] for x in result for y in pool]
468 for prod in result:
469 yield tuple(prod)
Raymond Hettingerc5705a82008-02-22 19:50:06 +0000470
471 .. versionadded:: 2.6
Georg Brandl8ec7f652007-08-15 14:28:01 +0000472
473.. function:: repeat(object[, times])
474
475 Make an iterator that returns *object* over and over again. Runs indefinitely
476 unless the *times* argument is specified. Used as argument to :func:`imap` for
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000477 invariant function parameters. Also used with :func:`izip` to create constant
478 fields in a tuple record. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000479
480 def repeat(object, times=None):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000481 # repeat(10, 3) --> 10 10 10
Georg Brandl8ec7f652007-08-15 14:28:01 +0000482 if times is None:
483 while True:
484 yield object
485 else:
486 for i in xrange(times):
487 yield object
488
489
490.. function:: starmap(function, iterable)
491
Raymond Hettinger47317092008-01-17 03:02:14 +0000492 Make an iterator that computes the function using arguments obtained from
Georg Brandl8ec7f652007-08-15 14:28:01 +0000493 the iterable. Used instead of :func:`imap` when argument parameters are already
494 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
495 difference between :func:`imap` and :func:`starmap` parallels the distinction
496 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
497
498 def starmap(function, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000499 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Raymond Hettinger47317092008-01-17 03:02:14 +0000500 for args in iterable:
501 yield function(*args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000502
Raymond Hettinger47317092008-01-17 03:02:14 +0000503 .. versionchanged:: 2.6
504 Previously, :func:`starmap` required the function arguments to be tuples.
505 Now, any iterable is allowed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000506
507.. function:: takewhile(predicate, iterable)
508
509 Make an iterator that returns elements from the iterable as long as the
510 predicate is true. Equivalent to::
511
512 def takewhile(predicate, iterable):
Raymond Hettinger040f10e2008-03-06 01:15:52 +0000513 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000514 for x in iterable:
515 if predicate(x):
516 yield x
517 else:
518 break
519
520
521.. function:: tee(iterable[, n=2])
522
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000523 Return *n* independent iterators from a single iterable. Equivalent to::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000524
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000525 def tee(iterable, n=2):
526 it = iter(iterable)
527 deques = [collections.deque() for i in range(n)]
528 def gen(mydeque):
529 while True:
530 if not mydeque: # when the local deque is empty
531 newval = next(it) # fetch a new value and
532 for d in deques: # load it to all the deques
533 d.append(newval)
534 yield mydeque.popleft()
535 return tuple(gen(d) for d in deques)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000536
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000537 Once :func:`tee` has made a split, the original *iterable* should not be
538 used anywhere else; otherwise, the *iterable* could get advanced without
539 the tee objects being informed.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000540
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000541 This itertool may require significant auxiliary storage (depending on how
542 much temporary data needs to be stored). In general, if one iterator uses
543 most or all of the data before another iterator starts, it is faster to use
544 :func:`list` instead of :func:`tee`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000545
546 .. versionadded:: 2.4
547
548
549.. _itertools-example:
550
551Examples
552--------
553
554The following examples show common uses for each tool and demonstrate ways they
Georg Brandle8f1b002008-03-22 22:04:10 +0000555can be combined.
556
557.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000558
Georg Brandl734373c2009-01-03 21:55:17 +0000559 >>> # Show a dictionary sorted and grouped by value
Georg Brandl8ec7f652007-08-15 14:28:01 +0000560 >>> from operator import itemgetter
561 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
562 >>> di = sorted(d.iteritems(), key=itemgetter(1))
563 >>> for k, g in groupby(di, key=itemgetter(1)):
564 ... print k, map(itemgetter(0), g)
565 ...
566 1 ['a', 'c', 'e']
567 2 ['b', 'd', 'f']
568 3 ['g']
569
Georg Brandl734373c2009-01-03 21:55:17 +0000570 >>> # Find runs of consecutive numbers using groupby. The key to the solution
571 >>> # is differencing with a range so that consecutive numbers all appear in
572 >>> # same group.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000573 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
574 >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):
Georg Brandle8f1b002008-03-22 22:04:10 +0000575 ... print map(itemgetter(1), g)
Georg Brandl734373c2009-01-03 21:55:17 +0000576 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000577 [1]
578 [4, 5, 6]
579 [10]
580 [15, 16, 17, 18]
581 [22]
582 [25, 26, 27, 28]
583
584
585
586.. _itertools-recipes:
587
588Recipes
589-------
590
591This section shows recipes for creating an extended toolset using the existing
592itertools as building blocks.
593
594The extended tools offer the same high performance as the underlying toolset.
595The superior memory performance is kept by processing elements one at a time
596rather than bringing the whole iterable into memory all at once. Code volume is
597kept small by linking the tools together in a functional style which helps
598eliminate temporary variables. High speed is retained by preferring
Georg Brandlcf3fb252007-10-21 10:52:38 +0000599"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Georg Brandle8f1b002008-03-22 22:04:10 +0000600which incur interpreter overhead.
601
602.. testcode::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000603
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000604 def take(n, iterable):
605 "Return first n items of the iterable as a list"
606 return list(islice(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000607
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000608 def enumerate(iterable, start=0):
609 return izip(count(start), iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000610
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000611 def tabulate(function, start=0):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000612 "Return function(0), function(1), ..."
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000613 return imap(function, count(start))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000614
Raymond Hettinger5894c2b2009-02-19 05:38:53 +0000615 def nth(iterable, n, default=None):
616 "Returns the nth item or a default value"
617 return next(islice(iterable, n, None), default)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000618
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000619 def quantify(iterable, pred=bool):
620 "Count how many times the predicate is true"
621 return sum(imap(pred, iterable))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000622
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000623 def padnone(iterable):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000624 """Returns the sequence elements and then returns None indefinitely.
625
626 Useful for emulating the behavior of the built-in map() function.
627 """
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000628 return chain(iterable, repeat(None))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000629
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000630 def ncycles(iterable, n):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000631 "Returns the sequence elements n times"
Raymond Hettingerf1f46f02008-07-19 23:58:47 +0000632 return chain.from_iterable(repeat(iterable, n))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000633
634 def dotproduct(vec1, vec2):
635 return sum(imap(operator.mul, vec1, vec2))
636
637 def flatten(listOfLists):
Raymond Hettinger330958e2008-02-28 19:41:24 +0000638 return list(chain.from_iterable(listOfLists))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000639
640 def repeatfunc(func, times=None, *args):
641 """Repeat calls to func with specified arguments.
642
643 Example: repeatfunc(random.random)
644 """
645 if times is None:
646 return starmap(func, repeat(args))
Raymond Hettinger330958e2008-02-28 19:41:24 +0000647 return starmap(func, repeat(args, times))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000648
649 def pairwise(iterable):
650 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
651 a, b = tee(iterable)
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000652 for elem in b:
653 break
Georg Brandl8ec7f652007-08-15 14:28:01 +0000654 return izip(a, b)
655
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000656 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000657 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettinger38fb9be2008-03-07 01:33:20 +0000658 args = [iter(iterable)] * n
Raymond Hettingerf080e6d2008-07-31 01:19:50 +0000659 return izip_longest(fillvalue=fillvalue, *args)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000660
Raymond Hettingera44327a2008-01-30 22:17:31 +0000661 def roundrobin(*iterables):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000662 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Raymond Hettinger330958e2008-02-28 19:41:24 +0000663 # Recipe credited to George Sakkis
Raymond Hettingera44327a2008-01-30 22:17:31 +0000664 pending = len(iterables)
665 nexts = cycle(iter(it).next for it in iterables)
666 while pending:
667 try:
668 for next in nexts:
669 yield next()
670 except StopIteration:
671 pending -= 1
672 nexts = cycle(islice(nexts, pending))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000673
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000674 def powerset(iterable):
Raymond Hettinger5cd012b2009-01-26 02:17:52 +0000675 "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
676 s = list(iterable)
677 return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
Raymond Hettinger7832d4d2008-02-23 10:04:15 +0000678
Raymond Hettingere8b4b602008-03-11 00:19:07 +0000679 def compress(data, selectors):
Raymond Hettingerefdf7062008-07-30 07:27:30 +0000680 "compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F"
681 return (d for d, s in izip(data, selectors) if s)
Raymond Hettinger33691672008-07-19 00:43:00 +0000682
Georg Brandl8c81fda2008-07-23 16:00:44 +0000683 def combinations_with_replacement(iterable, r):
Raymond Hettinger825758c2009-01-08 05:20:19 +0000684 "combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC"
685 # number items returned: (n+r-1)! / r! / (n-1)!
Georg Brandl8c81fda2008-07-23 16:00:44 +0000686 pool = tuple(iterable)
687 n = len(pool)
Raymond Hettingercdc9f2c2009-01-27 06:38:00 +0000688 if not n and r:
689 return
Georg Brandl8c81fda2008-07-23 16:00:44 +0000690 indices = [0] * r
691 yield tuple(pool[i] for i in indices)
Raymond Hettingerbb94fd82009-02-18 21:04:16 +0000692 while True:
Georg Brandl8c81fda2008-07-23 16:00:44 +0000693 for i in reversed(range(r)):
694 if indices[i] != n - 1:
695 break
696 else:
697 return
698 indices[i:] = [indices[i] + 1] * (r - i)
699 yield tuple(pool[i] for i in indices)
Raymond Hettingerd8461652009-01-02 21:20:38 +0000700
701 def unique_everseen(iterable, key=None):
702 "List unique elements, preserving order. Remember all elements ever seen."
703 # unique_everseen('AAAABBBCCDAABBB') --> A B C D
Georg Brandl734373c2009-01-03 21:55:17 +0000704 # unique_everseen('ABBCcAD', str.lower) --> A B C D
Raymond Hettingerd8461652009-01-02 21:20:38 +0000705 seen = set()
706 seen_add = seen.add
707 if key is None:
708 for element in iterable:
709 if element not in seen:
710 seen_add(element)
711 yield element
712 else:
713 for element in iterable:
714 k = key(element)
715 if k not in seen:
716 seen_add(k)
717 yield element
718
719 def unique_justseen(iterable, key=None):
720 "List unique elements, preserving order. Remember only the element just seen."
721 # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B
722 # unique_justseen('ABBCcAD', str.lower) --> A B C A D
723 return imap(next, imap(itemgetter(1), groupby(iterable, key)))