blob: 79a69ea04375e1caaea9efa596d062b71b4b3439 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Christian Heimesfe337bf2008-03-23 21:54:12 +000011.. testsetup::
12
13 from itertools import *
14
15
Georg Brandl9afde1c2007-11-01 20:32:30 +000016This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl116aa622007-08-15 14:28:22 +000017constructs from the Haskell and SML programming languages. Each has been recast
18in a form suitable for Python.
19
20The module standardizes a core set of fast, memory efficient tools that are
21useful by themselves or in combination. Standardization helps avoid the
22readability and reliability problems which arise when many different individuals
23create their own slightly varying implementations, each with their own quirks
24and naming conventions.
25
26The tools are designed to combine readily with one another. This makes it easy
27to construct more specialized tools succinctly and efficiently in pure Python.
28
29For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000030sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
31by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
33Likewise, the functional tools are designed to work well with the high-speed
34functions provided by the :mod:`operator` module.
35
Georg Brandl116aa622007-08-15 14:28:22 +000036Whether cast in pure python form or compiled code, tools that use iterators are
Georg Brandl3dbca812008-07-23 16:10:53 +000037more memory efficient (and often faster) than their list based counterparts. Adopting
Georg Brandl116aa622007-08-15 14:28:22 +000038the principles of just-in-time manufacturing, they create data when and where
39needed instead of consuming memory with the computer equivalent of "inventory".
40
Georg Brandl116aa622007-08-15 14:28:22 +000041
42.. seealso::
43
44 The Standard ML Basis Library, `The Standard ML Basis Library
45 <http://www.standardml.org/Basis/>`_.
46
47 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
48 Libraries <http://www.haskell.org/definition/>`_.
49
50
51.. _itertools-functions:
52
53Itertool functions
54------------------
55
56The following module functions all construct and return iterators. Some provide
57streams of infinite length, so they should only be accessed by functions or
58loops that truncate the stream.
59
60
61.. function:: chain(*iterables)
62
63 Make an iterator that returns elements from the first iterable until it is
64 exhausted, then proceeds to the next iterable, until all of the iterables are
65 exhausted. Used for treating consecutive sequences as a single sequence.
66 Equivalent to::
67
68 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000069 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000070 for it in iterables:
71 for element in it:
72 yield element
73
74
Christian Heimes70e7ea22008-02-28 20:02:27 +000075.. function:: itertools.chain.from_iterable(iterable)
76
77 Alternate constructor for :func:`chain`. Gets chained inputs from a
78 single iterable argument that is evaluated lazily. Equivalent to::
79
80 @classmethod
81 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000082 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +000083 for it in iterables:
84 for element in it:
85 yield element
86
Christian Heimes78644762008-03-04 23:39:23 +000087
Christian Heimes836baa52008-02-26 08:18:30 +000088.. function:: combinations(iterable, r)
89
Christian Heimesdae2a892008-04-19 00:55:37 +000090 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +000091
Christian Heimes70e7ea22008-02-28 20:02:27 +000092 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +000093 input *iterable* is sorted, the combination tuples will be produced
94 in sorted order.
95
96 Elements are treated as unique based on their position, not on their
97 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +000098 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +000099
Christian Heimes836baa52008-02-26 08:18:30 +0000100 Equivalent to::
101
102 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000103 # combinations('ABCD', 2) --> AB AC AD BC BD CD
104 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000105 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000106 n = len(pool)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000107 indices = range(r)
108 yield tuple(pool[i] for i in indices)
Christian Heimes380f7f22008-02-28 11:19:05 +0000109 while 1:
110 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000111 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000112 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000113 else:
114 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000115 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000116 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000117 indices[j] = indices[j-1] + 1
118 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000119
Christian Heimes78644762008-03-04 23:39:23 +0000120 The code for :func:`combinations` can be also expressed as a subsequence
121 of :func:`permutations` after filtering entries where the elements are not
122 in sorted order (according to their position in the input pool)::
123
124 def combinations(iterable, r):
125 pool = tuple(iterable)
126 n = len(pool)
127 for indices in permutations(range(n), r):
128 if sorted(indices) == list(indices):
129 yield tuple(pool[i] for i in indices)
130
Christian Heimes836baa52008-02-26 08:18:30 +0000131
Georg Brandl116aa622007-08-15 14:28:22 +0000132.. function:: count([n])
133
134 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettingera6c60372008-03-13 01:26:19 +0000135 specified *n* defaults to zero. Often used as an argument to :func:`map` to
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000136 generate consecutive data points. Also, used with :func:`zip` to add sequence
Georg Brandl9afde1c2007-11-01 20:32:30 +0000137 numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000138
139 def count(n=0):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000140 # count(10) --> 10 11 12 13 14 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000141 while True:
142 yield n
143 n += 1
144
Georg Brandl116aa622007-08-15 14:28:22 +0000145
146.. function:: cycle(iterable)
147
148 Make an iterator returning elements from the iterable and saving a copy of each.
149 When the iterable is exhausted, return elements from the saved copy. Repeats
150 indefinitely. Equivalent to::
151
152 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000153 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000154 saved = []
155 for element in iterable:
156 yield element
157 saved.append(element)
158 while saved:
159 for element in saved:
160 yield element
161
162 Note, this member of the toolkit may require significant auxiliary storage
163 (depending on the length of the iterable).
164
165
166.. function:: dropwhile(predicate, iterable)
167
168 Make an iterator that drops elements from the iterable as long as the predicate
169 is true; afterwards, returns every element. Note, the iterator does not produce
170 *any* output until the predicate first becomes false, so it may have a lengthy
171 start-up time. Equivalent to::
172
173 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000174 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000175 iterable = iter(iterable)
176 for x in iterable:
177 if not predicate(x):
178 yield x
179 break
180 for x in iterable:
181 yield x
182
183
184.. function:: groupby(iterable[, key])
185
186 Make an iterator that returns consecutive keys and groups from the *iterable*.
187 The *key* is a function computing a key value for each element. If not
188 specified or is ``None``, *key* defaults to an identity function and returns
189 the element unchanged. Generally, the iterable needs to already be sorted on
190 the same key function.
191
192 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
193 generates a break or new group every time the value of the key function changes
194 (which is why it is usually necessary to have sorted the data using the same key
195 function). That behavior differs from SQL's GROUP BY which aggregates common
196 elements regardless of their input order.
197
198 The returned group is itself an iterator that shares the underlying iterable
199 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
200 object is advanced, the previous group is no longer visible. So, if that data
201 is needed later, it should be stored as a list::
202
203 groups = []
204 uniquekeys = []
205 data = sorted(data, key=keyfunc)
206 for k, g in groupby(data, keyfunc):
207 groups.append(list(g)) # Store group iterator as a list
208 uniquekeys.append(k)
209
210 :func:`groupby` is equivalent to::
211
212 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000213 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
214 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000215 def __init__(self, iterable, key=None):
216 if key is None:
217 key = lambda x: x
218 self.keyfunc = key
219 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000220 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000221 def __iter__(self):
222 return self
223 def __next__(self):
224 while self.currkey == self.tgtkey:
225 self.currvalue = next(self.it) # Exit on StopIteration
226 self.currkey = self.keyfunc(self.currvalue)
227 self.tgtkey = self.currkey
228 return (self.currkey, self._grouper(self.tgtkey))
229 def _grouper(self, tgtkey):
230 while self.currkey == tgtkey:
231 yield self.currvalue
232 self.currvalue = next(self.it) # Exit on StopIteration
233 self.currkey = self.keyfunc(self.currvalue)
234
Georg Brandl116aa622007-08-15 14:28:22 +0000235
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000236.. function:: filterfalse(predicate, iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000237
238 Make an iterator that filters elements from iterable returning only those for
239 which the predicate is ``False``. If *predicate* is ``None``, return the items
240 that are false. Equivalent to::
241
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000242 def filterfalse(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000243 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl116aa622007-08-15 14:28:22 +0000244 if predicate is None:
245 predicate = bool
246 for x in iterable:
247 if not predicate(x):
248 yield x
249
250
Georg Brandl116aa622007-08-15 14:28:22 +0000251.. function:: islice(iterable, [start,] stop [, step])
252
253 Make an iterator that returns selected elements from the iterable. If *start* is
254 non-zero, then elements from the iterable are skipped until start is reached.
255 Afterward, elements are returned consecutively unless *step* is set higher than
256 one which results in items being skipped. If *stop* is ``None``, then iteration
257 continues until the iterator is exhausted, if at all; otherwise, it stops at the
258 specified position. Unlike regular slicing, :func:`islice` does not support
259 negative values for *start*, *stop*, or *step*. Can be used to extract related
260 fields from data where the internal structure has been flattened (for example, a
261 multi-line report may list a name field on every third line). Equivalent to::
262
263 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000264 # islice('ABCDEFG', 2) --> A B
265 # islice('ABCDEFG', 2, 4) --> C D
266 # islice('ABCDEFG', 2, None) --> C D E F G
267 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000268 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000269 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000270 nexti = next(it)
271 for i, element in enumerate(iterable):
272 if i == nexti:
273 yield element
274 nexti = next(it)
275
276 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
277 then the step defaults to one.
278
Georg Brandl116aa622007-08-15 14:28:22 +0000279
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000280.. function:: zip_longest(*iterables[, fillvalue])
Georg Brandl116aa622007-08-15 14:28:22 +0000281
282 Make an iterator that aggregates elements from each of the iterables. If the
283 iterables are of uneven length, missing values are filled-in with *fillvalue*.
284 Iteration continues until the longest iterable is exhausted. Equivalent to::
285
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000286 def zip_longest(*args, fillvalue=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000287 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl116aa622007-08-15 14:28:22 +0000288 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
289 yield counter() # yields the fillvalue, or raises IndexError
290 fillers = repeat(fillvalue)
291 iters = [chain(it, sentinel(), fillers) for it in args]
292 try:
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000293 for tup in zip(*iters):
Georg Brandl116aa622007-08-15 14:28:22 +0000294 yield tup
295 except IndexError:
296 pass
297
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000298 If one of the iterables is potentially infinite, then the :func:`zip_longest`
Georg Brandl116aa622007-08-15 14:28:22 +0000299 function should be wrapped with something that limits the number of calls (for
300 example :func:`islice` or :func:`takewhile`).
301
Georg Brandl116aa622007-08-15 14:28:22 +0000302
Christian Heimes70e7ea22008-02-28 20:02:27 +0000303.. function:: permutations(iterable[, r])
304
305 Return successive *r* length permutations of elements in the *iterable*.
306
307 If *r* is not specified or is ``None``, then *r* defaults to the length
308 of the *iterable* and all possible full-length permutations
309 are generated.
310
311 Permutations are emitted in lexicographic sort order. So, if the
312 input *iterable* is sorted, the permutation tuples will be produced
313 in sorted order.
314
315 Elements are treated as unique based on their position, not on their
316 value. So if the input elements are unique, there will be no repeat
317 values in each permutation.
318
Christian Heimesb558a2e2008-03-02 22:46:37 +0000319 Equivalent to::
320
321 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000322 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
323 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000324 pool = tuple(iterable)
325 n = len(pool)
326 r = n if r is None else r
327 indices = range(n)
Christian Heimesfe337bf2008-03-23 21:54:12 +0000328 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000329 yield tuple(pool[i] for i in indices[:r])
330 while n:
331 for i in reversed(range(r)):
332 cycles[i] -= 1
333 if cycles[i] == 0:
334 indices[i:] = indices[i+1:] + indices[i:i+1]
335 cycles[i] = n - i
336 else:
337 j = cycles[i]
338 indices[i], indices[-j] = indices[-j], indices[i]
339 yield tuple(pool[i] for i in indices[:r])
340 break
341 else:
342 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000343
Christian Heimes78644762008-03-04 23:39:23 +0000344 The code for :func:`permutations` can be also expressed as a subsequence of
345 :func:`product`, filtered to exclude entries with repeated elements (those
346 from the same position in the input pool)::
347
348 def permutations(iterable, r=None):
349 pool = tuple(iterable)
350 n = len(pool)
351 r = n if r is None else r
352 for indices in product(range(n), repeat=r):
353 if len(set(indices)) == r:
354 yield tuple(pool[i] for i in indices)
355
Christian Heimes70e7ea22008-02-28 20:02:27 +0000356
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000357.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000358
359 Cartesian product of input iterables.
360
361 Equivalent to nested for-loops in a generator expression. For example,
362 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
363
Christian Heimesdae2a892008-04-19 00:55:37 +0000364 The nested loops cycle like an odometer with the rightmost element advancing
365 on every iteration. This pattern creates a lexicographic ordering so that if
366 the input's iterables are sorted, the product tuples are emitted in sorted
367 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000368
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000369 To compute the product of an iterable with itself, specify the number of
370 repetitions with the optional *repeat* keyword argument. For example,
371 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
372
Christian Heimes78644762008-03-04 23:39:23 +0000373 This function is equivalent to the following code, except that the
374 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000375
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000376 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000377 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
378 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000379 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000380 result = [[]]
381 for pool in pools:
382 result = [x+[y] for x in result for y in pool]
383 for prod in result:
384 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000385
386
Georg Brandl116aa622007-08-15 14:28:22 +0000387.. function:: repeat(object[, times])
388
389 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000390 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000391 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000392 create an invariant part of a tuple record. Equivalent to::
393
394 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000395 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000396 if times is None:
397 while True:
398 yield object
399 else:
400 for i in range(times):
401 yield object
402
403
404.. function:: starmap(function, iterable)
405
Christian Heimes679db4a2008-01-18 09:56:22 +0000406 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000407 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000408 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000409 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000410 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
411
412 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000413 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000414 for args in iterable:
415 yield function(*args)
416
Georg Brandl116aa622007-08-15 14:28:22 +0000417
418.. function:: takewhile(predicate, iterable)
419
420 Make an iterator that returns elements from the iterable as long as the
421 predicate is true. Equivalent to::
422
423 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000424 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000425 for x in iterable:
426 if predicate(x):
427 yield x
428 else:
429 break
430
431
432.. function:: tee(iterable[, n=2])
433
434 Return *n* independent iterators from a single iterable. The case where ``n==2``
435 is equivalent to::
436
437 def tee(iterable):
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000438 def gen(next, data={}):
Georg Brandl116aa622007-08-15 14:28:22 +0000439 for i in count():
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000440 if i in data:
441 yield data.pop(i)
Georg Brandl116aa622007-08-15 14:28:22 +0000442 else:
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000443 data[i] = next()
444 yield data[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000445 it = iter(iterable)
446 return (gen(it.__next__), gen(it.__next__))
447
448 Note, once :func:`tee` has made a split, the original *iterable* should not be
449 used anywhere else; otherwise, the *iterable* could get advanced without the tee
450 objects being informed.
451
452 Note, this member of the toolkit may require significant auxiliary storage
453 (depending on how much temporary data needs to be stored). In general, if one
454 iterator is going to use most or all of the data before the other iterator, it
455 is faster to use :func:`list` instead of :func:`tee`.
456
Georg Brandl116aa622007-08-15 14:28:22 +0000457
458.. _itertools-example:
459
460Examples
461--------
462
463The following examples show common uses for each tool and demonstrate ways they
Christian Heimesfe337bf2008-03-23 21:54:12 +0000464can be combined.
465
466.. doctest::
Georg Brandl116aa622007-08-15 14:28:22 +0000467
Georg Brandl116aa622007-08-15 14:28:22 +0000468 # Show a dictionary sorted and grouped by value
469 >>> from operator import itemgetter
470 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000471 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000472 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000473 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000474 ...
475 1 ['a', 'c', 'e']
476 2 ['b', 'd', 'f']
477 3 ['g']
478
479 # Find runs of consecutive numbers using groupby. The key to the solution
480 # is differencing with a range so that consecutive numbers all appear in
481 # same group.
482 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
483 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000484 ... print(map(operator.itemgetter(1), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000485 ...
486 [1]
487 [4, 5, 6]
488 [10]
489 [15, 16, 17, 18]
490 [22]
491 [25, 26, 27, 28]
492
493
494
495.. _itertools-recipes:
496
497Recipes
498-------
499
500This section shows recipes for creating an extended toolset using the existing
501itertools as building blocks.
502
503The extended tools offer the same high performance as the underlying toolset.
504The superior memory performance is kept by processing elements one at a time
505rather than bringing the whole iterable into memory all at once. Code volume is
506kept small by linking the tools together in a functional style which helps
507eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000508"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000509which incur interpreter overhead.
510
511.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000512
Georg Brandl3dbca812008-07-23 16:10:53 +0000513 def take(n, iterable):
514 "Return first n items of the iterable as a list"
515 return list(islice(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000516
Georg Brandl3dbca812008-07-23 16:10:53 +0000517 def enumerate(iterable, start=0):
518 return zip(count(start), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000519
Georg Brandl3dbca812008-07-23 16:10:53 +0000520 def tabulate(function, start=0):
Georg Brandl116aa622007-08-15 14:28:22 +0000521 "Return function(0), function(1), ..."
Georg Brandl3dbca812008-07-23 16:10:53 +0000522 return map(function, count(start))
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000523
Georg Brandl116aa622007-08-15 14:28:22 +0000524 def nth(iterable, n):
Georg Brandl3dbca812008-07-23 16:10:53 +0000525 "Returns the nth item or empty list"
526 return list(islice(iterable, n, n+1))
Georg Brandl116aa622007-08-15 14:28:22 +0000527
Georg Brandl3dbca812008-07-23 16:10:53 +0000528 def quantify(iterable, pred=bool):
529 "Count how many times the predicate is true"
530 return sum(map(pred, iterable))
Georg Brandl116aa622007-08-15 14:28:22 +0000531
Georg Brandl3dbca812008-07-23 16:10:53 +0000532 def padnone(iterable):
Georg Brandl116aa622007-08-15 14:28:22 +0000533 """Returns the sequence elements and then returns None indefinitely.
534
535 Useful for emulating the behavior of the built-in map() function.
536 """
Georg Brandl3dbca812008-07-23 16:10:53 +0000537 return chain(iterable, repeat(None))
Georg Brandl116aa622007-08-15 14:28:22 +0000538
Georg Brandl3dbca812008-07-23 16:10:53 +0000539 def ncycles(iterable, n):
Georg Brandl116aa622007-08-15 14:28:22 +0000540 "Returns the sequence elements n times"
Georg Brandl3dbca812008-07-23 16:10:53 +0000541 return chain.from_iterable(repeat(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000542
543 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000544 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000545
546 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000547 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000548
549 def repeatfunc(func, times=None, *args):
550 """Repeat calls to func with specified arguments.
551
552 Example: repeatfunc(random.random)
553 """
554 if times is None:
555 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000556 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000557
558 def pairwise(iterable):
559 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
560 a, b = tee(iterable)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000561 for elem in b:
562 break
563 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000564
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000565 def grouper(n, iterable, fillvalue=None):
Georg Brandl116aa622007-08-15 14:28:22 +0000566 "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000567 args = [iter(iterable)] * n
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000568 return zip_longest(*args, fillvalue=fillvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000569
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000570 def roundrobin(*iterables):
571 "roundrobin('abc', 'd', 'ef') --> 'a', 'd', 'e', 'b', 'f', 'c'"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000572 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000573 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000574 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000575 while pending:
576 try:
577 for next in nexts:
578 yield next()
579 except StopIteration:
580 pending -= 1
581 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000582
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000583 def powerset(iterable):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000584 "powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])"
585 # Recipe credited to Eric Raymond
586 pairs = [(2**i, x) for i, x in enumerate(iterable)]
587 for n in xrange(2**len(pairs)):
588 yield set(x for m, x in pairs if m&n)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000589
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000590 def compress(data, selectors):
591 "compress('abcdef', [1,0,1,0,1,1]) --> a c e f"
Georg Brandl3dbca812008-07-23 16:10:53 +0000592 decorated = zip(data, selectors)
593 filtered = filter(operator.itemgetter(1), decorated)
594 return map(operator.itemgetter(0), filtered)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000595
Georg Brandl3dbca812008-07-23 16:10:53 +0000596 def combinations_with_replacement(iterable, r):
597 "combinations_with_replacement('ABC', 3) --> AA AB AC BB BC CC"
598 pool = tuple(iterable)
599 n = len(pool)
600 indices = [0] * r
601 yield tuple(pool[i] for i in indices)
602 while True:
603 for i in reversed(range(r)):
604 if indices[i] != n - 1:
605 break
606 else:
607 return
608 indices[i:] = [indices[i] + 1] * (r - i)
609 yield tuple(pool[i] for i in indices)