blob: ec0f5334d2efe77844b47f2aa70483602df64455 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Christian Heimesfe337bf2008-03-23 21:54:12 +000011.. testsetup::
12
13 from itertools import *
14
15
Georg Brandl9afde1c2007-11-01 20:32:30 +000016This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl116aa622007-08-15 14:28:22 +000017constructs from the Haskell and SML programming languages. Each has been recast
18in a form suitable for Python.
19
20The module standardizes a core set of fast, memory efficient tools that are
21useful by themselves or in combination. Standardization helps avoid the
22readability and reliability problems which arise when many different individuals
23create their own slightly varying implementations, each with their own quirks
24and naming conventions.
25
26The tools are designed to combine readily with one another. This makes it easy
27to construct more specialized tools succinctly and efficiently in pure Python.
28
29For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000030sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
31by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
33Likewise, the functional tools are designed to work well with the high-speed
34functions provided by the :mod:`operator` module.
35
Georg Brandl116aa622007-08-15 14:28:22 +000036Whether cast in pure python form or compiled code, tools that use iterators are
Georg Brandl3dbca812008-07-23 16:10:53 +000037more memory efficient (and often faster) than their list based counterparts. Adopting
Georg Brandl116aa622007-08-15 14:28:22 +000038the principles of just-in-time manufacturing, they create data when and where
39needed instead of consuming memory with the computer equivalent of "inventory".
40
Georg Brandl116aa622007-08-15 14:28:22 +000041
42.. seealso::
43
44 The Standard ML Basis Library, `The Standard ML Basis Library
45 <http://www.standardml.org/Basis/>`_.
46
47 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
48 Libraries <http://www.haskell.org/definition/>`_.
49
50
51.. _itertools-functions:
52
53Itertool functions
54------------------
55
56The following module functions all construct and return iterators. Some provide
57streams of infinite length, so they should only be accessed by functions or
58loops that truncate the stream.
59
60
61.. function:: chain(*iterables)
62
63 Make an iterator that returns elements from the first iterable until it is
64 exhausted, then proceeds to the next iterable, until all of the iterables are
65 exhausted. Used for treating consecutive sequences as a single sequence.
66 Equivalent to::
67
68 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000069 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000070 for it in iterables:
71 for element in it:
72 yield element
73
74
Christian Heimes70e7ea22008-02-28 20:02:27 +000075.. function:: itertools.chain.from_iterable(iterable)
76
77 Alternate constructor for :func:`chain`. Gets chained inputs from a
78 single iterable argument that is evaluated lazily. Equivalent to::
79
80 @classmethod
81 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000082 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +000083 for it in iterables:
84 for element in it:
85 yield element
86
Christian Heimes78644762008-03-04 23:39:23 +000087
Christian Heimes836baa52008-02-26 08:18:30 +000088.. function:: combinations(iterable, r)
89
Christian Heimesdae2a892008-04-19 00:55:37 +000090 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +000091
Christian Heimes70e7ea22008-02-28 20:02:27 +000092 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +000093 input *iterable* is sorted, the combination tuples will be produced
94 in sorted order.
95
96 Elements are treated as unique based on their position, not on their
97 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +000098 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +000099
Christian Heimes836baa52008-02-26 08:18:30 +0000100 Equivalent to::
101
102 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000103 # combinations('ABCD', 2) --> AB AC AD BC BD CD
104 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000105 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000106 n = len(pool)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000107 indices = range(r)
108 yield tuple(pool[i] for i in indices)
Christian Heimes380f7f22008-02-28 11:19:05 +0000109 while 1:
110 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000111 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000112 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000113 else:
114 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000115 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000116 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000117 indices[j] = indices[j-1] + 1
118 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000119
Christian Heimes78644762008-03-04 23:39:23 +0000120 The code for :func:`combinations` can be also expressed as a subsequence
121 of :func:`permutations` after filtering entries where the elements are not
122 in sorted order (according to their position in the input pool)::
123
124 def combinations(iterable, r):
125 pool = tuple(iterable)
126 n = len(pool)
127 for indices in permutations(range(n), r):
128 if sorted(indices) == list(indices):
129 yield tuple(pool[i] for i in indices)
130
Christian Heimes836baa52008-02-26 08:18:30 +0000131
Georg Brandl116aa622007-08-15 14:28:22 +0000132.. function:: count([n])
133
134 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettingera6c60372008-03-13 01:26:19 +0000135 specified *n* defaults to zero. Often used as an argument to :func:`map` to
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000136 generate consecutive data points. Also, used with :func:`zip` to add sequence
Georg Brandl9afde1c2007-11-01 20:32:30 +0000137 numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000138
139 def count(n=0):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000140 # count(10) --> 10 11 12 13 14 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000141 while True:
142 yield n
143 n += 1
144
Georg Brandl116aa622007-08-15 14:28:22 +0000145
146.. function:: cycle(iterable)
147
148 Make an iterator returning elements from the iterable and saving a copy of each.
149 When the iterable is exhausted, return elements from the saved copy. Repeats
150 indefinitely. Equivalent to::
151
152 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000153 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000154 saved = []
155 for element in iterable:
156 yield element
157 saved.append(element)
158 while saved:
159 for element in saved:
160 yield element
161
162 Note, this member of the toolkit may require significant auxiliary storage
163 (depending on the length of the iterable).
164
165
166.. function:: dropwhile(predicate, iterable)
167
168 Make an iterator that drops elements from the iterable as long as the predicate
169 is true; afterwards, returns every element. Note, the iterator does not produce
170 *any* output until the predicate first becomes false, so it may have a lengthy
171 start-up time. Equivalent to::
172
173 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000174 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000175 iterable = iter(iterable)
176 for x in iterable:
177 if not predicate(x):
178 yield x
179 break
180 for x in iterable:
181 yield x
182
183
184.. function:: groupby(iterable[, key])
185
186 Make an iterator that returns consecutive keys and groups from the *iterable*.
187 The *key* is a function computing a key value for each element. If not
188 specified or is ``None``, *key* defaults to an identity function and returns
189 the element unchanged. Generally, the iterable needs to already be sorted on
190 the same key function.
191
192 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
193 generates a break or new group every time the value of the key function changes
194 (which is why it is usually necessary to have sorted the data using the same key
195 function). That behavior differs from SQL's GROUP BY which aggregates common
196 elements regardless of their input order.
197
198 The returned group is itself an iterator that shares the underlying iterable
199 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
200 object is advanced, the previous group is no longer visible. So, if that data
201 is needed later, it should be stored as a list::
202
203 groups = []
204 uniquekeys = []
205 data = sorted(data, key=keyfunc)
206 for k, g in groupby(data, keyfunc):
207 groups.append(list(g)) # Store group iterator as a list
208 uniquekeys.append(k)
209
210 :func:`groupby` is equivalent to::
211
212 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000213 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
214 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000215 def __init__(self, iterable, key=None):
216 if key is None:
217 key = lambda x: x
218 self.keyfunc = key
219 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000220 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000221 def __iter__(self):
222 return self
223 def __next__(self):
224 while self.currkey == self.tgtkey:
225 self.currvalue = next(self.it) # Exit on StopIteration
226 self.currkey = self.keyfunc(self.currvalue)
227 self.tgtkey = self.currkey
228 return (self.currkey, self._grouper(self.tgtkey))
229 def _grouper(self, tgtkey):
230 while self.currkey == tgtkey:
231 yield self.currvalue
232 self.currvalue = next(self.it) # Exit on StopIteration
233 self.currkey = self.keyfunc(self.currvalue)
234
Georg Brandl116aa622007-08-15 14:28:22 +0000235
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000236.. function:: filterfalse(predicate, iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000237
238 Make an iterator that filters elements from iterable returning only those for
239 which the predicate is ``False``. If *predicate* is ``None``, return the items
240 that are false. Equivalent to::
241
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000242 def filterfalse(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000243 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl116aa622007-08-15 14:28:22 +0000244 if predicate is None:
245 predicate = bool
246 for x in iterable:
247 if not predicate(x):
248 yield x
249
250
Georg Brandl116aa622007-08-15 14:28:22 +0000251.. function:: islice(iterable, [start,] stop [, step])
252
253 Make an iterator that returns selected elements from the iterable. If *start* is
254 non-zero, then elements from the iterable are skipped until start is reached.
255 Afterward, elements are returned consecutively unless *step* is set higher than
256 one which results in items being skipped. If *stop* is ``None``, then iteration
257 continues until the iterator is exhausted, if at all; otherwise, it stops at the
258 specified position. Unlike regular slicing, :func:`islice` does not support
259 negative values for *start*, *stop*, or *step*. Can be used to extract related
260 fields from data where the internal structure has been flattened (for example, a
261 multi-line report may list a name field on every third line). Equivalent to::
262
263 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000264 # islice('ABCDEFG', 2) --> A B
265 # islice('ABCDEFG', 2, 4) --> C D
266 # islice('ABCDEFG', 2, None) --> C D E F G
267 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000268 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000269 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000270 nexti = next(it)
271 for i, element in enumerate(iterable):
272 if i == nexti:
273 yield element
274 nexti = next(it)
275
276 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
277 then the step defaults to one.
278
Georg Brandl116aa622007-08-15 14:28:22 +0000279
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000280.. function:: zip_longest(*iterables[, fillvalue])
Georg Brandl116aa622007-08-15 14:28:22 +0000281
282 Make an iterator that aggregates elements from each of the iterables. If the
283 iterables are of uneven length, missing values are filled-in with *fillvalue*.
284 Iteration continues until the longest iterable is exhausted. Equivalent to::
285
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000286 def zip_longest(*args, fillvalue=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000287 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl116aa622007-08-15 14:28:22 +0000288 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
289 yield counter() # yields the fillvalue, or raises IndexError
290 fillers = repeat(fillvalue)
291 iters = [chain(it, sentinel(), fillers) for it in args]
292 try:
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000293 for tup in zip(*iters):
Georg Brandl116aa622007-08-15 14:28:22 +0000294 yield tup
295 except IndexError:
296 pass
297
Benjamin Peterson37d2fe02008-10-24 22:28:58 +0000298 If one of the iterables is potentially infinite, then the :func:`zip_longest`
299 function should be wrapped with something that limits the number of calls
300 (for example :func:`islice` or :func:`takewhile`). If not specified,
301 *fillvalue* defaults to ``None``.
Georg Brandl116aa622007-08-15 14:28:22 +0000302
Georg Brandl116aa622007-08-15 14:28:22 +0000303
Christian Heimes70e7ea22008-02-28 20:02:27 +0000304.. function:: permutations(iterable[, r])
305
306 Return successive *r* length permutations of elements in the *iterable*.
307
308 If *r* is not specified or is ``None``, then *r* defaults to the length
309 of the *iterable* and all possible full-length permutations
310 are generated.
311
312 Permutations are emitted in lexicographic sort order. So, if the
313 input *iterable* is sorted, the permutation tuples will be produced
314 in sorted order.
315
316 Elements are treated as unique based on their position, not on their
317 value. So if the input elements are unique, there will be no repeat
318 values in each permutation.
319
Christian Heimesb558a2e2008-03-02 22:46:37 +0000320 Equivalent to::
321
322 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000323 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
324 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000325 pool = tuple(iterable)
326 n = len(pool)
327 r = n if r is None else r
328 indices = range(n)
Christian Heimesfe337bf2008-03-23 21:54:12 +0000329 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000330 yield tuple(pool[i] for i in indices[:r])
331 while n:
332 for i in reversed(range(r)):
333 cycles[i] -= 1
334 if cycles[i] == 0:
335 indices[i:] = indices[i+1:] + indices[i:i+1]
336 cycles[i] = n - i
337 else:
338 j = cycles[i]
339 indices[i], indices[-j] = indices[-j], indices[i]
340 yield tuple(pool[i] for i in indices[:r])
341 break
342 else:
343 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000344
Christian Heimes78644762008-03-04 23:39:23 +0000345 The code for :func:`permutations` can be also expressed as a subsequence of
346 :func:`product`, filtered to exclude entries with repeated elements (those
347 from the same position in the input pool)::
348
349 def permutations(iterable, r=None):
350 pool = tuple(iterable)
351 n = len(pool)
352 r = n if r is None else r
353 for indices in product(range(n), repeat=r):
354 if len(set(indices)) == r:
355 yield tuple(pool[i] for i in indices)
356
Christian Heimes70e7ea22008-02-28 20:02:27 +0000357
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000358.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000359
360 Cartesian product of input iterables.
361
362 Equivalent to nested for-loops in a generator expression. For example,
363 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
364
Christian Heimesdae2a892008-04-19 00:55:37 +0000365 The nested loops cycle like an odometer with the rightmost element advancing
366 on every iteration. This pattern creates a lexicographic ordering so that if
367 the input's iterables are sorted, the product tuples are emitted in sorted
368 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000369
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000370 To compute the product of an iterable with itself, specify the number of
371 repetitions with the optional *repeat* keyword argument. For example,
372 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
373
Christian Heimes78644762008-03-04 23:39:23 +0000374 This function is equivalent to the following code, except that the
375 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000376
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000377 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000378 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
379 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000380 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000381 result = [[]]
382 for pool in pools:
383 result = [x+[y] for x in result for y in pool]
384 for prod in result:
385 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000386
387
Georg Brandl116aa622007-08-15 14:28:22 +0000388.. function:: repeat(object[, times])
389
390 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000391 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000392 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000393 create an invariant part of a tuple record. Equivalent to::
394
395 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000396 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000397 if times is None:
398 while True:
399 yield object
400 else:
401 for i in range(times):
402 yield object
403
404
405.. function:: starmap(function, iterable)
406
Christian Heimes679db4a2008-01-18 09:56:22 +0000407 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000408 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000409 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000410 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000411 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
412
413 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000414 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000415 for args in iterable:
416 yield function(*args)
417
Georg Brandl116aa622007-08-15 14:28:22 +0000418
419.. function:: takewhile(predicate, iterable)
420
421 Make an iterator that returns elements from the iterable as long as the
422 predicate is true. Equivalent to::
423
424 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000425 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000426 for x in iterable:
427 if predicate(x):
428 yield x
429 else:
430 break
431
432
433.. function:: tee(iterable[, n=2])
434
435 Return *n* independent iterators from a single iterable. The case where ``n==2``
436 is equivalent to::
437
438 def tee(iterable):
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000439 def gen(next, data={}):
Georg Brandl116aa622007-08-15 14:28:22 +0000440 for i in count():
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000441 if i in data:
442 yield data.pop(i)
Georg Brandl116aa622007-08-15 14:28:22 +0000443 else:
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000444 data[i] = next()
445 yield data[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000446 it = iter(iterable)
447 return (gen(it.__next__), gen(it.__next__))
448
449 Note, once :func:`tee` has made a split, the original *iterable* should not be
450 used anywhere else; otherwise, the *iterable* could get advanced without the tee
451 objects being informed.
452
453 Note, this member of the toolkit may require significant auxiliary storage
454 (depending on how much temporary data needs to be stored). In general, if one
455 iterator is going to use most or all of the data before the other iterator, it
456 is faster to use :func:`list` instead of :func:`tee`.
457
Georg Brandl116aa622007-08-15 14:28:22 +0000458
459.. _itertools-example:
460
461Examples
462--------
463
464The following examples show common uses for each tool and demonstrate ways they
Christian Heimesfe337bf2008-03-23 21:54:12 +0000465can be combined.
466
467.. doctest::
Georg Brandl116aa622007-08-15 14:28:22 +0000468
Georg Brandl116aa622007-08-15 14:28:22 +0000469 # Show a dictionary sorted and grouped by value
470 >>> from operator import itemgetter
471 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000472 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000473 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000474 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000475 ...
476 1 ['a', 'c', 'e']
477 2 ['b', 'd', 'f']
478 3 ['g']
479
480 # Find runs of consecutive numbers using groupby. The key to the solution
481 # is differencing with a range so that consecutive numbers all appear in
482 # same group.
483 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
484 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000485 ... print(map(operator.itemgetter(1), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000486 ...
487 [1]
488 [4, 5, 6]
489 [10]
490 [15, 16, 17, 18]
491 [22]
492 [25, 26, 27, 28]
493
494
495
496.. _itertools-recipes:
497
498Recipes
499-------
500
501This section shows recipes for creating an extended toolset using the existing
502itertools as building blocks.
503
504The extended tools offer the same high performance as the underlying toolset.
505The superior memory performance is kept by processing elements one at a time
506rather than bringing the whole iterable into memory all at once. Code volume is
507kept small by linking the tools together in a functional style which helps
508eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000509"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000510which incur interpreter overhead.
511
512.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000513
Georg Brandl3dbca812008-07-23 16:10:53 +0000514 def take(n, iterable):
515 "Return first n items of the iterable as a list"
516 return list(islice(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000517
Georg Brandl3dbca812008-07-23 16:10:53 +0000518 def enumerate(iterable, start=0):
519 return zip(count(start), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000520
Georg Brandl3dbca812008-07-23 16:10:53 +0000521 def tabulate(function, start=0):
Georg Brandl116aa622007-08-15 14:28:22 +0000522 "Return function(0), function(1), ..."
Georg Brandl3dbca812008-07-23 16:10:53 +0000523 return map(function, count(start))
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000524
Georg Brandl116aa622007-08-15 14:28:22 +0000525 def nth(iterable, n):
Georg Brandl3dbca812008-07-23 16:10:53 +0000526 "Returns the nth item or empty list"
527 return list(islice(iterable, n, n+1))
Georg Brandl116aa622007-08-15 14:28:22 +0000528
Georg Brandl3dbca812008-07-23 16:10:53 +0000529 def quantify(iterable, pred=bool):
530 "Count how many times the predicate is true"
531 return sum(map(pred, iterable))
Georg Brandl116aa622007-08-15 14:28:22 +0000532
Georg Brandl3dbca812008-07-23 16:10:53 +0000533 def padnone(iterable):
Georg Brandl116aa622007-08-15 14:28:22 +0000534 """Returns the sequence elements and then returns None indefinitely.
535
536 Useful for emulating the behavior of the built-in map() function.
537 """
Georg Brandl3dbca812008-07-23 16:10:53 +0000538 return chain(iterable, repeat(None))
Georg Brandl116aa622007-08-15 14:28:22 +0000539
Georg Brandl3dbca812008-07-23 16:10:53 +0000540 def ncycles(iterable, n):
Georg Brandl116aa622007-08-15 14:28:22 +0000541 "Returns the sequence elements n times"
Georg Brandl3dbca812008-07-23 16:10:53 +0000542 return chain.from_iterable(repeat(iterable, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000543
544 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000545 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000546
547 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000548 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000549
550 def repeatfunc(func, times=None, *args):
551 """Repeat calls to func with specified arguments.
552
553 Example: repeatfunc(random.random)
554 """
555 if times is None:
556 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000557 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000558
559 def pairwise(iterable):
560 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
561 a, b = tee(iterable)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000562 for elem in b:
563 break
564 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000565
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000566 def grouper(n, iterable, fillvalue=None):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000567 "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000568 args = [iter(iterable)] * n
Benjamin Petersond18de0e2008-07-31 20:21:46 +0000569 return zip_longest(fillvalue=fillvalue, *args)
Georg Brandl116aa622007-08-15 14:28:22 +0000570
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000571 def roundrobin(*iterables):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000572 "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000573 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000574 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000575 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000576 while pending:
577 try:
578 for next in nexts:
579 yield next()
580 except StopIteration:
581 pending -= 1
582 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000583
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000584 def powerset(iterable):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000585 "powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])"
586 # Recipe credited to Eric Raymond
587 pairs = [(2**i, x) for i, x in enumerate(iterable)]
588 for n in xrange(2**len(pairs)):
589 yield set(x for m, x in pairs if m&n)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000590
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000591 def compress(data, selectors):
Raymond Hettingerf5a2e472008-07-30 07:37:37 +0000592 "compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F"
Benjamin Peterson37d2fe02008-10-24 22:28:58 +0000593 return (d for d, s in zip(data, selectors) if s)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000594
Georg Brandl3dbca812008-07-23 16:10:53 +0000595 def combinations_with_replacement(iterable, r):
596 "combinations_with_replacement('ABC', 3) --> AA AB AC BB BC CC"
597 pool = tuple(iterable)
598 n = len(pool)
599 indices = [0] * r
600 yield tuple(pool[i] for i in indices)
601 while True:
602 for i in reversed(range(r)):
603 if indices[i] != n - 1:
604 break
605 else:
606 return
607 indices[i:] = [indices[i] + 1] * (r - i)
608 yield tuple(pool[i] for i in indices)