blob: bc8986b809ec1eb831897f015477e3f204657bd8 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandl9afde1c2007-11-01 20:32:30 +000011This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl116aa622007-08-15 14:28:22 +000012constructs from the Haskell and SML programming languages. Each has been recast
13in a form suitable for Python.
14
15The module standardizes a core set of fast, memory efficient tools that are
16useful by themselves or in combination. Standardization helps avoid the
17readability and reliability problems which arise when many different individuals
18create their own slightly varying implementations, each with their own quirks
19and naming conventions.
20
21The tools are designed to combine readily with one another. This makes it easy
22to construct more specialized tools succinctly and efficiently in pure Python.
23
24For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000025sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
26by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000027
28Likewise, the functional tools are designed to work well with the high-speed
29functions provided by the :mod:`operator` module.
30
31The module author welcomes suggestions for other basic building blocks to be
32added to future versions of the module.
33
34Whether cast in pure python form or compiled code, tools that use iterators are
35more memory efficient (and faster) than their list based counterparts. Adopting
36the principles of just-in-time manufacturing, they create data when and where
37needed instead of consuming memory with the computer equivalent of "inventory".
38
39The performance advantage of iterators becomes more acute as the number of
40elements increases -- at some point, lists grow large enough to severely impact
41memory cache performance and start running slowly.
42
43
44.. seealso::
45
46 The Standard ML Basis Library, `The Standard ML Basis Library
47 <http://www.standardml.org/Basis/>`_.
48
49 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
50 Libraries <http://www.haskell.org/definition/>`_.
51
52
53.. _itertools-functions:
54
55Itertool functions
56------------------
57
58The following module functions all construct and return iterators. Some provide
59streams of infinite length, so they should only be accessed by functions or
60loops that truncate the stream.
61
62
63.. function:: chain(*iterables)
64
65 Make an iterator that returns elements from the first iterable until it is
66 exhausted, then proceeds to the next iterable, until all of the iterables are
67 exhausted. Used for treating consecutive sequences as a single sequence.
68 Equivalent to::
69
70 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000071 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000072 for it in iterables:
73 for element in it:
74 yield element
75
76
Christian Heimes70e7ea22008-02-28 20:02:27 +000077.. function:: itertools.chain.from_iterable(iterable)
78
79 Alternate constructor for :func:`chain`. Gets chained inputs from a
80 single iterable argument that is evaluated lazily. Equivalent to::
81
82 @classmethod
83 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000084 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +000085 for it in iterables:
86 for element in it:
87 yield element
88
89 .. versionadded:: 2.6
90
Christian Heimes78644762008-03-04 23:39:23 +000091
Christian Heimes836baa52008-02-26 08:18:30 +000092.. function:: combinations(iterable, r)
93
94 Return successive *r* length combinations of elements in the *iterable*.
95
Christian Heimes70e7ea22008-02-28 20:02:27 +000096 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +000097 input *iterable* is sorted, the combination tuples will be produced
98 in sorted order.
99
100 Elements are treated as unique based on their position, not on their
101 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +0000102 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +0000103
104 Each result tuple is ordered to match the input order. So, every
105 combination is a subsequence of the input *iterable*.
106
Christian Heimes836baa52008-02-26 08:18:30 +0000107 Equivalent to::
108
109 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000110 # combinations('ABCD', 2) --> AB AC AD BC BD CD
111 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000112 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000113 n = len(pool)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000114 indices = range(r)
115 yield tuple(pool[i] for i in indices)
Christian Heimes380f7f22008-02-28 11:19:05 +0000116 while 1:
117 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000118 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000119 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000120 else:
121 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000122 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000123 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000124 indices[j] = indices[j-1] + 1
125 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000126
Christian Heimes78644762008-03-04 23:39:23 +0000127 The code for :func:`combinations` can be also expressed as a subsequence
128 of :func:`permutations` after filtering entries where the elements are not
129 in sorted order (according to their position in the input pool)::
130
131 def combinations(iterable, r):
132 pool = tuple(iterable)
133 n = len(pool)
134 for indices in permutations(range(n), r):
135 if sorted(indices) == list(indices):
136 yield tuple(pool[i] for i in indices)
137
Christian Heimes836baa52008-02-26 08:18:30 +0000138 .. versionadded:: 2.6
139
Georg Brandl116aa622007-08-15 14:28:22 +0000140.. function:: count([n])
141
142 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettingera6c60372008-03-13 01:26:19 +0000143 specified *n* defaults to zero. Often used as an argument to :func:`map` to
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000144 generate consecutive data points. Also, used with :func:`zip` to add sequence
Georg Brandl9afde1c2007-11-01 20:32:30 +0000145 numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000146
147 def count(n=0):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000148 # count(10) --> 10 11 12 13 14 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000149 while True:
150 yield n
151 n += 1
152
Georg Brandl116aa622007-08-15 14:28:22 +0000153
154.. function:: cycle(iterable)
155
156 Make an iterator returning elements from the iterable and saving a copy of each.
157 When the iterable is exhausted, return elements from the saved copy. Repeats
158 indefinitely. Equivalent to::
159
160 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000161 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000162 saved = []
163 for element in iterable:
164 yield element
165 saved.append(element)
166 while saved:
167 for element in saved:
168 yield element
169
170 Note, this member of the toolkit may require significant auxiliary storage
171 (depending on the length of the iterable).
172
173
174.. function:: dropwhile(predicate, iterable)
175
176 Make an iterator that drops elements from the iterable as long as the predicate
177 is true; afterwards, returns every element. Note, the iterator does not produce
178 *any* output until the predicate first becomes false, so it may have a lengthy
179 start-up time. Equivalent to::
180
181 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000182 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000183 iterable = iter(iterable)
184 for x in iterable:
185 if not predicate(x):
186 yield x
187 break
188 for x in iterable:
189 yield x
190
191
192.. function:: groupby(iterable[, key])
193
194 Make an iterator that returns consecutive keys and groups from the *iterable*.
195 The *key* is a function computing a key value for each element. If not
196 specified or is ``None``, *key* defaults to an identity function and returns
197 the element unchanged. Generally, the iterable needs to already be sorted on
198 the same key function.
199
200 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
201 generates a break or new group every time the value of the key function changes
202 (which is why it is usually necessary to have sorted the data using the same key
203 function). That behavior differs from SQL's GROUP BY which aggregates common
204 elements regardless of their input order.
205
206 The returned group is itself an iterator that shares the underlying iterable
207 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
208 object is advanced, the previous group is no longer visible. So, if that data
209 is needed later, it should be stored as a list::
210
211 groups = []
212 uniquekeys = []
213 data = sorted(data, key=keyfunc)
214 for k, g in groupby(data, keyfunc):
215 groups.append(list(g)) # Store group iterator as a list
216 uniquekeys.append(k)
217
218 :func:`groupby` is equivalent to::
219
220 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000221 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
222 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000223 def __init__(self, iterable, key=None):
224 if key is None:
225 key = lambda x: x
226 self.keyfunc = key
227 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000228 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000229 def __iter__(self):
230 return self
231 def __next__(self):
232 while self.currkey == self.tgtkey:
233 self.currvalue = next(self.it) # Exit on StopIteration
234 self.currkey = self.keyfunc(self.currvalue)
235 self.tgtkey = self.currkey
236 return (self.currkey, self._grouper(self.tgtkey))
237 def _grouper(self, tgtkey):
238 while self.currkey == tgtkey:
239 yield self.currvalue
240 self.currvalue = next(self.it) # Exit on StopIteration
241 self.currkey = self.keyfunc(self.currvalue)
242
Georg Brandl116aa622007-08-15 14:28:22 +0000243
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000244.. function:: filterfalse(predicate, iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000245
246 Make an iterator that filters elements from iterable returning only those for
247 which the predicate is ``False``. If *predicate* is ``None``, return the items
248 that are false. Equivalent to::
249
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000250 def filterfalse(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000251 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl116aa622007-08-15 14:28:22 +0000252 if predicate is None:
253 predicate = bool
254 for x in iterable:
255 if not predicate(x):
256 yield x
257
258
Georg Brandl116aa622007-08-15 14:28:22 +0000259.. function:: islice(iterable, [start,] stop [, step])
260
261 Make an iterator that returns selected elements from the iterable. If *start* is
262 non-zero, then elements from the iterable are skipped until start is reached.
263 Afterward, elements are returned consecutively unless *step* is set higher than
264 one which results in items being skipped. If *stop* is ``None``, then iteration
265 continues until the iterator is exhausted, if at all; otherwise, it stops at the
266 specified position. Unlike regular slicing, :func:`islice` does not support
267 negative values for *start*, *stop*, or *step*. Can be used to extract related
268 fields from data where the internal structure has been flattened (for example, a
269 multi-line report may list a name field on every third line). Equivalent to::
270
271 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000272 # islice('ABCDEFG', 2) --> A B
273 # islice('ABCDEFG', 2, 4) --> C D
274 # islice('ABCDEFG', 2, None) --> C D E F G
275 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000276 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000277 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000278 nexti = next(it)
279 for i, element in enumerate(iterable):
280 if i == nexti:
281 yield element
282 nexti = next(it)
283
284 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
285 then the step defaults to one.
286
Georg Brandl116aa622007-08-15 14:28:22 +0000287
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000288.. function:: zip_longest(*iterables[, fillvalue])
Georg Brandl116aa622007-08-15 14:28:22 +0000289
290 Make an iterator that aggregates elements from each of the iterables. If the
291 iterables are of uneven length, missing values are filled-in with *fillvalue*.
292 Iteration continues until the longest iterable is exhausted. Equivalent to::
293
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000294 def zip_longest(*args, **kwds):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000295 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl116aa622007-08-15 14:28:22 +0000296 fillvalue = kwds.get('fillvalue')
297 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
298 yield counter() # yields the fillvalue, or raises IndexError
299 fillers = repeat(fillvalue)
300 iters = [chain(it, sentinel(), fillers) for it in args]
301 try:
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000302 for tup in zip(*iters):
Georg Brandl116aa622007-08-15 14:28:22 +0000303 yield tup
304 except IndexError:
305 pass
306
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000307 If one of the iterables is potentially infinite, then the :func:`zip_longest`
Georg Brandl116aa622007-08-15 14:28:22 +0000308 function should be wrapped with something that limits the number of calls (for
309 example :func:`islice` or :func:`takewhile`).
310
Georg Brandl116aa622007-08-15 14:28:22 +0000311
Christian Heimes70e7ea22008-02-28 20:02:27 +0000312.. function:: permutations(iterable[, r])
313
314 Return successive *r* length permutations of elements in the *iterable*.
315
316 If *r* is not specified or is ``None``, then *r* defaults to the length
317 of the *iterable* and all possible full-length permutations
318 are generated.
319
320 Permutations are emitted in lexicographic sort order. So, if the
321 input *iterable* is sorted, the permutation tuples will be produced
322 in sorted order.
323
324 Elements are treated as unique based on their position, not on their
325 value. So if the input elements are unique, there will be no repeat
326 values in each permutation.
327
Christian Heimesb558a2e2008-03-02 22:46:37 +0000328 Equivalent to::
329
330 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000331 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
332 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000333 pool = tuple(iterable)
334 n = len(pool)
335 r = n if r is None else r
336 indices = range(n)
337 cycles = range(n-r+1, n+1)[::-1]
338 yield tuple(pool[i] for i in indices[:r])
339 while n:
340 for i in reversed(range(r)):
341 cycles[i] -= 1
342 if cycles[i] == 0:
343 indices[i:] = indices[i+1:] + indices[i:i+1]
344 cycles[i] = n - i
345 else:
346 j = cycles[i]
347 indices[i], indices[-j] = indices[-j], indices[i]
348 yield tuple(pool[i] for i in indices[:r])
349 break
350 else:
351 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000352
Christian Heimes78644762008-03-04 23:39:23 +0000353 The code for :func:`permutations` can be also expressed as a subsequence of
354 :func:`product`, filtered to exclude entries with repeated elements (those
355 from the same position in the input pool)::
356
357 def permutations(iterable, r=None):
358 pool = tuple(iterable)
359 n = len(pool)
360 r = n if r is None else r
361 for indices in product(range(n), repeat=r):
362 if len(set(indices)) == r:
363 yield tuple(pool[i] for i in indices)
364
Christian Heimes70e7ea22008-02-28 20:02:27 +0000365 .. versionadded:: 2.6
366
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000367.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000368
369 Cartesian product of input iterables.
370
371 Equivalent to nested for-loops in a generator expression. For example,
372 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
373
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000374 The leftmost iterators correspond to the outermost for-loop, so the output
375 tuples cycle like an odometer (with the rightmost element changing on every
Christian Heimes78644762008-03-04 23:39:23 +0000376 iteration). This results in a lexicographic ordering so that if the
377 inputs iterables are sorted, the product tuples are emitted
Christian Heimes836baa52008-02-26 08:18:30 +0000378 in sorted order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000379
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000380 To compute the product of an iterable with itself, specify the number of
381 repetitions with the optional *repeat* keyword argument. For example,
382 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
383
Christian Heimes78644762008-03-04 23:39:23 +0000384 This function is equivalent to the following code, except that the
385 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000386
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000387 def product(*args, **kwds):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000388 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
389 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000390 pools = map(tuple, args) * kwds.get('repeat', 1)
Christian Heimes78644762008-03-04 23:39:23 +0000391 result = [[]]
392 for pool in pools:
393 result = [x+[y] for x in result for y in pool]
394 for prod in result:
395 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000396
397
Georg Brandl116aa622007-08-15 14:28:22 +0000398.. function:: repeat(object[, times])
399
400 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000401 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000402 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000403 create an invariant part of a tuple record. Equivalent to::
404
405 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000406 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000407 if times is None:
408 while True:
409 yield object
410 else:
411 for i in range(times):
412 yield object
413
414
415.. function:: starmap(function, iterable)
416
Christian Heimes679db4a2008-01-18 09:56:22 +0000417 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000418 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000419 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000420 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000421 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
422
423 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000424 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000425 for args in iterable:
426 yield function(*args)
427
428 .. versionchanged:: 2.6
429 Previously, :func:`starmap` required the function arguments to be tuples.
430 Now, any iterable is allowed.
Georg Brandl116aa622007-08-15 14:28:22 +0000431
432
433.. function:: takewhile(predicate, iterable)
434
435 Make an iterator that returns elements from the iterable as long as the
436 predicate is true. Equivalent to::
437
438 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000439 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000440 for x in iterable:
441 if predicate(x):
442 yield x
443 else:
444 break
445
446
447.. function:: tee(iterable[, n=2])
448
449 Return *n* independent iterators from a single iterable. The case where ``n==2``
450 is equivalent to::
451
452 def tee(iterable):
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000453 def gen(next, data={}):
Georg Brandl116aa622007-08-15 14:28:22 +0000454 for i in count():
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000455 if i in data:
456 yield data.pop(i)
Georg Brandl116aa622007-08-15 14:28:22 +0000457 else:
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000458 data[i] = next()
459 yield data[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000460 it = iter(iterable)
461 return (gen(it.__next__), gen(it.__next__))
462
463 Note, once :func:`tee` has made a split, the original *iterable* should not be
464 used anywhere else; otherwise, the *iterable* could get advanced without the tee
465 objects being informed.
466
467 Note, this member of the toolkit may require significant auxiliary storage
468 (depending on how much temporary data needs to be stored). In general, if one
469 iterator is going to use most or all of the data before the other iterator, it
470 is faster to use :func:`list` instead of :func:`tee`.
471
Georg Brandl116aa622007-08-15 14:28:22 +0000472
473.. _itertools-example:
474
475Examples
476--------
477
478The following examples show common uses for each tool and demonstrate ways they
479can be combined. ::
480
Georg Brandl116aa622007-08-15 14:28:22 +0000481 # Show a dictionary sorted and grouped by value
482 >>> from operator import itemgetter
483 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000484 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000485 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000486 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000487 ...
488 1 ['a', 'c', 'e']
489 2 ['b', 'd', 'f']
490 3 ['g']
491
492 # Find runs of consecutive numbers using groupby. The key to the solution
493 # is differencing with a range so that consecutive numbers all appear in
494 # same group.
495 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
496 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000497 ... print(map(operator.itemgetter(1), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000498 ...
499 [1]
500 [4, 5, 6]
501 [10]
502 [15, 16, 17, 18]
503 [22]
504 [25, 26, 27, 28]
505
506
507
508.. _itertools-recipes:
509
510Recipes
511-------
512
513This section shows recipes for creating an extended toolset using the existing
514itertools as building blocks.
515
516The extended tools offer the same high performance as the underlying toolset.
517The superior memory performance is kept by processing elements one at a time
518rather than bringing the whole iterable into memory all at once. Code volume is
519kept small by linking the tools together in a functional style which helps
520eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000521"vectorized" building blocks over the use of for-loops and :term:`generator`\s
522which incur interpreter overhead. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000523
524 def take(n, seq):
525 return list(islice(seq, n))
526
527 def enumerate(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000528 return zip(count(), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000529
530 def tabulate(function):
531 "Return function(0), function(1), ..."
Raymond Hettingera6c60372008-03-13 01:26:19 +0000532 return map(function, count())
Georg Brandl116aa622007-08-15 14:28:22 +0000533
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000534 def items(mapping):
535 return zip(mapping.keys(), mapping.values())
536
Georg Brandl116aa622007-08-15 14:28:22 +0000537 def nth(iterable, n):
538 "Returns the nth item or raise StopIteration"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000539 return next(islice(iterable, n, None))
Georg Brandl116aa622007-08-15 14:28:22 +0000540
541 def all(seq, pred=None):
542 "Returns True if pred(x) is true for every element in the iterable"
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000543 for elem in filterfalse(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000544 return False
545 return True
546
547 def any(seq, pred=None):
548 "Returns True if pred(x) is true for at least one element in the iterable"
Raymond Hettinger17301e92008-03-13 00:19:26 +0000549 for elem in filter(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000550 return True
551 return False
552
553 def no(seq, pred=None):
554 "Returns True if pred(x) is false for every element in the iterable"
Raymond Hettinger17301e92008-03-13 00:19:26 +0000555 for elem in filter(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000556 return False
557 return True
558
559 def quantify(seq, pred=None):
560 "Count how many times the predicate is true in the sequence"
Raymond Hettingera6c60372008-03-13 01:26:19 +0000561 return sum(map(pred, seq))
Georg Brandl116aa622007-08-15 14:28:22 +0000562
563 def padnone(seq):
564 """Returns the sequence elements and then returns None indefinitely.
565
566 Useful for emulating the behavior of the built-in map() function.
567 """
568 return chain(seq, repeat(None))
569
570 def ncycles(seq, n):
571 "Returns the sequence elements n times"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000572 return chain.from_iterable(repeat(seq, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000573
574 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000575 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000576
577 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000578 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000579
580 def repeatfunc(func, times=None, *args):
581 """Repeat calls to func with specified arguments.
582
583 Example: repeatfunc(random.random)
584 """
585 if times is None:
586 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000587 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000588
589 def pairwise(iterable):
590 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
591 a, b = tee(iterable)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000592 for elem in b:
593 break
594 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000595
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000596 def grouper(n, iterable, fillvalue=None):
Georg Brandl116aa622007-08-15 14:28:22 +0000597 "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000598 args = [iter(iterable)] * n
599 kwds = dict(fillvalue=fillvalue)
600 return zip_longest(*args, **kwds)
Georg Brandl116aa622007-08-15 14:28:22 +0000601
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000602 def roundrobin(*iterables):
603 "roundrobin('abc', 'd', 'ef') --> 'a', 'd', 'e', 'b', 'f', 'c'"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000604 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000605 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000606 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000607 while pending:
608 try:
609 for next in nexts:
610 yield next()
611 except StopIteration:
612 pending -= 1
613 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000614
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000615 def powerset(iterable):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000616 "powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])"
617 # Recipe credited to Eric Raymond
618 pairs = [(2**i, x) for i, x in enumerate(iterable)]
619 for n in xrange(2**len(pairs)):
620 yield set(x for m, x in pairs if m&n)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000621
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000622 def compress(data, selectors):
623 "compress('abcdef', [1,0,1,0,1,1]) --> a c e f"
624 for d, s in zip(data, selectors):
625 if s:
626 yield d
627