blob: c9d4c60bd27abc00e272f66960e76f8bc2bde8a5 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Christian Heimesfe337bf2008-03-23 21:54:12 +000011.. testsetup::
12
13 from itertools import *
14
15
Georg Brandl9afde1c2007-11-01 20:32:30 +000016This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl116aa622007-08-15 14:28:22 +000017constructs from the Haskell and SML programming languages. Each has been recast
18in a form suitable for Python.
19
20The module standardizes a core set of fast, memory efficient tools that are
21useful by themselves or in combination. Standardization helps avoid the
22readability and reliability problems which arise when many different individuals
23create their own slightly varying implementations, each with their own quirks
24and naming conventions.
25
26The tools are designed to combine readily with one another. This makes it easy
27to construct more specialized tools succinctly and efficiently in pure Python.
28
29For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000030sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
31by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
33Likewise, the functional tools are designed to work well with the high-speed
34functions provided by the :mod:`operator` module.
35
36The module author welcomes suggestions for other basic building blocks to be
37added to future versions of the module.
38
39Whether cast in pure python form or compiled code, tools that use iterators are
40more memory efficient (and faster) than their list based counterparts. Adopting
41the principles of just-in-time manufacturing, they create data when and where
42needed instead of consuming memory with the computer equivalent of "inventory".
43
44The performance advantage of iterators becomes more acute as the number of
45elements increases -- at some point, lists grow large enough to severely impact
46memory cache performance and start running slowly.
47
48
49.. seealso::
50
51 The Standard ML Basis Library, `The Standard ML Basis Library
52 <http://www.standardml.org/Basis/>`_.
53
54 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
55 Libraries <http://www.haskell.org/definition/>`_.
56
57
58.. _itertools-functions:
59
60Itertool functions
61------------------
62
63The following module functions all construct and return iterators. Some provide
64streams of infinite length, so they should only be accessed by functions or
65loops that truncate the stream.
66
67
68.. function:: chain(*iterables)
69
70 Make an iterator that returns elements from the first iterable until it is
71 exhausted, then proceeds to the next iterable, until all of the iterables are
72 exhausted. Used for treating consecutive sequences as a single sequence.
73 Equivalent to::
74
75 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000076 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000077 for it in iterables:
78 for element in it:
79 yield element
80
81
Christian Heimes70e7ea22008-02-28 20:02:27 +000082.. function:: itertools.chain.from_iterable(iterable)
83
84 Alternate constructor for :func:`chain`. Gets chained inputs from a
85 single iterable argument that is evaluated lazily. Equivalent to::
86
87 @classmethod
88 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000089 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +000090 for it in iterables:
91 for element in it:
92 yield element
93
94 .. versionadded:: 2.6
95
Christian Heimes78644762008-03-04 23:39:23 +000096
Christian Heimes836baa52008-02-26 08:18:30 +000097.. function:: combinations(iterable, r)
98
Christian Heimesdae2a892008-04-19 00:55:37 +000099 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +0000100
Christian Heimes70e7ea22008-02-28 20:02:27 +0000101 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +0000102 input *iterable* is sorted, the combination tuples will be produced
103 in sorted order.
104
105 Elements are treated as unique based on their position, not on their
106 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +0000107 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +0000108
Christian Heimes836baa52008-02-26 08:18:30 +0000109 Equivalent to::
110
111 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000112 # combinations('ABCD', 2) --> AB AC AD BC BD CD
113 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000114 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000115 n = len(pool)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000116 indices = range(r)
117 yield tuple(pool[i] for i in indices)
Christian Heimes380f7f22008-02-28 11:19:05 +0000118 while 1:
119 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000120 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000121 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000122 else:
123 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000124 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000125 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000126 indices[j] = indices[j-1] + 1
127 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000128
Christian Heimes78644762008-03-04 23:39:23 +0000129 The code for :func:`combinations` can be also expressed as a subsequence
130 of :func:`permutations` after filtering entries where the elements are not
131 in sorted order (according to their position in the input pool)::
132
133 def combinations(iterable, r):
134 pool = tuple(iterable)
135 n = len(pool)
136 for indices in permutations(range(n), r):
137 if sorted(indices) == list(indices):
138 yield tuple(pool[i] for i in indices)
139
Christian Heimes836baa52008-02-26 08:18:30 +0000140 .. versionadded:: 2.6
141
Georg Brandl116aa622007-08-15 14:28:22 +0000142.. function:: count([n])
143
144 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettingera6c60372008-03-13 01:26:19 +0000145 specified *n* defaults to zero. Often used as an argument to :func:`map` to
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000146 generate consecutive data points. Also, used with :func:`zip` to add sequence
Georg Brandl9afde1c2007-11-01 20:32:30 +0000147 numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000148
149 def count(n=0):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000150 # count(10) --> 10 11 12 13 14 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000151 while True:
152 yield n
153 n += 1
154
Georg Brandl116aa622007-08-15 14:28:22 +0000155
156.. function:: cycle(iterable)
157
158 Make an iterator returning elements from the iterable and saving a copy of each.
159 When the iterable is exhausted, return elements from the saved copy. Repeats
160 indefinitely. Equivalent to::
161
162 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000163 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000164 saved = []
165 for element in iterable:
166 yield element
167 saved.append(element)
168 while saved:
169 for element in saved:
170 yield element
171
172 Note, this member of the toolkit may require significant auxiliary storage
173 (depending on the length of the iterable).
174
175
176.. function:: dropwhile(predicate, iterable)
177
178 Make an iterator that drops elements from the iterable as long as the predicate
179 is true; afterwards, returns every element. Note, the iterator does not produce
180 *any* output until the predicate first becomes false, so it may have a lengthy
181 start-up time. Equivalent to::
182
183 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000184 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000185 iterable = iter(iterable)
186 for x in iterable:
187 if not predicate(x):
188 yield x
189 break
190 for x in iterable:
191 yield x
192
193
194.. function:: groupby(iterable[, key])
195
196 Make an iterator that returns consecutive keys and groups from the *iterable*.
197 The *key* is a function computing a key value for each element. If not
198 specified or is ``None``, *key* defaults to an identity function and returns
199 the element unchanged. Generally, the iterable needs to already be sorted on
200 the same key function.
201
202 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
203 generates a break or new group every time the value of the key function changes
204 (which is why it is usually necessary to have sorted the data using the same key
205 function). That behavior differs from SQL's GROUP BY which aggregates common
206 elements regardless of their input order.
207
208 The returned group is itself an iterator that shares the underlying iterable
209 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
210 object is advanced, the previous group is no longer visible. So, if that data
211 is needed later, it should be stored as a list::
212
213 groups = []
214 uniquekeys = []
215 data = sorted(data, key=keyfunc)
216 for k, g in groupby(data, keyfunc):
217 groups.append(list(g)) # Store group iterator as a list
218 uniquekeys.append(k)
219
220 :func:`groupby` is equivalent to::
221
222 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000223 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
224 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000225 def __init__(self, iterable, key=None):
226 if key is None:
227 key = lambda x: x
228 self.keyfunc = key
229 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000230 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000231 def __iter__(self):
232 return self
233 def __next__(self):
234 while self.currkey == self.tgtkey:
235 self.currvalue = next(self.it) # Exit on StopIteration
236 self.currkey = self.keyfunc(self.currvalue)
237 self.tgtkey = self.currkey
238 return (self.currkey, self._grouper(self.tgtkey))
239 def _grouper(self, tgtkey):
240 while self.currkey == tgtkey:
241 yield self.currvalue
242 self.currvalue = next(self.it) # Exit on StopIteration
243 self.currkey = self.keyfunc(self.currvalue)
244
Georg Brandl116aa622007-08-15 14:28:22 +0000245
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000246.. function:: filterfalse(predicate, iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000247
248 Make an iterator that filters elements from iterable returning only those for
249 which the predicate is ``False``. If *predicate* is ``None``, return the items
250 that are false. Equivalent to::
251
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000252 def filterfalse(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000253 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl116aa622007-08-15 14:28:22 +0000254 if predicate is None:
255 predicate = bool
256 for x in iterable:
257 if not predicate(x):
258 yield x
259
260
Georg Brandl116aa622007-08-15 14:28:22 +0000261.. function:: islice(iterable, [start,] stop [, step])
262
263 Make an iterator that returns selected elements from the iterable. If *start* is
264 non-zero, then elements from the iterable are skipped until start is reached.
265 Afterward, elements are returned consecutively unless *step* is set higher than
266 one which results in items being skipped. If *stop* is ``None``, then iteration
267 continues until the iterator is exhausted, if at all; otherwise, it stops at the
268 specified position. Unlike regular slicing, :func:`islice` does not support
269 negative values for *start*, *stop*, or *step*. Can be used to extract related
270 fields from data where the internal structure has been flattened (for example, a
271 multi-line report may list a name field on every third line). Equivalent to::
272
273 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000274 # islice('ABCDEFG', 2) --> A B
275 # islice('ABCDEFG', 2, 4) --> C D
276 # islice('ABCDEFG', 2, None) --> C D E F G
277 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000278 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000279 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000280 nexti = next(it)
281 for i, element in enumerate(iterable):
282 if i == nexti:
283 yield element
284 nexti = next(it)
285
286 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
287 then the step defaults to one.
288
Georg Brandl116aa622007-08-15 14:28:22 +0000289
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000290.. function:: zip_longest(*iterables[, fillvalue])
Georg Brandl116aa622007-08-15 14:28:22 +0000291
292 Make an iterator that aggregates elements from each of the iterables. If the
293 iterables are of uneven length, missing values are filled-in with *fillvalue*.
294 Iteration continues until the longest iterable is exhausted. Equivalent to::
295
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000296 def zip_longest(*args, fillvalue=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000297 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl116aa622007-08-15 14:28:22 +0000298 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
299 yield counter() # yields the fillvalue, or raises IndexError
300 fillers = repeat(fillvalue)
301 iters = [chain(it, sentinel(), fillers) for it in args]
302 try:
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000303 for tup in zip(*iters):
Georg Brandl116aa622007-08-15 14:28:22 +0000304 yield tup
305 except IndexError:
306 pass
307
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000308 If one of the iterables is potentially infinite, then the :func:`zip_longest`
Georg Brandl116aa622007-08-15 14:28:22 +0000309 function should be wrapped with something that limits the number of calls (for
310 example :func:`islice` or :func:`takewhile`).
311
Georg Brandl116aa622007-08-15 14:28:22 +0000312
Christian Heimes70e7ea22008-02-28 20:02:27 +0000313.. function:: permutations(iterable[, r])
314
315 Return successive *r* length permutations of elements in the *iterable*.
316
317 If *r* is not specified or is ``None``, then *r* defaults to the length
318 of the *iterable* and all possible full-length permutations
319 are generated.
320
321 Permutations are emitted in lexicographic sort order. So, if the
322 input *iterable* is sorted, the permutation tuples will be produced
323 in sorted order.
324
325 Elements are treated as unique based on their position, not on their
326 value. So if the input elements are unique, there will be no repeat
327 values in each permutation.
328
Christian Heimesb558a2e2008-03-02 22:46:37 +0000329 Equivalent to::
330
331 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000332 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
333 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000334 pool = tuple(iterable)
335 n = len(pool)
336 r = n if r is None else r
337 indices = range(n)
Christian Heimesfe337bf2008-03-23 21:54:12 +0000338 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000339 yield tuple(pool[i] for i in indices[:r])
340 while n:
341 for i in reversed(range(r)):
342 cycles[i] -= 1
343 if cycles[i] == 0:
344 indices[i:] = indices[i+1:] + indices[i:i+1]
345 cycles[i] = n - i
346 else:
347 j = cycles[i]
348 indices[i], indices[-j] = indices[-j], indices[i]
349 yield tuple(pool[i] for i in indices[:r])
350 break
351 else:
352 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000353
Christian Heimes78644762008-03-04 23:39:23 +0000354 The code for :func:`permutations` can be also expressed as a subsequence of
355 :func:`product`, filtered to exclude entries with repeated elements (those
356 from the same position in the input pool)::
357
358 def permutations(iterable, r=None):
359 pool = tuple(iterable)
360 n = len(pool)
361 r = n if r is None else r
362 for indices in product(range(n), repeat=r):
363 if len(set(indices)) == r:
364 yield tuple(pool[i] for i in indices)
365
Christian Heimes70e7ea22008-02-28 20:02:27 +0000366 .. versionadded:: 2.6
367
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000368.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000369
370 Cartesian product of input iterables.
371
372 Equivalent to nested for-loops in a generator expression. For example,
373 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
374
Christian Heimesdae2a892008-04-19 00:55:37 +0000375 The nested loops cycle like an odometer with the rightmost element advancing
376 on every iteration. This pattern creates a lexicographic ordering so that if
377 the input's iterables are sorted, the product tuples are emitted in sorted
378 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000379
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000380 To compute the product of an iterable with itself, specify the number of
381 repetitions with the optional *repeat* keyword argument. For example,
382 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
383
Christian Heimes78644762008-03-04 23:39:23 +0000384 This function is equivalent to the following code, except that the
385 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000386
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000387 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000388 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
389 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000390 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000391 result = [[]]
392 for pool in pools:
393 result = [x+[y] for x in result for y in pool]
394 for prod in result:
395 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000396
397
Georg Brandl116aa622007-08-15 14:28:22 +0000398.. function:: repeat(object[, times])
399
400 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000401 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000402 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000403 create an invariant part of a tuple record. Equivalent to::
404
405 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000406 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000407 if times is None:
408 while True:
409 yield object
410 else:
411 for i in range(times):
412 yield object
413
414
415.. function:: starmap(function, iterable)
416
Christian Heimes679db4a2008-01-18 09:56:22 +0000417 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000418 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000419 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000420 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000421 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
422
423 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000424 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000425 for args in iterable:
426 yield function(*args)
427
428 .. versionchanged:: 2.6
429 Previously, :func:`starmap` required the function arguments to be tuples.
430 Now, any iterable is allowed.
Georg Brandl116aa622007-08-15 14:28:22 +0000431
432
433.. function:: takewhile(predicate, iterable)
434
435 Make an iterator that returns elements from the iterable as long as the
436 predicate is true. Equivalent to::
437
438 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000439 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000440 for x in iterable:
441 if predicate(x):
442 yield x
443 else:
444 break
445
446
447.. function:: tee(iterable[, n=2])
448
449 Return *n* independent iterators from a single iterable. The case where ``n==2``
450 is equivalent to::
451
452 def tee(iterable):
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000453 def gen(next, data={}):
Georg Brandl116aa622007-08-15 14:28:22 +0000454 for i in count():
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000455 if i in data:
456 yield data.pop(i)
Georg Brandl116aa622007-08-15 14:28:22 +0000457 else:
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000458 data[i] = next()
459 yield data[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000460 it = iter(iterable)
461 return (gen(it.__next__), gen(it.__next__))
462
463 Note, once :func:`tee` has made a split, the original *iterable* should not be
464 used anywhere else; otherwise, the *iterable* could get advanced without the tee
465 objects being informed.
466
467 Note, this member of the toolkit may require significant auxiliary storage
468 (depending on how much temporary data needs to be stored). In general, if one
469 iterator is going to use most or all of the data before the other iterator, it
470 is faster to use :func:`list` instead of :func:`tee`.
471
Georg Brandl116aa622007-08-15 14:28:22 +0000472
473.. _itertools-example:
474
475Examples
476--------
477
478The following examples show common uses for each tool and demonstrate ways they
Christian Heimesfe337bf2008-03-23 21:54:12 +0000479can be combined.
480
481.. doctest::
Georg Brandl116aa622007-08-15 14:28:22 +0000482
Georg Brandl116aa622007-08-15 14:28:22 +0000483 # Show a dictionary sorted and grouped by value
484 >>> from operator import itemgetter
485 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000486 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000487 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000488 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000489 ...
490 1 ['a', 'c', 'e']
491 2 ['b', 'd', 'f']
492 3 ['g']
493
494 # Find runs of consecutive numbers using groupby. The key to the solution
495 # is differencing with a range so that consecutive numbers all appear in
496 # same group.
497 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
498 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000499 ... print(map(operator.itemgetter(1), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000500 ...
501 [1]
502 [4, 5, 6]
503 [10]
504 [15, 16, 17, 18]
505 [22]
506 [25, 26, 27, 28]
507
508
509
510.. _itertools-recipes:
511
512Recipes
513-------
514
515This section shows recipes for creating an extended toolset using the existing
516itertools as building blocks.
517
518The extended tools offer the same high performance as the underlying toolset.
519The superior memory performance is kept by processing elements one at a time
520rather than bringing the whole iterable into memory all at once. Code volume is
521kept small by linking the tools together in a functional style which helps
522eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000523"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000524which incur interpreter overhead.
525
526.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000527
528 def take(n, seq):
529 return list(islice(seq, n))
530
531 def enumerate(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000532 return zip(count(), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000533
534 def tabulate(function):
535 "Return function(0), function(1), ..."
Raymond Hettingera6c60372008-03-13 01:26:19 +0000536 return map(function, count())
Georg Brandl116aa622007-08-15 14:28:22 +0000537
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000538 def items(mapping):
539 return zip(mapping.keys(), mapping.values())
540
Georg Brandl116aa622007-08-15 14:28:22 +0000541 def nth(iterable, n):
542 "Returns the nth item or raise StopIteration"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000543 return next(islice(iterable, n, None))
Georg Brandl116aa622007-08-15 14:28:22 +0000544
545 def all(seq, pred=None):
546 "Returns True if pred(x) is true for every element in the iterable"
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000547 for elem in filterfalse(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000548 return False
549 return True
550
551 def any(seq, pred=None):
552 "Returns True if pred(x) is true for at least one element in the iterable"
Raymond Hettinger17301e92008-03-13 00:19:26 +0000553 for elem in filter(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000554 return True
555 return False
556
557 def no(seq, pred=None):
558 "Returns True if pred(x) is false for every element in the iterable"
Raymond Hettinger17301e92008-03-13 00:19:26 +0000559 for elem in filter(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000560 return False
561 return True
562
563 def quantify(seq, pred=None):
564 "Count how many times the predicate is true in the sequence"
Raymond Hettingera6c60372008-03-13 01:26:19 +0000565 return sum(map(pred, seq))
Georg Brandl116aa622007-08-15 14:28:22 +0000566
567 def padnone(seq):
568 """Returns the sequence elements and then returns None indefinitely.
569
570 Useful for emulating the behavior of the built-in map() function.
571 """
572 return chain(seq, repeat(None))
573
574 def ncycles(seq, n):
575 "Returns the sequence elements n times"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000576 return chain.from_iterable(repeat(seq, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000577
578 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000579 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000580
581 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000582 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000583
584 def repeatfunc(func, times=None, *args):
585 """Repeat calls to func with specified arguments.
586
587 Example: repeatfunc(random.random)
588 """
589 if times is None:
590 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000591 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000592
593 def pairwise(iterable):
594 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
595 a, b = tee(iterable)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000596 for elem in b:
597 break
598 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000599
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000600 def grouper(n, iterable, fillvalue=None):
Georg Brandl116aa622007-08-15 14:28:22 +0000601 "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000602 args = [iter(iterable)] * n
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000603 return zip_longest(*args, fillvalue=fillvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000604
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000605 def roundrobin(*iterables):
606 "roundrobin('abc', 'd', 'ef') --> 'a', 'd', 'e', 'b', 'f', 'c'"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000607 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000608 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000609 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000610 while pending:
611 try:
612 for next in nexts:
613 yield next()
614 except StopIteration:
615 pending -= 1
616 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000617
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000618 def powerset(iterable):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000619 "powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])"
620 # Recipe credited to Eric Raymond
621 pairs = [(2**i, x) for i, x in enumerate(iterable)]
622 for n in xrange(2**len(pairs)):
623 yield set(x for m, x in pairs if m&n)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000624
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000625 def compress(data, selectors):
626 "compress('abcdef', [1,0,1,0,1,1]) --> a c e f"
627 for d, s in zip(data, selectors):
628 if s:
629 yield d
630