blob: 0d2041060372159ae0f4a04de93d5399243fe873 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Christian Heimesfe337bf2008-03-23 21:54:12 +000011.. testsetup::
12
13 from itertools import *
14
15
Georg Brandl9afde1c2007-11-01 20:32:30 +000016This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl116aa622007-08-15 14:28:22 +000017constructs from the Haskell and SML programming languages. Each has been recast
18in a form suitable for Python.
19
20The module standardizes a core set of fast, memory efficient tools that are
21useful by themselves or in combination. Standardization helps avoid the
22readability and reliability problems which arise when many different individuals
23create their own slightly varying implementations, each with their own quirks
24and naming conventions.
25
26The tools are designed to combine readily with one another. This makes it easy
27to construct more specialized tools succinctly and efficiently in pure Python.
28
29For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000030sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
31by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
33Likewise, the functional tools are designed to work well with the high-speed
34functions provided by the :mod:`operator` module.
35
36The module author welcomes suggestions for other basic building blocks to be
37added to future versions of the module.
38
39Whether cast in pure python form or compiled code, tools that use iterators are
40more memory efficient (and faster) than their list based counterparts. Adopting
41the principles of just-in-time manufacturing, they create data when and where
42needed instead of consuming memory with the computer equivalent of "inventory".
43
44The performance advantage of iterators becomes more acute as the number of
45elements increases -- at some point, lists grow large enough to severely impact
46memory cache performance and start running slowly.
47
48
49.. seealso::
50
51 The Standard ML Basis Library, `The Standard ML Basis Library
52 <http://www.standardml.org/Basis/>`_.
53
54 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
55 Libraries <http://www.haskell.org/definition/>`_.
56
57
58.. _itertools-functions:
59
60Itertool functions
61------------------
62
63The following module functions all construct and return iterators. Some provide
64streams of infinite length, so they should only be accessed by functions or
65loops that truncate the stream.
66
67
68.. function:: chain(*iterables)
69
70 Make an iterator that returns elements from the first iterable until it is
71 exhausted, then proceeds to the next iterable, until all of the iterables are
72 exhausted. Used for treating consecutive sequences as a single sequence.
73 Equivalent to::
74
75 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000076 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000077 for it in iterables:
78 for element in it:
79 yield element
80
81
Christian Heimes70e7ea22008-02-28 20:02:27 +000082.. function:: itertools.chain.from_iterable(iterable)
83
84 Alternate constructor for :func:`chain`. Gets chained inputs from a
85 single iterable argument that is evaluated lazily. Equivalent to::
86
87 @classmethod
88 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000089 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +000090 for it in iterables:
91 for element in it:
92 yield element
93
94 .. versionadded:: 2.6
95
Christian Heimes78644762008-03-04 23:39:23 +000096
Christian Heimes836baa52008-02-26 08:18:30 +000097.. function:: combinations(iterable, r)
98
99 Return successive *r* length combinations of elements in the *iterable*.
100
Christian Heimes70e7ea22008-02-28 20:02:27 +0000101 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +0000102 input *iterable* is sorted, the combination tuples will be produced
103 in sorted order.
104
105 Elements are treated as unique based on their position, not on their
106 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +0000107 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +0000108
109 Each result tuple is ordered to match the input order. So, every
110 combination is a subsequence of the input *iterable*.
111
Christian Heimes836baa52008-02-26 08:18:30 +0000112 Equivalent to::
113
114 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000115 # combinations('ABCD', 2) --> AB AC AD BC BD CD
116 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000117 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000118 n = len(pool)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000119 indices = range(r)
120 yield tuple(pool[i] for i in indices)
Christian Heimes380f7f22008-02-28 11:19:05 +0000121 while 1:
122 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000123 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000124 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000125 else:
126 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000127 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000128 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000129 indices[j] = indices[j-1] + 1
130 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000131
Christian Heimes78644762008-03-04 23:39:23 +0000132 The code for :func:`combinations` can be also expressed as a subsequence
133 of :func:`permutations` after filtering entries where the elements are not
134 in sorted order (according to their position in the input pool)::
135
136 def combinations(iterable, r):
137 pool = tuple(iterable)
138 n = len(pool)
139 for indices in permutations(range(n), r):
140 if sorted(indices) == list(indices):
141 yield tuple(pool[i] for i in indices)
142
Christian Heimes836baa52008-02-26 08:18:30 +0000143 .. versionadded:: 2.6
144
Georg Brandl116aa622007-08-15 14:28:22 +0000145.. function:: count([n])
146
147 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettingera6c60372008-03-13 01:26:19 +0000148 specified *n* defaults to zero. Often used as an argument to :func:`map` to
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000149 generate consecutive data points. Also, used with :func:`zip` to add sequence
Georg Brandl9afde1c2007-11-01 20:32:30 +0000150 numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000151
152 def count(n=0):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000153 # count(10) --> 10 11 12 13 14 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000154 while True:
155 yield n
156 n += 1
157
Georg Brandl116aa622007-08-15 14:28:22 +0000158
159.. function:: cycle(iterable)
160
161 Make an iterator returning elements from the iterable and saving a copy of each.
162 When the iterable is exhausted, return elements from the saved copy. Repeats
163 indefinitely. Equivalent to::
164
165 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000166 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000167 saved = []
168 for element in iterable:
169 yield element
170 saved.append(element)
171 while saved:
172 for element in saved:
173 yield element
174
175 Note, this member of the toolkit may require significant auxiliary storage
176 (depending on the length of the iterable).
177
178
179.. function:: dropwhile(predicate, iterable)
180
181 Make an iterator that drops elements from the iterable as long as the predicate
182 is true; afterwards, returns every element. Note, the iterator does not produce
183 *any* output until the predicate first becomes false, so it may have a lengthy
184 start-up time. Equivalent to::
185
186 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000187 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000188 iterable = iter(iterable)
189 for x in iterable:
190 if not predicate(x):
191 yield x
192 break
193 for x in iterable:
194 yield x
195
196
197.. function:: groupby(iterable[, key])
198
199 Make an iterator that returns consecutive keys and groups from the *iterable*.
200 The *key* is a function computing a key value for each element. If not
201 specified or is ``None``, *key* defaults to an identity function and returns
202 the element unchanged. Generally, the iterable needs to already be sorted on
203 the same key function.
204
205 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
206 generates a break or new group every time the value of the key function changes
207 (which is why it is usually necessary to have sorted the data using the same key
208 function). That behavior differs from SQL's GROUP BY which aggregates common
209 elements regardless of their input order.
210
211 The returned group is itself an iterator that shares the underlying iterable
212 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
213 object is advanced, the previous group is no longer visible. So, if that data
214 is needed later, it should be stored as a list::
215
216 groups = []
217 uniquekeys = []
218 data = sorted(data, key=keyfunc)
219 for k, g in groupby(data, keyfunc):
220 groups.append(list(g)) # Store group iterator as a list
221 uniquekeys.append(k)
222
223 :func:`groupby` is equivalent to::
224
225 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000226 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
227 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000228 def __init__(self, iterable, key=None):
229 if key is None:
230 key = lambda x: x
231 self.keyfunc = key
232 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000233 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000234 def __iter__(self):
235 return self
236 def __next__(self):
237 while self.currkey == self.tgtkey:
238 self.currvalue = next(self.it) # Exit on StopIteration
239 self.currkey = self.keyfunc(self.currvalue)
240 self.tgtkey = self.currkey
241 return (self.currkey, self._grouper(self.tgtkey))
242 def _grouper(self, tgtkey):
243 while self.currkey == tgtkey:
244 yield self.currvalue
245 self.currvalue = next(self.it) # Exit on StopIteration
246 self.currkey = self.keyfunc(self.currvalue)
247
Georg Brandl116aa622007-08-15 14:28:22 +0000248
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000249.. function:: filterfalse(predicate, iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000250
251 Make an iterator that filters elements from iterable returning only those for
252 which the predicate is ``False``. If *predicate* is ``None``, return the items
253 that are false. Equivalent to::
254
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000255 def filterfalse(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000256 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl116aa622007-08-15 14:28:22 +0000257 if predicate is None:
258 predicate = bool
259 for x in iterable:
260 if not predicate(x):
261 yield x
262
263
Georg Brandl116aa622007-08-15 14:28:22 +0000264.. function:: islice(iterable, [start,] stop [, step])
265
266 Make an iterator that returns selected elements from the iterable. If *start* is
267 non-zero, then elements from the iterable are skipped until start is reached.
268 Afterward, elements are returned consecutively unless *step* is set higher than
269 one which results in items being skipped. If *stop* is ``None``, then iteration
270 continues until the iterator is exhausted, if at all; otherwise, it stops at the
271 specified position. Unlike regular slicing, :func:`islice` does not support
272 negative values for *start*, *stop*, or *step*. Can be used to extract related
273 fields from data where the internal structure has been flattened (for example, a
274 multi-line report may list a name field on every third line). Equivalent to::
275
276 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000277 # islice('ABCDEFG', 2) --> A B
278 # islice('ABCDEFG', 2, 4) --> C D
279 # islice('ABCDEFG', 2, None) --> C D E F G
280 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000281 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000282 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000283 nexti = next(it)
284 for i, element in enumerate(iterable):
285 if i == nexti:
286 yield element
287 nexti = next(it)
288
289 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
290 then the step defaults to one.
291
Georg Brandl116aa622007-08-15 14:28:22 +0000292
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000293.. function:: zip_longest(*iterables[, fillvalue])
Georg Brandl116aa622007-08-15 14:28:22 +0000294
295 Make an iterator that aggregates elements from each of the iterables. If the
296 iterables are of uneven length, missing values are filled-in with *fillvalue*.
297 Iteration continues until the longest iterable is exhausted. Equivalent to::
298
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000299 def zip_longest(*args, fillvalue=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000300 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl116aa622007-08-15 14:28:22 +0000301 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
302 yield counter() # yields the fillvalue, or raises IndexError
303 fillers = repeat(fillvalue)
304 iters = [chain(it, sentinel(), fillers) for it in args]
305 try:
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000306 for tup in zip(*iters):
Georg Brandl116aa622007-08-15 14:28:22 +0000307 yield tup
308 except IndexError:
309 pass
310
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000311 If one of the iterables is potentially infinite, then the :func:`zip_longest`
Georg Brandl116aa622007-08-15 14:28:22 +0000312 function should be wrapped with something that limits the number of calls (for
313 example :func:`islice` or :func:`takewhile`).
314
Georg Brandl116aa622007-08-15 14:28:22 +0000315
Christian Heimes70e7ea22008-02-28 20:02:27 +0000316.. function:: permutations(iterable[, r])
317
318 Return successive *r* length permutations of elements in the *iterable*.
319
320 If *r* is not specified or is ``None``, then *r* defaults to the length
321 of the *iterable* and all possible full-length permutations
322 are generated.
323
324 Permutations are emitted in lexicographic sort order. So, if the
325 input *iterable* is sorted, the permutation tuples will be produced
326 in sorted order.
327
328 Elements are treated as unique based on their position, not on their
329 value. So if the input elements are unique, there will be no repeat
330 values in each permutation.
331
Christian Heimesb558a2e2008-03-02 22:46:37 +0000332 Equivalent to::
333
334 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000335 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
336 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000337 pool = tuple(iterable)
338 n = len(pool)
339 r = n if r is None else r
340 indices = range(n)
Christian Heimesfe337bf2008-03-23 21:54:12 +0000341 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000342 yield tuple(pool[i] for i in indices[:r])
343 while n:
344 for i in reversed(range(r)):
345 cycles[i] -= 1
346 if cycles[i] == 0:
347 indices[i:] = indices[i+1:] + indices[i:i+1]
348 cycles[i] = n - i
349 else:
350 j = cycles[i]
351 indices[i], indices[-j] = indices[-j], indices[i]
352 yield tuple(pool[i] for i in indices[:r])
353 break
354 else:
355 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000356
Christian Heimes78644762008-03-04 23:39:23 +0000357 The code for :func:`permutations` can be also expressed as a subsequence of
358 :func:`product`, filtered to exclude entries with repeated elements (those
359 from the same position in the input pool)::
360
361 def permutations(iterable, r=None):
362 pool = tuple(iterable)
363 n = len(pool)
364 r = n if r is None else r
365 for indices in product(range(n), repeat=r):
366 if len(set(indices)) == r:
367 yield tuple(pool[i] for i in indices)
368
Christian Heimes70e7ea22008-02-28 20:02:27 +0000369 .. versionadded:: 2.6
370
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000371.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000372
373 Cartesian product of input iterables.
374
375 Equivalent to nested for-loops in a generator expression. For example,
376 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
377
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000378 The leftmost iterators correspond to the outermost for-loop, so the output
379 tuples cycle like an odometer (with the rightmost element changing on every
Christian Heimes78644762008-03-04 23:39:23 +0000380 iteration). This results in a lexicographic ordering so that if the
381 inputs iterables are sorted, the product tuples are emitted
Christian Heimes836baa52008-02-26 08:18:30 +0000382 in sorted order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000383
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000384 To compute the product of an iterable with itself, specify the number of
385 repetitions with the optional *repeat* keyword argument. For example,
386 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
387
Christian Heimes78644762008-03-04 23:39:23 +0000388 This function is equivalent to the following code, except that the
389 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000390
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000391 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000392 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
393 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000394 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000395 result = [[]]
396 for pool in pools:
397 result = [x+[y] for x in result for y in pool]
398 for prod in result:
399 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000400
401
Georg Brandl116aa622007-08-15 14:28:22 +0000402.. function:: repeat(object[, times])
403
404 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000405 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000406 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000407 create an invariant part of a tuple record. Equivalent to::
408
409 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000410 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000411 if times is None:
412 while True:
413 yield object
414 else:
415 for i in range(times):
416 yield object
417
418
419.. function:: starmap(function, iterable)
420
Christian Heimes679db4a2008-01-18 09:56:22 +0000421 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000422 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000423 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000424 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000425 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
426
427 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000428 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000429 for args in iterable:
430 yield function(*args)
431
432 .. versionchanged:: 2.6
433 Previously, :func:`starmap` required the function arguments to be tuples.
434 Now, any iterable is allowed.
Georg Brandl116aa622007-08-15 14:28:22 +0000435
436
437.. function:: takewhile(predicate, iterable)
438
439 Make an iterator that returns elements from the iterable as long as the
440 predicate is true. Equivalent to::
441
442 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000443 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000444 for x in iterable:
445 if predicate(x):
446 yield x
447 else:
448 break
449
450
451.. function:: tee(iterable[, n=2])
452
453 Return *n* independent iterators from a single iterable. The case where ``n==2``
454 is equivalent to::
455
456 def tee(iterable):
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000457 def gen(next, data={}):
Georg Brandl116aa622007-08-15 14:28:22 +0000458 for i in count():
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000459 if i in data:
460 yield data.pop(i)
Georg Brandl116aa622007-08-15 14:28:22 +0000461 else:
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000462 data[i] = next()
463 yield data[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000464 it = iter(iterable)
465 return (gen(it.__next__), gen(it.__next__))
466
467 Note, once :func:`tee` has made a split, the original *iterable* should not be
468 used anywhere else; otherwise, the *iterable* could get advanced without the tee
469 objects being informed.
470
471 Note, this member of the toolkit may require significant auxiliary storage
472 (depending on how much temporary data needs to be stored). In general, if one
473 iterator is going to use most or all of the data before the other iterator, it
474 is faster to use :func:`list` instead of :func:`tee`.
475
Georg Brandl116aa622007-08-15 14:28:22 +0000476
477.. _itertools-example:
478
479Examples
480--------
481
482The following examples show common uses for each tool and demonstrate ways they
Christian Heimesfe337bf2008-03-23 21:54:12 +0000483can be combined.
484
485.. doctest::
Georg Brandl116aa622007-08-15 14:28:22 +0000486
Georg Brandl116aa622007-08-15 14:28:22 +0000487 # Show a dictionary sorted and grouped by value
488 >>> from operator import itemgetter
489 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000490 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000491 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000492 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000493 ...
494 1 ['a', 'c', 'e']
495 2 ['b', 'd', 'f']
496 3 ['g']
497
498 # Find runs of consecutive numbers using groupby. The key to the solution
499 # is differencing with a range so that consecutive numbers all appear in
500 # same group.
501 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
502 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000503 ... print(map(operator.itemgetter(1), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000504 ...
505 [1]
506 [4, 5, 6]
507 [10]
508 [15, 16, 17, 18]
509 [22]
510 [25, 26, 27, 28]
511
512
513
514.. _itertools-recipes:
515
516Recipes
517-------
518
519This section shows recipes for creating an extended toolset using the existing
520itertools as building blocks.
521
522The extended tools offer the same high performance as the underlying toolset.
523The superior memory performance is kept by processing elements one at a time
524rather than bringing the whole iterable into memory all at once. Code volume is
525kept small by linking the tools together in a functional style which helps
526eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000527"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000528which incur interpreter overhead.
529
530.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000531
532 def take(n, seq):
533 return list(islice(seq, n))
534
535 def enumerate(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000536 return zip(count(), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000537
538 def tabulate(function):
539 "Return function(0), function(1), ..."
Raymond Hettingera6c60372008-03-13 01:26:19 +0000540 return map(function, count())
Georg Brandl116aa622007-08-15 14:28:22 +0000541
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000542 def items(mapping):
543 return zip(mapping.keys(), mapping.values())
544
Georg Brandl116aa622007-08-15 14:28:22 +0000545 def nth(iterable, n):
546 "Returns the nth item or raise StopIteration"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000547 return next(islice(iterable, n, None))
Georg Brandl116aa622007-08-15 14:28:22 +0000548
549 def all(seq, pred=None):
550 "Returns True if pred(x) is true for every element in the iterable"
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000551 for elem in filterfalse(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000552 return False
553 return True
554
555 def any(seq, pred=None):
556 "Returns True if pred(x) is true for at least one element in the iterable"
Raymond Hettinger17301e92008-03-13 00:19:26 +0000557 for elem in filter(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000558 return True
559 return False
560
561 def no(seq, pred=None):
562 "Returns True if pred(x) is false for every element in the iterable"
Raymond Hettinger17301e92008-03-13 00:19:26 +0000563 for elem in filter(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000564 return False
565 return True
566
567 def quantify(seq, pred=None):
568 "Count how many times the predicate is true in the sequence"
Raymond Hettingera6c60372008-03-13 01:26:19 +0000569 return sum(map(pred, seq))
Georg Brandl116aa622007-08-15 14:28:22 +0000570
571 def padnone(seq):
572 """Returns the sequence elements and then returns None indefinitely.
573
574 Useful for emulating the behavior of the built-in map() function.
575 """
576 return chain(seq, repeat(None))
577
578 def ncycles(seq, n):
579 "Returns the sequence elements n times"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000580 return chain.from_iterable(repeat(seq, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000581
582 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000583 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000584
585 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000586 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000587
588 def repeatfunc(func, times=None, *args):
589 """Repeat calls to func with specified arguments.
590
591 Example: repeatfunc(random.random)
592 """
593 if times is None:
594 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000595 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000596
597 def pairwise(iterable):
598 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
599 a, b = tee(iterable)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000600 for elem in b:
601 break
602 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000603
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000604 def grouper(n, iterable, fillvalue=None):
Georg Brandl116aa622007-08-15 14:28:22 +0000605 "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000606 args = [iter(iterable)] * n
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000607 return zip_longest(*args, fillvalue=fillvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000608
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000609 def roundrobin(*iterables):
610 "roundrobin('abc', 'd', 'ef') --> 'a', 'd', 'e', 'b', 'f', 'c'"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000611 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000612 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000613 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000614 while pending:
615 try:
616 for next in nexts:
617 yield next()
618 except StopIteration:
619 pending -= 1
620 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000621
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000622 def powerset(iterable):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000623 "powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])"
624 # Recipe credited to Eric Raymond
625 pairs = [(2**i, x) for i, x in enumerate(iterable)]
626 for n in xrange(2**len(pairs)):
627 yield set(x for m, x in pairs if m&n)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000628
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000629 def compress(data, selectors):
630 "compress('abcdef', [1,0,1,0,1,1]) --> a c e f"
631 for d, s in zip(data, selectors):
632 if s:
633 yield d
634