blob: 6e7885e12eef0a04a2e1570dcd4d438037f21441 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Georg Brandl9afde1c2007-11-01 20:32:30 +000011This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl116aa622007-08-15 14:28:22 +000012constructs from the Haskell and SML programming languages. Each has been recast
13in a form suitable for Python.
14
15The module standardizes a core set of fast, memory efficient tools that are
16useful by themselves or in combination. Standardization helps avoid the
17readability and reliability problems which arise when many different individuals
18create their own slightly varying implementations, each with their own quirks
19and naming conventions.
20
21The tools are designed to combine readily with one another. This makes it easy
22to construct more specialized tools succinctly and efficiently in pure Python.
23
24For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000025sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
26by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000027
28Likewise, the functional tools are designed to work well with the high-speed
29functions provided by the :mod:`operator` module.
30
31The module author welcomes suggestions for other basic building blocks to be
32added to future versions of the module.
33
34Whether cast in pure python form or compiled code, tools that use iterators are
35more memory efficient (and faster) than their list based counterparts. Adopting
36the principles of just-in-time manufacturing, they create data when and where
37needed instead of consuming memory with the computer equivalent of "inventory".
38
39The performance advantage of iterators becomes more acute as the number of
40elements increases -- at some point, lists grow large enough to severely impact
41memory cache performance and start running slowly.
42
43
44.. seealso::
45
46 The Standard ML Basis Library, `The Standard ML Basis Library
47 <http://www.standardml.org/Basis/>`_.
48
49 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
50 Libraries <http://www.haskell.org/definition/>`_.
51
52
53.. _itertools-functions:
54
55Itertool functions
56------------------
57
58The following module functions all construct and return iterators. Some provide
59streams of infinite length, so they should only be accessed by functions or
60loops that truncate the stream.
61
62
63.. function:: chain(*iterables)
64
65 Make an iterator that returns elements from the first iterable until it is
66 exhausted, then proceeds to the next iterable, until all of the iterables are
67 exhausted. Used for treating consecutive sequences as a single sequence.
68 Equivalent to::
69
70 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000071 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000072 for it in iterables:
73 for element in it:
74 yield element
75
76
Christian Heimes70e7ea22008-02-28 20:02:27 +000077.. function:: itertools.chain.from_iterable(iterable)
78
79 Alternate constructor for :func:`chain`. Gets chained inputs from a
80 single iterable argument that is evaluated lazily. Equivalent to::
81
82 @classmethod
83 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000084 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +000085 for it in iterables:
86 for element in it:
87 yield element
88
89 .. versionadded:: 2.6
90
Christian Heimes78644762008-03-04 23:39:23 +000091
Christian Heimes836baa52008-02-26 08:18:30 +000092.. function:: combinations(iterable, r)
93
94 Return successive *r* length combinations of elements in the *iterable*.
95
Christian Heimes70e7ea22008-02-28 20:02:27 +000096 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +000097 input *iterable* is sorted, the combination tuples will be produced
98 in sorted order.
99
100 Elements are treated as unique based on their position, not on their
101 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +0000102 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +0000103
104 Each result tuple is ordered to match the input order. So, every
105 combination is a subsequence of the input *iterable*.
106
Christian Heimes836baa52008-02-26 08:18:30 +0000107 Equivalent to::
108
109 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000110 # combinations('ABCD', 2) --> AB AC AD BC BD CD
111 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000112 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000113 n = len(pool)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000114 indices = range(r)
115 yield tuple(pool[i] for i in indices)
Christian Heimes380f7f22008-02-28 11:19:05 +0000116 while 1:
117 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000118 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000119 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000120 else:
121 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000122 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000123 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000124 indices[j] = indices[j-1] + 1
125 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000126
Christian Heimes78644762008-03-04 23:39:23 +0000127 The code for :func:`combinations` can be also expressed as a subsequence
128 of :func:`permutations` after filtering entries where the elements are not
129 in sorted order (according to their position in the input pool)::
130
131 def combinations(iterable, r):
132 pool = tuple(iterable)
133 n = len(pool)
134 for indices in permutations(range(n), r):
135 if sorted(indices) == list(indices):
136 yield tuple(pool[i] for i in indices)
137
Christian Heimes836baa52008-02-26 08:18:30 +0000138 .. versionadded:: 2.6
139
Georg Brandl116aa622007-08-15 14:28:22 +0000140.. function:: count([n])
141
142 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettingera6c60372008-03-13 01:26:19 +0000143 specified *n* defaults to zero. Often used as an argument to :func:`map` to
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000144 generate consecutive data points. Also, used with :func:`zip` to add sequence
Georg Brandl9afde1c2007-11-01 20:32:30 +0000145 numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000146
147 def count(n=0):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000148 # count(10) --> 10 11 12 13 14 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000149 while True:
150 yield n
151 n += 1
152
Georg Brandl116aa622007-08-15 14:28:22 +0000153
154.. function:: cycle(iterable)
155
156 Make an iterator returning elements from the iterable and saving a copy of each.
157 When the iterable is exhausted, return elements from the saved copy. Repeats
158 indefinitely. Equivalent to::
159
160 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000161 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000162 saved = []
163 for element in iterable:
164 yield element
165 saved.append(element)
166 while saved:
167 for element in saved:
168 yield element
169
170 Note, this member of the toolkit may require significant auxiliary storage
171 (depending on the length of the iterable).
172
173
174.. function:: dropwhile(predicate, iterable)
175
176 Make an iterator that drops elements from the iterable as long as the predicate
177 is true; afterwards, returns every element. Note, the iterator does not produce
178 *any* output until the predicate first becomes false, so it may have a lengthy
179 start-up time. Equivalent to::
180
181 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000182 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000183 iterable = iter(iterable)
184 for x in iterable:
185 if not predicate(x):
186 yield x
187 break
188 for x in iterable:
189 yield x
190
191
192.. function:: groupby(iterable[, key])
193
194 Make an iterator that returns consecutive keys and groups from the *iterable*.
195 The *key* is a function computing a key value for each element. If not
196 specified or is ``None``, *key* defaults to an identity function and returns
197 the element unchanged. Generally, the iterable needs to already be sorted on
198 the same key function.
199
200 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
201 generates a break or new group every time the value of the key function changes
202 (which is why it is usually necessary to have sorted the data using the same key
203 function). That behavior differs from SQL's GROUP BY which aggregates common
204 elements regardless of their input order.
205
206 The returned group is itself an iterator that shares the underlying iterable
207 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
208 object is advanced, the previous group is no longer visible. So, if that data
209 is needed later, it should be stored as a list::
210
211 groups = []
212 uniquekeys = []
213 data = sorted(data, key=keyfunc)
214 for k, g in groupby(data, keyfunc):
215 groups.append(list(g)) # Store group iterator as a list
216 uniquekeys.append(k)
217
218 :func:`groupby` is equivalent to::
219
220 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000221 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
222 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000223 def __init__(self, iterable, key=None):
224 if key is None:
225 key = lambda x: x
226 self.keyfunc = key
227 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000228 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000229 def __iter__(self):
230 return self
231 def __next__(self):
232 while self.currkey == self.tgtkey:
233 self.currvalue = next(self.it) # Exit on StopIteration
234 self.currkey = self.keyfunc(self.currvalue)
235 self.tgtkey = self.currkey
236 return (self.currkey, self._grouper(self.tgtkey))
237 def _grouper(self, tgtkey):
238 while self.currkey == tgtkey:
239 yield self.currvalue
240 self.currvalue = next(self.it) # Exit on StopIteration
241 self.currkey = self.keyfunc(self.currvalue)
242
Georg Brandl116aa622007-08-15 14:28:22 +0000243
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000244.. function:: filterfalse(predicate, iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000245
246 Make an iterator that filters elements from iterable returning only those for
247 which the predicate is ``False``. If *predicate* is ``None``, return the items
248 that are false. Equivalent to::
249
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000250 def filterfalse(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000251 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl116aa622007-08-15 14:28:22 +0000252 if predicate is None:
253 predicate = bool
254 for x in iterable:
255 if not predicate(x):
256 yield x
257
258
Georg Brandl116aa622007-08-15 14:28:22 +0000259.. function:: islice(iterable, [start,] stop [, step])
260
261 Make an iterator that returns selected elements from the iterable. If *start* is
262 non-zero, then elements from the iterable are skipped until start is reached.
263 Afterward, elements are returned consecutively unless *step* is set higher than
264 one which results in items being skipped. If *stop* is ``None``, then iteration
265 continues until the iterator is exhausted, if at all; otherwise, it stops at the
266 specified position. Unlike regular slicing, :func:`islice` does not support
267 negative values for *start*, *stop*, or *step*. Can be used to extract related
268 fields from data where the internal structure has been flattened (for example, a
269 multi-line report may list a name field on every third line). Equivalent to::
270
271 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000272 # islice('ABCDEFG', 2) --> A B
273 # islice('ABCDEFG', 2, 4) --> C D
274 # islice('ABCDEFG', 2, None) --> C D E F G
275 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000276 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000277 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000278 nexti = next(it)
279 for i, element in enumerate(iterable):
280 if i == nexti:
281 yield element
282 nexti = next(it)
283
284 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
285 then the step defaults to one.
286
Georg Brandl116aa622007-08-15 14:28:22 +0000287
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000288.. function:: zip_longest(*iterables[, fillvalue])
Georg Brandl116aa622007-08-15 14:28:22 +0000289
290 Make an iterator that aggregates elements from each of the iterables. If the
291 iterables are of uneven length, missing values are filled-in with *fillvalue*.
292 Iteration continues until the longest iterable is exhausted. Equivalent to::
293
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000294 def zip_longest(*args, fillvalue=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000295 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl116aa622007-08-15 14:28:22 +0000296 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
297 yield counter() # yields the fillvalue, or raises IndexError
298 fillers = repeat(fillvalue)
299 iters = [chain(it, sentinel(), fillers) for it in args]
300 try:
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000301 for tup in zip(*iters):
Georg Brandl116aa622007-08-15 14:28:22 +0000302 yield tup
303 except IndexError:
304 pass
305
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000306 If one of the iterables is potentially infinite, then the :func:`zip_longest`
Georg Brandl116aa622007-08-15 14:28:22 +0000307 function should be wrapped with something that limits the number of calls (for
308 example :func:`islice` or :func:`takewhile`).
309
Georg Brandl116aa622007-08-15 14:28:22 +0000310
Christian Heimes70e7ea22008-02-28 20:02:27 +0000311.. function:: permutations(iterable[, r])
312
313 Return successive *r* length permutations of elements in the *iterable*.
314
315 If *r* is not specified or is ``None``, then *r* defaults to the length
316 of the *iterable* and all possible full-length permutations
317 are generated.
318
319 Permutations are emitted in lexicographic sort order. So, if the
320 input *iterable* is sorted, the permutation tuples will be produced
321 in sorted order.
322
323 Elements are treated as unique based on their position, not on their
324 value. So if the input elements are unique, there will be no repeat
325 values in each permutation.
326
Christian Heimesb558a2e2008-03-02 22:46:37 +0000327 Equivalent to::
328
329 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000330 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
331 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000332 pool = tuple(iterable)
333 n = len(pool)
334 r = n if r is None else r
335 indices = range(n)
336 cycles = range(n-r+1, n+1)[::-1]
337 yield tuple(pool[i] for i in indices[:r])
338 while n:
339 for i in reversed(range(r)):
340 cycles[i] -= 1
341 if cycles[i] == 0:
342 indices[i:] = indices[i+1:] + indices[i:i+1]
343 cycles[i] = n - i
344 else:
345 j = cycles[i]
346 indices[i], indices[-j] = indices[-j], indices[i]
347 yield tuple(pool[i] for i in indices[:r])
348 break
349 else:
350 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000351
Christian Heimes78644762008-03-04 23:39:23 +0000352 The code for :func:`permutations` can be also expressed as a subsequence of
353 :func:`product`, filtered to exclude entries with repeated elements (those
354 from the same position in the input pool)::
355
356 def permutations(iterable, r=None):
357 pool = tuple(iterable)
358 n = len(pool)
359 r = n if r is None else r
360 for indices in product(range(n), repeat=r):
361 if len(set(indices)) == r:
362 yield tuple(pool[i] for i in indices)
363
Christian Heimes70e7ea22008-02-28 20:02:27 +0000364 .. versionadded:: 2.6
365
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000366.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000367
368 Cartesian product of input iterables.
369
370 Equivalent to nested for-loops in a generator expression. For example,
371 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
372
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000373 The leftmost iterators correspond to the outermost for-loop, so the output
374 tuples cycle like an odometer (with the rightmost element changing on every
Christian Heimes78644762008-03-04 23:39:23 +0000375 iteration). This results in a lexicographic ordering so that if the
376 inputs iterables are sorted, the product tuples are emitted
Christian Heimes836baa52008-02-26 08:18:30 +0000377 in sorted order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000378
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000379 To compute the product of an iterable with itself, specify the number of
380 repetitions with the optional *repeat* keyword argument. For example,
381 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
382
Christian Heimes78644762008-03-04 23:39:23 +0000383 This function is equivalent to the following code, except that the
384 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000385
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000386 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000387 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
388 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000389 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000390 result = [[]]
391 for pool in pools:
392 result = [x+[y] for x in result for y in pool]
393 for prod in result:
394 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000395
396
Georg Brandl116aa622007-08-15 14:28:22 +0000397.. function:: repeat(object[, times])
398
399 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000400 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000401 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000402 create an invariant part of a tuple record. Equivalent to::
403
404 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000405 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000406 if times is None:
407 while True:
408 yield object
409 else:
410 for i in range(times):
411 yield object
412
413
414.. function:: starmap(function, iterable)
415
Christian Heimes679db4a2008-01-18 09:56:22 +0000416 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000417 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000418 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000419 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000420 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
421
422 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000423 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000424 for args in iterable:
425 yield function(*args)
426
427 .. versionchanged:: 2.6
428 Previously, :func:`starmap` required the function arguments to be tuples.
429 Now, any iterable is allowed.
Georg Brandl116aa622007-08-15 14:28:22 +0000430
431
432.. function:: takewhile(predicate, iterable)
433
434 Make an iterator that returns elements from the iterable as long as the
435 predicate is true. Equivalent to::
436
437 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000438 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000439 for x in iterable:
440 if predicate(x):
441 yield x
442 else:
443 break
444
445
446.. function:: tee(iterable[, n=2])
447
448 Return *n* independent iterators from a single iterable. The case where ``n==2``
449 is equivalent to::
450
451 def tee(iterable):
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000452 def gen(next, data={}):
Georg Brandl116aa622007-08-15 14:28:22 +0000453 for i in count():
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000454 if i in data:
455 yield data.pop(i)
Georg Brandl116aa622007-08-15 14:28:22 +0000456 else:
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000457 data[i] = next()
458 yield data[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000459 it = iter(iterable)
460 return (gen(it.__next__), gen(it.__next__))
461
462 Note, once :func:`tee` has made a split, the original *iterable* should not be
463 used anywhere else; otherwise, the *iterable* could get advanced without the tee
464 objects being informed.
465
466 Note, this member of the toolkit may require significant auxiliary storage
467 (depending on how much temporary data needs to be stored). In general, if one
468 iterator is going to use most or all of the data before the other iterator, it
469 is faster to use :func:`list` instead of :func:`tee`.
470
Georg Brandl116aa622007-08-15 14:28:22 +0000471
472.. _itertools-example:
473
474Examples
475--------
476
477The following examples show common uses for each tool and demonstrate ways they
478can be combined. ::
479
Georg Brandl116aa622007-08-15 14:28:22 +0000480 # Show a dictionary sorted and grouped by value
481 >>> from operator import itemgetter
482 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000483 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000484 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000485 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000486 ...
487 1 ['a', 'c', 'e']
488 2 ['b', 'd', 'f']
489 3 ['g']
490
491 # Find runs of consecutive numbers using groupby. The key to the solution
492 # is differencing with a range so that consecutive numbers all appear in
493 # same group.
494 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
495 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000496 ... print(map(operator.itemgetter(1), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000497 ...
498 [1]
499 [4, 5, 6]
500 [10]
501 [15, 16, 17, 18]
502 [22]
503 [25, 26, 27, 28]
504
505
506
507.. _itertools-recipes:
508
509Recipes
510-------
511
512This section shows recipes for creating an extended toolset using the existing
513itertools as building blocks.
514
515The extended tools offer the same high performance as the underlying toolset.
516The superior memory performance is kept by processing elements one at a time
517rather than bringing the whole iterable into memory all at once. Code volume is
518kept small by linking the tools together in a functional style which helps
519eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000520"vectorized" building blocks over the use of for-loops and :term:`generator`\s
521which incur interpreter overhead. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000522
523 def take(n, seq):
524 return list(islice(seq, n))
525
526 def enumerate(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000527 return zip(count(), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000528
529 def tabulate(function):
530 "Return function(0), function(1), ..."
Raymond Hettingera6c60372008-03-13 01:26:19 +0000531 return map(function, count())
Georg Brandl116aa622007-08-15 14:28:22 +0000532
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000533 def items(mapping):
534 return zip(mapping.keys(), mapping.values())
535
Georg Brandl116aa622007-08-15 14:28:22 +0000536 def nth(iterable, n):
537 "Returns the nth item or raise StopIteration"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000538 return next(islice(iterable, n, None))
Georg Brandl116aa622007-08-15 14:28:22 +0000539
540 def all(seq, pred=None):
541 "Returns True if pred(x) is true for every element in the iterable"
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000542 for elem in filterfalse(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000543 return False
544 return True
545
546 def any(seq, pred=None):
547 "Returns True if pred(x) is true for at least one element in the iterable"
Raymond Hettinger17301e92008-03-13 00:19:26 +0000548 for elem in filter(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000549 return True
550 return False
551
552 def no(seq, pred=None):
553 "Returns True if pred(x) is false for every element in the iterable"
Raymond Hettinger17301e92008-03-13 00:19:26 +0000554 for elem in filter(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000555 return False
556 return True
557
558 def quantify(seq, pred=None):
559 "Count how many times the predicate is true in the sequence"
Raymond Hettingera6c60372008-03-13 01:26:19 +0000560 return sum(map(pred, seq))
Georg Brandl116aa622007-08-15 14:28:22 +0000561
562 def padnone(seq):
563 """Returns the sequence elements and then returns None indefinitely.
564
565 Useful for emulating the behavior of the built-in map() function.
566 """
567 return chain(seq, repeat(None))
568
569 def ncycles(seq, n):
570 "Returns the sequence elements n times"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000571 return chain.from_iterable(repeat(seq, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000572
573 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000574 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000575
576 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000577 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000578
579 def repeatfunc(func, times=None, *args):
580 """Repeat calls to func with specified arguments.
581
582 Example: repeatfunc(random.random)
583 """
584 if times is None:
585 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000586 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000587
588 def pairwise(iterable):
589 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
590 a, b = tee(iterable)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000591 for elem in b:
592 break
593 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000594
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000595 def grouper(n, iterable, fillvalue=None):
Georg Brandl116aa622007-08-15 14:28:22 +0000596 "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000597 args = [iter(iterable)] * n
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000598 return zip_longest(*args, fillvalue=fillvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000599
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000600 def roundrobin(*iterables):
601 "roundrobin('abc', 'd', 'ef') --> 'a', 'd', 'e', 'b', 'f', 'c'"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000602 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000603 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000604 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000605 while pending:
606 try:
607 for next in nexts:
608 yield next()
609 except StopIteration:
610 pending -= 1
611 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000612
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000613 def powerset(iterable):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000614 "powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])"
615 # Recipe credited to Eric Raymond
616 pairs = [(2**i, x) for i, x in enumerate(iterable)]
617 for n in xrange(2**len(pairs)):
618 yield set(x for m, x in pairs if m&n)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000619
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000620 def compress(data, selectors):
621 "compress('abcdef', [1,0,1,0,1,1]) --> a c e f"
622 for d, s in zip(data, selectors):
623 if s:
624 yield d
625