blob: 63cace61bd33dd5862a26fd4cee7016f2caa06d5 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
2:mod:`itertools` --- Functions creating iterators for efficient looping
3=======================================================================
4
5.. module:: itertools
6 :synopsis: Functions creating iterators for efficient looping.
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
10
Christian Heimesfe337bf2008-03-23 21:54:12 +000011.. testsetup::
12
13 from itertools import *
14
15
Georg Brandl9afde1c2007-11-01 20:32:30 +000016This module implements a number of :term:`iterator` building blocks inspired by
Georg Brandl116aa622007-08-15 14:28:22 +000017constructs from the Haskell and SML programming languages. Each has been recast
18in a form suitable for Python.
19
20The module standardizes a core set of fast, memory efficient tools that are
21useful by themselves or in combination. Standardization helps avoid the
22readability and reliability problems which arise when many different individuals
23create their own slightly varying implementations, each with their own quirks
24and naming conventions.
25
26The tools are designed to combine readily with one another. This makes it easy
27to construct more specialized tools succinctly and efficiently in pure Python.
28
29For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a
Raymond Hettingera6c60372008-03-13 01:26:19 +000030sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python
31by combining :func:`map` and :func:`count` to form ``map(f, count())``.
Georg Brandl116aa622007-08-15 14:28:22 +000032
33Likewise, the functional tools are designed to work well with the high-speed
34functions provided by the :mod:`operator` module.
35
36The module author welcomes suggestions for other basic building blocks to be
37added to future versions of the module.
38
39Whether cast in pure python form or compiled code, tools that use iterators are
40more memory efficient (and faster) than their list based counterparts. Adopting
41the principles of just-in-time manufacturing, they create data when and where
42needed instead of consuming memory with the computer equivalent of "inventory".
43
44The performance advantage of iterators becomes more acute as the number of
45elements increases -- at some point, lists grow large enough to severely impact
46memory cache performance and start running slowly.
47
48
49.. seealso::
50
51 The Standard ML Basis Library, `The Standard ML Basis Library
52 <http://www.standardml.org/Basis/>`_.
53
54 Haskell, A Purely Functional Language, `Definition of Haskell and the Standard
55 Libraries <http://www.haskell.org/definition/>`_.
56
57
58.. _itertools-functions:
59
60Itertool functions
61------------------
62
63The following module functions all construct and return iterators. Some provide
64streams of infinite length, so they should only be accessed by functions or
65loops that truncate the stream.
66
67
68.. function:: chain(*iterables)
69
70 Make an iterator that returns elements from the first iterable until it is
71 exhausted, then proceeds to the next iterable, until all of the iterables are
72 exhausted. Used for treating consecutive sequences as a single sequence.
73 Equivalent to::
74
75 def chain(*iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000076 # chain('ABC', 'DEF') --> A B C D E F
Georg Brandl116aa622007-08-15 14:28:22 +000077 for it in iterables:
78 for element in it:
79 yield element
80
81
Christian Heimes70e7ea22008-02-28 20:02:27 +000082.. function:: itertools.chain.from_iterable(iterable)
83
84 Alternate constructor for :func:`chain`. Gets chained inputs from a
85 single iterable argument that is evaluated lazily. Equivalent to::
86
87 @classmethod
88 def from_iterable(iterables):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +000089 # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F
Christian Heimes70e7ea22008-02-28 20:02:27 +000090 for it in iterables:
91 for element in it:
92 yield element
93
Christian Heimes78644762008-03-04 23:39:23 +000094
Christian Heimes836baa52008-02-26 08:18:30 +000095.. function:: combinations(iterable, r)
96
Christian Heimesdae2a892008-04-19 00:55:37 +000097 Return *r* length subsequences of elements from the input *iterable*.
Christian Heimes836baa52008-02-26 08:18:30 +000098
Christian Heimes70e7ea22008-02-28 20:02:27 +000099 Combinations are emitted in lexicographic sort order. So, if the
Christian Heimes836baa52008-02-26 08:18:30 +0000100 input *iterable* is sorted, the combination tuples will be produced
101 in sorted order.
102
103 Elements are treated as unique based on their position, not on their
104 value. So if the input elements are unique, there will be no repeat
Christian Heimes70e7ea22008-02-28 20:02:27 +0000105 values in each combination.
Christian Heimes836baa52008-02-26 08:18:30 +0000106
Christian Heimes836baa52008-02-26 08:18:30 +0000107 Equivalent to::
108
109 def combinations(iterable, r):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000110 # combinations('ABCD', 2) --> AB AC AD BC BD CD
111 # combinations(range(4), 3) --> 012 013 023 123
Christian Heimes836baa52008-02-26 08:18:30 +0000112 pool = tuple(iterable)
Christian Heimes380f7f22008-02-28 11:19:05 +0000113 n = len(pool)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000114 indices = range(r)
115 yield tuple(pool[i] for i in indices)
Christian Heimes380f7f22008-02-28 11:19:05 +0000116 while 1:
117 for i in reversed(range(r)):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000118 if indices[i] != i + n - r:
Christian Heimes836baa52008-02-26 08:18:30 +0000119 break
Christian Heimes380f7f22008-02-28 11:19:05 +0000120 else:
121 return
Christian Heimesb558a2e2008-03-02 22:46:37 +0000122 indices[i] += 1
Christian Heimes380f7f22008-02-28 11:19:05 +0000123 for j in range(i+1, r):
Christian Heimesb558a2e2008-03-02 22:46:37 +0000124 indices[j] = indices[j-1] + 1
125 yield tuple(pool[i] for i in indices)
Christian Heimes836baa52008-02-26 08:18:30 +0000126
Christian Heimes78644762008-03-04 23:39:23 +0000127 The code for :func:`combinations` can be also expressed as a subsequence
128 of :func:`permutations` after filtering entries where the elements are not
129 in sorted order (according to their position in the input pool)::
130
131 def combinations(iterable, r):
132 pool = tuple(iterable)
133 n = len(pool)
134 for indices in permutations(range(n), r):
135 if sorted(indices) == list(indices):
136 yield tuple(pool[i] for i in indices)
137
Christian Heimes836baa52008-02-26 08:18:30 +0000138
Georg Brandl116aa622007-08-15 14:28:22 +0000139.. function:: count([n])
140
141 Make an iterator that returns consecutive integers starting with *n*. If not
Raymond Hettingera6c60372008-03-13 01:26:19 +0000142 specified *n* defaults to zero. Often used as an argument to :func:`map` to
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000143 generate consecutive data points. Also, used with :func:`zip` to add sequence
Georg Brandl9afde1c2007-11-01 20:32:30 +0000144 numbers. Equivalent to::
Georg Brandl116aa622007-08-15 14:28:22 +0000145
146 def count(n=0):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000147 # count(10) --> 10 11 12 13 14 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000148 while True:
149 yield n
150 n += 1
151
Georg Brandl116aa622007-08-15 14:28:22 +0000152
153.. function:: cycle(iterable)
154
155 Make an iterator returning elements from the iterable and saving a copy of each.
156 When the iterable is exhausted, return elements from the saved copy. Repeats
157 indefinitely. Equivalent to::
158
159 def cycle(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000160 # cycle('ABCD') --> A B C D A B C D A B C D ...
Georg Brandl116aa622007-08-15 14:28:22 +0000161 saved = []
162 for element in iterable:
163 yield element
164 saved.append(element)
165 while saved:
166 for element in saved:
167 yield element
168
169 Note, this member of the toolkit may require significant auxiliary storage
170 (depending on the length of the iterable).
171
172
173.. function:: dropwhile(predicate, iterable)
174
175 Make an iterator that drops elements from the iterable as long as the predicate
176 is true; afterwards, returns every element. Note, the iterator does not produce
177 *any* output until the predicate first becomes false, so it may have a lengthy
178 start-up time. Equivalent to::
179
180 def dropwhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000181 # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1
Georg Brandl116aa622007-08-15 14:28:22 +0000182 iterable = iter(iterable)
183 for x in iterable:
184 if not predicate(x):
185 yield x
186 break
187 for x in iterable:
188 yield x
189
190
191.. function:: groupby(iterable[, key])
192
193 Make an iterator that returns consecutive keys and groups from the *iterable*.
194 The *key* is a function computing a key value for each element. If not
195 specified or is ``None``, *key* defaults to an identity function and returns
196 the element unchanged. Generally, the iterable needs to already be sorted on
197 the same key function.
198
199 The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It
200 generates a break or new group every time the value of the key function changes
201 (which is why it is usually necessary to have sorted the data using the same key
202 function). That behavior differs from SQL's GROUP BY which aggregates common
203 elements regardless of their input order.
204
205 The returned group is itself an iterator that shares the underlying iterable
206 with :func:`groupby`. Because the source is shared, when the :func:`groupby`
207 object is advanced, the previous group is no longer visible. So, if that data
208 is needed later, it should be stored as a list::
209
210 groups = []
211 uniquekeys = []
212 data = sorted(data, key=keyfunc)
213 for k, g in groupby(data, keyfunc):
214 groups.append(list(g)) # Store group iterator as a list
215 uniquekeys.append(k)
216
217 :func:`groupby` is equivalent to::
218
219 class groupby(object):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000220 # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B
221 # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D
Georg Brandl116aa622007-08-15 14:28:22 +0000222 def __init__(self, iterable, key=None):
223 if key is None:
224 key = lambda x: x
225 self.keyfunc = key
226 self.it = iter(iterable)
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000227 self.tgtkey = self.currkey = self.currvalue = object()
Georg Brandl116aa622007-08-15 14:28:22 +0000228 def __iter__(self):
229 return self
230 def __next__(self):
231 while self.currkey == self.tgtkey:
232 self.currvalue = next(self.it) # Exit on StopIteration
233 self.currkey = self.keyfunc(self.currvalue)
234 self.tgtkey = self.currkey
235 return (self.currkey, self._grouper(self.tgtkey))
236 def _grouper(self, tgtkey):
237 while self.currkey == tgtkey:
238 yield self.currvalue
239 self.currvalue = next(self.it) # Exit on StopIteration
240 self.currkey = self.keyfunc(self.currvalue)
241
Georg Brandl116aa622007-08-15 14:28:22 +0000242
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000243.. function:: filterfalse(predicate, iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000244
245 Make an iterator that filters elements from iterable returning only those for
246 which the predicate is ``False``. If *predicate* is ``None``, return the items
247 that are false. Equivalent to::
248
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000249 def filterfalse(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000250 # filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8
Georg Brandl116aa622007-08-15 14:28:22 +0000251 if predicate is None:
252 predicate = bool
253 for x in iterable:
254 if not predicate(x):
255 yield x
256
257
Georg Brandl116aa622007-08-15 14:28:22 +0000258.. function:: islice(iterable, [start,] stop [, step])
259
260 Make an iterator that returns selected elements from the iterable. If *start* is
261 non-zero, then elements from the iterable are skipped until start is reached.
262 Afterward, elements are returned consecutively unless *step* is set higher than
263 one which results in items being skipped. If *stop* is ``None``, then iteration
264 continues until the iterator is exhausted, if at all; otherwise, it stops at the
265 specified position. Unlike regular slicing, :func:`islice` does not support
266 negative values for *start*, *stop*, or *step*. Can be used to extract related
267 fields from data where the internal structure has been flattened (for example, a
268 multi-line report may list a name field on every third line). Equivalent to::
269
270 def islice(iterable, *args):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000271 # islice('ABCDEFG', 2) --> A B
272 # islice('ABCDEFG', 2, 4) --> C D
273 # islice('ABCDEFG', 2, None) --> C D E F G
274 # islice('ABCDEFG', 0, None, 2) --> A C E G
Georg Brandl116aa622007-08-15 14:28:22 +0000275 s = slice(*args)
Georg Brandlf6945182008-02-01 11:56:49 +0000276 it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)
Georg Brandl116aa622007-08-15 14:28:22 +0000277 nexti = next(it)
278 for i, element in enumerate(iterable):
279 if i == nexti:
280 yield element
281 nexti = next(it)
282
283 If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,
284 then the step defaults to one.
285
Georg Brandl116aa622007-08-15 14:28:22 +0000286
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000287.. function:: zip_longest(*iterables[, fillvalue])
Georg Brandl116aa622007-08-15 14:28:22 +0000288
289 Make an iterator that aggregates elements from each of the iterables. If the
290 iterables are of uneven length, missing values are filled-in with *fillvalue*.
291 Iteration continues until the longest iterable is exhausted. Equivalent to::
292
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000293 def zip_longest(*args, fillvalue=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000294 # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
Georg Brandl116aa622007-08-15 14:28:22 +0000295 def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):
296 yield counter() # yields the fillvalue, or raises IndexError
297 fillers = repeat(fillvalue)
298 iters = [chain(it, sentinel(), fillers) for it in args]
299 try:
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000300 for tup in zip(*iters):
Georg Brandl116aa622007-08-15 14:28:22 +0000301 yield tup
302 except IndexError:
303 pass
304
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000305 If one of the iterables is potentially infinite, then the :func:`zip_longest`
Georg Brandl116aa622007-08-15 14:28:22 +0000306 function should be wrapped with something that limits the number of calls (for
307 example :func:`islice` or :func:`takewhile`).
308
Georg Brandl116aa622007-08-15 14:28:22 +0000309
Christian Heimes70e7ea22008-02-28 20:02:27 +0000310.. function:: permutations(iterable[, r])
311
312 Return successive *r* length permutations of elements in the *iterable*.
313
314 If *r* is not specified or is ``None``, then *r* defaults to the length
315 of the *iterable* and all possible full-length permutations
316 are generated.
317
318 Permutations are emitted in lexicographic sort order. So, if the
319 input *iterable* is sorted, the permutation tuples will be produced
320 in sorted order.
321
322 Elements are treated as unique based on their position, not on their
323 value. So if the input elements are unique, there will be no repeat
324 values in each permutation.
325
Christian Heimesb558a2e2008-03-02 22:46:37 +0000326 Equivalent to::
327
328 def permutations(iterable, r=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000329 # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
330 # permutations(range(3)) --> 012 021 102 120 201 210
Christian Heimesb558a2e2008-03-02 22:46:37 +0000331 pool = tuple(iterable)
332 n = len(pool)
333 r = n if r is None else r
334 indices = range(n)
Christian Heimesfe337bf2008-03-23 21:54:12 +0000335 cycles = range(n, n-r, -1)
Christian Heimesb558a2e2008-03-02 22:46:37 +0000336 yield tuple(pool[i] for i in indices[:r])
337 while n:
338 for i in reversed(range(r)):
339 cycles[i] -= 1
340 if cycles[i] == 0:
341 indices[i:] = indices[i+1:] + indices[i:i+1]
342 cycles[i] = n - i
343 else:
344 j = cycles[i]
345 indices[i], indices[-j] = indices[-j], indices[i]
346 yield tuple(pool[i] for i in indices[:r])
347 break
348 else:
349 return
Christian Heimes70e7ea22008-02-28 20:02:27 +0000350
Christian Heimes78644762008-03-04 23:39:23 +0000351 The code for :func:`permutations` can be also expressed as a subsequence of
352 :func:`product`, filtered to exclude entries with repeated elements (those
353 from the same position in the input pool)::
354
355 def permutations(iterable, r=None):
356 pool = tuple(iterable)
357 n = len(pool)
358 r = n if r is None else r
359 for indices in product(range(n), repeat=r):
360 if len(set(indices)) == r:
361 yield tuple(pool[i] for i in indices)
362
Christian Heimes70e7ea22008-02-28 20:02:27 +0000363
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000364.. function:: product(*iterables[, repeat])
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000365
366 Cartesian product of input iterables.
367
368 Equivalent to nested for-loops in a generator expression. For example,
369 ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.
370
Christian Heimesdae2a892008-04-19 00:55:37 +0000371 The nested loops cycle like an odometer with the rightmost element advancing
372 on every iteration. This pattern creates a lexicographic ordering so that if
373 the input's iterables are sorted, the product tuples are emitted in sorted
374 order.
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000375
Christian Heimes9e7f1d22008-02-28 12:27:11 +0000376 To compute the product of an iterable with itself, specify the number of
377 repetitions with the optional *repeat* keyword argument. For example,
378 ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.
379
Christian Heimes78644762008-03-04 23:39:23 +0000380 This function is equivalent to the following code, except that the
381 actual implementation does not build up intermediate results in memory::
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000382
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000383 def product(*args, repeat=1):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000384 # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
385 # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000386 pools = map(tuple, args) * repeat
Christian Heimes78644762008-03-04 23:39:23 +0000387 result = [[]]
388 for pool in pools:
389 result = [x+[y] for x in result for y in pool]
390 for prod in result:
391 yield tuple(prod)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000392
393
Georg Brandl116aa622007-08-15 14:28:22 +0000394.. function:: repeat(object[, times])
395
396 Make an iterator that returns *object* over and over again. Runs indefinitely
Raymond Hettingera6c60372008-03-13 01:26:19 +0000397 unless the *times* argument is specified. Used as argument to :func:`map` for
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000398 invariant parameters to the called function. Also used with :func:`zip` to
Georg Brandl116aa622007-08-15 14:28:22 +0000399 create an invariant part of a tuple record. Equivalent to::
400
401 def repeat(object, times=None):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000402 # repeat(10, 3) --> 10 10 10
Georg Brandl116aa622007-08-15 14:28:22 +0000403 if times is None:
404 while True:
405 yield object
406 else:
407 for i in range(times):
408 yield object
409
410
411.. function:: starmap(function, iterable)
412
Christian Heimes679db4a2008-01-18 09:56:22 +0000413 Make an iterator that computes the function using arguments obtained from
Raymond Hettingera6c60372008-03-13 01:26:19 +0000414 the iterable. Used instead of :func:`map` when argument parameters are already
Georg Brandl116aa622007-08-15 14:28:22 +0000415 grouped in tuples from a single iterable (the data has been "pre-zipped"). The
Raymond Hettingera6c60372008-03-13 01:26:19 +0000416 difference between :func:`map` and :func:`starmap` parallels the distinction
Georg Brandl116aa622007-08-15 14:28:22 +0000417 between ``function(a,b)`` and ``function(*c)``. Equivalent to::
418
419 def starmap(function, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000420 # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000
Christian Heimes679db4a2008-01-18 09:56:22 +0000421 for args in iterable:
422 yield function(*args)
423
Georg Brandl116aa622007-08-15 14:28:22 +0000424
425.. function:: takewhile(predicate, iterable)
426
427 Make an iterator that returns elements from the iterable as long as the
428 predicate is true. Equivalent to::
429
430 def takewhile(predicate, iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000431 # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4
Georg Brandl116aa622007-08-15 14:28:22 +0000432 for x in iterable:
433 if predicate(x):
434 yield x
435 else:
436 break
437
438
439.. function:: tee(iterable[, n=2])
440
441 Return *n* independent iterators from a single iterable. The case where ``n==2``
442 is equivalent to::
443
444 def tee(iterable):
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000445 def gen(next, data={}):
Georg Brandl116aa622007-08-15 14:28:22 +0000446 for i in count():
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000447 if i in data:
448 yield data.pop(i)
Georg Brandl116aa622007-08-15 14:28:22 +0000449 else:
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000450 data[i] = next()
451 yield data[i]
Georg Brandl116aa622007-08-15 14:28:22 +0000452 it = iter(iterable)
453 return (gen(it.__next__), gen(it.__next__))
454
455 Note, once :func:`tee` has made a split, the original *iterable* should not be
456 used anywhere else; otherwise, the *iterable* could get advanced without the tee
457 objects being informed.
458
459 Note, this member of the toolkit may require significant auxiliary storage
460 (depending on how much temporary data needs to be stored). In general, if one
461 iterator is going to use most or all of the data before the other iterator, it
462 is faster to use :func:`list` instead of :func:`tee`.
463
Georg Brandl116aa622007-08-15 14:28:22 +0000464
465.. _itertools-example:
466
467Examples
468--------
469
470The following examples show common uses for each tool and demonstrate ways they
Christian Heimesfe337bf2008-03-23 21:54:12 +0000471can be combined.
472
473.. doctest::
Georg Brandl116aa622007-08-15 14:28:22 +0000474
Georg Brandl116aa622007-08-15 14:28:22 +0000475 # Show a dictionary sorted and grouped by value
476 >>> from operator import itemgetter
477 >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)
Fred Drake2e748782007-09-04 17:33:11 +0000478 >>> di = sorted(d.items(), key=itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000479 >>> for k, g in groupby(di, key=itemgetter(1)):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000480 ... print(k, map(itemgetter(0), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000481 ...
482 1 ['a', 'c', 'e']
483 2 ['b', 'd', 'f']
484 3 ['g']
485
486 # Find runs of consecutive numbers using groupby. The key to the solution
487 # is differencing with a range so that consecutive numbers all appear in
488 # same group.
489 >>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]
490 >>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):
Georg Brandl6911e3c2007-09-04 07:15:32 +0000491 ... print(map(operator.itemgetter(1), g))
Georg Brandl116aa622007-08-15 14:28:22 +0000492 ...
493 [1]
494 [4, 5, 6]
495 [10]
496 [15, 16, 17, 18]
497 [22]
498 [25, 26, 27, 28]
499
500
501
502.. _itertools-recipes:
503
504Recipes
505-------
506
507This section shows recipes for creating an extended toolset using the existing
508itertools as building blocks.
509
510The extended tools offer the same high performance as the underlying toolset.
511The superior memory performance is kept by processing elements one at a time
512rather than bringing the whole iterable into memory all at once. Code volume is
513kept small by linking the tools together in a functional style which helps
514eliminate temporary variables. High speed is retained by preferring
Georg Brandl9afde1c2007-11-01 20:32:30 +0000515"vectorized" building blocks over the use of for-loops and :term:`generator`\s
Christian Heimesfe337bf2008-03-23 21:54:12 +0000516which incur interpreter overhead.
517
518.. testcode::
Georg Brandl116aa622007-08-15 14:28:22 +0000519
520 def take(n, seq):
521 return list(islice(seq, n))
522
523 def enumerate(iterable):
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000524 return zip(count(), iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000525
526 def tabulate(function):
527 "Return function(0), function(1), ..."
Raymond Hettingera6c60372008-03-13 01:26:19 +0000528 return map(function, count())
Georg Brandl116aa622007-08-15 14:28:22 +0000529
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000530 def items(mapping):
531 return zip(mapping.keys(), mapping.values())
532
Georg Brandl116aa622007-08-15 14:28:22 +0000533 def nth(iterable, n):
534 "Returns the nth item or raise StopIteration"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000535 return next(islice(iterable, n, None))
Georg Brandl116aa622007-08-15 14:28:22 +0000536
537 def all(seq, pred=None):
538 "Returns True if pred(x) is true for every element in the iterable"
Raymond Hettingerb0002d22008-03-13 01:41:43 +0000539 for elem in filterfalse(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000540 return False
541 return True
542
543 def any(seq, pred=None):
544 "Returns True if pred(x) is true for at least one element in the iterable"
Raymond Hettinger17301e92008-03-13 00:19:26 +0000545 for elem in filter(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000546 return True
547 return False
548
549 def no(seq, pred=None):
550 "Returns True if pred(x) is false for every element in the iterable"
Raymond Hettinger17301e92008-03-13 00:19:26 +0000551 for elem in filter(pred, seq):
Georg Brandl116aa622007-08-15 14:28:22 +0000552 return False
553 return True
554
555 def quantify(seq, pred=None):
556 "Count how many times the predicate is true in the sequence"
Raymond Hettingera6c60372008-03-13 01:26:19 +0000557 return sum(map(pred, seq))
Georg Brandl116aa622007-08-15 14:28:22 +0000558
559 def padnone(seq):
560 """Returns the sequence elements and then returns None indefinitely.
561
562 Useful for emulating the behavior of the built-in map() function.
563 """
564 return chain(seq, repeat(None))
565
566 def ncycles(seq, n):
567 "Returns the sequence elements n times"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000568 return chain.from_iterable(repeat(seq, n))
Georg Brandl116aa622007-08-15 14:28:22 +0000569
570 def dotproduct(vec1, vec2):
Raymond Hettingera6c60372008-03-13 01:26:19 +0000571 return sum(map(operator.mul, vec1, vec2))
Georg Brandl116aa622007-08-15 14:28:22 +0000572
573 def flatten(listOfLists):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000574 return list(chain.from_iterable(listOfLists))
Georg Brandl116aa622007-08-15 14:28:22 +0000575
576 def repeatfunc(func, times=None, *args):
577 """Repeat calls to func with specified arguments.
578
579 Example: repeatfunc(random.random)
580 """
581 if times is None:
582 return starmap(func, repeat(args))
Christian Heimes70e7ea22008-02-28 20:02:27 +0000583 return starmap(func, repeat(args, times))
Georg Brandl116aa622007-08-15 14:28:22 +0000584
585 def pairwise(iterable):
586 "s -> (s0,s1), (s1,s2), (s2, s3), ..."
587 a, b = tee(iterable)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000588 for elem in b:
589 break
590 return zip(a, b)
Georg Brandl116aa622007-08-15 14:28:22 +0000591
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000592 def grouper(n, iterable, fillvalue=None):
Georg Brandl116aa622007-08-15 14:28:22 +0000593 "grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000594 args = [iter(iterable)] * n
Raymond Hettingera137e1f2008-03-13 02:43:14 +0000595 return zip_longest(*args, fillvalue=fillvalue)
Georg Brandl116aa622007-08-15 14:28:22 +0000596
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000597 def roundrobin(*iterables):
598 "roundrobin('abc', 'd', 'ef') --> 'a', 'd', 'e', 'b', 'f', 'c'"
Christian Heimes70e7ea22008-02-28 20:02:27 +0000599 # Recipe credited to George Sakkis
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000600 pending = len(iterables)
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000601 nexts = cycle(iter(it).__next__ for it in iterables)
Christian Heimes7b3ce6a2008-01-31 14:31:45 +0000602 while pending:
603 try:
604 for next in nexts:
605 yield next()
606 except StopIteration:
607 pending -= 1
608 nexts = cycle(islice(nexts, pending))
Georg Brandl116aa622007-08-15 14:28:22 +0000609
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000610 def powerset(iterable):
Christian Heimes70e7ea22008-02-28 20:02:27 +0000611 "powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])"
612 # Recipe credited to Eric Raymond
613 pairs = [(2**i, x) for i, x in enumerate(iterable)]
614 for n in xrange(2**len(pairs)):
615 yield set(x for m, x in pairs if m&n)
Christian Heimes90c3d9b2008-02-23 13:18:03 +0000616
Raymond Hettingerdd1150e2008-03-13 02:39:40 +0000617 def compress(data, selectors):
618 "compress('abcdef', [1,0,1,0,1,1]) --> a c e f"
619 for d, s in zip(data, selectors):
620 if s:
621 yield d
622