| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1 |  | 
|  | 2 | :mod:`itertools` --- Functions creating iterators for efficient looping | 
|  | 3 | ======================================================================= | 
|  | 4 |  | 
|  | 5 | .. module:: itertools | 
|  | 6 | :synopsis: Functions creating iterators for efficient looping. | 
|  | 7 | .. moduleauthor:: Raymond Hettinger <python@rcn.com> | 
|  | 8 | .. sectionauthor:: Raymond Hettinger <python@rcn.com> | 
|  | 9 |  | 
|  | 10 |  | 
| Georg Brandl | e8f1b00 | 2008-03-22 22:04:10 +0000 | [diff] [blame] | 11 | .. testsetup:: | 
|  | 12 |  | 
|  | 13 | from itertools import * | 
|  | 14 |  | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 15 | .. versionadded:: 2.3 | 
|  | 16 |  | 
| Georg Brandl | e7a0990 | 2007-10-21 12:10:28 +0000 | [diff] [blame] | 17 | This module implements a number of :term:`iterator` building blocks inspired by | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 18 | constructs from the Haskell and SML programming languages.  Each has been recast | 
|  | 19 | in a form suitable for Python. | 
|  | 20 |  | 
|  | 21 | The module standardizes a core set of fast, memory efficient tools that are | 
|  | 22 | useful by themselves or in combination.  Standardization helps avoid the | 
|  | 23 | readability and reliability problems which arise when many different individuals | 
|  | 24 | create their own slightly varying implementations, each with their own quirks | 
|  | 25 | and naming conventions. | 
|  | 26 |  | 
|  | 27 | The tools are designed to combine readily with one another.  This makes it easy | 
|  | 28 | to construct more specialized tools succinctly and efficiently in pure Python. | 
|  | 29 |  | 
|  | 30 | For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a | 
|  | 31 | sequence ``f(0), f(1), ...``.  This toolbox provides :func:`imap` and | 
|  | 32 | :func:`count` which can be combined to form ``imap(f, count())`` and produce an | 
|  | 33 | equivalent result. | 
|  | 34 |  | 
|  | 35 | Likewise, the functional tools are designed to work well with the high-speed | 
|  | 36 | functions provided by the :mod:`operator` module. | 
|  | 37 |  | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 38 | Whether cast in pure python form or compiled code, tools that use iterators are | 
| Raymond Hettinger | f1f46f0 | 2008-07-19 23:58:47 +0000 | [diff] [blame] | 39 | more memory efficient (and often faster) than their list based counterparts. Adopting | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 40 | the principles of just-in-time manufacturing, they create data when and where | 
|  | 41 | needed instead of consuming memory with the computer equivalent of "inventory". | 
|  | 42 |  | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 43 |  | 
|  | 44 | .. seealso:: | 
|  | 45 |  | 
|  | 46 | The Standard ML Basis Library, `The Standard ML Basis Library | 
|  | 47 | <http://www.standardml.org/Basis/>`_. | 
|  | 48 |  | 
|  | 49 | Haskell, A Purely Functional Language, `Definition of Haskell and the Standard | 
|  | 50 | Libraries <http://www.haskell.org/definition/>`_. | 
|  | 51 |  | 
|  | 52 |  | 
|  | 53 | .. _itertools-functions: | 
|  | 54 |  | 
|  | 55 | Itertool functions | 
|  | 56 | ------------------ | 
|  | 57 |  | 
|  | 58 | The following module functions all construct and return iterators. Some provide | 
|  | 59 | streams of infinite length, so they should only be accessed by functions or | 
|  | 60 | loops that truncate the stream. | 
|  | 61 |  | 
|  | 62 |  | 
|  | 63 | .. function:: chain(*iterables) | 
|  | 64 |  | 
|  | 65 | Make an iterator that returns elements from the first iterable until it is | 
|  | 66 | exhausted, then proceeds to the next iterable, until all of the iterables are | 
|  | 67 | exhausted.  Used for treating consecutive sequences as a single sequence. | 
|  | 68 | Equivalent to:: | 
|  | 69 |  | 
|  | 70 | def chain(*iterables): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 71 | # chain('ABC', 'DEF') --> A B C D E F | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 72 | for it in iterables: | 
|  | 73 | for element in it: | 
|  | 74 | yield element | 
|  | 75 |  | 
|  | 76 |  | 
| Raymond Hettinger | 330958e | 2008-02-28 19:41:24 +0000 | [diff] [blame] | 77 | .. function:: itertools.chain.from_iterable(iterable) | 
|  | 78 |  | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 79 | Alternate constructor for :func:`chain`.  Gets chained inputs from a | 
| Raymond Hettinger | 330958e | 2008-02-28 19:41:24 +0000 | [diff] [blame] | 80 | single iterable argument that is evaluated lazily.  Equivalent to:: | 
|  | 81 |  | 
|  | 82 | @classmethod | 
|  | 83 | def from_iterable(iterables): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 84 | # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F | 
| Raymond Hettinger | 330958e | 2008-02-28 19:41:24 +0000 | [diff] [blame] | 85 | for it in iterables: | 
|  | 86 | for element in it: | 
|  | 87 | yield element | 
|  | 88 |  | 
|  | 89 | .. versionadded:: 2.6 | 
|  | 90 |  | 
| Raymond Hettinger | d553d85 | 2008-03-04 04:17:08 +0000 | [diff] [blame] | 91 |  | 
| Raymond Hettinger | 3fa41d5 | 2008-02-26 02:46:54 +0000 | [diff] [blame] | 92 | .. function:: combinations(iterable, r) | 
|  | 93 |  | 
| Raymond Hettinger | 5eaffc4 | 2008-04-17 10:48:31 +0000 | [diff] [blame] | 94 | Return *r* length subsequences of elements from the input *iterable*. | 
| Raymond Hettinger | 3fa41d5 | 2008-02-26 02:46:54 +0000 | [diff] [blame] | 95 |  | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 96 | Combinations are emitted in lexicographic sort order.  So, if the | 
| Raymond Hettinger | 3fa41d5 | 2008-02-26 02:46:54 +0000 | [diff] [blame] | 97 | input *iterable* is sorted, the combination tuples will be produced | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 98 | in sorted order. | 
| Raymond Hettinger | 3fa41d5 | 2008-02-26 02:46:54 +0000 | [diff] [blame] | 99 |  | 
|  | 100 | Elements are treated as unique based on their position, not on their | 
|  | 101 | value.  So if the input elements are unique, there will be no repeat | 
| Raymond Hettinger | 330958e | 2008-02-28 19:41:24 +0000 | [diff] [blame] | 102 | values in each combination. | 
| Raymond Hettinger | 3fa41d5 | 2008-02-26 02:46:54 +0000 | [diff] [blame] | 103 |  | 
| Raymond Hettinger | 3fa41d5 | 2008-02-26 02:46:54 +0000 | [diff] [blame] | 104 | Equivalent to:: | 
|  | 105 |  | 
|  | 106 | def combinations(iterable, r): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 107 | # combinations('ABCD', 2) --> AB AC AD BC BD CD | 
|  | 108 | # combinations(range(4), 3) --> 012 013 023 123 | 
| Raymond Hettinger | 3fa41d5 | 2008-02-26 02:46:54 +0000 | [diff] [blame] | 109 | pool = tuple(iterable) | 
| Raymond Hettinger | 93e804d | 2008-02-26 23:40:50 +0000 | [diff] [blame] | 110 | n = len(pool) | 
| Raymond Hettinger | 5b913e3 | 2009-01-08 06:39:04 +0000 | [diff] [blame] | 111 | if r > n: | 
|  | 112 | return | 
| Raymond Hettinger | f287f17 | 2008-03-02 10:59:31 +0000 | [diff] [blame] | 113 | indices = range(r) | 
|  | 114 | yield tuple(pool[i] for i in indices) | 
| Raymond Hettinger | 93e804d | 2008-02-26 23:40:50 +0000 | [diff] [blame] | 115 | while 1: | 
|  | 116 | for i in reversed(range(r)): | 
| Raymond Hettinger | f287f17 | 2008-03-02 10:59:31 +0000 | [diff] [blame] | 117 | if indices[i] != i + n - r: | 
| Raymond Hettinger | c105289 | 2008-02-27 01:44:34 +0000 | [diff] [blame] | 118 | break | 
| Raymond Hettinger | 93e804d | 2008-02-26 23:40:50 +0000 | [diff] [blame] | 119 | else: | 
|  | 120 | return | 
| Raymond Hettinger | f287f17 | 2008-03-02 10:59:31 +0000 | [diff] [blame] | 121 | indices[i] += 1 | 
| Raymond Hettinger | c105289 | 2008-02-27 01:44:34 +0000 | [diff] [blame] | 122 | for j in range(i+1, r): | 
| Raymond Hettinger | f287f17 | 2008-03-02 10:59:31 +0000 | [diff] [blame] | 123 | indices[j] = indices[j-1] + 1 | 
|  | 124 | yield tuple(pool[i] for i in indices) | 
| Raymond Hettinger | 3fa41d5 | 2008-02-26 02:46:54 +0000 | [diff] [blame] | 125 |  | 
| Raymond Hettinger | d553d85 | 2008-03-04 04:17:08 +0000 | [diff] [blame] | 126 | The code for :func:`combinations` can be also expressed as a subsequence | 
|  | 127 | of :func:`permutations` after filtering entries where the elements are not | 
|  | 128 | in sorted order (according to their position in the input pool):: | 
|  | 129 |  | 
|  | 130 | def combinations(iterable, r): | 
|  | 131 | pool = tuple(iterable) | 
|  | 132 | n = len(pool) | 
|  | 133 | for indices in permutations(range(n), r): | 
|  | 134 | if sorted(indices) == list(indices): | 
|  | 135 | yield tuple(pool[i] for i in indices) | 
|  | 136 |  | 
| Raymond Hettinger | 5b913e3 | 2009-01-08 06:39:04 +0000 | [diff] [blame] | 137 | The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n`` | 
|  | 138 | or zero when ``r > n``. | 
|  | 139 |  | 
| Raymond Hettinger | 3fa41d5 | 2008-02-26 02:46:54 +0000 | [diff] [blame] | 140 | .. versionadded:: 2.6 | 
|  | 141 |  | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 142 | .. function:: count([n]) | 
|  | 143 |  | 
|  | 144 | Make an iterator that returns consecutive integers starting with *n*. If not | 
| Raymond Hettinger | 50e90e2 | 2007-10-04 00:20:27 +0000 | [diff] [blame] | 145 | specified *n* defaults to zero.   Often used as an argument to :func:`imap` to | 
|  | 146 | generate consecutive data points. Also, used with :func:`izip` to add sequence | 
|  | 147 | numbers.  Equivalent to:: | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 148 |  | 
|  | 149 | def count(n=0): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 150 | # count(10) --> 10 11 12 13 14 ... | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 151 | while True: | 
|  | 152 | yield n | 
|  | 153 | n += 1 | 
|  | 154 |  | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 155 |  | 
|  | 156 | .. function:: cycle(iterable) | 
|  | 157 |  | 
|  | 158 | Make an iterator returning elements from the iterable and saving a copy of each. | 
|  | 159 | When the iterable is exhausted, return elements from the saved copy.  Repeats | 
|  | 160 | indefinitely.  Equivalent to:: | 
|  | 161 |  | 
|  | 162 | def cycle(iterable): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 163 | # cycle('ABCD') --> A B C D A B C D A B C D ... | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 164 | saved = [] | 
|  | 165 | for element in iterable: | 
|  | 166 | yield element | 
|  | 167 | saved.append(element) | 
|  | 168 | while saved: | 
|  | 169 | for element in saved: | 
|  | 170 | yield element | 
|  | 171 |  | 
|  | 172 | Note, this member of the toolkit may require significant auxiliary storage | 
|  | 173 | (depending on the length of the iterable). | 
|  | 174 |  | 
|  | 175 |  | 
|  | 176 | .. function:: dropwhile(predicate, iterable) | 
|  | 177 |  | 
|  | 178 | Make an iterator that drops elements from the iterable as long as the predicate | 
|  | 179 | is true; afterwards, returns every element.  Note, the iterator does not produce | 
|  | 180 | *any* output until the predicate first becomes false, so it may have a lengthy | 
|  | 181 | start-up time.  Equivalent to:: | 
|  | 182 |  | 
|  | 183 | def dropwhile(predicate, iterable): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 184 | # dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1 | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 185 | iterable = iter(iterable) | 
|  | 186 | for x in iterable: | 
|  | 187 | if not predicate(x): | 
|  | 188 | yield x | 
|  | 189 | break | 
|  | 190 | for x in iterable: | 
|  | 191 | yield x | 
|  | 192 |  | 
|  | 193 |  | 
|  | 194 | .. function:: groupby(iterable[, key]) | 
|  | 195 |  | 
|  | 196 | Make an iterator that returns consecutive keys and groups from the *iterable*. | 
|  | 197 | The *key* is a function computing a key value for each element.  If not | 
|  | 198 | specified or is ``None``, *key* defaults to an identity function and returns | 
|  | 199 | the element unchanged.  Generally, the iterable needs to already be sorted on | 
|  | 200 | the same key function. | 
|  | 201 |  | 
|  | 202 | The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix.  It | 
|  | 203 | generates a break or new group every time the value of the key function changes | 
|  | 204 | (which is why it is usually necessary to have sorted the data using the same key | 
|  | 205 | function).  That behavior differs from SQL's GROUP BY which aggregates common | 
|  | 206 | elements regardless of their input order. | 
|  | 207 |  | 
|  | 208 | The returned group is itself an iterator that shares the underlying iterable | 
|  | 209 | with :func:`groupby`.  Because the source is shared, when the :func:`groupby` | 
|  | 210 | object is advanced, the previous group is no longer visible.  So, if that data | 
|  | 211 | is needed later, it should be stored as a list:: | 
|  | 212 |  | 
|  | 213 | groups = [] | 
|  | 214 | uniquekeys = [] | 
|  | 215 | data = sorted(data, key=keyfunc) | 
|  | 216 | for k, g in groupby(data, keyfunc): | 
|  | 217 | groups.append(list(g))      # Store group iterator as a list | 
|  | 218 | uniquekeys.append(k) | 
|  | 219 |  | 
|  | 220 | :func:`groupby` is equivalent to:: | 
|  | 221 |  | 
|  | 222 | class groupby(object): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 223 | # [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B | 
|  | 224 | # [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 225 | def __init__(self, iterable, key=None): | 
|  | 226 | if key is None: | 
|  | 227 | key = lambda x: x | 
|  | 228 | self.keyfunc = key | 
|  | 229 | self.it = iter(iterable) | 
| Raymond Hettinger | 81a885a | 2007-12-29 22:16:24 +0000 | [diff] [blame] | 230 | self.tgtkey = self.currkey = self.currvalue = object() | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 231 | def __iter__(self): | 
|  | 232 | return self | 
|  | 233 | def next(self): | 
|  | 234 | while self.currkey == self.tgtkey: | 
|  | 235 | self.currvalue = self.it.next() # Exit on StopIteration | 
|  | 236 | self.currkey = self.keyfunc(self.currvalue) | 
|  | 237 | self.tgtkey = self.currkey | 
|  | 238 | return (self.currkey, self._grouper(self.tgtkey)) | 
|  | 239 | def _grouper(self, tgtkey): | 
|  | 240 | while self.currkey == tgtkey: | 
|  | 241 | yield self.currvalue | 
|  | 242 | self.currvalue = self.it.next() # Exit on StopIteration | 
|  | 243 | self.currkey = self.keyfunc(self.currvalue) | 
|  | 244 |  | 
|  | 245 | .. versionadded:: 2.4 | 
|  | 246 |  | 
|  | 247 |  | 
|  | 248 | .. function:: ifilter(predicate, iterable) | 
|  | 249 |  | 
|  | 250 | Make an iterator that filters elements from iterable returning only those for | 
|  | 251 | which the predicate is ``True``. If *predicate* is ``None``, return the items | 
|  | 252 | that are true. Equivalent to:: | 
|  | 253 |  | 
|  | 254 | def ifilter(predicate, iterable): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 255 | # ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9 | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 256 | if predicate is None: | 
|  | 257 | predicate = bool | 
|  | 258 | for x in iterable: | 
|  | 259 | if predicate(x): | 
|  | 260 | yield x | 
|  | 261 |  | 
|  | 262 |  | 
|  | 263 | .. function:: ifilterfalse(predicate, iterable) | 
|  | 264 |  | 
|  | 265 | Make an iterator that filters elements from iterable returning only those for | 
|  | 266 | which the predicate is ``False``. If *predicate* is ``None``, return the items | 
|  | 267 | that are false. Equivalent to:: | 
|  | 268 |  | 
|  | 269 | def ifilterfalse(predicate, iterable): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 270 | # ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8 | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 271 | if predicate is None: | 
|  | 272 | predicate = bool | 
|  | 273 | for x in iterable: | 
|  | 274 | if not predicate(x): | 
|  | 275 | yield x | 
|  | 276 |  | 
|  | 277 |  | 
|  | 278 | .. function:: imap(function, *iterables) | 
|  | 279 |  | 
|  | 280 | Make an iterator that computes the function using arguments from each of the | 
|  | 281 | iterables.  If *function* is set to ``None``, then :func:`imap` returns the | 
|  | 282 | arguments as a tuple.  Like :func:`map` but stops when the shortest iterable is | 
|  | 283 | exhausted instead of filling in ``None`` for shorter iterables.  The reason for | 
|  | 284 | the difference is that infinite iterator arguments are typically an error for | 
|  | 285 | :func:`map` (because the output is fully evaluated) but represent a common and | 
|  | 286 | useful way of supplying arguments to :func:`imap`. Equivalent to:: | 
|  | 287 |  | 
|  | 288 | def imap(function, *iterables): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 289 | # imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000 | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 290 | iterables = map(iter, iterables) | 
|  | 291 | while True: | 
| Raymond Hettinger | 2dec48d | 2008-01-22 22:09:26 +0000 | [diff] [blame] | 292 | args = [it.next() for it in iterables] | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 293 | if function is None: | 
|  | 294 | yield tuple(args) | 
|  | 295 | else: | 
|  | 296 | yield function(*args) | 
|  | 297 |  | 
|  | 298 |  | 
|  | 299 | .. function:: islice(iterable, [start,] stop [, step]) | 
|  | 300 |  | 
|  | 301 | Make an iterator that returns selected elements from the iterable. If *start* is | 
|  | 302 | non-zero, then elements from the iterable are skipped until start is reached. | 
|  | 303 | Afterward, elements are returned consecutively unless *step* is set higher than | 
|  | 304 | one which results in items being skipped.  If *stop* is ``None``, then iteration | 
|  | 305 | continues until the iterator is exhausted, if at all; otherwise, it stops at the | 
|  | 306 | specified position.  Unlike regular slicing, :func:`islice` does not support | 
|  | 307 | negative values for *start*, *stop*, or *step*.  Can be used to extract related | 
|  | 308 | fields from data where the internal structure has been flattened (for example, a | 
|  | 309 | multi-line report may list a name field on every third line).  Equivalent to:: | 
|  | 310 |  | 
|  | 311 | def islice(iterable, *args): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 312 | # islice('ABCDEFG', 2) --> A B | 
|  | 313 | # islice('ABCDEFG', 2, 4) --> C D | 
|  | 314 | # islice('ABCDEFG', 2, None) --> C D E F G | 
|  | 315 | # islice('ABCDEFG', 0, None, 2) --> A C E G | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 316 | s = slice(*args) | 
|  | 317 | it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1)) | 
|  | 318 | nexti = it.next() | 
|  | 319 | for i, element in enumerate(iterable): | 
|  | 320 | if i == nexti: | 
|  | 321 | yield element | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 322 | nexti = it.next() | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 323 |  | 
|  | 324 | If *start* is ``None``, then iteration starts at zero. If *step* is ``None``, | 
|  | 325 | then the step defaults to one. | 
|  | 326 |  | 
|  | 327 | .. versionchanged:: 2.5 | 
|  | 328 | accept ``None`` values for default *start* and *step*. | 
|  | 329 |  | 
|  | 330 |  | 
|  | 331 | .. function:: izip(*iterables) | 
|  | 332 |  | 
|  | 333 | Make an iterator that aggregates elements from each of the iterables. Like | 
|  | 334 | :func:`zip` except that it returns an iterator instead of a list.  Used for | 
|  | 335 | lock-step iteration over several iterables at a time.  Equivalent to:: | 
|  | 336 |  | 
|  | 337 | def izip(*iterables): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 338 | # izip('ABCD', 'xy') --> Ax By | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 339 | iterables = map(iter, iterables) | 
|  | 340 | while iterables: | 
|  | 341 | result = [it.next() for it in iterables] | 
|  | 342 | yield tuple(result) | 
|  | 343 |  | 
|  | 344 | .. versionchanged:: 2.4 | 
|  | 345 | When no iterables are specified, returns a zero length iterator instead of | 
|  | 346 | raising a :exc:`TypeError` exception. | 
|  | 347 |  | 
| Raymond Hettinger | 48c6293 | 2008-01-22 19:51:41 +0000 | [diff] [blame] | 348 | The left-to-right evaluation order of the iterables is guaranteed. This | 
|  | 349 | makes possible an idiom for clustering a data series into n-length groups | 
|  | 350 | using ``izip(*[iter(s)]*n)``. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 351 |  | 
| Raymond Hettinger | 48c6293 | 2008-01-22 19:51:41 +0000 | [diff] [blame] | 352 | :func:`izip` should only be used with unequal length inputs when you don't | 
|  | 353 | care about trailing, unmatched values from the longer iterables.  If those | 
|  | 354 | values are important, use :func:`izip_longest` instead. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 355 |  | 
|  | 356 |  | 
|  | 357 | .. function:: izip_longest(*iterables[, fillvalue]) | 
|  | 358 |  | 
|  | 359 | Make an iterator that aggregates elements from each of the iterables. If the | 
|  | 360 | iterables are of uneven length, missing values are filled-in with *fillvalue*. | 
|  | 361 | Iteration continues until the longest iterable is exhausted.  Equivalent to:: | 
|  | 362 |  | 
|  | 363 | def izip_longest(*args, **kwds): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 364 | # izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D- | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 365 | fillvalue = kwds.get('fillvalue') | 
|  | 366 | def sentinel(counter = ([fillvalue]*(len(args)-1)).pop): | 
|  | 367 | yield counter()         # yields the fillvalue, or raises IndexError | 
|  | 368 | fillers = repeat(fillvalue) | 
|  | 369 | iters = [chain(it, sentinel(), fillers) for it in args] | 
|  | 370 | try: | 
|  | 371 | for tup in izip(*iters): | 
|  | 372 | yield tup | 
|  | 373 | except IndexError: | 
|  | 374 | pass | 
|  | 375 |  | 
| Benjamin Peterson | 5255cba | 2008-07-25 17:02:11 +0000 | [diff] [blame] | 376 | If one of the iterables is potentially infinite, then the | 
|  | 377 | :func:`izip_longest` function should be wrapped with something that limits | 
|  | 378 | the number of calls (for example :func:`islice` or :func:`takewhile`).  If | 
|  | 379 | not specified, *fillvalue* defaults to ``None``. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 380 |  | 
|  | 381 | .. versionadded:: 2.6 | 
|  | 382 |  | 
| Raymond Hettinger | 330958e | 2008-02-28 19:41:24 +0000 | [diff] [blame] | 383 | .. function:: permutations(iterable[, r]) | 
|  | 384 |  | 
|  | 385 | Return successive *r* length permutations of elements in the *iterable*. | 
|  | 386 |  | 
|  | 387 | If *r* is not specified or is ``None``, then *r* defaults to the length | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 388 | of the *iterable* and all possible full-length permutations | 
| Raymond Hettinger | 330958e | 2008-02-28 19:41:24 +0000 | [diff] [blame] | 389 | are generated. | 
|  | 390 |  | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 391 | Permutations are emitted in lexicographic sort order.  So, if the | 
| Raymond Hettinger | 330958e | 2008-02-28 19:41:24 +0000 | [diff] [blame] | 392 | input *iterable* is sorted, the permutation tuples will be produced | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 393 | in sorted order. | 
| Raymond Hettinger | 330958e | 2008-02-28 19:41:24 +0000 | [diff] [blame] | 394 |  | 
|  | 395 | Elements are treated as unique based on their position, not on their | 
|  | 396 | value.  So if the input elements are unique, there will be no repeat | 
|  | 397 | values in each permutation. | 
|  | 398 |  | 
| Raymond Hettinger | f287f17 | 2008-03-02 10:59:31 +0000 | [diff] [blame] | 399 | Equivalent to:: | 
|  | 400 |  | 
|  | 401 | def permutations(iterable, r=None): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 402 | # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC | 
|  | 403 | # permutations(range(3)) --> 012 021 102 120 201 210 | 
| Raymond Hettinger | f287f17 | 2008-03-02 10:59:31 +0000 | [diff] [blame] | 404 | pool = tuple(iterable) | 
|  | 405 | n = len(pool) | 
|  | 406 | r = n if r is None else r | 
| Raymond Hettinger | 5b913e3 | 2009-01-08 06:39:04 +0000 | [diff] [blame] | 407 | if r > n: | 
|  | 408 | return | 
| Raymond Hettinger | f287f17 | 2008-03-02 10:59:31 +0000 | [diff] [blame] | 409 | indices = range(n) | 
| Raymond Hettinger | e70bb8d | 2008-03-23 00:55:46 +0000 | [diff] [blame] | 410 | cycles = range(n, n-r, -1) | 
| Raymond Hettinger | f287f17 | 2008-03-02 10:59:31 +0000 | [diff] [blame] | 411 | yield tuple(pool[i] for i in indices[:r]) | 
|  | 412 | while n: | 
|  | 413 | for i in reversed(range(r)): | 
|  | 414 | cycles[i] -= 1 | 
|  | 415 | if cycles[i] == 0: | 
| Raymond Hettinger | 2b7a5c4 | 2008-03-02 11:17:51 +0000 | [diff] [blame] | 416 | indices[i:] = indices[i+1:] + indices[i:i+1] | 
| Raymond Hettinger | f287f17 | 2008-03-02 10:59:31 +0000 | [diff] [blame] | 417 | cycles[i] = n - i | 
|  | 418 | else: | 
|  | 419 | j = cycles[i] | 
|  | 420 | indices[i], indices[-j] = indices[-j], indices[i] | 
|  | 421 | yield tuple(pool[i] for i in indices[:r]) | 
|  | 422 | break | 
|  | 423 | else: | 
|  | 424 | return | 
| Raymond Hettinger | 330958e | 2008-02-28 19:41:24 +0000 | [diff] [blame] | 425 |  | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 426 | The code for :func:`permutations` can be also expressed as a subsequence of | 
| Raymond Hettinger | d553d85 | 2008-03-04 04:17:08 +0000 | [diff] [blame] | 427 | :func:`product`, filtered to exclude entries with repeated elements (those | 
|  | 428 | from the same position in the input pool):: | 
|  | 429 |  | 
|  | 430 | def permutations(iterable, r=None): | 
|  | 431 | pool = tuple(iterable) | 
|  | 432 | n = len(pool) | 
|  | 433 | r = n if r is None else r | 
|  | 434 | for indices in product(range(n), repeat=r): | 
|  | 435 | if len(set(indices)) == r: | 
|  | 436 | yield tuple(pool[i] for i in indices) | 
|  | 437 |  | 
| Raymond Hettinger | 5b913e3 | 2009-01-08 06:39:04 +0000 | [diff] [blame] | 438 | The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n`` | 
|  | 439 | or zero when ``r > n``. | 
|  | 440 |  | 
| Raymond Hettinger | 330958e | 2008-02-28 19:41:24 +0000 | [diff] [blame] | 441 | .. versionadded:: 2.6 | 
|  | 442 |  | 
| Raymond Hettinger | 18750ab | 2008-02-28 09:23:48 +0000 | [diff] [blame] | 443 | .. function:: product(*iterables[, repeat]) | 
| Raymond Hettinger | c5705a8 | 2008-02-22 19:50:06 +0000 | [diff] [blame] | 444 |  | 
|  | 445 | Cartesian product of input iterables. | 
|  | 446 |  | 
|  | 447 | Equivalent to nested for-loops in a generator expression. For example, | 
|  | 448 | ``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``. | 
|  | 449 |  | 
| Raymond Hettinger | 5eaffc4 | 2008-04-17 10:48:31 +0000 | [diff] [blame] | 450 | The nested loops cycle like an odometer with the rightmost element advancing | 
| Andrew M. Kuchling | e2e0313 | 2008-04-17 20:44:06 +0000 | [diff] [blame] | 451 | on every iteration.  This pattern creates a lexicographic ordering so that if | 
|  | 452 | the input's iterables are sorted, the product tuples are emitted in sorted | 
| Raymond Hettinger | 5eaffc4 | 2008-04-17 10:48:31 +0000 | [diff] [blame] | 453 | order. | 
| Raymond Hettinger | c5705a8 | 2008-02-22 19:50:06 +0000 | [diff] [blame] | 454 |  | 
| Raymond Hettinger | 18750ab | 2008-02-28 09:23:48 +0000 | [diff] [blame] | 455 | To compute the product of an iterable with itself, specify the number of | 
|  | 456 | repetitions with the optional *repeat* keyword argument.  For example, | 
|  | 457 | ``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``. | 
|  | 458 |  | 
| Andrew M. Kuchling | 684868a | 2008-03-04 01:47:38 +0000 | [diff] [blame] | 459 | This function is equivalent to the following code, except that the | 
|  | 460 | actual implementation does not build up intermediate results in memory:: | 
| Raymond Hettinger | c5705a8 | 2008-02-22 19:50:06 +0000 | [diff] [blame] | 461 |  | 
| Raymond Hettinger | 18750ab | 2008-02-28 09:23:48 +0000 | [diff] [blame] | 462 | def product(*args, **kwds): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 463 | # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy | 
|  | 464 | # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111 | 
| Raymond Hettinger | 18750ab | 2008-02-28 09:23:48 +0000 | [diff] [blame] | 465 | pools = map(tuple, args) * kwds.get('repeat', 1) | 
| Raymond Hettinger | d553d85 | 2008-03-04 04:17:08 +0000 | [diff] [blame] | 466 | result = [[]] | 
|  | 467 | for pool in pools: | 
|  | 468 | result = [x+[y] for x in result for y in pool] | 
|  | 469 | for prod in result: | 
|  | 470 | yield tuple(prod) | 
| Raymond Hettinger | c5705a8 | 2008-02-22 19:50:06 +0000 | [diff] [blame] | 471 |  | 
|  | 472 | .. versionadded:: 2.6 | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 473 |  | 
|  | 474 | .. function:: repeat(object[, times]) | 
|  | 475 |  | 
|  | 476 | Make an iterator that returns *object* over and over again. Runs indefinitely | 
|  | 477 | unless the *times* argument is specified. Used as argument to :func:`imap` for | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 478 | invariant function parameters.  Also used with :func:`izip` to create constant | 
|  | 479 | fields in a tuple record.  Equivalent to:: | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 480 |  | 
|  | 481 | def repeat(object, times=None): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 482 | # repeat(10, 3) --> 10 10 10 | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 483 | if times is None: | 
|  | 484 | while True: | 
|  | 485 | yield object | 
|  | 486 | else: | 
|  | 487 | for i in xrange(times): | 
|  | 488 | yield object | 
|  | 489 |  | 
|  | 490 |  | 
|  | 491 | .. function:: starmap(function, iterable) | 
|  | 492 |  | 
| Raymond Hettinger | 4731709 | 2008-01-17 03:02:14 +0000 | [diff] [blame] | 493 | Make an iterator that computes the function using arguments obtained from | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 494 | the iterable.  Used instead of :func:`imap` when argument parameters are already | 
|  | 495 | grouped in tuples from a single iterable (the data has been "pre-zipped").  The | 
|  | 496 | difference between :func:`imap` and :func:`starmap` parallels the distinction | 
|  | 497 | between ``function(a,b)`` and ``function(*c)``. Equivalent to:: | 
|  | 498 |  | 
|  | 499 | def starmap(function, iterable): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 500 | # starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000 | 
| Raymond Hettinger | 4731709 | 2008-01-17 03:02:14 +0000 | [diff] [blame] | 501 | for args in iterable: | 
|  | 502 | yield function(*args) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 503 |  | 
| Raymond Hettinger | 4731709 | 2008-01-17 03:02:14 +0000 | [diff] [blame] | 504 | .. versionchanged:: 2.6 | 
|  | 505 | Previously, :func:`starmap` required the function arguments to be tuples. | 
|  | 506 | Now, any iterable is allowed. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 507 |  | 
|  | 508 | .. function:: takewhile(predicate, iterable) | 
|  | 509 |  | 
|  | 510 | Make an iterator that returns elements from the iterable as long as the | 
|  | 511 | predicate is true.  Equivalent to:: | 
|  | 512 |  | 
|  | 513 | def takewhile(predicate, iterable): | 
| Raymond Hettinger | 040f10e | 2008-03-06 01:15:52 +0000 | [diff] [blame] | 514 | # takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4 | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 515 | for x in iterable: | 
|  | 516 | if predicate(x): | 
|  | 517 | yield x | 
|  | 518 | else: | 
|  | 519 | break | 
|  | 520 |  | 
|  | 521 |  | 
|  | 522 | .. function:: tee(iterable[, n=2]) | 
|  | 523 |  | 
|  | 524 | Return *n* independent iterators from a single iterable. The case where ``n==2`` | 
|  | 525 | is equivalent to:: | 
|  | 526 |  | 
|  | 527 | def tee(iterable): | 
| Raymond Hettinger | 5d332bb | 2007-12-29 22:09:34 +0000 | [diff] [blame] | 528 | def gen(next, data={}): | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 529 | for i in count(): | 
| Raymond Hettinger | 5d332bb | 2007-12-29 22:09:34 +0000 | [diff] [blame] | 530 | if i in data: | 
|  | 531 | yield data.pop(i) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 532 | else: | 
| Raymond Hettinger | 5d332bb | 2007-12-29 22:09:34 +0000 | [diff] [blame] | 533 | data[i] = next() | 
|  | 534 | yield data[i] | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 535 | it = iter(iterable) | 
| Raymond Hettinger | 5d332bb | 2007-12-29 22:09:34 +0000 | [diff] [blame] | 536 | return gen(it.next), gen(it.next) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 537 |  | 
|  | 538 | Note, once :func:`tee` has made a split, the original *iterable* should not be | 
|  | 539 | used anywhere else; otherwise, the *iterable* could get advanced without the tee | 
|  | 540 | objects being informed. | 
|  | 541 |  | 
|  | 542 | Note, this member of the toolkit may require significant auxiliary storage | 
|  | 543 | (depending on how much temporary data needs to be stored). In general, if one | 
|  | 544 | iterator is going to use most or all of the data before the other iterator, it | 
|  | 545 | is faster to use :func:`list` instead of :func:`tee`. | 
|  | 546 |  | 
|  | 547 | .. versionadded:: 2.4 | 
|  | 548 |  | 
|  | 549 |  | 
|  | 550 | .. _itertools-example: | 
|  | 551 |  | 
|  | 552 | Examples | 
|  | 553 | -------- | 
|  | 554 |  | 
|  | 555 | The following examples show common uses for each tool and demonstrate ways they | 
| Georg Brandl | e8f1b00 | 2008-03-22 22:04:10 +0000 | [diff] [blame] | 556 | can be combined. | 
|  | 557 |  | 
|  | 558 | .. doctest:: | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 559 |  | 
| Benjamin Peterson | 8ea9999 | 2009-01-01 16:43:12 +0000 | [diff] [blame] | 560 | >>> # Show a dictionary sorted and grouped by value | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 561 | >>> from operator import itemgetter | 
|  | 562 | >>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3) | 
|  | 563 | >>> di = sorted(d.iteritems(), key=itemgetter(1)) | 
|  | 564 | >>> for k, g in groupby(di, key=itemgetter(1)): | 
|  | 565 | ...     print k, map(itemgetter(0), g) | 
|  | 566 | ... | 
|  | 567 | 1 ['a', 'c', 'e'] | 
|  | 568 | 2 ['b', 'd', 'f'] | 
|  | 569 | 3 ['g'] | 
|  | 570 |  | 
| Benjamin Peterson | 8ea9999 | 2009-01-01 16:43:12 +0000 | [diff] [blame] | 571 | >>> # Find runs of consecutive numbers using groupby.  The key to the solution | 
|  | 572 | >>> # is differencing with a range so that consecutive numbers all appear in | 
|  | 573 | >>> # same group. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 574 | >>> data = [ 1,  4,5,6, 10, 15,16,17,18, 22, 25,26,27,28] | 
|  | 575 | >>> for k, g in groupby(enumerate(data), lambda (i,x):i-x): | 
| Georg Brandl | e8f1b00 | 2008-03-22 22:04:10 +0000 | [diff] [blame] | 576 | ...     print map(itemgetter(1), g) | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 577 | ... | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 578 | [1] | 
|  | 579 | [4, 5, 6] | 
|  | 580 | [10] | 
|  | 581 | [15, 16, 17, 18] | 
|  | 582 | [22] | 
|  | 583 | [25, 26, 27, 28] | 
|  | 584 |  | 
|  | 585 |  | 
|  | 586 |  | 
|  | 587 | .. _itertools-recipes: | 
|  | 588 |  | 
|  | 589 | Recipes | 
|  | 590 | ------- | 
|  | 591 |  | 
|  | 592 | This section shows recipes for creating an extended toolset using the existing | 
|  | 593 | itertools as building blocks. | 
|  | 594 |  | 
|  | 595 | The extended tools offer the same high performance as the underlying toolset. | 
|  | 596 | The superior memory performance is kept by processing elements one at a time | 
|  | 597 | rather than bringing the whole iterable into memory all at once. Code volume is | 
|  | 598 | kept small by linking the tools together in a functional style which helps | 
|  | 599 | eliminate temporary variables.  High speed is retained by preferring | 
| Georg Brandl | cf3fb25 | 2007-10-21 10:52:38 +0000 | [diff] [blame] | 600 | "vectorized" building blocks over the use of for-loops and :term:`generator`\s | 
| Georg Brandl | e8f1b00 | 2008-03-22 22:04:10 +0000 | [diff] [blame] | 601 | which incur interpreter overhead. | 
|  | 602 |  | 
|  | 603 | .. testcode:: | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 604 |  | 
| Raymond Hettinger | f1f46f0 | 2008-07-19 23:58:47 +0000 | [diff] [blame] | 605 | def take(n, iterable): | 
|  | 606 | "Return first n items of the iterable as a list" | 
|  | 607 | return list(islice(iterable, n)) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 608 |  | 
| Raymond Hettinger | f1f46f0 | 2008-07-19 23:58:47 +0000 | [diff] [blame] | 609 | def enumerate(iterable, start=0): | 
|  | 610 | return izip(count(start), iterable) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 611 |  | 
| Raymond Hettinger | f1f46f0 | 2008-07-19 23:58:47 +0000 | [diff] [blame] | 612 | def tabulate(function, start=0): | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 613 | "Return function(0), function(1), ..." | 
| Raymond Hettinger | f1f46f0 | 2008-07-19 23:58:47 +0000 | [diff] [blame] | 614 | return imap(function, count(start)) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 615 |  | 
|  | 616 | def nth(iterable, n): | 
| Raymond Hettinger | f1f46f0 | 2008-07-19 23:58:47 +0000 | [diff] [blame] | 617 | "Returns the nth item or empty list" | 
|  | 618 | return list(islice(iterable, n, n+1)) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 619 |  | 
| Raymond Hettinger | f1f46f0 | 2008-07-19 23:58:47 +0000 | [diff] [blame] | 620 | def quantify(iterable, pred=bool): | 
|  | 621 | "Count how many times the predicate is true" | 
|  | 622 | return sum(imap(pred, iterable)) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 623 |  | 
| Raymond Hettinger | f1f46f0 | 2008-07-19 23:58:47 +0000 | [diff] [blame] | 624 | def padnone(iterable): | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 625 | """Returns the sequence elements and then returns None indefinitely. | 
|  | 626 |  | 
|  | 627 | Useful for emulating the behavior of the built-in map() function. | 
|  | 628 | """ | 
| Raymond Hettinger | f1f46f0 | 2008-07-19 23:58:47 +0000 | [diff] [blame] | 629 | return chain(iterable, repeat(None)) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 630 |  | 
| Raymond Hettinger | f1f46f0 | 2008-07-19 23:58:47 +0000 | [diff] [blame] | 631 | def ncycles(iterable, n): | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 632 | "Returns the sequence elements n times" | 
| Raymond Hettinger | f1f46f0 | 2008-07-19 23:58:47 +0000 | [diff] [blame] | 633 | return chain.from_iterable(repeat(iterable, n)) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 634 |  | 
|  | 635 | def dotproduct(vec1, vec2): | 
|  | 636 | return sum(imap(operator.mul, vec1, vec2)) | 
|  | 637 |  | 
|  | 638 | def flatten(listOfLists): | 
| Raymond Hettinger | 330958e | 2008-02-28 19:41:24 +0000 | [diff] [blame] | 639 | return list(chain.from_iterable(listOfLists)) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 640 |  | 
|  | 641 | def repeatfunc(func, times=None, *args): | 
|  | 642 | """Repeat calls to func with specified arguments. | 
|  | 643 |  | 
|  | 644 | Example:  repeatfunc(random.random) | 
|  | 645 | """ | 
|  | 646 | if times is None: | 
|  | 647 | return starmap(func, repeat(args)) | 
| Raymond Hettinger | 330958e | 2008-02-28 19:41:24 +0000 | [diff] [blame] | 648 | return starmap(func, repeat(args, times)) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 649 |  | 
|  | 650 | def pairwise(iterable): | 
|  | 651 | "s -> (s0,s1), (s1,s2), (s2, s3), ..." | 
|  | 652 | a, b = tee(iterable) | 
| Raymond Hettinger | 38fb9be | 2008-03-07 01:33:20 +0000 | [diff] [blame] | 653 | for elem in b: | 
|  | 654 | break | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 655 | return izip(a, b) | 
|  | 656 |  | 
| Raymond Hettinger | 38fb9be | 2008-03-07 01:33:20 +0000 | [diff] [blame] | 657 | def grouper(n, iterable, fillvalue=None): | 
| Raymond Hettinger | efdf706 | 2008-07-30 07:27:30 +0000 | [diff] [blame] | 658 | "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx" | 
| Raymond Hettinger | 38fb9be | 2008-03-07 01:33:20 +0000 | [diff] [blame] | 659 | args = [iter(iterable)] * n | 
| Raymond Hettinger | f080e6d | 2008-07-31 01:19:50 +0000 | [diff] [blame] | 660 | return izip_longest(fillvalue=fillvalue, *args) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 661 |  | 
| Raymond Hettinger | a44327a | 2008-01-30 22:17:31 +0000 | [diff] [blame] | 662 | def roundrobin(*iterables): | 
| Raymond Hettinger | efdf706 | 2008-07-30 07:27:30 +0000 | [diff] [blame] | 663 | "roundrobin('ABC', 'D', 'EF') --> A D E B F C" | 
| Raymond Hettinger | 330958e | 2008-02-28 19:41:24 +0000 | [diff] [blame] | 664 | # Recipe credited to George Sakkis | 
| Raymond Hettinger | a44327a | 2008-01-30 22:17:31 +0000 | [diff] [blame] | 665 | pending = len(iterables) | 
|  | 666 | nexts = cycle(iter(it).next for it in iterables) | 
|  | 667 | while pending: | 
|  | 668 | try: | 
|  | 669 | for next in nexts: | 
|  | 670 | yield next() | 
|  | 671 | except StopIteration: | 
|  | 672 | pending -= 1 | 
|  | 673 | nexts = cycle(islice(nexts, pending)) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 674 |  | 
| Raymond Hettinger | 7832d4d | 2008-02-23 10:04:15 +0000 | [diff] [blame] | 675 | def powerset(iterable): | 
| Raymond Hettinger | 330958e | 2008-02-28 19:41:24 +0000 | [diff] [blame] | 676 | "powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])" | 
|  | 677 | # Recipe credited to Eric Raymond | 
|  | 678 | pairs = [(2**i, x) for i, x in enumerate(iterable)] | 
|  | 679 | for n in xrange(2**len(pairs)): | 
|  | 680 | yield set(x for m, x in pairs if m&n) | 
| Raymond Hettinger | 7832d4d | 2008-02-23 10:04:15 +0000 | [diff] [blame] | 681 |  | 
| Raymond Hettinger | e8b4b60 | 2008-03-11 00:19:07 +0000 | [diff] [blame] | 682 | def compress(data, selectors): | 
| Raymond Hettinger | efdf706 | 2008-07-30 07:27:30 +0000 | [diff] [blame] | 683 | "compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F" | 
|  | 684 | return (d for d, s in izip(data, selectors) if s) | 
| Raymond Hettinger | 3369167 | 2008-07-19 00:43:00 +0000 | [diff] [blame] | 685 |  | 
| Georg Brandl | 8c81fda | 2008-07-23 16:00:44 +0000 | [diff] [blame] | 686 | def combinations_with_replacement(iterable, r): | 
| Raymond Hettinger | 5b913e3 | 2009-01-08 06:39:04 +0000 | [diff] [blame] | 687 | "combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC" | 
|  | 688 | # number items returned:  (n+r-1)! / r! / (n-1)! | 
| Georg Brandl | 8c81fda | 2008-07-23 16:00:44 +0000 | [diff] [blame] | 689 | pool = tuple(iterable) | 
|  | 690 | n = len(pool) | 
|  | 691 | indices = [0] * r | 
|  | 692 | yield tuple(pool[i] for i in indices) | 
|  | 693 | while 1: | 
|  | 694 | for i in reversed(range(r)): | 
|  | 695 | if indices[i] != n - 1: | 
|  | 696 | break | 
|  | 697 | else: | 
|  | 698 | return | 
|  | 699 | indices[i:] = [indices[i] + 1] * (r - i) | 
|  | 700 | yield tuple(pool[i] for i in indices) | 
| Raymond Hettinger | 44e1581 | 2009-01-02 21:26:45 +0000 | [diff] [blame] | 701 |  | 
|  | 702 | def unique_everseen(iterable, key=None): | 
|  | 703 | "List unique elements, preserving order. Remember all elements ever seen." | 
|  | 704 | # unique_everseen('AAAABBBCCDAABBB') --> A B C D | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 705 | # unique_everseen('ABBCcAD', str.lower) --> A B C D | 
| Raymond Hettinger | 44e1581 | 2009-01-02 21:26:45 +0000 | [diff] [blame] | 706 | seen = set() | 
|  | 707 | seen_add = seen.add | 
|  | 708 | if key is None: | 
|  | 709 | for element in iterable: | 
|  | 710 | if element not in seen: | 
|  | 711 | seen_add(element) | 
|  | 712 | yield element | 
|  | 713 | else: | 
|  | 714 | for element in iterable: | 
|  | 715 | k = key(element) | 
|  | 716 | if k not in seen: | 
|  | 717 | seen_add(k) | 
|  | 718 | yield element | 
|  | 719 |  | 
|  | 720 | def unique_justseen(iterable, key=None): | 
|  | 721 | "List unique elements, preserving order. Remember only the element just seen." | 
|  | 722 | # unique_justseen('AAAABBBCCDAABBB') --> A B C D A B | 
|  | 723 | # unique_justseen('ABBCcAD', str.lower) --> A B C A D | 
|  | 724 | return imap(next, imap(itemgetter(1), groupby(iterable, key))) |