Blame - Doc/library/itertools.rst - platform/external/python/cpython2

2007-08-15 14:28:01 +0000

[diff] [blame]

1

2

:mod:`itertools` --- Functions creating iterators for efficient looping

3

=======================================================================

4

5

.. module:: itertools

6

:synopsis: Functions creating iterators for efficient looping.

7

.. moduleauthor:: Raymond Hettinger <python@rcn.com>

8

.. sectionauthor:: Raymond Hettinger <python@rcn.com>

9

10

Georg Brandl

2008-03-22 22:04:10 +0000

[diff] [blame]

11

.. testsetup::

12

13

from itertools import *

14

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

15

.. versionadded:: 2.3

16

Georg Brandl

e7a0990

2007-10-21 12:10:28 +0000

[diff] [blame]

17

This module implements a number of :term:`iterator` building blocks inspired by

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

18

constructs from the Haskell and SML programming languages. Each has been recast

19

in a form suitable for Python.

20

21

The module standardizes a core set of fast, memory efficient tools that are

22

useful by themselves or in combination. Standardization helps avoid the

23

readability and reliability problems which arise when many different individuals

24

create their own slightly varying implementations, each with their own quirks

25

and naming conventions.

26

27

The tools are designed to combine readily with one another. This makes it easy

28

to construct more specialized tools succinctly and efficiently in pure Python.

29

30

For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a

31

sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and

32

:func:`count` which can be combined to form ``imap(f, count())`` and produce an

33

equivalent result.

34

35

Likewise, the functional tools are designed to work well with the high-speed

36

functions provided by the :mod:`operator` module.

37

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

38

Whether cast in pure python form or compiled code, tools that use iterators are

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

39

more memory efficient (and often faster) than their list based counterparts. Adopting

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

40

the principles of just-in-time manufacturing, they create data when and where

41

needed instead of consuming memory with the computer equivalent of "inventory".

42

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

.. seealso::

The Standard ML Basis Library, `The Standard ML Basis Library

47

<http://www.standardml.org/Basis/>`_.

48

49

Haskell, A Purely Functional Language, `Definition of Haskell and the Standard

50

Libraries <http://www.haskell.org/definition/>`_.

51

52

53

.. _itertools-functions:

Itertool functions

------------------

The following module functions all construct and return iterators. Some provide

59

streams of infinite length, so they should only be accessed by functions or

60

loops that truncate the stream.

61

62

63

.. function:: chain(*iterables)

64

65

Make an iterator that returns elements from the first iterable until it is

66

exhausted, then proceeds to the next iterable, until all of the iterables are

67

exhausted. Used for treating consecutive sequences as a single sequence.

68

Equivalent to::

69

70

def chain(*iterables):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

71

# chain('ABC', 'DEF') --> A B C D E F

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

for it in iterables:

for element in it:

yield element

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

77

.. function:: itertools.chain.from_iterable(iterable)

78

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

79

Alternate constructor for :func:`chain`. Gets chained inputs from a

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

80

single iterable argument that is evaluated lazily. Equivalent to::

81

82

@classmethod

83

def from_iterable(iterables):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

84

# chain.from_iterable(['ABC', 'DEF']) --> A B C D E F

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

for it in iterables:

for element in it:

yield element

.. versionadded:: 2.6

90

Raymond Hettinger

2008-03-04 04:17:08 +0000

[diff] [blame]

91

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

92

.. function:: combinations(iterable, r)

93

Raymond Hettinger

5eaffc4

2008-04-17 10:48:31 +0000

[diff] [blame]

94

Return *r* length subsequences of elements from the input *iterable*.

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

95

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

96

Combinations are emitted in lexicographic sort order. So, if the

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

97

input *iterable* is sorted, the combination tuples will be produced

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

98

in sorted order.

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

99

100

Elements are treated as unique based on their position, not on their

101

value. So if the input elements are unique, there will be no repeat

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

102

values in each combination.

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

103

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

104

Equivalent to::

105

106

def combinations(iterable, r):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

107

# combinations('ABCD', 2) --> AB AC AD BC BD CD

108

# combinations(range(4), 3) --> 012 013 023 123

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

109

pool = tuple(iterable)

Raymond Hettinger

93e804d

2008-02-26 23:40:50 +0000

[diff] [blame]

110

n = len(pool)

Raymond Hettinger

2009-01-08 06:39:04 +0000

[diff] [blame]

111

if r > n:

112

return

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

113

indices = range(r)

114

yield tuple(pool[i] for i in indices)

Raymond Hettinger

93e804d

2008-02-26 23:40:50 +0000

[diff] [blame]

115

while 1:

116

for i in reversed(range(r)):

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

117

if indices[i] != i + n - r:

Raymond Hettinger

c105289

2008-02-27 01:44:34 +0000

[diff] [blame]

118

break

Raymond Hettinger

93e804d

2008-02-26 23:40:50 +0000

[diff] [blame]

119

else:

120

return

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

121

indices[i] += 1

Raymond Hettinger

c105289

2008-02-27 01:44:34 +0000

[diff] [blame]

122

for j in range(i+1, r):

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

123

indices[j] = indices[j-1] + 1

124

yield tuple(pool[i] for i in indices)

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

125

Raymond Hettinger

2008-03-04 04:17:08 +0000

[diff] [blame]

126

The code for :func:`combinations` can be also expressed as a subsequence

127

of :func:`permutations` after filtering entries where the elements are not

128

in sorted order (according to their position in the input pool)::

129

130

def combinations(iterable, r):

131

pool = tuple(iterable)

132

n = len(pool)

133

for indices in permutations(range(n), r):

134

if sorted(indices) == list(indices):

135

yield tuple(pool[i] for i in indices)

136

Raymond Hettinger

2009-01-08 06:39:04 +0000

[diff] [blame]

137

The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``

138

or zero when ``r > n``.

139

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

140

.. versionadded:: 2.6

141

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

142

.. function:: count([n])

143

144

Make an iterator that returns consecutive integers starting with *n*. If not

Raymond Hettinger

50e90e2

2007-10-04 00:20:27 +0000

[diff] [blame]

145

specified *n* defaults to zero. Often used as an argument to :func:`imap` to

146

generate consecutive data points. Also, used with :func:`izip` to add sequence

147

numbers. Equivalent to::

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

148

149

def count(n=0):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

150

# count(10) --> 10 11 12 13 14 ...

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

while True:

yield n

n += 1

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

155

156

.. function:: cycle(iterable)

157

158

Make an iterator returning elements from the iterable and saving a copy of each.

159

When the iterable is exhausted, return elements from the saved copy. Repeats

160

indefinitely. Equivalent to::

161

162

def cycle(iterable):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

163

# cycle('ABCD') --> A B C D A B C D A B C D ...

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

164

saved = []

165

for element in iterable:

166

yield element

167

saved.append(element)

168

while saved:

169

for element in saved:

170

yield element

171

172

Note, this member of the toolkit may require significant auxiliary storage

173

(depending on the length of the iterable).

174

175

176

.. function:: dropwhile(predicate, iterable)

177

178

Make an iterator that drops elements from the iterable as long as the predicate

179

is true; afterwards, returns every element. Note, the iterator does not produce

180

*any* output until the predicate first becomes false, so it may have a lengthy

181

start-up time. Equivalent to::

182

183

def dropwhile(predicate, iterable):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

184

# dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

185

iterable = iter(iterable)

for x in iterable:

if not predicate(x):

yield x

break

for x in iterable:

yield x

.. function:: groupby(iterable[, key])

195

196

Make an iterator that returns consecutive keys and groups from the *iterable*.

197

The *key* is a function computing a key value for each element. If not

198

specified or is ``None``, *key* defaults to an identity function and returns

199

the element unchanged. Generally, the iterable needs to already be sorted on

200

the same key function.

201

202

The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It

203

generates a break or new group every time the value of the key function changes

204

(which is why it is usually necessary to have sorted the data using the same key

205

function). That behavior differs from SQL's GROUP BY which aggregates common

206

elements regardless of their input order.

207

208

The returned group is itself an iterator that shares the underlying iterable

209

with :func:`groupby`. Because the source is shared, when the :func:`groupby`

210

object is advanced, the previous group is no longer visible. So, if that data

211

is needed later, it should be stored as a list::

groups = []

uniquekeys = []

data = sorted(data, key=keyfunc)

216

for k, g in groupby(data, keyfunc):

217

groups.append(list(g)) # Store group iterator as a list

218

uniquekeys.append(k)

219

220

:func:`groupby` is equivalent to::

221

222

class groupby(object):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

223

# [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B

224

# [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

225

def __init__(self, iterable, key=None):

if key is None:

key = lambda x: x

self.keyfunc = key

self.it = iter(iterable)

Raymond Hettinger

81a885a

2007-12-29 22:16:24 +0000

[diff] [blame]

230

self.tgtkey = self.currkey = self.currvalue = object()

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

def __iter__(self):

return self

def next(self):

while self.currkey == self.tgtkey:

235

self.currvalue = self.it.next() # Exit on StopIteration

236

self.currkey = self.keyfunc(self.currvalue)

237

self.tgtkey = self.currkey

238

return (self.currkey, self._grouper(self.tgtkey))

239

def _grouper(self, tgtkey):

240

while self.currkey == tgtkey:

241

yield self.currvalue

242

self.currvalue = self.it.next() # Exit on StopIteration

243

self.currkey = self.keyfunc(self.currvalue)

244

245

.. versionadded:: 2.4

246

247

248

.. function:: ifilter(predicate, iterable)

249

250

Make an iterator that filters elements from iterable returning only those for

251

which the predicate is ``True``. If *predicate* is ``None``, return the items

252

that are true. Equivalent to::

253

254

def ifilter(predicate, iterable):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

255

# ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

256

if predicate is None:

predicate = bool

for x in iterable:

if predicate(x):

yield x

.. function:: ifilterfalse(predicate, iterable)

264

265

Make an iterator that filters elements from iterable returning only those for

266

which the predicate is ``False``. If *predicate* is ``None``, return the items

267

that are false. Equivalent to::

268

269

def ifilterfalse(predicate, iterable):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

270

# ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

271

if predicate is None:

predicate = bool

for x in iterable:

if not predicate(x):

yield x

.. function:: imap(function, *iterables)

279

280

Make an iterator that computes the function using arguments from each of the

281

iterables. If *function* is set to ``None``, then :func:`imap` returns the

282

arguments as a tuple. Like :func:`map` but stops when the shortest iterable is

283

exhausted instead of filling in ``None`` for shorter iterables. The reason for

284

the difference is that infinite iterator arguments are typically an error for

285

:func:`map` (because the output is fully evaluated) but represent a common and

286

useful way of supplying arguments to :func:`imap`. Equivalent to::

287

288

def imap(function, *iterables):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

289

# imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

290

iterables = map(iter, iterables)

291

while True:

Raymond Hettinger

2dec48d

2008-01-22 22:09:26 +0000

[diff] [blame]

292

args = [it.next() for it in iterables]

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

if function is None:

yield tuple(args)

else:

yield function(*args)

297

298

299

.. function:: islice(iterable, [start,] stop [, step])

300

301

Make an iterator that returns selected elements from the iterable. If *start* is

302

non-zero, then elements from the iterable are skipped until start is reached.

303

Afterward, elements are returned consecutively unless *step* is set higher than

304

one which results in items being skipped. If *stop* is ``None``, then iteration

305

continues until the iterator is exhausted, if at all; otherwise, it stops at the

306

specified position. Unlike regular slicing, :func:`islice` does not support

307

negative values for *start*, *stop*, or *step*. Can be used to extract related

308

fields from data where the internal structure has been flattened (for example, a

309

multi-line report may list a name field on every third line). Equivalent to::

310

311

def islice(iterable, *args):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

312

# islice('ABCDEFG', 2) --> A B

313

# islice('ABCDEFG', 2, 4) --> C D

314

# islice('ABCDEFG', 2, None) --> C D E F G

315

# islice('ABCDEFG', 0, None, 2) --> A C E G

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

316

s = slice(*args)

317

it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))

318

nexti = it.next()

319

for i, element in enumerate(iterable):

320

if i == nexti:

321

yield element

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

322

nexti = it.next()

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

323

324

If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,

325

then the step defaults to one.

326

327

.. versionchanged:: 2.5

328

accept ``None`` values for default *start* and *step*.

329

330

331

.. function:: izip(*iterables)

332

333

Make an iterator that aggregates elements from each of the iterables. Like

334

:func:`zip` except that it returns an iterator instead of a list. Used for

335

lock-step iteration over several iterables at a time. Equivalent to::

336

337

def izip(*iterables):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

338

# izip('ABCD', 'xy') --> Ax By

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

339

iterables = map(iter, iterables)

340

while iterables:

341

result = [it.next() for it in iterables]

342

yield tuple(result)

343

344

.. versionchanged:: 2.4

345

When no iterables are specified, returns a zero length iterator instead of

346

raising a :exc:`TypeError` exception.

347

Raymond Hettinger

48c6293

2008-01-22 19:51:41 +0000

[diff] [blame]

348

The left-to-right evaluation order of the iterables is guaranteed. This

349

makes possible an idiom for clustering a data series into n-length groups

350

using ``izip(*[iter(s)]*n)``.

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

351

Raymond Hettinger

48c6293

2008-01-22 19:51:41 +0000

[diff] [blame]

352

:func:`izip` should only be used with unequal length inputs when you don't

353

care about trailing, unmatched values from the longer iterables. If those

354

values are important, use :func:`izip_longest` instead.

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

355

356

357

.. function:: izip_longest(*iterables[, fillvalue])

358

359

Make an iterator that aggregates elements from each of the iterables. If the

360

iterables are of uneven length, missing values are filled-in with *fillvalue*.

361

Iteration continues until the longest iterable is exhausted. Equivalent to::

362

363

def izip_longest(*args, **kwds):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

364

# izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

365

fillvalue = kwds.get('fillvalue')

366

def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):

367

yield counter() # yields the fillvalue, or raises IndexError

368

fillers = repeat(fillvalue)

369

iters = [chain(it, sentinel(), fillers) for it in args]

370

try:

371

for tup in izip(*iters):

yield tup

except IndexError:

pass

Benjamin Peterson

2008-07-25 17:02:11 +0000

[diff] [blame]

376

If one of the iterables is potentially infinite, then the

377

:func:`izip_longest` function should be wrapped with something that limits

378

the number of calls (for example :func:`islice` or :func:`takewhile`). If

379

not specified, *fillvalue* defaults to ``None``.

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

380

381

.. versionadded:: 2.6

382

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

383

.. function:: permutations(iterable[, r])

384

385

Return successive *r* length permutations of elements in the *iterable*.

386

387

If *r* is not specified or is ``None``, then *r* defaults to the length

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

388

of the *iterable* and all possible full-length permutations

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

389

are generated.

390

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

391

Permutations are emitted in lexicographic sort order. So, if the

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

392

input *iterable* is sorted, the permutation tuples will be produced

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

393

in sorted order.

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

394

395

Elements are treated as unique based on their position, not on their

396

value. So if the input elements are unique, there will be no repeat

397

values in each permutation.

398

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

399

Equivalent to::

400

401

def permutations(iterable, r=None):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

402

# permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC

403

# permutations(range(3)) --> 012 021 102 120 201 210

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

404

pool = tuple(iterable)

405

n = len(pool)

406

r = n if r is None else r

Raymond Hettinger

2009-01-08 06:39:04 +0000

[diff] [blame]

407

if r > n:

408

return

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

409

indices = range(n)

Raymond Hettinger

e70bb8d

2008-03-23 00:55:46 +0000

[diff] [blame]

410

cycles = range(n, n-r, -1)

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

411

yield tuple(pool[i] for i in indices[:r])

412

while n:

413

for i in reversed(range(r)):

414

cycles[i] -= 1

415

if cycles[i] == 0:

Raymond Hettinger

2b7a5c4

2008-03-02 11:17:51 +0000

[diff] [blame]

416

indices[i:] = indices[i+1:] + indices[i:i+1]

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

cycles[i] = n - i

else:

j = cycles[i]

indices[i], indices[-j] = indices[-j], indices[i]

421

yield tuple(pool[i] for i in indices[:r])

422

break

423

else:

424

return

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

425

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

426

The code for :func:`permutations` can be also expressed as a subsequence of

Raymond Hettinger

2008-03-04 04:17:08 +0000

[diff] [blame]

427

:func:`product`, filtered to exclude entries with repeated elements (those

428

from the same position in the input pool)::

429

430

def permutations(iterable, r=None):

431

pool = tuple(iterable)

432

n = len(pool)

433

r = n if r is None else r

434

for indices in product(range(n), repeat=r):

435

if len(set(indices)) == r:

436

yield tuple(pool[i] for i in indices)

437

Raymond Hettinger

2009-01-08 06:39:04 +0000

[diff] [blame]

438

The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``

439

or zero when ``r > n``.

440

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

441

.. versionadded:: 2.6

442

Raymond Hettinger

2008-02-28 09:23:48 +0000

[diff] [blame]

443

.. function:: product(*iterables[, repeat])

Raymond Hettinger

2008-02-22 19:50:06 +0000

[diff] [blame]

444

445

Cartesian product of input iterables.

446

447

Equivalent to nested for-loops in a generator expression. For example,

448

``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.

449

Raymond Hettinger

5eaffc4

2008-04-17 10:48:31 +0000

[diff] [blame]

450

The nested loops cycle like an odometer with the rightmost element advancing

Andrew M. Kuchling

e2e0313

2008-04-17 20:44:06 +0000

[diff] [blame]

451

on every iteration. This pattern creates a lexicographic ordering so that if

452

the input's iterables are sorted, the product tuples are emitted in sorted

Raymond Hettinger

5eaffc4

2008-04-17 10:48:31 +0000

[diff] [blame]

453

order.

Raymond Hettinger

2008-02-22 19:50:06 +0000

[diff] [blame]

454

Raymond Hettinger

2008-02-28 09:23:48 +0000

[diff] [blame]

455

To compute the product of an iterable with itself, specify the number of

456

repetitions with the optional *repeat* keyword argument. For example,

457

``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.

458

Andrew M. Kuchling

684868a

2008-03-04 01:47:38 +0000

[diff] [blame]

459

This function is equivalent to the following code, except that the

460

actual implementation does not build up intermediate results in memory::

Raymond Hettinger

2008-02-22 19:50:06 +0000

[diff] [blame]

461

Raymond Hettinger

2008-02-28 09:23:48 +0000

[diff] [blame]

462

def product(*args, **kwds):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

463

# product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy

464

# product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111

Raymond Hettinger

2008-02-28 09:23:48 +0000

[diff] [blame]

465

pools = map(tuple, args) * kwds.get('repeat', 1)

Raymond Hettinger

2008-03-04 04:17:08 +0000

[diff] [blame]

466

result = [[]]

467

for pool in pools:

468

result = [x+[y] for x in result for y in pool]

469

for prod in result:

470

yield tuple(prod)

Raymond Hettinger

2008-02-22 19:50:06 +0000

[diff] [blame]

471

472

.. versionadded:: 2.6

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

473

474

.. function:: repeat(object[, times])

475

476

Make an iterator that returns *object* over and over again. Runs indefinitely

477

unless the *times* argument is specified. Used as argument to :func:`imap` for

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

478

invariant function parameters. Also used with :func:`izip` to create constant

479

fields in a tuple record. Equivalent to::

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

480

481

def repeat(object, times=None):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

482

# repeat(10, 3) --> 10 10 10

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

if times is None:

while True:

yield object

else:

for i in xrange(times):

yield object

.. function:: starmap(function, iterable)

492

Raymond Hettinger

4731709

2008-01-17 03:02:14 +0000

[diff] [blame]

493

Make an iterator that computes the function using arguments obtained from

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

494

the iterable. Used instead of :func:`imap` when argument parameters are already

495

grouped in tuples from a single iterable (the data has been "pre-zipped"). The

496

difference between :func:`imap` and :func:`starmap` parallels the distinction

497

between ``function(a,b)`` and ``function(*c)``. Equivalent to::

498

499

def starmap(function, iterable):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

500

# starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000

Raymond Hettinger

4731709

2008-01-17 03:02:14 +0000

[diff] [blame]

501

for args in iterable:

502

yield function(*args)

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

503

Raymond Hettinger

4731709

2008-01-17 03:02:14 +0000

[diff] [blame]

504

.. versionchanged:: 2.6

505

Previously, :func:`starmap` required the function arguments to be tuples.

506

Now, any iterable is allowed.

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

507

508

.. function:: takewhile(predicate, iterable)

509

510

Make an iterator that returns elements from the iterable as long as the

511

predicate is true. Equivalent to::

512

513

def takewhile(predicate, iterable):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

514

# takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

for x in iterable:

if predicate(x):

yield x

else:

break

.. function:: tee(iterable[, n=2])

523

524

Return *n* independent iterators from a single iterable. The case where ``n==2``

525

is equivalent to::

526

527

def tee(iterable):

Raymond Hettinger

2007-12-29 22:09:34 +0000

[diff] [blame]

528

def gen(next, data={}):

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

529

for i in count():

Raymond Hettinger

2007-12-29 22:09:34 +0000

[diff] [blame]

530

if i in data:

531

yield data.pop(i)

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

532

else:

Raymond Hettinger

2007-12-29 22:09:34 +0000

[diff] [blame]

533

data[i] = next()

534

yield data[i]

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

535

it = iter(iterable)

Raymond Hettinger

2007-12-29 22:09:34 +0000

[diff] [blame]

536

return gen(it.next), gen(it.next)

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

537

538

Note, once :func:`tee` has made a split, the original *iterable* should not be

539

used anywhere else; otherwise, the *iterable* could get advanced without the tee

540

objects being informed.

541

542

Note, this member of the toolkit may require significant auxiliary storage

543

(depending on how much temporary data needs to be stored). In general, if one

544

iterator is going to use most or all of the data before the other iterator, it

545

is faster to use :func:`list` instead of :func:`tee`.

546

547

.. versionadded:: 2.4

548

549

550

.. _itertools-example:

Examples

--------

The following examples show common uses for each tool and demonstrate ways they

Georg Brandl

2008-03-22 22:04:10 +0000

[diff] [blame]

556

can be combined.

557

558

.. doctest::

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

559

Benjamin Peterson

8ea9999

2009-01-01 16:43:12 +0000

[diff] [blame]

560

>>> # Show a dictionary sorted and grouped by value

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

561

>>> from operator import itemgetter

562

>>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)

563

>>> di = sorted(d.iteritems(), key=itemgetter(1))

564

>>> for k, g in groupby(di, key=itemgetter(1)):

565

... print k, map(itemgetter(0), g)

...

1 ['a', 'c', 'e']

2 ['b', 'd', 'f']

3 ['g']

Benjamin Peterson

2009-01-01 16:43:12 +0000

[diff] [blame]

571

>>> # Find runs of consecutive numbers using groupby. The key to the solution

572

>>> # is differencing with a range so that consecutive numbers all appear in

573

>>> # same group.

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

574

>>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]

575

>>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):

Georg Brandl

2008-03-22 22:04:10 +0000

[diff] [blame]

576

... print map(itemgetter(1), g)

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

577

...

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

[1]

[4, 5, 6]

[10]

[15, 16, 17, 18]

[22]

[25, 26, 27, 28]

.. _itertools-recipes:

Recipes

-------

This section shows recipes for creating an extended toolset using the existing

593

itertools as building blocks.

594

595

The extended tools offer the same high performance as the underlying toolset.

596

The superior memory performance is kept by processing elements one at a time

597

rather than bringing the whole iterable into memory all at once. Code volume is

598

kept small by linking the tools together in a functional style which helps

599

eliminate temporary variables. High speed is retained by preferring

Georg Brandl

cf3fb25

2007-10-21 10:52:38 +0000

[diff] [blame]

600

"vectorized" building blocks over the use of for-loops and :term:`generator`\s

Georg Brandl

2008-03-22 22:04:10 +0000

[diff] [blame]

601

which incur interpreter overhead.

602

603

.. testcode::

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

604

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

605

def take(n, iterable):

606

"Return first n items of the iterable as a list"

607

return list(islice(iterable, n))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

608

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

609

def enumerate(iterable, start=0):

610

return izip(count(start), iterable)

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

611

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

612

def tabulate(function, start=0):

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

613

"Return function(0), function(1), ..."

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

614

return imap(function, count(start))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

615

616

def nth(iterable, n):

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

617

"Returns the nth item or empty list"

618

return list(islice(iterable, n, n+1))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

619

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

620

def quantify(iterable, pred=bool):

621

"Count how many times the predicate is true"

622

return sum(imap(pred, iterable))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

623

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

624

def padnone(iterable):

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

625

"""Returns the sequence elements and then returns None indefinitely.

626

627

Useful for emulating the behavior of the built-in map() function.

628

"""

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

629

return chain(iterable, repeat(None))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

630

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

631

def ncycles(iterable, n):

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

632

"Returns the sequence elements n times"

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

633

return chain.from_iterable(repeat(iterable, n))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

634

635

def dotproduct(vec1, vec2):

636

return sum(imap(operator.mul, vec1, vec2))

637

638

def flatten(listOfLists):

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

639

return list(chain.from_iterable(listOfLists))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

640

641

def repeatfunc(func, times=None, *args):

642

"""Repeat calls to func with specified arguments.

643

644

Example: repeatfunc(random.random)

645

"""

646

if times is None:

647

return starmap(func, repeat(args))

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

648

return starmap(func, repeat(args, times))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

649

650

def pairwise(iterable):

651

"s -> (s0,s1), (s1,s2), (s2, s3), ..."

652

a, b = tee(iterable)

Raymond Hettinger

38fb9be

2008-03-07 01:33:20 +0000

[diff] [blame]

653

for elem in b:

654

break

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

655

return izip(a, b)

656

Raymond Hettinger

38fb9be

2008-03-07 01:33:20 +0000

[diff] [blame]

657

def grouper(n, iterable, fillvalue=None):

Raymond Hettinger

efdf706

2008-07-30 07:27:30 +0000

[diff] [blame]

658

"grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"

Raymond Hettinger

38fb9be

2008-03-07 01:33:20 +0000

[diff] [blame]

659

args = [iter(iterable)] * n

Raymond Hettinger

f080e6d

2008-07-31 01:19:50 +0000

[diff] [blame]

660

return izip_longest(fillvalue=fillvalue, *args)

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

661

Raymond Hettinger

a44327a

2008-01-30 22:17:31 +0000

[diff] [blame]

662

def roundrobin(*iterables):

Raymond Hettinger

efdf706

2008-07-30 07:27:30 +0000

[diff] [blame]

663

"roundrobin('ABC', 'D', 'EF') --> A D E B F C"

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

664

# Recipe credited to George Sakkis

Raymond Hettinger

a44327a

2008-01-30 22:17:31 +0000

[diff] [blame]

665

pending = len(iterables)

666

nexts = cycle(iter(it).next for it in iterables)

while pending:

try:

for next in nexts:

yield next()

except StopIteration:

672

pending -= 1

673

nexts = cycle(islice(nexts, pending))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

674

Raymond Hettinger

7832d4d

2008-02-23 10:04:15 +0000

[diff] [blame]

675

def powerset(iterable):

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

676

"powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])"

677

# Recipe credited to Eric Raymond

678

pairs = [(2**i, x) for i, x in enumerate(iterable)]

679

for n in xrange(2**len(pairs)):

680

yield set(x for m, x in pairs if m&n)

Raymond Hettinger

7832d4d

2008-02-23 10:04:15 +0000

[diff] [blame]

681

Raymond Hettinger

e8b4b60

2008-03-11 00:19:07 +0000

[diff] [blame]

682

def compress(data, selectors):

Raymond Hettinger

efdf706

2008-07-30 07:27:30 +0000

[diff] [blame]

683

"compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F"

684

return (d for d, s in izip(data, selectors) if s)

Raymond Hettinger

3369167

2008-07-19 00:43:00 +0000

[diff] [blame]

685

Georg Brandl

8c81fda

2008-07-23 16:00:44 +0000

[diff] [blame]

686

def combinations_with_replacement(iterable, r):

Raymond Hettinger

2009-01-08 06:39:04 +0000

[diff] [blame]

687

"combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC"

688

# number items returned: (n+r-1)! / r! / (n-1)!

Georg Brandl

8c81fda

2008-07-23 16:00:44 +0000

[diff] [blame]

689

pool = tuple(iterable)

690

n = len(pool)

691

indices = [0] * r

692

yield tuple(pool[i] for i in indices)

693

while 1:

694

for i in reversed(range(r)):

695

if indices[i] != n - 1:

break

else:

return

indices[i:] = [indices[i] + 1] * (r - i)

700

yield tuple(pool[i] for i in indices)

Raymond Hettinger

44e1581

2009-01-02 21:26:45 +0000

[diff] [blame]

701

702

def unique_everseen(iterable, key=None):

703

"List unique elements, preserving order. Remember all elements ever seen."

704

# unique_everseen('AAAABBBCCDAABBB') --> A B C D

Georg Brandl