Blame - Doc/library/itertools.rst - platform/external/python/cpython2

2007-08-15 14:28:22 +0000

[diff] [blame]

1

2

:mod:`itertools` --- Functions creating iterators for efficient looping

3

=======================================================================

4

5

.. module:: itertools

6

:synopsis: Functions creating iterators for efficient looping.

7

.. moduleauthor:: Raymond Hettinger <python@rcn.com>

8

.. sectionauthor:: Raymond Hettinger <python@rcn.com>

9

10

Christian Heimes

2008-03-23 21:54:12 +0000

[diff] [blame]

11

.. testsetup::

12

13

from itertools import *

14

15

Georg Brandl

9afde1c

2007-11-01 20:32:30 +0000

[diff] [blame]

16

This module implements a number of :term:`iterator` building blocks inspired by

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

17

constructs from the Haskell and SML programming languages. Each has been recast

18

in a form suitable for Python.

19

20

The module standardizes a core set of fast, memory efficient tools that are

21

useful by themselves or in combination. Standardization helps avoid the

22

readability and reliability problems which arise when many different individuals

23

create their own slightly varying implementations, each with their own quirks

24

and naming conventions.

25

26

The tools are designed to combine readily with one another. This makes it easy

27

to construct more specialized tools succinctly and efficiently in pure Python.

28

29

For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a

Raymond Hettinger

2008-03-13 01:26:19 +0000

[diff] [blame]

30

sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python

31

by combining :func:`map` and :func:`count` to form ``map(f, count())``.

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

32

33

Likewise, the functional tools are designed to work well with the high-speed

34

functions provided by the :mod:`operator` module.

35

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

36

Whether cast in pure python form or compiled code, tools that use iterators are

Georg Brandl

2008-07-23 16:10:53 +0000

[diff] [blame]

37

more memory efficient (and often faster) than their list based counterparts. Adopting

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

38

the principles of just-in-time manufacturing, they create data when and where

39

needed instead of consuming memory with the computer equivalent of "inventory".

40

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

.. seealso::

The Standard ML Basis Library, `The Standard ML Basis Library

45

<http://www.standardml.org/Basis/>`_.

46

47

Haskell, A Purely Functional Language, `Definition of Haskell and the Standard

48

Libraries <http://www.haskell.org/definition/>`_.

49

50

51

.. _itertools-functions:

Itertool functions

------------------

The following module functions all construct and return iterators. Some provide

57

streams of infinite length, so they should only be accessed by functions or

58

loops that truncate the stream.

59

60

61

.. function:: chain(*iterables)

62

63

Make an iterator that returns elements from the first iterable until it is

64

exhausted, then proceeds to the next iterable, until all of the iterables are

65

exhausted. Used for treating consecutive sequences as a single sequence.

66

Equivalent to::

67

68

def chain(*iterables):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

69

# chain('ABC', 'DEF') --> A B C D E F

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

for it in iterables:

for element in it:

yield element

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

75

.. function:: itertools.chain.from_iterable(iterable)

76

Georg Brandl

2009-01-03 21:18:54 +0000

[diff] [blame]

77

Alternate constructor for :func:`chain`. Gets chained inputs from a

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

78

single iterable argument that is evaluated lazily. Equivalent to::

79

80

@classmethod

81

def from_iterable(iterables):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

82

# chain.from_iterable(['ABC', 'DEF']) --> A B C D E F

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

for it in iterables:

for element in it:

yield element

Christian Heimes

2008-03-04 23:39:23 +0000

[diff] [blame]

87

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

88

.. function:: combinations(iterable, r)

89

Christian Heimes

dae2a89

2008-04-19 00:55:37 +0000

[diff] [blame]

90

Return *r* length subsequences of elements from the input *iterable*.

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

91

Georg Brandl

2009-01-03 21:18:54 +0000

[diff] [blame]

92

Combinations are emitted in lexicographic sort order. So, if the

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

93

input *iterable* is sorted, the combination tuples will be produced

Georg Brandl

2009-01-03 21:18:54 +0000

[diff] [blame]

94

in sorted order.

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

95

96

Elements are treated as unique based on their position, not on their

97

value. So if the input elements are unique, there will be no repeat

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

98

values in each combination.

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

99

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

100

Equivalent to::

101

102

def combinations(iterable, r):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

103

# combinations('ABCD', 2) --> AB AC AD BC BD CD

104

# combinations(range(4), 3) --> 012 013 023 123

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

105

pool = tuple(iterable)

Christian Heimes

2008-02-28 11:19:05 +0000

[diff] [blame]

106

n = len(pool)

Raymond Hettinger

2009-01-08 21:01:54 +0000

[diff] [blame]

107

if r > n:

108

return

109

indices = list(range(r))

Christian Heimes

2008-03-02 22:46:37 +0000

[diff] [blame]

110

yield tuple(pool[i] for i in indices)

Christian Heimes

2008-02-28 11:19:05 +0000

[diff] [blame]

111

while 1:

112

for i in reversed(range(r)):

Christian Heimes

2008-03-02 22:46:37 +0000

[diff] [blame]

113

if indices[i] != i + n - r:

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

114

break

Christian Heimes

2008-02-28 11:19:05 +0000

[diff] [blame]

115

else:

116

return

Christian Heimes

2008-03-02 22:46:37 +0000

[diff] [blame]

117

indices[i] += 1

Christian Heimes

2008-02-28 11:19:05 +0000

[diff] [blame]

118

for j in range(i+1, r):

Christian Heimes

2008-03-02 22:46:37 +0000

[diff] [blame]

119

indices[j] = indices[j-1] + 1

120

yield tuple(pool[i] for i in indices)

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

121

Christian Heimes

2008-03-04 23:39:23 +0000

[diff] [blame]

122

The code for :func:`combinations` can be also expressed as a subsequence

123

of :func:`permutations` after filtering entries where the elements are not

124

in sorted order (according to their position in the input pool)::

125

126

def combinations(iterable, r):

127

pool = tuple(iterable)

128

n = len(pool)

129

for indices in permutations(range(n), r):

130

if sorted(indices) == list(indices):

131

yield tuple(pool[i] for i in indices)

132

Raymond Hettinger

2009-01-08 21:01:54 +0000

[diff] [blame]

133

The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``

134

or zero when ``r > n``.

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

135

Raymond Hettinger

d07d939

2009-01-27 04:20:44 +0000

[diff] [blame]

136

.. function:: combinations_with_replacement(iterable, r)

137

138

Return *r* length subsequences of elements from the input *iterable*

139

allowing individual elements to be repeated more than once.

140

141

Combinations are emitted in lexicographic sort order. So, if the

142

input *iterable* is sorted, the combination tuples will be produced

143

in sorted order.

144

145

Elements are treated as unique based on their position, not on their

146

value. So if the input elements are unique, the generated combinations

will also be unique.

Equivalent to::

def combinations_with_replacement(iterable, r):

152

# combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC

153

pool = tuple(iterable)

n = len(pool)

if not n and r:

return

indices = [0] * r

yield tuple(pool[i] for i in indices)

159

while 1:

160

for i in reversed(range(r)):

161

if indices[i] != n - 1:

break

else:

return

indices[i:] = [indices[i] + 1] * (r - i)

166

yield tuple(pool[i] for i in indices)

167

168

The code for :func:`combinations_with_replacement` can be also expressed as

169

a subsequence of :func:`product` after filtering entries where the elements

170

are not in sorted order (according to their position in the input pool)::

171

172

def combinations_with_replacement(iterable, r):

173

pool = tuple(iterable)

174

n = len(pool)

175

for indices in product(range(n), repeat=r):

176

if sorted(indices) == list(indices):

177

yield tuple(pool[i] for i in indices)

178

179

The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.

180

181

.. versionadded:: 2.7

182

Raymond Hettinger

6b3b0fc

2009-01-26 02:56:58 +0000

[diff] [blame]

183

.. function:: compress(data, selectors)

184

185

Make an iterator that filters elements from *data* returning only those that

186

have a corresponding element in *selectors* that evaluates to ``True``.

187

Stops when either the *data* or *selectors* iterables have been exhausted.

188

Equivalent to::

189

190

def compress(data, selectors):

191

# compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F

192

return (d for d, s in zip(data, selectors) if s)

193

194

.. versionadded:: 2.7

195

196

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

197

.. function:: count([n])

198

199

Make an iterator that returns consecutive integers starting with *n*. If not

Raymond Hettinger

2008-03-13 01:26:19 +0000

[diff] [blame]

200

specified *n* defaults to zero. Often used as an argument to :func:`map` to

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

201

generate consecutive data points. Also, used with :func:`zip` to add sequence

Georg Brandl

9afde1c

2007-11-01 20:32:30 +0000

[diff] [blame]

202

numbers. Equivalent to::

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

203

204

def count(n=0):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

205

# count(10) --> 10 11 12 13 14 ...

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

while True:

yield n

n += 1

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

210

211

.. function:: cycle(iterable)

212

213

Make an iterator returning elements from the iterable and saving a copy of each.

214

When the iterable is exhausted, return elements from the saved copy. Repeats

215

indefinitely. Equivalent to::

216

217

def cycle(iterable):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

218

# cycle('ABCD') --> A B C D A B C D A B C D ...

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

219

saved = []

220

for element in iterable:

221

yield element

222

saved.append(element)

223

while saved:

224

for element in saved:

225

yield element

226

227

Note, this member of the toolkit may require significant auxiliary storage

228

(depending on the length of the iterable).

229

230

231

.. function:: dropwhile(predicate, iterable)

232

233

Make an iterator that drops elements from the iterable as long as the predicate

234

is true; afterwards, returns every element. Note, the iterator does not produce

235

*any* output until the predicate first becomes false, so it may have a lengthy

236

start-up time. Equivalent to::

237

238

def dropwhile(predicate, iterable):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

239

# dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

240

iterable = iter(iterable)

for x in iterable:

if not predicate(x):

yield x

break

for x in iterable:

yield x

Raymond Hettinger

2009-01-27 04:42:48 +0000

[diff] [blame^]

248

.. function:: filterfalse(predicate, iterable)

249

250

Make an iterator that filters elements from iterable returning only those for

251

which the predicate is ``False``. If *predicate* is ``None``, return the items

252

that are false. Equivalent to::

253

254

def filterfalse(predicate, iterable):

255

# filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8

256

if predicate is None:

predicate = bool

for x in iterable:

if not predicate(x):

yield x

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

262

263

.. function:: groupby(iterable[, key])

264

265

Make an iterator that returns consecutive keys and groups from the *iterable*.

266

The *key* is a function computing a key value for each element. If not

267

specified or is ``None``, *key* defaults to an identity function and returns

268

the element unchanged. Generally, the iterable needs to already be sorted on

269

the same key function.

270

271

The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It

272

generates a break or new group every time the value of the key function changes

273

(which is why it is usually necessary to have sorted the data using the same key

274

function). That behavior differs from SQL's GROUP BY which aggregates common

275

elements regardless of their input order.

276

277

The returned group is itself an iterator that shares the underlying iterable

278

with :func:`groupby`. Because the source is shared, when the :func:`groupby`

279

object is advanced, the previous group is no longer visible. So, if that data

280

is needed later, it should be stored as a list::

groups = []

uniquekeys = []

data = sorted(data, key=keyfunc)

285

for k, g in groupby(data, keyfunc):

286

groups.append(list(g)) # Store group iterator as a list

287

uniquekeys.append(k)

288

289

:func:`groupby` is equivalent to::

290

291

class groupby(object):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

292

# [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B

293

# [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

294

def __init__(self, iterable, key=None):

if key is None:

key = lambda x: x

self.keyfunc = key

self.it = iter(iterable)

Christian Heimes

2007-12-31 16:14:33 +0000

[diff] [blame]

299

self.tgtkey = self.currkey = self.currvalue = object()

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

def __iter__(self):

return self

def __next__(self):

while self.currkey == self.tgtkey:

304

self.currvalue = next(self.it) # Exit on StopIteration

305

self.currkey = self.keyfunc(self.currvalue)

306

self.tgtkey = self.currkey

307

return (self.currkey, self._grouper(self.tgtkey))

308

def _grouper(self, tgtkey):

309

while self.currkey == tgtkey:

310

yield self.currvalue

311

self.currvalue = next(self.it) # Exit on StopIteration

312

self.currkey = self.keyfunc(self.currvalue)

313

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

314

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

315

.. function:: islice(iterable, [start,] stop [, step])

316

317

Make an iterator that returns selected elements from the iterable. If *start* is

318

non-zero, then elements from the iterable are skipped until start is reached.

319

Afterward, elements are returned consecutively unless *step* is set higher than

320

one which results in items being skipped. If *stop* is ``None``, then iteration

321

continues until the iterator is exhausted, if at all; otherwise, it stops at the

322

specified position. Unlike regular slicing, :func:`islice` does not support

323

negative values for *start*, *stop*, or *step*. Can be used to extract related

324

fields from data where the internal structure has been flattened (for example, a

325

multi-line report may list a name field on every third line). Equivalent to::

326

327

def islice(iterable, *args):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

328

# islice('ABCDEFG', 2) --> A B

329

# islice('ABCDEFG', 2, 4) --> C D

330

# islice('ABCDEFG', 2, None) --> C D E F G

331

# islice('ABCDEFG', 0, None, 2) --> A C E G

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

332

s = slice(*args)

Georg Brandl

f694518

2008-02-01 11:56:49 +0000

[diff] [blame]

333

it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

334

nexti = next(it)

335

for i, element in enumerate(iterable):

if i == nexti:

yield element

nexti = next(it)

If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,

341

then the step defaults to one.

342

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

343

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

344

.. function:: permutations(iterable[, r])

345

346

Return successive *r* length permutations of elements in the *iterable*.

347

348

If *r* is not specified or is ``None``, then *r* defaults to the length

Georg Brandl

2009-01-03 21:18:54 +0000

[diff] [blame]

349

of the *iterable* and all possible full-length permutations

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

350

are generated.

351

Georg Brandl

2009-01-03 21:18:54 +0000

[diff] [blame]

352

Permutations are emitted in lexicographic sort order. So, if the

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

353

input *iterable* is sorted, the permutation tuples will be produced

Georg Brandl

2009-01-03 21:18:54 +0000

[diff] [blame]

354

in sorted order.

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

355

356

Elements are treated as unique based on their position, not on their

357

value. So if the input elements are unique, there will be no repeat

358

values in each permutation.

359

Christian Heimes

2008-03-02 22:46:37 +0000

[diff] [blame]

360

Equivalent to::

361

362

def permutations(iterable, r=None):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

363

# permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC

364

# permutations(range(3)) --> 012 021 102 120 201 210

Christian Heimes

2008-03-02 22:46:37 +0000

[diff] [blame]

365

pool = tuple(iterable)

366

n = len(pool)

367

r = n if r is None else r

Raymond Hettinger

2009-01-08 21:01:54 +0000

[diff] [blame]

368

if r > n:

369

return

370

indices = list(range(n))

Christian Heimes

2008-03-23 21:54:12 +0000

[diff] [blame]

371

cycles = range(n, n-r, -1)

Christian Heimes

2008-03-02 22:46:37 +0000

[diff] [blame]

372

yield tuple(pool[i] for i in indices[:r])

373

while n:

374

for i in reversed(range(r)):

375

cycles[i] -= 1

376

if cycles[i] == 0:

377

indices[i:] = indices[i+1:] + indices[i:i+1]

cycles[i] = n - i

else:

j = cycles[i]

indices[i], indices[-j] = indices[-j], indices[i]

382

yield tuple(pool[i] for i in indices[:r])

383

break

384

else:

385

return

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

386

Georg Brandl

2009-01-03 21:18:54 +0000

[diff] [blame]

387

The code for :func:`permutations` can be also expressed as a subsequence of

Christian Heimes

2008-03-04 23:39:23 +0000

[diff] [blame]

388

:func:`product`, filtered to exclude entries with repeated elements (those

389

from the same position in the input pool)::

390

391

def permutations(iterable, r=None):

392

pool = tuple(iterable)

393

n = len(pool)

394

r = n if r is None else r

395

for indices in product(range(n), repeat=r):

396

if len(set(indices)) == r:

397

yield tuple(pool[i] for i in indices)

398

Raymond Hettinger

2009-01-08 21:01:54 +0000

[diff] [blame]

399

The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``

400

or zero when ``r > n``.

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

401

Christian Heimes

9e7f1d2

2008-02-28 12:27:11 +0000

[diff] [blame]

402

.. function:: product(*iterables[, repeat])

Christian Heimes

2008-02-23 13:18:03 +0000

[diff] [blame]

403

404

Cartesian product of input iterables.

405

406

Equivalent to nested for-loops in a generator expression. For example,

407

``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.

408

Christian Heimes

dae2a89

2008-04-19 00:55:37 +0000

[diff] [blame]

409

The nested loops cycle like an odometer with the rightmost element advancing

410

on every iteration. This pattern creates a lexicographic ordering so that if

411

the input's iterables are sorted, the product tuples are emitted in sorted

412

order.

Christian Heimes

2008-02-23 13:18:03 +0000

[diff] [blame]

413

Christian Heimes

9e7f1d2

2008-02-28 12:27:11 +0000

[diff] [blame]

414

To compute the product of an iterable with itself, specify the number of

415

repetitions with the optional *repeat* keyword argument. For example,

416

``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.

417

Christian Heimes

2008-03-04 23:39:23 +0000

[diff] [blame]

418

This function is equivalent to the following code, except that the

419

actual implementation does not build up intermediate results in memory::

Christian Heimes

2008-02-23 13:18:03 +0000

[diff] [blame]

420

Raymond Hettinger

a137e1f

2008-03-13 02:43:14 +0000

[diff] [blame]

421

def product(*args, repeat=1):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

422

# product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy

423

# product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111

Raymond Hettinger

a137e1f

2008-03-13 02:43:14 +0000

[diff] [blame]

424

pools = map(tuple, args) * repeat

Christian Heimes

2008-03-04 23:39:23 +0000

[diff] [blame]

425

result = [[]]

426

for pool in pools:

427

result = [x+[y] for x in result for y in pool]

428

for prod in result:

429

yield tuple(prod)

Christian Heimes

2008-02-23 13:18:03 +0000

[diff] [blame]

430

431

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

432

.. function:: repeat(object[, times])

433

434

Make an iterator that returns *object* over and over again. Runs indefinitely

Raymond Hettinger

2008-03-13 01:26:19 +0000

[diff] [blame]

435

unless the *times* argument is specified. Used as argument to :func:`map` for

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

436

invariant parameters to the called function. Also used with :func:`zip` to

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

437

create an invariant part of a tuple record. Equivalent to::

438

439

def repeat(object, times=None):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

440

# repeat(10, 3) --> 10 10 10

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

if times is None:

while True:

yield object

else:

for i in range(times):

yield object

.. function:: starmap(function, iterable)

450

Christian Heimes

679db4a

2008-01-18 09:56:22 +0000

[diff] [blame]

451

Make an iterator that computes the function using arguments obtained from

Raymond Hettinger

2008-03-13 01:26:19 +0000

[diff] [blame]

452

the iterable. Used instead of :func:`map` when argument parameters are already

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

453

grouped in tuples from a single iterable (the data has been "pre-zipped"). The

Raymond Hettinger

2008-03-13 01:26:19 +0000

[diff] [blame]

454

difference between :func:`map` and :func:`starmap` parallels the distinction

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

455

between ``function(a,b)`` and ``function(*c)``. Equivalent to::

456

457

def starmap(function, iterable):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

458

# starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000

Christian Heimes

679db4a

2008-01-18 09:56:22 +0000

[diff] [blame]

459

for args in iterable:

460

yield function(*args)

461

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

462

463

.. function:: takewhile(predicate, iterable)

464

465

Make an iterator that returns elements from the iterable as long as the

466

predicate is true. Equivalent to::

467

468

def takewhile(predicate, iterable):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

469

# takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

for x in iterable:

if predicate(x):

yield x

else:

break

.. function:: tee(iterable[, n=2])

478

479

Return *n* independent iterators from a single iterable. The case where ``n==2``

480

is equivalent to::

481

482

def tee(iterable):

Christian Heimes

2007-12-31 16:14:33 +0000

[diff] [blame]

483

def gen(next, data={}):

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

484

for i in count():

Christian Heimes

2007-12-31 16:14:33 +0000

[diff] [blame]

485

if i in data:

486

yield data.pop(i)

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

487

else:

Christian Heimes

2007-12-31 16:14:33 +0000

[diff] [blame]

488

data[i] = next()

489

yield data[i]

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

490

it = iter(iterable)

491

return (gen(it.__next__), gen(it.__next__))

492

493

Note, once :func:`tee` has made a split, the original *iterable* should not be

494

used anywhere else; otherwise, the *iterable* could get advanced without the tee

495

objects being informed.

496

497

Note, this member of the toolkit may require significant auxiliary storage

498

(depending on how much temporary data needs to be stored). In general, if one

499

iterator is going to use most or all of the data before the other iterator, it

500

is faster to use :func:`list` instead of :func:`tee`.

501

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

502

Raymond Hettinger

749761e

2009-01-27 04:42:48 +0000

[diff] [blame^]

503

.. function:: zip_longest(*iterables[, fillvalue])

504

505

Make an iterator that aggregates elements from each of the iterables. If the

506

iterables are of uneven length, missing values are filled-in with *fillvalue*.

507

Iteration continues until the longest iterable is exhausted. Equivalent to::

508

509

def zip_longest(*args, fillvalue=None):

510

# zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-

511

def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):

512

yield counter() # yields the fillvalue, or raises IndexError

513

fillers = repeat(fillvalue)

514

iters = [chain(it, sentinel(), fillers) for it in args]

515

try:

516

for tup in zip(*iters):

yield tup

except IndexError:

pass

If one of the iterables is potentially infinite, then the :func:`zip_longest`

522

function should be wrapped with something that limits the number of calls

523

(for example :func:`islice` or :func:`takewhile`). If not specified,

524

*fillvalue* defaults to ``None``.

525

526

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

527

.. _itertools-example:

Examples

--------

The following examples show common uses for each tool and demonstrate ways they

Christian Heimes

2008-03-23 21:54:12 +0000

[diff] [blame]

533

can be combined.

534

535

.. doctest::

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

536

Georg Brandl

b1441c7

2009-01-03 22:33:39 +0000

[diff] [blame]

537

>>> # Show a dictionary sorted and grouped by value

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

538

>>> from operator import itemgetter

539

>>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)

Fred Drake

2e74878

2007-09-04 17:33:11 +0000

[diff] [blame]

540

>>> di = sorted(d.items(), key=itemgetter(1))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

541

>>> for k, g in groupby(di, key=itemgetter(1)):

Georg Brandl

6911e3c

2007-09-04 07:15:32 +0000

[diff] [blame]

542

... print(k, map(itemgetter(0), g))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

...

1 ['a', 'c', 'e']

2 ['b', 'd', 'f']

3 ['g']

Georg Brandl

2009-01-03 22:33:39 +0000

[diff] [blame]

548

>>> # Find runs of consecutive numbers using groupby. The key to the solution

549

>>> # is differencing with a range so that consecutive numbers all appear in

550

>>> # same group.

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

551

>>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]

552

>>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):

Georg Brandl

6911e3c

2007-09-04 07:15:32 +0000

[diff] [blame]

553

... print(map(operator.itemgetter(1), g))

Georg Brandl

2009-01-03 21:18:54 +0000

[diff] [blame]

554

...

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

[1]

[4, 5, 6]

[10]

[15, 16, 17, 18]

[22]

[25, 26, 27, 28]

.. _itertools-recipes:

Recipes

-------

This section shows recipes for creating an extended toolset using the existing

570

itertools as building blocks.

571

572

The extended tools offer the same high performance as the underlying toolset.

573

The superior memory performance is kept by processing elements one at a time

574

rather than bringing the whole iterable into memory all at once. Code volume is

575

kept small by linking the tools together in a functional style which helps

576

eliminate temporary variables. High speed is retained by preferring

Georg Brandl

9afde1c

2007-11-01 20:32:30 +0000

[diff] [blame]

577

"vectorized" building blocks over the use of for-loops and :term:`generator`\s

Christian Heimes

2008-03-23 21:54:12 +0000

[diff] [blame]

578

which incur interpreter overhead.

579

580

.. testcode::

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

581

Georg Brandl

2008-07-23 16:10:53 +0000

[diff] [blame]

582

def take(n, iterable):

583

"Return first n items of the iterable as a list"

584

return list(islice(iterable, n))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

585

Georg Brandl

2008-07-23 16:10:53 +0000

[diff] [blame]

586

def enumerate(iterable, start=0):

587

return zip(count(start), iterable)

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

588

Georg Brandl

2008-07-23 16:10:53 +0000

[diff] [blame]

589

def tabulate(function, start=0):

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

590

"Return function(0), function(1), ..."

Georg Brandl

2008-07-23 16:10:53 +0000

[diff] [blame]

591

return map(function, count(start))

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

592

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

593

def nth(iterable, n):

Georg Brandl

2008-07-23 16:10:53 +0000

[diff] [blame]

594

"Returns the nth item or empty list"

595

return list(islice(iterable, n, n+1))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

596

Georg Brandl

2008-07-23 16:10:53 +0000

[diff] [blame]

597

def quantify(iterable, pred=bool):

598

"Count how many times the predicate is true"

599

return sum(map(pred, iterable))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

600

Georg Brandl

2008-07-23 16:10:53 +0000

[diff] [blame]

601

def padnone(iterable):

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

602

"""Returns the sequence elements and then returns None indefinitely.

603

604

Useful for emulating the behavior of the built-in map() function.

605

"""

Georg Brandl

2008-07-23 16:10:53 +0000

[diff] [blame]

606

return chain(iterable, repeat(None))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

607

Georg Brandl

2008-07-23 16:10:53 +0000

[diff] [blame]

608

def ncycles(iterable, n):

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

609

"Returns the sequence elements n times"

Georg Brandl

2008-07-23 16:10:53 +0000

[diff] [blame]

610

return chain.from_iterable(repeat(iterable, n))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

611

612

def dotproduct(vec1, vec2):

Raymond Hettinger

2008-03-13 01:26:19 +0000

[diff] [blame]

613

return sum(map(operator.mul, vec1, vec2))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

614

615

def flatten(listOfLists):

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

616

return list(chain.from_iterable(listOfLists))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

617

618

def repeatfunc(func, times=None, *args):

619

"""Repeat calls to func with specified arguments.

620

621

Example: repeatfunc(random.random)

622

"""

623

if times is None:

624

return starmap(func, repeat(args))

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

625

return starmap(func, repeat(args, times))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

626

627

def pairwise(iterable):

628

"s -> (s0,s1), (s1,s2), (s2, s3), ..."

629

a, b = tee(iterable)

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

630

for elem in b:

631

break

632

return zip(a, b)

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

633

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

634

def grouper(n, iterable, fillvalue=None):

Raymond Hettinger

f5a2e47

2008-07-30 07:37:37 +0000

[diff] [blame]

635

"grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

636

args = [iter(iterable)] * n

Benjamin Peterson

d18de0e

2008-07-31 20:21:46 +0000

[diff] [blame]

637

return zip_longest(fillvalue=fillvalue, *args)

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

638

Christian Heimes

7b3ce6a

2008-01-31 14:31:45 +0000

[diff] [blame]

639

def roundrobin(*iterables):

Raymond Hettinger

f5a2e47

2008-07-30 07:37:37 +0000

[diff] [blame]

640

"roundrobin('ABC', 'D', 'EF') --> A D E B F C"

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

641

# Recipe credited to George Sakkis

Christian Heimes

7b3ce6a

2008-01-31 14:31:45 +0000

[diff] [blame]

642

pending = len(iterables)

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

643

nexts = cycle(iter(it).__next__ for it in iterables)

Christian Heimes

7b3ce6a

2008-01-31 14:31:45 +0000

[diff] [blame]

while pending:

try:

for next in nexts:

yield next()

except StopIteration:

649

pending -= 1

650

nexts = cycle(islice(nexts, pending))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

651

Christian Heimes

2008-02-23 13:18:03 +0000

[diff] [blame]

652

def powerset(iterable):

Raymond Hettinger

ace6733

2009-01-26 02:23:50 +0000

[diff] [blame]

653

"powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"

654

s = list(iterable)

655

return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))

Christian Heimes

2008-02-23 13:18:03 +0000

[diff] [blame]

656

Raymond Hettinger

ad9d96b

2009-01-02 21:39:07 +0000

[diff] [blame]

657

def unique_everseen(iterable, key=None):

658

"List unique elements, preserving order. Remember all elements ever seen."

659

# unique_everseen('AAAABBBCCDAABBB') --> A B C D

Georg Brandl