Blame - Doc/library/itertools.rst - platform/external/python/cpython3

2007-08-15 14:28:22 +0000

[diff] [blame]

1

2

:mod:`itertools` --- Functions creating iterators for efficient looping

3

=======================================================================

4

5

.. module:: itertools

6

:synopsis: Functions creating iterators for efficient looping.

7

.. moduleauthor:: Raymond Hettinger <python@rcn.com>

8

.. sectionauthor:: Raymond Hettinger <python@rcn.com>

9

10

Christian Heimes

2008-03-23 21:54:12 +0000

[diff] [blame]

11

.. testsetup::

12

13

from itertools import *

14

15

Georg Brandl

9afde1c

2007-11-01 20:32:30 +0000

[diff] [blame]

16

This module implements a number of :term:`iterator` building blocks inspired by

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

17

constructs from the Haskell and SML programming languages. Each has been recast

18

in a form suitable for Python.

19

20

The module standardizes a core set of fast, memory efficient tools that are

21

useful by themselves or in combination. Standardization helps avoid the

22

readability and reliability problems which arise when many different individuals

23

create their own slightly varying implementations, each with their own quirks

24

and naming conventions.

25

26

The tools are designed to combine readily with one another. This makes it easy

27

to construct more specialized tools succinctly and efficiently in pure Python.

28

29

For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a

Raymond Hettinger

2008-03-13 01:26:19 +0000

[diff] [blame]

30

sequence ``f(0), f(1), ...``. But, this effect can be achieved in Python

31

by combining :func:`map` and :func:`count` to form ``map(f, count())``.

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

32

33

Likewise, the functional tools are designed to work well with the high-speed

34

functions provided by the :mod:`operator` module.

35

36

The module author welcomes suggestions for other basic building blocks to be

37

added to future versions of the module.

38

39

Whether cast in pure python form or compiled code, tools that use iterators are

40

more memory efficient (and faster) than their list based counterparts. Adopting

41

the principles of just-in-time manufacturing, they create data when and where

42

needed instead of consuming memory with the computer equivalent of "inventory".

43

44

The performance advantage of iterators becomes more acute as the number of

45

elements increases -- at some point, lists grow large enough to severely impact

46

memory cache performance and start running slowly.

.. seealso::

The Standard ML Basis Library, `The Standard ML Basis Library

52

<http://www.standardml.org/Basis/>`_.

53

54

Haskell, A Purely Functional Language, `Definition of Haskell and the Standard

55

Libraries <http://www.haskell.org/definition/>`_.

56

57

58

.. _itertools-functions:

Itertool functions

------------------

The following module functions all construct and return iterators. Some provide

64

streams of infinite length, so they should only be accessed by functions or

65

loops that truncate the stream.

66

67

68

.. function:: chain(*iterables)

69

70

Make an iterator that returns elements from the first iterable until it is

71

exhausted, then proceeds to the next iterable, until all of the iterables are

72

exhausted. Used for treating consecutive sequences as a single sequence.

73

Equivalent to::

74

75

def chain(*iterables):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

76

# chain('ABC', 'DEF') --> A B C D E F

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

for it in iterables:

for element in it:

yield element

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

82

.. function:: itertools.chain.from_iterable(iterable)

83

84

Alternate constructor for :func:`chain`. Gets chained inputs from a

85

single iterable argument that is evaluated lazily. Equivalent to::

86

87

@classmethod

88

def from_iterable(iterables):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

89

# chain.from_iterable(['ABC', 'DEF']) --> A B C D E F

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

for it in iterables:

for element in it:

yield element

.. versionadded:: 2.6

95

Christian Heimes

2008-03-04 23:39:23 +0000

[diff] [blame]

96

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

97

.. function:: combinations(iterable, r)

98

99

Return successive *r* length combinations of elements in the *iterable*.

100

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

101

Combinations are emitted in lexicographic sort order. So, if the

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

102

input *iterable* is sorted, the combination tuples will be produced

103

in sorted order.

104

105

Elements are treated as unique based on their position, not on their

106

value. So if the input elements are unique, there will be no repeat

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

107

values in each combination.

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

108

109

Each result tuple is ordered to match the input order. So, every

110

combination is a subsequence of the input *iterable*.

111

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

112

Equivalent to::

113

114

def combinations(iterable, r):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

115

# combinations('ABCD', 2) --> AB AC AD BC BD CD

116

# combinations(range(4), 3) --> 012 013 023 123

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

117

pool = tuple(iterable)

Christian Heimes

2008-02-28 11:19:05 +0000

[diff] [blame]

118

n = len(pool)

Christian Heimes

2008-03-02 22:46:37 +0000

[diff] [blame]

119

indices = range(r)

120

yield tuple(pool[i] for i in indices)

Christian Heimes

2008-02-28 11:19:05 +0000

[diff] [blame]

121

while 1:

122

for i in reversed(range(r)):

Christian Heimes

2008-03-02 22:46:37 +0000

[diff] [blame]

123

if indices[i] != i + n - r:

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

124

break

Christian Heimes

2008-02-28 11:19:05 +0000

[diff] [blame]

125

else:

126

return

Christian Heimes

2008-03-02 22:46:37 +0000

[diff] [blame]

127

indices[i] += 1

Christian Heimes

2008-02-28 11:19:05 +0000

[diff] [blame]

128

for j in range(i+1, r):

Christian Heimes

2008-03-02 22:46:37 +0000

[diff] [blame]

129

indices[j] = indices[j-1] + 1

130

yield tuple(pool[i] for i in indices)

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

131

Christian Heimes

2008-03-04 23:39:23 +0000

[diff] [blame]

132

The code for :func:`combinations` can be also expressed as a subsequence

133

of :func:`permutations` after filtering entries where the elements are not

134

in sorted order (according to their position in the input pool)::

135

136

def combinations(iterable, r):

137

pool = tuple(iterable)

138

n = len(pool)

139

for indices in permutations(range(n), r):

140

if sorted(indices) == list(indices):

141

yield tuple(pool[i] for i in indices)

142

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

143

.. versionadded:: 2.6

144

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

145

.. function:: count([n])

146

147

Make an iterator that returns consecutive integers starting with *n*. If not

Raymond Hettinger

2008-03-13 01:26:19 +0000

[diff] [blame]

148

specified *n* defaults to zero. Often used as an argument to :func:`map` to

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

149

generate consecutive data points. Also, used with :func:`zip` to add sequence

Georg Brandl

9afde1c

2007-11-01 20:32:30 +0000

[diff] [blame]

150

numbers. Equivalent to::

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

151

152

def count(n=0):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

153

# count(10) --> 10 11 12 13 14 ...

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

while True:

yield n

n += 1

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

158

159

.. function:: cycle(iterable)

160

161

Make an iterator returning elements from the iterable and saving a copy of each.

162

When the iterable is exhausted, return elements from the saved copy. Repeats

163

indefinitely. Equivalent to::

164

165

def cycle(iterable):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

166

# cycle('ABCD') --> A B C D A B C D A B C D ...

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

167

saved = []

168

for element in iterable:

169

yield element

170

saved.append(element)

171

while saved:

172

for element in saved:

173

yield element

174

175

Note, this member of the toolkit may require significant auxiliary storage

176

(depending on the length of the iterable).

177

178

179

.. function:: dropwhile(predicate, iterable)

180

181

Make an iterator that drops elements from the iterable as long as the predicate

182

is true; afterwards, returns every element. Note, the iterator does not produce

183

*any* output until the predicate first becomes false, so it may have a lengthy

184

start-up time. Equivalent to::

185

186

def dropwhile(predicate, iterable):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

187

# dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

188

iterable = iter(iterable)

for x in iterable:

if not predicate(x):

yield x

break

for x in iterable:

yield x

.. function:: groupby(iterable[, key])

198

199

Make an iterator that returns consecutive keys and groups from the *iterable*.

200

The *key* is a function computing a key value for each element. If not

201

specified or is ``None``, *key* defaults to an identity function and returns

202

the element unchanged. Generally, the iterable needs to already be sorted on

203

the same key function.

204

205

The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It

206

generates a break or new group every time the value of the key function changes

207

(which is why it is usually necessary to have sorted the data using the same key

208

function). That behavior differs from SQL's GROUP BY which aggregates common

209

elements regardless of their input order.

210

211

The returned group is itself an iterator that shares the underlying iterable

212

with :func:`groupby`. Because the source is shared, when the :func:`groupby`

213

object is advanced, the previous group is no longer visible. So, if that data

214

is needed later, it should be stored as a list::

groups = []

uniquekeys = []

data = sorted(data, key=keyfunc)

219

for k, g in groupby(data, keyfunc):

220

groups.append(list(g)) # Store group iterator as a list

221

uniquekeys.append(k)

222

223

:func:`groupby` is equivalent to::

224

225

class groupby(object):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

226

# [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B

227

# [(list(g)) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

228

def __init__(self, iterable, key=None):

if key is None:

key = lambda x: x

self.keyfunc = key

self.it = iter(iterable)

Christian Heimes

2007-12-31 16:14:33 +0000

[diff] [blame]

233

self.tgtkey = self.currkey = self.currvalue = object()

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

def __iter__(self):

return self

def __next__(self):

while self.currkey == self.tgtkey:

238

self.currvalue = next(self.it) # Exit on StopIteration

239

self.currkey = self.keyfunc(self.currvalue)

240

self.tgtkey = self.currkey

241

return (self.currkey, self._grouper(self.tgtkey))

242

def _grouper(self, tgtkey):

243

while self.currkey == tgtkey:

244

yield self.currvalue

245

self.currvalue = next(self.it) # Exit on StopIteration

246

self.currkey = self.keyfunc(self.currvalue)

247

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

248

Raymond Hettinger

2008-03-13 01:41:43 +0000

[diff] [blame]

249

.. function:: filterfalse(predicate, iterable)

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

250

251

Make an iterator that filters elements from iterable returning only those for

252

which the predicate is ``False``. If *predicate* is ``None``, return the items

253

that are false. Equivalent to::

254

Raymond Hettinger

2008-03-13 01:41:43 +0000

[diff] [blame]

255

def filterfalse(predicate, iterable):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

256

# filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

257

if predicate is None:

predicate = bool

for x in iterable:

if not predicate(x):

yield x

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

264

.. function:: islice(iterable, [start,] stop [, step])

265

266

Make an iterator that returns selected elements from the iterable. If *start* is

267

non-zero, then elements from the iterable are skipped until start is reached.

268

Afterward, elements are returned consecutively unless *step* is set higher than

269

one which results in items being skipped. If *stop* is ``None``, then iteration

270

continues until the iterator is exhausted, if at all; otherwise, it stops at the

271

specified position. Unlike regular slicing, :func:`islice` does not support

272

negative values for *start*, *stop*, or *step*. Can be used to extract related

273

fields from data where the internal structure has been flattened (for example, a

274

multi-line report may list a name field on every third line). Equivalent to::

275

276

def islice(iterable, *args):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

277

# islice('ABCDEFG', 2) --> A B

278

# islice('ABCDEFG', 2, 4) --> C D

279

# islice('ABCDEFG', 2, None) --> C D E F G

280

# islice('ABCDEFG', 0, None, 2) --> A C E G

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

281

s = slice(*args)

Georg Brandl

f694518

2008-02-01 11:56:49 +0000

[diff] [blame]

282

it = range(s.start or 0, s.stop or sys.maxsize, s.step or 1)

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

283

nexti = next(it)

284

for i, element in enumerate(iterable):

if i == nexti:

yield element

nexti = next(it)

If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,

290

then the step defaults to one.

291

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

292

Raymond Hettinger

2008-03-13 01:41:43 +0000

[diff] [blame]

293

.. function:: zip_longest(*iterables[, fillvalue])

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

294

295

Make an iterator that aggregates elements from each of the iterables. If the

296

iterables are of uneven length, missing values are filled-in with *fillvalue*.

297

Iteration continues until the longest iterable is exhausted. Equivalent to::

298

Raymond Hettinger

2008-03-13 02:43:14 +0000

[diff] [blame]

299

def zip_longest(*args, fillvalue=None):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

300

# zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

301

def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):

302

yield counter() # yields the fillvalue, or raises IndexError

303

fillers = repeat(fillvalue)

304

iters = [chain(it, sentinel(), fillers) for it in args]

305

try:

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

306

for tup in zip(*iters):

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

yield tup

except IndexError:

pass

Raymond Hettinger

2008-03-13 01:41:43 +0000

[diff] [blame]

311

If one of the iterables is potentially infinite, then the :func:`zip_longest`

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

312

function should be wrapped with something that limits the number of calls (for

313

example :func:`islice` or :func:`takewhile`).

314

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

315

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

316

.. function:: permutations(iterable[, r])

317

318

Return successive *r* length permutations of elements in the *iterable*.

319

320

If *r* is not specified or is ``None``, then *r* defaults to the length

321

of the *iterable* and all possible full-length permutations

322

are generated.

323

324

Permutations are emitted in lexicographic sort order. So, if the

325

input *iterable* is sorted, the permutation tuples will be produced

326

in sorted order.

327

328

Elements are treated as unique based on their position, not on their

329

value. So if the input elements are unique, there will be no repeat

330

values in each permutation.

331

Christian Heimes

2008-03-02 22:46:37 +0000

[diff] [blame]

332

Equivalent to::

333

334

def permutations(iterable, r=None):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

335

# permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC

336

# permutations(range(3)) --> 012 021 102 120 201 210

Christian Heimes

2008-03-02 22:46:37 +0000

[diff] [blame]

337

pool = tuple(iterable)

338

n = len(pool)

339

r = n if r is None else r

340

indices = range(n)

Christian Heimes

2008-03-23 21:54:12 +0000

[diff] [blame]

341

cycles = range(n, n-r, -1)

Christian Heimes

2008-03-02 22:46:37 +0000

[diff] [blame]

342

yield tuple(pool[i] for i in indices[:r])

343

while n:

344

for i in reversed(range(r)):

345

cycles[i] -= 1

346

if cycles[i] == 0:

347

indices[i:] = indices[i+1:] + indices[i:i+1]

cycles[i] = n - i

else:

j = cycles[i]

indices[i], indices[-j] = indices[-j], indices[i]

352

yield tuple(pool[i] for i in indices[:r])

353

break

354

else:

355

return

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

356

Christian Heimes

2008-03-04 23:39:23 +0000

[diff] [blame]

357

The code for :func:`permutations` can be also expressed as a subsequence of

358

:func:`product`, filtered to exclude entries with repeated elements (those

359

from the same position in the input pool)::

360

361

def permutations(iterable, r=None):

362

pool = tuple(iterable)

363

n = len(pool)

364

r = n if r is None else r

365

for indices in product(range(n), repeat=r):

366

if len(set(indices)) == r:

367

yield tuple(pool[i] for i in indices)

368

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

369

.. versionadded:: 2.6

370

Christian Heimes

9e7f1d2

2008-02-28 12:27:11 +0000

[diff] [blame]

371

.. function:: product(*iterables[, repeat])

Christian Heimes

2008-02-23 13:18:03 +0000

[diff] [blame]

372

373

Cartesian product of input iterables.

374

375

Equivalent to nested for-loops in a generator expression. For example,

376

``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.

377

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

378

The leftmost iterators correspond to the outermost for-loop, so the output

379

tuples cycle like an odometer (with the rightmost element changing on every

Christian Heimes

2008-03-04 23:39:23 +0000

[diff] [blame]

380

iteration). This results in a lexicographic ordering so that if the

381

inputs iterables are sorted, the product tuples are emitted

Christian Heimes

2008-02-26 08:18:30 +0000

[diff] [blame]

382

in sorted order.

Christian Heimes

2008-02-23 13:18:03 +0000

[diff] [blame]

383

Christian Heimes

9e7f1d2

2008-02-28 12:27:11 +0000

[diff] [blame]

384

To compute the product of an iterable with itself, specify the number of

385

repetitions with the optional *repeat* keyword argument. For example,

386

``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.

387

Christian Heimes

2008-03-04 23:39:23 +0000

[diff] [blame]

388

This function is equivalent to the following code, except that the

389

actual implementation does not build up intermediate results in memory::

Christian Heimes

2008-02-23 13:18:03 +0000

[diff] [blame]

390

Raymond Hettinger

2008-03-13 02:43:14 +0000

[diff] [blame]

391

def product(*args, repeat=1):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

392

# product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy

393

# product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111

Raymond Hettinger

2008-03-13 02:43:14 +0000

[diff] [blame]

394

pools = map(tuple, args) * repeat

Christian Heimes

2008-03-04 23:39:23 +0000

[diff] [blame]

395

result = [[]]

396

for pool in pools:

397

result = [x+[y] for x in result for y in pool]

398

for prod in result:

399

yield tuple(prod)

Christian Heimes

2008-02-23 13:18:03 +0000

[diff] [blame]

400

401

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

402

.. function:: repeat(object[, times])

403

404

Make an iterator that returns *object* over and over again. Runs indefinitely

Raymond Hettinger

2008-03-13 01:26:19 +0000

[diff] [blame]

405

unless the *times* argument is specified. Used as argument to :func:`map` for

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

406

invariant parameters to the called function. Also used with :func:`zip` to

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

407

create an invariant part of a tuple record. Equivalent to::

408

409

def repeat(object, times=None):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

410

# repeat(10, 3) --> 10 10 10

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

if times is None:

while True:

yield object

else:

for i in range(times):

yield object

.. function:: starmap(function, iterable)

420

Christian Heimes

679db4a

2008-01-18 09:56:22 +0000

[diff] [blame]

421

Make an iterator that computes the function using arguments obtained from

Raymond Hettinger

2008-03-13 01:26:19 +0000

[diff] [blame]

422

the iterable. Used instead of :func:`map` when argument parameters are already

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

423

grouped in tuples from a single iterable (the data has been "pre-zipped"). The

Raymond Hettinger

2008-03-13 01:26:19 +0000

[diff] [blame]

424

difference between :func:`map` and :func:`starmap` parallels the distinction

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

425

between ``function(a,b)`` and ``function(*c)``. Equivalent to::

426

427

def starmap(function, iterable):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

428

# starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000

Christian Heimes

679db4a

2008-01-18 09:56:22 +0000

[diff] [blame]

429

for args in iterable:

430

yield function(*args)

431

432

.. versionchanged:: 2.6

433

Previously, :func:`starmap` required the function arguments to be tuples.

434

Now, any iterable is allowed.

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

435

436

437

.. function:: takewhile(predicate, iterable)

438

439

Make an iterator that returns elements from the iterable as long as the

440

predicate is true. Equivalent to::

441

442

def takewhile(predicate, iterable):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

443

# takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

for x in iterable:

if predicate(x):

yield x

else:

break

.. function:: tee(iterable[, n=2])

452

453

Return *n* independent iterators from a single iterable. The case where ``n==2``

454

is equivalent to::

455

456

def tee(iterable):

Christian Heimes

2007-12-31 16:14:33 +0000

[diff] [blame]

457

def gen(next, data={}):

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

458

for i in count():

Christian Heimes

2007-12-31 16:14:33 +0000

[diff] [blame]

459

if i in data:

460

yield data.pop(i)

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

461

else:

Christian Heimes

2007-12-31 16:14:33 +0000

[diff] [blame]

462

data[i] = next()

463

yield data[i]

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

464

it = iter(iterable)

465

return (gen(it.__next__), gen(it.__next__))

466

467

Note, once :func:`tee` has made a split, the original *iterable* should not be

468

used anywhere else; otherwise, the *iterable* could get advanced without the tee

469

objects being informed.

470

471

Note, this member of the toolkit may require significant auxiliary storage

472

(depending on how much temporary data needs to be stored). In general, if one

473

iterator is going to use most or all of the data before the other iterator, it

474

is faster to use :func:`list` instead of :func:`tee`.

475

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

476

477

.. _itertools-example:

Examples

--------

The following examples show common uses for each tool and demonstrate ways they

Christian Heimes

2008-03-23 21:54:12 +0000

[diff] [blame]

483

can be combined.

484

485

.. doctest::

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

486

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

487

# Show a dictionary sorted and grouped by value

488

>>> from operator import itemgetter

489

>>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)

Fred Drake

2e74878

2007-09-04 17:33:11 +0000

[diff] [blame]

490

>>> di = sorted(d.items(), key=itemgetter(1))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

491

>>> for k, g in groupby(di, key=itemgetter(1)):

Georg Brandl

6911e3c

2007-09-04 07:15:32 +0000

[diff] [blame]

492

... print(k, map(itemgetter(0), g))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

...

1 ['a', 'c', 'e']

2 ['b', 'd', 'f']

3 ['g']

# Find runs of consecutive numbers using groupby. The key to the solution

499

# is differencing with a range so that consecutive numbers all appear in

500

# same group.

501

>>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]

502

>>> for k, g in groupby(enumerate(data), lambda t:t[0]-t[1]):

Georg Brandl

6911e3c

2007-09-04 07:15:32 +0000

[diff] [blame]

503

... print(map(operator.itemgetter(1), g))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

...

[1]

[4, 5, 6]

[10]

[15, 16, 17, 18]

[22]

[25, 26, 27, 28]

.. _itertools-recipes:

Recipes

-------

This section shows recipes for creating an extended toolset using the existing

520

itertools as building blocks.

521

522

The extended tools offer the same high performance as the underlying toolset.

523

The superior memory performance is kept by processing elements one at a time

524

rather than bringing the whole iterable into memory all at once. Code volume is

525

kept small by linking the tools together in a functional style which helps

526

eliminate temporary variables. High speed is retained by preferring

Georg Brandl

9afde1c

2007-11-01 20:32:30 +0000

[diff] [blame]

527

"vectorized" building blocks over the use of for-loops and :term:`generator`\s

Christian Heimes

2008-03-23 21:54:12 +0000

[diff] [blame]

528

which incur interpreter overhead.

529

530

.. testcode::

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

531

532

def take(n, seq):

533

return list(islice(seq, n))

534

535

def enumerate(iterable):

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

536

return zip(count(), iterable)

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

537

538

def tabulate(function):

539

"Return function(0), function(1), ..."

Raymond Hettinger

2008-03-13 01:26:19 +0000

[diff] [blame]

540

return map(function, count())

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

541

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

542

def items(mapping):

543

return zip(mapping.keys(), mapping.values())

544

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

545

def nth(iterable, n):

546

"Returns the nth item or raise StopIteration"

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

547

return next(islice(iterable, n, None))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

548

549

def all(seq, pred=None):

550

"Returns True if pred(x) is true for every element in the iterable"

Raymond Hettinger

2008-03-13 01:41:43 +0000

[diff] [blame]

551

for elem in filterfalse(pred, seq):

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

return False

return True

def any(seq, pred=None):

556

"Returns True if pred(x) is true for at least one element in the iterable"

Raymond Hettinger

17301e9

2008-03-13 00:19:26 +0000

[diff] [blame]

557

for elem in filter(pred, seq):

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

return True

return False

def no(seq, pred=None):

562

"Returns True if pred(x) is false for every element in the iterable"

Raymond Hettinger

17301e9

2008-03-13 00:19:26 +0000

[diff] [blame]

563

for elem in filter(pred, seq):

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

return False

return True

def quantify(seq, pred=None):

568

"Count how many times the predicate is true in the sequence"

Raymond Hettinger

2008-03-13 01:26:19 +0000

[diff] [blame]

569

return sum(map(pred, seq))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

570

571

def padnone(seq):

572

"""Returns the sequence elements and then returns None indefinitely.

573

574

Useful for emulating the behavior of the built-in map() function.

575

"""

576

return chain(seq, repeat(None))

577

578

def ncycles(seq, n):

579

"Returns the sequence elements n times"

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

580

return chain.from_iterable(repeat(seq, n))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

581

582

def dotproduct(vec1, vec2):

Raymond Hettinger

2008-03-13 01:26:19 +0000

[diff] [blame]

583

return sum(map(operator.mul, vec1, vec2))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

584

585

def flatten(listOfLists):

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

586

return list(chain.from_iterable(listOfLists))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

587

588

def repeatfunc(func, times=None, *args):

589

"""Repeat calls to func with specified arguments.

590

591

Example: repeatfunc(random.random)

592

"""

593

if times is None:

594

return starmap(func, repeat(args))

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

595

return starmap(func, repeat(args, times))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

596

597

def pairwise(iterable):

598

"s -> (s0,s1), (s1,s2), (s2, s3), ..."

599

a, b = tee(iterable)

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

600

for elem in b:

601

break

602

return zip(a, b)

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

603

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

604

def grouper(n, iterable, fillvalue=None):

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

605

"grouper(3, 'abcdefg', 'x') --> ('a','b','c'), ('d','e','f'), ('g','x','x')"

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

606

args = [iter(iterable)] * n

Raymond Hettinger

2008-03-13 02:43:14 +0000

[diff] [blame]

607

return zip_longest(*args, fillvalue=fillvalue)

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

608

Christian Heimes

7b3ce6a

2008-01-31 14:31:45 +0000

[diff] [blame]

609

def roundrobin(*iterables):

610

"roundrobin('abc', 'd', 'ef') --> 'a', 'd', 'e', 'b', 'f', 'c'"

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

611

# Recipe credited to George Sakkis

Christian Heimes

7b3ce6a

2008-01-31 14:31:45 +0000

[diff] [blame]

612

pending = len(iterables)

Raymond Hettinger

2008-03-13 02:39:40 +0000

[diff] [blame]

613

nexts = cycle(iter(it).__next__ for it in iterables)

Christian Heimes

7b3ce6a

2008-01-31 14:31:45 +0000

[diff] [blame]

while pending:

try:

for next in nexts:

yield next()

except StopIteration:

619

pending -= 1

620

nexts = cycle(islice(nexts, pending))

Georg Brandl

2007-08-15 14:28:22 +0000

[diff] [blame]

621

Christian Heimes

2008-02-23 13:18:03 +0000

[diff] [blame]

622

def powerset(iterable):

Christian Heimes

2008-02-28 20:02:27 +0000

[diff] [blame]

623

"powerset('ab') --> set([]), set(['a']), set(['b']), set(['a', 'b'])"

624

# Recipe credited to Eric Raymond

625

pairs = [(2**i, x) for i, x in enumerate(iterable)]

626

for n in xrange(2**len(pairs)):

627

yield set(x for m, x in pairs if m&n)

Christian Heimes

2008-02-23 13:18:03 +0000

[diff] [blame]

628

Raymond Hettinger