Blame - Doc/library/itertools.rst - platform/external/python/cpython2

2007-08-15 14:28:01 +0000

[diff] [blame]

1

2

:mod:`itertools` --- Functions creating iterators for efficient looping

3

=======================================================================

4

5

.. module:: itertools

6

:synopsis: Functions creating iterators for efficient looping.

7

.. moduleauthor:: Raymond Hettinger <python@rcn.com>

8

.. sectionauthor:: Raymond Hettinger <python@rcn.com>

9

10

Georg Brandl

2008-03-22 22:04:10 +0000

[diff] [blame]

11

.. testsetup::

12

13

from itertools import *

14

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

15

.. versionadded:: 2.3

16

Georg Brandl

e7a0990

2007-10-21 12:10:28 +0000

[diff] [blame]

17

This module implements a number of :term:`iterator` building blocks inspired by

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

18

constructs from the Haskell and SML programming languages. Each has been recast

19

in a form suitable for Python.

20

21

The module standardizes a core set of fast, memory efficient tools that are

22

useful by themselves or in combination. Standardization helps avoid the

23

readability and reliability problems which arise when many different individuals

24

create their own slightly varying implementations, each with their own quirks

25

and naming conventions.

26

27

The tools are designed to combine readily with one another. This makes it easy

28

to construct more specialized tools succinctly and efficiently in pure Python.

29

30

For instance, SML provides a tabulation tool: ``tabulate(f)`` which produces a

31

sequence ``f(0), f(1), ...``. This toolbox provides :func:`imap` and

32

:func:`count` which can be combined to form ``imap(f, count())`` and produce an

33

equivalent result.

34

35

Likewise, the functional tools are designed to work well with the high-speed

36

functions provided by the :mod:`operator` module.

37

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

38

Whether cast in pure python form or compiled code, tools that use iterators are

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

39

more memory efficient (and often faster) than their list based counterparts. Adopting

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

40

the principles of just-in-time manufacturing, they create data when and where

41

needed instead of consuming memory with the computer equivalent of "inventory".

42

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

.. seealso::

The Standard ML Basis Library, `The Standard ML Basis Library

47

<http://www.standardml.org/Basis/>`_.

48

49

Haskell, A Purely Functional Language, `Definition of Haskell and the Standard

50

Libraries <http://www.haskell.org/definition/>`_.

51

52

53

.. _itertools-functions:

Itertool functions

------------------

The following module functions all construct and return iterators. Some provide

59

streams of infinite length, so they should only be accessed by functions or

60

loops that truncate the stream.

61

62

63

.. function:: chain(*iterables)

64

65

Make an iterator that returns elements from the first iterable until it is

66

exhausted, then proceeds to the next iterable, until all of the iterables are

67

exhausted. Used for treating consecutive sequences as a single sequence.

68

Equivalent to::

69

70

def chain(*iterables):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

71

# chain('ABC', 'DEF') --> A B C D E F

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

for it in iterables:

for element in it:

yield element

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

77

.. function:: itertools.chain.from_iterable(iterable)

78

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

79

Alternate constructor for :func:`chain`. Gets chained inputs from a

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

80

single iterable argument that is evaluated lazily. Equivalent to::

81

82

@classmethod

83

def from_iterable(iterables):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

84

# chain.from_iterable(['ABC', 'DEF']) --> A B C D E F

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

for it in iterables:

for element in it:

yield element

.. versionadded:: 2.6

90

Raymond Hettinger

2008-03-04 04:17:08 +0000

[diff] [blame]

91

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

92

.. function:: combinations(iterable, r)

93

Raymond Hettinger

5eaffc4

2008-04-17 10:48:31 +0000

[diff] [blame]

94

Return *r* length subsequences of elements from the input *iterable*.

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

95

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

96

Combinations are emitted in lexicographic sort order. So, if the

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

97

input *iterable* is sorted, the combination tuples will be produced

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

98

in sorted order.

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

99

100

Elements are treated as unique based on their position, not on their

101

value. So if the input elements are unique, there will be no repeat

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

102

values in each combination.

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

103

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

104

Equivalent to::

105

106

def combinations(iterable, r):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

107

# combinations('ABCD', 2) --> AB AC AD BC BD CD

108

# combinations(range(4), 3) --> 012 013 023 123

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

109

pool = tuple(iterable)

Raymond Hettinger

93e804d

2008-02-26 23:40:50 +0000

[diff] [blame]

110

n = len(pool)

Raymond Hettinger

2009-01-08 06:39:04 +0000

[diff] [blame]

111

if r > n:

112

return

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

113

indices = range(r)

114

yield tuple(pool[i] for i in indices)

Raymond Hettinger

93e804d

2008-02-26 23:40:50 +0000

[diff] [blame]

115

while 1:

116

for i in reversed(range(r)):

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

117

if indices[i] != i + n - r:

Raymond Hettinger

c105289

2008-02-27 01:44:34 +0000

[diff] [blame]

118

break

Raymond Hettinger

93e804d

2008-02-26 23:40:50 +0000

[diff] [blame]

119

else:

120

return

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

121

indices[i] += 1

Raymond Hettinger

c105289

2008-02-27 01:44:34 +0000

[diff] [blame]

122

for j in range(i+1, r):

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

123

indices[j] = indices[j-1] + 1

124

yield tuple(pool[i] for i in indices)

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

125

Raymond Hettinger

2008-03-04 04:17:08 +0000

[diff] [blame]

126

The code for :func:`combinations` can be also expressed as a subsequence

127

of :func:`permutations` after filtering entries where the elements are not

128

in sorted order (according to their position in the input pool)::

129

130

def combinations(iterable, r):

131

pool = tuple(iterable)

132

n = len(pool)

133

for indices in permutations(range(n), r):

134

if sorted(indices) == list(indices):

135

yield tuple(pool[i] for i in indices)

136

Raymond Hettinger

2009-01-08 06:39:04 +0000

[diff] [blame]

137

The number of items returned is ``n! / r! / (n-r)!`` when ``0 <= r <= n``

138

or zero when ``r > n``.

139

Raymond Hettinger

2008-02-26 02:46:54 +0000

[diff] [blame]

140

.. versionadded:: 2.6

141

Raymond Hettinger

d081abc

2009-01-27 02:58:49 +0000

[diff] [blame]

142

.. function:: combinations_with_replacement(iterable, r)

143

144

Return *r* length subsequences of elements from the input *iterable*

145

allowing individual elements to be repeated more than once.

146

147

Combinations are emitted in lexicographic sort order. So, if the

148

input *iterable* is sorted, the combination tuples will be produced

149

in sorted order.

150

151

Elements are treated as unique based on their position, not on their

152

value. So if the input elements are unique, the generated combinations

will also be unique.

Equivalent to::

def combinations_with_replacement(iterable, r):

158

# combinations_with_replacement('ABC', 2) --> AA AB AC BB BC CC

159

pool = tuple(iterable)

n = len(pool)

if not n and r:

return

indices = [0] * r

yield tuple(pool[i] for i in indices)

165

while 1:

166

for i in reversed(range(r)):

167

if indices[i] != n - 1:

break

else:

return

indices[i:] = [indices[i] + 1] * (r - i)

172

yield tuple(pool[i] for i in indices)

173

174

The code for :func:`combinations_with_replacement` can be also expressed as

175

a subsequence of :func:`product` after filtering entries where the elements

176

are not in sorted order (according to their position in the input pool)::

177

178

def combinations_with_replacement(iterable, r):

179

pool = tuple(iterable)

180

n = len(pool)

181

for indices in product(range(n), repeat=r):

182

if sorted(indices) == list(indices):

183

yield tuple(pool[i] for i in indices)

184

185

The number of items returned is ``(n+r-1)! / r! / (n-1)!`` when ``n > 0``.

186

187

.. versionadded:: 2.7

188

Raymond Hettinger

2bcb8e9

2009-01-25 21:04:14 +0000

[diff] [blame]

189

.. function:: compress(data, selectors)

190

191

Make an iterator that filters elements from *data* returning only those that

192

have a corresponding element in *selectors* that evaluates to ``True``.

193

Stops when either the *data* or *selectors* iterables have been exhausted.

194

Equivalent to::

195

196

def compress(data, selectors):

197

# compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F

198

return (d for d, s in izip(data, selectors) if s)

199

200

.. versionadded:: 2.7

201

202

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

203

.. function:: count([n])

204

205

Make an iterator that returns consecutive integers starting with *n*. If not

Raymond Hettinger

50e90e2

2007-10-04 00:20:27 +0000

[diff] [blame]

206

specified *n* defaults to zero. Often used as an argument to :func:`imap` to

207

generate consecutive data points. Also, used with :func:`izip` to add sequence

208

numbers. Equivalent to::

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

209

210

def count(n=0):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

211

# count(10) --> 10 11 12 13 14 ...

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

while True:

yield n

n += 1

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

216

217

.. function:: cycle(iterable)

218

219

Make an iterator returning elements from the iterable and saving a copy of each.

220

When the iterable is exhausted, return elements from the saved copy. Repeats

221

indefinitely. Equivalent to::

222

223

def cycle(iterable):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

224

# cycle('ABCD') --> A B C D A B C D A B C D ...

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

225

saved = []

226

for element in iterable:

227

yield element

228

saved.append(element)

229

while saved:

230

for element in saved:

231

yield element

232

233

Note, this member of the toolkit may require significant auxiliary storage

234

(depending on the length of the iterable).

235

236

237

.. function:: dropwhile(predicate, iterable)

238

239

Make an iterator that drops elements from the iterable as long as the predicate

240

is true; afterwards, returns every element. Note, the iterator does not produce

241

*any* output until the predicate first becomes false, so it may have a lengthy

242

start-up time. Equivalent to::

243

244

def dropwhile(predicate, iterable):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

245

# dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

246

iterable = iter(iterable)

for x in iterable:

if not predicate(x):

yield x

break

for x in iterable:

yield x

.. function:: groupby(iterable[, key])

256

257

Make an iterator that returns consecutive keys and groups from the *iterable*.

258

The *key* is a function computing a key value for each element. If not

259

specified or is ``None``, *key* defaults to an identity function and returns

260

the element unchanged. Generally, the iterable needs to already be sorted on

261

the same key function.

262

263

The operation of :func:`groupby` is similar to the ``uniq`` filter in Unix. It

264

generates a break or new group every time the value of the key function changes

265

(which is why it is usually necessary to have sorted the data using the same key

266

function). That behavior differs from SQL's GROUP BY which aggregates common

267

elements regardless of their input order.

268

269

The returned group is itself an iterator that shares the underlying iterable

270

with :func:`groupby`. Because the source is shared, when the :func:`groupby`

271

object is advanced, the previous group is no longer visible. So, if that data

272

is needed later, it should be stored as a list::

groups = []

uniquekeys = []

data = sorted(data, key=keyfunc)

277

for k, g in groupby(data, keyfunc):

278

groups.append(list(g)) # Store group iterator as a list

279

uniquekeys.append(k)

280

281

:func:`groupby` is equivalent to::

282

283

class groupby(object):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

284

# [k for k, g in groupby('AAAABBBCCDAABBB')] --> A B C D A B

Raymond Hettinger

d507afd

2009-02-04 10:52:32 +0000

[diff] [blame^]

285

# [list(g) for k, g in groupby('AAAABBBCCD')] --> AAAA BBB CC D

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

286

def __init__(self, iterable, key=None):

if key is None:

key = lambda x: x

self.keyfunc = key

self.it = iter(iterable)

Raymond Hettinger

81a885a

2007-12-29 22:16:24 +0000

[diff] [blame]

291

self.tgtkey = self.currkey = self.currvalue = object()

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

def __iter__(self):

return self

def next(self):

while self.currkey == self.tgtkey:

296

self.currvalue = self.it.next() # Exit on StopIteration

297

self.currkey = self.keyfunc(self.currvalue)

298

self.tgtkey = self.currkey

299

return (self.currkey, self._grouper(self.tgtkey))

300

def _grouper(self, tgtkey):

301

while self.currkey == tgtkey:

302

yield self.currvalue

303

self.currvalue = self.it.next() # Exit on StopIteration

304

self.currkey = self.keyfunc(self.currvalue)

305

306

.. versionadded:: 2.4

307

308

309

.. function:: ifilter(predicate, iterable)

310

311

Make an iterator that filters elements from iterable returning only those for

312

which the predicate is ``True``. If *predicate* is ``None``, return the items

313

that are true. Equivalent to::

314

315

def ifilter(predicate, iterable):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

316

# ifilter(lambda x: x%2, range(10)) --> 1 3 5 7 9

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

317

if predicate is None:

predicate = bool

for x in iterable:

if predicate(x):

yield x

.. function:: ifilterfalse(predicate, iterable)

325

326

Make an iterator that filters elements from iterable returning only those for

327

which the predicate is ``False``. If *predicate* is ``None``, return the items

328

that are false. Equivalent to::

329

330

def ifilterfalse(predicate, iterable):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

331

# ifilterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

332

if predicate is None:

predicate = bool

for x in iterable:

if not predicate(x):

yield x

.. function:: imap(function, *iterables)

340

341

Make an iterator that computes the function using arguments from each of the

342

iterables. If *function* is set to ``None``, then :func:`imap` returns the

343

arguments as a tuple. Like :func:`map` but stops when the shortest iterable is

344

exhausted instead of filling in ``None`` for shorter iterables. The reason for

345

the difference is that infinite iterator arguments are typically an error for

346

:func:`map` (because the output is fully evaluated) but represent a common and

347

useful way of supplying arguments to :func:`imap`. Equivalent to::

348

349

def imap(function, *iterables):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

350

# imap(pow, (2,3,10), (5,2,3)) --> 32 9 1000

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

351

iterables = map(iter, iterables)

352

while True:

Raymond Hettinger

2dec48d

2008-01-22 22:09:26 +0000

[diff] [blame]

353

args = [it.next() for it in iterables]

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

if function is None:

yield tuple(args)

else:

yield function(*args)

358

359

360

.. function:: islice(iterable, [start,] stop [, step])

361

362

Make an iterator that returns selected elements from the iterable. If *start* is

363

non-zero, then elements from the iterable are skipped until start is reached.

364

Afterward, elements are returned consecutively unless *step* is set higher than

365

one which results in items being skipped. If *stop* is ``None``, then iteration

366

continues until the iterator is exhausted, if at all; otherwise, it stops at the

367

specified position. Unlike regular slicing, :func:`islice` does not support

368

negative values for *start*, *stop*, or *step*. Can be used to extract related

369

fields from data where the internal structure has been flattened (for example, a

370

multi-line report may list a name field on every third line). Equivalent to::

371

372

def islice(iterable, *args):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

373

# islice('ABCDEFG', 2) --> A B

374

# islice('ABCDEFG', 2, 4) --> C D

375

# islice('ABCDEFG', 2, None) --> C D E F G

376

# islice('ABCDEFG', 0, None, 2) --> A C E G

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

377

s = slice(*args)

378

it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))

379

nexti = it.next()

380

for i, element in enumerate(iterable):

381

if i == nexti:

382

yield element

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

383

nexti = it.next()

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

384

385

If *start* is ``None``, then iteration starts at zero. If *step* is ``None``,

386

then the step defaults to one.

387

388

.. versionchanged:: 2.5

389

accept ``None`` values for default *start* and *step*.

390

391

392

.. function:: izip(*iterables)

393

394

Make an iterator that aggregates elements from each of the iterables. Like

395

:func:`zip` except that it returns an iterator instead of a list. Used for

396

lock-step iteration over several iterables at a time. Equivalent to::

397

398

def izip(*iterables):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

399

# izip('ABCD', 'xy') --> Ax By

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

400

iterables = map(iter, iterables)

401

while iterables:

402

result = [it.next() for it in iterables]

403

yield tuple(result)

404

405

.. versionchanged:: 2.4

406

When no iterables are specified, returns a zero length iterator instead of

407

raising a :exc:`TypeError` exception.

408

Raymond Hettinger

48c6293

2008-01-22 19:51:41 +0000

[diff] [blame]

409

The left-to-right evaluation order of the iterables is guaranteed. This

410

makes possible an idiom for clustering a data series into n-length groups

411

using ``izip(*[iter(s)]*n)``.

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

412

Raymond Hettinger

48c6293

2008-01-22 19:51:41 +0000

[diff] [blame]

413

:func:`izip` should only be used with unequal length inputs when you don't

414

care about trailing, unmatched values from the longer iterables. If those

415

values are important, use :func:`izip_longest` instead.

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

416

417

418

.. function:: izip_longest(*iterables[, fillvalue])

419

420

Make an iterator that aggregates elements from each of the iterables. If the

421

iterables are of uneven length, missing values are filled-in with *fillvalue*.

422

Iteration continues until the longest iterable is exhausted. Equivalent to::

423

424

def izip_longest(*args, **kwds):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

425

# izip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

426

fillvalue = kwds.get('fillvalue')

427

def sentinel(counter = ([fillvalue]*(len(args)-1)).pop):

428

yield counter() # yields the fillvalue, or raises IndexError

429

fillers = repeat(fillvalue)

430

iters = [chain(it, sentinel(), fillers) for it in args]

431

try:

432

for tup in izip(*iters):

yield tup

except IndexError:

pass

Benjamin Peterson

2008-07-25 17:02:11 +0000

[diff] [blame]

437

If one of the iterables is potentially infinite, then the

438

:func:`izip_longest` function should be wrapped with something that limits

439

the number of calls (for example :func:`islice` or :func:`takewhile`). If

440

not specified, *fillvalue* defaults to ``None``.

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

441

442

.. versionadded:: 2.6

443

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

444

.. function:: permutations(iterable[, r])

445

446

Return successive *r* length permutations of elements in the *iterable*.

447

448

If *r* is not specified or is ``None``, then *r* defaults to the length

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

449

of the *iterable* and all possible full-length permutations

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

450

are generated.

451

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

452

Permutations are emitted in lexicographic sort order. So, if the

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

453

input *iterable* is sorted, the permutation tuples will be produced

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

454

in sorted order.

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

455

456

Elements are treated as unique based on their position, not on their

457

value. So if the input elements are unique, there will be no repeat

458

values in each permutation.

459

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

460

Equivalent to::

461

462

def permutations(iterable, r=None):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

463

# permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC

464

# permutations(range(3)) --> 012 021 102 120 201 210

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

465

pool = tuple(iterable)

466

n = len(pool)

467

r = n if r is None else r

Raymond Hettinger

2009-01-08 06:39:04 +0000

[diff] [blame]

468

if r > n:

469

return

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

470

indices = range(n)

Raymond Hettinger

e70bb8d

2008-03-23 00:55:46 +0000

[diff] [blame]

471

cycles = range(n, n-r, -1)

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

472

yield tuple(pool[i] for i in indices[:r])

473

while n:

474

for i in reversed(range(r)):

475

cycles[i] -= 1

476

if cycles[i] == 0:

Raymond Hettinger

2b7a5c4

2008-03-02 11:17:51 +0000

[diff] [blame]

477

indices[i:] = indices[i+1:] + indices[i:i+1]

Raymond Hettinger

2008-03-02 10:59:31 +0000

[diff] [blame]

cycles[i] = n - i

else:

j = cycles[i]

indices[i], indices[-j] = indices[-j], indices[i]

482

yield tuple(pool[i] for i in indices[:r])

483

break

484

else:

485

return

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

486

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

487

The code for :func:`permutations` can be also expressed as a subsequence of

Raymond Hettinger

2008-03-04 04:17:08 +0000

[diff] [blame]

488

:func:`product`, filtered to exclude entries with repeated elements (those

489

from the same position in the input pool)::

490

491

def permutations(iterable, r=None):

492

pool = tuple(iterable)

493

n = len(pool)

494

r = n if r is None else r

495

for indices in product(range(n), repeat=r):

496

if len(set(indices)) == r:

497

yield tuple(pool[i] for i in indices)

498

Raymond Hettinger

2009-01-08 06:39:04 +0000

[diff] [blame]

499

The number of items returned is ``n! / (n-r)!`` when ``0 <= r <= n``

500

or zero when ``r > n``.

501

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

502

.. versionadded:: 2.6

503

Raymond Hettinger

2008-02-28 09:23:48 +0000

[diff] [blame]

504

.. function:: product(*iterables[, repeat])

Raymond Hettinger

2008-02-22 19:50:06 +0000

[diff] [blame]

505

506

Cartesian product of input iterables.

507

508

Equivalent to nested for-loops in a generator expression. For example,

509

``product(A, B)`` returns the same as ``((x,y) for x in A for y in B)``.

510

Raymond Hettinger

5eaffc4

2008-04-17 10:48:31 +0000

[diff] [blame]

511

The nested loops cycle like an odometer with the rightmost element advancing

Andrew M. Kuchling

e2e0313

2008-04-17 20:44:06 +0000

[diff] [blame]

512

on every iteration. This pattern creates a lexicographic ordering so that if

513

the input's iterables are sorted, the product tuples are emitted in sorted

Raymond Hettinger

5eaffc4

2008-04-17 10:48:31 +0000

[diff] [blame]

514

order.

Raymond Hettinger

2008-02-22 19:50:06 +0000

[diff] [blame]

515

Raymond Hettinger

2008-02-28 09:23:48 +0000

[diff] [blame]

516

To compute the product of an iterable with itself, specify the number of

517

repetitions with the optional *repeat* keyword argument. For example,

518

``product(A, repeat=4)`` means the same as ``product(A, A, A, A)``.

519

Andrew M. Kuchling

684868a

2008-03-04 01:47:38 +0000

[diff] [blame]

520

This function is equivalent to the following code, except that the

521

actual implementation does not build up intermediate results in memory::

Raymond Hettinger

2008-02-22 19:50:06 +0000

[diff] [blame]

522

Raymond Hettinger

2008-02-28 09:23:48 +0000

[diff] [blame]

523

def product(*args, **kwds):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

524

# product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy

525

# product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111

Raymond Hettinger

2008-02-28 09:23:48 +0000

[diff] [blame]

526

pools = map(tuple, args) * kwds.get('repeat', 1)

Raymond Hettinger

2008-03-04 04:17:08 +0000

[diff] [blame]

527

result = [[]]

528

for pool in pools:

529

result = [x+[y] for x in result for y in pool]

530

for prod in result:

531

yield tuple(prod)

Raymond Hettinger

2008-02-22 19:50:06 +0000

[diff] [blame]

532

533

.. versionadded:: 2.6

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

534

535

.. function:: repeat(object[, times])

536

537

Make an iterator that returns *object* over and over again. Runs indefinitely

538

unless the *times* argument is specified. Used as argument to :func:`imap` for

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

539

invariant function parameters. Also used with :func:`izip` to create constant

540

fields in a tuple record. Equivalent to::

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

541

542

def repeat(object, times=None):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

543

# repeat(10, 3) --> 10 10 10

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

if times is None:

while True:

yield object

else:

for i in xrange(times):

yield object

.. function:: starmap(function, iterable)

553

Raymond Hettinger

4731709

2008-01-17 03:02:14 +0000

[diff] [blame]

554

Make an iterator that computes the function using arguments obtained from

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

555

the iterable. Used instead of :func:`imap` when argument parameters are already

556

grouped in tuples from a single iterable (the data has been "pre-zipped"). The

557

difference between :func:`imap` and :func:`starmap` parallels the distinction

558

between ``function(a,b)`` and ``function(*c)``. Equivalent to::

559

560

def starmap(function, iterable):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

561

# starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000

Raymond Hettinger

4731709

2008-01-17 03:02:14 +0000

[diff] [blame]

562

for args in iterable:

563

yield function(*args)

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

564

Raymond Hettinger

4731709

2008-01-17 03:02:14 +0000

[diff] [blame]

565

.. versionchanged:: 2.6

566

Previously, :func:`starmap` required the function arguments to be tuples.

567

Now, any iterable is allowed.

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

568

569

.. function:: takewhile(predicate, iterable)

570

571

Make an iterator that returns elements from the iterable as long as the

572

predicate is true. Equivalent to::

573

574

def takewhile(predicate, iterable):

Raymond Hettinger

2008-03-06 01:15:52 +0000

[diff] [blame]

575

# takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

for x in iterable:

if predicate(x):

yield x

else:

break

.. function:: tee(iterable[, n=2])

584

585

Return *n* independent iterators from a single iterable. The case where ``n==2``

586

is equivalent to::

587

588

def tee(iterable):

Raymond Hettinger

2007-12-29 22:09:34 +0000

[diff] [blame]

589

def gen(next, data={}):

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

590

for i in count():

Raymond Hettinger

2007-12-29 22:09:34 +0000

[diff] [blame]

591

if i in data:

592

yield data.pop(i)

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

593

else:

Raymond Hettinger

2007-12-29 22:09:34 +0000

[diff] [blame]

594

data[i] = next()

595

yield data[i]

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

596

it = iter(iterable)

Raymond Hettinger

2007-12-29 22:09:34 +0000

[diff] [blame]

597

return gen(it.next), gen(it.next)

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

598

599

Note, once :func:`tee` has made a split, the original *iterable* should not be

600

used anywhere else; otherwise, the *iterable* could get advanced without the tee

601

objects being informed.

602

603

Note, this member of the toolkit may require significant auxiliary storage

604

(depending on how much temporary data needs to be stored). In general, if one

605

iterator is going to use most or all of the data before the other iterator, it

606

is faster to use :func:`list` instead of :func:`tee`.

607

608

.. versionadded:: 2.4

609

610

611

.. _itertools-example:

Examples

--------

The following examples show common uses for each tool and demonstrate ways they

Georg Brandl

2008-03-22 22:04:10 +0000

[diff] [blame]

617

can be combined.

618

619

.. doctest::

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

620

Benjamin Peterson

8ea9999

2009-01-01 16:43:12 +0000

[diff] [blame]

621

>>> # Show a dictionary sorted and grouped by value

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

622

>>> from operator import itemgetter

623

>>> d = dict(a=1, b=2, c=1, d=2, e=1, f=2, g=3)

624

>>> di = sorted(d.iteritems(), key=itemgetter(1))

625

>>> for k, g in groupby(di, key=itemgetter(1)):

626

... print k, map(itemgetter(0), g)

...

1 ['a', 'c', 'e']

2 ['b', 'd', 'f']

3 ['g']

Benjamin Peterson

2009-01-01 16:43:12 +0000

[diff] [blame]

632

>>> # Find runs of consecutive numbers using groupby. The key to the solution

633

>>> # is differencing with a range so that consecutive numbers all appear in

634

>>> # same group.

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

635

>>> data = [ 1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28]

636

>>> for k, g in groupby(enumerate(data), lambda (i,x):i-x):

Georg Brandl

2008-03-22 22:04:10 +0000

[diff] [blame]

637

... print map(itemgetter(1), g)

Georg Brandl

2009-01-03 20:55:06 +0000

[diff] [blame]

638

...

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

[1]

[4, 5, 6]

[10]

[15, 16, 17, 18]

[22]

[25, 26, 27, 28]

.. _itertools-recipes:

Recipes

-------

This section shows recipes for creating an extended toolset using the existing

654

itertools as building blocks.

655

656

The extended tools offer the same high performance as the underlying toolset.

657

The superior memory performance is kept by processing elements one at a time

658

rather than bringing the whole iterable into memory all at once. Code volume is

659

kept small by linking the tools together in a functional style which helps

660

eliminate temporary variables. High speed is retained by preferring

Georg Brandl

cf3fb25

2007-10-21 10:52:38 +0000

[diff] [blame]

661

"vectorized" building blocks over the use of for-loops and :term:`generator`\s

Georg Brandl

2008-03-22 22:04:10 +0000

[diff] [blame]

662

which incur interpreter overhead.

663

664

.. testcode::

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

665

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

666

def take(n, iterable):

667

"Return first n items of the iterable as a list"

668

return list(islice(iterable, n))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

669

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

670

def enumerate(iterable, start=0):

671

return izip(count(start), iterable)

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

672

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

673

def tabulate(function, start=0):

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

674

"Return function(0), function(1), ..."

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

675

return imap(function, count(start))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

676

677

def nth(iterable, n):

Raymond Hettinger

d507afd

2009-02-04 10:52:32 +0000

[diff] [blame^]

678

"Returns the nth item or None"

679

return next(islice(iterable, n, None), None)

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

680

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

681

def quantify(iterable, pred=bool):

682

"Count how many times the predicate is true"

683

return sum(imap(pred, iterable))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

684

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

685

def padnone(iterable):

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

686

"""Returns the sequence elements and then returns None indefinitely.

687

688

Useful for emulating the behavior of the built-in map() function.

689

"""

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

690

return chain(iterable, repeat(None))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

691

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

692

def ncycles(iterable, n):

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

693

"Returns the sequence elements n times"

Raymond Hettinger

2008-07-19 23:58:47 +0000

[diff] [blame]

694

return chain.from_iterable(repeat(iterable, n))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

695

696

def dotproduct(vec1, vec2):

697

return sum(imap(operator.mul, vec1, vec2))

698

699

def flatten(listOfLists):

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

700

return list(chain.from_iterable(listOfLists))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

701

702

def repeatfunc(func, times=None, *args):

703

"""Repeat calls to func with specified arguments.

704

705

Example: repeatfunc(random.random)

706

"""

707

if times is None:

708

return starmap(func, repeat(args))

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

709

return starmap(func, repeat(args, times))

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

710

711

def pairwise(iterable):

712

"s -> (s0,s1), (s1,s2), (s2, s3), ..."

713

a, b = tee(iterable)

Raymond Hettinger

38fb9be

2008-03-07 01:33:20 +0000

[diff] [blame]

714

for elem in b:

715

break

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

716

return izip(a, b)

717

Raymond Hettinger

38fb9be

2008-03-07 01:33:20 +0000

[diff] [blame]

718

def grouper(n, iterable, fillvalue=None):

Raymond Hettinger

efdf706

2008-07-30 07:27:30 +0000

[diff] [blame]

719

"grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"

Raymond Hettinger

38fb9be

2008-03-07 01:33:20 +0000

[diff] [blame]

720

args = [iter(iterable)] * n

Raymond Hettinger

f080e6d

2008-07-31 01:19:50 +0000

[diff] [blame]

721

return izip_longest(fillvalue=fillvalue, *args)

Georg Brandl

2007-08-15 14:28:01 +0000

[diff] [blame]

722

Raymond Hettinger

a44327a

2008-01-30 22:17:31 +0000

[diff] [blame]

723

def roundrobin(*iterables):

Raymond Hettinger

efdf706

2008-07-30 07:27:30 +0000

[diff] [blame]

724

"roundrobin('ABC', 'D', 'EF') --> A D E B F C"

Raymond Hettinger

2008-02-28 19:41:24 +0000

[diff] [blame]

725

# Recipe credited to George Sakkis

Raymond Hettinger

a44327a

2008-01-30 22:17:31 +0000

[diff] [blame]

726

pending = len(iterables)

727

nexts = cycle(iter(it).next for it in iterables)

while pending:

try:

for next in nexts:

yield next()

except StopIteration:

733

pending -= 1

734

nexts = cycle(islice(nexts, pending))

Georg Brandl