blob: 4ef0ca4488ea6125edd6227dd43469a3e385af4d [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`collections` --- High-performance container datatypes
3===========================================================
4
5.. module:: collections
6 :synopsis: High-performance datatypes
7.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
Georg Brandl8ec7f652007-08-15 14:28:01 +000010.. versionadded:: 2.4
11
Georg Brandl4c8bbe62008-03-22 21:06:20 +000012.. testsetup:: *
13
14 from collections import *
15 import itertools
16 __name__ = '<doctest>'
17
Georg Brandl8ec7f652007-08-15 14:28:01 +000018This module implements high-performance container datatypes. Currently,
19there are two datatypes, :class:`deque` and :class:`defaultdict`, and
Georg Brandl4c8bbe62008-03-22 21:06:20 +000020one datatype factory function, :func:`namedtuple`.
Georg Brandl8ec7f652007-08-15 14:28:01 +000021
22.. versionchanged:: 2.5
23 Added :class:`defaultdict`.
24
25.. versionchanged:: 2.6
Raymond Hettingereeeb9c42007-11-15 02:44:53 +000026 Added :func:`namedtuple`.
Georg Brandl8ec7f652007-08-15 14:28:01 +000027
Raymond Hettingerbc4ffc12008-02-11 23:38:00 +000028The specialized containers provided in this module provide alternatives
Georg Brandl4c8bbe62008-03-22 21:06:20 +000029to Python's general purpose built-in containers, :class:`dict`,
Raymond Hettingerbc4ffc12008-02-11 23:38:00 +000030:class:`list`, :class:`set`, and :class:`tuple`.
31
32Besides the containers provided here, the optional :mod:`bsddb`
Georg Brandl4c8bbe62008-03-22 21:06:20 +000033module offers the ability to create in-memory or file based ordered
Raymond Hettingerbc4ffc12008-02-11 23:38:00 +000034dictionaries with string keys using the :meth:`bsddb.btopen` method.
35
36In addition to containers, the collections module provides some ABCs
Georg Brandl4c8bbe62008-03-22 21:06:20 +000037(abstract base classes) that can be used to test whether a class
Raymond Hettingerbc4ffc12008-02-11 23:38:00 +000038provides a particular interface, for example, is it hashable or
Georg Brandl4c8bbe62008-03-22 21:06:20 +000039a mapping.
Raymond Hettingerbc4ffc12008-02-11 23:38:00 +000040
41.. versionchanged:: 2.6
42 Added abstract base classes.
43
44ABCs - abstract base classes
45----------------------------
46
47The collections module offers the following ABCs:
48
Georg Brandldbc59872008-07-08 07:05:23 +000049========================= ===================== ====================== ====================================================
50ABC Inherits Abstract Methods Mixin Methods
51========================= ===================== ====================== ====================================================
52:class:`Container` ``__contains__``
53:class:`Hashable` ``__hash__``
54:class:`Iterable` ``__iter__``
55:class:`Iterator` :class:`Iterable` ``__next__`` ``__iter__``
Georg Brandl734373c2009-01-03 21:55:17 +000056:class:`Sized` ``__len__``
Georg Brandldbc59872008-07-08 07:05:23 +000057:class:`Callable` ``__call__``
Georg Brandl734373c2009-01-03 21:55:17 +000058
Georg Brandldbc59872008-07-08 07:05:23 +000059:class:`Sequence` :class:`Sized`, ``__getitem__`` ``__contains__``. ``__iter__``, ``__reversed__``.
Raymond Hettingere4cb43d2009-01-29 00:02:31 +000060 :class:`Iterable`, ``index``, and ``count``
Georg Brandl734373c2009-01-03 21:55:17 +000061 :class:`Container`
62
Raymond Hettingere4cb43d2009-01-29 00:02:31 +000063:class:`MutableSequence` :class:`Sequence` ``__setitem__`` Inherited Sequence methods and
Georg Brandldbc59872008-07-08 07:05:23 +000064 ``__delitem__``, ``append``, ``reverse``, ``extend``, ``pop``,
Raymond Hettingere4cb43d2009-01-29 00:02:31 +000065 and ``insert`` ``remove``, and ``__iadd__``
Georg Brandl734373c2009-01-03 21:55:17 +000066
Raymond Hettingere4cb43d2009-01-29 00:02:31 +000067:class:`Set` :class:`Sized`, ``__le__``, ``__lt__``, ``__eq__``, ``__ne__``,
68 :class:`Iterable`, ``__gt__``, ``__ge__``, ``__and__``, ``__or__``
69 :class:`Container` ``__sub__``, ``__xor__``, and ``isdisjoint``
Georg Brandl734373c2009-01-03 21:55:17 +000070
Georg Brandldbc59872008-07-08 07:05:23 +000071:class:`MutableSet` :class:`Set` ``add`` and Inherited Set methods and
72 ``discard`` ``clear``, ``pop``, ``remove``, ``__ior__``,
73 ``__iand__``, ``__ixor__``, and ``__isub__``
Georg Brandl734373c2009-01-03 21:55:17 +000074
Raymond Hettingere4cb43d2009-01-29 00:02:31 +000075:class:`Mapping` :class:`Sized`, ``__getitem__`` ``__contains__``, ``keys``, ``items``, ``values``,
76 :class:`Iterable`, ``get``, ``__eq__``, and ``__ne__``
77 :class:`Container`
Georg Brandl734373c2009-01-03 21:55:17 +000078
Raymond Hettingere4cb43d2009-01-29 00:02:31 +000079:class:`MutableMapping` :class:`Mapping` ``__setitem__`` and Inherited Mapping methods and
80 ``__delitem__`` ``pop``, ``popitem``, ``clear``, ``update``,
81 and ``setdefault``
82
Georg Brandl734373c2009-01-03 21:55:17 +000083
Georg Brandldbc59872008-07-08 07:05:23 +000084:class:`MappingView` :class:`Sized` ``__len__``
85:class:`KeysView` :class:`MappingView`, ``__contains__``,
86 :class:`Set` ``__iter__``
87:class:`ItemsView` :class:`MappingView`, ``__contains__``,
88 :class:`Set` ``__iter__``
89:class:`ValuesView` :class:`MappingView` ``__contains__``, ``__iter__``
90========================= ===================== ====================== ====================================================
Raymond Hettingerbc4ffc12008-02-11 23:38:00 +000091
92These ABCs allow us to ask classes or instances if they provide
93particular functionality, for example::
94
95 size = None
96 if isinstance(myvar, collections.Sized):
Georg Brandl734373c2009-01-03 21:55:17 +000097 size = len(myvar)
Raymond Hettingerbc4ffc12008-02-11 23:38:00 +000098
99Several of the ABCs are also useful as mixins that make it easier to develop
100classes supporting container APIs. For example, to write a class supporting
101the full :class:`Set` API, it only necessary to supply the three underlying
102abstract methods: :meth:`__contains__`, :meth:`__iter__`, and :meth:`__len__`.
103The ABC supplies the remaining methods such as :meth:`__and__` and
104:meth:`isdisjoint` ::
105
106 class ListBasedSet(collections.Set):
107 ''' Alternate set implementation favoring space over speed
108 and not requiring the set elements to be hashable. '''
109 def __init__(self, iterable):
110 self.elements = lst = []
111 for value in iterable:
112 if value not in lst:
113 lst.append(value)
114 def __iter__(self):
115 return iter(self.elements)
116 def __contains__(self, value):
117 return value in self.elements
118 def __len__(self):
119 return len(self.elements)
120
121 s1 = ListBasedSet('abcdef')
122 s2 = ListBasedSet('defghi')
123 overlap = s1 & s2 # The __and__() method is supported automatically
124
125Notes on using :class:`Set` and :class:`MutableSet` as a mixin:
126
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000127(1)
Raymond Hettingerbc4ffc12008-02-11 23:38:00 +0000128 Since some set operations create new sets, the default mixin methods need
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000129 a way to create new instances from an iterable. The class constructor is
130 assumed to have a signature in the form ``ClassName(iterable)``.
Raymond Hettinger96b42402008-05-23 17:34:34 +0000131 That assumption is factored-out to an internal classmethod called
Raymond Hettingerbc4ffc12008-02-11 23:38:00 +0000132 :meth:`_from_iterable` which calls ``cls(iterable)`` to produce a new set.
133 If the :class:`Set` mixin is being used in a class with a different
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000134 constructor signature, you will need to override :meth:`from_iterable`
135 with a classmethod that can construct new instances from
Raymond Hettingerbc4ffc12008-02-11 23:38:00 +0000136 an iterable argument.
137
138(2)
139 To override the comparisons (presumably for speed, as the
140 semantics are fixed), redefine :meth:`__le__` and
141 then the other operations will automatically follow suit.
142
143(3)
144 The :class:`Set` mixin provides a :meth:`_hash` method to compute a hash value
145 for the set; however, :meth:`__hash__` is not defined because not all sets
146 are hashable or immutable. To add set hashabilty using mixins,
147 inherit from both :meth:`Set` and :meth:`Hashable`, then define
148 ``__hash__ = Set._hash``.
149
Raymond Hettinger399ff092009-03-20 18:30:29 +0000150.. seealso::
Raymond Hettingerbc4ffc12008-02-11 23:38:00 +0000151
Raymond Hettinger399ff092009-03-20 18:30:29 +0000152 * `OrderedSet recipe <http://code.activestate.com/recipes/576694/>`_ for an
153 example built on :class:`MutableSet`.
154
155 * For more about ABCs, see the :mod:`abc` module and :pep:`3119`.
Raymond Hettingerbc4ffc12008-02-11 23:38:00 +0000156
Georg Brandl8ec7f652007-08-15 14:28:01 +0000157
158.. _deque-objects:
159
160:class:`deque` objects
161----------------------
162
163
Raymond Hettingera7fc4b12007-10-05 02:47:07 +0000164.. class:: deque([iterable[, maxlen]])
Georg Brandl8ec7f652007-08-15 14:28:01 +0000165
166 Returns a new deque object initialized left-to-right (using :meth:`append`) with
167 data from *iterable*. If *iterable* is not specified, the new deque is empty.
168
169 Deques are a generalization of stacks and queues (the name is pronounced "deck"
170 and is short for "double-ended queue"). Deques support thread-safe, memory
171 efficient appends and pops from either side of the deque with approximately the
172 same O(1) performance in either direction.
173
174 Though :class:`list` objects support similar operations, they are optimized for
175 fast fixed-length operations and incur O(n) memory movement costs for
176 ``pop(0)`` and ``insert(0, v)`` operations which change both the size and
177 position of the underlying data representation.
178
179 .. versionadded:: 2.4
180
Raymond Hettinger68995862007-10-10 00:26:46 +0000181 If *maxlen* is not specified or is *None*, deques may grow to an
Raymond Hettingera7fc4b12007-10-05 02:47:07 +0000182 arbitrary length. Otherwise, the deque is bounded to the specified maximum
183 length. Once a bounded length deque is full, when new items are added, a
184 corresponding number of items are discarded from the opposite end. Bounded
185 length deques provide functionality similar to the ``tail`` filter in
186 Unix. They are also useful for tracking transactions and other pools of data
187 where only the most recent activity is of interest.
188
189 .. versionchanged:: 2.6
Georg Brandlb19be572007-12-29 10:57:00 +0000190 Added *maxlen* parameter.
Raymond Hettingera7fc4b12007-10-05 02:47:07 +0000191
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000192 Deque objects support the following methods:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000193
194
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000195 .. method:: append(x)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000196
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000197 Add *x* to the right side of the deque.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000198
199
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000200 .. method:: appendleft(x)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000201
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000202 Add *x* to the left side of the deque.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000203
204
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000205 .. method:: clear()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000206
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000207 Remove all elements from the deque leaving it with length 0.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000208
209
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000210 .. method:: extend(iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000211
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000212 Extend the right side of the deque by appending elements from the iterable
213 argument.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000214
215
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000216 .. method:: extendleft(iterable)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000217
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000218 Extend the left side of the deque by appending elements from *iterable*.
219 Note, the series of left appends results in reversing the order of
220 elements in the iterable argument.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000221
222
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000223 .. method:: pop()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000224
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000225 Remove and return an element from the right side of the deque. If no
226 elements are present, raises an :exc:`IndexError`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000227
228
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000229 .. method:: popleft()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000230
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000231 Remove and return an element from the left side of the deque. If no
232 elements are present, raises an :exc:`IndexError`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000233
234
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000235 .. method:: remove(value)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000236
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000237 Removed the first occurrence of *value*. If not found, raises a
238 :exc:`ValueError`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000239
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000240 .. versionadded:: 2.5
Georg Brandl8ec7f652007-08-15 14:28:01 +0000241
242
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000243 .. method:: rotate(n)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000244
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000245 Rotate the deque *n* steps to the right. If *n* is negative, rotate to
246 the left. Rotating one step to the right is equivalent to:
247 ``d.appendleft(d.pop())``.
248
Georg Brandl8ec7f652007-08-15 14:28:01 +0000249
250In addition to the above, deques support iteration, pickling, ``len(d)``,
251``reversed(d)``, ``copy.copy(d)``, ``copy.deepcopy(d)``, membership testing with
Georg Brandl4aef7032008-11-07 08:56:27 +0000252the :keyword:`in` operator, and subscript references such as ``d[-1]``. Indexed
253access is O(1) at both ends but slows to O(n) in the middle. For fast random
254access, use lists instead.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000255
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000256Example:
257
258.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000259
260 >>> from collections import deque
261 >>> d = deque('ghi') # make a new deque with three items
262 >>> for elem in d: # iterate over the deque's elements
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000263 ... print elem.upper()
Georg Brandl8ec7f652007-08-15 14:28:01 +0000264 G
265 H
266 I
267
268 >>> d.append('j') # add a new entry to the right side
269 >>> d.appendleft('f') # add a new entry to the left side
270 >>> d # show the representation of the deque
271 deque(['f', 'g', 'h', 'i', 'j'])
272
273 >>> d.pop() # return and remove the rightmost item
274 'j'
275 >>> d.popleft() # return and remove the leftmost item
276 'f'
277 >>> list(d) # list the contents of the deque
278 ['g', 'h', 'i']
279 >>> d[0] # peek at leftmost item
280 'g'
281 >>> d[-1] # peek at rightmost item
282 'i'
283
284 >>> list(reversed(d)) # list the contents of a deque in reverse
285 ['i', 'h', 'g']
286 >>> 'h' in d # search the deque
287 True
288 >>> d.extend('jkl') # add multiple elements at once
289 >>> d
290 deque(['g', 'h', 'i', 'j', 'k', 'l'])
291 >>> d.rotate(1) # right rotation
292 >>> d
293 deque(['l', 'g', 'h', 'i', 'j', 'k'])
294 >>> d.rotate(-1) # left rotation
295 >>> d
296 deque(['g', 'h', 'i', 'j', 'k', 'l'])
297
298 >>> deque(reversed(d)) # make a new deque in reverse order
299 deque(['l', 'k', 'j', 'i', 'h', 'g'])
300 >>> d.clear() # empty the deque
301 >>> d.pop() # cannot pop from an empty deque
302 Traceback (most recent call last):
303 File "<pyshell#6>", line 1, in -toplevel-
304 d.pop()
305 IndexError: pop from an empty deque
306
307 >>> d.extendleft('abc') # extendleft() reverses the input order
308 >>> d
309 deque(['c', 'b', 'a'])
310
311
312.. _deque-recipes:
313
Raymond Hettingera7fc4b12007-10-05 02:47:07 +0000314:class:`deque` Recipes
315^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl8ec7f652007-08-15 14:28:01 +0000316
317This section shows various approaches to working with deques.
318
319The :meth:`rotate` method provides a way to implement :class:`deque` slicing and
320deletion. For example, a pure python implementation of ``del d[n]`` relies on
321the :meth:`rotate` method to position elements to be popped::
322
323 def delete_nth(d, n):
324 d.rotate(-n)
325 d.popleft()
326 d.rotate(n)
327
328To implement :class:`deque` slicing, use a similar approach applying
329:meth:`rotate` to bring a target element to the left side of the deque. Remove
330old entries with :meth:`popleft`, add new entries with :meth:`extend`, and then
331reverse the rotation.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000332With minor variations on that approach, it is easy to implement Forth style
333stack manipulations such as ``dup``, ``drop``, ``swap``, ``over``, ``pick``,
334``rot``, and ``roll``.
335
Georg Brandl8ec7f652007-08-15 14:28:01 +0000336Multi-pass data reduction algorithms can be succinctly expressed and efficiently
337coded by extracting elements with multiple calls to :meth:`popleft`, applying
Raymond Hettingera7fc4b12007-10-05 02:47:07 +0000338a reduction function, and calling :meth:`append` to add the result back to the
339deque.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000340
341For example, building a balanced binary tree of nested lists entails reducing
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000342two adjacent nodes into one by grouping them in a list:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000343
344 >>> def maketree(iterable):
345 ... d = deque(iterable)
346 ... while len(d) > 1:
347 ... pair = [d.popleft(), d.popleft()]
348 ... d.append(pair)
349 ... return list(d)
350 ...
351 >>> print maketree('abcdefgh')
352 [[[['a', 'b'], ['c', 'd']], [['e', 'f'], ['g', 'h']]]]
353
Raymond Hettingera7fc4b12007-10-05 02:47:07 +0000354Bounded length deques provide functionality similar to the ``tail`` filter
355in Unix::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000356
Raymond Hettingera7fc4b12007-10-05 02:47:07 +0000357 def tail(filename, n=10):
358 'Return the last n lines of a file'
359 return deque(open(filename), n)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000360
361.. _defaultdict-objects:
362
363:class:`defaultdict` objects
364----------------------------
365
366
367.. class:: defaultdict([default_factory[, ...]])
368
369 Returns a new dictionary-like object. :class:`defaultdict` is a subclass of the
370 builtin :class:`dict` class. It overrides one method and adds one writable
371 instance variable. The remaining functionality is the same as for the
372 :class:`dict` class and is not documented here.
373
374 The first argument provides the initial value for the :attr:`default_factory`
375 attribute; it defaults to ``None``. All remaining arguments are treated the same
376 as if they were passed to the :class:`dict` constructor, including keyword
377 arguments.
378
379 .. versionadded:: 2.5
380
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000381 :class:`defaultdict` objects support the following method in addition to the
382 standard :class:`dict` operations:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000383
384
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000385 .. method:: defaultdict.__missing__(key)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000386
Skip Montanarob40890d2008-09-17 11:50:36 +0000387 If the :attr:`default_factory` attribute is ``None``, this raises a
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000388 :exc:`KeyError` exception with the *key* as argument.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000389
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000390 If :attr:`default_factory` is not ``None``, it is called without arguments
391 to provide a default value for the given *key*, this value is inserted in
392 the dictionary for the *key*, and returned.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000393
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000394 If calling :attr:`default_factory` raises an exception this exception is
395 propagated unchanged.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000396
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000397 This method is called by the :meth:`__getitem__` method of the
398 :class:`dict` class when the requested key is not found; whatever it
399 returns or raises is then returned or raised by :meth:`__getitem__`.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000400
401
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000402 :class:`defaultdict` objects support the following instance variable:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000403
Benjamin Petersonc7b05922008-04-25 01:29:10 +0000404
405 .. attribute:: defaultdict.default_factory
406
407 This attribute is used by the :meth:`__missing__` method; it is
408 initialized from the first argument to the constructor, if present, or to
409 ``None``, if absent.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000410
411
412.. _defaultdict-examples:
413
414:class:`defaultdict` Examples
415^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
416
417Using :class:`list` as the :attr:`default_factory`, it is easy to group a
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000418sequence of key-value pairs into a dictionary of lists:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000419
420 >>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
421 >>> d = defaultdict(list)
422 >>> for k, v in s:
423 ... d[k].append(v)
424 ...
425 >>> d.items()
426 [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
427
428When each key is encountered for the first time, it is not already in the
429mapping; so an entry is automatically created using the :attr:`default_factory`
430function which returns an empty :class:`list`. The :meth:`list.append`
431operation then attaches the value to the new list. When keys are encountered
432again, the look-up proceeds normally (returning the list for that key) and the
433:meth:`list.append` operation adds another value to the list. This technique is
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000434simpler and faster than an equivalent technique using :meth:`dict.setdefault`:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000435
436 >>> d = {}
437 >>> for k, v in s:
438 ... d.setdefault(k, []).append(v)
439 ...
440 >>> d.items()
441 [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
442
443Setting the :attr:`default_factory` to :class:`int` makes the
444:class:`defaultdict` useful for counting (like a bag or multiset in other
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000445languages):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000446
447 >>> s = 'mississippi'
448 >>> d = defaultdict(int)
449 >>> for k in s:
450 ... d[k] += 1
451 ...
452 >>> d.items()
453 [('i', 4), ('p', 2), ('s', 4), ('m', 1)]
454
455When a letter is first encountered, it is missing from the mapping, so the
456:attr:`default_factory` function calls :func:`int` to supply a default count of
457zero. The increment operation then builds up the count for each letter.
458
459The function :func:`int` which always returns zero is just a special case of
460constant functions. A faster and more flexible way to create constant functions
461is to use :func:`itertools.repeat` which can supply any constant value (not just
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000462zero):
Georg Brandl8ec7f652007-08-15 14:28:01 +0000463
464 >>> def constant_factory(value):
465 ... return itertools.repeat(value).next
466 >>> d = defaultdict(constant_factory('<missing>'))
467 >>> d.update(name='John', action='ran')
468 >>> '%(name)s %(action)s to %(object)s' % d
469 'John ran to <missing>'
470
471Setting the :attr:`default_factory` to :class:`set` makes the
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000472:class:`defaultdict` useful for building a dictionary of sets:
Georg Brandl8ec7f652007-08-15 14:28:01 +0000473
474 >>> s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)]
475 >>> d = defaultdict(set)
476 >>> for k, v in s:
477 ... d[k].add(v)
478 ...
479 >>> d.items()
480 [('blue', set([2, 4])), ('red', set([1, 3]))]
481
482
483.. _named-tuple-factory:
484
Raymond Hettingereeeb9c42007-11-15 02:44:53 +0000485:func:`namedtuple` Factory Function for Tuples with Named Fields
Georg Brandlb3255ed2008-01-07 16:43:47 +0000486----------------------------------------------------------------
Georg Brandl8ec7f652007-08-15 14:28:01 +0000487
Raymond Hettingercbab5942007-09-18 22:18:02 +0000488Named tuples assign meaning to each position in a tuple and allow for more readable,
489self-documenting code. They can be used wherever regular tuples are used, and
490they add the ability to access fields by name instead of position index.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000491
Georg Brandld2094602008-12-05 08:51:30 +0000492.. function:: namedtuple(typename, field_names, [verbose])
Georg Brandl8ec7f652007-08-15 14:28:01 +0000493
494 Returns a new tuple subclass named *typename*. The new subclass is used to
Georg Brandl907a7202008-02-22 12:31:45 +0000495 create tuple-like objects that have fields accessible by attribute lookup as
Georg Brandl8ec7f652007-08-15 14:28:01 +0000496 well as being indexable and iterable. Instances of the subclass also have a
Georg Brandld2094602008-12-05 08:51:30 +0000497 helpful docstring (with typename and field_names) and a helpful :meth:`__repr__`
Georg Brandl8ec7f652007-08-15 14:28:01 +0000498 method which lists the tuple contents in a ``name=value`` format.
499
Georg Brandld2094602008-12-05 08:51:30 +0000500 The *field_names* are a single string with each fieldname separated by whitespace
501 and/or commas, for example ``'x y'`` or ``'x, y'``. Alternatively, *field_names*
Raymond Hettinger15b5e552008-01-10 23:00:01 +0000502 can be a sequence of strings such as ``['x', 'y']``.
Raymond Hettingerabfd8df2007-10-16 21:28:32 +0000503
504 Any valid Python identifier may be used for a fieldname except for names
Raymond Hettinger42da8742007-12-14 02:49:47 +0000505 starting with an underscore. Valid identifiers consist of letters, digits,
506 and underscores but do not start with a digit or underscore and cannot be
Raymond Hettingerabfd8df2007-10-16 21:28:32 +0000507 a :mod:`keyword` such as *class*, *for*, *return*, *global*, *pass*, *print*,
508 or *raise*.
Raymond Hettingercbab5942007-09-18 22:18:02 +0000509
Raymond Hettinger15b5e552008-01-10 23:00:01 +0000510 If *verbose* is true, the class definition is printed just before being built.
Raymond Hettingercbab5942007-09-18 22:18:02 +0000511
Raymond Hettingera48a2992007-10-08 21:26:58 +0000512 Named tuple instances do not have per-instance dictionaries, so they are
Raymond Hettinger7268e9d2007-09-20 03:03:43 +0000513 lightweight and require no more memory than regular tuples.
Raymond Hettingercbab5942007-09-18 22:18:02 +0000514
Georg Brandl8ec7f652007-08-15 14:28:01 +0000515 .. versionadded:: 2.6
516
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000517Example:
518
519.. doctest::
520 :options: +NORMALIZE_WHITESPACE
Georg Brandl8ec7f652007-08-15 14:28:01 +0000521
Raymond Hettingereeeb9c42007-11-15 02:44:53 +0000522 >>> Point = namedtuple('Point', 'x y', verbose=True)
Raymond Hettingercbab5942007-09-18 22:18:02 +0000523 class Point(tuple):
524 'Point(x, y)'
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000525 <BLANKLINE>
Raymond Hettingercbab5942007-09-18 22:18:02 +0000526 __slots__ = ()
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000527 <BLANKLINE>
Raymond Hettingere0734e72008-01-04 03:22:53 +0000528 _fields = ('x', 'y')
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000529 <BLANKLINE>
Raymond Hettingercbab5942007-09-18 22:18:02 +0000530 def __new__(cls, x, y):
531 return tuple.__new__(cls, (x, y))
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000532 <BLANKLINE>
Raymond Hettinger02740f72008-01-05 01:35:43 +0000533 @classmethod
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000534 def _make(cls, iterable, new=tuple.__new__, len=len):
Raymond Hettinger02740f72008-01-05 01:35:43 +0000535 'Make a new Point object from a sequence or iterable'
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000536 result = new(cls, iterable)
Raymond Hettinger02740f72008-01-05 01:35:43 +0000537 if len(result) != 2:
538 raise TypeError('Expected 2 arguments, got %d' % len(result))
539 return result
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000540 <BLANKLINE>
Raymond Hettingercbab5942007-09-18 22:18:02 +0000541 def __repr__(self):
542 return 'Point(x=%r, y=%r)' % self
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000543 <BLANKLINE>
Raymond Hettinger8777bca2007-12-18 22:21:27 +0000544 def _asdict(t):
Raymond Hettinger48eca672007-12-14 18:08:20 +0000545 'Return a new dict which maps field names to their values'
Raymond Hettinger8777bca2007-12-18 22:21:27 +0000546 return {'x': t[0], 'y': t[1]}
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000547 <BLANKLINE>
Raymond Hettinger42da8742007-12-14 02:49:47 +0000548 def _replace(self, **kwds):
Raymond Hettingereeeb9c42007-11-15 02:44:53 +0000549 'Return a new Point object replacing specified fields with new values'
Raymond Hettinger11668722008-01-06 09:02:24 +0000550 result = self._make(map(kwds.pop, ('x', 'y'), self))
Raymond Hettinger1b50fd72008-01-05 02:17:24 +0000551 if kwds:
552 raise ValueError('Got unexpected field names: %r' % kwds.keys())
553 return result
Georg Brandl734373c2009-01-03 21:55:17 +0000554 <BLANKLINE>
555 def __getnewargs__(self):
Raymond Hettingeree51cff2008-06-27 21:34:24 +0000556 return tuple(self)
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000557 <BLANKLINE>
Raymond Hettingercbab5942007-09-18 22:18:02 +0000558 x = property(itemgetter(0))
559 y = property(itemgetter(1))
Georg Brandl8ec7f652007-08-15 14:28:01 +0000560
Raymond Hettingercbab5942007-09-18 22:18:02 +0000561 >>> p = Point(11, y=22) # instantiate with positional or keyword arguments
Raymond Hettinger88880b22007-12-18 00:13:45 +0000562 >>> p[0] + p[1] # indexable like the plain tuple (11, 22)
Raymond Hettingercbab5942007-09-18 22:18:02 +0000563 33
564 >>> x, y = p # unpack like a regular tuple
565 >>> x, y
566 (11, 22)
Georg Brandl907a7202008-02-22 12:31:45 +0000567 >>> p.x + p.y # fields also accessible by name
Raymond Hettingercbab5942007-09-18 22:18:02 +0000568 33
569 >>> p # readable __repr__ with a name=value style
570 Point(x=11, y=22)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000571
Raymond Hettingercbab5942007-09-18 22:18:02 +0000572Named tuples are especially useful for assigning field names to result tuples returned
573by the :mod:`csv` or :mod:`sqlite3` modules::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000574
Raymond Hettingereeeb9c42007-11-15 02:44:53 +0000575 EmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title, department, paygrade')
Raymond Hettingera48a2992007-10-08 21:26:58 +0000576
Raymond Hettingercbab5942007-09-18 22:18:02 +0000577 import csv
Raymond Hettinger02740f72008-01-05 01:35:43 +0000578 for emp in map(EmployeeRecord._make, csv.reader(open("employees.csv", "rb"))):
Raymond Hettingercbab5942007-09-18 22:18:02 +0000579 print emp.name, emp.title
Georg Brandl8ec7f652007-08-15 14:28:01 +0000580
Raymond Hettingera48a2992007-10-08 21:26:58 +0000581 import sqlite3
582 conn = sqlite3.connect('/companydata')
583 cursor = conn.cursor()
584 cursor.execute('SELECT name, age, title, department, paygrade FROM employees')
Raymond Hettinger02740f72008-01-05 01:35:43 +0000585 for emp in map(EmployeeRecord._make, cursor.fetchall()):
Raymond Hettingera48a2992007-10-08 21:26:58 +0000586 print emp.name, emp.title
587
Raymond Hettinger85dfcf32007-12-18 23:51:15 +0000588In addition to the methods inherited from tuples, named tuples support
Raymond Hettingerac5742e2008-01-08 02:24:15 +0000589three additional methods and one attribute. To prevent conflicts with
590field names, the method and attribute names start with an underscore.
Raymond Hettinger85dfcf32007-12-18 23:51:15 +0000591
Georg Brandlb3255ed2008-01-07 16:43:47 +0000592.. method:: somenamedtuple._make(iterable)
Raymond Hettinger85dfcf32007-12-18 23:51:15 +0000593
Raymond Hettinger02740f72008-01-05 01:35:43 +0000594 Class method that makes a new instance from an existing sequence or iterable.
Raymond Hettinger85dfcf32007-12-18 23:51:15 +0000595
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000596.. doctest::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000597
Raymond Hettinger02740f72008-01-05 01:35:43 +0000598 >>> t = [11, 22]
599 >>> Point._make(t)
600 Point(x=11, y=22)
Raymond Hettinger2b03d452007-09-18 03:33:19 +0000601
Georg Brandlb3255ed2008-01-07 16:43:47 +0000602.. method:: somenamedtuple._asdict()
Raymond Hettingera7fc4b12007-10-05 02:47:07 +0000603
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000604 Return a new dict which maps field names to their corresponding values::
Raymond Hettingera7fc4b12007-10-05 02:47:07 +0000605
Raymond Hettinger42da8742007-12-14 02:49:47 +0000606 >>> p._asdict()
Raymond Hettingera7fc4b12007-10-05 02:47:07 +0000607 {'x': 11, 'y': 22}
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000608
Georg Brandlb3255ed2008-01-07 16:43:47 +0000609.. method:: somenamedtuple._replace(kwargs)
Raymond Hettingerd36a60e2007-09-17 00:55:00 +0000610
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000611 Return a new instance of the named tuple replacing specified fields with new
612 values:
Raymond Hettinger7268e9d2007-09-20 03:03:43 +0000613
614::
Raymond Hettingerd36a60e2007-09-17 00:55:00 +0000615
Raymond Hettingercbab5942007-09-18 22:18:02 +0000616 >>> p = Point(x=11, y=22)
Raymond Hettinger42da8742007-12-14 02:49:47 +0000617 >>> p._replace(x=33)
Raymond Hettingerd36a60e2007-09-17 00:55:00 +0000618 Point(x=33, y=22)
619
Raymond Hettinger7c3738e2007-11-15 03:16:09 +0000620 >>> for partnum, record in inventory.items():
Raymond Hettingere11230e2008-01-09 03:02:23 +0000621 ... inventory[partnum] = record._replace(price=newprices[partnum], timestamp=time.now())
Raymond Hettingerd36a60e2007-09-17 00:55:00 +0000622
Georg Brandlb3255ed2008-01-07 16:43:47 +0000623.. attribute:: somenamedtuple._fields
Raymond Hettingerd36a60e2007-09-17 00:55:00 +0000624
Raymond Hettingerf6b769b2008-01-07 21:33:51 +0000625 Tuple of strings listing the field names. Useful for introspection
Raymond Hettingera7fc4b12007-10-05 02:47:07 +0000626 and for creating new named tuple types from existing named tuples.
Raymond Hettinger7268e9d2007-09-20 03:03:43 +0000627
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000628.. doctest::
Raymond Hettingerd36a60e2007-09-17 00:55:00 +0000629
Raymond Hettinger42da8742007-12-14 02:49:47 +0000630 >>> p._fields # view the field names
Raymond Hettingercbab5942007-09-18 22:18:02 +0000631 ('x', 'y')
Raymond Hettingerd36a60e2007-09-17 00:55:00 +0000632
Raymond Hettingereeeb9c42007-11-15 02:44:53 +0000633 >>> Color = namedtuple('Color', 'red green blue')
Raymond Hettinger42da8742007-12-14 02:49:47 +0000634 >>> Pixel = namedtuple('Pixel', Point._fields + Color._fields)
Raymond Hettingercbab5942007-09-18 22:18:02 +0000635 >>> Pixel(11, 22, 128, 255, 0)
Raymond Hettingerdc1854d2008-01-09 03:13:20 +0000636 Pixel(x=11, y=22, red=128, green=255, blue=0)
Raymond Hettingerd36a60e2007-09-17 00:55:00 +0000637
Raymond Hettingere846f382007-12-14 21:51:50 +0000638To retrieve a field whose name is stored in a string, use the :func:`getattr`
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000639function:
Raymond Hettingere846f382007-12-14 21:51:50 +0000640
641 >>> getattr(p, 'x')
642 11
643
Raymond Hettingerb6b38792009-02-11 00:12:07 +0000644To convert a dictionary to a named tuple, use the double-star-operator
645(as described in :ref:`tut-unpacking-arguments`):
Raymond Hettinger85dfcf32007-12-18 23:51:15 +0000646
647 >>> d = {'x': 11, 'y': 22}
648 >>> Point(**d)
649 Point(x=11, y=22)
650
Raymond Hettingereeeb9c42007-11-15 02:44:53 +0000651Since a named tuple is a regular Python class, it is easy to add or change
Raymond Hettingerb8e00722008-01-07 04:24:49 +0000652functionality with a subclass. Here is how to add a calculated field and
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000653a fixed-width print format:
Raymond Hettingereeeb9c42007-11-15 02:44:53 +0000654
Raymond Hettingerb8e00722008-01-07 04:24:49 +0000655 >>> class Point(namedtuple('Point', 'x y')):
Raymond Hettingere1655082008-01-10 19:15:10 +0000656 ... __slots__ = ()
Raymond Hettingere11230e2008-01-09 03:02:23 +0000657 ... @property
658 ... def hypot(self):
659 ... return (self.x ** 2 + self.y ** 2) ** 0.5
660 ... def __str__(self):
Raymond Hettinger15b5e552008-01-10 23:00:01 +0000661 ... return 'Point: x=%6.3f y=%6.3f hypot=%6.3f' % (self.x, self.y, self.hypot)
Raymond Hettingerb8e00722008-01-07 04:24:49 +0000662
Raymond Hettingere1655082008-01-10 19:15:10 +0000663 >>> for p in Point(3, 4), Point(14, 5/7.):
Raymond Hettingere11230e2008-01-09 03:02:23 +0000664 ... print p
Raymond Hettinger15b5e552008-01-10 23:00:01 +0000665 Point: x= 3.000 y= 4.000 hypot= 5.000
666 Point: x=14.000 y= 0.714 hypot=14.018
Raymond Hettingereeeb9c42007-11-15 02:44:53 +0000667
Raymond Hettinger9bba7b72008-01-27 10:47:55 +0000668The subclass shown above sets ``__slots__`` to an empty tuple. This keeps
Raymond Hettinger171f3912008-01-16 23:38:16 +0000669keep memory requirements low by preventing the creation of instance dictionaries.
Raymond Hettingerf59e9622008-01-15 20:52:42 +0000670
Raymond Hettingerac5742e2008-01-08 02:24:15 +0000671Subclassing is not useful for adding new, stored fields. Instead, simply
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000672create a new named tuple type from the :attr:`_fields` attribute:
Raymond Hettingerac5742e2008-01-08 02:24:15 +0000673
Raymond Hettingere850c462008-01-10 20:37:12 +0000674 >>> Point3D = namedtuple('Point3D', Point._fields + ('z',))
Raymond Hettingerac5742e2008-01-08 02:24:15 +0000675
Raymond Hettingerfb3ced62008-01-07 20:17:35 +0000676Default values can be implemented by using :meth:`_replace` to
Georg Brandl4c8bbe62008-03-22 21:06:20 +0000677customize a prototype instance:
Raymond Hettingerbc693492007-11-15 22:39:34 +0000678
679 >>> Account = namedtuple('Account', 'owner balance transaction_count')
Raymond Hettinger0fe6ca42008-01-18 21:14:58 +0000680 >>> default_account = Account('<owner name>', 0.0, 0)
681 >>> johns_account = default_account._replace(owner='John')
Raymond Hettingerbc693492007-11-15 22:39:34 +0000682
Raymond Hettinger5a9fed72008-05-08 07:23:30 +0000683Enumerated constants can be implemented with named tuples, but it is simpler
684and more efficient to use a simple class declaration:
685
686 >>> Status = namedtuple('Status', 'open pending closed')._make(range(3))
687 >>> Status.open, Status.pending, Status.closed
688 (0, 1, 2)
689 >>> class Status:
690 ... open, pending, closed = range(3)
691
Raymond Hettingerb6b38792009-02-11 00:12:07 +0000692.. seealso::
Mark Summerfield7f626f42007-08-30 15:03:03 +0000693
Raymond Hettingerb6b38792009-02-11 00:12:07 +0000694 `Named tuple recipe <http://code.activestate.com/recipes/500261/>`_
695 adapted for Python 2.4.