blob: e886d8622447e9ab6a0acc0e3c8c64958d4320fa [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001
Raymond Hettinger53dbe392008-02-12 20:03:09 +00002:mod:`collections` --- Container datatypes
3==========================================
Georg Brandl116aa622007-08-15 14:28:22 +00004
5.. module:: collections
Raymond Hettinger53dbe392008-02-12 20:03:09 +00006 :synopsis: Container datatypes
Georg Brandl116aa622007-08-15 14:28:22 +00007.. moduleauthor:: Raymond Hettinger <python@rcn.com>
8.. sectionauthor:: Raymond Hettinger <python@rcn.com>
9
Christian Heimesfe337bf2008-03-23 21:54:12 +000010.. testsetup:: *
11
12 from collections import *
13 import itertools
14 __name__ = '<doctest>'
Georg Brandl116aa622007-08-15 14:28:22 +000015
Georg Brandl116aa622007-08-15 14:28:22 +000016This module implements high-performance container datatypes. Currently,
17there are two datatypes, :class:`deque` and :class:`defaultdict`, and
Mark Summerfield71316b02008-02-14 16:28:00 +000018one datatype factory function, :func:`namedtuple`. This module also
19provides the :class:`UserDict` and :class:`UserList` classes which may
20be useful when inheriting directly from :class:`dict` or
21:class:`list` isn't convenient.
Christian Heimes0bd4e112008-02-12 22:59:25 +000022
Raymond Hettingerebcee3f2008-02-06 19:54:00 +000023The specialized containers provided in this module provide alternatives
Christian Heimesfe337bf2008-03-23 21:54:12 +000024to Python's general purpose built-in containers, :class:`dict`,
Raymond Hettingerebcee3f2008-02-06 19:54:00 +000025:class:`list`, :class:`set`, and :class:`tuple`.
Raymond Hettingerebcee3f2008-02-06 19:54:00 +000026Besides the containers provided here, the optional :mod:`bsddb`
Christian Heimesfe337bf2008-03-23 21:54:12 +000027module offers the ability to create in-memory or file based ordered
Raymond Hettingerebcee3f2008-02-06 19:54:00 +000028dictionaries with string keys using the :meth:`bsddb.btopen` method.
Georg Brandl116aa622007-08-15 14:28:22 +000029
Mark Summerfield08898b42007-09-05 08:43:04 +000030In addition to containers, the collections module provides some ABCs
Christian Heimesfe337bf2008-03-23 21:54:12 +000031(abstract base classes) that can be used to test whether a class
Raymond Hettingerebcee3f2008-02-06 19:54:00 +000032provides a particular interface, for example, is it hashable or
Mark Summerfield71316b02008-02-14 16:28:00 +000033a mapping, and some of them can also be used as mixin classes.
Raymond Hettingerebcee3f2008-02-06 19:54:00 +000034
35ABCs - abstract base classes
36----------------------------
37
38The collections module offers the following ABCs:
Mark Summerfield08898b42007-09-05 08:43:04 +000039
Raymond Hettinger409fb2c2008-02-09 02:17:06 +000040========================= ==================== ====================== ====================================================
41ABC Inherits Abstract Methods Mixin Methods
42========================= ==================== ====================== ====================================================
43:class:`Container` ``__contains__``
44:class:`Hashable` ``__hash__``
45:class:`Iterable` ``__iter__``
46:class:`Iterator` :class:`Iterable` ``__next__`` ``__iter__``
47:class:`Sized` ``__len__``
48
49:class:`Mapping` :class:`Sized`, ``__getitem__``, ``__contains__``, ``keys``, ``items``, ``values``,
50 :class:`Iterable`, ``__len__``. and ``get``, ``__eq__``, and ``__ne__``
51 :class:`Container` ``__iter__``
52
53:class:`MutableMapping` :class:`Mapping` ``__getitem__`` Inherited Mapping methods and
54 ``__setitem__``, ``pop``, ``popitem``, ``clear``, ``update``,
55 ``__delitem__``, and ``setdefault``
56 ``__iter__``, and
57 ``__len__``
58
59:class:`Sequence` :class:`Sized`, ``__getitem__`` ``__contains__``. ``__iter__``, ``__reversed__``.
60 :class:`Iterable`, and ``__len__`` ``index``, and ``count``
61 :class:`Container`
62
63:class:`MutableSequnce` :class:`Sequence` ``__getitem__`` Inherited Sequence methods and
64 ``__delitem__``, ``append``, ``reverse``, ``extend``, ``pop``,
65 ``insert``, ``remove``, and ``__iadd__``
66 and ``__len__``
67
Raymond Hettinger0dbdab22008-02-09 03:48:16 +000068:class:`Set` :class:`Sized`, ``__len__``, ``__le__``, ``__lt__``, ``__eq__``, ``__ne__``,
Raymond Hettinger409fb2c2008-02-09 02:17:06 +000069 :class:`Iterable`, ``__iter__``, and ``__gt__``, ``__ge__``, ``__and__``, ``__or__``
70 :class:`Container` ``__contains__`` ``__sub__``, ``__xor__``, and ``isdisjoint``
71
72:class:`MutableSet` :class:`Set` ``add`` and Inherited Set methods and
73 ``discard`` ``clear``, ``pop``, ``remove``, ``__ior__``,
74 ``__iand__``, ``__ixor__``, and ``__isub__``
75========================= ==================== ====================== ====================================================
Mark Summerfield08898b42007-09-05 08:43:04 +000076
Mark Summerfield08898b42007-09-05 08:43:04 +000077These ABCs allow us to ask classes or instances if they provide
78particular functionality, for example::
79
Mark Summerfield08898b42007-09-05 08:43:04 +000080 size = None
Raymond Hettingerebcee3f2008-02-06 19:54:00 +000081 if isinstance(myvar, collections.Sized):
Mark Summerfield08898b42007-09-05 08:43:04 +000082 size = len(myvar)
83
Raymond Hettingerebcee3f2008-02-06 19:54:00 +000084Several of the ABCs are also useful as mixins that make it easier to develop
85classes supporting container APIs. For example, to write a class supporting
86the full :class:`Set` API, it only necessary to supply the three underlying
87abstract methods: :meth:`__contains__`, :meth:`__iter__`, and :meth:`__len__`.
88The ABC supplies the remaining methods such as :meth:`__and__` and
89:meth:`isdisjoint` ::
90
91 class ListBasedSet(collections.Set):
Raymond Hettingerc1b6a4a2008-02-08 23:46:23 +000092 ''' Alternate set implementation favoring space over speed
93 and not requiring the set elements to be hashable. '''
Raymond Hettingerebcee3f2008-02-06 19:54:00 +000094 def __init__(self, iterable):
Raymond Hettingerc1b6a4a2008-02-08 23:46:23 +000095 self.elements = lst = []
96 for value in iterable:
97 if value not in lst:
98 lst.append(value)
Raymond Hettingerebcee3f2008-02-06 19:54:00 +000099 def __iter__(self):
100 return iter(self.elements)
101 def __contains__(self, value):
102 return value in self.elements
103 def __len__(self):
104 return len(self.elements)
105
106 s1 = ListBasedSet('abcdef')
107 s2 = ListBasedSet('defghi')
108 overlap = s1 & s2 # The __and__() method is supported automatically
109
Raymond Hettinger7aebb642008-02-09 03:25:08 +0000110Notes on using :class:`Set` and :class:`MutableSet` as a mixin:
111
Christian Heimesfe337bf2008-03-23 21:54:12 +0000112(1)
Raymond Hettinger7aebb642008-02-09 03:25:08 +0000113 Since some set operations create new sets, the default mixin methods need
Christian Heimesfe337bf2008-03-23 21:54:12 +0000114 a way to create new instances from an iterable. The class constructor is
115 assumed to have a signature in the form ``ClassName(iterable)``.
Mark Summerfield71316b02008-02-14 16:28:00 +0000116 That assumption is factored-out to a single internal classmethod called
Raymond Hettinger7aebb642008-02-09 03:25:08 +0000117 :meth:`_from_iterable` which calls ``cls(iterable)`` to produce a new set.
118 If the :class:`Set` mixin is being used in a class with a different
Christian Heimesfe337bf2008-03-23 21:54:12 +0000119 constructor signature, you will need to override :meth:`from_iterable`
120 with a classmethod that can construct new instances from
Raymond Hettinger7aebb642008-02-09 03:25:08 +0000121 an iterable argument.
122
123(2)
124 To override the comparisons (presumably for speed, as the
125 semantics are fixed), redefine :meth:`__le__` and
126 then the other operations will automatically follow suit.
Raymond Hettingerebcee3f2008-02-06 19:54:00 +0000127
Raymond Hettinger0dbdab22008-02-09 03:48:16 +0000128(3)
129 The :class:`Set` mixin provides a :meth:`_hash` method to compute a hash value
130 for the set; however, :meth:`__hash__` is not defined because not all sets
131 are hashable or immutable. To add set hashabilty using mixins,
132 inherit from both :meth:`Set` and :meth:`Hashable`, then define
133 ``__hash__ = Set._hash``.
134
Mark Summerfield08898b42007-09-05 08:43:04 +0000135(For more about ABCs, see the :mod:`abc` module and :pep:`3119`.)
136
137
Georg Brandl116aa622007-08-15 14:28:22 +0000138.. _deque-objects:
139
140:class:`deque` objects
141----------------------
142
143
Georg Brandl9afde1c2007-11-01 20:32:30 +0000144.. class:: deque([iterable[, maxlen]])
Georg Brandl116aa622007-08-15 14:28:22 +0000145
146 Returns a new deque object initialized left-to-right (using :meth:`append`) with
147 data from *iterable*. If *iterable* is not specified, the new deque is empty.
148
149 Deques are a generalization of stacks and queues (the name is pronounced "deck"
150 and is short for "double-ended queue"). Deques support thread-safe, memory
151 efficient appends and pops from either side of the deque with approximately the
152 same O(1) performance in either direction.
153
154 Though :class:`list` objects support similar operations, they are optimized for
155 fast fixed-length operations and incur O(n) memory movement costs for
156 ``pop(0)`` and ``insert(0, v)`` operations which change both the size and
157 position of the underlying data representation.
158
Georg Brandl116aa622007-08-15 14:28:22 +0000159
Georg Brandl9afde1c2007-11-01 20:32:30 +0000160 If *maxlen* is not specified or is *None*, deques may grow to an
161 arbitrary length. Otherwise, the deque is bounded to the specified maximum
162 length. Once a bounded length deque is full, when new items are added, a
163 corresponding number of items are discarded from the opposite end. Bounded
164 length deques provide functionality similar to the ``tail`` filter in
165 Unix. They are also useful for tracking transactions and other pools of data
166 where only the most recent activity is of interest.
167
Georg Brandl9afde1c2007-11-01 20:32:30 +0000168
Benjamin Petersone41251e2008-04-25 01:59:09 +0000169 Deque objects support the following methods:
Georg Brandl116aa622007-08-15 14:28:22 +0000170
Benjamin Petersone41251e2008-04-25 01:59:09 +0000171 .. method:: append(x)
Georg Brandl116aa622007-08-15 14:28:22 +0000172
Benjamin Petersone41251e2008-04-25 01:59:09 +0000173 Add *x* to the right side of the deque.
Georg Brandl116aa622007-08-15 14:28:22 +0000174
175
Benjamin Petersone41251e2008-04-25 01:59:09 +0000176 .. method:: appendleft(x)
Georg Brandl116aa622007-08-15 14:28:22 +0000177
Benjamin Petersone41251e2008-04-25 01:59:09 +0000178 Add *x* to the left side of the deque.
Georg Brandl116aa622007-08-15 14:28:22 +0000179
180
Benjamin Petersone41251e2008-04-25 01:59:09 +0000181 .. method:: clear()
Georg Brandl116aa622007-08-15 14:28:22 +0000182
Benjamin Petersone41251e2008-04-25 01:59:09 +0000183 Remove all elements from the deque leaving it with length 0.
Georg Brandl116aa622007-08-15 14:28:22 +0000184
185
Benjamin Petersone41251e2008-04-25 01:59:09 +0000186 .. method:: extend(iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000187
Benjamin Petersone41251e2008-04-25 01:59:09 +0000188 Extend the right side of the deque by appending elements from the iterable
189 argument.
Georg Brandl116aa622007-08-15 14:28:22 +0000190
191
Benjamin Petersone41251e2008-04-25 01:59:09 +0000192 .. method:: extendleft(iterable)
Georg Brandl116aa622007-08-15 14:28:22 +0000193
Benjamin Petersone41251e2008-04-25 01:59:09 +0000194 Extend the left side of the deque by appending elements from *iterable*.
195 Note, the series of left appends results in reversing the order of
196 elements in the iterable argument.
Georg Brandl116aa622007-08-15 14:28:22 +0000197
198
Benjamin Petersone41251e2008-04-25 01:59:09 +0000199 .. method:: pop()
Georg Brandl116aa622007-08-15 14:28:22 +0000200
Benjamin Petersone41251e2008-04-25 01:59:09 +0000201 Remove and return an element from the right side of the deque. If no
202 elements are present, raises an :exc:`IndexError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000203
204
Benjamin Petersone41251e2008-04-25 01:59:09 +0000205 .. method:: popleft()
Georg Brandl116aa622007-08-15 14:28:22 +0000206
Benjamin Petersone41251e2008-04-25 01:59:09 +0000207 Remove and return an element from the left side of the deque. If no
208 elements are present, raises an :exc:`IndexError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000209
210
Benjamin Petersone41251e2008-04-25 01:59:09 +0000211 .. method:: remove(value)
Georg Brandl116aa622007-08-15 14:28:22 +0000212
Benjamin Petersone41251e2008-04-25 01:59:09 +0000213 Removed the first occurrence of *value*. If not found, raises a
214 :exc:`ValueError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000215
Georg Brandl116aa622007-08-15 14:28:22 +0000216
Benjamin Petersone41251e2008-04-25 01:59:09 +0000217 .. method:: rotate(n)
Georg Brandl116aa622007-08-15 14:28:22 +0000218
Benjamin Petersone41251e2008-04-25 01:59:09 +0000219 Rotate the deque *n* steps to the right. If *n* is negative, rotate to
220 the left. Rotating one step to the right is equivalent to:
221 ``d.appendleft(d.pop())``.
222
Georg Brandl116aa622007-08-15 14:28:22 +0000223
224In addition to the above, deques support iteration, pickling, ``len(d)``,
225``reversed(d)``, ``copy.copy(d)``, ``copy.deepcopy(d)``, membership testing with
226the :keyword:`in` operator, and subscript references such as ``d[-1]``.
227
Christian Heimesfe337bf2008-03-23 21:54:12 +0000228Example:
229
230.. doctest::
Georg Brandl116aa622007-08-15 14:28:22 +0000231
232 >>> from collections import deque
233 >>> d = deque('ghi') # make a new deque with three items
234 >>> for elem in d: # iterate over the deque's elements
Christian Heimesfe337bf2008-03-23 21:54:12 +0000235 ... print elem.upper()
Georg Brandl116aa622007-08-15 14:28:22 +0000236 G
237 H
238 I
239
240 >>> d.append('j') # add a new entry to the right side
241 >>> d.appendleft('f') # add a new entry to the left side
242 >>> d # show the representation of the deque
243 deque(['f', 'g', 'h', 'i', 'j'])
244
245 >>> d.pop() # return and remove the rightmost item
246 'j'
247 >>> d.popleft() # return and remove the leftmost item
248 'f'
249 >>> list(d) # list the contents of the deque
250 ['g', 'h', 'i']
251 >>> d[0] # peek at leftmost item
252 'g'
253 >>> d[-1] # peek at rightmost item
254 'i'
255
256 >>> list(reversed(d)) # list the contents of a deque in reverse
257 ['i', 'h', 'g']
258 >>> 'h' in d # search the deque
259 True
260 >>> d.extend('jkl') # add multiple elements at once
261 >>> d
262 deque(['g', 'h', 'i', 'j', 'k', 'l'])
263 >>> d.rotate(1) # right rotation
264 >>> d
265 deque(['l', 'g', 'h', 'i', 'j', 'k'])
266 >>> d.rotate(-1) # left rotation
267 >>> d
268 deque(['g', 'h', 'i', 'j', 'k', 'l'])
269
270 >>> deque(reversed(d)) # make a new deque in reverse order
271 deque(['l', 'k', 'j', 'i', 'h', 'g'])
272 >>> d.clear() # empty the deque
273 >>> d.pop() # cannot pop from an empty deque
274 Traceback (most recent call last):
275 File "<pyshell#6>", line 1, in -toplevel-
276 d.pop()
277 IndexError: pop from an empty deque
278
279 >>> d.extendleft('abc') # extendleft() reverses the input order
280 >>> d
281 deque(['c', 'b', 'a'])
282
283
284.. _deque-recipes:
285
Georg Brandl9afde1c2007-11-01 20:32:30 +0000286:class:`deque` Recipes
287^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000288
289This section shows various approaches to working with deques.
290
291The :meth:`rotate` method provides a way to implement :class:`deque` slicing and
292deletion. For example, a pure python implementation of ``del d[n]`` relies on
293the :meth:`rotate` method to position elements to be popped::
294
295 def delete_nth(d, n):
296 d.rotate(-n)
297 d.popleft()
298 d.rotate(n)
299
300To implement :class:`deque` slicing, use a similar approach applying
301:meth:`rotate` to bring a target element to the left side of the deque. Remove
302old entries with :meth:`popleft`, add new entries with :meth:`extend`, and then
303reverse the rotation.
Georg Brandl116aa622007-08-15 14:28:22 +0000304With minor variations on that approach, it is easy to implement Forth style
305stack manipulations such as ``dup``, ``drop``, ``swap``, ``over``, ``pick``,
306``rot``, and ``roll``.
307
Georg Brandl116aa622007-08-15 14:28:22 +0000308Multi-pass data reduction algorithms can be succinctly expressed and efficiently
309coded by extracting elements with multiple calls to :meth:`popleft`, applying
Georg Brandl9afde1c2007-11-01 20:32:30 +0000310a reduction function, and calling :meth:`append` to add the result back to the
311deque.
Georg Brandl116aa622007-08-15 14:28:22 +0000312
313For example, building a balanced binary tree of nested lists entails reducing
Christian Heimesfe337bf2008-03-23 21:54:12 +0000314two adjacent nodes into one by grouping them in a list:
Georg Brandl116aa622007-08-15 14:28:22 +0000315
316 >>> def maketree(iterable):
317 ... d = deque(iterable)
318 ... while len(d) > 1:
319 ... pair = [d.popleft(), d.popleft()]
320 ... d.append(pair)
321 ... return list(d)
322 ...
Georg Brandl6911e3c2007-09-04 07:15:32 +0000323 >>> print(maketree('abcdefgh'))
Georg Brandl116aa622007-08-15 14:28:22 +0000324 [[[['a', 'b'], ['c', 'd']], [['e', 'f'], ['g', 'h']]]]
325
Georg Brandl9afde1c2007-11-01 20:32:30 +0000326Bounded length deques provide functionality similar to the ``tail`` filter
327in Unix::
Georg Brandl116aa622007-08-15 14:28:22 +0000328
Georg Brandl9afde1c2007-11-01 20:32:30 +0000329 def tail(filename, n=10):
330 'Return the last n lines of a file'
331 return deque(open(filename), n)
Georg Brandl116aa622007-08-15 14:28:22 +0000332
333.. _defaultdict-objects:
334
335:class:`defaultdict` objects
336----------------------------
337
338
339.. class:: defaultdict([default_factory[, ...]])
340
341 Returns a new dictionary-like object. :class:`defaultdict` is a subclass of the
342 builtin :class:`dict` class. It overrides one method and adds one writable
343 instance variable. The remaining functionality is the same as for the
344 :class:`dict` class and is not documented here.
345
346 The first argument provides the initial value for the :attr:`default_factory`
347 attribute; it defaults to ``None``. All remaining arguments are treated the same
348 as if they were passed to the :class:`dict` constructor, including keyword
349 arguments.
350
Georg Brandl116aa622007-08-15 14:28:22 +0000351
Benjamin Petersone41251e2008-04-25 01:59:09 +0000352 :class:`defaultdict` objects support the following method in addition to the
353 standard :class:`dict` operations:
Georg Brandl116aa622007-08-15 14:28:22 +0000354
Benjamin Petersone41251e2008-04-25 01:59:09 +0000355 .. method:: defaultdict.__missing__(key)
Georg Brandl116aa622007-08-15 14:28:22 +0000356
Benjamin Petersone41251e2008-04-25 01:59:09 +0000357 If the :attr:`default_factory` attribute is ``None``, this raises an
358 :exc:`KeyError` exception with the *key* as argument.
Georg Brandl116aa622007-08-15 14:28:22 +0000359
Benjamin Petersone41251e2008-04-25 01:59:09 +0000360 If :attr:`default_factory` is not ``None``, it is called without arguments
361 to provide a default value for the given *key*, this value is inserted in
362 the dictionary for the *key*, and returned.
Georg Brandl116aa622007-08-15 14:28:22 +0000363
Benjamin Petersone41251e2008-04-25 01:59:09 +0000364 If calling :attr:`default_factory` raises an exception this exception is
365 propagated unchanged.
Georg Brandl116aa622007-08-15 14:28:22 +0000366
Benjamin Petersone41251e2008-04-25 01:59:09 +0000367 This method is called by the :meth:`__getitem__` method of the
368 :class:`dict` class when the requested key is not found; whatever it
369 returns or raises is then returned or raised by :meth:`__getitem__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000370
371
Benjamin Petersone41251e2008-04-25 01:59:09 +0000372 :class:`defaultdict` objects support the following instance variable:
Georg Brandl116aa622007-08-15 14:28:22 +0000373
Benjamin Petersone41251e2008-04-25 01:59:09 +0000374
375 .. attribute:: defaultdict.default_factory
376
377 This attribute is used by the :meth:`__missing__` method; it is
378 initialized from the first argument to the constructor, if present, or to
379 ``None``, if absent.
Georg Brandl116aa622007-08-15 14:28:22 +0000380
381
382.. _defaultdict-examples:
383
384:class:`defaultdict` Examples
385^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
386
387Using :class:`list` as the :attr:`default_factory`, it is easy to group a
Christian Heimesfe337bf2008-03-23 21:54:12 +0000388sequence of key-value pairs into a dictionary of lists:
Georg Brandl116aa622007-08-15 14:28:22 +0000389
390 >>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
391 >>> d = defaultdict(list)
392 >>> for k, v in s:
393 ... d[k].append(v)
394 ...
395 >>> d.items()
396 [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
397
398When each key is encountered for the first time, it is not already in the
399mapping; so an entry is automatically created using the :attr:`default_factory`
400function which returns an empty :class:`list`. The :meth:`list.append`
401operation then attaches the value to the new list. When keys are encountered
402again, the look-up proceeds normally (returning the list for that key) and the
403:meth:`list.append` operation adds another value to the list. This technique is
Christian Heimesfe337bf2008-03-23 21:54:12 +0000404simpler and faster than an equivalent technique using :meth:`dict.setdefault`:
Georg Brandl116aa622007-08-15 14:28:22 +0000405
406 >>> d = {}
407 >>> for k, v in s:
408 ... d.setdefault(k, []).append(v)
409 ...
410 >>> d.items()
411 [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
412
413Setting the :attr:`default_factory` to :class:`int` makes the
414:class:`defaultdict` useful for counting (like a bag or multiset in other
Christian Heimesfe337bf2008-03-23 21:54:12 +0000415languages):
Georg Brandl116aa622007-08-15 14:28:22 +0000416
417 >>> s = 'mississippi'
418 >>> d = defaultdict(int)
419 >>> for k in s:
420 ... d[k] += 1
421 ...
422 >>> d.items()
423 [('i', 4), ('p', 2), ('s', 4), ('m', 1)]
424
425When a letter is first encountered, it is missing from the mapping, so the
426:attr:`default_factory` function calls :func:`int` to supply a default count of
427zero. The increment operation then builds up the count for each letter.
428
429The function :func:`int` which always returns zero is just a special case of
430constant functions. A faster and more flexible way to create constant functions
431is to use a lambda function which can supply any constant value (not just
Christian Heimesfe337bf2008-03-23 21:54:12 +0000432zero):
Georg Brandl116aa622007-08-15 14:28:22 +0000433
434 >>> def constant_factory(value):
435 ... return lambda: value
436 >>> d = defaultdict(constant_factory('<missing>'))
437 >>> d.update(name='John', action='ran')
438 >>> '%(name)s %(action)s to %(object)s' % d
439 'John ran to <missing>'
440
441Setting the :attr:`default_factory` to :class:`set` makes the
Christian Heimesfe337bf2008-03-23 21:54:12 +0000442:class:`defaultdict` useful for building a dictionary of sets:
Georg Brandl116aa622007-08-15 14:28:22 +0000443
444 >>> s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)]
445 >>> d = defaultdict(set)
446 >>> for k, v in s:
447 ... d[k].add(v)
448 ...
449 >>> d.items()
450 [('blue', set([2, 4])), ('red', set([1, 3]))]
451
452
453.. _named-tuple-factory:
454
Guido van Rossum3d392eb2007-11-16 00:35:22 +0000455:func:`namedtuple` Factory Function for Tuples with Named Fields
Christian Heimes790c8232008-01-07 21:14:23 +0000456----------------------------------------------------------------
Georg Brandl116aa622007-08-15 14:28:22 +0000457
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000458Named tuples assign meaning to each position in a tuple and allow for more readable,
459self-documenting code. They can be used wherever regular tuples are used, and
460they add the ability to access fields by name instead of position index.
Georg Brandl116aa622007-08-15 14:28:22 +0000461
Guido van Rossum3d392eb2007-11-16 00:35:22 +0000462.. function:: namedtuple(typename, fieldnames, [verbose])
Georg Brandl116aa622007-08-15 14:28:22 +0000463
464 Returns a new tuple subclass named *typename*. The new subclass is used to
Christian Heimesc3f30c42008-02-22 16:37:40 +0000465 create tuple-like objects that have fields accessible by attribute lookup as
Georg Brandl116aa622007-08-15 14:28:22 +0000466 well as being indexable and iterable. Instances of the subclass also have a
467 helpful docstring (with typename and fieldnames) and a helpful :meth:`__repr__`
468 method which lists the tuple contents in a ``name=value`` format.
469
Georg Brandl9afde1c2007-11-01 20:32:30 +0000470 The *fieldnames* are a single string with each fieldname separated by whitespace
Christian Heimes25bb7832008-01-11 16:17:00 +0000471 and/or commas, for example ``'x y'`` or ``'x, y'``. Alternatively, *fieldnames*
472 can be a sequence of strings such as ``['x', 'y']``.
Georg Brandl9afde1c2007-11-01 20:32:30 +0000473
474 Any valid Python identifier may be used for a fieldname except for names
Christian Heimes0449f632007-12-15 01:27:15 +0000475 starting with an underscore. Valid identifiers consist of letters, digits,
476 and underscores but do not start with a digit or underscore and cannot be
Georg Brandlf6945182008-02-01 11:56:49 +0000477 a :mod:`keyword` such as *class*, *for*, *return*, *global*, *pass*,
Georg Brandl9afde1c2007-11-01 20:32:30 +0000478 or *raise*.
Georg Brandl116aa622007-08-15 14:28:22 +0000479
Christian Heimes25bb7832008-01-11 16:17:00 +0000480 If *verbose* is true, the class definition is printed just before being built.
Georg Brandl116aa622007-08-15 14:28:22 +0000481
Georg Brandl9afde1c2007-11-01 20:32:30 +0000482 Named tuple instances do not have per-instance dictionaries, so they are
Thomas Wouters8ce81f72007-09-20 18:22:40 +0000483 lightweight and require no more memory than regular tuples.
Georg Brandl116aa622007-08-15 14:28:22 +0000484
Christian Heimesfe337bf2008-03-23 21:54:12 +0000485Example:
486
487.. doctest::
488 :options: +NORMALIZE_WHITESPACE
Georg Brandl116aa622007-08-15 14:28:22 +0000489
Guido van Rossum3d392eb2007-11-16 00:35:22 +0000490 >>> Point = namedtuple('Point', 'x y', verbose=True)
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000491 class Point(tuple):
492 'Point(x, y)'
Christian Heimesfe337bf2008-03-23 21:54:12 +0000493 <BLANKLINE>
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000494 __slots__ = ()
Christian Heimesfe337bf2008-03-23 21:54:12 +0000495 <BLANKLINE>
Christian Heimesfaf2f632008-01-06 16:59:19 +0000496 _fields = ('x', 'y')
Christian Heimesfe337bf2008-03-23 21:54:12 +0000497 <BLANKLINE>
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000498 def __new__(cls, x, y):
499 return tuple.__new__(cls, (x, y))
Christian Heimesfe337bf2008-03-23 21:54:12 +0000500 <BLANKLINE>
Christian Heimesfaf2f632008-01-06 16:59:19 +0000501 @classmethod
Christian Heimesfe337bf2008-03-23 21:54:12 +0000502 def _make(cls, iterable, new=tuple.__new__, len=len):
Christian Heimesfaf2f632008-01-06 16:59:19 +0000503 'Make a new Point object from a sequence or iterable'
Christian Heimesfe337bf2008-03-23 21:54:12 +0000504 result = new(cls, iterable)
Christian Heimesfaf2f632008-01-06 16:59:19 +0000505 if len(result) != 2:
506 raise TypeError('Expected 2 arguments, got %d' % len(result))
507 return result
Christian Heimesfe337bf2008-03-23 21:54:12 +0000508 <BLANKLINE>
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000509 def __repr__(self):
510 return 'Point(x=%r, y=%r)' % self
Christian Heimesfe337bf2008-03-23 21:54:12 +0000511 <BLANKLINE>
Christian Heimes99170a52007-12-19 02:07:34 +0000512 def _asdict(t):
Christian Heimes0449f632007-12-15 01:27:15 +0000513 'Return a new dict which maps field names to their values'
Christian Heimes99170a52007-12-19 02:07:34 +0000514 return {'x': t[0], 'y': t[1]}
Christian Heimesfe337bf2008-03-23 21:54:12 +0000515 <BLANKLINE>
Christian Heimes0449f632007-12-15 01:27:15 +0000516 def _replace(self, **kwds):
Guido van Rossum3d392eb2007-11-16 00:35:22 +0000517 'Return a new Point object replacing specified fields with new values'
Christian Heimesfaf2f632008-01-06 16:59:19 +0000518 result = self._make(map(kwds.pop, ('x', 'y'), self))
519 if kwds:
520 raise ValueError('Got unexpected field names: %r' % kwds.keys())
521 return result
Christian Heimesfe337bf2008-03-23 21:54:12 +0000522 <BLANKLINE>
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000523 x = property(itemgetter(0))
524 y = property(itemgetter(1))
Georg Brandl116aa622007-08-15 14:28:22 +0000525
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000526 >>> p = Point(11, y=22) # instantiate with positional or keyword arguments
Christian Heimes99170a52007-12-19 02:07:34 +0000527 >>> p[0] + p[1] # indexable like the plain tuple (11, 22)
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000528 33
529 >>> x, y = p # unpack like a regular tuple
530 >>> x, y
531 (11, 22)
Christian Heimesc3f30c42008-02-22 16:37:40 +0000532 >>> p.x + p.y # fields also accessible by name
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000533 33
534 >>> p # readable __repr__ with a name=value style
535 Point(x=11, y=22)
Georg Brandl116aa622007-08-15 14:28:22 +0000536
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000537Named tuples are especially useful for assigning field names to result tuples returned
538by the :mod:`csv` or :mod:`sqlite3` modules::
539
Guido van Rossum3d392eb2007-11-16 00:35:22 +0000540 EmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title, department, paygrade')
Georg Brandl9afde1c2007-11-01 20:32:30 +0000541
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000542 import csv
Christian Heimesfaf2f632008-01-06 16:59:19 +0000543 for emp in map(EmployeeRecord._make, csv.reader(open("employees.csv", "rb"))):
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000544 print(emp.name, emp.title)
545
Georg Brandl9afde1c2007-11-01 20:32:30 +0000546 import sqlite3
547 conn = sqlite3.connect('/companydata')
548 cursor = conn.cursor()
549 cursor.execute('SELECT name, age, title, department, paygrade FROM employees')
Christian Heimesfaf2f632008-01-06 16:59:19 +0000550 for emp in map(EmployeeRecord._make, cursor.fetchall()):
Christian Heimes00412232008-01-10 16:02:19 +0000551 print(emp.name, emp.title)
Georg Brandl9afde1c2007-11-01 20:32:30 +0000552
Christian Heimes99170a52007-12-19 02:07:34 +0000553In addition to the methods inherited from tuples, named tuples support
Christian Heimes2380ac72008-01-09 00:17:24 +0000554three additional methods and one attribute. To prevent conflicts with
555field names, the method and attribute names start with an underscore.
Christian Heimes99170a52007-12-19 02:07:34 +0000556
Christian Heimes790c8232008-01-07 21:14:23 +0000557.. method:: somenamedtuple._make(iterable)
Christian Heimes99170a52007-12-19 02:07:34 +0000558
Christian Heimesfaf2f632008-01-06 16:59:19 +0000559 Class method that makes a new instance from an existing sequence or iterable.
Christian Heimes99170a52007-12-19 02:07:34 +0000560
Christian Heimesfe337bf2008-03-23 21:54:12 +0000561.. doctest::
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000562
Christian Heimesfaf2f632008-01-06 16:59:19 +0000563 >>> t = [11, 22]
564 >>> Point._make(t)
565 Point(x=11, y=22)
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000566
Christian Heimes790c8232008-01-07 21:14:23 +0000567.. method:: somenamedtuple._asdict()
Georg Brandl9afde1c2007-11-01 20:32:30 +0000568
Christian Heimesfe337bf2008-03-23 21:54:12 +0000569 Return a new dict which maps field names to their corresponding values::
Georg Brandl9afde1c2007-11-01 20:32:30 +0000570
Christian Heimes0449f632007-12-15 01:27:15 +0000571 >>> p._asdict()
Georg Brandl9afde1c2007-11-01 20:32:30 +0000572 {'x': 11, 'y': 22}
Christian Heimesfe337bf2008-03-23 21:54:12 +0000573
Christian Heimes790c8232008-01-07 21:14:23 +0000574.. method:: somenamedtuple._replace(kwargs)
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000575
Christian Heimesfe337bf2008-03-23 21:54:12 +0000576 Return a new instance of the named tuple replacing specified fields with new
577 values:
Thomas Wouters8ce81f72007-09-20 18:22:40 +0000578
579::
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000580
581 >>> p = Point(x=11, y=22)
Christian Heimes0449f632007-12-15 01:27:15 +0000582 >>> p._replace(x=33)
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000583 Point(x=33, y=22)
584
Guido van Rossum3d392eb2007-11-16 00:35:22 +0000585 >>> for partnum, record in inventory.items():
Christian Heimes454f37b2008-01-10 00:10:02 +0000586 ... inventory[partnum] = record._replace(price=newprices[partnum], timestamp=time.now())
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000587
Christian Heimes790c8232008-01-07 21:14:23 +0000588.. attribute:: somenamedtuple._fields
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000589
Christian Heimes2380ac72008-01-09 00:17:24 +0000590 Tuple of strings listing the field names. Useful for introspection
Georg Brandl9afde1c2007-11-01 20:32:30 +0000591 and for creating new named tuple types from existing named tuples.
Thomas Wouters8ce81f72007-09-20 18:22:40 +0000592
Christian Heimesfe337bf2008-03-23 21:54:12 +0000593.. doctest::
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000594
Christian Heimes0449f632007-12-15 01:27:15 +0000595 >>> p._fields # view the field names
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000596 ('x', 'y')
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000597
Guido van Rossum3d392eb2007-11-16 00:35:22 +0000598 >>> Color = namedtuple('Color', 'red green blue')
Christian Heimes0449f632007-12-15 01:27:15 +0000599 >>> Pixel = namedtuple('Pixel', Point._fields + Color._fields)
Thomas Wouters1b7f8912007-09-19 03:06:30 +0000600 >>> Pixel(11, 22, 128, 255, 0)
Christian Heimes454f37b2008-01-10 00:10:02 +0000601 Pixel(x=11, y=22, red=128, green=255, blue=0)
Georg Brandl116aa622007-08-15 14:28:22 +0000602
Christian Heimes0449f632007-12-15 01:27:15 +0000603To retrieve a field whose name is stored in a string, use the :func:`getattr`
Christian Heimesfe337bf2008-03-23 21:54:12 +0000604function:
Christian Heimes0449f632007-12-15 01:27:15 +0000605
606 >>> getattr(p, 'x')
607 11
608
Christian Heimesfe337bf2008-03-23 21:54:12 +0000609To convert a dictionary to a named tuple, use the double-star-operator [#]_:
Christian Heimes99170a52007-12-19 02:07:34 +0000610
611 >>> d = {'x': 11, 'y': 22}
612 >>> Point(**d)
613 Point(x=11, y=22)
614
Guido van Rossum3d392eb2007-11-16 00:35:22 +0000615Since a named tuple is a regular Python class, it is easy to add or change
Christian Heimes043d6f62008-01-07 17:19:16 +0000616functionality with a subclass. Here is how to add a calculated field and
Christian Heimesfe337bf2008-03-23 21:54:12 +0000617a fixed-width print format:
Guido van Rossum3d392eb2007-11-16 00:35:22 +0000618
Christian Heimes043d6f62008-01-07 17:19:16 +0000619 >>> class Point(namedtuple('Point', 'x y')):
Christian Heimes25bb7832008-01-11 16:17:00 +0000620 ... __slots__ = ()
Christian Heimes454f37b2008-01-10 00:10:02 +0000621 ... @property
622 ... def hypot(self):
623 ... return (self.x ** 2 + self.y ** 2) ** 0.5
624 ... def __str__(self):
Christian Heimes25bb7832008-01-11 16:17:00 +0000625 ... return 'Point: x=%6.3f y=%6.3f hypot=%6.3f' % (self.x, self.y, self.hypot)
Guido van Rossum3d392eb2007-11-16 00:35:22 +0000626
Christian Heimes25bb7832008-01-11 16:17:00 +0000627 >>> for p in Point(3, 4), Point(14, 5/7.):
Christian Heimes00412232008-01-10 16:02:19 +0000628 ... print(p)
Christian Heimes25bb7832008-01-11 16:17:00 +0000629 Point: x= 3.000 y= 4.000 hypot= 5.000
630 Point: x=14.000 y= 0.714 hypot=14.018
Christian Heimes043d6f62008-01-07 17:19:16 +0000631
Christian Heimesaf98da12008-01-27 15:18:18 +0000632The subclass shown above sets ``__slots__`` to an empty tuple. This keeps
Christian Heimes679db4a2008-01-18 09:56:22 +0000633keep memory requirements low by preventing the creation of instance dictionaries.
634
Christian Heimes2380ac72008-01-09 00:17:24 +0000635
636Subclassing is not useful for adding new, stored fields. Instead, simply
Christian Heimesfe337bf2008-03-23 21:54:12 +0000637create a new named tuple type from the :attr:`_fields` attribute:
Christian Heimes2380ac72008-01-09 00:17:24 +0000638
Christian Heimes25bb7832008-01-11 16:17:00 +0000639 >>> Point3D = namedtuple('Point3D', Point._fields + ('z',))
Christian Heimes2380ac72008-01-09 00:17:24 +0000640
641Default values can be implemented by using :meth:`_replace` to
Christian Heimesfe337bf2008-03-23 21:54:12 +0000642customize a prototype instance:
Guido van Rossum3d392eb2007-11-16 00:35:22 +0000643
644 >>> Account = namedtuple('Account', 'owner balance transaction_count')
Christian Heimes587c2bf2008-01-19 16:21:02 +0000645 >>> default_account = Account('<owner name>', 0.0, 0)
646 >>> johns_account = default_account._replace(owner='John')
Guido van Rossum3d392eb2007-11-16 00:35:22 +0000647
Christian Heimese4ca8152008-05-08 17:18:53 +0000648Enumerated constants can be implemented with named tuples, but it is simpler
649and more efficient to use a simple class declaration:
650
651 >>> Status = namedtuple('Status', 'open pending closed')._make(range(3))
652 >>> Status.open, Status.pending, Status.closed
653 (0, 1, 2)
654 >>> class Status:
655 ... open, pending, closed = range(3)
656
Thomas Wouters47b49bf2007-08-30 22:15:33 +0000657.. rubric:: Footnotes
658
Christian Heimes99170a52007-12-19 02:07:34 +0000659.. [#] For information on the double-star-operator see
Thomas Wouters47b49bf2007-08-30 22:15:33 +0000660 :ref:`tut-unpacking-arguments` and :ref:`calls`.
Raymond Hettingere4c96ad2008-02-06 01:23:58 +0000661
662
663
664:class:`UserDict` objects
Mark Summerfield8f2d0062008-02-06 13:30:44 +0000665-------------------------
Raymond Hettingere4c96ad2008-02-06 01:23:58 +0000666
667The class, :class:`UserDict` acts as a wrapper around dictionary objects.
668The need for this class has been partially supplanted by the ability to
669subclass directly from :class:`dict`; however, this class can be easier
670to work with because the underlying dictionary is accessible as an
671attribute.
672
673.. class:: UserDict([initialdata])
674
675 Class that simulates a dictionary. The instance's contents are kept in a
676 regular dictionary, which is accessible via the :attr:`data` attribute of
677 :class:`UserDict` instances. If *initialdata* is provided, :attr:`data` is
678 initialized with its contents; note that a reference to *initialdata* will not
679 be kept, allowing it be used for other purposes.
680
681In addition to supporting the methods and operations of mappings,
Raymond Hettingerebcee3f2008-02-06 19:54:00 +0000682:class:`UserDict` instances provide the following attribute:
Raymond Hettingere4c96ad2008-02-06 01:23:58 +0000683
684.. attribute:: UserDict.data
685
686 A real dictionary used to store the contents of the :class:`UserDict` class.
Raymond Hettinger53dbe392008-02-12 20:03:09 +0000687
688
689
690:class:`UserList` objects
691-------------------------
692
693This class acts as a wrapper around list objects. It is a useful base class
694for your own list-like classes which can inherit from them and override
695existing methods or add new ones. In this way, one can add new behaviors to
696lists.
697
698The need for this class has been partially supplanted by the ability to
699subclass directly from :class:`list`; however, this class can be easier
700to work with because the underlying list is accessible as an attribute.
701
702.. class:: UserList([list])
703
704 Class that simulates a list. The instance's contents are kept in a regular
705 list, which is accessible via the :attr:`data` attribute of :class:`UserList`
706 instances. The instance's contents are initially set to a copy of *list*,
707 defaulting to the empty list ``[]``. *list* can be any iterable, for
708 example a real Python list or a :class:`UserList` object.
709
710In addition to supporting the methods and operations of mutable sequences,
711:class:`UserList` instances provide the following attribute:
712
713.. attribute:: UserList.data
714
715 A real :class:`list` object used to store the contents of the
716 :class:`UserList` class.
717
718**Subclassing requirements:** Subclasses of :class:`UserList` are expect to
719offer a constructor which can be called with either no arguments or one
720argument. List operations which return a new sequence attempt to create an
721instance of the actual implementation class. To do so, it assumes that the
722constructor can be called with a single parameter, which is a sequence object
723used as a data source.
724
725If a derived class does not wish to comply with this requirement, all of the
726special methods supported by this class will need to be overridden; please
727consult the sources for information about the methods which need to be provided
728in that case.
Raymond Hettingerb3a65f82008-02-21 22:11:37 +0000729
730:class:`UserString` objects
Christian Heimesc3f30c42008-02-22 16:37:40 +0000731---------------------------
Raymond Hettingerb3a65f82008-02-21 22:11:37 +0000732
733The class, :class:`UserString` acts as a wrapper around string objects.
734The need for this class has been partially supplanted by the ability to
735subclass directly from :class:`str`; however, this class can be easier
736to work with because the underlying string is accessible as an
737attribute.
738
739.. class:: UserString([sequence])
740
741 Class that simulates a string or a Unicode string object. The instance's
742 content is kept in a regular string object, which is accessible via the
743 :attr:`data` attribute of :class:`UserString` instances. The instance's
744 contents are initially set to a copy of *sequence*. The *sequence* can
745 be an instance of :class:`bytes`, :class:`str`, :class:`UserString` (or a
746 subclass) or an arbitrary sequence which can be converted into a string using
747 the built-in :func:`str` function.