Blame - Doc/library/collections.rst - platform/external/python/cpython2

blob: fa5dd6410ad1cf5f1390048e53232307e1116252 [file] [log] [blame]

Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	1
				2	:mod:`collections` --- High-performance container datatypes
				3	===========================================================
				4
				5	.. module:: collections
				6	:synopsis: High-performance datatypes
				7	.. moduleauthor:: Raymond Hettinger <python@rcn.com>
				8	.. sectionauthor:: Raymond Hettinger <python@rcn.com>
				9
				10
				11	.. versionadded:: 2.4
				12
				13	This module implements high-performance container datatypes. Currently,
				14	there are two datatypes, :class:`deque` and :class:`defaultdict`, and
				15	one datatype factory function, :func:`NamedTuple`. Python already
				16	includes built-in containers, :class:`dict`, :class:`list`,
				17	:class:`set`, and :class:`tuple`. In addition, the optional :mod:`bsddb`
				18	module has a :meth:`bsddb.btopen` method that can be used to create in-memory
				19	or file based ordered dictionaries with string keys.
				20
				21	Future editions of the standard library may include balanced trees and
				22	ordered dictionaries.
				23
				24	.. versionchanged:: 2.5
				25	Added :class:`defaultdict`.
				26
				27	.. versionchanged:: 2.6
				28	Added :class:`NamedTuple`.
				29
				30
				31	.. _deque-objects:
				32
				33	:class:`deque` objects
				34	----------------------
				35
				36
				37	.. class:: deque([iterable])
				38
				39	Returns a new deque object initialized left-to-right (using :meth:`append`) with
				40	data from iterable. If iterable is not specified, the new deque is empty.
				41
				42	Deques are a generalization of stacks and queues (the name is pronounced "deck"
				43	and is short for "double-ended queue"). Deques support thread-safe, memory
				44	efficient appends and pops from either side of the deque with approximately the
				45	same O(1) performance in either direction.
				46
				47	Though :class:`list` objects support similar operations, they are optimized for
				48	fast fixed-length operations and incur O(n) memory movement costs for
				49	``pop(0)`` and ``insert(0, v)`` operations which change both the size and
				50	position of the underlying data representation.
				51
				52	.. versionadded:: 2.4
				53
				54	Deque objects support the following methods:
				55
				56
				57	.. method:: deque.append(x)
				58
				59	Add x to the right side of the deque.
				60
				61
				62	.. method:: deque.appendleft(x)
				63
				64	Add x to the left side of the deque.
				65
				66
				67	.. method:: deque.clear()
				68
				69	Remove all elements from the deque leaving it with length 0.
				70
				71
				72	.. method:: deque.extend(iterable)
				73
				74	Extend the right side of the deque by appending elements from the iterable
				75	argument.
				76
				77
				78	.. method:: deque.extendleft(iterable)
				79
				80	Extend the left side of the deque by appending elements from iterable. Note,
				81	the series of left appends results in reversing the order of elements in the
				82	iterable argument.
				83
				84
				85	.. method:: deque.pop()
				86
				87	Remove and return an element from the right side of the deque. If no elements
				88	are present, raises an :exc:`IndexError`.
				89
				90
				91	.. method:: deque.popleft()
				92
				93	Remove and return an element from the left side of the deque. If no elements are
				94	present, raises an :exc:`IndexError`.
				95
				96
				97	.. method:: deque.remove(value)
				98
				99	Removed the first occurrence of value. If not found, raises a
				100	:exc:`ValueError`.
				101
				102	.. versionadded:: 2.5
				103
				104
				105	.. method:: deque.rotate(n)
				106
				107	Rotate the deque n steps to the right. If n is negative, rotate to the
				108	left. Rotating one step to the right is equivalent to:
				109	``d.appendleft(d.pop())``.
				110
				111	In addition to the above, deques support iteration, pickling, ``len(d)``,
				112	``reversed(d)``, ``copy.copy(d)``, ``copy.deepcopy(d)``, membership testing with
				113	the :keyword:`in` operator, and subscript references such as ``d[-1]``.
				114
				115	Example::
				116
				117	>>> from collections import deque
				118	>>> d = deque('ghi') # make a new deque with three items
				119	>>> for elem in d: # iterate over the deque's elements
				120	... print elem.upper()
				121	G
				122	H
				123	I
				124
				125	>>> d.append('j') # add a new entry to the right side
				126	>>> d.appendleft('f') # add a new entry to the left side
				127	>>> d # show the representation of the deque
				128	deque(['f', 'g', 'h', 'i', 'j'])
				129
				130	>>> d.pop() # return and remove the rightmost item
				131	'j'
				132	>>> d.popleft() # return and remove the leftmost item
				133	'f'
				134	>>> list(d) # list the contents of the deque
				135	['g', 'h', 'i']
				136	>>> d[0] # peek at leftmost item
				137	'g'
				138	>>> d[-1] # peek at rightmost item
				139	'i'
				140
				141	>>> list(reversed(d)) # list the contents of a deque in reverse
				142	['i', 'h', 'g']
				143	>>> 'h' in d # search the deque
				144	True
				145	>>> d.extend('jkl') # add multiple elements at once
				146	>>> d
				147	deque(['g', 'h', 'i', 'j', 'k', 'l'])
				148	>>> d.rotate(1) # right rotation
				149	>>> d
				150	deque(['l', 'g', 'h', 'i', 'j', 'k'])
				151	>>> d.rotate(-1) # left rotation
				152	>>> d
				153	deque(['g', 'h', 'i', 'j', 'k', 'l'])
				154
				155	>>> deque(reversed(d)) # make a new deque in reverse order
				156	deque(['l', 'k', 'j', 'i', 'h', 'g'])
				157	>>> d.clear() # empty the deque
				158	>>> d.pop() # cannot pop from an empty deque
				159	Traceback (most recent call last):
				160	File "<pyshell#6>", line 1, in -toplevel-
				161	d.pop()
				162	IndexError: pop from an empty deque
				163
				164	>>> d.extendleft('abc') # extendleft() reverses the input order
				165	>>> d
				166	deque(['c', 'b', 'a'])
				167
				168
				169	.. _deque-recipes:
				170
				171	Recipes
				172	^^^^^^^
				173
				174	This section shows various approaches to working with deques.
				175
				176	The :meth:`rotate` method provides a way to implement :class:`deque` slicing and
				177	deletion. For example, a pure python implementation of ``del d[n]`` relies on
				178	the :meth:`rotate` method to position elements to be popped::
				179
				180	def delete_nth(d, n):
				181	d.rotate(-n)
				182	d.popleft()
				183	d.rotate(n)
				184
				185	To implement :class:`deque` slicing, use a similar approach applying
				186	:meth:`rotate` to bring a target element to the left side of the deque. Remove
				187	old entries with :meth:`popleft`, add new entries with :meth:`extend`, and then
				188	reverse the rotation.
				189
				190	With minor variations on that approach, it is easy to implement Forth style
				191	stack manipulations such as ``dup``, ``drop``, ``swap``, ``over``, ``pick``,
				192	``rot``, and ``roll``.
				193
				194	A roundrobin task server can be built from a :class:`deque` using
				195	:meth:`popleft` to select the current task and :meth:`append` to add it back to
				196	the tasklist if the input stream is not exhausted::
				197
				198	>>> def roundrobin(*iterables):
				199	... pending = deque(iter(i) for i in iterables)
				200	... while pending:
				201	... task = pending.popleft()
				202	... try:
				203	... yield task.next()
				204	... except StopIteration:
				205	... continue
				206	... pending.append(task)
				207	...
				208	>>> for value in roundrobin('abc', 'd', 'efgh'):
				209	... print value
				210
				211	a
				212	d
				213	e
				214	b
				215	f
				216	c
				217	g
				218	h
				219
				220
				221	Multi-pass data reduction algorithms can be succinctly expressed and efficiently
				222	coded by extracting elements with multiple calls to :meth:`popleft`, applying
				223	the reduction function, and calling :meth:`append` to add the result back to the
				224	queue.
				225
				226	For example, building a balanced binary tree of nested lists entails reducing
				227	two adjacent nodes into one by grouping them in a list::
				228
				229	>>> def maketree(iterable):
				230	... d = deque(iterable)
				231	... while len(d) > 1:
				232	... pair = [d.popleft(), d.popleft()]
				233	... d.append(pair)
				234	... return list(d)
				235	...
				236	>>> print maketree('abcdefgh')
				237	[[[['a', 'b'], ['c', 'd']], [['e', 'f'], ['g', 'h']]]]
				238
				239
				240
				241	.. _defaultdict-objects:
				242
				243	:class:`defaultdict` objects
				244	----------------------------
				245
				246
				247	.. class:: defaultdict([default_factory[, ...]])
				248
				249	Returns a new dictionary-like object. :class:`defaultdict` is a subclass of the
				250	builtin :class:`dict` class. It overrides one method and adds one writable
				251	instance variable. The remaining functionality is the same as for the
				252	:class:`dict` class and is not documented here.
				253
				254	The first argument provides the initial value for the :attr:`default_factory`
				255	attribute; it defaults to ``None``. All remaining arguments are treated the same
				256	as if they were passed to the :class:`dict` constructor, including keyword
				257	arguments.
				258
				259	.. versionadded:: 2.5
				260
				261	:class:`defaultdict` objects support the following method in addition to the
				262	standard :class:`dict` operations:
				263
				264
				265	.. method:: defaultdict.__missing__(key)
				266
				267	If the :attr:`default_factory` attribute is ``None``, this raises an
				268	:exc:`KeyError` exception with the key as argument.
				269
				270	If :attr:`default_factory` is not ``None``, it is called without arguments to
				271	provide a default value for the given key, this value is inserted in the
				272	dictionary for the key, and returned.
				273
				274	If calling :attr:`default_factory` raises an exception this exception is
				275	propagated unchanged.
				276
				277	This method is called by the :meth:`__getitem__` method of the :class:`dict`
				278	class when the requested key is not found; whatever it returns or raises is then
				279	returned or raised by :meth:`__getitem__`.
				280
				281	:class:`defaultdict` objects support the following instance variable:
				282
				283
				284	.. attribute:: defaultdict.default_factory
				285
				286	This attribute is used by the :meth:`__missing__` method; it is initialized from
				287	the first argument to the constructor, if present, or to ``None``, if absent.
				288
				289
				290	.. _defaultdict-examples:
				291
				292	:class:`defaultdict` Examples
				293	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				294
				295	Using :class:`list` as the :attr:`default_factory`, it is easy to group a
				296	sequence of key-value pairs into a dictionary of lists::
				297
				298	>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
				299	>>> d = defaultdict(list)
				300	>>> for k, v in s:
				301	... d[k].append(v)
				302	...
				303	>>> d.items()
				304	[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
				305
				306	When each key is encountered for the first time, it is not already in the
				307	mapping; so an entry is automatically created using the :attr:`default_factory`
				308	function which returns an empty :class:`list`. The :meth:`list.append`
				309	operation then attaches the value to the new list. When keys are encountered
				310	again, the look-up proceeds normally (returning the list for that key) and the
				311	:meth:`list.append` operation adds another value to the list. This technique is
				312	simpler and faster than an equivalent technique using :meth:`dict.setdefault`::
				313
				314	>>> d = {}
				315	>>> for k, v in s:
				316	... d.setdefault(k, []).append(v)
				317	...
				318	>>> d.items()
				319	[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
				320
				321	Setting the :attr:`default_factory` to :class:`int` makes the
				322	:class:`defaultdict` useful for counting (like a bag or multiset in other
				323	languages)::
				324
				325	>>> s = 'mississippi'
				326	>>> d = defaultdict(int)
				327	>>> for k in s:
				328	... d[k] += 1
				329	...
				330	>>> d.items()
				331	[('i', 4), ('p', 2), ('s', 4), ('m', 1)]
				332
				333	When a letter is first encountered, it is missing from the mapping, so the
				334	:attr:`default_factory` function calls :func:`int` to supply a default count of
				335	zero. The increment operation then builds up the count for each letter.
				336
				337	The function :func:`int` which always returns zero is just a special case of
				338	constant functions. A faster and more flexible way to create constant functions
				339	is to use :func:`itertools.repeat` which can supply any constant value (not just
				340	zero)::
				341
				342	>>> def constant_factory(value):
				343	... return itertools.repeat(value).next
				344	>>> d = defaultdict(constant_factory('<missing>'))
				345	>>> d.update(name='John', action='ran')
				346	>>> '%(name)s %(action)s to %(object)s' % d
				347	'John ran to <missing>'
				348
				349	Setting the :attr:`default_factory` to :class:`set` makes the
				350	:class:`defaultdict` useful for building a dictionary of sets::
				351
				352	>>> s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)]
				353	>>> d = defaultdict(set)
				354	>>> for k, v in s:
				355	... d[k].add(v)
				356	...
				357	>>> d.items()
				358	[('blue', set([2, 4])), ('red', set([1, 3]))]
				359
				360
				361	.. _named-tuple-factory:
				362
Raymond Hettinger	7268e9d	2007-09-20 03:03:43 +0000	[diff] [blame]	363	:func:`NamedTuple` Factory Function for Tuples with Named Fields
				364	----------------------------------------------------------------
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	365
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	366	Named tuples assign meaning to each position in a tuple and allow for more readable,
				367	self-documenting code. They can be used wherever regular tuples are used, and
				368	they add the ability to access fields by name instead of position index.
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	369
Raymond Hettinger	2b03d45	2007-09-18 03:33:19 +0000	[diff] [blame]	370	.. function:: NamedTuple(typename, fieldnames, [verbose])
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	371
				372	Returns a new tuple subclass named typename. The new subclass is used to
				373	create tuple-like objects that have fields accessable by attribute lookup as
				374	well as being indexable and iterable. Instances of the subclass also have a
				375	helpful docstring (with typename and fieldnames) and a helpful :meth:`__repr__`
				376	method which lists the tuple contents in a ``name=value`` format.
				377
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	378	The fieldnames are specified in a single string with each fieldname separated by
Raymond Hettinger	7268e9d	2007-09-20 03:03:43 +0000	[diff] [blame]	379	a space and/or comma. Any valid Python identifier may be used for a fieldname.
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	380
Raymond Hettinger	7268e9d	2007-09-20 03:03:43 +0000	[diff] [blame]	381	If verbose is true, will print the class definition.
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	382
				383	NamedTuple instances do not have per-instance dictionaries, so they are
Raymond Hettinger	7268e9d	2007-09-20 03:03:43 +0000	[diff] [blame]	384	lightweight and require no more memory than regular tuples.
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	385
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	386	.. versionadded:: 2.6
				387
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	388	Example::
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	389
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	390	>>> Point = NamedTuple('Point', 'x y', True)
				391	class Point(tuple):
				392	'Point(x, y)'
				393	__slots__ = ()
				394	__fields__ = ('x', 'y')
				395	def __new__(cls, x, y):
				396	return tuple.__new__(cls, (x, y))
				397	def __repr__(self):
				398	return 'Point(x=%r, y=%r)' % self
				399	def __replace__(self, field, value):
				400	'Return a new Point object replacing one field with a new value'
				401	return Point(**dict(zip(('x', 'y'), self) + [(field, value)]))
				402	x = property(itemgetter(0))
				403	y = property(itemgetter(1))
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	404
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	405	>>> p = Point(11, y=22) # instantiate with positional or keyword arguments
				406	>>> p[0] + p[1] # indexable like the regular tuple (11, 22)
				407	33
				408	>>> x, y = p # unpack like a regular tuple
				409	>>> x, y
				410	(11, 22)
				411	>>> p.x + p.y # fields also accessable by name
				412	33
				413	>>> p # readable __repr__ with a name=value style
				414	Point(x=11, y=22)
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	415
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	416	Named tuples are especially useful for assigning field names to result tuples returned
				417	by the :mod:`csv` or :mod:`sqlite3` modules::
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	418
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	419	from itertools import starmap
				420	import csv
				421	EmployeeRecord = NamedTuple('EmployeeRecord', 'name age title department paygrade')
				422	for emp in starmap(EmployeeRecord, csv.reader(open("employees.csv", "rb"))):
				423	print emp.name, emp.title
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	424
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	425	When casting a single record to a NamedTuple, use the star-operator [#]_ to unpack
				426	the values::
Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame]	427
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	428	>>> t = [11, 22]
				429	>>> Point(*t) # the star-operator unpacks any iterable object
				430	Point(x=11, y=22)
Raymond Hettinger	2b03d45	2007-09-18 03:33:19 +0000	[diff] [blame]	431
Raymond Hettinger	d36a60e	2007-09-17 00:55:00 +0000	[diff] [blame]	432	In addition to the methods inherited from tuples, named tuples support
				433	an additonal method and an informational read-only attribute.
				434
				435	.. method:: somenamedtuple.replace(field, value)
				436
Raymond Hettinger	7268e9d	2007-09-20 03:03:43 +0000	[diff] [blame]	437	Return a new instance of the named tuple replacing the named field with a new value:
				438
				439	::
Raymond Hettinger	d36a60e	2007-09-17 00:55:00 +0000	[diff] [blame]	440
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	441	>>> p = Point(x=11, y=22)
Raymond Hettinger	d36a60e	2007-09-17 00:55:00 +0000	[diff] [blame]	442	>>> p.__replace__('x', 33)
				443	Point(x=33, y=22)
				444
				445	>>> for recordnum, record in inventory:
				446	... inventory[recordnum] = record.replace('total', record.price * record.quantity)
				447
Raymond Hettinger	d36a60e	2007-09-17 00:55:00 +0000	[diff] [blame]	448	.. attribute:: somenamedtuple.__fields__
				449
				450	Return a tuple of strings listing the field names. This is useful for introspection,
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	451	for converting a named tuple instance to a dictionary, and for combining named tuple
Raymond Hettinger	7268e9d	2007-09-20 03:03:43 +0000	[diff] [blame]	452	types to create new named tuple types:
				453
				454	::
Raymond Hettinger	d36a60e	2007-09-17 00:55:00 +0000	[diff] [blame]	455
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	456	>>> p.__fields__ # view the field names
				457	('x', 'y')
				458	>>> dict(zip(p.__fields__, p)) # convert to a dictionary
				459	{'y': 22, 'x': 11}
Raymond Hettinger	d36a60e	2007-09-17 00:55:00 +0000	[diff] [blame]	460
Raymond Hettinger	cbab594	2007-09-18 22:18:02 +0000	[diff] [blame]	461	>>> Color = NamedTuple('Color', 'red green blue')
				462	>>> pixel_fields = ' '.join(Point.__fields__ + Color.__fields__) # combine fields
				463	>>> Pixel = NamedTuple('Pixel', pixel_fields)
				464	>>> Pixel(11, 22, 128, 255, 0)
				465	Pixel(x=11, y=22, red=128, green=255, blue=0)'
Raymond Hettinger	d36a60e	2007-09-17 00:55:00 +0000	[diff] [blame]	466
Mark Summerfield	7f626f4	2007-08-30 15:03:03 +0000	[diff] [blame]	467	.. rubric:: Footnotes
				468
				469	.. [#] For information on the star-operator see
				470	:ref:`tut-unpacking-arguments` and :ref:`calls`.