Blame - Doc/tutorial/datastructures.rst - platform/external/python/cpython2

blob: a5599566edf0973571a3524e16a43bc2f608b6a3 [file] [log] [blame]

Georg Brandl	8ec7f65	2007-08-15 14:28:01 +0000	[diff] [blame^]	1	.. _tut-structures:
				2
				3	***************
				4	Data Structures
				5	***************
				6
				7	This chapter describes some things you've learned about already in more detail,
				8	and adds some new things as well.
				9
				10
				11	.. _tut-morelists:
				12
				13	More on Lists
				14	=============
				15
				16	The list data type has some more methods. Here are all of the methods of list
				17	objects:
				18
				19
				20	.. method:: list.append(x)
				21
				22	Add an item to the end of the list; equivalent to ``a[len(a):] = [x]``.
				23
				24
				25	.. method:: list.extend(L)
				26
				27	Extend the list by appending all the items in the given list; equivalent to
				28	``a[len(a):] = L``.
				29
				30
				31	.. method:: list.insert(i, x)
				32
				33	Insert an item at a given position. The first argument is the index of the
				34	element before which to insert, so ``a.insert(0, x)`` inserts at the front of
				35	the list, and ``a.insert(len(a), x)`` is equivalent to ``a.append(x)``.
				36
				37
				38	.. method:: list.remove(x)
				39
				40	Remove the first item from the list whose value is x. It is an error if there
				41	is no such item.
				42
				43
				44	.. method:: list.pop([i])
				45
				46	Remove the item at the given position in the list, and return it. If no index
				47	is specified, ``a.pop()`` removes and returns the last item in the list. (The
				48	square brackets around the i in the method signature denote that the parameter
				49	is optional, not that you should type square brackets at that position. You
				50	will see this notation frequently in the Python Library Reference.)
				51
				52
				53	.. method:: list.index(x)
				54
				55	Return the index in the list of the first item whose value is x. It is an
				56	error if there is no such item.
				57
				58
				59	.. method:: list.count(x)
				60
				61	Return the number of times x appears in the list.
				62
				63
				64	.. method:: list.sort()
				65
				66	Sort the items of the list, in place.
				67
				68
				69	.. method:: list.reverse()
				70
				71	Reverse the elements of the list, in place.
				72
				73	An example that uses most of the list methods::
				74
				75	>>> a = [66.25, 333, 333, 1, 1234.5]
				76	>>> print a.count(333), a.count(66.25), a.count('x')
				77	2 1 0
				78	>>> a.insert(2, -1)
				79	>>> a.append(333)
				80	>>> a
				81	[66.25, 333, -1, 333, 1, 1234.5, 333]
				82	>>> a.index(333)
				83	1
				84	>>> a.remove(333)
				85	>>> a
				86	[66.25, -1, 333, 1, 1234.5, 333]
				87	>>> a.reverse()
				88	>>> a
				89	[333, 1234.5, 1, 333, -1, 66.25]
				90	>>> a.sort()
				91	>>> a
				92	[-1, 1, 66.25, 333, 333, 1234.5]
				93
				94
				95	.. _tut-lists-as-stacks:
				96
				97	Using Lists as Stacks
				98	---------------------
				99
				100	.. sectionauthor:: Ka-Ping Yee <ping@lfw.org>
				101
				102
				103	The list methods make it very easy to use a list as a stack, where the last
				104	element added is the first element retrieved ("last-in, first-out"). To add an
				105	item to the top of the stack, use :meth:`append`. To retrieve an item from the
				106	top of the stack, use :meth:`pop` without an explicit index. For example::
				107
				108	>>> stack = [3, 4, 5]
				109	>>> stack.append(6)
				110	>>> stack.append(7)
				111	>>> stack
				112	[3, 4, 5, 6, 7]
				113	>>> stack.pop()
				114	7
				115	>>> stack
				116	[3, 4, 5, 6]
				117	>>> stack.pop()
				118	6
				119	>>> stack.pop()
				120	5
				121	>>> stack
				122	[3, 4]
				123
				124
				125	.. _tut-lists-as-queues:
				126
				127	Using Lists as Queues
				128	---------------------
				129
				130	.. sectionauthor:: Ka-Ping Yee <ping@lfw.org>
				131
				132
				133	You can also use a list conveniently as a queue, where the first element added
				134	is the first element retrieved ("first-in, first-out"). To add an item to the
				135	back of the queue, use :meth:`append`. To retrieve an item from the front of
				136	the queue, use :meth:`pop` with ``0`` as the index. For example::
				137
				138	>>> queue = ["Eric", "John", "Michael"]
				139	>>> queue.append("Terry") # Terry arrives
				140	>>> queue.append("Graham") # Graham arrives
				141	>>> queue.pop(0)
				142	'Eric'
				143	>>> queue.pop(0)
				144	'John'
				145	>>> queue
				146	['Michael', 'Terry', 'Graham']
				147
				148
				149	.. _tut-functional:
				150
				151	Functional Programming Tools
				152	----------------------------
				153
				154	There are three built-in functions that are very useful when used with lists:
				155	:func:`filter`, :func:`map`, and :func:`reduce`.
				156
				157	``filter(function, sequence)`` returns a sequence consisting of those items from
				158	the sequence for which ``function(item)`` is true. If sequence is a
				159	:class:`string` or :class:`tuple`, the result will be of the same type;
				160	otherwise, it is always a :class:`list`. For example, to compute some primes::
				161
				162	>>> def f(x): return x % 2 != 0 and x % 3 != 0
				163	...
				164	>>> filter(f, range(2, 25))
				165	[5, 7, 11, 13, 17, 19, 23]
				166
				167	``map(function, sequence)`` calls ``function(item)`` for each of the sequence's
				168	items and returns a list of the return values. For example, to compute some
				169	cubes::
				170
				171	>>> def cube(x): return xxx
				172	...
				173	>>> map(cube, range(1, 11))
				174	[1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
				175
				176	More than one sequence may be passed; the function must then have as many
				177	arguments as there are sequences and is called with the corresponding item from
				178	each sequence (or ``None`` if some sequence is shorter than another). For
				179	example::
				180
				181	>>> seq = range(8)
				182	>>> def add(x, y): return x+y
				183	...
				184	>>> map(add, seq, seq)
				185	[0, 2, 4, 6, 8, 10, 12, 14]
				186
				187	``reduce(function, sequence)`` returns a single value constructed by calling the
				188	binary function function on the first two items of the sequence, then on the
				189	result and the next item, and so on. For example, to compute the sum of the
				190	numbers 1 through 10::
				191
				192	>>> def add(x,y): return x+y
				193	...
				194	>>> reduce(add, range(1, 11))
				195	55
				196
				197	If there's only one item in the sequence, its value is returned; if the sequence
				198	is empty, an exception is raised.
				199
				200	A third argument can be passed to indicate the starting value. In this case the
				201	starting value is returned for an empty sequence, and the function is first
				202	applied to the starting value and the first sequence item, then to the result
				203	and the next item, and so on. For example, ::
				204
				205	>>> def sum(seq):
				206	... def add(x,y): return x+y
				207	... return reduce(add, seq, 0)
				208	...
				209	>>> sum(range(1, 11))
				210	55
				211	>>> sum([])
				212	0
				213
				214	Don't use this example's definition of :func:`sum`: since summing numbers is
				215	such a common need, a built-in function ``sum(sequence)`` is already provided,
				216	and works exactly like this.
				217
				218	.. versionadded:: 2.3
				219
				220
				221	List Comprehensions
				222	-------------------
				223
				224	List comprehensions provide a concise way to create lists without resorting to
				225	use of :func:`map`, :func:`filter` and/or :keyword:`lambda`. The resulting list
				226	definition tends often to be clearer than lists built using those constructs.
				227	Each list comprehension consists of an expression followed by a :keyword:`for`
				228	clause, then zero or more :keyword:`for` or :keyword:`if` clauses. The result
				229	will be a list resulting from evaluating the expression in the context of the
				230	:keyword:`for` and :keyword:`if` clauses which follow it. If the expression
				231	would evaluate to a tuple, it must be parenthesized. ::
				232
				233	>>> freshfruit = [' banana', ' loganberry ', 'passion fruit ']
				234	>>> [weapon.strip() for weapon in freshfruit]
				235	['banana', 'loganberry', 'passion fruit']
				236	>>> vec = [2, 4, 6]
				237	>>> [3*x for x in vec]
				238	[6, 12, 18]
				239	>>> [3*x for x in vec if x > 3]
				240	[12, 18]
				241	>>> [3*x for x in vec if x < 2]
				242	[]
				243	>>> [[x,x**2] for x in vec]
				244	[[2, 4], [4, 16], [6, 36]]
				245	>>> [x, x**2 for x in vec] # error - parens required for tuples
				246	File "<stdin>", line 1, in ?
				247	[x, x**2 for x in vec]
				248	^
				249	SyntaxError: invalid syntax
				250	>>> [(x, x**2) for x in vec]
				251	[(2, 4), (4, 16), (6, 36)]
				252	>>> vec1 = [2, 4, 6]
				253	>>> vec2 = [4, 3, -9]
				254	>>> [x*y for x in vec1 for y in vec2]
				255	[8, 6, -18, 16, 12, -36, 24, 18, -54]
				256	>>> [x+y for x in vec1 for y in vec2]
				257	[6, 5, -7, 8, 7, -5, 10, 9, -3]
				258	>>> [vec1[i]*vec2[i] for i in range(len(vec1))]
				259	[8, 12, -54]
				260
				261	List comprehensions are much more flexible than :func:`map` and can be applied
				262	to complex expressions and nested functions::
				263
				264	>>> [str(round(355/113.0, i)) for i in range(1,6)]
				265	['3.1', '3.14', '3.142', '3.1416', '3.14159']
				266
				267
				268	.. _tut-del:
				269
				270	The :keyword:`del` statement
				271	============================
				272
				273	There is a way to remove an item from a list given its index instead of its
				274	value: the :keyword:`del` statement. This differs from the :meth:`pop` method
				275	which returns a value. The :keyword:`del` statement can also be used to remove
				276	slices from a list or clear the entire list (which we did earlier by assignment
				277	of an empty list to the slice). For example::
				278
				279	>>> a = [-1, 1, 66.25, 333, 333, 1234.5]
				280	>>> del a[0]
				281	>>> a
				282	[1, 66.25, 333, 333, 1234.5]
				283	>>> del a[2:4]
				284	>>> a
				285	[1, 66.25, 1234.5]
				286	>>> del a[:]
				287	>>> a
				288	[]
				289
				290	:keyword:`del` can also be used to delete entire variables::
				291
				292	>>> del a
				293
				294	Referencing the name ``a`` hereafter is an error (at least until another value
				295	is assigned to it). We'll find other uses for :keyword:`del` later.
				296
				297
				298	.. _tut-tuples:
				299
				300	Tuples and Sequences
				301	====================
				302
				303	We saw that lists and strings have many common properties, such as indexing and
				304	slicing operations. They are two examples of sequence data types (see
				305	:ref:`typesseq`). Since Python is an evolving language, other sequence data
				306	types may be added. There is also another standard sequence data type: the
				307	tuple.
				308
				309	A tuple consists of a number of values separated by commas, for instance::
				310
				311	>>> t = 12345, 54321, 'hello!'
				312	>>> t[0]
				313	12345
				314	>>> t
				315	(12345, 54321, 'hello!')
				316	>>> # Tuples may be nested:
				317	... u = t, (1, 2, 3, 4, 5)
				318	>>> u
				319	((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))
				320
				321	As you see, on output tuples are always enclosed in parentheses, so that nested
				322	tuples are interpreted correctly; they may be input with or without surrounding
				323	parentheses, although often parentheses are necessary anyway (if the tuple is
				324	part of a larger expression).
				325
				326	Tuples have many uses. For example: (x, y) coordinate pairs, employee records
				327	from a database, etc. Tuples, like strings, are immutable: it is not possible
				328	to assign to the individual items of a tuple (you can simulate much of the same
				329	effect with slicing and concatenation, though). It is also possible to create
				330	tuples which contain mutable objects, such as lists.
				331
				332	A special problem is the construction of tuples containing 0 or 1 items: the
				333	syntax has some extra quirks to accommodate these. Empty tuples are constructed
				334	by an empty pair of parentheses; a tuple with one item is constructed by
				335	following a value with a comma (it is not sufficient to enclose a single value
				336	in parentheses). Ugly, but effective. For example::
				337
				338	>>> empty = ()
				339	>>> singleton = 'hello', # <-- note trailing comma
				340	>>> len(empty)
				341	0
				342	>>> len(singleton)
				343	1
				344	>>> singleton
				345	('hello',)
				346
				347	The statement ``t = 12345, 54321, 'hello!'`` is an example of tuple packing:
				348	the values ``12345``, ``54321`` and ``'hello!'`` are packed together in a tuple.
				349	The reverse operation is also possible::
				350
				351	>>> x, y, z = t
				352
				353	This is called, appropriately enough, sequence unpacking. Sequence unpacking
				354	requires the list of variables on the left to have the same number of elements
				355	as the length of the sequence. Note that multiple assignment is really just a
				356	combination of tuple packing and sequence unpacking!
				357
				358	There is a small bit of asymmetry here: packing multiple values always creates
				359	a tuple, and unpacking works for any sequence.
				360
				361	.. % XXX Add a bit on the difference between tuples and lists.
				362
				363
				364	.. _tut-sets:
				365
				366	Sets
				367	====
				368
				369	Python also includes a data type for sets. A set is an unordered collection
				370	with no duplicate elements. Basic uses include membership testing and
				371	eliminating duplicate entries. Set objects also support mathematical operations
				372	like union, intersection, difference, and symmetric difference.
				373
				374	Here is a brief demonstration::
				375
				376	>>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
				377	>>> fruit = set(basket) # create a set without duplicates
				378	>>> fruit
				379	set(['orange', 'pear', 'apple', 'banana'])
				380	>>> 'orange' in fruit # fast membership testing
				381	True
				382	>>> 'crabgrass' in fruit
				383	False
				384
				385	>>> # Demonstrate set operations on unique letters from two words
				386	...
				387	>>> a = set('abracadabra')
				388	>>> b = set('alacazam')
				389	>>> a # unique letters in a
				390	set(['a', 'r', 'b', 'c', 'd'])
				391	>>> a - b # letters in a but not in b
				392	set(['r', 'd', 'b'])
				393	>>> a \| b # letters in either a or b
				394	set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])
				395	>>> a & b # letters in both a and b
				396	set(['a', 'c'])
				397	>>> a ^ b # letters in a or b but not both
				398	set(['r', 'd', 'b', 'm', 'z', 'l'])
				399
				400
				401	.. _tut-dictionaries:
				402
				403	Dictionaries
				404	============
				405
				406	Another useful data type built into Python is the dictionary (see
				407	:ref:`typesmapping`). Dictionaries are sometimes found in other languages as
				408	"associative memories" or "associative arrays". Unlike sequences, which are
				409	indexed by a range of numbers, dictionaries are indexed by keys, which can be
				410	any immutable type; strings and numbers can always be keys. Tuples can be used
				411	as keys if they contain only strings, numbers, or tuples; if a tuple contains
				412	any mutable object either directly or indirectly, it cannot be used as a key.
				413	You can't use lists as keys, since lists can be modified in place using index
				414	assignments, slice assignments, or methods like :meth:`append` and
				415	:meth:`extend`.
				416
				417	It is best to think of a dictionary as an unordered set of key: value pairs,
				418	with the requirement that the keys are unique (within one dictionary). A pair of
				419	braces creates an empty dictionary: ``{}``. Placing a comma-separated list of
				420	key:value pairs within the braces adds initial key:value pairs to the
				421	dictionary; this is also the way dictionaries are written on output.
				422
				423	The main operations on a dictionary are storing a value with some key and
				424	extracting the value given the key. It is also possible to delete a key:value
				425	pair with ``del``. If you store using a key that is already in use, the old
				426	value associated with that key is forgotten. It is an error to extract a value
				427	using a non-existent key.
				428
				429	The :meth:`keys` method of a dictionary object returns a list of all the keys
				430	used in the dictionary, in arbitrary order (if you want it sorted, just apply
				431	the :meth:`sort` method to the list of keys). To check whether a single key is
				432	in the dictionary, either use the dictionary's :meth:`has_key` method or the
				433	:keyword:`in` keyword.
				434
				435	Here is a small example using a dictionary::
				436
				437	>>> tel = {'jack': 4098, 'sape': 4139}
				438	>>> tel['guido'] = 4127
				439	>>> tel
				440	{'sape': 4139, 'guido': 4127, 'jack': 4098}
				441	>>> tel['jack']
				442	4098
				443	>>> del tel['sape']
				444	>>> tel['irv'] = 4127
				445	>>> tel
				446	{'guido': 4127, 'irv': 4127, 'jack': 4098}
				447	>>> tel.keys()
				448	['guido', 'irv', 'jack']
				449	>>> tel.has_key('guido')
				450	True
				451	>>> 'guido' in tel
				452	True
				453
				454	The :func:`dict` constructor builds dictionaries directly from lists of
				455	key-value pairs stored as tuples. When the pairs form a pattern, list
				456	comprehensions can compactly specify the key-value list. ::
				457
				458	>>> dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])
				459	{'sape': 4139, 'jack': 4098, 'guido': 4127}
				460	>>> dict([(x, x**2) for x in (2, 4, 6)]) # use a list comprehension
				461	{2: 4, 4: 16, 6: 36}
				462
				463	Later in the tutorial, we will learn about Generator Expressions which are even
				464	better suited for the task of supplying key-values pairs to the :func:`dict`
				465	constructor.
				466
				467	When the keys are simple strings, it is sometimes easier to specify pairs using
				468	keyword arguments::
				469
				470	>>> dict(sape=4139, guido=4127, jack=4098)
				471	{'sape': 4139, 'jack': 4098, 'guido': 4127}
				472
				473
				474	.. _tut-loopidioms:
				475
				476	Looping Techniques
				477	==================
				478
				479	When looping through dictionaries, the key and corresponding value can be
				480	retrieved at the same time using the :meth:`iteritems` method. ::
				481
				482	>>> knights = {'gallahad': 'the pure', 'robin': 'the brave'}
				483	>>> for k, v in knights.iteritems():
				484	... print k, v
				485	...
				486	gallahad the pure
				487	robin the brave
				488
				489	When looping through a sequence, the position index and corresponding value can
				490	be retrieved at the same time using the :func:`enumerate` function. ::
				491
				492	>>> for i, v in enumerate(['tic', 'tac', 'toe']):
				493	... print i, v
				494	...
				495	0 tic
				496	1 tac
				497	2 toe
				498
				499	To loop over two or more sequences at the same time, the entries can be paired
				500	with the :func:`zip` function. ::
				501
				502	>>> questions = ['name', 'quest', 'favorite color']
				503	>>> answers = ['lancelot', 'the holy grail', 'blue']
				504	>>> for q, a in zip(questions, answers):
				505	... print 'What is your %s? It is %s.' % (q, a)
				506	...
				507	What is your name? It is lancelot.
				508	What is your quest? It is the holy grail.
				509	What is your favorite color? It is blue.
				510
				511	To loop over a sequence in reverse, first specify the sequence in a forward
				512	direction and then call the :func:`reversed` function. ::
				513
				514	>>> for i in reversed(xrange(1,10,2)):
				515	... print i
				516	...
				517	9
				518	7
				519	5
				520	3
				521	1
				522
				523	To loop over a sequence in sorted order, use the :func:`sorted` function which
				524	returns a new sorted list while leaving the source unaltered. ::
				525
				526	>>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
				527	>>> for f in sorted(set(basket)):
				528	... print f
				529	...
				530	apple
				531	banana
				532	orange
				533	pear
				534
				535
				536	.. _tut-conditions:
				537
				538	More on Conditions
				539	==================
				540
				541	The conditions used in ``while`` and ``if`` statements can contain any
				542	operators, not just comparisons.
				543
				544	The comparison operators ``in`` and ``not in`` check whether a value occurs
				545	(does not occur) in a sequence. The operators ``is`` and ``is not`` compare
				546	whether two objects are really the same object; this only matters for mutable
				547	objects like lists. All comparison operators have the same priority, which is
				548	lower than that of all numerical operators.
				549
				550	Comparisons can be chained. For example, ``a < b == c`` tests whether ``a`` is
				551	less than ``b`` and moreover ``b`` equals ``c``.
				552
				553	Comparisons may be combined using the Boolean operators ``and`` and ``or``, and
				554	the outcome of a comparison (or of any other Boolean expression) may be negated
				555	with ``not``. These have lower priorities than comparison operators; between
				556	them, ``not`` has the highest priority and ``or`` the lowest, so that ``A and
				557	not B or C`` is equivalent to ``(A and (not B)) or C``. As always, parentheses
				558	can be used to express the desired composition.
				559
				560	The Boolean operators ``and`` and ``or`` are so-called short-circuit
				561	operators: their arguments are evaluated from left to right, and evaluation
				562	stops as soon as the outcome is determined. For example, if ``A`` and ``C`` are
				563	true but ``B`` is false, ``A and B and C`` does not evaluate the expression
				564	``C``. When used as a general value and not as a Boolean, the return value of a
				565	short-circuit operator is the last evaluated argument.
				566
				567	It is possible to assign the result of a comparison or other Boolean expression
				568	to a variable. For example, ::
				569
				570	>>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
				571	>>> non_null = string1 or string2 or string3
				572	>>> non_null
				573	'Trondheim'
				574
				575	Note that in Python, unlike C, assignment cannot occur inside expressions. C
				576	programmers may grumble about this, but it avoids a common class of problems
				577	encountered in C programs: typing ``=`` in an expression when ``==`` was
				578	intended.
				579
				580
				581	.. _tut-comparing:
				582
				583	Comparing Sequences and Other Types
				584	===================================
				585
				586	Sequence objects may be compared to other objects with the same sequence type.
				587	The comparison uses lexicographical ordering: first the first two items are
				588	compared, and if they differ this determines the outcome of the comparison; if
				589	they are equal, the next two items are compared, and so on, until either
				590	sequence is exhausted. If two items to be compared are themselves sequences of
				591	the same type, the lexicographical comparison is carried out recursively. If
				592	all items of two sequences compare equal, the sequences are considered equal.
				593	If one sequence is an initial sub-sequence of the other, the shorter sequence is
				594	the smaller (lesser) one. Lexicographical ordering for strings uses the ASCII
				595	ordering for individual characters. Some examples of comparisons between
				596	sequences of the same type::
				597
				598	(1, 2, 3) < (1, 2, 4)
				599	[1, 2, 3] < [1, 2, 4]
				600	'ABC' < 'C' < 'Pascal' < 'Python'
				601	(1, 2, 3, 4) < (1, 2, 4)
				602	(1, 2) < (1, 2, -1)
				603	(1, 2, 3) == (1.0, 2.0, 3.0)
				604	(1, 2, ('aa', 'ab')) < (1, 2, ('abc', 'a'), 4)
				605
				606	Note that comparing objects of different types is legal. The outcome is
				607	deterministic but arbitrary: the types are ordered by their name. Thus, a list
				608	is always smaller than a string, a string is always smaller than a tuple, etc.
				609	[#]_ Mixed numeric types are compared according to their numeric value, so 0
				610	equals 0.0, etc.
				611
				612
				613	.. rubric:: Footnotes
				614
				615	.. [#] The rules for comparing objects of different types should not be relied upon;
				616	they may change in a future version of the language.
				617