blob: a5599566edf0973571a3524e16a43bc2f608b6a3 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001.. _tut-structures:
2
3***************
4Data Structures
5***************
6
7This chapter describes some things you've learned about already in more detail,
8and adds some new things as well.
9
10
11.. _tut-morelists:
12
13More on Lists
14=============
15
16The list data type has some more methods. Here are all of the methods of list
17objects:
18
19
20.. method:: list.append(x)
21
22 Add an item to the end of the list; equivalent to ``a[len(a):] = [x]``.
23
24
25.. method:: list.extend(L)
26
27 Extend the list by appending all the items in the given list; equivalent to
28 ``a[len(a):] = L``.
29
30
31.. method:: list.insert(i, x)
32
33 Insert an item at a given position. The first argument is the index of the
34 element before which to insert, so ``a.insert(0, x)`` inserts at the front of
35 the list, and ``a.insert(len(a), x)`` is equivalent to ``a.append(x)``.
36
37
38.. method:: list.remove(x)
39
40 Remove the first item from the list whose value is *x*. It is an error if there
41 is no such item.
42
43
44.. method:: list.pop([i])
45
46 Remove the item at the given position in the list, and return it. If no index
47 is specified, ``a.pop()`` removes and returns the last item in the list. (The
48 square brackets around the *i* in the method signature denote that the parameter
49 is optional, not that you should type square brackets at that position. You
50 will see this notation frequently in the Python Library Reference.)
51
52
53.. method:: list.index(x)
54
55 Return the index in the list of the first item whose value is *x*. It is an
56 error if there is no such item.
57
58
59.. method:: list.count(x)
60
61 Return the number of times *x* appears in the list.
62
63
64.. method:: list.sort()
65
66 Sort the items of the list, in place.
67
68
69.. method:: list.reverse()
70
71 Reverse the elements of the list, in place.
72
73An example that uses most of the list methods::
74
75 >>> a = [66.25, 333, 333, 1, 1234.5]
76 >>> print a.count(333), a.count(66.25), a.count('x')
77 2 1 0
78 >>> a.insert(2, -1)
79 >>> a.append(333)
80 >>> a
81 [66.25, 333, -1, 333, 1, 1234.5, 333]
82 >>> a.index(333)
83 1
84 >>> a.remove(333)
85 >>> a
86 [66.25, -1, 333, 1, 1234.5, 333]
87 >>> a.reverse()
88 >>> a
89 [333, 1234.5, 1, 333, -1, 66.25]
90 >>> a.sort()
91 >>> a
92 [-1, 1, 66.25, 333, 333, 1234.5]
93
94
95.. _tut-lists-as-stacks:
96
97Using Lists as Stacks
98---------------------
99
100.. sectionauthor:: Ka-Ping Yee <ping@lfw.org>
101
102
103The list methods make it very easy to use a list as a stack, where the last
104element added is the first element retrieved ("last-in, first-out"). To add an
105item to the top of the stack, use :meth:`append`. To retrieve an item from the
106top of the stack, use :meth:`pop` without an explicit index. For example::
107
108 >>> stack = [3, 4, 5]
109 >>> stack.append(6)
110 >>> stack.append(7)
111 >>> stack
112 [3, 4, 5, 6, 7]
113 >>> stack.pop()
114 7
115 >>> stack
116 [3, 4, 5, 6]
117 >>> stack.pop()
118 6
119 >>> stack.pop()
120 5
121 >>> stack
122 [3, 4]
123
124
125.. _tut-lists-as-queues:
126
127Using Lists as Queues
128---------------------
129
130.. sectionauthor:: Ka-Ping Yee <ping@lfw.org>
131
132
133You can also use a list conveniently as a queue, where the first element added
134is the first element retrieved ("first-in, first-out"). To add an item to the
135back of the queue, use :meth:`append`. To retrieve an item from the front of
136the queue, use :meth:`pop` with ``0`` as the index. For example::
137
138 >>> queue = ["Eric", "John", "Michael"]
139 >>> queue.append("Terry") # Terry arrives
140 >>> queue.append("Graham") # Graham arrives
141 >>> queue.pop(0)
142 'Eric'
143 >>> queue.pop(0)
144 'John'
145 >>> queue
146 ['Michael', 'Terry', 'Graham']
147
148
149.. _tut-functional:
150
151Functional Programming Tools
152----------------------------
153
154There are three built-in functions that are very useful when used with lists:
155:func:`filter`, :func:`map`, and :func:`reduce`.
156
157``filter(function, sequence)`` returns a sequence consisting of those items from
158the sequence for which ``function(item)`` is true. If *sequence* is a
159:class:`string` or :class:`tuple`, the result will be of the same type;
160otherwise, it is always a :class:`list`. For example, to compute some primes::
161
162 >>> def f(x): return x % 2 != 0 and x % 3 != 0
163 ...
164 >>> filter(f, range(2, 25))
165 [5, 7, 11, 13, 17, 19, 23]
166
167``map(function, sequence)`` calls ``function(item)`` for each of the sequence's
168items and returns a list of the return values. For example, to compute some
169cubes::
170
171 >>> def cube(x): return x*x*x
172 ...
173 >>> map(cube, range(1, 11))
174 [1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
175
176More than one sequence may be passed; the function must then have as many
177arguments as there are sequences and is called with the corresponding item from
178each sequence (or ``None`` if some sequence is shorter than another). For
179example::
180
181 >>> seq = range(8)
182 >>> def add(x, y): return x+y
183 ...
184 >>> map(add, seq, seq)
185 [0, 2, 4, 6, 8, 10, 12, 14]
186
187``reduce(function, sequence)`` returns a single value constructed by calling the
188binary function *function* on the first two items of the sequence, then on the
189result and the next item, and so on. For example, to compute the sum of the
190numbers 1 through 10::
191
192 >>> def add(x,y): return x+y
193 ...
194 >>> reduce(add, range(1, 11))
195 55
196
197If there's only one item in the sequence, its value is returned; if the sequence
198is empty, an exception is raised.
199
200A third argument can be passed to indicate the starting value. In this case the
201starting value is returned for an empty sequence, and the function is first
202applied to the starting value and the first sequence item, then to the result
203and the next item, and so on. For example, ::
204
205 >>> def sum(seq):
206 ... def add(x,y): return x+y
207 ... return reduce(add, seq, 0)
208 ...
209 >>> sum(range(1, 11))
210 55
211 >>> sum([])
212 0
213
214Don't use this example's definition of :func:`sum`: since summing numbers is
215such a common need, a built-in function ``sum(sequence)`` is already provided,
216and works exactly like this.
217
218.. versionadded:: 2.3
219
220
221List Comprehensions
222-------------------
223
224List comprehensions provide a concise way to create lists without resorting to
225use of :func:`map`, :func:`filter` and/or :keyword:`lambda`. The resulting list
226definition tends often to be clearer than lists built using those constructs.
227Each list comprehension consists of an expression followed by a :keyword:`for`
228clause, then zero or more :keyword:`for` or :keyword:`if` clauses. The result
229will be a list resulting from evaluating the expression in the context of the
230:keyword:`for` and :keyword:`if` clauses which follow it. If the expression
231would evaluate to a tuple, it must be parenthesized. ::
232
233 >>> freshfruit = [' banana', ' loganberry ', 'passion fruit ']
234 >>> [weapon.strip() for weapon in freshfruit]
235 ['banana', 'loganberry', 'passion fruit']
236 >>> vec = [2, 4, 6]
237 >>> [3*x for x in vec]
238 [6, 12, 18]
239 >>> [3*x for x in vec if x > 3]
240 [12, 18]
241 >>> [3*x for x in vec if x < 2]
242 []
243 >>> [[x,x**2] for x in vec]
244 [[2, 4], [4, 16], [6, 36]]
245 >>> [x, x**2 for x in vec] # error - parens required for tuples
246 File "<stdin>", line 1, in ?
247 [x, x**2 for x in vec]
248 ^
249 SyntaxError: invalid syntax
250 >>> [(x, x**2) for x in vec]
251 [(2, 4), (4, 16), (6, 36)]
252 >>> vec1 = [2, 4, 6]
253 >>> vec2 = [4, 3, -9]
254 >>> [x*y for x in vec1 for y in vec2]
255 [8, 6, -18, 16, 12, -36, 24, 18, -54]
256 >>> [x+y for x in vec1 for y in vec2]
257 [6, 5, -7, 8, 7, -5, 10, 9, -3]
258 >>> [vec1[i]*vec2[i] for i in range(len(vec1))]
259 [8, 12, -54]
260
261List comprehensions are much more flexible than :func:`map` and can be applied
262to complex expressions and nested functions::
263
264 >>> [str(round(355/113.0, i)) for i in range(1,6)]
265 ['3.1', '3.14', '3.142', '3.1416', '3.14159']
266
267
268.. _tut-del:
269
270The :keyword:`del` statement
271============================
272
273There is a way to remove an item from a list given its index instead of its
274value: the :keyword:`del` statement. This differs from the :meth:`pop` method
275which returns a value. The :keyword:`del` statement can also be used to remove
276slices from a list or clear the entire list (which we did earlier by assignment
277of an empty list to the slice). For example::
278
279 >>> a = [-1, 1, 66.25, 333, 333, 1234.5]
280 >>> del a[0]
281 >>> a
282 [1, 66.25, 333, 333, 1234.5]
283 >>> del a[2:4]
284 >>> a
285 [1, 66.25, 1234.5]
286 >>> del a[:]
287 >>> a
288 []
289
290:keyword:`del` can also be used to delete entire variables::
291
292 >>> del a
293
294Referencing the name ``a`` hereafter is an error (at least until another value
295is assigned to it). We'll find other uses for :keyword:`del` later.
296
297
298.. _tut-tuples:
299
300Tuples and Sequences
301====================
302
303We saw that lists and strings have many common properties, such as indexing and
304slicing operations. They are two examples of *sequence* data types (see
305:ref:`typesseq`). Since Python is an evolving language, other sequence data
306types may be added. There is also another standard sequence data type: the
307*tuple*.
308
309A tuple consists of a number of values separated by commas, for instance::
310
311 >>> t = 12345, 54321, 'hello!'
312 >>> t[0]
313 12345
314 >>> t
315 (12345, 54321, 'hello!')
316 >>> # Tuples may be nested:
317 ... u = t, (1, 2, 3, 4, 5)
318 >>> u
319 ((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))
320
321As you see, on output tuples are always enclosed in parentheses, so that nested
322tuples are interpreted correctly; they may be input with or without surrounding
323parentheses, although often parentheses are necessary anyway (if the tuple is
324part of a larger expression).
325
326Tuples have many uses. For example: (x, y) coordinate pairs, employee records
327from a database, etc. Tuples, like strings, are immutable: it is not possible
328to assign to the individual items of a tuple (you can simulate much of the same
329effect with slicing and concatenation, though). It is also possible to create
330tuples which contain mutable objects, such as lists.
331
332A special problem is the construction of tuples containing 0 or 1 items: the
333syntax has some extra quirks to accommodate these. Empty tuples are constructed
334by an empty pair of parentheses; a tuple with one item is constructed by
335following a value with a comma (it is not sufficient to enclose a single value
336in parentheses). Ugly, but effective. For example::
337
338 >>> empty = ()
339 >>> singleton = 'hello', # <-- note trailing comma
340 >>> len(empty)
341 0
342 >>> len(singleton)
343 1
344 >>> singleton
345 ('hello',)
346
347The statement ``t = 12345, 54321, 'hello!'`` is an example of *tuple packing*:
348the values ``12345``, ``54321`` and ``'hello!'`` are packed together in a tuple.
349The reverse operation is also possible::
350
351 >>> x, y, z = t
352
353This is called, appropriately enough, *sequence unpacking*. Sequence unpacking
354requires the list of variables on the left to have the same number of elements
355as the length of the sequence. Note that multiple assignment is really just a
356combination of tuple packing and sequence unpacking!
357
358There is a small bit of asymmetry here: packing multiple values always creates
359a tuple, and unpacking works for any sequence.
360
361.. % XXX Add a bit on the difference between tuples and lists.
362
363
364.. _tut-sets:
365
366Sets
367====
368
369Python also includes a data type for *sets*. A set is an unordered collection
370with no duplicate elements. Basic uses include membership testing and
371eliminating duplicate entries. Set objects also support mathematical operations
372like union, intersection, difference, and symmetric difference.
373
374Here is a brief demonstration::
375
376 >>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
377 >>> fruit = set(basket) # create a set without duplicates
378 >>> fruit
379 set(['orange', 'pear', 'apple', 'banana'])
380 >>> 'orange' in fruit # fast membership testing
381 True
382 >>> 'crabgrass' in fruit
383 False
384
385 >>> # Demonstrate set operations on unique letters from two words
386 ...
387 >>> a = set('abracadabra')
388 >>> b = set('alacazam')
389 >>> a # unique letters in a
390 set(['a', 'r', 'b', 'c', 'd'])
391 >>> a - b # letters in a but not in b
392 set(['r', 'd', 'b'])
393 >>> a | b # letters in either a or b
394 set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])
395 >>> a & b # letters in both a and b
396 set(['a', 'c'])
397 >>> a ^ b # letters in a or b but not both
398 set(['r', 'd', 'b', 'm', 'z', 'l'])
399
400
401.. _tut-dictionaries:
402
403Dictionaries
404============
405
406Another useful data type built into Python is the *dictionary* (see
407:ref:`typesmapping`). Dictionaries are sometimes found in other languages as
408"associative memories" or "associative arrays". Unlike sequences, which are
409indexed by a range of numbers, dictionaries are indexed by *keys*, which can be
410any immutable type; strings and numbers can always be keys. Tuples can be used
411as keys if they contain only strings, numbers, or tuples; if a tuple contains
412any mutable object either directly or indirectly, it cannot be used as a key.
413You can't use lists as keys, since lists can be modified in place using index
414assignments, slice assignments, or methods like :meth:`append` and
415:meth:`extend`.
416
417It is best to think of a dictionary as an unordered set of *key: value* pairs,
418with the requirement that the keys are unique (within one dictionary). A pair of
419braces creates an empty dictionary: ``{}``. Placing a comma-separated list of
420key:value pairs within the braces adds initial key:value pairs to the
421dictionary; this is also the way dictionaries are written on output.
422
423The main operations on a dictionary are storing a value with some key and
424extracting the value given the key. It is also possible to delete a key:value
425pair with ``del``. If you store using a key that is already in use, the old
426value associated with that key is forgotten. It is an error to extract a value
427using a non-existent key.
428
429The :meth:`keys` method of a dictionary object returns a list of all the keys
430used in the dictionary, in arbitrary order (if you want it sorted, just apply
431the :meth:`sort` method to the list of keys). To check whether a single key is
432in the dictionary, either use the dictionary's :meth:`has_key` method or the
433:keyword:`in` keyword.
434
435Here is a small example using a dictionary::
436
437 >>> tel = {'jack': 4098, 'sape': 4139}
438 >>> tel['guido'] = 4127
439 >>> tel
440 {'sape': 4139, 'guido': 4127, 'jack': 4098}
441 >>> tel['jack']
442 4098
443 >>> del tel['sape']
444 >>> tel['irv'] = 4127
445 >>> tel
446 {'guido': 4127, 'irv': 4127, 'jack': 4098}
447 >>> tel.keys()
448 ['guido', 'irv', 'jack']
449 >>> tel.has_key('guido')
450 True
451 >>> 'guido' in tel
452 True
453
454The :func:`dict` constructor builds dictionaries directly from lists of
455key-value pairs stored as tuples. When the pairs form a pattern, list
456comprehensions can compactly specify the key-value list. ::
457
458 >>> dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])
459 {'sape': 4139, 'jack': 4098, 'guido': 4127}
460 >>> dict([(x, x**2) for x in (2, 4, 6)]) # use a list comprehension
461 {2: 4, 4: 16, 6: 36}
462
463Later in the tutorial, we will learn about Generator Expressions which are even
464better suited for the task of supplying key-values pairs to the :func:`dict`
465constructor.
466
467When the keys are simple strings, it is sometimes easier to specify pairs using
468keyword arguments::
469
470 >>> dict(sape=4139, guido=4127, jack=4098)
471 {'sape': 4139, 'jack': 4098, 'guido': 4127}
472
473
474.. _tut-loopidioms:
475
476Looping Techniques
477==================
478
479When looping through dictionaries, the key and corresponding value can be
480retrieved at the same time using the :meth:`iteritems` method. ::
481
482 >>> knights = {'gallahad': 'the pure', 'robin': 'the brave'}
483 >>> for k, v in knights.iteritems():
484 ... print k, v
485 ...
486 gallahad the pure
487 robin the brave
488
489When looping through a sequence, the position index and corresponding value can
490be retrieved at the same time using the :func:`enumerate` function. ::
491
492 >>> for i, v in enumerate(['tic', 'tac', 'toe']):
493 ... print i, v
494 ...
495 0 tic
496 1 tac
497 2 toe
498
499To loop over two or more sequences at the same time, the entries can be paired
500with the :func:`zip` function. ::
501
502 >>> questions = ['name', 'quest', 'favorite color']
503 >>> answers = ['lancelot', 'the holy grail', 'blue']
504 >>> for q, a in zip(questions, answers):
505 ... print 'What is your %s? It is %s.' % (q, a)
506 ...
507 What is your name? It is lancelot.
508 What is your quest? It is the holy grail.
509 What is your favorite color? It is blue.
510
511To loop over a sequence in reverse, first specify the sequence in a forward
512direction and then call the :func:`reversed` function. ::
513
514 >>> for i in reversed(xrange(1,10,2)):
515 ... print i
516 ...
517 9
518 7
519 5
520 3
521 1
522
523To loop over a sequence in sorted order, use the :func:`sorted` function which
524returns a new sorted list while leaving the source unaltered. ::
525
526 >>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
527 >>> for f in sorted(set(basket)):
528 ... print f
529 ...
530 apple
531 banana
532 orange
533 pear
534
535
536.. _tut-conditions:
537
538More on Conditions
539==================
540
541The conditions used in ``while`` and ``if`` statements can contain any
542operators, not just comparisons.
543
544The comparison operators ``in`` and ``not in`` check whether a value occurs
545(does not occur) in a sequence. The operators ``is`` and ``is not`` compare
546whether two objects are really the same object; this only matters for mutable
547objects like lists. All comparison operators have the same priority, which is
548lower than that of all numerical operators.
549
550Comparisons can be chained. For example, ``a < b == c`` tests whether ``a`` is
551less than ``b`` and moreover ``b`` equals ``c``.
552
553Comparisons may be combined using the Boolean operators ``and`` and ``or``, and
554the outcome of a comparison (or of any other Boolean expression) may be negated
555with ``not``. These have lower priorities than comparison operators; between
556them, ``not`` has the highest priority and ``or`` the lowest, so that ``A and
557not B or C`` is equivalent to ``(A and (not B)) or C``. As always, parentheses
558can be used to express the desired composition.
559
560The Boolean operators ``and`` and ``or`` are so-called *short-circuit*
561operators: their arguments are evaluated from left to right, and evaluation
562stops as soon as the outcome is determined. For example, if ``A`` and ``C`` are
563true but ``B`` is false, ``A and B and C`` does not evaluate the expression
564``C``. When used as a general value and not as a Boolean, the return value of a
565short-circuit operator is the last evaluated argument.
566
567It is possible to assign the result of a comparison or other Boolean expression
568to a variable. For example, ::
569
570 >>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
571 >>> non_null = string1 or string2 or string3
572 >>> non_null
573 'Trondheim'
574
575Note that in Python, unlike C, assignment cannot occur inside expressions. C
576programmers may grumble about this, but it avoids a common class of problems
577encountered in C programs: typing ``=`` in an expression when ``==`` was
578intended.
579
580
581.. _tut-comparing:
582
583Comparing Sequences and Other Types
584===================================
585
586Sequence objects may be compared to other objects with the same sequence type.
587The comparison uses *lexicographical* ordering: first the first two items are
588compared, and if they differ this determines the outcome of the comparison; if
589they are equal, the next two items are compared, and so on, until either
590sequence is exhausted. If two items to be compared are themselves sequences of
591the same type, the lexicographical comparison is carried out recursively. If
592all items of two sequences compare equal, the sequences are considered equal.
593If one sequence is an initial sub-sequence of the other, the shorter sequence is
594the smaller (lesser) one. Lexicographical ordering for strings uses the ASCII
595ordering for individual characters. Some examples of comparisons between
596sequences of the same type::
597
598 (1, 2, 3) < (1, 2, 4)
599 [1, 2, 3] < [1, 2, 4]
600 'ABC' < 'C' < 'Pascal' < 'Python'
601 (1, 2, 3, 4) < (1, 2, 4)
602 (1, 2) < (1, 2, -1)
603 (1, 2, 3) == (1.0, 2.0, 3.0)
604 (1, 2, ('aa', 'ab')) < (1, 2, ('abc', 'a'), 4)
605
606Note that comparing objects of different types is legal. The outcome is
607deterministic but arbitrary: the types are ordered by their name. Thus, a list
608is always smaller than a string, a string is always smaller than a tuple, etc.
609[#]_ Mixed numeric types are compared according to their numeric value, so 0
610equals 0.0, etc.
611
612
613.. rubric:: Footnotes
614
615.. [#] The rules for comparing objects of different types should not be relied upon;
616 they may change in a future version of the language.
617