blob: 9f3320f4c3acc964f8da79d39e85f5eb478fe124 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001.. _tut-structures:
2
3***************
4Data Structures
5***************
6
7This chapter describes some things you've learned about already in more detail,
8and adds some new things as well.
9
10
11.. _tut-morelists:
12
13More on Lists
14=============
15
16The list data type has some more methods. Here are all of the methods of list
17objects:
18
19
20.. method:: list.append(x)
21
22 Add an item to the end of the list; equivalent to ``a[len(a):] = [x]``.
23
24
25.. method:: list.extend(L)
26
27 Extend the list by appending all the items in the given list; equivalent to
28 ``a[len(a):] = L``.
29
30
31.. method:: list.insert(i, x)
32
33 Insert an item at a given position. The first argument is the index of the
34 element before which to insert, so ``a.insert(0, x)`` inserts at the front of
35 the list, and ``a.insert(len(a), x)`` is equivalent to ``a.append(x)``.
36
37
38.. method:: list.remove(x)
39
40 Remove the first item from the list whose value is *x*. It is an error if there
41 is no such item.
42
43
44.. method:: list.pop([i])
45
46 Remove the item at the given position in the list, and return it. If no index
47 is specified, ``a.pop()`` removes and returns the last item in the list. (The
48 square brackets around the *i* in the method signature denote that the parameter
49 is optional, not that you should type square brackets at that position. You
50 will see this notation frequently in the Python Library Reference.)
51
52
53.. method:: list.index(x)
54
55 Return the index in the list of the first item whose value is *x*. It is an
56 error if there is no such item.
57
58
59.. method:: list.count(x)
60
61 Return the number of times *x* appears in the list.
62
63
64.. method:: list.sort()
65
66 Sort the items of the list, in place.
67
68
69.. method:: list.reverse()
70
71 Reverse the elements of the list, in place.
72
73An example that uses most of the list methods::
74
75 >>> a = [66.25, 333, 333, 1, 1234.5]
76 >>> print a.count(333), a.count(66.25), a.count('x')
77 2 1 0
78 >>> a.insert(2, -1)
79 >>> a.append(333)
80 >>> a
81 [66.25, 333, -1, 333, 1, 1234.5, 333]
82 >>> a.index(333)
83 1
84 >>> a.remove(333)
85 >>> a
86 [66.25, -1, 333, 1, 1234.5, 333]
87 >>> a.reverse()
88 >>> a
89 [333, 1234.5, 1, 333, -1, 66.25]
90 >>> a.sort()
91 >>> a
92 [-1, 1, 66.25, 333, 333, 1234.5]
93
94
95.. _tut-lists-as-stacks:
96
97Using Lists as Stacks
98---------------------
99
100.. sectionauthor:: Ka-Ping Yee <ping@lfw.org>
101
102
103The list methods make it very easy to use a list as a stack, where the last
104element added is the first element retrieved ("last-in, first-out"). To add an
105item to the top of the stack, use :meth:`append`. To retrieve an item from the
106top of the stack, use :meth:`pop` without an explicit index. For example::
107
108 >>> stack = [3, 4, 5]
109 >>> stack.append(6)
110 >>> stack.append(7)
111 >>> stack
112 [3, 4, 5, 6, 7]
113 >>> stack.pop()
114 7
115 >>> stack
116 [3, 4, 5, 6]
117 >>> stack.pop()
118 6
119 >>> stack.pop()
120 5
121 >>> stack
122 [3, 4]
123
124
125.. _tut-lists-as-queues:
126
127Using Lists as Queues
128---------------------
129
130.. sectionauthor:: Ka-Ping Yee <ping@lfw.org>
131
132
133You can also use a list conveniently as a queue, where the first element added
134is the first element retrieved ("first-in, first-out"). To add an item to the
135back of the queue, use :meth:`append`. To retrieve an item from the front of
136the queue, use :meth:`pop` with ``0`` as the index. For example::
137
138 >>> queue = ["Eric", "John", "Michael"]
139 >>> queue.append("Terry") # Terry arrives
140 >>> queue.append("Graham") # Graham arrives
141 >>> queue.pop(0)
142 'Eric'
143 >>> queue.pop(0)
144 'John'
145 >>> queue
146 ['Michael', 'Terry', 'Graham']
147
148
149.. _tut-functional:
150
151Functional Programming Tools
152----------------------------
153
154There are three built-in functions that are very useful when used with lists:
155:func:`filter`, :func:`map`, and :func:`reduce`.
156
157``filter(function, sequence)`` returns a sequence consisting of those items from
158the sequence for which ``function(item)`` is true. If *sequence* is a
159:class:`string` or :class:`tuple`, the result will be of the same type;
160otherwise, it is always a :class:`list`. For example, to compute some primes::
161
162 >>> def f(x): return x % 2 != 0 and x % 3 != 0
163 ...
164 >>> filter(f, range(2, 25))
165 [5, 7, 11, 13, 17, 19, 23]
166
167``map(function, sequence)`` calls ``function(item)`` for each of the sequence's
168items and returns a list of the return values. For example, to compute some
169cubes::
170
171 >>> def cube(x): return x*x*x
172 ...
173 >>> map(cube, range(1, 11))
174 [1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
175
176More than one sequence may be passed; the function must then have as many
177arguments as there are sequences and is called with the corresponding item from
178each sequence (or ``None`` if some sequence is shorter than another). For
179example::
180
181 >>> seq = range(8)
182 >>> def add(x, y): return x+y
183 ...
184 >>> map(add, seq, seq)
185 [0, 2, 4, 6, 8, 10, 12, 14]
186
187``reduce(function, sequence)`` returns a single value constructed by calling the
188binary function *function* on the first two items of the sequence, then on the
189result and the next item, and so on. For example, to compute the sum of the
190numbers 1 through 10::
191
192 >>> def add(x,y): return x+y
193 ...
194 >>> reduce(add, range(1, 11))
195 55
196
197If there's only one item in the sequence, its value is returned; if the sequence
198is empty, an exception is raised.
199
200A third argument can be passed to indicate the starting value. In this case the
201starting value is returned for an empty sequence, and the function is first
202applied to the starting value and the first sequence item, then to the result
203and the next item, and so on. For example, ::
204
205 >>> def sum(seq):
206 ... def add(x,y): return x+y
207 ... return reduce(add, seq, 0)
208 ...
209 >>> sum(range(1, 11))
210 55
211 >>> sum([])
212 0
213
214Don't use this example's definition of :func:`sum`: since summing numbers is
215such a common need, a built-in function ``sum(sequence)`` is already provided,
216and works exactly like this.
217
218.. versionadded:: 2.3
219
220
221List Comprehensions
222-------------------
223
224List comprehensions provide a concise way to create lists without resorting to
225use of :func:`map`, :func:`filter` and/or :keyword:`lambda`. The resulting list
226definition tends often to be clearer than lists built using those constructs.
227Each list comprehension consists of an expression followed by a :keyword:`for`
228clause, then zero or more :keyword:`for` or :keyword:`if` clauses. The result
229will be a list resulting from evaluating the expression in the context of the
230:keyword:`for` and :keyword:`if` clauses which follow it. If the expression
231would evaluate to a tuple, it must be parenthesized. ::
232
233 >>> freshfruit = [' banana', ' loganberry ', 'passion fruit ']
234 >>> [weapon.strip() for weapon in freshfruit]
235 ['banana', 'loganberry', 'passion fruit']
236 >>> vec = [2, 4, 6]
237 >>> [3*x for x in vec]
238 [6, 12, 18]
239 >>> [3*x for x in vec if x > 3]
240 [12, 18]
241 >>> [3*x for x in vec if x < 2]
242 []
243 >>> [[x,x**2] for x in vec]
244 [[2, 4], [4, 16], [6, 36]]
245 >>> [x, x**2 for x in vec] # error - parens required for tuples
246 File "<stdin>", line 1, in ?
247 [x, x**2 for x in vec]
248 ^
249 SyntaxError: invalid syntax
250 >>> [(x, x**2) for x in vec]
251 [(2, 4), (4, 16), (6, 36)]
252 >>> vec1 = [2, 4, 6]
253 >>> vec2 = [4, 3, -9]
254 >>> [x*y for x in vec1 for y in vec2]
255 [8, 6, -18, 16, 12, -36, 24, 18, -54]
256 >>> [x+y for x in vec1 for y in vec2]
257 [6, 5, -7, 8, 7, -5, 10, 9, -3]
258 >>> [vec1[i]*vec2[i] for i in range(len(vec1))]
259 [8, 12, -54]
260
261List comprehensions are much more flexible than :func:`map` and can be applied
262to complex expressions and nested functions::
263
264 >>> [str(round(355/113.0, i)) for i in range(1,6)]
265 ['3.1', '3.14', '3.142', '3.1416', '3.14159']
266
267
Georg Brandladbda842007-12-14 19:03:36 +0000268Nested List Comprehensions
269--------------------------
270
271If you've got the stomach for it, list comprehensions can be nested. They are a
272powerful tool but -- like all powerful tools -- they need to be used carefully,
273if at all.
274
275Consider the following example of a 3x3 matrix held as a list containing three
276lists, one list per row::
277
278 >>> mat = [
279 ... [1, 2, 3],
280 ... [4, 5, 6],
281 ... [7, 8, 9],
282 ... ]
283
284Now, if you wanted to swap rows and columns, you could use a list
285comprehension::
286
287 >>> print [[row[i] for row in mat] for i in [0, 1, 2]]
288 [[1, 4, 7], [2, 5, 8], [3, 6, 9]]
289
290Special care has to be taken for the *nested* list comprehension:
291
292 To avoid apprehension when nesting list comprehensions, read from right to
293 left.
294
295A more verbose version of this snippet shows the flow explicitly::
296
297 for i in [0, 1, 2]:
298 for row in mat:
299 print row[i],
300 print
301
302In real world, you should prefer builtin functions to complex flow statements.
303The :func:`zip` function would do a great job for this use case::
304
305 >>> zip(*mat)
306 [(1, 4, 7), (2, 5, 8), (3, 6, 9)]
307
308See :ref:`tut-unpacking-arguments` for details on the asterisk in this line.
309
Georg Brandl8ec7f652007-08-15 14:28:01 +0000310.. _tut-del:
311
312The :keyword:`del` statement
313============================
314
315There is a way to remove an item from a list given its index instead of its
316value: the :keyword:`del` statement. This differs from the :meth:`pop` method
317which returns a value. The :keyword:`del` statement can also be used to remove
318slices from a list or clear the entire list (which we did earlier by assignment
319of an empty list to the slice). For example::
320
321 >>> a = [-1, 1, 66.25, 333, 333, 1234.5]
322 >>> del a[0]
323 >>> a
324 [1, 66.25, 333, 333, 1234.5]
325 >>> del a[2:4]
326 >>> a
327 [1, 66.25, 1234.5]
328 >>> del a[:]
329 >>> a
330 []
331
332:keyword:`del` can also be used to delete entire variables::
333
334 >>> del a
335
336Referencing the name ``a`` hereafter is an error (at least until another value
337is assigned to it). We'll find other uses for :keyword:`del` later.
338
339
340.. _tut-tuples:
341
342Tuples and Sequences
343====================
344
345We saw that lists and strings have many common properties, such as indexing and
346slicing operations. They are two examples of *sequence* data types (see
347:ref:`typesseq`). Since Python is an evolving language, other sequence data
348types may be added. There is also another standard sequence data type: the
349*tuple*.
350
351A tuple consists of a number of values separated by commas, for instance::
352
353 >>> t = 12345, 54321, 'hello!'
354 >>> t[0]
355 12345
356 >>> t
357 (12345, 54321, 'hello!')
358 >>> # Tuples may be nested:
359 ... u = t, (1, 2, 3, 4, 5)
360 >>> u
361 ((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))
362
363As you see, on output tuples are always enclosed in parentheses, so that nested
364tuples are interpreted correctly; they may be input with or without surrounding
365parentheses, although often parentheses are necessary anyway (if the tuple is
366part of a larger expression).
367
368Tuples have many uses. For example: (x, y) coordinate pairs, employee records
369from a database, etc. Tuples, like strings, are immutable: it is not possible
370to assign to the individual items of a tuple (you can simulate much of the same
371effect with slicing and concatenation, though). It is also possible to create
372tuples which contain mutable objects, such as lists.
373
374A special problem is the construction of tuples containing 0 or 1 items: the
375syntax has some extra quirks to accommodate these. Empty tuples are constructed
376by an empty pair of parentheses; a tuple with one item is constructed by
377following a value with a comma (it is not sufficient to enclose a single value
378in parentheses). Ugly, but effective. For example::
379
380 >>> empty = ()
381 >>> singleton = 'hello', # <-- note trailing comma
382 >>> len(empty)
383 0
384 >>> len(singleton)
385 1
386 >>> singleton
387 ('hello',)
388
389The statement ``t = 12345, 54321, 'hello!'`` is an example of *tuple packing*:
390the values ``12345``, ``54321`` and ``'hello!'`` are packed together in a tuple.
391The reverse operation is also possible::
392
393 >>> x, y, z = t
394
395This is called, appropriately enough, *sequence unpacking*. Sequence unpacking
396requires the list of variables on the left to have the same number of elements
397as the length of the sequence. Note that multiple assignment is really just a
398combination of tuple packing and sequence unpacking!
399
400There is a small bit of asymmetry here: packing multiple values always creates
401a tuple, and unpacking works for any sequence.
402
Georg Brandlb19be572007-12-29 10:57:00 +0000403.. XXX Add a bit on the difference between tuples and lists.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000404
405
406.. _tut-sets:
407
408Sets
409====
410
411Python also includes a data type for *sets*. A set is an unordered collection
412with no duplicate elements. Basic uses include membership testing and
413eliminating duplicate entries. Set objects also support mathematical operations
414like union, intersection, difference, and symmetric difference.
415
416Here is a brief demonstration::
417
418 >>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
419 >>> fruit = set(basket) # create a set without duplicates
420 >>> fruit
421 set(['orange', 'pear', 'apple', 'banana'])
422 >>> 'orange' in fruit # fast membership testing
423 True
424 >>> 'crabgrass' in fruit
425 False
426
427 >>> # Demonstrate set operations on unique letters from two words
428 ...
429 >>> a = set('abracadabra')
430 >>> b = set('alacazam')
431 >>> a # unique letters in a
432 set(['a', 'r', 'b', 'c', 'd'])
433 >>> a - b # letters in a but not in b
434 set(['r', 'd', 'b'])
435 >>> a | b # letters in either a or b
436 set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])
437 >>> a & b # letters in both a and b
438 set(['a', 'c'])
439 >>> a ^ b # letters in a or b but not both
440 set(['r', 'd', 'b', 'm', 'z', 'l'])
441
442
443.. _tut-dictionaries:
444
445Dictionaries
446============
447
448Another useful data type built into Python is the *dictionary* (see
449:ref:`typesmapping`). Dictionaries are sometimes found in other languages as
450"associative memories" or "associative arrays". Unlike sequences, which are
451indexed by a range of numbers, dictionaries are indexed by *keys*, which can be
452any immutable type; strings and numbers can always be keys. Tuples can be used
453as keys if they contain only strings, numbers, or tuples; if a tuple contains
454any mutable object either directly or indirectly, it cannot be used as a key.
455You can't use lists as keys, since lists can be modified in place using index
456assignments, slice assignments, or methods like :meth:`append` and
457:meth:`extend`.
458
459It is best to think of a dictionary as an unordered set of *key: value* pairs,
460with the requirement that the keys are unique (within one dictionary). A pair of
461braces creates an empty dictionary: ``{}``. Placing a comma-separated list of
462key:value pairs within the braces adds initial key:value pairs to the
463dictionary; this is also the way dictionaries are written on output.
464
465The main operations on a dictionary are storing a value with some key and
466extracting the value given the key. It is also possible to delete a key:value
467pair with ``del``. If you store using a key that is already in use, the old
468value associated with that key is forgotten. It is an error to extract a value
469using a non-existent key.
470
471The :meth:`keys` method of a dictionary object returns a list of all the keys
472used in the dictionary, in arbitrary order (if you want it sorted, just apply
473the :meth:`sort` method to the list of keys). To check whether a single key is
474in the dictionary, either use the dictionary's :meth:`has_key` method or the
475:keyword:`in` keyword.
476
477Here is a small example using a dictionary::
478
479 >>> tel = {'jack': 4098, 'sape': 4139}
480 >>> tel['guido'] = 4127
481 >>> tel
482 {'sape': 4139, 'guido': 4127, 'jack': 4098}
483 >>> tel['jack']
484 4098
485 >>> del tel['sape']
486 >>> tel['irv'] = 4127
487 >>> tel
488 {'guido': 4127, 'irv': 4127, 'jack': 4098}
489 >>> tel.keys()
490 ['guido', 'irv', 'jack']
491 >>> tel.has_key('guido')
492 True
493 >>> 'guido' in tel
494 True
495
496The :func:`dict` constructor builds dictionaries directly from lists of
497key-value pairs stored as tuples. When the pairs form a pattern, list
498comprehensions can compactly specify the key-value list. ::
499
500 >>> dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])
501 {'sape': 4139, 'jack': 4098, 'guido': 4127}
502 >>> dict([(x, x**2) for x in (2, 4, 6)]) # use a list comprehension
503 {2: 4, 4: 16, 6: 36}
504
505Later in the tutorial, we will learn about Generator Expressions which are even
506better suited for the task of supplying key-values pairs to the :func:`dict`
507constructor.
508
509When the keys are simple strings, it is sometimes easier to specify pairs using
510keyword arguments::
511
512 >>> dict(sape=4139, guido=4127, jack=4098)
513 {'sape': 4139, 'jack': 4098, 'guido': 4127}
514
515
516.. _tut-loopidioms:
517
518Looping Techniques
519==================
520
521When looping through dictionaries, the key and corresponding value can be
522retrieved at the same time using the :meth:`iteritems` method. ::
523
524 >>> knights = {'gallahad': 'the pure', 'robin': 'the brave'}
525 >>> for k, v in knights.iteritems():
526 ... print k, v
527 ...
528 gallahad the pure
529 robin the brave
530
531When looping through a sequence, the position index and corresponding value can
532be retrieved at the same time using the :func:`enumerate` function. ::
533
534 >>> for i, v in enumerate(['tic', 'tac', 'toe']):
535 ... print i, v
536 ...
537 0 tic
538 1 tac
539 2 toe
540
541To loop over two or more sequences at the same time, the entries can be paired
542with the :func:`zip` function. ::
543
544 >>> questions = ['name', 'quest', 'favorite color']
545 >>> answers = ['lancelot', 'the holy grail', 'blue']
546 >>> for q, a in zip(questions, answers):
547 ... print 'What is your %s? It is %s.' % (q, a)
548 ...
549 What is your name? It is lancelot.
550 What is your quest? It is the holy grail.
551 What is your favorite color? It is blue.
552
553To loop over a sequence in reverse, first specify the sequence in a forward
554direction and then call the :func:`reversed` function. ::
555
556 >>> for i in reversed(xrange(1,10,2)):
557 ... print i
558 ...
559 9
560 7
561 5
562 3
563 1
564
565To loop over a sequence in sorted order, use the :func:`sorted` function which
566returns a new sorted list while leaving the source unaltered. ::
567
568 >>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
569 >>> for f in sorted(set(basket)):
570 ... print f
571 ...
572 apple
573 banana
574 orange
575 pear
576
577
578.. _tut-conditions:
579
580More on Conditions
581==================
582
583The conditions used in ``while`` and ``if`` statements can contain any
584operators, not just comparisons.
585
586The comparison operators ``in`` and ``not in`` check whether a value occurs
587(does not occur) in a sequence. The operators ``is`` and ``is not`` compare
588whether two objects are really the same object; this only matters for mutable
589objects like lists. All comparison operators have the same priority, which is
590lower than that of all numerical operators.
591
592Comparisons can be chained. For example, ``a < b == c`` tests whether ``a`` is
593less than ``b`` and moreover ``b`` equals ``c``.
594
595Comparisons may be combined using the Boolean operators ``and`` and ``or``, and
596the outcome of a comparison (or of any other Boolean expression) may be negated
597with ``not``. These have lower priorities than comparison operators; between
598them, ``not`` has the highest priority and ``or`` the lowest, so that ``A and
599not B or C`` is equivalent to ``(A and (not B)) or C``. As always, parentheses
600can be used to express the desired composition.
601
602The Boolean operators ``and`` and ``or`` are so-called *short-circuit*
603operators: their arguments are evaluated from left to right, and evaluation
604stops as soon as the outcome is determined. For example, if ``A`` and ``C`` are
605true but ``B`` is false, ``A and B and C`` does not evaluate the expression
606``C``. When used as a general value and not as a Boolean, the return value of a
607short-circuit operator is the last evaluated argument.
608
609It is possible to assign the result of a comparison or other Boolean expression
610to a variable. For example, ::
611
612 >>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
613 >>> non_null = string1 or string2 or string3
614 >>> non_null
615 'Trondheim'
616
617Note that in Python, unlike C, assignment cannot occur inside expressions. C
618programmers may grumble about this, but it avoids a common class of problems
619encountered in C programs: typing ``=`` in an expression when ``==`` was
620intended.
621
622
623.. _tut-comparing:
624
625Comparing Sequences and Other Types
626===================================
627
628Sequence objects may be compared to other objects with the same sequence type.
629The comparison uses *lexicographical* ordering: first the first two items are
630compared, and if they differ this determines the outcome of the comparison; if
631they are equal, the next two items are compared, and so on, until either
632sequence is exhausted. If two items to be compared are themselves sequences of
633the same type, the lexicographical comparison is carried out recursively. If
634all items of two sequences compare equal, the sequences are considered equal.
635If one sequence is an initial sub-sequence of the other, the shorter sequence is
636the smaller (lesser) one. Lexicographical ordering for strings uses the ASCII
637ordering for individual characters. Some examples of comparisons between
638sequences of the same type::
639
640 (1, 2, 3) < (1, 2, 4)
641 [1, 2, 3] < [1, 2, 4]
642 'ABC' < 'C' < 'Pascal' < 'Python'
643 (1, 2, 3, 4) < (1, 2, 4)
644 (1, 2) < (1, 2, -1)
645 (1, 2, 3) == (1.0, 2.0, 3.0)
646 (1, 2, ('aa', 'ab')) < (1, 2, ('abc', 'a'), 4)
647
648Note that comparing objects of different types is legal. The outcome is
649deterministic but arbitrary: the types are ordered by their name. Thus, a list
650is always smaller than a string, a string is always smaller than a tuple, etc.
651[#]_ Mixed numeric types are compared according to their numeric value, so 0
652equals 0.0, etc.
653
654
655.. rubric:: Footnotes
656
657.. [#] The rules for comparing objects of different types should not be relied upon;
658 they may change in a future version of the language.
659