blob: 23ff5229e9c72fbe067264465d00604d6b5ec877 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001.. _tut-informal:
2
3**********************************
4An Informal Introduction to Python
5**********************************
6
7In the following examples, input and output are distinguished by the presence or
8absence of prompts (``>>>`` and ``...``): to repeat the example, you must type
9everything after the prompt, when the prompt appears; lines that do not begin
10with a prompt are output from the interpreter. Note that a secondary prompt on a
11line by itself in an example means you must type a blank line; this is used to
12end a multi-line command.
13
Georg Brandl8ec7f652007-08-15 14:28:01 +000014Many of the examples in this manual, even those entered at the interactive
15prompt, include comments. Comments in Python start with the hash character,
Georg Brandl3ce0dee2008-09-13 17:18:11 +000016``#``, and extend to the end of the physical line. A comment may appear at the
17start of a line or following whitespace or code, but not within a string
Georg Brandlb19be572007-12-29 10:57:00 +000018literal. A hash character within a string literal is just a hash character.
Georg Brandl3ce0dee2008-09-13 17:18:11 +000019Since comments are to clarify code and are not interpreted by Python, they may
20be omitted when typing in examples.
Georg Brandl8ec7f652007-08-15 14:28:01 +000021
22Some examples::
23
24 # this is the first comment
25 SPAM = 1 # and this is the second comment
26 # ... and now a third!
27 STRING = "# This is not a comment."
28
29
30.. _tut-calculator:
31
32Using Python as a Calculator
33============================
34
35Let's try some simple Python commands. Start the interpreter and wait for the
36primary prompt, ``>>>``. (It shouldn't take long.)
37
38
39.. _tut-numbers:
40
41Numbers
42-------
43
44The interpreter acts as a simple calculator: you can type an expression at it
45and it will write the value. Expression syntax is straightforward: the
46operators ``+``, ``-``, ``*`` and ``/`` work just like in most other languages
47(for example, Pascal or C); parentheses can be used for grouping. For example::
48
49 >>> 2+2
50 4
51 >>> # This is a comment
52 ... 2+2
53 4
54 >>> 2+2 # and a comment on the same line as code
55 4
56 >>> (50-5*6)/4
57 5
58 >>> # Integer division returns the floor:
59 ... 7/3
60 2
61 >>> 7/-3
62 -3
63
64The equal sign (``'='``) is used to assign a value to a variable. Afterwards, no
65result is displayed before the next interactive prompt::
66
67 >>> width = 20
68 >>> height = 5*9
69 >>> width * height
70 900
71
72A value can be assigned to several variables simultaneously::
73
74 >>> x = y = z = 0 # Zero x, y and z
75 >>> x
76 0
77 >>> y
78 0
79 >>> z
80 0
81
Georg Brandl3ce0dee2008-09-13 17:18:11 +000082Variables must be "defined" (assigned a value) before they can be used, or an
83error will occur::
84
85 >>> # try to access an undefined variable
86 ... n
Georg Brandlc62ef8b2009-01-03 20:55:06 +000087 Traceback (most recent call last):
Georg Brandl3ce0dee2008-09-13 17:18:11 +000088 File "<stdin>", line 1, in <module>
89 NameError: name 'n' is not defined
90
Georg Brandl8ec7f652007-08-15 14:28:01 +000091There is full support for floating point; operators with mixed type operands
92convert the integer operand to floating point::
93
94 >>> 3 * 3.75 / 1.5
95 7.5
96 >>> 7.0 / 2
97 3.5
98
99Complex numbers are also supported; imaginary numbers are written with a suffix
100of ``j`` or ``J``. Complex numbers with a nonzero real component are written as
101``(real+imagj)``, or can be created with the ``complex(real, imag)`` function.
102::
103
104 >>> 1j * 1J
105 (-1+0j)
106 >>> 1j * complex(0,1)
107 (-1+0j)
108 >>> 3+1j*3
109 (3+3j)
110 >>> (3+1j)*3
111 (9+3j)
112 >>> (1+2j)/(1+1j)
113 (1.5+0.5j)
114
115Complex numbers are always represented as two floating point numbers, the real
116and imaginary part. To extract these parts from a complex number *z*, use
117``z.real`` and ``z.imag``. ::
118
119 >>> a=1.5+0.5j
120 >>> a.real
121 1.5
122 >>> a.imag
123 0.5
124
125The conversion functions to floating point and integer (:func:`float`,
126:func:`int` and :func:`long`) don't work for complex numbers --- there is no one
127correct way to convert a complex number to a real number. Use ``abs(z)`` to get
128its magnitude (as a float) or ``z.real`` to get its real part. ::
129
130 >>> a=3.0+4.0j
131 >>> float(a)
132 Traceback (most recent call last):
133 File "<stdin>", line 1, in ?
134 TypeError: can't convert complex to float; use abs(z)
135 >>> a.real
136 3.0
137 >>> a.imag
138 4.0
139 >>> abs(a) # sqrt(a.real**2 + a.imag**2)
140 5.0
141 >>>
142
143In interactive mode, the last printed expression is assigned to the variable
144``_``. This means that when you are using Python as a desk calculator, it is
145somewhat easier to continue calculations, for example::
146
147 >>> tax = 12.5 / 100
148 >>> price = 100.50
149 >>> price * tax
150 12.5625
151 >>> price + _
152 113.0625
153 >>> round(_, 2)
154 113.06
155 >>>
156
157This variable should be treated as read-only by the user. Don't explicitly
158assign a value to it --- you would create an independent local variable with the
159same name masking the built-in variable with its magic behavior.
160
161
162.. _tut-strings:
163
164Strings
165-------
166
167Besides numbers, Python can also manipulate strings, which can be expressed in
168several ways. They can be enclosed in single quotes or double quotes::
169
170 >>> 'spam eggs'
171 'spam eggs'
172 >>> 'doesn\'t'
173 "doesn't"
174 >>> "doesn't"
175 "doesn't"
176 >>> '"Yes," he said.'
177 '"Yes," he said.'
178 >>> "\"Yes,\" he said."
179 '"Yes," he said.'
180 >>> '"Isn\'t," she said.'
181 '"Isn\'t," she said.'
182
183String literals can span multiple lines in several ways. Continuation lines can
184be used, with a backslash as the last character on the line indicating that the
185next line is a logical continuation of the line::
186
187 hello = "This is a rather long string containing\n\
188 several lines of text just as you would do in C.\n\
189 Note that whitespace at the beginning of the line is\
190 significant."
191
192 print hello
193
194Note that newlines still need to be embedded in the string using ``\n``; the
195newline following the trailing backslash is discarded. This example would print
196the following::
197
198 This is a rather long string containing
199 several lines of text just as you would do in C.
200 Note that whitespace at the beginning of the line is significant.
201
Georg Brandl8ec7f652007-08-15 14:28:01 +0000202Or, strings can be surrounded in a pair of matching triple-quotes: ``"""`` or
203``'''``. End of lines do not need to be escaped when using triple-quotes, but
204they will be included in the string. ::
205
206 print """
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000207 Usage: thingy [OPTIONS]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000208 -h Display this usage message
209 -H hostname Hostname to connect to
210 """
211
212produces the following output::
213
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000214 Usage: thingy [OPTIONS]
Georg Brandl8ec7f652007-08-15 14:28:01 +0000215 -h Display this usage message
216 -H hostname Hostname to connect to
217
Georg Brandl186188d2009-03-31 20:56:32 +0000218If we make the string literal a "raw" string, ``\n`` sequences are not converted
219to newlines, but the backslash at the end of the line, and the newline character
220in the source, are both included in the string as data. Thus, the example::
221
222 hello = r"This is a rather long string containing\n\
223 several lines of text much as you would do in C."
224
225 print hello
226
227would print::
228
229 This is a rather long string containing\n\
230 several lines of text much as you would do in C.
231
Georg Brandl8ec7f652007-08-15 14:28:01 +0000232The interpreter prints the result of string operations in the same way as they
233are typed for input: inside quotes, and with quotes and other funny characters
234escaped by backslashes, to show the precise value. The string is enclosed in
235double quotes if the string contains a single quote and no double quotes, else
236it's enclosed in single quotes. (The :keyword:`print` statement, described
237later, can be used to write strings without quotes or escapes.)
238
239Strings can be concatenated (glued together) with the ``+`` operator, and
240repeated with ``*``::
241
242 >>> word = 'Help' + 'A'
243 >>> word
244 'HelpA'
245 >>> '<' + word*5 + '>'
246 '<HelpAHelpAHelpAHelpAHelpA>'
247
248Two string literals next to each other are automatically concatenated; the first
249line above could also have been written ``word = 'Help' 'A'``; this only works
250with two literals, not with arbitrary string expressions::
251
252 >>> 'str' 'ing' # <- This is ok
253 'string'
254 >>> 'str'.strip() + 'ing' # <- This is ok
255 'string'
256 >>> 'str'.strip() 'ing' # <- This is invalid
257 File "<stdin>", line 1, in ?
258 'str'.strip() 'ing'
259 ^
260 SyntaxError: invalid syntax
261
262Strings can be subscripted (indexed); like in C, the first character of a string
263has subscript (index) 0. There is no separate character type; a character is
264simply a string of size one. Like in Icon, substrings can be specified with the
265*slice notation*: two indices separated by a colon. ::
266
267 >>> word[4]
268 'A'
269 >>> word[0:2]
270 'He'
271 >>> word[2:4]
272 'lp'
273
274Slice indices have useful defaults; an omitted first index defaults to zero, an
275omitted second index defaults to the size of the string being sliced. ::
276
277 >>> word[:2] # The first two characters
278 'He'
279 >>> word[2:] # Everything except the first two characters
280 'lpA'
281
Georg Brandl3ce0dee2008-09-13 17:18:11 +0000282Unlike a C string, Python strings cannot be changed. Assigning to an indexed
Georg Brandl8ec7f652007-08-15 14:28:01 +0000283position in the string results in an error::
284
285 >>> word[0] = 'x'
286 Traceback (most recent call last):
287 File "<stdin>", line 1, in ?
Georg Brandl5e88eea2009-05-17 08:10:27 +0000288 TypeError: object does not support item assignment
Georg Brandl8ec7f652007-08-15 14:28:01 +0000289 >>> word[:1] = 'Splat'
290 Traceback (most recent call last):
291 File "<stdin>", line 1, in ?
Georg Brandl5e88eea2009-05-17 08:10:27 +0000292 TypeError: object does not support slice assignment
Georg Brandl8ec7f652007-08-15 14:28:01 +0000293
294However, creating a new string with the combined content is easy and efficient::
295
296 >>> 'x' + word[1:]
297 'xelpA'
298 >>> 'Splat' + word[4]
299 'SplatA'
300
301Here's a useful invariant of slice operations: ``s[:i] + s[i:]`` equals ``s``.
302::
303
304 >>> word[:2] + word[2:]
305 'HelpA'
306 >>> word[:3] + word[3:]
307 'HelpA'
308
309Degenerate slice indices are handled gracefully: an index that is too large is
310replaced by the string size, an upper bound smaller than the lower bound returns
311an empty string. ::
312
313 >>> word[1:100]
314 'elpA'
315 >>> word[10:]
316 ''
317 >>> word[2:1]
318 ''
319
320Indices may be negative numbers, to start counting from the right. For example::
321
322 >>> word[-1] # The last character
323 'A'
324 >>> word[-2] # The last-but-one character
325 'p'
326 >>> word[-2:] # The last two characters
327 'pA'
328 >>> word[:-2] # Everything except the last two characters
329 'Hel'
330
331But note that -0 is really the same as 0, so it does not count from the right!
332::
333
334 >>> word[-0] # (since -0 equals 0)
335 'H'
336
337Out-of-range negative slice indices are truncated, but don't try this for
338single-element (non-slice) indices::
339
340 >>> word[-100:]
341 'HelpA'
342 >>> word[-10] # error
343 Traceback (most recent call last):
344 File "<stdin>", line 1, in ?
345 IndexError: string index out of range
346
347One way to remember how slices work is to think of the indices as pointing
348*between* characters, with the left edge of the first character numbered 0.
349Then the right edge of the last character of a string of *n* characters has
350index *n*, for example::
351
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000352 +---+---+---+---+---+
Georg Brandl8ec7f652007-08-15 14:28:01 +0000353 | H | e | l | p | A |
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000354 +---+---+---+---+---+
355 0 1 2 3 4 5
Georg Brandl8ec7f652007-08-15 14:28:01 +0000356 -5 -4 -3 -2 -1
357
358The first row of numbers gives the position of the indices 0...5 in the string;
359the second row gives the corresponding negative indices. The slice from *i* to
360*j* consists of all characters between the edges labeled *i* and *j*,
361respectively.
362
363For non-negative indices, the length of a slice is the difference of the
364indices, if both are within bounds. For example, the length of ``word[1:3]`` is
3652.
366
367The built-in function :func:`len` returns the length of a string::
368
369 >>> s = 'supercalifragilisticexpialidocious'
370 >>> len(s)
371 34
372
373
374.. seealso::
375
376 :ref:`typesseq`
377 Strings, and the Unicode strings described in the next section, are
378 examples of *sequence types*, and support the common operations supported
379 by such types.
380
381 :ref:`string-methods`
382 Both strings and Unicode strings support a large number of methods for
383 basic transformations and searching.
384
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000385 :ref:`new-string-formatting`
386 Information about string formatting with :meth:`str.format` is described
387 here.
388
Georg Brandl8ec7f652007-08-15 14:28:01 +0000389 :ref:`string-formatting`
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000390 The old formatting operations invoked when strings and Unicode strings are
391 the left operand of the ``%`` operator are described in more detail here.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000392
393
394.. _tut-unicodestrings:
395
396Unicode Strings
397---------------
398
399.. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com>
400
401
402Starting with Python 2.0 a new data type for storing text data is available to
403the programmer: the Unicode object. It can be used to store and manipulate
404Unicode data (see http://www.unicode.org/) and integrates well with the existing
405string objects, providing auto-conversions where necessary.
406
407Unicode has the advantage of providing one ordinal for every character in every
408script used in modern and ancient texts. Previously, there were only 256
409possible ordinals for script characters. Texts were typically bound to a code
410page which mapped the ordinals to script characters. This lead to very much
411confusion especially with respect to internationalization (usually written as
412``i18n`` --- ``'i'`` + 18 characters + ``'n'``) of software. Unicode solves
413these problems by defining one code page for all scripts.
414
415Creating Unicode strings in Python is just as simple as creating normal
416strings::
417
418 >>> u'Hello World !'
419 u'Hello World !'
420
421The small ``'u'`` in front of the quote indicates that a Unicode string is
422supposed to be created. If you want to include special characters in the string,
423you can do so by using the Python *Unicode-Escape* encoding. The following
424example shows how::
425
426 >>> u'Hello\u0020World !'
427 u'Hello World !'
428
429The escape sequence ``\u0020`` indicates to insert the Unicode character with
430the ordinal value 0x0020 (the space character) at the given position.
431
432Other characters are interpreted by using their respective ordinal values
433directly as Unicode ordinals. If you have literal strings in the standard
434Latin-1 encoding that is used in many Western countries, you will find it
435convenient that the lower 256 characters of Unicode are the same as the 256
436characters of Latin-1.
437
438For experts, there is also a raw mode just like the one for normal strings. You
439have to prefix the opening quote with 'ur' to have Python use the
440*Raw-Unicode-Escape* encoding. It will only apply the above ``\uXXXX``
441conversion if there is an uneven number of backslashes in front of the small
442'u'. ::
443
444 >>> ur'Hello\u0020World !'
445 u'Hello World !'
446 >>> ur'Hello\\u0020World !'
447 u'Hello\\\\u0020World !'
448
449The raw mode is most useful when you have to enter lots of backslashes, as can
450be necessary in regular expressions.
451
452Apart from these standard encodings, Python provides a whole set of other ways
453of creating Unicode strings on the basis of a known encoding.
454
455.. index:: builtin: unicode
456
457The built-in function :func:`unicode` provides access to all registered Unicode
458codecs (COders and DECoders). Some of the more well known encodings which these
459codecs can convert are *Latin-1*, *ASCII*, *UTF-8*, and *UTF-16*. The latter two
460are variable-length encodings that store each Unicode character in one or more
461bytes. The default encoding is normally set to ASCII, which passes through
462characters in the range 0 to 127 and rejects any other characters with an error.
463When a Unicode string is printed, written to a file, or converted with
464:func:`str`, conversion takes place using this default encoding. ::
465
466 >>> u"abc"
467 u'abc'
468 >>> str(u"abc")
469 'abc'
470 >>> u"äöü"
471 u'\xe4\xf6\xfc'
472 >>> str(u"äöü")
473 Traceback (most recent call last):
474 File "<stdin>", line 1, in ?
475 UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
476
477To convert a Unicode string into an 8-bit string using a specific encoding,
478Unicode objects provide an :func:`encode` method that takes one argument, the
479name of the encoding. Lowercase names for encodings are preferred. ::
480
481 >>> u"äöü".encode('utf-8')
482 '\xc3\xa4\xc3\xb6\xc3\xbc'
483
484If you have data in a specific encoding and want to produce a corresponding
485Unicode string from it, you can use the :func:`unicode` function with the
486encoding name as the second argument. ::
487
488 >>> unicode('\xc3\xa4\xc3\xb6\xc3\xbc', 'utf-8')
489 u'\xe4\xf6\xfc'
490
491
492.. _tut-lists:
493
494Lists
495-----
496
497Python knows a number of *compound* data types, used to group together other
498values. The most versatile is the *list*, which can be written as a list of
499comma-separated values (items) between square brackets. List items need not all
500have the same type. ::
501
502 >>> a = ['spam', 'eggs', 100, 1234]
503 >>> a
504 ['spam', 'eggs', 100, 1234]
505
506Like string indices, list indices start at 0, and lists can be sliced,
507concatenated and so on::
508
509 >>> a[0]
510 'spam'
511 >>> a[3]
512 1234
513 >>> a[-2]
514 100
515 >>> a[1:-1]
516 ['eggs', 100]
517 >>> a[:2] + ['bacon', 2*2]
518 ['spam', 'eggs', 'bacon', 4]
519 >>> 3*a[:3] + ['Boo!']
520 ['spam', 'eggs', 100, 'spam', 'eggs', 100, 'spam', 'eggs', 100, 'Boo!']
521
522Unlike strings, which are *immutable*, it is possible to change individual
523elements of a list::
524
525 >>> a
526 ['spam', 'eggs', 100, 1234]
527 >>> a[2] = a[2] + 23
528 >>> a
529 ['spam', 'eggs', 123, 1234]
530
531Assignment to slices is also possible, and this can even change the size of the
532list or clear it entirely::
533
534 >>> # Replace some items:
535 ... a[0:2] = [1, 12]
536 >>> a
537 [1, 12, 123, 1234]
538 >>> # Remove some:
539 ... a[0:2] = []
540 >>> a
541 [123, 1234]
542 >>> # Insert some:
543 ... a[1:1] = ['bletch', 'xyzzy']
544 >>> a
545 [123, 'bletch', 'xyzzy', 1234]
546 >>> # Insert (a copy of) itself at the beginning
547 >>> a[:0] = a
548 >>> a
549 [123, 'bletch', 'xyzzy', 1234, 123, 'bletch', 'xyzzy', 1234]
550 >>> # Clear the list: replace all items with an empty list
551 >>> a[:] = []
552 >>> a
553 []
554
555The built-in function :func:`len` also applies to lists::
556
Georg Brandl87426cb2007-11-09 13:08:48 +0000557 >>> a = ['a', 'b', 'c', 'd']
Georg Brandl8ec7f652007-08-15 14:28:01 +0000558 >>> len(a)
Georg Brandl87426cb2007-11-09 13:08:48 +0000559 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000560
561It is possible to nest lists (create lists containing other lists), for
562example::
563
564 >>> q = [2, 3]
565 >>> p = [1, q, 4]
566 >>> len(p)
567 3
568 >>> p[1]
569 [2, 3]
570 >>> p[1][0]
571 2
572 >>> p[1].append('xtra') # See section 5.1
573 >>> p
574 [1, [2, 3, 'xtra'], 4]
575 >>> q
576 [2, 3, 'xtra']
577
578Note that in the last example, ``p[1]`` and ``q`` really refer to the same
579object! We'll come back to *object semantics* later.
580
581
582.. _tut-firststeps:
583
584First Steps Towards Programming
585===============================
586
587Of course, we can use Python for more complicated tasks than adding two and two
588together. For instance, we can write an initial sub-sequence of the *Fibonacci*
589series as follows::
590
591 >>> # Fibonacci series:
592 ... # the sum of two elements defines the next
593 ... a, b = 0, 1
594 >>> while b < 10:
Georg Brandl35f88612008-01-06 22:05:40 +0000595 ... print b
596 ... a, b = b, a+b
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000597 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000598 1
599 1
600 2
601 3
602 5
603 8
604
605This example introduces several new features.
606
607* The first line contains a *multiple assignment*: the variables ``a`` and ``b``
608 simultaneously get the new values 0 and 1. On the last line this is used again,
609 demonstrating that the expressions on the right-hand side are all evaluated
610 first before any of the assignments take place. The right-hand side expressions
611 are evaluated from the left to the right.
612
613* The :keyword:`while` loop executes as long as the condition (here: ``b < 10``)
614 remains true. In Python, like in C, any non-zero integer value is true; zero is
615 false. The condition may also be a string or list value, in fact any sequence;
616 anything with a non-zero length is true, empty sequences are false. The test
617 used in the example is a simple comparison. The standard comparison operators
618 are written the same as in C: ``<`` (less than), ``>`` (greater than), ``==``
619 (equal to), ``<=`` (less than or equal to), ``>=`` (greater than or equal to)
620 and ``!=`` (not equal to).
621
622* The *body* of the loop is *indented*: indentation is Python's way of grouping
623 statements. Python does not (yet!) provide an intelligent input line editing
624 facility, so you have to type a tab or space(s) for each indented line. In
625 practice you will prepare more complicated input for Python with a text editor;
626 most text editors have an auto-indent facility. When a compound statement is
627 entered interactively, it must be followed by a blank line to indicate
628 completion (since the parser cannot guess when you have typed the last line).
629 Note that each line within a basic block must be indented by the same amount.
630
631* The :keyword:`print` statement writes the value of the expression(s) it is
632 given. It differs from just writing the expression you want to write (as we did
633 earlier in the calculator examples) in the way it handles multiple expressions
634 and strings. Strings are printed without quotes, and a space is inserted
635 between items, so you can format things nicely, like this::
636
637 >>> i = 256*256
638 >>> print 'The value of i is', i
639 The value of i is 65536
640
641 A trailing comma avoids the newline after the output::
642
643 >>> a, b = 0, 1
644 >>> while b < 1000:
645 ... print b,
646 ... a, b = b, a+b
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000647 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000648 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
649
650 Note that the interpreter inserts a newline before it prints the next prompt if
651 the last line was not completed.