blob: 797e531270fc94a75872f730507bb61932bbd657 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001.. _tut-informal:
2
3**********************************
4An Informal Introduction to Python
5**********************************
6
7In the following examples, input and output are distinguished by the presence or
8absence of prompts (``>>>`` and ``...``): to repeat the example, you must type
9everything after the prompt, when the prompt appears; lines that do not begin
10with a prompt are output from the interpreter. Note that a secondary prompt on a
11line by itself in an example means you must type a blank line; this is used to
12end a multi-line command.
13
Georg Brandl8ec7f652007-08-15 14:28:01 +000014Many of the examples in this manual, even those entered at the interactive
15prompt, include comments. Comments in Python start with the hash character,
Georg Brandl3ce0dee2008-09-13 17:18:11 +000016``#``, and extend to the end of the physical line. A comment may appear at the
17start of a line or following whitespace or code, but not within a string
Georg Brandlb19be572007-12-29 10:57:00 +000018literal. A hash character within a string literal is just a hash character.
Georg Brandl3ce0dee2008-09-13 17:18:11 +000019Since comments are to clarify code and are not interpreted by Python, they may
20be omitted when typing in examples.
Georg Brandl8ec7f652007-08-15 14:28:01 +000021
22Some examples::
23
24 # this is the first comment
25 SPAM = 1 # and this is the second comment
26 # ... and now a third!
27 STRING = "# This is not a comment."
28
29
30.. _tut-calculator:
31
32Using Python as a Calculator
33============================
34
35Let's try some simple Python commands. Start the interpreter and wait for the
36primary prompt, ``>>>``. (It shouldn't take long.)
37
38
39.. _tut-numbers:
40
41Numbers
42-------
43
44The interpreter acts as a simple calculator: you can type an expression at it
45and it will write the value. Expression syntax is straightforward: the
46operators ``+``, ``-``, ``*`` and ``/`` work just like in most other languages
47(for example, Pascal or C); parentheses can be used for grouping. For example::
48
49 >>> 2+2
50 4
51 >>> # This is a comment
52 ... 2+2
53 4
54 >>> 2+2 # and a comment on the same line as code
55 4
56 >>> (50-5*6)/4
57 5
58 >>> # Integer division returns the floor:
59 ... 7/3
60 2
61 >>> 7/-3
62 -3
63
64The equal sign (``'='``) is used to assign a value to a variable. Afterwards, no
65result is displayed before the next interactive prompt::
66
67 >>> width = 20
68 >>> height = 5*9
69 >>> width * height
70 900
71
72A value can be assigned to several variables simultaneously::
73
74 >>> x = y = z = 0 # Zero x, y and z
75 >>> x
76 0
77 >>> y
78 0
79 >>> z
80 0
81
Georg Brandl3ce0dee2008-09-13 17:18:11 +000082Variables must be "defined" (assigned a value) before they can be used, or an
83error will occur::
84
85 >>> # try to access an undefined variable
86 ... n
87 Traceback (most recent call last):
88 File "<stdin>", line 1, in <module>
89 NameError: name 'n' is not defined
90
Georg Brandl8ec7f652007-08-15 14:28:01 +000091There is full support for floating point; operators with mixed type operands
92convert the integer operand to floating point::
93
94 >>> 3 * 3.75 / 1.5
95 7.5
96 >>> 7.0 / 2
97 3.5
98
99Complex numbers are also supported; imaginary numbers are written with a suffix
100of ``j`` or ``J``. Complex numbers with a nonzero real component are written as
101``(real+imagj)``, or can be created with the ``complex(real, imag)`` function.
102::
103
104 >>> 1j * 1J
105 (-1+0j)
106 >>> 1j * complex(0,1)
107 (-1+0j)
108 >>> 3+1j*3
109 (3+3j)
110 >>> (3+1j)*3
111 (9+3j)
112 >>> (1+2j)/(1+1j)
113 (1.5+0.5j)
114
115Complex numbers are always represented as two floating point numbers, the real
116and imaginary part. To extract these parts from a complex number *z*, use
117``z.real`` and ``z.imag``. ::
118
119 >>> a=1.5+0.5j
120 >>> a.real
121 1.5
122 >>> a.imag
123 0.5
124
125The conversion functions to floating point and integer (:func:`float`,
126:func:`int` and :func:`long`) don't work for complex numbers --- there is no one
127correct way to convert a complex number to a real number. Use ``abs(z)`` to get
128its magnitude (as a float) or ``z.real`` to get its real part. ::
129
130 >>> a=3.0+4.0j
131 >>> float(a)
132 Traceback (most recent call last):
133 File "<stdin>", line 1, in ?
134 TypeError: can't convert complex to float; use abs(z)
135 >>> a.real
136 3.0
137 >>> a.imag
138 4.0
139 >>> abs(a) # sqrt(a.real**2 + a.imag**2)
140 5.0
141 >>>
142
143In interactive mode, the last printed expression is assigned to the variable
144``_``. This means that when you are using Python as a desk calculator, it is
145somewhat easier to continue calculations, for example::
146
147 >>> tax = 12.5 / 100
148 >>> price = 100.50
149 >>> price * tax
150 12.5625
151 >>> price + _
152 113.0625
153 >>> round(_, 2)
154 113.06
155 >>>
156
157This variable should be treated as read-only by the user. Don't explicitly
158assign a value to it --- you would create an independent local variable with the
159same name masking the built-in variable with its magic behavior.
160
161
162.. _tut-strings:
163
164Strings
165-------
166
167Besides numbers, Python can also manipulate strings, which can be expressed in
168several ways. They can be enclosed in single quotes or double quotes::
169
170 >>> 'spam eggs'
171 'spam eggs'
172 >>> 'doesn\'t'
173 "doesn't"
174 >>> "doesn't"
175 "doesn't"
176 >>> '"Yes," he said.'
177 '"Yes," he said.'
178 >>> "\"Yes,\" he said."
179 '"Yes," he said.'
180 >>> '"Isn\'t," she said.'
181 '"Isn\'t," she said.'
182
183String literals can span multiple lines in several ways. Continuation lines can
184be used, with a backslash as the last character on the line indicating that the
185next line is a logical continuation of the line::
186
187 hello = "This is a rather long string containing\n\
188 several lines of text just as you would do in C.\n\
189 Note that whitespace at the beginning of the line is\
190 significant."
191
192 print hello
193
194Note that newlines still need to be embedded in the string using ``\n``; the
195newline following the trailing backslash is discarded. This example would print
196the following::
197
198 This is a rather long string containing
199 several lines of text just as you would do in C.
200 Note that whitespace at the beginning of the line is significant.
201
202If we make the string literal a "raw" string, however, the ``\n`` sequences are
203not converted to newlines, but the backslash at the end of the line, and the
204newline character in the source, are both included in the string as data. Thus,
205the example::
206
207 hello = r"This is a rather long string containing\n\
208 several lines of text much as you would do in C."
209
210 print hello
211
212would print::
213
214 This is a rather long string containing\n\
215 several lines of text much as you would do in C.
216
217Or, strings can be surrounded in a pair of matching triple-quotes: ``"""`` or
218``'''``. End of lines do not need to be escaped when using triple-quotes, but
219they will be included in the string. ::
220
221 print """
222 Usage: thingy [OPTIONS]
223 -h Display this usage message
224 -H hostname Hostname to connect to
225 """
226
227produces the following output::
228
229 Usage: thingy [OPTIONS]
230 -h Display this usage message
231 -H hostname Hostname to connect to
232
233The interpreter prints the result of string operations in the same way as they
234are typed for input: inside quotes, and with quotes and other funny characters
235escaped by backslashes, to show the precise value. The string is enclosed in
236double quotes if the string contains a single quote and no double quotes, else
237it's enclosed in single quotes. (The :keyword:`print` statement, described
238later, can be used to write strings without quotes or escapes.)
239
240Strings can be concatenated (glued together) with the ``+`` operator, and
241repeated with ``*``::
242
243 >>> word = 'Help' + 'A'
244 >>> word
245 'HelpA'
246 >>> '<' + word*5 + '>'
247 '<HelpAHelpAHelpAHelpAHelpA>'
248
249Two string literals next to each other are automatically concatenated; the first
250line above could also have been written ``word = 'Help' 'A'``; this only works
251with two literals, not with arbitrary string expressions::
252
253 >>> 'str' 'ing' # <- This is ok
254 'string'
255 >>> 'str'.strip() + 'ing' # <- This is ok
256 'string'
257 >>> 'str'.strip() 'ing' # <- This is invalid
258 File "<stdin>", line 1, in ?
259 'str'.strip() 'ing'
260 ^
261 SyntaxError: invalid syntax
262
263Strings can be subscripted (indexed); like in C, the first character of a string
264has subscript (index) 0. There is no separate character type; a character is
265simply a string of size one. Like in Icon, substrings can be specified with the
266*slice notation*: two indices separated by a colon. ::
267
268 >>> word[4]
269 'A'
270 >>> word[0:2]
271 'He'
272 >>> word[2:4]
273 'lp'
274
275Slice indices have useful defaults; an omitted first index defaults to zero, an
276omitted second index defaults to the size of the string being sliced. ::
277
278 >>> word[:2] # The first two characters
279 'He'
280 >>> word[2:] # Everything except the first two characters
281 'lpA'
282
Georg Brandl3ce0dee2008-09-13 17:18:11 +0000283Unlike a C string, Python strings cannot be changed. Assigning to an indexed
Georg Brandl8ec7f652007-08-15 14:28:01 +0000284position in the string results in an error::
285
286 >>> word[0] = 'x'
287 Traceback (most recent call last):
288 File "<stdin>", line 1, in ?
289 TypeError: object doesn't support item assignment
290 >>> word[:1] = 'Splat'
291 Traceback (most recent call last):
292 File "<stdin>", line 1, in ?
293 TypeError: object doesn't support slice assignment
294
295However, creating a new string with the combined content is easy and efficient::
296
297 >>> 'x' + word[1:]
298 'xelpA'
299 >>> 'Splat' + word[4]
300 'SplatA'
301
302Here's a useful invariant of slice operations: ``s[:i] + s[i:]`` equals ``s``.
303::
304
305 >>> word[:2] + word[2:]
306 'HelpA'
307 >>> word[:3] + word[3:]
308 'HelpA'
309
310Degenerate slice indices are handled gracefully: an index that is too large is
311replaced by the string size, an upper bound smaller than the lower bound returns
312an empty string. ::
313
314 >>> word[1:100]
315 'elpA'
316 >>> word[10:]
317 ''
318 >>> word[2:1]
319 ''
320
321Indices may be negative numbers, to start counting from the right. For example::
322
323 >>> word[-1] # The last character
324 'A'
325 >>> word[-2] # The last-but-one character
326 'p'
327 >>> word[-2:] # The last two characters
328 'pA'
329 >>> word[:-2] # Everything except the last two characters
330 'Hel'
331
332But note that -0 is really the same as 0, so it does not count from the right!
333::
334
335 >>> word[-0] # (since -0 equals 0)
336 'H'
337
338Out-of-range negative slice indices are truncated, but don't try this for
339single-element (non-slice) indices::
340
341 >>> word[-100:]
342 'HelpA'
343 >>> word[-10] # error
344 Traceback (most recent call last):
345 File "<stdin>", line 1, in ?
346 IndexError: string index out of range
347
348One way to remember how slices work is to think of the indices as pointing
349*between* characters, with the left edge of the first character numbered 0.
350Then the right edge of the last character of a string of *n* characters has
351index *n*, for example::
352
353 +---+---+---+---+---+
354 | H | e | l | p | A |
355 +---+---+---+---+---+
356 0 1 2 3 4 5
357 -5 -4 -3 -2 -1
358
359The first row of numbers gives the position of the indices 0...5 in the string;
360the second row gives the corresponding negative indices. The slice from *i* to
361*j* consists of all characters between the edges labeled *i* and *j*,
362respectively.
363
364For non-negative indices, the length of a slice is the difference of the
365indices, if both are within bounds. For example, the length of ``word[1:3]`` is
3662.
367
368The built-in function :func:`len` returns the length of a string::
369
370 >>> s = 'supercalifragilisticexpialidocious'
371 >>> len(s)
372 34
373
374
375.. seealso::
376
377 :ref:`typesseq`
378 Strings, and the Unicode strings described in the next section, are
379 examples of *sequence types*, and support the common operations supported
380 by such types.
381
382 :ref:`string-methods`
383 Both strings and Unicode strings support a large number of methods for
384 basic transformations and searching.
385
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000386 :ref:`new-string-formatting`
387 Information about string formatting with :meth:`str.format` is described
388 here.
389
Georg Brandl8ec7f652007-08-15 14:28:01 +0000390 :ref:`string-formatting`
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000391 The old formatting operations invoked when strings and Unicode strings are
392 the left operand of the ``%`` operator are described in more detail here.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000393
394
395.. _tut-unicodestrings:
396
397Unicode Strings
398---------------
399
400.. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com>
401
402
403Starting with Python 2.0 a new data type for storing text data is available to
404the programmer: the Unicode object. It can be used to store and manipulate
405Unicode data (see http://www.unicode.org/) and integrates well with the existing
406string objects, providing auto-conversions where necessary.
407
408Unicode has the advantage of providing one ordinal for every character in every
409script used in modern and ancient texts. Previously, there were only 256
410possible ordinals for script characters. Texts were typically bound to a code
411page which mapped the ordinals to script characters. This lead to very much
412confusion especially with respect to internationalization (usually written as
413``i18n`` --- ``'i'`` + 18 characters + ``'n'``) of software. Unicode solves
414these problems by defining one code page for all scripts.
415
416Creating Unicode strings in Python is just as simple as creating normal
417strings::
418
419 >>> u'Hello World !'
420 u'Hello World !'
421
422The small ``'u'`` in front of the quote indicates that a Unicode string is
423supposed to be created. If you want to include special characters in the string,
424you can do so by using the Python *Unicode-Escape* encoding. The following
425example shows how::
426
427 >>> u'Hello\u0020World !'
428 u'Hello World !'
429
430The escape sequence ``\u0020`` indicates to insert the Unicode character with
431the ordinal value 0x0020 (the space character) at the given position.
432
433Other characters are interpreted by using their respective ordinal values
434directly as Unicode ordinals. If you have literal strings in the standard
435Latin-1 encoding that is used in many Western countries, you will find it
436convenient that the lower 256 characters of Unicode are the same as the 256
437characters of Latin-1.
438
439For experts, there is also a raw mode just like the one for normal strings. You
440have to prefix the opening quote with 'ur' to have Python use the
441*Raw-Unicode-Escape* encoding. It will only apply the above ``\uXXXX``
442conversion if there is an uneven number of backslashes in front of the small
443'u'. ::
444
445 >>> ur'Hello\u0020World !'
446 u'Hello World !'
447 >>> ur'Hello\\u0020World !'
448 u'Hello\\\\u0020World !'
449
450The raw mode is most useful when you have to enter lots of backslashes, as can
451be necessary in regular expressions.
452
453Apart from these standard encodings, Python provides a whole set of other ways
454of creating Unicode strings on the basis of a known encoding.
455
456.. index:: builtin: unicode
457
458The built-in function :func:`unicode` provides access to all registered Unicode
459codecs (COders and DECoders). Some of the more well known encodings which these
460codecs can convert are *Latin-1*, *ASCII*, *UTF-8*, and *UTF-16*. The latter two
461are variable-length encodings that store each Unicode character in one or more
462bytes. The default encoding is normally set to ASCII, which passes through
463characters in the range 0 to 127 and rejects any other characters with an error.
464When a Unicode string is printed, written to a file, or converted with
465:func:`str`, conversion takes place using this default encoding. ::
466
467 >>> u"abc"
468 u'abc'
469 >>> str(u"abc")
470 'abc'
471 >>> u"äöü"
472 u'\xe4\xf6\xfc'
473 >>> str(u"äöü")
474 Traceback (most recent call last):
475 File "<stdin>", line 1, in ?
476 UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
477
478To convert a Unicode string into an 8-bit string using a specific encoding,
479Unicode objects provide an :func:`encode` method that takes one argument, the
480name of the encoding. Lowercase names for encodings are preferred. ::
481
482 >>> u"äöü".encode('utf-8')
483 '\xc3\xa4\xc3\xb6\xc3\xbc'
484
485If you have data in a specific encoding and want to produce a corresponding
486Unicode string from it, you can use the :func:`unicode` function with the
487encoding name as the second argument. ::
488
489 >>> unicode('\xc3\xa4\xc3\xb6\xc3\xbc', 'utf-8')
490 u'\xe4\xf6\xfc'
491
492
493.. _tut-lists:
494
495Lists
496-----
497
498Python knows a number of *compound* data types, used to group together other
499values. The most versatile is the *list*, which can be written as a list of
500comma-separated values (items) between square brackets. List items need not all
501have the same type. ::
502
503 >>> a = ['spam', 'eggs', 100, 1234]
504 >>> a
505 ['spam', 'eggs', 100, 1234]
506
507Like string indices, list indices start at 0, and lists can be sliced,
508concatenated and so on::
509
510 >>> a[0]
511 'spam'
512 >>> a[3]
513 1234
514 >>> a[-2]
515 100
516 >>> a[1:-1]
517 ['eggs', 100]
518 >>> a[:2] + ['bacon', 2*2]
519 ['spam', 'eggs', 'bacon', 4]
520 >>> 3*a[:3] + ['Boo!']
521 ['spam', 'eggs', 100, 'spam', 'eggs', 100, 'spam', 'eggs', 100, 'Boo!']
522
523Unlike strings, which are *immutable*, it is possible to change individual
524elements of a list::
525
526 >>> a
527 ['spam', 'eggs', 100, 1234]
528 >>> a[2] = a[2] + 23
529 >>> a
530 ['spam', 'eggs', 123, 1234]
531
532Assignment to slices is also possible, and this can even change the size of the
533list or clear it entirely::
534
535 >>> # Replace some items:
536 ... a[0:2] = [1, 12]
537 >>> a
538 [1, 12, 123, 1234]
539 >>> # Remove some:
540 ... a[0:2] = []
541 >>> a
542 [123, 1234]
543 >>> # Insert some:
544 ... a[1:1] = ['bletch', 'xyzzy']
545 >>> a
546 [123, 'bletch', 'xyzzy', 1234]
547 >>> # Insert (a copy of) itself at the beginning
548 >>> a[:0] = a
549 >>> a
550 [123, 'bletch', 'xyzzy', 1234, 123, 'bletch', 'xyzzy', 1234]
551 >>> # Clear the list: replace all items with an empty list
552 >>> a[:] = []
553 >>> a
554 []
555
556The built-in function :func:`len` also applies to lists::
557
Georg Brandl87426cb2007-11-09 13:08:48 +0000558 >>> a = ['a', 'b', 'c', 'd']
Georg Brandl8ec7f652007-08-15 14:28:01 +0000559 >>> len(a)
Georg Brandl87426cb2007-11-09 13:08:48 +0000560 4
Georg Brandl8ec7f652007-08-15 14:28:01 +0000561
562It is possible to nest lists (create lists containing other lists), for
563example::
564
565 >>> q = [2, 3]
566 >>> p = [1, q, 4]
567 >>> len(p)
568 3
569 >>> p[1]
570 [2, 3]
571 >>> p[1][0]
572 2
573 >>> p[1].append('xtra') # See section 5.1
574 >>> p
575 [1, [2, 3, 'xtra'], 4]
576 >>> q
577 [2, 3, 'xtra']
578
579Note that in the last example, ``p[1]`` and ``q`` really refer to the same
580object! We'll come back to *object semantics* later.
581
582
583.. _tut-firststeps:
584
585First Steps Towards Programming
586===============================
587
588Of course, we can use Python for more complicated tasks than adding two and two
589together. For instance, we can write an initial sub-sequence of the *Fibonacci*
590series as follows::
591
592 >>> # Fibonacci series:
593 ... # the sum of two elements defines the next
594 ... a, b = 0, 1
595 >>> while b < 10:
Georg Brandl35f88612008-01-06 22:05:40 +0000596 ... print b
597 ... a, b = b, a+b
Georg Brandl8ec7f652007-08-15 14:28:01 +0000598 ...
599 1
600 1
601 2
602 3
603 5
604 8
605
606This example introduces several new features.
607
608* The first line contains a *multiple assignment*: the variables ``a`` and ``b``
609 simultaneously get the new values 0 and 1. On the last line this is used again,
610 demonstrating that the expressions on the right-hand side are all evaluated
611 first before any of the assignments take place. The right-hand side expressions
612 are evaluated from the left to the right.
613
614* The :keyword:`while` loop executes as long as the condition (here: ``b < 10``)
615 remains true. In Python, like in C, any non-zero integer value is true; zero is
616 false. The condition may also be a string or list value, in fact any sequence;
617 anything with a non-zero length is true, empty sequences are false. The test
618 used in the example is a simple comparison. The standard comparison operators
619 are written the same as in C: ``<`` (less than), ``>`` (greater than), ``==``
620 (equal to), ``<=`` (less than or equal to), ``>=`` (greater than or equal to)
621 and ``!=`` (not equal to).
622
623* The *body* of the loop is *indented*: indentation is Python's way of grouping
624 statements. Python does not (yet!) provide an intelligent input line editing
625 facility, so you have to type a tab or space(s) for each indented line. In
626 practice you will prepare more complicated input for Python with a text editor;
627 most text editors have an auto-indent facility. When a compound statement is
628 entered interactively, it must be followed by a blank line to indicate
629 completion (since the parser cannot guess when you have typed the last line).
630 Note that each line within a basic block must be indented by the same amount.
631
632* The :keyword:`print` statement writes the value of the expression(s) it is
633 given. It differs from just writing the expression you want to write (as we did
634 earlier in the calculator examples) in the way it handles multiple expressions
635 and strings. Strings are printed without quotes, and a space is inserted
636 between items, so you can format things nicely, like this::
637
638 >>> i = 256*256
639 >>> print 'The value of i is', i
640 The value of i is 65536
641
642 A trailing comma avoids the newline after the output::
643
644 >>> a, b = 0, 1
645 >>> while b < 1000:
646 ... print b,
647 ... a, b = b, a+b
648 ...
649 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
650
651 Note that the interpreter inserts a newline before it prints the next prompt if
652 the last line was not completed.