Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | .. _tut-informal: |
| 2 | |
| 3 | ********************************** |
| 4 | An Informal Introduction to Python |
| 5 | ********************************** |
| 6 | |
| 7 | In the following examples, input and output are distinguished by the presence or |
| 8 | absence of prompts (``>>>`` and ``...``): to repeat the example, you must type |
| 9 | everything after the prompt, when the prompt appears; lines that do not begin |
| 10 | with a prompt are output from the interpreter. Note that a secondary prompt on a |
| 11 | line by itself in an example means you must type a blank line; this is used to |
| 12 | end a multi-line command. |
| 13 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 14 | Many of the examples in this manual, even those entered at the interactive |
| 15 | prompt, include comments. Comments in Python start with the hash character, |
Georg Brandl | 5d955ed | 2008-09-13 17:18:21 +0000 | [diff] [blame] | 16 | ``#``, and extend to the end of the physical line. A comment may appear at the |
| 17 | start of a line or following whitespace or code, but not within a string |
Christian Heimes | 5b5e81c | 2007-12-31 16:14:33 +0000 | [diff] [blame] | 18 | literal. A hash character within a string literal is just a hash character. |
Georg Brandl | 5d955ed | 2008-09-13 17:18:21 +0000 | [diff] [blame] | 19 | Since comments are to clarify code and are not interpreted by Python, they may |
| 20 | be omitted when typing in examples. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 21 | |
| 22 | Some examples:: |
| 23 | |
| 24 | # this is the first comment |
| 25 | SPAM = 1 # and this is the second comment |
| 26 | # ... and now a third! |
| 27 | STRING = "# This is not a comment." |
| 28 | |
| 29 | |
| 30 | .. _tut-calculator: |
| 31 | |
| 32 | Using Python as a Calculator |
| 33 | ============================ |
| 34 | |
| 35 | Let's try some simple Python commands. Start the interpreter and wait for the |
| 36 | primary prompt, ``>>>``. (It shouldn't take long.) |
| 37 | |
| 38 | |
| 39 | .. _tut-numbers: |
| 40 | |
| 41 | Numbers |
| 42 | ------- |
| 43 | |
| 44 | The interpreter acts as a simple calculator: you can type an expression at it |
| 45 | and it will write the value. Expression syntax is straightforward: the |
| 46 | operators ``+``, ``-``, ``*`` and ``/`` work just like in most other languages |
| 47 | (for example, Pascal or C); parentheses can be used for grouping. For example:: |
| 48 | |
| 49 | >>> 2+2 |
| 50 | 4 |
| 51 | >>> # This is a comment |
| 52 | ... 2+2 |
| 53 | 4 |
| 54 | >>> 2+2 # and a comment on the same line as code |
| 55 | 4 |
| 56 | >>> (50-5*6)/4 |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 57 | 5.0 |
| 58 | >>> 8/5 # Fractions aren't lost when dividing integers |
Mark Dickinson | 5a55b61 | 2009-06-28 20:59:42 +0000 | [diff] [blame] | 59 | 1.6 |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 60 | |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 61 | Note: You might not see exactly the same result; floating point results can |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 62 | differ from one machine to another. We will say more later about controlling |
Mark Dickinson | af15f3c | 2009-06-28 21:19:18 +0000 | [diff] [blame] | 63 | the appearance of floating point output. See also :ref:`tut-fp-issues` for a |
| 64 | full discussion of some of the subtleties of floating point numbers and their |
| 65 | representations. |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 66 | |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 67 | To do integer division and get an integer result, |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 68 | discarding any fractional result, there is another operator, ``//``:: |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 69 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 70 | >>> # Integer division returns the floor: |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 71 | ... 7//3 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 72 | 2 |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 73 | >>> 7//-3 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 74 | -3 |
| 75 | |
| 76 | The equal sign (``'='``) is used to assign a value to a variable. Afterwards, no |
| 77 | result is displayed before the next interactive prompt:: |
| 78 | |
| 79 | >>> width = 20 |
| 80 | >>> height = 5*9 |
| 81 | >>> width * height |
| 82 | 900 |
| 83 | |
| 84 | A value can be assigned to several variables simultaneously:: |
| 85 | |
| 86 | >>> x = y = z = 0 # Zero x, y and z |
| 87 | >>> x |
| 88 | 0 |
| 89 | >>> y |
| 90 | 0 |
| 91 | >>> z |
| 92 | 0 |
| 93 | |
Georg Brandl | 5d955ed | 2008-09-13 17:18:21 +0000 | [diff] [blame] | 94 | Variables must be "defined" (assigned a value) before they can be used, or an |
| 95 | error will occur:: |
| 96 | |
| 97 | >>> # try to access an undefined variable |
| 98 | ... n |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 99 | Traceback (most recent call last): |
Georg Brandl | 5d955ed | 2008-09-13 17:18:21 +0000 | [diff] [blame] | 100 | File "<stdin>", line 1, in <module> |
| 101 | NameError: name 'n' is not defined |
| 102 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 103 | There is full support for floating point; operators with mixed type operands |
| 104 | convert the integer operand to floating point:: |
| 105 | |
| 106 | >>> 3 * 3.75 / 1.5 |
| 107 | 7.5 |
| 108 | >>> 7.0 / 2 |
| 109 | 3.5 |
| 110 | |
| 111 | Complex numbers are also supported; imaginary numbers are written with a suffix |
| 112 | of ``j`` or ``J``. Complex numbers with a nonzero real component are written as |
| 113 | ``(real+imagj)``, or can be created with the ``complex(real, imag)`` function. |
| 114 | :: |
| 115 | |
| 116 | >>> 1j * 1J |
| 117 | (-1+0j) |
Georg Brandl | e4ac750 | 2007-09-03 07:10:24 +0000 | [diff] [blame] | 118 | >>> 1j * complex(0, 1) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 119 | (-1+0j) |
| 120 | >>> 3+1j*3 |
| 121 | (3+3j) |
| 122 | >>> (3+1j)*3 |
| 123 | (9+3j) |
| 124 | >>> (1+2j)/(1+1j) |
| 125 | (1.5+0.5j) |
| 126 | |
| 127 | Complex numbers are always represented as two floating point numbers, the real |
| 128 | and imaginary part. To extract these parts from a complex number *z*, use |
| 129 | ``z.real`` and ``z.imag``. :: |
| 130 | |
| 131 | >>> a=1.5+0.5j |
| 132 | >>> a.real |
| 133 | 1.5 |
| 134 | >>> a.imag |
| 135 | 0.5 |
| 136 | |
| 137 | The conversion functions to floating point and integer (:func:`float`, |
Georg Brandl | 2d2590d | 2007-09-28 13:13:35 +0000 | [diff] [blame] | 138 | :func:`int`) don't work for complex numbers --- there is not one correct way to |
| 139 | convert a complex number to a real number. Use ``abs(z)`` to get its magnitude |
| 140 | (as a float) or ``z.real`` to get its real part:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 141 | |
| 142 | >>> a=3.0+4.0j |
| 143 | >>> float(a) |
| 144 | Traceback (most recent call last): |
| 145 | File "<stdin>", line 1, in ? |
| 146 | TypeError: can't convert complex to float; use abs(z) |
| 147 | >>> a.real |
| 148 | 3.0 |
| 149 | >>> a.imag |
| 150 | 4.0 |
| 151 | >>> abs(a) # sqrt(a.real**2 + a.imag**2) |
| 152 | 5.0 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 153 | |
| 154 | In interactive mode, the last printed expression is assigned to the variable |
| 155 | ``_``. This means that when you are using Python as a desk calculator, it is |
| 156 | somewhat easier to continue calculations, for example:: |
| 157 | |
| 158 | >>> tax = 12.5 / 100 |
| 159 | >>> price = 100.50 |
| 160 | >>> price * tax |
| 161 | 12.5625 |
| 162 | >>> price + _ |
| 163 | 113.0625 |
| 164 | >>> round(_, 2) |
| 165 | 113.06 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 166 | |
| 167 | This variable should be treated as read-only by the user. Don't explicitly |
| 168 | assign a value to it --- you would create an independent local variable with the |
| 169 | same name masking the built-in variable with its magic behavior. |
| 170 | |
| 171 | |
| 172 | .. _tut-strings: |
| 173 | |
| 174 | Strings |
| 175 | ------- |
| 176 | |
| 177 | Besides numbers, Python can also manipulate strings, which can be expressed in |
| 178 | several ways. They can be enclosed in single quotes or double quotes:: |
| 179 | |
| 180 | >>> 'spam eggs' |
| 181 | 'spam eggs' |
| 182 | >>> 'doesn\'t' |
| 183 | "doesn't" |
| 184 | >>> "doesn't" |
| 185 | "doesn't" |
| 186 | >>> '"Yes," he said.' |
| 187 | '"Yes," he said.' |
| 188 | >>> "\"Yes,\" he said." |
| 189 | '"Yes," he said.' |
| 190 | >>> '"Isn\'t," she said.' |
| 191 | '"Isn\'t," she said.' |
| 192 | |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 193 | The interpreter prints the result of string operations in the same way as they |
| 194 | are typed for input: inside quotes, and with quotes and other funny characters |
| 195 | escaped by backslashes, to show the precise value. The string is enclosed in |
| 196 | double quotes if the string contains a single quote and no double quotes, else |
| 197 | it's enclosed in single quotes. Once again, the :func:`print` function |
| 198 | produces the more readable output. |
| 199 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 200 | String literals can span multiple lines in several ways. Continuation lines can |
| 201 | be used, with a backslash as the last character on the line indicating that the |
| 202 | next line is a logical continuation of the line:: |
| 203 | |
| 204 | hello = "This is a rather long string containing\n\ |
| 205 | several lines of text just as you would do in C.\n\ |
| 206 | Note that whitespace at the beginning of the line is\ |
| 207 | significant." |
| 208 | |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 209 | print(hello) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 210 | |
| 211 | Note that newlines still need to be embedded in the string using ``\n``; the |
| 212 | newline following the trailing backslash is discarded. This example would print |
Benjamin Peterson | 8719ad5 | 2009-09-11 22:24:02 +0000 | [diff] [blame] | 213 | the following: |
| 214 | |
| 215 | .. code-block:: text |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 216 | |
| 217 | This is a rather long string containing |
| 218 | several lines of text just as you would do in C. |
| 219 | Note that whitespace at the beginning of the line is significant. |
| 220 | |
Benjamin Peterson | 8719ad5 | 2009-09-11 22:24:02 +0000 | [diff] [blame] | 221 | Or, strings can be surrounded in a pair of matching triple-quotes: ``"""`` or |
| 222 | ``'''``. End of lines do not need to be escaped when using triple-quotes, but |
| 223 | they will be included in the string. :: |
| 224 | |
Ezio Melotti | b297e71 | 2009-09-25 20:14:02 +0000 | [diff] [blame] | 225 | print(""" |
Benjamin Peterson | 8719ad5 | 2009-09-11 22:24:02 +0000 | [diff] [blame] | 226 | Usage: thingy [OPTIONS] |
| 227 | -h Display this usage message |
| 228 | -H hostname Hostname to connect to |
Ezio Melotti | b297e71 | 2009-09-25 20:14:02 +0000 | [diff] [blame] | 229 | """) |
Benjamin Peterson | 8719ad5 | 2009-09-11 22:24:02 +0000 | [diff] [blame] | 230 | |
| 231 | produces the following output: |
| 232 | |
| 233 | .. code-block:: text |
| 234 | |
| 235 | Usage: thingy [OPTIONS] |
| 236 | -h Display this usage message |
| 237 | -H hostname Hostname to connect to |
| 238 | |
Benjamin Peterson | d23f822 | 2009-04-05 19:13:16 +0000 | [diff] [blame] | 239 | If we make the string literal a "raw" string, ``\n`` sequences are not converted |
| 240 | to newlines, but the backslash at the end of the line, and the newline character |
| 241 | in the source, are both included in the string as data. Thus, the example:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 242 | |
| 243 | hello = r"This is a rather long string containing\n\ |
| 244 | several lines of text much as you would do in C." |
| 245 | |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 246 | print(hello) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 247 | |
Benjamin Peterson | 8719ad5 | 2009-09-11 22:24:02 +0000 | [diff] [blame] | 248 | would print: |
| 249 | |
| 250 | .. code-block:: text |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 251 | |
| 252 | This is a rather long string containing\n\ |
| 253 | several lines of text much as you would do in C. |
| 254 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 255 | Strings can be concatenated (glued together) with the ``+`` operator, and |
| 256 | repeated with ``*``:: |
| 257 | |
| 258 | >>> word = 'Help' + 'A' |
| 259 | >>> word |
| 260 | 'HelpA' |
| 261 | >>> '<' + word*5 + '>' |
| 262 | '<HelpAHelpAHelpAHelpAHelpA>' |
| 263 | |
| 264 | Two string literals next to each other are automatically concatenated; the first |
| 265 | line above could also have been written ``word = 'Help' 'A'``; this only works |
| 266 | with two literals, not with arbitrary string expressions:: |
| 267 | |
| 268 | >>> 'str' 'ing' # <- This is ok |
| 269 | 'string' |
| 270 | >>> 'str'.strip() + 'ing' # <- This is ok |
| 271 | 'string' |
| 272 | >>> 'str'.strip() 'ing' # <- This is invalid |
| 273 | File "<stdin>", line 1, in ? |
| 274 | 'str'.strip() 'ing' |
| 275 | ^ |
| 276 | SyntaxError: invalid syntax |
| 277 | |
| 278 | Strings can be subscripted (indexed); like in C, the first character of a string |
| 279 | has subscript (index) 0. There is no separate character type; a character is |
Georg Brandl | e4ac750 | 2007-09-03 07:10:24 +0000 | [diff] [blame] | 280 | simply a string of size one. As in the Icon programming language, substrings |
| 281 | can be specified with the *slice notation*: two indices separated by a colon. |
| 282 | :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 283 | |
| 284 | >>> word[4] |
| 285 | 'A' |
| 286 | >>> word[0:2] |
| 287 | 'He' |
| 288 | >>> word[2:4] |
| 289 | 'lp' |
| 290 | |
| 291 | Slice indices have useful defaults; an omitted first index defaults to zero, an |
| 292 | omitted second index defaults to the size of the string being sliced. :: |
| 293 | |
| 294 | >>> word[:2] # The first two characters |
| 295 | 'He' |
| 296 | >>> word[2:] # Everything except the first two characters |
| 297 | 'lpA' |
| 298 | |
Georg Brandl | 5d955ed | 2008-09-13 17:18:21 +0000 | [diff] [blame] | 299 | Unlike a C string, Python strings cannot be changed. Assigning to an indexed |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 300 | position in the string results in an error:: |
| 301 | |
| 302 | >>> word[0] = 'x' |
| 303 | Traceback (most recent call last): |
| 304 | File "<stdin>", line 1, in ? |
Georg Brandl | 7fcb3bf | 2009-05-17 08:18:02 +0000 | [diff] [blame] | 305 | TypeError: 'str' object does not support item assignment |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 306 | >>> word[:1] = 'Splat' |
| 307 | Traceback (most recent call last): |
| 308 | File "<stdin>", line 1, in ? |
Georg Brandl | 7fcb3bf | 2009-05-17 08:18:02 +0000 | [diff] [blame] | 309 | TypeError: 'str' object does not support slice assignment |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 310 | |
| 311 | However, creating a new string with the combined content is easy and efficient:: |
| 312 | |
| 313 | >>> 'x' + word[1:] |
| 314 | 'xelpA' |
| 315 | >>> 'Splat' + word[4] |
| 316 | 'SplatA' |
| 317 | |
| 318 | Here's a useful invariant of slice operations: ``s[:i] + s[i:]`` equals ``s``. |
| 319 | :: |
| 320 | |
| 321 | >>> word[:2] + word[2:] |
| 322 | 'HelpA' |
| 323 | >>> word[:3] + word[3:] |
| 324 | 'HelpA' |
| 325 | |
| 326 | Degenerate slice indices are handled gracefully: an index that is too large is |
| 327 | replaced by the string size, an upper bound smaller than the lower bound returns |
| 328 | an empty string. :: |
| 329 | |
| 330 | >>> word[1:100] |
| 331 | 'elpA' |
| 332 | >>> word[10:] |
| 333 | '' |
| 334 | >>> word[2:1] |
| 335 | '' |
| 336 | |
| 337 | Indices may be negative numbers, to start counting from the right. For example:: |
| 338 | |
| 339 | >>> word[-1] # The last character |
| 340 | 'A' |
| 341 | >>> word[-2] # The last-but-one character |
| 342 | 'p' |
| 343 | >>> word[-2:] # The last two characters |
| 344 | 'pA' |
| 345 | >>> word[:-2] # Everything except the last two characters |
| 346 | 'Hel' |
| 347 | |
| 348 | But note that -0 is really the same as 0, so it does not count from the right! |
| 349 | :: |
| 350 | |
| 351 | >>> word[-0] # (since -0 equals 0) |
| 352 | 'H' |
| 353 | |
| 354 | Out-of-range negative slice indices are truncated, but don't try this for |
| 355 | single-element (non-slice) indices:: |
| 356 | |
| 357 | >>> word[-100:] |
| 358 | 'HelpA' |
| 359 | >>> word[-10] # error |
| 360 | Traceback (most recent call last): |
| 361 | File "<stdin>", line 1, in ? |
| 362 | IndexError: string index out of range |
| 363 | |
| 364 | One way to remember how slices work is to think of the indices as pointing |
| 365 | *between* characters, with the left edge of the first character numbered 0. |
| 366 | Then the right edge of the last character of a string of *n* characters has |
| 367 | index *n*, for example:: |
| 368 | |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 369 | +---+---+---+---+---+ |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 370 | | H | e | l | p | A | |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 371 | +---+---+---+---+---+ |
| 372 | 0 1 2 3 4 5 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 373 | -5 -4 -3 -2 -1 |
| 374 | |
| 375 | The first row of numbers gives the position of the indices 0...5 in the string; |
| 376 | the second row gives the corresponding negative indices. The slice from *i* to |
| 377 | *j* consists of all characters between the edges labeled *i* and *j*, |
| 378 | respectively. |
| 379 | |
| 380 | For non-negative indices, the length of a slice is the difference of the |
| 381 | indices, if both are within bounds. For example, the length of ``word[1:3]`` is |
| 382 | 2. |
| 383 | |
| 384 | The built-in function :func:`len` returns the length of a string:: |
| 385 | |
| 386 | >>> s = 'supercalifragilisticexpialidocious' |
| 387 | >>> len(s) |
| 388 | 34 |
| 389 | |
| 390 | |
| 391 | .. seealso:: |
| 392 | |
| 393 | :ref:`typesseq` |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 394 | Strings are examples of *sequence types*, and support the common |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 395 | operations supported by such types. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 396 | |
| 397 | :ref:`string-methods` |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 398 | Strings support a large number of methods for |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 399 | basic transformations and searching. |
| 400 | |
| 401 | :ref:`string-formatting` |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 402 | Information about string formatting with :meth:`str.format` is described |
| 403 | here. |
| 404 | |
| 405 | :ref:`old-string-formatting` |
| 406 | The old formatting operations invoked when strings and Unicode strings are |
| 407 | the left operand of the ``%`` operator are described in more detail here. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 408 | |
| 409 | |
| 410 | .. _tut-unicodestrings: |
| 411 | |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 412 | About Unicode |
| 413 | ------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 414 | |
| 415 | .. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com> |
| 416 | |
| 417 | |
Georg Brandl | 5d955ed | 2008-09-13 17:18:21 +0000 | [diff] [blame] | 418 | Starting with Python 3.0 all strings support Unicode (see |
| 419 | http://www.unicode.org/). |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 420 | |
| 421 | Unicode has the advantage of providing one ordinal for every character in every |
| 422 | script used in modern and ancient texts. Previously, there were only 256 |
| 423 | possible ordinals for script characters. Texts were typically bound to a code |
| 424 | page which mapped the ordinals to script characters. This lead to very much |
| 425 | confusion especially with respect to internationalization (usually written as |
| 426 | ``i18n`` --- ``'i'`` + 18 characters + ``'n'``) of software. Unicode solves |
| 427 | these problems by defining one code page for all scripts. |
| 428 | |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 429 | If you want to include special characters in a string, |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 430 | you can do so by using the Python *Unicode-Escape* encoding. The following |
| 431 | example shows how:: |
| 432 | |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 433 | >>> 'Hello\u0020World !' |
| 434 | 'Hello World !' |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 435 | |
| 436 | The escape sequence ``\u0020`` indicates to insert the Unicode character with |
| 437 | the ordinal value 0x0020 (the space character) at the given position. |
| 438 | |
| 439 | Other characters are interpreted by using their respective ordinal values |
| 440 | directly as Unicode ordinals. If you have literal strings in the standard |
| 441 | Latin-1 encoding that is used in many Western countries, you will find it |
| 442 | convenient that the lower 256 characters of Unicode are the same as the 256 |
| 443 | characters of Latin-1. |
| 444 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 445 | Apart from these standard encodings, Python provides a whole set of other ways |
| 446 | of creating Unicode strings on the basis of a known encoding. |
| 447 | |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 448 | To convert a string into a sequence of bytes using a specific encoding, |
| 449 | string objects provide an :func:`encode` method that takes one argument, the |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 450 | name of the encoding. Lowercase names for encodings are preferred. :: |
| 451 | |
Georg Brandl | c3f5bad | 2007-08-31 06:46:05 +0000 | [diff] [blame] | 452 | >>> "Äpfel".encode('utf-8') |
| 453 | b'\xc3\x84pfel' |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 454 | |
| 455 | .. _tut-lists: |
| 456 | |
| 457 | Lists |
| 458 | ----- |
| 459 | |
| 460 | Python knows a number of *compound* data types, used to group together other |
| 461 | values. The most versatile is the *list*, which can be written as a list of |
| 462 | comma-separated values (items) between square brackets. List items need not all |
| 463 | have the same type. :: |
| 464 | |
| 465 | >>> a = ['spam', 'eggs', 100, 1234] |
| 466 | >>> a |
| 467 | ['spam', 'eggs', 100, 1234] |
| 468 | |
| 469 | Like string indices, list indices start at 0, and lists can be sliced, |
| 470 | concatenated and so on:: |
| 471 | |
| 472 | >>> a[0] |
| 473 | 'spam' |
| 474 | >>> a[3] |
| 475 | 1234 |
| 476 | >>> a[-2] |
| 477 | 100 |
| 478 | >>> a[1:-1] |
| 479 | ['eggs', 100] |
| 480 | >>> a[:2] + ['bacon', 2*2] |
| 481 | ['spam', 'eggs', 'bacon', 4] |
| 482 | >>> 3*a[:3] + ['Boo!'] |
| 483 | ['spam', 'eggs', 100, 'spam', 'eggs', 100, 'spam', 'eggs', 100, 'Boo!'] |
| 484 | |
Benjamin Peterson | 886af96 | 2010-03-21 23:13:07 +0000 | [diff] [blame] | 485 | All slice operations return a new list containing the requested elements. This |
| 486 | means that the following slice returns a shallow copy of the list *a*:: |
| 487 | |
| 488 | >>> a[:] |
| 489 | ['spam', 'eggs', 100, 1234] |
| 490 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 491 | Unlike strings, which are *immutable*, it is possible to change individual |
| 492 | elements of a list:: |
| 493 | |
| 494 | >>> a |
| 495 | ['spam', 'eggs', 100, 1234] |
| 496 | >>> a[2] = a[2] + 23 |
| 497 | >>> a |
| 498 | ['spam', 'eggs', 123, 1234] |
| 499 | |
| 500 | Assignment to slices is also possible, and this can even change the size of the |
| 501 | list or clear it entirely:: |
| 502 | |
| 503 | >>> # Replace some items: |
| 504 | ... a[0:2] = [1, 12] |
| 505 | >>> a |
| 506 | [1, 12, 123, 1234] |
| 507 | >>> # Remove some: |
| 508 | ... a[0:2] = [] |
| 509 | >>> a |
| 510 | [123, 1234] |
| 511 | >>> # Insert some: |
| 512 | ... a[1:1] = ['bletch', 'xyzzy'] |
| 513 | >>> a |
| 514 | [123, 'bletch', 'xyzzy', 1234] |
| 515 | >>> # Insert (a copy of) itself at the beginning |
| 516 | >>> a[:0] = a |
| 517 | >>> a |
| 518 | [123, 'bletch', 'xyzzy', 1234, 123, 'bletch', 'xyzzy', 1234] |
| 519 | >>> # Clear the list: replace all items with an empty list |
| 520 | >>> a[:] = [] |
| 521 | >>> a |
| 522 | [] |
| 523 | |
| 524 | The built-in function :func:`len` also applies to lists:: |
| 525 | |
Guido van Rossum | 58da931 | 2007-11-10 23:39:45 +0000 | [diff] [blame] | 526 | >>> a = ['a', 'b', 'c', 'd'] |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 527 | >>> len(a) |
Guido van Rossum | 58da931 | 2007-11-10 23:39:45 +0000 | [diff] [blame] | 528 | 4 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 529 | |
| 530 | It is possible to nest lists (create lists containing other lists), for |
| 531 | example:: |
| 532 | |
| 533 | >>> q = [2, 3] |
| 534 | >>> p = [1, q, 4] |
| 535 | >>> len(p) |
| 536 | 3 |
| 537 | >>> p[1] |
| 538 | [2, 3] |
| 539 | >>> p[1][0] |
| 540 | 2 |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 541 | |
| 542 | You can add something to the end of the list:: |
| 543 | |
Georg Brandl | e4ac750 | 2007-09-03 07:10:24 +0000 | [diff] [blame] | 544 | >>> p[1].append('xtra') |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 545 | >>> p |
| 546 | [1, [2, 3, 'xtra'], 4] |
| 547 | >>> q |
| 548 | [2, 3, 'xtra'] |
| 549 | |
| 550 | Note that in the last example, ``p[1]`` and ``q`` really refer to the same |
| 551 | object! We'll come back to *object semantics* later. |
| 552 | |
| 553 | |
| 554 | .. _tut-firststeps: |
| 555 | |
| 556 | First Steps Towards Programming |
| 557 | =============================== |
| 558 | |
| 559 | Of course, we can use Python for more complicated tasks than adding two and two |
| 560 | together. For instance, we can write an initial sub-sequence of the *Fibonacci* |
| 561 | series as follows:: |
| 562 | |
| 563 | >>> # Fibonacci series: |
| 564 | ... # the sum of two elements defines the next |
| 565 | ... a, b = 0, 1 |
| 566 | >>> while b < 10: |
Georg Brandl | 22ec03c | 2008-01-07 17:32:13 +0000 | [diff] [blame] | 567 | ... print(b) |
| 568 | ... a, b = b, a+b |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 569 | ... |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 570 | 1 |
| 571 | 1 |
| 572 | 2 |
| 573 | 3 |
| 574 | 5 |
| 575 | 8 |
| 576 | |
| 577 | This example introduces several new features. |
| 578 | |
| 579 | * The first line contains a *multiple assignment*: the variables ``a`` and ``b`` |
| 580 | simultaneously get the new values 0 and 1. On the last line this is used again, |
| 581 | demonstrating that the expressions on the right-hand side are all evaluated |
| 582 | first before any of the assignments take place. The right-hand side expressions |
| 583 | are evaluated from the left to the right. |
| 584 | |
| 585 | * The :keyword:`while` loop executes as long as the condition (here: ``b < 10``) |
| 586 | remains true. In Python, like in C, any non-zero integer value is true; zero is |
| 587 | false. The condition may also be a string or list value, in fact any sequence; |
| 588 | anything with a non-zero length is true, empty sequences are false. The test |
| 589 | used in the example is a simple comparison. The standard comparison operators |
| 590 | are written the same as in C: ``<`` (less than), ``>`` (greater than), ``==`` |
| 591 | (equal to), ``<=`` (less than or equal to), ``>=`` (greater than or equal to) |
| 592 | and ``!=`` (not equal to). |
| 593 | |
| 594 | * The *body* of the loop is *indented*: indentation is Python's way of grouping |
| 595 | statements. Python does not (yet!) provide an intelligent input line editing |
| 596 | facility, so you have to type a tab or space(s) for each indented line. In |
| 597 | practice you will prepare more complicated input for Python with a text editor; |
| 598 | most text editors have an auto-indent facility. When a compound statement is |
| 599 | entered interactively, it must be followed by a blank line to indicate |
| 600 | completion (since the parser cannot guess when you have typed the last line). |
| 601 | Note that each line within a basic block must be indented by the same amount. |
| 602 | |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 603 | * The :func:`print` function writes the value of the expression(s) it is |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 604 | given. It differs from just writing the expression you want to write (as we did |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 605 | earlier in the calculator examples) in the way it handles multiple |
| 606 | expressions, floating point quantities, |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 607 | and strings. Strings are printed without quotes, and a space is inserted |
| 608 | between items, so you can format things nicely, like this:: |
| 609 | |
| 610 | >>> i = 256*256 |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 611 | >>> print('The value of i is', i) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 612 | The value of i is 65536 |
| 613 | |
Georg Brandl | 11e18b0 | 2008-08-05 09:04:16 +0000 | [diff] [blame] | 614 | The keyword *end* can be used to avoid the newline after the output, or end |
| 615 | the output with a different string:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 616 | |
| 617 | >>> a, b = 0, 1 |
| 618 | >>> while b < 1000: |
Georg Brandl | 11e18b0 | 2008-08-05 09:04:16 +0000 | [diff] [blame] | 619 | ... print(b, end=' ') |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 620 | ... a, b = b, a+b |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 621 | ... |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 622 | 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 |