| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1 | .. _tut-io: | 
 | 2 |  | 
 | 3 | **************** | 
 | 4 | Input and Output | 
 | 5 | **************** | 
 | 6 |  | 
 | 7 | There are several ways to present the output of a program; data can be printed | 
 | 8 | in a human-readable form, or written to a file for future use. This chapter will | 
 | 9 | discuss some of the possibilities. | 
 | 10 |  | 
 | 11 |  | 
 | 12 | .. _tut-formatting: | 
 | 13 |  | 
 | 14 | Fancier Output Formatting | 
 | 15 | ========================= | 
 | 16 |  | 
 | 17 | So far we've encountered two ways of writing values: *expression statements* and | 
 | 18 | the :keyword:`print` statement.  (A third way is using the :meth:`write` method | 
 | 19 | of file objects; the standard output file can be referenced as ``sys.stdout``. | 
 | 20 | See the Library Reference for more information on this.) | 
 | 21 |  | 
 | 22 | .. index:: module: string | 
 | 23 |  | 
 | 24 | Often you'll want more control over the formatting of your output than simply | 
 | 25 | printing space-separated values.  There are two ways to format your output; the | 
 | 26 | first way is to do all the string handling yourself; using string slicing and | 
 | 27 | concatenation operations you can create any layout you can imagine.  The | 
 | 28 | standard module :mod:`string` contains some useful operations for padding | 
 | 29 | strings to a given column width; these will be discussed shortly.  The second | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 30 | way is to use the :meth:`str.format` method. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 31 |  | 
 | 32 | One question remains, of course: how do you convert values to strings? Luckily, | 
 | 33 | Python has ways to convert any value to a string: pass it to the :func:`repr` | 
| Georg Brandl | b04d485 | 2008-08-08 15:34:34 +0000 | [diff] [blame] | 34 | or :func:`str` functions. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 35 |  | 
 | 36 | The :func:`str` function is meant to return representations of values which are | 
 | 37 | fairly human-readable, while :func:`repr` is meant to generate representations | 
 | 38 | which can be read by the interpreter (or will force a :exc:`SyntaxError` if | 
 | 39 | there is not equivalent syntax).  For objects which don't have a particular | 
 | 40 | representation for human consumption, :func:`str` will return the same value as | 
 | 41 | :func:`repr`.  Many values, such as numbers or structures like lists and | 
 | 42 | dictionaries, have the same representation using either function.  Strings and | 
 | 43 | floating point numbers, in particular, have two distinct representations. | 
 | 44 |  | 
 | 45 | Some examples:: | 
 | 46 |  | 
 | 47 |    >>> s = 'Hello, world.' | 
 | 48 |    >>> str(s) | 
 | 49 |    'Hello, world.' | 
 | 50 |    >>> repr(s) | 
 | 51 |    "'Hello, world.'" | 
 | 52 |    >>> str(0.1) | 
 | 53 |    '0.1' | 
 | 54 |    >>> repr(0.1) | 
 | 55 |    '0.10000000000000001' | 
 | 56 |    >>> x = 10 * 3.25 | 
 | 57 |    >>> y = 200 * 200 | 
 | 58 |    >>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...' | 
 | 59 |    >>> print s | 
 | 60 |    The value of x is 32.5, and y is 40000... | 
 | 61 |    >>> # The repr() of a string adds string quotes and backslashes: | 
 | 62 |    ... hello = 'hello, world\n' | 
 | 63 |    >>> hellos = repr(hello) | 
 | 64 |    >>> print hellos | 
 | 65 |    'hello, world\n' | 
 | 66 |    >>> # The argument to repr() may be any Python object: | 
 | 67 |    ... repr((x, y, ('spam', 'eggs'))) | 
 | 68 |    "(32.5, 40000, ('spam', 'eggs'))" | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 69 |  | 
 | 70 | Here are two ways to write a table of squares and cubes:: | 
 | 71 |  | 
 | 72 |    >>> for x in range(1, 11): | 
 | 73 |    ...     print repr(x).rjust(2), repr(x*x).rjust(3), | 
 | 74 |    ...     # Note trailing comma on previous line | 
 | 75 |    ...     print repr(x*x*x).rjust(4) | 
 | 76 |    ... | 
 | 77 |     1   1    1 | 
 | 78 |     2   4    8 | 
 | 79 |     3   9   27 | 
 | 80 |     4  16   64 | 
 | 81 |     5  25  125 | 
 | 82 |     6  36  216 | 
 | 83 |     7  49  343 | 
 | 84 |     8  64  512 | 
 | 85 |     9  81  729 | 
 | 86 |    10 100 1000 | 
 | 87 |  | 
 | 88 |    >>> for x in range(1,11): | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 89 |    ...     print '{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x) | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 90 |    ... | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 91 |     1   1    1 | 
 | 92 |     2   4    8 | 
 | 93 |     3   9   27 | 
 | 94 |     4  16   64 | 
 | 95 |     5  25  125 | 
 | 96 |     6  36  216 | 
 | 97 |     7  49  343 | 
 | 98 |     8  64  512 | 
 | 99 |     9  81  729 | 
 | 100 |    10 100 1000 | 
 | 101 |  | 
 | 102 | (Note that in the first example, one space between each column was added by the | 
 | 103 | way :keyword:`print` works: it always adds spaces between its arguments.) | 
 | 104 |  | 
 | 105 | This example demonstrates the :meth:`rjust` method of string objects, which | 
 | 106 | right-justifies a string in a field of a given width by padding it with spaces | 
 | 107 | on the left.  There are similar methods :meth:`ljust` and :meth:`center`.  These | 
 | 108 | methods do not write anything, they just return a new string.  If the input | 
 | 109 | string is too long, they don't truncate it, but return it unchanged; this will | 
 | 110 | mess up your column lay-out but that's usually better than the alternative, | 
 | 111 | which would be lying about a value.  (If you really want truncation you can | 
 | 112 | always add a slice operation, as in ``x.ljust(n)[:n]``.) | 
 | 113 |  | 
 | 114 | There is another method, :meth:`zfill`, which pads a numeric string on the left | 
 | 115 | with zeros.  It understands about plus and minus signs:: | 
 | 116 |  | 
 | 117 |    >>> '12'.zfill(5) | 
 | 118 |    '00012' | 
 | 119 |    >>> '-3.14'.zfill(7) | 
 | 120 |    '-003.14' | 
 | 121 |    >>> '3.14159265359'.zfill(5) | 
 | 122 |    '3.14159265359' | 
 | 123 |  | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 124 | Basic usage of the :meth:`str.format` method looks like this:: | 
 | 125 |  | 
 | 126 |    >>> print 'We are the {0} who say "{1}!"'.format('knights', 'Ni') | 
 | 127 |    We are the knights who say "Ni!" | 
 | 128 |  | 
 | 129 | The brackets and characters within them (called format fields) are replaced with | 
| Georg Brandl | 14bb28a | 2009-07-29 17:15:20 +0000 | [diff] [blame^] | 130 | the objects passed into the :meth:`~str.format` method.  The number in the | 
 | 131 | brackets refers to the position of the object passed into the | 
 | 132 | :meth:`~str.format` method. :: | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 133 |  | 
 | 134 |    >>> print '{0} and {1}'.format('spam', 'eggs') | 
 | 135 |    spam and eggs | 
 | 136 |    >>> print '{1} and {0}'.format('spam', 'eggs') | 
 | 137 |    eggs and spam | 
 | 138 |  | 
| Georg Brandl | 14bb28a | 2009-07-29 17:15:20 +0000 | [diff] [blame^] | 139 | If keyword arguments are used in the :meth:`~str.format` method, their values | 
 | 140 | are referred to by using the name of the argument. :: | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 141 |  | 
| Georg Brandl | 4b99e9b | 2008-07-26 22:13:29 +0000 | [diff] [blame] | 142 |    >>> print 'This {food} is {adjective}.'.format( | 
 | 143 |    ...       food='spam', adjective='absolutely horrible') | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 144 |    This spam is absolutely horrible. | 
 | 145 |  | 
 | 146 | Positional and keyword arguments can be arbitrarily combined:: | 
 | 147 |  | 
| Georg Brandl | 4b99e9b | 2008-07-26 22:13:29 +0000 | [diff] [blame] | 148 |    >>> print 'The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred', | 
 | 149 |    ...                                                    other='Georg') | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 150 |    The story of Bill, Manfred, and Georg. | 
 | 151 |  | 
| Georg Brandl | a1a4bdb | 2009-07-18 09:06:31 +0000 | [diff] [blame] | 152 | An optional ``':'`` and format specifier can follow the field name. This allows | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 153 | greater control over how the value is formatted.  The following example | 
| Georg Brandl | a1a4bdb | 2009-07-18 09:06:31 +0000 | [diff] [blame] | 154 | truncates Pi to three places after the decimal. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 155 |  | 
 | 156 |    >>> import math | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 157 |    >>> print 'The value of PI is approximately {0:.3f}.'.format(math.pi) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 158 |    The value of PI is approximately 3.142. | 
 | 159 |  | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 160 | Passing an integer after the ``':'`` will cause that field to be a minimum | 
| Georg Brandl | 14bb28a | 2009-07-29 17:15:20 +0000 | [diff] [blame^] | 161 | number of characters wide.  This is useful for making tables pretty. :: | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 162 |  | 
 | 163 |    >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678} | 
 | 164 |    >>> for name, phone in table.items(): | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 165 |    ...     print '{0:10} ==> {1:10d}'.format(name, phone) | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 166 |    ... | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 167 |    Jack       ==>       4098 | 
 | 168 |    Dcab       ==>       7678 | 
 | 169 |    Sjoerd     ==>       4127 | 
 | 170 |  | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 171 | If you have a really long format string that you don't want to split up, it | 
 | 172 | would be nice if you could reference the variables to be formatted by name | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 173 | instead of by position.  This can be done by simply passing the dict and using | 
 | 174 | square brackets ``'[]'`` to access the keys :: | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 175 |  | 
 | 176 |    >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678} | 
| Georg Brandl | 4b99e9b | 2008-07-26 22:13:29 +0000 | [diff] [blame] | 177 |    >>> print ('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; ' | 
 | 178 |    ...        'Dcab: {0[Dcab]:d}'.format(table)) | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 179 |    Jack: 4098; Sjoerd: 4127; Dcab: 8637678 | 
 | 180 |  | 
 | 181 | This could also be done by passing the table as keyword arguments with the '**' | 
| Georg Brandl | 14bb28a | 2009-07-29 17:15:20 +0000 | [diff] [blame^] | 182 | notation. :: | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 183 |  | 
 | 184 |    >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678} | 
 | 185 |    >>> print 'Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 186 |    Jack: 4098; Sjoerd: 4127; Dcab: 8637678 | 
 | 187 |  | 
 | 188 | This is particularly useful in combination with the new built-in :func:`vars` | 
 | 189 | function, which returns a dictionary containing all local variables. | 
 | 190 |  | 
| Mark Dickinson | 3e4caeb | 2009-02-21 20:27:01 +0000 | [diff] [blame] | 191 | For a complete overview of string formatting with :meth:`str.format`, see | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 192 | :ref:`formatstrings`. | 
 | 193 |  | 
 | 194 |  | 
 | 195 | Old string formatting | 
 | 196 | --------------------- | 
 | 197 |  | 
 | 198 | The ``%`` operator can also be used for string formatting. It interprets the | 
 | 199 | left argument much like a :cfunc:`sprintf`\ -style format string to be applied | 
 | 200 | to the right argument, and returns the string resulting from this formatting | 
 | 201 | operation. For example:: | 
 | 202 |  | 
 | 203 |    >>> import math | 
 | 204 |    >>> print 'The value of PI is approximately %5.3f.' % math.pi | 
 | 205 |    The value of PI is approximately 3.142. | 
 | 206 |  | 
 | 207 | Since :meth:`str.format` is quite new, a lot of Python code still uses the ``%`` | 
| Georg Brandl | a1a4bdb | 2009-07-18 09:06:31 +0000 | [diff] [blame] | 208 | operator. However, because this old style of formatting will eventually be | 
 | 209 | removed from the language, :meth:`str.format` should generally be used. | 
| Benjamin Peterson | f9ef988 | 2008-05-26 00:54:22 +0000 | [diff] [blame] | 210 |  | 
 | 211 | More information can be found in the :ref:`string-formatting` section. | 
 | 212 |  | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 213 |  | 
 | 214 | .. _tut-files: | 
 | 215 |  | 
 | 216 | Reading and Writing Files | 
 | 217 | ========================= | 
 | 218 |  | 
 | 219 | .. index:: | 
 | 220 |    builtin: open | 
 | 221 |    object: file | 
 | 222 |  | 
 | 223 | :func:`open` returns a file object, and is most commonly used with two | 
 | 224 | arguments: ``open(filename, mode)``. | 
 | 225 |  | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 226 | :: | 
 | 227 |  | 
| Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 228 |    >>> f = open('/tmp/workfile', 'w') | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 229 |    >>> print f | 
 | 230 |    <open file '/tmp/workfile', mode 'w' at 80a0960> | 
 | 231 |  | 
 | 232 | The first argument is a string containing the filename.  The second argument is | 
 | 233 | another string containing a few characters describing the way in which the file | 
 | 234 | will be used.  *mode* can be ``'r'`` when the file will only be read, ``'w'`` | 
 | 235 | for only writing (an existing file with the same name will be erased), and | 
 | 236 | ``'a'`` opens the file for appending; any data written to the file is | 
 | 237 | automatically added to the end.  ``'r+'`` opens the file for both reading and | 
 | 238 | writing. The *mode* argument is optional; ``'r'`` will be assumed if it's | 
 | 239 | omitted. | 
 | 240 |  | 
| Georg Brandl | 9af9498 | 2008-09-13 17:41:16 +0000 | [diff] [blame] | 241 | On Windows, ``'b'`` appended to the mode opens the file in binary mode, so there | 
 | 242 | are also modes like ``'rb'``, ``'wb'``, and ``'r+b'``.  Windows makes a | 
 | 243 | distinction between text and binary files; the end-of-line characters in text | 
 | 244 | files are automatically altered slightly when data is read or written.  This | 
 | 245 | behind-the-scenes modification to file data is fine for ASCII text files, but | 
 | 246 | it'll corrupt binary data like that in :file:`JPEG` or :file:`EXE` files.  Be | 
 | 247 | very careful to use binary mode when reading and writing such files.  On Unix, | 
 | 248 | it doesn't hurt to append a ``'b'`` to the mode, so you can use it | 
 | 249 | platform-independently for all binary files. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 250 |  | 
 | 251 |  | 
 | 252 | .. _tut-filemethods: | 
 | 253 |  | 
 | 254 | Methods of File Objects | 
 | 255 | ----------------------- | 
 | 256 |  | 
 | 257 | The rest of the examples in this section will assume that a file object called | 
 | 258 | ``f`` has already been created. | 
 | 259 |  | 
 | 260 | To read a file's contents, call ``f.read(size)``, which reads some quantity of | 
 | 261 | data and returns it as a string.  *size* is an optional numeric argument.  When | 
 | 262 | *size* is omitted or negative, the entire contents of the file will be read and | 
 | 263 | returned; it's your problem if the file is twice as large as your machine's | 
 | 264 | memory. Otherwise, at most *size* bytes are read and returned.  If the end of | 
 | 265 | the file has been reached, ``f.read()`` will return an empty string (``""``). | 
 | 266 | :: | 
 | 267 |  | 
 | 268 |    >>> f.read() | 
 | 269 |    'This is the entire file.\n' | 
 | 270 |    >>> f.read() | 
 | 271 |    '' | 
 | 272 |  | 
 | 273 | ``f.readline()`` reads a single line from the file; a newline character (``\n``) | 
 | 274 | is left at the end of the string, and is only omitted on the last line of the | 
 | 275 | file if the file doesn't end in a newline.  This makes the return value | 
 | 276 | unambiguous; if ``f.readline()`` returns an empty string, the end of the file | 
 | 277 | has been reached, while a blank line is represented by ``'\n'``, a string | 
 | 278 | containing only a single newline.   :: | 
 | 279 |  | 
 | 280 |    >>> f.readline() | 
 | 281 |    'This is the first line of the file.\n' | 
 | 282 |    >>> f.readline() | 
 | 283 |    'Second line of the file\n' | 
 | 284 |    >>> f.readline() | 
 | 285 |    '' | 
 | 286 |  | 
 | 287 | ``f.readlines()`` returns a list containing all the lines of data in the file. | 
 | 288 | If given an optional parameter *sizehint*, it reads that many bytes from the | 
 | 289 | file and enough more to complete a line, and returns the lines from that.  This | 
 | 290 | is often used to allow efficient reading of a large file by lines, but without | 
 | 291 | having to load the entire file in memory.  Only complete lines will be returned. | 
 | 292 | :: | 
 | 293 |  | 
 | 294 |    >>> f.readlines() | 
 | 295 |    ['This is the first line of the file.\n', 'Second line of the file\n'] | 
 | 296 |  | 
| Georg Brandl | 5d242ee | 2007-09-20 08:44:59 +0000 | [diff] [blame] | 297 | An alternative approach to reading lines is to loop over the file object. This is | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 298 | memory efficient, fast, and leads to simpler code:: | 
 | 299 |  | 
 | 300 |    >>> for line in f: | 
 | 301 |            print line, | 
 | 302 |  | 
 | 303 |    This is the first line of the file. | 
 | 304 |    Second line of the file | 
 | 305 |  | 
 | 306 | The alternative approach is simpler but does not provide as fine-grained | 
 | 307 | control.  Since the two approaches manage line buffering differently, they | 
 | 308 | should not be mixed. | 
 | 309 |  | 
 | 310 | ``f.write(string)`` writes the contents of *string* to the file, returning | 
 | 311 | ``None``.   :: | 
 | 312 |  | 
 | 313 |    >>> f.write('This is a test\n') | 
 | 314 |  | 
 | 315 | To write something other than a string, it needs to be converted to a string | 
 | 316 | first:: | 
 | 317 |  | 
 | 318 |    >>> value = ('the answer', 42) | 
 | 319 |    >>> s = str(value) | 
 | 320 |    >>> f.write(s) | 
 | 321 |  | 
 | 322 | ``f.tell()`` returns an integer giving the file object's current position in the | 
 | 323 | file, measured in bytes from the beginning of the file.  To change the file | 
 | 324 | object's position, use ``f.seek(offset, from_what)``.  The position is computed | 
 | 325 | from adding *offset* to a reference point; the reference point is selected by | 
 | 326 | the *from_what* argument.  A *from_what* value of 0 measures from the beginning | 
 | 327 | of the file, 1 uses the current file position, and 2 uses the end of the file as | 
 | 328 | the reference point.  *from_what* can be omitted and defaults to 0, using the | 
 | 329 | beginning of the file as the reference point. :: | 
 | 330 |  | 
 | 331 |    >>> f = open('/tmp/workfile', 'r+') | 
 | 332 |    >>> f.write('0123456789abcdef') | 
 | 333 |    >>> f.seek(5)     # Go to the 6th byte in the file | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 334 |    >>> f.read(1) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 335 |    '5' | 
 | 336 |    >>> f.seek(-3, 2) # Go to the 3rd byte before the end | 
 | 337 |    >>> f.read(1) | 
 | 338 |    'd' | 
 | 339 |  | 
 | 340 | When you're done with a file, call ``f.close()`` to close it and free up any | 
 | 341 | system resources taken up by the open file.  After calling ``f.close()``, | 
 | 342 | attempts to use the file object will automatically fail. :: | 
 | 343 |  | 
 | 344 |    >>> f.close() | 
 | 345 |    >>> f.read() | 
 | 346 |    Traceback (most recent call last): | 
 | 347 |      File "<stdin>", line 1, in ? | 
 | 348 |    ValueError: I/O operation on closed file | 
 | 349 |  | 
| Georg Brandl | a66bb0a | 2008-07-16 23:35:54 +0000 | [diff] [blame] | 350 | It is good practice to use the :keyword:`with` keyword when dealing with file | 
 | 351 | objects.  This has the advantage that the file is properly closed after its | 
 | 352 | suite finishes, even if an exception is raised on the way.  It is also much | 
 | 353 | shorter than writing equivalent :keyword:`try`\ -\ :keyword:`finally` blocks:: | 
 | 354 |  | 
 | 355 |     >>> with open('/tmp/workfile', 'r') as f: | 
 | 356 |     ...     read_data = f.read() | 
 | 357 |     >>> f.closed | 
 | 358 |     True | 
 | 359 |  | 
| Georg Brandl | 14bb28a | 2009-07-29 17:15:20 +0000 | [diff] [blame^] | 360 | File objects have some additional methods, such as :meth:`~file.isatty` and | 
 | 361 | :meth:`~file.truncate` which are less frequently used; consult the Library | 
 | 362 | Reference for a complete guide to file objects. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 363 |  | 
 | 364 |  | 
 | 365 | .. _tut-pickle: | 
 | 366 |  | 
 | 367 | The :mod:`pickle` Module | 
 | 368 | ------------------------ | 
 | 369 |  | 
 | 370 | .. index:: module: pickle | 
 | 371 |  | 
 | 372 | Strings can easily be written to and read from a file. Numbers take a bit more | 
 | 373 | effort, since the :meth:`read` method only returns strings, which will have to | 
 | 374 | be passed to a function like :func:`int`, which takes a string like ``'123'`` | 
 | 375 | and returns its numeric value 123.  However, when you want to save more complex | 
 | 376 | data types like lists, dictionaries, or class instances, things get a lot more | 
 | 377 | complicated. | 
 | 378 |  | 
 | 379 | Rather than have users be constantly writing and debugging code to save | 
 | 380 | complicated data types, Python provides a standard module called :mod:`pickle`. | 
 | 381 | This is an amazing module that can take almost any Python object (even some | 
 | 382 | forms of Python code!), and convert it to a string representation; this process | 
 | 383 | is called :dfn:`pickling`.  Reconstructing the object from the string | 
 | 384 | representation is called :dfn:`unpickling`.  Between pickling and unpickling, | 
 | 385 | the string representing the object may have been stored in a file or data, or | 
 | 386 | sent over a network connection to some distant machine. | 
 | 387 |  | 
 | 388 | If you have an object ``x``, and a file object ``f`` that's been opened for | 
 | 389 | writing, the simplest way to pickle the object takes only one line of code:: | 
 | 390 |  | 
 | 391 |    pickle.dump(x, f) | 
 | 392 |  | 
 | 393 | To unpickle the object again, if ``f`` is a file object which has been opened | 
 | 394 | for reading:: | 
 | 395 |  | 
 | 396 |    x = pickle.load(f) | 
 | 397 |  | 
 | 398 | (There are other variants of this, used when pickling many objects or when you | 
 | 399 | don't want to write the pickled data to a file; consult the complete | 
 | 400 | documentation for :mod:`pickle` in the Python Library Reference.) | 
 | 401 |  | 
 | 402 | :mod:`pickle` is the standard way to make Python objects which can be stored and | 
 | 403 | reused by other programs or by a future invocation of the same program; the | 
 | 404 | technical term for this is a :dfn:`persistent` object.  Because :mod:`pickle` is | 
 | 405 | so widely used, many authors who write Python extensions take care to ensure | 
 | 406 | that new data types such as matrices can be properly pickled and unpickled. | 
 | 407 |  | 
 | 408 |  |