Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | .. _tut-io: |
| 2 | |
| 3 | **************** |
| 4 | Input and Output |
| 5 | **************** |
| 6 | |
| 7 | There are several ways to present the output of a program; data can be printed |
| 8 | in a human-readable form, or written to a file for future use. This chapter will |
| 9 | discuss some of the possibilities. |
| 10 | |
| 11 | |
| 12 | .. _tut-formatting: |
| 13 | |
| 14 | Fancier Output Formatting |
| 15 | ========================= |
| 16 | |
| 17 | So far we've encountered two ways of writing values: *expression statements* and |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 18 | the :func:`print` function. (A third way is using the :meth:`write` method |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 19 | of file objects; the standard output file can be referenced as ``sys.stdout``. |
| 20 | See the Library Reference for more information on this.) |
| 21 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 22 | Often you'll want more control over the formatting of your output than simply |
| 23 | printing space-separated values. There are two ways to format your output; the |
| 24 | first way is to do all the string handling yourself; using string slicing and |
| 25 | concatenation operations you can create any layout you can imagine. The |
Georg Brandl | 3640e18 | 2011-03-06 10:56:18 +0100 | [diff] [blame] | 26 | string type has some methods that perform useful operations for padding |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 27 | strings to a given column width; these will be discussed shortly. The second |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 28 | way is to use the :meth:`str.format` method. |
| 29 | |
Georg Brandl | 3640e18 | 2011-03-06 10:56:18 +0100 | [diff] [blame] | 30 | The :mod:`string` module contains a :class:`~string.Template` class which offers |
| 31 | yet another way to substitute values into strings. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 32 | |
| 33 | One question remains, of course: how do you convert values to strings? Luckily, |
| 34 | Python has ways to convert any value to a string: pass it to the :func:`repr` |
Georg Brandl | 1e3830a | 2008-08-08 06:45:01 +0000 | [diff] [blame] | 35 | or :func:`str` functions. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 36 | |
| 37 | The :func:`str` function is meant to return representations of values which are |
| 38 | fairly human-readable, while :func:`repr` is meant to generate representations |
| 39 | which can be read by the interpreter (or will force a :exc:`SyntaxError` if |
Sandro Tosi | a17ef14 | 2012-08-14 19:51:43 +0200 | [diff] [blame] | 40 | there is no equivalent syntax). For objects which don't have a particular |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 41 | representation for human consumption, :func:`str` will return the same value as |
| 42 | :func:`repr`. Many values, such as numbers or structures like lists and |
Ezio Melotti | 0def5c6 | 2011-03-13 02:27:26 +0200 | [diff] [blame] | 43 | dictionaries, have the same representation using either function. Strings, in |
| 44 | particular, have two distinct representations. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 45 | |
| 46 | Some examples:: |
| 47 | |
| 48 | >>> s = 'Hello, world.' |
| 49 | >>> str(s) |
| 50 | 'Hello, world.' |
| 51 | >>> repr(s) |
| 52 | "'Hello, world.'" |
Ezio Melotti | 0def5c6 | 2011-03-13 02:27:26 +0200 | [diff] [blame] | 53 | >>> str(1/7) |
Mark Dickinson | 5a55b61 | 2009-06-28 20:59:42 +0000 | [diff] [blame] | 54 | '0.14285714285714285' |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 55 | >>> x = 10 * 3.25 |
| 56 | >>> y = 200 * 200 |
| 57 | >>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...' |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 58 | >>> print(s) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 59 | The value of x is 32.5, and y is 40000... |
| 60 | >>> # The repr() of a string adds string quotes and backslashes: |
| 61 | ... hello = 'hello, world\n' |
| 62 | >>> hellos = repr(hello) |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 63 | >>> print(hellos) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 64 | 'hello, world\n' |
| 65 | >>> # The argument to repr() may be any Python object: |
| 66 | ... repr((x, y, ('spam', 'eggs'))) |
| 67 | "(32.5, 40000, ('spam', 'eggs'))" |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 68 | |
| 69 | Here are two ways to write a table of squares and cubes:: |
| 70 | |
| 71 | >>> for x in range(1, 11): |
Georg Brandl | e4ac750 | 2007-09-03 07:10:24 +0000 | [diff] [blame] | 72 | ... print(repr(x).rjust(2), repr(x*x).rjust(3), end=' ') |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 73 | ... # Note use of 'end' on previous line |
| 74 | ... print(repr(x*x*x).rjust(4)) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 75 | ... |
| 76 | 1 1 1 |
| 77 | 2 4 8 |
| 78 | 3 9 27 |
| 79 | 4 16 64 |
| 80 | 5 25 125 |
| 81 | 6 36 216 |
| 82 | 7 49 343 |
| 83 | 8 64 512 |
| 84 | 9 81 729 |
| 85 | 10 100 1000 |
| 86 | |
Georg Brandl | e4ac750 | 2007-09-03 07:10:24 +0000 | [diff] [blame] | 87 | >>> for x in range(1, 11): |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 88 | ... print('{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x)) |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 89 | ... |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 90 | 1 1 1 |
| 91 | 2 4 8 |
| 92 | 3 9 27 |
| 93 | 4 16 64 |
| 94 | 5 25 125 |
| 95 | 6 36 216 |
| 96 | 7 49 343 |
| 97 | 8 64 512 |
| 98 | 9 81 729 |
| 99 | 10 100 1000 |
| 100 | |
| 101 | (Note that in the first example, one space between each column was added by the |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 102 | way :func:`print` works: it always adds spaces between its arguments.) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 103 | |
Ezio Melotti | 2b73660 | 2011-03-13 02:19:57 +0200 | [diff] [blame] | 104 | This example demonstrates the :meth:`str.rjust` method of string |
| 105 | objects, which right-justifies a string in a field of a given width by padding |
| 106 | it with spaces on the left. There are similar methods :meth:`str.ljust` and |
| 107 | :meth:`str.center`. These methods do not write anything, they just return a |
| 108 | new string. If the input string is too long, they don't truncate it, but |
| 109 | return it unchanged; this will mess up your column lay-out but that's usually |
| 110 | better than the alternative, which would be lying about a value. (If you |
| 111 | really want truncation you can always add a slice operation, as in |
| 112 | ``x.ljust(n)[:n]``.) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 113 | |
Ezio Melotti | 2b73660 | 2011-03-13 02:19:57 +0200 | [diff] [blame] | 114 | There is another method, :meth:`str.zfill`, which pads a numeric string on the |
| 115 | left with zeros. It understands about plus and minus signs:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 116 | |
| 117 | >>> '12'.zfill(5) |
| 118 | '00012' |
| 119 | >>> '-3.14'.zfill(7) |
| 120 | '-003.14' |
| 121 | >>> '3.14159265359'.zfill(5) |
| 122 | '3.14159265359' |
| 123 | |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 124 | Basic usage of the :meth:`str.format` method looks like this:: |
| 125 | |
Georg Brandl | 2f3ed68 | 2009-09-01 07:42:40 +0000 | [diff] [blame] | 126 | >>> print('We are the {} who say "{}!"'.format('knights', 'Ni')) |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 127 | We are the knights who say "Ni!" |
| 128 | |
| 129 | The brackets and characters within them (called format fields) are replaced with |
Ezio Melotti | 2b73660 | 2011-03-13 02:19:57 +0200 | [diff] [blame] | 130 | the objects passed into the :meth:`str.format` method. A number in the |
Georg Brandl | 2f3ed68 | 2009-09-01 07:42:40 +0000 | [diff] [blame] | 131 | brackets can be used to refer to the position of the object passed into the |
Ezio Melotti | 2b73660 | 2011-03-13 02:19:57 +0200 | [diff] [blame] | 132 | :meth:`str.format` method. :: |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 133 | |
Benjamin Peterson | 0cea157 | 2008-07-26 21:59:03 +0000 | [diff] [blame] | 134 | >>> print('{0} and {1}'.format('spam', 'eggs')) |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 135 | spam and eggs |
Benjamin Peterson | 0cea157 | 2008-07-26 21:59:03 +0000 | [diff] [blame] | 136 | >>> print('{1} and {0}'.format('spam', 'eggs')) |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 137 | eggs and spam |
| 138 | |
Ezio Melotti | 2b73660 | 2011-03-13 02:19:57 +0200 | [diff] [blame] | 139 | If keyword arguments are used in the :meth:`str.format` method, their values |
Alexandre Vassalotti | 6d3dfc3 | 2009-07-29 19:54:39 +0000 | [diff] [blame] | 140 | are referred to by using the name of the argument. :: |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 141 | |
Benjamin Peterson | 7114193 | 2008-07-26 22:27:04 +0000 | [diff] [blame] | 142 | >>> print('This {food} is {adjective}.'.format( |
| 143 | ... food='spam', adjective='absolutely horrible')) |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 144 | This spam is absolutely horrible. |
| 145 | |
| 146 | Positional and keyword arguments can be arbitrarily combined:: |
| 147 | |
Benjamin Peterson | 7114193 | 2008-07-26 22:27:04 +0000 | [diff] [blame] | 148 | >>> print('The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred', |
| 149 | other='Georg')) |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 150 | The story of Bill, Manfred, and Georg. |
| 151 | |
Georg Brandl | 2f3ed68 | 2009-09-01 07:42:40 +0000 | [diff] [blame] | 152 | ``'!a'`` (apply :func:`ascii`), ``'!s'`` (apply :func:`str`) and ``'!r'`` |
| 153 | (apply :func:`repr`) can be used to convert the value before it is formatted:: |
| 154 | |
| 155 | >>> import math |
| 156 | >>> print('The value of PI is approximately {}.'.format(math.pi)) |
| 157 | The value of PI is approximately 3.14159265359. |
| 158 | >>> print('The value of PI is approximately {!r}.'.format(math.pi)) |
| 159 | The value of PI is approximately 3.141592653589793. |
| 160 | |
Alexandre Vassalotti | e223eb8 | 2009-07-29 20:12:15 +0000 | [diff] [blame] | 161 | An optional ``':'`` and format specifier can follow the field name. This allows |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 162 | greater control over how the value is formatted. The following example |
Raymond Hettinger | 756fe26 | 2011-02-24 00:06:16 +0000 | [diff] [blame] | 163 | rounds Pi to three places after the decimal. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 164 | |
| 165 | >>> import math |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 166 | >>> print('The value of PI is approximately {0:.3f}.'.format(math.pi)) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 167 | The value of PI is approximately 3.142. |
| 168 | |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 169 | Passing an integer after the ``':'`` will cause that field to be a minimum |
Alexandre Vassalotti | 6d3dfc3 | 2009-07-29 19:54:39 +0000 | [diff] [blame] | 170 | number of characters wide. This is useful for making tables pretty. :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 171 | |
| 172 | >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678} |
| 173 | >>> for name, phone in table.items(): |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 174 | ... print('{0:10} ==> {1:10d}'.format(name, phone)) |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 175 | ... |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 176 | Jack ==> 4098 |
| 177 | Dcab ==> 7678 |
| 178 | Sjoerd ==> 4127 |
| 179 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 180 | If you have a really long format string that you don't want to split up, it |
| 181 | would be nice if you could reference the variables to be formatted by name |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 182 | instead of by position. This can be done by simply passing the dict and using |
| 183 | square brackets ``'[]'`` to access the keys :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 184 | |
| 185 | >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678} |
Benjamin Peterson | 7114193 | 2008-07-26 22:27:04 +0000 | [diff] [blame] | 186 | >>> print('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; ' |
| 187 | 'Dcab: {0[Dcab]:d}'.format(table)) |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 188 | Jack: 4098; Sjoerd: 4127; Dcab: 8637678 |
| 189 | |
| 190 | This could also be done by passing the table as keyword arguments with the '**' |
Alexandre Vassalotti | 6d3dfc3 | 2009-07-29 19:54:39 +0000 | [diff] [blame] | 191 | notation. :: |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 192 | |
| 193 | >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678} |
| 194 | >>> print('Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table)) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 195 | Jack: 4098; Sjoerd: 4127; Dcab: 8637678 |
| 196 | |
Ezio Melotti | 2b73660 | 2011-03-13 02:19:57 +0200 | [diff] [blame] | 197 | This is particularly useful in combination with the built-in function |
| 198 | :func:`vars`, which returns a dictionary containing all local variables. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 199 | |
Mark Dickinson | 934896d | 2009-02-21 20:59:32 +0000 | [diff] [blame] | 200 | For a complete overview of string formatting with :meth:`str.format`, see |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 201 | :ref:`formatstrings`. |
| 202 | |
| 203 | |
| 204 | Old string formatting |
| 205 | --------------------- |
| 206 | |
| 207 | The ``%`` operator can also be used for string formatting. It interprets the |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 208 | left argument much like a :c:func:`sprintf`\ -style format string to be applied |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 209 | to the right argument, and returns the string resulting from this formatting |
| 210 | operation. For example:: |
| 211 | |
| 212 | >>> import math |
Georg Brandl | 11e18b0 | 2008-08-05 09:04:16 +0000 | [diff] [blame] | 213 | >>> print('The value of PI is approximately %5.3f.' % math.pi) |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 214 | The value of PI is approximately 3.142. |
| 215 | |
| 216 | Since :meth:`str.format` is quite new, a lot of Python code still uses the ``%`` |
Alexandre Vassalotti | e223eb8 | 2009-07-29 20:12:15 +0000 | [diff] [blame] | 217 | operator. However, because this old style of formatting will eventually be |
| 218 | removed from the language, :meth:`str.format` should generally be used. |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 219 | |
| 220 | More information can be found in the :ref:`old-string-formatting` section. |
| 221 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 222 | |
| 223 | .. _tut-files: |
| 224 | |
| 225 | Reading and Writing Files |
| 226 | ========================= |
| 227 | |
| 228 | .. index:: |
| 229 | builtin: open |
| 230 | object: file |
| 231 | |
Antoine Pitrou | 11cb961 | 2010-09-15 11:11:28 +0000 | [diff] [blame] | 232 | :func:`open` returns a :term:`file object`, and is most commonly used with |
| 233 | two arguments: ``open(filename, mode)``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 234 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 235 | :: |
| 236 | |
Christian Heimes | 5b5e81c | 2007-12-31 16:14:33 +0000 | [diff] [blame] | 237 | >>> f = open('/tmp/workfile', 'w') |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 238 | |
| 239 | .. XXX str(f) is <io.TextIOWrapper object at 0x82e8dc4> |
| 240 | |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 241 | >>> print(f) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 242 | <open file '/tmp/workfile', mode 'w' at 80a0960> |
| 243 | |
| 244 | The first argument is a string containing the filename. The second argument is |
| 245 | another string containing a few characters describing the way in which the file |
| 246 | will be used. *mode* can be ``'r'`` when the file will only be read, ``'w'`` |
| 247 | for only writing (an existing file with the same name will be erased), and |
| 248 | ``'a'`` opens the file for appending; any data written to the file is |
| 249 | automatically added to the end. ``'r+'`` opens the file for both reading and |
| 250 | writing. The *mode* argument is optional; ``'r'`` will be assumed if it's |
| 251 | omitted. |
| 252 | |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 253 | Normally, files are opened in :dfn:`text mode`, that means, you read and write |
| 254 | strings from and to the file, which are encoded in a specific encoding (the |
| 255 | default being UTF-8). ``'b'`` appended to the mode opens the file in |
| 256 | :dfn:`binary mode`: now the data is read and written in the form of bytes |
| 257 | objects. This mode should be used for all files that don't contain text. |
Skip Montanaro | 4e02c50 | 2007-09-26 01:10:12 +0000 | [diff] [blame] | 258 | |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 259 | In text mode, the default is to convert platform-specific line endings (``\n`` |
| 260 | on Unix, ``\r\n`` on Windows) to just ``\n`` on reading and ``\n`` back to |
| 261 | platform-specific line endings on writing. This behind-the-scenes modification |
| 262 | to file data is fine for text files, but will corrupt binary data like that in |
| 263 | :file:`JPEG` or :file:`EXE` files. Be very careful to use binary mode when |
| 264 | reading and writing such files. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 265 | |
| 266 | |
| 267 | .. _tut-filemethods: |
| 268 | |
| 269 | Methods of File Objects |
| 270 | ----------------------- |
| 271 | |
| 272 | The rest of the examples in this section will assume that a file object called |
| 273 | ``f`` has already been created. |
| 274 | |
| 275 | To read a file's contents, call ``f.read(size)``, which reads some quantity of |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 276 | data and returns it as a string or bytes object. *size* is an optional numeric |
| 277 | argument. When *size* is omitted or negative, the entire contents of the file |
| 278 | will be read and returned; it's your problem if the file is twice as large as |
| 279 | your machine's memory. Otherwise, at most *size* bytes are read and returned. |
| 280 | If the end of the file has been reached, ``f.read()`` will return an empty |
| 281 | string (``''``). :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 282 | |
| 283 | >>> f.read() |
| 284 | 'This is the entire file.\n' |
| 285 | >>> f.read() |
| 286 | '' |
| 287 | |
| 288 | ``f.readline()`` reads a single line from the file; a newline character (``\n``) |
| 289 | is left at the end of the string, and is only omitted on the last line of the |
| 290 | file if the file doesn't end in a newline. This makes the return value |
| 291 | unambiguous; if ``f.readline()`` returns an empty string, the end of the file |
| 292 | has been reached, while a blank line is represented by ``'\n'``, a string |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 293 | containing only a single newline. :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 294 | |
| 295 | >>> f.readline() |
| 296 | 'This is the first line of the file.\n' |
| 297 | >>> f.readline() |
| 298 | 'Second line of the file\n' |
| 299 | >>> f.readline() |
| 300 | '' |
| 301 | |
| 302 | ``f.readlines()`` returns a list containing all the lines of data in the file. |
| 303 | If given an optional parameter *sizehint*, it reads that many bytes from the |
| 304 | file and enough more to complete a line, and returns the lines from that. This |
| 305 | is often used to allow efficient reading of a large file by lines, but without |
| 306 | having to load the entire file in memory. Only complete lines will be returned. |
| 307 | :: |
| 308 | |
| 309 | >>> f.readlines() |
| 310 | ['This is the first line of the file.\n', 'Second line of the file\n'] |
| 311 | |
Thomas Wouters | 8ce81f7 | 2007-09-20 18:22:40 +0000 | [diff] [blame] | 312 | An alternative approach to reading lines is to loop over the file object. This is |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 313 | memory efficient, fast, and leads to simpler code:: |
| 314 | |
| 315 | >>> for line in f: |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 316 | ... print(line, end='') |
| 317 | ... |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 318 | This is the first line of the file. |
| 319 | Second line of the file |
| 320 | |
| 321 | The alternative approach is simpler but does not provide as fine-grained |
| 322 | control. Since the two approaches manage line buffering differently, they |
| 323 | should not be mixed. |
| 324 | |
| 325 | ``f.write(string)`` writes the contents of *string* to the file, returning |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 326 | the number of characters written. :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 327 | |
| 328 | >>> f.write('This is a test\n') |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 329 | 15 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 330 | |
| 331 | To write something other than a string, it needs to be converted to a string |
| 332 | first:: |
| 333 | |
| 334 | >>> value = ('the answer', 42) |
| 335 | >>> s = str(value) |
| 336 | >>> f.write(s) |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 337 | 18 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 338 | |
| 339 | ``f.tell()`` returns an integer giving the file object's current position in the |
| 340 | file, measured in bytes from the beginning of the file. To change the file |
| 341 | object's position, use ``f.seek(offset, from_what)``. The position is computed |
| 342 | from adding *offset* to a reference point; the reference point is selected by |
| 343 | the *from_what* argument. A *from_what* value of 0 measures from the beginning |
| 344 | of the file, 1 uses the current file position, and 2 uses the end of the file as |
| 345 | the reference point. *from_what* can be omitted and defaults to 0, using the |
| 346 | beginning of the file as the reference point. :: |
| 347 | |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 348 | >>> f = open('/tmp/workfile', 'rb+') |
| 349 | >>> f.write(b'0123456789abcdef') |
| 350 | 16 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 351 | >>> f.seek(5) # Go to the 6th byte in the file |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 352 | 5 |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 353 | >>> f.read(1) |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 354 | b'5' |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 355 | >>> f.seek(-3, 2) # Go to the 3rd byte before the end |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 356 | 13 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 357 | >>> f.read(1) |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 358 | b'd' |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 359 | |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 360 | In text files (those opened without a ``b`` in the mode string), only seeks |
| 361 | relative to the beginning of the file are allowed (the exception being seeking |
| 362 | to the very file end with ``seek(0, 2)``). |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 363 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 364 | When you're done with a file, call ``f.close()`` to close it and free up any |
| 365 | system resources taken up by the open file. After calling ``f.close()``, |
| 366 | attempts to use the file object will automatically fail. :: |
| 367 | |
| 368 | >>> f.close() |
| 369 | >>> f.read() |
| 370 | Traceback (most recent call last): |
| 371 | File "<stdin>", line 1, in ? |
| 372 | ValueError: I/O operation on closed file |
| 373 | |
Georg Brandl | 3dbca81 | 2008-07-23 16:10:53 +0000 | [diff] [blame] | 374 | It is good practice to use the :keyword:`with` keyword when dealing with file |
| 375 | objects. This has the advantage that the file is properly closed after its |
| 376 | suite finishes, even if an exception is raised on the way. It is also much |
| 377 | shorter than writing equivalent :keyword:`try`\ -\ :keyword:`finally` blocks:: |
| 378 | |
| 379 | >>> with open('/tmp/workfile', 'r') as f: |
| 380 | ... read_data = f.read() |
| 381 | >>> f.closed |
| 382 | True |
| 383 | |
Alexandre Vassalotti | 6d3dfc3 | 2009-07-29 19:54:39 +0000 | [diff] [blame] | 384 | File objects have some additional methods, such as :meth:`~file.isatty` and |
| 385 | :meth:`~file.truncate` which are less frequently used; consult the Library |
| 386 | Reference for a complete guide to file objects. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 387 | |
| 388 | |
| 389 | .. _tut-pickle: |
| 390 | |
| 391 | The :mod:`pickle` Module |
| 392 | ------------------------ |
| 393 | |
| 394 | .. index:: module: pickle |
| 395 | |
| 396 | Strings can easily be written to and read from a file. Numbers take a bit more |
| 397 | effort, since the :meth:`read` method only returns strings, which will have to |
| 398 | be passed to a function like :func:`int`, which takes a string like ``'123'`` |
| 399 | and returns its numeric value 123. However, when you want to save more complex |
| 400 | data types like lists, dictionaries, or class instances, things get a lot more |
| 401 | complicated. |
| 402 | |
| 403 | Rather than have users be constantly writing and debugging code to save |
| 404 | complicated data types, Python provides a standard module called :mod:`pickle`. |
| 405 | This is an amazing module that can take almost any Python object (even some |
| 406 | forms of Python code!), and convert it to a string representation; this process |
| 407 | is called :dfn:`pickling`. Reconstructing the object from the string |
| 408 | representation is called :dfn:`unpickling`. Between pickling and unpickling, |
| 409 | the string representing the object may have been stored in a file or data, or |
| 410 | sent over a network connection to some distant machine. |
| 411 | |
| 412 | If you have an object ``x``, and a file object ``f`` that's been opened for |
| 413 | writing, the simplest way to pickle the object takes only one line of code:: |
| 414 | |
| 415 | pickle.dump(x, f) |
| 416 | |
| 417 | To unpickle the object again, if ``f`` is a file object which has been opened |
| 418 | for reading:: |
| 419 | |
| 420 | x = pickle.load(f) |
| 421 | |
| 422 | (There are other variants of this, used when pickling many objects or when you |
| 423 | don't want to write the pickled data to a file; consult the complete |
| 424 | documentation for :mod:`pickle` in the Python Library Reference.) |
| 425 | |
| 426 | :mod:`pickle` is the standard way to make Python objects which can be stored and |
| 427 | reused by other programs or by a future invocation of the same program; the |
| 428 | technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is |
| 429 | so widely used, many authors who write Python extensions take care to ensure |
| 430 | that new data types such as matrices can be properly pickled and unpickled. |
| 431 | |
| 432 | |