Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | .. _tut-io: |
| 2 | |
| 3 | **************** |
| 4 | Input and Output |
| 5 | **************** |
| 6 | |
| 7 | There are several ways to present the output of a program; data can be printed |
| 8 | in a human-readable form, or written to a file for future use. This chapter will |
| 9 | discuss some of the possibilities. |
| 10 | |
| 11 | |
| 12 | .. _tut-formatting: |
| 13 | |
| 14 | Fancier Output Formatting |
| 15 | ========================= |
| 16 | |
| 17 | So far we've encountered two ways of writing values: *expression statements* and |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 18 | the :func:`print` function. (A third way is using the :meth:`write` method |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 19 | of file objects; the standard output file can be referenced as ``sys.stdout``. |
| 20 | See the Library Reference for more information on this.) |
| 21 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 22 | Often you'll want more control over the formatting of your output than simply |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 23 | printing space-separated values. There are several ways to format output. |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 24 | |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 25 | * To use :ref:`formatted string literals <tut-f-strings>`, begin a string |
| 26 | with ``f`` or ``F`` before the opening quotation mark or triple quotation mark. |
| 27 | Inside this string, you can write a Python expression between ``{`` and ``}`` |
| 28 | characters that can refer to variables or literal values. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 29 | |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 30 | :: |
| 31 | |
Ben Hoyt | 3705b98 | 2018-09-19 06:28:28 -0400 | [diff] [blame] | 32 | >>> year = 2016 |
| 33 | >>> event = 'Referendum' |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 34 | >>> f'Results of the {year} {event}' |
| 35 | 'Results of the 2016 Referendum' |
| 36 | |
| 37 | * The :meth:`str.format` method of strings requires more manual |
| 38 | effort. You'll still use ``{`` and ``}`` to mark where a variable |
| 39 | will be substituted and can provide detailed formatting directives, |
| 40 | but you'll also need to provide the information to be formatted. |
| 41 | |
| 42 | :: |
| 43 | |
Ben Hoyt | 3705b98 | 2018-09-19 06:28:28 -0400 | [diff] [blame] | 44 | >>> yes_votes = 42_572_654 |
| 45 | >>> no_votes = 43_132_495 |
| 46 | >>> percentage = yes_votes / (yes_votes + no_votes) |
Aaqa Ishtyaq | cb5f3fd | 2018-07-20 21:36:44 +0530 | [diff] [blame] | 47 | >>> '{:-9} YES votes {:2.2%}'.format(yes_votes, percentage) |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 48 | ' 42572654 YES votes 49.67%' |
| 49 | |
| 50 | * Finally, you can do all the string handling yourself by using string slicing and |
| 51 | concatenation operations to create any layout you can imagine. The |
| 52 | string type has some methods that perform useful operations for padding |
| 53 | strings to a given column width. |
| 54 | |
| 55 | When you don't need fancy output but just want a quick display of some |
| 56 | variables for debugging purposes, you can convert any value to a string with |
| 57 | the :func:`repr` or :func:`str` functions. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 58 | |
| 59 | The :func:`str` function is meant to return representations of values which are |
| 60 | fairly human-readable, while :func:`repr` is meant to generate representations |
| 61 | which can be read by the interpreter (or will force a :exc:`SyntaxError` if |
Sandro Tosi | a17ef14 | 2012-08-14 19:51:43 +0200 | [diff] [blame] | 62 | there is no equivalent syntax). For objects which don't have a particular |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 63 | representation for human consumption, :func:`str` will return the same value as |
| 64 | :func:`repr`. Many values, such as numbers or structures like lists and |
Ezio Melotti | 0def5c6 | 2011-03-13 02:27:26 +0200 | [diff] [blame] | 65 | dictionaries, have the same representation using either function. Strings, in |
| 66 | particular, have two distinct representations. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 67 | |
| 68 | Some examples:: |
| 69 | |
| 70 | >>> s = 'Hello, world.' |
| 71 | >>> str(s) |
| 72 | 'Hello, world.' |
| 73 | >>> repr(s) |
| 74 | "'Hello, world.'" |
Ezio Melotti | 0def5c6 | 2011-03-13 02:27:26 +0200 | [diff] [blame] | 75 | >>> str(1/7) |
Mark Dickinson | 5a55b61 | 2009-06-28 20:59:42 +0000 | [diff] [blame] | 76 | '0.14285714285714285' |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 77 | >>> x = 10 * 3.25 |
| 78 | >>> y = 200 * 200 |
| 79 | >>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...' |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 80 | >>> print(s) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 81 | The value of x is 32.5, and y is 40000... |
| 82 | >>> # The repr() of a string adds string quotes and backslashes: |
| 83 | ... hello = 'hello, world\n' |
| 84 | >>> hellos = repr(hello) |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 85 | >>> print(hellos) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 86 | 'hello, world\n' |
| 87 | >>> # The argument to repr() may be any Python object: |
| 88 | ... repr((x, y, ('spam', 'eggs'))) |
| 89 | "(32.5, 40000, ('spam', 'eggs'))" |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 90 | |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 91 | The :mod:`string` module contains a :class:`~string.Template` class that offers |
| 92 | yet another way to substitute values into strings, using placeholders like |
| 93 | ``$x`` and replacing them with values from a dictionary, but offers much less |
| 94 | control of the formatting. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 95 | |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 96 | |
| 97 | .. _tut-f-strings: |
| 98 | |
| 99 | Formatted String Literals |
| 100 | ------------------------- |
| 101 | |
| 102 | :ref:`Formatted string literals <f-strings>` (also called f-strings for |
| 103 | short) let you include the value of Python expressions inside a string by |
| 104 | prefixing the string with ``f`` or ``F`` and writing expressions as |
| 105 | ``{expression}``. |
| 106 | |
| 107 | An optional format specifier can follow the expression. This allows greater |
| 108 | control over how the value is formatted. The following example rounds pi to |
| 109 | three places after the decimal:: |
| 110 | |
| 111 | >>> import math |
| 112 | >>> print(f'The value of pi is approximately {math.pi:.3f}.') |
Ben Hoyt | 3705b98 | 2018-09-19 06:28:28 -0400 | [diff] [blame] | 113 | The value of pi is approximately 3.142. |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 114 | |
| 115 | Passing an integer after the ``':'`` will cause that field to be a minimum |
| 116 | number of characters wide. This is useful for making columns line up. :: |
| 117 | |
| 118 | >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678} |
| 119 | >>> for name, phone in table.items(): |
| 120 | ... print(f'{name:10} ==> {phone:10d}') |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 121 | ... |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 122 | Sjoerd ==> 4127 |
| 123 | Jack ==> 4098 |
| 124 | Dcab ==> 7678 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 125 | |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 126 | Other modifiers can be used to convert the value before it is formatted. |
| 127 | ``'!a'`` applies :func:`ascii`, ``'!s'`` applies :func:`str`, and ``'!r'`` |
| 128 | applies :func:`repr`:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 129 | |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 130 | >>> animals = 'eels' |
| 131 | >>> print(f'My hovercraft is full of {animals}.') |
| 132 | My hovercraft is full of eels. |
Ben Hoyt | 3705b98 | 2018-09-19 06:28:28 -0400 | [diff] [blame] | 133 | >>> print(f'My hovercraft is full of {animals!r}.') |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 134 | My hovercraft is full of 'eels'. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 135 | |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 136 | For a reference on these format specifications, see |
| 137 | the reference guide for the :ref:`formatspec`. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 138 | |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 139 | .. _tut-string-format: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 140 | |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 141 | The String format() Method |
| 142 | -------------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 143 | |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 144 | Basic usage of the :meth:`str.format` method looks like this:: |
| 145 | |
Georg Brandl | 2f3ed68 | 2009-09-01 07:42:40 +0000 | [diff] [blame] | 146 | >>> print('We are the {} who say "{}!"'.format('knights', 'Ni')) |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 147 | We are the knights who say "Ni!" |
| 148 | |
| 149 | The brackets and characters within them (called format fields) are replaced with |
Ezio Melotti | 2b73660 | 2011-03-13 02:19:57 +0200 | [diff] [blame] | 150 | the objects passed into the :meth:`str.format` method. A number in the |
Georg Brandl | 2f3ed68 | 2009-09-01 07:42:40 +0000 | [diff] [blame] | 151 | brackets can be used to refer to the position of the object passed into the |
Ezio Melotti | 2b73660 | 2011-03-13 02:19:57 +0200 | [diff] [blame] | 152 | :meth:`str.format` method. :: |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 153 | |
Benjamin Peterson | 0cea157 | 2008-07-26 21:59:03 +0000 | [diff] [blame] | 154 | >>> print('{0} and {1}'.format('spam', 'eggs')) |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 155 | spam and eggs |
Benjamin Peterson | 0cea157 | 2008-07-26 21:59:03 +0000 | [diff] [blame] | 156 | >>> print('{1} and {0}'.format('spam', 'eggs')) |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 157 | eggs and spam |
| 158 | |
Ezio Melotti | 2b73660 | 2011-03-13 02:19:57 +0200 | [diff] [blame] | 159 | If keyword arguments are used in the :meth:`str.format` method, their values |
Alexandre Vassalotti | 6d3dfc3 | 2009-07-29 19:54:39 +0000 | [diff] [blame] | 160 | are referred to by using the name of the argument. :: |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 161 | |
Benjamin Peterson | 7114193 | 2008-07-26 22:27:04 +0000 | [diff] [blame] | 162 | >>> print('This {food} is {adjective}.'.format( |
| 163 | ... food='spam', adjective='absolutely horrible')) |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 164 | This spam is absolutely horrible. |
| 165 | |
| 166 | Positional and keyword arguments can be arbitrarily combined:: |
| 167 | |
Benjamin Peterson | 7114193 | 2008-07-26 22:27:04 +0000 | [diff] [blame] | 168 | >>> print('The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred', |
| 169 | other='Georg')) |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 170 | The story of Bill, Manfred, and Georg. |
| 171 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 172 | If you have a really long format string that you don't want to split up, it |
| 173 | would be nice if you could reference the variables to be formatted by name |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 174 | instead of by position. This can be done by simply passing the dict and using |
| 175 | square brackets ``'[]'`` to access the keys :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 176 | |
| 177 | >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678} |
Benjamin Peterson | 7114193 | 2008-07-26 22:27:04 +0000 | [diff] [blame] | 178 | >>> print('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; ' |
Andrew Svetlov | e9cf97c | 2012-10-17 16:41:28 +0300 | [diff] [blame] | 179 | ... 'Dcab: {0[Dcab]:d}'.format(table)) |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 180 | Jack: 4098; Sjoerd: 4127; Dcab: 8637678 |
| 181 | |
| 182 | This could also be done by passing the table as keyword arguments with the '**' |
Alexandre Vassalotti | 6d3dfc3 | 2009-07-29 19:54:39 +0000 | [diff] [blame] | 183 | notation. :: |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 184 | |
| 185 | >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678} |
| 186 | >>> print('Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table)) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 187 | Jack: 4098; Sjoerd: 4127; Dcab: 8637678 |
| 188 | |
Ezio Melotti | 2b73660 | 2011-03-13 02:19:57 +0200 | [diff] [blame] | 189 | This is particularly useful in combination with the built-in function |
| 190 | :func:`vars`, which returns a dictionary containing all local variables. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 191 | |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 192 | As an example, the following lines produce a tidily-aligned |
| 193 | set of columns giving integers and their squares and cubes:: |
| 194 | |
| 195 | >>> for x in range(1, 11): |
| 196 | ... print('{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x)) |
| 197 | ... |
| 198 | 1 1 1 |
| 199 | 2 4 8 |
| 200 | 3 9 27 |
| 201 | 4 16 64 |
| 202 | 5 25 125 |
| 203 | 6 36 216 |
| 204 | 7 49 343 |
| 205 | 8 64 512 |
| 206 | 9 81 729 |
| 207 | 10 100 1000 |
| 208 | |
Mark Dickinson | 934896d | 2009-02-21 20:59:32 +0000 | [diff] [blame] | 209 | For a complete overview of string formatting with :meth:`str.format`, see |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 210 | :ref:`formatstrings`. |
| 211 | |
| 212 | |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 213 | Manual String Formatting |
| 214 | ------------------------ |
| 215 | |
| 216 | Here's the same table of squares and cubes, formatted manually:: |
| 217 | |
| 218 | >>> for x in range(1, 11): |
| 219 | ... print(repr(x).rjust(2), repr(x*x).rjust(3), end=' ') |
| 220 | ... # Note use of 'end' on previous line |
| 221 | ... print(repr(x*x*x).rjust(4)) |
| 222 | ... |
| 223 | 1 1 1 |
| 224 | 2 4 8 |
| 225 | 3 9 27 |
| 226 | 4 16 64 |
| 227 | 5 25 125 |
| 228 | 6 36 216 |
| 229 | 7 49 343 |
| 230 | 8 64 512 |
| 231 | 9 81 729 |
| 232 | 10 100 1000 |
| 233 | |
| 234 | (Note that the one space between each column was added by the |
| 235 | way :func:`print` works: it always adds spaces between its arguments.) |
| 236 | |
| 237 | The :meth:`str.rjust` method of string objects right-justifies a string in a |
| 238 | field of a given width by padding it with spaces on the left. There are |
| 239 | similar methods :meth:`str.ljust` and :meth:`str.center`. These methods do |
| 240 | not write anything, they just return a new string. If the input string is too |
| 241 | long, they don't truncate it, but return it unchanged; this will mess up your |
| 242 | column lay-out but that's usually better than the alternative, which would be |
| 243 | lying about a value. (If you really want truncation you can always add a |
| 244 | slice operation, as in ``x.ljust(n)[:n]``.) |
| 245 | |
| 246 | There is another method, :meth:`str.zfill`, which pads a numeric string on the |
| 247 | left with zeros. It understands about plus and minus signs:: |
| 248 | |
| 249 | >>> '12'.zfill(5) |
| 250 | '00012' |
| 251 | >>> '-3.14'.zfill(7) |
| 252 | '-003.14' |
| 253 | >>> '3.14159265359'.zfill(5) |
| 254 | '3.14159265359' |
| 255 | |
| 256 | |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 257 | Old string formatting |
| 258 | --------------------- |
| 259 | |
| 260 | The ``%`` operator can also be used for string formatting. It interprets the |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 261 | left argument much like a :c:func:`sprintf`\ -style format string to be applied |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 262 | to the right argument, and returns the string resulting from this formatting |
| 263 | operation. For example:: |
| 264 | |
| 265 | >>> import math |
Andrew Kuchling | ced350b | 2018-07-07 17:36:23 -0400 | [diff] [blame] | 266 | >>> print('The value of pi is approximately %5.3f.' % math.pi) |
| 267 | The value of pi is approximately 3.142. |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 268 | |
Benjamin Peterson | e6f0063 | 2008-05-26 01:03:56 +0000 | [diff] [blame] | 269 | More information can be found in the :ref:`old-string-formatting` section. |
| 270 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 271 | |
| 272 | .. _tut-files: |
| 273 | |
| 274 | Reading and Writing Files |
| 275 | ========================= |
| 276 | |
| 277 | .. index:: |
| 278 | builtin: open |
| 279 | object: file |
| 280 | |
Antoine Pitrou | 11cb961 | 2010-09-15 11:11:28 +0000 | [diff] [blame] | 281 | :func:`open` returns a :term:`file object`, and is most commonly used with |
| 282 | two arguments: ``open(filename, mode)``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 283 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 284 | :: |
| 285 | |
Petri Lehtinen | 9f74c6c | 2013-02-23 19:26:56 +0100 | [diff] [blame] | 286 | >>> f = open('workfile', 'w') |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 287 | |
| 288 | .. XXX str(f) is <io.TextIOWrapper object at 0x82e8dc4> |
| 289 | |
Guido van Rossum | 0616b79 | 2007-08-31 03:25:11 +0000 | [diff] [blame] | 290 | >>> print(f) |
Petri Lehtinen | 9f74c6c | 2013-02-23 19:26:56 +0100 | [diff] [blame] | 291 | <open file 'workfile', mode 'w' at 80a0960> |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 292 | |
| 293 | The first argument is a string containing the filename. The second argument is |
| 294 | another string containing a few characters describing the way in which the file |
| 295 | will be used. *mode* can be ``'r'`` when the file will only be read, ``'w'`` |
| 296 | for only writing (an existing file with the same name will be erased), and |
| 297 | ``'a'`` opens the file for appending; any data written to the file is |
| 298 | automatically added to the end. ``'r+'`` opens the file for both reading and |
| 299 | writing. The *mode* argument is optional; ``'r'`` will be assumed if it's |
| 300 | omitted. |
| 301 | |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 302 | Normally, files are opened in :dfn:`text mode`, that means, you read and write |
Alessandro Cucci | d8de44b | 2015-07-28 21:00:10 +0200 | [diff] [blame] | 303 | strings from and to the file, which are encoded in a specific encoding. If |
Jason R. Coombs | 842c074 | 2015-07-29 14:04:36 -0400 | [diff] [blame] | 304 | encoding is not specified, the default is platform dependent (see |
| 305 | :func:`open`). ``'b'`` appended to the mode opens the file in |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 306 | :dfn:`binary mode`: now the data is read and written in the form of bytes |
| 307 | objects. This mode should be used for all files that don't contain text. |
Skip Montanaro | 4e02c50 | 2007-09-26 01:10:12 +0000 | [diff] [blame] | 308 | |
Chris Jerdonek | 5bf7f1f | 2012-10-17 20:17:41 -0700 | [diff] [blame] | 309 | In text mode, the default when reading is to convert platform-specific line |
| 310 | endings (``\n`` on Unix, ``\r\n`` on Windows) to just ``\n``. When writing in |
| 311 | text mode, the default is to convert occurrences of ``\n`` back to |
| 312 | platform-specific line endings. This behind-the-scenes modification |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 313 | to file data is fine for text files, but will corrupt binary data like that in |
| 314 | :file:`JPEG` or :file:`EXE` files. Be very careful to use binary mode when |
| 315 | reading and writing such files. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 316 | |
Andrew Kuchling | bd4e9e0 | 2017-06-13 01:31:01 -0400 | [diff] [blame] | 317 | It is good practice to use the :keyword:`with` keyword when dealing |
| 318 | with file objects. The advantage is that the file is properly closed |
| 319 | after its suite finishes, even if an exception is raised at some |
Serhiy Storchaka | 2b57c43 | 2018-12-19 08:09:46 +0200 | [diff] [blame] | 320 | point. Using :keyword:`!with` is also much shorter than writing |
Andrew Kuchling | bd4e9e0 | 2017-06-13 01:31:01 -0400 | [diff] [blame] | 321 | equivalent :keyword:`try`\ -\ :keyword:`finally` blocks:: |
| 322 | |
| 323 | >>> with open('workfile') as f: |
| 324 | ... read_data = f.read() |
| 325 | >>> f.closed |
| 326 | True |
| 327 | |
| 328 | If you're not using the :keyword:`with` keyword, then you should call |
| 329 | ``f.close()`` to close the file and immediately free up any system |
| 330 | resources used by it. If you don't explicitly close a file, Python's |
| 331 | garbage collector will eventually destroy the object and close the |
| 332 | open file for you, but the file may stay open for a while. Another |
| 333 | risk is that different Python implementations will do this clean-up at |
| 334 | different times. |
| 335 | |
| 336 | After a file object is closed, either by a :keyword:`with` statement |
| 337 | or by calling ``f.close()``, attempts to use the file object will |
| 338 | automatically fail. :: |
| 339 | |
| 340 | >>> f.close() |
| 341 | >>> f.read() |
| 342 | Traceback (most recent call last): |
| 343 | File "<stdin>", line 1, in <module> |
Lysandros Nikolaou | 9cffdbf | 2018-07-11 02:11:34 +0200 | [diff] [blame] | 344 | ValueError: I/O operation on closed file. |
Andrew Kuchling | bd4e9e0 | 2017-06-13 01:31:01 -0400 | [diff] [blame] | 345 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 346 | |
| 347 | .. _tut-filemethods: |
| 348 | |
| 349 | Methods of File Objects |
| 350 | ----------------------- |
| 351 | |
| 352 | The rest of the examples in this section will assume that a file object called |
| 353 | ``f`` has already been created. |
| 354 | |
| 355 | To read a file's contents, call ``f.read(size)``, which reads some quantity of |
Ezio Melotti | 397bb24 | 2016-01-12 11:27:30 +0200 | [diff] [blame] | 356 | data and returns it as a string (in text mode) or bytes object (in binary mode). |
| 357 | *size* is an optional numeric argument. When *size* is omitted or negative, the |
| 358 | entire contents of the file will be read and returned; it's your problem if the |
| 359 | file is twice as large as your machine's memory. Otherwise, at most *size* bytes |
| 360 | are read and returned. |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 361 | If the end of the file has been reached, ``f.read()`` will return an empty |
| 362 | string (``''``). :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 363 | |
| 364 | >>> f.read() |
| 365 | 'This is the entire file.\n' |
| 366 | >>> f.read() |
| 367 | '' |
| 368 | |
| 369 | ``f.readline()`` reads a single line from the file; a newline character (``\n``) |
| 370 | is left at the end of the string, and is only omitted on the last line of the |
| 371 | file if the file doesn't end in a newline. This makes the return value |
| 372 | unambiguous; if ``f.readline()`` returns an empty string, the end of the file |
| 373 | has been reached, while a blank line is represented by ``'\n'``, a string |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 374 | containing only a single newline. :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 375 | |
| 376 | >>> f.readline() |
| 377 | 'This is the first line of the file.\n' |
| 378 | >>> f.readline() |
| 379 | 'Second line of the file\n' |
| 380 | >>> f.readline() |
| 381 | '' |
| 382 | |
Ezio Melotti | ed3cd7e | 2013-04-15 19:08:31 +0300 | [diff] [blame] | 383 | For reading lines from a file, you can loop over the file object. This is memory |
| 384 | efficient, fast, and leads to simple code:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 385 | |
| 386 | >>> for line in f: |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 387 | ... print(line, end='') |
| 388 | ... |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 389 | This is the first line of the file. |
| 390 | Second line of the file |
| 391 | |
Ezio Melotti | ed3cd7e | 2013-04-15 19:08:31 +0300 | [diff] [blame] | 392 | If you want to read all the lines of a file in a list you can also use |
| 393 | ``list(f)`` or ``f.readlines()``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 394 | |
| 395 | ``f.write(string)`` writes the contents of *string* to the file, returning |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 396 | the number of characters written. :: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 397 | |
| 398 | >>> f.write('This is a test\n') |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 399 | 15 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 400 | |
Ezio Melotti | 397bb24 | 2016-01-12 11:27:30 +0200 | [diff] [blame] | 401 | Other types of objects need to be converted -- either to a string (in text mode) |
| 402 | or a bytes object (in binary mode) -- before writing them:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 403 | |
| 404 | >>> value = ('the answer', 42) |
Ezio Melotti | 397bb24 | 2016-01-12 11:27:30 +0200 | [diff] [blame] | 405 | >>> s = str(value) # convert the tuple to string |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 406 | >>> f.write(s) |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 407 | 18 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 408 | |
R David Murray | 1c4e443 | 2013-07-30 15:51:57 -0400 | [diff] [blame] | 409 | ``f.tell()`` returns an integer giving the file object's current position in the file |
Georg Brandl | 6b4c847 | 2014-10-30 22:26:26 +0100 | [diff] [blame] | 410 | represented as number of bytes from the beginning of the file when in binary mode and |
| 411 | an opaque number when in text mode. |
R David Murray | 1c4e443 | 2013-07-30 15:51:57 -0400 | [diff] [blame] | 412 | |
| 413 | To change the file object's position, use ``f.seek(offset, from_what)``. The position is computed |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 414 | from adding *offset* to a reference point; the reference point is selected by |
| 415 | the *from_what* argument. A *from_what* value of 0 measures from the beginning |
| 416 | of the file, 1 uses the current file position, and 2 uses the end of the file as |
| 417 | the reference point. *from_what* can be omitted and defaults to 0, using the |
| 418 | beginning of the file as the reference point. :: |
| 419 | |
Petri Lehtinen | 9f74c6c | 2013-02-23 19:26:56 +0100 | [diff] [blame] | 420 | >>> f = open('workfile', 'rb+') |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 421 | >>> f.write(b'0123456789abcdef') |
| 422 | 16 |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 423 | >>> f.seek(5) # Go to the 6th byte in the file |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 424 | 5 |
Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 425 | >>> f.read(1) |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 426 | b'5' |
Serhiy Storchaka | dba9039 | 2016-05-10 12:01:23 +0300 | [diff] [blame] | 427 | >>> f.seek(-3, 2) # Go to the 3rd byte before the end |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 428 | 13 |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 429 | >>> f.read(1) |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 430 | b'd' |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 431 | |
Georg Brandl | 0dcb7ac | 2008-08-08 07:04:38 +0000 | [diff] [blame] | 432 | In text files (those opened without a ``b`` in the mode string), only seeks |
| 433 | relative to the beginning of the file are allowed (the exception being seeking |
R David Murray | 1c4e443 | 2013-07-30 15:51:57 -0400 | [diff] [blame] | 434 | to the very file end with ``seek(0, 2)``) and the only valid *offset* values are |
| 435 | those returned from the ``f.tell()``, or zero. Any other *offset* value produces |
| 436 | undefined behaviour. |
| 437 | |
Alexandre Vassalotti | 6d3dfc3 | 2009-07-29 19:54:39 +0000 | [diff] [blame] | 438 | File objects have some additional methods, such as :meth:`~file.isatty` and |
| 439 | :meth:`~file.truncate` which are less frequently used; consult the Library |
| 440 | Reference for a complete guide to file objects. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 441 | |
| 442 | |
Antoine Pitrou | dd799d2 | 2013-12-05 23:46:32 +0100 | [diff] [blame] | 443 | .. _tut-json: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 444 | |
Antoine Pitrou | dd799d2 | 2013-12-05 23:46:32 +0100 | [diff] [blame] | 445 | Saving structured data with :mod:`json` |
| 446 | --------------------------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 447 | |
Antoine Pitrou | dd799d2 | 2013-12-05 23:46:32 +0100 | [diff] [blame] | 448 | .. index:: module: json |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 449 | |
Antoine Pitrou | dd799d2 | 2013-12-05 23:46:32 +0100 | [diff] [blame] | 450 | Strings can easily be written to and read from a file. Numbers take a bit more |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 451 | effort, since the :meth:`read` method only returns strings, which will have to |
| 452 | be passed to a function like :func:`int`, which takes a string like ``'123'`` |
Antoine Pitrou | dd799d2 | 2013-12-05 23:46:32 +0100 | [diff] [blame] | 453 | and returns its numeric value 123. When you want to save more complex data |
| 454 | types like nested lists and dictionaries, parsing and serializing by hand |
| 455 | becomes complicated. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 456 | |
Antoine Pitrou | dd799d2 | 2013-12-05 23:46:32 +0100 | [diff] [blame] | 457 | Rather than having users constantly writing and debugging code to save |
| 458 | complicated data types to files, Python allows you to use the popular data |
| 459 | interchange format called `JSON (JavaScript Object Notation) |
| 460 | <http://json.org>`_. The standard module called :mod:`json` can take Python |
| 461 | data hierarchies, and convert them to string representations; this process is |
| 462 | called :dfn:`serializing`. Reconstructing the data from the string representation |
| 463 | is called :dfn:`deserializing`. Between serializing and deserializing, the |
| 464 | string representing the object may have been stored in a file or data, or |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 465 | sent over a network connection to some distant machine. |
| 466 | |
Antoine Pitrou | dd799d2 | 2013-12-05 23:46:32 +0100 | [diff] [blame] | 467 | .. note:: |
| 468 | The JSON format is commonly used by modern applications to allow for data |
| 469 | exchange. Many programmers are already familiar with it, which makes |
| 470 | it a good choice for interoperability. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 471 | |
Antoine Pitrou | dd799d2 | 2013-12-05 23:46:32 +0100 | [diff] [blame] | 472 | If you have an object ``x``, you can view its JSON string representation with a |
| 473 | simple line of code:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 474 | |
suketa | 1dbce04 | 2017-06-12 10:42:59 +0900 | [diff] [blame] | 475 | >>> import json |
Antoine Pitrou | dd799d2 | 2013-12-05 23:46:32 +0100 | [diff] [blame] | 476 | >>> json.dumps([1, 'simple', 'list']) |
| 477 | '[1, "simple", "list"]' |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 478 | |
Antoine Pitrou | dd799d2 | 2013-12-05 23:46:32 +0100 | [diff] [blame] | 479 | Another variant of the :func:`~json.dumps` function, called :func:`~json.dump`, |
| 480 | simply serializes the object to a :term:`text file`. So if ``f`` is a |
| 481 | :term:`text file` object opened for writing, we can do this:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 482 | |
Antoine Pitrou | dd799d2 | 2013-12-05 23:46:32 +0100 | [diff] [blame] | 483 | json.dump(x, f) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 484 | |
Antoine Pitrou | dd799d2 | 2013-12-05 23:46:32 +0100 | [diff] [blame] | 485 | To decode the object again, if ``f`` is a :term:`text file` object which has |
| 486 | been opened for reading:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 487 | |
Antoine Pitrou | dd799d2 | 2013-12-05 23:46:32 +0100 | [diff] [blame] | 488 | x = json.load(f) |
| 489 | |
| 490 | This simple serialization technique can handle lists and dictionaries, but |
| 491 | serializing arbitrary class instances in JSON requires a bit of extra effort. |
| 492 | The reference for the :mod:`json` module contains an explanation of this. |
| 493 | |
| 494 | .. seealso:: |
| 495 | |
| 496 | :mod:`pickle` - the pickle module |
| 497 | |
| 498 | Contrary to :ref:`JSON <tut-json>`, *pickle* is a protocol which allows |
| 499 | the serialization of arbitrarily complex Python objects. As such, it is |
| 500 | specific to Python and cannot be used to communicate with applications |
| 501 | written in other languages. It is also insecure by default: |
| 502 | deserializing pickle data coming from an untrusted source can execute |
| 503 | arbitrary code, if the data was crafted by a skilled attacker. |