blob: abe1ce09c0f73282cd43c574ca0d12c6716d114d [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001.. _tut-io:
2
3****************
4Input and Output
5****************
6
7There are several ways to present the output of a program; data can be printed
8in a human-readable form, or written to a file for future use. This chapter will
9discuss some of the possibilities.
10
11
12.. _tut-formatting:
13
14Fancier Output Formatting
15=========================
16
17So far we've encountered two ways of writing values: *expression statements* and
Guido van Rossum0616b792007-08-31 03:25:11 +000018the :func:`print` function. (A third way is using the :meth:`write` method
Georg Brandl116aa622007-08-15 14:28:22 +000019of file objects; the standard output file can be referenced as ``sys.stdout``.
20See the Library Reference for more information on this.)
21
Georg Brandl116aa622007-08-15 14:28:22 +000022Often you'll want more control over the formatting of your output than simply
23printing space-separated values. There are two ways to format your output; the
24first way is to do all the string handling yourself; using string slicing and
25concatenation operations you can create any layout you can imagine. The
Georg Brandl3640e182011-03-06 10:56:18 +010026string type has some methods that perform useful operations for padding
Georg Brandl116aa622007-08-15 14:28:22 +000027strings to a given column width; these will be discussed shortly. The second
Benjamin Petersone6f00632008-05-26 01:03:56 +000028way is to use the :meth:`str.format` method.
29
Georg Brandl3640e182011-03-06 10:56:18 +010030The :mod:`string` module contains a :class:`~string.Template` class which offers
31yet another way to substitute values into strings.
Georg Brandl116aa622007-08-15 14:28:22 +000032
33One question remains, of course: how do you convert values to strings? Luckily,
34Python has ways to convert any value to a string: pass it to the :func:`repr`
Georg Brandl1e3830a2008-08-08 06:45:01 +000035or :func:`str` functions.
Georg Brandl116aa622007-08-15 14:28:22 +000036
37The :func:`str` function is meant to return representations of values which are
38fairly human-readable, while :func:`repr` is meant to generate representations
39which can be read by the interpreter (or will force a :exc:`SyntaxError` if
40there is not equivalent syntax). For objects which don't have a particular
41representation for human consumption, :func:`str` will return the same value as
42:func:`repr`. Many values, such as numbers or structures like lists and
43dictionaries, have the same representation using either function. Strings and
44floating point numbers, in particular, have two distinct representations.
45
46Some examples::
47
48 >>> s = 'Hello, world.'
49 >>> str(s)
50 'Hello, world.'
51 >>> repr(s)
52 "'Hello, world.'"
Mark Dickinsond1cc39d2009-06-28 21:00:42 +000053 >>> str(1.0/7.0)
54 '0.142857142857'
55 >>> repr(1.0/7.0)
56 '0.14285714285714285'
Georg Brandl116aa622007-08-15 14:28:22 +000057 >>> x = 10 * 3.25
58 >>> y = 200 * 200
59 >>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'
Guido van Rossum0616b792007-08-31 03:25:11 +000060 >>> print(s)
Georg Brandl116aa622007-08-15 14:28:22 +000061 The value of x is 32.5, and y is 40000...
62 >>> # The repr() of a string adds string quotes and backslashes:
63 ... hello = 'hello, world\n'
64 >>> hellos = repr(hello)
Guido van Rossum0616b792007-08-31 03:25:11 +000065 >>> print(hellos)
Georg Brandl116aa622007-08-15 14:28:22 +000066 'hello, world\n'
67 >>> # The argument to repr() may be any Python object:
68 ... repr((x, y, ('spam', 'eggs')))
69 "(32.5, 40000, ('spam', 'eggs'))"
Georg Brandl116aa622007-08-15 14:28:22 +000070
71Here are two ways to write a table of squares and cubes::
72
73 >>> for x in range(1, 11):
Georg Brandle4ac7502007-09-03 07:10:24 +000074 ... print(repr(x).rjust(2), repr(x*x).rjust(3), end=' ')
Guido van Rossum0616b792007-08-31 03:25:11 +000075 ... # Note use of 'end' on previous line
76 ... print(repr(x*x*x).rjust(4))
Georg Brandl116aa622007-08-15 14:28:22 +000077 ...
78 1 1 1
79 2 4 8
80 3 9 27
81 4 16 64
82 5 25 125
83 6 36 216
84 7 49 343
85 8 64 512
86 9 81 729
87 10 100 1000
88
Georg Brandle4ac7502007-09-03 07:10:24 +000089 >>> for x in range(1, 11):
Benjamin Petersone6f00632008-05-26 01:03:56 +000090 ... print('{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x))
Georg Brandl48310cd2009-01-03 21:18:54 +000091 ...
Georg Brandl116aa622007-08-15 14:28:22 +000092 1 1 1
93 2 4 8
94 3 9 27
95 4 16 64
96 5 25 125
97 6 36 216
98 7 49 343
99 8 64 512
100 9 81 729
101 10 100 1000
102
103(Note that in the first example, one space between each column was added by the
Guido van Rossum0616b792007-08-31 03:25:11 +0000104way :func:`print` works: it always adds spaces between its arguments.)
Georg Brandl116aa622007-08-15 14:28:22 +0000105
Ezio Melotti2b736602011-03-13 02:19:57 +0200106This example demonstrates the :meth:`str.rjust` method of string
107objects, which right-justifies a string in a field of a given width by padding
108it with spaces on the left. There are similar methods :meth:`str.ljust` and
109:meth:`str.center`. These methods do not write anything, they just return a
110new string. If the input string is too long, they don't truncate it, but
111return it unchanged; this will mess up your column lay-out but that's usually
112better than the alternative, which would be lying about a value. (If you
113really want truncation you can always add a slice operation, as in
114``x.ljust(n)[:n]``.)
Georg Brandl116aa622007-08-15 14:28:22 +0000115
Ezio Melotti2b736602011-03-13 02:19:57 +0200116There is another method, :meth:`str.zfill`, which pads a numeric string on the
117left with zeros. It understands about plus and minus signs::
Georg Brandl116aa622007-08-15 14:28:22 +0000118
119 >>> '12'.zfill(5)
120 '00012'
121 >>> '-3.14'.zfill(7)
122 '-003.14'
123 >>> '3.14159265359'.zfill(5)
124 '3.14159265359'
125
Benjamin Petersone6f00632008-05-26 01:03:56 +0000126Basic usage of the :meth:`str.format` method looks like this::
127
Georg Brandl7baf6252009-09-01 08:13:16 +0000128 >>> print('We are the {} who say "{}!"'.format('knights', 'Ni'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000129 We are the knights who say "Ni!"
130
131The brackets and characters within them (called format fields) are replaced with
Ezio Melotti2b736602011-03-13 02:19:57 +0200132the objects passed into the :meth:`str.format` method. A number in the
Georg Brandl7baf6252009-09-01 08:13:16 +0000133brackets can be used to refer to the position of the object passed into the
Ezio Melotti2b736602011-03-13 02:19:57 +0200134:meth:`str.format` method. ::
Benjamin Petersone6f00632008-05-26 01:03:56 +0000135
Benjamin Peterson0cea1572008-07-26 21:59:03 +0000136 >>> print('{0} and {1}'.format('spam', 'eggs'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000137 spam and eggs
Benjamin Peterson0cea1572008-07-26 21:59:03 +0000138 >>> print('{1} and {0}'.format('spam', 'eggs'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000139 eggs and spam
140
Ezio Melotti2b736602011-03-13 02:19:57 +0200141If keyword arguments are used in the :meth:`str.format` method, their values
Georg Brandlc5605df2009-08-13 08:26:44 +0000142are referred to by using the name of the argument. ::
Benjamin Petersone6f00632008-05-26 01:03:56 +0000143
Benjamin Peterson71141932008-07-26 22:27:04 +0000144 >>> print('This {food} is {adjective}.'.format(
145 ... food='spam', adjective='absolutely horrible'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000146 This spam is absolutely horrible.
147
148Positional and keyword arguments can be arbitrarily combined::
149
Benjamin Peterson71141932008-07-26 22:27:04 +0000150 >>> print('The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred',
151 other='Georg'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000152 The story of Bill, Manfred, and Georg.
153
Georg Brandl7baf6252009-09-01 08:13:16 +0000154``'!a'`` (apply :func:`ascii`), ``'!s'`` (apply :func:`str`) and ``'!r'``
155(apply :func:`repr`) can be used to convert the value before it is formatted::
156
157 >>> import math
158 >>> print('The value of PI is approximately {}.'.format(math.pi))
159 The value of PI is approximately 3.14159265359.
160 >>> print('The value of PI is approximately {!r}.'.format(math.pi))
161 The value of PI is approximately 3.141592653589793.
162
Georg Brandl01a30522009-08-13 08:37:59 +0000163An optional ``':'`` and format specifier can follow the field name. This allows
Benjamin Petersone6f00632008-05-26 01:03:56 +0000164greater control over how the value is formatted. The following example
Georg Brandl01a30522009-08-13 08:37:59 +0000165truncates Pi to three places after the decimal.
Georg Brandl116aa622007-08-15 14:28:22 +0000166
167 >>> import math
Benjamin Petersone6f00632008-05-26 01:03:56 +0000168 >>> print('The value of PI is approximately {0:.3f}.'.format(math.pi))
Georg Brandl116aa622007-08-15 14:28:22 +0000169 The value of PI is approximately 3.142.
170
Benjamin Petersone6f00632008-05-26 01:03:56 +0000171Passing an integer after the ``':'`` will cause that field to be a minimum
Georg Brandlc5605df2009-08-13 08:26:44 +0000172number of characters wide. This is useful for making tables pretty. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000173
174 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
175 >>> for name, phone in table.items():
Benjamin Petersone6f00632008-05-26 01:03:56 +0000176 ... print('{0:10} ==> {1:10d}'.format(name, phone))
Georg Brandl48310cd2009-01-03 21:18:54 +0000177 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000178 Jack ==> 4098
179 Dcab ==> 7678
180 Sjoerd ==> 4127
181
Georg Brandl116aa622007-08-15 14:28:22 +0000182If you have a really long format string that you don't want to split up, it
183would be nice if you could reference the variables to be formatted by name
Benjamin Petersone6f00632008-05-26 01:03:56 +0000184instead of by position. This can be done by simply passing the dict and using
185square brackets ``'[]'`` to access the keys ::
Georg Brandl116aa622007-08-15 14:28:22 +0000186
187 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
Benjamin Peterson71141932008-07-26 22:27:04 +0000188 >>> print('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; '
189 'Dcab: {0[Dcab]:d}'.format(table))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000190 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
191
192This could also be done by passing the table as keyword arguments with the '**'
Georg Brandlc5605df2009-08-13 08:26:44 +0000193notation. ::
Benjamin Petersone6f00632008-05-26 01:03:56 +0000194
195 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
196 >>> print('Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table))
Georg Brandl116aa622007-08-15 14:28:22 +0000197 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
198
Ezio Melotti2b736602011-03-13 02:19:57 +0200199This is particularly useful in combination with the built-in function
200:func:`vars`, which returns a dictionary containing all local variables.
Georg Brandl116aa622007-08-15 14:28:22 +0000201
Mark Dickinson934896d2009-02-21 20:59:32 +0000202For a complete overview of string formatting with :meth:`str.format`, see
Benjamin Petersone6f00632008-05-26 01:03:56 +0000203:ref:`formatstrings`.
204
205
206Old string formatting
207---------------------
208
209The ``%`` operator can also be used for string formatting. It interprets the
210left argument much like a :cfunc:`sprintf`\ -style format string to be applied
211to the right argument, and returns the string resulting from this formatting
212operation. For example::
213
214 >>> import math
Georg Brandl11e18b02008-08-05 09:04:16 +0000215 >>> print('The value of PI is approximately %5.3f.' % math.pi)
Benjamin Petersone6f00632008-05-26 01:03:56 +0000216 The value of PI is approximately 3.142.
217
218Since :meth:`str.format` is quite new, a lot of Python code still uses the ``%``
Georg Brandl01a30522009-08-13 08:37:59 +0000219operator. However, because this old style of formatting will eventually be
220removed from the language, :meth:`str.format` should generally be used.
Benjamin Petersone6f00632008-05-26 01:03:56 +0000221
222More information can be found in the :ref:`old-string-formatting` section.
223
Georg Brandl116aa622007-08-15 14:28:22 +0000224
225.. _tut-files:
226
227Reading and Writing Files
228=========================
229
230.. index::
231 builtin: open
232 object: file
233
Antoine Pitrou25d535e2010-09-15 11:25:11 +0000234:func:`open` returns a :term:`file object`, and is most commonly used with
235two arguments: ``open(filename, mode)``.
Georg Brandl116aa622007-08-15 14:28:22 +0000236
Georg Brandl116aa622007-08-15 14:28:22 +0000237::
238
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000239 >>> f = open('/tmp/workfile', 'w')
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000240
241.. XXX str(f) is <io.TextIOWrapper object at 0x82e8dc4>
242
Guido van Rossum0616b792007-08-31 03:25:11 +0000243 >>> print(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000244 <open file '/tmp/workfile', mode 'w' at 80a0960>
245
246The first argument is a string containing the filename. The second argument is
247another string containing a few characters describing the way in which the file
248will be used. *mode* can be ``'r'`` when the file will only be read, ``'w'``
249for only writing (an existing file with the same name will be erased), and
250``'a'`` opens the file for appending; any data written to the file is
251automatically added to the end. ``'r+'`` opens the file for both reading and
252writing. The *mode* argument is optional; ``'r'`` will be assumed if it's
253omitted.
254
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000255Normally, files are opened in :dfn:`text mode`, that means, you read and write
256strings from and to the file, which are encoded in a specific encoding (the
257default being UTF-8). ``'b'`` appended to the mode opens the file in
258:dfn:`binary mode`: now the data is read and written in the form of bytes
259objects. This mode should be used for all files that don't contain text.
Skip Montanaro4e02c502007-09-26 01:10:12 +0000260
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000261In text mode, the default is to convert platform-specific line endings (``\n``
262on Unix, ``\r\n`` on Windows) to just ``\n`` on reading and ``\n`` back to
263platform-specific line endings on writing. This behind-the-scenes modification
264to file data is fine for text files, but will corrupt binary data like that in
265:file:`JPEG` or :file:`EXE` files. Be very careful to use binary mode when
266reading and writing such files.
Georg Brandl116aa622007-08-15 14:28:22 +0000267
268
269.. _tut-filemethods:
270
271Methods of File Objects
272-----------------------
273
274The rest of the examples in this section will assume that a file object called
275``f`` has already been created.
276
277To read a file's contents, call ``f.read(size)``, which reads some quantity of
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000278data and returns it as a string or bytes object. *size* is an optional numeric
279argument. When *size* is omitted or negative, the entire contents of the file
280will be read and returned; it's your problem if the file is twice as large as
281your machine's memory. Otherwise, at most *size* bytes are read and returned.
282If the end of the file has been reached, ``f.read()`` will return an empty
283string (``''``). ::
Georg Brandl116aa622007-08-15 14:28:22 +0000284
285 >>> f.read()
286 'This is the entire file.\n'
287 >>> f.read()
288 ''
289
290``f.readline()`` reads a single line from the file; a newline character (``\n``)
291is left at the end of the string, and is only omitted on the last line of the
292file if the file doesn't end in a newline. This makes the return value
293unambiguous; if ``f.readline()`` returns an empty string, the end of the file
294has been reached, while a blank line is represented by ``'\n'``, a string
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000295containing only a single newline. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000296
297 >>> f.readline()
298 'This is the first line of the file.\n'
299 >>> f.readline()
300 'Second line of the file\n'
301 >>> f.readline()
302 ''
303
304``f.readlines()`` returns a list containing all the lines of data in the file.
305If given an optional parameter *sizehint*, it reads that many bytes from the
306file and enough more to complete a line, and returns the lines from that. This
307is often used to allow efficient reading of a large file by lines, but without
308having to load the entire file in memory. Only complete lines will be returned.
309::
310
311 >>> f.readlines()
312 ['This is the first line of the file.\n', 'Second line of the file\n']
313
Thomas Wouters8ce81f72007-09-20 18:22:40 +0000314An alternative approach to reading lines is to loop over the file object. This is
Georg Brandl116aa622007-08-15 14:28:22 +0000315memory efficient, fast, and leads to simpler code::
316
317 >>> for line in f:
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000318 ... print(line, end='')
319 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000320 This is the first line of the file.
321 Second line of the file
322
323The alternative approach is simpler but does not provide as fine-grained
324control. Since the two approaches manage line buffering differently, they
325should not be mixed.
326
327``f.write(string)`` writes the contents of *string* to the file, returning
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000328the number of characters written. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000329
330 >>> f.write('This is a test\n')
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000331 15
Georg Brandl116aa622007-08-15 14:28:22 +0000332
333To write something other than a string, it needs to be converted to a string
334first::
335
336 >>> value = ('the answer', 42)
337 >>> s = str(value)
338 >>> f.write(s)
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000339 18
Georg Brandl116aa622007-08-15 14:28:22 +0000340
341``f.tell()`` returns an integer giving the file object's current position in the
342file, measured in bytes from the beginning of the file. To change the file
343object's position, use ``f.seek(offset, from_what)``. The position is computed
344from adding *offset* to a reference point; the reference point is selected by
345the *from_what* argument. A *from_what* value of 0 measures from the beginning
346of the file, 1 uses the current file position, and 2 uses the end of the file as
347the reference point. *from_what* can be omitted and defaults to 0, using the
348beginning of the file as the reference point. ::
349
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000350 >>> f = open('/tmp/workfile', 'rb+')
351 >>> f.write(b'0123456789abcdef')
352 16
Georg Brandl116aa622007-08-15 14:28:22 +0000353 >>> f.seek(5) # Go to the 6th byte in the file
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000354 5
Georg Brandl48310cd2009-01-03 21:18:54 +0000355 >>> f.read(1)
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000356 b'5'
Georg Brandl116aa622007-08-15 14:28:22 +0000357 >>> f.seek(-3, 2) # Go to the 3rd byte before the end
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000358 13
Georg Brandl116aa622007-08-15 14:28:22 +0000359 >>> f.read(1)
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000360 b'd'
Georg Brandl116aa622007-08-15 14:28:22 +0000361
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000362In text files (those opened without a ``b`` in the mode string), only seeks
363relative to the beginning of the file are allowed (the exception being seeking
364to the very file end with ``seek(0, 2)``).
Georg Brandl48310cd2009-01-03 21:18:54 +0000365
Georg Brandl116aa622007-08-15 14:28:22 +0000366When you're done with a file, call ``f.close()`` to close it and free up any
367system resources taken up by the open file. After calling ``f.close()``,
368attempts to use the file object will automatically fail. ::
369
370 >>> f.close()
371 >>> f.read()
372 Traceback (most recent call last):
373 File "<stdin>", line 1, in ?
374 ValueError: I/O operation on closed file
375
Georg Brandl3dbca812008-07-23 16:10:53 +0000376It is good practice to use the :keyword:`with` keyword when dealing with file
377objects. This has the advantage that the file is properly closed after its
378suite finishes, even if an exception is raised on the way. It is also much
379shorter than writing equivalent :keyword:`try`\ -\ :keyword:`finally` blocks::
380
381 >>> with open('/tmp/workfile', 'r') as f:
382 ... read_data = f.read()
383 >>> f.closed
384 True
385
Georg Brandlc5605df2009-08-13 08:26:44 +0000386File objects have some additional methods, such as :meth:`~file.isatty` and
387:meth:`~file.truncate` which are less frequently used; consult the Library
388Reference for a complete guide to file objects.
Georg Brandl116aa622007-08-15 14:28:22 +0000389
390
391.. _tut-pickle:
392
393The :mod:`pickle` Module
394------------------------
395
396.. index:: module: pickle
397
398Strings can easily be written to and read from a file. Numbers take a bit more
399effort, since the :meth:`read` method only returns strings, which will have to
400be passed to a function like :func:`int`, which takes a string like ``'123'``
401and returns its numeric value 123. However, when you want to save more complex
402data types like lists, dictionaries, or class instances, things get a lot more
403complicated.
404
405Rather than have users be constantly writing and debugging code to save
406complicated data types, Python provides a standard module called :mod:`pickle`.
407This is an amazing module that can take almost any Python object (even some
408forms of Python code!), and convert it to a string representation; this process
409is called :dfn:`pickling`. Reconstructing the object from the string
410representation is called :dfn:`unpickling`. Between pickling and unpickling,
411the string representing the object may have been stored in a file or data, or
412sent over a network connection to some distant machine.
413
414If you have an object ``x``, and a file object ``f`` that's been opened for
415writing, the simplest way to pickle the object takes only one line of code::
416
417 pickle.dump(x, f)
418
419To unpickle the object again, if ``f`` is a file object which has been opened
420for reading::
421
422 x = pickle.load(f)
423
424(There are other variants of this, used when pickling many objects or when you
425don't want to write the pickled data to a file; consult the complete
426documentation for :mod:`pickle` in the Python Library Reference.)
427
428:mod:`pickle` is the standard way to make Python objects which can be stored and
429reused by other programs or by a future invocation of the same program; the
430technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is
431so widely used, many authors who write Python extensions take care to ensure
432that new data types such as matrices can be properly pickled and unpickled.
433
434