blob: dbb56f61623ed7a65a646f435b7ca0bb2313bff9 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001.. _tut-io:
2
3****************
4Input and Output
5****************
6
7There are several ways to present the output of a program; data can be printed
8in a human-readable form, or written to a file for future use. This chapter will
9discuss some of the possibilities.
10
11
12.. _tut-formatting:
13
14Fancier Output Formatting
15=========================
16
17So far we've encountered two ways of writing values: *expression statements* and
Guido van Rossum0616b792007-08-31 03:25:11 +000018the :func:`print` function. (A third way is using the :meth:`write` method
Georg Brandl116aa622007-08-15 14:28:22 +000019of file objects; the standard output file can be referenced as ``sys.stdout``.
20See the Library Reference for more information on this.)
21
22.. index:: module: string
23
24Often you'll want more control over the formatting of your output than simply
25printing space-separated values. There are two ways to format your output; the
26first way is to do all the string handling yourself; using string slicing and
27concatenation operations you can create any layout you can imagine. The
28standard module :mod:`string` contains some useful operations for padding
29strings to a given column width; these will be discussed shortly. The second
Benjamin Petersone6f00632008-05-26 01:03:56 +000030way is to use the :meth:`str.format` method.
31
32The :mod:`string` module contains a class Template which offers yet another way
33to substitute values into strings.
Georg Brandl116aa622007-08-15 14:28:22 +000034
35One question remains, of course: how do you convert values to strings? Luckily,
36Python has ways to convert any value to a string: pass it to the :func:`repr`
Georg Brandl1e3830a2008-08-08 06:45:01 +000037or :func:`str` functions.
Georg Brandl116aa622007-08-15 14:28:22 +000038
39The :func:`str` function is meant to return representations of values which are
40fairly human-readable, while :func:`repr` is meant to generate representations
41which can be read by the interpreter (or will force a :exc:`SyntaxError` if
42there is not equivalent syntax). For objects which don't have a particular
43representation for human consumption, :func:`str` will return the same value as
44:func:`repr`. Many values, such as numbers or structures like lists and
45dictionaries, have the same representation using either function. Strings and
46floating point numbers, in particular, have two distinct representations.
47
48Some examples::
49
50 >>> s = 'Hello, world.'
51 >>> str(s)
52 'Hello, world.'
53 >>> repr(s)
54 "'Hello, world.'"
Mark Dickinson5a55b612009-06-28 20:59:42 +000055 >>> str(1.0/7.0)
56 '0.142857142857'
57 >>> repr(1.0/7.0)
58 '0.14285714285714285'
Georg Brandl116aa622007-08-15 14:28:22 +000059 >>> x = 10 * 3.25
60 >>> y = 200 * 200
61 >>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'
Guido van Rossum0616b792007-08-31 03:25:11 +000062 >>> print(s)
Georg Brandl116aa622007-08-15 14:28:22 +000063 The value of x is 32.5, and y is 40000...
64 >>> # The repr() of a string adds string quotes and backslashes:
65 ... hello = 'hello, world\n'
66 >>> hellos = repr(hello)
Guido van Rossum0616b792007-08-31 03:25:11 +000067 >>> print(hellos)
Georg Brandl116aa622007-08-15 14:28:22 +000068 'hello, world\n'
69 >>> # The argument to repr() may be any Python object:
70 ... repr((x, y, ('spam', 'eggs')))
71 "(32.5, 40000, ('spam', 'eggs'))"
Georg Brandl116aa622007-08-15 14:28:22 +000072
73Here are two ways to write a table of squares and cubes::
74
75 >>> for x in range(1, 11):
Georg Brandle4ac7502007-09-03 07:10:24 +000076 ... print(repr(x).rjust(2), repr(x*x).rjust(3), end=' ')
Guido van Rossum0616b792007-08-31 03:25:11 +000077 ... # Note use of 'end' on previous line
78 ... print(repr(x*x*x).rjust(4))
Georg Brandl116aa622007-08-15 14:28:22 +000079 ...
80 1 1 1
81 2 4 8
82 3 9 27
83 4 16 64
84 5 25 125
85 6 36 216
86 7 49 343
87 8 64 512
88 9 81 729
89 10 100 1000
90
Georg Brandle4ac7502007-09-03 07:10:24 +000091 >>> for x in range(1, 11):
Benjamin Petersone6f00632008-05-26 01:03:56 +000092 ... print('{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x))
Georg Brandl48310cd2009-01-03 21:18:54 +000093 ...
Georg Brandl116aa622007-08-15 14:28:22 +000094 1 1 1
95 2 4 8
96 3 9 27
97 4 16 64
98 5 25 125
99 6 36 216
100 7 49 343
101 8 64 512
102 9 81 729
103 10 100 1000
104
105(Note that in the first example, one space between each column was added by the
Guido van Rossum0616b792007-08-31 03:25:11 +0000106way :func:`print` works: it always adds spaces between its arguments.)
Georg Brandl116aa622007-08-15 14:28:22 +0000107
108This example demonstrates the :meth:`rjust` method of string objects, which
109right-justifies a string in a field of a given width by padding it with spaces
110on the left. There are similar methods :meth:`ljust` and :meth:`center`. These
111methods do not write anything, they just return a new string. If the input
112string is too long, they don't truncate it, but return it unchanged; this will
113mess up your column lay-out but that's usually better than the alternative,
114which would be lying about a value. (If you really want truncation you can
115always add a slice operation, as in ``x.ljust(n)[:n]``.)
116
117There is another method, :meth:`zfill`, which pads a numeric string on the left
118with zeros. It understands about plus and minus signs::
119
120 >>> '12'.zfill(5)
121 '00012'
122 >>> '-3.14'.zfill(7)
123 '-003.14'
124 >>> '3.14159265359'.zfill(5)
125 '3.14159265359'
126
Benjamin Petersone6f00632008-05-26 01:03:56 +0000127Basic usage of the :meth:`str.format` method looks like this::
128
Georg Brandl2f3ed682009-09-01 07:42:40 +0000129 >>> print('We are the {} who say "{}!"'.format('knights', 'Ni'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000130 We are the knights who say "Ni!"
131
132The brackets and characters within them (called format fields) are replaced with
Georg Brandl2f3ed682009-09-01 07:42:40 +0000133the objects passed into the :meth:`~str.format` method. A number in the
134brackets can be used to refer to the position of the object passed into the
Alexandre Vassalotti6d3dfc32009-07-29 19:54:39 +0000135:meth:`~str.format` method. ::
Benjamin Petersone6f00632008-05-26 01:03:56 +0000136
Benjamin Peterson0cea1572008-07-26 21:59:03 +0000137 >>> print('{0} and {1}'.format('spam', 'eggs'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000138 spam and eggs
Benjamin Peterson0cea1572008-07-26 21:59:03 +0000139 >>> print('{1} and {0}'.format('spam', 'eggs'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000140 eggs and spam
141
Alexandre Vassalotti6d3dfc32009-07-29 19:54:39 +0000142If keyword arguments are used in the :meth:`~str.format` method, their values
143are referred to by using the name of the argument. ::
Benjamin Petersone6f00632008-05-26 01:03:56 +0000144
Benjamin Peterson71141932008-07-26 22:27:04 +0000145 >>> print('This {food} is {adjective}.'.format(
146 ... food='spam', adjective='absolutely horrible'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000147 This spam is absolutely horrible.
148
149Positional and keyword arguments can be arbitrarily combined::
150
Benjamin Peterson71141932008-07-26 22:27:04 +0000151 >>> print('The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred',
152 other='Georg'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000153 The story of Bill, Manfred, and Georg.
154
Georg Brandl2f3ed682009-09-01 07:42:40 +0000155``'!a'`` (apply :func:`ascii`), ``'!s'`` (apply :func:`str`) and ``'!r'``
156(apply :func:`repr`) can be used to convert the value before it is formatted::
157
158 >>> import math
159 >>> print('The value of PI is approximately {}.'.format(math.pi))
160 The value of PI is approximately 3.14159265359.
161 >>> print('The value of PI is approximately {!r}.'.format(math.pi))
162 The value of PI is approximately 3.141592653589793.
163
Alexandre Vassalottie223eb82009-07-29 20:12:15 +0000164An optional ``':'`` and format specifier can follow the field name. This allows
Benjamin Petersone6f00632008-05-26 01:03:56 +0000165greater control over how the value is formatted. The following example
Alexandre Vassalottie223eb82009-07-29 20:12:15 +0000166truncates Pi to three places after the decimal.
Georg Brandl116aa622007-08-15 14:28:22 +0000167
168 >>> import math
Benjamin Petersone6f00632008-05-26 01:03:56 +0000169 >>> print('The value of PI is approximately {0:.3f}.'.format(math.pi))
Georg Brandl116aa622007-08-15 14:28:22 +0000170 The value of PI is approximately 3.142.
171
Benjamin Petersone6f00632008-05-26 01:03:56 +0000172Passing an integer after the ``':'`` will cause that field to be a minimum
Alexandre Vassalotti6d3dfc32009-07-29 19:54:39 +0000173number of characters wide. This is useful for making tables pretty. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000174
175 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
176 >>> for name, phone in table.items():
Benjamin Petersone6f00632008-05-26 01:03:56 +0000177 ... print('{0:10} ==> {1:10d}'.format(name, phone))
Georg Brandl48310cd2009-01-03 21:18:54 +0000178 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000179 Jack ==> 4098
180 Dcab ==> 7678
181 Sjoerd ==> 4127
182
Georg Brandl116aa622007-08-15 14:28:22 +0000183If you have a really long format string that you don't want to split up, it
184would be nice if you could reference the variables to be formatted by name
Benjamin Petersone6f00632008-05-26 01:03:56 +0000185instead of by position. This can be done by simply passing the dict and using
186square brackets ``'[]'`` to access the keys ::
Georg Brandl116aa622007-08-15 14:28:22 +0000187
188 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
Benjamin Peterson71141932008-07-26 22:27:04 +0000189 >>> print('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; '
190 'Dcab: {0[Dcab]:d}'.format(table))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000191 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
192
193This could also be done by passing the table as keyword arguments with the '**'
Alexandre Vassalotti6d3dfc32009-07-29 19:54:39 +0000194notation. ::
Benjamin Petersone6f00632008-05-26 01:03:56 +0000195
196 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
197 >>> print('Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table))
Georg Brandl116aa622007-08-15 14:28:22 +0000198 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
199
200This is particularly useful in combination with the new built-in :func:`vars`
201function, which returns a dictionary containing all local variables.
202
Mark Dickinson934896d2009-02-21 20:59:32 +0000203For a complete overview of string formatting with :meth:`str.format`, see
Benjamin Petersone6f00632008-05-26 01:03:56 +0000204:ref:`formatstrings`.
205
206
207Old string formatting
208---------------------
209
210The ``%`` operator can also be used for string formatting. It interprets the
211left argument much like a :cfunc:`sprintf`\ -style format string to be applied
212to the right argument, and returns the string resulting from this formatting
213operation. For example::
214
215 >>> import math
Georg Brandl11e18b02008-08-05 09:04:16 +0000216 >>> print('The value of PI is approximately %5.3f.' % math.pi)
Benjamin Petersone6f00632008-05-26 01:03:56 +0000217 The value of PI is approximately 3.142.
218
219Since :meth:`str.format` is quite new, a lot of Python code still uses the ``%``
Alexandre Vassalottie223eb82009-07-29 20:12:15 +0000220operator. However, because this old style of formatting will eventually be
221removed from the language, :meth:`str.format` should generally be used.
Benjamin Petersone6f00632008-05-26 01:03:56 +0000222
223More information can be found in the :ref:`old-string-formatting` section.
224
Georg Brandl116aa622007-08-15 14:28:22 +0000225
226.. _tut-files:
227
228Reading and Writing Files
229=========================
230
231.. index::
232 builtin: open
233 object: file
234
235:func:`open` returns a file object, and is most commonly used with two
236arguments: ``open(filename, mode)``.
237
Georg Brandl116aa622007-08-15 14:28:22 +0000238::
239
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000240 >>> f = open('/tmp/workfile', 'w')
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000241
242.. XXX str(f) is <io.TextIOWrapper object at 0x82e8dc4>
243
Guido van Rossum0616b792007-08-31 03:25:11 +0000244 >>> print(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000245 <open file '/tmp/workfile', mode 'w' at 80a0960>
246
247The first argument is a string containing the filename. The second argument is
248another string containing a few characters describing the way in which the file
249will be used. *mode* can be ``'r'`` when the file will only be read, ``'w'``
250for only writing (an existing file with the same name will be erased), and
251``'a'`` opens the file for appending; any data written to the file is
252automatically added to the end. ``'r+'`` opens the file for both reading and
253writing. The *mode* argument is optional; ``'r'`` will be assumed if it's
254omitted.
255
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000256Normally, files are opened in :dfn:`text mode`, that means, you read and write
257strings from and to the file, which are encoded in a specific encoding (the
258default being UTF-8). ``'b'`` appended to the mode opens the file in
259:dfn:`binary mode`: now the data is read and written in the form of bytes
260objects. This mode should be used for all files that don't contain text.
Skip Montanaro4e02c502007-09-26 01:10:12 +0000261
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000262In text mode, the default is to convert platform-specific line endings (``\n``
263on Unix, ``\r\n`` on Windows) to just ``\n`` on reading and ``\n`` back to
264platform-specific line endings on writing. This behind-the-scenes modification
265to file data is fine for text files, but will corrupt binary data like that in
266:file:`JPEG` or :file:`EXE` files. Be very careful to use binary mode when
267reading and writing such files.
Georg Brandl116aa622007-08-15 14:28:22 +0000268
269
270.. _tut-filemethods:
271
272Methods of File Objects
273-----------------------
274
275The rest of the examples in this section will assume that a file object called
276``f`` has already been created.
277
278To read a file's contents, call ``f.read(size)``, which reads some quantity of
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000279data and returns it as a string or bytes object. *size* is an optional numeric
280argument. When *size* is omitted or negative, the entire contents of the file
281will be read and returned; it's your problem if the file is twice as large as
282your machine's memory. Otherwise, at most *size* bytes are read and returned.
283If the end of the file has been reached, ``f.read()`` will return an empty
284string (``''``). ::
Georg Brandl116aa622007-08-15 14:28:22 +0000285
286 >>> f.read()
287 'This is the entire file.\n'
288 >>> f.read()
289 ''
290
291``f.readline()`` reads a single line from the file; a newline character (``\n``)
292is left at the end of the string, and is only omitted on the last line of the
293file if the file doesn't end in a newline. This makes the return value
294unambiguous; if ``f.readline()`` returns an empty string, the end of the file
295has been reached, while a blank line is represented by ``'\n'``, a string
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000296containing only a single newline. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000297
298 >>> f.readline()
299 'This is the first line of the file.\n'
300 >>> f.readline()
301 'Second line of the file\n'
302 >>> f.readline()
303 ''
304
305``f.readlines()`` returns a list containing all the lines of data in the file.
306If given an optional parameter *sizehint*, it reads that many bytes from the
307file and enough more to complete a line, and returns the lines from that. This
308is often used to allow efficient reading of a large file by lines, but without
309having to load the entire file in memory. Only complete lines will be returned.
310::
311
312 >>> f.readlines()
313 ['This is the first line of the file.\n', 'Second line of the file\n']
314
Thomas Wouters8ce81f72007-09-20 18:22:40 +0000315An alternative approach to reading lines is to loop over the file object. This is
Georg Brandl116aa622007-08-15 14:28:22 +0000316memory efficient, fast, and leads to simpler code::
317
318 >>> for line in f:
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000319 ... print(line, end='')
320 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000321 This is the first line of the file.
322 Second line of the file
323
324The alternative approach is simpler but does not provide as fine-grained
325control. Since the two approaches manage line buffering differently, they
326should not be mixed.
327
328``f.write(string)`` writes the contents of *string* to the file, returning
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000329the number of characters written. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000330
331 >>> f.write('This is a test\n')
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000332 15
Georg Brandl116aa622007-08-15 14:28:22 +0000333
334To write something other than a string, it needs to be converted to a string
335first::
336
337 >>> value = ('the answer', 42)
338 >>> s = str(value)
339 >>> f.write(s)
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000340 18
Georg Brandl116aa622007-08-15 14:28:22 +0000341
342``f.tell()`` returns an integer giving the file object's current position in the
343file, measured in bytes from the beginning of the file. To change the file
344object's position, use ``f.seek(offset, from_what)``. The position is computed
345from adding *offset* to a reference point; the reference point is selected by
346the *from_what* argument. A *from_what* value of 0 measures from the beginning
347of the file, 1 uses the current file position, and 2 uses the end of the file as
348the reference point. *from_what* can be omitted and defaults to 0, using the
349beginning of the file as the reference point. ::
350
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000351 >>> f = open('/tmp/workfile', 'rb+')
352 >>> f.write(b'0123456789abcdef')
353 16
Georg Brandl116aa622007-08-15 14:28:22 +0000354 >>> f.seek(5) # Go to the 6th byte in the file
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000355 5
Georg Brandl48310cd2009-01-03 21:18:54 +0000356 >>> f.read(1)
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000357 b'5'
Georg Brandl116aa622007-08-15 14:28:22 +0000358 >>> f.seek(-3, 2) # Go to the 3rd byte before the end
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000359 13
Georg Brandl116aa622007-08-15 14:28:22 +0000360 >>> f.read(1)
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000361 b'd'
Georg Brandl116aa622007-08-15 14:28:22 +0000362
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000363In text files (those opened without a ``b`` in the mode string), only seeks
364relative to the beginning of the file are allowed (the exception being seeking
365to the very file end with ``seek(0, 2)``).
Georg Brandl48310cd2009-01-03 21:18:54 +0000366
Georg Brandl116aa622007-08-15 14:28:22 +0000367When you're done with a file, call ``f.close()`` to close it and free up any
368system resources taken up by the open file. After calling ``f.close()``,
369attempts to use the file object will automatically fail. ::
370
371 >>> f.close()
372 >>> f.read()
373 Traceback (most recent call last):
374 File "<stdin>", line 1, in ?
375 ValueError: I/O operation on closed file
376
Georg Brandl3dbca812008-07-23 16:10:53 +0000377It is good practice to use the :keyword:`with` keyword when dealing with file
378objects. This has the advantage that the file is properly closed after its
379suite finishes, even if an exception is raised on the way. It is also much
380shorter than writing equivalent :keyword:`try`\ -\ :keyword:`finally` blocks::
381
382 >>> with open('/tmp/workfile', 'r') as f:
383 ... read_data = f.read()
384 >>> f.closed
385 True
386
Alexandre Vassalotti6d3dfc32009-07-29 19:54:39 +0000387File objects have some additional methods, such as :meth:`~file.isatty` and
388:meth:`~file.truncate` which are less frequently used; consult the Library
389Reference for a complete guide to file objects.
Georg Brandl116aa622007-08-15 14:28:22 +0000390
391
392.. _tut-pickle:
393
394The :mod:`pickle` Module
395------------------------
396
397.. index:: module: pickle
398
399Strings can easily be written to and read from a file. Numbers take a bit more
400effort, since the :meth:`read` method only returns strings, which will have to
401be passed to a function like :func:`int`, which takes a string like ``'123'``
402and returns its numeric value 123. However, when you want to save more complex
403data types like lists, dictionaries, or class instances, things get a lot more
404complicated.
405
406Rather than have users be constantly writing and debugging code to save
407complicated data types, Python provides a standard module called :mod:`pickle`.
408This is an amazing module that can take almost any Python object (even some
409forms of Python code!), and convert it to a string representation; this process
410is called :dfn:`pickling`. Reconstructing the object from the string
411representation is called :dfn:`unpickling`. Between pickling and unpickling,
412the string representing the object may have been stored in a file or data, or
413sent over a network connection to some distant machine.
414
415If you have an object ``x``, and a file object ``f`` that's been opened for
416writing, the simplest way to pickle the object takes only one line of code::
417
418 pickle.dump(x, f)
419
420To unpickle the object again, if ``f`` is a file object which has been opened
421for reading::
422
423 x = pickle.load(f)
424
425(There are other variants of this, used when pickling many objects or when you
426don't want to write the pickled data to a file; consult the complete
427documentation for :mod:`pickle` in the Python Library Reference.)
428
429:mod:`pickle` is the standard way to make Python objects which can be stored and
430reused by other programs or by a future invocation of the same program; the
431technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is
432so widely used, many authors who write Python extensions take care to ensure
433that new data types such as matrices can be properly pickled and unpickled.
434
435