blob: 549a92286d54d1f76d82144251425fbbe1670222 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001.. _tut-io:
2
3****************
4Input and Output
5****************
6
7There are several ways to present the output of a program; data can be printed
8in a human-readable form, or written to a file for future use. This chapter will
9discuss some of the possibilities.
10
11
12.. _tut-formatting:
13
14Fancier Output Formatting
15=========================
16
17So far we've encountered two ways of writing values: *expression statements* and
Guido van Rossum0616b792007-08-31 03:25:11 +000018the :func:`print` function. (A third way is using the :meth:`write` method
Georg Brandl116aa622007-08-15 14:28:22 +000019of file objects; the standard output file can be referenced as ``sys.stdout``.
20See the Library Reference for more information on this.)
21
22.. index:: module: string
23
24Often you'll want more control over the formatting of your output than simply
25printing space-separated values. There are two ways to format your output; the
26first way is to do all the string handling yourself; using string slicing and
27concatenation operations you can create any layout you can imagine. The
28standard module :mod:`string` contains some useful operations for padding
29strings to a given column width; these will be discussed shortly. The second
Benjamin Petersone6f00632008-05-26 01:03:56 +000030way is to use the :meth:`str.format` method.
31
32The :mod:`string` module contains a class Template which offers yet another way
33to substitute values into strings.
Georg Brandl116aa622007-08-15 14:28:22 +000034
35One question remains, of course: how do you convert values to strings? Luckily,
36Python has ways to convert any value to a string: pass it to the :func:`repr`
Georg Brandl1e3830a2008-08-08 06:45:01 +000037or :func:`str` functions.
Georg Brandl116aa622007-08-15 14:28:22 +000038
39The :func:`str` function is meant to return representations of values which are
40fairly human-readable, while :func:`repr` is meant to generate representations
41which can be read by the interpreter (or will force a :exc:`SyntaxError` if
42there is not equivalent syntax). For objects which don't have a particular
43representation for human consumption, :func:`str` will return the same value as
44:func:`repr`. Many values, such as numbers or structures like lists and
45dictionaries, have the same representation using either function. Strings and
46floating point numbers, in particular, have two distinct representations.
47
48Some examples::
49
50 >>> s = 'Hello, world.'
51 >>> str(s)
52 'Hello, world.'
53 >>> repr(s)
54 "'Hello, world.'"
Mark Dickinsond1cc39d2009-06-28 21:00:42 +000055 >>> str(1.0/7.0)
56 '0.142857142857'
57 >>> repr(1.0/7.0)
58 '0.14285714285714285'
Georg Brandl116aa622007-08-15 14:28:22 +000059 >>> x = 10 * 3.25
60 >>> y = 200 * 200
61 >>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'
Guido van Rossum0616b792007-08-31 03:25:11 +000062 >>> print(s)
Georg Brandl116aa622007-08-15 14:28:22 +000063 The value of x is 32.5, and y is 40000...
64 >>> # The repr() of a string adds string quotes and backslashes:
65 ... hello = 'hello, world\n'
66 >>> hellos = repr(hello)
Guido van Rossum0616b792007-08-31 03:25:11 +000067 >>> print(hellos)
Georg Brandl116aa622007-08-15 14:28:22 +000068 'hello, world\n'
69 >>> # The argument to repr() may be any Python object:
70 ... repr((x, y, ('spam', 'eggs')))
71 "(32.5, 40000, ('spam', 'eggs'))"
Georg Brandl116aa622007-08-15 14:28:22 +000072
73Here are two ways to write a table of squares and cubes::
74
75 >>> for x in range(1, 11):
Georg Brandle4ac7502007-09-03 07:10:24 +000076 ... print(repr(x).rjust(2), repr(x*x).rjust(3), end=' ')
Guido van Rossum0616b792007-08-31 03:25:11 +000077 ... # Note use of 'end' on previous line
78 ... print(repr(x*x*x).rjust(4))
Georg Brandl116aa622007-08-15 14:28:22 +000079 ...
80 1 1 1
81 2 4 8
82 3 9 27
83 4 16 64
84 5 25 125
85 6 36 216
86 7 49 343
87 8 64 512
88 9 81 729
89 10 100 1000
90
Georg Brandle4ac7502007-09-03 07:10:24 +000091 >>> for x in range(1, 11):
Benjamin Petersone6f00632008-05-26 01:03:56 +000092 ... print('{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x))
Georg Brandl48310cd2009-01-03 21:18:54 +000093 ...
Georg Brandl116aa622007-08-15 14:28:22 +000094 1 1 1
95 2 4 8
96 3 9 27
97 4 16 64
98 5 25 125
99 6 36 216
100 7 49 343
101 8 64 512
102 9 81 729
103 10 100 1000
104
105(Note that in the first example, one space between each column was added by the
Guido van Rossum0616b792007-08-31 03:25:11 +0000106way :func:`print` works: it always adds spaces between its arguments.)
Georg Brandl116aa622007-08-15 14:28:22 +0000107
108This example demonstrates the :meth:`rjust` method of string objects, which
109right-justifies a string in a field of a given width by padding it with spaces
110on the left. There are similar methods :meth:`ljust` and :meth:`center`. These
111methods do not write anything, they just return a new string. If the input
112string is too long, they don't truncate it, but return it unchanged; this will
113mess up your column lay-out but that's usually better than the alternative,
114which would be lying about a value. (If you really want truncation you can
115always add a slice operation, as in ``x.ljust(n)[:n]``.)
116
117There is another method, :meth:`zfill`, which pads a numeric string on the left
118with zeros. It understands about plus and minus signs::
119
120 >>> '12'.zfill(5)
121 '00012'
122 >>> '-3.14'.zfill(7)
123 '-003.14'
124 >>> '3.14159265359'.zfill(5)
125 '3.14159265359'
126
Benjamin Petersone6f00632008-05-26 01:03:56 +0000127Basic usage of the :meth:`str.format` method looks like this::
128
Benjamin Peterson0cea1572008-07-26 21:59:03 +0000129 >>> print('We are the {0} who say "{1}!"'.format('knights', 'Ni'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000130 We are the knights who say "Ni!"
131
132The brackets and characters within them (called format fields) are replaced with
Georg Brandlc5605df2009-08-13 08:26:44 +0000133the objects passed into the :meth:`~str.format` method. The number in the
134brackets refers to the position of the object passed into the
135:meth:`~str.format` method. ::
Benjamin Petersone6f00632008-05-26 01:03:56 +0000136
Benjamin Peterson0cea1572008-07-26 21:59:03 +0000137 >>> print('{0} and {1}'.format('spam', 'eggs'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000138 spam and eggs
Benjamin Peterson0cea1572008-07-26 21:59:03 +0000139 >>> print('{1} and {0}'.format('spam', 'eggs'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000140 eggs and spam
141
Georg Brandlc5605df2009-08-13 08:26:44 +0000142If keyword arguments are used in the :meth:`~str.format` method, their values
143are referred to by using the name of the argument. ::
Benjamin Petersone6f00632008-05-26 01:03:56 +0000144
Benjamin Peterson71141932008-07-26 22:27:04 +0000145 >>> print('This {food} is {adjective}.'.format(
146 ... food='spam', adjective='absolutely horrible'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000147 This spam is absolutely horrible.
148
149Positional and keyword arguments can be arbitrarily combined::
150
Benjamin Peterson71141932008-07-26 22:27:04 +0000151 >>> print('The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred',
152 other='Georg'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000153 The story of Bill, Manfred, and Georg.
154
Georg Brandl01a30522009-08-13 08:37:59 +0000155An optional ``':'`` and format specifier can follow the field name. This allows
Benjamin Petersone6f00632008-05-26 01:03:56 +0000156greater control over how the value is formatted. The following example
Georg Brandl01a30522009-08-13 08:37:59 +0000157truncates Pi to three places after the decimal.
Georg Brandl116aa622007-08-15 14:28:22 +0000158
159 >>> import math
Benjamin Petersone6f00632008-05-26 01:03:56 +0000160 >>> print('The value of PI is approximately {0:.3f}.'.format(math.pi))
Georg Brandl116aa622007-08-15 14:28:22 +0000161 The value of PI is approximately 3.142.
162
Benjamin Petersone6f00632008-05-26 01:03:56 +0000163Passing an integer after the ``':'`` will cause that field to be a minimum
Georg Brandlc5605df2009-08-13 08:26:44 +0000164number of characters wide. This is useful for making tables pretty. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000165
166 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
167 >>> for name, phone in table.items():
Benjamin Petersone6f00632008-05-26 01:03:56 +0000168 ... print('{0:10} ==> {1:10d}'.format(name, phone))
Georg Brandl48310cd2009-01-03 21:18:54 +0000169 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000170 Jack ==> 4098
171 Dcab ==> 7678
172 Sjoerd ==> 4127
173
Georg Brandl116aa622007-08-15 14:28:22 +0000174If you have a really long format string that you don't want to split up, it
175would be nice if you could reference the variables to be formatted by name
Benjamin Petersone6f00632008-05-26 01:03:56 +0000176instead of by position. This can be done by simply passing the dict and using
177square brackets ``'[]'`` to access the keys ::
Georg Brandl116aa622007-08-15 14:28:22 +0000178
179 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
Benjamin Peterson71141932008-07-26 22:27:04 +0000180 >>> print('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; '
181 'Dcab: {0[Dcab]:d}'.format(table))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000182 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
183
184This could also be done by passing the table as keyword arguments with the '**'
Georg Brandlc5605df2009-08-13 08:26:44 +0000185notation. ::
Benjamin Petersone6f00632008-05-26 01:03:56 +0000186
187 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
188 >>> print('Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table))
Georg Brandl116aa622007-08-15 14:28:22 +0000189 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
190
191This is particularly useful in combination with the new built-in :func:`vars`
192function, which returns a dictionary containing all local variables.
193
Mark Dickinson934896d2009-02-21 20:59:32 +0000194For a complete overview of string formatting with :meth:`str.format`, see
Benjamin Petersone6f00632008-05-26 01:03:56 +0000195:ref:`formatstrings`.
196
197
198Old string formatting
199---------------------
200
201The ``%`` operator can also be used for string formatting. It interprets the
202left argument much like a :cfunc:`sprintf`\ -style format string to be applied
203to the right argument, and returns the string resulting from this formatting
204operation. For example::
205
206 >>> import math
Georg Brandl11e18b02008-08-05 09:04:16 +0000207 >>> print('The value of PI is approximately %5.3f.' % math.pi)
Benjamin Petersone6f00632008-05-26 01:03:56 +0000208 The value of PI is approximately 3.142.
209
210Since :meth:`str.format` is quite new, a lot of Python code still uses the ``%``
Georg Brandl01a30522009-08-13 08:37:59 +0000211operator. However, because this old style of formatting will eventually be
212removed from the language, :meth:`str.format` should generally be used.
Benjamin Petersone6f00632008-05-26 01:03:56 +0000213
214More information can be found in the :ref:`old-string-formatting` section.
215
Georg Brandl116aa622007-08-15 14:28:22 +0000216
217.. _tut-files:
218
219Reading and Writing Files
220=========================
221
222.. index::
223 builtin: open
224 object: file
225
226:func:`open` returns a file object, and is most commonly used with two
227arguments: ``open(filename, mode)``.
228
Georg Brandl116aa622007-08-15 14:28:22 +0000229::
230
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000231 >>> f = open('/tmp/workfile', 'w')
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000232
233.. XXX str(f) is <io.TextIOWrapper object at 0x82e8dc4>
234
Guido van Rossum0616b792007-08-31 03:25:11 +0000235 >>> print(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000236 <open file '/tmp/workfile', mode 'w' at 80a0960>
237
238The first argument is a string containing the filename. The second argument is
239another string containing a few characters describing the way in which the file
240will be used. *mode* can be ``'r'`` when the file will only be read, ``'w'``
241for only writing (an existing file with the same name will be erased), and
242``'a'`` opens the file for appending; any data written to the file is
243automatically added to the end. ``'r+'`` opens the file for both reading and
244writing. The *mode* argument is optional; ``'r'`` will be assumed if it's
245omitted.
246
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000247Normally, files are opened in :dfn:`text mode`, that means, you read and write
248strings from and to the file, which are encoded in a specific encoding (the
249default being UTF-8). ``'b'`` appended to the mode opens the file in
250:dfn:`binary mode`: now the data is read and written in the form of bytes
251objects. This mode should be used for all files that don't contain text.
Skip Montanaro4e02c502007-09-26 01:10:12 +0000252
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000253In text mode, the default is to convert platform-specific line endings (``\n``
254on Unix, ``\r\n`` on Windows) to just ``\n`` on reading and ``\n`` back to
255platform-specific line endings on writing. This behind-the-scenes modification
256to file data is fine for text files, but will corrupt binary data like that in
257:file:`JPEG` or :file:`EXE` files. Be very careful to use binary mode when
258reading and writing such files.
Georg Brandl116aa622007-08-15 14:28:22 +0000259
260
261.. _tut-filemethods:
262
263Methods of File Objects
264-----------------------
265
266The rest of the examples in this section will assume that a file object called
267``f`` has already been created.
268
269To read a file's contents, call ``f.read(size)``, which reads some quantity of
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000270data and returns it as a string or bytes object. *size* is an optional numeric
271argument. When *size* is omitted or negative, the entire contents of the file
272will be read and returned; it's your problem if the file is twice as large as
273your machine's memory. Otherwise, at most *size* bytes are read and returned.
274If the end of the file has been reached, ``f.read()`` will return an empty
275string (``''``). ::
Georg Brandl116aa622007-08-15 14:28:22 +0000276
277 >>> f.read()
278 'This is the entire file.\n'
279 >>> f.read()
280 ''
281
282``f.readline()`` reads a single line from the file; a newline character (``\n``)
283is left at the end of the string, and is only omitted on the last line of the
284file if the file doesn't end in a newline. This makes the return value
285unambiguous; if ``f.readline()`` returns an empty string, the end of the file
286has been reached, while a blank line is represented by ``'\n'``, a string
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000287containing only a single newline. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000288
289 >>> f.readline()
290 'This is the first line of the file.\n'
291 >>> f.readline()
292 'Second line of the file\n'
293 >>> f.readline()
294 ''
295
296``f.readlines()`` returns a list containing all the lines of data in the file.
297If given an optional parameter *sizehint*, it reads that many bytes from the
298file and enough more to complete a line, and returns the lines from that. This
299is often used to allow efficient reading of a large file by lines, but without
300having to load the entire file in memory. Only complete lines will be returned.
301::
302
303 >>> f.readlines()
304 ['This is the first line of the file.\n', 'Second line of the file\n']
305
Thomas Wouters8ce81f72007-09-20 18:22:40 +0000306An alternative approach to reading lines is to loop over the file object. This is
Georg Brandl116aa622007-08-15 14:28:22 +0000307memory efficient, fast, and leads to simpler code::
308
309 >>> for line in f:
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000310 ... print(line, end='')
311 ...
Georg Brandl116aa622007-08-15 14:28:22 +0000312 This is the first line of the file.
313 Second line of the file
314
315The alternative approach is simpler but does not provide as fine-grained
316control. Since the two approaches manage line buffering differently, they
317should not be mixed.
318
319``f.write(string)`` writes the contents of *string* to the file, returning
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000320the number of characters written. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000321
322 >>> f.write('This is a test\n')
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000323 15
Georg Brandl116aa622007-08-15 14:28:22 +0000324
325To write something other than a string, it needs to be converted to a string
326first::
327
328 >>> value = ('the answer', 42)
329 >>> s = str(value)
330 >>> f.write(s)
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000331 18
Georg Brandl116aa622007-08-15 14:28:22 +0000332
333``f.tell()`` returns an integer giving the file object's current position in the
334file, measured in bytes from the beginning of the file. To change the file
335object's position, use ``f.seek(offset, from_what)``. The position is computed
336from adding *offset* to a reference point; the reference point is selected by
337the *from_what* argument. A *from_what* value of 0 measures from the beginning
338of the file, 1 uses the current file position, and 2 uses the end of the file as
339the reference point. *from_what* can be omitted and defaults to 0, using the
340beginning of the file as the reference point. ::
341
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000342 >>> f = open('/tmp/workfile', 'rb+')
343 >>> f.write(b'0123456789abcdef')
344 16
Georg Brandl116aa622007-08-15 14:28:22 +0000345 >>> f.seek(5) # Go to the 6th byte in the file
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000346 5
Georg Brandl48310cd2009-01-03 21:18:54 +0000347 >>> f.read(1)
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000348 b'5'
Georg Brandl116aa622007-08-15 14:28:22 +0000349 >>> f.seek(-3, 2) # Go to the 3rd byte before the end
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000350 13
Georg Brandl116aa622007-08-15 14:28:22 +0000351 >>> f.read(1)
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000352 b'd'
Georg Brandl116aa622007-08-15 14:28:22 +0000353
Georg Brandl0dcb7ac2008-08-08 07:04:38 +0000354In text files (those opened without a ``b`` in the mode string), only seeks
355relative to the beginning of the file are allowed (the exception being seeking
356to the very file end with ``seek(0, 2)``).
Georg Brandl48310cd2009-01-03 21:18:54 +0000357
Georg Brandl116aa622007-08-15 14:28:22 +0000358When you're done with a file, call ``f.close()`` to close it and free up any
359system resources taken up by the open file. After calling ``f.close()``,
360attempts to use the file object will automatically fail. ::
361
362 >>> f.close()
363 >>> f.read()
364 Traceback (most recent call last):
365 File "<stdin>", line 1, in ?
366 ValueError: I/O operation on closed file
367
Georg Brandl3dbca812008-07-23 16:10:53 +0000368It is good practice to use the :keyword:`with` keyword when dealing with file
369objects. This has the advantage that the file is properly closed after its
370suite finishes, even if an exception is raised on the way. It is also much
371shorter than writing equivalent :keyword:`try`\ -\ :keyword:`finally` blocks::
372
373 >>> with open('/tmp/workfile', 'r') as f:
374 ... read_data = f.read()
375 >>> f.closed
376 True
377
Georg Brandlc5605df2009-08-13 08:26:44 +0000378File objects have some additional methods, such as :meth:`~file.isatty` and
379:meth:`~file.truncate` which are less frequently used; consult the Library
380Reference for a complete guide to file objects.
Georg Brandl116aa622007-08-15 14:28:22 +0000381
382
383.. _tut-pickle:
384
385The :mod:`pickle` Module
386------------------------
387
388.. index:: module: pickle
389
390Strings can easily be written to and read from a file. Numbers take a bit more
391effort, since the :meth:`read` method only returns strings, which will have to
392be passed to a function like :func:`int`, which takes a string like ``'123'``
393and returns its numeric value 123. However, when you want to save more complex
394data types like lists, dictionaries, or class instances, things get a lot more
395complicated.
396
397Rather than have users be constantly writing and debugging code to save
398complicated data types, Python provides a standard module called :mod:`pickle`.
399This is an amazing module that can take almost any Python object (even some
400forms of Python code!), and convert it to a string representation; this process
401is called :dfn:`pickling`. Reconstructing the object from the string
402representation is called :dfn:`unpickling`. Between pickling and unpickling,
403the string representing the object may have been stored in a file or data, or
404sent over a network connection to some distant machine.
405
406If you have an object ``x``, and a file object ``f`` that's been opened for
407writing, the simplest way to pickle the object takes only one line of code::
408
409 pickle.dump(x, f)
410
411To unpickle the object again, if ``f`` is a file object which has been opened
412for reading::
413
414 x = pickle.load(f)
415
416(There are other variants of this, used when pickling many objects or when you
417don't want to write the pickled data to a file; consult the complete
418documentation for :mod:`pickle` in the Python Library Reference.)
419
420:mod:`pickle` is the standard way to make Python objects which can be stored and
421reused by other programs or by a future invocation of the same program; the
422technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is
423so widely used, many authors who write Python extensions take care to ensure
424that new data types such as matrices can be properly pickled and unpickled.
425
426