blob: f2946ceaa604be29445795db11dac7fd3c6f14c5 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001.. _tut-io:
2
3****************
4Input and Output
5****************
6
7There are several ways to present the output of a program; data can be printed
8in a human-readable form, or written to a file for future use. This chapter will
9discuss some of the possibilities.
10
11
12.. _tut-formatting:
13
14Fancier Output Formatting
15=========================
16
17So far we've encountered two ways of writing values: *expression statements* and
Guido van Rossum0616b792007-08-31 03:25:11 +000018the :func:`print` function. (A third way is using the :meth:`write` method
Georg Brandl116aa622007-08-15 14:28:22 +000019of file objects; the standard output file can be referenced as ``sys.stdout``.
20See the Library Reference for more information on this.)
21
22.. index:: module: string
23
24Often you'll want more control over the formatting of your output than simply
25printing space-separated values. There are two ways to format your output; the
26first way is to do all the string handling yourself; using string slicing and
27concatenation operations you can create any layout you can imagine. The
28standard module :mod:`string` contains some useful operations for padding
29strings to a given column width; these will be discussed shortly. The second
Benjamin Petersone6f00632008-05-26 01:03:56 +000030way is to use the :meth:`str.format` method.
31
32The :mod:`string` module contains a class Template which offers yet another way
33to substitute values into strings.
Georg Brandl116aa622007-08-15 14:28:22 +000034
35One question remains, of course: how do you convert values to strings? Luckily,
36Python has ways to convert any value to a string: pass it to the :func:`repr`
Georg Brandl1e3830a2008-08-08 06:45:01 +000037or :func:`str` functions.
Georg Brandl116aa622007-08-15 14:28:22 +000038
39The :func:`str` function is meant to return representations of values which are
40fairly human-readable, while :func:`repr` is meant to generate representations
41which can be read by the interpreter (or will force a :exc:`SyntaxError` if
42there is not equivalent syntax). For objects which don't have a particular
43representation for human consumption, :func:`str` will return the same value as
44:func:`repr`. Many values, such as numbers or structures like lists and
45dictionaries, have the same representation using either function. Strings and
46floating point numbers, in particular, have two distinct representations.
47
48Some examples::
49
50 >>> s = 'Hello, world.'
51 >>> str(s)
52 'Hello, world.'
53 >>> repr(s)
54 "'Hello, world.'"
55 >>> str(0.1)
56 '0.1'
57 >>> repr(0.1)
58 '0.10000000000000001'
59 >>> x = 10 * 3.25
60 >>> y = 200 * 200
61 >>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'
Guido van Rossum0616b792007-08-31 03:25:11 +000062 >>> print(s)
Georg Brandl116aa622007-08-15 14:28:22 +000063 The value of x is 32.5, and y is 40000...
64 >>> # The repr() of a string adds string quotes and backslashes:
65 ... hello = 'hello, world\n'
66 >>> hellos = repr(hello)
Guido van Rossum0616b792007-08-31 03:25:11 +000067 >>> print(hellos)
Georg Brandl116aa622007-08-15 14:28:22 +000068 'hello, world\n'
69 >>> # The argument to repr() may be any Python object:
70 ... repr((x, y, ('spam', 'eggs')))
71 "(32.5, 40000, ('spam', 'eggs'))"
Georg Brandl116aa622007-08-15 14:28:22 +000072
73Here are two ways to write a table of squares and cubes::
74
75 >>> for x in range(1, 11):
Georg Brandle4ac7502007-09-03 07:10:24 +000076 ... print(repr(x).rjust(2), repr(x*x).rjust(3), end=' ')
Guido van Rossum0616b792007-08-31 03:25:11 +000077 ... # Note use of 'end' on previous line
78 ... print(repr(x*x*x).rjust(4))
Georg Brandl116aa622007-08-15 14:28:22 +000079 ...
80 1 1 1
81 2 4 8
82 3 9 27
83 4 16 64
84 5 25 125
85 6 36 216
86 7 49 343
87 8 64 512
88 9 81 729
89 10 100 1000
90
Georg Brandle4ac7502007-09-03 07:10:24 +000091 >>> for x in range(1, 11):
Benjamin Petersone6f00632008-05-26 01:03:56 +000092 ... print('{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x))
Georg Brandl116aa622007-08-15 14:28:22 +000093 ...
94 1 1 1
95 2 4 8
96 3 9 27
97 4 16 64
98 5 25 125
99 6 36 216
100 7 49 343
101 8 64 512
102 9 81 729
103 10 100 1000
104
105(Note that in the first example, one space between each column was added by the
Guido van Rossum0616b792007-08-31 03:25:11 +0000106way :func:`print` works: it always adds spaces between its arguments.)
Georg Brandl116aa622007-08-15 14:28:22 +0000107
108This example demonstrates the :meth:`rjust` method of string objects, which
109right-justifies a string in a field of a given width by padding it with spaces
110on the left. There are similar methods :meth:`ljust` and :meth:`center`. These
111methods do not write anything, they just return a new string. If the input
112string is too long, they don't truncate it, but return it unchanged; this will
113mess up your column lay-out but that's usually better than the alternative,
114which would be lying about a value. (If you really want truncation you can
115always add a slice operation, as in ``x.ljust(n)[:n]``.)
116
117There is another method, :meth:`zfill`, which pads a numeric string on the left
118with zeros. It understands about plus and minus signs::
119
120 >>> '12'.zfill(5)
121 '00012'
122 >>> '-3.14'.zfill(7)
123 '-003.14'
124 >>> '3.14159265359'.zfill(5)
125 '3.14159265359'
126
Benjamin Petersone6f00632008-05-26 01:03:56 +0000127Basic usage of the :meth:`str.format` method looks like this::
128
Benjamin Peterson0cea1572008-07-26 21:59:03 +0000129 >>> print('We are the {0} who say "{1}!"'.format('knights', 'Ni'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000130 We are the knights who say "Ni!"
131
132The brackets and characters within them (called format fields) are replaced with
133the objects passed into the format method. The number in the brackets refers to
134the position of the object passed into the format method. ::
135
Benjamin Peterson0cea1572008-07-26 21:59:03 +0000136 >>> print('{0} and {1}'.format('spam', 'eggs'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000137 spam and eggs
Benjamin Peterson0cea1572008-07-26 21:59:03 +0000138 >>> print('{1} and {0}'.format('spam', 'eggs'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000139 eggs and spam
140
141If keyword arguments are used in the format method, their values are referred to
142by using the name of the argument. ::
143
Benjamin Peterson71141932008-07-26 22:27:04 +0000144 >>> print('This {food} is {adjective}.'.format(
145 ... food='spam', adjective='absolutely horrible'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000146 This spam is absolutely horrible.
147
148Positional and keyword arguments can be arbitrarily combined::
149
Benjamin Peterson71141932008-07-26 22:27:04 +0000150 >>> print('The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred',
151 other='Georg'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000152 The story of Bill, Manfred, and Georg.
153
154An optional ``':``` and format specifier can follow the field name. This also
155greater control over how the value is formatted. The following example
156truncates the Pi to three places after the decimal.
Georg Brandl116aa622007-08-15 14:28:22 +0000157
158 >>> import math
Benjamin Petersone6f00632008-05-26 01:03:56 +0000159 >>> print('The value of PI is approximately {0:.3f}.'.format(math.pi))
Georg Brandl116aa622007-08-15 14:28:22 +0000160 The value of PI is approximately 3.142.
161
Benjamin Petersone6f00632008-05-26 01:03:56 +0000162Passing an integer after the ``':'`` will cause that field to be a minimum
163number of characters wide. This is useful for making tables pretty.::
Georg Brandl116aa622007-08-15 14:28:22 +0000164
165 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
166 >>> for name, phone in table.items():
Benjamin Petersone6f00632008-05-26 01:03:56 +0000167 ... print('{0:10} ==> {1:10d}'.format(name, phone))
Georg Brandl116aa622007-08-15 14:28:22 +0000168 ...
169 Jack ==> 4098
170 Dcab ==> 7678
171 Sjoerd ==> 4127
172
Georg Brandl116aa622007-08-15 14:28:22 +0000173If you have a really long format string that you don't want to split up, it
174would be nice if you could reference the variables to be formatted by name
Benjamin Petersone6f00632008-05-26 01:03:56 +0000175instead of by position. This can be done by simply passing the dict and using
176square brackets ``'[]'`` to access the keys ::
Georg Brandl116aa622007-08-15 14:28:22 +0000177
178 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
Benjamin Peterson71141932008-07-26 22:27:04 +0000179 >>> print('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; '
180 'Dcab: {0[Dcab]:d}'.format(table))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000181 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
182
183This could also be done by passing the table as keyword arguments with the '**'
184notation.::
185
186 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
187 >>> print('Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table))
Georg Brandl116aa622007-08-15 14:28:22 +0000188 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
189
190This is particularly useful in combination with the new built-in :func:`vars`
191function, which returns a dictionary containing all local variables.
192
Benjamin Petersone6f00632008-05-26 01:03:56 +0000193For a complete overview of string formating with :meth:`str.format`, see
194:ref:`formatstrings`.
195
196
197Old string formatting
198---------------------
199
200The ``%`` operator can also be used for string formatting. It interprets the
201left argument much like a :cfunc:`sprintf`\ -style format string to be applied
202to the right argument, and returns the string resulting from this formatting
203operation. For example::
204
205 >>> import math
Georg Brandl11e18b02008-08-05 09:04:16 +0000206 >>> print('The value of PI is approximately %5.3f.' % math.pi)
Benjamin Petersone6f00632008-05-26 01:03:56 +0000207 The value of PI is approximately 3.142.
208
209Since :meth:`str.format` is quite new, a lot of Python code still uses the ``%``
210operator. However, because this old style of formatting will eventually removed
211from the language :meth:`str.format` should generally be used.
212
213More information can be found in the :ref:`old-string-formatting` section.
214
Georg Brandl116aa622007-08-15 14:28:22 +0000215
216.. _tut-files:
217
218Reading and Writing Files
219=========================
220
221.. index::
222 builtin: open
223 object: file
224
225:func:`open` returns a file object, and is most commonly used with two
226arguments: ``open(filename, mode)``.
227
Georg Brandl116aa622007-08-15 14:28:22 +0000228::
229
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000230 >>> f = open('/tmp/workfile', 'w')
Guido van Rossum0616b792007-08-31 03:25:11 +0000231 >>> print(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000232 <open file '/tmp/workfile', mode 'w' at 80a0960>
233
234The first argument is a string containing the filename. The second argument is
235another string containing a few characters describing the way in which the file
236will be used. *mode* can be ``'r'`` when the file will only be read, ``'w'``
237for only writing (an existing file with the same name will be erased), and
238``'a'`` opens the file for appending; any data written to the file is
239automatically added to the end. ``'r+'`` opens the file for both reading and
240writing. The *mode* argument is optional; ``'r'`` will be assumed if it's
241omitted.
242
Christian Heimesaf98da12008-01-27 15:18:18 +0000243On Windows and the Macintosh, ``'b'`` appended to the mode opens the file in
244binary mode, so there are also modes like ``'rb'``, ``'wb'``, and ``'r+b'``.
245Windows makes a distinction between text and binary files; the end-of-line
246characters in text files are automatically altered slightly when data is read or
247written. This behind-the-scenes modification to file data is fine for ASCII
248text files, but it'll corrupt binary data like that in :file:`JPEG` or
249:file:`EXE` files. Be very careful to use binary mode when reading and writing
250such files. On Unix, it doesn't hurt to append a ``'b'`` to the mode, so
251you can use it platform-independently for all binary files.
Skip Montanaro4e02c502007-09-26 01:10:12 +0000252
253This behind-the-scenes modification to file data is fine for text files, but
254will corrupt binary data like that in :file:`JPEG` or :file:`EXE` files. Be
255very careful to use binary mode when reading and writing such files.
Georg Brandl116aa622007-08-15 14:28:22 +0000256
257
258.. _tut-filemethods:
259
260Methods of File Objects
261-----------------------
262
263The rest of the examples in this section will assume that a file object called
264``f`` has already been created.
265
266To read a file's contents, call ``f.read(size)``, which reads some quantity of
267data and returns it as a string. *size* is an optional numeric argument. When
268*size* is omitted or negative, the entire contents of the file will be read and
269returned; it's your problem if the file is twice as large as your machine's
270memory. Otherwise, at most *size* bytes are read and returned. If the end of
271the file has been reached, ``f.read()`` will return an empty string (``""``).
272::
273
274 >>> f.read()
275 'This is the entire file.\n'
276 >>> f.read()
277 ''
278
279``f.readline()`` reads a single line from the file; a newline character (``\n``)
280is left at the end of the string, and is only omitted on the last line of the
281file if the file doesn't end in a newline. This makes the return value
282unambiguous; if ``f.readline()`` returns an empty string, the end of the file
283has been reached, while a blank line is represented by ``'\n'``, a string
284containing only a single newline. ::
285
286 >>> f.readline()
287 'This is the first line of the file.\n'
288 >>> f.readline()
289 'Second line of the file\n'
290 >>> f.readline()
291 ''
292
293``f.readlines()`` returns a list containing all the lines of data in the file.
294If given an optional parameter *sizehint*, it reads that many bytes from the
295file and enough more to complete a line, and returns the lines from that. This
296is often used to allow efficient reading of a large file by lines, but without
297having to load the entire file in memory. Only complete lines will be returned.
298::
299
300 >>> f.readlines()
301 ['This is the first line of the file.\n', 'Second line of the file\n']
302
Thomas Wouters8ce81f72007-09-20 18:22:40 +0000303An alternative approach to reading lines is to loop over the file object. This is
Georg Brandl116aa622007-08-15 14:28:22 +0000304memory efficient, fast, and leads to simpler code::
305
306 >>> for line in f:
Guido van Rossum0616b792007-08-31 03:25:11 +0000307 print(line, end='')
Georg Brandl116aa622007-08-15 14:28:22 +0000308
309 This is the first line of the file.
310 Second line of the file
311
312The alternative approach is simpler but does not provide as fine-grained
313control. Since the two approaches manage line buffering differently, they
314should not be mixed.
315
316``f.write(string)`` writes the contents of *string* to the file, returning
317``None``. ::
318
319 >>> f.write('This is a test\n')
320
321To write something other than a string, it needs to be converted to a string
322first::
323
324 >>> value = ('the answer', 42)
325 >>> s = str(value)
326 >>> f.write(s)
327
328``f.tell()`` returns an integer giving the file object's current position in the
329file, measured in bytes from the beginning of the file. To change the file
330object's position, use ``f.seek(offset, from_what)``. The position is computed
331from adding *offset* to a reference point; the reference point is selected by
332the *from_what* argument. A *from_what* value of 0 measures from the beginning
333of the file, 1 uses the current file position, and 2 uses the end of the file as
334the reference point. *from_what* can be omitted and defaults to 0, using the
335beginning of the file as the reference point. ::
336
337 >>> f = open('/tmp/workfile', 'r+')
338 >>> f.write('0123456789abcdef')
339 >>> f.seek(5) # Go to the 6th byte in the file
340 >>> f.read(1)
341 '5'
342 >>> f.seek(-3, 2) # Go to the 3rd byte before the end
343 >>> f.read(1)
344 'd'
345
346When you're done with a file, call ``f.close()`` to close it and free up any
347system resources taken up by the open file. After calling ``f.close()``,
348attempts to use the file object will automatically fail. ::
349
350 >>> f.close()
351 >>> f.read()
352 Traceback (most recent call last):
353 File "<stdin>", line 1, in ?
354 ValueError: I/O operation on closed file
355
Georg Brandl3dbca812008-07-23 16:10:53 +0000356It is good practice to use the :keyword:`with` keyword when dealing with file
357objects. This has the advantage that the file is properly closed after its
358suite finishes, even if an exception is raised on the way. It is also much
359shorter than writing equivalent :keyword:`try`\ -\ :keyword:`finally` blocks::
360
361 >>> with open('/tmp/workfile', 'r') as f:
362 ... read_data = f.read()
363 >>> f.closed
364 True
365
Georg Brandl116aa622007-08-15 14:28:22 +0000366File objects have some additional methods, such as :meth:`isatty` and
367:meth:`truncate` which are less frequently used; consult the Library Reference
368for a complete guide to file objects.
369
370
371.. _tut-pickle:
372
373The :mod:`pickle` Module
374------------------------
375
376.. index:: module: pickle
377
378Strings can easily be written to and read from a file. Numbers take a bit more
379effort, since the :meth:`read` method only returns strings, which will have to
380be passed to a function like :func:`int`, which takes a string like ``'123'``
381and returns its numeric value 123. However, when you want to save more complex
382data types like lists, dictionaries, or class instances, things get a lot more
383complicated.
384
385Rather than have users be constantly writing and debugging code to save
386complicated data types, Python provides a standard module called :mod:`pickle`.
387This is an amazing module that can take almost any Python object (even some
388forms of Python code!), and convert it to a string representation; this process
389is called :dfn:`pickling`. Reconstructing the object from the string
390representation is called :dfn:`unpickling`. Between pickling and unpickling,
391the string representing the object may have been stored in a file or data, or
392sent over a network connection to some distant machine.
393
394If you have an object ``x``, and a file object ``f`` that's been opened for
395writing, the simplest way to pickle the object takes only one line of code::
396
397 pickle.dump(x, f)
398
399To unpickle the object again, if ``f`` is a file object which has been opened
400for reading::
401
402 x = pickle.load(f)
403
404(There are other variants of this, used when pickling many objects or when you
405don't want to write the pickled data to a file; consult the complete
406documentation for :mod:`pickle` in the Python Library Reference.)
407
408:mod:`pickle` is the standard way to make Python objects which can be stored and
409reused by other programs or by a future invocation of the same program; the
410technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is
411so widely used, many authors who write Python extensions take care to ensure
412that new data types such as matrices can be properly pickled and unpickled.
413
414