blob: 7287e3e7846e3bc1eb86f6d1b6284ff3a981bcb4 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001.. _tut-io:
2
3****************
4Input and Output
5****************
6
7There are several ways to present the output of a program; data can be printed
8in a human-readable form, or written to a file for future use. This chapter will
9discuss some of the possibilities.
10
11
12.. _tut-formatting:
13
14Fancier Output Formatting
15=========================
16
17So far we've encountered two ways of writing values: *expression statements* and
18the :keyword:`print` statement. (A third way is using the :meth:`write` method
19of file objects; the standard output file can be referenced as ``sys.stdout``.
20See the Library Reference for more information on this.)
21
22.. index:: module: string
23
24Often you'll want more control over the formatting of your output than simply
25printing space-separated values. There are two ways to format your output; the
26first way is to do all the string handling yourself; using string slicing and
27concatenation operations you can create any layout you can imagine. The
28standard module :mod:`string` contains some useful operations for padding
29strings to a given column width; these will be discussed shortly. The second
Benjamin Petersonf9ef9882008-05-26 00:54:22 +000030way is to use the :meth:`str.format` method.
Georg Brandl8ec7f652007-08-15 14:28:01 +000031
32One question remains, of course: how do you convert values to strings? Luckily,
33Python has ways to convert any value to a string: pass it to the :func:`repr`
Georg Brandlb04d4852008-08-08 15:34:34 +000034or :func:`str` functions.
Georg Brandl8ec7f652007-08-15 14:28:01 +000035
36The :func:`str` function is meant to return representations of values which are
37fairly human-readable, while :func:`repr` is meant to generate representations
38which can be read by the interpreter (or will force a :exc:`SyntaxError` if
39there is not equivalent syntax). For objects which don't have a particular
40representation for human consumption, :func:`str` will return the same value as
41:func:`repr`. Many values, such as numbers or structures like lists and
42dictionaries, have the same representation using either function. Strings and
43floating point numbers, in particular, have two distinct representations.
44
45Some examples::
46
47 >>> s = 'Hello, world.'
48 >>> str(s)
49 'Hello, world.'
50 >>> repr(s)
51 "'Hello, world.'"
52 >>> str(0.1)
53 '0.1'
54 >>> repr(0.1)
55 '0.10000000000000001'
56 >>> x = 10 * 3.25
57 >>> y = 200 * 200
58 >>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'
59 >>> print s
60 The value of x is 32.5, and y is 40000...
61 >>> # The repr() of a string adds string quotes and backslashes:
62 ... hello = 'hello, world\n'
63 >>> hellos = repr(hello)
64 >>> print hellos
65 'hello, world\n'
66 >>> # The argument to repr() may be any Python object:
67 ... repr((x, y, ('spam', 'eggs')))
68 "(32.5, 40000, ('spam', 'eggs'))"
Georg Brandl8ec7f652007-08-15 14:28:01 +000069
70Here are two ways to write a table of squares and cubes::
71
72 >>> for x in range(1, 11):
73 ... print repr(x).rjust(2), repr(x*x).rjust(3),
74 ... # Note trailing comma on previous line
75 ... print repr(x*x*x).rjust(4)
76 ...
77 1 1 1
78 2 4 8
79 3 9 27
80 4 16 64
81 5 25 125
82 6 36 216
83 7 49 343
84 8 64 512
85 9 81 729
86 10 100 1000
87
88 >>> for x in range(1,11):
Benjamin Petersonf9ef9882008-05-26 00:54:22 +000089 ... print '{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x)
Georg Brandl8ec7f652007-08-15 14:28:01 +000090 ...
91 1 1 1
92 2 4 8
93 3 9 27
94 4 16 64
95 5 25 125
96 6 36 216
97 7 49 343
98 8 64 512
99 9 81 729
100 10 100 1000
101
102(Note that in the first example, one space between each column was added by the
103way :keyword:`print` works: it always adds spaces between its arguments.)
104
105This example demonstrates the :meth:`rjust` method of string objects, which
106right-justifies a string in a field of a given width by padding it with spaces
107on the left. There are similar methods :meth:`ljust` and :meth:`center`. These
108methods do not write anything, they just return a new string. If the input
109string is too long, they don't truncate it, but return it unchanged; this will
110mess up your column lay-out but that's usually better than the alternative,
111which would be lying about a value. (If you really want truncation you can
112always add a slice operation, as in ``x.ljust(n)[:n]``.)
113
114There is another method, :meth:`zfill`, which pads a numeric string on the left
115with zeros. It understands about plus and minus signs::
116
117 >>> '12'.zfill(5)
118 '00012'
119 >>> '-3.14'.zfill(7)
120 '-003.14'
121 >>> '3.14159265359'.zfill(5)
122 '3.14159265359'
123
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000124Basic usage of the :meth:`str.format` method looks like this::
125
126 >>> print 'We are the {0} who say "{1}!"'.format('knights', 'Ni')
127 We are the knights who say "Ni!"
128
129The brackets and characters within them (called format fields) are replaced with
130the objects passed into the format method. The number in the brackets refers to
131the position of the object passed into the format method. ::
132
133 >>> print '{0} and {1}'.format('spam', 'eggs')
134 spam and eggs
135 >>> print '{1} and {0}'.format('spam', 'eggs')
136 eggs and spam
137
138If keyword arguments are used in the format method, their values are referred to
139by using the name of the argument. ::
140
Georg Brandl4b99e9b2008-07-26 22:13:29 +0000141 >>> print 'This {food} is {adjective}.'.format(
142 ... food='spam', adjective='absolutely horrible')
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000143 This spam is absolutely horrible.
144
145Positional and keyword arguments can be arbitrarily combined::
146
Georg Brandl4b99e9b2008-07-26 22:13:29 +0000147 >>> print 'The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred',
148 ... other='Georg')
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000149 The story of Bill, Manfred, and Georg.
150
151An optional ``':``` and format specifier can follow the field name. This also
152greater control over how the value is formatted. The following example
153truncates the Pi to three places after the decimal.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000154
155 >>> import math
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000156 >>> print 'The value of PI is approximately {0:.3f}.'.format(math.pi)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000157 The value of PI is approximately 3.142.
158
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000159Passing an integer after the ``':'`` will cause that field to be a minimum
160number of characters wide. This is useful for making tables pretty.::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000161
162 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
163 >>> for name, phone in table.items():
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000164 ... print '{0:10} ==> {1:10d}'.format(name, phone)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000165 ...
166 Jack ==> 4098
167 Dcab ==> 7678
168 Sjoerd ==> 4127
169
Georg Brandl8ec7f652007-08-15 14:28:01 +0000170If you have a really long format string that you don't want to split up, it
171would be nice if you could reference the variables to be formatted by name
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000172instead of by position. This can be done by simply passing the dict and using
173square brackets ``'[]'`` to access the keys ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000174
175 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
Georg Brandl4b99e9b2008-07-26 22:13:29 +0000176 >>> print ('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; '
177 ... 'Dcab: {0[Dcab]:d}'.format(table))
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000178 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
179
180This could also be done by passing the table as keyword arguments with the '**'
181notation.::
182
183 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
184 >>> print 'Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000185 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
186
187This is particularly useful in combination with the new built-in :func:`vars`
188function, which returns a dictionary containing all local variables.
189
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000190For a complete overview of string formating with :meth:`str.format`, see
191:ref:`formatstrings`.
192
193
194Old string formatting
195---------------------
196
197The ``%`` operator can also be used for string formatting. It interprets the
198left argument much like a :cfunc:`sprintf`\ -style format string to be applied
199to the right argument, and returns the string resulting from this formatting
200operation. For example::
201
202 >>> import math
203 >>> print 'The value of PI is approximately %5.3f.' % math.pi
204 The value of PI is approximately 3.142.
205
206Since :meth:`str.format` is quite new, a lot of Python code still uses the ``%``
207operator. However, because this old style of formatting will eventually removed
208from the language :meth:`str.format` should generally be used.
209
210More information can be found in the :ref:`string-formatting` section.
211
Georg Brandl8ec7f652007-08-15 14:28:01 +0000212
213.. _tut-files:
214
215Reading and Writing Files
216=========================
217
218.. index::
219 builtin: open
220 object: file
221
222:func:`open` returns a file object, and is most commonly used with two
223arguments: ``open(filename, mode)``.
224
Georg Brandl8ec7f652007-08-15 14:28:01 +0000225::
226
Georg Brandlb19be572007-12-29 10:57:00 +0000227 >>> f = open('/tmp/workfile', 'w')
Georg Brandl8ec7f652007-08-15 14:28:01 +0000228 >>> print f
229 <open file '/tmp/workfile', mode 'w' at 80a0960>
230
231The first argument is a string containing the filename. The second argument is
232another string containing a few characters describing the way in which the file
233will be used. *mode* can be ``'r'`` when the file will only be read, ``'w'``
234for only writing (an existing file with the same name will be erased), and
235``'a'`` opens the file for appending; any data written to the file is
236automatically added to the end. ``'r+'`` opens the file for both reading and
237writing. The *mode* argument is optional; ``'r'`` will be assumed if it's
238omitted.
239
Georg Brandl9af94982008-09-13 17:41:16 +0000240On Windows, ``'b'`` appended to the mode opens the file in binary mode, so there
241are also modes like ``'rb'``, ``'wb'``, and ``'r+b'``. Windows makes a
242distinction between text and binary files; the end-of-line characters in text
243files are automatically altered slightly when data is read or written. This
244behind-the-scenes modification to file data is fine for ASCII text files, but
245it'll corrupt binary data like that in :file:`JPEG` or :file:`EXE` files. Be
246very careful to use binary mode when reading and writing such files. On Unix,
247it doesn't hurt to append a ``'b'`` to the mode, so you can use it
248platform-independently for all binary files.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000249
250
251.. _tut-filemethods:
252
253Methods of File Objects
254-----------------------
255
256The rest of the examples in this section will assume that a file object called
257``f`` has already been created.
258
259To read a file's contents, call ``f.read(size)``, which reads some quantity of
260data and returns it as a string. *size* is an optional numeric argument. When
261*size* is omitted or negative, the entire contents of the file will be read and
262returned; it's your problem if the file is twice as large as your machine's
263memory. Otherwise, at most *size* bytes are read and returned. If the end of
264the file has been reached, ``f.read()`` will return an empty string (``""``).
265::
266
267 >>> f.read()
268 'This is the entire file.\n'
269 >>> f.read()
270 ''
271
272``f.readline()`` reads a single line from the file; a newline character (``\n``)
273is left at the end of the string, and is only omitted on the last line of the
274file if the file doesn't end in a newline. This makes the return value
275unambiguous; if ``f.readline()`` returns an empty string, the end of the file
276has been reached, while a blank line is represented by ``'\n'``, a string
277containing only a single newline. ::
278
279 >>> f.readline()
280 'This is the first line of the file.\n'
281 >>> f.readline()
282 'Second line of the file\n'
283 >>> f.readline()
284 ''
285
286``f.readlines()`` returns a list containing all the lines of data in the file.
287If given an optional parameter *sizehint*, it reads that many bytes from the
288file and enough more to complete a line, and returns the lines from that. This
289is often used to allow efficient reading of a large file by lines, but without
290having to load the entire file in memory. Only complete lines will be returned.
291::
292
293 >>> f.readlines()
294 ['This is the first line of the file.\n', 'Second line of the file\n']
295
Georg Brandl5d242ee2007-09-20 08:44:59 +0000296An alternative approach to reading lines is to loop over the file object. This is
Georg Brandl8ec7f652007-08-15 14:28:01 +0000297memory efficient, fast, and leads to simpler code::
298
299 >>> for line in f:
300 print line,
301
302 This is the first line of the file.
303 Second line of the file
304
305The alternative approach is simpler but does not provide as fine-grained
306control. Since the two approaches manage line buffering differently, they
307should not be mixed.
308
309``f.write(string)`` writes the contents of *string* to the file, returning
310``None``. ::
311
312 >>> f.write('This is a test\n')
313
314To write something other than a string, it needs to be converted to a string
315first::
316
317 >>> value = ('the answer', 42)
318 >>> s = str(value)
319 >>> f.write(s)
320
321``f.tell()`` returns an integer giving the file object's current position in the
322file, measured in bytes from the beginning of the file. To change the file
323object's position, use ``f.seek(offset, from_what)``. The position is computed
324from adding *offset* to a reference point; the reference point is selected by
325the *from_what* argument. A *from_what* value of 0 measures from the beginning
326of the file, 1 uses the current file position, and 2 uses the end of the file as
327the reference point. *from_what* can be omitted and defaults to 0, using the
328beginning of the file as the reference point. ::
329
330 >>> f = open('/tmp/workfile', 'r+')
331 >>> f.write('0123456789abcdef')
332 >>> f.seek(5) # Go to the 6th byte in the file
333 >>> f.read(1)
334 '5'
335 >>> f.seek(-3, 2) # Go to the 3rd byte before the end
336 >>> f.read(1)
337 'd'
338
339When you're done with a file, call ``f.close()`` to close it and free up any
340system resources taken up by the open file. After calling ``f.close()``,
341attempts to use the file object will automatically fail. ::
342
343 >>> f.close()
344 >>> f.read()
345 Traceback (most recent call last):
346 File "<stdin>", line 1, in ?
347 ValueError: I/O operation on closed file
348
Georg Brandla66bb0a2008-07-16 23:35:54 +0000349It is good practice to use the :keyword:`with` keyword when dealing with file
350objects. This has the advantage that the file is properly closed after its
351suite finishes, even if an exception is raised on the way. It is also much
352shorter than writing equivalent :keyword:`try`\ -\ :keyword:`finally` blocks::
353
354 >>> with open('/tmp/workfile', 'r') as f:
355 ... read_data = f.read()
356 >>> f.closed
357 True
358
Georg Brandl8ec7f652007-08-15 14:28:01 +0000359File objects have some additional methods, such as :meth:`isatty` and
360:meth:`truncate` which are less frequently used; consult the Library Reference
361for a complete guide to file objects.
362
363
364.. _tut-pickle:
365
366The :mod:`pickle` Module
367------------------------
368
369.. index:: module: pickle
370
371Strings can easily be written to and read from a file. Numbers take a bit more
372effort, since the :meth:`read` method only returns strings, which will have to
373be passed to a function like :func:`int`, which takes a string like ``'123'``
374and returns its numeric value 123. However, when you want to save more complex
375data types like lists, dictionaries, or class instances, things get a lot more
376complicated.
377
378Rather than have users be constantly writing and debugging code to save
379complicated data types, Python provides a standard module called :mod:`pickle`.
380This is an amazing module that can take almost any Python object (even some
381forms of Python code!), and convert it to a string representation; this process
382is called :dfn:`pickling`. Reconstructing the object from the string
383representation is called :dfn:`unpickling`. Between pickling and unpickling,
384the string representing the object may have been stored in a file or data, or
385sent over a network connection to some distant machine.
386
387If you have an object ``x``, and a file object ``f`` that's been opened for
388writing, the simplest way to pickle the object takes only one line of code::
389
390 pickle.dump(x, f)
391
392To unpickle the object again, if ``f`` is a file object which has been opened
393for reading::
394
395 x = pickle.load(f)
396
397(There are other variants of this, used when pickling many objects or when you
398don't want to write the pickled data to a file; consult the complete
399documentation for :mod:`pickle` in the Python Library Reference.)
400
401:mod:`pickle` is the standard way to make Python objects which can be stored and
402reused by other programs or by a future invocation of the same program; the
403technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is
404so widely used, many authors who write Python extensions take care to ensure
405that new data types such as matrices can be properly pickled and unpickled.
406
407