blob: 9efbca5b8b0c1553ab2e52e583282d4a7a618181 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001.. _tut-io:
2
3****************
4Input and Output
5****************
6
7There are several ways to present the output of a program; data can be printed
8in a human-readable form, or written to a file for future use. This chapter will
9discuss some of the possibilities.
10
11
12.. _tut-formatting:
13
14Fancier Output Formatting
15=========================
16
17So far we've encountered two ways of writing values: *expression statements* and
18the :keyword:`print` statement. (A third way is using the :meth:`write` method
19of file objects; the standard output file can be referenced as ``sys.stdout``.
20See the Library Reference for more information on this.)
21
22.. index:: module: string
23
24Often you'll want more control over the formatting of your output than simply
25printing space-separated values. There are two ways to format your output; the
26first way is to do all the string handling yourself; using string slicing and
27concatenation operations you can create any layout you can imagine. The
28standard module :mod:`string` contains some useful operations for padding
29strings to a given column width; these will be discussed shortly. The second
Benjamin Petersonf9ef9882008-05-26 00:54:22 +000030way is to use the :meth:`str.format` method.
Georg Brandl8ec7f652007-08-15 14:28:01 +000031
32One question remains, of course: how do you convert values to strings? Luckily,
33Python has ways to convert any value to a string: pass it to the :func:`repr`
Georg Brandlb04d4852008-08-08 15:34:34 +000034or :func:`str` functions.
Georg Brandl8ec7f652007-08-15 14:28:01 +000035
36The :func:`str` function is meant to return representations of values which are
37fairly human-readable, while :func:`repr` is meant to generate representations
38which can be read by the interpreter (or will force a :exc:`SyntaxError` if
39there is not equivalent syntax). For objects which don't have a particular
40representation for human consumption, :func:`str` will return the same value as
41:func:`repr`. Many values, such as numbers or structures like lists and
42dictionaries, have the same representation using either function. Strings and
43floating point numbers, in particular, have two distinct representations.
44
45Some examples::
46
47 >>> s = 'Hello, world.'
48 >>> str(s)
49 'Hello, world.'
50 >>> repr(s)
51 "'Hello, world.'"
52 >>> str(0.1)
53 '0.1'
54 >>> repr(0.1)
55 '0.10000000000000001'
56 >>> x = 10 * 3.25
57 >>> y = 200 * 200
58 >>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'
59 >>> print s
60 The value of x is 32.5, and y is 40000...
61 >>> # The repr() of a string adds string quotes and backslashes:
62 ... hello = 'hello, world\n'
63 >>> hellos = repr(hello)
64 >>> print hellos
65 'hello, world\n'
66 >>> # The argument to repr() may be any Python object:
67 ... repr((x, y, ('spam', 'eggs')))
68 "(32.5, 40000, ('spam', 'eggs'))"
Georg Brandl8ec7f652007-08-15 14:28:01 +000069
70Here are two ways to write a table of squares and cubes::
71
72 >>> for x in range(1, 11):
73 ... print repr(x).rjust(2), repr(x*x).rjust(3),
74 ... # Note trailing comma on previous line
75 ... print repr(x*x*x).rjust(4)
76 ...
77 1 1 1
78 2 4 8
79 3 9 27
80 4 16 64
81 5 25 125
82 6 36 216
83 7 49 343
84 8 64 512
85 9 81 729
86 10 100 1000
87
88 >>> for x in range(1,11):
Benjamin Petersonf9ef9882008-05-26 00:54:22 +000089 ... print '{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x)
Georg Brandlc62ef8b2009-01-03 20:55:06 +000090 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +000091 1 1 1
92 2 4 8
93 3 9 27
94 4 16 64
95 5 25 125
96 6 36 216
97 7 49 343
98 8 64 512
99 9 81 729
100 10 100 1000
101
102(Note that in the first example, one space between each column was added by the
103way :keyword:`print` works: it always adds spaces between its arguments.)
104
105This example demonstrates the :meth:`rjust` method of string objects, which
106right-justifies a string in a field of a given width by padding it with spaces
107on the left. There are similar methods :meth:`ljust` and :meth:`center`. These
108methods do not write anything, they just return a new string. If the input
109string is too long, they don't truncate it, but return it unchanged; this will
110mess up your column lay-out but that's usually better than the alternative,
111which would be lying about a value. (If you really want truncation you can
112always add a slice operation, as in ``x.ljust(n)[:n]``.)
113
114There is another method, :meth:`zfill`, which pads a numeric string on the left
115with zeros. It understands about plus and minus signs::
116
117 >>> '12'.zfill(5)
118 '00012'
119 >>> '-3.14'.zfill(7)
120 '-003.14'
121 >>> '3.14159265359'.zfill(5)
122 '3.14159265359'
123
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000124Basic usage of the :meth:`str.format` method looks like this::
125
Georg Brandl254c17c2009-09-01 07:40:54 +0000126 >>> print 'We are the {} who say "{}!"'.format('knights', 'Ni')
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000127 We are the knights who say "Ni!"
128
129The brackets and characters within them (called format fields) are replaced with
Georg Brandl254c17c2009-09-01 07:40:54 +0000130the objects passed into the :meth:`~str.format` method. A number in the
Georg Brandl14bb28a2009-07-29 17:15:20 +0000131brackets refers to the position of the object passed into the
132:meth:`~str.format` method. ::
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000133
134 >>> print '{0} and {1}'.format('spam', 'eggs')
135 spam and eggs
136 >>> print '{1} and {0}'.format('spam', 'eggs')
137 eggs and spam
138
Georg Brandl14bb28a2009-07-29 17:15:20 +0000139If keyword arguments are used in the :meth:`~str.format` method, their values
140are referred to by using the name of the argument. ::
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000141
Georg Brandl4b99e9b2008-07-26 22:13:29 +0000142 >>> print 'This {food} is {adjective}.'.format(
143 ... food='spam', adjective='absolutely horrible')
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000144 This spam is absolutely horrible.
145
146Positional and keyword arguments can be arbitrarily combined::
147
Georg Brandl4b99e9b2008-07-26 22:13:29 +0000148 >>> print 'The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred',
149 ... other='Georg')
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000150 The story of Bill, Manfred, and Georg.
151
Georg Brandl254c17c2009-09-01 07:40:54 +0000152``'!s'`` (apply :func:`str`) and ``'!r'`` (apply :func:`repr`) can be used to
153convert the value before it is formatted. ::
154
155 >>> import math
156 >>> print 'The value of PI is approximately {}.'.format(math.pi)
157 The value of PI is approximately 3.14159265359.
158 >>> print 'The value of PI is approximately {!r}.'.format(math.pi)
159 The value of PI is approximately 3.141592653589793.
160
Georg Brandla1a4bdb2009-07-18 09:06:31 +0000161An optional ``':'`` and format specifier can follow the field name. This allows
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000162greater control over how the value is formatted. The following example
Georg Brandla1a4bdb2009-07-18 09:06:31 +0000163truncates Pi to three places after the decimal.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000164
165 >>> import math
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000166 >>> print 'The value of PI is approximately {0:.3f}.'.format(math.pi)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000167 The value of PI is approximately 3.142.
168
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000169Passing an integer after the ``':'`` will cause that field to be a minimum
Georg Brandl14bb28a2009-07-29 17:15:20 +0000170number of characters wide. This is useful for making tables pretty. ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000171
172 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
173 >>> for name, phone in table.items():
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000174 ... print '{0:10} ==> {1:10d}'.format(name, phone)
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000175 ...
Georg Brandl8ec7f652007-08-15 14:28:01 +0000176 Jack ==> 4098
177 Dcab ==> 7678
178 Sjoerd ==> 4127
179
Georg Brandl8ec7f652007-08-15 14:28:01 +0000180If you have a really long format string that you don't want to split up, it
181would be nice if you could reference the variables to be formatted by name
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000182instead of by position. This can be done by simply passing the dict and using
183square brackets ``'[]'`` to access the keys ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000184
185 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
Georg Brandl4b99e9b2008-07-26 22:13:29 +0000186 >>> print ('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; '
187 ... 'Dcab: {0[Dcab]:d}'.format(table))
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000188 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
189
190This could also be done by passing the table as keyword arguments with the '**'
Georg Brandl14bb28a2009-07-29 17:15:20 +0000191notation. ::
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000192
193 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
194 >>> print 'Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000195 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
196
197This is particularly useful in combination with the new built-in :func:`vars`
198function, which returns a dictionary containing all local variables.
199
Mark Dickinson3e4caeb2009-02-21 20:27:01 +0000200For a complete overview of string formatting with :meth:`str.format`, see
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000201:ref:`formatstrings`.
202
203
204Old string formatting
205---------------------
206
207The ``%`` operator can also be used for string formatting. It interprets the
208left argument much like a :cfunc:`sprintf`\ -style format string to be applied
209to the right argument, and returns the string resulting from this formatting
210operation. For example::
211
212 >>> import math
213 >>> print 'The value of PI is approximately %5.3f.' % math.pi
214 The value of PI is approximately 3.142.
215
216Since :meth:`str.format` is quite new, a lot of Python code still uses the ``%``
Georg Brandla1a4bdb2009-07-18 09:06:31 +0000217operator. However, because this old style of formatting will eventually be
218removed from the language, :meth:`str.format` should generally be used.
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000219
220More information can be found in the :ref:`string-formatting` section.
221
Georg Brandl8ec7f652007-08-15 14:28:01 +0000222
223.. _tut-files:
224
225Reading and Writing Files
226=========================
227
228.. index::
229 builtin: open
230 object: file
231
232:func:`open` returns a file object, and is most commonly used with two
233arguments: ``open(filename, mode)``.
234
Georg Brandl8ec7f652007-08-15 14:28:01 +0000235::
236
Georg Brandlb19be572007-12-29 10:57:00 +0000237 >>> f = open('/tmp/workfile', 'w')
Georg Brandl8ec7f652007-08-15 14:28:01 +0000238 >>> print f
239 <open file '/tmp/workfile', mode 'w' at 80a0960>
240
241The first argument is a string containing the filename. The second argument is
242another string containing a few characters describing the way in which the file
243will be used. *mode* can be ``'r'`` when the file will only be read, ``'w'``
244for only writing (an existing file with the same name will be erased), and
245``'a'`` opens the file for appending; any data written to the file is
246automatically added to the end. ``'r+'`` opens the file for both reading and
247writing. The *mode* argument is optional; ``'r'`` will be assumed if it's
248omitted.
249
Georg Brandl9af94982008-09-13 17:41:16 +0000250On Windows, ``'b'`` appended to the mode opens the file in binary mode, so there
251are also modes like ``'rb'``, ``'wb'``, and ``'r+b'``. Windows makes a
252distinction between text and binary files; the end-of-line characters in text
253files are automatically altered slightly when data is read or written. This
254behind-the-scenes modification to file data is fine for ASCII text files, but
255it'll corrupt binary data like that in :file:`JPEG` or :file:`EXE` files. Be
256very careful to use binary mode when reading and writing such files. On Unix,
257it doesn't hurt to append a ``'b'`` to the mode, so you can use it
258platform-independently for all binary files.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000259
260
261.. _tut-filemethods:
262
263Methods of File Objects
264-----------------------
265
266The rest of the examples in this section will assume that a file object called
267``f`` has already been created.
268
269To read a file's contents, call ``f.read(size)``, which reads some quantity of
270data and returns it as a string. *size* is an optional numeric argument. When
271*size* is omitted or negative, the entire contents of the file will be read and
272returned; it's your problem if the file is twice as large as your machine's
273memory. Otherwise, at most *size* bytes are read and returned. If the end of
274the file has been reached, ``f.read()`` will return an empty string (``""``).
275::
276
277 >>> f.read()
278 'This is the entire file.\n'
279 >>> f.read()
280 ''
281
282``f.readline()`` reads a single line from the file; a newline character (``\n``)
283is left at the end of the string, and is only omitted on the last line of the
284file if the file doesn't end in a newline. This makes the return value
285unambiguous; if ``f.readline()`` returns an empty string, the end of the file
286has been reached, while a blank line is represented by ``'\n'``, a string
287containing only a single newline. ::
288
289 >>> f.readline()
290 'This is the first line of the file.\n'
291 >>> f.readline()
292 'Second line of the file\n'
293 >>> f.readline()
294 ''
295
296``f.readlines()`` returns a list containing all the lines of data in the file.
297If given an optional parameter *sizehint*, it reads that many bytes from the
298file and enough more to complete a line, and returns the lines from that. This
299is often used to allow efficient reading of a large file by lines, but without
300having to load the entire file in memory. Only complete lines will be returned.
301::
302
303 >>> f.readlines()
304 ['This is the first line of the file.\n', 'Second line of the file\n']
305
Georg Brandl5d242ee2007-09-20 08:44:59 +0000306An alternative approach to reading lines is to loop over the file object. This is
Georg Brandl8ec7f652007-08-15 14:28:01 +0000307memory efficient, fast, and leads to simpler code::
308
309 >>> for line in f:
310 print line,
311
312 This is the first line of the file.
313 Second line of the file
314
315The alternative approach is simpler but does not provide as fine-grained
316control. Since the two approaches manage line buffering differently, they
317should not be mixed.
318
319``f.write(string)`` writes the contents of *string* to the file, returning
320``None``. ::
321
322 >>> f.write('This is a test\n')
323
324To write something other than a string, it needs to be converted to a string
325first::
326
327 >>> value = ('the answer', 42)
328 >>> s = str(value)
329 >>> f.write(s)
330
331``f.tell()`` returns an integer giving the file object's current position in the
332file, measured in bytes from the beginning of the file. To change the file
333object's position, use ``f.seek(offset, from_what)``. The position is computed
334from adding *offset* to a reference point; the reference point is selected by
335the *from_what* argument. A *from_what* value of 0 measures from the beginning
336of the file, 1 uses the current file position, and 2 uses the end of the file as
337the reference point. *from_what* can be omitted and defaults to 0, using the
338beginning of the file as the reference point. ::
339
340 >>> f = open('/tmp/workfile', 'r+')
341 >>> f.write('0123456789abcdef')
342 >>> f.seek(5) # Go to the 6th byte in the file
Georg Brandlc62ef8b2009-01-03 20:55:06 +0000343 >>> f.read(1)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000344 '5'
345 >>> f.seek(-3, 2) # Go to the 3rd byte before the end
346 >>> f.read(1)
347 'd'
348
349When you're done with a file, call ``f.close()`` to close it and free up any
350system resources taken up by the open file. After calling ``f.close()``,
351attempts to use the file object will automatically fail. ::
352
353 >>> f.close()
354 >>> f.read()
355 Traceback (most recent call last):
356 File "<stdin>", line 1, in ?
357 ValueError: I/O operation on closed file
358
Georg Brandla66bb0a2008-07-16 23:35:54 +0000359It is good practice to use the :keyword:`with` keyword when dealing with file
360objects. This has the advantage that the file is properly closed after its
361suite finishes, even if an exception is raised on the way. It is also much
362shorter than writing equivalent :keyword:`try`\ -\ :keyword:`finally` blocks::
363
364 >>> with open('/tmp/workfile', 'r') as f:
365 ... read_data = f.read()
366 >>> f.closed
367 True
368
Georg Brandl14bb28a2009-07-29 17:15:20 +0000369File objects have some additional methods, such as :meth:`~file.isatty` and
370:meth:`~file.truncate` which are less frequently used; consult the Library
371Reference for a complete guide to file objects.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000372
373
374.. _tut-pickle:
375
376The :mod:`pickle` Module
377------------------------
378
379.. index:: module: pickle
380
381Strings can easily be written to and read from a file. Numbers take a bit more
382effort, since the :meth:`read` method only returns strings, which will have to
383be passed to a function like :func:`int`, which takes a string like ``'123'``
384and returns its numeric value 123. However, when you want to save more complex
385data types like lists, dictionaries, or class instances, things get a lot more
386complicated.
387
388Rather than have users be constantly writing and debugging code to save
389complicated data types, Python provides a standard module called :mod:`pickle`.
390This is an amazing module that can take almost any Python object (even some
391forms of Python code!), and convert it to a string representation; this process
392is called :dfn:`pickling`. Reconstructing the object from the string
393representation is called :dfn:`unpickling`. Between pickling and unpickling,
394the string representing the object may have been stored in a file or data, or
395sent over a network connection to some distant machine.
396
397If you have an object ``x``, and a file object ``f`` that's been opened for
398writing, the simplest way to pickle the object takes only one line of code::
399
400 pickle.dump(x, f)
401
402To unpickle the object again, if ``f`` is a file object which has been opened
403for reading::
404
405 x = pickle.load(f)
406
407(There are other variants of this, used when pickling many objects or when you
408don't want to write the pickled data to a file; consult the complete
409documentation for :mod:`pickle` in the Python Library Reference.)
410
411:mod:`pickle` is the standard way to make Python objects which can be stored and
412reused by other programs or by a future invocation of the same program; the
413technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is
414so widely used, many authors who write Python extensions take care to ensure
415that new data types such as matrices can be properly pickled and unpickled.
416
417