blob: 1b344e60b22dc8f899e4fe860b4d3ce466a12427 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001.. _tut-io:
2
3****************
4Input and Output
5****************
6
7There are several ways to present the output of a program; data can be printed
8in a human-readable form, or written to a file for future use. This chapter will
9discuss some of the possibilities.
10
11
12.. _tut-formatting:
13
14Fancier Output Formatting
15=========================
16
17So far we've encountered two ways of writing values: *expression statements* and
18the :keyword:`print` statement. (A third way is using the :meth:`write` method
19of file objects; the standard output file can be referenced as ``sys.stdout``.
20See the Library Reference for more information on this.)
21
22.. index:: module: string
23
24Often you'll want more control over the formatting of your output than simply
25printing space-separated values. There are two ways to format your output; the
26first way is to do all the string handling yourself; using string slicing and
27concatenation operations you can create any layout you can imagine. The
28standard module :mod:`string` contains some useful operations for padding
29strings to a given column width; these will be discussed shortly. The second
Benjamin Petersonf9ef9882008-05-26 00:54:22 +000030way is to use the :meth:`str.format` method.
Georg Brandl8ec7f652007-08-15 14:28:01 +000031
32One question remains, of course: how do you convert values to strings? Luckily,
33Python has ways to convert any value to a string: pass it to the :func:`repr`
34or :func:`str` functions. Reverse quotes (``````) are equivalent to
Benjamin Petersonf9ef9882008-05-26 00:54:22 +000035:func:`repr`, but they are no longer used in modern Python code and are removed
36in future versions of the language.
Georg Brandl8ec7f652007-08-15 14:28:01 +000037
38The :func:`str` function is meant to return representations of values which are
39fairly human-readable, while :func:`repr` is meant to generate representations
40which can be read by the interpreter (or will force a :exc:`SyntaxError` if
41there is not equivalent syntax). For objects which don't have a particular
42representation for human consumption, :func:`str` will return the same value as
43:func:`repr`. Many values, such as numbers or structures like lists and
44dictionaries, have the same representation using either function. Strings and
45floating point numbers, in particular, have two distinct representations.
46
47Some examples::
48
49 >>> s = 'Hello, world.'
50 >>> str(s)
51 'Hello, world.'
52 >>> repr(s)
53 "'Hello, world.'"
54 >>> str(0.1)
55 '0.1'
56 >>> repr(0.1)
57 '0.10000000000000001'
58 >>> x = 10 * 3.25
59 >>> y = 200 * 200
60 >>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'
61 >>> print s
62 The value of x is 32.5, and y is 40000...
63 >>> # The repr() of a string adds string quotes and backslashes:
64 ... hello = 'hello, world\n'
65 >>> hellos = repr(hello)
66 >>> print hellos
67 'hello, world\n'
68 >>> # The argument to repr() may be any Python object:
69 ... repr((x, y, ('spam', 'eggs')))
70 "(32.5, 40000, ('spam', 'eggs'))"
71 >>> # reverse quotes are convenient in interactive sessions:
72 ... `x, y, ('spam', 'eggs')`
73 "(32.5, 40000, ('spam', 'eggs'))"
74
75Here are two ways to write a table of squares and cubes::
76
77 >>> for x in range(1, 11):
78 ... print repr(x).rjust(2), repr(x*x).rjust(3),
79 ... # Note trailing comma on previous line
80 ... print repr(x*x*x).rjust(4)
81 ...
82 1 1 1
83 2 4 8
84 3 9 27
85 4 16 64
86 5 25 125
87 6 36 216
88 7 49 343
89 8 64 512
90 9 81 729
91 10 100 1000
92
93 >>> for x in range(1,11):
Benjamin Petersonf9ef9882008-05-26 00:54:22 +000094 ... print '{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x)
Georg Brandl8ec7f652007-08-15 14:28:01 +000095 ...
96 1 1 1
97 2 4 8
98 3 9 27
99 4 16 64
100 5 25 125
101 6 36 216
102 7 49 343
103 8 64 512
104 9 81 729
105 10 100 1000
106
107(Note that in the first example, one space between each column was added by the
108way :keyword:`print` works: it always adds spaces between its arguments.)
109
110This example demonstrates the :meth:`rjust` method of string objects, which
111right-justifies a string in a field of a given width by padding it with spaces
112on the left. There are similar methods :meth:`ljust` and :meth:`center`. These
113methods do not write anything, they just return a new string. If the input
114string is too long, they don't truncate it, but return it unchanged; this will
115mess up your column lay-out but that's usually better than the alternative,
116which would be lying about a value. (If you really want truncation you can
117always add a slice operation, as in ``x.ljust(n)[:n]``.)
118
119There is another method, :meth:`zfill`, which pads a numeric string on the left
120with zeros. It understands about plus and minus signs::
121
122 >>> '12'.zfill(5)
123 '00012'
124 >>> '-3.14'.zfill(7)
125 '-003.14'
126 >>> '3.14159265359'.zfill(5)
127 '3.14159265359'
128
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000129Basic usage of the :meth:`str.format` method looks like this::
130
131 >>> print 'We are the {0} who say "{1}!"'.format('knights', 'Ni')
132 We are the knights who say "Ni!"
133
134The brackets and characters within them (called format fields) are replaced with
135the objects passed into the format method. The number in the brackets refers to
136the position of the object passed into the format method. ::
137
138 >>> print '{0} and {1}'.format('spam', 'eggs')
139 spam and eggs
140 >>> print '{1} and {0}'.format('spam', 'eggs')
141 eggs and spam
142
143If keyword arguments are used in the format method, their values are referred to
144by using the name of the argument. ::
145
146 >>> print 'This {food} is {adjective}.'.format(food='spam', adjective='absolutely horrible')
147 This spam is absolutely horrible.
148
149Positional and keyword arguments can be arbitrarily combined::
150
151 >>> print 'The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred', other='Georg')
152 The story of Bill, Manfred, and Georg.
153
154An optional ``':``` and format specifier can follow the field name. This also
155greater control over how the value is formatted. The following example
156truncates the Pi to three places after the decimal.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000157
158 >>> import math
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000159 >>> print 'The value of PI is approximately {0:.3f}.'.format(math.pi)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000160 The value of PI is approximately 3.142.
161
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000162Passing an integer after the ``':'`` will cause that field to be a minimum
163number of characters wide. This is useful for making tables pretty.::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000164
165 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
166 >>> for name, phone in table.items():
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000167 ... print '{0:10} ==> {1:10d}'.format(name, phone)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000168 ...
169 Jack ==> 4098
170 Dcab ==> 7678
171 Sjoerd ==> 4127
172
Georg Brandl8ec7f652007-08-15 14:28:01 +0000173If you have a really long format string that you don't want to split up, it
174would be nice if you could reference the variables to be formatted by name
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000175instead of by position. This can be done by simply passing the dict and using
176square brackets ``'[]'`` to access the keys ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000177
178 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000179 >>> print 'Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; Dcab: {0[Dcab]:d}'.format(table)
180 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
181
182This could also be done by passing the table as keyword arguments with the '**'
183notation.::
184
185 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
186 >>> print 'Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000187 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
188
189This is particularly useful in combination with the new built-in :func:`vars`
190function, which returns a dictionary containing all local variables.
191
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000192For a complete overview of string formating with :meth:`str.format`, see
193:ref:`formatstrings`.
194
195
196Old string formatting
197---------------------
198
199The ``%`` operator can also be used for string formatting. It interprets the
200left argument much like a :cfunc:`sprintf`\ -style format string to be applied
201to the right argument, and returns the string resulting from this formatting
202operation. For example::
203
204 >>> import math
205 >>> print 'The value of PI is approximately %5.3f.' % math.pi
206 The value of PI is approximately 3.142.
207
208Since :meth:`str.format` is quite new, a lot of Python code still uses the ``%``
209operator. However, because this old style of formatting will eventually removed
210from the language :meth:`str.format` should generally be used.
211
212More information can be found in the :ref:`string-formatting` section.
213
Georg Brandl8ec7f652007-08-15 14:28:01 +0000214
215.. _tut-files:
216
217Reading and Writing Files
218=========================
219
220.. index::
221 builtin: open
222 object: file
223
224:func:`open` returns a file object, and is most commonly used with two
225arguments: ``open(filename, mode)``.
226
Georg Brandl8ec7f652007-08-15 14:28:01 +0000227::
228
Georg Brandlb19be572007-12-29 10:57:00 +0000229 >>> f = open('/tmp/workfile', 'w')
Georg Brandl8ec7f652007-08-15 14:28:01 +0000230 >>> print f
231 <open file '/tmp/workfile', mode 'w' at 80a0960>
232
233The first argument is a string containing the filename. The second argument is
234another string containing a few characters describing the way in which the file
235will be used. *mode* can be ``'r'`` when the file will only be read, ``'w'``
236for only writing (an existing file with the same name will be erased), and
237``'a'`` opens the file for appending; any data written to the file is
238automatically added to the end. ``'r+'`` opens the file for both reading and
239writing. The *mode* argument is optional; ``'r'`` will be assumed if it's
240omitted.
241
242On Windows and the Macintosh, ``'b'`` appended to the mode opens the file in
243binary mode, so there are also modes like ``'rb'``, ``'wb'``, and ``'r+b'``.
244Windows makes a distinction between text and binary files; the end-of-line
245characters in text files are automatically altered slightly when data is read or
246written. This behind-the-scenes modification to file data is fine for ASCII
247text files, but it'll corrupt binary data like that in :file:`JPEG` or
248:file:`EXE` files. Be very careful to use binary mode when reading and writing
Georg Brandl2a7d991c2008-01-26 14:02:38 +0000249such files. On Unix, it doesn't hurt to append a ``'b'`` to the mode, so
250you can use it platform-independently for all binary files.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000251
252
253.. _tut-filemethods:
254
255Methods of File Objects
256-----------------------
257
258The rest of the examples in this section will assume that a file object called
259``f`` has already been created.
260
261To read a file's contents, call ``f.read(size)``, which reads some quantity of
262data and returns it as a string. *size* is an optional numeric argument. When
263*size* is omitted or negative, the entire contents of the file will be read and
264returned; it's your problem if the file is twice as large as your machine's
265memory. Otherwise, at most *size* bytes are read and returned. If the end of
266the file has been reached, ``f.read()`` will return an empty string (``""``).
267::
268
269 >>> f.read()
270 'This is the entire file.\n'
271 >>> f.read()
272 ''
273
274``f.readline()`` reads a single line from the file; a newline character (``\n``)
275is left at the end of the string, and is only omitted on the last line of the
276file if the file doesn't end in a newline. This makes the return value
277unambiguous; if ``f.readline()`` returns an empty string, the end of the file
278has been reached, while a blank line is represented by ``'\n'``, a string
279containing only a single newline. ::
280
281 >>> f.readline()
282 'This is the first line of the file.\n'
283 >>> f.readline()
284 'Second line of the file\n'
285 >>> f.readline()
286 ''
287
288``f.readlines()`` returns a list containing all the lines of data in the file.
289If given an optional parameter *sizehint*, it reads that many bytes from the
290file and enough more to complete a line, and returns the lines from that. This
291is often used to allow efficient reading of a large file by lines, but without
292having to load the entire file in memory. Only complete lines will be returned.
293::
294
295 >>> f.readlines()
296 ['This is the first line of the file.\n', 'Second line of the file\n']
297
Georg Brandl5d242ee2007-09-20 08:44:59 +0000298An alternative approach to reading lines is to loop over the file object. This is
Georg Brandl8ec7f652007-08-15 14:28:01 +0000299memory efficient, fast, and leads to simpler code::
300
301 >>> for line in f:
302 print line,
303
304 This is the first line of the file.
305 Second line of the file
306
307The alternative approach is simpler but does not provide as fine-grained
308control. Since the two approaches manage line buffering differently, they
309should not be mixed.
310
311``f.write(string)`` writes the contents of *string* to the file, returning
312``None``. ::
313
314 >>> f.write('This is a test\n')
315
316To write something other than a string, it needs to be converted to a string
317first::
318
319 >>> value = ('the answer', 42)
320 >>> s = str(value)
321 >>> f.write(s)
322
323``f.tell()`` returns an integer giving the file object's current position in the
324file, measured in bytes from the beginning of the file. To change the file
325object's position, use ``f.seek(offset, from_what)``. The position is computed
326from adding *offset* to a reference point; the reference point is selected by
327the *from_what* argument. A *from_what* value of 0 measures from the beginning
328of the file, 1 uses the current file position, and 2 uses the end of the file as
329the reference point. *from_what* can be omitted and defaults to 0, using the
330beginning of the file as the reference point. ::
331
332 >>> f = open('/tmp/workfile', 'r+')
333 >>> f.write('0123456789abcdef')
334 >>> f.seek(5) # Go to the 6th byte in the file
335 >>> f.read(1)
336 '5'
337 >>> f.seek(-3, 2) # Go to the 3rd byte before the end
338 >>> f.read(1)
339 'd'
340
341When you're done with a file, call ``f.close()`` to close it and free up any
342system resources taken up by the open file. After calling ``f.close()``,
343attempts to use the file object will automatically fail. ::
344
345 >>> f.close()
346 >>> f.read()
347 Traceback (most recent call last):
348 File "<stdin>", line 1, in ?
349 ValueError: I/O operation on closed file
350
Georg Brandla66bb0a2008-07-16 23:35:54 +0000351It is good practice to use the :keyword:`with` keyword when dealing with file
352objects. This has the advantage that the file is properly closed after its
353suite finishes, even if an exception is raised on the way. It is also much
354shorter than writing equivalent :keyword:`try`\ -\ :keyword:`finally` blocks::
355
356 >>> with open('/tmp/workfile', 'r') as f:
357 ... read_data = f.read()
358 >>> f.closed
359 True
360
Georg Brandl8ec7f652007-08-15 14:28:01 +0000361File objects have some additional methods, such as :meth:`isatty` and
362:meth:`truncate` which are less frequently used; consult the Library Reference
363for a complete guide to file objects.
364
365
366.. _tut-pickle:
367
368The :mod:`pickle` Module
369------------------------
370
371.. index:: module: pickle
372
373Strings can easily be written to and read from a file. Numbers take a bit more
374effort, since the :meth:`read` method only returns strings, which will have to
375be passed to a function like :func:`int`, which takes a string like ``'123'``
376and returns its numeric value 123. However, when you want to save more complex
377data types like lists, dictionaries, or class instances, things get a lot more
378complicated.
379
380Rather than have users be constantly writing and debugging code to save
381complicated data types, Python provides a standard module called :mod:`pickle`.
382This is an amazing module that can take almost any Python object (even some
383forms of Python code!), and convert it to a string representation; this process
384is called :dfn:`pickling`. Reconstructing the object from the string
385representation is called :dfn:`unpickling`. Between pickling and unpickling,
386the string representing the object may have been stored in a file or data, or
387sent over a network connection to some distant machine.
388
389If you have an object ``x``, and a file object ``f`` that's been opened for
390writing, the simplest way to pickle the object takes only one line of code::
391
392 pickle.dump(x, f)
393
394To unpickle the object again, if ``f`` is a file object which has been opened
395for reading::
396
397 x = pickle.load(f)
398
399(There are other variants of this, used when pickling many objects or when you
400don't want to write the pickled data to a file; consult the complete
401documentation for :mod:`pickle` in the Python Library Reference.)
402
403:mod:`pickle` is the standard way to make Python objects which can be stored and
404reused by other programs or by a future invocation of the same program; the
405technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is
406so widely used, many authors who write Python extensions take care to ensure
407that new data types such as matrices can be properly pickled and unpickled.
408
409