blob: 22bad7488f833692e63dd249734db9ad36d1e3d7 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001.. _tut-io:
2
3****************
4Input and Output
5****************
6
7There are several ways to present the output of a program; data can be printed
8in a human-readable form, or written to a file for future use. This chapter will
9discuss some of the possibilities.
10
11
12.. _tut-formatting:
13
14Fancier Output Formatting
15=========================
16
17So far we've encountered two ways of writing values: *expression statements* and
18the :keyword:`print` statement. (A third way is using the :meth:`write` method
19of file objects; the standard output file can be referenced as ``sys.stdout``.
20See the Library Reference for more information on this.)
21
22.. index:: module: string
23
24Often you'll want more control over the formatting of your output than simply
25printing space-separated values. There are two ways to format your output; the
26first way is to do all the string handling yourself; using string slicing and
27concatenation operations you can create any layout you can imagine. The
28standard module :mod:`string` contains some useful operations for padding
29strings to a given column width; these will be discussed shortly. The second
Benjamin Petersonf9ef9882008-05-26 00:54:22 +000030way is to use the :meth:`str.format` method.
Georg Brandl8ec7f652007-08-15 14:28:01 +000031
32One question remains, of course: how do you convert values to strings? Luckily,
33Python has ways to convert any value to a string: pass it to the :func:`repr`
34or :func:`str` functions. Reverse quotes (``````) are equivalent to
Benjamin Petersonf9ef9882008-05-26 00:54:22 +000035:func:`repr`, but they are no longer used in modern Python code and are removed
36in future versions of the language.
Georg Brandl8ec7f652007-08-15 14:28:01 +000037
38The :func:`str` function is meant to return representations of values which are
39fairly human-readable, while :func:`repr` is meant to generate representations
40which can be read by the interpreter (or will force a :exc:`SyntaxError` if
41there is not equivalent syntax). For objects which don't have a particular
42representation for human consumption, :func:`str` will return the same value as
43:func:`repr`. Many values, such as numbers or structures like lists and
44dictionaries, have the same representation using either function. Strings and
45floating point numbers, in particular, have two distinct representations.
46
47Some examples::
48
49 >>> s = 'Hello, world.'
50 >>> str(s)
51 'Hello, world.'
52 >>> repr(s)
53 "'Hello, world.'"
54 >>> str(0.1)
55 '0.1'
56 >>> repr(0.1)
57 '0.10000000000000001'
58 >>> x = 10 * 3.25
59 >>> y = 200 * 200
60 >>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'
61 >>> print s
62 The value of x is 32.5, and y is 40000...
63 >>> # The repr() of a string adds string quotes and backslashes:
64 ... hello = 'hello, world\n'
65 >>> hellos = repr(hello)
66 >>> print hellos
67 'hello, world\n'
68 >>> # The argument to repr() may be any Python object:
69 ... repr((x, y, ('spam', 'eggs')))
70 "(32.5, 40000, ('spam', 'eggs'))"
71 >>> # reverse quotes are convenient in interactive sessions:
72 ... `x, y, ('spam', 'eggs')`
73 "(32.5, 40000, ('spam', 'eggs'))"
74
75Here are two ways to write a table of squares and cubes::
76
77 >>> for x in range(1, 11):
78 ... print repr(x).rjust(2), repr(x*x).rjust(3),
79 ... # Note trailing comma on previous line
80 ... print repr(x*x*x).rjust(4)
81 ...
82 1 1 1
83 2 4 8
84 3 9 27
85 4 16 64
86 5 25 125
87 6 36 216
88 7 49 343
89 8 64 512
90 9 81 729
91 10 100 1000
92
93 >>> for x in range(1,11):
Benjamin Petersonf9ef9882008-05-26 00:54:22 +000094 ... print '{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x)
Georg Brandl8ec7f652007-08-15 14:28:01 +000095 ...
96 1 1 1
97 2 4 8
98 3 9 27
99 4 16 64
100 5 25 125
101 6 36 216
102 7 49 343
103 8 64 512
104 9 81 729
105 10 100 1000
106
107(Note that in the first example, one space between each column was added by the
108way :keyword:`print` works: it always adds spaces between its arguments.)
109
110This example demonstrates the :meth:`rjust` method of string objects, which
111right-justifies a string in a field of a given width by padding it with spaces
112on the left. There are similar methods :meth:`ljust` and :meth:`center`. These
113methods do not write anything, they just return a new string. If the input
114string is too long, they don't truncate it, but return it unchanged; this will
115mess up your column lay-out but that's usually better than the alternative,
116which would be lying about a value. (If you really want truncation you can
117always add a slice operation, as in ``x.ljust(n)[:n]``.)
118
119There is another method, :meth:`zfill`, which pads a numeric string on the left
120with zeros. It understands about plus and minus signs::
121
122 >>> '12'.zfill(5)
123 '00012'
124 >>> '-3.14'.zfill(7)
125 '-003.14'
126 >>> '3.14159265359'.zfill(5)
127 '3.14159265359'
128
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000129Basic usage of the :meth:`str.format` method looks like this::
130
131 >>> print 'We are the {0} who say "{1}!"'.format('knights', 'Ni')
132 We are the knights who say "Ni!"
133
134The brackets and characters within them (called format fields) are replaced with
135the objects passed into the format method. The number in the brackets refers to
136the position of the object passed into the format method. ::
137
138 >>> print '{0} and {1}'.format('spam', 'eggs')
139 spam and eggs
140 >>> print '{1} and {0}'.format('spam', 'eggs')
141 eggs and spam
142
143If keyword arguments are used in the format method, their values are referred to
144by using the name of the argument. ::
145
Georg Brandl4b99e9b2008-07-26 22:13:29 +0000146 >>> print 'This {food} is {adjective}.'.format(
147 ... food='spam', adjective='absolutely horrible')
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000148 This spam is absolutely horrible.
149
150Positional and keyword arguments can be arbitrarily combined::
151
Georg Brandl4b99e9b2008-07-26 22:13:29 +0000152 >>> print 'The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred',
153 ... other='Georg')
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000154 The story of Bill, Manfred, and Georg.
155
156An optional ``':``` and format specifier can follow the field name. This also
157greater control over how the value is formatted. The following example
158truncates the Pi to three places after the decimal.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000159
160 >>> import math
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000161 >>> print 'The value of PI is approximately {0:.3f}.'.format(math.pi)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000162 The value of PI is approximately 3.142.
163
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000164Passing an integer after the ``':'`` will cause that field to be a minimum
165number of characters wide. This is useful for making tables pretty.::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000166
167 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
168 >>> for name, phone in table.items():
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000169 ... print '{0:10} ==> {1:10d}'.format(name, phone)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000170 ...
171 Jack ==> 4098
172 Dcab ==> 7678
173 Sjoerd ==> 4127
174
Georg Brandl8ec7f652007-08-15 14:28:01 +0000175If you have a really long format string that you don't want to split up, it
176would be nice if you could reference the variables to be formatted by name
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000177instead of by position. This can be done by simply passing the dict and using
178square brackets ``'[]'`` to access the keys ::
Georg Brandl8ec7f652007-08-15 14:28:01 +0000179
180 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
Georg Brandl4b99e9b2008-07-26 22:13:29 +0000181 >>> print ('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; '
182 ... 'Dcab: {0[Dcab]:d}'.format(table))
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000183 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
184
185This could also be done by passing the table as keyword arguments with the '**'
186notation.::
187
188 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
189 >>> print 'Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table)
Georg Brandl8ec7f652007-08-15 14:28:01 +0000190 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
191
192This is particularly useful in combination with the new built-in :func:`vars`
193function, which returns a dictionary containing all local variables.
194
Benjamin Petersonf9ef9882008-05-26 00:54:22 +0000195For a complete overview of string formating with :meth:`str.format`, see
196:ref:`formatstrings`.
197
198
199Old string formatting
200---------------------
201
202The ``%`` operator can also be used for string formatting. It interprets the
203left argument much like a :cfunc:`sprintf`\ -style format string to be applied
204to the right argument, and returns the string resulting from this formatting
205operation. For example::
206
207 >>> import math
208 >>> print 'The value of PI is approximately %5.3f.' % math.pi
209 The value of PI is approximately 3.142.
210
211Since :meth:`str.format` is quite new, a lot of Python code still uses the ``%``
212operator. However, because this old style of formatting will eventually removed
213from the language :meth:`str.format` should generally be used.
214
215More information can be found in the :ref:`string-formatting` section.
216
Georg Brandl8ec7f652007-08-15 14:28:01 +0000217
218.. _tut-files:
219
220Reading and Writing Files
221=========================
222
223.. index::
224 builtin: open
225 object: file
226
227:func:`open` returns a file object, and is most commonly used with two
228arguments: ``open(filename, mode)``.
229
Georg Brandl8ec7f652007-08-15 14:28:01 +0000230::
231
Georg Brandlb19be572007-12-29 10:57:00 +0000232 >>> f = open('/tmp/workfile', 'w')
Georg Brandl8ec7f652007-08-15 14:28:01 +0000233 >>> print f
234 <open file '/tmp/workfile', mode 'w' at 80a0960>
235
236The first argument is a string containing the filename. The second argument is
237another string containing a few characters describing the way in which the file
238will be used. *mode* can be ``'r'`` when the file will only be read, ``'w'``
239for only writing (an existing file with the same name will be erased), and
240``'a'`` opens the file for appending; any data written to the file is
241automatically added to the end. ``'r+'`` opens the file for both reading and
242writing. The *mode* argument is optional; ``'r'`` will be assumed if it's
243omitted.
244
245On Windows and the Macintosh, ``'b'`` appended to the mode opens the file in
246binary mode, so there are also modes like ``'rb'``, ``'wb'``, and ``'r+b'``.
247Windows makes a distinction between text and binary files; the end-of-line
248characters in text files are automatically altered slightly when data is read or
249written. This behind-the-scenes modification to file data is fine for ASCII
250text files, but it'll corrupt binary data like that in :file:`JPEG` or
251:file:`EXE` files. Be very careful to use binary mode when reading and writing
Georg Brandl2a7d991c2008-01-26 14:02:38 +0000252such files. On Unix, it doesn't hurt to append a ``'b'`` to the mode, so
253you can use it platform-independently for all binary files.
Georg Brandl8ec7f652007-08-15 14:28:01 +0000254
255
256.. _tut-filemethods:
257
258Methods of File Objects
259-----------------------
260
261The rest of the examples in this section will assume that a file object called
262``f`` has already been created.
263
264To read a file's contents, call ``f.read(size)``, which reads some quantity of
265data and returns it as a string. *size* is an optional numeric argument. When
266*size* is omitted or negative, the entire contents of the file will be read and
267returned; it's your problem if the file is twice as large as your machine's
268memory. Otherwise, at most *size* bytes are read and returned. If the end of
269the file has been reached, ``f.read()`` will return an empty string (``""``).
270::
271
272 >>> f.read()
273 'This is the entire file.\n'
274 >>> f.read()
275 ''
276
277``f.readline()`` reads a single line from the file; a newline character (``\n``)
278is left at the end of the string, and is only omitted on the last line of the
279file if the file doesn't end in a newline. This makes the return value
280unambiguous; if ``f.readline()`` returns an empty string, the end of the file
281has been reached, while a blank line is represented by ``'\n'``, a string
282containing only a single newline. ::
283
284 >>> f.readline()
285 'This is the first line of the file.\n'
286 >>> f.readline()
287 'Second line of the file\n'
288 >>> f.readline()
289 ''
290
291``f.readlines()`` returns a list containing all the lines of data in the file.
292If given an optional parameter *sizehint*, it reads that many bytes from the
293file and enough more to complete a line, and returns the lines from that. This
294is often used to allow efficient reading of a large file by lines, but without
295having to load the entire file in memory. Only complete lines will be returned.
296::
297
298 >>> f.readlines()
299 ['This is the first line of the file.\n', 'Second line of the file\n']
300
Georg Brandl5d242ee2007-09-20 08:44:59 +0000301An alternative approach to reading lines is to loop over the file object. This is
Georg Brandl8ec7f652007-08-15 14:28:01 +0000302memory efficient, fast, and leads to simpler code::
303
304 >>> for line in f:
305 print line,
306
307 This is the first line of the file.
308 Second line of the file
309
310The alternative approach is simpler but does not provide as fine-grained
311control. Since the two approaches manage line buffering differently, they
312should not be mixed.
313
314``f.write(string)`` writes the contents of *string* to the file, returning
315``None``. ::
316
317 >>> f.write('This is a test\n')
318
319To write something other than a string, it needs to be converted to a string
320first::
321
322 >>> value = ('the answer', 42)
323 >>> s = str(value)
324 >>> f.write(s)
325
326``f.tell()`` returns an integer giving the file object's current position in the
327file, measured in bytes from the beginning of the file. To change the file
328object's position, use ``f.seek(offset, from_what)``. The position is computed
329from adding *offset* to a reference point; the reference point is selected by
330the *from_what* argument. A *from_what* value of 0 measures from the beginning
331of the file, 1 uses the current file position, and 2 uses the end of the file as
332the reference point. *from_what* can be omitted and defaults to 0, using the
333beginning of the file as the reference point. ::
334
335 >>> f = open('/tmp/workfile', 'r+')
336 >>> f.write('0123456789abcdef')
337 >>> f.seek(5) # Go to the 6th byte in the file
338 >>> f.read(1)
339 '5'
340 >>> f.seek(-3, 2) # Go to the 3rd byte before the end
341 >>> f.read(1)
342 'd'
343
344When you're done with a file, call ``f.close()`` to close it and free up any
345system resources taken up by the open file. After calling ``f.close()``,
346attempts to use the file object will automatically fail. ::
347
348 >>> f.close()
349 >>> f.read()
350 Traceback (most recent call last):
351 File "<stdin>", line 1, in ?
352 ValueError: I/O operation on closed file
353
Georg Brandla66bb0a2008-07-16 23:35:54 +0000354It is good practice to use the :keyword:`with` keyword when dealing with file
355objects. This has the advantage that the file is properly closed after its
356suite finishes, even if an exception is raised on the way. It is also much
357shorter than writing equivalent :keyword:`try`\ -\ :keyword:`finally` blocks::
358
359 >>> with open('/tmp/workfile', 'r') as f:
360 ... read_data = f.read()
361 >>> f.closed
362 True
363
Georg Brandl8ec7f652007-08-15 14:28:01 +0000364File objects have some additional methods, such as :meth:`isatty` and
365:meth:`truncate` which are less frequently used; consult the Library Reference
366for a complete guide to file objects.
367
368
369.. _tut-pickle:
370
371The :mod:`pickle` Module
372------------------------
373
374.. index:: module: pickle
375
376Strings can easily be written to and read from a file. Numbers take a bit more
377effort, since the :meth:`read` method only returns strings, which will have to
378be passed to a function like :func:`int`, which takes a string like ``'123'``
379and returns its numeric value 123. However, when you want to save more complex
380data types like lists, dictionaries, or class instances, things get a lot more
381complicated.
382
383Rather than have users be constantly writing and debugging code to save
384complicated data types, Python provides a standard module called :mod:`pickle`.
385This is an amazing module that can take almost any Python object (even some
386forms of Python code!), and convert it to a string representation; this process
387is called :dfn:`pickling`. Reconstructing the object from the string
388representation is called :dfn:`unpickling`. Between pickling and unpickling,
389the string representing the object may have been stored in a file or data, or
390sent over a network connection to some distant machine.
391
392If you have an object ``x``, and a file object ``f`` that's been opened for
393writing, the simplest way to pickle the object takes only one line of code::
394
395 pickle.dump(x, f)
396
397To unpickle the object again, if ``f`` is a file object which has been opened
398for reading::
399
400 x = pickle.load(f)
401
402(There are other variants of this, used when pickling many objects or when you
403don't want to write the pickled data to a file; consult the complete
404documentation for :mod:`pickle` in the Python Library Reference.)
405
406:mod:`pickle` is the standard way to make Python objects which can be stored and
407reused by other programs or by a future invocation of the same program; the
408technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is
409so widely used, many authors who write Python extensions take care to ensure
410that new data types such as matrices can be properly pickled and unpickled.
411
412