blob: bf1c79fa1243837ef14a33fb2f416a00c5978b1c [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001.. _tut-io:
2
3****************
4Input and Output
5****************
6
7There are several ways to present the output of a program; data can be printed
8in a human-readable form, or written to a file for future use. This chapter will
9discuss some of the possibilities.
10
11
12.. _tut-formatting:
13
14Fancier Output Formatting
15=========================
16
17So far we've encountered two ways of writing values: *expression statements* and
Guido van Rossum0616b792007-08-31 03:25:11 +000018the :func:`print` function. (A third way is using the :meth:`write` method
Georg Brandl116aa622007-08-15 14:28:22 +000019of file objects; the standard output file can be referenced as ``sys.stdout``.
20See the Library Reference for more information on this.)
21
22.. index:: module: string
23
24Often you'll want more control over the formatting of your output than simply
25printing space-separated values. There are two ways to format your output; the
26first way is to do all the string handling yourself; using string slicing and
27concatenation operations you can create any layout you can imagine. The
28standard module :mod:`string` contains some useful operations for padding
29strings to a given column width; these will be discussed shortly. The second
Benjamin Petersone6f00632008-05-26 01:03:56 +000030way is to use the :meth:`str.format` method.
31
32The :mod:`string` module contains a class Template which offers yet another way
33to substitute values into strings.
Georg Brandl116aa622007-08-15 14:28:22 +000034
35One question remains, of course: how do you convert values to strings? Luckily,
36Python has ways to convert any value to a string: pass it to the :func:`repr`
37or :func:`str` functions. Reverse quotes (``````) are equivalent to
Benjamin Petersone6f00632008-05-26 01:03:56 +000038:func:`repr`, but they are no longer used in modern Python code and are removed
39in future versions of the language.
Georg Brandl116aa622007-08-15 14:28:22 +000040
41The :func:`str` function is meant to return representations of values which are
42fairly human-readable, while :func:`repr` is meant to generate representations
43which can be read by the interpreter (or will force a :exc:`SyntaxError` if
44there is not equivalent syntax). For objects which don't have a particular
45representation for human consumption, :func:`str` will return the same value as
46:func:`repr`. Many values, such as numbers or structures like lists and
47dictionaries, have the same representation using either function. Strings and
48floating point numbers, in particular, have two distinct representations.
49
50Some examples::
51
52 >>> s = 'Hello, world.'
53 >>> str(s)
54 'Hello, world.'
55 >>> repr(s)
56 "'Hello, world.'"
57 >>> str(0.1)
58 '0.1'
59 >>> repr(0.1)
60 '0.10000000000000001'
61 >>> x = 10 * 3.25
62 >>> y = 200 * 200
63 >>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'
Guido van Rossum0616b792007-08-31 03:25:11 +000064 >>> print(s)
Georg Brandl116aa622007-08-15 14:28:22 +000065 The value of x is 32.5, and y is 40000...
66 >>> # The repr() of a string adds string quotes and backslashes:
67 ... hello = 'hello, world\n'
68 >>> hellos = repr(hello)
Guido van Rossum0616b792007-08-31 03:25:11 +000069 >>> print(hellos)
Georg Brandl116aa622007-08-15 14:28:22 +000070 'hello, world\n'
71 >>> # The argument to repr() may be any Python object:
72 ... repr((x, y, ('spam', 'eggs')))
73 "(32.5, 40000, ('spam', 'eggs'))"
74 >>> # reverse quotes are convenient in interactive sessions:
75 ... `x, y, ('spam', 'eggs')`
76 "(32.5, 40000, ('spam', 'eggs'))"
77
78Here are two ways to write a table of squares and cubes::
79
80 >>> for x in range(1, 11):
Georg Brandle4ac7502007-09-03 07:10:24 +000081 ... print(repr(x).rjust(2), repr(x*x).rjust(3), end=' ')
Guido van Rossum0616b792007-08-31 03:25:11 +000082 ... # Note use of 'end' on previous line
83 ... print(repr(x*x*x).rjust(4))
Georg Brandl116aa622007-08-15 14:28:22 +000084 ...
85 1 1 1
86 2 4 8
87 3 9 27
88 4 16 64
89 5 25 125
90 6 36 216
91 7 49 343
92 8 64 512
93 9 81 729
94 10 100 1000
95
Georg Brandle4ac7502007-09-03 07:10:24 +000096 >>> for x in range(1, 11):
Benjamin Petersone6f00632008-05-26 01:03:56 +000097 ... print('{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x))
Georg Brandl116aa622007-08-15 14:28:22 +000098 ...
99 1 1 1
100 2 4 8
101 3 9 27
102 4 16 64
103 5 25 125
104 6 36 216
105 7 49 343
106 8 64 512
107 9 81 729
108 10 100 1000
109
110(Note that in the first example, one space between each column was added by the
Guido van Rossum0616b792007-08-31 03:25:11 +0000111way :func:`print` works: it always adds spaces between its arguments.)
Georg Brandl116aa622007-08-15 14:28:22 +0000112
113This example demonstrates the :meth:`rjust` method of string objects, which
114right-justifies a string in a field of a given width by padding it with spaces
115on the left. There are similar methods :meth:`ljust` and :meth:`center`. These
116methods do not write anything, they just return a new string. If the input
117string is too long, they don't truncate it, but return it unchanged; this will
118mess up your column lay-out but that's usually better than the alternative,
119which would be lying about a value. (If you really want truncation you can
120always add a slice operation, as in ``x.ljust(n)[:n]``.)
121
122There is another method, :meth:`zfill`, which pads a numeric string on the left
123with zeros. It understands about plus and minus signs::
124
125 >>> '12'.zfill(5)
126 '00012'
127 >>> '-3.14'.zfill(7)
128 '-003.14'
129 >>> '3.14159265359'.zfill(5)
130 '3.14159265359'
131
Benjamin Petersone6f00632008-05-26 01:03:56 +0000132Basic usage of the :meth:`str.format` method looks like this::
133
Benjamin Peterson0cea1572008-07-26 21:59:03 +0000134 >>> print('We are the {0} who say "{1}!"'.format('knights', 'Ni'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000135 We are the knights who say "Ni!"
136
137The brackets and characters within them (called format fields) are replaced with
138the objects passed into the format method. The number in the brackets refers to
139the position of the object passed into the format method. ::
140
Benjamin Peterson0cea1572008-07-26 21:59:03 +0000141 >>> print('{0} and {1}'.format('spam', 'eggs'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000142 spam and eggs
Benjamin Peterson0cea1572008-07-26 21:59:03 +0000143 >>> print('{1} and {0}'.format('spam', 'eggs'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000144 eggs and spam
145
146If keyword arguments are used in the format method, their values are referred to
147by using the name of the argument. ::
148
Benjamin Peterson71141932008-07-26 22:27:04 +0000149 >>> print('This {food} is {adjective}.'.format(
150 ... food='spam', adjective='absolutely horrible'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000151 This spam is absolutely horrible.
152
153Positional and keyword arguments can be arbitrarily combined::
154
Benjamin Peterson71141932008-07-26 22:27:04 +0000155 >>> print('The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred',
156 other='Georg'))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000157 The story of Bill, Manfred, and Georg.
158
159An optional ``':``` and format specifier can follow the field name. This also
160greater control over how the value is formatted. The following example
161truncates the Pi to three places after the decimal.
Georg Brandl116aa622007-08-15 14:28:22 +0000162
163 >>> import math
Benjamin Petersone6f00632008-05-26 01:03:56 +0000164 >>> print('The value of PI is approximately {0:.3f}.'.format(math.pi))
Georg Brandl116aa622007-08-15 14:28:22 +0000165 The value of PI is approximately 3.142.
166
Benjamin Petersone6f00632008-05-26 01:03:56 +0000167Passing an integer after the ``':'`` will cause that field to be a minimum
168number of characters wide. This is useful for making tables pretty.::
Georg Brandl116aa622007-08-15 14:28:22 +0000169
170 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
171 >>> for name, phone in table.items():
Benjamin Petersone6f00632008-05-26 01:03:56 +0000172 ... print('{0:10} ==> {1:10d}'.format(name, phone))
Georg Brandl116aa622007-08-15 14:28:22 +0000173 ...
174 Jack ==> 4098
175 Dcab ==> 7678
176 Sjoerd ==> 4127
177
Georg Brandl116aa622007-08-15 14:28:22 +0000178If you have a really long format string that you don't want to split up, it
179would be nice if you could reference the variables to be formatted by name
Benjamin Petersone6f00632008-05-26 01:03:56 +0000180instead of by position. This can be done by simply passing the dict and using
181square brackets ``'[]'`` to access the keys ::
Georg Brandl116aa622007-08-15 14:28:22 +0000182
183 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
Benjamin Peterson71141932008-07-26 22:27:04 +0000184 >>> print('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; '
185 'Dcab: {0[Dcab]:d}'.format(table))
Benjamin Petersone6f00632008-05-26 01:03:56 +0000186 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
187
188This could also be done by passing the table as keyword arguments with the '**'
189notation.::
190
191 >>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
192 >>> print('Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table))
Georg Brandl116aa622007-08-15 14:28:22 +0000193 Jack: 4098; Sjoerd: 4127; Dcab: 8637678
194
195This is particularly useful in combination with the new built-in :func:`vars`
196function, which returns a dictionary containing all local variables.
197
Benjamin Petersone6f00632008-05-26 01:03:56 +0000198For a complete overview of string formating with :meth:`str.format`, see
199:ref:`formatstrings`.
200
201
202Old string formatting
203---------------------
204
205The ``%`` operator can also be used for string formatting. It interprets the
206left argument much like a :cfunc:`sprintf`\ -style format string to be applied
207to the right argument, and returns the string resulting from this formatting
208operation. For example::
209
210 >>> import math
211 >>> print 'The value of PI is approximately %5.3f.' % math.pi
212 The value of PI is approximately 3.142.
213
214Since :meth:`str.format` is quite new, a lot of Python code still uses the ``%``
215operator. However, because this old style of formatting will eventually removed
216from the language :meth:`str.format` should generally be used.
217
218More information can be found in the :ref:`old-string-formatting` section.
219
Georg Brandl116aa622007-08-15 14:28:22 +0000220
221.. _tut-files:
222
223Reading and Writing Files
224=========================
225
226.. index::
227 builtin: open
228 object: file
229
230:func:`open` returns a file object, and is most commonly used with two
231arguments: ``open(filename, mode)``.
232
Georg Brandl116aa622007-08-15 14:28:22 +0000233::
234
Christian Heimes5b5e81c2007-12-31 16:14:33 +0000235 >>> f = open('/tmp/workfile', 'w')
Guido van Rossum0616b792007-08-31 03:25:11 +0000236 >>> print(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000237 <open file '/tmp/workfile', mode 'w' at 80a0960>
238
239The first argument is a string containing the filename. The second argument is
240another string containing a few characters describing the way in which the file
241will be used. *mode* can be ``'r'`` when the file will only be read, ``'w'``
242for only writing (an existing file with the same name will be erased), and
243``'a'`` opens the file for appending; any data written to the file is
244automatically added to the end. ``'r+'`` opens the file for both reading and
245writing. The *mode* argument is optional; ``'r'`` will be assumed if it's
246omitted.
247
Christian Heimesaf98da12008-01-27 15:18:18 +0000248On Windows and the Macintosh, ``'b'`` appended to the mode opens the file in
249binary mode, so there are also modes like ``'rb'``, ``'wb'``, and ``'r+b'``.
250Windows makes a distinction between text and binary files; the end-of-line
251characters in text files are automatically altered slightly when data is read or
252written. This behind-the-scenes modification to file data is fine for ASCII
253text files, but it'll corrupt binary data like that in :file:`JPEG` or
254:file:`EXE` files. Be very careful to use binary mode when reading and writing
255such files. On Unix, it doesn't hurt to append a ``'b'`` to the mode, so
256you can use it platform-independently for all binary files.
Skip Montanaro4e02c502007-09-26 01:10:12 +0000257
258This behind-the-scenes modification to file data is fine for text files, but
259will corrupt binary data like that in :file:`JPEG` or :file:`EXE` files. Be
260very careful to use binary mode when reading and writing such files.
Georg Brandl116aa622007-08-15 14:28:22 +0000261
262
263.. _tut-filemethods:
264
265Methods of File Objects
266-----------------------
267
268The rest of the examples in this section will assume that a file object called
269``f`` has already been created.
270
271To read a file's contents, call ``f.read(size)``, which reads some quantity of
272data and returns it as a string. *size* is an optional numeric argument. When
273*size* is omitted or negative, the entire contents of the file will be read and
274returned; it's your problem if the file is twice as large as your machine's
275memory. Otherwise, at most *size* bytes are read and returned. If the end of
276the file has been reached, ``f.read()`` will return an empty string (``""``).
277::
278
279 >>> f.read()
280 'This is the entire file.\n'
281 >>> f.read()
282 ''
283
284``f.readline()`` reads a single line from the file; a newline character (``\n``)
285is left at the end of the string, and is only omitted on the last line of the
286file if the file doesn't end in a newline. This makes the return value
287unambiguous; if ``f.readline()`` returns an empty string, the end of the file
288has been reached, while a blank line is represented by ``'\n'``, a string
289containing only a single newline. ::
290
291 >>> f.readline()
292 'This is the first line of the file.\n'
293 >>> f.readline()
294 'Second line of the file\n'
295 >>> f.readline()
296 ''
297
298``f.readlines()`` returns a list containing all the lines of data in the file.
299If given an optional parameter *sizehint*, it reads that many bytes from the
300file and enough more to complete a line, and returns the lines from that. This
301is often used to allow efficient reading of a large file by lines, but without
302having to load the entire file in memory. Only complete lines will be returned.
303::
304
305 >>> f.readlines()
306 ['This is the first line of the file.\n', 'Second line of the file\n']
307
Thomas Wouters8ce81f72007-09-20 18:22:40 +0000308An alternative approach to reading lines is to loop over the file object. This is
Georg Brandl116aa622007-08-15 14:28:22 +0000309memory efficient, fast, and leads to simpler code::
310
311 >>> for line in f:
Guido van Rossum0616b792007-08-31 03:25:11 +0000312 print(line, end='')
Georg Brandl116aa622007-08-15 14:28:22 +0000313
314 This is the first line of the file.
315 Second line of the file
316
317The alternative approach is simpler but does not provide as fine-grained
318control. Since the two approaches manage line buffering differently, they
319should not be mixed.
320
321``f.write(string)`` writes the contents of *string* to the file, returning
322``None``. ::
323
324 >>> f.write('This is a test\n')
325
326To write something other than a string, it needs to be converted to a string
327first::
328
329 >>> value = ('the answer', 42)
330 >>> s = str(value)
331 >>> f.write(s)
332
333``f.tell()`` returns an integer giving the file object's current position in the
334file, measured in bytes from the beginning of the file. To change the file
335object's position, use ``f.seek(offset, from_what)``. The position is computed
336from adding *offset* to a reference point; the reference point is selected by
337the *from_what* argument. A *from_what* value of 0 measures from the beginning
338of the file, 1 uses the current file position, and 2 uses the end of the file as
339the reference point. *from_what* can be omitted and defaults to 0, using the
340beginning of the file as the reference point. ::
341
342 >>> f = open('/tmp/workfile', 'r+')
343 >>> f.write('0123456789abcdef')
344 >>> f.seek(5) # Go to the 6th byte in the file
345 >>> f.read(1)
346 '5'
347 >>> f.seek(-3, 2) # Go to the 3rd byte before the end
348 >>> f.read(1)
349 'd'
350
351When you're done with a file, call ``f.close()`` to close it and free up any
352system resources taken up by the open file. After calling ``f.close()``,
353attempts to use the file object will automatically fail. ::
354
355 >>> f.close()
356 >>> f.read()
357 Traceback (most recent call last):
358 File "<stdin>", line 1, in ?
359 ValueError: I/O operation on closed file
360
Georg Brandl3dbca812008-07-23 16:10:53 +0000361It is good practice to use the :keyword:`with` keyword when dealing with file
362objects. This has the advantage that the file is properly closed after its
363suite finishes, even if an exception is raised on the way. It is also much
364shorter than writing equivalent :keyword:`try`\ -\ :keyword:`finally` blocks::
365
366 >>> with open('/tmp/workfile', 'r') as f:
367 ... read_data = f.read()
368 >>> f.closed
369 True
370
Georg Brandl116aa622007-08-15 14:28:22 +0000371File objects have some additional methods, such as :meth:`isatty` and
372:meth:`truncate` which are less frequently used; consult the Library Reference
373for a complete guide to file objects.
374
375
376.. _tut-pickle:
377
378The :mod:`pickle` Module
379------------------------
380
381.. index:: module: pickle
382
383Strings can easily be written to and read from a file. Numbers take a bit more
384effort, since the :meth:`read` method only returns strings, which will have to
385be passed to a function like :func:`int`, which takes a string like ``'123'``
386and returns its numeric value 123. However, when you want to save more complex
387data types like lists, dictionaries, or class instances, things get a lot more
388complicated.
389
390Rather than have users be constantly writing and debugging code to save
391complicated data types, Python provides a standard module called :mod:`pickle`.
392This is an amazing module that can take almost any Python object (even some
393forms of Python code!), and convert it to a string representation; this process
394is called :dfn:`pickling`. Reconstructing the object from the string
395representation is called :dfn:`unpickling`. Between pickling and unpickling,
396the string representing the object may have been stored in a file or data, or
397sent over a network connection to some distant machine.
398
399If you have an object ``x``, and a file object ``f`` that's been opened for
400writing, the simplest way to pickle the object takes only one line of code::
401
402 pickle.dump(x, f)
403
404To unpickle the object again, if ``f`` is a file object which has been opened
405for reading::
406
407 x = pickle.load(f)
408
409(There are other variants of this, used when pickling many objects or when you
410don't want to write the pickled data to a file; consult the complete
411documentation for :mod:`pickle` in the Python Library Reference.)
412
413:mod:`pickle` is the standard way to make Python objects which can be stored and
414reused by other programs or by a future invocation of the same program; the
415technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is
416so widely used, many authors who write Python extensions take care to ensure
417that new data types such as matrices can be properly pickled and unpickled.
418
419