| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 | :mod:`fileinput` --- Iterate over lines from multiple input streams | 
 | 2 | =================================================================== | 
 | 3 |  | 
 | 4 | .. module:: fileinput | 
 | 5 |    :synopsis: Loop over standard input or a list of files. | 
 | 6 | .. moduleauthor:: Guido van Rossum <guido@python.org> | 
 | 7 | .. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> | 
 | 8 |  | 
| Raymond Hettinger | 1048094 | 2011-01-10 03:26:08 +0000 | [diff] [blame] | 9 | **Source code:** :source:`Lib/fileinput.py` | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 10 |  | 
| Raymond Hettinger | 4f707fd | 2011-01-10 19:54:11 +0000 | [diff] [blame] | 11 | -------------- | 
 | 12 |  | 
| Thomas Wouters | 1b7f891 | 2007-09-19 03:06:30 +0000 | [diff] [blame] | 13 | This module implements a helper class and functions to quickly write a | 
 | 14 | loop over standard input or a list of files. If you just want to read or | 
 | 15 | write one file see :func:`open`. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 16 |  | 
 | 17 | The typical use is:: | 
 | 18 |  | 
 | 19 |    import fileinput | 
 | 20 |    for line in fileinput.input(): | 
 | 21 |        process(line) | 
 | 22 |  | 
 | 23 | This iterates over the lines of all files listed in ``sys.argv[1:]``, defaulting | 
 | 24 | to ``sys.stdin`` if the list is empty.  If a filename is ``'-'``, it is also | 
 | 25 | replaced by ``sys.stdin``.  To specify an alternative list of filenames, pass it | 
| Georg Brandl | 96593ed | 2007-09-07 14:15:41 +0000 | [diff] [blame] | 26 | as the first argument to :func:`.input`.  A single file name is also allowed. | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 27 |  | 
 | 28 | All files are opened in text mode by default, but you can override this by | 
| Georg Brandl | 96593ed | 2007-09-07 14:15:41 +0000 | [diff] [blame] | 29 | specifying the *mode* parameter in the call to :func:`.input` or | 
| Georg Brandl | 6cb7b65 | 2010-07-31 20:08:15 +0000 | [diff] [blame] | 30 | :class:`FileInput`.  If an I/O error occurs during opening or reading a file, | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 31 | :exc:`IOError` is raised. | 
 | 32 |  | 
 | 33 | If ``sys.stdin`` is used more than once, the second and further use will return | 
 | 34 | no lines, except perhaps for interactive use, or if it has been explicitly reset | 
 | 35 | (e.g. using ``sys.stdin.seek(0)``). | 
 | 36 |  | 
 | 37 | Empty files are opened and immediately closed; the only time their presence in | 
 | 38 | the list of filenames is noticeable at all is when the last file opened is | 
 | 39 | empty. | 
 | 40 |  | 
 | 41 | Lines are returned with any newlines intact, which means that the last line in | 
 | 42 | a file may not have one. | 
 | 43 |  | 
 | 44 | You can control how files are opened by providing an opening hook via the | 
 | 45 | *openhook* parameter to :func:`fileinput.input` or :class:`FileInput()`. The | 
 | 46 | hook must be a function that takes two arguments, *filename* and *mode*, and | 
 | 47 | returns an accordingly opened file-like object. Two useful hooks are already | 
 | 48 | provided by this module. | 
 | 49 |  | 
 | 50 | The following function is the primary interface of this module: | 
 | 51 |  | 
 | 52 |  | 
| Georg Brandl | 71515ca | 2009-05-17 12:29:12 +0000 | [diff] [blame] | 53 | .. function:: input(files=None, inplace=False, backup='', bufsize=0, mode='r', openhook=None) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 54 |  | 
 | 55 |    Create an instance of the :class:`FileInput` class.  The instance will be used | 
 | 56 |    as global state for the functions of this module, and is also returned to use | 
 | 57 |    during iteration.  The parameters to this function will be passed along to the | 
 | 58 |    constructor of the :class:`FileInput` class. | 
 | 59 |  | 
| Georg Brandl | 6cb7b65 | 2010-07-31 20:08:15 +0000 | [diff] [blame] | 60 |    The :class:`FileInput` instance can be used as a context manager in the | 
 | 61 |    :keyword:`with` statement.  In this example, *input* is closed after the | 
 | 62 |    :keyword:`with` statement is exited, even if an exception occurs:: | 
 | 63 |  | 
| Raymond Hettinger | 7fefaff | 2010-09-05 23:50:32 +0000 | [diff] [blame] | 64 |       with fileinput.input(files=('spam.txt', 'eggs.txt')) as f: | 
 | 65 |           for line in f: | 
 | 66 |               process(line) | 
| Georg Brandl | 6cb7b65 | 2010-07-31 20:08:15 +0000 | [diff] [blame] | 67 |  | 
 | 68 |    .. versionchanged:: 3.2 | 
 | 69 |       Can be used as a context manager. | 
 | 70 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 71 |  | 
 | 72 | The following functions use the global state created by :func:`fileinput.input`; | 
 | 73 | if there is no active state, :exc:`RuntimeError` is raised. | 
 | 74 |  | 
 | 75 |  | 
 | 76 | .. function:: filename() | 
 | 77 |  | 
 | 78 |    Return the name of the file currently being read.  Before the first line has | 
 | 79 |    been read, returns ``None``. | 
 | 80 |  | 
 | 81 |  | 
 | 82 | .. function:: fileno() | 
 | 83 |  | 
 | 84 |    Return the integer "file descriptor" for the current file. When no file is | 
 | 85 |    opened (before the first line and between files), returns ``-1``. | 
 | 86 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 87 |  | 
 | 88 | .. function:: lineno() | 
 | 89 |  | 
 | 90 |    Return the cumulative line number of the line that has just been read.  Before | 
 | 91 |    the first line has been read, returns ``0``.  After the last line of the last | 
 | 92 |    file has been read, returns the line number of that line. | 
 | 93 |  | 
 | 94 |  | 
 | 95 | .. function:: filelineno() | 
 | 96 |  | 
 | 97 |    Return the line number in the current file.  Before the first line has been | 
 | 98 |    read, returns ``0``.  After the last line of the last file has been read, | 
 | 99 |    returns the line number of that line within the file. | 
 | 100 |  | 
 | 101 |  | 
 | 102 | .. function:: isfirstline() | 
 | 103 |  | 
 | 104 |    Returns true if the line just read is the first line of its file, otherwise | 
 | 105 |    returns false. | 
 | 106 |  | 
 | 107 |  | 
 | 108 | .. function:: isstdin() | 
 | 109 |  | 
 | 110 |    Returns true if the last line was read from ``sys.stdin``, otherwise returns | 
 | 111 |    false. | 
 | 112 |  | 
 | 113 |  | 
 | 114 | .. function:: nextfile() | 
 | 115 |  | 
 | 116 |    Close the current file so that the next iteration will read the first line from | 
 | 117 |    the next file (if any); lines not read from the file will not count towards the | 
 | 118 |    cumulative line count.  The filename is not changed until after the first line | 
 | 119 |    of the next file has been read.  Before the first line has been read, this | 
 | 120 |    function has no effect; it cannot be used to skip the first file.  After the | 
 | 121 |    last line of the last file has been read, this function has no effect. | 
 | 122 |  | 
 | 123 |  | 
 | 124 | .. function:: close() | 
 | 125 |  | 
 | 126 |    Close the sequence. | 
 | 127 |  | 
 | 128 | The class which implements the sequence behavior provided by the module is | 
 | 129 | available for subclassing as well: | 
 | 130 |  | 
 | 131 |  | 
| Georg Brandl | 71515ca | 2009-05-17 12:29:12 +0000 | [diff] [blame] | 132 | .. class:: FileInput(files=None, inplace=False, backup='', bufsize=0, mode='r', openhook=None) | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 133 |  | 
 | 134 |    Class :class:`FileInput` is the implementation; its methods :meth:`filename`, | 
 | 135 |    :meth:`fileno`, :meth:`lineno`, :meth:`filelineno`, :meth:`isfirstline`, | 
 | 136 |    :meth:`isstdin`, :meth:`nextfile` and :meth:`close` correspond to the functions | 
 | 137 |    of the same name in the module. In addition it has a :meth:`readline` method | 
 | 138 |    which returns the next input line, and a :meth:`__getitem__` method which | 
 | 139 |    implements the sequence behavior.  The sequence must be accessed in strictly | 
 | 140 |    sequential order; random access and :meth:`readline` cannot be mixed. | 
 | 141 |  | 
 | 142 |    With *mode* you can specify which file mode will be passed to :func:`open`. It | 
 | 143 |    must be one of ``'r'``, ``'rU'``, ``'U'`` and ``'rb'``. | 
 | 144 |  | 
 | 145 |    The *openhook*, when given, must be a function that takes two arguments, | 
 | 146 |    *filename* and *mode*, and returns an accordingly opened file-like object. You | 
 | 147 |    cannot use *inplace* and *openhook* together. | 
 | 148 |  | 
| Georg Brandl | 6cb7b65 | 2010-07-31 20:08:15 +0000 | [diff] [blame] | 149 |    A :class:`FileInput` instance can be used as a context manager in the | 
 | 150 |    :keyword:`with` statement.  In this example, *input* is closed after the | 
 | 151 |    :keyword:`with` statement is exited, even if an exception occurs:: | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 152 |  | 
| Georg Brandl | 6cb7b65 | 2010-07-31 20:08:15 +0000 | [diff] [blame] | 153 |       with FileInput(files=('spam.txt', 'eggs.txt')) as input: | 
 | 154 |           process(input) | 
 | 155 |  | 
 | 156 |    .. versionchanged:: 3.2 | 
 | 157 |       Can be used as a context manager. | 
 | 158 |  | 
 | 159 |  | 
 | 160 | **Optional in-place filtering:** if the keyword argument ``inplace=True`` is | 
 | 161 | passed to :func:`fileinput.input` or to the :class:`FileInput` constructor, the | 
 | 162 | file is moved to a backup file and standard output is directed to the input file | 
 | 163 | (if a file of the same name as the backup file already exists, it will be | 
 | 164 | replaced silently).  This makes it possible to write a filter that rewrites its | 
 | 165 | input file in place.  If the *backup* parameter is given (typically as | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 166 | ``backup='.<some extension>'``), it specifies the extension for the backup file, | 
 | 167 | and the backup file remains around; by default, the extension is ``'.bak'`` and | 
 | 168 | it is deleted when the output file is closed.  In-place filtering is disabled | 
 | 169 | when standard input is read. | 
 | 170 |  | 
| Georg Brandl | e720c0a | 2009-04-27 16:20:50 +0000 | [diff] [blame] | 171 | .. note:: | 
| Georg Brandl | 48310cd | 2009-01-03 21:18:54 +0000 | [diff] [blame] | 172 |  | 
| Guido van Rossum | da27fd2 | 2007-08-17 00:24:54 +0000 | [diff] [blame] | 173 |    The current implementation does not work for MS-DOS 8+3 filesystems. | 
 | 174 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 175 |  | 
 | 176 | The two following opening hooks are provided by this module: | 
 | 177 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 178 | .. function:: hook_compressed(filename, mode) | 
 | 179 |  | 
 | 180 |    Transparently opens files compressed with gzip and bzip2 (recognized by the | 
 | 181 |    extensions ``'.gz'`` and ``'.bz2'``) using the :mod:`gzip` and :mod:`bz2` | 
 | 182 |    modules.  If the filename extension is not ``'.gz'`` or ``'.bz2'``, the file is | 
 | 183 |    opened normally (ie, using :func:`open` without any decompression). | 
 | 184 |  | 
 | 185 |    Usage example:  ``fi = fileinput.FileInput(openhook=fileinput.hook_compressed)`` | 
 | 186 |  | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 187 |  | 
 | 188 | .. function:: hook_encoded(encoding) | 
 | 189 |  | 
 | 190 |    Returns a hook which opens each file with :func:`codecs.open`, using the given | 
 | 191 |    *encoding* to read the file. | 
 | 192 |  | 
 | 193 |    Usage example: ``fi = | 
 | 194 |    fileinput.FileInput(openhook=fileinput.hook_encoded("iso-8859-1"))`` |