blob: 3cfb7bc4eb38659982b27de31713cab2e7b76e63 [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{fileinput} ---
Fred Drake8ad27031999-06-29 16:00:22 +00002 Iterate over lines from multiple input streams}
Fred Drakeb91e9341998-07-23 17:59:49 +00003\declaremodule{standard}{fileinput}
Fred Drake295da241998-08-10 19:42:37 +00004\moduleauthor{Guido van Rossum}{guido@python.org}
5\sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org}
Fred Drakeb91e9341998-07-23 17:59:49 +00006
Fred Drake295da241998-08-10 19:42:37 +00007\modulesynopsis{Perl-like iteration over lines from multiple input
8streams, with ``save in place'' capability.}
Fred Drakeb91e9341998-07-23 17:59:49 +00009
Fred Drake35ca0d61998-04-04 04:20:51 +000010
11This module implements a helper class and functions to quickly write a
12loop over standard input or a list of files.
13
14The typical use is:
15
16\begin{verbatim}
17import fileinput
18for line in fileinput.input():
19 process(line)
20\end{verbatim}
21
22This iterates over the lines of all files listed in
23\code{sys.argv[1:]}, defaulting to \code{sys.stdin} if the list is
24empty. If a filename is \code{'-'}, it is also replaced by
25\code{sys.stdin}. To specify an alternative list of filenames, pass
26it as the first argument to \function{input()}. A single file name is
27also allowed.
28
Georg Brandlc029f872006-02-19 14:12:34 +000029All files are opened in text mode by default, but you can override this by
30specifying the \var{mode} parameter in the call to \function{input()}
31or \class{FileInput()}. If an I/O error occurs during opening or reading
32a file, \exception{IOError} is raised.
Fred Drake35ca0d61998-04-04 04:20:51 +000033
34If \code{sys.stdin} is used more than once, the second and further use
35will return no lines, except perhaps for interactive use, or if it has
36been explicitly reset (e.g. using \code{sys.stdin.seek(0)}).
37
38Empty files are opened and immediately closed; the only time their
39presence in the list of filenames is noticeable at all is when the
40last file opened is empty.
41
42It is possible that the last line of a file does not end in a newline
43character; lines are returned including the trailing newline when it
44is present.
45
Georg Brandlc98eeed2006-02-19 14:57:47 +000046You can control how files are opened by providing an opening hook via the
47\var{openhook} parameter to \function{input()} or \class{FileInput()}.
48The hook must be a function that takes two arguments, \var{filename}
49and \var{mode}, and returns an accordingly opened file-like object.
50Two useful hooks are already provided by this module.
51
Fred Drake35ca0d61998-04-04 04:20:51 +000052The following function is the primary interface of this module:
53
Georg Brandlc98eeed2006-02-19 14:57:47 +000054\begin{funcdesc}{input}{\optional{files\optional{, inplace\optional{,
55 backup\optional{, mode\optional{, openhook}}}}}}
Fred Drake35ca0d61998-04-04 04:20:51 +000056 Create an instance of the \class{FileInput} class. The instance
57 will be used as global state for the functions of this module, and
Fred Drake1ef24e12001-05-09 03:24:55 +000058 is also returned to use during iteration. The parameters to this
59 function will be passed along to the constructor of the
60 \class{FileInput} class.
Georg Brandlc029f872006-02-19 14:12:34 +000061
Georg Brandlc98eeed2006-02-19 14:57:47 +000062 \versionchanged[Added the \var{mode} and \var{openhook} parameters]{2.5}
Fred Drake35ca0d61998-04-04 04:20:51 +000063\end{funcdesc}
64
65
66The following functions use the global state created by
67\function{input()}; if there is no active state,
68\exception{RuntimeError} is raised.
69
70\begin{funcdesc}{filename}{}
71 Return the name of the file currently being read. Before the first
72 line has been read, returns \code{None}.
73\end{funcdesc}
74
Georg Brandl67e9fb92006-02-19 13:56:17 +000075\begin{funcdesc}{fileno}{}
76 Return the integer ``file descriptor'' for the current file. When no
77 file is opened (before the first line and between files), returns
78 \code{-1}.
Neal Norwitz87f28752006-02-19 19:18:18 +000079\versionadded{2.5}
Georg Brandl67e9fb92006-02-19 13:56:17 +000080\end{funcdesc}
81
Fred Drake35ca0d61998-04-04 04:20:51 +000082\begin{funcdesc}{lineno}{}
83 Return the cumulative line number of the line that has just been
84 read. Before the first line has been read, returns \code{0}. After
85 the last line of the last file has been read, returns the line
86 number of that line.
87\end{funcdesc}
88
89\begin{funcdesc}{filelineno}{}
90 Return the line number in the current file. Before the first line
91 has been read, returns \code{0}. After the last line of the last
92 file has been read, returns the line number of that line within the
93 file.
94\end{funcdesc}
95
96\begin{funcdesc}{isfirstline}{}
Fred Drakedbe79802003-11-10 14:43:16 +000097 Returns true if the line just read is the first line of its file,
Fred Drake38e5d272000-04-03 20:13:55 +000098 otherwise returns false.
Fred Drake35ca0d61998-04-04 04:20:51 +000099\end{funcdesc}
100
101\begin{funcdesc}{isstdin}{}
Fred Drake38e5d272000-04-03 20:13:55 +0000102 Returns true if the last line was read from \code{sys.stdin},
103 otherwise returns false.
Fred Drake35ca0d61998-04-04 04:20:51 +0000104\end{funcdesc}
105
106\begin{funcdesc}{nextfile}{}
107 Close the current file so that the next iteration will read the
108 first line from the next file (if any); lines not read from the file
109 will not count towards the cumulative line count. The filename is
110 not changed until after the first line of the next file has been
111 read. Before the first line has been read, this function has no
112 effect; it cannot be used to skip the first file. After the last
113 line of the last file has been read, this function has no effect.
114\end{funcdesc}
115
116\begin{funcdesc}{close}{}
117 Close the sequence.
118\end{funcdesc}
119
120
121The class which implements the sequence behavior provided by the
122module is available for subclassing as well:
123
124\begin{classdesc}{FileInput}{\optional{files\optional{,
Georg Brandlc98eeed2006-02-19 14:57:47 +0000125 inplace\optional{, backup\optional{,
126 mode\optional{, openhook}}}}}}
Fred Drake35ca0d61998-04-04 04:20:51 +0000127 Class \class{FileInput} is the implementation; its methods
Georg Brandl67e9fb92006-02-19 13:56:17 +0000128 \method{filename()}, \method{fileno()}, \method{lineno()},
129 \method{fileline()}, \method{isfirstline()}, \method{isstdin()},
130 \method{nextfile()} and \method{close()} correspond to the functions
131 of the same name in the module.
132 In addition it has a \method{readline()} method which
Fred Drake35ca0d61998-04-04 04:20:51 +0000133 returns the next input line, and a \method{__getitem__()} method
134 which implements the sequence behavior. The sequence must be
135 accessed in strictly sequential order; random access and
136 \method{readline()} cannot be mixed.
Georg Brandlc029f872006-02-19 14:12:34 +0000137
138 With \var{mode} you can specify which file mode will be passed to
139 \function{open()}. It must be one of \code{'r'}, \code{'rU'},
140 \code{'U'} and \code{'rb'}.
141
Georg Brandlc98eeed2006-02-19 14:57:47 +0000142 The \var{openhook}, when given, must be a function that takes two arguments,
143 \var{filename} and \var{mode}, and returns an accordingly opened
144 file-like object.
145 You cannot use \var{inplace} and \var{openhook} together.
146
147 \versionchanged[Added the \var{mode} and \var{openhook} parameters]{2.5}
Fred Drake35ca0d61998-04-04 04:20:51 +0000148\end{classdesc}
149
150\strong{Optional in-place filtering:} if the keyword argument
151\code{\var{inplace}=1} is passed to \function{input()} or to the
152\class{FileInput} constructor, the file is moved to a backup file and
Fred Drake1ef24e12001-05-09 03:24:55 +0000153standard output is directed to the input file (if a file of the same
154name as the backup file already exists, it will be replaced silently).
Fred Drake35ca0d61998-04-04 04:20:51 +0000155This makes it possible to write a filter that rewrites its input file
156in place. If the keyword argument \code{\var{backup}='.<some
157extension>'} is also given, it specifies the extension for the backup
158file, and the backup file remains around; by default, the extension is
159\code{'.bak'} and it is deleted when the output file is closed. In-place
160filtering is disabled when standard input is read.
161
162\strong{Caveat:} The current implementation does not work for MS-DOS
1638+3 filesystems.
Georg Brandlc98eeed2006-02-19 14:57:47 +0000164
165
166The two following opening hooks are provided by this module:
167
168\begin{funcdesc}{hook_compressed}{filename, mode}
Georg Brandlc3e950c2006-02-19 15:20:29 +0000169 Transparently opens files compressed with gzip and bzip2 (recognized
170 by the extensions \code{'.gz'} and \code{'.bz2'}) using the \module{gzip}
Neal Norwitz87f28752006-02-19 19:18:18 +0000171 and \module{bz2} modules. If the filename extension is not \code{'.gz'}
Thomas Wouters6302a5a2006-02-19 20:08:18 +0000172 or \code{'.bz2'}, the file is opened normally (ie,
Neal Norwitz87f28752006-02-19 19:18:18 +0000173 using \function{open()} without any decompression).
Georg Brandlc98eeed2006-02-19 14:57:47 +0000174
175 Usage example:
176 \samp{fi = fileinput.FileInput(openhook=fileinput.hook_compressed)}
177
178 \versionadded{2.5}
179\end{funcdesc}
180
181\begin{funcdesc}{hook_encoded}{encoding}
182 Returns a hook which opens each file with \function{codecs.open()},
183 using the given \var{encoding} to read the file.
184
185 Usage example:
186 \samp{fi = fileinput.FileInput(openhook=fileinput.hook_encoded("iso-8859-1"))}
187
188 \note{With this hook, \class{FileInput} might return Unicode strings
189 depending on the specified \var{encoding}.}
190 \versionadded{2.5}
191\end{funcdesc}
192