blob: f3f0af7135ac99c8cc2462e64727bcc66c13b526 [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{multifile} ---
Fred Drake812860e1999-04-23 14:46:18 +00002 Support for files containing distinct parts}
Fred Drakeb91e9341998-07-23 17:59:49 +00003
Fred Drake812860e1999-04-23 14:46:18 +00004\declaremodule{standard}{multifile}
Fred Draked795c5c1998-08-07 15:55:14 +00005\modulesynopsis{Support for reading files which contain distinct
Fred Drake812860e1999-04-23 14:46:18 +00006 parts, such as some MIME data.}
7\sectionauthor{Eric S. Raymond}{esr@snark.thyrsus.com}
Fred Drakeb91e9341998-07-23 17:59:49 +00008
Georg Brandl868e7042006-02-21 19:23:49 +00009\deprecated{2.5}{The \refmodule{email} package should be used in
10 preference to the \module{multifile} module.
11 This module is present only to maintain backward
12 compatibility.}
Guido van Rossum8668e8e1998-06-28 17:55:53 +000013
Fred Drake1717ba41998-07-02 19:36:50 +000014The \class{MultiFile} object enables you to treat sections of a text
15file as file-like input objects, with \code{''} being returned by
16\method{readline()} when a given delimiter pattern is encountered. The
Guido van Rossum8668e8e1998-06-28 17:55:53 +000017defaults of this class are designed to make it useful for parsing
18MIME multipart messages, but by subclassing it and overriding methods
19it can be easily adapted for more general use.
20
Fred Drake1717ba41998-07-02 19:36:50 +000021\begin{classdesc}{MultiFile}{fp\optional{, seekable}}
Guido van Rossum8668e8e1998-06-28 17:55:53 +000022Create a multi-file. You must instantiate this class with an input
Fred Drake1717ba41998-07-02 19:36:50 +000023object argument for the \class{MultiFile} instance to get lines from,
Raymond Hettinger999b57c2003-08-25 04:28:05 +000024such as a file object returned by \function{open()}.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000025
Fred Drake1717ba41998-07-02 19:36:50 +000026\class{MultiFile} only ever looks at the input object's
27\method{readline()}, \method{seek()} and \method{tell()} methods, and
28the latter two are only needed if you want random access to the
29individual MIME parts. To use \class{MultiFile} on a non-seekable
30stream object, set the optional \var{seekable} argument to false; this
31will prevent using the input object's \method{seek()} and
32\method{tell()} methods.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000033\end{classdesc}
34
Fred Drake1717ba41998-07-02 19:36:50 +000035It will be useful to know that in \class{MultiFile}'s view of the world, text
Guido van Rossum8668e8e1998-06-28 17:55:53 +000036is composed of three kinds of lines: data, section-dividers, and
37end-markers. MultiFile is designed to support parsing of
38messages that may have multiple nested message parts, each with its
39own pattern for section-divider and end-marker lines.
40
Fred Drake2d3c03d2002-08-06 21:26:01 +000041\begin{seealso}
Raymond Hettinger68804312005-01-01 00:28:46 +000042 \seemodule{email}{Comprehensive email handling package; supersedes
Fred Drake2d3c03d2002-08-06 21:26:01 +000043 the \module{multifile} module.}
44\end{seealso}
45
Fred Draked795c5c1998-08-07 15:55:14 +000046
47\subsection{MultiFile Objects \label{MultiFile-objects}}
Guido van Rossum8668e8e1998-06-28 17:55:53 +000048
49A \class{MultiFile} instance has the following methods:
50
Guido van Rossumd8faa362007-04-27 19:54:29 +000051\begin{methoddesc}[MultiFile]{readline}{str}
Guido van Rossum8668e8e1998-06-28 17:55:53 +000052Read a line. If the line is data (not a section-divider or end-marker
53or real EOF) return it. If the line matches the most-recently-stacked
Guido van Rossum8ec619f1998-06-30 16:35:25 +000054boundary, return \code{''} and set \code{self.last} to 1 or 0 according as
Guido van Rossum8668e8e1998-06-28 17:55:53 +000055the match is or is not an end-marker. If the line matches any other
Fred Drake1717ba41998-07-02 19:36:50 +000056stacked boundary, raise an error. On encountering end-of-file on the
57underlying stream object, the method raises \exception{Error} unless
58all boundaries have been popped.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000059\end{methoddesc}
60
Guido van Rossumd8faa362007-04-27 19:54:29 +000061\begin{methoddesc}[MultiFile]{readlines}{str}
Fred Drake1717ba41998-07-02 19:36:50 +000062Return all lines remaining in this part as a list of strings.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000063\end{methoddesc}
64
Guido van Rossumd8faa362007-04-27 19:54:29 +000065\begin{methoddesc}[MultiFile]{read}{}
Guido van Rossum8668e8e1998-06-28 17:55:53 +000066Read all lines, up to the next section. Return them as a single
67(multiline) string. Note that this doesn't take a size argument!
68\end{methoddesc}
69
Guido van Rossumd8faa362007-04-27 19:54:29 +000070\begin{methoddesc}[MultiFile]{seek}{pos\optional{, whence}}
Guido van Rossum8668e8e1998-06-28 17:55:53 +000071Seek. Seek indices are relative to the start of the current section.
Fred Drake1717ba41998-07-02 19:36:50 +000072The \var{pos} and \var{whence} arguments are interpreted as for a file
73seek.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000074\end{methoddesc}
75
Guido van Rossumd8faa362007-04-27 19:54:29 +000076\begin{methoddesc}[MultiFile]{tell}{}
Fred Drake1717ba41998-07-02 19:36:50 +000077Return the file position relative to the start of the current section.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000078\end{methoddesc}
79
Guido van Rossumd8faa362007-04-27 19:54:29 +000080\begin{methoddesc}[MultiFile]{next}{}
Fred Drakef0ebbe02001-03-08 22:46:41 +000081Skip lines to the next section (that is, read lines until a
82section-divider or end-marker has been consumed). Return true if
83there is such a section, false if an end-marker is seen. Re-enable
84the most-recently-pushed boundary.
85\end{methoddesc}
86
Guido van Rossumd8faa362007-04-27 19:54:29 +000087\begin{methoddesc}[MultiFile]{is_data}{str}
Fred Drake1717ba41998-07-02 19:36:50 +000088Return true if \var{str} is data and false if it might be a section
Fred Drake812860e1999-04-23 14:46:18 +000089boundary. As written, it tests for a prefix other than \code{'-}\code{-'} at
Fred Drake1717ba41998-07-02 19:36:50 +000090start of line (which all MIME boundaries have) but it is declared so
91it can be overridden in derived classes.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000092
93Note that this test is used intended as a fast guard for the real
Fred Drake1717ba41998-07-02 19:36:50 +000094boundary tests; if it always returns false it will merely slow
95processing, not cause it to fail.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000096\end{methoddesc}
97
Guido van Rossumd8faa362007-04-27 19:54:29 +000098\begin{methoddesc}[MultiFile]{push}{str}
Georg Brandl4bd165a2005-11-22 19:42:45 +000099Push a boundary string. When a decorated version of this boundary
100is found as an input line, it will be interpreted as a section-divider
101or end-marker (depending on the decoration, see \rfc{2045}). All subsequent
Fred Drakef0ebbe02001-03-08 22:46:41 +0000102reads will return the empty string to indicate end-of-file, until a
103call to \method{pop()} removes the boundary a or \method{next()} call
104reenables it.
105
106It is possible to push more than one boundary. Encountering the
107most-recently-pushed boundary will return EOF; encountering any other
108boundary will raise an error.
109\end{methoddesc}
110
Guido van Rossumd8faa362007-04-27 19:54:29 +0000111\begin{methoddesc}[MultiFile]{pop}{}
Fred Drakef0ebbe02001-03-08 22:46:41 +0000112Pop a section boundary. This boundary will no longer be interpreted
113as EOF.
114\end{methoddesc}
115
Guido van Rossumd8faa362007-04-27 19:54:29 +0000116\begin{methoddesc}[MultiFile]{section_divider}{str}
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000117Turn a boundary into a section-divider line. By default, this
Fred Drake812860e1999-04-23 14:46:18 +0000118method prepends \code{'-}\code{-'} (which MIME section boundaries have) but
Fred Drake1717ba41998-07-02 19:36:50 +0000119it is declared so it can be overridden in derived classes. This
120method need not append LF or CR-LF, as comparison with the result
121ignores trailing whitespace.
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000122\end{methoddesc}
123
Guido van Rossumd8faa362007-04-27 19:54:29 +0000124\begin{methoddesc}[MultiFile]{end_marker}{str}
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000125Turn a boundary string into an end-marker line. By default, this
Fred Drake812860e1999-04-23 14:46:18 +0000126method prepends \code{'-}\code{-'} and appends \code{'-}\code{-'} (like a
Fred Drake1717ba41998-07-02 19:36:50 +0000127MIME-multipart end-of-message marker) but it is declared so it can be
Raymond Hettinger7e431102003-09-22 15:00:55 +0000128overridden in derived classes. This method need not append LF or
Fred Drake1717ba41998-07-02 19:36:50 +0000129CR-LF, as comparison with the result ignores trailing whitespace.
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000130\end{methoddesc}
131
132Finally, \class{MultiFile} instances have two public instance variables:
133
Guido van Rossumd8faa362007-04-27 19:54:29 +0000134\begin{memberdesc}[MultiFile]{level}
Fred Drake1717ba41998-07-02 19:36:50 +0000135Nesting depth of the current part.
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000136\end{memberdesc}
137
Guido van Rossumd8faa362007-04-27 19:54:29 +0000138\begin{memberdesc}[MultiFile]{last}
Fred Drake1717ba41998-07-02 19:36:50 +0000139True if the last end-of-file was for an end-of-message marker.
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000140\end{memberdesc}
141
Fred Drake1717ba41998-07-02 19:36:50 +0000142
Fred Draked795c5c1998-08-07 15:55:14 +0000143\subsection{\class{MultiFile} Example \label{multifile-example}}
Fred Drake9164f882000-04-08 04:53:29 +0000144\sectionauthor{Skip Montanaro}{skip@mojam.com}
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000145
146\begin{verbatim}
Fred Drakec2c46c32000-04-07 16:09:59 +0000147import mimetools
Martin v. Löwisd15a9422000-09-30 17:04:40 +0000148import multifile
Fred Drakec2c46c32000-04-07 16:09:59 +0000149import StringIO
150
151def extract_mime_part_matching(stream, mimetype):
152 """Return the first element in a multipart MIME message on stream
153 matching mimetype."""
154
155 msg = mimetools.Message(stream)
156 msgtype = msg.gettype()
157 params = msg.getplist()
158
159 data = StringIO.StringIO()
160 if msgtype[:10] == "multipart/":
161
162 file = multifile.MultiFile(stream)
163 file.push(msg.getparam("boundary"))
164 while file.next():
165 submsg = mimetools.Message(file)
166 try:
167 data = StringIO.StringIO()
168 mimetools.decode(file, data, submsg.getencoding())
169 except ValueError:
170 continue
171 if submsg.gettype() == mimetype:
172 break
173 file.pop()
174 return data.getvalue()
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000175\end{verbatim}