blob: 41f494e12677ccb56a5f5a509971344ee3e2392f [file] [log] [blame]
Guido van Rossum8668e8e1998-06-28 17:55:53 +00001% Documentation by ESR
2\section{Standard Module \module{multifile}}
Fred Drakeb91e9341998-07-23 17:59:49 +00003\declaremodule[multifile]{standard}{multiFile}
4
5\modulesynopsis{None}
6
Guido van Rossum8668e8e1998-06-28 17:55:53 +00007
Fred Drake1717ba41998-07-02 19:36:50 +00008The \class{MultiFile} object enables you to treat sections of a text
9file as file-like input objects, with \code{''} being returned by
10\method{readline()} when a given delimiter pattern is encountered. The
Guido van Rossum8668e8e1998-06-28 17:55:53 +000011defaults of this class are designed to make it useful for parsing
12MIME multipart messages, but by subclassing it and overriding methods
13it can be easily adapted for more general use.
14
Fred Drake1717ba41998-07-02 19:36:50 +000015\begin{classdesc}{MultiFile}{fp\optional{, seekable}}
Guido van Rossum8668e8e1998-06-28 17:55:53 +000016Create a multi-file. You must instantiate this class with an input
Fred Drake1717ba41998-07-02 19:36:50 +000017object argument for the \class{MultiFile} instance to get lines from,
18such as as a file object returned by \function{open()}.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000019
Fred Drake1717ba41998-07-02 19:36:50 +000020\class{MultiFile} only ever looks at the input object's
21\method{readline()}, \method{seek()} and \method{tell()} methods, and
22the latter two are only needed if you want random access to the
23individual MIME parts. To use \class{MultiFile} on a non-seekable
24stream object, set the optional \var{seekable} argument to false; this
25will prevent using the input object's \method{seek()} and
26\method{tell()} methods.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000027\end{classdesc}
28
Fred Drake1717ba41998-07-02 19:36:50 +000029It will be useful to know that in \class{MultiFile}'s view of the world, text
Guido van Rossum8668e8e1998-06-28 17:55:53 +000030is composed of three kinds of lines: data, section-dividers, and
31end-markers. MultiFile is designed to support parsing of
32messages that may have multiple nested message parts, each with its
33own pattern for section-divider and end-marker lines.
34
35\subsection{MultiFile Objects}
36\label{MultiFile-objects}
37
38A \class{MultiFile} instance has the following methods:
39
40\begin{methoddesc}{push}{str}
41Push a boundary string. When an appropriately decorated version of
42this boundary is found as an input line, it will be interpreted as a
Fred Drake1717ba41998-07-02 19:36:50 +000043section-divider or end-marker. All subsequent
44reads will return the empty string to indicate end-of-file, until a
45call to \method{pop()} removes the boundary a or \method{next()} call
46reenables it.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000047
48It is possible to push more than one boundary. Encountering the
49most-recently-pushed boundary will return EOF; encountering any other
50boundary will raise an error.
51\end{methoddesc}
52
53\begin{methoddesc}{readline}{str}
54Read a line. If the line is data (not a section-divider or end-marker
55or real EOF) return it. If the line matches the most-recently-stacked
Guido van Rossum8ec619f1998-06-30 16:35:25 +000056boundary, return \code{''} and set \code{self.last} to 1 or 0 according as
Guido van Rossum8668e8e1998-06-28 17:55:53 +000057the match is or is not an end-marker. If the line matches any other
Fred Drake1717ba41998-07-02 19:36:50 +000058stacked boundary, raise an error. On encountering end-of-file on the
59underlying stream object, the method raises \exception{Error} unless
60all boundaries have been popped.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000061\end{methoddesc}
62
63\begin{methoddesc}{readlines}{str}
Fred Drake1717ba41998-07-02 19:36:50 +000064Return all lines remaining in this part as a list of strings.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000065\end{methoddesc}
66
Fred Drake1717ba41998-07-02 19:36:50 +000067\begin{methoddesc}{read}{}
Guido van Rossum8668e8e1998-06-28 17:55:53 +000068Read all lines, up to the next section. Return them as a single
69(multiline) string. Note that this doesn't take a size argument!
70\end{methoddesc}
71
Fred Drake1717ba41998-07-02 19:36:50 +000072\begin{methoddesc}{next}{}
Guido van Rossum8668e8e1998-06-28 17:55:53 +000073Skip lines to the next section (that is, read lines until a
Fred Drake1717ba41998-07-02 19:36:50 +000074section-divider or end-marker has been consumed). Return true if
75there is such a section, false if an end-marker is seen. Re-enable
76the most-recently-pushed boundary.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000077\end{methoddesc}
78
Fred Drake1717ba41998-07-02 19:36:50 +000079\begin{methoddesc}{pop}{}
80Pop a section boundary. This boundary will no longer be interpreted
81as EOF.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000082\end{methoddesc}
83
Fred Drake1717ba41998-07-02 19:36:50 +000084\begin{methoddesc}{seek}{pos\optional{, whence}}
Guido van Rossum8668e8e1998-06-28 17:55:53 +000085Seek. Seek indices are relative to the start of the current section.
Fred Drake1717ba41998-07-02 19:36:50 +000086The \var{pos} and \var{whence} arguments are interpreted as for a file
87seek.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000088\end{methoddesc}
89
Fred Drake1717ba41998-07-02 19:36:50 +000090\begin{methoddesc}{tell}{}
91Return the file position relative to the start of the current section.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000092\end{methoddesc}
93
94\begin{methoddesc}{is_data}{str}
Fred Drake1717ba41998-07-02 19:36:50 +000095Return true if \var{str} is data and false if it might be a section
96boundary. As written, it tests for a prefix other than \code{'--'} at
97start of line (which all MIME boundaries have) but it is declared so
98it can be overridden in derived classes.
Guido van Rossum8668e8e1998-06-28 17:55:53 +000099
100Note that this test is used intended as a fast guard for the real
Fred Drake1717ba41998-07-02 19:36:50 +0000101boundary tests; if it always returns false it will merely slow
102processing, not cause it to fail.
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000103\end{methoddesc}
104
105\begin{methoddesc}{section_divider}{str}
106Turn a boundary into a section-divider line. By default, this
Fred Drake1717ba41998-07-02 19:36:50 +0000107method prepends \code{'--'} (which MIME section boundaries have) but
108it is declared so it can be overridden in derived classes. This
109method need not append LF or CR-LF, as comparison with the result
110ignores trailing whitespace.
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000111\end{methoddesc}
112
113\begin{methoddesc}{end_marker}{str}
114Turn a boundary string into an end-marker line. By default, this
Fred Drake1717ba41998-07-02 19:36:50 +0000115method prepends \code{'--'} and appends \code{'--'} (like a
116MIME-multipart end-of-message marker) but it is declared so it can be
117be overridden in derived classes. This method need not append LF or
118CR-LF, as comparison with the result ignores trailing whitespace.
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000119\end{methoddesc}
120
121Finally, \class{MultiFile} instances have two public instance variables:
122
123\begin{memberdesc}{level}
Fred Drake1717ba41998-07-02 19:36:50 +0000124Nesting depth of the current part.
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000125\end{memberdesc}
126
127\begin{memberdesc}{last}
Fred Drake1717ba41998-07-02 19:36:50 +0000128True if the last end-of-file was for an end-of-message marker.
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000129\end{memberdesc}
130
Fred Drake1717ba41998-07-02 19:36:50 +0000131
132\subsection{\class{Multifile} Example}
133\label{multifile-example}
134
135% This is almost unreadable; should be re-written when someone gets time.
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000136
137\begin{verbatim}
Fred Drake1717ba41998-07-02 19:36:50 +0000138fp = MultiFile(sys.stdin, 0)
139fp.push(outer_boundary)
140message1 = fp.readlines()
141# We should now be either at real EOF or stopped on a message
142# boundary. Re-enable the outer boundary.
143fp.next()
144# Read another message with the same delimiter
145message2 = fp.readlines()
146# Re-enable that delimiter again
147fp.next()
148# Now look for a message subpart with a different boundary
149fp.push(inner_boundary)
150sub_header = fp.readlines()
151# If no exception has been thrown, we're looking at the start of
152# the message subpart. Reset and grab the subpart
153fp.next()
154sub_body = fp.readlines()
155# Got it. Now pop the inner boundary to re-enable the outer one.
156fp.pop()
157# Read to next outer boundary
158message3 = fp.readlines()
Guido van Rossum8668e8e1998-06-28 17:55:53 +0000159\end{verbatim}