blob: c36ccb7a0fa2a938fa96185ec01a62104f1ca8e2 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`multifile` --- Support for files containing distinct parts
3================================================================
4
5.. module:: multifile
6 :synopsis: Support for reading files which contain distinct parts, such as some MIME data.
7.. sectionauthor:: Eric S. Raymond <esr@snark.thyrsus.com>
8
9
10.. deprecated:: 2.5
11 The :mod:`email` package should be used in preference to the :mod:`multifile`
12 module. This module is present only to maintain backward compatibility.
13
14The :class:`MultiFile` object enables you to treat sections of a text file as
15file-like input objects, with ``''`` being returned by :meth:`readline` when a
16given delimiter pattern is encountered. The defaults of this class are designed
17to make it useful for parsing MIME multipart messages, but by subclassing it and
18overriding methods it can be easily adapted for more general use.
19
20
21.. class:: MultiFile(fp[, seekable])
22
23 Create a multi-file. You must instantiate this class with an input object
24 argument for the :class:`MultiFile` instance to get lines from, such as a file
25 object returned by :func:`open`.
26
27 :class:`MultiFile` only ever looks at the input object's :meth:`readline`,
28 :meth:`seek` and :meth:`tell` methods, and the latter two are only needed if you
29 want random access to the individual MIME parts. To use :class:`MultiFile` on a
30 non-seekable stream object, set the optional *seekable* argument to false; this
31 will prevent using the input object's :meth:`seek` and :meth:`tell` methods.
32
33It will be useful to know that in :class:`MultiFile`'s view of the world, text
34is composed of three kinds of lines: data, section-dividers, and end-markers.
35MultiFile is designed to support parsing of messages that may have multiple
36nested message parts, each with its own pattern for section-divider and
37end-marker lines.
38
39
40.. seealso::
41
42 Module :mod:`email`
43 Comprehensive email handling package; supersedes the :mod:`multifile` module.
44
45
46.. _multifile-objects:
47
48MultiFile Objects
49-----------------
50
51A :class:`MultiFile` instance has the following methods:
52
53
54.. method:: MultiFile.readline(str)
55
56 Read a line. If the line is data (not a section-divider or end-marker or real
57 EOF) return it. If the line matches the most-recently-stacked boundary, return
58 ``''`` and set ``self.last`` to 1 or 0 according as the match is or is not an
59 end-marker. If the line matches any other stacked boundary, raise an error. On
60 encountering end-of-file on the underlying stream object, the method raises
61 :exc:`Error` unless all boundaries have been popped.
62
63
64.. method:: MultiFile.readlines(str)
65
66 Return all lines remaining in this part as a list of strings.
67
68
69.. method:: MultiFile.read()
70
71 Read all lines, up to the next section. Return them as a single (multiline)
72 string. Note that this doesn't take a size argument!
73
74
75.. method:: MultiFile.seek(pos[, whence])
76
77 Seek. Seek indices are relative to the start of the current section. The *pos*
78 and *whence* arguments are interpreted as for a file seek.
79
80
81.. method:: MultiFile.tell()
82
83 Return the file position relative to the start of the current section.
84
85
86.. method:: MultiFile.next()
87
88 Skip lines to the next section (that is, read lines until a section-divider or
89 end-marker has been consumed). Return true if there is such a section, false if
90 an end-marker is seen. Re-enable the most-recently-pushed boundary.
91
92
93.. method:: MultiFile.is_data(str)
94
95 Return true if *str* is data and false if it might be a section boundary. As
96 written, it tests for a prefix other than ``'-``\ ``-'`` at start of line (which
97 all MIME boundaries have) but it is declared so it can be overridden in derived
98 classes.
99
100 Note that this test is used intended as a fast guard for the real boundary
101 tests; if it always returns false it will merely slow processing, not cause it
102 to fail.
103
104
105.. method:: MultiFile.push(str)
106
107 Push a boundary string. When a decorated version of this boundary is found as
108 an input line, it will be interpreted as a section-divider or end-marker
109 (depending on the decoration, see :rfc:`2045`). All subsequent reads will
110 return the empty string to indicate end-of-file, until a call to :meth:`pop`
111 removes the boundary a or :meth:`next` call reenables it.
112
113 It is possible to push more than one boundary. Encountering the
114 most-recently-pushed boundary will return EOF; encountering any other
115 boundary will raise an error.
116
117
118.. method:: MultiFile.pop()
119
120 Pop a section boundary. This boundary will no longer be interpreted as EOF.
121
122
123.. method:: MultiFile.section_divider(str)
124
125 Turn a boundary into a section-divider line. By default, this method
126 prepends ``'--'`` (which MIME section boundaries have) but it is declared so
127 it can be overridden in derived classes. This method need not append LF or
128 CR-LF, as comparison with the result ignores trailing whitespace.
129
130
131.. method:: MultiFile.end_marker(str)
132
133 Turn a boundary string into an end-marker line. By default, this method
134 prepends ``'--'`` and appends ``'--'`` (like a MIME-multipart end-of-message
135 marker) but it is declared so it can be overridden in derived classes. This
136 method need not append LF or CR-LF, as comparison with the result ignores
137 trailing whitespace.
138
139Finally, :class:`MultiFile` instances have two public instance variables:
140
141
142.. attribute:: MultiFile.level
143
144 Nesting depth of the current part.
145
146
147.. attribute:: MultiFile.last
148
149 True if the last end-of-file was for an end-of-message marker.
150
151
152.. _multifile-example:
153
154:class:`MultiFile` Example
155--------------------------
156
157.. sectionauthor:: Skip Montanaro <skip@mojam.com>
158
159
160::
161
162 import mimetools
163 import multifile
164 import StringIO
165
166 def extract_mime_part_matching(stream, mimetype):
167 """Return the first element in a multipart MIME message on stream
168 matching mimetype."""
169
170 msg = mimetools.Message(stream)
171 msgtype = msg.gettype()
172 params = msg.getplist()
173
174 data = StringIO.StringIO()
175 if msgtype[:10] == "multipart/":
176
177 file = multifile.MultiFile(stream)
178 file.push(msg.getparam("boundary"))
179 while file.next():
180 submsg = mimetools.Message(file)
181 try:
182 data = StringIO.StringIO()
183 mimetools.decode(file, data, submsg.getencoding())
184 except ValueError:
185 continue
186 if submsg.gettype() == mimetype:
187 break
188 file.pop()
189 return data.getvalue()
190