| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 1 |  | 
 | 2 | :mod:`multifile` --- Support for files containing distinct parts | 
 | 3 | ================================================================ | 
 | 4 |  | 
 | 5 | .. module:: multifile | 
 | 6 |    :synopsis: Support for reading files which contain distinct parts, such as some MIME data. | 
| Guido van Rossum | da27fd2 | 2007-08-17 00:24:54 +0000 | [diff] [blame] | 7 |    :deprecated: | 
| Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 8 | .. sectionauthor:: Eric S. Raymond <esr@snark.thyrsus.com> | 
 | 9 |  | 
 | 10 |  | 
 | 11 | .. deprecated:: 2.5 | 
 | 12 |    The :mod:`email` package should be used in preference to the :mod:`multifile` | 
 | 13 |    module. This module is present only to maintain backward compatibility. | 
 | 14 |  | 
 | 15 | The :class:`MultiFile` object enables you to treat sections of a text file as | 
 | 16 | file-like input objects, with ``''`` being returned by :meth:`readline` when a | 
 | 17 | given delimiter pattern is encountered.  The defaults of this class are designed | 
 | 18 | to make it useful for parsing MIME multipart messages, but by subclassing it and | 
 | 19 | overriding methods  it can be easily adapted for more general use. | 
 | 20 |  | 
 | 21 |  | 
 | 22 | .. class:: MultiFile(fp[, seekable]) | 
 | 23 |  | 
 | 24 |    Create a multi-file.  You must instantiate this class with an input object | 
 | 25 |    argument for the :class:`MultiFile` instance to get lines from, such as a file | 
 | 26 |    object returned by :func:`open`. | 
 | 27 |  | 
 | 28 |    :class:`MultiFile` only ever looks at the input object's :meth:`readline`, | 
 | 29 |    :meth:`seek` and :meth:`tell` methods, and the latter two are only needed if you | 
 | 30 |    want random access to the individual MIME parts. To use :class:`MultiFile` on a | 
 | 31 |    non-seekable stream object, set the optional *seekable* argument to false; this | 
 | 32 |    will prevent using the input object's :meth:`seek` and :meth:`tell` methods. | 
 | 33 |  | 
 | 34 | It will be useful to know that in :class:`MultiFile`'s view of the world, text | 
 | 35 | is composed of three kinds of lines: data, section-dividers, and end-markers. | 
 | 36 | MultiFile is designed to support parsing of messages that may have multiple | 
 | 37 | nested message parts, each with its own pattern for section-divider and | 
 | 38 | end-marker lines. | 
 | 39 |  | 
 | 40 |  | 
 | 41 | .. seealso:: | 
 | 42 |  | 
 | 43 |    Module :mod:`email` | 
 | 44 |       Comprehensive email handling package; supersedes the :mod:`multifile` module. | 
 | 45 |  | 
 | 46 |  | 
 | 47 | .. _multifile-objects: | 
 | 48 |  | 
 | 49 | MultiFile Objects | 
 | 50 | ----------------- | 
 | 51 |  | 
 | 52 | A :class:`MultiFile` instance has the following methods: | 
 | 53 |  | 
 | 54 |  | 
 | 55 | .. method:: MultiFile.readline(str) | 
 | 56 |  | 
 | 57 |    Read a line.  If the line is data (not a section-divider or end-marker or real | 
 | 58 |    EOF) return it.  If the line matches the most-recently-stacked boundary, return | 
 | 59 |    ``''`` and set ``self.last`` to 1 or 0 according as the match is or is not an | 
 | 60 |    end-marker.  If the line matches any other stacked boundary, raise an error.  On | 
 | 61 |    encountering end-of-file on the underlying stream object, the method raises | 
 | 62 |    :exc:`Error` unless all boundaries have been popped. | 
 | 63 |  | 
 | 64 |  | 
 | 65 | .. method:: MultiFile.readlines(str) | 
 | 66 |  | 
 | 67 |    Return all lines remaining in this part as a list of strings. | 
 | 68 |  | 
 | 69 |  | 
 | 70 | .. method:: MultiFile.read() | 
 | 71 |  | 
 | 72 |    Read all lines, up to the next section.  Return them as a single (multiline) | 
 | 73 |    string.  Note that this doesn't take a size argument! | 
 | 74 |  | 
 | 75 |  | 
 | 76 | .. method:: MultiFile.seek(pos[, whence]) | 
 | 77 |  | 
 | 78 |    Seek.  Seek indices are relative to the start of the current section. The *pos* | 
 | 79 |    and *whence* arguments are interpreted as for a file seek. | 
 | 80 |  | 
 | 81 |  | 
 | 82 | .. method:: MultiFile.tell() | 
 | 83 |  | 
 | 84 |    Return the file position relative to the start of the current section. | 
 | 85 |  | 
 | 86 |  | 
 | 87 | .. method:: MultiFile.next() | 
 | 88 |  | 
 | 89 |    Skip lines to the next section (that is, read lines until a section-divider or | 
 | 90 |    end-marker has been consumed).  Return true if there is such a section, false if | 
 | 91 |    an end-marker is seen.  Re-enable the most-recently-pushed boundary. | 
 | 92 |  | 
 | 93 |  | 
 | 94 | .. method:: MultiFile.is_data(str) | 
 | 95 |  | 
 | 96 |    Return true if *str* is data and false if it might be a section boundary.  As | 
 | 97 |    written, it tests for a prefix other than ``'-``\ ``-'`` at start of line (which | 
 | 98 |    all MIME boundaries have) but it is declared so it can be overridden in derived | 
 | 99 |    classes. | 
 | 100 |  | 
 | 101 |    Note that this test is used intended as a fast guard for the real boundary | 
 | 102 |    tests; if it always returns false it will merely slow processing, not cause it | 
 | 103 |    to fail. | 
 | 104 |  | 
 | 105 |  | 
 | 106 | .. method:: MultiFile.push(str) | 
 | 107 |  | 
 | 108 |    Push a boundary string.  When a decorated version of this boundary  is found as | 
 | 109 |    an input line, it will be interpreted as a section-divider  or end-marker | 
 | 110 |    (depending on the decoration, see :rfc:`2045`).  All subsequent reads will | 
 | 111 |    return the empty string to indicate end-of-file, until a call to :meth:`pop` | 
 | 112 |    removes the boundary a or :meth:`next` call reenables it. | 
 | 113 |  | 
 | 114 |    It is possible to push more than one boundary.  Encountering the | 
 | 115 |    most-recently-pushed boundary will return EOF; encountering any other | 
 | 116 |    boundary will raise an error. | 
 | 117 |  | 
 | 118 |  | 
 | 119 | .. method:: MultiFile.pop() | 
 | 120 |  | 
 | 121 |    Pop a section boundary.  This boundary will no longer be interpreted as EOF. | 
 | 122 |  | 
 | 123 |  | 
 | 124 | .. method:: MultiFile.section_divider(str) | 
 | 125 |  | 
 | 126 |    Turn a boundary into a section-divider line.  By default, this method | 
 | 127 |    prepends ``'--'`` (which MIME section boundaries have) but it is declared so | 
 | 128 |    it can be overridden in derived classes.  This method need not append LF or | 
 | 129 |    CR-LF, as comparison with the result ignores trailing whitespace. | 
 | 130 |  | 
 | 131 |  | 
 | 132 | .. method:: MultiFile.end_marker(str) | 
 | 133 |  | 
 | 134 |    Turn a boundary string into an end-marker line.  By default, this method | 
 | 135 |    prepends ``'--'`` and appends ``'--'`` (like a MIME-multipart end-of-message | 
 | 136 |    marker) but it is declared so it can be overridden in derived classes.  This | 
 | 137 |    method need not append LF or CR-LF, as comparison with the result ignores | 
 | 138 |    trailing whitespace. | 
 | 139 |  | 
 | 140 | Finally, :class:`MultiFile` instances have two public instance variables: | 
 | 141 |  | 
 | 142 |  | 
 | 143 | .. attribute:: MultiFile.level | 
 | 144 |  | 
 | 145 |    Nesting depth of the current part. | 
 | 146 |  | 
 | 147 |  | 
 | 148 | .. attribute:: MultiFile.last | 
 | 149 |  | 
 | 150 |    True if the last end-of-file was for an end-of-message marker. | 
 | 151 |  | 
 | 152 |  | 
 | 153 | .. _multifile-example: | 
 | 154 |  | 
 | 155 | :class:`MultiFile` Example | 
 | 156 | -------------------------- | 
 | 157 |  | 
 | 158 | .. sectionauthor:: Skip Montanaro <skip@mojam.com> | 
 | 159 |  | 
 | 160 |  | 
 | 161 | :: | 
 | 162 |  | 
 | 163 |    import mimetools | 
 | 164 |    import multifile | 
 | 165 |    import StringIO | 
 | 166 |  | 
 | 167 |    def extract_mime_part_matching(stream, mimetype): | 
 | 168 |        """Return the first element in a multipart MIME message on stream | 
 | 169 |        matching mimetype.""" | 
 | 170 |  | 
 | 171 |        msg = mimetools.Message(stream) | 
 | 172 |        msgtype = msg.gettype() | 
 | 173 |        params = msg.getplist() | 
 | 174 |  | 
 | 175 |        data = StringIO.StringIO() | 
 | 176 |        if msgtype[:10] == "multipart/": | 
 | 177 |  | 
 | 178 |            file = multifile.MultiFile(stream) | 
 | 179 |            file.push(msg.getparam("boundary")) | 
 | 180 |            while file.next(): | 
 | 181 |                submsg = mimetools.Message(file) | 
 | 182 |                try: | 
 | 183 |                    data = StringIO.StringIO() | 
 | 184 |                    mimetools.decode(file, data, submsg.getencoding()) | 
 | 185 |                except ValueError: | 
 | 186 |                    continue | 
 | 187 |                if submsg.gettype() == mimetype: | 
 | 188 |                    break | 
 | 189 |            file.pop() | 
 | 190 |        return data.getvalue() | 
 | 191 |  |