blob: d596ed8d855580c37d2a1f105c2df3b56017cb1f [file] [log] [blame]
R David Murray79cf3ba2012-05-27 17:10:36 -04001:mod:`email.generator`: Generating MIME documents
2-------------------------------------------------
Georg Brandl116aa622007-08-15 14:28:22 +00003
4.. module:: email.generator
5 :synopsis: Generate flat text email messages from a message structure.
6
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04007**Source code:** :source:`Lib/email/generator.py`
8
9--------------
Georg Brandl116aa622007-08-15 14:28:22 +000010
11One of the most common tasks is to generate the flat text of the email message
12represented by a message object structure. You will need to do this if you want
13to send your message via the :mod:`smtplib` module or the :mod:`nntplib` module,
14or print the message on the console. Taking a message object structure and
15producing a flat text document is the job of the :class:`Generator` class.
16
17Again, as with the :mod:`email.parser` module, you aren't limited to the
18functionality of the bundled generator; you could write one from scratch
19yourself. However the bundled generator knows how to generate most email in a
20standards-compliant way, should handle MIME and non-MIME email messages just
21fine, and is designed so that the transformation from flat text, to a message
Georg Brandl3638e482009-04-27 16:46:17 +000022structure via the :class:`~email.parser.Parser` class, and back to flat text,
R David Murrayea1badb2012-05-15 22:07:52 -040023is idempotent (the input is identical to the output) [#]_. On the other hand,
R David Murray28e68ea2012-05-15 22:13:29 -040024using the Generator on a :class:`~email.message.Message` constructed by program
25may result in changes to the :class:`~email.message.Message` object as defaults
26are filled in.
Georg Brandl116aa622007-08-15 14:28:22 +000027
R. David Murray96fd54e2010-10-08 15:55:28 +000028:class:`bytes` output can be generated using the :class:`BytesGenerator` class.
29If the message object structure contains non-ASCII bytes, this generator's
30:meth:`~BytesGenerator.flatten` method will emit the original bytes. Parsing a
31binary message and then flattening it with :class:`BytesGenerator` should be
32idempotent for standards compliant messages.
33
Georg Brandl116aa622007-08-15 14:28:22 +000034Here are the public methods of the :class:`Generator` class, imported from the
35:mod:`email.generator` module:
36
37
R David Murrayc27e5222012-05-25 15:01:48 -040038.. class:: Generator(outfp, mangle_from_=True, maxheaderlen=78, *, policy=None)
Georg Brandl116aa622007-08-15 14:28:22 +000039
Antoine Pitrou11cb9612010-09-15 11:11:28 +000040 The constructor for the :class:`Generator` class takes a :term:`file-like object`
41 called *outfp* for an argument. *outfp* must support the :meth:`write` method
42 and be usable as the output file for the :func:`print` function.
Georg Brandl116aa622007-08-15 14:28:22 +000043
44 Optional *mangle_from_* is a flag that, when ``True``, puts a ``>`` character in
45 front of any line in the body that starts exactly as ``From``, i.e. ``From``
46 followed by a space at the beginning of the line. This is the only guaranteed
47 portable way to avoid having such lines be mistaken for a Unix mailbox format
48 envelope header separator (see `WHY THE CONTENT-LENGTH FORMAT IS BAD
Georg Brandl5d941342016-02-26 19:37:12 +010049 <https://www.jwz.org/doc/content-length.html>`_ for details). *mangle_from_*
Georg Brandl116aa622007-08-15 14:28:22 +000050 defaults to ``True``, but you might want to set this to ``False`` if you are not
51 writing Unix mailbox format files.
52
53 Optional *maxheaderlen* specifies the longest length for a non-continued header.
54 When a header line is longer than *maxheaderlen* (in characters, with tabs
55 expanded to 8 spaces), the header will be split as defined in the
Georg Brandl3638e482009-04-27 16:46:17 +000056 :class:`~email.header.Header` class. Set to zero to disable header wrapping.
57 The default is 78, as recommended (but not required) by :rfc:`2822`.
Georg Brandl116aa622007-08-15 14:28:22 +000058
R David Murray3edd22a2011-04-18 13:59:37 -040059 The *policy* keyword specifies a :mod:`~email.policy` object that controls a
R David Murrayc27e5222012-05-25 15:01:48 -040060 number of aspects of the generator's operation. If no *policy* is specified,
Georg Brandl3539afd2012-05-30 22:03:20 +020061 then the *policy* attached to the message object passed to :attr:`flatten`
R David Murrayc27e5222012-05-25 15:01:48 -040062 is used.
R David Murray3edd22a2011-04-18 13:59:37 -040063
64 .. versionchanged:: 3.3 Added the *policy* keyword.
65
Benjamin Petersone41251e2008-04-25 01:59:09 +000066 The other public :class:`Generator` methods are:
Georg Brandl116aa622007-08-15 14:28:22 +000067
68
R David Murray3edd22a2011-04-18 13:59:37 -040069 .. method:: flatten(msg, unixfrom=False, linesep=None)
Georg Brandl116aa622007-08-15 14:28:22 +000070
Benjamin Petersone41251e2008-04-25 01:59:09 +000071 Print the textual representation of the message object structure rooted at
72 *msg* to the output file specified when the :class:`Generator` instance
73 was created. Subparts are visited depth-first and the resulting text will
74 be properly MIME encoded.
Georg Brandl116aa622007-08-15 14:28:22 +000075
Benjamin Petersone41251e2008-04-25 01:59:09 +000076 Optional *unixfrom* is a flag that forces the printing of the envelope
77 header delimiter before the first :rfc:`2822` header of the root message
78 object. If the root object has no envelope header, a standard one is
79 crafted. By default, this is set to ``False`` to inhibit the printing of
80 the envelope delimiter.
Georg Brandl116aa622007-08-15 14:28:22 +000081
Benjamin Petersone41251e2008-04-25 01:59:09 +000082 Note that for subparts, no envelope header is ever printed.
Georg Brandl116aa622007-08-15 14:28:22 +000083
R. David Murray8451c4b2010-10-23 22:19:56 +000084 Optional *linesep* specifies the line separator character used to
R David Murray3edd22a2011-04-18 13:59:37 -040085 terminate lines in the output. If specified it overrides the value
R David Murrayc27e5222012-05-25 15:01:48 -040086 specified by the *msg*\'s or ``Generator``\'s ``policy``.
R. David Murray8451c4b2010-10-23 22:19:56 +000087
R David Murrayc27e5222012-05-25 15:01:48 -040088 Because strings cannot represent non-ASCII bytes, if the policy that
89 applies when ``flatten`` is run has :attr:`~email.policy.Policy.cte_type`
90 set to ``8bit``, ``Generator`` will operate as if it were set to
91 ``7bit``. This means that messages parsed with a Bytes parser that have
92 a :mailheader:`Content-Transfer-Encoding` of ``8bit`` will be converted
93 to a use a ``7bit`` Content-Transfer-Encoding. Non-ASCII bytes in the
94 headers will be :rfc:`2047` encoded with a charset of ``unknown-8bit``.
R. David Murray96fd54e2010-10-08 15:55:28 +000095
R. David Murray8451c4b2010-10-23 22:19:56 +000096 .. versionchanged:: 3.2
R David Murrayc27e5222012-05-25 15:01:48 -040097 Added support for re-encoding ``8bit`` message bodies, and the
98 *linesep* argument.
R. David Murray96fd54e2010-10-08 15:55:28 +000099
Benjamin Petersone41251e2008-04-25 01:59:09 +0000100 .. method:: clone(fp)
Georg Brandl116aa622007-08-15 14:28:22 +0000101
Benjamin Petersone41251e2008-04-25 01:59:09 +0000102 Return an independent clone of this :class:`Generator` instance with the
103 exact same options.
Georg Brandl116aa622007-08-15 14:28:22 +0000104
Benjamin Petersone41251e2008-04-25 01:59:09 +0000105 .. method:: write(s)
Georg Brandl116aa622007-08-15 14:28:22 +0000106
Benjamin Petersone41251e2008-04-25 01:59:09 +0000107 Write the string *s* to the underlying file object, i.e. *outfp* passed to
108 :class:`Generator`'s constructor. This provides just enough file-like API
109 for :class:`Generator` instances to be used in the :func:`print` function.
Georg Brandl116aa622007-08-15 14:28:22 +0000110
R. David Murray96fd54e2010-10-08 15:55:28 +0000111As a convenience, see the :class:`~email.message.Message` methods
112:meth:`~email.message.Message.as_string` and ``str(aMessage)``, a.k.a.
113:meth:`~email.message.Message.__str__`, which simplify the generation of a
114formatted string representation of a message object. For more detail, see
Georg Brandl116aa622007-08-15 14:28:22 +0000115:mod:`email.message`.
116
R David Murray3edd22a2011-04-18 13:59:37 -0400117.. class:: BytesGenerator(outfp, mangle_from_=True, maxheaderlen=78, *, \
R David Murraye2524462014-05-06 21:33:18 -0400118 policy=None)
R. David Murray96fd54e2010-10-08 15:55:28 +0000119
R. David Murray8451c4b2010-10-23 22:19:56 +0000120 The constructor for the :class:`BytesGenerator` class takes a binary
121 :term:`file-like object` called *outfp* for an argument. *outfp* must
122 support a :meth:`write` method that accepts binary data.
R. David Murray96fd54e2010-10-08 15:55:28 +0000123
R. David Murray8451c4b2010-10-23 22:19:56 +0000124 Optional *mangle_from_* is a flag that, when ``True``, puts a ``>``
125 character in front of any line in the body that starts exactly as ``From``,
126 i.e. ``From`` followed by a space at the beginning of the line. This is the
127 only guaranteed portable way to avoid having such lines be mistaken for a
128 Unix mailbox format envelope header separator (see `WHY THE CONTENT-LENGTH
Georg Brandl5d941342016-02-26 19:37:12 +0100129 FORMAT IS BAD <https://www.jwz.org/doc/content-length.html>`_ for details).
R. David Murray8451c4b2010-10-23 22:19:56 +0000130 *mangle_from_* defaults to ``True``, but you might want to set this to
131 ``False`` if you are not writing Unix mailbox format files.
132
133 Optional *maxheaderlen* specifies the longest length for a non-continued
134 header. When a header line is longer than *maxheaderlen* (in characters,
135 with tabs expanded to 8 spaces), the header will be split as defined in the
136 :class:`~email.header.Header` class. Set to zero to disable header
137 wrapping. The default is 78, as recommended (but not required) by
138 :rfc:`2822`.
139
R David Murraye2524462014-05-06 21:33:18 -0400140
R David Murray3edd22a2011-04-18 13:59:37 -0400141 The *policy* keyword specifies a :mod:`~email.policy` object that controls a
R David Murraye2524462014-05-06 21:33:18 -0400142 number of aspects of the generator's operation. If no *policy* is specified,
143 then the *policy* attached to the message object passed to :attr:`flatten`
144 is used.
R David Murray3edd22a2011-04-18 13:59:37 -0400145
146 .. versionchanged:: 3.3 Added the *policy* keyword.
147
R. David Murray8451c4b2010-10-23 22:19:56 +0000148 The other public :class:`BytesGenerator` methods are:
149
150
R David Murray3edd22a2011-04-18 13:59:37 -0400151 .. method:: flatten(msg, unixfrom=False, linesep=None)
R. David Murray8451c4b2010-10-23 22:19:56 +0000152
153 Print the textual representation of the message object structure rooted
154 at *msg* to the output file specified when the :class:`BytesGenerator`
155 instance was created. Subparts are visited depth-first and the resulting
R David Murray3edd22a2011-04-18 13:59:37 -0400156 text will be properly MIME encoded. If the :mod:`~email.policy` option
R David Murrayc27e5222012-05-25 15:01:48 -0400157 :attr:`~email.policy.Policy.cte_type` is ``8bit`` (the default),
R David Murray3edd22a2011-04-18 13:59:37 -0400158 then any bytes with the high bit set in the original parsed message that
159 have not been modified will be copied faithfully to the output. If
R David Murrayc27e5222012-05-25 15:01:48 -0400160 ``cte_type`` is ``7bit``, the bytes will be converted as needed
161 using an ASCII-compatible Content-Transfer-Encoding. In particular,
162 RFC-invalid non-ASCII bytes in headers will be encoded using the MIME
163 ``unknown-8bit`` character set, thus rendering them RFC-compliant.
R David Murray3edd22a2011-04-18 13:59:37 -0400164
165 .. XXX: There should be a complementary option that just does the RFC
166 compliance transformation but leaves CTE 8bit parts alone.
R. David Murray8451c4b2010-10-23 22:19:56 +0000167
168 Messages parsed with a Bytes parser that have a
169 :mailheader:`Content-Transfer-Encoding` of 8bit will be reconstructed
170 as 8bit if they have not been modified.
171
172 Optional *unixfrom* is a flag that forces the printing of the envelope
173 header delimiter before the first :rfc:`2822` header of the root message
174 object. If the root object has no envelope header, a standard one is
175 crafted. By default, this is set to ``False`` to inhibit the printing of
176 the envelope delimiter.
177
178 Note that for subparts, no envelope header is ever printed.
179
180 Optional *linesep* specifies the line separator character used to
R David Murray3edd22a2011-04-18 13:59:37 -0400181 terminate lines in the output. If specified it overrides the value
R David Murraye2524462014-05-06 21:33:18 -0400182 specified by the ``Generator``\ or *msg*\ 's ``policy``.
R. David Murray8451c4b2010-10-23 22:19:56 +0000183
184 .. method:: clone(fp)
185
186 Return an independent clone of this :class:`BytesGenerator` instance with
187 the exact same options.
188
189 .. method:: write(s)
190
191 Write the string *s* to the underlying file object. *s* is encoded using
192 the ``ASCII`` codec and written to the *write* method of the *outfp*
193 *outfp* passed to the :class:`BytesGenerator`'s constructor. This
194 provides just enough file-like API for :class:`BytesGenerator` instances
195 to be used in the :func:`print` function.
R. David Murray96fd54e2010-10-08 15:55:28 +0000196
197 .. versionadded:: 3.2
198
Georg Brandl116aa622007-08-15 14:28:22 +0000199The :mod:`email.generator` module also provides a derived class, called
200:class:`DecodedGenerator` which is like the :class:`Generator` base class,
201except that non-\ :mimetype:`text` parts are substituted with a format string
202representing the part.
203
204
Hynek Schlawackdfa46522012-05-21 11:01:54 +0200205.. class:: DecodedGenerator(outfp, mangle_from_=True, maxheaderlen=78, fmt=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000206
207 This class, derived from :class:`Generator` walks through all the subparts of a
208 message. If the subpart is of main type :mimetype:`text`, then it prints the
209 decoded payload of the subpart. Optional *_mangle_from_* and *maxheaderlen* are
210 as with the :class:`Generator` base class.
211
212 If the subpart is not of main type :mimetype:`text`, optional *fmt* is a format
213 string that is used instead of the message payload. *fmt* is expanded with the
214 following keywords, ``%(keyword)s`` format:
215
216 * ``type`` -- Full MIME type of the non-\ :mimetype:`text` part
217
218 * ``maintype`` -- Main MIME type of the non-\ :mimetype:`text` part
219
220 * ``subtype`` -- Sub-MIME type of the non-\ :mimetype:`text` part
221
222 * ``filename`` -- Filename of the non-\ :mimetype:`text` part
223
224 * ``description`` -- Description associated with the non-\ :mimetype:`text` part
225
226 * ``encoding`` -- Content transfer encoding of the non-\ :mimetype:`text` part
227
228 The default value for *fmt* is ``None``, meaning ::
229
230 [Non-text (%(type)s) part of message omitted, filename %(filename)s]
R David Murrayea1badb2012-05-15 22:07:52 -0400231
232
233.. rubric:: Footnotes
234
235.. [#] This statement assumes that you use the appropriate setting for the
236 ``unixfrom`` argument, and that you set maxheaderlen=0 (which will
237 preserve whatever the input line lengths were). It is also not strictly
238 true, since in many cases runs of whitespace in headers are collapsed
239 into single blanks. The latter is a bug that will eventually be fixed.