blob: 2d9bae6a7ee57b1ed789a80b11d7eea6a313a3c2 [file] [log] [blame]
R David Murray79cf3ba2012-05-27 17:10:36 -04001:mod:`email.generator`: Generating MIME documents
2-------------------------------------------------
Georg Brandl116aa622007-08-15 14:28:22 +00003
4.. module:: email.generator
5 :synopsis: Generate flat text email messages from a message structure.
6
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04007**Source code:** :source:`Lib/email/generator.py`
8
9--------------
Georg Brandl116aa622007-08-15 14:28:22 +000010
R David Murray29d1bc02016-09-07 21:15:59 -040011One of the most common tasks is to generate the flat (serialized) version of
12the email message represented by a message object structure. You will need to
13do this if you want to send your message via :meth:`smtplib.SMTP.sendmail` or
14the :mod:`nntplib` module, or print the message on the console. Taking a
15message object structure and producing a serialized representation is the job
16of the generator classes.
Georg Brandl116aa622007-08-15 14:28:22 +000017
R David Murray29d1bc02016-09-07 21:15:59 -040018As with the :mod:`email.parser` module, you aren't limited to the functionality
19of the bundled generator; you could write one from scratch yourself. However
20the bundled generator knows how to generate most email in a standards-compliant
21way, should handle MIME and non-MIME email messages just fine, and is designed
22so that the bytes-oriented parsing and generation operations are inverses,
23assuming the same non-transforming :mod:`~email.policy` is used for both. That
24is, parsing the serialized byte stream via the
25:class:`~email.parser.BytesParser` class and then regenerating the serialized
26byte stream using :class:`BytesGenerator` should produce output identical to
27the input [#]_. (On the other hand, using the generator on an
28:class:`~email.message.EmailMessage` constructed by program may result in
29changes to the :class:`~email.message.EmailMessage` object as defaults are
30filled in.)
Georg Brandl116aa622007-08-15 14:28:22 +000031
R David Murray29d1bc02016-09-07 21:15:59 -040032The :class:`Generator` class can be used to flatten a message into a text (as
33opposed to binary) serialized representation, but since Unicode cannot
34represent binary data directly, the message is of necessity transformed into
35something that contains only ASCII characters, using the standard email RFC
36Content Transfer Encoding techniques for encoding email messages for transport
37over channels that are not "8 bit clean".
Georg Brandl116aa622007-08-15 14:28:22 +000038
Nick Sung13d4e6a2019-05-24 21:50:35 +080039To accommodate reproducible processing of SMIME-signed messages
Saptak Sengupta622935d2018-11-05 03:42:34 +053040:class:`Generator` disables header folding for message parts of type
41``multipart/signed`` and all subparts.
42
Georg Brandl116aa622007-08-15 14:28:22 +000043
R David Murray29d1bc02016-09-07 21:15:59 -040044.. class:: BytesGenerator(outfp, mangle_from_=None, maxheaderlen=None, *, \
R David Murraye2524462014-05-06 21:33:18 -040045 policy=None)
R. David Murray96fd54e2010-10-08 15:55:28 +000046
R David Murray29d1bc02016-09-07 21:15:59 -040047 Return a :class:`BytesGenerator` object that will write any message provided
48 to the :meth:`flatten` method, or any surrogateescape encoded text provided
49 to the :meth:`write` method, to the :term:`file-like object` *outfp*.
50 *outfp* must support a ``write`` method that accepts binary data.
R. David Murray96fd54e2010-10-08 15:55:28 +000051
R David Murray29d1bc02016-09-07 21:15:59 -040052 If optional *mangle_from_* is ``True``, put a ``>`` character in front of
53 any line in the body that starts with the exact string ``"From "``, that is
54 ``From`` followed by a space at the beginning of a line. *mangle_from_*
55 defaults to the value of the :attr:`~email.policy.Policy.mangle_from_`
56 setting of the *policy* (which is ``True`` for the
57 :data:`~email.policy.compat32` policy and ``False`` for all others).
58 *mangle_from_* is intended for use when messages are stored in unix mbox
59 format (see :mod:`mailbox` and `WHY THE CONTENT-LENGTH FORMAT IS BAD
Sanyam Khurana1b4587a2017-12-06 22:09:33 +053060 <https://www.jwz.org/doc/content-length.html>`_).
R. David Murray8451c4b2010-10-23 22:19:56 +000061
R David Murray29d1bc02016-09-07 21:15:59 -040062 If *maxheaderlen* is not ``None``, refold any header lines that are longer
63 than *maxheaderlen*, or if ``0``, do not rewrap any headers. If
64 *manheaderlen* is ``None`` (the default), wrap headers and other message
65 lines according to the *policy* settings.
R. David Murray8451c4b2010-10-23 22:19:56 +000066
R David Murray29d1bc02016-09-07 21:15:59 -040067 If *policy* is specified, use that policy to control message generation. If
68 *policy* is ``None`` (the default), use the policy associated with the
69 :class:`~email.message.Message` or :class:`~email.message.EmailMessage`
70 object passed to ``flatten`` to control the message generation. See
71 :mod:`email.policy` for details on what *policy* controls.
R David Murraye2524462014-05-06 21:33:18 -040072
R David Murray29d1bc02016-09-07 21:15:59 -040073 .. versionadded:: 3.2
R David Murray3edd22a2011-04-18 13:59:37 -040074
75 .. versionchanged:: 3.3 Added the *policy* keyword.
76
R David Murray29d1bc02016-09-07 21:15:59 -040077 .. versionchanged:: 3.6 The default behavior of the *mangle_from_*
78 and *maxheaderlen* parameters is to follow the policy.
R. David Murray8451c4b2010-10-23 22:19:56 +000079
80
R David Murray3edd22a2011-04-18 13:59:37 -040081 .. method:: flatten(msg, unixfrom=False, linesep=None)
R. David Murray8451c4b2010-10-23 22:19:56 +000082
83 Print the textual representation of the message object structure rooted
84 at *msg* to the output file specified when the :class:`BytesGenerator`
R David Murray29d1bc02016-09-07 21:15:59 -040085 instance was created.
R David Murray3edd22a2011-04-18 13:59:37 -040086
R David Murray29d1bc02016-09-07 21:15:59 -040087 If the :mod:`~email.policy` option :attr:`~email.policy.Policy.cte_type`
88 is ``8bit`` (the default), copy any headers in the original parsed
89 message that have not been modified to the output with any bytes with the
90 high bit set reproduced as in the original, and preserve the non-ASCII
91 :mailheader:`Content-Transfer-Encoding` of any body parts that have them.
92 If ``cte_type`` is ``7bit``, convert the bytes with the high bit set as
93 needed using an ASCII-compatible :mailheader:`Content-Transfer-Encoding`.
94 That is, transform parts with non-ASCII
delirious-lettuce3378b202017-05-19 14:37:57 -060095 :mailheader:`Content-Transfer-Encoding`
96 (:mailheader:`Content-Transfer-Encoding: 8bit`) to an ASCII compatible
R David Murray29d1bc02016-09-07 21:15:59 -040097 :mailheader:`Content-Transfer-Encoding`, and encode RFC-invalid non-ASCII
98 bytes in headers using the MIME ``unknown-8bit`` character set, thus
99 rendering them RFC-compliant.
R. David Murray8451c4b2010-10-23 22:19:56 +0000100
R David Murray29d1bc02016-09-07 21:15:59 -0400101 .. XXX: There should be an option that just does the RFC
102 compliance transformation on headers but leaves CTE 8bit parts alone.
R. David Murray8451c4b2010-10-23 22:19:56 +0000103
R David Murray29d1bc02016-09-07 21:15:59 -0400104 If *unixfrom* is ``True``, print the envelope header delimiter used by
105 the Unix mailbox format (see :mod:`mailbox`) before the first of the
106 :rfc:`5322` headers of the root message object. If the root object has
107 no envelope header, craft a standard one. The default is ``False``.
R. David Murray8451c4b2010-10-23 22:19:56 +0000108 Note that for subparts, no envelope header is ever printed.
109
R David Murray29d1bc02016-09-07 21:15:59 -0400110 If *linesep* is not ``None``, use it as the separator character between
111 all the lines of the flattened message. If *linesep* is ``None`` (the
112 default), use the value specified in the *policy*.
113
114 .. XXX: flatten should take a *policy* keyword.
115
R. David Murray8451c4b2010-10-23 22:19:56 +0000116
117 .. method:: clone(fp)
118
119 Return an independent clone of this :class:`BytesGenerator` instance with
R David Murray29d1bc02016-09-07 21:15:59 -0400120 the exact same option settings, and *fp* as the new *outfp*.
121
R. David Murray8451c4b2010-10-23 22:19:56 +0000122
123 .. method:: write(s)
124
R David Murray29d1bc02016-09-07 21:15:59 -0400125 Encode *s* using the ``ASCII`` codec and the ``surrogateescape`` error
126 handler, and pass it to the *write* method of the *outfp* passed to the
127 :class:`BytesGenerator`'s constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000128
129
R David Murray29d1bc02016-09-07 21:15:59 -0400130As a convenience, :class:`~email.message.EmailMessage` provides the methods
131:meth:`~email.message.EmailMessage.as_bytes` and ``bytes(aMessage)`` (a.k.a.
132:meth:`~email.message.EmailMessage.__bytes__`), which simplify the generation of
133a serialized binary representation of a message object. For more detail, see
134:mod:`email.message`.
Georg Brandl116aa622007-08-15 14:28:22 +0000135
Georg Brandl116aa622007-08-15 14:28:22 +0000136
R David Murray29d1bc02016-09-07 21:15:59 -0400137Because strings cannot represent binary data, the :class:`Generator` class must
138convert any binary data in any message it flattens to an ASCII compatible
139format, by converting them to an ASCII compatible
140:mailheader:`Content-Transfer_Encoding`. Using the terminology of the email
141RFCs, you can think of this as :class:`Generator` serializing to an I/O stream
142that is not "8 bit clean". In other words, most applications will want
143to be using :class:`BytesGenerator`, and not :class:`Generator`.
144
145.. class:: Generator(outfp, mangle_from_=None, maxheaderlen=None, *, \
146 policy=None)
147
148 Return a :class:`Generator` object that will write any message provided
149 to the :meth:`flatten` method, or any text provided to the :meth:`write`
150 method, to the :term:`file-like object` *outfp*. *outfp* must support a
151 ``write`` method that accepts string data.
152
153 If optional *mangle_from_* is ``True``, put a ``>`` character in front of
154 any line in the body that starts with the exact string ``"From "``, that is
155 ``From`` followed by a space at the beginning of a line. *mangle_from_*
156 defaults to the value of the :attr:`~email.policy.Policy.mangle_from_`
157 setting of the *policy* (which is ``True`` for the
158 :data:`~email.policy.compat32` policy and ``False`` for all others).
159 *mangle_from_* is intended for use when messages are stored in unix mbox
160 format (see :mod:`mailbox` and `WHY THE CONTENT-LENGTH FORMAT IS BAD
Sanyam Khurana1b4587a2017-12-06 22:09:33 +0530161 <https://www.jwz.org/doc/content-length.html>`_).
R David Murray29d1bc02016-09-07 21:15:59 -0400162
163 If *maxheaderlen* is not ``None``, refold any header lines that are longer
164 than *maxheaderlen*, or if ``0``, do not rewrap any headers. If
165 *manheaderlen* is ``None`` (the default), wrap headers and other message
166 lines according to the *policy* settings.
167
168 If *policy* is specified, use that policy to control message generation. If
169 *policy* is ``None`` (the default), use the policy associated with the
170 :class:`~email.message.Message` or :class:`~email.message.EmailMessage`
171 object passed to ``flatten`` to control the message generation. See
172 :mod:`email.policy` for details on what *policy* controls.
173
174 .. versionchanged:: 3.3 Added the *policy* keyword.
175
176 .. versionchanged:: 3.6 The default behavior of the *mangle_from_*
177 and *maxheaderlen* parameters is to follow the policy.
178
179
180 .. method:: flatten(msg, unixfrom=False, linesep=None)
181
182 Print the textual representation of the message object structure rooted
183 at *msg* to the output file specified when the :class:`Generator`
184 instance was created.
185
186 If the :mod:`~email.policy` option :attr:`~email.policy.Policy.cte_type`
187 is ``8bit``, generate the message as if the option were set to ``7bit``.
188 (This is required because strings cannot represent non-ASCII bytes.)
189 Convert any bytes with the high bit set as needed using an
190 ASCII-compatible :mailheader:`Content-Transfer-Encoding`. That is,
penguindustin96466302019-05-06 14:57:17 -0400191 transform parts with non-ASCII :mailheader:`Content-Transfer-Encoding`
Ville Skyttä61f82e02018-04-20 23:08:45 +0300192 (:mailheader:`Content-Transfer-Encoding: 8bit`) to an ASCII compatible
R David Murray29d1bc02016-09-07 21:15:59 -0400193 :mailheader:`Content-Transfer-Encoding`, and encode RFC-invalid non-ASCII
194 bytes in headers using the MIME ``unknown-8bit`` character set, thus
195 rendering them RFC-compliant.
196
197 If *unixfrom* is ``True``, print the envelope header delimiter used by
198 the Unix mailbox format (see :mod:`mailbox`) before the first of the
199 :rfc:`5322` headers of the root message object. If the root object has
200 no envelope header, craft a standard one. The default is ``False``.
201 Note that for subparts, no envelope header is ever printed.
202
203 If *linesep* is not ``None``, use it as the separator character between
204 all the lines of the flattened message. If *linesep* is ``None`` (the
205 default), use the value specified in the *policy*.
206
207 .. XXX: flatten should take a *policy* keyword.
208
209 .. versionchanged:: 3.2
210 Added support for re-encoding ``8bit`` message bodies, and the
211 *linesep* argument.
212
213
214 .. method:: clone(fp)
215
216 Return an independent clone of this :class:`Generator` instance with the
217 exact same options, and *fp* as the new *outfp*.
218
219
220 .. method:: write(s)
221
222 Write *s* to the *write* method of the *outfp* passed to the
223 :class:`Generator`'s constructor. This provides just enough file-like
224 API for :class:`Generator` instances to be used in the :func:`print`
225 function.
226
227
228As a convenience, :class:`~email.message.EmailMessage` provides the methods
229:meth:`~email.message.EmailMessage.as_string` and ``str(aMessage)`` (a.k.a.
230:meth:`~email.message.EmailMessage.__str__`), which simplify the generation of
231a formatted string representation of a message object. For more detail, see
232:mod:`email.message`.
233
234
235The :mod:`email.generator` module also provides a derived class,
236:class:`DecodedGenerator`, which is like the :class:`Generator` base class,
237except that non-\ :mimetype:`text` parts are not serialized, but are instead
238represented in the output stream by a string derived from a template filled
239in with information about the part.
240
R David Murray301edfa2016-09-08 17:57:06 -0400241.. class:: DecodedGenerator(outfp, mangle_from_=None, maxheaderlen=None, \
242 fmt=None, *, policy=None)
R David Murray29d1bc02016-09-07 21:15:59 -0400243
244 Act like :class:`Generator`, except that for any subpart of the message
245 passed to :meth:`Generator.flatten`, if the subpart is of main type
246 :mimetype:`text`, print the decoded payload of the subpart, and if the main
247 type is not :mimetype:`text`, instead of printing it fill in the string
248 *fmt* using information from the part and print the resulting
249 filled-in string.
250
251 To fill in *fmt*, execute ``fmt % part_info``, where ``part_info``
252 is a dictionary composed of the following keys and values:
Georg Brandl116aa622007-08-15 14:28:22 +0000253
254 * ``type`` -- Full MIME type of the non-\ :mimetype:`text` part
255
256 * ``maintype`` -- Main MIME type of the non-\ :mimetype:`text` part
257
258 * ``subtype`` -- Sub-MIME type of the non-\ :mimetype:`text` part
259
260 * ``filename`` -- Filename of the non-\ :mimetype:`text` part
261
262 * ``description`` -- Description associated with the non-\ :mimetype:`text` part
263
264 * ``encoding`` -- Content transfer encoding of the non-\ :mimetype:`text` part
265
R David Murray29d1bc02016-09-07 21:15:59 -0400266 If *fmt* is ``None``, use the following default *fmt*:
Georg Brandl116aa622007-08-15 14:28:22 +0000267
R David Murray29d1bc02016-09-07 21:15:59 -0400268 "[Non-text (%(type)s) part of message omitted, filename %(filename)s]"
269
270 Optional *_mangle_from_* and *maxheaderlen* are as with the
R David Murray301edfa2016-09-08 17:57:06 -0400271 :class:`Generator` base class.
R David Murrayea1badb2012-05-15 22:07:52 -0400272
273
274.. rubric:: Footnotes
275
R David Murray29d1bc02016-09-07 21:15:59 -0400276.. [#] This statement assumes that you use the appropriate setting for
277 ``unixfrom``, and that there are no :mod:`policy` settings calling for
278 automatic adjustments (for example,
279 :attr:`~email.policy.Policy.refold_source` must be ``none``, which is
280 *not* the default). It is also not 100% true, since if the message
281 does not conform to the RFC standards occasionally information about the
282 exact original text is lost during parsing error recovery. It is a goal
283 to fix these latter edge cases when possible.