blob: ab0fbc29d1ec8569cc689c0639b38497f97a3e53 [file] [log] [blame]
R David Murray79cf3ba2012-05-27 17:10:36 -04001:mod:`email.generator`: Generating MIME documents
2-------------------------------------------------
Georg Brandl116aa622007-08-15 14:28:22 +00003
4.. module:: email.generator
5 :synopsis: Generate flat text email messages from a message structure.
6
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04007**Source code:** :source:`Lib/email/generator.py`
8
9--------------
Georg Brandl116aa622007-08-15 14:28:22 +000010
R David Murray29d1bc02016-09-07 21:15:59 -040011One of the most common tasks is to generate the flat (serialized) version of
12the email message represented by a message object structure. You will need to
13do this if you want to send your message via :meth:`smtplib.SMTP.sendmail` or
14the :mod:`nntplib` module, or print the message on the console. Taking a
15message object structure and producing a serialized representation is the job
16of the generator classes.
Georg Brandl116aa622007-08-15 14:28:22 +000017
R David Murray29d1bc02016-09-07 21:15:59 -040018As with the :mod:`email.parser` module, you aren't limited to the functionality
19of the bundled generator; you could write one from scratch yourself. However
20the bundled generator knows how to generate most email in a standards-compliant
21way, should handle MIME and non-MIME email messages just fine, and is designed
22so that the bytes-oriented parsing and generation operations are inverses,
23assuming the same non-transforming :mod:`~email.policy` is used for both. That
24is, parsing the serialized byte stream via the
25:class:`~email.parser.BytesParser` class and then regenerating the serialized
26byte stream using :class:`BytesGenerator` should produce output identical to
27the input [#]_. (On the other hand, using the generator on an
28:class:`~email.message.EmailMessage` constructed by program may result in
29changes to the :class:`~email.message.EmailMessage` object as defaults are
30filled in.)
Georg Brandl116aa622007-08-15 14:28:22 +000031
R David Murray29d1bc02016-09-07 21:15:59 -040032The :class:`Generator` class can be used to flatten a message into a text (as
33opposed to binary) serialized representation, but since Unicode cannot
34represent binary data directly, the message is of necessity transformed into
35something that contains only ASCII characters, using the standard email RFC
36Content Transfer Encoding techniques for encoding email messages for transport
37over channels that are not "8 bit clean".
Georg Brandl116aa622007-08-15 14:28:22 +000038
39
R David Murray29d1bc02016-09-07 21:15:59 -040040.. class:: BytesGenerator(outfp, mangle_from_=None, maxheaderlen=None, *, \
R David Murraye2524462014-05-06 21:33:18 -040041 policy=None)
R. David Murray96fd54e2010-10-08 15:55:28 +000042
R David Murray29d1bc02016-09-07 21:15:59 -040043 Return a :class:`BytesGenerator` object that will write any message provided
44 to the :meth:`flatten` method, or any surrogateescape encoded text provided
45 to the :meth:`write` method, to the :term:`file-like object` *outfp*.
46 *outfp* must support a ``write`` method that accepts binary data.
R. David Murray96fd54e2010-10-08 15:55:28 +000047
R David Murray29d1bc02016-09-07 21:15:59 -040048 If optional *mangle_from_* is ``True``, put a ``>`` character in front of
49 any line in the body that starts with the exact string ``"From "``, that is
50 ``From`` followed by a space at the beginning of a line. *mangle_from_*
51 defaults to the value of the :attr:`~email.policy.Policy.mangle_from_`
52 setting of the *policy* (which is ``True`` for the
53 :data:`~email.policy.compat32` policy and ``False`` for all others).
54 *mangle_from_* is intended for use when messages are stored in unix mbox
55 format (see :mod:`mailbox` and `WHY THE CONTENT-LENGTH FORMAT IS BAD
56 <http://www.jwz.org/doc/content-length.html>`_).
R. David Murray8451c4b2010-10-23 22:19:56 +000057
R David Murray29d1bc02016-09-07 21:15:59 -040058 If *maxheaderlen* is not ``None``, refold any header lines that are longer
59 than *maxheaderlen*, or if ``0``, do not rewrap any headers. If
60 *manheaderlen* is ``None`` (the default), wrap headers and other message
61 lines according to the *policy* settings.
R. David Murray8451c4b2010-10-23 22:19:56 +000062
R David Murray29d1bc02016-09-07 21:15:59 -040063 If *policy* is specified, use that policy to control message generation. If
64 *policy* is ``None`` (the default), use the policy associated with the
65 :class:`~email.message.Message` or :class:`~email.message.EmailMessage`
66 object passed to ``flatten`` to control the message generation. See
67 :mod:`email.policy` for details on what *policy* controls.
R David Murraye2524462014-05-06 21:33:18 -040068
R David Murray29d1bc02016-09-07 21:15:59 -040069 .. versionadded:: 3.2
R David Murray3edd22a2011-04-18 13:59:37 -040070
71 .. versionchanged:: 3.3 Added the *policy* keyword.
72
R David Murray29d1bc02016-09-07 21:15:59 -040073 .. versionchanged:: 3.6 The default behavior of the *mangle_from_*
74 and *maxheaderlen* parameters is to follow the policy.
R. David Murray8451c4b2010-10-23 22:19:56 +000075
76
R David Murray3edd22a2011-04-18 13:59:37 -040077 .. method:: flatten(msg, unixfrom=False, linesep=None)
R. David Murray8451c4b2010-10-23 22:19:56 +000078
79 Print the textual representation of the message object structure rooted
80 at *msg* to the output file specified when the :class:`BytesGenerator`
R David Murray29d1bc02016-09-07 21:15:59 -040081 instance was created.
R David Murray3edd22a2011-04-18 13:59:37 -040082
R David Murray29d1bc02016-09-07 21:15:59 -040083 If the :mod:`~email.policy` option :attr:`~email.policy.Policy.cte_type`
84 is ``8bit`` (the default), copy any headers in the original parsed
85 message that have not been modified to the output with any bytes with the
86 high bit set reproduced as in the original, and preserve the non-ASCII
87 :mailheader:`Content-Transfer-Encoding` of any body parts that have them.
88 If ``cte_type`` is ``7bit``, convert the bytes with the high bit set as
89 needed using an ASCII-compatible :mailheader:`Content-Transfer-Encoding`.
90 That is, transform parts with non-ASCII
91 :mailheader:`Cotnent-Transfer-Encoding`
92 (:mailheader:`Content-Transfer-Encoding: 8bit`) to an ASCII compatibile
93 :mailheader:`Content-Transfer-Encoding`, and encode RFC-invalid non-ASCII
94 bytes in headers using the MIME ``unknown-8bit`` character set, thus
95 rendering them RFC-compliant.
R. David Murray8451c4b2010-10-23 22:19:56 +000096
R David Murray29d1bc02016-09-07 21:15:59 -040097 .. XXX: There should be an option that just does the RFC
98 compliance transformation on headers but leaves CTE 8bit parts alone.
R. David Murray8451c4b2010-10-23 22:19:56 +000099
R David Murray29d1bc02016-09-07 21:15:59 -0400100 If *unixfrom* is ``True``, print the envelope header delimiter used by
101 the Unix mailbox format (see :mod:`mailbox`) before the first of the
102 :rfc:`5322` headers of the root message object. If the root object has
103 no envelope header, craft a standard one. The default is ``False``.
R. David Murray8451c4b2010-10-23 22:19:56 +0000104 Note that for subparts, no envelope header is ever printed.
105
R David Murray29d1bc02016-09-07 21:15:59 -0400106 If *linesep* is not ``None``, use it as the separator character between
107 all the lines of the flattened message. If *linesep* is ``None`` (the
108 default), use the value specified in the *policy*.
109
110 .. XXX: flatten should take a *policy* keyword.
111
R. David Murray8451c4b2010-10-23 22:19:56 +0000112
113 .. method:: clone(fp)
114
115 Return an independent clone of this :class:`BytesGenerator` instance with
R David Murray29d1bc02016-09-07 21:15:59 -0400116 the exact same option settings, and *fp* as the new *outfp*.
117
R. David Murray8451c4b2010-10-23 22:19:56 +0000118
119 .. method:: write(s)
120
R David Murray29d1bc02016-09-07 21:15:59 -0400121 Encode *s* using the ``ASCII`` codec and the ``surrogateescape`` error
122 handler, and pass it to the *write* method of the *outfp* passed to the
123 :class:`BytesGenerator`'s constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000124
125
R David Murray29d1bc02016-09-07 21:15:59 -0400126As a convenience, :class:`~email.message.EmailMessage` provides the methods
127:meth:`~email.message.EmailMessage.as_bytes` and ``bytes(aMessage)`` (a.k.a.
128:meth:`~email.message.EmailMessage.__bytes__`), which simplify the generation of
129a serialized binary representation of a message object. For more detail, see
130:mod:`email.message`.
Georg Brandl116aa622007-08-15 14:28:22 +0000131
Georg Brandl116aa622007-08-15 14:28:22 +0000132
R David Murray29d1bc02016-09-07 21:15:59 -0400133Because strings cannot represent binary data, the :class:`Generator` class must
134convert any binary data in any message it flattens to an ASCII compatible
135format, by converting them to an ASCII compatible
136:mailheader:`Content-Transfer_Encoding`. Using the terminology of the email
137RFCs, you can think of this as :class:`Generator` serializing to an I/O stream
138that is not "8 bit clean". In other words, most applications will want
139to be using :class:`BytesGenerator`, and not :class:`Generator`.
140
141.. class:: Generator(outfp, mangle_from_=None, maxheaderlen=None, *, \
142 policy=None)
143
144 Return a :class:`Generator` object that will write any message provided
145 to the :meth:`flatten` method, or any text provided to the :meth:`write`
146 method, to the :term:`file-like object` *outfp*. *outfp* must support a
147 ``write`` method that accepts string data.
148
149 If optional *mangle_from_* is ``True``, put a ``>`` character in front of
150 any line in the body that starts with the exact string ``"From "``, that is
151 ``From`` followed by a space at the beginning of a line. *mangle_from_*
152 defaults to the value of the :attr:`~email.policy.Policy.mangle_from_`
153 setting of the *policy* (which is ``True`` for the
154 :data:`~email.policy.compat32` policy and ``False`` for all others).
155 *mangle_from_* is intended for use when messages are stored in unix mbox
156 format (see :mod:`mailbox` and `WHY THE CONTENT-LENGTH FORMAT IS BAD
157 <http://www.jwz.org/doc/content-length.html>`_).
158
159 If *maxheaderlen* is not ``None``, refold any header lines that are longer
160 than *maxheaderlen*, or if ``0``, do not rewrap any headers. If
161 *manheaderlen* is ``None`` (the default), wrap headers and other message
162 lines according to the *policy* settings.
163
164 If *policy* is specified, use that policy to control message generation. If
165 *policy* is ``None`` (the default), use the policy associated with the
166 :class:`~email.message.Message` or :class:`~email.message.EmailMessage`
167 object passed to ``flatten`` to control the message generation. See
168 :mod:`email.policy` for details on what *policy* controls.
169
170 .. versionchanged:: 3.3 Added the *policy* keyword.
171
172 .. versionchanged:: 3.6 The default behavior of the *mangle_from_*
173 and *maxheaderlen* parameters is to follow the policy.
174
175
176 .. method:: flatten(msg, unixfrom=False, linesep=None)
177
178 Print the textual representation of the message object structure rooted
179 at *msg* to the output file specified when the :class:`Generator`
180 instance was created.
181
182 If the :mod:`~email.policy` option :attr:`~email.policy.Policy.cte_type`
183 is ``8bit``, generate the message as if the option were set to ``7bit``.
184 (This is required because strings cannot represent non-ASCII bytes.)
185 Convert any bytes with the high bit set as needed using an
186 ASCII-compatible :mailheader:`Content-Transfer-Encoding`. That is,
187 transform parts with non-ASCII :mailheader:`Cotnent-Transfer-Encoding`
188 (:mailheader:`Content-Transfer-Encoding: 8bit`) to an ASCII compatibile
189 :mailheader:`Content-Transfer-Encoding`, and encode RFC-invalid non-ASCII
190 bytes in headers using the MIME ``unknown-8bit`` character set, thus
191 rendering them RFC-compliant.
192
193 If *unixfrom* is ``True``, print the envelope header delimiter used by
194 the Unix mailbox format (see :mod:`mailbox`) before the first of the
195 :rfc:`5322` headers of the root message object. If the root object has
196 no envelope header, craft a standard one. The default is ``False``.
197 Note that for subparts, no envelope header is ever printed.
198
199 If *linesep* is not ``None``, use it as the separator character between
200 all the lines of the flattened message. If *linesep* is ``None`` (the
201 default), use the value specified in the *policy*.
202
203 .. XXX: flatten should take a *policy* keyword.
204
205 .. versionchanged:: 3.2
206 Added support for re-encoding ``8bit`` message bodies, and the
207 *linesep* argument.
208
209
210 .. method:: clone(fp)
211
212 Return an independent clone of this :class:`Generator` instance with the
213 exact same options, and *fp* as the new *outfp*.
214
215
216 .. method:: write(s)
217
218 Write *s* to the *write* method of the *outfp* passed to the
219 :class:`Generator`'s constructor. This provides just enough file-like
220 API for :class:`Generator` instances to be used in the :func:`print`
221 function.
222
223
224As a convenience, :class:`~email.message.EmailMessage` provides the methods
225:meth:`~email.message.EmailMessage.as_string` and ``str(aMessage)`` (a.k.a.
226:meth:`~email.message.EmailMessage.__str__`), which simplify the generation of
227a formatted string representation of a message object. For more detail, see
228:mod:`email.message`.
229
230
231The :mod:`email.generator` module also provides a derived class,
232:class:`DecodedGenerator`, which is like the :class:`Generator` base class,
233except that non-\ :mimetype:`text` parts are not serialized, but are instead
234represented in the output stream by a string derived from a template filled
235in with information about the part.
236
R David Murray301edfa2016-09-08 17:57:06 -0400237.. class:: DecodedGenerator(outfp, mangle_from_=None, maxheaderlen=None, \
238 fmt=None, *, policy=None)
R David Murray29d1bc02016-09-07 21:15:59 -0400239
240 Act like :class:`Generator`, except that for any subpart of the message
241 passed to :meth:`Generator.flatten`, if the subpart is of main type
242 :mimetype:`text`, print the decoded payload of the subpart, and if the main
243 type is not :mimetype:`text`, instead of printing it fill in the string
244 *fmt* using information from the part and print the resulting
245 filled-in string.
246
247 To fill in *fmt*, execute ``fmt % part_info``, where ``part_info``
248 is a dictionary composed of the following keys and values:
Georg Brandl116aa622007-08-15 14:28:22 +0000249
250 * ``type`` -- Full MIME type of the non-\ :mimetype:`text` part
251
252 * ``maintype`` -- Main MIME type of the non-\ :mimetype:`text` part
253
254 * ``subtype`` -- Sub-MIME type of the non-\ :mimetype:`text` part
255
256 * ``filename`` -- Filename of the non-\ :mimetype:`text` part
257
258 * ``description`` -- Description associated with the non-\ :mimetype:`text` part
259
260 * ``encoding`` -- Content transfer encoding of the non-\ :mimetype:`text` part
261
R David Murray29d1bc02016-09-07 21:15:59 -0400262 If *fmt* is ``None``, use the following default *fmt*:
Georg Brandl116aa622007-08-15 14:28:22 +0000263
R David Murray29d1bc02016-09-07 21:15:59 -0400264 "[Non-text (%(type)s) part of message omitted, filename %(filename)s]"
265
266 Optional *_mangle_from_* and *maxheaderlen* are as with the
R David Murray301edfa2016-09-08 17:57:06 -0400267 :class:`Generator` base class.
R David Murrayea1badb2012-05-15 22:07:52 -0400268
269
270.. rubric:: Footnotes
271
R David Murray29d1bc02016-09-07 21:15:59 -0400272.. [#] This statement assumes that you use the appropriate setting for
273 ``unixfrom``, and that there are no :mod:`policy` settings calling for
274 automatic adjustments (for example,
275 :attr:`~email.policy.Policy.refold_source` must be ``none``, which is
276 *not* the default). It is also not 100% true, since if the message
277 does not conform to the RFC standards occasionally information about the
278 exact original text is lost during parsing error recovery. It is a goal
279 to fix these latter edge cases when possible.