blob: 8f0bfdbe531a5e00d9d37cf6979548d186b57925 [file] [log] [blame]
R David Murray3da240f2013-10-16 22:48:40 -04001:mod:`email.contentmanager`: Managing MIME Content
2--------------------------------------------------
3
4.. module:: email.contentmanager
5 :synopsis: Storing and Retrieving Content from MIME Parts
6
7.. moduleauthor:: R. David Murray <rdmurray@bitdance.com>
8.. sectionauthor:: R. David Murray <rdmurray@bitdance.com>
9
10
11.. note::
12
13 The contentmanager module has been included in the standard library on a
14 :term:`provisional basis <provisional package>`. Backwards incompatible
15 changes (up to and including removal of the module) may occur if deemed
16 necessary by the core developers.
17
18.. versionadded:: 3.4
19 as a :term:`provisional module <provisional package>`.
20
21The :mod:`~email.message` module provides a class that can represent an
22arbitrary email message. That basic message model has a useful and flexible
23API, but it provides only a lower-level API for interacting with the generic
24parts of a message (the headers, generic header parameters, and the payload,
25which may be a list of sub-parts). This module provides classes and tools
26that provide an enhanced and extensible API for dealing with various specific
27types of content, including the ability to retrieve the content of the message
28as a specialized object type rather than as a simple bytes object. The module
29automatically takes care of the RFC-specified MIME details (required headers
30and parameters, etc.) for the certain common content types content properties,
31and support for additional types can be added by an application using the
32extension mechanisms.
33
34This module defines the eponymous "Content Manager" classes. The base
35:class:`.ContentManager` class defines an API for registering content
36management functions which extract data from ``Message`` objects or insert data
37and headers into ``Message`` objects, thus providing a way of converting
38between ``Message`` objects containing data and other representations of that
39data (Python data types, specialized Python objects, external files, etc). The
40module also defines one concrete content manager: :data:`raw_data_manager`
41converts between MIME content types and ``str`` or ``bytes`` data. It also
42provides a convenient API for managing the MIME parameters when inserting
43content into ``Message``\ s. It also handles inserting and extracting
44``Message`` objects when dealing with the ``message/rfc822`` content type.
45
46Another part of the enhanced interface is subclasses of
47:class:`~email.message.Message` that provide new convenience API functions,
48including convenience methods for calling the Content Managers derived from
49this module.
50
51.. note::
52
53 Although :class:`.EmailMessage` and :class:`.MIMEPart` are currently
54 documented in this module because of the provisional nature of the code, the
55 implementation lives in the :mod:`email.message` module.
56
Larry Hastings3732ed22014-03-15 21:13:56 -070057.. currentmodule:: email.message
R David Murray3da240f2013-10-16 22:48:40 -040058
59.. class:: EmailMessage(policy=default)
60
61 If *policy* is specified (it must be an instance of a :mod:`~email.policy`
62 class) use the rules it specifies to udpate and serialize the representation
63 of the message. If *policy* is not set, use the
64 :class:`~email.policy.default` policy, which follows the rules of the email
65 RFCs except for line endings (instead of the RFC mandated ``\r\n``, it uses
66 the Python standard ``\n`` line endings). For more information see the
67 :mod:`~email.policy` documentation.
68
69 This class is a subclass of :class:`~email.message.Message`. It adds
70 the following methods:
71
72
73 .. attribute:: is_attachment
74
75 Set to ``True`` if there is a :mailheader:`Content-Disposition` header
76 and its (case insensitive) value is ``attachment``, ``False`` otherwise.
77
78
79 .. method:: get_body(preferencelist=('related', 'html', 'plain'))
80
81 Return the MIME part that is the best candidate to be the "body" of the
82 message.
83
84 *preferencelist* must be a sequence of strings from the set ``related``,
85 ``html``, and ``plain``, and indicates the order of preference for the
86 content type of the part returned.
87
88 Start looking for candidate matches with the object on which the
89 ``get_body`` method is called.
90
91 If ``related`` is not included in *preferencelist*, consider the root
92 part (or subpart of the root part) of any related encountered as a
93 candidate if the (sub-)part matches a preference.
94
95 When encountering a ``multipart/related``, check the ``start`` parameter
96 and if a part with a matching :mailheader:`Content-ID` is found, consider
97 only it when looking for candidate matches. Otherwise consider only the
98 first (default root) part of the ``multipart/related``.
99
Georg Brandled007d52013-11-24 16:09:26 +0100100 If a part has a :mailheader:`Content-Disposition` header, only consider
R David Murray3da240f2013-10-16 22:48:40 -0400101 the part a candidate match if the value of the header is ``inline``.
102
103 If none of the candidates matches any of the preferences in
104 *preferneclist*, return ``None``.
105
106 Notes: (1) For most applications the only *preferencelist* combinations
107 that really make sense are ``('plain',)``, ``('html', 'plain')``, and the
108 default, ``('related', 'html', 'plain')``. (2) Because matching starts
109 with the object on which ``get_body`` is called, calling ``get_body`` on
110 a ``multipart/related`` will return the object itself unless
111 *preferencelist* has a non-default value. (3) Messages (or message parts)
112 that do not specify a :mailheader:`Content-Type` or whose
113 :mailheader:`Content-Type` header is invalid will be treated as if they
114 are of type ``text/plain``, which may occasionally cause ``get_body`` to
115 return unexpected results.
116
117
118 .. method:: iter_attachments()
119
120 Return an iterator over all of the parts of the message that are not
121 candidate "body" parts. That is, skip the first occurrence of each of
122 ``text/plain``, ``text/html``, ``multipart/related``, or
123 ``multipart/alternative`` (unless they are explicitly marked as
124 attachments via :mailheader:`Content-Disposition: attachment`), and
125 return all remaining parts. When applied directly to a
126 ``multipart/related``, return an iterator over the all the related parts
127 except the root part (ie: the part pointed to by the ``start`` parameter,
128 or the first part if there is no ``start`` parameter or the ``start``
129 parameter doesn't match the :mailheader:`Content-ID` of any of the
130 parts). When applied directly to a ``multipart/alternative`` or a
131 non-``multipart``, return an empty iterator.
132
133
134 .. method:: iter_parts()
135
136 Return an iterator over all of the immediate sub-parts of the message,
137 which will be empty for a non-``multipart``. (See also
Georg Brandled007d52013-11-24 16:09:26 +0100138 :meth:`~email.message.walk`.)
R David Murray3da240f2013-10-16 22:48:40 -0400139
140
141 .. method:: get_content(*args, content_manager=None, **kw)
142
143 Call the ``get_content`` method of the *content_manager*, passing self
144 as the message object, and passing along any other arguments or keywords
145 as additional arguments. If *content_manager* is not specified, use
146 the ``content_manager`` specified by the current :mod:`~email.policy`.
147
148
149 .. method:: set_content(*args, content_manager=None, **kw)
150
151 Call the ``set_content`` method of the *content_manager*, passing self
152 as the message object, and passing along any other arguments or keywords
153 as additional arguments. If *content_manager* is not specified, use
154 the ``content_manager`` specified by the current :mod:`~email.policy`.
155
156
157 .. method:: make_related(boundary=None)
158
159 Convert a non-``multipart`` message into a ``multipart/related`` message,
160 moving any existing :mailheader:`Content-` headers and payload into a
161 (new) first part of the ``multipart``. If *boundary* is specified, use
162 it as the boundary string in the multipart, otherwise leave the boundary
163 to be automatically created when it is needed (for example, when the
164 message is serialized).
165
166
167 .. method:: make_alternative(boundary=None)
168
169 Convert a non-``multipart`` or a ``multipart/related`` into a
170 ``multipart/alternative``, moving any existing :mailheader:`Content-`
171 headers and payload into a (new) first part of the ``multipart``. If
172 *boundary* is specified, use it as the boundary string in the multipart,
173 otherwise leave the boundary to be automatically created when it is
174 needed (for example, when the message is serialized).
175
176
177 .. method:: make_mixed(boundary=None)
178
179 Convert a non-``multipart``, a ``multipart/related``, or a
180 ``multipart-alternative`` into a ``multipart/mixed``, moving any existing
181 :mailheader:`Content-` headers and payload into a (new) first part of the
182 ``multipart``. If *boundary* is specified, use it as the boundary string
183 in the multipart, otherwise leave the boundary to be automatically
184 created when it is needed (for example, when the message is serialized).
185
186
187 .. method:: add_related(*args, content_manager=None, **kw)
188
189 If the message is a ``multipart/related``, create a new message
190 object, pass all of the arguments to its :meth:`set_content` method,
191 and :meth:`~email.message.Message.attach` it to the ``multipart``. If
192 the message is a non-``multipart``, call :meth:`make_related` and then
193 proceed as above. If the message is any other type of ``multipart``,
194 raise a :exc:`TypeError`. If *content_manager* is not specified, use
195 the ``content_manager`` specified by the current :mod:`~email.policy`.
196 If the added part has no :mailheader:`Content-Disposition` header,
197 add one with the value ``inline``.
198
199
200 .. method:: add_alternative(*args, content_manager=None, **kw)
201
202 If the message is a ``multipart/alternative``, create a new message
203 object, pass all of the arguments to its :meth:`set_content` method, and
204 :meth:`~email.message.Message.attach` it to the ``multipart``. If the
205 message is a non-``multipart`` or ``multipart/related``, call
206 :meth:`make_alternative` and then proceed as above. If the message is
207 any other type of ``multipart``, raise a :exc:`TypeError`. If
208 *content_manager* is not specified, use the ``content_manager`` specified
209 by the current :mod:`~email.policy`.
210
211
212 .. method:: add_attachment(*args, content_manager=None, **kw)
213
214 If the message is a ``multipart/mixed``, create a new message object,
215 pass all of the arguments to its :meth:`set_content` method, and
216 :meth:`~email.message.Message.attach` it to the ``multipart``. If the
217 message is a non-``multipart``, ``multipart/related``, or
218 ``multipart/alternative``, call :meth:`make_mixed` and then proceed as
219 above. If *content_manager* is not specified, use the ``content_manager``
220 specified by the current :mod:`~email.policy`. If the added part
221 has no :mailheader:`Content-Disposition` header, add one with the value
222 ``attachment``. This method can be used both for explicit attachments
223 (:mailheader:`Content-Disposition: attachment` and ``inline`` attachments
224 (:mailheader:`Content-Disposition: inline`), by passing appropriate
225 options to the ``content_manager``.
226
227
228 .. method:: clear()
229
230 Remove the payload and all of the headers.
231
232
233 .. method:: clear_content()
234
235 Remove the payload and all of the :exc:`Content-` headers, leaving
236 all other headers intact and in their original order.
237
238
Larry Hastings3732ed22014-03-15 21:13:56 -0700239.. class:: MIMEPart(policy=default)
240
241 This class represents a subpart of a MIME message. It is identical to
242 :class:`EmailMessage`, except that no :mailheader:`MIME-Version` headers are
243 added when :meth:`~EmailMessage.set_content` is called, since sub-parts do
244 not need their own :mailheader:`MIME-Version` headers.
245
246
247.. currentmodule:: email.contentmanager
248
R David Murray3da240f2013-10-16 22:48:40 -0400249.. class:: ContentManager()
250
251 Base class for content managers. Provides the standard registry mechanisms
252 to register converters between MIME content and other representations, as
253 well as the ``get_content`` and ``set_content`` dispatch methods.
254
255
256 .. method:: get_content(msg, *args, **kw)
257
258 Look up a handler function based on the ``mimetype`` of *msg* (see next
259 paragraph), call it, passing through all arguments, and return the result
260 of the call. The expectation is that the handler will extract the
261 payload from *msg* and return an object that encodes information about
262 the extracted data.
263
264 To find the handler, look for the following keys in the registry,
265 stopping with the first one found:
266
267 * the string representing the full MIME type (``maintype/subtype``)
268 * the string representing the ``maintype``
269 * the empty string
270
271 If none of these keys produce a handler, raise a :exc:`KeyError` for the
272 full MIME type.
273
274
275 .. method:: set_content(msg, obj, *args, **kw)
276
277 If the ``maintype`` is ``multipart``, raise a :exc:`TypeError`; otherwise
278 look up a handler function based on the type of *obj* (see next
279 paragraph), call :meth:`~email.message.EmailMessage.clear_content` on the
280 *msg*, and call the handler function, passing through all arguments. The
281 expectation is that the handler will transform and store *obj* into
282 *msg*, possibly making other changes to *msg* as well, such as adding
283 various MIME headers to encode information needed to interpret the stored
284 data.
285
286 To find the handler, obtain the type of *obj* (``typ = type(obj)``), and
287 look for the following keys in the registry, stopping with the first one
288 found:
289
290 * the type itself (``typ``)
291 * the type's fully qualified name (``typ.__module__ + '.' +
292 typ.__qualname__``).
293 * the type's qualname (``typ.__qualname__``)
294 * the type's name (``typ.__name__``).
295
296 If none of the above match, repeat all of the checks above for each of
297 the types in the :term:`MRO` (``typ.__mro__``). Finally, if no other key
298 yields a handler, check for a handler for the key ``None``. If there is
299 no handler for ``None``, raise a :exc:`KeyError` for the fully
300 qualified name of the type.
301
302 Also add a :mailheader:`MIME-Version` header if one is not present (see
303 also :class:`.MIMEPart`).
304
305
306 .. method:: add_get_handler(key, handler)
307
308 Record the function *handler* as the handler for *key*. For the possible
309 values of *key*, see :meth:`get_content`.
310
311
312 .. method:: add_set_handler(typekey, handler)
313
314 Record *handler* as the function to call when an object of a type
315 matching *typekey* is passed to :meth:`set_content`. For the possible
316 values of *typekey*, see :meth:`set_content`.
317
318
R David Murray3da240f2013-10-16 22:48:40 -0400319Content Manager Instances
320~~~~~~~~~~~~~~~~~~~~~~~~~
321
322Currently the email package provides only one concrete content manager,
323:data:`raw_data_manager`, although more may be added in the future.
324:data:`raw_data_manager` is the
325:attr:`~email.policy.EmailPolicy.content_manager` provided by
326:attr:`~email.policy.EmailPolicy` and its derivatives.
327
328
329.. data:: raw_data_manager
330
331 This content manager provides only a minimum interface beyond that provided
332 by :class:`~email.message.Message` itself: it deals only with text, raw
333 byte strings, and :class:`~email.message.Message` objects. Nevertheless, it
334 provides significant advantages compared to the base API: ``get_content`` on
335 a text part will return a unicode string without the application needing to
336 manually decode it, ``set_content`` provides a rich set of options for
337 controlling the headers added to a part and controlling the content transfer
338 encoding, and it enables the use of the various ``add_`` methods, thereby
339 simplifying the creation of multipart messages.
340
341 .. method:: get_content(msg, errors='replace')
342
343 Return the payload of the part as either a string (for ``text`` parts), a
344 :class:`~email.message.EmailMessage` object (for ``message/rfc822``
345 parts), or a ``bytes`` object (for all other non-multipart types). Raise
346 a :exc:`KeyError` if called on a ``multipart``. If the part is a
347 ``text`` part and *errors* is specified, use it as the error handler when
348 decoding the payload to unicode. The default error handler is
349 ``replace``.
350
351 .. method:: set_content(msg, <'str'>, subtype="plain", charset='utf-8' \
352 cte=None, \
353 disposition=None, filename=None, cid=None, \
354 params=None, headers=None)
355 set_content(msg, <'bytes'>, maintype, subtype, cte="base64", \
356 disposition=None, filename=None, cid=None, \
357 params=None, headers=None)
358 set_content(msg, <'Message'>, cte=None, \
359 disposition=None, filename=None, cid=None, \
360 params=None, headers=None)
361 set_content(msg, <'list'>, subtype='mixed', \
362 disposition=None, filename=None, cid=None, \
363 params=None, headers=None)
364
365 Add headers and payload to *msg*:
366
367 Add a :mailheader:`Content-Type` header with a ``maintype/subtype``
368 value.
369
370 * For ``str``, set the MIME ``maintype`` to ``text``, and set the
371 subtype to *subtype* if it is specified, or ``plain`` if it is not.
372 * For ``bytes``, use the specified *maintype* and *subtype*, or
373 raise a :exc:`TypeError` if they are not specified.
374 * For :class:`~email.message.Message` objects, set the maintype to
375 ``message``, and set the subtype to *subtype* if it is specified
376 or ``rfc822`` if it is not. If *subtype* is ``partial``, raise an
377 error (``bytes`` objects must be used to construct
378 ``message/partial`` parts).
379 * For *<'list'>*, which should be a list of
380 :class:`~email.message.Message` objects, set the ``maintype`` to
381 ``multipart``, and the ``subtype`` to *subtype* if it is
382 specified, and ``mixed`` if it is not. If the message parts in
383 the *<'list'>* have :mailheader:`MIME-Version` headers, remove
384 them.
385
386 If *charset* is provided (which is valid only for ``str``), encode the
387 string to bytes using the specified character set. The default is
388 ``utf-8``. If the specified *charset* is a known alias for a standard
389 MIME charset name, use the standard charset instead.
390
391 If *cte* is set, encode the payload using the specified content transfer
392 encoding, and set the :mailheader:`Content-Transfer-Endcoding` header to
393 that value. For ``str`` objects, if it is not set use heuristics to
394 determine the most compact encoding. Possible values for *cte* are
395 ``quoted-printable``, ``base64``, ``7bit``, ``8bit``, and ``binary``.
396 If the input cannot be encoded in the specified encoding (eg: ``7bit``),
397 raise a :exc:`ValueError`. For :class:`~email.message.Message`, per
398 :rfc:`2046`, raise an error if a *cte* of ``quoted-printable`` or
399 ``base64`` is requested for *subtype* ``rfc822``, and for any *cte*
400 other than ``7bit`` for *subtype* ``external-body``. For
401 ``message/rfc822``, use ``8bit`` if *cte* is not specified. For all
402 other values of *subtype*, use ``7bit``.
403
404 .. note:: A *cte* of ``binary`` does not actually work correctly yet.
405 The ``Message`` object as modified by ``set_content`` is correct, but
406 :class:`~email.generator.BytesGenerator` does not serialize it
407 correctly.
408
409 If *disposition* is set, use it as the value of the
410 :mailheader:`Content-Disposition` header. If not specified, and
411 *filename* is specified, add the header with the value ``attachment``.
412 If it is not specified and *filename* is also not specified, do not add
413 the header. The only valid values for *disposition* are ``attachment``
414 and ``inline``.
415
416 If *filename* is specified, use it as the value of the ``filename``
417 parameter of the :mailheader:`Content-Disposition` header. There is no
418 default.
419
420 If *cid* is specified, add a :mailheader:`Content-ID` header with
421 *cid* as its value.
422
423 If *params* is specified, iterate its ``items`` method and use the
424 resulting ``(key, value)`` pairs to set additional paramters on the
425 :mailheader:`Content-Type` header.
426
427 If *headers* is specified and is a list of strings of the form
428 ``headername: headervalue`` or a list of ``header`` objects
429 (distinguised from strings by having a ``name`` attribute), add the
430 headers to *msg*.