blob: 5162da18faabf8f170258810befe0e7d5fb64795 [file] [log] [blame]
R David Murray3da240f2013-10-16 22:48:40 -04001:mod:`email.contentmanager`: Managing MIME Content
2--------------------------------------------------
3
4.. module:: email.contentmanager
5 :synopsis: Storing and Retrieving Content from MIME Parts
6
7.. moduleauthor:: R. David Murray <rdmurray@bitdance.com>
8.. sectionauthor:: R. David Murray <rdmurray@bitdance.com>
9
10
11.. note::
12
13 The contentmanager module has been included in the standard library on a
14 :term:`provisional basis <provisional package>`. Backwards incompatible
15 changes (up to and including removal of the module) may occur if deemed
16 necessary by the core developers.
17
18.. versionadded:: 3.4
19 as a :term:`provisional module <provisional package>`.
20
21The :mod:`~email.message` module provides a class that can represent an
22arbitrary email message. That basic message model has a useful and flexible
23API, but it provides only a lower-level API for interacting with the generic
24parts of a message (the headers, generic header parameters, and the payload,
25which may be a list of sub-parts). This module provides classes and tools
26that provide an enhanced and extensible API for dealing with various specific
27types of content, including the ability to retrieve the content of the message
28as a specialized object type rather than as a simple bytes object. The module
29automatically takes care of the RFC-specified MIME details (required headers
30and parameters, etc.) for the certain common content types content properties,
31and support for additional types can be added by an application using the
32extension mechanisms.
33
34This module defines the eponymous "Content Manager" classes. The base
35:class:`.ContentManager` class defines an API for registering content
36management functions which extract data from ``Message`` objects or insert data
37and headers into ``Message`` objects, thus providing a way of converting
38between ``Message`` objects containing data and other representations of that
39data (Python data types, specialized Python objects, external files, etc). The
40module also defines one concrete content manager: :data:`raw_data_manager`
41converts between MIME content types and ``str`` or ``bytes`` data. It also
42provides a convenient API for managing the MIME parameters when inserting
43content into ``Message``\ s. It also handles inserting and extracting
44``Message`` objects when dealing with the ``message/rfc822`` content type.
45
46Another part of the enhanced interface is subclasses of
47:class:`~email.message.Message` that provide new convenience API functions,
48including convenience methods for calling the Content Managers derived from
49this module.
50
51.. note::
52
53 Although :class:`.EmailMessage` and :class:`.MIMEPart` are currently
54 documented in this module because of the provisional nature of the code, the
55 implementation lives in the :mod:`email.message` module.
56
57
58.. class:: EmailMessage(policy=default)
59
60 If *policy* is specified (it must be an instance of a :mod:`~email.policy`
61 class) use the rules it specifies to udpate and serialize the representation
62 of the message. If *policy* is not set, use the
63 :class:`~email.policy.default` policy, which follows the rules of the email
64 RFCs except for line endings (instead of the RFC mandated ``\r\n``, it uses
65 the Python standard ``\n`` line endings). For more information see the
66 :mod:`~email.policy` documentation.
67
68 This class is a subclass of :class:`~email.message.Message`. It adds
69 the following methods:
70
71
72 .. attribute:: is_attachment
73
74 Set to ``True`` if there is a :mailheader:`Content-Disposition` header
75 and its (case insensitive) value is ``attachment``, ``False`` otherwise.
76
77
78 .. method:: get_body(preferencelist=('related', 'html', 'plain'))
79
80 Return the MIME part that is the best candidate to be the "body" of the
81 message.
82
83 *preferencelist* must be a sequence of strings from the set ``related``,
84 ``html``, and ``plain``, and indicates the order of preference for the
85 content type of the part returned.
86
87 Start looking for candidate matches with the object on which the
88 ``get_body`` method is called.
89
90 If ``related`` is not included in *preferencelist*, consider the root
91 part (or subpart of the root part) of any related encountered as a
92 candidate if the (sub-)part matches a preference.
93
94 When encountering a ``multipart/related``, check the ``start`` parameter
95 and if a part with a matching :mailheader:`Content-ID` is found, consider
96 only it when looking for candidate matches. Otherwise consider only the
97 first (default root) part of the ``multipart/related``.
98
Georg Brandled007d52013-11-24 16:09:26 +010099 If a part has a :mailheader:`Content-Disposition` header, only consider
R David Murray3da240f2013-10-16 22:48:40 -0400100 the part a candidate match if the value of the header is ``inline``.
101
102 If none of the candidates matches any of the preferences in
103 *preferneclist*, return ``None``.
104
105 Notes: (1) For most applications the only *preferencelist* combinations
106 that really make sense are ``('plain',)``, ``('html', 'plain')``, and the
107 default, ``('related', 'html', 'plain')``. (2) Because matching starts
108 with the object on which ``get_body`` is called, calling ``get_body`` on
109 a ``multipart/related`` will return the object itself unless
110 *preferencelist* has a non-default value. (3) Messages (or message parts)
111 that do not specify a :mailheader:`Content-Type` or whose
112 :mailheader:`Content-Type` header is invalid will be treated as if they
113 are of type ``text/plain``, which may occasionally cause ``get_body`` to
114 return unexpected results.
115
116
117 .. method:: iter_attachments()
118
119 Return an iterator over all of the parts of the message that are not
120 candidate "body" parts. That is, skip the first occurrence of each of
121 ``text/plain``, ``text/html``, ``multipart/related``, or
122 ``multipart/alternative`` (unless they are explicitly marked as
123 attachments via :mailheader:`Content-Disposition: attachment`), and
124 return all remaining parts. When applied directly to a
125 ``multipart/related``, return an iterator over the all the related parts
126 except the root part (ie: the part pointed to by the ``start`` parameter,
127 or the first part if there is no ``start`` parameter or the ``start``
128 parameter doesn't match the :mailheader:`Content-ID` of any of the
129 parts). When applied directly to a ``multipart/alternative`` or a
130 non-``multipart``, return an empty iterator.
131
132
133 .. method:: iter_parts()
134
135 Return an iterator over all of the immediate sub-parts of the message,
136 which will be empty for a non-``multipart``. (See also
Georg Brandled007d52013-11-24 16:09:26 +0100137 :meth:`~email.message.walk`.)
R David Murray3da240f2013-10-16 22:48:40 -0400138
139
140 .. method:: get_content(*args, content_manager=None, **kw)
141
142 Call the ``get_content`` method of the *content_manager*, passing self
143 as the message object, and passing along any other arguments or keywords
144 as additional arguments. If *content_manager* is not specified, use
145 the ``content_manager`` specified by the current :mod:`~email.policy`.
146
147
148 .. method:: set_content(*args, content_manager=None, **kw)
149
150 Call the ``set_content`` method of the *content_manager*, passing self
151 as the message object, and passing along any other arguments or keywords
152 as additional arguments. If *content_manager* is not specified, use
153 the ``content_manager`` specified by the current :mod:`~email.policy`.
154
155
156 .. method:: make_related(boundary=None)
157
158 Convert a non-``multipart`` message into a ``multipart/related`` message,
159 moving any existing :mailheader:`Content-` headers and payload into a
160 (new) first part of the ``multipart``. If *boundary* is specified, use
161 it as the boundary string in the multipart, otherwise leave the boundary
162 to be automatically created when it is needed (for example, when the
163 message is serialized).
164
165
166 .. method:: make_alternative(boundary=None)
167
168 Convert a non-``multipart`` or a ``multipart/related`` into a
169 ``multipart/alternative``, moving any existing :mailheader:`Content-`
170 headers and payload into a (new) first part of the ``multipart``. If
171 *boundary* is specified, use it as the boundary string in the multipart,
172 otherwise leave the boundary to be automatically created when it is
173 needed (for example, when the message is serialized).
174
175
176 .. method:: make_mixed(boundary=None)
177
178 Convert a non-``multipart``, a ``multipart/related``, or a
179 ``multipart-alternative`` into a ``multipart/mixed``, moving any existing
180 :mailheader:`Content-` headers and payload into a (new) first part of the
181 ``multipart``. If *boundary* is specified, use it as the boundary string
182 in the multipart, otherwise leave the boundary to be automatically
183 created when it is needed (for example, when the message is serialized).
184
185
186 .. method:: add_related(*args, content_manager=None, **kw)
187
188 If the message is a ``multipart/related``, create a new message
189 object, pass all of the arguments to its :meth:`set_content` method,
190 and :meth:`~email.message.Message.attach` it to the ``multipart``. If
191 the message is a non-``multipart``, call :meth:`make_related` and then
192 proceed as above. If the message is any other type of ``multipart``,
193 raise a :exc:`TypeError`. If *content_manager* is not specified, use
194 the ``content_manager`` specified by the current :mod:`~email.policy`.
195 If the added part has no :mailheader:`Content-Disposition` header,
196 add one with the value ``inline``.
197
198
199 .. method:: add_alternative(*args, content_manager=None, **kw)
200
201 If the message is a ``multipart/alternative``, create a new message
202 object, pass all of the arguments to its :meth:`set_content` method, and
203 :meth:`~email.message.Message.attach` it to the ``multipart``. If the
204 message is a non-``multipart`` or ``multipart/related``, call
205 :meth:`make_alternative` and then proceed as above. If the message is
206 any other type of ``multipart``, raise a :exc:`TypeError`. If
207 *content_manager* is not specified, use the ``content_manager`` specified
208 by the current :mod:`~email.policy`.
209
210
211 .. method:: add_attachment(*args, content_manager=None, **kw)
212
213 If the message is a ``multipart/mixed``, create a new message object,
214 pass all of the arguments to its :meth:`set_content` method, and
215 :meth:`~email.message.Message.attach` it to the ``multipart``. If the
216 message is a non-``multipart``, ``multipart/related``, or
217 ``multipart/alternative``, call :meth:`make_mixed` and then proceed as
218 above. If *content_manager* is not specified, use the ``content_manager``
219 specified by the current :mod:`~email.policy`. If the added part
220 has no :mailheader:`Content-Disposition` header, add one with the value
221 ``attachment``. This method can be used both for explicit attachments
222 (:mailheader:`Content-Disposition: attachment` and ``inline`` attachments
223 (:mailheader:`Content-Disposition: inline`), by passing appropriate
224 options to the ``content_manager``.
225
226
227 .. method:: clear()
228
229 Remove the payload and all of the headers.
230
231
232 .. method:: clear_content()
233
234 Remove the payload and all of the :exc:`Content-` headers, leaving
235 all other headers intact and in their original order.
236
237
238.. class:: ContentManager()
239
240 Base class for content managers. Provides the standard registry mechanisms
241 to register converters between MIME content and other representations, as
242 well as the ``get_content`` and ``set_content`` dispatch methods.
243
244
245 .. method:: get_content(msg, *args, **kw)
246
247 Look up a handler function based on the ``mimetype`` of *msg* (see next
248 paragraph), call it, passing through all arguments, and return the result
249 of the call. The expectation is that the handler will extract the
250 payload from *msg* and return an object that encodes information about
251 the extracted data.
252
253 To find the handler, look for the following keys in the registry,
254 stopping with the first one found:
255
256 * the string representing the full MIME type (``maintype/subtype``)
257 * the string representing the ``maintype``
258 * the empty string
259
260 If none of these keys produce a handler, raise a :exc:`KeyError` for the
261 full MIME type.
262
263
264 .. method:: set_content(msg, obj, *args, **kw)
265
266 If the ``maintype`` is ``multipart``, raise a :exc:`TypeError`; otherwise
267 look up a handler function based on the type of *obj* (see next
268 paragraph), call :meth:`~email.message.EmailMessage.clear_content` on the
269 *msg*, and call the handler function, passing through all arguments. The
270 expectation is that the handler will transform and store *obj* into
271 *msg*, possibly making other changes to *msg* as well, such as adding
272 various MIME headers to encode information needed to interpret the stored
273 data.
274
275 To find the handler, obtain the type of *obj* (``typ = type(obj)``), and
276 look for the following keys in the registry, stopping with the first one
277 found:
278
279 * the type itself (``typ``)
280 * the type's fully qualified name (``typ.__module__ + '.' +
281 typ.__qualname__``).
282 * the type's qualname (``typ.__qualname__``)
283 * the type's name (``typ.__name__``).
284
285 If none of the above match, repeat all of the checks above for each of
286 the types in the :term:`MRO` (``typ.__mro__``). Finally, if no other key
287 yields a handler, check for a handler for the key ``None``. If there is
288 no handler for ``None``, raise a :exc:`KeyError` for the fully
289 qualified name of the type.
290
291 Also add a :mailheader:`MIME-Version` header if one is not present (see
292 also :class:`.MIMEPart`).
293
294
295 .. method:: add_get_handler(key, handler)
296
297 Record the function *handler* as the handler for *key*. For the possible
298 values of *key*, see :meth:`get_content`.
299
300
301 .. method:: add_set_handler(typekey, handler)
302
303 Record *handler* as the function to call when an object of a type
304 matching *typekey* is passed to :meth:`set_content`. For the possible
305 values of *typekey*, see :meth:`set_content`.
306
307
308.. class:: MIMEPart(policy=default)
309
310 This class represents a subpart of a MIME message. It is identical to
311 :class:`EmailMessage`, except that no :mailheader:`MIME-Version` headers are
312 added when :meth:`~EmailMessage.set_content` is called, since sub-parts do
313 not need their own :mailheader:`MIME-Version` headers.
314
315
316Content Manager Instances
317~~~~~~~~~~~~~~~~~~~~~~~~~
318
319Currently the email package provides only one concrete content manager,
320:data:`raw_data_manager`, although more may be added in the future.
321:data:`raw_data_manager` is the
322:attr:`~email.policy.EmailPolicy.content_manager` provided by
323:attr:`~email.policy.EmailPolicy` and its derivatives.
324
325
326.. data:: raw_data_manager
327
328 This content manager provides only a minimum interface beyond that provided
329 by :class:`~email.message.Message` itself: it deals only with text, raw
330 byte strings, and :class:`~email.message.Message` objects. Nevertheless, it
331 provides significant advantages compared to the base API: ``get_content`` on
332 a text part will return a unicode string without the application needing to
333 manually decode it, ``set_content`` provides a rich set of options for
334 controlling the headers added to a part and controlling the content transfer
335 encoding, and it enables the use of the various ``add_`` methods, thereby
336 simplifying the creation of multipart messages.
337
338 .. method:: get_content(msg, errors='replace')
339
340 Return the payload of the part as either a string (for ``text`` parts), a
341 :class:`~email.message.EmailMessage` object (for ``message/rfc822``
342 parts), or a ``bytes`` object (for all other non-multipart types). Raise
343 a :exc:`KeyError` if called on a ``multipart``. If the part is a
344 ``text`` part and *errors* is specified, use it as the error handler when
345 decoding the payload to unicode. The default error handler is
346 ``replace``.
347
348 .. method:: set_content(msg, <'str'>, subtype="plain", charset='utf-8' \
349 cte=None, \
350 disposition=None, filename=None, cid=None, \
351 params=None, headers=None)
352 set_content(msg, <'bytes'>, maintype, subtype, cte="base64", \
353 disposition=None, filename=None, cid=None, \
354 params=None, headers=None)
355 set_content(msg, <'Message'>, cte=None, \
356 disposition=None, filename=None, cid=None, \
357 params=None, headers=None)
358 set_content(msg, <'list'>, subtype='mixed', \
359 disposition=None, filename=None, cid=None, \
360 params=None, headers=None)
361
362 Add headers and payload to *msg*:
363
364 Add a :mailheader:`Content-Type` header with a ``maintype/subtype``
365 value.
366
367 * For ``str``, set the MIME ``maintype`` to ``text``, and set the
368 subtype to *subtype* if it is specified, or ``plain`` if it is not.
369 * For ``bytes``, use the specified *maintype* and *subtype*, or
370 raise a :exc:`TypeError` if they are not specified.
371 * For :class:`~email.message.Message` objects, set the maintype to
372 ``message``, and set the subtype to *subtype* if it is specified
373 or ``rfc822`` if it is not. If *subtype* is ``partial``, raise an
374 error (``bytes`` objects must be used to construct
375 ``message/partial`` parts).
376 * For *<'list'>*, which should be a list of
377 :class:`~email.message.Message` objects, set the ``maintype`` to
378 ``multipart``, and the ``subtype`` to *subtype* if it is
379 specified, and ``mixed`` if it is not. If the message parts in
380 the *<'list'>* have :mailheader:`MIME-Version` headers, remove
381 them.
382
383 If *charset* is provided (which is valid only for ``str``), encode the
384 string to bytes using the specified character set. The default is
385 ``utf-8``. If the specified *charset* is a known alias for a standard
386 MIME charset name, use the standard charset instead.
387
388 If *cte* is set, encode the payload using the specified content transfer
389 encoding, and set the :mailheader:`Content-Transfer-Endcoding` header to
390 that value. For ``str`` objects, if it is not set use heuristics to
391 determine the most compact encoding. Possible values for *cte* are
392 ``quoted-printable``, ``base64``, ``7bit``, ``8bit``, and ``binary``.
393 If the input cannot be encoded in the specified encoding (eg: ``7bit``),
394 raise a :exc:`ValueError`. For :class:`~email.message.Message`, per
395 :rfc:`2046`, raise an error if a *cte* of ``quoted-printable`` or
396 ``base64`` is requested for *subtype* ``rfc822``, and for any *cte*
397 other than ``7bit`` for *subtype* ``external-body``. For
398 ``message/rfc822``, use ``8bit`` if *cte* is not specified. For all
399 other values of *subtype*, use ``7bit``.
400
401 .. note:: A *cte* of ``binary`` does not actually work correctly yet.
402 The ``Message`` object as modified by ``set_content`` is correct, but
403 :class:`~email.generator.BytesGenerator` does not serialize it
404 correctly.
405
406 If *disposition* is set, use it as the value of the
407 :mailheader:`Content-Disposition` header. If not specified, and
408 *filename* is specified, add the header with the value ``attachment``.
409 If it is not specified and *filename* is also not specified, do not add
410 the header. The only valid values for *disposition* are ``attachment``
411 and ``inline``.
412
413 If *filename* is specified, use it as the value of the ``filename``
414 parameter of the :mailheader:`Content-Disposition` header. There is no
415 default.
416
417 If *cid* is specified, add a :mailheader:`Content-ID` header with
418 *cid* as its value.
419
420 If *params* is specified, iterate its ``items`` method and use the
421 resulting ``(key, value)`` pairs to set additional paramters on the
422 :mailheader:`Content-Type` header.
423
424 If *headers* is specified and is a list of strings of the form
425 ``headername: headervalue`` or a list of ``header`` objects
426 (distinguised from strings by having a ``name`` attribute), add the
427 headers to *msg*.