blob: f53d34b34cbac5cd3ae7c7236994cbbe20f786fa [file] [log] [blame]
R David Murray3da240f2013-10-16 22:48:40 -04001:mod:`email.contentmanager`: Managing MIME Content
2--------------------------------------------------
3
4.. module:: email.contentmanager
5 :synopsis: Storing and Retrieving Content from MIME Parts
6
7.. moduleauthor:: R. David Murray <rdmurray@bitdance.com>
8.. sectionauthor:: R. David Murray <rdmurray@bitdance.com>
9
10
11.. note::
12
13 The contentmanager module has been included in the standard library on a
14 :term:`provisional basis <provisional package>`. Backwards incompatible
15 changes (up to and including removal of the module) may occur if deemed
16 necessary by the core developers.
17
18.. versionadded:: 3.4
19 as a :term:`provisional module <provisional package>`.
20
21The :mod:`~email.message` module provides a class that can represent an
22arbitrary email message. That basic message model has a useful and flexible
23API, but it provides only a lower-level API for interacting with the generic
24parts of a message (the headers, generic header parameters, and the payload,
25which may be a list of sub-parts). This module provides classes and tools
26that provide an enhanced and extensible API for dealing with various specific
27types of content, including the ability to retrieve the content of the message
28as a specialized object type rather than as a simple bytes object. The module
29automatically takes care of the RFC-specified MIME details (required headers
30and parameters, etc.) for the certain common content types content properties,
31and support for additional types can be added by an application using the
32extension mechanisms.
33
34This module defines the eponymous "Content Manager" classes. The base
35:class:`.ContentManager` class defines an API for registering content
36management functions which extract data from ``Message`` objects or insert data
37and headers into ``Message`` objects, thus providing a way of converting
38between ``Message`` objects containing data and other representations of that
39data (Python data types, specialized Python objects, external files, etc). The
40module also defines one concrete content manager: :data:`raw_data_manager`
41converts between MIME content types and ``str`` or ``bytes`` data. It also
42provides a convenient API for managing the MIME parameters when inserting
43content into ``Message``\ s. It also handles inserting and extracting
44``Message`` objects when dealing with the ``message/rfc822`` content type.
45
46Another part of the enhanced interface is subclasses of
47:class:`~email.message.Message` that provide new convenience API functions,
48including convenience methods for calling the Content Managers derived from
49this module.
50
51.. note::
52
53 Although :class:`.EmailMessage` and :class:`.MIMEPart` are currently
54 documented in this module because of the provisional nature of the code, the
55 implementation lives in the :mod:`email.message` module.
56
Larry Hastings3732ed22014-03-15 21:13:56 -070057.. currentmodule:: email.message
R David Murray3da240f2013-10-16 22:48:40 -040058
59.. class:: EmailMessage(policy=default)
60
61 If *policy* is specified (it must be an instance of a :mod:`~email.policy`
62 class) use the rules it specifies to udpate and serialize the representation
63 of the message. If *policy* is not set, use the
64 :class:`~email.policy.default` policy, which follows the rules of the email
65 RFCs except for line endings (instead of the RFC mandated ``\r\n``, it uses
66 the Python standard ``\n`` line endings). For more information see the
67 :mod:`~email.policy` documentation.
68
69 This class is a subclass of :class:`~email.message.Message`. It adds
70 the following methods:
71
72
R David Murray8a978962014-09-20 18:05:28 -040073 .. method:: is_attachment
R David Murray3da240f2013-10-16 22:48:40 -040074
R David Murray8a978962014-09-20 18:05:28 -040075 Return ``True`` if there is a :mailheader:`Content-Disposition` header
R David Murray3da240f2013-10-16 22:48:40 -040076 and its (case insensitive) value is ``attachment``, ``False`` otherwise.
77
R David Murray8a978962014-09-20 18:05:28 -040078 .. versionchanged:: 3.4.2
79 is_attachment is now a method instead of a property, for consistency
80 with :meth:`~email.message.Message.is_multipart`.
81
R David Murray3da240f2013-10-16 22:48:40 -040082
83 .. method:: get_body(preferencelist=('related', 'html', 'plain'))
84
85 Return the MIME part that is the best candidate to be the "body" of the
86 message.
87
88 *preferencelist* must be a sequence of strings from the set ``related``,
89 ``html``, and ``plain``, and indicates the order of preference for the
90 content type of the part returned.
91
92 Start looking for candidate matches with the object on which the
93 ``get_body`` method is called.
94
95 If ``related`` is not included in *preferencelist*, consider the root
96 part (or subpart of the root part) of any related encountered as a
97 candidate if the (sub-)part matches a preference.
98
99 When encountering a ``multipart/related``, check the ``start`` parameter
100 and if a part with a matching :mailheader:`Content-ID` is found, consider
101 only it when looking for candidate matches. Otherwise consider only the
102 first (default root) part of the ``multipart/related``.
103
Georg Brandled007d52013-11-24 16:09:26 +0100104 If a part has a :mailheader:`Content-Disposition` header, only consider
R David Murray3da240f2013-10-16 22:48:40 -0400105 the part a candidate match if the value of the header is ``inline``.
106
107 If none of the candidates matches any of the preferences in
108 *preferneclist*, return ``None``.
109
110 Notes: (1) For most applications the only *preferencelist* combinations
111 that really make sense are ``('plain',)``, ``('html', 'plain')``, and the
112 default, ``('related', 'html', 'plain')``. (2) Because matching starts
113 with the object on which ``get_body`` is called, calling ``get_body`` on
114 a ``multipart/related`` will return the object itself unless
115 *preferencelist* has a non-default value. (3) Messages (or message parts)
116 that do not specify a :mailheader:`Content-Type` or whose
117 :mailheader:`Content-Type` header is invalid will be treated as if they
118 are of type ``text/plain``, which may occasionally cause ``get_body`` to
119 return unexpected results.
120
121
122 .. method:: iter_attachments()
123
124 Return an iterator over all of the parts of the message that are not
125 candidate "body" parts. That is, skip the first occurrence of each of
126 ``text/plain``, ``text/html``, ``multipart/related``, or
127 ``multipart/alternative`` (unless they are explicitly marked as
128 attachments via :mailheader:`Content-Disposition: attachment`), and
129 return all remaining parts. When applied directly to a
130 ``multipart/related``, return an iterator over the all the related parts
131 except the root part (ie: the part pointed to by the ``start`` parameter,
132 or the first part if there is no ``start`` parameter or the ``start``
133 parameter doesn't match the :mailheader:`Content-ID` of any of the
134 parts). When applied directly to a ``multipart/alternative`` or a
135 non-``multipart``, return an empty iterator.
136
137
138 .. method:: iter_parts()
139
140 Return an iterator over all of the immediate sub-parts of the message,
141 which will be empty for a non-``multipart``. (See also
Georg Brandled007d52013-11-24 16:09:26 +0100142 :meth:`~email.message.walk`.)
R David Murray3da240f2013-10-16 22:48:40 -0400143
144
145 .. method:: get_content(*args, content_manager=None, **kw)
146
147 Call the ``get_content`` method of the *content_manager*, passing self
148 as the message object, and passing along any other arguments or keywords
149 as additional arguments. If *content_manager* is not specified, use
150 the ``content_manager`` specified by the current :mod:`~email.policy`.
151
152
153 .. method:: set_content(*args, content_manager=None, **kw)
154
155 Call the ``set_content`` method of the *content_manager*, passing self
156 as the message object, and passing along any other arguments or keywords
157 as additional arguments. If *content_manager* is not specified, use
158 the ``content_manager`` specified by the current :mod:`~email.policy`.
159
160
161 .. method:: make_related(boundary=None)
162
163 Convert a non-``multipart`` message into a ``multipart/related`` message,
164 moving any existing :mailheader:`Content-` headers and payload into a
165 (new) first part of the ``multipart``. If *boundary* is specified, use
166 it as the boundary string in the multipart, otherwise leave the boundary
167 to be automatically created when it is needed (for example, when the
168 message is serialized).
169
170
171 .. method:: make_alternative(boundary=None)
172
173 Convert a non-``multipart`` or a ``multipart/related`` into a
174 ``multipart/alternative``, moving any existing :mailheader:`Content-`
175 headers and payload into a (new) first part of the ``multipart``. If
176 *boundary* is specified, use it as the boundary string in the multipart,
177 otherwise leave the boundary to be automatically created when it is
178 needed (for example, when the message is serialized).
179
180
181 .. method:: make_mixed(boundary=None)
182
183 Convert a non-``multipart``, a ``multipart/related``, or a
184 ``multipart-alternative`` into a ``multipart/mixed``, moving any existing
185 :mailheader:`Content-` headers and payload into a (new) first part of the
186 ``multipart``. If *boundary* is specified, use it as the boundary string
187 in the multipart, otherwise leave the boundary to be automatically
188 created when it is needed (for example, when the message is serialized).
189
190
191 .. method:: add_related(*args, content_manager=None, **kw)
192
193 If the message is a ``multipart/related``, create a new message
194 object, pass all of the arguments to its :meth:`set_content` method,
195 and :meth:`~email.message.Message.attach` it to the ``multipart``. If
196 the message is a non-``multipart``, call :meth:`make_related` and then
197 proceed as above. If the message is any other type of ``multipart``,
198 raise a :exc:`TypeError`. If *content_manager* is not specified, use
199 the ``content_manager`` specified by the current :mod:`~email.policy`.
200 If the added part has no :mailheader:`Content-Disposition` header,
201 add one with the value ``inline``.
202
203
204 .. method:: add_alternative(*args, content_manager=None, **kw)
205
206 If the message is a ``multipart/alternative``, create a new message
207 object, pass all of the arguments to its :meth:`set_content` method, and
208 :meth:`~email.message.Message.attach` it to the ``multipart``. If the
209 message is a non-``multipart`` or ``multipart/related``, call
210 :meth:`make_alternative` and then proceed as above. If the message is
211 any other type of ``multipart``, raise a :exc:`TypeError`. If
212 *content_manager* is not specified, use the ``content_manager`` specified
213 by the current :mod:`~email.policy`.
214
215
216 .. method:: add_attachment(*args, content_manager=None, **kw)
217
218 If the message is a ``multipart/mixed``, create a new message object,
219 pass all of the arguments to its :meth:`set_content` method, and
220 :meth:`~email.message.Message.attach` it to the ``multipart``. If the
221 message is a non-``multipart``, ``multipart/related``, or
222 ``multipart/alternative``, call :meth:`make_mixed` and then proceed as
223 above. If *content_manager* is not specified, use the ``content_manager``
224 specified by the current :mod:`~email.policy`. If the added part
225 has no :mailheader:`Content-Disposition` header, add one with the value
226 ``attachment``. This method can be used both for explicit attachments
227 (:mailheader:`Content-Disposition: attachment` and ``inline`` attachments
228 (:mailheader:`Content-Disposition: inline`), by passing appropriate
229 options to the ``content_manager``.
230
231
232 .. method:: clear()
233
234 Remove the payload and all of the headers.
235
236
237 .. method:: clear_content()
238
239 Remove the payload and all of the :exc:`Content-` headers, leaving
240 all other headers intact and in their original order.
241
242
Larry Hastings3732ed22014-03-15 21:13:56 -0700243.. class:: MIMEPart(policy=default)
244
245 This class represents a subpart of a MIME message. It is identical to
246 :class:`EmailMessage`, except that no :mailheader:`MIME-Version` headers are
247 added when :meth:`~EmailMessage.set_content` is called, since sub-parts do
248 not need their own :mailheader:`MIME-Version` headers.
249
250
251.. currentmodule:: email.contentmanager
252
R David Murray3da240f2013-10-16 22:48:40 -0400253.. class:: ContentManager()
254
255 Base class for content managers. Provides the standard registry mechanisms
256 to register converters between MIME content and other representations, as
257 well as the ``get_content`` and ``set_content`` dispatch methods.
258
259
260 .. method:: get_content(msg, *args, **kw)
261
262 Look up a handler function based on the ``mimetype`` of *msg* (see next
263 paragraph), call it, passing through all arguments, and return the result
264 of the call. The expectation is that the handler will extract the
265 payload from *msg* and return an object that encodes information about
266 the extracted data.
267
268 To find the handler, look for the following keys in the registry,
269 stopping with the first one found:
270
271 * the string representing the full MIME type (``maintype/subtype``)
272 * the string representing the ``maintype``
273 * the empty string
274
275 If none of these keys produce a handler, raise a :exc:`KeyError` for the
276 full MIME type.
277
278
279 .. method:: set_content(msg, obj, *args, **kw)
280
281 If the ``maintype`` is ``multipart``, raise a :exc:`TypeError`; otherwise
282 look up a handler function based on the type of *obj* (see next
283 paragraph), call :meth:`~email.message.EmailMessage.clear_content` on the
284 *msg*, and call the handler function, passing through all arguments. The
285 expectation is that the handler will transform and store *obj* into
286 *msg*, possibly making other changes to *msg* as well, such as adding
287 various MIME headers to encode information needed to interpret the stored
288 data.
289
290 To find the handler, obtain the type of *obj* (``typ = type(obj)``), and
291 look for the following keys in the registry, stopping with the first one
292 found:
293
294 * the type itself (``typ``)
295 * the type's fully qualified name (``typ.__module__ + '.' +
296 typ.__qualname__``).
297 * the type's qualname (``typ.__qualname__``)
298 * the type's name (``typ.__name__``).
299
300 If none of the above match, repeat all of the checks above for each of
301 the types in the :term:`MRO` (``typ.__mro__``). Finally, if no other key
302 yields a handler, check for a handler for the key ``None``. If there is
303 no handler for ``None``, raise a :exc:`KeyError` for the fully
304 qualified name of the type.
305
306 Also add a :mailheader:`MIME-Version` header if one is not present (see
307 also :class:`.MIMEPart`).
308
309
310 .. method:: add_get_handler(key, handler)
311
312 Record the function *handler* as the handler for *key*. For the possible
313 values of *key*, see :meth:`get_content`.
314
315
316 .. method:: add_set_handler(typekey, handler)
317
318 Record *handler* as the function to call when an object of a type
319 matching *typekey* is passed to :meth:`set_content`. For the possible
320 values of *typekey*, see :meth:`set_content`.
321
322
R David Murray3da240f2013-10-16 22:48:40 -0400323Content Manager Instances
324~~~~~~~~~~~~~~~~~~~~~~~~~
325
326Currently the email package provides only one concrete content manager,
327:data:`raw_data_manager`, although more may be added in the future.
328:data:`raw_data_manager` is the
329:attr:`~email.policy.EmailPolicy.content_manager` provided by
330:attr:`~email.policy.EmailPolicy` and its derivatives.
331
332
333.. data:: raw_data_manager
334
335 This content manager provides only a minimum interface beyond that provided
336 by :class:`~email.message.Message` itself: it deals only with text, raw
337 byte strings, and :class:`~email.message.Message` objects. Nevertheless, it
338 provides significant advantages compared to the base API: ``get_content`` on
339 a text part will return a unicode string without the application needing to
340 manually decode it, ``set_content`` provides a rich set of options for
341 controlling the headers added to a part and controlling the content transfer
342 encoding, and it enables the use of the various ``add_`` methods, thereby
343 simplifying the creation of multipart messages.
344
345 .. method:: get_content(msg, errors='replace')
346
347 Return the payload of the part as either a string (for ``text`` parts), a
348 :class:`~email.message.EmailMessage` object (for ``message/rfc822``
349 parts), or a ``bytes`` object (for all other non-multipart types). Raise
350 a :exc:`KeyError` if called on a ``multipart``. If the part is a
351 ``text`` part and *errors* is specified, use it as the error handler when
352 decoding the payload to unicode. The default error handler is
353 ``replace``.
354
355 .. method:: set_content(msg, <'str'>, subtype="plain", charset='utf-8' \
356 cte=None, \
357 disposition=None, filename=None, cid=None, \
358 params=None, headers=None)
359 set_content(msg, <'bytes'>, maintype, subtype, cte="base64", \
360 disposition=None, filename=None, cid=None, \
361 params=None, headers=None)
362 set_content(msg, <'Message'>, cte=None, \
363 disposition=None, filename=None, cid=None, \
364 params=None, headers=None)
365 set_content(msg, <'list'>, subtype='mixed', \
366 disposition=None, filename=None, cid=None, \
367 params=None, headers=None)
368
369 Add headers and payload to *msg*:
370
371 Add a :mailheader:`Content-Type` header with a ``maintype/subtype``
372 value.
373
374 * For ``str``, set the MIME ``maintype`` to ``text``, and set the
375 subtype to *subtype* if it is specified, or ``plain`` if it is not.
376 * For ``bytes``, use the specified *maintype* and *subtype*, or
377 raise a :exc:`TypeError` if they are not specified.
378 * For :class:`~email.message.Message` objects, set the maintype to
379 ``message``, and set the subtype to *subtype* if it is specified
380 or ``rfc822`` if it is not. If *subtype* is ``partial``, raise an
381 error (``bytes`` objects must be used to construct
382 ``message/partial`` parts).
383 * For *<'list'>*, which should be a list of
384 :class:`~email.message.Message` objects, set the ``maintype`` to
385 ``multipart``, and the ``subtype`` to *subtype* if it is
386 specified, and ``mixed`` if it is not. If the message parts in
387 the *<'list'>* have :mailheader:`MIME-Version` headers, remove
388 them.
389
390 If *charset* is provided (which is valid only for ``str``), encode the
391 string to bytes using the specified character set. The default is
392 ``utf-8``. If the specified *charset* is a known alias for a standard
393 MIME charset name, use the standard charset instead.
394
395 If *cte* is set, encode the payload using the specified content transfer
396 encoding, and set the :mailheader:`Content-Transfer-Endcoding` header to
397 that value. For ``str`` objects, if it is not set use heuristics to
398 determine the most compact encoding. Possible values for *cte* are
399 ``quoted-printable``, ``base64``, ``7bit``, ``8bit``, and ``binary``.
400 If the input cannot be encoded in the specified encoding (eg: ``7bit``),
401 raise a :exc:`ValueError`. For :class:`~email.message.Message`, per
402 :rfc:`2046`, raise an error if a *cte* of ``quoted-printable`` or
403 ``base64`` is requested for *subtype* ``rfc822``, and for any *cte*
404 other than ``7bit`` for *subtype* ``external-body``. For
405 ``message/rfc822``, use ``8bit`` if *cte* is not specified. For all
406 other values of *subtype*, use ``7bit``.
407
Berker Peksag9c1dba22014-09-28 00:00:58 +0300408 .. note:: A *cte* of ``binary`` does not actually work correctly yet.
R David Murray3da240f2013-10-16 22:48:40 -0400409 The ``Message`` object as modified by ``set_content`` is correct, but
410 :class:`~email.generator.BytesGenerator` does not serialize it
411 correctly.
412
413 If *disposition* is set, use it as the value of the
414 :mailheader:`Content-Disposition` header. If not specified, and
415 *filename* is specified, add the header with the value ``attachment``.
416 If it is not specified and *filename* is also not specified, do not add
417 the header. The only valid values for *disposition* are ``attachment``
418 and ``inline``.
419
420 If *filename* is specified, use it as the value of the ``filename``
421 parameter of the :mailheader:`Content-Disposition` header. There is no
422 default.
423
424 If *cid* is specified, add a :mailheader:`Content-ID` header with
425 *cid* as its value.
426
427 If *params* is specified, iterate its ``items`` method and use the
Berker Peksag4882cac2015-04-14 09:30:01 +0300428 resulting ``(key, value)`` pairs to set additional parameters on the
R David Murray3da240f2013-10-16 22:48:40 -0400429 :mailheader:`Content-Type` header.
430
431 If *headers* is specified and is a list of strings of the form
432 ``headername: headervalue`` or a list of ``header`` objects
433 (distinguised from strings by having a ``name`` attribute), add the
434 headers to *msg*.