blob: c25d0736835e7ca96a06e8b3f11a5203a6d82731 [file] [log] [blame]
R David Murray3da240f2013-10-16 22:48:40 -04001:mod:`email.contentmanager`: Managing MIME Content
2--------------------------------------------------
3
4.. module:: email.contentmanager
5 :synopsis: Storing and Retrieving Content from MIME Parts
6
7.. moduleauthor:: R. David Murray <rdmurray@bitdance.com>
8.. sectionauthor:: R. David Murray <rdmurray@bitdance.com>
9
Terry Jan Reedyfa089b92016-06-11 15:02:54 -040010.. versionadded:: 3.4
11 as a :term:`provisional module <provisional package>`.
12
13**Source code:** :source:`Lib/email/contentmanager.py`
R David Murray3da240f2013-10-16 22:48:40 -040014
15.. note::
16
17 The contentmanager module has been included in the standard library on a
18 :term:`provisional basis <provisional package>`. Backwards incompatible
19 changes (up to and including removal of the module) may occur if deemed
20 necessary by the core developers.
21
Terry Jan Reedyfa089b92016-06-11 15:02:54 -040022--------------
R David Murray3da240f2013-10-16 22:48:40 -040023
24The :mod:`~email.message` module provides a class that can represent an
25arbitrary email message. That basic message model has a useful and flexible
26API, but it provides only a lower-level API for interacting with the generic
27parts of a message (the headers, generic header parameters, and the payload,
28which may be a list of sub-parts). This module provides classes and tools
29that provide an enhanced and extensible API for dealing with various specific
30types of content, including the ability to retrieve the content of the message
31as a specialized object type rather than as a simple bytes object. The module
32automatically takes care of the RFC-specified MIME details (required headers
33and parameters, etc.) for the certain common content types content properties,
34and support for additional types can be added by an application using the
35extension mechanisms.
36
37This module defines the eponymous "Content Manager" classes. The base
38:class:`.ContentManager` class defines an API for registering content
39management functions which extract data from ``Message`` objects or insert data
40and headers into ``Message`` objects, thus providing a way of converting
41between ``Message`` objects containing data and other representations of that
42data (Python data types, specialized Python objects, external files, etc). The
43module also defines one concrete content manager: :data:`raw_data_manager`
44converts between MIME content types and ``str`` or ``bytes`` data. It also
45provides a convenient API for managing the MIME parameters when inserting
46content into ``Message``\ s. It also handles inserting and extracting
47``Message`` objects when dealing with the ``message/rfc822`` content type.
48
49Another part of the enhanced interface is subclasses of
50:class:`~email.message.Message` that provide new convenience API functions,
51including convenience methods for calling the Content Managers derived from
52this module.
53
54.. note::
55
56 Although :class:`.EmailMessage` and :class:`.MIMEPart` are currently
57 documented in this module because of the provisional nature of the code, the
58 implementation lives in the :mod:`email.message` module.
59
Larry Hastings3732ed22014-03-15 21:13:56 -070060.. currentmodule:: email.message
R David Murray3da240f2013-10-16 22:48:40 -040061
62.. class:: EmailMessage(policy=default)
63
64 If *policy* is specified (it must be an instance of a :mod:`~email.policy`
65 class) use the rules it specifies to udpate and serialize the representation
66 of the message. If *policy* is not set, use the
67 :class:`~email.policy.default` policy, which follows the rules of the email
68 RFCs except for line endings (instead of the RFC mandated ``\r\n``, it uses
69 the Python standard ``\n`` line endings). For more information see the
70 :mod:`~email.policy` documentation.
71
72 This class is a subclass of :class:`~email.message.Message`. It adds
73 the following methods:
74
75
R David Murray8a978962014-09-20 18:05:28 -040076 .. method:: is_attachment
R David Murray3da240f2013-10-16 22:48:40 -040077
R David Murray8a978962014-09-20 18:05:28 -040078 Return ``True`` if there is a :mailheader:`Content-Disposition` header
R David Murray3da240f2013-10-16 22:48:40 -040079 and its (case insensitive) value is ``attachment``, ``False`` otherwise.
80
R David Murray8a978962014-09-20 18:05:28 -040081 .. versionchanged:: 3.4.2
82 is_attachment is now a method instead of a property, for consistency
83 with :meth:`~email.message.Message.is_multipart`.
84
R David Murray3da240f2013-10-16 22:48:40 -040085
86 .. method:: get_body(preferencelist=('related', 'html', 'plain'))
87
88 Return the MIME part that is the best candidate to be the "body" of the
89 message.
90
91 *preferencelist* must be a sequence of strings from the set ``related``,
92 ``html``, and ``plain``, and indicates the order of preference for the
93 content type of the part returned.
94
95 Start looking for candidate matches with the object on which the
96 ``get_body`` method is called.
97
98 If ``related`` is not included in *preferencelist*, consider the root
99 part (or subpart of the root part) of any related encountered as a
100 candidate if the (sub-)part matches a preference.
101
102 When encountering a ``multipart/related``, check the ``start`` parameter
103 and if a part with a matching :mailheader:`Content-ID` is found, consider
104 only it when looking for candidate matches. Otherwise consider only the
105 first (default root) part of the ``multipart/related``.
106
Georg Brandled007d52013-11-24 16:09:26 +0100107 If a part has a :mailheader:`Content-Disposition` header, only consider
R David Murray3da240f2013-10-16 22:48:40 -0400108 the part a candidate match if the value of the header is ``inline``.
109
110 If none of the candidates matches any of the preferences in
111 *preferneclist*, return ``None``.
112
113 Notes: (1) For most applications the only *preferencelist* combinations
114 that really make sense are ``('plain',)``, ``('html', 'plain')``, and the
115 default, ``('related', 'html', 'plain')``. (2) Because matching starts
116 with the object on which ``get_body`` is called, calling ``get_body`` on
117 a ``multipart/related`` will return the object itself unless
118 *preferencelist* has a non-default value. (3) Messages (or message parts)
119 that do not specify a :mailheader:`Content-Type` or whose
120 :mailheader:`Content-Type` header is invalid will be treated as if they
121 are of type ``text/plain``, which may occasionally cause ``get_body`` to
122 return unexpected results.
123
124
125 .. method:: iter_attachments()
126
127 Return an iterator over all of the parts of the message that are not
128 candidate "body" parts. That is, skip the first occurrence of each of
129 ``text/plain``, ``text/html``, ``multipart/related``, or
130 ``multipart/alternative`` (unless they are explicitly marked as
131 attachments via :mailheader:`Content-Disposition: attachment`), and
132 return all remaining parts. When applied directly to a
133 ``multipart/related``, return an iterator over the all the related parts
134 except the root part (ie: the part pointed to by the ``start`` parameter,
135 or the first part if there is no ``start`` parameter or the ``start``
136 parameter doesn't match the :mailheader:`Content-ID` of any of the
137 parts). When applied directly to a ``multipart/alternative`` or a
138 non-``multipart``, return an empty iterator.
139
140
141 .. method:: iter_parts()
142
143 Return an iterator over all of the immediate sub-parts of the message,
144 which will be empty for a non-``multipart``. (See also
Georg Brandled007d52013-11-24 16:09:26 +0100145 :meth:`~email.message.walk`.)
R David Murray3da240f2013-10-16 22:48:40 -0400146
147
148 .. method:: get_content(*args, content_manager=None, **kw)
149
150 Call the ``get_content`` method of the *content_manager*, passing self
151 as the message object, and passing along any other arguments or keywords
152 as additional arguments. If *content_manager* is not specified, use
153 the ``content_manager`` specified by the current :mod:`~email.policy`.
154
155
156 .. method:: set_content(*args, content_manager=None, **kw)
157
158 Call the ``set_content`` method of the *content_manager*, passing self
159 as the message object, and passing along any other arguments or keywords
160 as additional arguments. If *content_manager* is not specified, use
161 the ``content_manager`` specified by the current :mod:`~email.policy`.
162
163
164 .. method:: make_related(boundary=None)
165
166 Convert a non-``multipart`` message into a ``multipart/related`` message,
167 moving any existing :mailheader:`Content-` headers and payload into a
168 (new) first part of the ``multipart``. If *boundary* is specified, use
169 it as the boundary string in the multipart, otherwise leave the boundary
170 to be automatically created when it is needed (for example, when the
171 message is serialized).
172
173
174 .. method:: make_alternative(boundary=None)
175
176 Convert a non-``multipart`` or a ``multipart/related`` into a
177 ``multipart/alternative``, moving any existing :mailheader:`Content-`
178 headers and payload into a (new) first part of the ``multipart``. If
179 *boundary* is specified, use it as the boundary string in the multipart,
180 otherwise leave the boundary to be automatically created when it is
181 needed (for example, when the message is serialized).
182
183
184 .. method:: make_mixed(boundary=None)
185
186 Convert a non-``multipart``, a ``multipart/related``, or a
187 ``multipart-alternative`` into a ``multipart/mixed``, moving any existing
188 :mailheader:`Content-` headers and payload into a (new) first part of the
189 ``multipart``. If *boundary* is specified, use it as the boundary string
190 in the multipart, otherwise leave the boundary to be automatically
191 created when it is needed (for example, when the message is serialized).
192
193
194 .. method:: add_related(*args, content_manager=None, **kw)
195
196 If the message is a ``multipart/related``, create a new message
197 object, pass all of the arguments to its :meth:`set_content` method,
198 and :meth:`~email.message.Message.attach` it to the ``multipart``. If
199 the message is a non-``multipart``, call :meth:`make_related` and then
200 proceed as above. If the message is any other type of ``multipart``,
201 raise a :exc:`TypeError`. If *content_manager* is not specified, use
202 the ``content_manager`` specified by the current :mod:`~email.policy`.
203 If the added part has no :mailheader:`Content-Disposition` header,
204 add one with the value ``inline``.
205
206
207 .. method:: add_alternative(*args, content_manager=None, **kw)
208
209 If the message is a ``multipart/alternative``, create a new message
210 object, pass all of the arguments to its :meth:`set_content` method, and
211 :meth:`~email.message.Message.attach` it to the ``multipart``. If the
212 message is a non-``multipart`` or ``multipart/related``, call
213 :meth:`make_alternative` and then proceed as above. If the message is
214 any other type of ``multipart``, raise a :exc:`TypeError`. If
215 *content_manager* is not specified, use the ``content_manager`` specified
216 by the current :mod:`~email.policy`.
217
218
219 .. method:: add_attachment(*args, content_manager=None, **kw)
220
221 If the message is a ``multipart/mixed``, create a new message object,
222 pass all of the arguments to its :meth:`set_content` method, and
223 :meth:`~email.message.Message.attach` it to the ``multipart``. If the
224 message is a non-``multipart``, ``multipart/related``, or
225 ``multipart/alternative``, call :meth:`make_mixed` and then proceed as
226 above. If *content_manager* is not specified, use the ``content_manager``
227 specified by the current :mod:`~email.policy`. If the added part
228 has no :mailheader:`Content-Disposition` header, add one with the value
229 ``attachment``. This method can be used both for explicit attachments
230 (:mailheader:`Content-Disposition: attachment` and ``inline`` attachments
231 (:mailheader:`Content-Disposition: inline`), by passing appropriate
232 options to the ``content_manager``.
233
234
235 .. method:: clear()
236
237 Remove the payload and all of the headers.
238
239
240 .. method:: clear_content()
241
242 Remove the payload and all of the :exc:`Content-` headers, leaving
243 all other headers intact and in their original order.
244
245
Larry Hastings3732ed22014-03-15 21:13:56 -0700246.. class:: MIMEPart(policy=default)
247
248 This class represents a subpart of a MIME message. It is identical to
249 :class:`EmailMessage`, except that no :mailheader:`MIME-Version` headers are
250 added when :meth:`~EmailMessage.set_content` is called, since sub-parts do
251 not need their own :mailheader:`MIME-Version` headers.
252
253
254.. currentmodule:: email.contentmanager
255
R David Murray3da240f2013-10-16 22:48:40 -0400256.. class:: ContentManager()
257
258 Base class for content managers. Provides the standard registry mechanisms
259 to register converters between MIME content and other representations, as
260 well as the ``get_content`` and ``set_content`` dispatch methods.
261
262
263 .. method:: get_content(msg, *args, **kw)
264
265 Look up a handler function based on the ``mimetype`` of *msg* (see next
266 paragraph), call it, passing through all arguments, and return the result
267 of the call. The expectation is that the handler will extract the
268 payload from *msg* and return an object that encodes information about
269 the extracted data.
270
271 To find the handler, look for the following keys in the registry,
272 stopping with the first one found:
273
274 * the string representing the full MIME type (``maintype/subtype``)
275 * the string representing the ``maintype``
276 * the empty string
277
278 If none of these keys produce a handler, raise a :exc:`KeyError` for the
279 full MIME type.
280
281
282 .. method:: set_content(msg, obj, *args, **kw)
283
284 If the ``maintype`` is ``multipart``, raise a :exc:`TypeError`; otherwise
285 look up a handler function based on the type of *obj* (see next
286 paragraph), call :meth:`~email.message.EmailMessage.clear_content` on the
287 *msg*, and call the handler function, passing through all arguments. The
288 expectation is that the handler will transform and store *obj* into
289 *msg*, possibly making other changes to *msg* as well, such as adding
290 various MIME headers to encode information needed to interpret the stored
291 data.
292
293 To find the handler, obtain the type of *obj* (``typ = type(obj)``), and
294 look for the following keys in the registry, stopping with the first one
295 found:
296
297 * the type itself (``typ``)
298 * the type's fully qualified name (``typ.__module__ + '.' +
299 typ.__qualname__``).
300 * the type's qualname (``typ.__qualname__``)
301 * the type's name (``typ.__name__``).
302
303 If none of the above match, repeat all of the checks above for each of
304 the types in the :term:`MRO` (``typ.__mro__``). Finally, if no other key
305 yields a handler, check for a handler for the key ``None``. If there is
306 no handler for ``None``, raise a :exc:`KeyError` for the fully
307 qualified name of the type.
308
309 Also add a :mailheader:`MIME-Version` header if one is not present (see
310 also :class:`.MIMEPart`).
311
312
313 .. method:: add_get_handler(key, handler)
314
315 Record the function *handler* as the handler for *key*. For the possible
316 values of *key*, see :meth:`get_content`.
317
318
319 .. method:: add_set_handler(typekey, handler)
320
321 Record *handler* as the function to call when an object of a type
322 matching *typekey* is passed to :meth:`set_content`. For the possible
323 values of *typekey*, see :meth:`set_content`.
324
325
R David Murray3da240f2013-10-16 22:48:40 -0400326Content Manager Instances
327~~~~~~~~~~~~~~~~~~~~~~~~~
328
329Currently the email package provides only one concrete content manager,
330:data:`raw_data_manager`, although more may be added in the future.
331:data:`raw_data_manager` is the
332:attr:`~email.policy.EmailPolicy.content_manager` provided by
333:attr:`~email.policy.EmailPolicy` and its derivatives.
334
335
336.. data:: raw_data_manager
337
338 This content manager provides only a minimum interface beyond that provided
339 by :class:`~email.message.Message` itself: it deals only with text, raw
340 byte strings, and :class:`~email.message.Message` objects. Nevertheless, it
341 provides significant advantages compared to the base API: ``get_content`` on
342 a text part will return a unicode string without the application needing to
343 manually decode it, ``set_content`` provides a rich set of options for
344 controlling the headers added to a part and controlling the content transfer
345 encoding, and it enables the use of the various ``add_`` methods, thereby
346 simplifying the creation of multipart messages.
347
348 .. method:: get_content(msg, errors='replace')
349
Martin Panterd210a702016-08-20 08:03:06 +0000350 Return the payload of the part as either a string (for ``text`` parts), an
R David Murray3da240f2013-10-16 22:48:40 -0400351 :class:`~email.message.EmailMessage` object (for ``message/rfc822``
352 parts), or a ``bytes`` object (for all other non-multipart types). Raise
353 a :exc:`KeyError` if called on a ``multipart``. If the part is a
354 ``text`` part and *errors* is specified, use it as the error handler when
355 decoding the payload to unicode. The default error handler is
356 ``replace``.
357
358 .. method:: set_content(msg, <'str'>, subtype="plain", charset='utf-8' \
359 cte=None, \
360 disposition=None, filename=None, cid=None, \
361 params=None, headers=None)
362 set_content(msg, <'bytes'>, maintype, subtype, cte="base64", \
363 disposition=None, filename=None, cid=None, \
364 params=None, headers=None)
365 set_content(msg, <'Message'>, cte=None, \
366 disposition=None, filename=None, cid=None, \
367 params=None, headers=None)
368 set_content(msg, <'list'>, subtype='mixed', \
369 disposition=None, filename=None, cid=None, \
370 params=None, headers=None)
371
372 Add headers and payload to *msg*:
373
374 Add a :mailheader:`Content-Type` header with a ``maintype/subtype``
375 value.
376
377 * For ``str``, set the MIME ``maintype`` to ``text``, and set the
378 subtype to *subtype* if it is specified, or ``plain`` if it is not.
379 * For ``bytes``, use the specified *maintype* and *subtype*, or
380 raise a :exc:`TypeError` if they are not specified.
381 * For :class:`~email.message.Message` objects, set the maintype to
382 ``message``, and set the subtype to *subtype* if it is specified
383 or ``rfc822`` if it is not. If *subtype* is ``partial``, raise an
384 error (``bytes`` objects must be used to construct
385 ``message/partial`` parts).
386 * For *<'list'>*, which should be a list of
387 :class:`~email.message.Message` objects, set the ``maintype`` to
388 ``multipart``, and the ``subtype`` to *subtype* if it is
389 specified, and ``mixed`` if it is not. If the message parts in
390 the *<'list'>* have :mailheader:`MIME-Version` headers, remove
391 them.
392
393 If *charset* is provided (which is valid only for ``str``), encode the
394 string to bytes using the specified character set. The default is
395 ``utf-8``. If the specified *charset* is a known alias for a standard
396 MIME charset name, use the standard charset instead.
397
398 If *cte* is set, encode the payload using the specified content transfer
399 encoding, and set the :mailheader:`Content-Transfer-Endcoding` header to
400 that value. For ``str`` objects, if it is not set use heuristics to
401 determine the most compact encoding. Possible values for *cte* are
402 ``quoted-printable``, ``base64``, ``7bit``, ``8bit``, and ``binary``.
403 If the input cannot be encoded in the specified encoding (eg: ``7bit``),
404 raise a :exc:`ValueError`. For :class:`~email.message.Message`, per
405 :rfc:`2046`, raise an error if a *cte* of ``quoted-printable`` or
406 ``base64`` is requested for *subtype* ``rfc822``, and for any *cte*
407 other than ``7bit`` for *subtype* ``external-body``. For
408 ``message/rfc822``, use ``8bit`` if *cte* is not specified. For all
409 other values of *subtype*, use ``7bit``.
410
Berker Peksag9c1dba22014-09-28 00:00:58 +0300411 .. note:: A *cte* of ``binary`` does not actually work correctly yet.
R David Murray3da240f2013-10-16 22:48:40 -0400412 The ``Message`` object as modified by ``set_content`` is correct, but
413 :class:`~email.generator.BytesGenerator` does not serialize it
414 correctly.
415
416 If *disposition* is set, use it as the value of the
417 :mailheader:`Content-Disposition` header. If not specified, and
418 *filename* is specified, add the header with the value ``attachment``.
419 If it is not specified and *filename* is also not specified, do not add
420 the header. The only valid values for *disposition* are ``attachment``
421 and ``inline``.
422
423 If *filename* is specified, use it as the value of the ``filename``
424 parameter of the :mailheader:`Content-Disposition` header. There is no
425 default.
426
427 If *cid* is specified, add a :mailheader:`Content-ID` header with
428 *cid* as its value.
429
430 If *params* is specified, iterate its ``items`` method and use the
Berker Peksag4882cac2015-04-14 09:30:01 +0300431 resulting ``(key, value)`` pairs to set additional parameters on the
R David Murray3da240f2013-10-16 22:48:40 -0400432 :mailheader:`Content-Type` header.
433
434 If *headers* is specified and is a list of strings of the form
435 ``headername: headervalue`` or a list of ``header`` objects
436 (distinguised from strings by having a ``name`` attribute), add the
437 headers to *msg*.