R David Murray | 3da240f | 2013-10-16 22:48:40 -0400 | [diff] [blame] | 1 | :mod:`email.contentmanager`: Managing MIME Content |
| 2 | -------------------------------------------------- |
| 3 | |
| 4 | .. module:: email.contentmanager |
| 5 | :synopsis: Storing and Retrieving Content from MIME Parts |
| 6 | |
| 7 | .. moduleauthor:: R. David Murray <rdmurray@bitdance.com> |
| 8 | .. sectionauthor:: R. David Murray <rdmurray@bitdance.com> |
| 9 | |
| 10 | |
| 11 | .. note:: |
| 12 | |
| 13 | The contentmanager module has been included in the standard library on a |
| 14 | :term:`provisional basis <provisional package>`. Backwards incompatible |
| 15 | changes (up to and including removal of the module) may occur if deemed |
| 16 | necessary by the core developers. |
| 17 | |
| 18 | .. versionadded:: 3.4 |
| 19 | as a :term:`provisional module <provisional package>`. |
| 20 | |
| 21 | The :mod:`~email.message` module provides a class that can represent an |
| 22 | arbitrary email message. That basic message model has a useful and flexible |
| 23 | API, but it provides only a lower-level API for interacting with the generic |
| 24 | parts of a message (the headers, generic header parameters, and the payload, |
| 25 | which may be a list of sub-parts). This module provides classes and tools |
| 26 | that provide an enhanced and extensible API for dealing with various specific |
| 27 | types of content, including the ability to retrieve the content of the message |
| 28 | as a specialized object type rather than as a simple bytes object. The module |
| 29 | automatically takes care of the RFC-specified MIME details (required headers |
| 30 | and parameters, etc.) for the certain common content types content properties, |
| 31 | and support for additional types can be added by an application using the |
| 32 | extension mechanisms. |
| 33 | |
| 34 | This module defines the eponymous "Content Manager" classes. The base |
| 35 | :class:`.ContentManager` class defines an API for registering content |
| 36 | management functions which extract data from ``Message`` objects or insert data |
| 37 | and headers into ``Message`` objects, thus providing a way of converting |
| 38 | between ``Message`` objects containing data and other representations of that |
| 39 | data (Python data types, specialized Python objects, external files, etc). The |
| 40 | module also defines one concrete content manager: :data:`raw_data_manager` |
| 41 | converts between MIME content types and ``str`` or ``bytes`` data. It also |
| 42 | provides a convenient API for managing the MIME parameters when inserting |
| 43 | content into ``Message``\ s. It also handles inserting and extracting |
| 44 | ``Message`` objects when dealing with the ``message/rfc822`` content type. |
| 45 | |
| 46 | Another part of the enhanced interface is subclasses of |
| 47 | :class:`~email.message.Message` that provide new convenience API functions, |
| 48 | including convenience methods for calling the Content Managers derived from |
| 49 | this module. |
| 50 | |
| 51 | .. note:: |
| 52 | |
| 53 | Although :class:`.EmailMessage` and :class:`.MIMEPart` are currently |
| 54 | documented in this module because of the provisional nature of the code, the |
| 55 | implementation lives in the :mod:`email.message` module. |
| 56 | |
| 57 | |
| 58 | .. class:: EmailMessage(policy=default) |
| 59 | |
| 60 | If *policy* is specified (it must be an instance of a :mod:`~email.policy` |
| 61 | class) use the rules it specifies to udpate and serialize the representation |
| 62 | of the message. If *policy* is not set, use the |
| 63 | :class:`~email.policy.default` policy, which follows the rules of the email |
| 64 | RFCs except for line endings (instead of the RFC mandated ``\r\n``, it uses |
| 65 | the Python standard ``\n`` line endings). For more information see the |
| 66 | :mod:`~email.policy` documentation. |
| 67 | |
| 68 | This class is a subclass of :class:`~email.message.Message`. It adds |
| 69 | the following methods: |
| 70 | |
| 71 | |
| 72 | .. attribute:: is_attachment |
| 73 | |
| 74 | Set to ``True`` if there is a :mailheader:`Content-Disposition` header |
| 75 | and its (case insensitive) value is ``attachment``, ``False`` otherwise. |
| 76 | |
| 77 | |
| 78 | .. method:: get_body(preferencelist=('related', 'html', 'plain')) |
| 79 | |
| 80 | Return the MIME part that is the best candidate to be the "body" of the |
| 81 | message. |
| 82 | |
| 83 | *preferencelist* must be a sequence of strings from the set ``related``, |
| 84 | ``html``, and ``plain``, and indicates the order of preference for the |
| 85 | content type of the part returned. |
| 86 | |
| 87 | Start looking for candidate matches with the object on which the |
| 88 | ``get_body`` method is called. |
| 89 | |
| 90 | If ``related`` is not included in *preferencelist*, consider the root |
| 91 | part (or subpart of the root part) of any related encountered as a |
| 92 | candidate if the (sub-)part matches a preference. |
| 93 | |
| 94 | When encountering a ``multipart/related``, check the ``start`` parameter |
| 95 | and if a part with a matching :mailheader:`Content-ID` is found, consider |
| 96 | only it when looking for candidate matches. Otherwise consider only the |
| 97 | first (default root) part of the ``multipart/related``. |
| 98 | |
Georg Brandl | ed007d5 | 2013-11-24 16:09:26 +0100 | [diff] [blame] | 99 | If a part has a :mailheader:`Content-Disposition` header, only consider |
R David Murray | 3da240f | 2013-10-16 22:48:40 -0400 | [diff] [blame] | 100 | the part a candidate match if the value of the header is ``inline``. |
| 101 | |
| 102 | If none of the candidates matches any of the preferences in |
| 103 | *preferneclist*, return ``None``. |
| 104 | |
| 105 | Notes: (1) For most applications the only *preferencelist* combinations |
| 106 | that really make sense are ``('plain',)``, ``('html', 'plain')``, and the |
| 107 | default, ``('related', 'html', 'plain')``. (2) Because matching starts |
| 108 | with the object on which ``get_body`` is called, calling ``get_body`` on |
| 109 | a ``multipart/related`` will return the object itself unless |
| 110 | *preferencelist* has a non-default value. (3) Messages (or message parts) |
| 111 | that do not specify a :mailheader:`Content-Type` or whose |
| 112 | :mailheader:`Content-Type` header is invalid will be treated as if they |
| 113 | are of type ``text/plain``, which may occasionally cause ``get_body`` to |
| 114 | return unexpected results. |
| 115 | |
| 116 | |
| 117 | .. method:: iter_attachments() |
| 118 | |
| 119 | Return an iterator over all of the parts of the message that are not |
| 120 | candidate "body" parts. That is, skip the first occurrence of each of |
| 121 | ``text/plain``, ``text/html``, ``multipart/related``, or |
| 122 | ``multipart/alternative`` (unless they are explicitly marked as |
| 123 | attachments via :mailheader:`Content-Disposition: attachment`), and |
| 124 | return all remaining parts. When applied directly to a |
| 125 | ``multipart/related``, return an iterator over the all the related parts |
| 126 | except the root part (ie: the part pointed to by the ``start`` parameter, |
| 127 | or the first part if there is no ``start`` parameter or the ``start`` |
| 128 | parameter doesn't match the :mailheader:`Content-ID` of any of the |
| 129 | parts). When applied directly to a ``multipart/alternative`` or a |
| 130 | non-``multipart``, return an empty iterator. |
| 131 | |
| 132 | |
| 133 | .. method:: iter_parts() |
| 134 | |
| 135 | Return an iterator over all of the immediate sub-parts of the message, |
| 136 | which will be empty for a non-``multipart``. (See also |
Georg Brandl | ed007d5 | 2013-11-24 16:09:26 +0100 | [diff] [blame] | 137 | :meth:`~email.message.walk`.) |
R David Murray | 3da240f | 2013-10-16 22:48:40 -0400 | [diff] [blame] | 138 | |
| 139 | |
| 140 | .. method:: get_content(*args, content_manager=None, **kw) |
| 141 | |
| 142 | Call the ``get_content`` method of the *content_manager*, passing self |
| 143 | as the message object, and passing along any other arguments or keywords |
| 144 | as additional arguments. If *content_manager* is not specified, use |
| 145 | the ``content_manager`` specified by the current :mod:`~email.policy`. |
| 146 | |
| 147 | |
| 148 | .. method:: set_content(*args, content_manager=None, **kw) |
| 149 | |
| 150 | Call the ``set_content`` method of the *content_manager*, passing self |
| 151 | as the message object, and passing along any other arguments or keywords |
| 152 | as additional arguments. If *content_manager* is not specified, use |
| 153 | the ``content_manager`` specified by the current :mod:`~email.policy`. |
| 154 | |
| 155 | |
| 156 | .. method:: make_related(boundary=None) |
| 157 | |
| 158 | Convert a non-``multipart`` message into a ``multipart/related`` message, |
| 159 | moving any existing :mailheader:`Content-` headers and payload into a |
| 160 | (new) first part of the ``multipart``. If *boundary* is specified, use |
| 161 | it as the boundary string in the multipart, otherwise leave the boundary |
| 162 | to be automatically created when it is needed (for example, when the |
| 163 | message is serialized). |
| 164 | |
| 165 | |
| 166 | .. method:: make_alternative(boundary=None) |
| 167 | |
| 168 | Convert a non-``multipart`` or a ``multipart/related`` into a |
| 169 | ``multipart/alternative``, moving any existing :mailheader:`Content-` |
| 170 | headers and payload into a (new) first part of the ``multipart``. If |
| 171 | *boundary* is specified, use it as the boundary string in the multipart, |
| 172 | otherwise leave the boundary to be automatically created when it is |
| 173 | needed (for example, when the message is serialized). |
| 174 | |
| 175 | |
| 176 | .. method:: make_mixed(boundary=None) |
| 177 | |
| 178 | Convert a non-``multipart``, a ``multipart/related``, or a |
| 179 | ``multipart-alternative`` into a ``multipart/mixed``, moving any existing |
| 180 | :mailheader:`Content-` headers and payload into a (new) first part of the |
| 181 | ``multipart``. If *boundary* is specified, use it as the boundary string |
| 182 | in the multipart, otherwise leave the boundary to be automatically |
| 183 | created when it is needed (for example, when the message is serialized). |
| 184 | |
| 185 | |
| 186 | .. method:: add_related(*args, content_manager=None, **kw) |
| 187 | |
| 188 | If the message is a ``multipart/related``, create a new message |
| 189 | object, pass all of the arguments to its :meth:`set_content` method, |
| 190 | and :meth:`~email.message.Message.attach` it to the ``multipart``. If |
| 191 | the message is a non-``multipart``, call :meth:`make_related` and then |
| 192 | proceed as above. If the message is any other type of ``multipart``, |
| 193 | raise a :exc:`TypeError`. If *content_manager* is not specified, use |
| 194 | the ``content_manager`` specified by the current :mod:`~email.policy`. |
| 195 | If the added part has no :mailheader:`Content-Disposition` header, |
| 196 | add one with the value ``inline``. |
| 197 | |
| 198 | |
| 199 | .. method:: add_alternative(*args, content_manager=None, **kw) |
| 200 | |
| 201 | If the message is a ``multipart/alternative``, create a new message |
| 202 | object, pass all of the arguments to its :meth:`set_content` method, and |
| 203 | :meth:`~email.message.Message.attach` it to the ``multipart``. If the |
| 204 | message is a non-``multipart`` or ``multipart/related``, call |
| 205 | :meth:`make_alternative` and then proceed as above. If the message is |
| 206 | any other type of ``multipart``, raise a :exc:`TypeError`. If |
| 207 | *content_manager* is not specified, use the ``content_manager`` specified |
| 208 | by the current :mod:`~email.policy`. |
| 209 | |
| 210 | |
| 211 | .. method:: add_attachment(*args, content_manager=None, **kw) |
| 212 | |
| 213 | If the message is a ``multipart/mixed``, create a new message object, |
| 214 | pass all of the arguments to its :meth:`set_content` method, and |
| 215 | :meth:`~email.message.Message.attach` it to the ``multipart``. If the |
| 216 | message is a non-``multipart``, ``multipart/related``, or |
| 217 | ``multipart/alternative``, call :meth:`make_mixed` and then proceed as |
| 218 | above. If *content_manager* is not specified, use the ``content_manager`` |
| 219 | specified by the current :mod:`~email.policy`. If the added part |
| 220 | has no :mailheader:`Content-Disposition` header, add one with the value |
| 221 | ``attachment``. This method can be used both for explicit attachments |
| 222 | (:mailheader:`Content-Disposition: attachment` and ``inline`` attachments |
| 223 | (:mailheader:`Content-Disposition: inline`), by passing appropriate |
| 224 | options to the ``content_manager``. |
| 225 | |
| 226 | |
| 227 | .. method:: clear() |
| 228 | |
| 229 | Remove the payload and all of the headers. |
| 230 | |
| 231 | |
| 232 | .. method:: clear_content() |
| 233 | |
| 234 | Remove the payload and all of the :exc:`Content-` headers, leaving |
| 235 | all other headers intact and in their original order. |
| 236 | |
| 237 | |
| 238 | .. class:: ContentManager() |
| 239 | |
| 240 | Base class for content managers. Provides the standard registry mechanisms |
| 241 | to register converters between MIME content and other representations, as |
| 242 | well as the ``get_content`` and ``set_content`` dispatch methods. |
| 243 | |
| 244 | |
| 245 | .. method:: get_content(msg, *args, **kw) |
| 246 | |
| 247 | Look up a handler function based on the ``mimetype`` of *msg* (see next |
| 248 | paragraph), call it, passing through all arguments, and return the result |
| 249 | of the call. The expectation is that the handler will extract the |
| 250 | payload from *msg* and return an object that encodes information about |
| 251 | the extracted data. |
| 252 | |
| 253 | To find the handler, look for the following keys in the registry, |
| 254 | stopping with the first one found: |
| 255 | |
| 256 | * the string representing the full MIME type (``maintype/subtype``) |
| 257 | * the string representing the ``maintype`` |
| 258 | * the empty string |
| 259 | |
| 260 | If none of these keys produce a handler, raise a :exc:`KeyError` for the |
| 261 | full MIME type. |
| 262 | |
| 263 | |
| 264 | .. method:: set_content(msg, obj, *args, **kw) |
| 265 | |
| 266 | If the ``maintype`` is ``multipart``, raise a :exc:`TypeError`; otherwise |
| 267 | look up a handler function based on the type of *obj* (see next |
| 268 | paragraph), call :meth:`~email.message.EmailMessage.clear_content` on the |
| 269 | *msg*, and call the handler function, passing through all arguments. The |
| 270 | expectation is that the handler will transform and store *obj* into |
| 271 | *msg*, possibly making other changes to *msg* as well, such as adding |
| 272 | various MIME headers to encode information needed to interpret the stored |
| 273 | data. |
| 274 | |
| 275 | To find the handler, obtain the type of *obj* (``typ = type(obj)``), and |
| 276 | look for the following keys in the registry, stopping with the first one |
| 277 | found: |
| 278 | |
| 279 | * the type itself (``typ``) |
| 280 | * the type's fully qualified name (``typ.__module__ + '.' + |
| 281 | typ.__qualname__``). |
| 282 | * the type's qualname (``typ.__qualname__``) |
| 283 | * the type's name (``typ.__name__``). |
| 284 | |
| 285 | If none of the above match, repeat all of the checks above for each of |
| 286 | the types in the :term:`MRO` (``typ.__mro__``). Finally, if no other key |
| 287 | yields a handler, check for a handler for the key ``None``. If there is |
| 288 | no handler for ``None``, raise a :exc:`KeyError` for the fully |
| 289 | qualified name of the type. |
| 290 | |
| 291 | Also add a :mailheader:`MIME-Version` header if one is not present (see |
| 292 | also :class:`.MIMEPart`). |
| 293 | |
| 294 | |
| 295 | .. method:: add_get_handler(key, handler) |
| 296 | |
| 297 | Record the function *handler* as the handler for *key*. For the possible |
| 298 | values of *key*, see :meth:`get_content`. |
| 299 | |
| 300 | |
| 301 | .. method:: add_set_handler(typekey, handler) |
| 302 | |
| 303 | Record *handler* as the function to call when an object of a type |
| 304 | matching *typekey* is passed to :meth:`set_content`. For the possible |
| 305 | values of *typekey*, see :meth:`set_content`. |
| 306 | |
| 307 | |
| 308 | .. class:: MIMEPart(policy=default) |
| 309 | |
| 310 | This class represents a subpart of a MIME message. It is identical to |
| 311 | :class:`EmailMessage`, except that no :mailheader:`MIME-Version` headers are |
| 312 | added when :meth:`~EmailMessage.set_content` is called, since sub-parts do |
| 313 | not need their own :mailheader:`MIME-Version` headers. |
| 314 | |
| 315 | |
| 316 | Content Manager Instances |
| 317 | ~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 318 | |
| 319 | Currently the email package provides only one concrete content manager, |
| 320 | :data:`raw_data_manager`, although more may be added in the future. |
| 321 | :data:`raw_data_manager` is the |
| 322 | :attr:`~email.policy.EmailPolicy.content_manager` provided by |
| 323 | :attr:`~email.policy.EmailPolicy` and its derivatives. |
| 324 | |
| 325 | |
| 326 | .. data:: raw_data_manager |
| 327 | |
| 328 | This content manager provides only a minimum interface beyond that provided |
| 329 | by :class:`~email.message.Message` itself: it deals only with text, raw |
| 330 | byte strings, and :class:`~email.message.Message` objects. Nevertheless, it |
| 331 | provides significant advantages compared to the base API: ``get_content`` on |
| 332 | a text part will return a unicode string without the application needing to |
| 333 | manually decode it, ``set_content`` provides a rich set of options for |
| 334 | controlling the headers added to a part and controlling the content transfer |
| 335 | encoding, and it enables the use of the various ``add_`` methods, thereby |
| 336 | simplifying the creation of multipart messages. |
| 337 | |
| 338 | .. method:: get_content(msg, errors='replace') |
| 339 | |
| 340 | Return the payload of the part as either a string (for ``text`` parts), a |
| 341 | :class:`~email.message.EmailMessage` object (for ``message/rfc822`` |
| 342 | parts), or a ``bytes`` object (for all other non-multipart types). Raise |
| 343 | a :exc:`KeyError` if called on a ``multipart``. If the part is a |
| 344 | ``text`` part and *errors* is specified, use it as the error handler when |
| 345 | decoding the payload to unicode. The default error handler is |
| 346 | ``replace``. |
| 347 | |
| 348 | .. method:: set_content(msg, <'str'>, subtype="plain", charset='utf-8' \ |
| 349 | cte=None, \ |
| 350 | disposition=None, filename=None, cid=None, \ |
| 351 | params=None, headers=None) |
| 352 | set_content(msg, <'bytes'>, maintype, subtype, cte="base64", \ |
| 353 | disposition=None, filename=None, cid=None, \ |
| 354 | params=None, headers=None) |
| 355 | set_content(msg, <'Message'>, cte=None, \ |
| 356 | disposition=None, filename=None, cid=None, \ |
| 357 | params=None, headers=None) |
| 358 | set_content(msg, <'list'>, subtype='mixed', \ |
| 359 | disposition=None, filename=None, cid=None, \ |
| 360 | params=None, headers=None) |
| 361 | |
| 362 | Add headers and payload to *msg*: |
| 363 | |
| 364 | Add a :mailheader:`Content-Type` header with a ``maintype/subtype`` |
| 365 | value. |
| 366 | |
| 367 | * For ``str``, set the MIME ``maintype`` to ``text``, and set the |
| 368 | subtype to *subtype* if it is specified, or ``plain`` if it is not. |
| 369 | * For ``bytes``, use the specified *maintype* and *subtype*, or |
| 370 | raise a :exc:`TypeError` if they are not specified. |
| 371 | * For :class:`~email.message.Message` objects, set the maintype to |
| 372 | ``message``, and set the subtype to *subtype* if it is specified |
| 373 | or ``rfc822`` if it is not. If *subtype* is ``partial``, raise an |
| 374 | error (``bytes`` objects must be used to construct |
| 375 | ``message/partial`` parts). |
| 376 | * For *<'list'>*, which should be a list of |
| 377 | :class:`~email.message.Message` objects, set the ``maintype`` to |
| 378 | ``multipart``, and the ``subtype`` to *subtype* if it is |
| 379 | specified, and ``mixed`` if it is not. If the message parts in |
| 380 | the *<'list'>* have :mailheader:`MIME-Version` headers, remove |
| 381 | them. |
| 382 | |
| 383 | If *charset* is provided (which is valid only for ``str``), encode the |
| 384 | string to bytes using the specified character set. The default is |
| 385 | ``utf-8``. If the specified *charset* is a known alias for a standard |
| 386 | MIME charset name, use the standard charset instead. |
| 387 | |
| 388 | If *cte* is set, encode the payload using the specified content transfer |
| 389 | encoding, and set the :mailheader:`Content-Transfer-Endcoding` header to |
| 390 | that value. For ``str`` objects, if it is not set use heuristics to |
| 391 | determine the most compact encoding. Possible values for *cte* are |
| 392 | ``quoted-printable``, ``base64``, ``7bit``, ``8bit``, and ``binary``. |
| 393 | If the input cannot be encoded in the specified encoding (eg: ``7bit``), |
| 394 | raise a :exc:`ValueError`. For :class:`~email.message.Message`, per |
| 395 | :rfc:`2046`, raise an error if a *cte* of ``quoted-printable`` or |
| 396 | ``base64`` is requested for *subtype* ``rfc822``, and for any *cte* |
| 397 | other than ``7bit`` for *subtype* ``external-body``. For |
| 398 | ``message/rfc822``, use ``8bit`` if *cte* is not specified. For all |
| 399 | other values of *subtype*, use ``7bit``. |
| 400 | |
| 401 | .. note:: A *cte* of ``binary`` does not actually work correctly yet. |
| 402 | The ``Message`` object as modified by ``set_content`` is correct, but |
| 403 | :class:`~email.generator.BytesGenerator` does not serialize it |
| 404 | correctly. |
| 405 | |
| 406 | If *disposition* is set, use it as the value of the |
| 407 | :mailheader:`Content-Disposition` header. If not specified, and |
| 408 | *filename* is specified, add the header with the value ``attachment``. |
| 409 | If it is not specified and *filename* is also not specified, do not add |
| 410 | the header. The only valid values for *disposition* are ``attachment`` |
| 411 | and ``inline``. |
| 412 | |
| 413 | If *filename* is specified, use it as the value of the ``filename`` |
| 414 | parameter of the :mailheader:`Content-Disposition` header. There is no |
| 415 | default. |
| 416 | |
| 417 | If *cid* is specified, add a :mailheader:`Content-ID` header with |
| 418 | *cid* as its value. |
| 419 | |
| 420 | If *params* is specified, iterate its ``items`` method and use the |
| 421 | resulting ``(key, value)`` pairs to set additional paramters on the |
| 422 | :mailheader:`Content-Type` header. |
| 423 | |
| 424 | If *headers* is specified and is a list of strings of the form |
| 425 | ``headername: headervalue`` or a list of ``header`` objects |
| 426 | (distinguised from strings by having a ``name`` attribute), add the |
| 427 | headers to *msg*. |