blob: e1fb20e357e86b4f6f543c3094168d13cd6ac492 [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001:mod:`email`: Representing an email message
2-------------------------------------------
3
4.. module:: email.message
5 :synopsis: The base class representing email messages.
6
7
8The central class in the :mod:`email` package is the :class:`Message` class,
9imported from the :mod:`email.message` module. It is the base class for the
10:mod:`email` object model. :class:`Message` provides the core functionality for
11setting and querying header fields, and for accessing message bodies.
12
13Conceptually, a :class:`Message` object consists of *headers* and *payloads*.
14Headers are :rfc:`2822` style field names and values where the field name and
15value are separated by a colon. The colon is not part of either the field name
16or the field value.
17
18Headers are stored and returned in case-preserving form but are matched
19case-insensitively. There may also be a single envelope header, also known as
20the *Unix-From* header or the ``From_`` header. The payload is either a string
21in the case of simple message objects or a list of :class:`Message` objects for
22MIME container documents (e.g. :mimetype:`multipart/\*` and
23:mimetype:`message/rfc822`).
24
25:class:`Message` objects provide a mapping style interface for accessing the
26message headers, and an explicit interface for accessing both the headers and
27the payload. It provides convenience methods for generating a flat text
28representation of the message object tree, for accessing commonly used header
29parameters, and for recursively walking over the object tree.
30
31Here are the methods of the :class:`Message` class:
32
33
34.. class:: Message()
35
36 The constructor takes no arguments.
37
38
39.. method:: Message.as_string([unixfrom])
40
41 Return the entire message flatten as a string. When optional *unixfrom* is
42 ``True``, the envelope header is included in the returned string. *unixfrom*
43 defaults to ``False``.
44
45 Note that this method is provided as a convenience and may not always format the
46 message the way you want. For example, by default it mangles lines that begin
47 with ``From``. For more flexibility, instantiate a :class:`Generator` instance
48 and use its :meth:`flatten` method directly. For example::
49
50 from cStringIO import StringIO
51 from email.generator import Generator
52 fp = StringIO()
53 g = Generator(fp, mangle_from_=False, maxheaderlen=60)
54 g.flatten(msg)
55 text = fp.getvalue()
56
57
58.. method:: Message.__str__()
59
60 Equivalent to ``as_string(unixfrom=True)``.
61
62
63.. method:: Message.is_multipart()
64
65 Return ``True`` if the message's payload is a list of sub-\ :class:`Message`
66 objects, otherwise return ``False``. When :meth:`is_multipart` returns False,
67 the payload should be a string object.
68
69
70.. method:: Message.set_unixfrom(unixfrom)
71
72 Set the message's envelope header to *unixfrom*, which should be a string.
73
74
75.. method:: Message.get_unixfrom()
76
77 Return the message's envelope header. Defaults to ``None`` if the envelope
78 header was never set.
79
80
81.. method:: Message.attach(payload)
82
83 Add the given *payload* to the current payload, which must be ``None`` or a list
84 of :class:`Message` objects before the call. After the call, the payload will
85 always be a list of :class:`Message` objects. If you want to set the payload to
86 a scalar object (e.g. a string), use :meth:`set_payload` instead.
87
88
89.. method:: Message.get_payload([i[, decode]])
90
91 Return a reference the current payload, which will be a list of :class:`Message`
92 objects when :meth:`is_multipart` is ``True``, or a string when
93 :meth:`is_multipart` is ``False``. If the payload is a list and you mutate the
94 list object, you modify the message's payload in place.
95
96 With optional argument *i*, :meth:`get_payload` will return the *i*-th element
97 of the payload, counting from zero, if :meth:`is_multipart` is ``True``. An
98 :exc:`IndexError` will be raised if *i* is less than 0 or greater than or equal
99 to the number of items in the payload. If the payload is a string (i.e.
100 :meth:`is_multipart` is ``False``) and *i* is given, a :exc:`TypeError` is
101 raised.
102
103 Optional *decode* is a flag indicating whether the payload should be decoded or
104 not, according to the :mailheader:`Content-Transfer-Encoding` header. When
105 ``True`` and the message is not a multipart, the payload will be decoded if this
106 header's value is ``quoted-printable`` or ``base64``. If some other encoding is
107 used, or :mailheader:`Content-Transfer-Encoding` header is missing, or if the
108 payload has bogus base64 data, the payload is returned as-is (undecoded). If
109 the message is a multipart and the *decode* flag is ``True``, then ``None`` is
110 returned. The default for *decode* is ``False``.
111
112
113.. method:: Message.set_payload(payload[, charset])
114
115 Set the entire message object's payload to *payload*. It is the client's
116 responsibility to ensure the payload invariants. Optional *charset* sets the
117 message's default character set; see :meth:`set_charset` for details.
118
119 .. versionchanged:: 2.2.2
120 *charset* argument added.
121
122
123.. method:: Message.set_charset(charset)
124
125 Set the character set of the payload to *charset*, which can either be a
126 :class:`Charset` instance (see :mod:`email.charset`), a string naming a
127 character set, or ``None``. If it is a string, it will be converted to a
128 :class:`Charset` instance. If *charset* is ``None``, the ``charset`` parameter
129 will be removed from the :mailheader:`Content-Type` header. Anything else will
130 generate a :exc:`TypeError`.
131
132 The message will be assumed to be of type :mimetype:`text/\*` encoded with
133 *charset.input_charset*. It will be converted to *charset.output_charset* and
134 encoded properly, if needed, when generating the plain text representation of
135 the message. MIME headers (:mailheader:`MIME-Version`,
136 :mailheader:`Content-Type`, :mailheader:`Content-Transfer-Encoding`) will be
137 added as needed.
138
139 .. versionadded:: 2.2.2
140
141
142.. method:: Message.get_charset()
143
144 Return the :class:`Charset` instance associated with the message's payload.
145
146 .. versionadded:: 2.2.2
147
148The following methods implement a mapping-like interface for accessing the
149message's :rfc:`2822` headers. Note that there are some semantic differences
150between these methods and a normal mapping (i.e. dictionary) interface. For
151example, in a dictionary there are no duplicate keys, but here there may be
152duplicate message headers. Also, in dictionaries there is no guaranteed order
153to the keys returned by :meth:`keys`, but in a :class:`Message` object, headers
154are always returned in the order they appeared in the original message, or were
155added to the message later. Any header deleted and then re-added are always
156appended to the end of the header list.
157
158These semantic differences are intentional and are biased toward maximal
159convenience.
160
161Note that in all cases, any envelope header present in the message is not
162included in the mapping interface.
163
164
165.. method:: Message.__len__()
166
167 Return the total number of headers, including duplicates.
168
169
170.. method:: Message.__contains__(name)
171
172 Return true if the message object has a field named *name*. Matching is done
173 case-insensitively and *name* should not include the trailing colon. Used for
174 the ``in`` operator, e.g.::
175
176 if 'message-id' in myMessage:
177 print 'Message-ID:', myMessage['message-id']
178
179
180.. method:: Message.__getitem__(name)
181
182 Return the value of the named header field. *name* should not include the colon
183 field separator. If the header is missing, ``None`` is returned; a
184 :exc:`KeyError` is never raised.
185
186 Note that if the named field appears more than once in the message's headers,
187 exactly which of those field values will be returned is undefined. Use the
188 :meth:`get_all` method to get the values of all the extant named headers.
189
190
191.. method:: Message.__setitem__(name, val)
192
193 Add a header to the message with field name *name* and value *val*. The field
194 is appended to the end of the message's existing fields.
195
196 Note that this does *not* overwrite or delete any existing header with the same
197 name. If you want to ensure that the new header is the only one present in the
198 message with field name *name*, delete the field first, e.g.::
199
200 del msg['subject']
201 msg['subject'] = 'Python roolz!'
202
203
204.. method:: Message.__delitem__(name)
205
206 Delete all occurrences of the field with name *name* from the message's headers.
207 No exception is raised if the named field isn't present in the headers.
208
209
210.. method:: Message.has_key(name)
211
212 Return true if the message contains a header field named *name*, otherwise
213 return false.
214
215
216.. method:: Message.keys()
217
218 Return a list of all the message's header field names.
219
220
221.. method:: Message.values()
222
223 Return a list of all the message's field values.
224
225
226.. method:: Message.items()
227
228 Return a list of 2-tuples containing all the message's field headers and values.
229
230
231.. method:: Message.get(name[, failobj])
232
233 Return the value of the named header field. This is identical to
234 :meth:`__getitem__` except that optional *failobj* is returned if the named
235 header is missing (defaults to ``None``).
236
237Here are some additional useful methods:
238
239
240.. method:: Message.get_all(name[, failobj])
241
242 Return a list of all the values for the field named *name*. If there are no such
243 named headers in the message, *failobj* is returned (defaults to ``None``).
244
245
246.. method:: Message.add_header(_name, _value, **_params)
247
248 Extended header setting. This method is similar to :meth:`__setitem__` except
249 that additional header parameters can be provided as keyword arguments. *_name*
250 is the header field to add and *_value* is the *primary* value for the header.
251
252 For each item in the keyword argument dictionary *_params*, the key is taken as
253 the parameter name, with underscores converted to dashes (since dashes are
254 illegal in Python identifiers). Normally, the parameter will be added as
255 ``key="value"`` unless the value is ``None``, in which case only the key will be
256 added.
257
258 Here's an example::
259
260 msg.add_header('Content-Disposition', 'attachment', filename='bud.gif')
261
262 This will add a header that looks like ::
263
264 Content-Disposition: attachment; filename="bud.gif"
265
266
267.. method:: Message.replace_header(_name, _value)
268
269 Replace a header. Replace the first header found in the message that matches
270 *_name*, retaining header order and field name case. If no matching header was
271 found, a :exc:`KeyError` is raised.
272
273 .. versionadded:: 2.2.2
274
275
276.. method:: Message.get_content_type()
277
278 Return the message's content type. The returned string is coerced to lower case
279 of the form :mimetype:`maintype/subtype`. If there was no
280 :mailheader:`Content-Type` header in the message the default type as given by
281 :meth:`get_default_type` will be returned. Since according to :rfc:`2045`,
282 messages always have a default type, :meth:`get_content_type` will always return
283 a value.
284
285 :rfc:`2045` defines a message's default type to be :mimetype:`text/plain` unless
286 it appears inside a :mimetype:`multipart/digest` container, in which case it
287 would be :mimetype:`message/rfc822`. If the :mailheader:`Content-Type` header
288 has an invalid type specification, :rfc:`2045` mandates that the default type be
289 :mimetype:`text/plain`.
290
291 .. versionadded:: 2.2.2
292
293
294.. method:: Message.get_content_maintype()
295
296 Return the message's main content type. This is the :mimetype:`maintype` part
297 of the string returned by :meth:`get_content_type`.
298
299 .. versionadded:: 2.2.2
300
301
302.. method:: Message.get_content_subtype()
303
304 Return the message's sub-content type. This is the :mimetype:`subtype` part of
305 the string returned by :meth:`get_content_type`.
306
307 .. versionadded:: 2.2.2
308
309
310.. method:: Message.get_default_type()
311
312 Return the default content type. Most messages have a default content type of
313 :mimetype:`text/plain`, except for messages that are subparts of
314 :mimetype:`multipart/digest` containers. Such subparts have a default content
315 type of :mimetype:`message/rfc822`.
316
317 .. versionadded:: 2.2.2
318
319
320.. method:: Message.set_default_type(ctype)
321
322 Set the default content type. *ctype* should either be :mimetype:`text/plain`
323 or :mimetype:`message/rfc822`, although this is not enforced. The default
324 content type is not stored in the :mailheader:`Content-Type` header.
325
326 .. versionadded:: 2.2.2
327
328
329.. method:: Message.get_params([failobj[, header[, unquote]]])
330
331 Return the message's :mailheader:`Content-Type` parameters, as a list. The
332 elements of the returned list are 2-tuples of key/value pairs, as split on the
333 ``'='`` sign. The left hand side of the ``'='`` is the key, while the right
334 hand side is the value. If there is no ``'='`` sign in the parameter the value
335 is the empty string, otherwise the value is as described in :meth:`get_param`
336 and is unquoted if optional *unquote* is ``True`` (the default).
337
338 Optional *failobj* is the object to return if there is no
339 :mailheader:`Content-Type` header. Optional *header* is the header to search
340 instead of :mailheader:`Content-Type`.
341
342 .. versionchanged:: 2.2.2
343 *unquote* argument added.
344
345
346.. method:: Message.get_param(param[, failobj[, header[, unquote]]])
347
348 Return the value of the :mailheader:`Content-Type` header's parameter *param* as
349 a string. If the message has no :mailheader:`Content-Type` header or if there
350 is no such parameter, then *failobj* is returned (defaults to ``None``).
351
352 Optional *header* if given, specifies the message header to use instead of
353 :mailheader:`Content-Type`.
354
355 Parameter keys are always compared case insensitively. The return value can
356 either be a string, or a 3-tuple if the parameter was :rfc:`2231` encoded. When
357 it's a 3-tuple, the elements of the value are of the form ``(CHARSET, LANGUAGE,
358 VALUE)``. Note that both ``CHARSET`` and ``LANGUAGE`` can be ``None``, in which
359 case you should consider ``VALUE`` to be encoded in the ``us-ascii`` charset.
360 You can usually ignore ``LANGUAGE``.
361
362 If your application doesn't care whether the parameter was encoded as in
363 :rfc:`2231`, you can collapse the parameter value by calling
364 :func:`email.Utils.collapse_rfc2231_value`, passing in the return value from
365 :meth:`get_param`. This will return a suitably decoded Unicode string whn the
366 value is a tuple, or the original string unquoted if it isn't. For example::
367
368 rawparam = msg.get_param('foo')
369 param = email.Utils.collapse_rfc2231_value(rawparam)
370
371 In any case, the parameter value (either the returned string, or the ``VALUE``
372 item in the 3-tuple) is always unquoted, unless *unquote* is set to ``False``.
373
374 .. versionchanged:: 2.2.2
375 *unquote* argument added, and 3-tuple return value possible.
376
377
378.. method:: Message.set_param(param, value[, header[, requote[, charset[, language]]]])
379
380 Set a parameter in the :mailheader:`Content-Type` header. If the parameter
381 already exists in the header, its value will be replaced with *value*. If the
382 :mailheader:`Content-Type` header as not yet been defined for this message, it
383 will be set to :mimetype:`text/plain` and the new parameter value will be
384 appended as per :rfc:`2045`.
385
386 Optional *header* specifies an alternative header to :mailheader:`Content-Type`,
387 and all parameters will be quoted as necessary unless optional *requote* is
388 ``False`` (the default is ``True``).
389
390 If optional *charset* is specified, the parameter will be encoded according to
391 :rfc:`2231`. Optional *language* specifies the RFC 2231 language, defaulting to
392 the empty string. Both *charset* and *language* should be strings.
393
394 .. versionadded:: 2.2.2
395
396
397.. method:: Message.del_param(param[, header[, requote]])
398
399 Remove the given parameter completely from the :mailheader:`Content-Type`
400 header. The header will be re-written in place without the parameter or its
401 value. All values will be quoted as necessary unless *requote* is ``False``
402 (the default is ``True``). Optional *header* specifies an alternative to
403 :mailheader:`Content-Type`.
404
405 .. versionadded:: 2.2.2
406
407
408.. method:: Message.set_type(type[, header][, requote])
409
410 Set the main type and subtype for the :mailheader:`Content-Type` header. *type*
411 must be a string in the form :mimetype:`maintype/subtype`, otherwise a
412 :exc:`ValueError` is raised.
413
414 This method replaces the :mailheader:`Content-Type` header, keeping all the
415 parameters in place. If *requote* is ``False``, this leaves the existing
416 header's quoting as is, otherwise the parameters will be quoted (the default).
417
418 An alternative header can be specified in the *header* argument. When the
419 :mailheader:`Content-Type` header is set a :mailheader:`MIME-Version` header is
420 also added.
421
422 .. versionadded:: 2.2.2
423
424
425.. method:: Message.get_filename([failobj])
426
427 Return the value of the ``filename`` parameter of the
428 :mailheader:`Content-Disposition` header of the message. If the header does not
429 have a ``filename`` parameter, this method falls back to looking for the
430 ``name`` parameter. If neither is found, or the header is missing, then
431 *failobj* is returned. The returned string will always be unquoted as per
432 :meth:`Utils.unquote`.
433
434
435.. method:: Message.get_boundary([failobj])
436
437 Return the value of the ``boundary`` parameter of the :mailheader:`Content-Type`
438 header of the message, or *failobj* if either the header is missing, or has no
439 ``boundary`` parameter. The returned string will always be unquoted as per
440 :meth:`Utils.unquote`.
441
442
443.. method:: Message.set_boundary(boundary)
444
445 Set the ``boundary`` parameter of the :mailheader:`Content-Type` header to
446 *boundary*. :meth:`set_boundary` will always quote *boundary* if necessary. A
447 :exc:`HeaderParseError` is raised if the message object has no
448 :mailheader:`Content-Type` header.
449
450 Note that using this method is subtly different than deleting the old
451 :mailheader:`Content-Type` header and adding a new one with the new boundary via
452 :meth:`add_header`, because :meth:`set_boundary` preserves the order of the
453 :mailheader:`Content-Type` header in the list of headers. However, it does *not*
454 preserve any continuation lines which may have been present in the original
455 :mailheader:`Content-Type` header.
456
457
458.. method:: Message.get_content_charset([failobj])
459
460 Return the ``charset`` parameter of the :mailheader:`Content-Type` header,
461 coerced to lower case. If there is no :mailheader:`Content-Type` header, or if
462 that header has no ``charset`` parameter, *failobj* is returned.
463
464 Note that this method differs from :meth:`get_charset` which returns the
465 :class:`Charset` instance for the default encoding of the message body.
466
467 .. versionadded:: 2.2.2
468
469
470.. method:: Message.get_charsets([failobj])
471
472 Return a list containing the character set names in the message. If the message
473 is a :mimetype:`multipart`, then the list will contain one element for each
474 subpart in the payload, otherwise, it will be a list of length 1.
475
476 Each item in the list will be a string which is the value of the ``charset``
477 parameter in the :mailheader:`Content-Type` header for the represented subpart.
478 However, if the subpart has no :mailheader:`Content-Type` header, no ``charset``
479 parameter, or is not of the :mimetype:`text` main MIME type, then that item in
480 the returned list will be *failobj*.
481
482
483.. method:: Message.walk()
484
485 The :meth:`walk` method is an all-purpose generator which can be used to iterate
486 over all the parts and subparts of a message object tree, in depth-first
487 traversal order. You will typically use :meth:`walk` as the iterator in a
488 ``for`` loop; each iteration returns the next subpart.
489
490 Here's an example that prints the MIME type of every part of a multipart message
491 structure::
492
493 >>> for part in msg.walk():
494 ... print part.get_content_type()
495 multipart/report
496 text/plain
497 message/delivery-status
498 text/plain
499 text/plain
500 message/rfc822
501
502.. versionchanged:: 2.5
503 The previously deprecated methods :meth:`get_type`, :meth:`get_main_type`, and
504 :meth:`get_subtype` were removed.
505
506:class:`Message` objects can also optionally contain two instance attributes,
507which can be used when generating the plain text of a MIME message.
508
509
510.. data:: preamble
511
512 The format of a MIME document allows for some text between the blank line
513 following the headers, and the first multipart boundary string. Normally, this
514 text is never visible in a MIME-aware mail reader because it falls outside the
515 standard MIME armor. However, when viewing the raw text of the message, or when
516 viewing the message in a non-MIME aware reader, this text can become visible.
517
518 The *preamble* attribute contains this leading extra-armor text for MIME
519 documents. When the :class:`Parser` discovers some text after the headers but
520 before the first boundary string, it assigns this text to the message's
521 *preamble* attribute. When the :class:`Generator` is writing out the plain text
522 representation of a MIME message, and it finds the message has a *preamble*
523 attribute, it will write this text in the area between the headers and the first
524 boundary. See :mod:`email.parser` and :mod:`email.generator` for details.
525
526 Note that if the message object has no preamble, the *preamble* attribute will
527 be ``None``.
528
529
530.. data:: epilogue
531
532 The *epilogue* attribute acts the same way as the *preamble* attribute, except
533 that it contains text that appears between the last boundary and the end of the
534 message.
535
536 .. versionchanged:: 2.5
537 You do not need to set the epilogue to the empty string in order for the
538 :class:`Generator` to print a newline at the end of the file.
539
540
541.. data:: defects
542
543 The *defects* attribute contains a list of all the problems found when parsing
544 this message. See :mod:`email.errors` for a detailed description of the
545 possible parsing defects.
546
547 .. versionadded:: 2.4
548