Barry Warsaw | 5e63463 | 2001-09-26 05:23:47 +0000 | [diff] [blame^] | 1 | \section{\module{email.Message} --- |
| 2 | The Message class} |
| 3 | |
| 4 | \declaremodule{standard}{email.Message} |
| 5 | \modulesynopsis{The base class representing email messages.} |
| 6 | \sectionauthor{Barry A. Warsaw}{barry@zope.com} |
| 7 | |
| 8 | \versionadded{2.2} |
| 9 | |
| 10 | The \module{Message} module provides a single class, the |
| 11 | \class{Message} class. This class is the base class for the |
| 12 | \module{email} package object model. It has a fairly extensive set of |
| 13 | methods to get and set email headers and email payloads. For an |
| 14 | introduction of the \module{email} package, please read the |
| 15 | \refmodule{email} package overview. |
| 16 | |
| 17 | \class{Message} instances can be created either directly, or |
| 18 | indirectly by using a \refmodule{email.Parser}. \class{Message} |
| 19 | objects provide a mapping style interface for accessing the message |
| 20 | headers, and an explicit interface for accessing both the headers and |
| 21 | the payload. It provides convenience methods for generating a flat |
| 22 | text representation of the message object tree, for accessing commonly |
| 23 | used header parameters, and for recursively walking over the object |
| 24 | tree. |
| 25 | |
| 26 | Here are the methods of the \class{Message} class: |
| 27 | |
| 28 | \begin{methoddesc}[Message]{as_string}{\optional{unixfrom}} |
| 29 | Return the entire formatted message as a string. Optional |
| 30 | \var{unixfrom}, when true, specifies to include the \emph{Unix-From} |
| 31 | envelope header; it defaults to 0. |
| 32 | \end{methoddesc} |
| 33 | |
| 34 | \begin{methoddesc}[Message]{__str__()}{} |
| 35 | Equivalent to \method{aMessage.as_string(unixfrom=1)}. |
| 36 | \end{methoddesc} |
| 37 | |
| 38 | \begin{methoddesc}[Message]{is_multipart}{} |
| 39 | Return 1 if the message's payload is a list of sub-\class{Message} |
| 40 | objects, otherwise return 0. When \method{is_multipart()} returns 0, |
| 41 | the payload should either be a string object, or a single |
| 42 | \class{Message} instance. |
| 43 | \end{methoddesc} |
| 44 | |
| 45 | \begin{methoddesc}[Message]{set_unixfrom}{unixfrom} |
| 46 | Set the \emph{Unix-From} (a.k.a envelope header or \code{From_} |
| 47 | header) to \var{unixfrom}, which should be a string. |
| 48 | \end{methoddesc} |
| 49 | |
| 50 | \begin{methoddesc}[Message]{get_unixfrom}{} |
| 51 | Return the \emph{Unix-From} header. Defaults to \code{None} if the |
| 52 | \emph{Unix-From} header was never set. |
| 53 | \end{methoddesc} |
| 54 | |
| 55 | \begin{methoddesc}[Message]{add_payload}{payload} |
| 56 | Add \var{payload} to the message object's existing payload. If, prior |
| 57 | to calling this method, the object's payload was \code{None} |
| 58 | (i.e. never before set), then after this method is called, the payload |
| 59 | will be the argument \var{payload}. |
| 60 | |
| 61 | If the object's payload was already a list |
| 62 | (i.e. \method{is_multipart()} returns 1), then \var{payload} is |
| 63 | appended to the end of the existing payload list. |
| 64 | |
| 65 | For any other type of existing payload, \method{add_payload()} will |
| 66 | transform the new payload into a list consisting of the old payload |
| 67 | and \var{payload}, but only if the document is already a MIME |
| 68 | multipart document. This condition is satisfied if the message's |
| 69 | \code{Content-Type:} header's main type is either \var{multipart}, or |
| 70 | there is no \code{Content-Type:} header. In any other situation, |
| 71 | \exception{MultipartConversionError} is raised. |
| 72 | \end{methoddesc} |
| 73 | |
| 74 | \begin{methoddesc}[Message]{attach}{payload} |
| 75 | Synonymous with \method{add_payload()}. |
| 76 | \end{methoddesc} |
| 77 | |
| 78 | \begin{methoddesc}[Message]{get_payload}{\optional{i\optional{, decode}}} |
| 79 | Return the current payload, which will be a list of \class{Message} |
| 80 | objects when \method{is_multipart()} returns 1, or a scalar (either a |
| 81 | string or a single \class{Message} instance) when |
| 82 | \method{is_multipart()} returns 0. |
| 83 | |
| 84 | With optional \var{i}, \method{get_payload()} will return the |
| 85 | \var{i}-th element of the payload, counting from zero, if |
| 86 | \method{is_multipart()} returns 1. An \code{IndexError} will be raised |
| 87 | if \var{i} is less than 0 or greater than or equal to the number of |
| 88 | items in the payload. If the payload is scalar |
| 89 | (i.e. \method{is_multipart()} returns 0) and \var{i} is given, a |
| 90 | \code{TypeError} is raised. |
| 91 | |
| 92 | Optional \var{decode} is a flag indicating whether the payload should be |
| 93 | decoded or not, according to the \code{Content-Transfer-Encoding:} header. |
| 94 | When true and the message is not a multipart, the payload will be |
| 95 | decoded if this header's value is \samp{quoted-printable} or |
| 96 | \samp{base64}. If some other encoding is used, or |
| 97 | \code{Content-Transfer-Encoding:} header is |
| 98 | missing, the payload is returned as-is (undecoded). If the message is |
| 99 | a multipart and the \var{decode} flag is true, then \code{None} is |
| 100 | returned. |
| 101 | \end{methoddesc} |
| 102 | |
| 103 | \begin{methoddesc}[Message]{set_payload}{payload} |
| 104 | Set the entire message object's payload to \var{payload}. It is the |
| 105 | client's responsibility to ensure the payload invariants. |
| 106 | \end{methoddesc} |
| 107 | |
| 108 | The following methods implement a mapping-like interface for accessing |
| 109 | the message object's \rfc{2822} headers. Note that there are some |
| 110 | semantic differences between these methods and a normal mapping |
| 111 | (i.e. dictionary) interface. For example, in a dictionary there are |
| 112 | no duplicate keys, but here there may be duplicate message headers. Also, |
| 113 | in dictionaries there is no guaranteed order to the keys returned by |
| 114 | \method{keys()}, but in a \class{Message} object, there is an explicit |
| 115 | order. These semantic differences are intentional and are biased |
| 116 | toward maximal convenience. |
| 117 | |
| 118 | Note that in all cases, any optional \emph{Unix-From} header the message |
| 119 | may have is not included in the mapping interface. |
| 120 | |
| 121 | \begin{methoddesc}[Message]{__len__}{} |
| 122 | Return the total number of headers, including duplicates. |
| 123 | \end{methoddesc} |
| 124 | |
| 125 | \begin{methoddesc}[Message]{__contains__}{name} |
| 126 | Return true if the message object has a field named \var{name}. |
| 127 | Match is done case-insensitively and \var{name} should not include the |
| 128 | trailing colon. Used for the \code{in} operator, |
| 129 | e.g.: |
| 130 | |
| 131 | \begin{verbatim} |
| 132 | if 'message-id' in myMessage: |
| 133 | print 'Message-ID:', myMessage['message-id'] |
| 134 | \end{verbatim} |
| 135 | \end{methoddesc} |
| 136 | |
| 137 | \begin{methoddesc}[Message]{__getitem__}{name} |
| 138 | Return the value of the named header field. \var{name} should not |
| 139 | include the colon field separator. If the header is missing, |
| 140 | \code{None} is returned; a \code{KeyError} is never raised. |
| 141 | |
| 142 | Note that if the named field appears more than once in the message's |
| 143 | headers, exactly which of those field values will be returned is |
| 144 | undefined. Use the \method{get_all()} method to get the values of all |
| 145 | the extant named headers. |
| 146 | \end{methoddesc} |
| 147 | |
| 148 | \begin{methoddesc}[Message]{__setitem__}{name, val} |
| 149 | Add a header to the message with field name \var{name} and value |
| 150 | \var{val}. The field is appended to the end of the message's existing |
| 151 | fields. |
| 152 | |
| 153 | Note that this does \emph{not} overwrite or delete any existing header |
| 154 | with the same name. If you want to ensure that the new header is the |
| 155 | only one present in the message with field name |
| 156 | \var{name}, first use \method{__delitem__()} to delete all named |
| 157 | fields, e.g.: |
| 158 | |
| 159 | \begin{verbatim} |
| 160 | del msg['subject'] |
| 161 | msg['subject'] = 'Python roolz!' |
| 162 | \end{verbatim} |
| 163 | \end{methoddesc} |
| 164 | |
| 165 | \begin{methoddesc}[Message]{__delitem__}{name} |
| 166 | Delete all occurrences of the field with name \var{name} from the |
| 167 | message's headers. No exception is raised if the named field isn't |
| 168 | present in the headers. |
| 169 | \end{methoddesc} |
| 170 | |
| 171 | \begin{methoddesc}[Message]{has_key}{name} |
| 172 | Return 1 if the message contains a header field named \var{name}, |
| 173 | otherwise return 0. |
| 174 | \end{methoddesc} |
| 175 | |
| 176 | \begin{methoddesc}[Message]{keys}{} |
| 177 | Return a list of all the message's header field names. These keys |
| 178 | will be sorted in the order in which they were added to the message |
| 179 | via \method{__setitem__()}, and may contain duplicates. Any fields |
| 180 | deleted and then subsequently re-added are always appended to the end |
| 181 | of the header list. |
| 182 | \end{methoddesc} |
| 183 | |
| 184 | \begin{methoddesc}[Message]{values}{} |
| 185 | Return a list of all the message's field values. These will be sorted |
| 186 | in the order in which they were added to the message via |
| 187 | \method{__setitem__()}, and may contain duplicates. Any fields |
| 188 | deleted and then subsequently re-added are always appended to the end |
| 189 | of the header list. |
| 190 | \end{methoddesc} |
| 191 | |
| 192 | \begin{methoddesc}[Message]{items}{} |
| 193 | Return a list of 2-tuples containing all the message's field headers and |
| 194 | values. These will be sorted in the order in which they were added to |
| 195 | the message via \method{__setitem__()}, and may contain duplicates. |
| 196 | Any fields deleted and then subsequently re-added are always appended |
| 197 | to the end of the header list. |
| 198 | \end{methoddesc} |
| 199 | |
| 200 | \begin{methoddesc}[Message]{get}{name\optional{, failobj}} |
| 201 | Return the value of the named header field. This is identical to |
| 202 | \method{__getitem__()} except that optional \var{failobj} is returned |
| 203 | if the named header is missing (defaults to \code{None}). |
| 204 | \end{methoddesc} |
| 205 | |
| 206 | Here are some additional useful methods: |
| 207 | |
| 208 | \begin{methoddesc}[Message]{get_all}{name\optional{, failobj}} |
| 209 | Return a list of all the values for the field named \var{name}. These |
| 210 | will be sorted in the order in which they were added to the message |
| 211 | via \method{__setitem__()}. Any fields |
| 212 | deleted and then subsequently re-added are always appended to the end |
| 213 | of the list. |
| 214 | |
| 215 | If there are no such named headers in the message, \var{failobj} is |
| 216 | returned (defaults to \code{None}). |
| 217 | \end{methoddesc} |
| 218 | |
| 219 | \begin{methoddesc}[Message]{add_header}{_name, _value, **_params} |
| 220 | Extended header setting. This method is similar to |
| 221 | \method{__setitem__()} except that additional header parameters can be |
| 222 | provided as keyword arguments. \var{_name} is the header to set and |
| 223 | \var{_value} is the \emph{primary} value for the header. |
| 224 | |
| 225 | For each item in the keyword argument dictionary \var{_params}, the |
| 226 | key is taken as the parameter name, with underscores converted to |
| 227 | dashes (since dashes are illegal in Python identifiers). Normally, |
| 228 | the parameter will be added as \code{key="value"} unless the value is |
| 229 | \code{None}, in which case only the key will be added. |
| 230 | |
| 231 | Here's an example: |
| 232 | |
| 233 | \begin{verbatim} |
| 234 | msg.add_header('Content-Disposition', 'attachment', filename='bud.gif') |
| 235 | \end{verbatim} |
| 236 | |
| 237 | This will add a header that looks like |
| 238 | |
| 239 | \begin{verbatim} |
| 240 | Content-Disposition: attachment; filename="bud.gif" |
| 241 | \end{verbatim} |
| 242 | \end{methoddesc} |
| 243 | |
| 244 | \begin{methoddesc}[Message]{get_type}{\optional{failobj}} |
| 245 | Return the message's content type, as a string of the form |
| 246 | ``maintype/subtype'' as taken from the \code{Content-Type:} header. |
| 247 | The returned string is coerced to lowercase. |
| 248 | |
| 249 | If there is no \code{Content-Type:} header in the message, |
| 250 | \var{failobj} is returned (defaults to \code{None}). |
| 251 | \end{methoddesc} |
| 252 | |
| 253 | \begin{methoddesc}[Message]{get_main_type}{\optional{failobj}} |
| 254 | Return the message's \emph{main} content type. This essentially returns the |
| 255 | \var{maintype} part of the string returned by \method{get_type()}, with the |
| 256 | same semantics for \var{failobj}. |
| 257 | \end{methoddesc} |
| 258 | |
| 259 | \begin{methoddesc}[Message]{get_subtype}{\optional{failobj}} |
| 260 | Return the message's sub-content type. This essentially returns the |
| 261 | \var{subtype} part of the string returned by \method{get_type()}, with the |
| 262 | same semantics for \var{failobj}. |
| 263 | \end{methoddesc} |
| 264 | |
| 265 | \begin{methoddesc}[Message]{get_params}{\optional{failobj\optional{, header}}} |
| 266 | Return the message's \code{Content-Type:} parameters, as a list. The |
| 267 | elements of the returned list are 2-tuples of key/value pairs, as |
| 268 | split on the \samp{=} sign. The left hand side of the \samp{=} is the |
| 269 | key, while the right hand side is the value. If there is no \samp{=} |
| 270 | sign in the parameter the value is the empty string. The value is |
| 271 | always unquoted with \method{Utils.unquote()}. |
| 272 | |
| 273 | Optional \var{failobj} is the object to return if there is no |
| 274 | \code{Content-Type:} header. Optional \var{header} is the header to |
| 275 | search instead of \code{Content-Type:}. |
| 276 | \end{methoddesc} |
| 277 | |
| 278 | \begin{methoddesc}[Message]{get_param}{param\optional{, |
| 279 | failobj\optional{, header}}} |
| 280 | Return the value of the \code{Content-Type:} header's parameter |
| 281 | \var{param} as a string. If the message has no \code{Content-Type:} |
| 282 | header or if there is no such parameter, then \var{failobj} is |
| 283 | returned (defaults to \code{None}). |
| 284 | |
| 285 | Optional \var{header} if given, specifies the message header to use |
| 286 | instead of \code{Content-Type:}. |
| 287 | \end{methoddesc} |
| 288 | |
| 289 | \begin{methoddesc}[Message]{get_charsets}{\optional{failobj}} |
| 290 | Return a list containing the character set names in the message. If |
| 291 | the message is a \code{multipart}, then the list will contain one |
| 292 | element for each subpart in the payload, otherwise, it will be a list |
| 293 | of length 1. |
| 294 | |
| 295 | Each item in the list will be a string which is the value of the |
| 296 | \code{charset} parameter in the \code{Content-Type:} header for the |
| 297 | represented subpart. However, if the subpart has no |
| 298 | \code{Content-Type:} header, no \code{charset} parameter, or is not of |
| 299 | the \code{text} main MIME type, then that item in the returned list |
| 300 | will be \var{failobj}. |
| 301 | \end{methoddesc} |
| 302 | |
| 303 | \begin{methoddesc}[Message]{get_filename}{\optional{failobj}} |
| 304 | Return the value of the \code{filename} parameter of the |
| 305 | \code{Content-Disposition:} header of the message, or \var{failobj} if |
| 306 | either the header is missing, or has no \code{filename} parameter. |
| 307 | The returned string will always be unquoted as per |
| 308 | \method{Utils.unquote()}. |
| 309 | \end{methoddesc} |
| 310 | |
| 311 | \begin{methoddesc}[Message]{get_boundary}{\optional{failobj}} |
| 312 | Return the value of the \code{boundary} parameter of the |
| 313 | \code{Content-Type:} header of the message, or \var{failobj} if either |
| 314 | the header is missing, or has no \code{boundary} parameter. The |
| 315 | returned string will always be unquoted as per |
| 316 | \method{Utils.unquote()}. |
| 317 | \end{methoddesc} |
| 318 | |
| 319 | \begin{methoddesc}[Message]{set_boundary}{boundary} |
| 320 | Set the \code{boundary} parameter of the \code{Content-Type:} header |
| 321 | to \var{boundary}. \method{set_boundary()} will always quote |
| 322 | \var{boundary} so you should not quote it yourself. A |
| 323 | \code{HeaderParseError} is raised if the message object has no |
| 324 | \code{Content-Type:} header. |
| 325 | |
| 326 | Note that using this method is subtly different than deleting the old |
| 327 | \code{Content-Type:} header and adding a new one with the new boundary |
| 328 | via \method{add_header()}, because \method{set_boundary()} preserves the |
| 329 | order of the \code{Content-Type:} header in the list of headers. |
| 330 | However, it does \emph{not} preserve any continuation lines which may |
| 331 | have been present in the original \code{Content-Type:} header. |
| 332 | \end{methoddesc} |
| 333 | |
| 334 | \begin{methoddesc}[Message]{walk}{} |
| 335 | The \method{walk()} method is an all-purpose generator which can be |
| 336 | used to iterate over all the parts and subparts of a message object |
| 337 | tree, in depth-first traversal order. You will typically use |
| 338 | \method{walk()} as the iterator in a \code{for ... in} loop; each |
| 339 | iteration returns the next subpart. |
| 340 | |
| 341 | Here's an example that prints the MIME type of every part of a message |
| 342 | object tree: |
| 343 | |
| 344 | \begin{verbatim} |
| 345 | >>> for part in msg.walk(): |
| 346 | >>> print part.get_type('text/plain') |
| 347 | multipart/report |
| 348 | text/plain |
| 349 | message/delivery-status |
| 350 | text/plain |
| 351 | text/plain |
| 352 | message/rfc822 |
| 353 | \end{verbatim} |
| 354 | \end{methoddesc} |
| 355 | |
| 356 | \class{Message} objects can also optionally contain two instance |
| 357 | attributes, which can be used when generating the plain text of a MIME |
| 358 | message. |
| 359 | |
| 360 | \begin{datadesc}{preamble} |
| 361 | The format of a MIME document allows for some text between the blank |
| 362 | line following the headers, and the first multipart boundary string. |
| 363 | Normally, this text is never visible in a MIME-aware mail reader |
| 364 | because it falls outside the standard MIME armor. However, when |
| 365 | viewing the raw text of the message, or when viewing the message in a |
| 366 | non-MIME aware reader, this text can become visible. |
| 367 | |
| 368 | The \var{preamble} attribute contains this leading extra-armor text |
| 369 | for MIME documents. When the \class{Parser} discovers some text after |
| 370 | the headers but before the first boundary string, it assigns this text |
| 371 | to the message's \var{preamble} attribute. When the \class{Generator} |
| 372 | is writing out the plain text representation of a MIME message, and it |
| 373 | finds the message has a \var{preamble} attribute, it will write this |
| 374 | text in the area between the headers and the first boundary. |
| 375 | |
| 376 | Note that if the message object has no preamble, the |
| 377 | \var{preamble} attribute will be \code{None}. |
| 378 | \end{datadesc} |
| 379 | |
| 380 | \begin{datadesc}{epilogue} |
| 381 | The \var{epilogue} attribute acts the same way as the \var{preamble} |
| 382 | attribute, except that it contains text that appears between the last |
| 383 | boundary and the end of the message. |
| 384 | \end{datadesc} |