| Guido van Rossum | 470be14 | 1995-03-17 16:07:09 +0000 | [diff] [blame] | 1 | \section{Standard Module \sectcode{rfc822}} | 
| Guido van Rossum | a12ef94 | 1995-02-27 17:53:25 +0000 | [diff] [blame] | 2 | \stmodindex{rfc822} | 
 | 3 |  | 
| Guido van Rossum | 8675115 | 1995-02-28 17:14:32 +0000 | [diff] [blame] | 4 | \renewcommand{\indexsubitem}{(in module rfc822)} | 
 | 5 |  | 
| Guido van Rossum | a12ef94 | 1995-02-27 17:53:25 +0000 | [diff] [blame] | 6 | This module defines a class, \code{Message}, which represents a | 
 | 7 | collection of ``email headers'' as defined by the Internet standard | 
 | 8 | RFC 822.  It is used in various contexts, usually to read such headers | 
 | 9 | from a file. | 
 | 10 |  | 
 | 11 | A \code{Message} instance is instantiated with an open file object as | 
 | 12 | parameter.  Instantiation reads headers from the file up to a blank | 
 | 13 | line and stores them in the instance; after instantiation, the file is | 
 | 14 | positioned directly after the blank line that terminates the headers. | 
 | 15 |  | 
 | 16 | Input lines as read from the file may either be terminated by CR-LF or | 
 | 17 | by a single linefeed; a terminating CR-LF is replaced by a single | 
 | 18 | linefeed before the line is stored. | 
 | 19 |  | 
 | 20 | All header matching is done independent of upper or lower case; | 
 | 21 | e.g. \code{m['From']}, \code{m['from']} and \code{m['FROM']} all yield | 
 | 22 | the same result. | 
 | 23 |  | 
| Guido van Rossum | ecde781 | 1995-03-28 13:35:14 +0000 | [diff] [blame] | 24 | \subsection{Message Objects} | 
 | 25 |  | 
| Guido van Rossum | a12ef94 | 1995-02-27 17:53:25 +0000 | [diff] [blame] | 26 | A \code{Message} instance has the following methods: | 
 | 27 |  | 
 | 28 | \begin{funcdesc}{rewindbody}{} | 
 | 29 | Seek to the start of the message body.  This only works if the file | 
 | 30 | object is seekable. | 
 | 31 | \end{funcdesc} | 
 | 32 |  | 
 | 33 | \begin{funcdesc}{getallmatchingheaders}{name} | 
| Guido van Rossum | 6c4f003 | 1995-03-07 10:14:09 +0000 | [diff] [blame] | 34 | Return a list of lines consisting of all headers matching | 
| Guido van Rossum | a12ef94 | 1995-02-27 17:53:25 +0000 | [diff] [blame] | 35 | \var{name}, if any.  Each physical line, whether it is a continuation | 
 | 36 | line or not, is a separate list item.  Return the empty list if no | 
 | 37 | header matches \var{name}. | 
 | 38 | \end{funcdesc} | 
 | 39 |  | 
 | 40 | \begin{funcdesc}{getfirstmatchingheader}{name} | 
 | 41 | Return a list of lines comprising the first header matching | 
 | 42 | \var{name}, and its continuation line(s), if any.  Return \code{None} | 
 | 43 | if there is no header matching \var{name}. | 
 | 44 | \end{funcdesc} | 
 | 45 |  | 
 | 46 | \begin{funcdesc}{getrawheader}{name} | 
 | 47 | Return a single string consisting of the text after the colon in the | 
 | 48 | first header matching \var{name}.  This includes leading whitespace, | 
 | 49 | the trailing linefeed, and internal linefeeds and whitespace if there | 
 | 50 | any continuation line(s) were present.  Return \code{None} if there is | 
 | 51 | no header matching \var{name}. | 
 | 52 | \end{funcdesc} | 
 | 53 |  | 
 | 54 | \begin{funcdesc}{getheader}{name} | 
 | 55 | Like \code{getrawheader(\var{name})}, but strip leading and trailing | 
 | 56 | whitespace (but not internal whitespace). | 
 | 57 | \end{funcdesc} | 
 | 58 |  | 
 | 59 | \begin{funcdesc}{getaddr}{name} | 
 | 60 | Return a pair (full name, email address) parsed from the string | 
 | 61 | returned by \code{getheader(\var{name})}.  If no header matching | 
 | 62 | \var{name} exists, return \code{None, None}; otherwise both the full | 
 | 63 | name and the address are (possibly empty )strings. | 
 | 64 |  | 
| Guido van Rossum | 470be14 | 1995-03-17 16:07:09 +0000 | [diff] [blame] | 65 | Example: If \code{m}'s first \code{From} header contains the string\\ | 
 | 66 | \code{'jack@cwi.nl (Jack Jansen)'}, then | 
| Guido van Rossum | a12ef94 | 1995-02-27 17:53:25 +0000 | [diff] [blame] | 67 | \code{m.getaddr('From')} will yield the pair | 
| Guido van Rossum | 470be14 | 1995-03-17 16:07:09 +0000 | [diff] [blame] | 68 | \code{('Jack Jansen', 'jack@cwi.nl')}. | 
| Guido van Rossum | a12ef94 | 1995-02-27 17:53:25 +0000 | [diff] [blame] | 69 | If the header contained | 
| Guido van Rossum | 470be14 | 1995-03-17 16:07:09 +0000 | [diff] [blame] | 70 | \code{'Jack Jansen <jack@cwi.nl>'} instead, it would yield the | 
| Guido van Rossum | a12ef94 | 1995-02-27 17:53:25 +0000 | [diff] [blame] | 71 | exact same result. | 
 | 72 | \end{funcdesc} | 
 | 73 |  | 
 | 74 | \begin{funcdesc}{getaddrlist}{name} | 
 | 75 | This is similar to \code{getaddr(\var{list})}, but parses a header | 
 | 76 | containing a list of email addresses (e.g. a \code{To} header) and | 
 | 77 | returns a list of (full name, email address) pairs (even if there was | 
 | 78 | only one address in the header).  If there is no header matching | 
 | 79 | \var{name}, return an empty list. | 
 | 80 |  | 
 | 81 | XXX The current version of this function is not really correct.  It | 
 | 82 | yields bogus results if a full name contains a comma. | 
 | 83 | \end{funcdesc} | 
 | 84 |  | 
 | 85 | \begin{funcdesc}{getdate}{name} | 
 | 86 | Retrieve a header using \code{getheader} and parse it into a 9-tuple | 
| Guido van Rossum | 6c4f003 | 1995-03-07 10:14:09 +0000 | [diff] [blame] | 87 | compatible with \code{time.mktime()}.  If there is no header matching | 
| Guido van Rossum | a12ef94 | 1995-02-27 17:53:25 +0000 | [diff] [blame] | 88 | \var{name}, or it is unparsable, return \code{None}. | 
 | 89 |  | 
 | 90 | Date parsing appears to be a black art, and not all mailers adhere to | 
 | 91 | the standard.  While it has been tested and found correct on a large | 
 | 92 | collection of email from many sources, it is still possible that this | 
 | 93 | function may occasionally yield an incorrect result. | 
 | 94 | \end{funcdesc} | 
 | 95 |  | 
 | 96 | \code{Message} instances also support a read-only mapping interface. | 
 | 97 | In particular: \code{m[name]} is the same as \code{m.getheader(name)}; | 
 | 98 | and \code{len(m)}, \code{m.has_key(name)}, \code{m.keys()}, | 
 | 99 | \code{m.values()} and \code{m.items()} act as expected (and | 
 | 100 | consistently). | 
 | 101 |  | 
 | 102 | Finally, \code{Message} instances have two public instance variables: | 
 | 103 |  | 
 | 104 | \begin{datadesc}{headers} | 
 | 105 | A list containing the entire set of header lines, in the order in | 
 | 106 | which they were read.  Each line contains a trailing newline.  The | 
 | 107 | blank line terminating the headers is not contained in the list. | 
 | 108 | \end{datadesc} | 
 | 109 |  | 
 | 110 | \begin{datadesc}{fp} | 
 | 111 | The file object passed at instantiation time. | 
 | 112 | \end{datadesc} |