Fred Drake | 295da24 | 1998-08-10 19:42:37 +0000 | [diff] [blame] | 1 | \section{\module{mailbox} --- |
Fred Drake | 199b79c | 1999-02-20 05:04:59 +0000 | [diff] [blame] | 2 | Read various mailbox formats} |
| 3 | |
Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 4 | \declaremodule{standard}{mailbox} |
Fred Drake | b91e934 | 1998-07-23 17:59:49 +0000 | [diff] [blame] | 5 | \modulesynopsis{Read various mailbox formats.} |
| 6 | |
Guido van Rossum | 39a23cc | 1997-06-02 21:04:41 +0000 | [diff] [blame] | 7 | |
Guido van Rossum | 39a23cc | 1997-06-02 21:04:41 +0000 | [diff] [blame] | 8 | This module defines a number of classes that allow easy and uniform |
Fred Drake | c37b65e | 2001-11-28 07:26:15 +0000 | [diff] [blame] | 9 | access to mail messages in a (\UNIX) mailbox. |
Guido van Rossum | 39a23cc | 1997-06-02 21:04:41 +0000 | [diff] [blame] | 10 | |
Barry Warsaw | 30dbd14 | 2001-01-31 22:14:01 +0000 | [diff] [blame] | 11 | \begin{classdesc}{UnixMailbox}{fp\optional{, factory}} |
Fred Drake | 6270031 | 2001-02-02 03:51:05 +0000 | [diff] [blame] | 12 | Access to a classic \UNIX-style mailbox, where all messages are |
| 13 | contained in a single file and separated by \samp{From } |
| 14 | (a.k.a.\ \samp{From_}) lines. The file object \var{fp} points to the |
| 15 | mailbox file. The optional \var{factory} parameter is a callable that |
| 16 | should create new message objects. \var{factory} is called with one |
| 17 | argument, \var{fp} by the \method{next()} method of the mailbox |
| 18 | object. The default is the \class{rfc822.Message} class (see the |
Barry Warsaw | 47db252 | 2003-06-20 22:04:03 +0000 | [diff] [blame^] | 19 | \refmodule{rfc822} module -- and the note below). |
Barry Warsaw | 30dbd14 | 2001-01-31 22:14:01 +0000 | [diff] [blame] | 20 | |
Fred Drake | 6270031 | 2001-02-02 03:51:05 +0000 | [diff] [blame] | 21 | For maximum portability, messages in a \UNIX-style mailbox are |
| 22 | separated by any line that begins exactly with the string \code{'From |
| 23 | '} (note the trailing space) if preceded by exactly two newlines. |
| 24 | Because of the wide-range of variations in practice, nothing else on |
| 25 | the From_ line should be considered. However, the current |
| 26 | implementation doesn't check for the leading two newlines. This is |
| 27 | usually fine for most applications. |
Barry Warsaw | 30dbd14 | 2001-01-31 22:14:01 +0000 | [diff] [blame] | 28 | |
| 29 | The \class{UnixMailbox} class implements a more strict version of |
| 30 | From_ line checking, using a regular expression that usually correctly |
| 31 | matched From_ delimiters. It considers delimiter line to be separated |
Fred Drake | 6270031 | 2001-02-02 03:51:05 +0000 | [diff] [blame] | 32 | by \samp{From \var{name} \var{time}} lines. For maximum portability, |
| 33 | use the \class{PortableUnixMailbox} class instead. This class is |
| 34 | identical to \class{UnixMailbox} except that individual messages are |
| 35 | separated by only \samp{From } lines. |
Barry Warsaw | 30dbd14 | 2001-01-31 22:14:01 +0000 | [diff] [blame] | 36 | |
Fred Drake | 6270031 | 2001-02-02 03:51:05 +0000 | [diff] [blame] | 37 | For more information, see |
| 38 | \citetitle[http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html]{Configuring |
| 39 | Netscape Mail on \UNIX: Why the Content-Length Format is Bad}. |
| 40 | \end{classdesc} |
| 41 | |
| 42 | \begin{classdesc}{PortableUnixMailbox}{fp\optional{, factory}} |
| 43 | A less-strict version of \class{UnixMailbox}, which considers only the |
| 44 | \samp{From } at the beginning of the line separating messages. The |
| 45 | ``\var{name} \var{time}'' portion of the From line is ignored, to |
| 46 | protect against some variations that are observed in practice. This |
| 47 | works since lines in the message which begin with \code{'From '} are |
Greg Ward | 02669a3 | 2002-09-23 19:32:42 +0000 | [diff] [blame] | 48 | quoted by mail handling software at delivery-time. |
Fred Drake | 2e495c9 | 1998-03-14 06:48:33 +0000 | [diff] [blame] | 49 | \end{classdesc} |
Guido van Rossum | 39a23cc | 1997-06-02 21:04:41 +0000 | [diff] [blame] | 50 | |
Barry Warsaw | 30dbd14 | 2001-01-31 22:14:01 +0000 | [diff] [blame] | 51 | \begin{classdesc}{MmdfMailbox}{fp\optional{, factory}} |
Guido van Rossum | 39a23cc | 1997-06-02 21:04:41 +0000 | [diff] [blame] | 52 | Access an MMDF-style mailbox, where all messages are contained |
| 53 | in a single file and separated by lines consisting of 4 control-A |
Fred Drake | 6e99adb | 1998-02-13 22:17:21 +0000 | [diff] [blame] | 54 | characters. The file object \var{fp} points to the mailbox file. |
Barry Warsaw | 30dbd14 | 2001-01-31 22:14:01 +0000 | [diff] [blame] | 55 | Optional \var{factory} is as with the \class{UnixMailbox} class. |
Fred Drake | 2e495c9 | 1998-03-14 06:48:33 +0000 | [diff] [blame] | 56 | \end{classdesc} |
Guido van Rossum | 39a23cc | 1997-06-02 21:04:41 +0000 | [diff] [blame] | 57 | |
Barry Warsaw | 30dbd14 | 2001-01-31 22:14:01 +0000 | [diff] [blame] | 58 | \begin{classdesc}{MHMailbox}{dirname\optional{, factory}} |
Guido van Rossum | 39a23cc | 1997-06-02 21:04:41 +0000 | [diff] [blame] | 59 | Access an MH mailbox, a directory with each message in a separate |
Fred Drake | 6e99adb | 1998-02-13 22:17:21 +0000 | [diff] [blame] | 60 | file with a numeric name. |
| 61 | The name of the mailbox directory is passed in \var{dirname}. |
Barry Warsaw | 30dbd14 | 2001-01-31 22:14:01 +0000 | [diff] [blame] | 62 | \var{factory} is as with the \class{UnixMailbox} class. |
Fred Drake | 2e495c9 | 1998-03-14 06:48:33 +0000 | [diff] [blame] | 63 | \end{classdesc} |
Guido van Rossum | 39a23cc | 1997-06-02 21:04:41 +0000 | [diff] [blame] | 64 | |
Barry Warsaw | 30dbd14 | 2001-01-31 22:14:01 +0000 | [diff] [blame] | 65 | \begin{classdesc}{Maildir}{dirname\optional{, factory}} |
Fred Drake | 199b79c | 1999-02-20 05:04:59 +0000 | [diff] [blame] | 66 | Access a Qmail mail directory. All new and current mail for the |
| 67 | mailbox specified by \var{dirname} is made available. |
Barry Warsaw | 30dbd14 | 2001-01-31 22:14:01 +0000 | [diff] [blame] | 68 | \var{factory} is as with the \class{UnixMailbox} class. |
Fred Drake | 199b79c | 1999-02-20 05:04:59 +0000 | [diff] [blame] | 69 | \end{classdesc} |
| 70 | |
Barry Warsaw | 30dbd14 | 2001-01-31 22:14:01 +0000 | [diff] [blame] | 71 | \begin{classdesc}{BabylMailbox}{fp\optional{, factory}} |
Barry Warsaw | c3cbbaf | 2001-04-11 20:12:33 +0000 | [diff] [blame] | 72 | Access a Babyl mailbox, which is similar to an MMDF mailbox. In |
| 73 | Babyl format, each message has two sets of headers, the |
| 74 | \emph{original} headers and the \emph{visible} headers. The original |
| 75 | headers appear before a a line containing only \code{'*** EOOH ***'} |
| 76 | (End-Of-Original-Headers) and the visible headers appear after the |
| 77 | \code{EOOH} line. Babyl-compliant mail readers will show you only the |
| 78 | visible headers, and \class{BabylMailbox} objects will return messages |
| 79 | containing only the visible headers. You'll have to do your own |
| 80 | parsing of the mailbox file to get at the original headers. Mail |
| 81 | messages start with the EOOH line and end with a line containing only |
| 82 | \code{'\e{}037\e{}014'}. \var{factory} is as with the |
| 83 | \class{UnixMailbox} class. |
Fred Drake | 199b79c | 1999-02-20 05:04:59 +0000 | [diff] [blame] | 84 | \end{classdesc} |
| 85 | |
Barry Warsaw | 47db252 | 2003-06-20 22:04:03 +0000 | [diff] [blame^] | 86 | Note that because the \refmodule{rfc822} module is deprecated, it is |
| 87 | recommended that you use the \refmodule{email} package to create |
| 88 | message objects from a mailbox. (The default can't be changed for |
| 89 | backwards compatibility reasons.) The safest way to do this is with |
| 90 | bit of code: |
| 91 | |
| 92 | \begin{verbatim} |
| 93 | import email |
| 94 | import email.Errors |
| 95 | import mailbox |
| 96 | |
| 97 | def msgfactory(fp): |
| 98 | try: |
| 99 | return email.message_from_file(fp) |
| 100 | except email.Errors.MessageParseError: |
| 101 | # Don't return None since that will |
| 102 | # stop the mailbox iterator |
| 103 | return '' |
| 104 | |
| 105 | mbox = mailbox.UnixMailbox(fp, msgfactory) |
| 106 | \end{verbatim} |
| 107 | |
| 108 | The above wrapper is defensive against ill-formed MIME messages in the |
| 109 | mailbox, but you have to be prepared to receive the empty string from |
| 110 | the mailbox's \function{next()} method. On the other hand, if you |
| 111 | know your mailbox contains only well-formed MIME messages, you can |
| 112 | simplify this to: |
| 113 | |
| 114 | \begin{verbatim} |
| 115 | import email |
| 116 | import mailbox |
| 117 | |
| 118 | mbox = mailbox.UnixMailbox(fp, email.message_from_file) |
| 119 | \end{verbatim} |
Fred Drake | 199b79c | 1999-02-20 05:04:59 +0000 | [diff] [blame] | 120 | |
Fred Drake | 1400baa | 2001-05-21 21:23:01 +0000 | [diff] [blame] | 121 | \begin{seealso} |
| 122 | \seetitle[http://www.qmail.org/man/man5/mbox.html]{mbox - |
| 123 | file containing mail messages}{Description of the |
| 124 | traditional ``mbox'' mailbox format.} |
| 125 | \seetitle[http://www.qmail.org/man/man5/maildir.html]{maildir - |
| 126 | directory for incoming mail messages}{Description of the |
| 127 | ``maildir'' mailbox format.} |
| 128 | \seetitle[http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html]{Configuring |
| 129 | Netscape Mail on \UNIX: Why the Content-Length Format is |
| 130 | Bad}{A description of problems with relying on the |
Fred Drake | d86038d | 2001-08-03 18:39:36 +0000 | [diff] [blame] | 131 | \mailheader{Content-Length} header for messages stored in |
| 132 | mailbox files.} |
Fred Drake | 1400baa | 2001-05-21 21:23:01 +0000 | [diff] [blame] | 133 | \end{seealso} |
| 134 | |
| 135 | |
Fred Drake | 199b79c | 1999-02-20 05:04:59 +0000 | [diff] [blame] | 136 | \subsection{Mailbox Objects \label{mailbox-objects}} |
Guido van Rossum | 39a23cc | 1997-06-02 21:04:41 +0000 | [diff] [blame] | 137 | |
Fred Drake | e9ba525 | 2001-10-01 15:49:56 +0000 | [diff] [blame] | 138 | All implementations of mailbox objects are iterable objects, and |
| 139 | have one externally visible method. This method is used by iterators |
| 140 | created from mailbox objects and may also be used directly. |
Guido van Rossum | 39a23cc | 1997-06-02 21:04:41 +0000 | [diff] [blame] | 141 | |
Fred Drake | 182bd2d | 1998-04-02 18:50:21 +0000 | [diff] [blame] | 142 | \begin{methoddesc}[mailbox]{next}{} |
Barry Warsaw | 30dbd14 | 2001-01-31 22:14:01 +0000 | [diff] [blame] | 143 | Return the next message in the mailbox, created with the optional |
| 144 | \var{factory} argument passed into the mailbox object's constructor. |
Skip Montanaro | bb5a465 | 2001-09-05 19:27:13 +0000 | [diff] [blame] | 145 | By default this is an \class{rfc822.Message} |
Fred Drake | 806a467 | 1999-03-27 05:45:46 +0000 | [diff] [blame] | 146 | object (see the \refmodule{rfc822} module). Depending on the mailbox |
| 147 | implementation the \var{fp} attribute of this object may be a true |
| 148 | file object or a class instance simulating a file object, taking care |
| 149 | of things like message boundaries if multiple mail messages are |
Fred Drake | 2c4f554 | 2000-10-10 22:00:03 +0000 | [diff] [blame] | 150 | contained in a single file, etc. If no more messages are available, |
| 151 | this method returns \code{None}. |
Fred Drake | 182bd2d | 1998-04-02 18:50:21 +0000 | [diff] [blame] | 152 | \end{methoddesc} |