| \section{\module{mailbox} --- |
| Read various mailbox formats} |
| |
| \declaremodule{standard}{mailbox} |
| \modulesynopsis{Read various mailbox formats.} |
| |
| |
| This module defines a number of classes that allow easy and uniform |
| access to mail messages in a (\UNIX) mailbox. |
| |
| \begin{classdesc}{UnixMailbox}{fp\optional{, factory}} |
| Access to a classic \UNIX-style mailbox, where all messages are |
| contained in a single file and separated by \samp{From } |
| (a.k.a.\ \samp{From_}) lines. The file object \var{fp} points to the |
| mailbox file. The optional \var{factory} parameter is a callable that |
| should create new message objects. \var{factory} is called with one |
| argument, \var{fp} by the \method{next()} method of the mailbox |
| object. The default is the \class{rfc822.Message} class (see the |
| \refmodule{rfc822} module -- and the note below). |
| |
| For maximum portability, messages in a \UNIX-style mailbox are |
| separated by any line that begins exactly with the string \code{'From |
| '} (note the trailing space) if preceded by exactly two newlines. |
| Because of the wide-range of variations in practice, nothing else on |
| the From_ line should be considered. However, the current |
| implementation doesn't check for the leading two newlines. This is |
| usually fine for most applications. |
| |
| The \class{UnixMailbox} class implements a more strict version of |
| From_ line checking, using a regular expression that usually correctly |
| matched From_ delimiters. It considers delimiter line to be separated |
| by \samp{From \var{name} \var{time}} lines. For maximum portability, |
| use the \class{PortableUnixMailbox} class instead. This class is |
| identical to \class{UnixMailbox} except that individual messages are |
| separated by only \samp{From } lines. |
| |
| For more information, see |
| \citetitle[http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html]{Configuring |
| Netscape Mail on \UNIX: Why the Content-Length Format is Bad}. |
| \end{classdesc} |
| |
| \begin{classdesc}{PortableUnixMailbox}{fp\optional{, factory}} |
| A less-strict version of \class{UnixMailbox}, which considers only the |
| \samp{From } at the beginning of the line separating messages. The |
| ``\var{name} \var{time}'' portion of the From line is ignored, to |
| protect against some variations that are observed in practice. This |
| works since lines in the message which begin with \code{'From '} are |
| quoted by mail handling software at delivery-time. |
| \end{classdesc} |
| |
| \begin{classdesc}{MmdfMailbox}{fp\optional{, factory}} |
| Access an MMDF-style mailbox, where all messages are contained |
| in a single file and separated by lines consisting of 4 control-A |
| characters. The file object \var{fp} points to the mailbox file. |
| Optional \var{factory} is as with the \class{UnixMailbox} class. |
| \end{classdesc} |
| |
| \begin{classdesc}{MHMailbox}{dirname\optional{, factory}} |
| Access an MH mailbox, a directory with each message in a separate |
| file with a numeric name. |
| The name of the mailbox directory is passed in \var{dirname}. |
| \var{factory} is as with the \class{UnixMailbox} class. |
| \end{classdesc} |
| |
| \begin{classdesc}{Maildir}{dirname\optional{, factory}} |
| Access a Qmail mail directory. All new and current mail for the |
| mailbox specified by \var{dirname} is made available. |
| \var{factory} is as with the \class{UnixMailbox} class. |
| \end{classdesc} |
| |
| \begin{classdesc}{BabylMailbox}{fp\optional{, factory}} |
| Access a Babyl mailbox, which is similar to an MMDF mailbox. In |
| Babyl format, each message has two sets of headers, the |
| \emph{original} headers and the \emph{visible} headers. The original |
| headers appear before a a line containing only \code{'*** EOOH ***'} |
| (End-Of-Original-Headers) and the visible headers appear after the |
| \code{EOOH} line. Babyl-compliant mail readers will show you only the |
| visible headers, and \class{BabylMailbox} objects will return messages |
| containing only the visible headers. You'll have to do your own |
| parsing of the mailbox file to get at the original headers. Mail |
| messages start with the EOOH line and end with a line containing only |
| \code{'\e{}037\e{}014'}. \var{factory} is as with the |
| \class{UnixMailbox} class. |
| \end{classdesc} |
| |
| Note that because the \refmodule{rfc822} module is deprecated, it is |
| recommended that you use the \refmodule{email} package to create |
| message objects from a mailbox. (The default can't be changed for |
| backwards compatibility reasons.) The safest way to do this is with |
| bit of code: |
| |
| \begin{verbatim} |
| import email |
| import email.Errors |
| import mailbox |
| |
| def msgfactory(fp): |
| try: |
| return email.message_from_file(fp) |
| except email.Errors.MessageParseError: |
| # Don't return None since that will |
| # stop the mailbox iterator |
| return '' |
| |
| mbox = mailbox.UnixMailbox(fp, msgfactory) |
| \end{verbatim} |
| |
| The above wrapper is defensive against ill-formed MIME messages in the |
| mailbox, but you have to be prepared to receive the empty string from |
| the mailbox's \function{next()} method. On the other hand, if you |
| know your mailbox contains only well-formed MIME messages, you can |
| simplify this to: |
| |
| \begin{verbatim} |
| import email |
| import mailbox |
| |
| mbox = mailbox.UnixMailbox(fp, email.message_from_file) |
| \end{verbatim} |
| |
| \begin{seealso} |
| \seetitle[http://www.qmail.org/man/man5/mbox.html]{mbox - |
| file containing mail messages}{Description of the |
| traditional ``mbox'' mailbox format.} |
| \seetitle[http://www.qmail.org/man/man5/maildir.html]{maildir - |
| directory for incoming mail messages}{Description of the |
| ``maildir'' mailbox format.} |
| \seetitle[http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html]{Configuring |
| Netscape Mail on \UNIX: Why the Content-Length Format is |
| Bad}{A description of problems with relying on the |
| \mailheader{Content-Length} header for messages stored in |
| mailbox files.} |
| \end{seealso} |
| |
| |
| \subsection{Mailbox Objects \label{mailbox-objects}} |
| |
| All implementations of mailbox objects are iterable objects, and |
| have one externally visible method. This method is used by iterators |
| created from mailbox objects and may also be used directly. |
| |
| \begin{methoddesc}[mailbox]{next}{} |
| Return the next message in the mailbox, created with the optional |
| \var{factory} argument passed into the mailbox object's constructor. |
| By default this is an \class{rfc822.Message} |
| object (see the \refmodule{rfc822} module). Depending on the mailbox |
| implementation the \var{fp} attribute of this object may be a true |
| file object or a class instance simulating a file object, taking care |
| of things like message boundaries if multiple mail messages are |
| contained in a single file, etc. If no more messages are available, |
| this method returns \code{None}. |
| \end{methoddesc} |