blob: 767b67827f55bae8e970d6716da3a89e872a1dd2 [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{mailbox} ---
Fred Drake199b79c1999-02-20 05:04:59 +00002 Read various mailbox formats}
3
Fred Drakeb91e9341998-07-23 17:59:49 +00004\declaremodule{standard}{mailbox}
Fred Drakeb91e9341998-07-23 17:59:49 +00005\modulesynopsis{Read various mailbox formats.}
6
Guido van Rossum39a23cc1997-06-02 21:04:41 +00007
Guido van Rossum39a23cc1997-06-02 21:04:41 +00008This module defines a number of classes that allow easy and uniform
Fred Drakec37b65e2001-11-28 07:26:15 +00009access to mail messages in a (\UNIX) mailbox.
Guido van Rossum39a23cc1997-06-02 21:04:41 +000010
Barry Warsaw30dbd142001-01-31 22:14:01 +000011\begin{classdesc}{UnixMailbox}{fp\optional{, factory}}
Fred Drake62700312001-02-02 03:51:05 +000012Access to a classic \UNIX-style mailbox, where all messages are
13contained in a single file and separated by \samp{From }
14(a.k.a.\ \samp{From_}) lines. The file object \var{fp} points to the
15mailbox file. The optional \var{factory} parameter is a callable that
16should create new message objects. \var{factory} is called with one
17argument, \var{fp} by the \method{next()} method of the mailbox
18object. The default is the \class{rfc822.Message} class (see the
Barry Warsaw47db2522003-06-20 22:04:03 +000019\refmodule{rfc822} module -- and the note below).
Barry Warsaw30dbd142001-01-31 22:14:01 +000020
Barry Warsawdd69b0a2004-05-10 23:12:52 +000021\note{For reasons of this module's internal implementation, you will probably
22want to open the \var{fp} object in binary mode. This is especially important
23on Windows.}
24
Fred Drake62700312001-02-02 03:51:05 +000025For maximum portability, messages in a \UNIX-style mailbox are
26separated by any line that begins exactly with the string \code{'From
27'} (note the trailing space) if preceded by exactly two newlines.
28Because of the wide-range of variations in practice, nothing else on
29the From_ line should be considered. However, the current
30implementation doesn't check for the leading two newlines. This is
31usually fine for most applications.
Barry Warsaw30dbd142001-01-31 22:14:01 +000032
33The \class{UnixMailbox} class implements a more strict version of
34From_ line checking, using a regular expression that usually correctly
35matched From_ delimiters. It considers delimiter line to be separated
Fred Drake62700312001-02-02 03:51:05 +000036by \samp{From \var{name} \var{time}} lines. For maximum portability,
37use the \class{PortableUnixMailbox} class instead. This class is
38identical to \class{UnixMailbox} except that individual messages are
39separated by only \samp{From } lines.
Barry Warsaw30dbd142001-01-31 22:14:01 +000040
Fred Drake62700312001-02-02 03:51:05 +000041For more information, see
42\citetitle[http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html]{Configuring
43Netscape Mail on \UNIX: Why the Content-Length Format is Bad}.
44\end{classdesc}
45
46\begin{classdesc}{PortableUnixMailbox}{fp\optional{, factory}}
47A less-strict version of \class{UnixMailbox}, which considers only the
48\samp{From } at the beginning of the line separating messages. The
49``\var{name} \var{time}'' portion of the From line is ignored, to
50protect against some variations that are observed in practice. This
51works since lines in the message which begin with \code{'From '} are
Greg Ward02669a32002-09-23 19:32:42 +000052quoted by mail handling software at delivery-time.
Fred Drake2e495c91998-03-14 06:48:33 +000053\end{classdesc}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000054
Barry Warsaw30dbd142001-01-31 22:14:01 +000055\begin{classdesc}{MmdfMailbox}{fp\optional{, factory}}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000056Access an MMDF-style mailbox, where all messages are contained
57in a single file and separated by lines consisting of 4 control-A
Fred Drake6e99adb1998-02-13 22:17:21 +000058characters. The file object \var{fp} points to the mailbox file.
Barry Warsaw30dbd142001-01-31 22:14:01 +000059Optional \var{factory} is as with the \class{UnixMailbox} class.
Fred Drake2e495c91998-03-14 06:48:33 +000060\end{classdesc}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000061
Barry Warsaw30dbd142001-01-31 22:14:01 +000062\begin{classdesc}{MHMailbox}{dirname\optional{, factory}}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000063Access an MH mailbox, a directory with each message in a separate
Fred Drake6e99adb1998-02-13 22:17:21 +000064file with a numeric name.
65The name of the mailbox directory is passed in \var{dirname}.
Barry Warsaw30dbd142001-01-31 22:14:01 +000066\var{factory} is as with the \class{UnixMailbox} class.
Fred Drake2e495c91998-03-14 06:48:33 +000067\end{classdesc}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000068
Barry Warsaw30dbd142001-01-31 22:14:01 +000069\begin{classdesc}{Maildir}{dirname\optional{, factory}}
Fred Drake199b79c1999-02-20 05:04:59 +000070Access a Qmail mail directory. All new and current mail for the
71mailbox specified by \var{dirname} is made available.
Barry Warsaw30dbd142001-01-31 22:14:01 +000072\var{factory} is as with the \class{UnixMailbox} class.
Fred Drake199b79c1999-02-20 05:04:59 +000073\end{classdesc}
74
Barry Warsaw30dbd142001-01-31 22:14:01 +000075\begin{classdesc}{BabylMailbox}{fp\optional{, factory}}
Barry Warsawc3cbbaf2001-04-11 20:12:33 +000076Access a Babyl mailbox, which is similar to an MMDF mailbox. In
77Babyl format, each message has two sets of headers, the
78\emph{original} headers and the \emph{visible} headers. The original
Raymond Hettinger999b57c2003-08-25 04:28:05 +000079headers appear before a line containing only \code{'*** EOOH ***'}
Barry Warsawc3cbbaf2001-04-11 20:12:33 +000080(End-Of-Original-Headers) and the visible headers appear after the
81\code{EOOH} line. Babyl-compliant mail readers will show you only the
82visible headers, and \class{BabylMailbox} objects will return messages
83containing only the visible headers. You'll have to do your own
84parsing of the mailbox file to get at the original headers. Mail
85messages start with the EOOH line and end with a line containing only
86\code{'\e{}037\e{}014'}. \var{factory} is as with the
87\class{UnixMailbox} class.
Fred Drake199b79c1999-02-20 05:04:59 +000088\end{classdesc}
89
Barry Warsaw47db2522003-06-20 22:04:03 +000090Note that because the \refmodule{rfc822} module is deprecated, it is
91recommended that you use the \refmodule{email} package to create
92message objects from a mailbox. (The default can't be changed for
93backwards compatibility reasons.) The safest way to do this is with
94bit of code:
95
96\begin{verbatim}
97import email
98import email.Errors
99import mailbox
100
101def msgfactory(fp):
102 try:
103 return email.message_from_file(fp)
104 except email.Errors.MessageParseError:
105 # Don't return None since that will
106 # stop the mailbox iterator
107 return ''
108
109mbox = mailbox.UnixMailbox(fp, msgfactory)
110\end{verbatim}
111
112The above wrapper is defensive against ill-formed MIME messages in the
113mailbox, but you have to be prepared to receive the empty string from
114the mailbox's \function{next()} method. On the other hand, if you
115know your mailbox contains only well-formed MIME messages, you can
116simplify this to:
117
118\begin{verbatim}
119import email
120import mailbox
121
122mbox = mailbox.UnixMailbox(fp, email.message_from_file)
123\end{verbatim}
Fred Drake199b79c1999-02-20 05:04:59 +0000124
Fred Drake1400baa2001-05-21 21:23:01 +0000125\begin{seealso}
126 \seetitle[http://www.qmail.org/man/man5/mbox.html]{mbox -
127 file containing mail messages}{Description of the
128 traditional ``mbox'' mailbox format.}
129 \seetitle[http://www.qmail.org/man/man5/maildir.html]{maildir -
130 directory for incoming mail messages}{Description of the
131 ``maildir'' mailbox format.}
132 \seetitle[http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html]{Configuring
133 Netscape Mail on \UNIX: Why the Content-Length Format is
134 Bad}{A description of problems with relying on the
Fred Draked86038d2001-08-03 18:39:36 +0000135 \mailheader{Content-Length} header for messages stored in
136 mailbox files.}
Fred Drake1400baa2001-05-21 21:23:01 +0000137\end{seealso}
138
139
Fred Drake199b79c1999-02-20 05:04:59 +0000140\subsection{Mailbox Objects \label{mailbox-objects}}
Guido van Rossum39a23cc1997-06-02 21:04:41 +0000141
Fred Drakee9ba5252001-10-01 15:49:56 +0000142All implementations of mailbox objects are iterable objects, and
143have one externally visible method. This method is used by iterators
144created from mailbox objects and may also be used directly.
Guido van Rossum39a23cc1997-06-02 21:04:41 +0000145
Fred Drake182bd2d1998-04-02 18:50:21 +0000146\begin{methoddesc}[mailbox]{next}{}
Barry Warsaw30dbd142001-01-31 22:14:01 +0000147Return the next message in the mailbox, created with the optional
148\var{factory} argument passed into the mailbox object's constructor.
Skip Montanarobb5a4652001-09-05 19:27:13 +0000149By default this is an \class{rfc822.Message}
Fred Drake806a4671999-03-27 05:45:46 +0000150object (see the \refmodule{rfc822} module). Depending on the mailbox
151implementation the \var{fp} attribute of this object may be a true
152file object or a class instance simulating a file object, taking care
153of things like message boundaries if multiple mail messages are
Fred Drake2c4f5542000-10-10 22:00:03 +0000154contained in a single file, etc. If no more messages are available,
155this method returns \code{None}.
Fred Drake182bd2d1998-04-02 18:50:21 +0000156\end{methoddesc}