blob: dd1856245c5981d6c933539ba0dfdfea4c672456 [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{mailbox} ---
Fred Drake199b79c1999-02-20 05:04:59 +00002 Read various mailbox formats}
3
Fred Drakeb91e9341998-07-23 17:59:49 +00004\declaremodule{standard}{mailbox}
Fred Drakeb91e9341998-07-23 17:59:49 +00005\modulesynopsis{Read various mailbox formats.}
6
Guido van Rossum39a23cc1997-06-02 21:04:41 +00007
Guido van Rossum39a23cc1997-06-02 21:04:41 +00008This module defines a number of classes that allow easy and uniform
Fred Drakec37b65e2001-11-28 07:26:15 +00009access to mail messages in a (\UNIX) mailbox.
Guido van Rossum39a23cc1997-06-02 21:04:41 +000010
Barry Warsaw30dbd142001-01-31 22:14:01 +000011\begin{classdesc}{UnixMailbox}{fp\optional{, factory}}
Fred Drake62700312001-02-02 03:51:05 +000012Access to a classic \UNIX-style mailbox, where all messages are
13contained in a single file and separated by \samp{From }
14(a.k.a.\ \samp{From_}) lines. The file object \var{fp} points to the
15mailbox file. The optional \var{factory} parameter is a callable that
16should create new message objects. \var{factory} is called with one
17argument, \var{fp} by the \method{next()} method of the mailbox
18object. The default is the \class{rfc822.Message} class (see the
Barry Warsaw47db2522003-06-20 22:04:03 +000019\refmodule{rfc822} module -- and the note below).
Barry Warsaw30dbd142001-01-31 22:14:01 +000020
Fred Drake0d736212004-05-11 05:29:34 +000021\begin{notice}
22 For reasons of this module's internal implementation, you will
23 probably want to open the \var{fp} object in binary mode. This is
24 especially important on Windows.
25\end{notice}
Barry Warsawdd69b0a2004-05-10 23:12:52 +000026
Fred Drake62700312001-02-02 03:51:05 +000027For maximum portability, messages in a \UNIX-style mailbox are
28separated by any line that begins exactly with the string \code{'From
29'} (note the trailing space) if preceded by exactly two newlines.
30Because of the wide-range of variations in practice, nothing else on
31the From_ line should be considered. However, the current
32implementation doesn't check for the leading two newlines. This is
33usually fine for most applications.
Barry Warsaw30dbd142001-01-31 22:14:01 +000034
35The \class{UnixMailbox} class implements a more strict version of
36From_ line checking, using a regular expression that usually correctly
37matched From_ delimiters. It considers delimiter line to be separated
Fred Drake62700312001-02-02 03:51:05 +000038by \samp{From \var{name} \var{time}} lines. For maximum portability,
39use the \class{PortableUnixMailbox} class instead. This class is
40identical to \class{UnixMailbox} except that individual messages are
41separated by only \samp{From } lines.
Barry Warsaw30dbd142001-01-31 22:14:01 +000042
Fred Drake62700312001-02-02 03:51:05 +000043For more information, see
44\citetitle[http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html]{Configuring
45Netscape Mail on \UNIX: Why the Content-Length Format is Bad}.
46\end{classdesc}
47
48\begin{classdesc}{PortableUnixMailbox}{fp\optional{, factory}}
49A less-strict version of \class{UnixMailbox}, which considers only the
50\samp{From } at the beginning of the line separating messages. The
51``\var{name} \var{time}'' portion of the From line is ignored, to
52protect against some variations that are observed in practice. This
53works since lines in the message which begin with \code{'From '} are
Greg Ward02669a32002-09-23 19:32:42 +000054quoted by mail handling software at delivery-time.
Fred Drake2e495c91998-03-14 06:48:33 +000055\end{classdesc}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000056
Barry Warsaw30dbd142001-01-31 22:14:01 +000057\begin{classdesc}{MmdfMailbox}{fp\optional{, factory}}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000058Access an MMDF-style mailbox, where all messages are contained
59in a single file and separated by lines consisting of 4 control-A
Fred Drake6e99adb1998-02-13 22:17:21 +000060characters. The file object \var{fp} points to the mailbox file.
Barry Warsaw30dbd142001-01-31 22:14:01 +000061Optional \var{factory} is as with the \class{UnixMailbox} class.
Fred Drake2e495c91998-03-14 06:48:33 +000062\end{classdesc}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000063
Barry Warsaw30dbd142001-01-31 22:14:01 +000064\begin{classdesc}{MHMailbox}{dirname\optional{, factory}}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000065Access an MH mailbox, a directory with each message in a separate
Fred Drake6e99adb1998-02-13 22:17:21 +000066file with a numeric name.
67The name of the mailbox directory is passed in \var{dirname}.
Barry Warsaw30dbd142001-01-31 22:14:01 +000068\var{factory} is as with the \class{UnixMailbox} class.
Fred Drake2e495c91998-03-14 06:48:33 +000069\end{classdesc}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000070
Barry Warsaw30dbd142001-01-31 22:14:01 +000071\begin{classdesc}{Maildir}{dirname\optional{, factory}}
Fred Drake199b79c1999-02-20 05:04:59 +000072Access a Qmail mail directory. All new and current mail for the
73mailbox specified by \var{dirname} is made available.
Barry Warsaw30dbd142001-01-31 22:14:01 +000074\var{factory} is as with the \class{UnixMailbox} class.
Fred Drake199b79c1999-02-20 05:04:59 +000075\end{classdesc}
76
Barry Warsaw30dbd142001-01-31 22:14:01 +000077\begin{classdesc}{BabylMailbox}{fp\optional{, factory}}
Barry Warsawc3cbbaf2001-04-11 20:12:33 +000078Access a Babyl mailbox, which is similar to an MMDF mailbox. In
79Babyl format, each message has two sets of headers, the
80\emph{original} headers and the \emph{visible} headers. The original
Raymond Hettinger999b57c2003-08-25 04:28:05 +000081headers appear before a line containing only \code{'*** EOOH ***'}
Barry Warsawc3cbbaf2001-04-11 20:12:33 +000082(End-Of-Original-Headers) and the visible headers appear after the
83\code{EOOH} line. Babyl-compliant mail readers will show you only the
84visible headers, and \class{BabylMailbox} objects will return messages
85containing only the visible headers. You'll have to do your own
86parsing of the mailbox file to get at the original headers. Mail
87messages start with the EOOH line and end with a line containing only
88\code{'\e{}037\e{}014'}. \var{factory} is as with the
89\class{UnixMailbox} class.
Fred Drake199b79c1999-02-20 05:04:59 +000090\end{classdesc}
91
Barry Warsaw47db2522003-06-20 22:04:03 +000092Note that because the \refmodule{rfc822} module is deprecated, it is
93recommended that you use the \refmodule{email} package to create
94message objects from a mailbox. (The default can't be changed for
95backwards compatibility reasons.) The safest way to do this is with
96bit of code:
97
98\begin{verbatim}
99import email
100import email.Errors
101import mailbox
102
103def msgfactory(fp):
104 try:
105 return email.message_from_file(fp)
106 except email.Errors.MessageParseError:
107 # Don't return None since that will
108 # stop the mailbox iterator
109 return ''
110
111mbox = mailbox.UnixMailbox(fp, msgfactory)
112\end{verbatim}
113
114The above wrapper is defensive against ill-formed MIME messages in the
115mailbox, but you have to be prepared to receive the empty string from
116the mailbox's \function{next()} method. On the other hand, if you
117know your mailbox contains only well-formed MIME messages, you can
118simplify this to:
119
120\begin{verbatim}
121import email
122import mailbox
123
124mbox = mailbox.UnixMailbox(fp, email.message_from_file)
125\end{verbatim}
Fred Drake199b79c1999-02-20 05:04:59 +0000126
Fred Drake1400baa2001-05-21 21:23:01 +0000127\begin{seealso}
128 \seetitle[http://www.qmail.org/man/man5/mbox.html]{mbox -
129 file containing mail messages}{Description of the
130 traditional ``mbox'' mailbox format.}
131 \seetitle[http://www.qmail.org/man/man5/maildir.html]{maildir -
132 directory for incoming mail messages}{Description of the
133 ``maildir'' mailbox format.}
134 \seetitle[http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html]{Configuring
135 Netscape Mail on \UNIX: Why the Content-Length Format is
136 Bad}{A description of problems with relying on the
Fred Draked86038d2001-08-03 18:39:36 +0000137 \mailheader{Content-Length} header for messages stored in
138 mailbox files.}
Fred Drake1400baa2001-05-21 21:23:01 +0000139\end{seealso}
140
141
Fred Drake199b79c1999-02-20 05:04:59 +0000142\subsection{Mailbox Objects \label{mailbox-objects}}
Guido van Rossum39a23cc1997-06-02 21:04:41 +0000143
Fred Drakee9ba5252001-10-01 15:49:56 +0000144All implementations of mailbox objects are iterable objects, and
145have one externally visible method. This method is used by iterators
146created from mailbox objects and may also be used directly.
Guido van Rossum39a23cc1997-06-02 21:04:41 +0000147
Fred Drake182bd2d1998-04-02 18:50:21 +0000148\begin{methoddesc}[mailbox]{next}{}
Barry Warsaw30dbd142001-01-31 22:14:01 +0000149Return the next message in the mailbox, created with the optional
150\var{factory} argument passed into the mailbox object's constructor.
Skip Montanarobb5a4652001-09-05 19:27:13 +0000151By default this is an \class{rfc822.Message}
Fred Drake806a4671999-03-27 05:45:46 +0000152object (see the \refmodule{rfc822} module). Depending on the mailbox
153implementation the \var{fp} attribute of this object may be a true
154file object or a class instance simulating a file object, taking care
155of things like message boundaries if multiple mail messages are
Fred Drake2c4f5542000-10-10 22:00:03 +0000156contained in a single file, etc. If no more messages are available,
157this method returns \code{None}.
Fred Drake182bd2d1998-04-02 18:50:21 +0000158\end{methoddesc}