blob: 94e0784485a6e3c689e6d91d8f0293866bb47804 [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{mailbox} ---
Fred Drake199b79c1999-02-20 05:04:59 +00002 Read various mailbox formats}
3
Fred Drakeb91e9341998-07-23 17:59:49 +00004\declaremodule{standard}{mailbox}
Fred Drakeb91e9341998-07-23 17:59:49 +00005\modulesynopsis{Read various mailbox formats.}
6
Guido van Rossum39a23cc1997-06-02 21:04:41 +00007
Guido van Rossum39a23cc1997-06-02 21:04:41 +00008This module defines a number of classes that allow easy and uniform
Fred Drakec37b65e2001-11-28 07:26:15 +00009access to mail messages in a (\UNIX) mailbox.
Guido van Rossum39a23cc1997-06-02 21:04:41 +000010
Barry Warsaw30dbd142001-01-31 22:14:01 +000011\begin{classdesc}{UnixMailbox}{fp\optional{, factory}}
Fred Drake62700312001-02-02 03:51:05 +000012Access to a classic \UNIX-style mailbox, where all messages are
13contained in a single file and separated by \samp{From }
14(a.k.a.\ \samp{From_}) lines. The file object \var{fp} points to the
15mailbox file. The optional \var{factory} parameter is a callable that
16should create new message objects. \var{factory} is called with one
17argument, \var{fp} by the \method{next()} method of the mailbox
18object. The default is the \class{rfc822.Message} class (see the
Barry Warsaw47db2522003-06-20 22:04:03 +000019\refmodule{rfc822} module -- and the note below).
Barry Warsaw30dbd142001-01-31 22:14:01 +000020
Fred Drake62700312001-02-02 03:51:05 +000021For maximum portability, messages in a \UNIX-style mailbox are
22separated by any line that begins exactly with the string \code{'From
23'} (note the trailing space) if preceded by exactly two newlines.
24Because of the wide-range of variations in practice, nothing else on
25the From_ line should be considered. However, the current
26implementation doesn't check for the leading two newlines. This is
27usually fine for most applications.
Barry Warsaw30dbd142001-01-31 22:14:01 +000028
29The \class{UnixMailbox} class implements a more strict version of
30From_ line checking, using a regular expression that usually correctly
31matched From_ delimiters. It considers delimiter line to be separated
Fred Drake62700312001-02-02 03:51:05 +000032by \samp{From \var{name} \var{time}} lines. For maximum portability,
33use the \class{PortableUnixMailbox} class instead. This class is
34identical to \class{UnixMailbox} except that individual messages are
35separated by only \samp{From } lines.
Barry Warsaw30dbd142001-01-31 22:14:01 +000036
Fred Drake62700312001-02-02 03:51:05 +000037For more information, see
38\citetitle[http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html]{Configuring
39Netscape Mail on \UNIX: Why the Content-Length Format is Bad}.
40\end{classdesc}
41
42\begin{classdesc}{PortableUnixMailbox}{fp\optional{, factory}}
43A less-strict version of \class{UnixMailbox}, which considers only the
44\samp{From } at the beginning of the line separating messages. The
45``\var{name} \var{time}'' portion of the From line is ignored, to
46protect against some variations that are observed in practice. This
47works since lines in the message which begin with \code{'From '} are
Greg Ward02669a32002-09-23 19:32:42 +000048quoted by mail handling software at delivery-time.
Fred Drake2e495c91998-03-14 06:48:33 +000049\end{classdesc}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000050
Barry Warsaw30dbd142001-01-31 22:14:01 +000051\begin{classdesc}{MmdfMailbox}{fp\optional{, factory}}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000052Access an MMDF-style mailbox, where all messages are contained
53in a single file and separated by lines consisting of 4 control-A
Fred Drake6e99adb1998-02-13 22:17:21 +000054characters. The file object \var{fp} points to the mailbox file.
Barry Warsaw30dbd142001-01-31 22:14:01 +000055Optional \var{factory} is as with the \class{UnixMailbox} class.
Fred Drake2e495c91998-03-14 06:48:33 +000056\end{classdesc}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000057
Barry Warsaw30dbd142001-01-31 22:14:01 +000058\begin{classdesc}{MHMailbox}{dirname\optional{, factory}}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000059Access an MH mailbox, a directory with each message in a separate
Fred Drake6e99adb1998-02-13 22:17:21 +000060file with a numeric name.
61The name of the mailbox directory is passed in \var{dirname}.
Barry Warsaw30dbd142001-01-31 22:14:01 +000062\var{factory} is as with the \class{UnixMailbox} class.
Fred Drake2e495c91998-03-14 06:48:33 +000063\end{classdesc}
Guido van Rossum39a23cc1997-06-02 21:04:41 +000064
Barry Warsaw30dbd142001-01-31 22:14:01 +000065\begin{classdesc}{Maildir}{dirname\optional{, factory}}
Fred Drake199b79c1999-02-20 05:04:59 +000066Access a Qmail mail directory. All new and current mail for the
67mailbox specified by \var{dirname} is made available.
Barry Warsaw30dbd142001-01-31 22:14:01 +000068\var{factory} is as with the \class{UnixMailbox} class.
Fred Drake199b79c1999-02-20 05:04:59 +000069\end{classdesc}
70
Barry Warsaw30dbd142001-01-31 22:14:01 +000071\begin{classdesc}{BabylMailbox}{fp\optional{, factory}}
Barry Warsawc3cbbaf2001-04-11 20:12:33 +000072Access a Babyl mailbox, which is similar to an MMDF mailbox. In
73Babyl format, each message has two sets of headers, the
74\emph{original} headers and the \emph{visible} headers. The original
75headers appear before a a line containing only \code{'*** EOOH ***'}
76(End-Of-Original-Headers) and the visible headers appear after the
77\code{EOOH} line. Babyl-compliant mail readers will show you only the
78visible headers, and \class{BabylMailbox} objects will return messages
79containing only the visible headers. You'll have to do your own
80parsing of the mailbox file to get at the original headers. Mail
81messages start with the EOOH line and end with a line containing only
82\code{'\e{}037\e{}014'}. \var{factory} is as with the
83\class{UnixMailbox} class.
Fred Drake199b79c1999-02-20 05:04:59 +000084\end{classdesc}
85
Barry Warsaw47db2522003-06-20 22:04:03 +000086Note that because the \refmodule{rfc822} module is deprecated, it is
87recommended that you use the \refmodule{email} package to create
88message objects from a mailbox. (The default can't be changed for
89backwards compatibility reasons.) The safest way to do this is with
90bit of code:
91
92\begin{verbatim}
93import email
94import email.Errors
95import mailbox
96
97def msgfactory(fp):
98 try:
99 return email.message_from_file(fp)
100 except email.Errors.MessageParseError:
101 # Don't return None since that will
102 # stop the mailbox iterator
103 return ''
104
105mbox = mailbox.UnixMailbox(fp, msgfactory)
106\end{verbatim}
107
108The above wrapper is defensive against ill-formed MIME messages in the
109mailbox, but you have to be prepared to receive the empty string from
110the mailbox's \function{next()} method. On the other hand, if you
111know your mailbox contains only well-formed MIME messages, you can
112simplify this to:
113
114\begin{verbatim}
115import email
116import mailbox
117
118mbox = mailbox.UnixMailbox(fp, email.message_from_file)
119\end{verbatim}
Fred Drake199b79c1999-02-20 05:04:59 +0000120
Fred Drake1400baa2001-05-21 21:23:01 +0000121\begin{seealso}
122 \seetitle[http://www.qmail.org/man/man5/mbox.html]{mbox -
123 file containing mail messages}{Description of the
124 traditional ``mbox'' mailbox format.}
125 \seetitle[http://www.qmail.org/man/man5/maildir.html]{maildir -
126 directory for incoming mail messages}{Description of the
127 ``maildir'' mailbox format.}
128 \seetitle[http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html]{Configuring
129 Netscape Mail on \UNIX: Why the Content-Length Format is
130 Bad}{A description of problems with relying on the
Fred Draked86038d2001-08-03 18:39:36 +0000131 \mailheader{Content-Length} header for messages stored in
132 mailbox files.}
Fred Drake1400baa2001-05-21 21:23:01 +0000133\end{seealso}
134
135
Fred Drake199b79c1999-02-20 05:04:59 +0000136\subsection{Mailbox Objects \label{mailbox-objects}}
Guido van Rossum39a23cc1997-06-02 21:04:41 +0000137
Fred Drakee9ba5252001-10-01 15:49:56 +0000138All implementations of mailbox objects are iterable objects, and
139have one externally visible method. This method is used by iterators
140created from mailbox objects and may also be used directly.
Guido van Rossum39a23cc1997-06-02 21:04:41 +0000141
Fred Drake182bd2d1998-04-02 18:50:21 +0000142\begin{methoddesc}[mailbox]{next}{}
Barry Warsaw30dbd142001-01-31 22:14:01 +0000143Return the next message in the mailbox, created with the optional
144\var{factory} argument passed into the mailbox object's constructor.
Skip Montanarobb5a4652001-09-05 19:27:13 +0000145By default this is an \class{rfc822.Message}
Fred Drake806a4671999-03-27 05:45:46 +0000146object (see the \refmodule{rfc822} module). Depending on the mailbox
147implementation the \var{fp} attribute of this object may be a true
148file object or a class instance simulating a file object, taking care
149of things like message boundaries if multiple mail messages are
Fred Drake2c4f5542000-10-10 22:00:03 +0000150contained in a single file, etc. If no more messages are available,
151this method returns \code{None}.
Fred Drake182bd2d1998-04-02 18:50:21 +0000152\end{methoddesc}