blob: eba0684259079756ab2ca303535f8a847cffe14d [file] [log] [blame]
Barry Warsaw5e634632001-09-26 05:23:47 +00001% Copyright (C) 2001 Python Software Foundation
2% Author: barry@zope.com (Barry Warsaw)
3
4\section{\module{email} --
5 An email and MIME handling package}
6
7\declaremodule{standard}{email}
8\modulesynopsis{Package supporting the parsing, manipulating, and
9 generating email messages, including MIME documents.}
10\moduleauthor{Barry A. Warsaw}{barry@zope.com}
11
12\versionadded{2.2}
13
14The \module{email} package is a library for managing email messages,
15including MIME and other \rfc{2822}-based message documents. It
16subsumes most of the functionality in several older standard modules
17such as \module{rfc822}, \module{mimetools}, \module{multifile}, and
18other non-standard packages such as \module{mimecntl}.
19
20The primary distinguishing feature of the \module{email} package is
21that it splits the parsing and generating of email messages from the
22internal \emph{object model} representation of email. Applications
23using the \module{email} package deal primarily with objects; you can
24add sub-objects to messages, remove sub-objects from messages,
25completely re-arrange the contents, etc. There is a separate parser
26and a separate generator which handles the transformation from flat
27text to the object module, and then back to flat text again. There
28are also handy subclasses for some common MIME object types, and a few
29miscellaneous utilities that help with such common tasks as extracting
30and parsing message field values, creating RFC-compliant dates, etc.
31
32The following sections describe the functionality of the
33\module{email} package. The ordering follows a progression that
34should be common in applications: an email message is read as flat
35text from a file or other source, the text is parsed to produce an
36object model representation of the email message, this model is
37manipulated, and finally the model is rendered back into
38flat text.
39
40It is perfectly feasible to create the object model out of whole cloth
41-- i.e. completely from scratch. From there, a similar progression can
42be taken as above.
43
44Also included are detailed specifications of all the classes and
45modules that the \module{email} package provides, the exception
46classes you might encounter while using the \module{email} package,
47some auxiliary utilities, and a few examples. For users of the older
48\module{mimelib} package, from which the \module{email} package is
49descendent, a section on differences and porting is provided.
50
51\subsection{Representing an email message}
52
53The primary object in the \module{email} package is the
54\class{Message} class, provided in the \refmodule{email.Message}
55module. \class{Message} is the base class for the \module{email}
56object model. It provides the core functionality for setting and
57querying header fields, and for accessing message bodies.
58
59Conceptually, a \class{Message} object consists of \emph{headers} and
60\emph{payloads}. Headers are \rfc{2822} style field name and
61values where the field name and value are separated by a colon. The
62colon is not part of either the field name or the field value.
63
64Headers are stored and returned in case-preserving form but are
65matched case-insensitively. There may also be a single
66\emph{Unix-From} header, also known as the envelope header or the
67\code{From_} header. The payload is either a string in the case of
68simple message objects, a list of \class{Message} objects for
69multipart MIME documents, or a single \class{Message} instance for
70\code{message/rfc822} type objects.
71
72\class{Message} objects provide a mapping style interface for
73accessing the message headers, and an explicit interface for accessing
74both the headers and the payload. It provides convenience methods for
75generating a flat text representation of the message object tree, for
76accessing commonly used header parameters, and for recursively walking
77over the object tree.
78
79\subsection{Parsing email messages}
80Message object trees can be created in one of two ways: they can be
81created from whole cloth by instantiating \class{Message} objects and
82stringing them together via \method{add_payload()} and
83\method{set_payload()} calls, or they can be created by parsing a flat text
84representation of the email message.
85
86The \module{email} package provides a standard parser that understands
87most email document structures, including MIME documents. You can
88pass the parser a string or a file object, and the parser will return
89to you the root \class{Message} instance of the object tree. For
90simple, non-MIME messages the payload of this root object will likely
91be a string (e.g. containing the text of the message). For MIME
92messages, the root object will return 1 from its
93\method{is_multipart()} method, and the subparts can be accessed via
94the \method{get_payload()} and \method{walk()} methods.
95
96Note that the parser can be extended in limited ways, and of course
97you can implement your own parser completely from scratch. There is
98no magical connection between the \module{email} package's bundled
99parser and the
100\class{Message} class, so your custom parser can create message object
101trees in any way it find necessary. The \module{email} package's
102parser is described in detail in the \refmodule{email.Parser} module
103documentation.
104
105\subsection{Generating MIME documents}
106One of the most common tasks is to generate the flat text of the email
107message represented by a message object tree. You will need to do
108this if you want to send your message via the \refmodule{smtplib}
109module or the \refmodule{nntplib} module, or print the message on the
110console. Taking a message object tree and producing a flat text
111document is the job of the \refmodule{email.Generator} module.
112
113Again, as with the \refmodule{email.Parser} module, you aren't limited
114to the functionality of the bundled generator; you could write one
115from scratch yourself. However the bundled generator knows how to
116generate most email in a standards-compliant way, should handle MIME
117and non-MIME email messages just fine, and is designed so that the
118transformation from flat text, to an object tree via the
119\class{Parser} class,
120and back to flat text, be idempotent (the input is identical to the
121output).
122
123\subsection{Creating email and MIME objects from scratch}
124
125Ordinarily, you get a message object tree by passing some text to a
126parser, which parses the text and returns the root of the message
127object tree. However you can also build a complete object tree from
128scratch, or even individual \class{Message} objects by hand. In fact,
129you can also take an existing tree and add new \class{Message}
130objects, move them around, etc. This makes a very convenient
131interface for slicing-and-dicing MIME messages.
132
133You can create a new object tree by creating \class{Message}
134instances, adding payloads and all the appropriate headers manually.
135For MIME messages though, the \module{email} package provides some
136convenient classes to make things easier. Each of these classes
137should be imported from a module with the same name as the class, from
138within the \module{email} package. E.g.:
139
140\begin{verbatim}
141import email.MIMEImage.MIMEImage
142\end{verbatim}
143
144or
145
146\begin{verbatim}
147from email.MIMEText import MIMEText
148\end{verbatim}
149
150Here are the classes:
151
152\begin{classdesc}{MIMEBase}{_maintype, _subtype, **_params}
153This is the base class for all the MIME-specific subclasses of
154\class{Message}. Ordinarily you won't create instances specifically
155of \class{MIMEBase}, although you could. \class{MIMEBase} is provided
156primarily as a convenient base class for more specific MIME-aware
157subclasses.
158
159\var{_maintype} is the \code{Content-Type:} major type (e.g. \code{text} or
160\code{image}), and \var{_subtype} is the \code{Content-Type:} minor type
161(e.g. \code{plain} or \code{gif}). \var{_params} is a parameter
162key/value dictionary and is passed directly to
163\method{Message.add_header()}.
164
165The \class{MIMEBase} class always adds a \code{Content-Type:} header
166(based on \var{_maintype}, \var{_subtype}, and \var{_params}), and a
167\code{MIME-Version:} header (always set to \code{1.0}).
168\end{classdesc}
169
170\begin{classdesc}{MIMEImage}{_imagedata\optional{, _subtype\optional{,
171 _encoder\optional{, **_params}}}}
172
173A subclass of \class{MIMEBase}, the \class{MIMEImage} class is used to
174create MIME message objects of major type \code{image}.
175\var{_imagedata} is a string containing the raw image data. If this
176data can be decoded by the standard Python module \refmodule{imghdr},
177then the subtype will be automatically included in the
178\code{Content-Type:} header. Otherwise you can explicitly specify the
179image subtype via the \var{_subtype} parameter. If the minor type could
180not be guessed and \var{_subtype} was not given, then \code{TypeError}
181is raised.
182
183Optional \var{_encoder} is a callable (i.e. function) which will
184perform the actual encoding of the image data for transport. This
185callable takes one argument, which is the \class{MIMEImage} instance.
186It should use \method{get_payload()} and \method{set_payload()} to
187change the payload to encoded form. It should also add any
188\code{Content-Transfer-Encoding:} or other headers to the message
189object as necessary. The default encoding is \emph{Base64}. See the
190\refmodule{email.Encoders} module for a list of the built-in encoders.
191
192\var{_params} are passed straight through to the \class{MIMEBase}
193constructor.
194\end{classdesc}
195
196\begin{classdesc}{MIMEText}{_text\optional{, _subtype\optional{,
197 _charset\optional{, _encoder}}}}
198A subclass of \class{MIMEBase}, the \class{MIMEText} class is used to
199create MIME objects of major type \code{text}. \var{_text} is the string
200for the payload. \var{_subtype} is the minor type and defaults to
201\code{plain}. \var{_charset} is the character set of the text and is
202passed as a parameter to the \class{MIMEBase} constructor; it defaults
203to \code{us-ascii}. No guessing or encoding is performed on the text
204data, but a newline is appended to \var{_text} if it doesn't already
205end with a newline.
206
207The \var{_encoding} argument is as with the \class{MIMEImage} class
208constructor, except that the default encoding for \class{MIMEText}
209objects is one that doesn't actually modify the payload, but does set
210the \code{Content-Transfer-Encoding:} header to \code{7bit} or
211\code{8bit} as appropriate.
212\end{classdesc}
213
214\begin{classdesc}{MIMEMessage}{_msg\optional{, _subtype}}
215A subclass of \class{MIMEBase}, the \class{MIMEMessage} class is used to
216create MIME objects of main type \code{message}. \var{_msg} is used as
217the payload, and must be an instance of class \class{Message} (or a
218subclass thereof), otherwise a \exception{TypeError} is raised.
219
220Optional \var{_subtype} sets the subtype of the message; it defaults
221to \code{rfc822}.
222\end{classdesc}
223
224\subsection{Encoders, Exceptions, Utilities, and Iterators}
225
226The \module{email} package provides various encoders for safe
227transport of binary payloads in \class{MIMEImage} and \class{MIMEText}
228instances. See the \refmodule{email.Encoders} module for more
229details.
230
231All of the class exceptions that the \module{email} package can raise
232are available in the \refmodule{email.Errors} module.
233
234Some miscellaneous utility functions are available in the
235\refmodule{email.Utils} module.
236
237Iterating over a message object tree is easy with the
238\method{Message.walk()} method; some additional helper iterators are
239available in the \refmodule{email.Iterators} module.
240
241\subsection{Differences from \module{mimelib}}
242
243The \module{email} package was originally prototyped as a separate
244library called \module{mimelib}. Changes have been made so that
245method names are more consistent, and some methods or modules have
246either been added or removed. The semantics of some of the methods
247have also changed. For the most part, any functionality available in
248\module{mimelib} is still available in the \module{email} package,
249albeit often in a different way.
250
251Here is a brief description of the differences between the
252\module{mimelib} and the \module{email} packages, along with hints on
253how to port your applications.
254
255Of course, the most visible difference between the two packages is
256that the package name has been changed to \module{email}. In
257addition, the top-level package has the following differences:
258
259\begin{itemize}
260\item \function{messageFromString()} has been renamed to
261 \function{message_from_string()}.
262\item \function{messageFromFile()} has been renamed to
263 \function{message_from_file()}.
264\end{itemize}
265
266The \class{Message} class has the following differences:
267
268\begin{itemize}
269\item The method \method{asString()} was renamed to \method{as_string()}.
270\item The method \method{ismultipart()} was renamed to
271 \method{is_multipart()}.
272\item The \method{get_payload()} method has grown a \var{decode}
273 optional argument.
274\item The method \method{getall()} was renamed to \method{get_all()}.
275\item The method \method{addheader()} was renamed to \method{add_header()}.
276\item The method \method{gettype()} was renamed to \method{get_type()}.
277\item The method\method{getmaintype()} was renamed to
278 \method{get_main_type()}.
279\item The method \method{getsubtype()} was renamed to
280 \method{get_subtype()}.
281\item The method \method{getparams()} was renamed to
282 \method{get_params()}.
283 Also, whereas \method{getparams()} returned a list of strings,
284 \method{get_params()} returns a list of 2-tuples, effectively
285 the key/value pairs of the parameters, split on the \samp{=}
286 sign.
287\item The method \method{getparam()} was renamed to \method{get_param()}.
288\item The method \method{getcharsets()} was renamed to
289 \method{get_charsets()}.
290\item The method \method{getfilename()} was renamed to
291 \method{get_filename()}.
292\item The method \method{getboundary()} was renamed to
293 \method{get_boundary()}.
294\item The method \method{setboundary()} was renamed to
295 \method{set_boundary()}.
296\item The method \method{getdecodedpayload()} was removed. To get
297 similar functionality, pass the value 1 to the \var{decode} flag
298 of the {get_payload()} method.
299\item The method \method{getpayloadastext()} was removed. Similar
300 functionality
301 is supported by the \class{DecodedGenerator} class in the
302 \refmodule{email.Generator} module.
303\item The method \method{getbodyastext()} was removed. You can get
304 similar functionality by creating an iterator with
305 \function{typed_subpart_iterator()} in the
306 \refmodule{email.Iterators} module.
307\end{itemize}
308
309The \class{Parser} class has no differences in its public interface.
310It does have some additional smarts to recognize
311\code{message/delivery-status} type messages, which it represents as
312a \class{Message} instance containing separate \class{Message}
313subparts for each header block in the delivery status
314notification\footnote{Delivery Status Notifications (DSN) are defined
315in \rfc{1894}}.
316
317The \class{Generator} class has no differences in its public
318interface. There is a new class in the \refmodule{email.Generator}
319module though, called \class{DecodedGenerator} which provides most of
320the functionality previously available in the
321\method{Message.getpayloadastext()} method.
322
323The following modules and classes have been changed:
324
325\begin{itemize}
326\item The \class{MIMEBase} class constructor arguments \var{_major}
327 and \var{_minor} have changed to \var{_maintype} and
328 \var{_subtype} respectively.
329\item The \code{Image} class/module has been renamed to
330 \code{MIMEImage}. The \var{_minor} argument has been renamed to
331 \var{_subtype}.
332\item The \code{Text} class/module has been renamed to
333 \code{MIMEText}. The \var{_minor} argument has been renamed to
334 \var{_subtype}.
335\item The \code{MessageRFC822} class/module has been renamed to
336 \code{MIMEMessage}. Note that an earlier version of
337 \module{mimelib} called this class/module \code{RFC822}, but
338 that clashed with the Python standard library module
339 \refmodule{rfc822} on some case-insensitive file systems.
340
341 Also, the \class{MIMEMessage} class now represents any kind of
342 MIME message with main type \code{message}. It takes an
343 optional argument \var{_subtype} which is used to set the MIME
344 subtype. \var{_subtype} defaults to \code{rfc822}.
345\end{itemize}
346
347\module{mimelib} provided some utility functions in its
348\module{address} and \module{date} modules. All of these functions
349have been moved to the \refmodule{email.Utils} module.
350
351The \code{MsgReader} class/module has been removed. Its functionality
352is most closely supported in the \function{body_line_iterator()}
353function in the \refmodule{email.Iterators} module.
354
355\subsection{Examples}
356
357Coming soon...
358