blob: 52df0130bf41befff5944ed1895ba5d29c60a92b [file] [log] [blame]
Georg Brandl8ec7f652007-08-15 14:28:01 +00001
2:mod:`rfc822` --- Parse RFC 2822 mail headers
3=============================================
4
5.. module:: rfc822
6 :synopsis: Parse 2822 style mail messages.
Georg Brandl7f758c42007-08-15 18:41:25 +00007 :deprecated:
Georg Brandl8ec7f652007-08-15 14:28:01 +00008
9
10.. deprecated:: 2.3
11 The :mod:`email` package should be used in preference to the :mod:`rfc822`
12 module. This module is present only to maintain backward compatibility.
13
14This module defines a class, :class:`Message`, which represents an "email
15message" as defined by the Internet standard :rfc:`2822`. [#]_ Such messages
16consist of a collection of message headers, and a message body. This module
17also defines a helper class :class:`AddressList` for parsing :rfc:`2822`
18addresses. Please refer to the RFC for information on the specific syntax of
19:rfc:`2822` messages.
20
21.. index:: module: mailbox
22
23The :mod:`mailbox` module provides classes to read mailboxes produced by
24various end-user mail programs.
25
26
27.. class:: Message(file[, seekable])
28
29 A :class:`Message` instance is instantiated with an input object as parameter.
30 Message relies only on the input object having a :meth:`readline` method; in
31 particular, ordinary file objects qualify. Instantiation reads headers from the
32 input object up to a delimiter line (normally a blank line) and stores them in
33 the instance. The message body, following the headers, is not consumed.
34
35 This class can work with any input object that supports a :meth:`readline`
36 method. If the input object has seek and tell capability, the
37 :meth:`rewindbody` method will work; also, illegal lines will be pushed back
38 onto the input stream. If the input object lacks seek but has an :meth:`unread`
39 method that can push back a line of input, :class:`Message` will use that to
40 push back illegal lines. Thus this class can be used to parse messages coming
41 from a buffered stream.
42
43 The optional *seekable* argument is provided as a workaround for certain stdio
44 libraries in which :cfunc:`tell` discards buffered data before discovering that
45 the :cfunc:`lseek` system call doesn't work. For maximum portability, you
46 should set the seekable argument to zero to prevent that initial :meth:`tell`
47 when passing in an unseekable object such as a file object created from a socket
48 object.
49
50 Input lines as read from the file may either be terminated by CR-LF or by a
51 single linefeed; a terminating CR-LF is replaced by a single linefeed before the
52 line is stored.
53
54 All header matching is done independent of upper or lower case; e.g.
55 ``m['From']``, ``m['from']`` and ``m['FROM']`` all yield the same result.
56
57
58.. class:: AddressList(field)
59
60 You may instantiate the :class:`AddressList` helper class using a single string
61 parameter, a comma-separated list of :rfc:`2822` addresses to be parsed. (The
62 parameter ``None`` yields an empty list.)
63
64
65.. function:: quote(str)
66
67 Return a new string with backslashes in *str* replaced by two backslashes and
68 double quotes replaced by backslash-double quote.
69
70
71.. function:: unquote(str)
72
73 Return a new string which is an *unquoted* version of *str*. If *str* ends and
74 begins with double quotes, they are stripped off. Likewise if *str* ends and
75 begins with angle brackets, they are stripped off.
76
77
78.. function:: parseaddr(address)
79
80 Parse *address*, which should be the value of some address-containing field such
81 as :mailheader:`To` or :mailheader:`Cc`, into its constituent "realname" and
82 "email address" parts. Returns a tuple of that information, unless the parse
83 fails, in which case a 2-tuple ``(None, None)`` is returned.
84
85
86.. function:: dump_address_pair(pair)
87
88 The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname,
89 email_address)`` and returns the string value suitable for a :mailheader:`To` or
90 :mailheader:`Cc` header. If the first element of *pair* is false, then the
91 second element is returned unmodified.
92
93
94.. function:: parsedate(date)
95
96 Attempts to parse a date according to the rules in :rfc:`2822`. however, some
97 mailers don't follow that format as specified, so :func:`parsedate` tries to
98 guess correctly in such cases. *date* is a string containing an :rfc:`2822`
99 date, such as ``'Mon, 20 Nov 1995 19:12:08 -0500'``. If it succeeds in parsing
100 the date, :func:`parsedate` returns a 9-tuple that can be passed directly to
101 :func:`time.mktime`; otherwise ``None`` will be returned. Note that indexes 6,
102 7, and 8 of the result tuple are not usable.
103
104
105.. function:: parsedate_tz(date)
106
107 Performs the same function as :func:`parsedate`, but returns either ``None`` or
108 a 10-tuple; the first 9 elements make up a tuple that can be passed directly to
109 :func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC
110 (which is the official term for Greenwich Mean Time). (Note that the sign of
111 the timezone offset is the opposite of the sign of the ``time.timezone``
112 variable for the same timezone; the latter variable follows the POSIX standard
113 while this module follows :rfc:`2822`.) If the input string has no timezone,
114 the last element of the tuple returned is ``None``. Note that indexes 6, 7, and
115 8 of the result tuple are not usable.
116
117
118.. function:: mktime_tz(tuple)
119
120 Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC timestamp. If
121 the timezone item in the tuple is ``None``, assume local time. Minor
122 deficiency: this first interprets the first 8 elements as a local time and then
123 compensates for the timezone difference; this may yield a slight error around
124 daylight savings time switch dates. Not enough to worry about for common use.
125
126
127.. seealso::
128
129 Module :mod:`email`
130 Comprehensive email handling package; supersedes the :mod:`rfc822` module.
131
132 Module :mod:`mailbox`
133 Classes to read various mailbox formats produced by end-user mail programs.
134
135 Module :mod:`mimetools`
136 Subclass of :class:`rfc822.Message` that handles MIME encoded messages.
137
138
139.. _message-objects:
140
141Message Objects
142---------------
143
144A :class:`Message` instance has the following methods:
145
146
147.. method:: Message.rewindbody()
148
149 Seek to the start of the message body. This only works if the file object is
150 seekable.
151
152
153.. method:: Message.isheader(line)
154
155 Returns a line's canonicalized fieldname (the dictionary key that will be used
156 to index it) if the line is a legal :rfc:`2822` header; otherwise returns
157 ``None`` (implying that parsing should stop here and the line be pushed back on
158 the input stream). It is sometimes useful to override this method in a
159 subclass.
160
161
162.. method:: Message.islast(line)
163
164 Return true if the given line is a delimiter on which Message should stop. The
165 delimiter line is consumed, and the file object's read location positioned
166 immediately after it. By default this method just checks that the line is
167 blank, but you can override it in a subclass.
168
169
170.. method:: Message.iscomment(line)
171
172 Return ``True`` if the given line should be ignored entirely, just skipped. By
173 default this is a stub that always returns ``False``, but you can override it in
174 a subclass.
175
176
177.. method:: Message.getallmatchingheaders(name)
178
179 Return a list of lines consisting of all headers matching *name*, if any. Each
180 physical line, whether it is a continuation line or not, is a separate list
181 item. Return the empty list if no header matches *name*.
182
183
184.. method:: Message.getfirstmatchingheader(name)
185
186 Return a list of lines comprising the first header matching *name*, and its
187 continuation line(s), if any. Return ``None`` if there is no header matching
188 *name*.
189
190
191.. method:: Message.getrawheader(name)
192
193 Return a single string consisting of the text after the colon in the first
194 header matching *name*. This includes leading whitespace, the trailing
195 linefeed, and internal linefeeds and whitespace if there any continuation
196 line(s) were present. Return ``None`` if there is no header matching *name*.
197
198
199.. method:: Message.getheader(name[, default])
200
201 Like ``getrawheader(name)``, but strip leading and trailing whitespace.
202 Internal whitespace is not stripped. The optional *default* argument can be
203 used to specify a different default to be returned when there is no header
204 matching *name*.
205
206
207.. method:: Message.get(name[, default])
208
209 An alias for :meth:`getheader`, to make the interface more compatible with
210 regular dictionaries.
211
212
213.. method:: Message.getaddr(name)
214
215 Return a pair ``(full name, email address)`` parsed from the string returned by
216 ``getheader(name)``. If no header matching *name* exists, return ``(None,
217 None)``; otherwise both the full name and the address are (possibly empty)
218 strings.
219
220 Example: If *m*'s first :mailheader:`From` header contains the string
221 ``'jack@cwi.nl (Jack Jansen)'``, then ``m.getaddr('From')`` will yield the pair
222 ``('Jack Jansen', 'jack@cwi.nl')``. If the header contained ``'Jack Jansen
223 <jack@cwi.nl>'`` instead, it would yield the exact same result.
224
225
226.. method:: Message.getaddrlist(name)
227
228 This is similar to ``getaddr(list)``, but parses a header containing a list of
229 email addresses (e.g. a :mailheader:`To` header) and returns a list of ``(full
230 name, email address)`` pairs (even if there was only one address in the header).
231 If there is no header matching *name*, return an empty list.
232
233 If multiple headers exist that match the named header (e.g. if there are several
234 :mailheader:`Cc` headers), all are parsed for addresses. Any continuation lines
235 the named headers contain are also parsed.
236
237
238.. method:: Message.getdate(name)
239
240 Retrieve a header using :meth:`getheader` and parse it into a 9-tuple compatible
241 with :func:`time.mktime`; note that fields 6, 7, and 8 are not usable. If
242 there is no header matching *name*, or it is unparsable, return ``None``.
243
244 Date parsing appears to be a black art, and not all mailers adhere to the
245 standard. While it has been tested and found correct on a large collection of
246 email from many sources, it is still possible that this function may
247 occasionally yield an incorrect result.
248
249
250.. method:: Message.getdate_tz(name)
251
252 Retrieve a header using :meth:`getheader` and parse it into a 10-tuple; the
253 first 9 elements will make a tuple compatible with :func:`time.mktime`, and the
254 10th is a number giving the offset of the date's timezone from UTC. Note that
255 fields 6, 7, and 8 are not usable. Similarly to :meth:`getdate`, if there is
256 no header matching *name*, or it is unparsable, return ``None``.
257
258:class:`Message` instances also support a limited mapping interface. In
259particular: ``m[name]`` is like ``m.getheader(name)`` but raises :exc:`KeyError`
260if there is no matching header; and ``len(m)``, ``m.get(name[, default])``,
261``m.has_key(name)``, ``m.keys()``, ``m.values()`` ``m.items()``, and
262``m.setdefault(name[, default])`` act as expected, with the one difference
263that :meth:`setdefault` uses an empty string as the default value.
264:class:`Message` instances also support the mapping writable interface ``m[name]
265= value`` and ``del m[name]``. :class:`Message` objects do not support the
266:meth:`clear`, :meth:`copy`, :meth:`popitem`, or :meth:`update` methods of the
267mapping interface. (Support for :meth:`get` and :meth:`setdefault` was only
268added in Python 2.2.)
269
270Finally, :class:`Message` instances have some public instance variables:
271
272
273.. attribute:: Message.headers
274
275 A list containing the entire set of header lines, in the order in which they
276 were read (except that setitem calls may disturb this order). Each line contains
277 a trailing newline. The blank line terminating the headers is not contained in
278 the list.
279
280
281.. attribute:: Message.fp
282
283 The file or file-like object passed at instantiation time. This can be used to
284 read the message content.
285
286
287.. attribute:: Message.unixfrom
288
289 The Unix ``From`` line, if the message had one, or an empty string. This is
290 needed to regenerate the message in some contexts, such as an ``mbox``\ -style
291 mailbox file.
292
293
294.. _addresslist-objects:
295
296AddressList Objects
297-------------------
298
299An :class:`AddressList` instance has the following methods:
300
301
302.. method:: AddressList.__len__()
303
304 Return the number of addresses in the address list.
305
306
307.. method:: AddressList.__str__()
308
309 Return a canonicalized string representation of the address list. Addresses are
310 rendered in "name" <host@domain> form, comma-separated.
311
312
313.. method:: AddressList.__add__(alist)
314
315 Return a new :class:`AddressList` instance that contains all addresses in both
316 :class:`AddressList` operands, with duplicates removed (set union).
317
318
319.. method:: AddressList.__iadd__(alist)
320
321 In-place version of :meth:`__add__`; turns this :class:`AddressList` instance
322 into the union of itself and the right-hand instance, *alist*.
323
324
325.. method:: AddressList.__sub__(alist)
326
327 Return a new :class:`AddressList` instance that contains every address in the
328 left-hand :class:`AddressList` operand that is not present in the right-hand
329 address operand (set difference).
330
331
332.. method:: AddressList.__isub__(alist)
333
334 In-place version of :meth:`__sub__`, removing addresses in this list which are
335 also in *alist*.
336
337Finally, :class:`AddressList` instances have one public instance variable:
338
339
340.. attribute:: AddressList.addresslist
341
342 A list of tuple string pairs, one per address. In each member, the first is the
343 canonicalized name part, the second is the actual route-address (``'@'``\
344 -separated username-host.domain pair).
345
346.. rubric:: Footnotes
347
348.. [#] This module originally conformed to :rfc:`822`, hence the name. Since then,
349 :rfc:`2822` has been released as an update to :rfc:`822`. This module should be
350 considered :rfc:`2822`\ -conformant, especially in cases where the syntax or
351 semantics have changed since :rfc:`822`.
352