blob: 2f9ef89380cbe2e2d9537c0994bde5556534107f [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`email`: Miscellaneous utilities
2-------------------------------------
3
4.. module:: email.utils
5 :synopsis: Miscellaneous email package utilities.
6
7
8There are several useful utilities provided in the :mod:`email.utils` module:
9
10
11.. function:: quote(str)
12
13 Return a new string with backslashes in *str* replaced by two backslashes, and
14 double quotes replaced by backslash-double quote.
15
16
17.. function:: unquote(str)
18
19 Return a new string which is an *unquoted* version of *str*. If *str* ends and
20 begins with double quotes, they are stripped off. Likewise if *str* ends and
21 begins with angle brackets, they are stripped off.
22
23
24.. function:: parseaddr(address)
25
26 Parse address -- which should be the value of some address-containing field such
27 as :mailheader:`To` or :mailheader:`Cc` -- into its constituent *realname* and
28 *email address* parts. Returns a tuple of that information, unless the parse
29 fails, in which case a 2-tuple of ``('', '')`` is returned.
30
31
R David Murray8debacb2011-04-06 09:35:57 -040032.. function:: formataddr(pair, charset='utf-8')
Georg Brandl116aa622007-08-15 14:28:22 +000033
34 The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname,
35 email_address)`` and returns the string value suitable for a :mailheader:`To` or
36 :mailheader:`Cc` header. If the first element of *pair* is false, then the
37 second element is returned unmodified.
38
R David Murray8debacb2011-04-06 09:35:57 -040039 Optional *charset* is the character set that will be used in the :rfc:`2047`
40 encoding of the ``realname`` if the ``realname`` contains non-ASCII
41 characters. Can be an instance of :class:`str` or a
42 :class:`~email.charset.Charset`. Defaults to ``utf-8``.
43
44 .. versionchanged: 3.3 added the *charset* option
45
Georg Brandl116aa622007-08-15 14:28:22 +000046
47.. function:: getaddresses(fieldvalues)
48
49 This method returns a list of 2-tuples of the form returned by ``parseaddr()``.
50 *fieldvalues* is a sequence of header field values as might be returned by
51 :meth:`Message.get_all`. Here's a simple example that gets all the recipients
52 of a message::
53
54 from email.utils import getaddresses
55
56 tos = msg.get_all('to', [])
57 ccs = msg.get_all('cc', [])
58 resent_tos = msg.get_all('resent-to', [])
59 resent_ccs = msg.get_all('resent-cc', [])
60 all_recipients = getaddresses(tos + ccs + resent_tos + resent_ccs)
61
62
63.. function:: parsedate(date)
64
65 Attempts to parse a date according to the rules in :rfc:`2822`. however, some
66 mailers don't follow that format as specified, so :func:`parsedate` tries to
67 guess correctly in such cases. *date* is a string containing an :rfc:`2822`
68 date, such as ``"Mon, 20 Nov 1995 19:12:08 -0500"``. If it succeeds in parsing
69 the date, :func:`parsedate` returns a 9-tuple that can be passed directly to
70 :func:`time.mktime`; otherwise ``None`` will be returned. Note that indexes 6,
71 7, and 8 of the result tuple are not usable.
72
73
74.. function:: parsedate_tz(date)
75
76 Performs the same function as :func:`parsedate`, but returns either ``None`` or
77 a 10-tuple; the first 9 elements make up a tuple that can be passed directly to
78 :func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC
79 (which is the official term for Greenwich Mean Time) [#]_. If the input string
80 has no timezone, the last element of the tuple returned is ``None``. Note that
81 indexes 6, 7, and 8 of the result tuple are not usable.
82
83
R David Murray875048b2011-07-20 11:41:21 -040084.. function:: parsedate_to_datetime(date)
85
86 The inverse of :func:`format_datetime`. Performs the same function as
87 :func:`parsedate`, but on success returns a :mod:`~datetime.datetime`. If
88 the input date has a timezone of ``-0000``, the ``datetime`` will be a naive
89 ``datetime``, and if the date is conforming to the RFCs it will represent a
90 time in UTC but with no indication of the actual source timezone of the
91 message the date comes from. If the input date has any other valid timezone
92 offset, the ``datetime`` will be an aware ``datetime`` with the
93 corresponding a :class:`~datetime.timezone` :class:`~datetime.tzinfo`.
94
95 .. versionadded:: 3.3
96
97
Georg Brandl116aa622007-08-15 14:28:22 +000098.. function:: mktime_tz(tuple)
99
100 Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC timestamp. It
101 the timezone item in the tuple is ``None``, assume local time. Minor
102 deficiency: :func:`mktime_tz` interprets the first 8 elements of *tuple* as a
103 local time and then compensates for the timezone difference. This may yield a
104 slight error around changes in daylight savings time, though not worth worrying
105 about for common use.
106
107
Georg Brandl3f076d82009-05-17 11:28:33 +0000108.. function:: formatdate(timeval=None, localtime=False, usegmt=False)
Georg Brandl116aa622007-08-15 14:28:22 +0000109
110 Returns a date string as per :rfc:`2822`, e.g.::
111
112 Fri, 09 Nov 2001 01:08:47 -0000
113
114 Optional *timeval* if given is a floating point time value as accepted by
115 :func:`time.gmtime` and :func:`time.localtime`, otherwise the current time is
116 used.
117
118 Optional *localtime* is a flag that when ``True``, interprets *timeval*, and
119 returns a date relative to the local timezone instead of UTC, properly taking
120 daylight savings time into account. The default is ``False`` meaning UTC is
121 used.
122
123 Optional *usegmt* is a flag that when ``True``, outputs a date string with the
124 timezone as an ascii string ``GMT``, rather than a numeric ``-0000``. This is
125 needed for some protocols (such as HTTP). This only applies when *localtime* is
R. David Murray5973e4d2010-02-04 16:41:57 +0000126 ``False``. The default is ``False``.
Georg Brandl116aa622007-08-15 14:28:22 +0000127
Georg Brandl116aa622007-08-15 14:28:22 +0000128
R David Murray875048b2011-07-20 11:41:21 -0400129.. function:: format_datetime(dt, usegmt=False)
130
131 Like ``formatdate``, but the input is a :mod:`datetime` instance. If it is
132 a naive datetime, it is assumed to be "UTC with no information about the
133 source timezone", and the conventional ``-0000`` is used for the timezone.
134 If it is an aware ``datetime``, then the numeric timezone offset is used.
135 If it is an aware timezone with offset zero, then *usegmt* may be set to
136 ``True``, in which case the string ``GMT`` is used instead of the numeric
137 timezone offset. This provides a way to generate standards conformant HTTP
138 date headers.
139
140 .. versionadded:: 3.3
141
142
R. David Murraya0b44b52010-12-02 21:47:19 +0000143.. function:: make_msgid(idstring=None, domain=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000144
145 Returns a string suitable for an :rfc:`2822`\ -compliant
Georg Brandl3f076d82009-05-17 11:28:33 +0000146 :mailheader:`Message-ID` header. Optional *idstring* if given, is a string
R. David Murraya0b44b52010-12-02 21:47:19 +0000147 used to strengthen the uniqueness of the message id. Optional *domain* if
148 given provides the portion of the msgid after the '@'. The default is the
149 local hostname. It is not normally necessary to override this default, but
150 may be useful certain cases, such as a constructing distributed system that
151 uses a consistent domain name across multiple hosts.
152
153 .. versionchanged:: 3.2 domain keyword added
Georg Brandl116aa622007-08-15 14:28:22 +0000154
155
156.. function:: decode_rfc2231(s)
157
158 Decode the string *s* according to :rfc:`2231`.
159
160
Georg Brandl3f076d82009-05-17 11:28:33 +0000161.. function:: encode_rfc2231(s, charset=None, language=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000162
163 Encode the string *s* according to :rfc:`2231`. Optional *charset* and
164 *language*, if given is the character set name and language name to use. If
165 neither is given, *s* is returned as-is. If *charset* is given but *language*
166 is not, the string is encoded using the empty string for *language*.
167
168
Georg Brandl3f076d82009-05-17 11:28:33 +0000169.. function:: collapse_rfc2231_value(value, errors='replace', fallback_charset='us-ascii')
Georg Brandl116aa622007-08-15 14:28:22 +0000170
171 When a header parameter is encoded in :rfc:`2231` format,
172 :meth:`Message.get_param` may return a 3-tuple containing the character set,
173 language, and value. :func:`collapse_rfc2231_value` turns this into a unicode
Georg Brandlf6945182008-02-01 11:56:49 +0000174 string. Optional *errors* is passed to the *errors* argument of :class:`str`'s
175 :func:`encode` method; it defaults to ``'replace'``. Optional
Georg Brandl116aa622007-08-15 14:28:22 +0000176 *fallback_charset* specifies the character set to use if the one in the
Georg Brandlf6945182008-02-01 11:56:49 +0000177 :rfc:`2231` header is not known by Python; it defaults to ``'us-ascii'``.
Georg Brandl116aa622007-08-15 14:28:22 +0000178
179 For convenience, if the *value* passed to :func:`collapse_rfc2231_value` is not
180 a tuple, it should be a string and it is returned unquoted.
181
182
183.. function:: decode_params(params)
184
185 Decode parameters list according to :rfc:`2231`. *params* is a sequence of
186 2-tuples containing elements of the form ``(content-type, string-value)``.
187
Georg Brandl116aa622007-08-15 14:28:22 +0000188
189.. rubric:: Footnotes
190
191.. [#] Note that the sign of the timezone offset is the opposite of the sign of the
192 ``time.timezone`` variable for the same timezone; the latter variable follows
193 the POSIX standard while this module follows :rfc:`2822`.
194