blob: 219e2847ea2d4bf1a09dc4c7f8996ee79668120f [file] [log] [blame]
R David Murray79cf3ba2012-05-27 17:10:36 -04001:mod:`email.utils`: Miscellaneous utilities
2-------------------------------------------
Georg Brandl116aa622007-08-15 14:28:22 +00003
4.. module:: email.utils
5 :synopsis: Miscellaneous email package utilities.
6
7
8There are several useful utilities provided in the :mod:`email.utils` module:
9
10
11.. function:: quote(str)
12
13 Return a new string with backslashes in *str* replaced by two backslashes, and
14 double quotes replaced by backslash-double quote.
15
16
17.. function:: unquote(str)
18
19 Return a new string which is an *unquoted* version of *str*. If *str* ends and
20 begins with double quotes, they are stripped off. Likewise if *str* ends and
21 begins with angle brackets, they are stripped off.
22
23
24.. function:: parseaddr(address)
25
26 Parse address -- which should be the value of some address-containing field such
27 as :mailheader:`To` or :mailheader:`Cc` -- into its constituent *realname* and
28 *email address* parts. Returns a tuple of that information, unless the parse
29 fails, in which case a 2-tuple of ``('', '')`` is returned.
30
31
R David Murray8debacb2011-04-06 09:35:57 -040032.. function:: formataddr(pair, charset='utf-8')
Georg Brandl116aa622007-08-15 14:28:22 +000033
34 The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname,
35 email_address)`` and returns the string value suitable for a :mailheader:`To` or
36 :mailheader:`Cc` header. If the first element of *pair* is false, then the
37 second element is returned unmodified.
38
R David Murray8debacb2011-04-06 09:35:57 -040039 Optional *charset* is the character set that will be used in the :rfc:`2047`
40 encoding of the ``realname`` if the ``realname`` contains non-ASCII
41 characters. Can be an instance of :class:`str` or a
42 :class:`~email.charset.Charset`. Defaults to ``utf-8``.
43
Georg Brandl61063cc2012-06-24 22:48:30 +020044 .. versionchanged:: 3.3
45 Added the *charset* option.
R David Murray8debacb2011-04-06 09:35:57 -040046
Georg Brandl116aa622007-08-15 14:28:22 +000047
48.. function:: getaddresses(fieldvalues)
49
50 This method returns a list of 2-tuples of the form returned by ``parseaddr()``.
51 *fieldvalues* is a sequence of header field values as might be returned by
Serhiy Storchakae0f0cf42013-08-19 09:59:18 +030052 :meth:`Message.get_all <email.message.Message.get_all>`. Here's a simple
53 example that gets all the recipients of a message::
Georg Brandl116aa622007-08-15 14:28:22 +000054
55 from email.utils import getaddresses
56
57 tos = msg.get_all('to', [])
58 ccs = msg.get_all('cc', [])
59 resent_tos = msg.get_all('resent-to', [])
60 resent_ccs = msg.get_all('resent-cc', [])
61 all_recipients = getaddresses(tos + ccs + resent_tos + resent_ccs)
62
63
64.. function:: parsedate(date)
65
66 Attempts to parse a date according to the rules in :rfc:`2822`. however, some
67 mailers don't follow that format as specified, so :func:`parsedate` tries to
68 guess correctly in such cases. *date* is a string containing an :rfc:`2822`
69 date, such as ``"Mon, 20 Nov 1995 19:12:08 -0500"``. If it succeeds in parsing
70 the date, :func:`parsedate` returns a 9-tuple that can be passed directly to
71 :func:`time.mktime`; otherwise ``None`` will be returned. Note that indexes 6,
72 7, and 8 of the result tuple are not usable.
73
74
75.. function:: parsedate_tz(date)
76
77 Performs the same function as :func:`parsedate`, but returns either ``None`` or
78 a 10-tuple; the first 9 elements make up a tuple that can be passed directly to
79 :func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC
80 (which is the official term for Greenwich Mean Time) [#]_. If the input string
81 has no timezone, the last element of the tuple returned is ``None``. Note that
82 indexes 6, 7, and 8 of the result tuple are not usable.
83
84
R David Murray875048b2011-07-20 11:41:21 -040085.. function:: parsedate_to_datetime(date)
86
87 The inverse of :func:`format_datetime`. Performs the same function as
88 :func:`parsedate`, but on success returns a :mod:`~datetime.datetime`. If
89 the input date has a timezone of ``-0000``, the ``datetime`` will be a naive
90 ``datetime``, and if the date is conforming to the RFCs it will represent a
91 time in UTC but with no indication of the actual source timezone of the
92 message the date comes from. If the input date has any other valid timezone
93 offset, the ``datetime`` will be an aware ``datetime`` with the
94 corresponding a :class:`~datetime.timezone` :class:`~datetime.tzinfo`.
95
96 .. versionadded:: 3.3
Georg Brandl61063cc2012-06-24 22:48:30 +020097
98
Georg Brandl116aa622007-08-15 14:28:22 +000099.. function:: mktime_tz(tuple)
100
R David Murrayae25f462014-04-26 19:01:18 -0400101 Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC
102 timestamp (seconds since the Epoch). If the timezone item in the
103 tuple is ``None``, assume local time.
Georg Brandl116aa622007-08-15 14:28:22 +0000104
105
Georg Brandl3f076d82009-05-17 11:28:33 +0000106.. function:: formatdate(timeval=None, localtime=False, usegmt=False)
Georg Brandl116aa622007-08-15 14:28:22 +0000107
108 Returns a date string as per :rfc:`2822`, e.g.::
109
110 Fri, 09 Nov 2001 01:08:47 -0000
111
112 Optional *timeval* if given is a floating point time value as accepted by
113 :func:`time.gmtime` and :func:`time.localtime`, otherwise the current time is
114 used.
115
116 Optional *localtime* is a flag that when ``True``, interprets *timeval*, and
117 returns a date relative to the local timezone instead of UTC, properly taking
118 daylight savings time into account. The default is ``False`` meaning UTC is
119 used.
120
121 Optional *usegmt* is a flag that when ``True``, outputs a date string with the
122 timezone as an ascii string ``GMT``, rather than a numeric ``-0000``. This is
123 needed for some protocols (such as HTTP). This only applies when *localtime* is
R. David Murray5973e4d2010-02-04 16:41:57 +0000124 ``False``. The default is ``False``.
Georg Brandl116aa622007-08-15 14:28:22 +0000125
Georg Brandl116aa622007-08-15 14:28:22 +0000126
R David Murray875048b2011-07-20 11:41:21 -0400127.. function:: format_datetime(dt, usegmt=False)
128
129 Like ``formatdate``, but the input is a :mod:`datetime` instance. If it is
130 a naive datetime, it is assumed to be "UTC with no information about the
131 source timezone", and the conventional ``-0000`` is used for the timezone.
132 If it is an aware ``datetime``, then the numeric timezone offset is used.
133 If it is an aware timezone with offset zero, then *usegmt* may be set to
134 ``True``, in which case the string ``GMT`` is used instead of the numeric
135 timezone offset. This provides a way to generate standards conformant HTTP
136 date headers.
137
138 .. versionadded:: 3.3
139
140
R David Murrayd2d521e2012-05-25 23:22:59 -0400141.. function:: localtime(dt=None)
142
143 Return local time as an aware datetime object. If called without
144 arguments, return current time. Otherwise *dt* argument should be a
145 :class:`~datetime.datetime` instance, and it is converted to the local time
146 zone according to the system time zone database. If *dt* is naive (that
147 is, ``dt.tzinfo`` is ``None``), it is assumed to be in local time. In this
148 case, a positive or zero value for *isdst* causes ``localtime`` to presume
149 initially that summer time (for example, Daylight Saving Time) is or is not
150 (respectively) in effect for the specified time. A negative value for
151 *isdst* causes the ``localtime`` to attempt to divine whether summer time
152 is in effect for the specified time.
153
154 .. versionadded:: 3.3
155
156
R. David Murraya0b44b52010-12-02 21:47:19 +0000157.. function:: make_msgid(idstring=None, domain=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000158
159 Returns a string suitable for an :rfc:`2822`\ -compliant
Georg Brandl3f076d82009-05-17 11:28:33 +0000160 :mailheader:`Message-ID` header. Optional *idstring* if given, is a string
R. David Murraya0b44b52010-12-02 21:47:19 +0000161 used to strengthen the uniqueness of the message id. Optional *domain* if
162 given provides the portion of the msgid after the '@'. The default is the
163 local hostname. It is not normally necessary to override this default, but
164 may be useful certain cases, such as a constructing distributed system that
165 uses a consistent domain name across multiple hosts.
166
Georg Brandl61063cc2012-06-24 22:48:30 +0200167 .. versionchanged:: 3.2
168 Added the *domain* keyword.
Georg Brandl116aa622007-08-15 14:28:22 +0000169
170
171.. function:: decode_rfc2231(s)
172
173 Decode the string *s* according to :rfc:`2231`.
174
175
Georg Brandl3f076d82009-05-17 11:28:33 +0000176.. function:: encode_rfc2231(s, charset=None, language=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000177
178 Encode the string *s* according to :rfc:`2231`. Optional *charset* and
179 *language*, if given is the character set name and language name to use. If
180 neither is given, *s* is returned as-is. If *charset* is given but *language*
181 is not, the string is encoded using the empty string for *language*.
182
183
Georg Brandl3f076d82009-05-17 11:28:33 +0000184.. function:: collapse_rfc2231_value(value, errors='replace', fallback_charset='us-ascii')
Georg Brandl116aa622007-08-15 14:28:22 +0000185
186 When a header parameter is encoded in :rfc:`2231` format,
Serhiy Storchakae0f0cf42013-08-19 09:59:18 +0300187 :meth:`Message.get_param <email.message.Message.get_param>` may return a
188 3-tuple containing the character set,
Georg Brandl116aa622007-08-15 14:28:22 +0000189 language, and value. :func:`collapse_rfc2231_value` turns this into a unicode
Georg Brandlf6945182008-02-01 11:56:49 +0000190 string. Optional *errors* is passed to the *errors* argument of :class:`str`'s
Serhiy Storchakae0f0cf42013-08-19 09:59:18 +0300191 :func:`~str.encode` method; it defaults to ``'replace'``. Optional
Georg Brandl116aa622007-08-15 14:28:22 +0000192 *fallback_charset* specifies the character set to use if the one in the
Georg Brandlf6945182008-02-01 11:56:49 +0000193 :rfc:`2231` header is not known by Python; it defaults to ``'us-ascii'``.
Georg Brandl116aa622007-08-15 14:28:22 +0000194
195 For convenience, if the *value* passed to :func:`collapse_rfc2231_value` is not
196 a tuple, it should be a string and it is returned unquoted.
197
198
199.. function:: decode_params(params)
200
201 Decode parameters list according to :rfc:`2231`. *params* is a sequence of
202 2-tuples containing elements of the form ``(content-type, string-value)``.
203
Georg Brandl116aa622007-08-15 14:28:22 +0000204
205.. rubric:: Footnotes
206
207.. [#] Note that the sign of the timezone offset is the opposite of the sign of the
208 ``time.timezone`` variable for the same timezone; the latter variable follows
209 the POSIX standard while this module follows :rfc:`2822`.