blob: 0e266b6a45782ac919aff3d2a4174cd431e0a498 [file] [log] [blame]
R David Murray79cf3ba2012-05-27 17:10:36 -04001:mod:`email.utils`: Miscellaneous utilities
2-------------------------------------------
Georg Brandl116aa622007-08-15 14:28:22 +00003
4.. module:: email.utils
5 :synopsis: Miscellaneous email package utilities.
6
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04007**Source code:** :source:`Lib/email/utils.py`
8
9--------------
Georg Brandl116aa622007-08-15 14:28:22 +000010
R David Murray29d1bc02016-09-07 21:15:59 -040011There are a couple of useful utilities provided in the :mod:`email.utils`
12module:
13
14.. function:: localtime(dt=None)
15
16 Return local time as an aware datetime object. If called without
17 arguments, return current time. Otherwise *dt* argument should be a
18 :class:`~datetime.datetime` instance, and it is converted to the local time
19 zone according to the system time zone database. If *dt* is naive (that
20 is, ``dt.tzinfo`` is ``None``), it is assumed to be in local time. In this
21 case, a positive or zero value for *isdst* causes ``localtime`` to presume
22 initially that summer time (for example, Daylight Saving Time) is or is not
23 (respectively) in effect for the specified time. A negative value for
24 *isdst* causes the ``localtime`` to attempt to divine whether summer time
25 is in effect for the specified time.
26
27 .. versionadded:: 3.3
28
29
30.. function:: make_msgid(idstring=None, domain=None)
31
32 Returns a string suitable for an :rfc:`2822`\ -compliant
33 :mailheader:`Message-ID` header. Optional *idstring* if given, is a string
34 used to strengthen the uniqueness of the message id. Optional *domain* if
35 given provides the portion of the msgid after the '@'. The default is the
36 local hostname. It is not normally necessary to override this default, but
37 may be useful certain cases, such as a constructing distributed system that
38 uses a consistent domain name across multiple hosts.
39
40 .. versionchanged:: 3.2
41 Added the *domain* keyword.
42
43
44The remaining functions are part of the legacy (``Compat32``) email API. There
45is no need to directly use these with the new API, since the parsing and
46formatting they provide is done automatically by the header parsing machinery
47of the new API.
Georg Brandl116aa622007-08-15 14:28:22 +000048
49
50.. function:: quote(str)
51
52 Return a new string with backslashes in *str* replaced by two backslashes, and
53 double quotes replaced by backslash-double quote.
54
55
56.. function:: unquote(str)
57
58 Return a new string which is an *unquoted* version of *str*. If *str* ends and
59 begins with double quotes, they are stripped off. Likewise if *str* ends and
60 begins with angle brackets, they are stripped off.
61
62
63.. function:: parseaddr(address)
64
65 Parse address -- which should be the value of some address-containing field such
66 as :mailheader:`To` or :mailheader:`Cc` -- into its constituent *realname* and
67 *email address* parts. Returns a tuple of that information, unless the parse
68 fails, in which case a 2-tuple of ``('', '')`` is returned.
69
70
R David Murray8debacb2011-04-06 09:35:57 -040071.. function:: formataddr(pair, charset='utf-8')
Georg Brandl116aa622007-08-15 14:28:22 +000072
73 The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname,
74 email_address)`` and returns the string value suitable for a :mailheader:`To` or
75 :mailheader:`Cc` header. If the first element of *pair* is false, then the
76 second element is returned unmodified.
77
R David Murray8debacb2011-04-06 09:35:57 -040078 Optional *charset* is the character set that will be used in the :rfc:`2047`
79 encoding of the ``realname`` if the ``realname`` contains non-ASCII
80 characters. Can be an instance of :class:`str` or a
81 :class:`~email.charset.Charset`. Defaults to ``utf-8``.
82
Georg Brandl61063cc2012-06-24 22:48:30 +020083 .. versionchanged:: 3.3
84 Added the *charset* option.
R David Murray8debacb2011-04-06 09:35:57 -040085
Georg Brandl116aa622007-08-15 14:28:22 +000086
87.. function:: getaddresses(fieldvalues)
88
89 This method returns a list of 2-tuples of the form returned by ``parseaddr()``.
90 *fieldvalues* is a sequence of header field values as might be returned by
Serhiy Storchakae0f0cf42013-08-19 09:59:18 +030091 :meth:`Message.get_all <email.message.Message.get_all>`. Here's a simple
92 example that gets all the recipients of a message::
Georg Brandl116aa622007-08-15 14:28:22 +000093
94 from email.utils import getaddresses
95
96 tos = msg.get_all('to', [])
97 ccs = msg.get_all('cc', [])
98 resent_tos = msg.get_all('resent-to', [])
99 resent_ccs = msg.get_all('resent-cc', [])
100 all_recipients = getaddresses(tos + ccs + resent_tos + resent_ccs)
101
102
103.. function:: parsedate(date)
104
105 Attempts to parse a date according to the rules in :rfc:`2822`. however, some
106 mailers don't follow that format as specified, so :func:`parsedate` tries to
107 guess correctly in such cases. *date* is a string containing an :rfc:`2822`
108 date, such as ``"Mon, 20 Nov 1995 19:12:08 -0500"``. If it succeeds in parsing
109 the date, :func:`parsedate` returns a 9-tuple that can be passed directly to
110 :func:`time.mktime`; otherwise ``None`` will be returned. Note that indexes 6,
111 7, and 8 of the result tuple are not usable.
112
113
114.. function:: parsedate_tz(date)
115
116 Performs the same function as :func:`parsedate`, but returns either ``None`` or
117 a 10-tuple; the first 9 elements make up a tuple that can be passed directly to
118 :func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC
119 (which is the official term for Greenwich Mean Time) [#]_. If the input string
David Ka12255d2019-11-12 12:38:46 +0000120 has no timezone, the last element of the tuple returned is ``0``, which represents
121 UTC. Note that indexes 6, 7, and 8 of the result tuple are not usable.
Georg Brandl116aa622007-08-15 14:28:22 +0000122
123
R David Murray875048b2011-07-20 11:41:21 -0400124.. function:: parsedate_to_datetime(date)
125
126 The inverse of :func:`format_datetime`. Performs the same function as
Georges Toth303aac82020-10-27 01:31:06 +0100127 :func:`parsedate`, but on success returns a :mod:`~datetime.datetime`;
128 otherwise ``ValueError`` is raised if *date* contains an invalid value such
129 as an hour greater than 23 or a timezone offset not between -24 and 24 hours.
130 If the input date has a timezone of ``-0000``, the ``datetime`` will be a naive
R David Murray875048b2011-07-20 11:41:21 -0400131 ``datetime``, and if the date is conforming to the RFCs it will represent a
132 time in UTC but with no indication of the actual source timezone of the
133 message the date comes from. If the input date has any other valid timezone
134 offset, the ``datetime`` will be an aware ``datetime`` with the
135 corresponding a :class:`~datetime.timezone` :class:`~datetime.tzinfo`.
136
137 .. versionadded:: 3.3
Georg Brandl61063cc2012-06-24 22:48:30 +0200138
139
Georg Brandl116aa622007-08-15 14:28:22 +0000140.. function:: mktime_tz(tuple)
141
R David Murrayae25f462014-04-26 19:01:18 -0400142 Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC
143 timestamp (seconds since the Epoch). If the timezone item in the
144 tuple is ``None``, assume local time.
Georg Brandl116aa622007-08-15 14:28:22 +0000145
146
Georg Brandl3f076d82009-05-17 11:28:33 +0000147.. function:: formatdate(timeval=None, localtime=False, usegmt=False)
Georg Brandl116aa622007-08-15 14:28:22 +0000148
149 Returns a date string as per :rfc:`2822`, e.g.::
150
151 Fri, 09 Nov 2001 01:08:47 -0000
152
153 Optional *timeval* if given is a floating point time value as accepted by
154 :func:`time.gmtime` and :func:`time.localtime`, otherwise the current time is
155 used.
156
157 Optional *localtime* is a flag that when ``True``, interprets *timeval*, and
158 returns a date relative to the local timezone instead of UTC, properly taking
159 daylight savings time into account. The default is ``False`` meaning UTC is
160 used.
161
162 Optional *usegmt* is a flag that when ``True``, outputs a date string with the
163 timezone as an ascii string ``GMT``, rather than a numeric ``-0000``. This is
164 needed for some protocols (such as HTTP). This only applies when *localtime* is
R. David Murray5973e4d2010-02-04 16:41:57 +0000165 ``False``. The default is ``False``.
Georg Brandl116aa622007-08-15 14:28:22 +0000166
Georg Brandl116aa622007-08-15 14:28:22 +0000167
R David Murray875048b2011-07-20 11:41:21 -0400168.. function:: format_datetime(dt, usegmt=False)
169
170 Like ``formatdate``, but the input is a :mod:`datetime` instance. If it is
171 a naive datetime, it is assumed to be "UTC with no information about the
172 source timezone", and the conventional ``-0000`` is used for the timezone.
173 If it is an aware ``datetime``, then the numeric timezone offset is used.
174 If it is an aware timezone with offset zero, then *usegmt* may be set to
175 ``True``, in which case the string ``GMT`` is used instead of the numeric
176 timezone offset. This provides a way to generate standards conformant HTTP
177 date headers.
178
179 .. versionadded:: 3.3
180
181
Georg Brandl116aa622007-08-15 14:28:22 +0000182.. function:: decode_rfc2231(s)
183
184 Decode the string *s* according to :rfc:`2231`.
185
186
Georg Brandl3f076d82009-05-17 11:28:33 +0000187.. function:: encode_rfc2231(s, charset=None, language=None)
Georg Brandl116aa622007-08-15 14:28:22 +0000188
189 Encode the string *s* according to :rfc:`2231`. Optional *charset* and
190 *language*, if given is the character set name and language name to use. If
191 neither is given, *s* is returned as-is. If *charset* is given but *language*
192 is not, the string is encoded using the empty string for *language*.
193
194
Georg Brandl3f076d82009-05-17 11:28:33 +0000195.. function:: collapse_rfc2231_value(value, errors='replace', fallback_charset='us-ascii')
Georg Brandl116aa622007-08-15 14:28:22 +0000196
197 When a header parameter is encoded in :rfc:`2231` format,
Serhiy Storchakae0f0cf42013-08-19 09:59:18 +0300198 :meth:`Message.get_param <email.message.Message.get_param>` may return a
199 3-tuple containing the character set,
Georg Brandl116aa622007-08-15 14:28:22 +0000200 language, and value. :func:`collapse_rfc2231_value` turns this into a unicode
Georg Brandlf6945182008-02-01 11:56:49 +0000201 string. Optional *errors* is passed to the *errors* argument of :class:`str`'s
Serhiy Storchakae0f0cf42013-08-19 09:59:18 +0300202 :func:`~str.encode` method; it defaults to ``'replace'``. Optional
Georg Brandl116aa622007-08-15 14:28:22 +0000203 *fallback_charset* specifies the character set to use if the one in the
Georg Brandlf6945182008-02-01 11:56:49 +0000204 :rfc:`2231` header is not known by Python; it defaults to ``'us-ascii'``.
Georg Brandl116aa622007-08-15 14:28:22 +0000205
206 For convenience, if the *value* passed to :func:`collapse_rfc2231_value` is not
207 a tuple, it should be a string and it is returned unquoted.
208
209
210.. function:: decode_params(params)
211
212 Decode parameters list according to :rfc:`2231`. *params* is a sequence of
213 2-tuples containing elements of the form ``(content-type, string-value)``.
214
Georg Brandl116aa622007-08-15 14:28:22 +0000215
216.. rubric:: Footnotes
217
218.. [#] Note that the sign of the timezone offset is the opposite of the sign of the
219 ``time.timezone`` variable for the same timezone; the latter variable follows
220 the POSIX standard while this module follows :rfc:`2822`.