R David Murray | 79cf3ba | 2012-05-27 17:10:36 -0400 | [diff] [blame] | 1 | :mod:`email.utils`: Miscellaneous utilities |
| 2 | ------------------------------------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 3 | |
| 4 | .. module:: email.utils |
| 5 | :synopsis: Miscellaneous email package utilities. |
| 6 | |
Terry Jan Reedy | fa089b9 | 2016-06-11 15:02:54 -0400 | [diff] [blame] | 7 | **Source code:** :source:`Lib/email/utils.py` |
| 8 | |
| 9 | -------------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 10 | |
R David Murray | 29d1bc0 | 2016-09-07 21:15:59 -0400 | [diff] [blame] | 11 | There are a couple of useful utilities provided in the :mod:`email.utils` |
| 12 | module: |
| 13 | |
| 14 | .. function:: localtime(dt=None) |
| 15 | |
| 16 | Return local time as an aware datetime object. If called without |
| 17 | arguments, return current time. Otherwise *dt* argument should be a |
| 18 | :class:`~datetime.datetime` instance, and it is converted to the local time |
| 19 | zone according to the system time zone database. If *dt* is naive (that |
| 20 | is, ``dt.tzinfo`` is ``None``), it is assumed to be in local time. In this |
| 21 | case, a positive or zero value for *isdst* causes ``localtime`` to presume |
| 22 | initially that summer time (for example, Daylight Saving Time) is or is not |
| 23 | (respectively) in effect for the specified time. A negative value for |
| 24 | *isdst* causes the ``localtime`` to attempt to divine whether summer time |
| 25 | is in effect for the specified time. |
| 26 | |
| 27 | .. versionadded:: 3.3 |
| 28 | |
| 29 | |
| 30 | .. function:: make_msgid(idstring=None, domain=None) |
| 31 | |
| 32 | Returns a string suitable for an :rfc:`2822`\ -compliant |
| 33 | :mailheader:`Message-ID` header. Optional *idstring* if given, is a string |
| 34 | used to strengthen the uniqueness of the message id. Optional *domain* if |
| 35 | given provides the portion of the msgid after the '@'. The default is the |
| 36 | local hostname. It is not normally necessary to override this default, but |
| 37 | may be useful certain cases, such as a constructing distributed system that |
| 38 | uses a consistent domain name across multiple hosts. |
| 39 | |
| 40 | .. versionchanged:: 3.2 |
| 41 | Added the *domain* keyword. |
| 42 | |
| 43 | |
| 44 | The remaining functions are part of the legacy (``Compat32``) email API. There |
| 45 | is no need to directly use these with the new API, since the parsing and |
| 46 | formatting they provide is done automatically by the header parsing machinery |
| 47 | of the new API. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 48 | |
| 49 | |
| 50 | .. function:: quote(str) |
| 51 | |
| 52 | Return a new string with backslashes in *str* replaced by two backslashes, and |
| 53 | double quotes replaced by backslash-double quote. |
| 54 | |
| 55 | |
| 56 | .. function:: unquote(str) |
| 57 | |
| 58 | Return a new string which is an *unquoted* version of *str*. If *str* ends and |
| 59 | begins with double quotes, they are stripped off. Likewise if *str* ends and |
| 60 | begins with angle brackets, they are stripped off. |
| 61 | |
| 62 | |
| 63 | .. function:: parseaddr(address) |
| 64 | |
| 65 | Parse address -- which should be the value of some address-containing field such |
| 66 | as :mailheader:`To` or :mailheader:`Cc` -- into its constituent *realname* and |
| 67 | *email address* parts. Returns a tuple of that information, unless the parse |
| 68 | fails, in which case a 2-tuple of ``('', '')`` is returned. |
| 69 | |
| 70 | |
R David Murray | 8debacb | 2011-04-06 09:35:57 -0400 | [diff] [blame] | 71 | .. function:: formataddr(pair, charset='utf-8') |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 72 | |
| 73 | The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname, |
| 74 | email_address)`` and returns the string value suitable for a :mailheader:`To` or |
| 75 | :mailheader:`Cc` header. If the first element of *pair* is false, then the |
| 76 | second element is returned unmodified. |
| 77 | |
R David Murray | 8debacb | 2011-04-06 09:35:57 -0400 | [diff] [blame] | 78 | Optional *charset* is the character set that will be used in the :rfc:`2047` |
| 79 | encoding of the ``realname`` if the ``realname`` contains non-ASCII |
| 80 | characters. Can be an instance of :class:`str` or a |
| 81 | :class:`~email.charset.Charset`. Defaults to ``utf-8``. |
| 82 | |
Georg Brandl | 61063cc | 2012-06-24 22:48:30 +0200 | [diff] [blame] | 83 | .. versionchanged:: 3.3 |
| 84 | Added the *charset* option. |
R David Murray | 8debacb | 2011-04-06 09:35:57 -0400 | [diff] [blame] | 85 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 86 | |
| 87 | .. function:: getaddresses(fieldvalues) |
| 88 | |
| 89 | This method returns a list of 2-tuples of the form returned by ``parseaddr()``. |
| 90 | *fieldvalues* is a sequence of header field values as might be returned by |
Serhiy Storchaka | e0f0cf4 | 2013-08-19 09:59:18 +0300 | [diff] [blame] | 91 | :meth:`Message.get_all <email.message.Message.get_all>`. Here's a simple |
| 92 | example that gets all the recipients of a message:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 93 | |
| 94 | from email.utils import getaddresses |
| 95 | |
| 96 | tos = msg.get_all('to', []) |
| 97 | ccs = msg.get_all('cc', []) |
| 98 | resent_tos = msg.get_all('resent-to', []) |
| 99 | resent_ccs = msg.get_all('resent-cc', []) |
| 100 | all_recipients = getaddresses(tos + ccs + resent_tos + resent_ccs) |
| 101 | |
| 102 | |
| 103 | .. function:: parsedate(date) |
| 104 | |
| 105 | Attempts to parse a date according to the rules in :rfc:`2822`. however, some |
| 106 | mailers don't follow that format as specified, so :func:`parsedate` tries to |
| 107 | guess correctly in such cases. *date* is a string containing an :rfc:`2822` |
| 108 | date, such as ``"Mon, 20 Nov 1995 19:12:08 -0500"``. If it succeeds in parsing |
| 109 | the date, :func:`parsedate` returns a 9-tuple that can be passed directly to |
| 110 | :func:`time.mktime`; otherwise ``None`` will be returned. Note that indexes 6, |
| 111 | 7, and 8 of the result tuple are not usable. |
| 112 | |
| 113 | |
| 114 | .. function:: parsedate_tz(date) |
| 115 | |
| 116 | Performs the same function as :func:`parsedate`, but returns either ``None`` or |
| 117 | a 10-tuple; the first 9 elements make up a tuple that can be passed directly to |
| 118 | :func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC |
| 119 | (which is the official term for Greenwich Mean Time) [#]_. If the input string |
| 120 | has no timezone, the last element of the tuple returned is ``None``. Note that |
| 121 | indexes 6, 7, and 8 of the result tuple are not usable. |
| 122 | |
| 123 | |
R David Murray | 875048b | 2011-07-20 11:41:21 -0400 | [diff] [blame] | 124 | .. function:: parsedate_to_datetime(date) |
| 125 | |
| 126 | The inverse of :func:`format_datetime`. Performs the same function as |
| 127 | :func:`parsedate`, but on success returns a :mod:`~datetime.datetime`. If |
| 128 | the input date has a timezone of ``-0000``, the ``datetime`` will be a naive |
| 129 | ``datetime``, and if the date is conforming to the RFCs it will represent a |
| 130 | time in UTC but with no indication of the actual source timezone of the |
| 131 | message the date comes from. If the input date has any other valid timezone |
| 132 | offset, the ``datetime`` will be an aware ``datetime`` with the |
| 133 | corresponding a :class:`~datetime.timezone` :class:`~datetime.tzinfo`. |
| 134 | |
| 135 | .. versionadded:: 3.3 |
Georg Brandl | 61063cc | 2012-06-24 22:48:30 +0200 | [diff] [blame] | 136 | |
| 137 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 138 | .. function:: mktime_tz(tuple) |
| 139 | |
R David Murray | ae25f46 | 2014-04-26 19:01:18 -0400 | [diff] [blame] | 140 | Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC |
| 141 | timestamp (seconds since the Epoch). If the timezone item in the |
| 142 | tuple is ``None``, assume local time. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 143 | |
| 144 | |
Georg Brandl | 3f076d8 | 2009-05-17 11:28:33 +0000 | [diff] [blame] | 145 | .. function:: formatdate(timeval=None, localtime=False, usegmt=False) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 146 | |
| 147 | Returns a date string as per :rfc:`2822`, e.g.:: |
| 148 | |
| 149 | Fri, 09 Nov 2001 01:08:47 -0000 |
| 150 | |
| 151 | Optional *timeval* if given is a floating point time value as accepted by |
| 152 | :func:`time.gmtime` and :func:`time.localtime`, otherwise the current time is |
| 153 | used. |
| 154 | |
| 155 | Optional *localtime* is a flag that when ``True``, interprets *timeval*, and |
| 156 | returns a date relative to the local timezone instead of UTC, properly taking |
| 157 | daylight savings time into account. The default is ``False`` meaning UTC is |
| 158 | used. |
| 159 | |
| 160 | Optional *usegmt* is a flag that when ``True``, outputs a date string with the |
| 161 | timezone as an ascii string ``GMT``, rather than a numeric ``-0000``. This is |
| 162 | needed for some protocols (such as HTTP). This only applies when *localtime* is |
R. David Murray | 5973e4d | 2010-02-04 16:41:57 +0000 | [diff] [blame] | 163 | ``False``. The default is ``False``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 164 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 165 | |
R David Murray | 875048b | 2011-07-20 11:41:21 -0400 | [diff] [blame] | 166 | .. function:: format_datetime(dt, usegmt=False) |
| 167 | |
| 168 | Like ``formatdate``, but the input is a :mod:`datetime` instance. If it is |
| 169 | a naive datetime, it is assumed to be "UTC with no information about the |
| 170 | source timezone", and the conventional ``-0000`` is used for the timezone. |
| 171 | If it is an aware ``datetime``, then the numeric timezone offset is used. |
| 172 | If it is an aware timezone with offset zero, then *usegmt* may be set to |
| 173 | ``True``, in which case the string ``GMT`` is used instead of the numeric |
| 174 | timezone offset. This provides a way to generate standards conformant HTTP |
| 175 | date headers. |
| 176 | |
| 177 | .. versionadded:: 3.3 |
| 178 | |
| 179 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 180 | .. function:: decode_rfc2231(s) |
| 181 | |
| 182 | Decode the string *s* according to :rfc:`2231`. |
| 183 | |
| 184 | |
Georg Brandl | 3f076d8 | 2009-05-17 11:28:33 +0000 | [diff] [blame] | 185 | .. function:: encode_rfc2231(s, charset=None, language=None) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 186 | |
| 187 | Encode the string *s* according to :rfc:`2231`. Optional *charset* and |
| 188 | *language*, if given is the character set name and language name to use. If |
| 189 | neither is given, *s* is returned as-is. If *charset* is given but *language* |
| 190 | is not, the string is encoded using the empty string for *language*. |
| 191 | |
| 192 | |
Georg Brandl | 3f076d8 | 2009-05-17 11:28:33 +0000 | [diff] [blame] | 193 | .. function:: collapse_rfc2231_value(value, errors='replace', fallback_charset='us-ascii') |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 194 | |
| 195 | When a header parameter is encoded in :rfc:`2231` format, |
Serhiy Storchaka | e0f0cf4 | 2013-08-19 09:59:18 +0300 | [diff] [blame] | 196 | :meth:`Message.get_param <email.message.Message.get_param>` may return a |
| 197 | 3-tuple containing the character set, |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 198 | language, and value. :func:`collapse_rfc2231_value` turns this into a unicode |
Georg Brandl | f694518 | 2008-02-01 11:56:49 +0000 | [diff] [blame] | 199 | string. Optional *errors* is passed to the *errors* argument of :class:`str`'s |
Serhiy Storchaka | e0f0cf4 | 2013-08-19 09:59:18 +0300 | [diff] [blame] | 200 | :func:`~str.encode` method; it defaults to ``'replace'``. Optional |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 201 | *fallback_charset* specifies the character set to use if the one in the |
Georg Brandl | f694518 | 2008-02-01 11:56:49 +0000 | [diff] [blame] | 202 | :rfc:`2231` header is not known by Python; it defaults to ``'us-ascii'``. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 203 | |
| 204 | For convenience, if the *value* passed to :func:`collapse_rfc2231_value` is not |
| 205 | a tuple, it should be a string and it is returned unquoted. |
| 206 | |
| 207 | |
| 208 | .. function:: decode_params(params) |
| 209 | |
| 210 | Decode parameters list according to :rfc:`2231`. *params* is a sequence of |
| 211 | 2-tuples containing elements of the form ``(content-type, string-value)``. |
| 212 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 213 | |
| 214 | .. rubric:: Footnotes |
| 215 | |
| 216 | .. [#] Note that the sign of the timezone offset is the opposite of the sign of the |
| 217 | ``time.timezone`` variable for the same timezone; the latter variable follows |
| 218 | the POSIX standard while this module follows :rfc:`2822`. |