R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 1 | :mod:`email`: Policy Objects |
| 2 | ---------------------------- |
| 3 | |
| 4 | .. module:: email.policy |
| 5 | :synopsis: Controlling the parsing and generating of messages |
| 6 | |
Éric Araujo | 54dbfbd | 2011-08-10 21:43:13 +0200 | [diff] [blame] | 7 | .. versionadded:: 3.3 |
R David Murray | 6a45d3b | 2011-04-18 16:00:47 -0400 | [diff] [blame] | 8 | |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 9 | |
| 10 | The :mod:`email` package's prime focus is the handling of email messages as |
| 11 | described by the various email and MIME RFCs. However, the general format of |
| 12 | email messages (a block of header fields each consisting of a name followed by |
| 13 | a colon followed by a value, the whole block followed by a blank line and an |
| 14 | arbitrary 'body'), is a format that has found utility outside of the realm of |
| 15 | email. Some of these uses conform fairly closely to the main RFCs, some do |
| 16 | not. And even when working with email, there are times when it is desirable to |
| 17 | break strict compliance with the RFCs. |
| 18 | |
R David Murray | 6a45d3b | 2011-04-18 16:00:47 -0400 | [diff] [blame] | 19 | Policy objects give the email package the flexibility to handle all these |
| 20 | disparate use cases. |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 21 | |
| 22 | A :class:`Policy` object encapsulates a set of attributes and methods that |
| 23 | control the behavior of various components of the email package during use. |
| 24 | :class:`Policy` instances can be passed to various classes and methods in the |
| 25 | email package to alter the default behavior. The settable values and their |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 26 | defaults are described below. |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 27 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 28 | There is a default policy used by all classes in the email package. This |
| 29 | policy is named :class:`Compat32`, with a corresponding pre-defined instance |
| 30 | named :const:`compat32`. It provides for complete backward compatibility (in |
| 31 | some cases, including bug compatibility) with the pre-Python3.3 version of the |
| 32 | email package. |
| 33 | |
| 34 | The first part of this documentation covers the features of :class:`Policy`, an |
| 35 | :term:`abstract base class` that defines the features that are common to all |
| 36 | policy objects, including :const:`compat32`. This includes certain hook |
| 37 | methods that are called internally by the email package, which a custom policy |
| 38 | could override to obtain different behavior. |
| 39 | |
| 40 | When a :class:`~email.message.Message` object is created, it acquires a policy. |
| 41 | By default this will be :const:`compat32`, but a different policy can be |
| 42 | specified. If the ``Message`` is created by a :mod:`~email.parser`, a policy |
| 43 | passed to the parser will be the policy used by the ``Message`` it creates. If |
| 44 | the ``Message`` is created by the program, then the policy can be specified |
| 45 | when it is created. When a ``Message`` is passed to a :mod:`~email.generator`, |
| 46 | the generator uses the policy from the ``Message`` by default, but you can also |
| 47 | pass a specific policy to the generator that will override the one stored on |
| 48 | the ``Message`` object. |
| 49 | |
| 50 | :class:`Policy` instances are immutable, but they can be cloned, accepting the |
| 51 | same keyword arguments as the class constructor and returning a new |
| 52 | :class:`Policy` instance that is a copy of the original but with the specified |
| 53 | attributes values changed. |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 54 | |
| 55 | As an example, the following code could be used to read an email message from a |
R David Murray | 6a45d3b | 2011-04-18 16:00:47 -0400 | [diff] [blame] | 56 | file on disk and pass it to the system ``sendmail`` program on a Unix system:: |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 57 | |
| 58 | >>> from email import msg_from_binary_file |
| 59 | >>> from email.generator import BytesGenerator |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 60 | >>> from subprocess import Popen, PIPE |
| 61 | >>> with open('mymsg.txt', 'b') as f: |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 62 | ... msg = msg_from_binary_file(f) |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 63 | >>> p = Popen(['sendmail', msg['To'][0].address], stdin=PIPE) |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 64 | >>> g = BytesGenerator(p.stdin, policy=msg.policy.clone(linesep='\r\n')) |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 65 | >>> g.flatten(msg) |
| 66 | >>> p.stdin.close() |
| 67 | >>> rc = p.wait() |
| 68 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 69 | Here we are telling :class:`~email.generator.BytesGenerator` to use the RFC |
| 70 | correct line separator characters when creating the binary string to feed into |
| 71 | ``sendmail's`` ``stdin``, where the default policy would use ``\n`` line |
| 72 | separators. |
Éric Araujo | fe0472e | 2011-12-03 16:00:56 +0100 | [diff] [blame] | 73 | |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 74 | Some email package methods accept a *policy* keyword argument, allowing the |
R David Murray | 6a45d3b | 2011-04-18 16:00:47 -0400 | [diff] [blame] | 75 | policy to be overridden for that method. For example, the following code uses |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 76 | the :meth:`~email.message.Message.as_string` method of the *msg* object from |
| 77 | the previous example and writes the message to a file using the native line |
| 78 | separators for the platform on which it is running:: |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 79 | |
| 80 | >>> import os |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 81 | >>> with open('converted.txt', 'wb') as f: |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 82 | ... f.write(msg.as_string(policy=msg.policy.clone(linesep=os.linesep)) |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 83 | |
| 84 | Policy objects can also be combined using the addition operator, producing a |
| 85 | policy object whose settings are a combination of the non-default values of the |
| 86 | summed objects:: |
| 87 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 88 | >>> compat_SMTP = email.policy.clone(linesep='\r\n') |
| 89 | >>> compat_strict = email.policy.clone(raise_on_defect=True) |
| 90 | >>> compat_strict_SMTP = compat_SMTP + compat_strict |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 91 | |
| 92 | This operation is not commutative; that is, the order in which the objects are |
| 93 | added matters. To illustrate:: |
| 94 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 95 | >>> policy100 = compat32.clone(max_line_length=100) |
| 96 | >>> policy80 = compat32.clone(max_line_length=80) |
| 97 | >>> apolicy = policy100 + Policy80 |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 98 | >>> apolicy.max_line_length |
| 99 | 80 |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 100 | >>> apolicy = policy80 + policy100 |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 101 | >>> apolicy.max_line_length |
| 102 | 100 |
| 103 | |
| 104 | |
| 105 | .. class:: Policy(**kw) |
| 106 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 107 | This is the :term:`abstract base class` for all policy classes. It provides |
| 108 | default implementations for a couple of trivial methods, as well as the |
| 109 | implementation of the immutability property, the :meth:`clone` method, and |
| 110 | the constructor semantics. |
| 111 | |
| 112 | The constructor of a policy class can be passed various keyword arguments. |
| 113 | The arguments that may be specified are any non-method properties on this |
| 114 | class, plus any additional non-method properties on the concrete class. A |
| 115 | value specified in the constructor will override the default value for the |
| 116 | corresponding attribute. |
| 117 | |
| 118 | This class defines the following properties, and thus values for the |
| 119 | following may be passed in the constructor of any policy class: |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 120 | |
| 121 | .. attribute:: max_line_length |
| 122 | |
| 123 | The maximum length of any line in the serialized output, not counting the |
| 124 | end of line character(s). Default is 78, per :rfc:`5322`. A value of |
| 125 | ``0`` or :const:`None` indicates that no line wrapping should be |
| 126 | done at all. |
| 127 | |
| 128 | .. attribute:: linesep |
| 129 | |
| 130 | The string to be used to terminate lines in serialized output. The |
R David Murray | 6a45d3b | 2011-04-18 16:00:47 -0400 | [diff] [blame] | 131 | default is ``\n`` because that's the internal end-of-line discipline used |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 132 | by Python, though ``\r\n`` is required by the RFCs. |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 133 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 134 | .. attribute:: cte_type |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 135 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 136 | Controls the type of Content Transfer Encodings that may be or are |
| 137 | required to be used. The possible values are: |
| 138 | |
| 139 | ======== =============================================================== |
| 140 | ``7bit`` all data must be "7 bit clean" (ASCII-only). This means that |
| 141 | where necessary data will be encoded using either |
| 142 | quoted-printable or base64 encoding. |
| 143 | |
| 144 | ``8bit`` data is not constrained to be 7 bit clean. Data in headers is |
| 145 | still required to be ASCII-only and so will be encoded (see |
| 146 | 'binary_fold' below for an exception), but body parts may use |
| 147 | the ``8bit`` CTE. |
| 148 | ======== =============================================================== |
| 149 | |
| 150 | A ``cte_type`` value of ``8bit`` only works with ``BytesGenerator``, not |
| 151 | ``Generator``, because strings cannot contain binary data. If a |
| 152 | ``Generator`` is operating under a policy that specifies |
| 153 | ``cte_type=8bit``, it will act as if ``cte_type`` is ``7bit``. |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 154 | |
| 155 | .. attribute:: raise_on_defect |
| 156 | |
| 157 | If :const:`True`, any defects encountered will be raised as errors. If |
| 158 | :const:`False` (the default), defects will be passed to the |
| 159 | :meth:`register_defect` method. |
| 160 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 161 | The following :class:`Policy` method is intended to be called by code using |
| 162 | the email library to create policy instances with custom settings: |
R David Murray | 6a45d3b | 2011-04-18 16:00:47 -0400 | [diff] [blame] | 163 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 164 | .. method:: clone(**kw) |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 165 | |
| 166 | Return a new :class:`Policy` instance whose attributes have the same |
| 167 | values as the current instance, except where those attributes are |
| 168 | given new values by the keyword arguments. |
| 169 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 170 | The remaining :class:`Policy` methods are called by the email package code, |
| 171 | and are not intended to be called by an application using the email package. |
| 172 | A custom policy must implement all of these methods. |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 173 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 174 | .. method:: handle_defect(obj, defect) |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 175 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 176 | Handle a *defect* found on *obj*. When the email package calls this |
| 177 | method, *defect* will always be a subclass of |
| 178 | :class:`~email.errors.Defect`. |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 179 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 180 | The default implementation checks the :attr:`raise_on_defect` flag. If |
| 181 | it is ``True``, *defect* is raised as an exception. If it is ``False`` |
| 182 | (the default), *obj* and *defect* are passed to :meth:`register_defect`. |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 183 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 184 | .. method:: register_defect(obj, defect) |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 185 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 186 | Register a *defect* on *obj*. In the email package, *defect* will always |
| 187 | be a subclass of :class:`~email.errors.Defect`. |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 188 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 189 | The default implementation calls the ``append`` method of the ``defects`` |
| 190 | attribute of *obj*. When the email package calls :attr:`handle_defect`, |
| 191 | *obj* will normally have a ``defects`` attribute that has an ``append`` |
| 192 | method. Custom object types used with the email package (for example, |
| 193 | custom ``Message`` objects) should also provide such an attribute, |
| 194 | otherwise defects in parsed messages will raise unexpected errors. |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 195 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 196 | .. method:: header_source_parse(sourcelines) |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 197 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 198 | The email package calls this method with a list of strings, each string |
| 199 | ending with the line separation characters found in the source being |
| 200 | parsed. The first line includes the field header name and separator. |
| 201 | All whitespace in the source is preserved. The method should return the |
| 202 | ``(name, value)`` tuple that is to be stored in the ``Message`` to |
| 203 | represent the parsed header. |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 204 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 205 | If an implementation wishes to retain compatibility with the existing |
| 206 | email package policies, *name* should be the case preserved name (all |
| 207 | characters up to the '``:``' separator), while *value* should be the |
| 208 | unfolded value (all line separator characters removed, but whitespace |
| 209 | kept intact), stripped of leading whitespace. |
R David Murray | 3edd22a | 2011-04-18 13:59:37 -0400 | [diff] [blame] | 210 | |
R David Murray | c27e522 | 2012-05-25 15:01:48 -0400 | [diff] [blame^] | 211 | *sourcelines* may contain surrogateescaped binary data. |
| 212 | |
| 213 | There is no default implementation |
| 214 | |
| 215 | .. method:: header_store_parse(name, value) |
| 216 | |
| 217 | The email package calls this method with the name and value provided by |
| 218 | the application program when the application program is modifying a |
| 219 | ``Message`` programmatically (as opposed to a ``Message`` created by a |
| 220 | parser). The method should return the ``(name, value)`` tuple that is to |
| 221 | be stored in the ``Message`` to represent the header. |
| 222 | |
| 223 | If an implementation wishes to retain compatibility with the existing |
| 224 | email package policies, the *name* and *value* should be strings or |
| 225 | string subclasses that do not change the content of the passed in |
| 226 | arguments. |
| 227 | |
| 228 | There is no default implementation |
| 229 | |
| 230 | .. method:: header_fetch_parse(name, value) |
| 231 | |
| 232 | The email package calls this method with the *name* and *value* currently |
| 233 | stored in the ``Message`` when that header is requested by the |
| 234 | application program, and whatever the method returns is what is passed |
| 235 | back to the application as the value of the header being retrieved. |
| 236 | Note that there may be more than one header with the same name stored in |
| 237 | the ``Message``; the method is passed the specific name and value of the |
| 238 | header destined to be returned to the application. |
| 239 | |
| 240 | *value* may contain surrogateescaped binary data. There should be no |
| 241 | surrogateescaped binary data in the value returned by the method. |
| 242 | |
| 243 | There is no default implementation |
| 244 | |
| 245 | .. method:: fold(name, value) |
| 246 | |
| 247 | The email package calls this method with the *name* and *value* currently |
| 248 | stored in the ``Message`` for a given header. The method should return a |
| 249 | string that represents that header "folded" correctly (according to the |
| 250 | policy settings) by composing the *name* with the *value* and inserting |
| 251 | :attr:`linesep` characters at the appropriate places. See :rfc:`5322` |
| 252 | for a discussion of the rules for folding email headers. |
| 253 | |
| 254 | *value* may contain surrogateescaped binary data. There should be no |
| 255 | surrogateescaped binary data in the string returned by the method. |
| 256 | |
| 257 | .. method:: fold_binary(name, value) |
| 258 | |
| 259 | The same as :meth:`fold`, except that the returned value should be a |
| 260 | bytes object rather than a string. |
| 261 | |
| 262 | *value* may contain surrogateescaped binary data. These could be |
| 263 | converted back into binary data in the returned bytes object. |
| 264 | |
| 265 | |
| 266 | .. class:: Compat32(**kw) |
| 267 | |
| 268 | This concrete :class:`Policy` is the backward compatibility policy. It |
| 269 | replicates the behavior of the email package in Python 3.2. The |
| 270 | :mod:`policy` module also defines an instance of this class, |
| 271 | :const:`compat32`, that is used as the default policy. Thus the default |
| 272 | behavior of the email package is to maintain compatibility with Python 3.2. |
| 273 | |
| 274 | The class provides the following concrete implementations of the |
| 275 | abstract methods of :class:`Policy`: |
| 276 | |
| 277 | .. method:: header_source_parse(sourcelines) |
| 278 | |
| 279 | The name is parsed as everything up to the '``:``' and returned |
| 280 | unmodified. The value is determined by stripping leading whitespace off |
| 281 | the remainder of the first line, joining all subsequent lines together, |
| 282 | and stripping any trailing carriage return or linefeed characters. |
| 283 | |
| 284 | .. method:: header_store_parse(name, value) |
| 285 | |
| 286 | The name and value are returned unmodified. |
| 287 | |
| 288 | .. method:: header_fetch_parse(name, value) |
| 289 | |
| 290 | If the value contains binary data, it is converted into a |
| 291 | :class:`~email.header.Header` object using the ``unknown-8bit`` charset. |
| 292 | Otherwise it is returned unmodified. |
| 293 | |
| 294 | .. method:: fold(name, value) |
| 295 | |
| 296 | Headers are folded using the :class:`~email.header.Header` folding |
| 297 | algorithm, which preserves existing line breaks in the value, and wraps |
| 298 | each resulting line to the ``max_line_length``. Non-ASCII binary data are |
| 299 | CTE encoded using the ``unknown-8bit`` charset. |
| 300 | |
| 301 | .. method:: fold_binary(name, value) |
| 302 | |
| 303 | Headers are folded using the :class:`~email.header.Header` folding |
| 304 | algorithm, which preserves existing line breaks in the value, and wraps |
| 305 | each resulting line to the ``max_line_length``. If ``cte_type`` is |
| 306 | ``7bit``, non-ascii binary data is CTE encoded using the ``unknown-8bit`` |
| 307 | charset. Otherwise the original source header is used, with its existing |
| 308 | line breaks and and any (RFC invalid) binary data it may contain. |