blob: 73cfba16b07f1d89050e67de3364657400b1b5fa [file] [log] [blame]
R David Murray3edd22a2011-04-18 13:59:37 -04001:mod:`email`: Policy Objects
2----------------------------
3
4.. module:: email.policy
5 :synopsis: Controlling the parsing and generating of messages
6
Éric Araujo54dbfbd2011-08-10 21:43:13 +02007.. versionadded:: 3.3
R David Murray6a45d3b2011-04-18 16:00:47 -04008
R David Murray3edd22a2011-04-18 13:59:37 -04009
10The :mod:`email` package's prime focus is the handling of email messages as
11described by the various email and MIME RFCs. However, the general format of
12email messages (a block of header fields each consisting of a name followed by
13a colon followed by a value, the whole block followed by a blank line and an
14arbitrary 'body'), is a format that has found utility outside of the realm of
15email. Some of these uses conform fairly closely to the main RFCs, some do
16not. And even when working with email, there are times when it is desirable to
17break strict compliance with the RFCs.
18
R David Murray6a45d3b2011-04-18 16:00:47 -040019Policy objects give the email package the flexibility to handle all these
20disparate use cases.
R David Murray3edd22a2011-04-18 13:59:37 -040021
22A :class:`Policy` object encapsulates a set of attributes and methods that
23control the behavior of various components of the email package during use.
24:class:`Policy` instances can be passed to various classes and methods in the
25email package to alter the default behavior. The settable values and their
R David Murrayc27e5222012-05-25 15:01:48 -040026defaults are described below.
R David Murray3edd22a2011-04-18 13:59:37 -040027
R David Murrayc27e5222012-05-25 15:01:48 -040028There is a default policy used by all classes in the email package. This
29policy is named :class:`Compat32`, with a corresponding pre-defined instance
30named :const:`compat32`. It provides for complete backward compatibility (in
31some cases, including bug compatibility) with the pre-Python3.3 version of the
32email package.
33
34The first part of this documentation covers the features of :class:`Policy`, an
35:term:`abstract base class` that defines the features that are common to all
36policy objects, including :const:`compat32`. This includes certain hook
37methods that are called internally by the email package, which a custom policy
38could override to obtain different behavior.
39
40When a :class:`~email.message.Message` object is created, it acquires a policy.
41By default this will be :const:`compat32`, but a different policy can be
42specified. If the ``Message`` is created by a :mod:`~email.parser`, a policy
43passed to the parser will be the policy used by the ``Message`` it creates. If
44the ``Message`` is created by the program, then the policy can be specified
45when it is created. When a ``Message`` is passed to a :mod:`~email.generator`,
46the generator uses the policy from the ``Message`` by default, but you can also
47pass a specific policy to the generator that will override the one stored on
48the ``Message`` object.
49
50:class:`Policy` instances are immutable, but they can be cloned, accepting the
51same keyword arguments as the class constructor and returning a new
52:class:`Policy` instance that is a copy of the original but with the specified
53attributes values changed.
R David Murray3edd22a2011-04-18 13:59:37 -040054
55As an example, the following code could be used to read an email message from a
R David Murray6a45d3b2011-04-18 16:00:47 -040056file on disk and pass it to the system ``sendmail`` program on a Unix system::
R David Murray3edd22a2011-04-18 13:59:37 -040057
58 >>> from email import msg_from_binary_file
59 >>> from email.generator import BytesGenerator
R David Murray3edd22a2011-04-18 13:59:37 -040060 >>> from subprocess import Popen, PIPE
61 >>> with open('mymsg.txt', 'b') as f:
R David Murrayc27e5222012-05-25 15:01:48 -040062 ... msg = msg_from_binary_file(f)
R David Murray3edd22a2011-04-18 13:59:37 -040063 >>> p = Popen(['sendmail', msg['To'][0].address], stdin=PIPE)
R David Murrayc27e5222012-05-25 15:01:48 -040064 >>> g = BytesGenerator(p.stdin, policy=msg.policy.clone(linesep='\r\n'))
R David Murray3edd22a2011-04-18 13:59:37 -040065 >>> g.flatten(msg)
66 >>> p.stdin.close()
67 >>> rc = p.wait()
68
R David Murrayc27e5222012-05-25 15:01:48 -040069Here we are telling :class:`~email.generator.BytesGenerator` to use the RFC
70correct line separator characters when creating the binary string to feed into
71``sendmail's`` ``stdin``, where the default policy would use ``\n`` line
72separators.
Éric Araujofe0472e2011-12-03 16:00:56 +010073
R David Murray3edd22a2011-04-18 13:59:37 -040074Some email package methods accept a *policy* keyword argument, allowing the
R David Murray6a45d3b2011-04-18 16:00:47 -040075policy to be overridden for that method. For example, the following code uses
R David Murrayc27e5222012-05-25 15:01:48 -040076the :meth:`~email.message.Message.as_string` method of the *msg* object from
77the previous example and writes the message to a file using the native line
78separators for the platform on which it is running::
R David Murray3edd22a2011-04-18 13:59:37 -040079
80 >>> import os
R David Murray3edd22a2011-04-18 13:59:37 -040081 >>> with open('converted.txt', 'wb') as f:
R David Murrayc27e5222012-05-25 15:01:48 -040082 ... f.write(msg.as_string(policy=msg.policy.clone(linesep=os.linesep))
R David Murray3edd22a2011-04-18 13:59:37 -040083
84Policy objects can also be combined using the addition operator, producing a
85policy object whose settings are a combination of the non-default values of the
86summed objects::
87
R David Murrayc27e5222012-05-25 15:01:48 -040088 >>> compat_SMTP = email.policy.clone(linesep='\r\n')
89 >>> compat_strict = email.policy.clone(raise_on_defect=True)
90 >>> compat_strict_SMTP = compat_SMTP + compat_strict
R David Murray3edd22a2011-04-18 13:59:37 -040091
92This operation is not commutative; that is, the order in which the objects are
93added matters. To illustrate::
94
R David Murrayc27e5222012-05-25 15:01:48 -040095 >>> policy100 = compat32.clone(max_line_length=100)
96 >>> policy80 = compat32.clone(max_line_length=80)
97 >>> apolicy = policy100 + Policy80
R David Murray3edd22a2011-04-18 13:59:37 -040098 >>> apolicy.max_line_length
99 80
R David Murrayc27e5222012-05-25 15:01:48 -0400100 >>> apolicy = policy80 + policy100
R David Murray3edd22a2011-04-18 13:59:37 -0400101 >>> apolicy.max_line_length
102 100
103
104
105.. class:: Policy(**kw)
106
R David Murrayc27e5222012-05-25 15:01:48 -0400107 This is the :term:`abstract base class` for all policy classes. It provides
108 default implementations for a couple of trivial methods, as well as the
109 implementation of the immutability property, the :meth:`clone` method, and
110 the constructor semantics.
111
112 The constructor of a policy class can be passed various keyword arguments.
113 The arguments that may be specified are any non-method properties on this
114 class, plus any additional non-method properties on the concrete class. A
115 value specified in the constructor will override the default value for the
116 corresponding attribute.
117
118 This class defines the following properties, and thus values for the
119 following may be passed in the constructor of any policy class:
R David Murray3edd22a2011-04-18 13:59:37 -0400120
121 .. attribute:: max_line_length
122
123 The maximum length of any line in the serialized output, not counting the
124 end of line character(s). Default is 78, per :rfc:`5322`. A value of
125 ``0`` or :const:`None` indicates that no line wrapping should be
126 done at all.
127
128 .. attribute:: linesep
129
130 The string to be used to terminate lines in serialized output. The
R David Murray6a45d3b2011-04-18 16:00:47 -0400131 default is ``\n`` because that's the internal end-of-line discipline used
R David Murrayc27e5222012-05-25 15:01:48 -0400132 by Python, though ``\r\n`` is required by the RFCs.
R David Murray3edd22a2011-04-18 13:59:37 -0400133
R David Murrayc27e5222012-05-25 15:01:48 -0400134 .. attribute:: cte_type
R David Murray3edd22a2011-04-18 13:59:37 -0400135
R David Murrayc27e5222012-05-25 15:01:48 -0400136 Controls the type of Content Transfer Encodings that may be or are
137 required to be used. The possible values are:
138
139 ======== ===============================================================
140 ``7bit`` all data must be "7 bit clean" (ASCII-only). This means that
141 where necessary data will be encoded using either
142 quoted-printable or base64 encoding.
143
144 ``8bit`` data is not constrained to be 7 bit clean. Data in headers is
145 still required to be ASCII-only and so will be encoded (see
146 'binary_fold' below for an exception), but body parts may use
147 the ``8bit`` CTE.
148 ======== ===============================================================
149
150 A ``cte_type`` value of ``8bit`` only works with ``BytesGenerator``, not
151 ``Generator``, because strings cannot contain binary data. If a
152 ``Generator`` is operating under a policy that specifies
153 ``cte_type=8bit``, it will act as if ``cte_type`` is ``7bit``.
R David Murray3edd22a2011-04-18 13:59:37 -0400154
155 .. attribute:: raise_on_defect
156
157 If :const:`True`, any defects encountered will be raised as errors. If
158 :const:`False` (the default), defects will be passed to the
159 :meth:`register_defect` method.
160
R David Murrayc27e5222012-05-25 15:01:48 -0400161 The following :class:`Policy` method is intended to be called by code using
162 the email library to create policy instances with custom settings:
R David Murray6a45d3b2011-04-18 16:00:47 -0400163
R David Murrayc27e5222012-05-25 15:01:48 -0400164 .. method:: clone(**kw)
R David Murray3edd22a2011-04-18 13:59:37 -0400165
166 Return a new :class:`Policy` instance whose attributes have the same
167 values as the current instance, except where those attributes are
168 given new values by the keyword arguments.
169
R David Murrayc27e5222012-05-25 15:01:48 -0400170 The remaining :class:`Policy` methods are called by the email package code,
171 and are not intended to be called by an application using the email package.
172 A custom policy must implement all of these methods.
R David Murray3edd22a2011-04-18 13:59:37 -0400173
R David Murrayc27e5222012-05-25 15:01:48 -0400174 .. method:: handle_defect(obj, defect)
R David Murray3edd22a2011-04-18 13:59:37 -0400175
R David Murrayc27e5222012-05-25 15:01:48 -0400176 Handle a *defect* found on *obj*. When the email package calls this
177 method, *defect* will always be a subclass of
178 :class:`~email.errors.Defect`.
R David Murray3edd22a2011-04-18 13:59:37 -0400179
R David Murrayc27e5222012-05-25 15:01:48 -0400180 The default implementation checks the :attr:`raise_on_defect` flag. If
181 it is ``True``, *defect* is raised as an exception. If it is ``False``
182 (the default), *obj* and *defect* are passed to :meth:`register_defect`.
R David Murray3edd22a2011-04-18 13:59:37 -0400183
R David Murrayc27e5222012-05-25 15:01:48 -0400184 .. method:: register_defect(obj, defect)
R David Murray3edd22a2011-04-18 13:59:37 -0400185
R David Murrayc27e5222012-05-25 15:01:48 -0400186 Register a *defect* on *obj*. In the email package, *defect* will always
187 be a subclass of :class:`~email.errors.Defect`.
R David Murray3edd22a2011-04-18 13:59:37 -0400188
R David Murrayc27e5222012-05-25 15:01:48 -0400189 The default implementation calls the ``append`` method of the ``defects``
190 attribute of *obj*. When the email package calls :attr:`handle_defect`,
191 *obj* will normally have a ``defects`` attribute that has an ``append``
192 method. Custom object types used with the email package (for example,
193 custom ``Message`` objects) should also provide such an attribute,
194 otherwise defects in parsed messages will raise unexpected errors.
R David Murray3edd22a2011-04-18 13:59:37 -0400195
R David Murrayc27e5222012-05-25 15:01:48 -0400196 .. method:: header_source_parse(sourcelines)
R David Murray3edd22a2011-04-18 13:59:37 -0400197
R David Murrayc27e5222012-05-25 15:01:48 -0400198 The email package calls this method with a list of strings, each string
199 ending with the line separation characters found in the source being
200 parsed. The first line includes the field header name and separator.
201 All whitespace in the source is preserved. The method should return the
202 ``(name, value)`` tuple that is to be stored in the ``Message`` to
203 represent the parsed header.
R David Murray3edd22a2011-04-18 13:59:37 -0400204
R David Murrayc27e5222012-05-25 15:01:48 -0400205 If an implementation wishes to retain compatibility with the existing
206 email package policies, *name* should be the case preserved name (all
207 characters up to the '``:``' separator), while *value* should be the
208 unfolded value (all line separator characters removed, but whitespace
209 kept intact), stripped of leading whitespace.
R David Murray3edd22a2011-04-18 13:59:37 -0400210
R David Murrayc27e5222012-05-25 15:01:48 -0400211 *sourcelines* may contain surrogateescaped binary data.
212
213 There is no default implementation
214
215 .. method:: header_store_parse(name, value)
216
217 The email package calls this method with the name and value provided by
218 the application program when the application program is modifying a
219 ``Message`` programmatically (as opposed to a ``Message`` created by a
220 parser). The method should return the ``(name, value)`` tuple that is to
221 be stored in the ``Message`` to represent the header.
222
223 If an implementation wishes to retain compatibility with the existing
224 email package policies, the *name* and *value* should be strings or
225 string subclasses that do not change the content of the passed in
226 arguments.
227
228 There is no default implementation
229
230 .. method:: header_fetch_parse(name, value)
231
232 The email package calls this method with the *name* and *value* currently
233 stored in the ``Message`` when that header is requested by the
234 application program, and whatever the method returns is what is passed
235 back to the application as the value of the header being retrieved.
236 Note that there may be more than one header with the same name stored in
237 the ``Message``; the method is passed the specific name and value of the
238 header destined to be returned to the application.
239
240 *value* may contain surrogateescaped binary data. There should be no
241 surrogateescaped binary data in the value returned by the method.
242
243 There is no default implementation
244
245 .. method:: fold(name, value)
246
247 The email package calls this method with the *name* and *value* currently
248 stored in the ``Message`` for a given header. The method should return a
249 string that represents that header "folded" correctly (according to the
250 policy settings) by composing the *name* with the *value* and inserting
251 :attr:`linesep` characters at the appropriate places. See :rfc:`5322`
252 for a discussion of the rules for folding email headers.
253
254 *value* may contain surrogateescaped binary data. There should be no
255 surrogateescaped binary data in the string returned by the method.
256
257 .. method:: fold_binary(name, value)
258
259 The same as :meth:`fold`, except that the returned value should be a
260 bytes object rather than a string.
261
262 *value* may contain surrogateescaped binary data. These could be
263 converted back into binary data in the returned bytes object.
264
265
266.. class:: Compat32(**kw)
267
268 This concrete :class:`Policy` is the backward compatibility policy. It
269 replicates the behavior of the email package in Python 3.2. The
270 :mod:`policy` module also defines an instance of this class,
271 :const:`compat32`, that is used as the default policy. Thus the default
272 behavior of the email package is to maintain compatibility with Python 3.2.
273
274 The class provides the following concrete implementations of the
275 abstract methods of :class:`Policy`:
276
277 .. method:: header_source_parse(sourcelines)
278
279 The name is parsed as everything up to the '``:``' and returned
280 unmodified. The value is determined by stripping leading whitespace off
281 the remainder of the first line, joining all subsequent lines together,
282 and stripping any trailing carriage return or linefeed characters.
283
284 .. method:: header_store_parse(name, value)
285
286 The name and value are returned unmodified.
287
288 .. method:: header_fetch_parse(name, value)
289
290 If the value contains binary data, it is converted into a
291 :class:`~email.header.Header` object using the ``unknown-8bit`` charset.
292 Otherwise it is returned unmodified.
293
294 .. method:: fold(name, value)
295
296 Headers are folded using the :class:`~email.header.Header` folding
297 algorithm, which preserves existing line breaks in the value, and wraps
298 each resulting line to the ``max_line_length``. Non-ASCII binary data are
299 CTE encoded using the ``unknown-8bit`` charset.
300
301 .. method:: fold_binary(name, value)
302
303 Headers are folded using the :class:`~email.header.Header` folding
304 algorithm, which preserves existing line breaks in the value, and wraps
305 each resulting line to the ``max_line_length``. If ``cte_type`` is
306 ``7bit``, non-ascii binary data is CTE encoded using the ``unknown-8bit``
307 charset. Otherwise the original source header is used, with its existing
308 line breaks and and any (RFC invalid) binary data it may contain.