Blame - Doc/lib/emailheaders.tex - platform/external/python/cpython3

blob: 8d5964b52af21abce61e37914cf794fb21983fe5 [file] [log] [blame]

Barry Warsaw	5b9da89	2002-10-01 01:05:52 +0000	[diff] [blame]	1	\declaremodule{standard}{email.Header}
				2	\modulesynopsis{Representing non-ASCII headers}
				3
				4	\rfc{2822} is the base standard that describes the format of email
				5	messages. It derives from the older \rfc{822} standard which came
Barry Warsaw	5db478f	2002-10-01 04:33:16 +0000	[diff] [blame]	6	into widespread use at a time when most email was composed of \ASCII{}
Barry Warsaw	5b9da89	2002-10-01 01:05:52 +0000	[diff] [blame]	7	characters only. \rfc{2822} is a specification written assuming email
				8	contains only 7-bit \ASCII{} characters.
				9
				10	Of course, as email has been deployed worldwide, it has become
				11	internationalized, such that language specific character sets can now
				12	be used in email messages. The base standard still requires email
				13	messages to be transfered using only 7-bit \ASCII{} characters, so a
				14	slew of RFCs have been written describing how to encode email
				15	containing non-\ASCII{} characters into \rfc{2822}-compliant format.
				16	These RFCs include \rfc{2045}, \rfc{2046}, \rfc{2047}, and \rfc{2231}.
				17	The \module{email} package supports these standards in its
				18	\module{email.Header} and \module{email.Charset} modules.
				19
				20	If you want to include non-\ASCII{} characters in your email headers,
				21	say in the \mailheader{Subject} or \mailheader{To} fields, you should
Barry Warsaw	5db478f	2002-10-01 04:33:16 +0000	[diff] [blame]	22	use the \class{Header} class and assign the field in the
				23	\class{Message} object to an instance of \class{Header} instead of
				24	using a string for the header value. For example:
Barry Warsaw	5b9da89	2002-10-01 01:05:52 +0000	[diff] [blame]	25
				26	\begin{verbatim}
				27	>>> from email.Message import Message
				28	>>> from email.Header import Header
				29	>>> msg = Message()
				30	>>> h = Header('p\xf6stal', 'iso-8859-1')
				31	>>> msg['Subject'] = h
				32	>>> print msg.as_string()
				33	Subject: =?iso-8859-1?q?p=F6stal?=
				34
				35
				36	\end{verbatim}
				37
				38	Notice here how we wanted the \mailheader{Subject} field to contain a
				39	non-\ASCII{} character? We did this by creating a \class{Header}
				40	instance and passing in the character set that the byte string was
				41	encoded in. When the subsequent \class{Message} instance was
				42	flattened, the \mailheader{Subject} field was properly \rfc{2047}
				43	encoded. MIME-aware mail readers would show this header using the
				44	embedded ISO-8859-1 character.
				45
				46	\versionadded{2.2.2}
				47
				48	Here is the \class{Header} class description:
				49
				50	\begin{classdesc}{Header}{\optional{s\optional{, charset\optional{,
Barry Warsaw	d1adc8a	2002-12-30 19:17:37 +0000	[diff] [blame]	51	maxlinelen\optional{, header_name\optional{, continuation_ws\optional{,
				52	errors}}}}}}}
Barry Warsaw	5db478f	2002-10-01 04:33:16 +0000	[diff] [blame]	53	Create a MIME-compliant header that can contain strings in different
				54	character sets.
Barry Warsaw	5b9da89	2002-10-01 01:05:52 +0000	[diff] [blame]	55
				56	Optional \var{s} is the initial header value. If \code{None} (the
				57	default), the initial header value is not set. You can later append
				58	to the header with \method{append()} method calls. \var{s} may be a
				59	byte string or a Unicode string, but see the \method{append()}
				60	documentation for semantics.
				61
				62	Optional \var{charset} serves two purposes: it has the same meaning as
				63	the \var{charset} argument to the \method{append()} method. It also
				64	sets the default character set for all subsequent \method{append()}
				65	calls that omit the \var{charset} argument. If \var{charset} is not
				66	provided in the constructor (the default), the \code{us-ascii}
				67	character set is used both as \var{s}'s initial charset and as the
				68	default for subsequent \method{append()} calls.
				69
				70	The maximum line length can be specified explicit via
				71	\var{maxlinelen}. For splitting the first line to a shorter value (to
				72	account for the field header which isn't included in \var{s},
				73	e.g. \mailheader{Subject}) pass in the name of the field in
				74	\var{header_name}. The default \var{maxlinelen} is 76, and the
				75	default value for \var{header_name} is \code{None}, meaning it is not
				76	taken into account for the first line of a long, split header.
				77
Barry Warsaw	5db478f	2002-10-01 04:33:16 +0000	[diff] [blame]	78	Optional \var{continuation_ws} must be \rfc{2822}-compliant folding
Barry Warsaw	5b9da89	2002-10-01 01:05:52 +0000	[diff] [blame]	79	whitespace, and is usually either a space or a hard tab character.
				80	This character will be prepended to continuation lines.
				81	\end{classdesc}
				82
Barry Warsaw	d1adc8a	2002-12-30 19:17:37 +0000	[diff] [blame]	83	Optional \var{errors} is passed straight through to the
				84	\method{append()} method.
				85
				86	\begin{methoddesc}[Header]{append}{s\optional{, charset\optional{, errors}}}
Barry Warsaw	5b9da89	2002-10-01 01:05:52 +0000	[diff] [blame]	87	Append the string \var{s} to the MIME header.
				88
				89	Optional \var{charset}, if given, should be a \class{Charset} instance
				90	(see \refmodule{email.Charset}) or the name of a character set, which
				91	will be converted to a \class{Charset} instance. A value of
				92	\code{None} (the default) means that the \var{charset} given in the
				93	constructor is used.
				94
				95	\var{s} may be a byte string or a Unicode string. If it is a byte
Barry Warsaw	5db478f	2002-10-01 04:33:16 +0000	[diff] [blame]	96	string (i.e. \code{isinstance(s, str)} is true), then
Barry Warsaw	5b9da89	2002-10-01 01:05:52 +0000	[diff] [blame]	97	\var{charset} is the encoding of that byte string, and a
				98	\exception{UnicodeError} will be raised if the string cannot be
				99	decoded with that character set.
				100
				101	If \var{s} is a Unicode string, then \var{charset} is a hint
				102	specifying the character set of the characters in the string. In this
				103	case, when producing an \rfc{2822}-compliant header using \rfc{2047}
				104	rules, the Unicode string will be encoded using the following charsets
				105	in order: \code{us-ascii}, the \var{charset} hint, \code{utf-8}. The
				106	first character set to not provoke a \exception{UnicodeError} is used.
Barry Warsaw	d1adc8a	2002-12-30 19:17:37 +0000	[diff] [blame]	107
				108	Optional \var{errors} is passed through to any \function{unicode()} or
				109	\function{ustr.encode()} call, and defaults to ``strict''.
Barry Warsaw	5b9da89	2002-10-01 01:05:52 +0000	[diff] [blame]	110	\end{methoddesc}
				111
				112	\begin{methoddesc}[Header]{encode}{}
				113	Encode a message header into an RFC-compliant format, possibly
				114	wrapping long lines and encapsulating non-\ASCII{} parts in base64 or
				115	quoted-printable encodings.
				116	\end{methoddesc}
				117
				118	The \class{Header} class also provides a number of methods to support
				119	standard operators and built-in functions.
				120
				121	\begin{methoddesc}[Header]{__str__}{}
				122	A synonym for \method{Header.encode()}. Useful for
Barry Warsaw	5db478f	2002-10-01 04:33:16 +0000	[diff] [blame]	123	\code{str(aHeader)}.
Barry Warsaw	5b9da89	2002-10-01 01:05:52 +0000	[diff] [blame]	124	\end{methoddesc}
				125
				126	\begin{methoddesc}[Header]{__unicode__}{}
				127	A helper for the built-in \function{unicode()} function. Returns the
				128	header as a Unicode string.
				129	\end{methoddesc}
				130
				131	\begin{methoddesc}[Header]{__eq__}{other}
				132	This method allows you to compare two \class{Header} instances for equality.
				133	\end{methoddesc}
				134
				135	\begin{methoddesc}[Header]{__ne__}{other}
				136	This method allows you to compare two \class{Header} instances for inequality.
				137	\end{methoddesc}
				138
				139	The \module{email.Header} module also provides the following
				140	convenient functions.
				141
				142	\begin{funcdesc}{decode_header}{header}
				143	Decode a message header value without converting the character set.
				144	The header value is in \var{header}.
				145
				146	This function returns a list of \code{(decoded_string, charset)} pairs
				147	containing each of the decoded parts of the header. \var{charset} is
				148	\code{None} for non-encoded parts of the header, otherwise a lower
				149	case string containing the name of the character set specified in the
				150	encoded string.
				151
				152	Here's an example:
				153
				154	\begin{verbatim}
				155	>>> from email.Header import decode_header
				156	>>> decode_header('=?iso-8859-1?q?p=F6stal?=')
				157	[('p\\xf6stal', 'iso-8859-1')]
				158	\end{verbatim}
				159	\end{funcdesc}
				160
				161	\begin{funcdesc}{make_header}{decoded_seq\optional{, maxlinelen\optional{,
				162	header_name\optional{, continuation_ws}}}}
				163	Create a \class{Header} instance from a sequence of pairs as returned
				164	by \function{decode_header()}.
				165
				166	\function{decode_header()} takes a header value string and returns a
				167	sequence of pairs of the format \code{(decoded_string, charset)} where
				168	\var{charset} is the name of the character set.
				169
				170	This function takes one of those sequence of pairs and returns a
				171	\class{Header} instance. Optional \var{maxlinelen},
				172	\var{header_name}, and \var{continuation_ws} are as in the
				173	\class{Header} constructor.
				174	\end{funcdesc}