Merge p3yk branch with the trunk up to revision 45595. This breaks a fair number of tests, all because of the codecs/_multibytecodecs issue described here (it's not a Py3K issue, just something Py3K discovers): http://mail.python.org/pipermail/python-dev/2006-April/064051.html Hye-Shik Chang promised to look for a fix, so no need to fix it here. The tests that are expected to break are: test_codecencodings_cn test_codecencodings_hk test_codecencodings_jp test_codecencodings_kr test_codecencodings_tw test_codecs test_multibytecodec This merge fixes an actual test failure (test_weakref) in this branch, though, so I believe merging is the right thing to do anyway.

commit: 49fd7fa4431da299196d74087df4a04f99f9c46f [log] [tgz]
author: Thomas Wouters <thomas@python.org> Fri Apr 21 10:40:58 2006 +0000
committer: Thomas Wouters <thomas@python.org> Fri Apr 21 10:40:58 2006 +0000
tree: 35ace5fe78d3d52c7a9ab356ab9f6dbf8d4b71f4
parent: 9ada3d6e29d5165dadacbe6be07bcd35cfbef59d [diff] [blame]
diff --git a/Doc/lib/libcodecs.tex b/Doc/lib/libcodecs.tex
index 1806ef0..8a2417e 100644
--- a/Doc/lib/libcodecs.tex
+++ b/Doc/lib/libcodecs.tex

@@ -112,6 +112,7 @@
 
 Raises a \exception{LookupError} in case the encoding cannot be found or the
 codec doesn't support an incremental encoder.
+\versionadded{2.5}
 \end{funcdesc}
 
 \begin{funcdesc}{getincrementaldecoder}{encoding}
@@ -120,6 +121,7 @@
 
 Raises a \exception{LookupError} in case the encoding cannot be found or the
 codec doesn't support an incremental decoder.
+\versionadded{2.5}
 \end{funcdesc}
 
 \begin{funcdesc}{getreader}{encoding}
@@ -150,7 +152,7 @@
 continue. The encoder will encode the replacement and continue encoding
 the original input at the specified position. Negative position values
 will be treated as being relative to the end of the input string. If the
-resulting position is out of bound an IndexError will be raised.
+resulting position is out of bound an \exception{IndexError} will be raised.
 
 Decoding and translating works similar, except \exception{UnicodeDecodeError}
 or \exception{UnicodeTranslateError} will be passed to the handler and
@@ -229,12 +231,14 @@
 Uses an incremental encoder to iteratively encode the input provided by
 \var{iterable}. This function is a generator. \var{errors} (as well as
 any other keyword argument) is passed through to the incremental encoder.
+\versionadded{2.5}
 \end{funcdesc}
 
 \begin{funcdesc}{iterdecode}{iterable, encoding\optional{, errors}}
 Uses an incremental decoder to iteratively decode the input provided by
 \var{iterable}. This function is a generator. \var{errors} (as well as
 any other keyword argument) is passed through to the incremental encoder.
+\versionadded{2.5}
 \end{funcdesc}
 
 The module also provides the following constants which are useful
@@ -355,6 +359,8 @@
 
 \subsubsection{IncrementalEncoder Objects \label{incremental-encoder-objects}}
 
+\versionadded{2.5}
+
 The \class{IncrementalEncoder} class is used for encoding an input in multiple
 steps. It defines the following methods which every incremental encoder must
 define in order to be compatible to the Python codec registry.
@@ -437,6 +443,10 @@
   Decodes \var{object} (taking the current state of the decoder into account)
   and returns the resulting decoded object. If this is the last call to
   \method{decode} \var{final} must be true (the default is false).
+  If \var{final} is true the decoder must decode the input completely and must
+  flush all buffers. If this isn't possible (e.g. because of incomplete byte
+  sequences at the end of the input) it must initiate error handling just like
+  in the stateless case (which might raise an exception).
 \end{methoddesc}
 
 \begin{methoddesc}{reset}{}
@@ -690,10 +700,10 @@
 The simplest method is to map the codepoints 0-255 to the bytes
 \code{0x0}-\code{0xff}. This means that a unicode object that contains 
 codepoints above \code{U+00FF} can't be encoded with this method (which 
-is called \code{'latin-1'} or \code{'iso-8859-1'}). unicode.encode() will 
-raise a UnicodeEncodeError that looks like this: \samp{UnicodeEncodeError:
-'latin-1' codec can't encode character u'\e u1234' in position 3: ordinal
-not in range(256)}.
+is called \code{'latin-1'} or \code{'iso-8859-1'}).
+\function{unicode.encode()} will raise a \exception{UnicodeEncodeError}
+that looks like this: \samp{UnicodeEncodeError: 'latin-1' codec can't
+encode character u'\e u1234' in position 3: ordinal not in range(256)}.
 
 There's another group of encodings (the so called charmap encodings)
 that choose a different subset of all unicode code points and how
@@ -1220,7 +1230,7 @@
 
 \lineiv{rot_13}
          {rot13}
-         {byte string}
+         {Unicode string}
          {Returns the Caesar-cypher encryption of the operand}
 
 \lineiv{string_escape}
commit	49fd7fa4431da299196d74087df4a04f99f9c46f	[log] [tgz]
author	Thomas Wouters <thomas@python.org>	Fri Apr 21 10:40:58 2006 +0000
committer	Thomas Wouters <thomas@python.org>	Fri Apr 21 10:40:58 2006 +0000
tree	35ace5fe78d3d52c7a9ab356ab9f6dbf8d4b71f4
parent	9ada3d6e29d5165dadacbe6be07bcd35cfbef59d [diff] [blame]