Merged revisions 74861-74863,74876,74896,74930,74933,74952-74953,75015,75019,75260-75263,75265-75266,75289 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
........
r74861 | benjamin.peterson | 2009-09-17 05:18:28 +0200 (Do, 17 Sep 2009) | 1 line
pep 8 defaults
........
r74862 | brett.cannon | 2009-09-17 05:24:45 +0200 (Do, 17 Sep 2009) | 1 line
Note in the intro to Extending... that ctypes can be a simpler, more portable solution than custom C code.
........
r74863 | benjamin.peterson | 2009-09-17 05:27:33 +0200 (Do, 17 Sep 2009) | 1 line
rationalize a bit
........
r74876 | georg.brandl | 2009-09-17 18:15:53 +0200 (Do, 17 Sep 2009) | 1 line
#6932: remove paragraph that advises relying on __del__ being called.
........
r74896 | georg.brandl | 2009-09-18 09:22:41 +0200 (Fr, 18 Sep 2009) | 1 line
#6936: for interactive use, quit() is just fine.
........
r74930 | georg.brandl | 2009-09-18 23:21:41 +0200 (Fr, 18 Sep 2009) | 1 line
#6925: rewrite docs for locals() and vars() a bit.
........
r74933 | georg.brandl | 2009-09-18 23:35:59 +0200 (Fr, 18 Sep 2009) | 1 line
#6930: clarify description about byteorder handling in UTF decoder routines.
........
r74952 | georg.brandl | 2009-09-19 12:42:34 +0200 (Sa, 19 Sep 2009) | 1 line
#6946: fix duplicate index entries for datetime classes.
........
r74953 | georg.brandl | 2009-09-19 14:04:16 +0200 (Sa, 19 Sep 2009) | 1 line
Fix references to threading.enumerate().
........
r75015 | georg.brandl | 2009-09-22 12:55:08 +0200 (Di, 22 Sep 2009) | 1 line
Fix encoding name.
........
r75019 | vinay.sajip | 2009-09-22 19:23:41 +0200 (Di, 22 Sep 2009) | 1 line
Fixed a typo, and added sections on optimization and using arbitrary objects as messages.
........
r75260 | andrew.kuchling | 2009-10-05 23:24:20 +0200 (Mo, 05 Okt 2009) | 1 line
Wording fix
........
r75261 | andrew.kuchling | 2009-10-05 23:24:35 +0200 (Mo, 05 Okt 2009) | 1 line
Fix narkup
........
r75262 | andrew.kuchling | 2009-10-05 23:25:03 +0200 (Mo, 05 Okt 2009) | 1 line
Document 'skip' parameter to constructor
........
r75263 | andrew.kuchling | 2009-10-05 23:25:35 +0200 (Mo, 05 Okt 2009) | 1 line
Note side benefit of socket.create_connection()
........
r75265 | andrew.kuchling | 2009-10-06 00:31:11 +0200 (Di, 06 Okt 2009) | 1 line
Reword sentence
........
r75266 | andrew.kuchling | 2009-10-06 00:32:48 +0200 (Di, 06 Okt 2009) | 1 line
Use standard comma punctuation; reword some sentences in the docs
........
r75289 | mark.dickinson | 2009-10-08 22:02:25 +0200 (Do, 08 Okt 2009) | 2 lines
Issue #7051: Clarify behaviour of 'g' and 'G'-style formatting.
........
diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index 1249ed7..4ab1c21 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -414,10 +414,13 @@
*byteorder == 0: native order
*byteorder == 1: big endian
- and then switches if the first four bytes of the input data are a byte order mark
- (BOM) and the specified byte order is native order. This BOM is not copied into
- the resulting Unicode string. After completion, *\*byteorder* is set to the
- current byte order at the end of input data.
+ If ``*byteorder`` is zero, and the first four bytes of the input data are a
+ byte order mark (BOM), the decoder switches to this byte order and the BOM is
+ not copied into the resulting Unicode string. If ``*byteorder`` is ``-1`` or
+ ``1``, any byte order mark is copied to the output.
+
+ After completion, *\*byteorder* is set to the current byte order at the end
+ of input data.
In a narrow build codepoints outside the BMP will be decoded as surrogate pairs.
@@ -442,8 +445,7 @@
.. cfunction:: PyObject* PyUnicode_EncodeUTF32(const Py_UNICODE *s, Py_ssize_t size, const char *errors, int byteorder)
Return a Python bytes object holding the UTF-32 encoded value of the Unicode
- data in *s*. If *byteorder* is not ``0``, output is written according to the
- following byte order::
+ data in *s*. Output is written according to the following byte order::
byteorder == -1: little endian
byteorder == 0: native byte order (writes a BOM mark)
@@ -487,10 +489,14 @@
*byteorder == 0: native order
*byteorder == 1: big endian
- and then switches if the first two bytes of the input data are a byte order mark
- (BOM) and the specified byte order is native order. This BOM is not copied into
- the resulting Unicode string. After completion, *\*byteorder* is set to the
- current byte order at the.
+ If ``*byteorder`` is zero, and the first two bytes of the input data are a
+ byte order mark (BOM), the decoder switches to this byte order and the BOM is
+ not copied into the resulting Unicode string. If ``*byteorder`` is ``-1`` or
+ ``1``, any byte order mark is copied to the output (where it will result in
+ either a ``\ufeff`` or a ``\ufffe`` character).
+
+ After completion, *\*byteorder* is set to the current byte order at the end
+ of input data.
If *byteorder* is *NULL*, the codec starts in native order mode.
@@ -520,8 +526,7 @@
.. cfunction:: PyObject* PyUnicode_EncodeUTF16(const Py_UNICODE *s, Py_ssize_t size, const char *errors, int byteorder)
Return a Python string object holding the UTF-16 encoded value of the Unicode
- data in *s*. If *byteorder* is not ``0``, output is written according to the
- following byte order::
+ data in *s*. Output is written according to the following byte order::
byteorder == -1: little endian
byteorder == 0: native byte order (writes a BOM mark)