Fix issue #15899: Make the unicode.rst doctests pass. Patch by Chris Jerdonek.

commit: 2fd8bdbc9d904a7cb9f7cd676322eee03d92ee0c [log] [tgz]
author: Senthil Kumaran <senthil@uthcode.com> Tue Sep 11 03:17:52 2012 -0700
committer: Senthil Kumaran <senthil@uthcode.com> Tue Sep 11 03:17:52 2012 -0700
tree: c6c03e3f3d0a42d4ff6c93d76bebe1b5dc2d2289
parent: c8754a13e607ebc70f12a10297c76dc574a91d5b [diff] [blame]
diff --git a/Doc/howto/unicode.rst b/Doc/howto/unicode.rst
index 045fd33..bcd29be 100644
--- a/Doc/howto/unicode.rst
+++ b/Doc/howto/unicode.rst

@@ -257,13 +257,13 @@
 'REPLACEMENT CHARACTER'), or 'ignore' (just leave the character out of the
 Unicode result).  The following examples show the differences::
 
-    >>> b'\x80abc'.decode("utf-8", "strict")
+    >>> b'\x80abc'.decode("utf-8", "strict")  #doctest: +NORMALIZE_WHITESPACE
     Traceback (most recent call last):
-      File "<stdin>", line 1, in ?
-    UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0:
-                        unexpected code byte
+        ...
+    UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0:
+      invalid start byte
     >>> b'\x80abc'.decode("utf-8", "replace")
-    '?abc'
+    '�abc'
     >>> b'\x80abc'.decode("utf-8", "ignore")
     'abc'
 
@@ -301,11 +301,11 @@
     >>> u = chr(40960) + 'abcd' + chr(1972)
     >>> u.encode('utf-8')
     b'\xea\x80\x80abcd\xde\xb4'
-    >>> u.encode('ascii')
+    >>> u.encode('ascii')  #doctest: +NORMALIZE_WHITESPACE
     Traceback (most recent call last):
-      File "<stdin>", line 1, in ?
+        ...
     UnicodeEncodeError: 'ascii' codec can't encode character '\ua000' in
-                        position 0: ordinal not in range(128)
+      position 0: ordinal not in range(128)
     >>> u.encode('ascii', 'ignore')
     b'abcd'
     >>> u.encode('ascii', 'replace')
@@ -331,12 +331,11 @@
 not four::
 
     >>> s = "a\xac\u1234\u20ac\U00008000"
-              ^^^^ two-digit hex escape
-                   ^^^^^ four-digit Unicode escape
-                              ^^^^^^^^^^ eight-digit Unicode escape
-    >>> for c in s:  print(ord(c), end=" ")
-    ...
-    97 172 4660 8364 32768
+    ... #     ^^^^ two-digit hex escape
+    ... #         ^^^^^^ four-digit Unicode escape
+    ... #                     ^^^^^^^^^^ eight-digit Unicode escape
+    >>> [ord(c) for c in s]
+    [97, 172, 4660, 8364, 32768]
 
 Using escape sequences for code points greater than 127 is fine in small doses,
 but becomes an annoyance if you're using many accented characters, as you would
commit	2fd8bdbc9d904a7cb9f7cd676322eee03d92ee0c	[log] [tgz]
author	Senthil Kumaran <senthil@uthcode.com>	Tue Sep 11 03:17:52 2012 -0700
committer	Senthil Kumaran <senthil@uthcode.com>	Tue Sep 11 03:17:52 2012 -0700
tree	c6c03e3f3d0a42d4ff6c93d76bebe1b5dc2d2289
parent	c8754a13e607ebc70f12a10297c76dc574a91d5b [diff] [blame]