Merged revisions 76259,76326,76376-76377,76430,76471,76517 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk
................
r76259 | georg.brandl | 2009-11-14 05:50:51 -0600 (Sat, 14 Nov 2009) | 1 line
Fix terminology.
................
r76326 | georg.brandl | 2009-11-16 10:44:05 -0600 (Mon, 16 Nov 2009) | 1 line
#7302: fix link.
................
r76376 | georg.brandl | 2009-11-18 13:39:14 -0600 (Wed, 18 Nov 2009) | 1 line
upcase Python
................
r76377 | georg.brandl | 2009-11-18 14:05:15 -0600 (Wed, 18 Nov 2009) | 1 line
Fix markup.
................
r76430 | r.david.murray | 2009-11-20 07:29:43 -0600 (Fri, 20 Nov 2009) | 2 lines
Issue 7363: fix indentation in socketserver udpserver example.
................
r76471 | georg.brandl | 2009-11-23 13:53:19 -0600 (Mon, 23 Nov 2009) | 1 line
#7345: fix arguments of formatyear().
................
r76517 | benjamin.peterson | 2009-11-25 12:16:46 -0600 (Wed, 25 Nov 2009) | 29 lines
Merged revisions 76160-76161,76250,76252,76447,76506 via svnmerge from
svn+ssh://pythondev@svn.python.org/sandbox/trunk/2to3/lib2to3
........
r76160 | benjamin.peterson | 2009-11-08 18:53:48 -0600 (Sun, 08 Nov 2009) | 1 line
undeprecate the -p option; it's useful for converting python3 sources
........
r76161 | benjamin.peterson | 2009-11-08 19:05:37 -0600 (Sun, 08 Nov 2009) | 1 line
simplify condition
........
r76250 | benjamin.peterson | 2009-11-13 16:56:48 -0600 (Fri, 13 Nov 2009) | 1 line
fix handling of a utf-8 bom #7313
........
r76252 | benjamin.peterson | 2009-11-13 16:58:36 -0600 (Fri, 13 Nov 2009) | 1 line
remove pdb turd
........
r76447 | benjamin.peterson | 2009-11-22 18:17:40 -0600 (Sun, 22 Nov 2009) | 1 line
#7375 fix nested transformations in fix_urllib
........
r76506 | benjamin.peterson | 2009-11-24 18:34:31 -0600 (Tue, 24 Nov 2009) | 1 line
use generator expressions in any()
........
................
diff --git a/Lib/lib2to3/pgen2/tokenize.py b/Lib/lib2to3/pgen2/tokenize.py
index 4585ca3..7ae0280 100644
--- a/Lib/lib2to3/pgen2/tokenize.py
+++ b/Lib/lib2to3/pgen2/tokenize.py
@@ -283,9 +283,13 @@
# This behaviour mimics the Python interpreter
raise SyntaxError("unknown encoding: " + encoding)
- if bom_found and codec.name != 'utf-8':
- # This behaviour mimics the Python interpreter
- raise SyntaxError('encoding problem: utf-8')
+ if bom_found:
+ if codec.name != 'utf-8':
+ # This behaviour mimics the Python interpreter
+ raise SyntaxError('encoding problem: utf-8')
+ else:
+ # Allow it to be properly encoded and decoded.
+ encoding = 'utf-8-sig'
return encoding
first = read_or_stop()