enhance sourcemgr to detect various UTF BOM's and emit a fatal error
about it instead of producing tons of garbage from the lexer.

It would be even better for sourcemgr to dynamically transcode (e.g.
from UTF16 -> UTF8).


git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@101924 91177308-0d34-0410-b5e6-96231b3b80d8
diff --git a/test/Lexer/utf-16.c b/test/Lexer/utf-16.c
new file mode 100644
index 0000000..7c14e39
--- /dev/null
+++ b/test/Lexer/utf-16.c
@@ -0,0 +1,4 @@
+// RUN: not %clang -xc %s.txt -fsyntax-only 2>&1 | grep 'UTF-16 (LE) byte order mark detected'
+// rdar://7876588
+
+// This test verifies that clang gives a decent error for UTF-16 source files.
\ No newline at end of file