[PCH] Fixed preamble breaking with BOM presence (and particularly, fluctuating BOM presence)
This patch fixes broken preamble-skipping when the preamble region includes a byte order mark (BOM). Previously, parsing would fail if preamble PCH generation was enabled and a BOM was present.
This also fixes preamble invalidation when a BOM appears or disappears. This may seem to be an obscure edge case, but it happens regularly with IDEs that pass buffer overrides that never (or always) have a BOM, yet the underlying file from the initial parse that generated a PCH might (or might not) have a BOM.
I've included a test case for these scenarios.
Differential Revision: https://reviews.llvm.org/D37491
llvm-svn: 313796
diff --git a/clang/lib/Lex/Lexer.cpp b/clang/lib/Lex/Lexer.cpp
index 928c24d..b7f97a5 100644
--- a/clang/lib/Lex/Lexer.cpp
+++ b/clang/lib/Lex/Lexer.cpp
@@ -552,9 +552,9 @@
} // end anonymous namespace
-std::pair<unsigned, bool> Lexer::ComputePreamble(StringRef Buffer,
- const LangOptions &LangOpts,
- unsigned MaxLines) {
+PreambleBounds Lexer::ComputePreamble(StringRef Buffer,
+ const LangOptions &LangOpts,
+ unsigned MaxLines) {
// Create a lexer starting at the beginning of the file. Note that we use a
// "fake" file source location at offset 1 so that the lexer will track our
// position within the file.
@@ -688,7 +688,7 @@
else
End = TheTok.getLocation();
- return std::make_pair(End.getRawEncoding() - StartLoc.getRawEncoding(),
+ return PreambleBounds(End.getRawEncoding() - FileLoc.getRawEncoding(),
TheTok.isAtStartOfLine());
}
@@ -1394,9 +1394,9 @@
// Helper methods for lexing.
//===----------------------------------------------------------------------===//
-/// \brief Routine that indiscriminately skips bytes in the source file.
-void Lexer::SkipBytes(unsigned Bytes, bool StartOfLine) {
- BufferPtr += Bytes;
+/// \brief Routine that indiscriminately sets the offset into the source file.
+void Lexer::SetByteOffset(unsigned Offset, bool StartOfLine) {
+ BufferPtr = BufferStart + Offset;
if (BufferPtr > BufferEnd)
BufferPtr = BufferEnd;
// FIXME: What exactly does the StartOfLine bit mean? There are two