Add a new mode to the lexer which enables it to return all characters,
even whitespace, as tokens from the file.  This is enabled with
L->SetKeepWhitespaceMode(true) on a raw lexer.  In this mode, you too
can use clang as a really complex version of 'cat' with code like this:

  Lexer RawLex(SourceLocation::getFileLoc(SM.getMainFileID(), 0),
               PP.getLangOptions(), File.first, File.second);
  
  RawLex.SetKeepWhitespaceMode(true);
  
  Token RawTok;
  RawLex.LexFromRawLexer(RawTok);
  while (RawTok.isNot(tok::eof)) {
    std::cout << PP.getSpelling(RawTok);
    RawLex.LexFromRawLexer(RawTok);
  }

This will emit exactly the input file, with no canonicalization or other
translation.  Realistic clients actually do something with the tokens of
course :)

llvm-svn: 57401
2 files changed
tree: 4f96812b667849bef915501a85b51458762b2c42
  1. clang/
  2. llvm/