[clang-format] Add basic support for formatting C# files
Summary:
This revision adds basic support for formatting C# files with clang-format, I know the barrier to entry is high here so I'm sending this revision in to test the water as to whether this might be something we'd consider landing.
Tracking in Bugzilla as:
https://bugs.llvm.org/show_bug.cgi?id=40850
Justification:
C# code just looks ugly in comparison to the C++ code in our source tree which is clang-formatted.
I've struggled with Visual Studio reformatting to get a clean and consistent style, I want to format our C# code on saving like I do now for C++ and i want it to have the same style as defined in our .clang-format file, so it consistent as it can be with C++. (Braces/Breaking/Spaces/Indent etc..)
Using clang format without this patch leaves the code in a bad state, sometimes when the BreakStringLiterals is set, it fails to compile.
Mostly the C# is similar to Java, except instead of JavaAnnotations I try to reuse the TT_AttributeSquare.
Almost the most valuable portion is to have a new Language in order to partition the configuration for C# within a common .clang-format file, with the auto detection on the .cs extension. But there are other C# specific styles that could be added later if this is accepted. in particular how `{ set;get }` is formatted.
Reviewers: djasper, klimek, krasimir, benhamilton, JonasToth
Reviewed By: klimek
Subscribers: llvm-commits, mgorny, jdoerfert, cfe-commits
Tags: #clang, #clang-tools-extra
Differential Revision: https://reviews.llvm.org/D58404
llvm-svn: 356662
diff --git a/clang/lib/Format/FormatTokenLexer.cpp b/clang/lib/Format/FormatTokenLexer.cpp
index da755e3..3a1dcef 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -66,6 +66,21 @@
return;
if (tryMergeLessLess())
return;
+
+ if (Style.isCSharp()) {
+ if (tryMergeCSharpKeywordVariables())
+ return;
+ if (tryMergeCSharpVerbatimStringLiteral())
+ return;
+ if (tryMergeCSharpDoubleQuestion())
+ return;
+ if (tryMergeCSharpNullConditionals())
+ return;
+ static const tok::TokenKind JSRightArrow[] = {tok::equal, tok::greater};
+ if (tryMergeTokens(JSRightArrow, TT_JsFatArrow))
+ return;
+ }
+
if (tryMergeNSStringLiteral())
return;
@@ -142,6 +157,100 @@
return true;
}
+// Search for verbatim or interpolated string literals @"ABC" or
+// $"aaaaa{abc}aaaaa" i and mark the token as TT_CSharpStringLiteral, and to
+// prevent splitting of @, $ and ".
+bool FormatTokenLexer::tryMergeCSharpVerbatimStringLiteral() {
+ if (Tokens.size() < 2)
+ return false;
+ auto &At = *(Tokens.end() - 2);
+ auto &String = *(Tokens.end() - 1);
+
+ // Look for $"aaaaaa" @"aaaaaa".
+ if (!(At->is(tok::at) || At->TokenText == "$") ||
+ !String->is(tok::string_literal))
+ return false;
+
+ if (Tokens.size() >= 2 && At->is(tok::at)) {
+ auto &Dollar = *(Tokens.end() - 3);
+ if (Dollar->TokenText == "$") {
+ // This looks like $@"aaaaa" so we need to combine all 3 tokens.
+ Dollar->Tok.setKind(tok::string_literal);
+ Dollar->TokenText =
+ StringRef(Dollar->TokenText.begin(),
+ String->TokenText.end() - Dollar->TokenText.begin());
+ Dollar->ColumnWidth += (At->ColumnWidth + String->ColumnWidth);
+ Dollar->Type = TT_CSharpStringLiteral;
+ Tokens.erase(Tokens.end() - 2);
+ Tokens.erase(Tokens.end() - 1);
+ return true;
+ }
+ }
+
+ // Convert back into just a string_literal.
+ At->Tok.setKind(tok::string_literal);
+ At->TokenText = StringRef(At->TokenText.begin(),
+ String->TokenText.end() - At->TokenText.begin());
+ At->ColumnWidth += String->ColumnWidth;
+ At->Type = TT_CSharpStringLiteral;
+ Tokens.erase(Tokens.end() - 1);
+ return true;
+}
+
+bool FormatTokenLexer::tryMergeCSharpDoubleQuestion() {
+ if (Tokens.size() < 2)
+ return false;
+ auto &FirstQuestion = *(Tokens.end() - 2);
+ auto &SecondQuestion = *(Tokens.end() - 1);
+ if (!FirstQuestion->is(tok::question) || !SecondQuestion->is(tok::question))
+ return false;
+ FirstQuestion->Tok.setKind(tok::question);
+ FirstQuestion->TokenText = StringRef(FirstQuestion->TokenText.begin(),
+ SecondQuestion->TokenText.end() -
+ FirstQuestion->TokenText.begin());
+ FirstQuestion->ColumnWidth += SecondQuestion->ColumnWidth;
+ FirstQuestion->Type = TT_CSharpNullCoalescing;
+ Tokens.erase(Tokens.end() - 1);
+ return true;
+}
+
+bool FormatTokenLexer::tryMergeCSharpKeywordVariables() {
+ if (Tokens.size() < 2)
+ return false;
+ auto &At = *(Tokens.end() - 2);
+ auto &Keyword = *(Tokens.end() - 1);
+ if (!At->is(tok::at))
+ return false;
+ if (!Keywords.isCSharpKeyword(*Keyword))
+ return false;
+
+ At->Tok.setKind(tok::identifier);
+ At->TokenText = StringRef(At->TokenText.begin(),
+ Keyword->TokenText.end() - At->TokenText.begin());
+ At->ColumnWidth += Keyword->ColumnWidth;
+ At->Type = Keyword->Type;
+ Tokens.erase(Tokens.end() - 1);
+ return true;
+}
+
+// In C# merge the Identifier and the ? together e.g. arg?.
+bool FormatTokenLexer::tryMergeCSharpNullConditionals() {
+ if (Tokens.size() < 2)
+ return false;
+ auto &Identifier = *(Tokens.end() - 2);
+ auto &Question = *(Tokens.end() - 1);
+ if (!Identifier->isOneOf(tok::r_square, tok::identifier) ||
+ !Question->is(tok::question))
+ return false;
+ Identifier->TokenText =
+ StringRef(Identifier->TokenText.begin(),
+ Question->TokenText.end() - Identifier->TokenText.begin());
+ Identifier->ColumnWidth += Question->ColumnWidth;
+ Identifier->Type = Identifier->Type;
+ Tokens.erase(Tokens.end() - 1);
+ return true;
+}
+
bool FormatTokenLexer::tryMergeLessLess() {
// Merge X,less,less,Y into X,lessless,Y unless X or Y is less.
if (Tokens.size() < 3)