[clang][Parse] Diagnose useless null statements / empty init-statements

Summary:
clang has `-Wextra-semi` (D43162), which is not dictated by the currently selected standard.
While that is great, there is at least one more source of need-less semis - 'null statements'.
Sometimes, they are needed:
```
for(int x = 0; continueToDoWork(x); x++)
  ; // Ugly code, but the semi is needed here.
```

But sometimes they are just there for no reason:
```
switch(X) {
case 0:
  return -2345;
case 5:
  return 0;
default:
  return 42;
}; // <- oops

;;;;;;;;;;; <- OOOOPS, still not diagnosed. Clearly this is junk.
```

Additionally:
```
if(; // <- empty init-statement
   true)
  ;

switch (; // empty init-statement
        x) {
  ...
}

for (; // <- empty init-statement
     int y : S())
  ;
}

As usual, things may or may not go sideways in the presence of macros.
While evaluating this diag on my codebase of interest, it was unsurprisingly
discovered that Google Test macros are *very* prone to this.
And it seems many issues are deep within the GTest itself, not
in the snippets passed from the codebase that uses GTest.

So after some thought, i decided not do issue a diagnostic if the semi
is within *any* macro, be it either from the normal header, or system header.

Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=39111 | PR39111 ]]

Reviewers: rsmith, aaron.ballman, efriedma

Reviewed By: aaron.ballman

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D52695

llvm-svn: 347339
diff --git a/clang/lib/Parse/ParseStmt.cpp b/clang/lib/Parse/ParseStmt.cpp
index 8461ee1..992997e 100644
--- a/clang/lib/Parse/ParseStmt.cpp
+++ b/clang/lib/Parse/ParseStmt.cpp
@@ -930,6 +930,34 @@
 
 }
 
+/// Consume any extra semi-colons resulting in null statements,
+/// returning true if any tok::semi were consumed.
+bool Parser::ConsumeNullStmt(StmtVector &Stmts) {
+  if (!Tok.is(tok::semi))
+    return false;
+
+  SourceLocation StartLoc = Tok.getLocation();
+  SourceLocation EndLoc;
+
+  while (Tok.is(tok::semi) && !Tok.hasLeadingEmptyMacro() &&
+         Tok.getLocation().isValid() && !Tok.getLocation().isMacroID()) {
+    EndLoc = Tok.getLocation();
+
+    // Don't just ConsumeToken() this tok::semi, do store it in AST.
+    StmtResult R = ParseStatementOrDeclaration(Stmts, ACK_Any);
+    if (R.isUsable())
+      Stmts.push_back(R.get());
+  }
+
+  // Did not consume any extra semi.
+  if (EndLoc.isInvalid())
+    return false;
+
+  Diag(StartLoc, diag::warn_null_statement)
+      << FixItHint::CreateRemoval(SourceRange(StartLoc, EndLoc));
+  return true;
+}
+
 /// ParseCompoundStatementBody - Parse a sequence of statements and invoke the
 /// ActOnCompoundStmt action.  This expects the '{' to be the current token, and
 /// consume the '}' at the end of the block.  It does not manipulate the scope
@@ -992,6 +1020,9 @@
       continue;
     }
 
+    if (ConsumeNullStmt(Stmts))
+      continue;
+
     StmtResult R;
     if (Tok.isNot(tok::kw___extension__)) {
       R = ParseStatementOrDeclaration(Stmts, ACK_Any);
@@ -1588,10 +1619,15 @@
   ParsedAttributesWithRange attrs(AttrFactory);
   MaybeParseCXX11Attributes(attrs);
 
+  SourceLocation EmptyInitStmtSemiLoc;
+
   // Parse the first part of the for specifier.
   if (Tok.is(tok::semi)) {  // for (;
     ProhibitAttributes(attrs);
     // no first part, eat the ';'.
+    SourceLocation SemiLoc = Tok.getLocation();
+    if (!Tok.hasLeadingEmptyMacro() && !SemiLoc.isMacroID())
+      EmptyInitStmtSemiLoc = SemiLoc;
     ConsumeToken();
   } else if (getLangOpts().CPlusPlus && Tok.is(tok::identifier) &&
              isForRangeIdentifier()) {
@@ -1723,6 +1759,11 @@
                    : diag::ext_for_range_init_stmt)
               << (FirstPart.get() ? FirstPart.get()->getSourceRange()
                                   : SourceRange());
+          if (EmptyInitStmtSemiLoc.isValid()) {
+            Diag(EmptyInitStmtSemiLoc, diag::warn_empty_init_statement)
+                << /*for-loop*/ 2
+                << FixItHint::CreateRemoval(EmptyInitStmtSemiLoc);
+          }
         }
       } else {
         ExprResult SecondExpr = ParseExpression();