Allow additional file suffixes/extensions considered as source in main include grouping

Summary:
By additional regex match, grouping of main include can be enabled in files that are not normally considered as a C/C++ source code.
For example, this might be useful in templated code, where template implementations are being held in *Impl.hpp files.
On the occassion, 'assume-filename' option description was reworded as it was misleading. It has nothing to do with `style=file` option and it does not influence sourced style filename.

Reviewers: rsmith, ioeric, krasimir, sylvestre.ledru, MyDeveloperDay

Reviewed By: MyDeveloperDay

Subscribers: MyDeveloperDay, cfe-commits

Patch by:  furdyna

Tags: #clang

Differential Revision: https://reviews.llvm.org/D67750
diff --git a/clang/docs/ClangFormat.rst b/clang/docs/ClangFormat.rst
index 7fe050e..1138b23 100644
--- a/clang/docs/ClangFormat.rst
+++ b/clang/docs/ClangFormat.rst
@@ -31,9 +31,9 @@
   Clang-format options:
 
     --Werror                   - If set, changes formatting warnings to errors
-    --assume-filename=<string> - When reading from stdin, clang-format assumes this
-                                 filename to look for a style config file (with
-                                 -style=file) and to determine the language.
+    --assume-filename=<string> - Override filename used to determine the language.
+                                 When reading from stdin, clang-format assumes this
+                                 filename to determine the language.
     --cursor=<uint>            - The position of the cursor when invoking
                                  clang-format from an editor integration
     --dry-run                  - If set, do not actually make the formatting changes
diff --git a/clang/docs/ClangFormatStyleOptions.rst b/clang/docs/ClangFormatStyleOptions.rst
index e0ad965..1a9c5b8 100644
--- a/clang/docs/ClangFormatStyleOptions.rst
+++ b/clang/docs/ClangFormatStyleOptions.rst
@@ -1581,6 +1581,26 @@
   For example, if configured to "(_test)?$", then a header a.h would be seen
   as the "main" include in both a.cc and a_test.cc.
 
+**IncludeIsMainSourceRegex** (``std::string``)
+  Specify a regular expression for files being formatted
+  that are allowed to be considered "main" in the
+  file-to-main-include mapping.
+
+  By default, clang-format considers files as "main" only when they end
+  with: ``.c``, ``.cc``, ``.cpp``, ``.c++``, ``.cxx``, ``.m`` or ``.mm``
+  extensions.
+  For these files a guessing of "main" include takes place
+  (to assign category 0, see above). This config option allows for
+  additional suffixes and extensions for files to be considered as "main".
+
+  For example, if this option is configured to ``(Impl\.hpp)$``,
+  then a file ``ClassImpl.hpp`` is considered "main" (in addition to
+  ``Class.c``, ``Class.cc``, ``Class.cpp`` and so on) and "main
+  include file" logic will be executed (with *IncludeIsMainRegex* setting
+  also being respected in later phase). Without this option set,
+  ``ClassImpl.hpp`` would not have the main include file put on top
+  before any other include.
+
 **IndentCaseLabels** (``bool``)
   Indent case labels one level from the switch statement.
 
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 3db1603..1139116 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -292,6 +292,24 @@
 - clang-format gets a new option called ``--dry-run`` or ``-n`` to emit a
   warning.
 
+- Option *IncludeIsMainSourceRegex* added to allow for additional
+  suffixes and file extensions to be considered as a source file
+  for execution of logic that looks for "main *include* file" to put
+  it on top.
+
+  By default, clang-format considers *source* files as "main" only when
+  they end with: ``.c``, ``.cc``, ``.cpp``, ``.c++``, ``.cxx``,
+  ``.m`` or ``.mm`` extensions. This config option allows to
+  extend this set of source files considered as "main".
+            
+  For example, if this option is configured to ``(Impl\.hpp)$``,
+  then a file ``ClassImpl.hpp`` is considered "main" (in addition to
+  ``Class.c``, ``Class.cc``, ``Class.cpp`` and so on) and "main
+  include file" logic will be executed (with *IncludeIsMainRegex* setting
+  also being respected in later phase). Without this option set,
+  ``ClassImpl.hpp`` would not have the main include file put on top
+  before any other include.
+
 libclang
 --------
 
diff --git a/clang/include/clang/Format/Format.h b/clang/include/clang/Format/Format.h
index 6c6b592..329579b 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -2087,6 +2087,10 @@
            ForEachMacros == R.ForEachMacros &&
            IncludeStyle.IncludeBlocks == R.IncludeStyle.IncludeBlocks &&
            IncludeStyle.IncludeCategories == R.IncludeStyle.IncludeCategories &&
+           IncludeStyle.IncludeIsMainRegex ==
+               R.IncludeStyle.IncludeIsMainRegex &&
+           IncludeStyle.IncludeIsMainSourceRegex ==
+               R.IncludeStyle.IncludeIsMainSourceRegex &&
            IndentCaseLabels == R.IndentCaseLabels &&
            IndentGotoLabels == R.IndentGotoLabels &&
            IndentPPDirectives == R.IndentPPDirectives &&
diff --git a/clang/include/clang/Tooling/Inclusions/IncludeStyle.h b/clang/include/clang/Tooling/Inclusions/IncludeStyle.h
index 9cade29..2648297 100644
--- a/clang/include/clang/Tooling/Inclusions/IncludeStyle.h
+++ b/clang/include/clang/Tooling/Inclusions/IncludeStyle.h
@@ -120,6 +120,26 @@
   /// For example, if configured to "(_test)?$", then a header a.h would be seen
   /// as the "main" include in both a.cc and a_test.cc.
   std::string IncludeIsMainRegex;
+
+  /// Specify a regular expression for files being formatted
+  /// that are allowed to be considered "main" in the
+  /// file-to-main-include mapping.
+  ///
+  /// By default, clang-format considers files as "main" only when they end
+  /// with: ``.c``, ``.cc``, ``.cpp``, ``.c++``, ``.cxx``, ``.m`` or ``.mm``
+  /// extensions.
+  /// For these files a guessing of "main" include takes place
+  /// (to assign category 0, see above). This config option allows for
+  /// additional suffixes and extensions for files to be considered as "main".
+  ///
+  /// For example, if this option is configured to ``(Impl\.hpp)$``,
+  /// then a file ``ClassImpl.hpp`` is considered "main" (in addition to
+  /// ``Class.c``, ``Class.cc``, ``Class.cpp`` and so on) and "main
+  /// include file" logic will be executed (with *IncludeIsMainRegex* setting
+  /// also being respected in later phase). Without this option set,
+  /// ``ClassImpl.hpp`` would not have the main include file put on top
+  /// before any other include.
+  std::string IncludeIsMainSourceRegex;
 };
 
 } // namespace tooling
diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index 50a6877..6095859 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -476,6 +476,8 @@
     IO.mapOptional("IncludeBlocks", Style.IncludeStyle.IncludeBlocks);
     IO.mapOptional("IncludeCategories", Style.IncludeStyle.IncludeCategories);
     IO.mapOptional("IncludeIsMainRegex", Style.IncludeStyle.IncludeIsMainRegex);
+    IO.mapOptional("IncludeIsMainSourceRegex",
+                   Style.IncludeStyle.IncludeIsMainSourceRegex);
     IO.mapOptional("IndentCaseLabels", Style.IndentCaseLabels);
     IO.mapOptional("IndentGotoLabels", Style.IndentGotoLabels);
     IO.mapOptional("IndentPPDirectives", Style.IndentPPDirectives);
diff --git a/clang/lib/Tooling/Inclusions/HeaderIncludes.cpp b/clang/lib/Tooling/Inclusions/HeaderIncludes.cpp
index e746bbb..37a0816 100644
--- a/clang/lib/Tooling/Inclusions/HeaderIncludes.cpp
+++ b/clang/lib/Tooling/Inclusions/HeaderIncludes.cpp
@@ -184,6 +184,10 @@
                FileName.endswith(".cpp") || FileName.endswith(".c++") ||
                FileName.endswith(".cxx") || FileName.endswith(".m") ||
                FileName.endswith(".mm");
+  if (!Style.IncludeIsMainSourceRegex.empty()) {
+    llvm::Regex MainFileRegex(Style.IncludeIsMainSourceRegex);
+    IsMainFile |= MainFileRegex.match(FileName);
+  }
 }
 
 int IncludeCategoryManager::getIncludePriority(StringRef IncludeName,
@@ -364,6 +368,5 @@
   return Result;
 }
 
-
 } // namespace tooling
 } // namespace clang
diff --git a/clang/tools/clang-format/ClangFormat.cpp b/clang/tools/clang-format/ClangFormat.cpp
index aa6890e..faf6451 100644
--- a/clang/tools/clang-format/ClangFormat.cpp
+++ b/clang/tools/clang-format/ClangFormat.cpp
@@ -75,9 +75,9 @@
 
 static cl::opt<std::string> AssumeFileName(
     "assume-filename",
-    cl::desc("When reading from stdin, clang-format assumes this\n"
-             "filename to look for a style config file (with\n"
-             "-style=file) and to determine the language."),
+    cl::desc("Override filename used to determine the language.\n"
+             "When reading from stdin, clang-format assumes this\n"
+             "filename to determine the language."),
     cl::init("<stdin>"), cl::cat(ClangFormatCategory));
 
 static cl::opt<bool> Inplace("i",
diff --git a/clang/unittests/Format/FormatTest.cpp b/clang/unittests/Format/FormatTest.cpp
index cef2b0b..01154e2 100644
--- a/clang/unittests/Format/FormatTest.cpp
+++ b/clang/unittests/Format/FormatTest.cpp
@@ -12847,6 +12847,8 @@
               IncludeStyle.IncludeCategories, ExpectedCategories);
   CHECK_PARSE("IncludeIsMainRegex: 'abc$'", IncludeStyle.IncludeIsMainRegex,
               "abc$");
+  CHECK_PARSE("IncludeIsMainSourceRegex: 'abc$'",
+              IncludeStyle.IncludeIsMainSourceRegex, "abc$");
 
   Style.RawStringFormats.clear();
   std::vector<FormatStyle::RawStringFormat> ExpectedRawStringFormats = {
diff --git a/clang/unittests/Format/SortIncludesTest.cpp b/clang/unittests/Format/SortIncludesTest.cpp
index 0c3a95a..7a073a3 100644
--- a/clang/unittests/Format/SortIncludesTest.cpp
+++ b/clang/unittests/Format/SortIncludesTest.cpp
@@ -466,6 +466,57 @@
                  "a.cc"));
 }
 
+TEST_F(SortIncludesTest, LeavesMainHeaderFirstInAdditionalExtensions) {
+  Style.IncludeIsMainRegex = "([-_](test|unittest))?|(Impl)?$";
+  EXPECT_EQ("#include \"b.h\"\n"
+            "#include \"c.h\"\n"
+            "#include \"llvm/a.h\"\n",
+            sort("#include \"llvm/a.h\"\n"
+                 "#include \"c.h\"\n"
+                 "#include \"b.h\"\n",
+                 "a_test.xxx"));
+  EXPECT_EQ("#include \"b.h\"\n"
+            "#include \"c.h\"\n"
+            "#include \"llvm/a.h\"\n",
+            sort("#include \"llvm/a.h\"\n"
+                 "#include \"c.h\"\n"
+                 "#include \"b.h\"\n",
+                 "aImpl.hpp"));
+
+  // .cpp extension is considered "main" by default
+  EXPECT_EQ("#include \"llvm/a.h\"\n"
+            "#include \"b.h\"\n"
+            "#include \"c.h\"\n",
+            sort("#include \"llvm/a.h\"\n"
+                 "#include \"c.h\"\n"
+                 "#include \"b.h\"\n",
+                 "aImpl.cpp"));
+  EXPECT_EQ("#include \"llvm/a.h\"\n"
+            "#include \"b.h\"\n"
+            "#include \"c.h\"\n",
+            sort("#include \"llvm/a.h\"\n"
+                 "#include \"c.h\"\n"
+                 "#include \"b.h\"\n",
+                 "a_test.cpp"));
+
+  // Allow additional filenames / extensions
+  Style.IncludeIsMainSourceRegex = "(Impl\\.hpp)|(\\.xxx)$";
+  EXPECT_EQ("#include \"llvm/a.h\"\n"
+            "#include \"b.h\"\n"
+            "#include \"c.h\"\n",
+            sort("#include \"llvm/a.h\"\n"
+                 "#include \"c.h\"\n"
+                 "#include \"b.h\"\n",
+                 "a_test.xxx"));
+  EXPECT_EQ("#include \"llvm/a.h\"\n"
+            "#include \"b.h\"\n"
+            "#include \"c.h\"\n",
+            sort("#include \"llvm/a.h\"\n"
+                 "#include \"c.h\"\n"
+                 "#include \"b.h\"\n",
+                 "aImpl.hpp"));
+}
+
 TEST_F(SortIncludesTest, RecognizeMainHeaderInAllGroups) {
   Style.IncludeIsMainRegex = "([-_](test|unittest))?$";
   Style.IncludeBlocks = tooling::IncludeStyle::IBS_Merge;