Use trigrams to speed up SpecialCaseList.

Summary:
it's often the case when the rules in the SpecialCaseList
are of the form hel.o*bar. That gives us a chance to build
trigram index to quickly discard 99% of inputs without
running a full regex. A similar idea was used in Google Code Search
as described in the blog post:
https://swtch.com/~rsc/regexp/regexp4.html

The check is defeated, if there's at least one regex
more complicated than that. In this case, all inputs
will go through the regex. That said, the real-world
rules are often simple or can be simplied. That considerably
speeds up compiling Chromium with CFI and UBSan.

As measured on Chromium's content_message_generator.cc:

before, CFI: 44 s
after, CFI: 23 s
after, CFI, no blacklist: 23 s (~1% slower, but 3 runs were unable to show the difference)
after, regular compilation to bitcode: 23 s

Reviewers: pcc

Subscribers: mgorny, llvm-commits

Differential Revision: https://reviews.llvm.org/D27188

llvm-svn: 288303
diff --git a/llvm/lib/Support/SpecialCaseList.cpp b/llvm/lib/Support/SpecialCaseList.cpp
index 62dcafe..df524b3 100644
--- a/llvm/lib/Support/SpecialCaseList.cpp
+++ b/llvm/lib/Support/SpecialCaseList.cpp
@@ -15,6 +15,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "llvm/Support/SpecialCaseList.h"
+#include "llvm/Support/TrigramIndex.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringSet.h"
@@ -33,10 +34,15 @@
 /// literal strings than Regex.
 struct SpecialCaseList::Entry {
   StringSet<> Strings;
+  TrigramIndex Trigrams;
   std::unique_ptr<Regex> RegEx;
 
   bool match(StringRef Query) const {
-    return Strings.count(Query) || (RegEx && RegEx->match(Query));
+    if (Strings.count(Query))
+      return true;
+    if (Trigrams.isDefinitelyOut(Query))
+      return false;
+    return RegEx && RegEx->match(Query);
   }
 };
 
@@ -104,10 +110,12 @@
     StringRef Category = SplitRegexp.second;
 
     // See if we can store Regexp in Strings.
+    auto &Entry = Entries[Prefix][Category];
     if (Regex::isLiteralERE(Regexp)) {
-      Entries[Prefix][Category].Strings.insert(Regexp);
+      Entry.Strings.insert(Regexp);
       continue;
     }
+    Entry.Trigrams.insert(Regexp);
 
     // Replace * with .*
     for (size_t pos = 0; (pos = Regexp.find('*', pos)) != std::string::npos;