[llvm-exegesis] (+final perf overview) InstructionBenchmarkClustering::rangeQuery(): reserve for the upper bound of Neighbors

Summary:
As it was pointed out in D54388+D54390, the maximal size of `Neighbors` is known,
it will contain at most Points_.size() minus one (the center of the cluster)

While that is the upper bound, meaning in the most cases, the actual count
will be much smaller, since D54390 made the allocation persistent,
we no longer have to worry about overly-optimistically `reserve()`ing.

Old: (D54393)
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       6553.167456      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.21% )
...
            6.5547 +- 0.0134 seconds time elapsed  ( +-  0.20% )
```
New:
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       6315.057872      task-clock (msec)         #    0.999 CPUs utilized            ( +-  0.24% )
...
            6.3187 +- 0.0160 seconds time elapsed  ( +-  0.25% )
```
And that is another -~4%.


Since this is the last (as of this moment) patch in this patch series,
it is a good time to summarize:
Old: (svn trunk, as stated in D54381)
```
$ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null

real    0m24.884s
user    0m24.099s
sys     0m0.785s
```
So these patches, on a given benchmark,
has decreased llvm-exegesis analysis time by 74.62%.

There surely is more room for further improvements.
D54514 may improve thins by -11.5% more (relative to this patch).
Parallelization may improve things further significantly, too.


Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet, MaskRay

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54415

llvm-svn: 347204
diff --git a/llvm/tools/llvm-exegesis/lib/Clustering.cpp b/llvm/tools/llvm-exegesis/lib/Clustering.cpp
index 360966a..b2cd97c 100644
--- a/llvm/tools/llvm-exegesis/lib/Clustering.cpp
+++ b/llvm/tools/llvm-exegesis/lib/Clustering.cpp
@@ -34,8 +34,9 @@
 // Finds the points at distance less than sqrt(EpsilonSquared) of Q (not
 // including Q).
 void InstructionBenchmarkClustering::rangeQuery(
-    const size_t Q, llvm::SmallVectorImpl<size_t> &Neighbors) const {
+    const size_t Q, std::vector<size_t> &Neighbors) const {
   Neighbors.clear();
+  Neighbors.reserve(Points_.size() - 1); // The Q itself isn't a neighbor.
   const auto &QMeasurements = Points_[Q].Measurements;
   for (size_t P = 0, NumPoints = Points_.size(); P < NumPoints; ++P) {
     if (P == Q)
@@ -91,7 +92,7 @@
 }
 
 void InstructionBenchmarkClustering::dbScan(const size_t MinPts) {
-  llvm::SmallVector<size_t, 0> Neighbors; // Persistent buffer to avoid allocs.
+  std::vector<size_t> Neighbors; // Persistent buffer to avoid allocs.
   for (size_t P = 0, NumPoints = Points_.size(); P < NumPoints; ++P) {
     if (!ClusterIdForPoint_[P].isUndef())
       continue; // Previously processed in inner loop.
@@ -136,6 +137,8 @@
       }
     }
   }
+  // assert(Neighbors.capacity() == (Points_.size() - 1));
+  // ^ True, but it is not quaranteed to be true in all the cases.
 
   // Add noisy points to noise cluster.
   for (size_t P = 0, NumPoints = Points_.size(); P < NumPoints; ++P) {