Specify the default values of the cache parameters

If the parameters of the target cache (i.e., cache level sizes, cache level
associativities) are not specified or have wrong values, we use ones for
parameters of the macro-kernel and do not perform data-layout optimizations of
the matrix multiplication. In this patch we specify the default values of the
cache parameters to be able to apply the pattern matching optimizations even in
this case. Since there is no typical values of this parameters, we use the
parameters of Intel Core i7-3820 SandyBridge that also help to attain the
high-performance on IBM POWER System S822 and IBM Power 730 Express server.

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D28090

llvm-svn: 290518
diff --git a/polly/lib/Transform/ScheduleOptimizer.cpp b/polly/lib/Transform/ScheduleOptimizer.cpp
index 017cce9..13bfaee 100644
--- a/polly/lib/Transform/ScheduleOptimizer.cpp
+++ b/polly/lib/Transform/ScheduleOptimizer.cpp
@@ -134,16 +134,33 @@
              "instructions per clock cycle."),
     cl::Hidden, cl::init(1), cl::ZeroOrMore, cl::cat(PollyCategory));
 
-static cl::list<int>
-    CacheLevelAssociativity("polly-target-cache-level-associativity",
-                            cl::desc("The associativity of each cache level."),
-                            cl::Hidden, cl::ZeroOrMore, cl::CommaSeparated,
-                            cl::cat(PollyCategory));
+// This option, along with --polly-target-2nd-cache-level-associativity,
+// --polly-target-1st-cache-level-size, and --polly-target-2st-cache-level-size
+// represent the parameters of the target cache, which do not have typical
+// values that can be used by default. However, to apply the pattern matching
+// optimizations, we use the values of the parameters of Intel Core i7-3820
+// SandyBridge in case the parameters are not specified. Such an approach helps
+// also to attain the high-performance on IBM POWER System S822 and IBM Power
+// 730 Express server.
+static cl::opt<int> FirstCacheLevelAssociativity(
+    "polly-target-1st-cache-level-associativity",
+    cl::desc("The associativity of the first cache level."), cl::Hidden,
+    cl::init(8), cl::ZeroOrMore, cl::cat(PollyCategory));
 
-static cl::list<int> CacheLevelSizes(
-    "polly-target-cache-level-sizes",
-    cl::desc("The size of each cache level specified in bytes."), cl::Hidden,
-    cl::ZeroOrMore, cl::CommaSeparated, cl::cat(PollyCategory));
+static cl::opt<int> SecondCacheLevelAssociativity(
+    "polly-target-2nd-cache-level-associativity",
+    cl::desc("The associativity of the second cache level."), cl::Hidden,
+    cl::init(8), cl::ZeroOrMore, cl::cat(PollyCategory));
+
+static cl::opt<int> FirstCacheLevelSize(
+    "polly-target-1st-cache-level-size",
+    cl::desc("The size of the first cache level specified in bytes."),
+    cl::Hidden, cl::init(32768), cl::ZeroOrMore, cl::cat(PollyCategory));
+
+static cl::opt<int> SecondCacheLevelSize(
+    "polly-target-2nd-cache-level-size",
+    cl::desc("The size of the second level specified in bytes."), cl::Hidden,
+    cl::init(262144), cl::ZeroOrMore, cl::cat(PollyCategory));
 
 static cl::opt<int> FirstLevelDefaultTileSize(
     "polly-default-tile-size",
@@ -612,21 +629,20 @@
   // degree of a cache level is greater than two. Otherwise, another algorithm
   // for determination of the parameters should be used.
   if (!(MicroKernelParams.Mr > 0 && MicroKernelParams.Nr > 0 &&
-        CacheLevelSizes.size() >= 2 && CacheLevelAssociativity.size() >= 2 &&
-        CacheLevelSizes[0] > 0 && CacheLevelSizes[1] > 0 &&
-        CacheLevelAssociativity[0] > 2 && CacheLevelAssociativity[1] > 2))
+        FirstCacheLevelSize > 0 && SecondCacheLevelSize > 0 &&
+        FirstCacheLevelAssociativity > 2 && SecondCacheLevelAssociativity > 2))
     return {1, 1, 1};
   // The quotient should be greater than zero.
   if (PollyPatternMatchingNcQuotient <= 0)
     return {1, 1, 1};
   int Car = floor(
-      (CacheLevelAssociativity[0] - 1) /
+      (FirstCacheLevelAssociativity - 1) /
       (1 + static_cast<double>(MicroKernelParams.Nr) / MicroKernelParams.Mr));
-  int Kc = (Car * CacheLevelSizes[0]) /
-           (MicroKernelParams.Mr * CacheLevelAssociativity[0] * 8);
-  double Cac = static_cast<double>(Kc * 8 * CacheLevelAssociativity[1]) /
-               CacheLevelSizes[1];
-  int Mc = floor((CacheLevelAssociativity[1] - 2) / Cac);
+  int Kc = (Car * FirstCacheLevelSize) /
+           (MicroKernelParams.Mr * FirstCacheLevelAssociativity * 8);
+  double Cac = static_cast<double>(Kc * 8 * SecondCacheLevelAssociativity) /
+               SecondCacheLevelSize;
+  int Mc = floor((SecondCacheLevelAssociativity - 2) / Cac);
   int Nc = PollyPatternMatchingNcQuotient * MicroKernelParams.Nr;
   return {Mc, Nc, Kc};
 }