[llvm-mca] Add fields "Total uOps" and "uOps Per Cycle" to the report generated by the SummaryView.

This patch adds two new fields to the perf report generated by the SummaryView.
Fields are now logically organized into two small groups; only the second group
contains throughput indicators.

Example:
```
Iterations:        100
Instructions:      300
Total Cycles:      414
Total uOps:        700

Dispatch Width:    4
uOps Per Cycle:    1.69
IPC:               0.72
Block RThroughput: 4.0
```

This patch also updates the docs for llvm-mca.
Due to the nature of this change, several tests in the tools/llvm-mca directory
were affected, and had to be updated using script `update_mca_test_checks.py`.

llvm-svn: 340946
diff --git a/llvm/tools/llvm-mca/Views/SummaryView.cpp b/llvm/tools/llvm-mca/Views/SummaryView.cpp
index 4a147bb..026742a 100644
--- a/llvm/tools/llvm-mca/Views/SummaryView.cpp
+++ b/llvm/tools/llvm-mca/Views/SummaryView.cpp
@@ -63,7 +63,9 @@
   unsigned Iterations = Source.getNumIterations();
   unsigned Instructions = Source.size();
   unsigned TotalInstructions = Instructions * Iterations;
+  unsigned TotalUOps = NumMicroOps * Iterations;
   double IPC = (double)TotalInstructions / TotalCycles;
+  double UOpsPerCycle = (double)TotalUOps / TotalCycles;
   double BlockRThroughput = computeBlockRThroughput(
       SM, DispatchWidth, NumMicroOps, ProcResourceUsage);
 
@@ -72,10 +74,12 @@
   TempStream << "Iterations:        " << Iterations;
   TempStream << "\nInstructions:      " << TotalInstructions;
   TempStream << "\nTotal Cycles:      " << TotalCycles;
+  TempStream << "\nTotal uOps:        " << TotalUOps << '\n';
   TempStream << "\nDispatch Width:    " << DispatchWidth;
-  TempStream << "\nIPC:               " << format("%.2f", IPC);
-
-  // Round to the block reciprocal throughput to the nearest tenth.
+  TempStream << "\nuOps Per Cycle:    "
+             << format("%.2f", floor((UOpsPerCycle * 100) + 0.5) / 100);
+  TempStream << "\nIPC:               "
+             << format("%.2f", floor((IPC * 100) + 0.5) / 100);
   TempStream << "\nBlock RThroughput: "
              << format("%.1f", floor((BlockRThroughput * 10) + 0.5) / 10)
              << '\n';