[llvm-mca] LLVM Machine Code Analyzer.
llvm-mca is an LLVM based performance analysis tool that can be used to
statically measure the performance of code, and to help triage potential
problems with target scheduling models.
llvm-mca uses information which is already available in LLVM (e.g. scheduling
models) to statically measure the performance of machine code in a specific cpu.
Performance is measured in terms of throughput as well as processor resource
consumption. The tool currently works for processors with an out-of-order
backend, for which there is a scheduling model available in LLVM.
The main goal of this tool is not just to predict the performance of the code
when run on the target, but also help with diagnosing potential performance
issues.
Given an assembly code sequence, llvm-mca estimates the IPC (instructions per
cycle), as well as hardware resources pressure. The analysis and reporting style
were mostly inspired by the IACA tool from Intel.
This patch is related to the RFC on llvm-dev visible at this link:
http://lists.llvm.org/pipermail/llvm-dev/2018-March/121490.html
Differential Revision: https://reviews.llvm.org/D43951
llvm-svn: 326998
diff --git a/llvm/tools/llvm-mca/ResourcePressureView.h b/llvm/tools/llvm-mca/ResourcePressureView.h
new file mode 100644
index 0000000..7017f0b
--- /dev/null
+++ b/llvm/tools/llvm-mca/ResourcePressureView.h
@@ -0,0 +1,113 @@
+//===--------------------- ResourcePressureView.h ---------------*- C++ -*-===//
+//
+// The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+/// \file
+///
+/// This file define class ResourcePressureView.
+/// Class ResourcePressureView observes hardware events generated by
+/// the Backend object and collects statistics related to resource usage at
+/// instruction granularity.
+/// Resource pressure information is then printed out to a stream in the
+/// form of a table like the one from the example below:
+///
+/// Resources:
+/// [0] - JALU0
+/// [1] - JALU1
+/// [2] - JDiv
+/// [3] - JFPM
+/// [4] - JFPU0
+/// [5] - JFPU1
+/// [6] - JLAGU
+/// [7] - JSAGU
+/// [8] - JSTC
+/// [9] - JVIMUL
+///
+/// Resource pressure per iteration:
+/// [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
+/// 0.00 0.00 0.00 0.00 2.00 2.00 0.00 0.00 0.00 0.00
+///
+/// Resource pressure by instruction:
+/// [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] Instructions:
+/// - - - - - 1.00 - - - - vpermilpd $1, %xmm0,
+/// %xmm1
+/// - - - - 1.00 - - - - - vaddps %xmm0, %xmm1,
+/// %xmm2
+/// - - - - - 1.00 - - - - vmovshdup %xmm2, %xmm3
+/// - - - - 1.00 - - - - - vaddss %xmm2, %xmm3,
+/// %xmm4
+///
+/// In this example, we have AVX code executed on AMD Jaguar (btver2).
+/// Both shuffles and vector floating point add operations on XMM registers have
+/// a reciprocal throughput of 1cy.
+/// Each add is issued to pipeline JFPU0, while each shuffle is issued to
+/// pipeline JFPU1. The overall pressure per iteration is reported by two
+/// tables: the first smaller table is the resource pressure per iteration;
+/// the second table reports resource pressure per instruction. Values are the
+/// average resource cycles consumed by an instruction.
+/// Every vector add from the example uses resource JFPU0 for an average of 1cy
+/// per iteration. Consequently, the resource pressure on JFPU0 is of 2cy per
+/// iteration.
+///
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TOOLS_LLVM_MCA_RESOURCEPRESSUREVIEW_H
+#define LLVM_TOOLS_LLVM_MCA_RESOURCEPRESSUREVIEW_H
+
+#include "HWEventListener.h"
+#include "SourceMgr.h"
+#include "llvm/MC/MCInstPrinter.h"
+#include "llvm/MC/MCSubtargetInfo.h"
+#include <map>
+
+namespace mca {
+
+class Backend;
+
+/// This class collects resource pressure statistics and it is able to print
+/// out all the collected information as a table to an output stream.
+class ResourcePressureView : public HWEventListener {
+ const llvm::MCSubtargetInfo &STI;
+ llvm::MCInstPrinter &MCIP;
+ const SourceMgr &Source;
+
+ // Map to quickly get a resource descriptor from a mask.
+ std::map<uint64_t, unsigned> Resource2VecIndex;
+
+ // Table of resources used by instructions.
+ std::vector<unsigned> ResourceUsage;
+ unsigned NumResourceUnits;
+
+ const llvm::MCInst &GetMCInstFromIndex(unsigned Index) const;
+ void printResourcePressurePerIteration(llvm::raw_ostream &OS,
+ unsigned Executions) const;
+ void printResourcePressurePerInstruction(llvm::raw_ostream &OS,
+ unsigned Executions) const;
+ void initialize(const llvm::ArrayRef<uint64_t> ProcResoureMasks);
+
+public:
+ ResourcePressureView(const llvm::MCSubtargetInfo &ST,
+ llvm::MCInstPrinter &Printer, const SourceMgr &SM,
+ const llvm::ArrayRef<uint64_t> ProcResourceMasks)
+ : STI(ST), MCIP(Printer), Source(SM) {
+ initialize(ProcResourceMasks);
+ }
+
+ void onInstructionIssued(
+ unsigned Index,
+ const llvm::ArrayRef<std::pair<ResourceRef, unsigned>> &Used) override;
+
+ void printResourcePressure(llvm::raw_ostream &OS, unsigned Cycles) const {
+ unsigned Executions = Source.getNumIterations();
+ printResourcePressurePerIteration(OS, Executions);
+ printResourcePressurePerInstruction(OS, Executions);
+ }
+};
+
+} // namespace mca
+
+#endif