[llvm-mca][BtVer2] Teach how to identify dependency-breaking idioms.

This patch teaches llvm-mca how to identify dependency breaking instructions on
btver2.

An example of dependency breaking instructions is the zero-idiom XOR (example:
`XOR %eax, %eax`), which always generates zero regardless of the actual value of
the input register operands.
Dependency breaking instructions don't have to wait on their input register
operands before executing. This is because the computation is not dependent on
the inputs.

Not all dependency breaking idioms are also zero-latency instructions. For
example, `CMPEQ %xmm1, %xmm1` is independent on
the value of XMM1, and it generates a vector of all-ones.
That instruction is not eliminated at register renaming stage, and its opcode is
issued to a pipeline for execution. So, the latency is not zero. 

This patch adds a new method named isDependencyBreaking() to the MCInstrAnalysis
interface. That method takes as input an instruction (i.e. MCInst) and a
MCSubtargetInfo.
The default implementation of isDependencyBreaking() conservatively returns
false for all instructions. Targets may override the default behavior for
specific CPUs, and return a value which better matches the subtarget behavior.

In future, we should teach to Tablegen how to automatically generate the body of
isDependencyBreaking from scheduling predicate definitions. This would allow us
to expose the knowledge about dependency breaking instructions to the machine
schedulers (and, potentially, other codegen passes).

Differential Revision: https://reviews.llvm.org/D49310

llvm-svn: 338372
diff --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
index d030f26..f1d15e6 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
@@ -307,10 +307,84 @@
 public:
   X86MCInstrAnalysis(const MCInstrInfo *MCII) : MCInstrAnalysis(MCII) {}
 
+  bool isDependencyBreaking(const MCSubtargetInfo &STI,
+                            const MCInst &Inst) const override;
   bool clearsSuperRegisters(const MCRegisterInfo &MRI, const MCInst &Inst,
                             APInt &Mask) const override;
 };
 
+bool X86MCInstrAnalysis::isDependencyBreaking(const MCSubtargetInfo &STI,
+                                              const MCInst &Inst) const {
+  if (STI.getCPU() == "btver2") {
+    // Reference: Agner Fog's microarchitecture.pdf - Section 20 "AMD Bobcat and
+    // Jaguar pipeline", subsection 8 "Dependency-breaking instructions".
+    switch (Inst.getOpcode()) {
+    default:
+      return false;
+    case X86::SUB32rr:
+    case X86::SUB64rr:
+    case X86::SBB32rr:
+    case X86::SBB64rr:
+    case X86::XOR32rr:
+    case X86::XOR64rr:
+    case X86::XORPSrr:
+    case X86::XORPDrr:
+    case X86::VXORPSrr:
+    case X86::VXORPDrr:
+    case X86::ANDNPSrr:
+    case X86::VANDNPSrr:
+    case X86::ANDNPDrr:
+    case X86::VANDNPDrr:
+    case X86::PXORrr:
+    case X86::VPXORrr:
+    case X86::PANDNrr:
+    case X86::VPANDNrr:
+    case X86::PSUBBrr:
+    case X86::PSUBWrr:
+    case X86::PSUBDrr:
+    case X86::PSUBQrr:
+    case X86::VPSUBBrr:
+    case X86::VPSUBWrr:
+    case X86::VPSUBDrr:
+    case X86::VPSUBQrr:
+    case X86::PCMPEQBrr:
+    case X86::PCMPEQWrr:
+    case X86::PCMPEQDrr:
+    case X86::PCMPEQQrr:
+    case X86::VPCMPEQBrr:
+    case X86::VPCMPEQWrr:
+    case X86::VPCMPEQDrr:
+    case X86::VPCMPEQQrr:
+    case X86::PCMPGTBrr:
+    case X86::PCMPGTWrr:
+    case X86::PCMPGTDrr:
+    case X86::PCMPGTQrr:
+    case X86::VPCMPGTBrr:
+    case X86::VPCMPGTWrr:
+    case X86::VPCMPGTDrr:
+    case X86::VPCMPGTQrr:
+    case X86::MMX_PXORirr:
+    case X86::MMX_PANDNirr:
+    case X86::MMX_PSUBBirr:
+    case X86::MMX_PSUBDirr:
+    case X86::MMX_PSUBQirr:
+    case X86::MMX_PSUBWirr:
+    case X86::MMX_PCMPGTBirr:
+    case X86::MMX_PCMPGTDirr:
+    case X86::MMX_PCMPGTWirr:
+    case X86::MMX_PCMPEQBirr:
+    case X86::MMX_PCMPEQDirr:
+    case X86::MMX_PCMPEQWirr:
+      return Inst.getOperand(1).getReg() == Inst.getOperand(2).getReg();
+    case X86::CMP32rr:
+    case X86::CMP64rr:
+      return Inst.getOperand(0).getReg() == Inst.getOperand(1).getReg();
+    }
+  }
+
+  return false;
+}
+
 bool X86MCInstrAnalysis::clearsSuperRegisters(const MCRegisterInfo &MRI,
                                               const MCInst &Inst,
                                               APInt &Mask) const {