[BPF] Enable relocation location for load/store/shifts

Previous btf field relocation is always at assignment like
   r1 = 4
which is converted from an ld_imm64 instruction.

This patch did an optimization such that relocation
instruction might be load/store/shift. Specically, the
following insns may also have relocation, except BPF_MOV:
  LDB, LDH, LDW, LDD, STB, STH, STW, STD,
  LDB32, LDH32, LDW32, STB32, STH32, STW32,
  SLL, SRL, SRA

To accomplish this, a few BPF target specific
codegen only instructions are invented. They
are generated at backend BPF SimplifyPatchable phase,
which is at early llc phase when SSA form is available.
The new codegen only instructions will be converted to
real proper instructions at the codegen and BTF emission stage.

Note that, as revealed by a few tests, this optimization might
be actual generating more relocations:
Scenario 1:
  if (...) {
    ... __builtin_preserve_field_info(arg->b2, 0) ...
  } else {
    ... __builtin_preserve_field_info(arg->b2, 0) ...
  }
  Compiler could do CSE to only have one relocation. But if both
  of the above is translated into codegen internal instructions,
  the compiler will not be able to do that.
Scenario 2:
  offset = ... __builtin_preserve_field_info(arg->b2, 0) ...
  ...
  ...  offset ...
  ...  offset ...
  ...  offset ...
  For whatever reason, the compiler might be temporarily do copy
  propagation of the righthand of "offset" assignment like
  ...  __builtin_preserve_field_info(arg->b2, 0) ...
  ...  __builtin_preserve_field_info(arg->b2, 0) ...
  and CSE will be able to deduplicate later.
  But if these intrinsics are converted to BPF pseudo instructions,
  they will not be able to get deduplicated.

I do not expect we have big instruction count difference.
It may actually reduce instruction count since now relocation
is in deeper insn dependency chain.
For example, for test offset-reloc-fieldinfo-2.ll, this patch
generates 7 instead of 6 relocations for non-alu32 mode, but it
actually reduced instruction count from 29 to 26.

Differential Revision: https://reviews.llvm.org/D71790
diff --git a/llvm/lib/Target/BPF/BTFDebug.cpp b/llvm/lib/Target/BPF/BTFDebug.cpp
index bdc7ce7..86e625b 100644
--- a/llvm/lib/Target/BPF/BTFDebug.cpp
+++ b/llvm/lib/Target/BPF/BTFDebug.cpp
@@ -937,9 +937,8 @@
 }
 
 /// Generate a struct member field relocation.
-void BTFDebug::generateFieldReloc(const MachineInstr *MI,
-                                   const MCSymbol *ORSym, DIType *RootTy,
-                                   StringRef AccessPattern) {
+void BTFDebug::generateFieldReloc(const MCSymbol *ORSym, DIType *RootTy,
+                                  StringRef AccessPattern) {
   unsigned RootId = populateStructType(RootTy);
   size_t FirstDollar = AccessPattern.find_first_of('$');
   size_t FirstColon = AccessPattern.find_first_of(':');
@@ -959,33 +958,8 @@
   FieldRelocTable[SecNameOff].push_back(FieldReloc);
 }
 
-void BTFDebug::processLDimm64(const MachineInstr *MI) {
-  // If the insn is an LD_imm64, the following two cases
-  // will generate an .BTF.ext record.
-  //
-  // If the insn is "r2 = LD_imm64 @__BTF_...",
-  // add this insn into the .BTF.ext FieldReloc subsection.
-  // Relocation looks like:
-  //  . SecName:
-  //    . InstOffset
-  //    . TypeID
-  //    . OffSetNameOff
-  // Later, the insn is replaced with "r2 = <offset>"
-  // where "<offset>" equals to the offset based on current
-  // type definitions.
-  //
-  // If the insn is "r2 = LD_imm64 @VAR" and VAR is
-  // a patchable external global, add this insn into the .BTF.ext
-  // ExternReloc subsection.
-  // Relocation looks like:
-  //  . SecName:
-  //    . InstOffset
-  //    . ExternNameOff
-  // Later, the insn is replaced with "r2 = <value>" or
-  // "LD_imm64 r2, <value>" where "<value>" = 0.
-
+void BTFDebug::processReloc(const MachineOperand &MO) {
   // check whether this is a candidate or not
-  const MachineOperand &MO = MI->getOperand(1);
   if (MO.isGlobal()) {
     const GlobalValue *GVal = MO.getGlobal();
     auto *GVar = dyn_cast<GlobalVariable>(GVal);
@@ -995,7 +969,7 @@
 
       MDNode *MDN = GVar->getMetadata(LLVMContext::MD_preserve_access_index);
       DIType *Ty = dyn_cast<DIType>(MDN);
-      generateFieldReloc(MI, ORSym, Ty, GVar->getName());
+      generateFieldReloc(ORSym, Ty, GVar->getName());
     }
   }
 }
@@ -1020,8 +994,25 @@
       return;
   }
 
-  if (MI->getOpcode() == BPF::LD_imm64)
-    processLDimm64(MI);
+  if (MI->getOpcode() == BPF::LD_imm64) {
+    // If the insn is "r2 = LD_imm64 @<an AmaAttr global>",
+    // add this insn into the .BTF.ext FieldReloc subsection.
+    // Relocation looks like:
+    //  . SecName:
+    //    . InstOffset
+    //    . TypeID
+    //    . OffSetNameOff
+    //    . RelocType
+    // Later, the insn is replaced with "r2 = <offset>"
+    // where "<offset>" equals to the offset based on current
+    // type definitions.
+    processReloc(MI->getOperand(1));
+  } else if (MI->getOpcode() == BPF::CORE_MEM ||
+             MI->getOpcode() == BPF::CORE_ALU32_MEM ||
+             MI->getOpcode() == BPF::CORE_SHIFT) {
+    // relocation insn is a load, store or shift insn.
+    processReloc(MI->getOperand(3));
+  }
 
   // Skip this instruction if no DebugLoc or the DebugLoc
   // is the same as the previous instruction.
@@ -1148,6 +1139,25 @@
         return true;
       }
     }
+  } else if (MI->getOpcode() == BPF::CORE_MEM ||
+             MI->getOpcode() == BPF::CORE_ALU32_MEM ||
+             MI->getOpcode() == BPF::CORE_SHIFT) {
+    const MachineOperand &MO = MI->getOperand(3);
+    if (MO.isGlobal()) {
+      const GlobalValue *GVal = MO.getGlobal();
+      auto *GVar = dyn_cast<GlobalVariable>(GVal);
+      if (GVar && GVar->hasAttribute(BPFCoreSharedInfo::AmaAttr)) {
+        uint32_t Imm = PatchImms[GVar->getName().str()];
+        OutMI.setOpcode(MI->getOperand(1).getImm());
+        if (MI->getOperand(0).isImm())
+          OutMI.addOperand(MCOperand::createImm(MI->getOperand(0).getImm()));
+        else
+          OutMI.addOperand(MCOperand::createReg(MI->getOperand(0).getReg()));
+        OutMI.addOperand(MCOperand::createReg(MI->getOperand(2).getReg()));
+        OutMI.addOperand(MCOperand::createImm(Imm));
+        return true;
+      }
+    }
   }
   return false;
 }