[ARM] DLS/LE low-overhead loop code generation

Introduce three pseudo instructions to be used during DAG ISel to
represent v8.1-m low-overhead loops. One maps to set_loop_iterations
while loop_decrement_reg is lowered to two, so that we can separate
the decrement and branching operations. The pseudo instructions are
expanded pre-emission, where we can still decide whether we actually
want to generate a low-overhead loop, in a new pass:
ARMLowOverheadLoops. The pass currently bails, reverting to an sub,
icmp and br, in the cases where a call or stack spill/restore happens
between the decrement and branching instructions, or if the loop is
too large.

Differential Revision: https://reviews.llvm.org/D63476

llvm-svn: 364288
diff --git a/llvm/lib/Target/ARM/ARM.h b/llvm/lib/Target/ARM/ARM.h
index f8dd298..bf8ed65 100644
--- a/llvm/lib/Target/ARM/ARM.h
+++ b/llvm/lib/Target/ARM/ARM.h
@@ -35,7 +35,7 @@
 class MCInst;
 class PassRegistry;
 
-
+FunctionPass *createARMLowOverheadLoopsPass();
 Pass *createARMParallelDSPPass();
 FunctionPass *createARMISelDag(ARMBaseTargetMachine &TM,
                                CodeGenOpt::Level OptLevel);
@@ -66,6 +66,7 @@
 void initializeThumb2SizeReducePass(PassRegistry &);
 void initializeThumb2ITBlockPass(PassRegistry &);
 void initializeMVEVPTBlockPass(PassRegistry &);
+void initializeARMLowOverheadLoopsPass(PassRegistry &);
 
 } // end namespace llvm