Change Thumb2 jumptable codegen to one that uses two level jumps:

Before:
      adr r12, #LJTI3_0_0
      ldr pc, [r12, +r0, lsl #2]
LJTI3_0_0:
      .long    LBB3_24
      .long    LBB3_30
      .long    LBB3_31
      .long    LBB3_32

After:
      adr r12, #LJTI3_0_0
      add pc, r12, +r0, lsl #2
LJTI3_0_0:
      b.w    LBB3_24
      b.w    LBB3_30
      b.w    LBB3_31
      b.w    LBB3_32

This has several advantages.
1. This will make it easier to optimize this to a TBB / TBH instruction +
   (smaller) table.
2. This eliminate the need for ugly asm printer hack to force the address
   into thumb addresses (bit 0 is one).
3. Same codegen for pic and non-pic.
4. This eliminate the need to align the table so constantpool island pass
   won't have to over-estimate the size.

Based on my calculation, the later is probably slightly faster as well since
ldr pc with shifter address is very slow. That is, it should be a win as long
as the HW implementation can do a reasonable job of branch predict the second
branch.

llvm-svn: 77024
diff --git a/llvm/lib/Target/ARM/ARMInstrInfo.td b/llvm/lib/Target/ARM/ARMInstrInfo.td
index 611c42f..b4fb8a7 100644
--- a/llvm/lib/Target/ARM/ARMInstrInfo.td
+++ b/llvm/lib/Target/ARM/ARMInstrInfo.td
@@ -33,6 +33,9 @@
 def SDT_ARMBrJT    : SDTypeProfile<0, 3,
                                   [SDTCisPtrTy<0>, SDTCisVT<1, i32>,
                                    SDTCisVT<2, i32>]>;
+def SDT_ARMBr2JT   : SDTypeProfile<0, 4,
+                                  [SDTCisPtrTy<0>, SDTCisVT<1, i32>,
+                                   SDTCisVT<2, i32>, SDTCisVT<3, i32>]>;
 
 def SDT_ARMCmp     : SDTypeProfile<0, 2, [SDTCisSameAs<0, 1>]>;
 
@@ -72,6 +75,9 @@
 def ARMbrjt          : SDNode<"ARMISD::BR_JT", SDT_ARMBrJT,
                               [SDNPHasChain]>;
 
+def ARMbr2jt         : SDNode<"ARMISD::BR2_JT", SDT_ARMBr2JT,
+                              [SDNPHasChain]>;
+
 def ARMcmp           : SDNode<"ARMISD::CMP", SDT_ARMCmp,
                               [SDNPOutFlag]>;
 
@@ -205,6 +211,9 @@
 def jtblock_operand : Operand<i32> {
   let PrintMethod = "printJTBlockOperand";
 }
+def jt2block_operand : Operand<i32> {
+  let PrintMethod = "printJT2BlockOperand";
+}
 
 // Local PC labels.
 def pclabel : Operand<i32> {