ARM FastISel integer sext/zext improvements

My recent ARM FastISel patch exposed this bug:
  http://llvm.org/bugs/show_bug.cgi?id=16178
The root cause is that it can't select integer sext/zext pre-ARMv6 and
asserts out.

The current integer sext/zext code doesn't handle other cases gracefully
either, so this patch makes it handle all sext and zext from i1/i8/i16
to i8/i16/i32, with and without ARMv6, both in Thumb and ARM mode. This
should fix the bug as well as make FastISel faster because it bails to
SelectionDAG less often. See fastisel-ext.patch for this.

fastisel-ext-tests.patch changes current tests to always use reg-imm AND
for 8-bit zext instead of UXTB. This simplifies code since it is
supported on ARMv4t and later, and at least on A15 both should perform
exactly the same (both have exec 1 uop 1, type I).

2013-05-31-char-shift-crash.ll is a bitcode version of the above bug
16178 repro.

fast-isel-ext.ll tests all sext/zext combinations that ARM FastISel
should now handle.

Note that my ARM FastISel enabling patch was reverted due to a separate
failure when dealing with MCJIT, I'll fix this second failure and then
turn FastISel on again for non-iOS ARM targets.

I've tested "make check-all" on my x86 box, and "lnt test-suite" on A15
hardware.

llvm-svn: 183551
diff --git a/llvm/test/CodeGen/ARM/2013-05-31-char-shift-crash.ll b/llvm/test/CodeGen/ARM/2013-05-31-char-shift-crash.ll
new file mode 100644
index 0000000..0130f7a
--- /dev/null
+++ b/llvm/test/CodeGen/ARM/2013-05-31-char-shift-crash.ll
@@ -0,0 +1,21 @@
+; RUN: llc < %s -O0 -mtriple=armv4t--linux-eabi-android
+; RUN: llc < %s -O0 -mtriple=armv4t-unknown-linux
+; RUN: llc < %s -O0 -mtriple=armv5-unknown-linux
+
+; See http://llvm.org/bugs/show_bug.cgi?id=16178
+; ARMFastISel used to fail emitting sext/zext in pre-ARMv6.
+
+; Function Attrs: nounwind
+define arm_aapcscc void @f2(i8 signext %a) #0 {
+entry:
+  %a.addr = alloca i8, align 1
+  store i8 %a, i8* %a.addr, align 1
+  %0 = load i8* %a.addr, align 1
+  %conv = sext i8 %0 to i32
+  %shr = ashr i32 %conv, 56
+  %conv1 = trunc i32 %shr to i8
+  call arm_aapcscc void @f1(i8 signext %conv1)
+  ret void
+}
+
+declare arm_aapcscc void @f1(i8 signext) #1