[ARM] Add ARMISD::VLD1DUP to match vld1_dup more consistently.
Currently, there are substantial problems forming vld1_dup even if the
VDUP survives legalization. The lack of an actual node
leads to terrible results: not only can we not form post-increment vld1_dup
instructions, but we form scalar pre-increment and post-increment
loads which force the loaded value into a GPR. This patch fixes that
by combining the vdup+load into an ARMISD node before DAGCombine
messes it up.
Also includes a crash fix for vld2_dup (see testcase @vld2dupi8_postinc_variable).
Recommiting with fix to avoid forming vld1dup if the type of the load
doesn't match the type of the vdup (see
https://llvm.org/bugs/show_bug.cgi?id=31404).
Differential Revision: https://reviews.llvm.org/D27694
llvm-svn: 289972
diff --git a/llvm/lib/Target/ARM/ARMISelLowering.h b/llvm/lib/Target/ARM/ARMISelLowering.h
index 7583b53..6348edb 100644
--- a/llvm/lib/Target/ARM/ARMISelLowering.h
+++ b/llvm/lib/Target/ARM/ARMISelLowering.h
@@ -190,7 +190,8 @@
MEMCPY,
// Vector load N-element structure to all lanes:
- VLD2DUP = ISD::FIRST_TARGET_MEMORY_OPCODE,
+ VLD1DUP = ISD::FIRST_TARGET_MEMORY_OPCODE,
+ VLD2DUP,
VLD3DUP,
VLD4DUP,
@@ -202,6 +203,7 @@
VLD2LN_UPD,
VLD3LN_UPD,
VLD4LN_UPD,
+ VLD1DUP_UPD,
VLD2DUP_UPD,
VLD3DUP_UPD,
VLD4DUP_UPD,