[PPC64LE] More vector swap optimization TLC
This makes one substantive change and a few stylistic changes to the
VSX swap optimization pass.
The substantive change is to permit LXSDX and LXSSPX instructions to
participate in swap optimization computations. The previous change to
insert a swap following a SUBREG_TO_REG widening operation makes this
almost trivial.
I experimented with also permitting STXSDX and STXSSPX instructions.
This can be done using similar techniques: we could insert a swap
prior to a narrowing COPY operation, and then permit these stores to
participate. I prototyped this, but discovered that the pattern of a
narrowing COPY followed by an STXSDX does not occur in any of our
test-suite code. So instead, I added commentary indicating that this
could be done.
Other TLC:
- I changed SH_COPYSCALAR to SH_COPYWIDEN to more clearly indicate
the direction of the copy.
- I factored the insertion of swap instructions into a separate
function.
Finally, I added a new test case to check that the scalar-to-vector
loads are working properly with swap optimization.
llvm-svn: 242838
diff --git a/llvm/test/CodeGen/PowerPC/swaps-le-6.ll b/llvm/test/CodeGen/PowerPC/swaps-le-6.ll
new file mode 100644
index 0000000..365aeee
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/swaps-le-6.ll
@@ -0,0 +1,44 @@
+; RUN: llc -mcpu=pwr8 -mtriple=powerpc64le-unknown-linux-gnu -O3 < %s | FileCheck %s
+
+; These tests verify that VSX swap optimization works when loading a scalar
+; into a vector register.
+
+
+@x = global <2 x double> <double 9.970000e+01, double -1.032220e+02>, align 16
+@z = global <2 x double> <double 2.332000e+01, double 3.111111e+01>, align 16
+@y = global double 1.780000e+00, align 8
+
+define void @bar0() {
+entry:
+ %0 = load <2 x double>, <2 x double>* @x, align 16
+ %1 = load double, double* @y, align 8
+ %vecins = insertelement <2 x double> %0, double %1, i32 0
+ store <2 x double> %vecins, <2 x double>* @z, align 16
+ ret void
+}
+
+; CHECK-LABEL: @bar0
+; CHECK-DAG: lxvd2x [[REG1:[0-9]+]]
+; CHECK-DAG: lxsdx [[REG2:[0-9]+]]
+; CHECK: xxswapd [[REG3:[0-9]+]], [[REG2]]
+; CHECK: xxspltd [[REG4:[0-9]+]], [[REG3]], 1
+; CHECK: xxpermdi [[REG5:[0-9]+]], [[REG4]], [[REG1]], 1
+; CHECK: stxvd2x [[REG5]]
+
+define void @bar1() {
+entry:
+ %0 = load <2 x double>, <2 x double>* @x, align 16
+ %1 = load double, double* @y, align 8
+ %vecins = insertelement <2 x double> %0, double %1, i32 1
+ store <2 x double> %vecins, <2 x double>* @z, align 16
+ ret void
+}
+
+; CHECK-LABEL: @bar1
+; CHECK-DAG: lxvd2x [[REG1:[0-9]+]]
+; CHECK-DAG: lxsdx [[REG2:[0-9]+]]
+; CHECK: xxswapd [[REG3:[0-9]+]], [[REG2]]
+; CHECK: xxspltd [[REG4:[0-9]+]], [[REG3]], 1
+; CHECK: xxmrghd [[REG5:[0-9]+]], [[REG1]], [[REG4]]
+; CHECK: stxvd2x [[REG5]]
+