[x86] Teach the AVX1 path of the new vector shuffle lowering one more
trick that I missed.
VPERMILPS has a non-immediate memory operand mode that allows it to do
asymetric shuffles in the two 128-bit lanes. Use this rather than two
shuffles and a blend.
However, it turns out the variable shuffle path to VPERMILPS (and
VPERMILPD, although that one offers no functional differenc from the
immediate operand other than variability) wasn't even plumbed through
codegen. Do such plumbing so that we can reasonably emit
a variable-masked VPERMILP instruction. Also plumb basic comment parsing
and printing through so that the tests are reasonable.
There are still a few tests which don't show the shuffle pattern. These
are tests with undef lanes. I'll teach the shuffle decoding and printing
to handle undef mask entries in a follow-up. I've looked at the masks
and they seem reasonable.
llvm-svn: 218300
diff --git a/llvm/lib/Target/X86/X86MCInstLower.cpp b/llvm/lib/Target/X86/X86MCInstLower.cpp
index ded84fc..5665a01 100644
--- a/llvm/lib/Target/X86/X86MCInstLower.cpp
+++ b/llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -1022,15 +1022,19 @@
case X86::PSHUFBrm:
case X86::VPSHUFBrm:
- // Lower PSHUFB normally but add a comment if we can find a constant
- // shuffle mask. We won't be able to do this at the MC layer because the
- // mask isn't an immediate.
+ case X86::VPERMILPSrm:
+ case X86::VPERMILPDrm:
+ case X86::VPERMILPSYrm:
+ case X86::VPERMILPDYrm:
+ // Lower PSHUFB and VPERMILP normally but add a comment if we can find
+ // a constant shuffle mask. We won't be able to do this at the MC layer
+ // because the mask isn't an immediate.
std::string Comment;
raw_string_ostream CS(Comment);
SmallVector<int, 16> Mask;
- assert(MI->getNumOperands() >= 6 &&
- "Wrong number of operands for PSHUFBrm or VPSHUFBrm");
+ // All of these instructions accept a constant pool operand as their fifth.
+ assert(MI->getNumOperands() > 5 && "We should always have at least 5 operands!");
const MachineOperand &DstOp = MI->getOperand(0);
const MachineOperand &SrcOp = MI->getOperand(1);
const MachineOperand &MaskOp = MI->getOperand(5);
@@ -1061,7 +1065,18 @@
assert(MaskTy == C->getType() &&
"Expected a constant of the same type!");
- DecodePSHUFBMask(C, Mask);
+ switch (MI->getOpcode()) {
+ case X86::PSHUFBrm:
+ case X86::VPSHUFBrm:
+ DecodePSHUFBMask(C, Mask);
+ break;
+ case X86::VPERMILPSrm:
+ case X86::VPERMILPDrm:
+ case X86::VPERMILPSYrm:
+ case X86::VPERMILPDYrm:
+ DecodeVPERMILPMask(C, Mask);
+ }
+
assert(Mask.size() == MaskTy->getVectorNumElements() &&
"Shuffle mask has a different size than its type!");
}