[x86] avoid code explosion from LoopVectorizer for gather loop (PR27826)
By making pointer extraction from a vector more expensive in the cost model,
we avoid the vectorization of a loop that is very likely to be memory-bound:
https://llvm.org/bugs/show_bug.cgi?id=27826
There are still bugs related to this, so we may need a more general solution
to avoid vectorizing obviously memory-bound loops when we don't have HW gather
support.
Differential Revision: http://reviews.llvm.org/D20601
llvm-svn: 270729
diff --git a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
index 508fbe0..1baa49c 100644
--- a/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
+++ b/llvm/lib/Target/X86/X86TargetTransformInfo.cpp
@@ -963,6 +963,8 @@
int X86TTIImpl::getVectorInstrCost(unsigned Opcode, Type *Val, unsigned Index) {
assert(Val->isVectorTy() && "This must be a vector type");
+ Type *ScalarType = Val->getScalarType();
+
if (Index != -1U) {
// Legalize the type.
std::pair<int, MVT> LT = TLI->getTypeLegalizationCost(DL, Val);
@@ -976,11 +978,17 @@
Index = Index % Width;
// Floating point scalars are already located in index #0.
- if (Val->getScalarType()->isFloatingPointTy() && Index == 0)
+ if (ScalarType->isFloatingPointTy() && Index == 0)
return 0;
}
- return BaseT::getVectorInstrCost(Opcode, Val, Index);
+ // Add to the base cost if we know that the extracted element of a vector is
+ // destined to be moved to and used in the integer register file.
+ int RegisterFileMoveCost = 0;
+ if (Opcode == Instruction::ExtractElement && ScalarType->isPointerTy())
+ RegisterFileMoveCost = 1;
+
+ return BaseT::getVectorInstrCost(Opcode, Val, Index) + RegisterFileMoveCost;
}
int X86TTIImpl::getScalarizationOverhead(Type *Ty, bool Insert, bool Extract) {