[LoopVectorize] Don't preserve nsw/nuw flags on shrunken ops.
If we're shrinking a binary operation, it may be the case that the new
operations wraps where the old didn't. If this happens, the behavior
should be well-defined. So, we can't always carry wrapping flags with us
when we shrink operations.
If we do, we get incorrect optimizations in cases like:
void foo(const unsigned char *from, unsigned char *to, int n) {
for (int i = 0; i < n; i++)
to[i] = from[i] - 128;
}
which gets optimized to:
void foo(const unsigned char *from, unsigned char *to, int n) {
for (int i = 0; i < n; i++)
to[i] = from[i] | 128;
}
Because:
- InstCombine turned `sub i32 %from.i, 128` into
`add nuw nsw i32 %from.i, 128`.
- LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a
vector full of `i8 128`s
- InstCombine took advantage of the fact that the newly-shrunken add
"couldn't wrap", and changed the `add` to an `or`.
InstCombine seems happy to figure out whether we can add nuw/nsw on its
own, so I just decided to drop the flags. There are already a number of
places in LoopVectorize where we rely on InstCombine to clean up.
llvm-svn: 305053
diff --git a/llvm/lib/IR/Instruction.cpp b/llvm/lib/IR/Instruction.cpp
index 0006610..3dd653d 100644
--- a/llvm/lib/IR/Instruction.cpp
+++ b/llvm/lib/IR/Instruction.cpp
@@ -216,10 +216,10 @@
copyFastMathFlags(I->getFastMathFlags());
}
-void Instruction::copyIRFlags(const Value *V) {
+void Instruction::copyIRFlags(const Value *V, bool IncludeWrapFlags) {
// Copy the wrapping flags.
- if (auto *OB = dyn_cast<OverflowingBinaryOperator>(V)) {
- if (isa<OverflowingBinaryOperator>(this)) {
+ if (IncludeWrapFlags && isa<OverflowingBinaryOperator>(this)) {
+ if (auto *OB = dyn_cast<OverflowingBinaryOperator>(V)) {
setHasNoSignedWrap(OB->hasNoSignedWrap());
setHasNoUnsignedWrap(OB->hasNoUnsignedWrap());
}