[SimplifyCFG] don't sink common insts too soon (PR34603)
This should solve:
https://bugs.llvm.org/show_bug.cgi?id=34603
...by preventing SimplifyCFG from altering redundant instructions before early-cse has a chance to run.
It changes the default (canonical-forming) behavior of SimplifyCFG, so we're only doing the
sinking transform later in the optimization pipeline.
Differential Revision: https://reviews.llvm.org/D38566
llvm-svn: 320749
diff --git a/llvm/lib/Passes/PassBuilder.cpp b/llvm/lib/Passes/PassBuilder.cpp
index 56eba69..d33c4df 100644
--- a/llvm/lib/Passes/PassBuilder.cpp
+++ b/llvm/lib/Passes/PassBuilder.cpp
@@ -747,21 +747,24 @@
// Cleanup after the loop optimization passes.
OptimizePM.addPass(InstCombinePass());
-
// Now that we've formed fast to execute loop structures, we do further
// optimizations. These are run afterward as they might block doing complex
// analyses and transforms such as what are needed for loop vectorization.
+ // Cleanup after loop vectorization, etc. Simplification passes like CVP and
+ // GVN, loop transforms, and others have already run, so it's now better to
+ // convert to more optimized IR using more aggressive simplify CFG options.
+ // The extra sinking transform can create larger basic blocks, so do this
+ // before SLP vectorization.
+ OptimizePM.addPass(SimplifyCFGPass(SimplifyCFGOptions().
+ forwardSwitchCondToPhi(true).
+ convertSwitchToLookupTable(true).
+ needCanonicalLoops(false).
+ sinkCommonInsts(true)));
+
// Optimize parallel scalar instruction chains into SIMD instructions.
OptimizePM.addPass(SLPVectorizerPass());
- // Cleanup after all of the vectorizers. Simplification passes like CVP and
- // GVN, loop transforms, and others have already run, so it's now better to
- // convert to more optimized IR using more aggressive simplify CFG options.
- OptimizePM.addPass(SimplifyCFGPass(SimplifyCFGOptions().
- forwardSwitchCondToPhi(true).
- convertSwitchToLookupTable(true).
- needCanonicalLoops(false)));
OptimizePM.addPass(InstCombinePass());
// Unroll small loops to hide loop backedge latency and saturate any parallel