Move early tail duplication earlier.
This fixes the issue noted in PR10251 where early tail dup of bbs with
indirectbr would cause a bb to be duplicated into a loop preheader
and then into its predecessors, creating phi nodes with identical
operands just before register allocation.
This helps with jsinterp.o size (__TEXT goes from 163568 to 126656)
and a bit with performance 1.005x faster on sunspider (jits still enabled).
The result on webkit with the jit disabled is more significant: 1.021x faster.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134372 91177308-0d34-0410-b5e6-96231b3b80d8
diff --git a/lib/CodeGen/LLVMTargetMachine.cpp b/lib/CodeGen/LLVMTargetMachine.cpp
index b98fbed..0255b28 100644
--- a/lib/CodeGen/LLVMTargetMachine.cpp
+++ b/lib/CodeGen/LLVMTargetMachine.cpp
@@ -388,6 +388,12 @@
// Expand pseudo-instructions emitted by ISel.
PM.add(createExpandISelPseudosPass());
+ // Pre-ra tail duplication.
+ if (OptLevel != CodeGenOpt::None && !DisableEarlyTailDup) {
+ PM.add(createTailDuplicatePass(true));
+ printAndVerify(PM, "After Pre-RegAlloc TailDuplicate");
+ }
+
// Optimize PHIs before DCE: removing dead PHI cycles may make more
// instructions dead.
if (OptLevel != CodeGenOpt::None)
@@ -416,12 +422,6 @@
printAndVerify(PM, "After codegen peephole optimization pass");
}
- // Pre-ra tail duplication.
- if (OptLevel != CodeGenOpt::None && !DisableEarlyTailDup) {
- PM.add(createTailDuplicatePass(true));
- printAndVerify(PM, "After Pre-RegAlloc TailDuplicate");
- }
-
// Run pre-ra passes.
if (addPreRegAlloc(PM, OptLevel))
printAndVerify(PM, "After PreRegAlloc passes");