More compile-time tuning
Another round of compile-time tuning, this time yeilding in the
vicinity of 3% total reduction in compile time (which means about
double that for the Quick Compile portion).
Primary improvements are skipping the basic block combine optimization
pass when using Quick (because we already have big blocks), combining
the null check elimination and type inference passes, and limiting
expensive local value number analysis to only those blocks which
might benefit from it.
Following this CL, the actual compile phase consumes roughly 60%
of the total dex2oat time on the host, and 55% on the target (Note,
I'm subtracting out the Deduping time here, which the timing logger
normally counts against the compiler).
A sample breakdown of the compilation time follows (this taken on
PlusOne.apk w/ a Nexus 4):
39.00% -> MIR2LIR: 1374.90 (Note: includes local optimization & scheduling)
10.25% -> MIROpt:SSATransform: 361.31
8.45% -> BuildMIRGraph: 297.80
7.55% -> Assemble: 266.16
6.87% -> MIROpt:NCE_TypeInference: 242.22
5.56% -> Dedupe: 196.15
3.45% -> MIROpt:BBOpt: 121.53
3.20% -> RegisterAllocation: 112.69
3.00% -> PcMappingTable: 105.65
2.90% -> GcMap: 102.22
2.68% -> Launchpads: 94.50
1.16% -> MIROpt:InitRegLoc: 40.94
1.16% -> Cleanup: 40.93
1.10% -> MIROpt:CodeLayout: 38.80
0.97% -> MIROpt:ConstantProp: 34.35
0.96% -> MIROpt:UseCount: 33.75
0.86% -> MIROpt:CheckFilters: 30.28
0.44% -> SpecialMIR2LIR: 15.53
0.44% -> MIROpt:BBCombine: 15.41
(cherry pick of 9e8e234af4430abe8d144414e272cd72d215b5f3)
Change-Id: I86c665fa7e88b75eb75629a99fd292ff8c449969
diff --git a/compiler/dex/frontend.cc b/compiler/dex/frontend.cc
index e53d636..197bba5 100644
--- a/compiler/dex/frontend.cc
+++ b/compiler/dex/frontend.cc
@@ -253,15 +253,15 @@
cu.mir_graph->InlineMethod(code_item, access_flags, invoke_type, class_def_idx, method_idx,
class_loader, dex_file);
+ cu.NewTimingSplit("MIROpt:CheckFilters");
#if !defined(ART_USE_PORTABLE_COMPILER)
if (cu.mir_graph->SkipCompilation(Runtime::Current()->GetCompilerFilter())) {
return NULL;
}
#endif
- cu.NewTimingSplit("MIROpt:CodeLayout");
-
/* Do a code layout pass */
+ cu.NewTimingSplit("MIROpt:CodeLayout");
cu.mir_graph->CodeLayout();
/* Perform SSA transformation for the whole method */
@@ -272,18 +272,23 @@
cu.NewTimingSplit("MIROpt:ConstantProp");
cu.mir_graph->PropagateConstants();
+ cu.NewTimingSplit("MIROpt:InitRegLoc");
+ cu.mir_graph->InitRegLocations();
+
/* Count uses */
+ cu.NewTimingSplit("MIROpt:UseCount");
cu.mir_graph->MethodUseCount();
- /* Perform null check elimination */
- cu.NewTimingSplit("MIROpt:NullCheckElimination");
- cu.mir_graph->NullCheckElimination();
+ /* Perform null check elimination and type inference*/
+ cu.NewTimingSplit("MIROpt:NCE_TypeInference");
+ cu.mir_graph->NullCheckEliminationAndTypeInference();
/* Combine basic blocks where possible */
- cu.NewTimingSplit("MIROpt:BBOpt");
+ cu.NewTimingSplit("MIROpt:BBCombine");
cu.mir_graph->BasicBlockCombine();
/* Do some basic block optimizations */
+ cu.NewTimingSplit("MIROpt:BBOpt");
cu.mir_graph->BasicBlockOptimization();
if (cu.enable_debug & (1 << kDebugDumpCheckStats)) {
@@ -294,8 +299,8 @@
cu.mir_graph->ShowOpcodeStats();
}
- /* Set up regLocation[] array to describe values - one for each ssa_name. */
- cu.mir_graph->BuildRegLocations();
+ /* Reassociate sreg names with original Dalvik vreg names. */
+ cu.mir_graph->RemapRegLocations();
CompiledMethod* result = NULL;
@@ -323,8 +328,9 @@
cu.cg->Materialize();
- cu.NewTimingSplit("Cleanup");
+ cu.NewTimingSplit("Dedupe"); /* deduping takes up the vast majority of time in GetCompiledMethod(). */
result = cu.cg->GetCompiledMethod();
+ cu.NewTimingSplit("Cleanup");
if (result) {
VLOG(compiler) << "Compiled " << PrettyMethod(method_idx, dex_file);