Compiler: replace DOM traversal computation

Originally the old trace JIT used a few recursive graph walking
algorithms - which was perfectly reasonable given that the graph
size was capped at a few dozen nodes at most.  These were replaced
with iterative walk order computations  - or at least I thought
they all were.  Missed one of them, which caused a stack overflow
on a pathologically large method compilation.

Renaming of some arena_allocator items for consistency and clarity.
More detailed memory usage logging.  Reworked the allocator to waste
less space when an allocation doesn't fit and a new block must be
allocated.

Change-Id: I4d84dded3c47819eefa0de90ebb821dd12eb8be8
diff --git a/src/compiler/dex/frontend.cc b/src/compiler/dex/frontend.cc
index b212e5b..ca751ab 100644
--- a/src/compiler/dex/frontend.cc
+++ b/src/compiler/dex/frontend.cc
@@ -101,6 +101,7 @@
   //(1 << kDebugDumpCheckStats) |
   //(1 << kDebugDumpBitcodeFile) |
   //(1 << kDebugVerifyBitcode) |
+  //(1 << kDebugShowSummaryMemoryUsage) |
   0;
 
 static CompiledMethod* CompileMethod(CompilerDriver& compiler,
@@ -249,6 +250,11 @@
     }
   }
 
+  if (cu->enable_debug & (1 << kDebugShowSummaryMemoryUsage)) {
+    LOG(INFO) << "MEMINFO " << cu->arena.BytesAllocated() << " " << cu->mir_graph->GetNumBlocks()
+              << " " << PrettyMethod(method_idx, dex_file);
+  }
+
   return result;
 }