ARM64 asm for region space array allocation

Wrote region space tlab array and array resolved allocators in
assembly code. The speedup is a combined increase from checking the
mark bit and having an assembly fast path.

Added resolved, initialized entrypoints for object region TLAB
allocator.

N6P (960000 mhz) EEAC benchmark (average of 50 samples):
CC 1442.309524 -> 1314 (10% improvement)
CMS: 1382.32

Read barrier slow paths reaching C++ code go from 5M to 2.5M.

Bug: 30162165
Bug: 12687968

Test: With CC: N6P boot, run EAAC, test-art-target

Change-Id: I51515b11ef3f795f57eb72fe0f5759618fef5084
5 files changed