Assembly region TLAB allocation fast path for arm.

This is for the CC collector.

Share the common fast path code with the tlab fast path code.

Speedup (on N5):
        BinaryTrees:  2291 ->  902 ms (-60%)
        MemAllocTest: 2137 -> 1845 ms (-14%)

Bug: 9986565
Bug: 12687968

Change-Id: Ica63094ec2f85eaa4fd04d202a20090399275d85
8 files changed