ARM Baker's read barrier fast path implementation.

Introduce an ARM fast path implementation in Optimizing for
Baker's read barriers (for both heap reference loads and GC
root loads).  The marking phase of the read barrier is
performed by a slow path, invoking the runtime entry point
artReadBarrierMark.

Other read barrier algorithms continue to use the original
slow path based implementation, which has been renamed as
GenerateReadBarrierSlow/GenerateReadBarrierForRootSlow.

Bug: 12687968
Change-Id: Ie7ee85b1b4c0564148270cebdd3cbd4c3da51b3a
6 files changed