[MIPS] Fast android_memset for Mips64, Mipsr6

Fix broken mips64 build by replacing mips32r2-only android_memset.S.
Use HW-bonded pairs of 64-bit stores to fill 128 bits/cycle.
Rely on HW automatic cache prefetch optimizations.
Software cache prefetching is counterproductive on next mips cores.
New method is coded in C, and also works okay on non-Mips architectures.

Change-Id: Id7153a8fe11538fe25287e101375661b0e99e2a2
3 files changed