math: new log2f

from https://github.com/ARM-software/optimized-routines,
commit 04884bd04eac4b251da4026900010ea7d8850edc

code size change: +177 bytes.
benchmark on x86_64 before, after, speedup:

-Os:
 log2f rthruput:  11.38 ns/call  5.99 ns/call 1.9x
  log2f latency:  35.01 ns/call 22.57 ns/call 1.55x
-O3:
 log2f rthruput:  10.82 ns/call  5.58 ns/call 1.94x
  log2f latency:  35.13 ns/call 21.04 ns/call 1.67x
3 files changed