Update micro_bench.

Moving the code to cpp to access the cpuset CPU* macros (these
macros are defined in sched.h inside of __USE_GNU which is not
defined for the thumb C compiler). The C++ code is also slightly
easier to read.
Add code to set the priority of the process to the highest value.
Add code to lock the process to a single cpu.
Add the ability to compute average and standard deviation over
a number of iterations.
Change the timing code to use nanosecond resolution timing.
Add options to allow modification of the alignment of the src/dst
pointers for memcpy and the dst pointer for memset.
Add an option to change the size of the data being copied in each
iteration.

Change-Id: Ib7c50ed4463f94e638eb81690fe8fe0d0bc3ea80
3 files changed