Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 1 | # Bionic Benchmarks |
Elliott Hughes | 0bfcbaf | 2017-08-28 09:18:34 -0700 | [diff] [blame] | 2 | |
Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 3 | [TOC] |
| 4 | |
| 5 | ## libc benchmarks (bionic-benchmarks) |
| 6 | |
| 7 | `bionic-benchmarks` is a command line tool for measuring the runtimes of libc functions. It is built |
| 8 | on top of [Google Benchmark](https://github.com/google/benchmark) with some additions to organize |
Anders Lewis | 814359a | 2017-08-11 16:07:18 -0700 | [diff] [blame] | 9 | tests into suites. |
| 10 | |
Elliott Hughes | 0bfcbaf | 2017-08-28 09:18:34 -0700 | [diff] [blame] | 11 | ### Device benchmarks |
| 12 | |
Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 13 | $ mmma bionic/benchmarks |
| 14 | $ adb root |
| 15 | $ adb sync data |
Peter Collingbourne | 62011c2 | 2018-09-13 16:08:15 -0700 | [diff] [blame] | 16 | $ adb shell /data/benchmarktest/bionic-benchmarks/bionic-benchmarks |
| 17 | $ adb shell /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks |
Elliott Hughes | 0bfcbaf | 2017-08-28 09:18:34 -0700 | [diff] [blame] | 18 | |
Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 19 | By default, `bionic-benchmarks` runs all of the benchmarks in alphabetical order. Pass |
| 20 | `--benchmark_filter=getpid` to run just the benchmarks with "getpid" in their name. |
Elliott Hughes | 0bfcbaf | 2017-08-28 09:18:34 -0700 | [diff] [blame] | 21 | |
| 22 | ### Host benchmarks |
| 23 | |
Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 24 | See the `benchmarks/run-on-host.sh` script. The host benchmarks can be run with 32-bit or 64-bit |
| 25 | Bionic, or the host glibc. |
Anders Lewis | 814359a | 2017-08-11 16:07:18 -0700 | [diff] [blame] | 26 | |
Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 27 | ### XML suites |
Anders Lewis | 814359a | 2017-08-11 16:07:18 -0700 | [diff] [blame] | 28 | |
| 29 | Suites are stored in the `suites/` directory and can be chosen with the command line flag |
Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 30 | `--bionic_xml`. |
Anders Lewis | 814359a | 2017-08-11 16:07:18 -0700 | [diff] [blame] | 31 | |
Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 32 | To choose a specific XML file, use the `--bionic_xml=FILE.XML` option. By default, this option |
| 33 | searches for the XML file in the `suites/` directory. If it doesn't exist in that directory, then |
| 34 | the file will be found as relative to the current directory. If the option specifies the full path |
| 35 | to an XML file such as `/data/nativetest/suites/example.xml`, it will be used as-is. |
Anders Lewis | 814359a | 2017-08-11 16:07:18 -0700 | [diff] [blame] | 36 | |
Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 37 | If no XML file is specified through the command-line option, the default is to use `suites/full.xml`. |
| 38 | However, for the host bionic benchmarks (`bionic-benchmarks-glibc`), the default is to use |
| 39 | `suites/host.xml`. |
Anders Lewis | 814359a | 2017-08-11 16:07:18 -0700 | [diff] [blame] | 40 | |
Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 41 | ### XML suite format |
Anders Lewis | 814359a | 2017-08-11 16:07:18 -0700 | [diff] [blame] | 42 | |
| 43 | The format for a benchmark is: |
| 44 | |
| 45 | ``` |
| 46 | <fn> |
| 47 | <name>BM_sample_benchmark</name> |
| 48 | <cpu><optional_cpu_to_lock></cpu> |
| 49 | <iterations><optional_iterations_to_run></iterations> |
| 50 | <args><space separated list of function args|shorthand></args> |
| 51 | </fn> |
| 52 | ``` |
| 53 | |
Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 54 | XML-specified values for iterations and cpu take precedence over those specified via command line |
| 55 | (via `--bionic_iterations` and `--bionic_cpu`, respectively.) |
Anders Lewis | 814359a | 2017-08-11 16:07:18 -0700 | [diff] [blame] | 56 | |
| 57 | To make small changes in runs, you can also schedule benchmarks by passing in their name and a |
Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 58 | space-separated list of arguments via the `--bionic_extra` command line flag, e.g. |
| 59 | `--bionic_extra="BM_string_memcpy AT_COMMON_SIZES"` or `--bionic_extra="BM_string_memcmp 32 8 8"` |
Anders Lewis | 814359a | 2017-08-11 16:07:18 -0700 | [diff] [blame] | 60 | |
| 61 | Note that benchmarks will run normally if extra arguments are passed in, and it will fail |
| 62 | with a segfault if too few are passed in. |
| 63 | |
| 64 | ### Shorthand |
| 65 | |
Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 66 | For the sake of brevity, multiple runs can be scheduled in one XML element by putting one of the |
Anders Lewis | 814359a | 2017-08-11 16:07:18 -0700 | [diff] [blame] | 67 | following in the args field: |
| 68 | |
| 69 | NUM_PROPS |
| 70 | MATH_COMMON |
| 71 | AT_ALIGNED_<ONE|TWO>BUF |
| 72 | AT_<any power of two between 2 and 16384>_ALIGNED_<ONE|TWO>BUF |
| 73 | AT_COMMON_SIZES |
| 74 | |
| 75 | Definitions for these can be found in bionic_benchmarks.cpp, and example usages can be found in |
| 76 | the suites directory. |
| 77 | |
| 78 | ### Unit Tests |
| 79 | |
Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 80 | `bionic-benchmarks` also has its own set of unit tests, which can be run from the binary in |
Anders Lewis | 814359a | 2017-08-11 16:07:18 -0700 | [diff] [blame] | 81 | `/data/nativetest[64]/bionic-benchmarks-tests` |
Ryan Prichard | aa85ac2 | 2019-09-18 15:58:46 -0700 | [diff] [blame] | 82 | |
| 83 | ## Process startup time (bionic-spawn-benchmarks) |
| 84 | |
| 85 | The `spawn/` subdirectory has a few benchmarks measuring the time used to start simple programs |
| 86 | (e.g. Toybox's `true` and `sh -c true`). Run it on a device like so: |
| 87 | |
| 88 | m bionic-spawn-benchmarks |
| 89 | adb root |
| 90 | adb sync data |
| 91 | adb shell /data/benchmarktest/bionic-spawn-benchmarks/bionic-spawn-benchmarks |
| 92 | adb shell /data/benchmarktest64/bionic-spawn-benchmarks/bionic-spawn-benchmarks |
| 93 | |
| 94 | Google Benchmark reports both a real-time figure ("Time") and a CPU usage figure. For these |
| 95 | benchmarks, the CPU measurement only counts time spent in the thread calling `posix_spawn`, not that |
| 96 | spent in the spawned process. The real-time is probably more useful, and it is the figure used to |
| 97 | determine the iteration count. |
| 98 | |
| 99 | Locking the CPU frequency seems to improve the results of these benchmarks significantly, and it |
| 100 | reduces variability. |
| 101 | |
| 102 | ## Google Benchmark notes |
| 103 | |
| 104 | ### Repetitions |
| 105 | |
| 106 | Google Benchmark uses two settings to control how many times to run each benchmark, "iterations" and |
| 107 | "repetitions". By default, the repetition count is one. Google Benchmark runs the benchmark a few |
| 108 | times to determine a sufficiently-large iteration count. |
| 109 | |
| 110 | Google Benchmark can optionally run a benchmark run repeatedly and report statistics (median, mean, |
| 111 | standard deviation) for the runs. To do so, pass the `--benchmark_repetitions` option, e.g.: |
| 112 | |
| 113 | # ./bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll --benchmark_repetitions=4 |
| 114 | ... |
| 115 | ------------------------------------------------------------------- |
| 116 | Benchmark Time CPU Iterations |
| 117 | ------------------------------------------------------------------- |
| 118 | BM_stdlib_strtoll 27.7 ns 27.7 ns 25290525 |
| 119 | BM_stdlib_strtoll 27.7 ns 27.7 ns 25290525 |
| 120 | BM_stdlib_strtoll 27.7 ns 27.7 ns 25290525 |
| 121 | BM_stdlib_strtoll 27.8 ns 27.7 ns 25290525 |
| 122 | BM_stdlib_strtoll_mean 27.7 ns 27.7 ns 4 |
| 123 | BM_stdlib_strtoll_median 27.7 ns 27.7 ns 4 |
| 124 | BM_stdlib_strtoll_stddev 0.023 ns 0.023 ns 4 |
| 125 | |
| 126 | There are 4 runs, each with 25290525 iterations. Measurements for the individual runs can be |
| 127 | suppressed if they aren't needed: |
| 128 | |
| 129 | # ./bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll --benchmark_repetitions=4 --benchmark_report_aggregates_only |
| 130 | ... |
| 131 | ------------------------------------------------------------------- |
| 132 | Benchmark Time CPU Iterations |
| 133 | ------------------------------------------------------------------- |
| 134 | BM_stdlib_strtoll_mean 27.8 ns 27.7 ns 4 |
| 135 | BM_stdlib_strtoll_median 27.7 ns 27.7 ns 4 |
| 136 | BM_stdlib_strtoll_stddev 0.043 ns 0.043 ns 4 |
| 137 | |
| 138 | ### CPU frequencies |
| 139 | |
| 140 | To get consistent results between runs, it can sometimes be helpful to restrict a benchmark to |
| 141 | specific cores, or to lock cores at specific frequencies. Some phones have a big.LITTLE core setup, |
| 142 | or at least allow some cores to run at higher frequencies than others. |
| 143 | |
| 144 | A core can be selected for `bionic-benchmarks` using the `--bionic_cpu` option or using the |
| 145 | `taskset` utility. e.g. A Pixel 3 device has 4 Kryo 385 Silver cores followed by 4 Gold cores: |
| 146 | |
| 147 | blueline:/ # /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll --bionic_cpu=0 |
| 148 | ... |
| 149 | ------------------------------------------------------------ |
| 150 | Benchmark Time CPU Iterations |
| 151 | ------------------------------------------------------------ |
| 152 | BM_stdlib_strtoll 64.2 ns 63.6 ns 11017493 |
| 153 | |
| 154 | blueline:/ # /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll --bionic_cpu=4 |
| 155 | ... |
| 156 | ------------------------------------------------------------ |
| 157 | Benchmark Time CPU Iterations |
| 158 | ------------------------------------------------------------ |
| 159 | BM_stdlib_strtoll 21.8 ns 21.7 ns 33167103 |
| 160 | |
| 161 | A similar result can be achieved using `taskset`. The first parameter is a bitmask of core numbers |
| 162 | to pass to `sched_setaffinity`: |
| 163 | |
| 164 | blueline:/ # taskset f /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll |
| 165 | ... |
| 166 | ------------------------------------------------------------ |
| 167 | Benchmark Time CPU Iterations |
| 168 | ------------------------------------------------------------ |
| 169 | BM_stdlib_strtoll 64.3 ns 63.6 ns 10998697 |
| 170 | |
| 171 | blueline:/ # taskset f0 /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --benchmark_filter=BM_stdlib_strtoll |
| 172 | ... |
| 173 | ------------------------------------------------------------ |
| 174 | Benchmark Time CPU Iterations |
| 175 | ------------------------------------------------------------ |
| 176 | BM_stdlib_strtoll 21.3 ns 21.2 ns 33094801 |
| 177 | |
| 178 | To lock the CPU frequency, use the sysfs interface at `/sys/devices/system/cpu/cpu*/cpufreq/`. |
| 179 | Changing the scaling governor to `performance` suppresses the warning that Google Benchmark |
| 180 | otherwise prints: |
| 181 | |
| 182 | ***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead. |
| 183 | |
| 184 | Some devices have a `perf-setup.sh` script that locks CPU and GPU frequencies. Some TradeFed |
| 185 | benchmarks appear to be using the script. For more information: |
| 186 | * run `get_build_var BOARD_PERFSETUP_SCRIPT` |
| 187 | * run `m perf-setup.sh` to install the script into `${OUT}/data/local/tmp/perf-setup.sh` |
| 188 | * see: https://android.googlesource.com/platform/platform_testing/+/refs/heads/master/scripts/perf-setup/ |