Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 1 | ========================== |
| 2 | Source-based Code Coverage |
| 3 | ========================== |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | |
| 8 | Introduction |
| 9 | ============ |
| 10 | |
| 11 | This document explains how to use clang's source-based code coverage feature. |
| 12 | It's called "source-based" because it operates on AST and preprocessor |
| 13 | information directly. This allows it to generate very precise coverage data. |
| 14 | |
| 15 | Clang ships two other code coverage implementations: |
| 16 | |
| 17 | * :doc:`SanitizerCoverage` - A low-overhead tool meant for use alongside the |
| 18 | various sanitizers. It can provide up to edge-level coverage. |
| 19 | |
| 20 | * gcov - A GCC-compatible coverage implementation which operates on DebugInfo. |
| 21 | |
| 22 | From this point onwards "code coverage" will refer to the source-based kind. |
| 23 | |
| 24 | The code coverage workflow |
| 25 | ========================== |
| 26 | |
| 27 | The code coverage workflow consists of three main steps: |
| 28 | |
Vedant Kumar | 0819f36 | 2016-06-02 02:25:13 +0000 | [diff] [blame] | 29 | * Compiling with coverage enabled. |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 30 | |
Vedant Kumar | 0819f36 | 2016-06-02 02:25:13 +0000 | [diff] [blame] | 31 | * Running the instrumented program. |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 32 | |
Vedant Kumar | 0819f36 | 2016-06-02 02:25:13 +0000 | [diff] [blame] | 33 | * Creating coverage reports. |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 34 | |
| 35 | The next few sections work through a complete, copy-'n-paste friendly example |
| 36 | based on this program: |
| 37 | |
Vedant Kumar | 4c1112c | 2016-06-02 01:15:59 +0000 | [diff] [blame] | 38 | .. code-block:: cpp |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 39 | |
| 40 | % cat <<EOF > foo.cc |
| 41 | #define BAR(x) ((x) || (x)) |
| 42 | template <typename T> void foo(T x) { |
| 43 | for (unsigned I = 0; I < 10; ++I) { BAR(I); } |
| 44 | } |
| 45 | int main() { |
| 46 | foo<int>(0); |
| 47 | foo<float>(0); |
| 48 | return 0; |
| 49 | } |
| 50 | EOF |
| 51 | |
| 52 | Compiling with coverage enabled |
| 53 | =============================== |
| 54 | |
Vedant Kumar | 6c53d8f | 2016-06-02 02:45:59 +0000 | [diff] [blame] | 55 | To compile code with coverage enabled, pass ``-fprofile-instr-generate |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 56 | -fcoverage-mapping`` to the compiler: |
| 57 | |
| 58 | .. code-block:: console |
| 59 | |
| 60 | # Step 1: Compile with coverage enabled. |
| 61 | % clang++ -fprofile-instr-generate -fcoverage-mapping foo.cc -o foo |
| 62 | |
| 63 | Note that linking together code with and without coverage instrumentation is |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 +0000 | [diff] [blame] | 64 | supported. Uninstrumented code simply won't be accounted for in reports. |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 65 | |
| 66 | Running the instrumented program |
| 67 | ================================ |
| 68 | |
| 69 | The next step is to run the instrumented program. When the program exits it |
| 70 | will write a **raw profile** to the path specified by the ``LLVM_PROFILE_FILE`` |
Vedant Kumar | 0819f36 | 2016-06-02 02:25:13 +0000 | [diff] [blame] | 71 | environment variable. If that variable does not exist, the profile is written |
| 72 | to ``default.profraw`` in the current directory of the program. If |
| 73 | ``LLVM_PROFILE_FILE`` contains a path to a non-existent directory, the missing |
| 74 | directory structure will be created. Additionally, the following special |
| 75 | **pattern strings** are rewritten: |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 76 | |
| 77 | * "%p" expands out to the process ID. |
| 78 | |
| 79 | * "%h" expands out to the hostname of the machine running the program. |
| 80 | |
Vedant Kumar | f3300c9 | 2016-06-14 00:42:12 +0000 | [diff] [blame] | 81 | * "%Nm" expands out to the instrumented binary's signature. When this pattern |
| 82 | is specified, the runtime creates a pool of N raw profiles which are used for |
| 83 | on-line profile merging. The runtime takes care of selecting a raw profile |
| 84 | from the pool, locking it, and updating it before the program exits. If N is |
| 85 | not specified (i.e the pattern is "%m"), it's assumed that ``N = 1``. N must |
| 86 | be between 1 and 9. The merge pool specifier can only occur once per filename |
| 87 | pattern. |
| 88 | |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 89 | .. code-block:: console |
| 90 | |
| 91 | # Step 2: Run the program. |
| 92 | % LLVM_PROFILE_FILE="foo.profraw" ./foo |
| 93 | |
| 94 | Creating coverage reports |
| 95 | ========================= |
| 96 | |
Vedant Kumar | 0819f36 | 2016-06-02 02:25:13 +0000 | [diff] [blame] | 97 | Raw profiles have to be **indexed** before they can be used to generate |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 +0000 | [diff] [blame] | 98 | coverage reports. This is done using the "merge" tool in ``llvm-profdata`` |
| 99 | (which can combine multiple raw profiles and index them at the same time): |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 100 | |
| 101 | .. code-block:: console |
| 102 | |
| 103 | # Step 3(a): Index the raw profile. |
| 104 | % llvm-profdata merge -sparse foo.profraw -o foo.profdata |
| 105 | |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 +0000 | [diff] [blame] | 106 | There are multiple different ways to render coverage reports. The simplest |
| 107 | option is to generate a line-oriented report: |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 108 | |
| 109 | .. code-block:: console |
| 110 | |
| 111 | # Step 3(b): Create a line-oriented coverage report. |
| 112 | % llvm-cov show ./foo -instr-profile=foo.profdata |
| 113 | |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 114 | This report includes a summary view as well as dedicated sub-views for |
| 115 | templated functions and their instantiations. For our example program, we get |
| 116 | distinct views for ``foo<int>(...)`` and ``foo<float>(...)``. If |
| 117 | ``-show-line-counts-or-regions`` is enabled, ``llvm-cov`` displays sub-line |
| 118 | region counts (even in macro expansions): |
| 119 | |
George Burgess IV | bc8cc5ac | 2016-06-21 02:19:43 +0000 | [diff] [blame] | 120 | .. code-block:: none |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 121 | |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 +0000 | [diff] [blame] | 122 | 1| 20|#define BAR(x) ((x) || (x)) |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 123 | ^20 ^2 |
| 124 | 2| 2|template <typename T> void foo(T x) { |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 +0000 | [diff] [blame] | 125 | 3| 22| for (unsigned I = 0; I < 10; ++I) { BAR(I); } |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 126 | ^22 ^20 ^20^20 |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 +0000 | [diff] [blame] | 127 | 4| 2|} |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 128 | ------------------ |
| 129 | | void foo<int>(int): |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 +0000 | [diff] [blame] | 130 | | 2| 1|template <typename T> void foo(T x) { |
| 131 | | 3| 11| for (unsigned I = 0; I < 10; ++I) { BAR(I); } |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 132 | | ^11 ^10 ^10^10 |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 +0000 | [diff] [blame] | 133 | | 4| 1|} |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 134 | ------------------ |
| 135 | | void foo<float>(int): |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 +0000 | [diff] [blame] | 136 | | 2| 1|template <typename T> void foo(T x) { |
| 137 | | 3| 11| for (unsigned I = 0; I < 10; ++I) { BAR(I); } |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 138 | | ^11 ^10 ^10^10 |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 +0000 | [diff] [blame] | 139 | | 4| 1|} |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 140 | ------------------ |
| 141 | |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 +0000 | [diff] [blame] | 142 | To generate a file-level summary of coverage statistics instead of a |
| 143 | line-oriented report, try: |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 144 | |
| 145 | .. code-block:: console |
| 146 | |
| 147 | # Step 3(c): Create a coverage summary. |
| 148 | % llvm-cov report ./foo -instr-profile=foo.profdata |
Vedant Kumar | 3f42b13 | 2016-07-28 23:18:48 +0000 | [diff] [blame] | 149 | Filename Regions Missed Regions Cover Functions Missed Functions Executed Lines Missed Lines Cover |
| 150 | -------------------------------------------------------------------------------------------------------------------------------------- |
| 151 | /tmp/foo.cc 13 0 100.00% 3 0 100.00% 13 0 100.00% |
| 152 | -------------------------------------------------------------------------------------------------------------------------------------- |
| 153 | TOTAL 13 0 100.00% 3 0 100.00% 13 0 100.00% |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 154 | |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 +0000 | [diff] [blame] | 155 | The ``llvm-cov`` tool supports specifying a custom demangler, writing out |
| 156 | reports in a directory structure, and generating html reports. For the full |
| 157 | list of options, please refer to the `command guide |
| 158 | <http://llvm.org/docs/CommandGuide/llvm-cov.html>`_. |
| 159 | |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 160 | A few final notes: |
| 161 | |
| 162 | * The ``-sparse`` flag is optional but can result in dramatically smaller |
| 163 | indexed profiles. This option should not be used if the indexed profile will |
| 164 | be reused for PGO. |
| 165 | |
| 166 | * Raw profiles can be discarded after they are indexed. Advanced use of the |
| 167 | profile runtime library allows an instrumented program to merge profiling |
| 168 | information directly into an existing raw profile on disk. The details are |
| 169 | out of scope. |
| 170 | |
| 171 | * The ``llvm-profdata`` tool can be used to merge together multiple raw or |
| 172 | indexed profiles. To combine profiling data from multiple runs of a program, |
| 173 | try e.g: |
| 174 | |
Vedant Kumar | 553a0d6 | 2016-06-02 17:19:45 +0000 | [diff] [blame] | 175 | .. code-block:: console |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 176 | |
Vedant Kumar | 553a0d6 | 2016-06-02 17:19:45 +0000 | [diff] [blame] | 177 | % llvm-profdata merge -sparse foo1.profraw foo2.profdata -o foo3.profdata |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 178 | |
Vedant Kumar | 6fe6eae | 2016-09-20 17:11:18 +0000 | [diff] [blame] | 179 | Exporting coverage data |
| 180 | ======================= |
| 181 | |
| 182 | Coverage data can be exported into JSON using the ``llvm-cov export`` |
| 183 | sub-command. There is a comprehensive reference which defines the structure of |
| 184 | the exported data at a high level in the llvm-cov source code. |
| 185 | |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 +0000 | [diff] [blame] | 186 | Interpreting reports |
| 187 | ==================== |
| 188 | |
| 189 | There are four statistics tracked in a coverage summary: |
| 190 | |
| 191 | * Function coverage is the percentage of functions which have been executed at |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 +0000 | [diff] [blame] | 192 | least once. A function is considered to be executed if any of its |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 +0000 | [diff] [blame] | 193 | instantiations are executed. |
| 194 | |
| 195 | * Instantiation coverage is the percentage of function instantiations which |
Vedant Kumar | 6fe6eae | 2016-09-20 17:11:18 +0000 | [diff] [blame] | 196 | have been executed at least once. Template functions and static inline |
| 197 | functions from headers are two kinds of functions which may have multiple |
| 198 | instantiations. |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 +0000 | [diff] [blame] | 199 | |
| 200 | * Line coverage is the percentage of code lines which have been executed at |
Vedant Kumar | 6fe6eae | 2016-09-20 17:11:18 +0000 | [diff] [blame] | 201 | least once. Only executable lines within function bodies are considered to be |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 +0000 | [diff] [blame] | 202 | code lines. |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 +0000 | [diff] [blame] | 203 | |
| 204 | * Region coverage is the percentage of code regions which have been executed at |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 +0000 | [diff] [blame] | 205 | least once. A code region may span multiple lines (e.g in a large function |
| 206 | body with no control flow). However, it's also possible for a single line to |
| 207 | contain multiple code regions (e.g in "return x || y && z"). |
Vedant Kumar | 6fe6eae | 2016-09-20 17:11:18 +0000 | [diff] [blame] | 208 | |
| 209 | Of these four statistics, function coverage is usually the least granular while |
| 210 | region coverage is the most granular. The project-wide totals for each |
| 211 | statistic are listed in the summary. |
Vedant Kumar | 9ed5802 | 2016-09-19 01:42:38 +0000 | [diff] [blame] | 212 | |
Vedant Kumar | a530a36 | 2016-06-02 00:51:50 +0000 | [diff] [blame] | 213 | Format compatibility guarantees |
| 214 | =============================== |
| 215 | |
| 216 | * There are no backwards or forwards compatibility guarantees for the raw |
| 217 | profile format. Raw profiles may be dependent on the specific compiler |
| 218 | revision used to generate them. It's inadvisable to store raw profiles for |
| 219 | long periods of time. |
| 220 | |
| 221 | * Tools must retain **backwards** compatibility with indexed profile formats. |
| 222 | These formats are not forwards-compatible: i.e, a tool which uses format |
| 223 | version X will not be able to understand format version (X+k). |
| 224 | |
Vedant Kumar | 74c3fd1 | 2016-09-22 15:34:33 +0000 | [diff] [blame] | 225 | * Tools must also retain **backwards** compatibility with the format of the |
| 226 | coverage mappings emitted into instrumented binaries. These formats are not |
| 227 | forwards-compatible. |
Vedant Kumar | 553a0d6 | 2016-06-02 17:19:45 +0000 | [diff] [blame] | 228 | |
Vedant Kumar | 6fe6eae | 2016-09-20 17:11:18 +0000 | [diff] [blame] | 229 | * The JSON coverage export format has a (major, minor, patch) version triple. |
| 230 | Only a major version increment indicates a backwards-incompatible change. A |
| 231 | minor version increment is for added functionality, and patch version |
| 232 | increments are for bugfixes. |
| 233 | |
Vedant Kumar | b06294d | 2016-06-07 22:25:29 +0000 | [diff] [blame] | 234 | Using the profiling runtime without static initializers |
| 235 | ======================================================= |
| 236 | |
| 237 | By default the compiler runtime uses a static initializer to determine the |
| 238 | profile output path and to register a writer function. To collect profiles |
| 239 | without using static initializers, do this manually: |
| 240 | |
Vedant Kumar | 32a9bfa | 2016-06-08 22:24:52 +0000 | [diff] [blame] | 241 | * Export a ``int __llvm_profile_runtime`` symbol from each instrumented shared |
| 242 | library and executable. When the linker finds a definition of this symbol, it |
| 243 | knows to skip loading the object which contains the profiling runtime's |
| 244 | static initializer. |
Vedant Kumar | b06294d | 2016-06-07 22:25:29 +0000 | [diff] [blame] | 245 | |
Vedant Kumar | 32a9bfa | 2016-06-08 22:24:52 +0000 | [diff] [blame] | 246 | * Forward-declare ``void __llvm_profile_initialize_file(void)`` and call it |
| 247 | once from each instrumented executable. This function parses |
| 248 | ``LLVM_PROFILE_FILE``, sets the output path, and truncates any existing files |
| 249 | at that path. To get the same behavior without truncating existing files, |
| 250 | pass a filename pattern string to ``void __llvm_profile_set_filename(char |
| 251 | *)``. These calls can be placed anywhere so long as they precede all calls |
| 252 | to ``__llvm_profile_write_file``. |
Vedant Kumar | b06294d | 2016-06-07 22:25:29 +0000 | [diff] [blame] | 253 | |
Vedant Kumar | 32a9bfa | 2016-06-08 22:24:52 +0000 | [diff] [blame] | 254 | * Forward-declare ``int __llvm_profile_write_file(void)`` and call it to write |
Vedant Kumar | 89262b6 | 2016-06-08 22:32:03 +0000 | [diff] [blame] | 255 | out a profile. This function returns 0 when it succeeds, and a non-zero value |
| 256 | otherwise. Calling this function multiple times appends profile data to an |
| 257 | existing on-disk raw profile. |
Vedant Kumar | b06294d | 2016-06-07 22:25:29 +0000 | [diff] [blame] | 258 | |
Nico Weber | b1706ca | 2017-01-25 16:01:32 +0000 | [diff] [blame^] | 259 | In C++ files, declare these as ``extern "C"``. |
| 260 | |
Vedant Kumar | 6fe6eae | 2016-09-20 17:11:18 +0000 | [diff] [blame] | 261 | Collecting coverage reports for the llvm project |
| 262 | ================================================ |
| 263 | |
| 264 | To prepare a coverage report for llvm (and any of its sub-projects), add |
| 265 | ``-DLLVM_BUILD_INSTRUMENTED_COVERAGE=On`` to the cmake configuration. Raw |
| 266 | profiles will be written to ``$BUILD_DIR/profiles/``. To prepare an html |
| 267 | report, run ``llvm/utils/prepare-code-coverage-artifact.py``. |
| 268 | |
| 269 | To specify an alternate directory for raw profiles, use |
| 270 | ``-DLLVM_PROFILE_DATA_DIR``. To change the size of the profile merge pool, use |
| 271 | ``-DLLVM_PROFILE_MERGE_POOL_SIZE``. |
| 272 | |
Vedant Kumar | 553a0d6 | 2016-06-02 17:19:45 +0000 | [diff] [blame] | 273 | Drawbacks and limitations |
| 274 | ========================= |
| 275 | |
Vedant Kumar | 62baa4c | 2016-06-06 15:44:40 +0000 | [diff] [blame] | 276 | * Code coverage does not handle unpredictable changes in control flow or stack |
| 277 | unwinding in the presence of exceptions precisely. Consider the following |
| 278 | function: |
Vedant Kumar | 553a0d6 | 2016-06-02 17:19:45 +0000 | [diff] [blame] | 279 | |
| 280 | .. code-block:: cpp |
| 281 | |
| 282 | int f() { |
| 283 | may_throw(); |
| 284 | return 0; |
| 285 | } |
| 286 | |
Vedant Kumar | 62baa4c | 2016-06-06 15:44:40 +0000 | [diff] [blame] | 287 | If the call to ``may_throw()`` propagates an exception into ``f``, the code |
Vedant Kumar | 553a0d6 | 2016-06-02 17:19:45 +0000 | [diff] [blame] | 288 | coverage tool may mark the ``return`` statement as executed even though it is |
Vedant Kumar | 62baa4c | 2016-06-06 15:44:40 +0000 | [diff] [blame] | 289 | not. A call to ``longjmp()`` can have similar effects. |