| Alex Lorenz | 6a12833 | 2014-08-19 17:05:58 +0000 | [diff] [blame] | 1 | .. role:: raw-html(raw) | 
|  | 2 | :format: html | 
|  | 3 |  | 
|  | 4 | ================================= | 
|  | 5 | LLVM Code Coverage Mapping Format | 
|  | 6 | ================================= | 
|  | 7 |  | 
|  | 8 | .. contents:: | 
|  | 9 | :local: | 
|  | 10 |  | 
|  | 11 | Introduction | 
|  | 12 | ============ | 
|  | 13 |  | 
|  | 14 | LLVM's code coverage mapping format is used to provide code coverage | 
|  | 15 | analysis using LLVM's and Clang's instrumenation based profiling | 
|  | 16 | (Clang's ``-fprofile-instr-generate`` option). | 
|  | 17 |  | 
|  | 18 | This document is aimed at those who use LLVM's code coverage mapping to provide | 
|  | 19 | code coverage analysis for their own programs, and for those who would like | 
|  | 20 | to know how it works under the hood. A prior knowledge of how Clang's profile | 
|  | 21 | guided optimization works is useful, but not required. | 
|  | 22 |  | 
|  | 23 | We start by showing how to use LLVM and Clang for code coverage analysis, | 
| Sylvestre Ledru | e6ec441 | 2017-01-14 11:37:01 +0000 | [diff] [blame] | 24 | then we briefly describe LLVM's code coverage mapping format and the | 
| Alex Lorenz | 6a12833 | 2014-08-19 17:05:58 +0000 | [diff] [blame] | 25 | way that Clang and LLVM's code coverage tool work with this format. After | 
|  | 26 | the basics are down, more advanced features of the coverage mapping format | 
|  | 27 | are discussed - such as the data structures, LLVM IR representation and | 
|  | 28 | the binary encoding. | 
|  | 29 |  | 
|  | 30 | Quick Start | 
|  | 31 | =========== | 
|  | 32 |  | 
|  | 33 | Here's a short story that describes how to generate code coverage overview | 
|  | 34 | for a sample source file called *test.c*. | 
|  | 35 |  | 
|  | 36 | * First, compile an instrumented version of your program using Clang's | 
|  | 37 | ``-fprofile-instr-generate`` option with the additional ``-fcoverage-mapping`` | 
|  | 38 | option: | 
|  | 39 |  | 
|  | 40 | ``clang -o test -fprofile-instr-generate -fcoverage-mapping test.c`` | 
|  | 41 | * Then, run the instrumented binary. The runtime will produce a file called | 
|  | 42 | *default.profraw* containing the raw profile instrumentation data: | 
|  | 43 |  | 
|  | 44 | ``./test`` | 
|  | 45 | * After that, merge the profile data using the *llvm-profdata* tool: | 
|  | 46 |  | 
|  | 47 | ``llvm-profdata merge -o test.profdata default.profraw`` | 
|  | 48 | * Finally, run LLVM's code coverage tool (*llvm-cov*) to produce the code | 
|  | 49 | coverage overview for the sample source file: | 
|  | 50 |  | 
|  | 51 | ``llvm-cov show ./test -instr-profile=test.profdata test.c`` | 
|  | 52 |  | 
|  | 53 | High Level Overview | 
|  | 54 | =================== | 
|  | 55 |  | 
|  | 56 | LLVM's code coverage mapping format is designed to be a self contained | 
|  | 57 | data format, that can be embedded into the LLVM IR and object files. | 
|  | 58 | It's described in this document as a **mapping** format because its goal is | 
|  | 59 | to store the data that is required for a code coverage tool to map between | 
|  | 60 | the specific source ranges in a file and the execution counts obtained | 
|  | 61 | after running the instrumented version of the program. | 
|  | 62 |  | 
|  | 63 | The mapping data is used in two places in the code coverage process: | 
|  | 64 |  | 
|  | 65 | 1. When clang compiles a source file with ``-fcoverage-mapping``, it | 
|  | 66 | generates the mapping information that describes the mapping between the | 
|  | 67 | source ranges and the profiling instrumentation counters. | 
|  | 68 | This information gets embedded into the LLVM IR and conveniently | 
|  | 69 | ends up in the final executable file when the program is linked. | 
|  | 70 |  | 
|  | 71 | 2. It is also used by *llvm-cov* - the mapping information is extracted from an | 
|  | 72 | object file and is used to associate the execution counts (the values of the | 
|  | 73 | profile instrumentation counters), and the source ranges in a file. | 
|  | 74 | After that, the tool is able to generate various code coverage reports | 
|  | 75 | for the program. | 
|  | 76 |  | 
|  | 77 | The coverage mapping format aims to be a "universal format" that would be | 
|  | 78 | suitable for usage by any frontend, and not just by Clang. It also aims to | 
|  | 79 | provide the frontend the possibility of generating the minimal coverage mapping | 
|  | 80 | data in order to reduce the size of the IR and object files - for example, | 
|  | 81 | instead of emitting mapping information for each statement in a function, the | 
|  | 82 | frontend is allowed to group the statements with the same execution count into | 
|  | 83 | regions of code, and emit the mapping information only for those regions. | 
|  | 84 |  | 
|  | 85 | Advanced Concepts | 
|  | 86 | ================= | 
|  | 87 |  | 
|  | 88 | The remainder of this guide is meant to give you insight into the way the | 
|  | 89 | coverage mapping format works. | 
|  | 90 |  | 
|  | 91 | The coverage mapping format operates on a per-function level as the | 
|  | 92 | profile instrumentation counters are associated with a specific function. | 
|  | 93 | For each function that requires code coverage, the frontend has to create | 
|  | 94 | coverage mapping data that can map between the source code ranges and | 
|  | 95 | the profile instrumentation counters for that function. | 
|  | 96 |  | 
|  | 97 | Mapping Region | 
|  | 98 | -------------- | 
|  | 99 |  | 
|  | 100 | The function's coverage mapping data contains an array of mapping regions. | 
|  | 101 | A mapping region stores the `source code range`_ that is covered by this region, | 
|  | 102 | the `file id <coverage file id_>`_, the `coverage mapping counter`_ and | 
|  | 103 | the region's kind. | 
|  | 104 | There are several kinds of mapping regions: | 
|  | 105 |  | 
|  | 106 | * Code regions associate portions of source code and `coverage mapping | 
|  | 107 | counters`_. They make up the majority of the mapping regions. They are used | 
|  | 108 | by the code coverage tool to compute the execution counts for lines, | 
|  | 109 | highlight the regions of code that were never executed, and to obtain | 
|  | 110 | the various code coverage statistics for a function. | 
|  | 111 | For example: | 
|  | 112 |  | 
|  | 113 | :raw-html:`<pre class='highlight' style='line-height:initial;'><span>int main(int argc, const char *argv[]) </span><span style='background-color:#4A789C'>{    </span> <span class='c1'>// Code Region from 1:40 to 9:2</span> | 
|  | 114 | <span style='background-color:#4A789C'>                                            </span> | 
|  | 115 | <span style='background-color:#4A789C'>  if (argc > 1) </span><span style='background-color:#85C1F5'>{                         </span>   <span class='c1'>// Code Region from 3:17 to 5:4</span> | 
|  | 116 | <span style='background-color:#85C1F5'>    printf("%s\n", argv[1]);              </span> | 
|  | 117 | <span style='background-color:#85C1F5'>  }</span><span style='background-color:#4A789C'> else </span><span style='background-color:#F6D55D'>{                                </span>   <span class='c1'>// Code Region from 5:10 to 7:4</span> | 
|  | 118 | <span style='background-color:#F6D55D'>    printf("\n");                         </span> | 
|  | 119 | <span style='background-color:#F6D55D'>  }</span><span style='background-color:#4A789C'>                                         </span> | 
|  | 120 | <span style='background-color:#4A789C'>  return 0;                                 </span> | 
|  | 121 | <span style='background-color:#4A789C'>}</span> | 
|  | 122 | </pre>` | 
|  | 123 | * Skipped regions are used to represent source ranges that were skipped | 
|  | 124 | by Clang's preprocessor. They don't associate with | 
|  | 125 | `coverage mapping counters`_, as the frontend knows that they are never | 
|  | 126 | executed. They are used by the code coverage tool to mark the skipped lines | 
|  | 127 | inside a function as non-code lines that don't have execution counts. | 
|  | 128 | For example: | 
|  | 129 |  | 
|  | 130 | :raw-html:`<pre class='highlight' style='line-height:initial;'><span>int main() </span><span style='background-color:#4A789C'>{               </span> <span class='c1'>// Code Region from 1:12 to 6:2</span> | 
|  | 131 | <span style='background-color:#85C1F5'>#ifdef DEBUG             </span>   <span class='c1'>// Skipped Region from 2:1 to 4:2</span> | 
|  | 132 | <span style='background-color:#85C1F5'>  printf("Hello world"); </span> | 
|  | 133 | <span style='background-color:#85C1F5'>#</span><span style='background-color:#4A789C'>endif                     </span> | 
|  | 134 | <span style='background-color:#4A789C'>  return 0;                </span> | 
|  | 135 | <span style='background-color:#4A789C'>}</span> | 
|  | 136 | </pre>` | 
|  | 137 | * Expansion regions are used to represent Clang's macro expansions. They | 
|  | 138 | have an additional property - *expanded file id*. This property can be | 
|  | 139 | used by the code coverage tool to find the mapping regions that are created | 
|  | 140 | as a result of this macro expansion, by checking if their file id matches the | 
|  | 141 | expanded file id. They don't associate with `coverage mapping counters`_, | 
|  | 142 | as the code coverage tool can determine the execution count for this region | 
|  | 143 | by looking up the execution count of the first region with a corresponding | 
|  | 144 | file id. | 
|  | 145 | For example: | 
|  | 146 |  | 
|  | 147 | :raw-html:`<pre class='highlight' style='line-height:initial;'><span>int func(int x) </span><span style='background-color:#4A789C'>{                             </span> | 
|  | 148 | <span style='background-color:#4A789C'>  #define MAX(x,y) </span><span style='background-color:#85C1F5'>((x) > (y)? </span><span style='background-color:#F6D55D'>(x)</span><span style='background-color:#85C1F5'> : </span><span style='background-color:#F4BA70'>(y)</span><span style='background-color:#85C1F5'>)</span><span style='background-color:#4A789C'>     </span> | 
|  | 149 | <span style='background-color:#4A789C'>  return </span><span style='background-color:#7FCA9F'>MAX</span><span style='background-color:#4A789C'>(x, 42);                          </span> <span class='c1'>// Expansion Region from 3:10 to 3:13</span> | 
|  | 150 | <span style='background-color:#4A789C'>}</span> | 
|  | 151 | </pre>` | 
|  | 152 |  | 
|  | 153 | .. _source code range: | 
|  | 154 |  | 
|  | 155 | Source Range: | 
|  | 156 | ^^^^^^^^^^^^^ | 
|  | 157 |  | 
|  | 158 | The source range record contains the starting and ending location of a certain | 
|  | 159 | mapping region. Both locations include the line and the column numbers. | 
|  | 160 |  | 
|  | 161 | .. _coverage file id: | 
|  | 162 |  | 
|  | 163 | File ID: | 
|  | 164 | ^^^^^^^^ | 
|  | 165 |  | 
|  | 166 | The file id an integer value that tells us | 
|  | 167 | in which source file or macro expansion is this region located. | 
|  | 168 | It enables Clang to produce mapping information for the code | 
|  | 169 | defined inside macros, like this example demonstrates: | 
|  | 170 |  | 
|  | 171 | :raw-html:`<pre class='highlight' style='line-height:initial;'><span>void func(const char *str) </span><span style='background-color:#4A789C'>{        </span> <span class='c1'>// Code Region from 1:28 to 6:2 with file id 0</span> | 
|  | 172 | <span style='background-color:#4A789C'>  #define PUT </span><span style='background-color:#85C1F5'>printf("%s\n", str)</span><span style='background-color:#4A789C'>   </span> <span class='c1'>// 2 Code Regions from 2:15 to 2:34 with file ids 1 and 2</span> | 
|  | 173 | <span style='background-color:#4A789C'>  if(*str)                          </span> | 
|  | 174 | <span style='background-color:#4A789C'>    </span><span style='background-color:#F6D55D'>PUT</span><span style='background-color:#4A789C'>;                            </span> <span class='c1'>// Expansion Region from 4:5 to 4:8 with file id 0 that expands a macro with file id 1</span> | 
|  | 175 | <span style='background-color:#4A789C'>  </span><span style='background-color:#F6D55D'>PUT</span><span style='background-color:#4A789C'>;                              </span> <span class='c1'>// Expansion Region from 5:3 to 5:6 with file id 0 that expands a macro with file id 2</span> | 
|  | 176 | <span style='background-color:#4A789C'>}</span> | 
|  | 177 | </pre>` | 
|  | 178 |  | 
|  | 179 | .. _coverage mapping counter: | 
|  | 180 | .. _coverage mapping counters: | 
|  | 181 |  | 
|  | 182 | Counter: | 
|  | 183 | ^^^^^^^^ | 
|  | 184 |  | 
|  | 185 | A coverage mapping counter can represents a reference to the profile | 
|  | 186 | instrumentation counter. The execution count for a region with such counter | 
|  | 187 | is determined by looking up the value of the corresponding profile | 
|  | 188 | instrumentation counter. | 
|  | 189 |  | 
|  | 190 | It can also represent a binary arithmetical expression that operates on | 
|  | 191 | coverage mapping counters or other expressions. | 
|  | 192 | The execution count for a region with an expression counter is determined by | 
|  | 193 | evaluating the expression's arguments and then adding them together or | 
|  | 194 | subtracting them from one another. | 
|  | 195 | In the example below, a subtraction expression is used to compute the execution | 
|  | 196 | count for the compound statement that follows the *else* keyword: | 
|  | 197 |  | 
|  | 198 | :raw-html:`<pre class='highlight' style='line-height:initial;'><span>int main(int argc, const char *argv[]) </span><span style='background-color:#4A789C'>{   </span> <span class='c1'>// Region's counter is a reference to the profile counter #0</span> | 
|  | 199 | <span style='background-color:#4A789C'>                                           </span> | 
|  | 200 | <span style='background-color:#4A789C'>  if (argc > 1) </span><span style='background-color:#85C1F5'>{                        </span>   <span class='c1'>// Region's counter is a reference to the profile counter #1</span> | 
|  | 201 | <span style='background-color:#85C1F5'>    printf("%s\n", argv[1]);             </span><span>   </span> | 
|  | 202 | <span style='background-color:#85C1F5'>  }</span><span style='background-color:#4A789C'> else </span><span style='background-color:#F6D55D'>{                               </span>   <span class='c1'>// Region's counter is an expression (reference to the profile counter #0 - reference to the profile counter #1)</span> | 
|  | 203 | <span style='background-color:#F6D55D'>    printf("\n");                        </span> | 
|  | 204 | <span style='background-color:#F6D55D'>  }</span><span style='background-color:#4A789C'>                                        </span> | 
|  | 205 | <span style='background-color:#4A789C'>  return 0;                                </span> | 
|  | 206 | <span style='background-color:#4A789C'>}</span> | 
|  | 207 | </pre>` | 
|  | 208 |  | 
|  | 209 | Finally, a coverage mapping counter can also represent an execution count of | 
|  | 210 | of zero. The zero counter is used to provide coverage mapping for | 
|  | 211 | unreachable statements and expressions, like in the example below: | 
|  | 212 |  | 
|  | 213 | :raw-html:`<pre class='highlight' style='line-height:initial;'><span>int main() </span><span style='background-color:#4A789C'>{                  </span> | 
|  | 214 | <span style='background-color:#4A789C'>  return 0;                   </span> | 
|  | 215 | <span style='background-color:#4A789C'>  </span><span style='background-color:#85C1F5'>printf("Hello world!\n")</span><span style='background-color:#4A789C'>;   </span> <span class='c1'>// Unreachable region's counter is zero</span> | 
|  | 216 | <span style='background-color:#4A789C'>}</span> | 
|  | 217 | </pre>` | 
|  | 218 |  | 
|  | 219 | The zero counters allow the code coverage tool to display proper line execution | 
|  | 220 | counts for the unreachable lines and highlight the unreachable code. | 
|  | 221 | Without them, the tool would think that those lines and regions were still | 
|  | 222 | executed, as it doesn't possess the frontend's knowledge. | 
|  | 223 |  | 
|  | 224 | LLVM IR Representation | 
|  | 225 | ====================== | 
|  | 226 |  | 
|  | 227 | The coverage mapping data is stored in the LLVM IR using a single global | 
|  | 228 | constant structure variable called *__llvm_coverage_mapping* | 
|  | 229 | with the *__llvm_covmap* section specifier. | 
|  | 230 |  | 
|  | 231 | For example, let’s consider a C file and how it gets compiled to LLVM: | 
|  | 232 |  | 
|  | 233 | .. _coverage mapping sample: | 
|  | 234 |  | 
|  | 235 | .. code-block:: c | 
|  | 236 |  | 
|  | 237 | int foo() { | 
|  | 238 | return 42; | 
|  | 239 | } | 
|  | 240 | int bar() { | 
|  | 241 | return 13; | 
|  | 242 | } | 
|  | 243 |  | 
| Xinliang David Li | a0da640 | 2016-01-06 01:23:41 +0000 | [diff] [blame] | 244 | The coverage mapping variable generated by Clang has 3 fields: | 
|  | 245 |  | 
|  | 246 | * Coverage mapping header. | 
|  | 247 |  | 
|  | 248 | * An array of function records. | 
|  | 249 |  | 
|  | 250 | * Coverage mapping data which is an array of bytes. Zero paddings are added at the end to force 8 byte alignment. | 
| Alex Lorenz | 6a12833 | 2014-08-19 17:05:58 +0000 | [diff] [blame] | 251 |  | 
|  | 252 | .. code-block:: llvm | 
|  | 253 |  | 
| Vedant Kumar | a743072 | 2016-03-28 22:16:01 +0000 | [diff] [blame] | 254 | @__llvm_coverage_mapping = internal constant { { i32, i32, i32, i32 }, [2 x { i64, i32, i64 }], [40 x i8] } | 
| Xinliang David Li | 58999d9 | 2016-01-04 20:00:47 +0000 | [diff] [blame] | 255 | { | 
|  | 256 | { i32, i32, i32, i32 } ; Coverage map header | 
|  | 257 | { | 
|  | 258 | i32 2,  ; The number of function records | 
|  | 259 | i32 20, ; The length of the string that contains the encoded translation unit filenames | 
|  | 260 | i32 20, ; The length of the string that contains the encoded coverage mapping data | 
| Vedant Kumar | ad8f637 | 2017-09-18 23:37:28 +0000 | [diff] [blame] | 261 | i32 2,  ; Coverage mapping format version | 
| Xinliang David Li | 58999d9 | 2016-01-04 20:00:47 +0000 | [diff] [blame] | 262 | }, | 
| Xinliang David Li | a82d6c0 | 2016-02-08 18:13:49 +0000 | [diff] [blame] | 263 | [2 x { i64, i32, i64 }] [ ; Function records | 
|  | 264 | { i64, i32, i64 } { | 
|  | 265 | i64 0x5cf8c24cdb18bdac, ; Function's name MD5 | 
| Xinliang David Li | 56f7f9d | 2016-01-27 03:13:09 +0000 | [diff] [blame] | 266 | i32 9, ; Function's encoded coverage mapping data string length | 
|  | 267 | i64 0  ; Function's structural hash | 
| Alex Lorenz | 6a12833 | 2014-08-19 17:05:58 +0000 | [diff] [blame] | 268 | }, | 
| Xinliang David Li | a82d6c0 | 2016-02-08 18:13:49 +0000 | [diff] [blame] | 269 | { i64, i32, i64 } { | 
|  | 270 | i64 0xe413754a191db537, ; Function's name MD5 | 
| Xinliang David Li | 56f7f9d | 2016-01-27 03:13:09 +0000 | [diff] [blame] | 271 | i32 9, ; Function's encoded coverage mapping data string length | 
|  | 272 | i64 0  ; Function's structural hash | 
| Alex Lorenz | 6a12833 | 2014-08-19 17:05:58 +0000 | [diff] [blame] | 273 | }], | 
|  | 274 | [40 x i8] c"..." ; Encoded data (dissected later) | 
|  | 275 | }, section "__llvm_covmap", align 8 | 
|  | 276 |  | 
| Vedant Kumar | ad8f637 | 2017-09-18 23:37:28 +0000 | [diff] [blame] | 277 | The current version of the format is version 3. The only difference from version 2 is that a special encoding for column end locations was introduced to indicate gap regions. | 
|  | 278 |  | 
| Xinliang David Li | a82d6c0 | 2016-02-08 18:13:49 +0000 | [diff] [blame] | 279 | The function record layout has evolved since version 1. In version 1, the function record for *foo* is defined as follows: | 
|  | 280 |  | 
|  | 281 | .. code-block:: llvm | 
|  | 282 |  | 
|  | 283 | { i8*, i32, i32, i64 } { i8* getelementptr inbounds ([3 x i8]* @__profn_foo, i32 0, i32 0), ; Function's name | 
|  | 284 | i32 3, ; Function's name length | 
|  | 285 | i32 9, ; Function's encoded coverage mapping data string length | 
|  | 286 | i64 0  ; Function's structural hash | 
|  | 287 | } | 
|  | 288 |  | 
|  | 289 |  | 
| Xinliang David Li | a0da640 | 2016-01-06 01:23:41 +0000 | [diff] [blame] | 290 | Coverage Mapping Header: | 
|  | 291 | ------------------------ | 
| Alex Lorenz | 6a12833 | 2014-08-19 17:05:58 +0000 | [diff] [blame] | 292 |  | 
| Xinliang David Li | a0da640 | 2016-01-06 01:23:41 +0000 | [diff] [blame] | 293 | The coverage mapping header has the following fields: | 
| Alex Lorenz | 6a12833 | 2014-08-19 17:05:58 +0000 | [diff] [blame] | 294 |  | 
| Xinliang David Li | a0da640 | 2016-01-06 01:23:41 +0000 | [diff] [blame] | 295 | * The number of function records. | 
|  | 296 |  | 
|  | 297 | * The length of the string in the third field of *__llvm_coverage_mapping* that contains the encoded translation unit filenames. | 
|  | 298 |  | 
|  | 299 | * The length of the string in the third field of *__llvm_coverage_mapping* that contains the encoded coverage mapping data. | 
|  | 300 |  | 
| Vedant Kumar | ad8f637 | 2017-09-18 23:37:28 +0000 | [diff] [blame] | 301 | * The format version. The current version is 3 (encoded as a 2). | 
| Alex Lorenz | 6a12833 | 2014-08-19 17:05:58 +0000 | [diff] [blame] | 302 |  | 
|  | 303 | .. _function records: | 
|  | 304 |  | 
|  | 305 | Function record: | 
|  | 306 | ---------------- | 
|  | 307 |  | 
|  | 308 | A function record is a structure of the following type: | 
|  | 309 |  | 
|  | 310 | .. code-block:: llvm | 
|  | 311 |  | 
| Xinliang David Li | a82d6c0 | 2016-02-08 18:13:49 +0000 | [diff] [blame] | 312 | { i64, i32, i64 } | 
| Alex Lorenz | 6a12833 | 2014-08-19 17:05:58 +0000 | [diff] [blame] | 313 |  | 
| Xinliang David Li | a82d6c0 | 2016-02-08 18:13:49 +0000 | [diff] [blame] | 314 | It contains function name's MD5, the length of the encoded mapping data for that function, and function's | 
|  | 315 | structural hash value. | 
| Alex Lorenz | 6a12833 | 2014-08-19 17:05:58 +0000 | [diff] [blame] | 316 |  | 
|  | 317 | Encoded data: | 
|  | 318 | ------------- | 
|  | 319 |  | 
|  | 320 | The encoded data is stored in a single string that contains | 
|  | 321 | the encoded filenames used by this translation unit and the encoded coverage | 
|  | 322 | mapping data for each function in this translation unit. | 
|  | 323 |  | 
|  | 324 | The encoded data has the following structure: | 
|  | 325 |  | 
|  | 326 | ``[filenames, coverageMappingDataForFunctionRecord0, coverageMappingDataForFunctionRecord1, ..., padding]`` | 
|  | 327 |  | 
|  | 328 | If necessary, the encoded data is padded with zeroes so that the size | 
|  | 329 | of the data string is rounded up to the nearest multiple of 8 bytes. | 
|  | 330 |  | 
|  | 331 | Dissecting the sample: | 
|  | 332 | ^^^^^^^^^^^^^^^^^^^^^^ | 
|  | 333 |  | 
|  | 334 | Here's an overview of the encoded data that was stored in the | 
|  | 335 | IR for the `coverage mapping sample`_ that was shown earlier: | 
|  | 336 |  | 
|  | 337 | * The IR contains the following string constant that represents the encoded | 
|  | 338 | coverage mapping data for the sample translation unit: | 
|  | 339 |  | 
|  | 340 | .. code-block:: llvm | 
|  | 341 |  | 
|  | 342 | c"\01\12/Users/alex/test.c\01\00\00\01\01\01\0C\02\02\01\00\00\01\01\04\0C\02\02\00\00" | 
|  | 343 |  | 
|  | 344 | * The string contains values that are encoded in the LEB128 format, which is | 
|  | 345 | used throughout for storing integers. It also contains a string value. | 
|  | 346 |  | 
|  | 347 | * The length of the substring that contains the encoded translation unit | 
|  | 348 | filenames is the value of the second field in the *__llvm_coverage_mapping* | 
|  | 349 | structure, which is 20, thus the filenames are encoded in this string: | 
|  | 350 |  | 
|  | 351 | .. code-block:: llvm | 
|  | 352 |  | 
|  | 353 | c"\01\12/Users/alex/test.c" | 
|  | 354 |  | 
|  | 355 | This string contains the following data: | 
|  | 356 |  | 
|  | 357 | * Its first byte has a value of ``0x01``. It stores the number of filenames | 
|  | 358 | contained in this string. | 
|  | 359 | * Its second byte stores the length of the first filename in this string. | 
|  | 360 | * The remaining 18 bytes are used to store the first filename. | 
|  | 361 |  | 
|  | 362 | * The length of the substring that contains the encoded coverage mapping data | 
|  | 363 | for the first function is the value of the third field in the first | 
|  | 364 | structure in an array of `function records`_ stored in the | 
| Xinliang David Li | a0da640 | 2016-01-06 01:23:41 +0000 | [diff] [blame] | 365 | third field of the *__llvm_coverage_mapping* structure, which is the 9. | 
| Alex Lorenz | 6a12833 | 2014-08-19 17:05:58 +0000 | [diff] [blame] | 366 | Therefore, the coverage mapping for the first function record is encoded | 
|  | 367 | in this string: | 
|  | 368 |  | 
|  | 369 | .. code-block:: llvm | 
|  | 370 |  | 
|  | 371 | c"\01\00\00\01\01\01\0C\02\02" | 
|  | 372 |  | 
|  | 373 | This string consists of the following bytes: | 
|  | 374 |  | 
|  | 375 | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | 376 | | ``0x01`` | The number of file ids used by this function. There is only one file id used by the mapping data in this function.      | | 
|  | 377 | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | 378 | | ``0x00`` | An index into the filenames array which corresponds to the file "/Users/alex/test.c".                                   | | 
|  | 379 | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | 380 | | ``0x00`` | The number of counter expressions used by this function. This function doesn't use any expressions.                     | | 
|  | 381 | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | 382 | | ``0x01`` | The number of mapping regions that are stored in an array for the function's file id #0.                                | | 
|  | 383 | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | 384 | | ``0x01`` | The coverage mapping counter for the first region in this function. The value of 1 tells us that it's a coverage        | | 
| Xinliang David Li | a0da640 | 2016-01-06 01:23:41 +0000 | [diff] [blame] | 385 | |          | mapping counter that is a reference to the profile instrumentation counter with an index of 0.                          | | 
| Alex Lorenz | 6a12833 | 2014-08-19 17:05:58 +0000 | [diff] [blame] | 386 | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | 387 | | ``0x01`` | The starting line of the first mapping region in this function.                                                         | | 
|  | 388 | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | 389 | | ``0x0C`` | The starting column of the first mapping region in this function.                                                       | | 
|  | 390 | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | 391 | | ``0x02`` | The ending line of the first mapping region in this function.                                                           | | 
|  | 392 | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | 393 | | ``0x02`` | The ending column of the first mapping region in this function.                                                         | | 
|  | 394 | +----------+-------------------------------------------------------------------------------------------------------------------------+ | 
|  | 395 |  | 
|  | 396 | * The length of the substring that contains the encoded coverage mapping data | 
|  | 397 | for the second function record is also 9. It's structured like the mapping data | 
|  | 398 | for the first function record. | 
|  | 399 |  | 
|  | 400 | * The two trailing bytes are zeroes and are used to pad the coverage mapping | 
|  | 401 | data to give it the 8 byte alignment. | 
|  | 402 |  | 
|  | 403 | Encoding | 
|  | 404 | ======== | 
|  | 405 |  | 
|  | 406 | The per-function coverage mapping data is encoded as a stream of bytes, | 
|  | 407 | with a simple structure. The structure consists of the encoding | 
|  | 408 | `types <cvmtypes_>`_ like variable-length unsigned integers, that | 
|  | 409 | are used to encode `File ID Mapping`_, `Counter Expressions`_ and | 
|  | 410 | the `Mapping Regions`_. | 
|  | 411 |  | 
|  | 412 | The format of the structure follows: | 
|  | 413 |  | 
|  | 414 | ``[file id mapping, counter expressions, mapping regions]`` | 
|  | 415 |  | 
|  | 416 | The translation unit filenames are encoded using the same encoding | 
|  | 417 | `types <cvmtypes_>`_ as the per-function coverage mapping data, with the | 
|  | 418 | following structure: | 
|  | 419 |  | 
|  | 420 | ``[numFilenames : LEB128, filename0 : string, filename1 : string, ...]`` | 
|  | 421 |  | 
|  | 422 | .. _cvmtypes: | 
|  | 423 |  | 
|  | 424 | Types | 
|  | 425 | ----- | 
|  | 426 |  | 
|  | 427 | This section describes the basic types that are used by the encoding format | 
|  | 428 | and can appear after ``:`` in the ``[foo : type]`` description. | 
|  | 429 |  | 
|  | 430 | .. _LEB128: | 
|  | 431 |  | 
|  | 432 | LEB128 | 
|  | 433 | ^^^^^^ | 
|  | 434 |  | 
| Sylvestre Ledru | 84666a1 | 2016-02-14 20:16:22 +0000 | [diff] [blame] | 435 | LEB128 is an unsigned integer value that is encoded using DWARF's LEB128 | 
| Alex Lorenz | 6a12833 | 2014-08-19 17:05:58 +0000 | [diff] [blame] | 436 | encoding, optimizing for the case where values are small | 
|  | 437 | (1 byte for values less than 128). | 
|  | 438 |  | 
| Aaron Ballman | f733993 | 2016-07-23 18:52:21 +0000 | [diff] [blame] | 439 | .. _CoverageStrings: | 
| Alex Lorenz | 6a12833 | 2014-08-19 17:05:58 +0000 | [diff] [blame] | 440 |  | 
|  | 441 | Strings | 
|  | 442 | ^^^^^^^ | 
|  | 443 |  | 
|  | 444 | ``[length : LEB128, characters...]`` | 
|  | 445 |  | 
|  | 446 | String values are encoded with a `LEB value <LEB128_>`_ for the length | 
|  | 447 | of the string and a sequence of bytes for its characters. | 
|  | 448 |  | 
|  | 449 | .. _file id mapping: | 
|  | 450 |  | 
|  | 451 | File ID Mapping | 
|  | 452 | --------------- | 
|  | 453 |  | 
|  | 454 | ``[numIndices : LEB128, filenameIndex0 : LEB128, filenameIndex1 : LEB128, ...]`` | 
|  | 455 |  | 
|  | 456 | File id mapping in a function's coverage mapping stream | 
|  | 457 | contains the indices into the translation unit's filenames array. | 
|  | 458 |  | 
|  | 459 | Counter | 
|  | 460 | ------- | 
|  | 461 |  | 
|  | 462 | ``[value : LEB128]`` | 
|  | 463 |  | 
|  | 464 | A `coverage mapping counter`_ is stored in a single `LEB value <LEB128_>`_. | 
|  | 465 | It is composed of two things --- the `tag <counter-tag_>`_ | 
|  | 466 | which is stored in the lowest 2 bits, and the `counter data`_ which is stored | 
|  | 467 | in the remaining bits. | 
|  | 468 |  | 
|  | 469 | .. _counter-tag: | 
|  | 470 |  | 
|  | 471 | Tag: | 
|  | 472 | ^^^^ | 
|  | 473 |  | 
|  | 474 | The counter's tag encodes the counter's kind | 
|  | 475 | and, if the counter is an expression, the expression's kind. | 
|  | 476 | The possible tag values are: | 
|  | 477 |  | 
|  | 478 | * 0 - The counter is zero. | 
|  | 479 |  | 
|  | 480 | * 1 - The counter is a reference to the profile instrumentation counter. | 
|  | 481 |  | 
|  | 482 | * 2 - The counter is a subtraction expression. | 
|  | 483 |  | 
|  | 484 | * 3 - The counter is an addition expression. | 
|  | 485 |  | 
|  | 486 | .. _counter data: | 
|  | 487 |  | 
|  | 488 | Data: | 
|  | 489 | ^^^^^ | 
|  | 490 |  | 
|  | 491 | The counter's data is interpreted in the following manner: | 
|  | 492 |  | 
|  | 493 | * When the counter is a reference to the profile instrumentation counter, | 
|  | 494 | then the counter's data is the id of the profile counter. | 
|  | 495 | * When the counter is an expression, then the counter's data | 
|  | 496 | is the index into the array of counter expressions. | 
|  | 497 |  | 
|  | 498 | .. _Counter Expressions: | 
|  | 499 |  | 
|  | 500 | Counter Expressions | 
|  | 501 | ------------------- | 
|  | 502 |  | 
|  | 503 | ``[numExpressions : LEB128, expr0LHS : LEB128, expr0RHS : LEB128, expr1LHS : LEB128, expr1RHS : LEB128, ...]`` | 
|  | 504 |  | 
|  | 505 | Counter expressions consist of two counters as they | 
|  | 506 | represent binary arithmetic operations. | 
|  | 507 | The expression's kind is determined from the `tag <counter-tag_>`_ of the | 
|  | 508 | counter that references this expression. | 
|  | 509 |  | 
|  | 510 | .. _Mapping Regions: | 
|  | 511 |  | 
|  | 512 | Mapping Regions | 
|  | 513 | --------------- | 
|  | 514 |  | 
|  | 515 | ``[numRegionArrays : LEB128, regionsForFile0, regionsForFile1, ...]`` | 
|  | 516 |  | 
|  | 517 | The mapping regions are stored in an array of sub-arrays where every | 
|  | 518 | region in a particular sub-array has the same file id. | 
|  | 519 |  | 
|  | 520 | The file id for a sub-array of regions is the index of that | 
|  | 521 | sub-array in the main array e.g. The first sub-array will have the file id | 
|  | 522 | of 0. | 
|  | 523 |  | 
|  | 524 | Sub-Array of Regions | 
|  | 525 | ^^^^^^^^^^^^^^^^^^^^ | 
|  | 526 |  | 
|  | 527 | ``[numRegions : LEB128, region0, region1, ...]`` | 
|  | 528 |  | 
|  | 529 | The mapping regions for a specific file id are stored in an array that is | 
|  | 530 | sorted in an ascending order by the region's starting location. | 
|  | 531 |  | 
|  | 532 | Mapping Region | 
|  | 533 | ^^^^^^^^^^^^^^ | 
|  | 534 |  | 
|  | 535 | ``[header, source range]`` | 
|  | 536 |  | 
|  | 537 | The mapping region record contains two sub-records --- | 
|  | 538 | the `header`_, which stores the counter and/or the region's kind, | 
|  | 539 | and the `source range`_ that contains the starting and ending | 
|  | 540 | location of this region. | 
|  | 541 |  | 
|  | 542 | .. _header: | 
|  | 543 |  | 
|  | 544 | Header | 
|  | 545 | ^^^^^^ | 
|  | 546 |  | 
|  | 547 | ``[counter]`` | 
|  | 548 |  | 
|  | 549 | or | 
|  | 550 |  | 
|  | 551 | ``[pseudo-counter]`` | 
|  | 552 |  | 
|  | 553 | The header encodes the region's counter and the region's kind. | 
|  | 554 |  | 
|  | 555 | The value of the counter's tag distinguishes between the counters and | 
|  | 556 | pseudo-counters --- if the tag is zero, than this header contains a | 
|  | 557 | pseudo-counter, otherwise this header contains an ordinary counter. | 
|  | 558 |  | 
|  | 559 | Counter: | 
|  | 560 | """""""" | 
|  | 561 |  | 
|  | 562 | A mapping region whose header has a counter with a non-zero tag is | 
|  | 563 | a code region. | 
|  | 564 |  | 
|  | 565 | Pseudo-Counter: | 
|  | 566 | """"""""""""""" | 
|  | 567 |  | 
|  | 568 | ``[value : LEB128]`` | 
|  | 569 |  | 
|  | 570 | A pseudo-counter is stored in a single `LEB value <LEB128_>`_, just like | 
|  | 571 | the ordinary counter. It has the following interpretation: | 
|  | 572 |  | 
|  | 573 | * bits 0-1: tag, which is always 0. | 
|  | 574 |  | 
|  | 575 | * bit 2: expansionRegionTag. If this bit is set, then this mapping region | 
|  | 576 | is an expansion region. | 
|  | 577 |  | 
|  | 578 | * remaining bits: data. If this region is an expansion region, then the data | 
|  | 579 | contains the expanded file id of that region. | 
|  | 580 |  | 
|  | 581 | Otherwise, the data contains the region's kind. The possible region | 
|  | 582 | kind values are: | 
|  | 583 |  | 
|  | 584 | * 0 - This mapping region is a code region with a counter of zero. | 
|  | 585 | * 2 - This mapping region is a skipped region. | 
|  | 586 |  | 
|  | 587 | .. _source range: | 
|  | 588 |  | 
|  | 589 | Source Range | 
|  | 590 | ^^^^^^^^^^^^ | 
|  | 591 |  | 
|  | 592 | ``[deltaLineStart : LEB128, columnStart : LEB128, numLines : LEB128, columnEnd : LEB128]`` | 
|  | 593 |  | 
|  | 594 | The source range record contains the following fields: | 
|  | 595 |  | 
|  | 596 | * *deltaLineStart*: The difference between the starting line of the | 
|  | 597 | current mapping region and the starting line of the previous mapping region. | 
|  | 598 |  | 
|  | 599 | If the current mapping region is the first region in the current | 
|  | 600 | sub-array, then it stores the starting line of that region. | 
|  | 601 |  | 
|  | 602 | * *columnStart*: The starting column of the mapping region. | 
|  | 603 |  | 
|  | 604 | * *numLines*: The difference between the ending line and the starting line | 
|  | 605 | of the current mapping region. | 
|  | 606 |  | 
| Vedant Kumar | ad8f637 | 2017-09-18 23:37:28 +0000 | [diff] [blame] | 607 | * *columnEnd*: The ending column of the mapping region. If the high bit is set, | 
|  | 608 | the current mapping region is a gap area. A count for a gap area is only used | 
|  | 609 | as the line execution count if there are no other regions on a line. |