Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 1 | <?xml version="1.0" encoding="ISO-8859-1" ?>
|
| 2 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
| 3 | <html xmlns="http://www.w3.org/1999/xhtml" lang="en">
|
| 4 | <head>
|
| 5 | <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
|
| 6 | <link rel="stylesheet" href=".resources/doc.css" charset="ISO-8859-1" type="text/css" />
|
Marc R. Hoffmann | ae8c2a8 | 2010-10-05 16:20:10 +0000 | [diff] [blame] | 7 | <link rel="stylesheet" href="../coverage/.resources/prettify.css" charset="ISO-8859-1" type="text/css" />
|
Marc R. Hoffmann | d7d2f75 | 2010-05-06 21:12:31 +0000 | [diff] [blame] | 8 | <link rel="shortcut icon" href=".resources/report.gif" type="image/gif" />
|
Marc R. Hoffmann | ae8c2a8 | 2010-10-05 16:20:10 +0000 | [diff] [blame] | 9 | <script type="text/javascript" src="../coverage/.resources/prettify.js"></script>
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 10 | <title>JaCoCo - Control Flow Analysis</title>
|
| 11 | </head>
|
Marc R. Hoffmann | ae8c2a8 | 2010-10-05 16:20:10 +0000 | [diff] [blame] | 12 | <body onload="prettyPrint()">
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 13 |
|
| 14 | <div class="breadcrumb">
|
Marc R. Hoffmann | d7d2f75 | 2010-05-06 21:12:31 +0000 | [diff] [blame] | 15 | <a href="../index.html" class="el_report">JaCoCo</a> >
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 16 | <a href="index.html" class="el_group">Documentation</a> >
|
| 17 | <span class="el_source">Control Flow Analysis</span>
|
| 18 | </div>
|
| 19 | <div id="content">
|
| 20 |
|
| 21 | <h1>Control Flow Analysis for Java Methods</h1>
|
| 22 |
|
| 23 | <p style="font-weight:bold;">
|
| 24 | DRAFT - This document does not reflect the current JaCoCo implementation.
|
| 25 | </p>
|
| 26 |
|
| 27 | <p class="hint">
|
Marc R. Hoffmann | b637242 | 2010-11-14 17:30:00 +0000 | [diff] [blame] | 28 | Implementing a coverage tool that supports statement (C0) as well as branch
|
| 29 | coverage coverage (C1) requires detailed analysis of the internal control flow
|
| 30 | of Java methods. Due to the architecture of JaCoCo this analysis happens on
|
| 31 | compiled class files (bytecode). This document defines graph structures for
|
| 32 | control flow analysis of Java bytecode and discusses strategies for probe
|
| 33 | insertion. Marc R. Hoffmann, November 2010
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 34 | </p>
|
| 35 |
|
Marc R. Hoffmann | b637242 | 2010-11-14 17:30:00 +0000 | [diff] [blame] | 36 | <h2>Control Flow Graphs for Java Bytecode</h2>
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 37 |
|
Marc R. Hoffmann | b637242 | 2010-11-14 17:30:00 +0000 | [diff] [blame] | 38 | <p>
|
| 39 | As an starting point we take the following example method that contains a
|
| 40 | single branching point:
|
Marc R. Hoffmann | ae8c2a8 | 2010-10-05 16:20:10 +0000 | [diff] [blame] | 41 | </p>
|
| 42 |
|
Marc R. Hoffmann | 0fd9c83 | 2011-03-16 20:42:40 +0000 | [diff] [blame] | 43 | <pre class="source lang-java linenums">
|
| 44 | public static void example() {
|
| 45 | a();
|
| 46 | if (cond()) {
|
| 47 | b();
|
| 48 | } else {
|
| 49 | c();
|
| 50 | }
|
| 51 | d();
|
| 52 | }
|
Marc R. Hoffmann | ae8c2a8 | 2010-10-05 16:20:10 +0000 | [diff] [blame] | 53 | </pre>
|
| 54 |
|
Marc R. Hoffmann | b637242 | 2010-11-14 17:30:00 +0000 | [diff] [blame] | 55 | <p>
|
| 56 | A Java compiler will create the following bytecode from this example method.
|
| 57 | Java bytecode is a linear sequence of instructions. Control flow is
|
| 58 | implemented with <i>jump</i> instructions like the conditional
|
| 59 | <code>IFEQ</code> or the unconditional <code>GOTO</code> opcode. The jump
|
| 60 | targets are technically relative offsets to the target instruction. For better
|
| 61 | readability we use symbolic labels (<code>L1</code>, <code>L2</code>) instead
|
| 62 | (also the ASM API uses such symbolic labels):
|
| 63 | </p>
|
Marc R. Hoffmann | ae8c2a8 | 2010-10-05 16:20:10 +0000 | [diff] [blame] | 64 |
|
Marc R. Hoffmann | 0fd9c83 | 2011-03-16 20:42:40 +0000 | [diff] [blame] | 65 | <pre class="source linenums">
|
| 66 | public static example()V
|
| 67 | INVOKESTATIC a()V
|
| 68 | INVOKESTATIC cond()Z
|
| 69 | IFEQ L1
|
| 70 | INVOKESTATIC b()V
|
| 71 | GOTO L2
|
| 72 | INVOKESTATIC c()V
|
| 73 | INVOKESTATIC d()V
|
| 74 | RETURN
|
Marc R. Hoffmann | ae8c2a8 | 2010-10-05 16:20:10 +0000 | [diff] [blame] | 75 | </pre>
|
| 76 |
|
Marc R. Hoffmann | b637242 | 2010-11-14 17:30:00 +0000 | [diff] [blame] | 77 | <p>
|
| 78 | The possible control flow in the bytecode above can be represented by a graph.
|
| 79 | The nodes are byte code instruction, the edged of the graph represent the
|
| 80 | possible control flow between the instructions:
|
| 81 | </p>
|
| 82 |
|
| 83 | <img src=".resources/flow-1.png" alt="Bytecode contol flow"/>
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 84 |
|
Marc R. Hoffmann | 66234cf | 2010-10-28 15:24:37 +0000 | [diff] [blame] | 85 | <h3>Flow Edges</h3>
|
| 86 |
|
| 87 | <p>
|
| 88 | The control flow graph of a Java method defined by Java byte code may have
|
| 89 | the following Edges. Each edge connects a source instruction with a target
|
| 90 | instruction. In some cases the source instruction or the target instruction
|
| 91 | does not exist (virtual edges for method entry and exit) or cannot be
|
| 92 | exactly specified (exception handlers).
|
| 93 | </p>
|
| 94 |
|
| 95 | <table class="coverage">
|
| 96 | <thead>
|
| 97 | <tr>
|
| 98 | <td>Type</td>
|
| 99 | <td>Source</td>
|
| 100 | <td>Target</td>
|
| 101 | <td>Remarks</td>
|
| 102 | </tr>
|
| 103 | </thead>
|
| 104 | <tbody>
|
| 105 | <tr>
|
| 106 | <td>ENTRY</td>
|
| 107 | <td>-</td>
|
| 108 | <td>First instruction in method</td>
|
| 109 | <td></td>
|
| 110 | </tr>
|
| 111 | <tr>
|
| 112 | <td>SEQUENCE</td>
|
| 113 | <td>Instruction, except <code>GOTO</code>, <code>xRETURN</code>,
|
| 114 | <code>THROW</code>, <code>TABLESWITCH</code> and <code>LOOKUPSWITCH</code></td>
|
| 115 | <td>Subsequent instruction</td>
|
| 116 | <td></td>
|
| 117 | </tr>
|
| 118 | <tr>
|
| 119 | <td>JUMP</td>
|
| 120 | <td><code>GOTO</code>, <code>IFx</code>, <code>TABLESWITCH</code> or
|
| 121 | <code>LOOKUPSWITCH</code> instruction</td>
|
| 122 | <td>Target instruction</td>
|
| 123 | <td><code>TABLESWITCH</code> and <code>LOOKUPSWITCH</code> will define
|
| 124 | multiple edges.</td>
|
| 125 | </tr>
|
| 126 | <tr>
|
| 127 | <td>EXHANDLER</td>
|
Marc R. Hoffmann | b637242 | 2010-11-14 17:30:00 +0000 | [diff] [blame] | 128 | <td>Any instruction in handler scope</td>
|
Marc R. Hoffmann | 66234cf | 2010-10-28 15:24:37 +0000 | [diff] [blame] | 129 | <td>Target instruction</td>
|
| 130 | <td></td>
|
| 131 | </tr>
|
| 132 | <tr>
|
| 133 | <td>EXIT</td>
|
| 134 | <td><code>xRETURN</code> or <code>THROW</code> instruction</td>
|
| 135 | <td>-</td>
|
| 136 | <td></td>
|
| 137 | </tr>
|
| 138 | <tr>
|
| 139 | <td>EXEXIT</td>
|
| 140 | <td>Any instruction</td>
|
| 141 | <td>-</td>
|
| 142 | <td>Unhandled exception.</td>
|
| 143 | </tr>
|
| 144 | </tbody>
|
| 145 | </table>
|
| 146 |
|
Marc R. Hoffmann | ecf7450 | 2010-12-11 02:54:22 +0000 | [diff] [blame] | 147 | <p>
|
| 148 | For the first implementation approach we ignore edges caused by exceptions
|
| 149 | and the the method entry. This means we consider SEQUENCE, JUMP and EXIT.
|
| 150 | </p>
|
| 151 |
|
Marc R. Hoffmann | b637242 | 2010-11-14 17:30:00 +0000 | [diff] [blame] | 152 | <h2>Probe Insertion Strategy</h2>
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 153 |
|
Marc R. Hoffmann | ecf7450 | 2010-12-11 02:54:22 +0000 | [diff] [blame] | 154 | <p>
|
| 155 | Probes are additional instructions that can be inserted between existing
|
| 156 | instructions. Probes record the fact that they have been executed. One can
|
| 157 | think probes are placed on edges of the control flow graph. Therefore if a
|
| 158 | probe has been executed we know that the corresponding edge has been visited.
|
| 159 | From this edge we can conclude to other preceding nodes and edges:
|
| 160 | </p>
|
| 161 |
|
Marc R. Hoffmann | b637242 | 2010-11-14 17:30:00 +0000 | [diff] [blame] | 162 | <ul>
|
Marc R. Hoffmann | b637242 | 2010-11-14 17:30:00 +0000 | [diff] [blame] | 163 | <li>If a edge has been visited, we know that the source node of the this edge
|
| 164 | has been executed.</li>
|
| 165 | <li>If a node has been executed and the node is the target of only one edge
|
| 166 | we know that this edge has been visited.</li>
|
| 167 | </ul>
|
| 168 |
|
| 169 | <p>
|
| 170 | With this observations we only need probes at the following edges:
|
| 171 | </p>
|
| 172 |
|
| 173 | <ul>
|
| 174 | <li>At every EXIT.</li>
|
| 175 | <li>At every edge where the target instruction is the target of more than one
|
| 176 | edge.</li>
|
| 177 | </ul>
|
| 178 |
|
| 179 | <p>
|
Marc R. Hoffmann | ecf7450 | 2010-12-11 02:54:22 +0000 | [diff] [blame] | 180 | Given the example method above we see that <code>INVOKE d()</code> is the only
|
| 181 | node with more than one incoming edge. So we need to place probes on those
|
| 182 | edges and another probe on the only exit node:
|
Marc R. Hoffmann | b637242 | 2010-11-14 17:30:00 +0000 | [diff] [blame] | 183 | </p>
|
| 184 |
|
Marc R. Hoffmann | ff410a6 | 2010-11-14 20:27:07 +0000 | [diff] [blame] | 185 | <img src=".resources/flow-2.png" alt="Probe positions"/>
|
| 186 |
|
Marc R. Hoffmann | ecf7450 | 2010-12-11 02:54:22 +0000 | [diff] [blame] | 187 | <table class="coverage">
|
| 188 | <thead>
|
| 189 | <tr>
|
| 190 | <td>Type</td>
|
| 191 | <td>Before</td>
|
| 192 | <td>After</td>
|
| 193 | </tr>
|
| 194 | </thead>
|
| 195 | <tbody>
|
| 196 | <tr>
|
| 197 | <td>SEQUENCE</td>
|
| 198 | <td><img src=".resources/flow-3a.png" alt="SEQUENCE"/></td>
|
| 199 | <td><img src=".resources/flow-3b.png" alt="SEQUENCE with Probe"/></td>
|
| 200 | </tr>
|
| 201 | <tr>
|
| 202 | <td>JUMP (unconditional)</td>
|
| 203 | <td><img src=".resources/flow-4a.png" alt="JUMP"/></td>
|
| 204 | <td><img src=".resources/flow-4b.png" alt="JUMP with Probe"/></td>
|
| 205 | </tr>
|
| 206 | <tr>
|
| 207 | <td>JUMP (conditional)</td>
|
| 208 | <td><img src=".resources/flow-5a.png" alt="JUMP"/></td>
|
| 209 | <td><img src=".resources/flow-5b.png" alt="JUMP with Probe"/></td>
|
| 210 | </tr>
|
| 211 | </tbody>
|
| 212 | </table>
|
Marc R. Hoffmann | ff410a6 | 2010-11-14 20:27:07 +0000 | [diff] [blame] | 213 |
|
Marc R. Hoffmann | b637242 | 2010-11-14 17:30:00 +0000 | [diff] [blame] | 214 | <h2>Coverage Analysis</h2>
|
| 215 |
|
Marc R. Hoffmann | ecf7450 | 2010-12-11 02:54:22 +0000 | [diff] [blame] | 216 | <p>
|
| 217 | The execution status of all other edges and instructions can be derived from
|
| 218 | the status of this probes by recursively applying the rules above.
|
| 219 | </p>
|
Marc R. Hoffmann | b637242 | 2010-11-14 17:30:00 +0000 | [diff] [blame] | 220 |
|
| 221 | <h2>Probe Implementation</h2>
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 222 |
|
| 223 | <p>
|
| 224 | Code coverage analysis is a runtime metric that provides execution details
|
| 225 | of the software under test. This requires detailed recording about the
|
Marc R. Hoffmann | ae8c2a8 | 2010-10-05 16:20:10 +0000 | [diff] [blame] | 226 | instructions (instruction coverage) that have been executed. For branch
|
| 227 | coverage also the outcome of decisions has to be recorded. In any case
|
| 228 | execution data is collected by so called probes:
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 229 | </p>
|
| 230 |
|
| 231 | <p class="hint">
|
Marc R. Hoffmann | 102e837 | 2010-04-27 15:46:23 +0000 | [diff] [blame] | 232 | A <b>probe</b> is a sequence of bytecode instructions that can be inserted
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 233 | into a Java method. When the probe is executed, this fact is recorded and can
|
| 234 | be reported by the coverage runtime.
|
| 235 | </p>
|
| 236 |
|
| 237 | <p>
|
| 238 | The only purpose of the probe is to record that it has been executed at least
|
| 239 | once. The probe does not record the number of times it has been called or
|
| 240 | collect any timing information. The latter is out of scope for code coverage
|
| 241 | analysis and more in the objective of a performance analysis tool. Typically
|
| 242 | multiple probes needs to be inserted into each method, therefore probes needs
|
| 243 | to be identified. Also the probe implementation and the storage mechanism it
|
| 244 | depends on needs to be thread safe as multi-threaded execution is a common
|
| 245 | scenario for java applications (albeit not for plain unit tests). Probes must
|
| 246 | not have any side effects on the original code of the method. Also they should
|
| 247 | add minimal overhead.
|
| 248 | </p>
|
| 249 |
|
| 250 | <p>
|
| 251 | So to summarize the requirements for execution probes:
|
| 252 | </p>
|
| 253 |
|
| 254 | <ul>
|
| 255 | <li>Record execution</li>
|
| 256 | <li>Identification for different probes</li>
|
| 257 | <li>Thread safe</li>
|
| 258 | <li>No side effects on application code</li>
|
| 259 | <li>Minimal runtime overhead</li>
|
| 260 | </ul>
|
| 261 |
|
| 262 | <p>
|
| 263 | JaCoCo implements probes with a <code>boolean[]</code> array instance per
|
| 264 | class. Each probe corresponds to a entry in this array. Whenever the probe is
|
Marc R. Hoffmann | 102e837 | 2010-04-27 15:46:23 +0000 | [diff] [blame] | 265 | executed the entry is set to <code>true</code> with the following four
|
| 266 | bytecode instructions:
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 267 | </p>
|
| 268 |
|
| 269 | <pre class="source">
|
| 270 | ALOAD probearray
|
| 271 | xPUSH probeid
|
| 272 | ICONST_1
|
| 273 | BASTORE
|
| 274 | </pre>
|
| 275 |
|
| 276 | <p>
|
| 277 | Note that this probe code is thread safe, does not modify the operand stack
|
| 278 | or modify local variables and is also thread safe. It does also not leave the
|
| 279 | method though an external call. The only prerequisite is that the probe array
|
| 280 | is available as a local variable. For this at the beginning of each method
|
| 281 | additional instrumentation code needs to be added to obtain the array instance
|
Marc R. Hoffmann | 9772829 | 2010-04-27 20:29:23 +0000 | [diff] [blame] | 282 | associated with the belonging class. To avoid code duplication the
|
| 283 | initialization is delegated to a static private method
|
| 284 | <code>$jacocoinit()</code> which is added to every non-interface class.
|
Marc R. Hoffmann | 102e837 | 2010-04-27 15:46:23 +0000 | [diff] [blame] | 285 | </p>
|
| 286 |
|
| 287 | <p>
|
| 288 | The size of the probe code above depends on the position of the probe array
|
| 289 | variable and the value of the probe identifier as different opcodes can be
|
| 290 | used. As calculated in the table below the overhead per probe ranges between 4
|
| 291 | and 7 bytes of additional bytecode:
|
| 292 | </p>
|
| 293 |
|
| 294 | <table class="coverage">
|
| 295 | <thead>
|
| 296 | <tr>
|
| 297 | <td>Possible Opcodes</td>
|
| 298 | <td>Min. Size [bytes]</td>
|
| 299 | <td>Max. Size [bytes]</td>
|
Marc R. Hoffmann | 102e837 | 2010-04-27 15:46:23 +0000 | [diff] [blame] | 300 | </tr>
|
| 301 | </thead>
|
Marc R. Hoffmann | 102e837 | 2010-04-27 15:46:23 +0000 | [diff] [blame] | 302 | <tfoot>
|
| 303 | <tr>
|
| 304 | <td>Total:</td>
|
| 305 | <td>4</td>
|
| 306 | <td>7</td>
|
Marc R. Hoffmann | 102e837 | 2010-04-27 15:46:23 +0000 | [diff] [blame] | 307 | </tr>
|
| 308 | </tfoot>
|
Marc R. Hoffmann | 9772829 | 2010-04-27 20:29:23 +0000 | [diff] [blame] | 309 | <tbody>
|
| 310 | <tr>
|
| 311 | <td><code>ALOAD_x</code>, <code>ALOAD</code> <sup>1</sup></td>
|
| 312 | <td>1</td>
|
| 313 | <td>2</td>
|
| 314 | </tr>
|
| 315 | <tr>
|
| 316 | <td><code>ICONST_x</code>, <code>BIPUSH</code>, <code>SIPUSH</code>, <code>LDC</code>, <code>LDC_W</code> <sup>2</sup></td>
|
| 317 | <td>1</td>
|
| 318 | <td>3</td>
|
| 319 | </tr>
|
| 320 | <tr>
|
| 321 | <td><code>ICONST_1</code></td>
|
| 322 | <td>1</td>
|
| 323 | <td>1</td>
|
| 324 | </tr>
|
| 325 | <tr>
|
| 326 | <td><code>BASTORE</code></td>
|
| 327 | <td>1</td>
|
| 328 | <td>1</td>
|
| 329 | </tr>
|
| 330 | </tbody>
|
Marc R. Hoffmann | 102e837 | 2010-04-27 15:46:23 +0000 | [diff] [blame] | 331 | </table>
|
| 332 |
|
| 333 | <p>
|
Marc R. Hoffmann | 9772829 | 2010-04-27 20:29:23 +0000 | [diff] [blame] | 334 | <sup>1</sup> The probe array is the first variable after the arguments.
|
| 335 | If the method arguments do not consume more that 3 slots the 1-byte opcode can
|
| 336 | be used.<br/>
|
| 337 | <sup>2</sup> 1-byte opcodes for ids 0 to 5, 2-byte opcode for ids up to 127,
|
| 338 | 3-byte opcode for ids up to 32767. Ids values of 32768 or more require an
|
| 339 | additional constant pool entry. For normal class files it is very unlikely to
|
| 340 | require more than 32,000 probes.
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 341 | </p>
|
| 342 |
|
| 343 | <ul>
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 344 | <li>Limitation: Only proves that the probe itself has been executed,
|
| 345 | assumptions about the surrounding application code is interpolation</li>
|
| 346 | <li>Probe in every edge of the control flow graph</li>
|
| 347 | <li>Every exit path known (branch coverage)</li>
|
| 348 | <li>Block entry known (exceptions within blocks)</li>
|
| 349 | </ul>
|
| 350 |
|
Marc R. Hoffmann | b637242 | 2010-11-14 17:30:00 +0000 | [diff] [blame] | 351 | <h2>Refernces</h2>
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 352 |
|
| 353 | <ul>
|
Marc R. Hoffmann | b637242 | 2010-11-14 17:30:00 +0000 | [diff] [blame] | 354 | <li>ASM</li>
|
| 355 | <li><a href="http://andrei.gmxhome.de/bytecode/index.html">Bytecode Outline Plug-In</a> by Andrei Loskutov</li>
|
| 356 | <li><a href="http://en.wikipedia.org/wiki/Glossary_of_graph_theory">Wikipedia: Glossary of Graph Theory</a></li>
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 357 | </ul>
|
| 358 |
|
Marc R. Hoffmann | 096bd1d | 2010-04-27 04:21:39 +0000 | [diff] [blame] | 359 |
|
| 360 | </div>
|
| 361 | <div class="footer">
|
| 362 | <div class="versioninfo"><a href="@jacoco.home.url@">JaCoCo</a> @qualified.bundle.version@</div>
|
| 363 | <a href="license.html">Copyright</a> © @copyright.years@ Mountainminds GmbH & Co. KG and Contributors
|
| 364 | </div>
|
| 365 |
|
| 366 | </body>
|
| 367 | </html> |