blob: f6c430e253ad402ded7968a6300ba0c73593bb04 [file] [log] [blame]
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +00001<?xml version="1.0" encoding="ISO-8859-1" ?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
4<head>
5 <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
6 <link rel="stylesheet" href=".resources/doc.css" charset="ISO-8859-1" type="text/css" />
Marc R. Hoffmannae8c2a82010-10-05 16:20:10 +00007 <link rel="stylesheet" href="../coverage/.resources/prettify.css" charset="ISO-8859-1" type="text/css" />
Marc R. Hoffmannd7d2f752010-05-06 21:12:31 +00008 <link rel="shortcut icon" href=".resources/report.gif" type="image/gif" />
Marc R. Hoffmannae8c2a82010-10-05 16:20:10 +00009 <script type="text/javascript" src="../coverage/.resources/prettify.js"></script>
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +000010 <title>JaCoCo - Control Flow Analysis</title>
11</head>
Marc R. Hoffmannae8c2a82010-10-05 16:20:10 +000012<body onload="prettyPrint()">
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +000013
14<div class="breadcrumb">
Marc R. Hoffmannd7d2f752010-05-06 21:12:31 +000015 <a href="../index.html" class="el_report">JaCoCo</a> &gt;
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +000016 <a href="index.html" class="el_group">Documentation</a> &gt;
17 <span class="el_source">Control Flow Analysis</span>
18</div>
19<div id="content">
20
21<h1>Control Flow Analysis for Java Methods</h1>
22
23<p style="font-weight:bold;">
24 DRAFT - This document does not reflect the current JaCoCo implementation.
25</p>
26
27<p class="hint">
Marc R. Hoffmannb6372422010-11-14 17:30:00 +000028 Implementing a coverage tool that supports statement (C0) as well as branch
29 coverage coverage (C1) requires detailed analysis of the internal control flow
30 of Java methods. Due to the architecture of JaCoCo this analysis happens on
31 compiled class files (bytecode). This document defines graph structures for
32 control flow analysis of Java bytecode and discusses strategies for probe
33 insertion. Marc R. Hoffmann, November 2010
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +000034</p>
35
Marc R. Hoffmannb6372422010-11-14 17:30:00 +000036<h2>Control Flow Graphs for Java Bytecode</h2>
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +000037
Marc R. Hoffmannb6372422010-11-14 17:30:00 +000038<p>
39 As an starting point we take the following example method that contains a
40 single branching point:
Marc R. Hoffmannae8c2a82010-10-05 16:20:10 +000041</p>
42
43<pre class="source lang-java">
Marc R. Hoffmannb6372422010-11-14 17:30:00 +000044<span class="nr"> 1</span>public static void example() {
Marc R. Hoffmannae8c2a82010-10-05 16:20:10 +000045<span class="nr"> 2</span> a();
Marc R. Hoffmannb6372422010-11-14 17:30:00 +000046<span class="nr"> 3</span> if (cond()) {
Marc R. Hoffmannae8c2a82010-10-05 16:20:10 +000047<span class="nr"> 4</span> b();
Marc R. Hoffmannb6372422010-11-14 17:30:00 +000048<span class="nr"> 5</span> } else {
49<span class="nr"> 6</span> c();
50<span class="nr"> 7</span> }
51<span class="nr"> 8</span> d();
52<span class="nr"> 9</span>}
Marc R. Hoffmannae8c2a82010-10-05 16:20:10 +000053</pre>
54
Marc R. Hoffmannb6372422010-11-14 17:30:00 +000055<p>
56 A Java compiler will create the following bytecode from this example method.
57 Java bytecode is a linear sequence of instructions. Control flow is
58 implemented with <i>jump</i> instructions like the conditional
59 <code>IFEQ</code> or the unconditional <code>GOTO</code> opcode. The jump
60 targets are technically relative offsets to the target instruction. For better
61 readability we use symbolic labels (<code>L1</code>, <code>L2</code>) instead
62 (also the ASM API uses such symbolic labels):
63</p>
Marc R. Hoffmannae8c2a82010-10-05 16:20:10 +000064
65<pre class="source">
Marc R. Hoffmannb6372422010-11-14 17:30:00 +000066<span class="nr"> </span>public static example()V
67<span class="nr"> </span> INVOKESTATIC a()V
68<span class="nr"> </span> INVOKESTATIC cond()Z
69<span class="nr"> </span> IFEQ L1
70<span class="nr"> </span> INVOKESTATIC b()V
71<span class="nr"> </span> GOTO L2
72<span class="nr"> L1:</span> INVOKESTATIC c()V
73<span class="nr"> L2:</span> INVOKESTATIC d()V
74<span class="nr"> </span> RETURN
Marc R. Hoffmannae8c2a82010-10-05 16:20:10 +000075</pre>
76
Marc R. Hoffmannb6372422010-11-14 17:30:00 +000077<p>
78 The possible control flow in the bytecode above can be represented by a graph.
79 The nodes are byte code instruction, the edged of the graph represent the
80 possible control flow between the instructions:
81</p>
82
83<img src=".resources/flow-1.png" alt="Bytecode contol flow"/>
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +000084
Marc R. Hoffmann66234cf2010-10-28 15:24:37 +000085<h3>Flow Edges</h3>
86
87<p>
88 The control flow graph of a Java method defined by Java byte code may have
89 the following Edges. Each edge connects a source instruction with a target
90 instruction. In some cases the source instruction or the target instruction
91 does not exist (virtual edges for method entry and exit) or cannot be
92 exactly specified (exception handlers).
93</p>
94
95<table class="coverage">
96 <thead>
97 <tr>
98 <td>Type</td>
99 <td>Source</td>
100 <td>Target</td>
101 <td>Remarks</td>
102 </tr>
103 </thead>
104 <tbody>
105 <tr>
106 <td>ENTRY</td>
107 <td>-</td>
108 <td>First instruction in method</td>
109 <td></td>
110 </tr>
111 <tr>
112 <td>SEQUENCE</td>
113 <td>Instruction, except <code>GOTO</code>, <code>xRETURN</code>,
114 <code>THROW</code>, <code>TABLESWITCH</code> and <code>LOOKUPSWITCH</code></td>
115 <td>Subsequent instruction</td>
116 <td></td>
117 </tr>
118 <tr>
119 <td>JUMP</td>
120 <td><code>GOTO</code>, <code>IFx</code>, <code>TABLESWITCH</code> or
121 <code>LOOKUPSWITCH</code> instruction</td>
122 <td>Target instruction</td>
123 <td><code>TABLESWITCH</code> and <code>LOOKUPSWITCH</code> will define
124 multiple edges.</td>
125 </tr>
126 <tr>
127 <td>EXHANDLER</td>
Marc R. Hoffmannb6372422010-11-14 17:30:00 +0000128 <td>Any instruction in handler scope</td>
Marc R. Hoffmann66234cf2010-10-28 15:24:37 +0000129 <td>Target instruction</td>
130 <td></td>
131 </tr>
132 <tr>
133 <td>EXIT</td>
134 <td><code>xRETURN</code> or <code>THROW</code> instruction</td>
135 <td>-</td>
136 <td></td>
137 </tr>
138 <tr>
139 <td>EXEXIT</td>
140 <td>Any instruction</td>
141 <td>-</td>
142 <td>Unhandled exception.</td>
143 </tr>
144 </tbody>
145</table>
146
Marc R. Hoffmannb6372422010-11-14 17:30:00 +0000147<h2>Probe Insertion Strategy</h2>
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +0000148
Marc R. Hoffmannb6372422010-11-14 17:30:00 +0000149<ul>
150 <li>Probes are additional instructions that can be inserted between existing
151 instructions. Probes record the fact that they have been executed. One can
152 think probes are placed on edges of the control flow graph. Therefore if
153 a probe has been executed we know that the corresponding edge has been
154 visited.</li>
155 <li>If a edge has been visited, we know that the source node of the this edge
156 has been executed.</li>
157 <li>If a node has been executed and the node is the target of only one edge
158 we know that this edge has been visited.</li>
159</ul>
160
161<p>
162 With this observations we only need probes at the following edges:
163</p>
164
165<ul>
166 <li>At every EXIT.</li>
167 <li>At every edge where the target instruction is the target of more than one
168 edge.</li>
169</ul>
170
171<p>
172 The execution status of all other edges and instructions can be derived from
173 the status of this probes by recursively applying the rules above.
174</p>
175
176<h2>Coverage Analysis</h2>
177
178
179<h2>Probe Implementation</h2>
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +0000180
181<p>
182 Code coverage analysis is a runtime metric that provides execution details
183 of the software under test. This requires detailed recording about the
Marc R. Hoffmannae8c2a82010-10-05 16:20:10 +0000184 instructions (instruction coverage) that have been executed. For branch
185 coverage also the outcome of decisions has to be recorded. In any case
186 execution data is collected by so called probes:
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +0000187</p>
188
189<p class="hint">
Marc R. Hoffmann102e8372010-04-27 15:46:23 +0000190 A <b>probe</b> is a sequence of bytecode instructions that can be inserted
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +0000191 into a Java method. When the probe is executed, this fact is recorded and can
192 be reported by the coverage runtime.
193</p>
194
195<p>
196 The only purpose of the probe is to record that it has been executed at least
197 once. The probe does not record the number of times it has been called or
198 collect any timing information. The latter is out of scope for code coverage
199 analysis and more in the objective of a performance analysis tool. Typically
200 multiple probes needs to be inserted into each method, therefore probes needs
201 to be identified. Also the probe implementation and the storage mechanism it
202 depends on needs to be thread safe as multi-threaded execution is a common
203 scenario for java applications (albeit not for plain unit tests). Probes must
204 not have any side effects on the original code of the method. Also they should
205 add minimal overhead.
206</p>
207
208<p>
209 So to summarize the requirements for execution probes:
210</p>
211
212<ul>
213 <li>Record execution</li>
214 <li>Identification for different probes</li>
215 <li>Thread safe</li>
216 <li>No side effects on application code</li>
217 <li>Minimal runtime overhead</li>
218</ul>
219
220<p>
221 JaCoCo implements probes with a <code>boolean[]</code> array instance per
222 class. Each probe corresponds to a entry in this array. Whenever the probe is
Marc R. Hoffmann102e8372010-04-27 15:46:23 +0000223 executed the entry is set to <code>true</code> with the following four
224 bytecode instructions:
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +0000225</p>
226
227<pre class="source">
228ALOAD probearray
229xPUSH probeid
230ICONST_1
231BASTORE
232</pre>
233
234<p>
235 Note that this probe code is thread safe, does not modify the operand stack
236 or modify local variables and is also thread safe. It does also not leave the
237 method though an external call. The only prerequisite is that the probe array
238 is available as a local variable. For this at the beginning of each method
239 additional instrumentation code needs to be added to obtain the array instance
Marc R. Hoffmann97728292010-04-27 20:29:23 +0000240 associated with the belonging class. To avoid code duplication the
241 initialization is delegated to a static private method
242 <code>$jacocoinit()</code> which is added to every non-interface class.
Marc R. Hoffmann102e8372010-04-27 15:46:23 +0000243</p>
244
245<p>
246 The size of the probe code above depends on the position of the probe array
247 variable and the value of the probe identifier as different opcodes can be
248 used. As calculated in the table below the overhead per probe ranges between 4
249 and 7 bytes of additional bytecode:
250</p>
251
252<table class="coverage">
253 <thead>
254 <tr>
255 <td>Possible Opcodes</td>
256 <td>Min. Size [bytes]</td>
257 <td>Max. Size [bytes]</td>
Marc R. Hoffmann102e8372010-04-27 15:46:23 +0000258 </tr>
259 </thead>
Marc R. Hoffmann102e8372010-04-27 15:46:23 +0000260 <tfoot>
261 <tr>
262 <td>Total:</td>
263 <td>4</td>
264 <td>7</td>
Marc R. Hoffmann102e8372010-04-27 15:46:23 +0000265 </tr>
266 </tfoot>
Marc R. Hoffmann97728292010-04-27 20:29:23 +0000267 <tbody>
268 <tr>
269 <td><code>ALOAD_x</code>, <code>ALOAD</code> <sup>1</sup></td>
270 <td>1</td>
271 <td>2</td>
272 </tr>
273 <tr>
274 <td><code>ICONST_x</code>, <code>BIPUSH</code>, <code>SIPUSH</code>, <code>LDC</code>, <code>LDC_W</code> <sup>2</sup></td>
275 <td>1</td>
276 <td>3</td>
277 </tr>
278 <tr>
279 <td><code>ICONST_1</code></td>
280 <td>1</td>
281 <td>1</td>
282 </tr>
283 <tr>
284 <td><code>BASTORE</code></td>
285 <td>1</td>
286 <td>1</td>
287 </tr>
288 </tbody>
Marc R. Hoffmann102e8372010-04-27 15:46:23 +0000289</table>
290
291<p>
Marc R. Hoffmann97728292010-04-27 20:29:23 +0000292 <sup>1</sup> The probe array is the first variable after the arguments.
293 If the method arguments do not consume more that 3 slots the 1-byte opcode can
294 be used.<br/>
295 <sup>2</sup> 1-byte opcodes for ids 0 to 5, 2-byte opcode for ids up to 127,
296 3-byte opcode for ids up to 32767. Ids values of 32768 or more require an
297 additional constant pool entry. For normal class files it is very unlikely to
298 require more than 32,000 probes.
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +0000299</p>
300
301<ul>
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +0000302 <li>Limitation: Only proves that the probe itself has been executed,
303 assumptions about the surrounding application code is interpolation</li>
304 <li>Probe in every edge of the control flow graph</li>
305 <li>Every exit path known (branch coverage)</li>
306 <li>Block entry known (exceptions within blocks)</li>
307</ul>
308
Marc R. Hoffmannb6372422010-11-14 17:30:00 +0000309<h2>Refernces</h2>
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +0000310
311<ul>
Marc R. Hoffmannb6372422010-11-14 17:30:00 +0000312 <li>ASM</li>
313 <li><a href="http://andrei.gmxhome.de/bytecode/index.html">Bytecode Outline Plug-In</a> by Andrei Loskutov</li>
314 <li><a href="http://en.wikipedia.org/wiki/Glossary_of_graph_theory">Wikipedia: Glossary of Graph Theory</a></li>
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +0000315</ul>
316
Marc R. Hoffmann096bd1d2010-04-27 04:21:39 +0000317
318</div>
319<div class="footer">
320 <div class="versioninfo"><a href="@jacoco.home.url@">JaCoCo</a> @qualified.bundle.version@</div>
321 <a href="license.html">Copyright</a> &copy; @copyright.years@ Mountainminds GmbH &amp; Co. KG and Contributors
322</div>
323
324</body>
325</html>