Blame - org.jacoco.doc/docroot/doc/flow.html - platform/external/jacoco

blob: b73c070ad17c224acc49f52f96521851870bb2f4 [file] [log] [blame]

Marc R. Hoffmann	e571f3f	2012-05-13 12:18:02 +0000	[diff] [blame]	1	<?xml version="1.0" encoding="UTF-8" ?>
Evgeny Mandrikov	82a92ca	2012-01-15 20:25:48 +0000	[diff] [blame]	2	<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
				3	<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
				4	<head>
Marc R. Hoffmann	e571f3f	2012-05-13 12:18:02 +0000	[diff] [blame]	5	<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
Evgeny Mandrikov	8b21298	2016-06-12 17:55:49 +0200	[diff] [blame]	6	<link rel="stylesheet" href="resources/doc.css" charset="UTF-8" type="text/css" />
Evgeny Mandrikov	28d5985	2016-06-11 12:41:17 +0200	[diff] [blame]	7	<link rel="stylesheet" href="../coverage/jacoco-resources/prettify.css" charset="UTF-8" type="text/css" />
Evgeny Mandrikov	8b21298	2016-06-12 17:55:49 +0200	[diff] [blame]	8	<link rel="shortcut icon" href="resources/report.gif" type="image/gif" />
Evgeny Mandrikov	28d5985	2016-06-11 12:41:17 +0200	[diff] [blame]	9	<script type="text/javascript" src="../coverage/jacoco-resources/prettify.js"></script>
Evgeny Mandrikov	82a92ca	2012-01-15 20:25:48 +0000	[diff] [blame]	10	<title>JaCoCo - Control Flow Analysis</title>
				11	</head>
				12	<body onload="prettyPrint()">
				13
				14	<div class="breadcrumb">
				15	<a href="../index.html" class="el_report">JaCoCo</a> >
				16	<a href="index.html" class="el_group">Documentation</a> >
				17	<span class="el_source">Control Flow Analysis</span>
				18	</div>
				19	<div id="content">
				20
				21	<h1>Control Flow Analysis for Java Methods</h1>
				22
				23	<p class="hint">
				24	Implementing a coverage tool that supports statement (C0) as well as branch
				25	coverage coverage (C1) requires detailed analysis of the internal control flow
				26	of Java methods. Due to the architecture of JaCoCo this analysis happens on
				27	the bytecode of compiled class files. This document describes JaCoCo's
				28	strategies for inserting probes into the control flow at runtime and analyzing
				29	the actual code coverage. Marc R. Hoffmann, November 2011
				30	</p>
				31
				32	<h2>Control Flow Graphs for Java Bytecode</h2>
				33
				34	<p>
				35	As an starting point we take the following example method that contains a
				36	single branching point:
				37	</p>
				38
				39	<pre class="source lang-java linenums">
				40	public static void example() {
				41	a();
				42	if (cond()) {
				43	b();
				44	} else {
				45	c();
				46	}
				47	d();
				48	}
				49	</pre>
				50
				51	<p>
				52	A Java compiler will create the following bytecode from this example method.
				53	Java bytecode is a linear sequence of instructions. Control flow is
				54	implemented with <i>jump</i> instructions like the conditional
				55	<code>IFEQ</code> or the unconditional <code>GOTO</code> opcode. The jump
				56	targets are technically relative offsets to the target instruction. For better
				57	readability we use symbolic labels (<code>L1</code>, <code>L2</code>) instead
				58	(also the ASM API uses such symbolic labels):
				59	</p>
				60
				61	<pre class="source linenums">
				62	public static example()V
				63	INVOKESTATIC a()V
				64	INVOKESTATIC cond()Z
				65	IFEQ L1
				66	INVOKESTATIC b()V
				67	GOTO L2
				68	L1: INVOKESTATIC c()V
				69	L2: INVOKESTATIC d()V
				70	RETURN
				71	</pre>
				72
				73	<p>
				74	The possible control flow in the bytecode above can be represented by a graph.
				75	The nodes are byte code instruction, the edged of the graph represent the
				76	possible control flow between the instructions. The control flow of the
				77	example is shown in the left box of this diagram:
				78	</p>
				79
Evgeny Mandrikov	8b21298	2016-06-12 17:55:49 +0200	[diff] [blame]	80	<img src="resources/flow-example.png" alt="Bytecode Control Flow"/>
Evgeny Mandrikov	82a92ca	2012-01-15 20:25:48 +0000	[diff] [blame]	81
				82
				83	<h3>Flow Edges</h3>
				84
				85	<p>
				86	The control flow graph of a Java method defined by Java byte code may have
				87	the following Edges. Each edge connects a source instruction with a target
				88	instruction. In some cases the source instruction or the target instruction
				89	does not exist (virtual edges for method entry and exit) or cannot be
				90	exactly specified (exception handlers).
				91	</p>
				92
				93	<table class="coverage">
				94	<thead>
				95	<tr>
				96	<td>Type</td>
				97	<td>Source</td>
				98	<td>Target</td>
				99	<td>Remarks</td>
				100	</tr>
				101	</thead>
				102	<tbody>
				103	<tr>
				104	<td>ENTRY</td>
				105	<td>-</td>
				106	<td>First instruction in method</td>
				107	<td></td>
				108	</tr>
				109	<tr>
				110	<td>SEQUENCE</td>
				111	<td>Instruction, except <code>GOTO</code>, <code>xRETURN</code>,
				112	<code>THROW</code>, <code>TABLESWITCH</code> and <code>LOOKUPSWITCH</code></td>
				113	<td>Subsequent instruction</td>
				114	<td></td>
				115	</tr>
				116	<tr>
				117	<td>JUMP</td>
				118	<td><code>GOTO</code>, <code>IFx</code>, <code>TABLESWITCH</code> or
				119	<code>LOOKUPSWITCH</code> instruction</td>
				120	<td>Target instruction</td>
				121	<td><code>TABLESWITCH</code> and <code>LOOKUPSWITCH</code> will define
				122	multiple edges.</td>
				123	</tr>
				124	<tr>
				125	<td>EXHANDLER</td>
				126	<td>Any instruction in handler scope</td>
				127	<td>Target instruction</td>
				128	<td></td>
				129	</tr>
				130	<tr>
				131	<td>EXIT</td>
				132	<td><code>xRETURN</code> or <code>THROW</code> instruction</td>
				133	<td>-</td>
				134	<td></td>
				135	</tr>
				136	<tr>
				137	<td>EXEXIT</td>
				138	<td>Any instruction</td>
				139	<td>-</td>
				140	<td>Unhandled exception.</td>
				141	</tr>
				142	</tbody>
				143	</table>
				144
				145	<p>
				146	The current JaCoCo implementation ignores edges caused by implicit exceptions
				147	and the the method entry. This means we consider SEQUENCE, JUMP, EXIT.
				148	</p>
				149
				150
				151	<h2>Probe Insertion Strategy</h2>
				152
				153	<p>
				154	Probes are additional instructions that can be inserted between existing
				155	instructions. They do not change the behavior of the method but record the
				156	fact that they have been executed. One can think probes are placed on edges of
				157	the control flow graph. Theoretically we could insert a probe at every edge of
				158	the control flow graph. As a probe implementation itself requires multiple
				159	bytecode instructions this would increase the size of the class files several
				160	times and significantly slow down execution speed of the instrumented classes.
				161	Fortunately this is not required, in fact we only need a few probes per method
				162	depending on the control flow of the method. For example a method without any
				163	branches requires a single probe only. The reason for this is that starting
				164	from a certain probe we can back-trace the execution path and typically get
				165	coverage information for multiple instructions.
				166	</p>
				167
				168	<p>
				169	If a probe has been executed we know that the corresponding edge has been
				170	visited. From this edge we can conclude to other preceding nodes and edges:
				171	</p>
				172
				173	<ul>
				174	<li>If a edge has been visited, we know that the source node of the this edge
				175	has been executed.</li>
				176	<li>If a node has been executed and the node is the target of only one edge
				177	we know that this edge has been visited.</li>
				178	</ul>
				179
				180	<p>
				181	Recursively applying these rules allows to determine the execution status of
				182	all instructions of a method – given that we have probes at the right
				183	positions. Therefore JaCoCo inserts probes
				184	</p>
				185
				186	<ul>
				187	<li>at every method exit (return or throws) and</li>
				188	<li>at every edge where the target instruction is the target of more than one
				189	edge.</li>
				190	</ul>
				191
				192	<p>
				193	We recall that a probe is simply a small sequence of additional instructions
				194	that needs to be inserted at a control flow edge. The following table
				195	illustrates how this extra instructions are added in case of different edge
				196	types.
				197	</p>
				198
				199	<table class="coverage">
				200	<thead>
				201	<tr>
				202	<td>Type</td>
				203	<td>Before</td>
				204	<td>After</td>
				205	<td>Remarks</td>
				206	</tr>
				207	</thead>
				208	<tbody>
				209	<tr>
				210	<td>SEQUENCE</td>
Evgeny Mandrikov	8b21298	2016-06-12 17:55:49 +0200	[diff] [blame]	211	<td><img src="resources/flow-sequence.png" alt="Sequence"/></td>
				212	<td><img src="resources/flow-sequence-probe.png" alt="Sequence with Probe"/></td>
Evgeny Mandrikov	82a92ca	2012-01-15 20:25:48 +0000	[diff] [blame]	213	<td>
				214	In case of a simple sequence the probe is simply inserted between the
				215	two instructions.
				216	</td>
				217	</tr>
				218	<tr>
				219	<td>JUMP (unconditional)</td>
Evgeny Mandrikov	8b21298	2016-06-12 17:55:49 +0200	[diff] [blame]	220	<td><img src="resources/flow-goto.png" alt="Unconditional Jump"/></td>
				221	<td><img src="resources/flow-goto-probe.png" alt="Unconditional Jump with Probe"/></td>
Evgeny Mandrikov	82a92ca	2012-01-15 20:25:48 +0000	[diff] [blame]	222	<td>
				223	As an unconditional jump is executed in any case, we can also insert the
				224	probe just before the GOTO instruction.
				225	</td>
				226	</tr>
				227	<tr>
				228	<td>JUMP (conditional)</td>
Evgeny Mandrikov	8b21298	2016-06-12 17:55:49 +0200	[diff] [blame]	229	<td><img src="resources/flow-cond.png" alt="Conditional Jump"/></td>
				230	<td><img src="resources/flow-cond-probe.png" alt="Conditional Jump with Probe"/></td>
Evgeny Mandrikov	82a92ca	2012-01-15 20:25:48 +0000	[diff] [blame]	231	<td>
Marc R. Hoffmann	99f7409	2012-06-21 18:46:52 +0000	[diff] [blame]	232	Adding a probe to an conditional jump is little bit more tricky. We
				233	invert the semantic of the opcode and add the probe right after the
				234	conditional jump. With a subsequent <code>GOTO</code> instruction we
				235	jump to the original target. Note that this approach will not introduce
				236	a backward jump, which would cause trouble with the Java verifier if we
				237	have an uninitialized object on the stack.
Evgeny Mandrikov	82a92ca	2012-01-15 20:25:48 +0000	[diff] [blame]	238	</td>
				239	</tr>
				240	<tr>
				241	<td>EXIT</td>
Evgeny Mandrikov	8b21298	2016-06-12 17:55:49 +0200	[diff] [blame]	242	<td><img src="resources/flow-exit.png" alt="Exit"/></td>
				243	<td><img src="resources/flow-exit-probe.png" alt="Exit with Probe"/></td>
Evgeny Mandrikov	82a92ca	2012-01-15 20:25:48 +0000	[diff] [blame]	244	<td>
				245	As is is the nature of RETURN and THROW statements to actually leave the
				246	method we add the probe right before these statements.
				247	</td>
				248	</tr>
				249	</tbody>
				250	</table>
				251
				252	<p>
				253	Now let's see how this rules apply to the example snippet above. We see that
				254	<code>INVOKE d()</code> instruction is the only node with more than one
				255	incoming edge. So we need to place probes on those edges and another probe on
				256	the only exit node. The result is shown the the right box of the diagram
				257	above.
				258	</p>
				259
Marc R. Hoffmann	8731145	2015-02-16 09:09:25 +0100	[diff] [blame]	260	<h2>Additional Probes Between Lines</h2>
				261
				262	<p>
				263	The probe insertion strategy described so far does not consider implicit
				264	exceptions thrown for example from invoked methods. If the control flow
				265	between two probes is interrupted by a exception not explicitly created with
				266	a <code>throw</code> statement all instruction in between are considered as
				267	not covered. This leads to unexpected results especially when the the block of
				268	instructions spans multiple lines of source code.
				269	</p>
				270
				271	<p>
				272	Therefore JaCoCo adds an additional probe between the instructions of two
				273	lines whenever the subsequent line contains at least one method invocation.
				274	This limits the effect of implicit exceptions from method invocations to
				275	single lines of source. The approach only works for class files compiled with
				276	debug information (line numbers) and does not consider implicit exceptions
				277	from other instructions than method invocations (e.g.
				278	<code>NullPointerException</code> or <code>ArrayIndexOutOfBoundsException</code>).
				279	</p>
				280
Evgeny Mandrikov	82a92ca	2012-01-15 20:25:48 +0000	[diff] [blame]	281	<h2>Probe Implementation</h2>
				282
				283	<p>
				284	Code coverage analysis is a runtime metric that provides execution details
				285	of the software under test. This requires detailed recording about the
				286	instructions (instruction coverage) that have been executed. For branch
				287	coverage also the outcome of decisions has to be recorded. In any case
				288	execution data is collected by so called probes:
				289	</p>
				290
				291	<p class="hint">
				292	A <b>probe</b> is a sequence of bytecode instructions that can be inserted
				293	into a Java method. When the probe is executed, this fact is recorded and can
				294	be reported by the coverage runtime. The probe must not change the behavior
				295	of the original code.
				296	</p>
				297
				298	<p>
				299	The only purpose of the probe is to record that it has been executed at least
				300	once. The probe does not record the number of times it has been called or
				301	collect any timing information. The latter is out of scope for code coverage
				302	analysis and more in the objective of a performance analysis tool. Typically
				303	multiple probes needs to be inserted into each method, therefore probes needs
				304	to be identified. Also the probe implementation and the storage mechanism it
				305	depends on needs to be thread safe as multi-threaded execution is a common
				306	scenario for java applications (albeit not for plain unit tests). Probes must
				307	not have any side effects on the original code of the method. Also they should
				308	add minimal overhead.
				309	</p>
				310
				311	<p>
				312	So to summarize the requirements for execution probes:
				313	</p>
				314
				315	<ul>
				316	<li>Record execution</li>
				317	<li>Identification for different probes</li>
				318	<li>Thread safe</li>
				319	<li>No side effects on application code</li>
				320	<li>Minimal runtime overhead</li>
				321	</ul>
				322
				323	<p>
				324	JaCoCo implements probes with a <code>boolean[]</code> array instance per
				325	class. Each probe corresponds to a entry in this array. Whenever the probe is
				326	executed the entry is set to <code>true</code> with the following four
				327	bytecode instructions:
				328	</p>
				329
				330	<pre class="source">
				331	ALOAD probearray
				332	xPUSH probeid
				333	ICONST_1
				334	BASTORE
				335	</pre>
				336
				337	<p>
				338	Note that this probe code is thread safe, does not modify the operand stack
				339	or modify local variables and is also thread safe. It does also not leave the
				340	method though an external call. The only prerequisite is that the probe array
				341	is available as a local variable. For this at the beginning of each method
				342	additional instrumentation code needs to be added to obtain the array instance
				343	associated with the belonging class. To avoid code duplication the
				344	initialization is delegated to a static private method
				345	<code>$jacocoinit()</code> which is added to every non-interface class.
				346	</p>
				347
				348	<p>
				349	The size of the probe code above depends on the position of the probe array
				350	variable and the value of the probe identifier as different opcodes can be
				351	used. As calculated in the table below the overhead per probe ranges between 4
				352	and 7 bytes of additional bytecode:
				353	</p>
				354
				355	<table class="coverage">
				356	<thead>
				357	<tr>
				358	<td>Possible Opcodes</td>
				359	<td>Min. Size [bytes]</td>
				360	<td>Max. Size [bytes]</td>
				361	</tr>
				362	</thead>
				363	<tfoot>
				364	<tr>
				365	<td>Total:</td>
				366	<td>4</td>
				367	<td>7</td>
				368	</tr>
				369	</tfoot>
				370	<tbody>
				371	<tr>
				372	<td><code>ALOAD_x</code>, <code>ALOAD</code> <sup>1</sup></td>
				373	<td>1</td>
				374	<td>2</td>
				375	</tr>
				376	<tr>
				377	<td><code>ICONST_x</code>, <code>BIPUSH</code>, <code>SIPUSH</code>, <code>LDC</code>, <code>LDC_W</code> <sup>2</sup></td>
				378	<td>1</td>
				379	<td>3</td>
				380	</tr>
				381	<tr>
				382	<td><code>ICONST_1</code></td>
				383	<td>1</td>
				384	<td>1</td>
				385	</tr>
				386	<tr>
				387	<td><code>BASTORE</code></td>
				388	<td>1</td>
				389	<td>1</td>
				390	</tr>
				391	</tbody>
				392	</table>
				393
				394	<p>
				395	<sup>1</sup> The probe array is the first variable after the arguments.
				396	If the method arguments do not consume more that 3 slots the 1-byte opcode can
				397	be used.<br/>
				398	<sup>2</sup> 1-byte opcodes for ids 0 to 5, 2-byte opcode for ids up to 127,
				399	3-byte opcode for ids up to 32767. Ids values of 32768 or more require an
				400	additional constant pool entry. For normal class files it is very unlikely to
				401	require more than 32,000 probes.
				402	</p>
				403
				404	<h2>Performance</h2>
				405
				406	<p>
				407	The control flow analysis and probe insertion strategy described in this
				408	document allows to efficiently record instruction and branch coverage. In
				409	total classes instrumented with JaCoCo increase their size by about 30%. Due
				410	to the fact that probe execution does not require any method calls, only local
				411	instructions, the observed execution time overhead for instrumented
				412	applications typically is less than 10%.
				413	</p>
				414
				415	<h2>References</h2>
				416
				417	<ul>
				418	<li><a href="http://asm.objectweb.org/">ASM byte code library</a> by Eric Bruneton at al.</li>
				419	<li><a href="http://andrei.gmxhome.de/bytecode/index.html">Bytecode Outline Plug-In</a> by Andrei Loskutov</li>
				420	<li><a href="http://en.wikipedia.org/wiki/Glossary_of_graph_theory">Wikipedia: Glossary of Graph Theory</a></li>
				421	</ul>
				422
				423	</div>
				424	<div class="footer">
				425	<div class="right"><a href="@jacoco.home.url@">JaCoCo</a> @qualified.bundle.version@</div>
				426	<a href="license.html">Copyright</a> © @copyright.years@ Mountainminds GmbH & Co. KG and Contributors
				427	</div>
				428
				429	</body>
				430	</html>