blob: 345c4c0b48925ef041fe48c6b9e479ff83e0bfc0 [file] [log] [blame]
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +00001<?xml version="1.0" encoding="ISO-8859-1" ?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
4<head>
5 <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
Marc R. Hoffmann15888492009-07-30 11:46:53 +00006 <link rel="stylesheet" href=".resources/doc.css" charset="ISO-8859-1" type="text/css" />
Marc R. Hoffmanna760f322010-03-10 22:23:52 +00007 <link rel="stylesheet" href="../coverage/.resources/prettify.css" charset="ISO-8859-1" type="text/css" />
8 <script type="text/javascript" src="../coverage/.resources/prettify.js"></script>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +00009 <title>JaCoCo - Implementation Design</title>
10</head>
Marc R. Hoffmanna760f322010-03-10 22:23:52 +000011<body onload="prettyPrint()">
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000012
Marc R. Hoffmann15888492009-07-30 11:46:53 +000013<div class="breadcrumb">
14 <a href="../index.html" class="el_session">JaCoCo</a> &gt;
15 <a href="index.html" class="el_group">Documentation</a> &gt;
16 <span class="el_source">Implementation Design</span>
17</div>
Marc R. Hoffmann17be2692010-02-02 05:44:47 +000018<div id="content">
Marc R. Hoffmann15888492009-07-30 11:46:53 +000019
20<h1>Implementation Design</h1>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000021
22<p>
23 This is a unordered list of implementation design decisions. Each topic tries
24 to follow this structure:
25</p>
26
27<ul>
28 <li>Problem statement</li>
29 <li>Proposed Solution</li>
30 <li>Alternatives and Discussion</li>
31</ul>
32
33
34<h2>Coverage Analysis Mechanism</h2>
35
Marc R. Hoffmann15888492009-07-30 11:46:53 +000036<p class="intro">
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000037 Coverage information has to be collected at runtime. For this purpose JaCoCo
38 creates instrumented versions of the original class definitions. The
39 instrumentation process happens on-the-fly during class loading using so
40 called Java agents.
41</p>
42
43<p>
44 There are several different approaches to collect coverage information. For
45 each approach different implementation techniques are known. The following
46 diagram gives an overview with the techniques used by JaCoCo highlighted:
47</p>
48
49<ul>
50 <li>Runtime Profiling
51 <ul>
52 <li>Java Virtual Machine Profiler Interface (JVMPI), until Java 1.4</li>
53 <li>Java Virtual Machine Tool Interface (JVMTI), since Java 1.5</li>
54 </ul>
55 </li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000056 <li><span class="high">Instrumentation*</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000057 <ul>
58 <li>Java Source Instrumentation</li>
Marc R. Hoffmannc4b20782009-10-02 13:28:46 +000059 <li><span class="high">Byte Code Instrumentation*</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000060 <ul>
61 <li>Offline
62 <ul>
63 <li>Replace Original Classes In-Place</li>
64 <li>Inject Instrumented Classes into the Class Path</li>
65 </ul>
66 </li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000067 <li><span class="high">On-The-Fly*</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000068 <ul>
69 <li>Special Classloader Implementions or Framework Specific Hooks</li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000070 <li><span class="high">Java Agent*</span></li>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000071 </ul>
72 </li>
73 </ul>
74 </li>
75 </ul>
76 </li>
77</ul>
78
79<p>
80 Byte code instrumentation is very fast, can be implemented in pure Java and
81 works with every Java VM. On-the-fly instrumentation with the Java agent
82 hook can be added to the JVM without any modification of the target
83 application.
84</p>
85
86<p>
Radek Libaad5fbc92009-10-26 13:26:53 +000087 The Java agent hook requires at least 1.5 JVMs. Class files compiled with
88 debug information (line numbers) allow for source code highlighting. Unluckily
89 some Java language constructs get compiled to byte code that produces
90 unexpected highlighting results, especially in case of implicitly generated
91 code like default constructors or control structures for finally statements.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000092</p>
93
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +000094
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000095<h2>Instrumentation Approach</h2>
96
Marc R. Hoffmann15888492009-07-30 11:46:53 +000097<p class="intro">
Marc R. Hoffmann872290a2009-07-06 15:33:15 +000098 Instrumentation means inserting probes at certain check points in the Java
Marc R. Hoffmannc4b20782009-10-02 13:28:46 +000099 byte code. A probe is a generated piece of byte code that records the fact
100 that it has been executed. JaCoCo inserts probes at the end of every basic
101 block.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000102</p>
103
104<p>
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000105 A basic block is a piece of byte code that has a single entry point (the first
106 byte code instruction) and a single exit point (like <code>jump</code>,
Radek Libaad5fbc92009-10-26 13:26:53 +0000107 <code>throw</code> or <code>return</code>). A basic block must not contain jump
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000108 targets except the entry point. One can think of basic blocks as the nodes in
109 a control flow graph of a method. Using basic block boundaries to insert code
Radek Libaad5fbc92009-10-26 13:26:53 +0000110 coverage probes has been very successfully utilized by
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000111 <a href="http://emma.sourceforge.net/">EMMA</a>.
112</p>
113
114<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000115 Basic block instrumentation works regardless of whether the class files have been
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000116 compiled with debug information for source lines. Source code highlighting
117 will of course not be possible without this debug information, but percentages
118 on method level can still be calculated. Basic block probes result in
Radek Libaad5fbc92009-10-26 13:26:53 +0000119 reasonable overhead regarding class file size and performance. Partial line
120 coverage can occur if a line contains more than one statement or a statement
121 gets compiled into byte code forming more than one basic block (e.g. boolean
122 assignments). Calculating basic block relies on the Java byte code only, therefore
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000123 JaCoCo is independent of the source language and should also work with other
124 Java VM based languages like <a href="http://www.scala-lang.org/">Scala</a>.
125</p>
126
127<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000128 The huge drawback of this approach is the fact that basic blocks are
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000129 actually much smaller in the Java VM: Nearly every byte code instruction
130 (especially method invocations) can result in an exception. In this case the
131 block is left somewhere in the middle without hitting the probe, which leads
132 to unexpected results for example in case of negative tests. A possible
Radek Libaad5fbc92009-10-26 13:26:53 +0000133 solution would be to add exception handlers that trigger special probes.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000134</p>
135
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000136<h2>Coverage Agent Isolation</h2>
137
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000138<p class="intro">
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000139 The Java agent is loaded by the application class loader. Therefore the
Radek Libaad5fbc92009-10-26 13:26:53 +0000140 classes of the agent live in the same name space like the application classes
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000141 which can result in clashes especially with the third party library ASM. The
142 JoCoCo build therefore moves all agent classes into a unique package.
143</p>
144
145<p>
146 The JaCoCo build renames all classes contained in the
147 <code>jacocoagent.jar</code> into classes with a
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000148 <code>org.jacoco.agent.rt_&lt;randomid&gt;</code> prefix, including the
149 required ASM library classes. The identifier is created from a random number.
150 As the agent does not provide any API, no one should be affected by this
151 renaming. This trick also allows that JaCoCo tests can be verified with
152 JaCoCo.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000153</p>
154
155
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000156<h2>Minimal Java Version</h2>
157
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000158<p class="intro">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000159 JaCoCo requires Java 1.5.
160</p>
161
162<p>
163 The Java agent mechanism used for on-the-fly instrumentation became available
Radek Libaad5fbc92009-10-26 13:26:53 +0000164 with Java 1.5 VMs. Coding and testing with Java 1.5 language level is more
165 efficient, less error-prone &ndash; and more fun than with older versions.
166 JaCoCo will still allow to run against Java code compiled for these.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000167</p>
168
169
170<h2>Byte Code Manipulation</h2>
171
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000172<p class="intro">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000173 Instrumentation requires mechanisms to modify and generate Java byte code.
Radek Libaad5fbc92009-10-26 13:26:53 +0000174 JaCoCo uses the ASM library for this purpose internally.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000175</p>
176
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000177<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000178 Implementing the Java byte code specification would be an extensive and
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000179 error-prone task. Therefore an existing library should be used. The
180 <a href="http://asm.objectweb.org/">ASM</a> library is lightweight, easy to
181 use and very efficient in terms of memory and CPU usage. It is actively
182 maintained and includes as huge regression test suite. Its simplified BSD
183 license is approved by the Eclipse Foundation for usage with EPL products.
184</p>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000185
186<h2>Java Class Identity</h2>
187
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000188<p class="intro">
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000189 Each class loaded at runtime needs a unique identity to associate coverage data with.
190 JaCoCo creates such identities by a CRC64 hash code of the raw class definition.
191</p>
192
193<p>
194 In multi-classloader environments the plain name of a class does not
195 unambiguously identify a class. For example OSGi allows to use different
196 versions of the same class to be loaded within the same VM. In complex
197 deployment scenarios the actual version of the test target might be different
198 from current development version. A code coverage report should guarantee that
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000199 the presented figures are extracted from a valid test target. A hash code of
Radek Libaad5fbc92009-10-26 13:26:53 +0000200 the class definitions allows to differentiate between classes and versions of
201 classes. The CRC64 hash computation is simple and fast resulting in a small 64
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000202 bit identifier.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000203</p>
204
205<p>
206 The same class definition might be loaded by class loaders which will result
207 in different classes for the Java runtime system. For coverage analysis this
208 distinction should be irrelevant. Class definitions might be altered by other
209 instrumentation based technologies (e.g. AspectJ). In this case the hash code
210 will change and identity gets lost. On the other hand code coverage analysis
211 based on classes that have been somehow altered will produce unexpected
Radek Libaad5fbc92009-10-26 13:26:53 +0000212 results. The CRC64 code might produce so called <i>collisions</i>, i.e.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000213 creating the same hash code for two different classes. Although CRC64 is not
214 cryptographically strong and collision examples can be easily computed, for
215 regular class files the collision probability is very low.
216</p>
217
218<h2>Coverage Runtime Dependency</h2>
219
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000220<p class="intro">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000221 Instrumented code typically gets a dependency to a coverage runtime which is
222 responsible for collecting and storing execution data. JaCoCo uses JRE types
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000223 only in generated instrumentation code.
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000224</p>
225
226<p>
227 Making a runtime library available to all instrumented classes can be a
Radek Libaad5fbc92009-10-26 13:26:53 +0000228 painful or impossible task in frameworks that use their own class loading
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000229 mechanisms. Since Java 1.6 <code>java.lang.instrument.Instrumentation</code>
230 has an API to extends the bootsstrap loader. As our minimum target is Java 1.5
231 JaCoCo decouples the instrumented classes and the coverage runtime through
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000232 official JRE API types only. The instrumented classes communicate through the
233 <code>Object.equals(Object)</code> method with the runtime. A instrumented
234 class can retrieve its probe array instance with the following code. Note
235 that only JRE APIs are used:
236</p>
237
238
Marc R. Hoffmanna760f322010-03-10 22:23:52 +0000239<pre class="source lang-java">
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000240<span class="nr"> 1</span>Object access = ... // Retrieve instance
241<span class="nr"> 2</span>
242<span class="nr"> 3</span>Object[] args = new Object[3];
243<span class="nr"> 4</span>args[0] = Long.valueOf(8060044182221863588); // class id
244<span class="nr"> 5</span>args[1] = "com/example/MyClass"; // class name
245<span class="nr"> 6</span>args[2] = Integer.valueOf(24); // probe count
246<span class="nr"> 7</span>
247<span class="nr"> 8</span>access.equals(args);
248<span class="nr"> 9</span>
249<span class="nr"> 10</span>boolean[] probes = (boolean[]) args[0];
250</pre>
251
252<p>
253 The most tricky part takes place in line 1 and is not shown in the snippet
254 above. The object instance providing access to the coverage runtime through
255 its <code>equals()</code> method has to be obtained. Different approaches have
256 been implemented and tested so far:
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000257</p>
258
Marc R. Hoffmann402370f2009-08-10 14:02:23 +0000259<ul>
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000260 <li><b><code>SystemPropertiesRuntime</code></b>: This approach stores the
261 object instance under a system property. This solution breaks the contract
262 that system properties must only contain <code>java.lang.String</code>
263 values and therefore causes trouble in applications that rely on this
264 definition (e.g. Ant).</li>
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000265 <li><b><code>LoggerRuntime</code></b>: Here we use a shared
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000266 <code>java.util.logging.Logger</code> and communicate through the logging
267 parameter array instead of a <code>equals()</code> method. The coverage
268 runtime registers a custom <code>Handler</code> to receive the parameter
269 array. This approach might break environments that install their own log
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000270 managers (e.g. Glassfish).</li>
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000271 <li><b><code>ModifiedSystemClassRuntime</code></b>: This approach adds a
272 public static field to an existing JRE class through instrumentation. Unlike
273 the other methods above this is only possible for environments where a Java
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000274 agent is active.</li>
Marc R. Hoffmann402370f2009-08-10 14:02:23 +0000275</ul>
276
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000277<p>
278 The current JaCoCo Java agent implementation uses the
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000279 <code>ModifiedSystemClassRuntime</code> adding a field to the class
Marc R. Hoffmann6751fe42010-02-01 18:18:24 +0000280 <code>java.sql.Types</code>.
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000281</p>
282
Marc R. Hoffmann402370f2009-08-10 14:02:23 +0000283
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000284<h2>Memory Usage</h2>
285
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000286<p class="intro">
Marc R. Hoffmann58d76212009-10-08 15:40:46 +0000287 Coverage analysis for huge projects with several thousand classes or hundred
288 thousand lines of code should be possible. To allow this with reasonable
289 memory usage the coverage analysis is based on streaming patterns and
290 "depth first" traversals.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000291</p>
292
293<p>
Marc R. Hoffmann58d76212009-10-08 15:40:46 +0000294 The complete data tree of a huge coverage report is too big to fit into a
295 reasonable heap memory configuration. Therefore the coverage analysis and
296 report generation is implemented as "depth first" traversals. Which means that
Radek Libaad5fbc92009-10-26 13:26:53 +0000297 at any point in time only the following data has to be held in working memory:
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000298</p>
299
Marc R. Hoffmann58d76212009-10-08 15:40:46 +0000300<ul>
301 <li>A single class which is currently processed.</li>
302 <li>The summary information of all parents of this class (package, groups).</li>
303</ul>
304
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000305<h2>Java Element Identifiers</h2>
306
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000307<p class="intro">
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000308 The Java language and the Java VM use different String representation formats
309 for Java elements. For example while a type reference in Java reads like
310 <code>java.lang.Object</code>, the VM references the same type as
311 <code>Ljava/lang/Object;</code>. The JaCoCo API is based on VM identifiers only.
312</p>
313
314<p>
315 Using VM identifiers directly does not cause any transformation overhead at
316 runtime. There are several programming languages based on the Java VM that
317 might use different notations. Specific transformations should therefore only
Radek Libaad5fbc92009-10-26 13:26:53 +0000318 happen at the user interface level, for example during report generation.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000319</p>
320
321<h2>Modularization of the JaCoCo implementation</h2>
322
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000323<p class="intro">
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000324 JaCoCo is implemented in several modules providing different functionality.
325 These modules are provided as OSGi bundles with proper manifest files. But
Radek Libaad5fbc92009-10-26 13:26:53 +0000326 there are no dependencies on OSGi itself.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000327</p>
328
329<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000330 Using OSGi bundles allows well defined dependencies at development time and
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000331 at runtime in OSGi containers. As there are no dependencies on OSGi, the
Radek Libaad5fbc92009-10-26 13:26:53 +0000332 bundles can also be used like regular JAR files.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000333</p>
334
Marc R. Hoffmann17be2692010-02-02 05:44:47 +0000335</div>
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000336<div class="footer">
Marc R. Hoffmanndf6ff962010-04-09 15:31:22 +0000337 <div class="versioninfo"><a href="@jacoco.home.url@">JaCoCo</a> @qualified.bundle.version@</div>
338 <a href="license.html">Copyright</a> &copy; @copyright.years@ Mountainminds GmbH &amp; Co. KG and Contributors
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000339</div>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000340
341</body>
342</html>