blob: ec1be8cef08bb80a380d098621e22dfff3449dd1 [file] [log] [blame]
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +00001<?xml version="1.0" encoding="ISO-8859-1" ?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
4<head>
5 <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
Marc R. Hoffmann15888492009-07-30 11:46:53 +00006 <link rel="stylesheet" href=".resources/doc.css" charset="ISO-8859-1" type="text/css" />
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +00007 <title>JaCoCo - Implementation Design</title>
8</head>
9<body>
10
Marc R. Hoffmann15888492009-07-30 11:46:53 +000011<div class="breadcrumb">
12 <a href="../index.html" class="el_session">JaCoCo</a> &gt;
13 <a href="index.html" class="el_group">Documentation</a> &gt;
14 <span class="el_source">Implementation Design</span>
15</div>
Marc R. Hoffmann17be2692010-02-02 05:44:47 +000016<div id="content">
Marc R. Hoffmann15888492009-07-30 11:46:53 +000017
18<h1>Implementation Design</h1>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000019
20<p>
21 This is a unordered list of implementation design decisions. Each topic tries
22 to follow this structure:
23</p>
24
25<ul>
26 <li>Problem statement</li>
27 <li>Proposed Solution</li>
28 <li>Alternatives and Discussion</li>
29</ul>
30
31
32<h2>Coverage Analysis Mechanism</h2>
33
Marc R. Hoffmann15888492009-07-30 11:46:53 +000034<p class="intro">
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000035 Coverage information has to be collected at runtime. For this purpose JaCoCo
36 creates instrumented versions of the original class definitions. The
37 instrumentation process happens on-the-fly during class loading using so
38 called Java agents.
39</p>
40
41<p>
42 There are several different approaches to collect coverage information. For
43 each approach different implementation techniques are known. The following
44 diagram gives an overview with the techniques used by JaCoCo highlighted:
45</p>
46
47<ul>
48 <li>Runtime Profiling
49 <ul>
50 <li>Java Virtual Machine Profiler Interface (JVMPI), until Java 1.4</li>
51 <li>Java Virtual Machine Tool Interface (JVMTI), since Java 1.5</li>
52 </ul>
53 </li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000054 <li><span class="high">Instrumentation*</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000055 <ul>
56 <li>Java Source Instrumentation</li>
Marc R. Hoffmannc4b20782009-10-02 13:28:46 +000057 <li><span class="high">Byte Code Instrumentation*</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000058 <ul>
59 <li>Offline
60 <ul>
61 <li>Replace Original Classes In-Place</li>
62 <li>Inject Instrumented Classes into the Class Path</li>
63 </ul>
64 </li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000065 <li><span class="high">On-The-Fly*</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000066 <ul>
67 <li>Special Classloader Implementions or Framework Specific Hooks</li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000068 <li><span class="high">Java Agent*</span></li>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000069 </ul>
70 </li>
71 </ul>
72 </li>
73 </ul>
74 </li>
75</ul>
76
77<p>
78 Byte code instrumentation is very fast, can be implemented in pure Java and
79 works with every Java VM. On-the-fly instrumentation with the Java agent
80 hook can be added to the JVM without any modification of the target
81 application.
82</p>
83
84<p>
Radek Libaad5fbc92009-10-26 13:26:53 +000085 The Java agent hook requires at least 1.5 JVMs. Class files compiled with
86 debug information (line numbers) allow for source code highlighting. Unluckily
87 some Java language constructs get compiled to byte code that produces
88 unexpected highlighting results, especially in case of implicitly generated
89 code like default constructors or control structures for finally statements.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000090</p>
91
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +000092
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000093<h2>Instrumentation Approach</h2>
94
Marc R. Hoffmann15888492009-07-30 11:46:53 +000095<p class="intro">
Marc R. Hoffmann872290a2009-07-06 15:33:15 +000096 Instrumentation means inserting probes at certain check points in the Java
Marc R. Hoffmannc4b20782009-10-02 13:28:46 +000097 byte code. A probe is a generated piece of byte code that records the fact
98 that it has been executed. JaCoCo inserts probes at the end of every basic
99 block.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000100</p>
101
102<p>
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000103 A basic block is a piece of byte code that has a single entry point (the first
104 byte code instruction) and a single exit point (like <code>jump</code>,
Radek Libaad5fbc92009-10-26 13:26:53 +0000105 <code>throw</code> or <code>return</code>). A basic block must not contain jump
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000106 targets except the entry point. One can think of basic blocks as the nodes in
107 a control flow graph of a method. Using basic block boundaries to insert code
Radek Libaad5fbc92009-10-26 13:26:53 +0000108 coverage probes has been very successfully utilized by
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000109 <a href="http://emma.sourceforge.net/">EMMA</a>.
110</p>
111
112<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000113 Basic block instrumentation works regardless of whether the class files have been
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000114 compiled with debug information for source lines. Source code highlighting
115 will of course not be possible without this debug information, but percentages
116 on method level can still be calculated. Basic block probes result in
Radek Libaad5fbc92009-10-26 13:26:53 +0000117 reasonable overhead regarding class file size and performance. Partial line
118 coverage can occur if a line contains more than one statement or a statement
119 gets compiled into byte code forming more than one basic block (e.g. boolean
120 assignments). Calculating basic block relies on the Java byte code only, therefore
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000121 JaCoCo is independent of the source language and should also work with other
122 Java VM based languages like <a href="http://www.scala-lang.org/">Scala</a>.
123</p>
124
125<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000126 The huge drawback of this approach is the fact that basic blocks are
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000127 actually much smaller in the Java VM: Nearly every byte code instruction
128 (especially method invocations) can result in an exception. In this case the
129 block is left somewhere in the middle without hitting the probe, which leads
130 to unexpected results for example in case of negative tests. A possible
Radek Libaad5fbc92009-10-26 13:26:53 +0000131 solution would be to add exception handlers that trigger special probes.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000132</p>
133
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000134<h2>Coverage Agent Isolation</h2>
135
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000136<p class="intro">
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000137 The Java agent is loaded by the application class loader. Therefore the
Radek Libaad5fbc92009-10-26 13:26:53 +0000138 classes of the agent live in the same name space like the application classes
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000139 which can result in clashes especially with the third party library ASM. The
140 JoCoCo build therefore moves all agent classes into a unique package.
141</p>
142
143<p>
144 The JaCoCo build renames all classes contained in the
145 <code>jacocoagent.jar</code> into classes with a
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000146 <code>org.jacoco.agent.rt_&lt;randomid&gt;</code> prefix, including the
147 required ASM library classes. The identifier is created from a random number.
148 As the agent does not provide any API, no one should be affected by this
149 renaming. This trick also allows that JaCoCo tests can be verified with
150 JaCoCo.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000151</p>
152
153
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000154<h2>Minimal Java Version</h2>
155
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000156<p class="intro">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000157 JaCoCo requires Java 1.5.
158</p>
159
160<p>
161 The Java agent mechanism used for on-the-fly instrumentation became available
Radek Libaad5fbc92009-10-26 13:26:53 +0000162 with Java 1.5 VMs. Coding and testing with Java 1.5 language level is more
163 efficient, less error-prone &ndash; and more fun than with older versions.
164 JaCoCo will still allow to run against Java code compiled for these.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000165</p>
166
167
168<h2>Byte Code Manipulation</h2>
169
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000170<p class="intro">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000171 Instrumentation requires mechanisms to modify and generate Java byte code.
Radek Libaad5fbc92009-10-26 13:26:53 +0000172 JaCoCo uses the ASM library for this purpose internally.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000173</p>
174
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000175<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000176 Implementing the Java byte code specification would be an extensive and
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000177 error-prone task. Therefore an existing library should be used. The
178 <a href="http://asm.objectweb.org/">ASM</a> library is lightweight, easy to
179 use and very efficient in terms of memory and CPU usage. It is actively
180 maintained and includes as huge regression test suite. Its simplified BSD
181 license is approved by the Eclipse Foundation for usage with EPL products.
182</p>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000183
184<h2>Java Class Identity</h2>
185
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000186<p class="intro">
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000187 Each class loaded at runtime needs a unique identity to associate coverage data with.
188 JaCoCo creates such identities by a CRC64 hash code of the raw class definition.
189</p>
190
191<p>
192 In multi-classloader environments the plain name of a class does not
193 unambiguously identify a class. For example OSGi allows to use different
194 versions of the same class to be loaded within the same VM. In complex
195 deployment scenarios the actual version of the test target might be different
196 from current development version. A code coverage report should guarantee that
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000197 the presented figures are extracted from a valid test target. A hash code of
Radek Libaad5fbc92009-10-26 13:26:53 +0000198 the class definitions allows to differentiate between classes and versions of
199 classes. The CRC64 hash computation is simple and fast resulting in a small 64
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000200 bit identifier.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000201</p>
202
203<p>
204 The same class definition might be loaded by class loaders which will result
205 in different classes for the Java runtime system. For coverage analysis this
206 distinction should be irrelevant. Class definitions might be altered by other
207 instrumentation based technologies (e.g. AspectJ). In this case the hash code
208 will change and identity gets lost. On the other hand code coverage analysis
209 based on classes that have been somehow altered will produce unexpected
Radek Libaad5fbc92009-10-26 13:26:53 +0000210 results. The CRC64 code might produce so called <i>collisions</i>, i.e.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000211 creating the same hash code for two different classes. Although CRC64 is not
212 cryptographically strong and collision examples can be easily computed, for
213 regular class files the collision probability is very low.
214</p>
215
216<h2>Coverage Runtime Dependency</h2>
217
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000218<p class="intro">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000219 Instrumented code typically gets a dependency to a coverage runtime which is
220 responsible for collecting and storing execution data. JaCoCo uses JRE types
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000221 only in generated instrumentation code.
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000222</p>
223
224<p>
225 Making a runtime library available to all instrumented classes can be a
Radek Libaad5fbc92009-10-26 13:26:53 +0000226 painful or impossible task in frameworks that use their own class loading
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000227 mechanisms. Since Java 1.6 <code>java.lang.instrument.Instrumentation</code>
228 has an API to extends the bootsstrap loader. As our minimum target is Java 1.5
229 JaCoCo decouples the instrumented classes and the coverage runtime through
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000230 official JRE API types only. The instrumented classes communicate through the
231 <code>Object.equals(Object)</code> method with the runtime. A instrumented
232 class can retrieve its probe array instance with the following code. Note
233 that only JRE APIs are used:
234</p>
235
236
237<pre class="source">
238<span class="nr"> 1</span>Object access = ... // Retrieve instance
239<span class="nr"> 2</span>
240<span class="nr"> 3</span>Object[] args = new Object[3];
241<span class="nr"> 4</span>args[0] = Long.valueOf(8060044182221863588); // class id
242<span class="nr"> 5</span>args[1] = "com/example/MyClass"; // class name
243<span class="nr"> 6</span>args[2] = Integer.valueOf(24); // probe count
244<span class="nr"> 7</span>
245<span class="nr"> 8</span>access.equals(args);
246<span class="nr"> 9</span>
247<span class="nr"> 10</span>boolean[] probes = (boolean[]) args[0];
248</pre>
249
250<p>
251 The most tricky part takes place in line 1 and is not shown in the snippet
252 above. The object instance providing access to the coverage runtime through
253 its <code>equals()</code> method has to be obtained. Different approaches have
254 been implemented and tested so far:
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000255</p>
256
Marc R. Hoffmann402370f2009-08-10 14:02:23 +0000257<ul>
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000258 <li><b><code>SystemPropertiesRuntime</code></b>: This approach stores the
259 object instance under a system property. This solution breaks the contract
260 that system properties must only contain <code>java.lang.String</code>
261 values and therefore causes trouble in applications that rely on this
262 definition (e.g. Ant).</li>
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000263 <li><b><code>LoggerRuntime</code></b>: Here we use a shared
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000264 <code>java.util.logging.Logger</code> and communicate through the logging
265 parameter array instead of a <code>equals()</code> method. The coverage
266 runtime registers a custom <code>Handler</code> to receive the parameter
267 array. This approach might break environments that install their own log
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000268 managers (e.g. Glassfish).</li>
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000269 <li><b><code>ModifiedSystemClassRuntime</code></b>: This approach adds a
270 public static field to an existing JRE class through instrumentation. Unlike
271 the other methods above this is only possible for environments where a Java
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000272 agent is active.</li>
Marc R. Hoffmann402370f2009-08-10 14:02:23 +0000273</ul>
274
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000275<p>
276 The current JaCoCo Java agent implementation uses the
Marc R. Hoffmanna942c892010-03-10 21:33:26 +0000277 <code>ModifiedSystemClassRuntime</code> adding a field to the class
Marc R. Hoffmann6751fe42010-02-01 18:18:24 +0000278 <code>java.sql.Types</code>.
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000279</p>
280
Marc R. Hoffmann402370f2009-08-10 14:02:23 +0000281
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000282<h2>Memory Usage</h2>
283
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000284<p class="intro">
Marc R. Hoffmann58d76212009-10-08 15:40:46 +0000285 Coverage analysis for huge projects with several thousand classes or hundred
286 thousand lines of code should be possible. To allow this with reasonable
287 memory usage the coverage analysis is based on streaming patterns and
288 "depth first" traversals.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000289</p>
290
291<p>
Marc R. Hoffmann58d76212009-10-08 15:40:46 +0000292 The complete data tree of a huge coverage report is too big to fit into a
293 reasonable heap memory configuration. Therefore the coverage analysis and
294 report generation is implemented as "depth first" traversals. Which means that
Radek Libaad5fbc92009-10-26 13:26:53 +0000295 at any point in time only the following data has to be held in working memory:
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000296</p>
297
Marc R. Hoffmann58d76212009-10-08 15:40:46 +0000298<ul>
299 <li>A single class which is currently processed.</li>
300 <li>The summary information of all parents of this class (package, groups).</li>
301</ul>
302
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000303<h2>Java Element Identifiers</h2>
304
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000305<p class="intro">
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000306 The Java language and the Java VM use different String representation formats
307 for Java elements. For example while a type reference in Java reads like
308 <code>java.lang.Object</code>, the VM references the same type as
309 <code>Ljava/lang/Object;</code>. The JaCoCo API is based on VM identifiers only.
310</p>
311
312<p>
313 Using VM identifiers directly does not cause any transformation overhead at
314 runtime. There are several programming languages based on the Java VM that
315 might use different notations. Specific transformations should therefore only
Radek Libaad5fbc92009-10-26 13:26:53 +0000316 happen at the user interface level, for example during report generation.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000317</p>
318
319<h2>Modularization of the JaCoCo implementation</h2>
320
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000321<p class="intro">
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000322 JaCoCo is implemented in several modules providing different functionality.
323 These modules are provided as OSGi bundles with proper manifest files. But
Radek Libaad5fbc92009-10-26 13:26:53 +0000324 there are no dependencies on OSGi itself.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000325</p>
326
327<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000328 Using OSGi bundles allows well defined dependencies at development time and
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000329 at runtime in OSGi containers. As there are no dependencies on OSGi, the
Radek Libaad5fbc92009-10-26 13:26:53 +0000330 bundles can also be used like regular JAR files.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000331</p>
332
Marc R. Hoffmann17be2692010-02-02 05:44:47 +0000333</div>
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000334<div class="footer">
Marc R. Hoffmannafe929b2009-08-05 09:19:00 +0000335 <div class="versioninfo"><a href="@HOMEURL@">JaCoCo</a> @VERSION@</div>
Marc R. Hoffmann889d62b2010-01-26 20:08:15 +0000336 <a href="license.html">Copyright</a> &copy; 2009, 2010 Mountainminds GmbH &amp; Co. KG and Contributors
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000337</div>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000338
339</body>
340</html>