blob: b9b6b7a3e9eaadd348acb9c1bc6f68e6b914cb98 [file] [log] [blame]
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +00001<?xml version="1.0" encoding="ISO-8859-1" ?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
4<head>
5 <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
Marc R. Hoffmann15888492009-07-30 11:46:53 +00006 <link rel="stylesheet" href=".resources/doc.css" charset="ISO-8859-1" type="text/css" />
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +00007 <title>JaCoCo - Implementation Design</title>
8</head>
9<body>
10
Marc R. Hoffmann15888492009-07-30 11:46:53 +000011<div class="breadcrumb">
12 <a href="../index.html" class="el_session">JaCoCo</a> &gt;
13 <a href="index.html" class="el_group">Documentation</a> &gt;
14 <span class="el_source">Implementation Design</span>
15</div>
Marc R. Hoffmann17be2692010-02-02 05:44:47 +000016<div id="content">
Marc R. Hoffmann15888492009-07-30 11:46:53 +000017
18<h1>Implementation Design</h1>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000019
20<p>
21 This is a unordered list of implementation design decisions. Each topic tries
22 to follow this structure:
23</p>
24
25<ul>
26 <li>Problem statement</li>
27 <li>Proposed Solution</li>
28 <li>Alternatives and Discussion</li>
29</ul>
30
31
32<h2>Coverage Analysis Mechanism</h2>
33
Marc R. Hoffmann15888492009-07-30 11:46:53 +000034<p class="intro">
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000035 Coverage information has to be collected at runtime. For this purpose JaCoCo
36 creates instrumented versions of the original class definitions. The
37 instrumentation process happens on-the-fly during class loading using so
38 called Java agents.
39</p>
40
41<p>
42 There are several different approaches to collect coverage information. For
43 each approach different implementation techniques are known. The following
44 diagram gives an overview with the techniques used by JaCoCo highlighted:
45</p>
46
47<ul>
48 <li>Runtime Profiling
49 <ul>
50 <li>Java Virtual Machine Profiler Interface (JVMPI), until Java 1.4</li>
51 <li>Java Virtual Machine Tool Interface (JVMTI), since Java 1.5</li>
52 </ul>
53 </li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000054 <li><span class="high">Instrumentation*</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000055 <ul>
56 <li>Java Source Instrumentation</li>
Marc R. Hoffmannc4b20782009-10-02 13:28:46 +000057 <li><span class="high">Byte Code Instrumentation*</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000058 <ul>
59 <li>Offline
60 <ul>
61 <li>Replace Original Classes In-Place</li>
62 <li>Inject Instrumented Classes into the Class Path</li>
63 </ul>
64 </li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000065 <li><span class="high">On-The-Fly*</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000066 <ul>
67 <li>Special Classloader Implementions or Framework Specific Hooks</li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000068 <li><span class="high">Java Agent*</span></li>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000069 </ul>
70 </li>
71 </ul>
72 </li>
73 </ul>
74 </li>
75</ul>
76
77<p>
78 Byte code instrumentation is very fast, can be implemented in pure Java and
79 works with every Java VM. On-the-fly instrumentation with the Java agent
80 hook can be added to the JVM without any modification of the target
81 application.
82</p>
83
84<p>
Radek Libaad5fbc92009-10-26 13:26:53 +000085 The Java agent hook requires at least 1.5 JVMs. Class files compiled with
86 debug information (line numbers) allow for source code highlighting. Unluckily
87 some Java language constructs get compiled to byte code that produces
88 unexpected highlighting results, especially in case of implicitly generated
89 code like default constructors or control structures for finally statements.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000090</p>
91
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +000092
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000093<h2>Instrumentation Approach</h2>
94
Marc R. Hoffmann15888492009-07-30 11:46:53 +000095<p class="intro">
Marc R. Hoffmann872290a2009-07-06 15:33:15 +000096 Instrumentation means inserting probes at certain check points in the Java
Marc R. Hoffmannc4b20782009-10-02 13:28:46 +000097 byte code. A probe is a generated piece of byte code that records the fact
98 that it has been executed. JaCoCo inserts probes at the end of every basic
99 block.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000100</p>
101
102<p>
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000103 A basic block is a piece of byte code that has a single entry point (the first
104 byte code instruction) and a single exit point (like <code>jump</code>,
Radek Libaad5fbc92009-10-26 13:26:53 +0000105 <code>throw</code> or <code>return</code>). A basic block must not contain jump
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000106 targets except the entry point. One can think of basic blocks as the nodes in
107 a control flow graph of a method. Using basic block boundaries to insert code
Radek Libaad5fbc92009-10-26 13:26:53 +0000108 coverage probes has been very successfully utilized by
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000109 <a href="http://emma.sourceforge.net/">EMMA</a>.
110</p>
111
112<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000113 Basic block instrumentation works regardless of whether the class files have been
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000114 compiled with debug information for source lines. Source code highlighting
115 will of course not be possible without this debug information, but percentages
116 on method level can still be calculated. Basic block probes result in
Radek Libaad5fbc92009-10-26 13:26:53 +0000117 reasonable overhead regarding class file size and performance. Partial line
118 coverage can occur if a line contains more than one statement or a statement
119 gets compiled into byte code forming more than one basic block (e.g. boolean
120 assignments). Calculating basic block relies on the Java byte code only, therefore
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000121 JaCoCo is independent of the source language and should also work with other
122 Java VM based languages like <a href="http://www.scala-lang.org/">Scala</a>.
123</p>
124
125<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000126 The huge drawback of this approach is the fact that basic blocks are
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000127 actually much smaller in the Java VM: Nearly every byte code instruction
128 (especially method invocations) can result in an exception. In this case the
129 block is left somewhere in the middle without hitting the probe, which leads
130 to unexpected results for example in case of negative tests. A possible
Radek Libaad5fbc92009-10-26 13:26:53 +0000131 solution would be to add exception handlers that trigger special probes.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000132</p>
133
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000134<h2>Coverage Agent Isolation</h2>
135
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000136<p class="intro">
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000137 The Java agent is loaded by the application class loader. Therefore the
Radek Libaad5fbc92009-10-26 13:26:53 +0000138 classes of the agent live in the same name space like the application classes
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000139 which can result in clashes especially with the third party library ASM. The
140 JoCoCo build therefore moves all agent classes into a unique package.
141</p>
142
143<p>
144 The JaCoCo build renames all classes contained in the
145 <code>jacocoagent.jar</code> into classes with a
Marc R. Hoffmann0948cb92009-07-06 09:15:28 +0000146 <code>org.jacoco.&lt;randomid&gt;</code> prefix, including the required ASM
147 library classes. The identifier is created from a random number. As the agent
148 does not provide any API, no one should be affected by this renaming. This
149 trick also allows that JaCoCo tests can be verified with JaCoCo.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000150</p>
151
152
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000153<h2>Minimal Java Version</h2>
154
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000155<p class="intro">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000156 JaCoCo requires Java 1.5.
157</p>
158
159<p>
160 The Java agent mechanism used for on-the-fly instrumentation became available
Radek Libaad5fbc92009-10-26 13:26:53 +0000161 with Java 1.5 VMs. Coding and testing with Java 1.5 language level is more
162 efficient, less error-prone &ndash; and more fun than with older versions.
163 JaCoCo will still allow to run against Java code compiled for these.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000164</p>
165
166
167<h2>Byte Code Manipulation</h2>
168
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000169<p class="intro">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000170 Instrumentation requires mechanisms to modify and generate Java byte code.
Radek Libaad5fbc92009-10-26 13:26:53 +0000171 JaCoCo uses the ASM library for this purpose internally.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000172</p>
173
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000174<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000175 Implementing the Java byte code specification would be an extensive and
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000176 error-prone task. Therefore an existing library should be used. The
177 <a href="http://asm.objectweb.org/">ASM</a> library is lightweight, easy to
178 use and very efficient in terms of memory and CPU usage. It is actively
179 maintained and includes as huge regression test suite. Its simplified BSD
180 license is approved by the Eclipse Foundation for usage with EPL products.
181</p>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000182
183<h2>Java Class Identity</h2>
184
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000185<p class="intro">
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000186 Each class loaded at runtime needs a unique identity to associate coverage data with.
187 JaCoCo creates such identities by a CRC64 hash code of the raw class definition.
188</p>
189
190<p>
191 In multi-classloader environments the plain name of a class does not
192 unambiguously identify a class. For example OSGi allows to use different
193 versions of the same class to be loaded within the same VM. In complex
194 deployment scenarios the actual version of the test target might be different
195 from current development version. A code coverage report should guarantee that
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000196 the presented figures are extracted from a valid test target. A hash code of
Radek Libaad5fbc92009-10-26 13:26:53 +0000197 the class definitions allows to differentiate between classes and versions of
198 classes. The CRC64 hash computation is simple and fast resulting in a small 64
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000199 bit identifier.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000200</p>
201
202<p>
203 The same class definition might be loaded by class loaders which will result
204 in different classes for the Java runtime system. For coverage analysis this
205 distinction should be irrelevant. Class definitions might be altered by other
206 instrumentation based technologies (e.g. AspectJ). In this case the hash code
207 will change and identity gets lost. On the other hand code coverage analysis
208 based on classes that have been somehow altered will produce unexpected
Radek Libaad5fbc92009-10-26 13:26:53 +0000209 results. The CRC64 code might produce so called <i>collisions</i>, i.e.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000210 creating the same hash code for two different classes. Although CRC64 is not
211 cryptographically strong and collision examples can be easily computed, for
212 regular class files the collision probability is very low.
213</p>
214
215<h2>Coverage Runtime Dependency</h2>
216
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000217<p class="intro">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000218 Instrumented code typically gets a dependency to a coverage runtime which is
219 responsible for collecting and storing execution data. JaCoCo uses JRE types
220 and interfaces only in generated instrumentation code.
221</p>
222
223<p>
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000224
225
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000226 Making a runtime library available to all instrumented classes can be a
Radek Libaad5fbc92009-10-26 13:26:53 +0000227 painful or impossible task in frameworks that use their own class loading
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000228 mechanisms. Since Java 1.6 <code>java.lang.instrument.Instrumentation</code>
229 has an API to extends the bootsstrap loader. As our minimum target is Java 1.5
230 JaCoCo decouples the instrumented classes and the coverage runtime through
231 official JRE API types only. Different approaches have been implemented and
232 tested so far:
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000233</p>
234
Marc R. Hoffmann402370f2009-08-10 14:02:23 +0000235<ul>
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000236 <li><b><code>SystemPropertiesRuntime</code></b>: This approach stores a
237 <code>java.util.Map</code> instance under a system property containing the
238 execution data. This solution breaks the contract that system properties
239 must only contain <code>java.lang.String</code> values and therefore causes
240 trouble in applications that rely on this definition (e.g. Ant).</li>
241 <li><b><code>LoggerRuntime</code></b>: Here we use a shared
242 <code>java.util.logging.Logger</code> instance to report coverage data to.
243 The coverage runtime registers a custom <code>Handler</code> to receive the
244 data. This approach might break environments that install their own log
245 managers (e.g. Glassfish).</li>
246 <li><b><code>ModifiedSystemClassRuntime</code></b>: This approach adds
247 additional APIs to existing JRE classes through instrumentation. Unlike the
248 other methods above this is only possible for environments where a Java
249 agent is active.</li>
Marc R. Hoffmann402370f2009-08-10 14:02:23 +0000250</ul>
251
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000252<p>
253 The current JaCoCo Java agent implementation uses the
254 <code>ModifiedSystemClassRuntime</code> adding APIs to the class
Marc R. Hoffmann6751fe42010-02-01 18:18:24 +0000255 <code>java.sql.Types</code>.
Marc R. Hoffmann9263b7b2010-01-31 10:20:30 +0000256</p>
257
Marc R. Hoffmann402370f2009-08-10 14:02:23 +0000258
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000259<h2>Memory Usage</h2>
260
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000261<p class="intro">
Marc R. Hoffmann58d76212009-10-08 15:40:46 +0000262 Coverage analysis for huge projects with several thousand classes or hundred
263 thousand lines of code should be possible. To allow this with reasonable
264 memory usage the coverage analysis is based on streaming patterns and
265 "depth first" traversals.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000266</p>
267
268<p>
Marc R. Hoffmann58d76212009-10-08 15:40:46 +0000269 The complete data tree of a huge coverage report is too big to fit into a
270 reasonable heap memory configuration. Therefore the coverage analysis and
271 report generation is implemented as "depth first" traversals. Which means that
Radek Libaad5fbc92009-10-26 13:26:53 +0000272 at any point in time only the following data has to be held in working memory:
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000273</p>
274
Marc R. Hoffmann58d76212009-10-08 15:40:46 +0000275<ul>
276 <li>A single class which is currently processed.</li>
277 <li>The summary information of all parents of this class (package, groups).</li>
278</ul>
279
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000280<h2>Java Element Identifiers</h2>
281
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000282<p class="intro">
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000283 The Java language and the Java VM use different String representation formats
284 for Java elements. For example while a type reference in Java reads like
285 <code>java.lang.Object</code>, the VM references the same type as
286 <code>Ljava/lang/Object;</code>. The JaCoCo API is based on VM identifiers only.
287</p>
288
289<p>
290 Using VM identifiers directly does not cause any transformation overhead at
291 runtime. There are several programming languages based on the Java VM that
292 might use different notations. Specific transformations should therefore only
Radek Libaad5fbc92009-10-26 13:26:53 +0000293 happen at the user interface level, for example during report generation.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000294</p>
295
296<h2>Modularization of the JaCoCo implementation</h2>
297
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000298<p class="intro">
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000299 JaCoCo is implemented in several modules providing different functionality.
300 These modules are provided as OSGi bundles with proper manifest files. But
Radek Libaad5fbc92009-10-26 13:26:53 +0000301 there are no dependencies on OSGi itself.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000302</p>
303
304<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000305 Using OSGi bundles allows well defined dependencies at development time and
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000306 at runtime in OSGi containers. As there are no dependencies on OSGi, the
Radek Libaad5fbc92009-10-26 13:26:53 +0000307 bundles can also be used like regular JAR files.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000308</p>
309
Marc R. Hoffmann17be2692010-02-02 05:44:47 +0000310</div>
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000311<div class="footer">
Marc R. Hoffmannafe929b2009-08-05 09:19:00 +0000312 <div class="versioninfo"><a href="@HOMEURL@">JaCoCo</a> @VERSION@</div>
Marc R. Hoffmann889d62b2010-01-26 20:08:15 +0000313 <a href="license.html">Copyright</a> &copy; 2009, 2010 Mountainminds GmbH &amp; Co. KG and Contributors
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000314</div>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000315
316</body>
317</html>