blob: 84745052849609cb7ae70b57673b38e841aa27e4 [file] [log] [blame]
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +00001<?xml version="1.0" encoding="ISO-8859-1" ?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
4<head>
5 <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
Marc R. Hoffmann15888492009-07-30 11:46:53 +00006 <link rel="stylesheet" href=".resources/doc.css" charset="ISO-8859-1" type="text/css" />
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +00007 <title>JaCoCo - Implementation Design</title>
8</head>
9<body>
10
Marc R. Hoffmann15888492009-07-30 11:46:53 +000011<div class="breadcrumb">
12 <a href="../index.html" class="el_session">JaCoCo</a> &gt;
13 <a href="index.html" class="el_group">Documentation</a> &gt;
14 <span class="el_source">Implementation Design</span>
15</div>
16
17<h1>Implementation Design</h1>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000018
19<p>
20 This is a unordered list of implementation design decisions. Each topic tries
21 to follow this structure:
22</p>
23
24<ul>
25 <li>Problem statement</li>
26 <li>Proposed Solution</li>
27 <li>Alternatives and Discussion</li>
28</ul>
29
30
31<h2>Coverage Analysis Mechanism</h2>
32
Marc R. Hoffmann15888492009-07-30 11:46:53 +000033<p class="intro">
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000034 Coverage information has to be collected at runtime. For this purpose JaCoCo
35 creates instrumented versions of the original class definitions. The
36 instrumentation process happens on-the-fly during class loading using so
37 called Java agents.
38</p>
39
40<p>
41 There are several different approaches to collect coverage information. For
42 each approach different implementation techniques are known. The following
43 diagram gives an overview with the techniques used by JaCoCo highlighted:
44</p>
45
46<ul>
47 <li>Runtime Profiling
48 <ul>
49 <li>Java Virtual Machine Profiler Interface (JVMPI), until Java 1.4</li>
50 <li>Java Virtual Machine Tool Interface (JVMTI), since Java 1.5</li>
51 </ul>
52 </li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000053 <li><span class="high">Instrumentation*</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000054 <ul>
55 <li>Java Source Instrumentation</li>
Marc R. Hoffmannc4b20782009-10-02 13:28:46 +000056 <li><span class="high">Byte Code Instrumentation*</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000057 <ul>
58 <li>Offline
59 <ul>
60 <li>Replace Original Classes In-Place</li>
61 <li>Inject Instrumented Classes into the Class Path</li>
62 </ul>
63 </li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000064 <li><span class="high">On-The-Fly*</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000065 <ul>
66 <li>Special Classloader Implementions or Framework Specific Hooks</li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000067 <li><span class="high">Java Agent*</span></li>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000068 </ul>
69 </li>
70 </ul>
71 </li>
72 </ul>
73 </li>
74</ul>
75
76<p>
77 Byte code instrumentation is very fast, can be implemented in pure Java and
78 works with every Java VM. On-the-fly instrumentation with the Java agent
79 hook can be added to the JVM without any modification of the target
80 application.
81</p>
82
83<p>
Radek Libaad5fbc92009-10-26 13:26:53 +000084 The Java agent hook requires at least 1.5 JVMs. Class files compiled with
85 debug information (line numbers) allow for source code highlighting. Unluckily
86 some Java language constructs get compiled to byte code that produces
87 unexpected highlighting results, especially in case of implicitly generated
88 code like default constructors or control structures for finally statements.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000089</p>
90
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +000091
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000092<h2>Instrumentation Approach</h2>
93
Marc R. Hoffmann15888492009-07-30 11:46:53 +000094<p class="intro">
Marc R. Hoffmann872290a2009-07-06 15:33:15 +000095 Instrumentation means inserting probes at certain check points in the Java
Marc R. Hoffmannc4b20782009-10-02 13:28:46 +000096 byte code. A probe is a generated piece of byte code that records the fact
97 that it has been executed. JaCoCo inserts probes at the end of every basic
98 block.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000099</p>
100
101<p>
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000102 A basic block is a piece of byte code that has a single entry point (the first
103 byte code instruction) and a single exit point (like <code>jump</code>,
Radek Libaad5fbc92009-10-26 13:26:53 +0000104 <code>throw</code> or <code>return</code>). A basic block must not contain jump
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000105 targets except the entry point. One can think of basic blocks as the nodes in
106 a control flow graph of a method. Using basic block boundaries to insert code
Radek Libaad5fbc92009-10-26 13:26:53 +0000107 coverage probes has been very successfully utilized by
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000108 <a href="http://emma.sourceforge.net/">EMMA</a>.
109</p>
110
111<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000112 Basic block instrumentation works regardless of whether the class files have been
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000113 compiled with debug information for source lines. Source code highlighting
114 will of course not be possible without this debug information, but percentages
115 on method level can still be calculated. Basic block probes result in
Radek Libaad5fbc92009-10-26 13:26:53 +0000116 reasonable overhead regarding class file size and performance. Partial line
117 coverage can occur if a line contains more than one statement or a statement
118 gets compiled into byte code forming more than one basic block (e.g. boolean
119 assignments). Calculating basic block relies on the Java byte code only, therefore
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000120 JaCoCo is independent of the source language and should also work with other
121 Java VM based languages like <a href="http://www.scala-lang.org/">Scala</a>.
122</p>
123
124<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000125 The huge drawback of this approach is the fact that basic blocks are
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000126 actually much smaller in the Java VM: Nearly every byte code instruction
127 (especially method invocations) can result in an exception. In this case the
128 block is left somewhere in the middle without hitting the probe, which leads
129 to unexpected results for example in case of negative tests. A possible
Radek Libaad5fbc92009-10-26 13:26:53 +0000130 solution would be to add exception handlers that trigger special probes.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000131</p>
132
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000133<h2>Coverage Agent Isolation</h2>
134
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000135<p class="intro">
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000136 The Java agent is loaded by the application class loader. Therefore the
Radek Libaad5fbc92009-10-26 13:26:53 +0000137 classes of the agent live in the same name space like the application classes
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000138 which can result in clashes especially with the third party library ASM. The
139 JoCoCo build therefore moves all agent classes into a unique package.
140</p>
141
142<p>
143 The JaCoCo build renames all classes contained in the
144 <code>jacocoagent.jar</code> into classes with a
Marc R. Hoffmann0948cb92009-07-06 09:15:28 +0000145 <code>org.jacoco.&lt;randomid&gt;</code> prefix, including the required ASM
146 library classes. The identifier is created from a random number. As the agent
147 does not provide any API, no one should be affected by this renaming. This
148 trick also allows that JaCoCo tests can be verified with JaCoCo.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000149</p>
150
151
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000152<h2>Minimal Java Version</h2>
153
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000154<p class="intro">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000155 JaCoCo requires Java 1.5.
156</p>
157
158<p>
159 The Java agent mechanism used for on-the-fly instrumentation became available
Radek Libaad5fbc92009-10-26 13:26:53 +0000160 with Java 1.5 VMs. Coding and testing with Java 1.5 language level is more
161 efficient, less error-prone &ndash; and more fun than with older versions.
162 JaCoCo will still allow to run against Java code compiled for these.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000163</p>
164
165
166<h2>Byte Code Manipulation</h2>
167
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000168<p class="intro">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000169 Instrumentation requires mechanisms to modify and generate Java byte code.
Radek Libaad5fbc92009-10-26 13:26:53 +0000170 JaCoCo uses the ASM library for this purpose internally.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000171</p>
172
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000173<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000174 Implementing the Java byte code specification would be an extensive and
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000175 error-prone task. Therefore an existing library should be used. The
176 <a href="http://asm.objectweb.org/">ASM</a> library is lightweight, easy to
177 use and very efficient in terms of memory and CPU usage. It is actively
178 maintained and includes as huge regression test suite. Its simplified BSD
179 license is approved by the Eclipse Foundation for usage with EPL products.
180</p>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000181
182<h2>Java Class Identity</h2>
183
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000184<p class="intro">
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000185 Each class loaded at runtime needs a unique identity to associate coverage data with.
186 JaCoCo creates such identities by a CRC64 hash code of the raw class definition.
187</p>
188
189<p>
190 In multi-classloader environments the plain name of a class does not
191 unambiguously identify a class. For example OSGi allows to use different
192 versions of the same class to be loaded within the same VM. In complex
193 deployment scenarios the actual version of the test target might be different
194 from current development version. A code coverage report should guarantee that
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000195 the presented figures are extracted from a valid test target. A hash code of
Radek Libaad5fbc92009-10-26 13:26:53 +0000196 the class definitions allows to differentiate between classes and versions of
197 classes. The CRC64 hash computation is simple and fast resulting in a small 64
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000198 bit identifier.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000199</p>
200
201<p>
202 The same class definition might be loaded by class loaders which will result
203 in different classes for the Java runtime system. For coverage analysis this
204 distinction should be irrelevant. Class definitions might be altered by other
205 instrumentation based technologies (e.g. AspectJ). In this case the hash code
206 will change and identity gets lost. On the other hand code coverage analysis
207 based on classes that have been somehow altered will produce unexpected
Radek Libaad5fbc92009-10-26 13:26:53 +0000208 results. The CRC64 code might produce so called <i>collisions</i>, i.e.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000209 creating the same hash code for two different classes. Although CRC64 is not
210 cryptographically strong and collision examples can be easily computed, for
211 regular class files the collision probability is very low.
212</p>
213
214<h2>Coverage Runtime Dependency</h2>
215
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000216<p class="intro">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000217 Instrumented code typically gets a dependency to a coverage runtime which is
218 responsible for collecting and storing execution data. JaCoCo uses JRE types
219 and interfaces only in generated instrumentation code.
220</p>
221
222<p>
223 Making a runtime library available to all instrumented classes can be a
Radek Libaad5fbc92009-10-26 13:26:53 +0000224 painful or impossible task in frameworks that use their own class loading
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000225 mechanisms. Therefore JaCoCo decouples the instrumented classes and the
Marc R. Hoffmann347cfed2009-09-07 19:15:54 +0000226 coverage runtime through official JRE API types. Currently two approaches have
Marc R. Hoffmann402370f2009-08-10 14:02:23 +0000227 been implemented:
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000228</p>
229
Marc R. Hoffmann402370f2009-08-10 14:02:23 +0000230<ul>
Marc R. Hoffmann347cfed2009-09-07 19:15:54 +0000231 <li>By default we use a shared <code>java.util.logging.Logger</code> instance
232 to report coverage data to. The coverage runtime registers a custom
233 <code>Handler</code> to receive the data. The problem with this approach is
234 that the logging framework removes all handlers during shutdown. This may
235 break classes that get initialized during JVM shutdown.</li>
Marc R. Hoffmann402370f2009-08-10 14:02:23 +0000236 <li>Another approach was to store a <code>java.util.Map</code> instance
237 under a system property. This solution breaks the contract that system
238 properties must only contain <code>java.lang.String</code> values and has
239 therefore caused trouble in certain environments.</li>
240</ul>
241
242
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000243<h2>Memory Usage</h2>
244
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000245<p class="intro">
Marc R. Hoffmann58d76212009-10-08 15:40:46 +0000246 Coverage analysis for huge projects with several thousand classes or hundred
247 thousand lines of code should be possible. To allow this with reasonable
248 memory usage the coverage analysis is based on streaming patterns and
249 "depth first" traversals.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000250</p>
251
252<p>
Marc R. Hoffmann58d76212009-10-08 15:40:46 +0000253 The complete data tree of a huge coverage report is too big to fit into a
254 reasonable heap memory configuration. Therefore the coverage analysis and
255 report generation is implemented as "depth first" traversals. Which means that
Radek Libaad5fbc92009-10-26 13:26:53 +0000256 at any point in time only the following data has to be held in working memory:
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000257</p>
258
Marc R. Hoffmann58d76212009-10-08 15:40:46 +0000259<ul>
260 <li>A single class which is currently processed.</li>
261 <li>The summary information of all parents of this class (package, groups).</li>
262</ul>
263
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000264<h2>Java Element Identifiers</h2>
265
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000266<p class="intro">
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000267 The Java language and the Java VM use different String representation formats
268 for Java elements. For example while a type reference in Java reads like
269 <code>java.lang.Object</code>, the VM references the same type as
270 <code>Ljava/lang/Object;</code>. The JaCoCo API is based on VM identifiers only.
271</p>
272
273<p>
274 Using VM identifiers directly does not cause any transformation overhead at
275 runtime. There are several programming languages based on the Java VM that
276 might use different notations. Specific transformations should therefore only
Radek Libaad5fbc92009-10-26 13:26:53 +0000277 happen at the user interface level, for example during report generation.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000278</p>
279
280<h2>Modularization of the JaCoCo implementation</h2>
281
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000282<p class="intro">
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000283 JaCoCo is implemented in several modules providing different functionality.
284 These modules are provided as OSGi bundles with proper manifest files. But
Radek Libaad5fbc92009-10-26 13:26:53 +0000285 there are no dependencies on OSGi itself.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000286</p>
287
288<p>
Radek Libaad5fbc92009-10-26 13:26:53 +0000289 Using OSGi bundles allows well defined dependencies at development time and
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000290 at runtime in OSGi containers. As there are no dependencies on OSGi, the
Radek Libaad5fbc92009-10-26 13:26:53 +0000291 bundles can also be used like regular JAR files.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000292</p>
293
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000294<div class="footer">
Marc R. Hoffmannafe929b2009-08-05 09:19:00 +0000295 <div class="versioninfo"><a href="@HOMEURL@">JaCoCo</a> @VERSION@</div>
Marc R. Hoffmann889d62b2010-01-26 20:08:15 +0000296 <a href="license.html">Copyright</a> &copy; 2009, 2010 Mountainminds GmbH &amp; Co. KG and Contributors
Marc R. Hoffmann15888492009-07-30 11:46:53 +0000297</div>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000298
299</body>
300</html>