blob: d05b101320c7edffc56a1108fc3146408043180b [file] [log] [blame]
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +00001<?xml version="1.0" encoding="ISO-8859-1" ?>
2<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
4<head>
5 <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
6 <link rel="stylesheet" href="book.css" charset="ISO-8859-1" type="text/css" />
7 <title>JaCoCo - Implementation Design</title>
8</head>
9<body>
10
11<h1>JaCoCo - Implementation Design</h1>
12
13<p>
14 This is a unordered list of implementation design decisions. Each topic tries
15 to follow this structure:
16</p>
17
18<ul>
19 <li>Problem statement</li>
20 <li>Proposed Solution</li>
21 <li>Alternatives and Discussion</li>
22</ul>
23
24
25<h2>Coverage Analysis Mechanism</h2>
26
27<p class="Note">
28 Coverage information has to be collected at runtime. For this purpose JaCoCo
29 creates instrumented versions of the original class definitions. The
30 instrumentation process happens on-the-fly during class loading using so
31 called Java agents.
32</p>
33
34<p>
35 There are several different approaches to collect coverage information. For
36 each approach different implementation techniques are known. The following
37 diagram gives an overview with the techniques used by JaCoCo highlighted:
38</p>
39
40<ul>
41 <li>Runtime Profiling
42 <ul>
43 <li>Java Virtual Machine Profiler Interface (JVMPI), until Java 1.4</li>
44 <li>Java Virtual Machine Tool Interface (JVMTI), since Java 1.5</li>
45 </ul>
46 </li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000047 <li><span class="high">Instrumentation*</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000048 <ul>
49 <li>Java Source Instrumentation</li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000050 <li><span class="high">Byte Code Instrumentation'</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000051 <ul>
52 <li>Offline
53 <ul>
54 <li>Replace Original Classes In-Place</li>
55 <li>Inject Instrumented Classes into the Class Path</li>
56 </ul>
57 </li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000058 <li><span class="high">On-The-Fly*</span>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000059 <ul>
60 <li>Special Classloader Implementions or Framework Specific Hooks</li>
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +000061 <li><span class="high">Java Agent*</span></li>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000062 </ul>
63 </li>
64 </ul>
65 </li>
66 </ul>
67 </li>
68</ul>
69
70<p>
71 Byte code instrumentation is very fast, can be implemented in pure Java and
72 works with every Java VM. On-the-fly instrumentation with the Java agent
73 hook can be added to the JVM without any modification of the target
74 application.
75</p>
76
77<p>
78 The Java agent hook requires at least 1.5 JVMs. For reporting class files
79 compiled with debug information (line numbers) allow a good mapping back to
80 source level. Although some Java language constructs are compiled in a way
81 that the the coverage highlighting leads to unexpected results, especially
82 in case of implicitly generated code like default constructors or control
83 structures for finally statements.
84</p>
85
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +000086
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000087<h2>Instrumentation Approach</h2>
88
89<p class="Note">
Marc R. Hoffmann872290a2009-07-06 15:33:15 +000090 Instrumentation means inserting probes at certain check points in the Java
91 byte code. A probe generated piece of byte code that records the fact that it
92 has been executed. JaCoCo inserts probes at the end of every basic block.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +000093</p>
94
95<p>
Marc R. Hoffmann872290a2009-07-06 15:33:15 +000096 A basic block is a piece of byte code that has a single entry point (the first
97 byte code instruction) and a single exit point (like <code>jump</code>,
98 <code>throw</code> or <code>return</code>). A basic code must not contain jump
99 targets except the entry point. One can think of basic blocks as the nodes in
100 a control flow graph of a method. Using basic block boundaries to insert code
101 coverage probes has been very successfully proven by
102 <a href="http://emma.sourceforge.net/">EMMA</a>.
103</p>
104
105<p>
106 Basic block instrumentation works regardless whether the class files have been
107 compiled with debug information for source lines. Source code highlighting
108 will of course not be possible without this debug information, but percentages
109 on method level can still be calculated. Basic block probes result in
110 reasonable overhead regarding class file size and execution overhead. As e.g.
111 multi-condition statements form several basic blocks partial line coverage is
112 possible. Calculating basic block relies on the Java byte code only, therefore
113 JaCoCo is independent of the source language and should also work with other
114 Java VM based languages like <a href="http://www.scala-lang.org/">Scala</a>.
115</p>
116
117<p>
118 The huge drawback of this approach is that fact, that basic blocks are
119 actually much smaller in the Java VM: Nearly every byte code instruction
120 (especially method invocations) can result in an exception. In this case the
121 block is left somewhere in the middle without hitting the probe, which leads
122 to unexpected results for example in case of negative tests. A possible
123 solutions would be to add exception handlers that trigger special probes.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000124</p>
125
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000126<h2>Coverage Agent Isolation</h2>
127
128<p class="Note">
129 The Java agent is loaded by the application class loader. Therefore the
130 classes of the agent live in the same name space than the application classes
131 which can result in clashes especially with the third party library ASM. The
132 JoCoCo build therefore moves all agent classes into a unique package.
133</p>
134
135<p>
136 The JaCoCo build renames all classes contained in the
137 <code>jacocoagent.jar</code> into classes with a
Marc R. Hoffmann0948cb92009-07-06 09:15:28 +0000138 <code>org.jacoco.&lt;randomid&gt;</code> prefix, including the required ASM
139 library classes. The identifier is created from a random number. As the agent
140 does not provide any API, no one should be affected by this renaming. This
141 trick also allows that JaCoCo tests can be verified with JaCoCo.
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000142</p>
143
144
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000145<h2>Minimal Java Version</h2>
146
147<p class="Note">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000148 JaCoCo requires Java 1.5.
149</p>
150
151<p>
152 The Java agent mechanism used for on-the-fly instrumentation became available
153 with in Java 1.5 VMs. Coding and testing with Java 1.5 language level is more
154 efficient, less error-prone &ndash; and more fun. JaCoCo will still allow to
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000155 run against Java code compiled for older versions.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000156</p>
157
158
159<h2>Byte Code Manipulation</h2>
160
161<p class="Note">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000162 Instrumentation requires mechanisms to modify and generate Java byte code.
163 JaCoCo uses the ASM library for this purpose.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000164</p>
165
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000166<p>
167 Implementing the Java byte code specification would be a extensive and
168 error-prone task. Therefore an existing library should be used. The
169 <a href="http://asm.objectweb.org/">ASM</a> library is lightweight, easy to
170 use and very efficient in terms of memory and CPU usage. It is actively
171 maintained and includes as huge regression test suite. Its simplified BSD
172 license is approved by the Eclipse Foundation for usage with EPL products.
173</p>
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000174
175<h2>Java Class Identity</h2>
176
177<p class="Note">
178 Each class loaded at runtime needs a unique identity to associate coverage data with.
179 JaCoCo creates such identities by a CRC64 hash code of the raw class definition.
180</p>
181
182<p>
183 In multi-classloader environments the plain name of a class does not
184 unambiguously identify a class. For example OSGi allows to use different
185 versions of the same class to be loaded within the same VM. In complex
186 deployment scenarios the actual version of the test target might be different
187 from current development version. A code coverage report should guarantee that
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000188 the presented figures are extracted from a valid test target. A hash code of
189 the class definitions allows a differentiate between classes and versions of a
190 class. The CRC64 hash computation is simple and fast resulting in a small 64
191 bit identifier.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000192</p>
193
194<p>
195 The same class definition might be loaded by class loaders which will result
196 in different classes for the Java runtime system. For coverage analysis this
197 distinction should be irrelevant. Class definitions might be altered by other
198 instrumentation based technologies (e.g. AspectJ). In this case the hash code
199 will change and identity gets lost. On the other hand code coverage analysis
200 based on classes that have been somehow altered will produce unexpected
201 results. The CRC64 has code might produce so called <i>collisions</i>, i.e.
202 creating the same hash code for two different classes. Although CRC64 is not
203 cryptographically strong and collision examples can be easily computed, for
204 regular class files the collision probability is very low.
205</p>
206
207<h2>Coverage Runtime Dependency</h2>
208
209<p class="Note">
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000210 Instrumented code typically gets a dependency to a coverage runtime which is
211 responsible for collecting and storing execution data. JaCoCo uses JRE types
212 and interfaces only in generated instrumentation code.
213</p>
214
215<p>
216 Making a runtime library available to all instrumented classes can be a
217 painful or impossible task in frameworks that use there own class loading
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000218 mechanisms. Therefore JaCoCo decouples the instrumented classes and the
Marc R. Hoffmanne52a0ef2009-06-16 20:28:45 +0000219 coverage runtime through official JRE API types.
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000220</p>
221
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000222<h2>Memory Usage</h2>
223
224<p class="Note">
225
226</p>
227
228<p>
Marc R. Hoffmann872290a2009-07-06 15:33:15 +0000229 TODO: Streaming, Deep first
Marc R. Hoffmann5267b6c2009-07-05 16:34:27 +0000230</p>
231
232<h2>Java Element Identifiers</h2>
233
234<p class="Note">
235 The Java language and the Java VM use different String representation formats
236 for Java elements. For example while a type reference in Java reads like
237 <code>java.lang.Object</code>, the VM references the same type as
238 <code>Ljava/lang/Object;</code>. The JaCoCo API is based on VM identifiers only.
239</p>
240
241<p>
242 Using VM identifiers directly does not cause any transformation overhead at
243 runtime. There are several programming languages based on the Java VM that
244 might use different notations. Specific transformations should therefore only
245 happen at the user interface level, for example while report generation.
246</p>
247
248<h2>Modularization of the JaCoCo implementation</h2>
249
250<p class="Note">
251 JaCoCo is implemented in several modules providing different functionality.
252 These modules are provided as OSGi bundles with proper manifest files. But
253 there is no dependencies on OSGi itself.
254</p>
255
256<p>
257 Using OSGi bundles allows well defines dependencies at development time and
258 at runtime in OSGi containers. As there are no dependencies on OSGi, the
259 bundles can also be used as regular JAR files.
260</p>
261
Marc R. Hoffmanna2af15d2009-06-07 21:15:05 +0000262
263<hr/>
264<div style="float:right">@VERSION@</div>
265<div>Copyright &copy; 2009 Mountainminds GmbH &amp; Co. KG, Marc R. Hoffmann</div>
266
267</body>
268</html>