<?xml version="1.0" encoding="ISO-8859-1" ?> | |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> | |
<html xmlns="http://www.w3.org/1999/xhtml" lang="en"> | |
<head> | |
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" /> | |
<link rel="stylesheet" href=".resources/doc.css" charset="ISO-8859-1" type="text/css" /> | |
<title>JaCoCo - Implementation Design</title> | |
</head> | |
<body> | |
<div class="breadcrumb"> | |
<a href="../index.html" class="el_session">JaCoCo</a> > | |
<a href="index.html" class="el_group">Documentation</a> > | |
<span class="el_source">Implementation Design</span> | |
</div> | |
<h1>Implementation Design</h1> | |
<p> | |
This is a unordered list of implementation design decisions. Each topic tries | |
to follow this structure: | |
</p> | |
<ul> | |
<li>Problem statement</li> | |
<li>Proposed Solution</li> | |
<li>Alternatives and Discussion</li> | |
</ul> | |
<h2>Coverage Analysis Mechanism</h2> | |
<p class="intro"> | |
Coverage information has to be collected at runtime. For this purpose JaCoCo | |
creates instrumented versions of the original class definitions. The | |
instrumentation process happens on-the-fly during class loading using so | |
called Java agents. | |
</p> | |
<p> | |
There are several different approaches to collect coverage information. For | |
each approach different implementation techniques are known. The following | |
diagram gives an overview with the techniques used by JaCoCo highlighted: | |
</p> | |
<ul> | |
<li>Runtime Profiling | |
<ul> | |
<li>Java Virtual Machine Profiler Interface (JVMPI), until Java 1.4</li> | |
<li>Java Virtual Machine Tool Interface (JVMTI), since Java 1.5</li> | |
</ul> | |
</li> | |
<li><span class="high">Instrumentation*</span> | |
<ul> | |
<li>Java Source Instrumentation</li> | |
<li><span class="high">Byte Code Instrumentation*</span> | |
<ul> | |
<li>Offline | |
<ul> | |
<li>Replace Original Classes In-Place</li> | |
<li>Inject Instrumented Classes into the Class Path</li> | |
</ul> | |
</li> | |
<li><span class="high">On-The-Fly*</span> | |
<ul> | |
<li>Special Classloader Implementions or Framework Specific Hooks</li> | |
<li><span class="high">Java Agent*</span></li> | |
</ul> | |
</li> | |
</ul> | |
</li> | |
</ul> | |
</li> | |
</ul> | |
<p> | |
Byte code instrumentation is very fast, can be implemented in pure Java and | |
works with every Java VM. On-the-fly instrumentation with the Java agent | |
hook can be added to the JVM without any modification of the target | |
application. | |
</p> | |
<p> | |
The Java agent hook requires at least 1.5 JVMs. Class files compiled with | |
debug information (line numbers) allow for source code highlighting. Unluckily | |
some Java language constructs get compiled to byte code that produces | |
unexpected highlighting results, especially in case of implicitly generated | |
code like default constructors or control structures for finally statements. | |
</p> | |
<h2>Instrumentation Approach</h2> | |
<p class="intro"> | |
Instrumentation means inserting probes at certain check points in the Java | |
byte code. A probe is a generated piece of byte code that records the fact | |
that it has been executed. JaCoCo inserts probes at the end of every basic | |
block. | |
</p> | |
<p> | |
A basic block is a piece of byte code that has a single entry point (the first | |
byte code instruction) and a single exit point (like <code>jump</code>, | |
<code>throw</code> or <code>return</code>). A basic block must not contain jump | |
targets except the entry point. One can think of basic blocks as the nodes in | |
a control flow graph of a method. Using basic block boundaries to insert code | |
coverage probes has been very successfully utilized by | |
<a href="http://emma.sourceforge.net/">EMMA</a>. | |
</p> | |
<p> | |
Basic block instrumentation works regardless of whether the class files have been | |
compiled with debug information for source lines. Source code highlighting | |
will of course not be possible without this debug information, but percentages | |
on method level can still be calculated. Basic block probes result in | |
reasonable overhead regarding class file size and performance. Partial line | |
coverage can occur if a line contains more than one statement or a statement | |
gets compiled into byte code forming more than one basic block (e.g. boolean | |
assignments). Calculating basic block relies on the Java byte code only, therefore | |
JaCoCo is independent of the source language and should also work with other | |
Java VM based languages like <a href="http://www.scala-lang.org/">Scala</a>. | |
</p> | |
<p> | |
The huge drawback of this approach is the fact that basic blocks are | |
actually much smaller in the Java VM: Nearly every byte code instruction | |
(especially method invocations) can result in an exception. In this case the | |
block is left somewhere in the middle without hitting the probe, which leads | |
to unexpected results for example in case of negative tests. A possible | |
solution would be to add exception handlers that trigger special probes. | |
</p> | |
<h2>Coverage Agent Isolation</h2> | |
<p class="intro"> | |
The Java agent is loaded by the application class loader. Therefore the | |
classes of the agent live in the same name space like the application classes | |
which can result in clashes especially with the third party library ASM. The | |
JoCoCo build therefore moves all agent classes into a unique package. | |
</p> | |
<p> | |
The JaCoCo build renames all classes contained in the | |
<code>jacocoagent.jar</code> into classes with a | |
<code>org.jacoco.<randomid></code> prefix, including the required ASM | |
library classes. The identifier is created from a random number. As the agent | |
does not provide any API, no one should be affected by this renaming. This | |
trick also allows that JaCoCo tests can be verified with JaCoCo. | |
</p> | |
<h2>Minimal Java Version</h2> | |
<p class="intro"> | |
JaCoCo requires Java 1.5. | |
</p> | |
<p> | |
The Java agent mechanism used for on-the-fly instrumentation became available | |
with Java 1.5 VMs. Coding and testing with Java 1.5 language level is more | |
efficient, less error-prone – and more fun than with older versions. | |
JaCoCo will still allow to run against Java code compiled for these. | |
</p> | |
<h2>Byte Code Manipulation</h2> | |
<p class="intro"> | |
Instrumentation requires mechanisms to modify and generate Java byte code. | |
JaCoCo uses the ASM library for this purpose internally. | |
</p> | |
<p> | |
Implementing the Java byte code specification would be an extensive and | |
error-prone task. Therefore an existing library should be used. The | |
<a href="http://asm.objectweb.org/">ASM</a> library is lightweight, easy to | |
use and very efficient in terms of memory and CPU usage. It is actively | |
maintained and includes as huge regression test suite. Its simplified BSD | |
license is approved by the Eclipse Foundation for usage with EPL products. | |
</p> | |
<h2>Java Class Identity</h2> | |
<p class="intro"> | |
Each class loaded at runtime needs a unique identity to associate coverage data with. | |
JaCoCo creates such identities by a CRC64 hash code of the raw class definition. | |
</p> | |
<p> | |
In multi-classloader environments the plain name of a class does not | |
unambiguously identify a class. For example OSGi allows to use different | |
versions of the same class to be loaded within the same VM. In complex | |
deployment scenarios the actual version of the test target might be different | |
from current development version. A code coverage report should guarantee that | |
the presented figures are extracted from a valid test target. A hash code of | |
the class definitions allows to differentiate between classes and versions of | |
classes. The CRC64 hash computation is simple and fast resulting in a small 64 | |
bit identifier. | |
</p> | |
<p> | |
The same class definition might be loaded by class loaders which will result | |
in different classes for the Java runtime system. For coverage analysis this | |
distinction should be irrelevant. Class definitions might be altered by other | |
instrumentation based technologies (e.g. AspectJ). In this case the hash code | |
will change and identity gets lost. On the other hand code coverage analysis | |
based on classes that have been somehow altered will produce unexpected | |
results. The CRC64 code might produce so called <i>collisions</i>, i.e. | |
creating the same hash code for two different classes. Although CRC64 is not | |
cryptographically strong and collision examples can be easily computed, for | |
regular class files the collision probability is very low. | |
</p> | |
<h2>Coverage Runtime Dependency</h2> | |
<p class="intro"> | |
Instrumented code typically gets a dependency to a coverage runtime which is | |
responsible for collecting and storing execution data. JaCoCo uses JRE types | |
and interfaces only in generated instrumentation code. | |
</p> | |
<p> | |
Making a runtime library available to all instrumented classes can be a | |
painful or impossible task in frameworks that use their own class loading | |
mechanisms. Therefore JaCoCo decouples the instrumented classes and the | |
coverage runtime through official JRE API types. Currently two approaches have | |
been implemented: | |
</p> | |
<ul> | |
<li>By default we use a shared <code>java.util.logging.Logger</code> instance | |
to report coverage data to. The coverage runtime registers a custom | |
<code>Handler</code> to receive the data. The problem with this approach is | |
that the logging framework removes all handlers during shutdown. This may | |
break classes that get initialized during JVM shutdown.</li> | |
<li>Another approach was to store a <code>java.util.Map</code> instance | |
under a system property. This solution breaks the contract that system | |
properties must only contain <code>java.lang.String</code> values and has | |
therefore caused trouble in certain environments.</li> | |
</ul> | |
<h2>Memory Usage</h2> | |
<p class="intro"> | |
Coverage analysis for huge projects with several thousand classes or hundred | |
thousand lines of code should be possible. To allow this with reasonable | |
memory usage the coverage analysis is based on streaming patterns and | |
"depth first" traversals. | |
</p> | |
<p> | |
The complete data tree of a huge coverage report is too big to fit into a | |
reasonable heap memory configuration. Therefore the coverage analysis and | |
report generation is implemented as "depth first" traversals. Which means that | |
at any point in time only the following data has to be held in working memory: | |
</p> | |
<ul> | |
<li>A single class which is currently processed.</li> | |
<li>The summary information of all parents of this class (package, groups).</li> | |
</ul> | |
<h2>Java Element Identifiers</h2> | |
<p class="intro"> | |
The Java language and the Java VM use different String representation formats | |
for Java elements. For example while a type reference in Java reads like | |
<code>java.lang.Object</code>, the VM references the same type as | |
<code>Ljava/lang/Object;</code>. The JaCoCo API is based on VM identifiers only. | |
</p> | |
<p> | |
Using VM identifiers directly does not cause any transformation overhead at | |
runtime. There are several programming languages based on the Java VM that | |
might use different notations. Specific transformations should therefore only | |
happen at the user interface level, for example during report generation. | |
</p> | |
<h2>Modularization of the JaCoCo implementation</h2> | |
<p class="intro"> | |
JaCoCo is implemented in several modules providing different functionality. | |
These modules are provided as OSGi bundles with proper manifest files. But | |
there are no dependencies on OSGi itself. | |
</p> | |
<p> | |
Using OSGi bundles allows well defined dependencies at development time and | |
at runtime in OSGi containers. As there are no dependencies on OSGi, the | |
bundles can also be used like regular JAR files. | |
</p> | |
<div class="footer"> | |
<div class="versioninfo"><a href="@HOMEURL@">JaCoCo</a> @VERSION@</div> | |
<a href="license.html">Copyright</a> © 2009 Mountainminds GmbH & Co. KG and Contributors | |
</div> | |
</body> | |
</html> |