Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 1 | <?xml version="1.0" encoding="ISO-8859-1" ?>
|
| 2 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
| 3 | <html xmlns="http://www.w3.org/1999/xhtml" lang="en">
|
| 4 | <head>
|
| 5 | <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
|
| 6 | <link rel="stylesheet" href="book.css" charset="ISO-8859-1" type="text/css" />
|
| 7 | <title>JaCoCo - Implementation Design</title>
|
| 8 | </head>
|
| 9 | <body>
|
| 10 |
|
| 11 | <h1>JaCoCo - Implementation Design</h1>
|
| 12 |
|
| 13 | <p>
|
| 14 | This is a unordered list of implementation design decisions. Each topic tries
|
| 15 | to follow this structure:
|
| 16 | </p>
|
| 17 |
|
| 18 | <ul>
|
| 19 | <li>Problem statement</li>
|
| 20 | <li>Proposed Solution</li>
|
| 21 | <li>Alternatives and Discussion</li>
|
| 22 | </ul>
|
| 23 |
|
| 24 |
|
| 25 | <h2>Coverage Analysis Mechanism</h2>
|
| 26 |
|
| 27 | <p class="Note">
|
| 28 | Coverage information has to be collected at runtime. For this purpose JaCoCo
|
| 29 | creates instrumented versions of the original class definitions. The
|
| 30 | instrumentation process happens on-the-fly during class loading using so
|
| 31 | called Java agents.
|
| 32 | </p>
|
| 33 |
|
| 34 | <p>
|
| 35 | There are several different approaches to collect coverage information. For
|
| 36 | each approach different implementation techniques are known. The following
|
| 37 | diagram gives an overview with the techniques used by JaCoCo highlighted:
|
| 38 | </p>
|
| 39 |
|
| 40 | <ul>
|
| 41 | <li>Runtime Profiling
|
| 42 | <ul>
|
| 43 | <li>Java Virtual Machine Profiler Interface (JVMPI), until Java 1.4</li>
|
| 44 | <li>Java Virtual Machine Tool Interface (JVMTI), since Java 1.5</li>
|
| 45 | </ul>
|
| 46 | </li>
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 47 | <li><span class="high">Instrumentation*</span>
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 48 | <ul>
|
| 49 | <li>Java Source Instrumentation</li>
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 50 | <li><span class="high">Byte Code Instrumentation'</span>
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 51 | <ul>
|
| 52 | <li>Offline
|
| 53 | <ul>
|
| 54 | <li>Replace Original Classes In-Place</li>
|
| 55 | <li>Inject Instrumented Classes into the Class Path</li>
|
| 56 | </ul>
|
| 57 | </li>
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 58 | <li><span class="high">On-The-Fly*</span>
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 59 | <ul>
|
| 60 | <li>Special Classloader Implementions or Framework Specific Hooks</li>
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 61 | <li><span class="high">Java Agent*</span></li>
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 62 | </ul>
|
| 63 | </li>
|
| 64 | </ul>
|
| 65 | </li>
|
| 66 | </ul>
|
| 67 | </li>
|
| 68 | </ul>
|
| 69 |
|
| 70 | <p>
|
| 71 | Byte code instrumentation is very fast, can be implemented in pure Java and
|
| 72 | works with every Java VM. On-the-fly instrumentation with the Java agent
|
| 73 | hook can be added to the JVM without any modification of the target
|
| 74 | application.
|
| 75 | </p>
|
| 76 |
|
| 77 | <p>
|
| 78 | The Java agent hook requires at least 1.5 JVMs. For reporting class files
|
| 79 | compiled with debug information (line numbers) allow a good mapping back to
|
| 80 | source level. Although some Java language constructs are compiled in a way
|
| 81 | that the the coverage highlighting leads to unexpected results, especially
|
| 82 | in case of implicitly generated code like default constructors or control
|
| 83 | structures for finally statements.
|
| 84 | </p>
|
| 85 |
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 86 |
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 87 | <h2>Instrumentation Approach</h2>
|
| 88 |
|
| 89 | <p class="Note">
|
Marc R. Hoffmann | 872290a | 2009-07-06 15:33:15 +0000 | [diff] [blame^] | 90 | Instrumentation means inserting probes at certain check points in the Java
|
| 91 | byte code. A probe generated piece of byte code that records the fact that it
|
| 92 | has been executed. JaCoCo inserts probes at the end of every basic block.
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 93 | </p>
|
| 94 |
|
| 95 | <p>
|
Marc R. Hoffmann | 872290a | 2009-07-06 15:33:15 +0000 | [diff] [blame^] | 96 | A basic block is a piece of byte code that has a single entry point (the first
|
| 97 | byte code instruction) and a single exit point (like <code>jump</code>,
|
| 98 | <code>throw</code> or <code>return</code>). A basic code must not contain jump
|
| 99 | targets except the entry point. One can think of basic blocks as the nodes in
|
| 100 | a control flow graph of a method. Using basic block boundaries to insert code
|
| 101 | coverage probes has been very successfully proven by
|
| 102 | <a href="http://emma.sourceforge.net/">EMMA</a>.
|
| 103 | </p>
|
| 104 |
|
| 105 | <p>
|
| 106 | Basic block instrumentation works regardless whether the class files have been
|
| 107 | compiled with debug information for source lines. Source code highlighting
|
| 108 | will of course not be possible without this debug information, but percentages
|
| 109 | on method level can still be calculated. Basic block probes result in
|
| 110 | reasonable overhead regarding class file size and execution overhead. As e.g.
|
| 111 | multi-condition statements form several basic blocks partial line coverage is
|
| 112 | possible. Calculating basic block relies on the Java byte code only, therefore
|
| 113 | JaCoCo is independent of the source language and should also work with other
|
| 114 | Java VM based languages like <a href="http://www.scala-lang.org/">Scala</a>.
|
| 115 | </p>
|
| 116 |
|
| 117 | <p>
|
| 118 | The huge drawback of this approach is that fact, that basic blocks are
|
| 119 | actually much smaller in the Java VM: Nearly every byte code instruction
|
| 120 | (especially method invocations) can result in an exception. In this case the
|
| 121 | block is left somewhere in the middle without hitting the probe, which leads
|
| 122 | to unexpected results for example in case of negative tests. A possible
|
| 123 | solutions would be to add exception handlers that trigger special probes.
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 124 | </p>
|
| 125 |
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 126 | <h2>Coverage Agent Isolation</h2>
|
| 127 |
|
| 128 | <p class="Note">
|
| 129 | The Java agent is loaded by the application class loader. Therefore the
|
| 130 | classes of the agent live in the same name space than the application classes
|
| 131 | which can result in clashes especially with the third party library ASM. The
|
| 132 | JoCoCo build therefore moves all agent classes into a unique package.
|
| 133 | </p>
|
| 134 |
|
| 135 | <p>
|
| 136 | The JaCoCo build renames all classes contained in the
|
| 137 | <code>jacocoagent.jar</code> into classes with a
|
Marc R. Hoffmann | 0948cb9 | 2009-07-06 09:15:28 +0000 | [diff] [blame] | 138 | <code>org.jacoco.<randomid></code> prefix, including the required ASM
|
| 139 | library classes. The identifier is created from a random number. As the agent
|
| 140 | does not provide any API, no one should be affected by this renaming. This
|
| 141 | trick also allows that JaCoCo tests can be verified with JaCoCo.
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 142 | </p>
|
| 143 |
|
| 144 |
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 145 | <h2>Minimal Java Version</h2>
|
| 146 |
|
| 147 | <p class="Note">
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 148 | JaCoCo requires Java 1.5.
|
| 149 | </p>
|
| 150 |
|
| 151 | <p>
|
| 152 | The Java agent mechanism used for on-the-fly instrumentation became available
|
| 153 | with in Java 1.5 VMs. Coding and testing with Java 1.5 language level is more
|
| 154 | efficient, less error-prone – and more fun. JaCoCo will still allow to
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 155 | run against Java code compiled for older versions.
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 156 | </p>
|
| 157 |
|
| 158 |
|
| 159 | <h2>Byte Code Manipulation</h2>
|
| 160 |
|
| 161 | <p class="Note">
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 162 | Instrumentation requires mechanisms to modify and generate Java byte code.
|
| 163 | JaCoCo uses the ASM library for this purpose.
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 164 | </p>
|
| 165 |
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 166 | <p>
|
| 167 | Implementing the Java byte code specification would be a extensive and
|
| 168 | error-prone task. Therefore an existing library should be used. The
|
| 169 | <a href="http://asm.objectweb.org/">ASM</a> library is lightweight, easy to
|
| 170 | use and very efficient in terms of memory and CPU usage. It is actively
|
| 171 | maintained and includes as huge regression test suite. Its simplified BSD
|
| 172 | license is approved by the Eclipse Foundation for usage with EPL products.
|
| 173 | </p>
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 174 |
|
| 175 | <h2>Java Class Identity</h2>
|
| 176 |
|
| 177 | <p class="Note">
|
| 178 | Each class loaded at runtime needs a unique identity to associate coverage data with.
|
| 179 | JaCoCo creates such identities by a CRC64 hash code of the raw class definition.
|
| 180 | </p>
|
| 181 |
|
| 182 | <p>
|
| 183 | In multi-classloader environments the plain name of a class does not
|
| 184 | unambiguously identify a class. For example OSGi allows to use different
|
| 185 | versions of the same class to be loaded within the same VM. In complex
|
| 186 | deployment scenarios the actual version of the test target might be different
|
| 187 | from current development version. A code coverage report should guarantee that
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 188 | the presented figures are extracted from a valid test target. A hash code of
|
| 189 | the class definitions allows a differentiate between classes and versions of a
|
| 190 | class. The CRC64 hash computation is simple and fast resulting in a small 64
|
| 191 | bit identifier.
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 192 | </p>
|
| 193 |
|
| 194 | <p>
|
| 195 | The same class definition might be loaded by class loaders which will result
|
| 196 | in different classes for the Java runtime system. For coverage analysis this
|
| 197 | distinction should be irrelevant. Class definitions might be altered by other
|
| 198 | instrumentation based technologies (e.g. AspectJ). In this case the hash code
|
| 199 | will change and identity gets lost. On the other hand code coverage analysis
|
| 200 | based on classes that have been somehow altered will produce unexpected
|
| 201 | results. The CRC64 has code might produce so called <i>collisions</i>, i.e.
|
| 202 | creating the same hash code for two different classes. Although CRC64 is not
|
| 203 | cryptographically strong and collision examples can be easily computed, for
|
| 204 | regular class files the collision probability is very low.
|
| 205 | </p>
|
| 206 |
|
| 207 | <h2>Coverage Runtime Dependency</h2>
|
| 208 |
|
| 209 | <p class="Note">
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 210 | Instrumented code typically gets a dependency to a coverage runtime which is
|
| 211 | responsible for collecting and storing execution data. JaCoCo uses JRE types
|
| 212 | and interfaces only in generated instrumentation code.
|
| 213 | </p>
|
| 214 |
|
| 215 | <p>
|
| 216 | Making a runtime library available to all instrumented classes can be a
|
| 217 | painful or impossible task in frameworks that use there own class loading
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 218 | mechanisms. Therefore JaCoCo decouples the instrumented classes and the
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 219 | coverage runtime through official JRE API types.
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 220 | </p>
|
| 221 |
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 222 | <h2>Memory Usage</h2>
|
| 223 |
|
| 224 | <p class="Note">
|
| 225 |
|
| 226 | </p>
|
| 227 |
|
| 228 | <p>
|
Marc R. Hoffmann | 872290a | 2009-07-06 15:33:15 +0000 | [diff] [blame^] | 229 | TODO: Streaming, Deep first
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 230 | </p>
|
| 231 |
|
| 232 | <h2>Java Element Identifiers</h2>
|
| 233 |
|
| 234 | <p class="Note">
|
| 235 | The Java language and the Java VM use different String representation formats
|
| 236 | for Java elements. For example while a type reference in Java reads like
|
| 237 | <code>java.lang.Object</code>, the VM references the same type as
|
| 238 | <code>Ljava/lang/Object;</code>. The JaCoCo API is based on VM identifiers only.
|
| 239 | </p>
|
| 240 |
|
| 241 | <p>
|
| 242 | Using VM identifiers directly does not cause any transformation overhead at
|
| 243 | runtime. There are several programming languages based on the Java VM that
|
| 244 | might use different notations. Specific transformations should therefore only
|
| 245 | happen at the user interface level, for example while report generation.
|
| 246 | </p>
|
| 247 |
|
| 248 | <h2>Modularization of the JaCoCo implementation</h2>
|
| 249 |
|
| 250 | <p class="Note">
|
| 251 | JaCoCo is implemented in several modules providing different functionality.
|
| 252 | These modules are provided as OSGi bundles with proper manifest files. But
|
| 253 | there is no dependencies on OSGi itself.
|
| 254 | </p>
|
| 255 |
|
| 256 | <p>
|
| 257 | Using OSGi bundles allows well defines dependencies at development time and
|
| 258 | at runtime in OSGi containers. As there are no dependencies on OSGi, the
|
| 259 | bundles can also be used as regular JAR files.
|
| 260 | </p>
|
| 261 |
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 262 |
|
| 263 | <hr/>
|
| 264 | <div style="float:right">@VERSION@</div>
|
| 265 | <div>Copyright © 2009 Mountainminds GmbH & Co. KG, Marc R. Hoffmann</div>
|
| 266 |
|
| 267 | </body>
|
| 268 | </html> |