Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 1 | <?xml version="1.0" encoding="ISO-8859-1" ?>
|
| 2 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
| 3 | <html xmlns="http://www.w3.org/1999/xhtml" lang="en">
|
| 4 | <head>
|
| 5 | <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
|
Marc R. Hoffmann | 1588849 | 2009-07-30 11:46:53 +0000 | [diff] [blame] | 6 | <link rel="stylesheet" href=".resources/doc.css" charset="ISO-8859-1" type="text/css" />
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 7 | <title>JaCoCo - Implementation Design</title>
|
| 8 | </head>
|
| 9 | <body>
|
| 10 |
|
Marc R. Hoffmann | 1588849 | 2009-07-30 11:46:53 +0000 | [diff] [blame] | 11 | <div class="breadcrumb">
|
| 12 | <a href="../index.html" class="el_session">JaCoCo</a> >
|
| 13 | <a href="index.html" class="el_group">Documentation</a> >
|
| 14 | <span class="el_source">Implementation Design</span>
|
| 15 | </div>
|
| 16 |
|
| 17 | <h1>Implementation Design</h1>
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 18 |
|
| 19 | <p>
|
| 20 | This is a unordered list of implementation design decisions. Each topic tries
|
| 21 | to follow this structure:
|
| 22 | </p>
|
| 23 |
|
| 24 | <ul>
|
| 25 | <li>Problem statement</li>
|
| 26 | <li>Proposed Solution</li>
|
| 27 | <li>Alternatives and Discussion</li>
|
| 28 | </ul>
|
| 29 |
|
| 30 |
|
| 31 | <h2>Coverage Analysis Mechanism</h2>
|
| 32 |
|
Marc R. Hoffmann | 1588849 | 2009-07-30 11:46:53 +0000 | [diff] [blame] | 33 | <p class="intro">
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 34 | Coverage information has to be collected at runtime. For this purpose JaCoCo
|
| 35 | creates instrumented versions of the original class definitions. The
|
| 36 | instrumentation process happens on-the-fly during class loading using so
|
| 37 | called Java agents.
|
| 38 | </p>
|
| 39 |
|
| 40 | <p>
|
| 41 | There are several different approaches to collect coverage information. For
|
| 42 | each approach different implementation techniques are known. The following
|
| 43 | diagram gives an overview with the techniques used by JaCoCo highlighted:
|
| 44 | </p>
|
| 45 |
|
| 46 | <ul>
|
| 47 | <li>Runtime Profiling
|
| 48 | <ul>
|
| 49 | <li>Java Virtual Machine Profiler Interface (JVMPI), until Java 1.4</li>
|
| 50 | <li>Java Virtual Machine Tool Interface (JVMTI), since Java 1.5</li>
|
| 51 | </ul>
|
| 52 | </li>
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 53 | <li><span class="high">Instrumentation*</span>
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 54 | <ul>
|
| 55 | <li>Java Source Instrumentation</li>
|
Marc R. Hoffmann | c4b2078 | 2009-10-02 13:28:46 +0000 | [diff] [blame] | 56 | <li><span class="high">Byte Code Instrumentation*</span>
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 57 | <ul>
|
| 58 | <li>Offline
|
| 59 | <ul>
|
| 60 | <li>Replace Original Classes In-Place</li>
|
| 61 | <li>Inject Instrumented Classes into the Class Path</li>
|
| 62 | </ul>
|
| 63 | </li>
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 64 | <li><span class="high">On-The-Fly*</span>
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 65 | <ul>
|
| 66 | <li>Special Classloader Implementions or Framework Specific Hooks</li>
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 67 | <li><span class="high">Java Agent*</span></li>
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 68 | </ul>
|
| 69 | </li>
|
| 70 | </ul>
|
| 71 | </li>
|
| 72 | </ul>
|
| 73 | </li>
|
| 74 | </ul>
|
| 75 |
|
| 76 | <p>
|
| 77 | Byte code instrumentation is very fast, can be implemented in pure Java and
|
| 78 | works with every Java VM. On-the-fly instrumentation with the Java agent
|
| 79 | hook can be added to the JVM without any modification of the target
|
| 80 | application.
|
| 81 | </p>
|
| 82 |
|
| 83 | <p>
|
| 84 | The Java agent hook requires at least 1.5 JVMs. For reporting class files
|
| 85 | compiled with debug information (line numbers) allow a good mapping back to
|
| 86 | source level. Although some Java language constructs are compiled in a way
|
| 87 | that the the coverage highlighting leads to unexpected results, especially
|
| 88 | in case of implicitly generated code like default constructors or control
|
| 89 | structures for finally statements.
|
| 90 | </p>
|
| 91 |
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 92 |
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 93 | <h2>Instrumentation Approach</h2>
|
| 94 |
|
Marc R. Hoffmann | 1588849 | 2009-07-30 11:46:53 +0000 | [diff] [blame] | 95 | <p class="intro">
|
Marc R. Hoffmann | 872290a | 2009-07-06 15:33:15 +0000 | [diff] [blame] | 96 | Instrumentation means inserting probes at certain check points in the Java
|
Marc R. Hoffmann | c4b2078 | 2009-10-02 13:28:46 +0000 | [diff] [blame] | 97 | byte code. A probe is a generated piece of byte code that records the fact
|
| 98 | that it has been executed. JaCoCo inserts probes at the end of every basic
|
| 99 | block.
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 100 | </p>
|
| 101 |
|
| 102 | <p>
|
Marc R. Hoffmann | 872290a | 2009-07-06 15:33:15 +0000 | [diff] [blame] | 103 | A basic block is a piece of byte code that has a single entry point (the first
|
| 104 | byte code instruction) and a single exit point (like <code>jump</code>,
|
| 105 | <code>throw</code> or <code>return</code>). A basic code must not contain jump
|
| 106 | targets except the entry point. One can think of basic blocks as the nodes in
|
| 107 | a control flow graph of a method. Using basic block boundaries to insert code
|
| 108 | coverage probes has been very successfully proven by
|
| 109 | <a href="http://emma.sourceforge.net/">EMMA</a>.
|
| 110 | </p>
|
| 111 |
|
| 112 | <p>
|
| 113 | Basic block instrumentation works regardless whether the class files have been
|
| 114 | compiled with debug information for source lines. Source code highlighting
|
| 115 | will of course not be possible without this debug information, but percentages
|
| 116 | on method level can still be calculated. Basic block probes result in
|
| 117 | reasonable overhead regarding class file size and execution overhead. As e.g.
|
| 118 | multi-condition statements form several basic blocks partial line coverage is
|
| 119 | possible. Calculating basic block relies on the Java byte code only, therefore
|
| 120 | JaCoCo is independent of the source language and should also work with other
|
| 121 | Java VM based languages like <a href="http://www.scala-lang.org/">Scala</a>.
|
| 122 | </p>
|
| 123 |
|
| 124 | <p>
|
| 125 | The huge drawback of this approach is that fact, that basic blocks are
|
| 126 | actually much smaller in the Java VM: Nearly every byte code instruction
|
| 127 | (especially method invocations) can result in an exception. In this case the
|
| 128 | block is left somewhere in the middle without hitting the probe, which leads
|
| 129 | to unexpected results for example in case of negative tests. A possible
|
| 130 | solutions would be to add exception handlers that trigger special probes.
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 131 | </p>
|
| 132 |
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 133 | <h2>Coverage Agent Isolation</h2>
|
| 134 |
|
Marc R. Hoffmann | 1588849 | 2009-07-30 11:46:53 +0000 | [diff] [blame] | 135 | <p class="intro">
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 136 | The Java agent is loaded by the application class loader. Therefore the
|
| 137 | classes of the agent live in the same name space than the application classes
|
| 138 | which can result in clashes especially with the third party library ASM. The
|
| 139 | JoCoCo build therefore moves all agent classes into a unique package.
|
| 140 | </p>
|
| 141 |
|
| 142 | <p>
|
| 143 | The JaCoCo build renames all classes contained in the
|
| 144 | <code>jacocoagent.jar</code> into classes with a
|
Marc R. Hoffmann | 0948cb9 | 2009-07-06 09:15:28 +0000 | [diff] [blame] | 145 | <code>org.jacoco.<randomid></code> prefix, including the required ASM
|
| 146 | library classes. The identifier is created from a random number. As the agent
|
| 147 | does not provide any API, no one should be affected by this renaming. This
|
| 148 | trick also allows that JaCoCo tests can be verified with JaCoCo.
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 149 | </p>
|
| 150 |
|
| 151 |
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 152 | <h2>Minimal Java Version</h2>
|
| 153 |
|
Marc R. Hoffmann | 1588849 | 2009-07-30 11:46:53 +0000 | [diff] [blame] | 154 | <p class="intro">
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 155 | JaCoCo requires Java 1.5.
|
| 156 | </p>
|
| 157 |
|
| 158 | <p>
|
| 159 | The Java agent mechanism used for on-the-fly instrumentation became available
|
| 160 | with in Java 1.5 VMs. Coding and testing with Java 1.5 language level is more
|
| 161 | efficient, less error-prone – and more fun. JaCoCo will still allow to
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 162 | run against Java code compiled for older versions.
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 163 | </p>
|
| 164 |
|
| 165 |
|
| 166 | <h2>Byte Code Manipulation</h2>
|
| 167 |
|
Marc R. Hoffmann | 1588849 | 2009-07-30 11:46:53 +0000 | [diff] [blame] | 168 | <p class="intro">
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 169 | Instrumentation requires mechanisms to modify and generate Java byte code.
|
| 170 | JaCoCo uses the ASM library for this purpose.
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 171 | </p>
|
| 172 |
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 173 | <p>
|
| 174 | Implementing the Java byte code specification would be a extensive and
|
| 175 | error-prone task. Therefore an existing library should be used. The
|
| 176 | <a href="http://asm.objectweb.org/">ASM</a> library is lightweight, easy to
|
| 177 | use and very efficient in terms of memory and CPU usage. It is actively
|
| 178 | maintained and includes as huge regression test suite. Its simplified BSD
|
| 179 | license is approved by the Eclipse Foundation for usage with EPL products.
|
| 180 | </p>
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 181 |
|
| 182 | <h2>Java Class Identity</h2>
|
| 183 |
|
Marc R. Hoffmann | 1588849 | 2009-07-30 11:46:53 +0000 | [diff] [blame] | 184 | <p class="intro">
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 185 | Each class loaded at runtime needs a unique identity to associate coverage data with.
|
| 186 | JaCoCo creates such identities by a CRC64 hash code of the raw class definition.
|
| 187 | </p>
|
| 188 |
|
| 189 | <p>
|
| 190 | In multi-classloader environments the plain name of a class does not
|
| 191 | unambiguously identify a class. For example OSGi allows to use different
|
| 192 | versions of the same class to be loaded within the same VM. In complex
|
| 193 | deployment scenarios the actual version of the test target might be different
|
| 194 | from current development version. A code coverage report should guarantee that
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 195 | the presented figures are extracted from a valid test target. A hash code of
|
| 196 | the class definitions allows a differentiate between classes and versions of a
|
| 197 | class. The CRC64 hash computation is simple and fast resulting in a small 64
|
| 198 | bit identifier.
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 199 | </p>
|
| 200 |
|
| 201 | <p>
|
| 202 | The same class definition might be loaded by class loaders which will result
|
| 203 | in different classes for the Java runtime system. For coverage analysis this
|
| 204 | distinction should be irrelevant. Class definitions might be altered by other
|
| 205 | instrumentation based technologies (e.g. AspectJ). In this case the hash code
|
| 206 | will change and identity gets lost. On the other hand code coverage analysis
|
| 207 | based on classes that have been somehow altered will produce unexpected
|
| 208 | results. The CRC64 has code might produce so called <i>collisions</i>, i.e.
|
| 209 | creating the same hash code for two different classes. Although CRC64 is not
|
| 210 | cryptographically strong and collision examples can be easily computed, for
|
| 211 | regular class files the collision probability is very low.
|
| 212 | </p>
|
| 213 |
|
| 214 | <h2>Coverage Runtime Dependency</h2>
|
| 215 |
|
Marc R. Hoffmann | 1588849 | 2009-07-30 11:46:53 +0000 | [diff] [blame] | 216 | <p class="intro">
|
Marc R. Hoffmann | e52a0ef | 2009-06-16 20:28:45 +0000 | [diff] [blame] | 217 | Instrumented code typically gets a dependency to a coverage runtime which is
|
| 218 | responsible for collecting and storing execution data. JaCoCo uses JRE types
|
| 219 | and interfaces only in generated instrumentation code.
|
| 220 | </p>
|
| 221 |
|
| 222 | <p>
|
| 223 | Making a runtime library available to all instrumented classes can be a
|
| 224 | painful or impossible task in frameworks that use there own class loading
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 225 | mechanisms. Therefore JaCoCo decouples the instrumented classes and the
|
Marc R. Hoffmann | 347cfed | 2009-09-07 19:15:54 +0000 | [diff] [blame] | 226 | coverage runtime through official JRE API types. Currently two approaches have
|
Marc R. Hoffmann | 402370f | 2009-08-10 14:02:23 +0000 | [diff] [blame] | 227 | been implemented:
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 228 | </p>
|
| 229 |
|
Marc R. Hoffmann | 402370f | 2009-08-10 14:02:23 +0000 | [diff] [blame] | 230 | <ul>
|
Marc R. Hoffmann | 347cfed | 2009-09-07 19:15:54 +0000 | [diff] [blame] | 231 | <li>By default we use a shared <code>java.util.logging.Logger</code> instance
|
| 232 | to report coverage data to. The coverage runtime registers a custom
|
| 233 | <code>Handler</code> to receive the data. The problem with this approach is
|
| 234 | that the logging framework removes all handlers during shutdown. This may
|
| 235 | break classes that get initialized during JVM shutdown.</li>
|
Marc R. Hoffmann | 402370f | 2009-08-10 14:02:23 +0000 | [diff] [blame] | 236 | <li>Another approach was to store a <code>java.util.Map</code> instance
|
| 237 | under a system property. This solution breaks the contract that system
|
| 238 | properties must only contain <code>java.lang.String</code> values and has
|
| 239 | therefore caused trouble in certain environments.</li>
|
| 240 | </ul>
|
| 241 |
|
| 242 |
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 243 | <h2>Memory Usage</h2>
|
| 244 |
|
Marc R. Hoffmann | 1588849 | 2009-07-30 11:46:53 +0000 | [diff] [blame] | 245 | <p class="intro">
|
Marc R. Hoffmann | 58d7621 | 2009-10-08 15:40:46 +0000 | [diff] [blame^] | 246 | Coverage analysis for huge projects with several thousand classes or hundred
|
| 247 | thousand lines of code should be possible. To allow this with reasonable
|
| 248 | memory usage the coverage analysis is based on streaming patterns and
|
| 249 | "depth first" traversals.
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 250 | </p>
|
| 251 |
|
| 252 | <p>
|
Marc R. Hoffmann | 58d7621 | 2009-10-08 15:40:46 +0000 | [diff] [blame^] | 253 | The complete data tree of a huge coverage report is too big to fit into a
|
| 254 | reasonable heap memory configuration. Therefore the coverage analysis and
|
| 255 | report generation is implemented as "depth first" traversals. Which means that
|
| 256 | at any point in time only only the following data has to be held in main
|
| 257 | memory:
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 258 | </p>
|
| 259 |
|
Marc R. Hoffmann | 58d7621 | 2009-10-08 15:40:46 +0000 | [diff] [blame^] | 260 | <ul>
|
| 261 | <li>A single class which is currently processed.</li>
|
| 262 | <li>The summary information of all parents of this class (package, groups).</li>
|
| 263 | </ul>
|
| 264 |
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 265 | <h2>Java Element Identifiers</h2>
|
| 266 |
|
Marc R. Hoffmann | 1588849 | 2009-07-30 11:46:53 +0000 | [diff] [blame] | 267 | <p class="intro">
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 268 | The Java language and the Java VM use different String representation formats
|
| 269 | for Java elements. For example while a type reference in Java reads like
|
| 270 | <code>java.lang.Object</code>, the VM references the same type as
|
| 271 | <code>Ljava/lang/Object;</code>. The JaCoCo API is based on VM identifiers only.
|
| 272 | </p>
|
| 273 |
|
| 274 | <p>
|
| 275 | Using VM identifiers directly does not cause any transformation overhead at
|
| 276 | runtime. There are several programming languages based on the Java VM that
|
| 277 | might use different notations. Specific transformations should therefore only
|
| 278 | happen at the user interface level, for example while report generation.
|
| 279 | </p>
|
| 280 |
|
| 281 | <h2>Modularization of the JaCoCo implementation</h2>
|
| 282 |
|
Marc R. Hoffmann | 1588849 | 2009-07-30 11:46:53 +0000 | [diff] [blame] | 283 | <p class="intro">
|
Marc R. Hoffmann | 5267b6c | 2009-07-05 16:34:27 +0000 | [diff] [blame] | 284 | JaCoCo is implemented in several modules providing different functionality.
|
| 285 | These modules are provided as OSGi bundles with proper manifest files. But
|
| 286 | there is no dependencies on OSGi itself.
|
| 287 | </p>
|
| 288 |
|
| 289 | <p>
|
| 290 | Using OSGi bundles allows well defines dependencies at development time and
|
| 291 | at runtime in OSGi containers. As there are no dependencies on OSGi, the
|
| 292 | bundles can also be used as regular JAR files.
|
| 293 | </p>
|
| 294 |
|
Marc R. Hoffmann | 1588849 | 2009-07-30 11:46:53 +0000 | [diff] [blame] | 295 | <div class="footer">
|
Marc R. Hoffmann | afe929b | 2009-08-05 09:19:00 +0000 | [diff] [blame] | 296 | <div class="versioninfo"><a href="@HOMEURL@">JaCoCo</a> @VERSION@</div>
|
Marc R. Hoffmann | 1588849 | 2009-07-30 11:46:53 +0000 | [diff] [blame] | 297 | <a href="license.html">Copyright</a> © 2009 Mountainminds GmbH & Co. KG and Contributors
|
| 298 | </div>
|
Marc R. Hoffmann | a2af15d | 2009-06-07 21:15:05 +0000 | [diff] [blame] | 299 |
|
| 300 | </body>
|
| 301 | </html> |