Initial Contribution
diff --git a/docs/dalvik-bytecode.css b/docs/dalvik-bytecode.css
new file mode 100644
index 0000000..e4a5caa
--- /dev/null
+++ b/docs/dalvik-bytecode.css
@@ -0,0 +1,165 @@
+h1 {
+ font-family: serif;
+ color: #222266;
+}
+
+h2 {
+ font-family: serif;
+ border-top-style: solid;
+ border-top-width: 2px;
+ border-color: #ccccdd;
+ padding-top: 12px;
+ margin-top: 48px;
+ margin-bottom: 2px;
+ color: #222266;
+}
+
+@media print {
+ table {
+ font-size: 8pt;
+ }
+}
+
+@media screen {
+ table {
+ font-size: 10pt;
+ }
+}
+
+
+/* general for all tables */
+
+table {
+ border-collapse: collapse;
+ margin-top: 12px;
+}
+
+table th {
+ font-family: sans-serif;
+ background: #aabbff;
+}
+
+table td {
+ font-family: sans-serif;
+ border-top-style: solid;
+ border-bottom-style: solid;
+ border-width: 1px;
+ border-color: #aaaaff;
+ padding-top: 4px;
+ padding-bottom: 4px;
+ padding-left: 4px;
+ padding-right: 6px;
+ background: #eeeeff;
+}
+
+table td p {
+ margin-top: 4pt;
+ margin-bottom: 0pt;
+}
+
+
+
+/* opcodes table */
+
+table.instruc {
+ margin-top: 24px;
+ margin-bottom: 24px;
+ margin-left: 48px;
+ margin-right: 48px;
+}
+
+table.instruc td {
+ font-family: sans-serif;
+ border-top-style: solid;
+ border-bottom-style: solid;
+ border-width: 1px;
+ padding-top: 4px;
+ padding-bottom: 4px;
+ padding-left: 2px;
+ padding-right: 2px;
+}
+
+table.instruc td:first-child {
+ font-family: monospace;
+ font-size: 90%;
+ vertical-align: top;
+ width: 12%;
+}
+
+table.instruc td:first-child + td {
+ font-family: monospace;
+ font-size: 90%;
+ vertical-align: top;
+ width: 23%;
+}
+
+table.instruc td:first-child + td i {
+ font-family: sans-serif;
+ font-size: 90%;
+}
+
+table.instruc td:first-child + td + td {
+ vertical-align: top;
+ width: 28%;
+}
+
+table.instruc td:first-child + td + td + td {
+ vertical-align: top;
+ width: 37%;
+}
+
+
+/* supplemental opcode format table */
+
+table.supplement {
+ margin-top: 24px;
+ margin-bottom: 24px;
+ margin-left: 48px;
+ margin-right: 48px;
+}
+
+table.supplement td:first-child {
+ font-family: monospace;
+ vertical-align: top;
+ width: 20%;
+}
+
+table.supplement td:first-child + td {
+ font-family: monospace;
+ vertical-align: top;
+ width: 20%;
+}
+
+table.supplement td:first-child + td + td {
+ font-family: sans-serif;
+ vertical-align: top;
+ width: 60%;
+}
+
+
+/* math details table */
+
+table.math {
+ margin-top: 24px;
+ margin-bottom: 24px;
+ margin-left: 48px;
+ margin-right: 48px;
+}
+
+table.math td:first-child {
+ font-family: monospace;
+ vertical-align: top;
+ width: 10%;
+}
+
+table.math td:first-child + td {
+ font-family: monospace;
+ vertical-align: top;
+ width: 30%;
+}
+
+table.math td:first-child + td + td {
+ font-family: sans-serif;
+ vertical-align: top;
+ width: 60%;
+}
diff --git a/docs/dalvik-bytecode.html b/docs/dalvik-bytecode.html
new file mode 100644
index 0000000..fc3cf0b
--- /dev/null
+++ b/docs/dalvik-bytecode.html
@@ -0,0 +1,1485 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+
+<html>
+
+<head>
+<title>Bytecode for the Dalvik VM</title>
+<link rel=stylesheet href="dalvik-bytecode.css">
+</head>
+
+<body>
+
+<h1>Bytecode for the Dalvik VM</h1>
+<p>Copyright © 2007 The Android Open Source Project
+
+<h2>General Design</h2>
+
+<ul>
+<li>The machine model and calling conventions are meant to approximately
+ imitate common real architectures and C-style calling conventions:
+ <ul>
+ <li>The VM is register-based, and frames are fixed in size upon creation.
+ Each frame consists of a particular number of registers (specified by
+ the method) as well as any adjunct data needed to execute the method,
+ such as (but not limited to) the program counter and a reference to the
+ <code>.dex</code> file that contains the method.
+ </li>
+ <li>The <i>N</i> arguments to a method land in the last <i>N</i> registers
+ of the method's invocation frame.
+ </li>
+ <li>Registers are 32 bits wide. Adjacent register pairs are used for 64-bit
+ values.
+ </li>
+ <li>In terms of bitwise representation, <code>(Object) null == (int)
+ 0</code>.
+ </li>
+ </ul>
+<li>The storage unit in the instruction stream is a 16-bit unsigned quantity.
+ Some bits in some instructions are ignored / must-be-zero.
+</li>
+<li>Instructions aren't gratuitously limited to a particular type. For
+ example, instructions that move 32-bit register values without interpretation
+ don't have to specify whether they are moving ints or floats.
+</li>
+<li>There are separately enumerated and indexed constant pools for
+ references to strings, types, fields, and methods.
+</li>
+<li>Bitwise literal data is represented in-line in the instruction stream.</li>
+<li>Because, in practice, it is uncommon for a method to need more than
+ 16 registers, and because needing more than eight registers <i>is</i>
+ reasonably common, many instructions may only address the first 16
+ registers. When reasonably possible, instructions allow references to
+ up to the first 256 registers. In cases where an instruction variant isn't
+ available to address a desired register, it is expected that the register
+ contents get moved from the original register to a low register (before the
+ operation) and/or moved from a low result register to a high register
+ (after the operation).
+</li>
+<li>When installed on a running system, some instructions may be altered,
+ changing their format, as an install-time static linking optimization.
+ This is to allow for faster execution once linkage is known.
+ See the associated
+ <a href="instruction-formats.html">instruction formats document</a>
+ for the suggested variants. The word "suggested" is used advisedly;
+ it is not mandatory to implement these.
+</li>
+<li>Human-syntax and mnemonics:
+ <ul>
+ <li>Dest-then-source ordering for arguments.</li>
+ <li>Some opcodes have a disambiguating suffix with respect to the type(s)
+ they operate on: Type-general 64-bit opcodes
+ are suffixed with <code>-wide</code>.
+ Type-specific opcodes are suffixed with their type (or a
+ straightforward abbreviation), one of: <code>-boolean</code>
+ <code>-byte</code> <code>-char</code> <code>-short</code>
+ <code>-int</code> <code>-long</code> <code>-float</code>
+ <code>-double</code> <code>-object</code> <code>-string</code>
+ <code>-class</code> <code>-void</code>. Type-general 32-bit opcodes
+ are unmarked.
+ </li>
+ <li>Some opcodes have a disambiguating suffix to distinguish
+ otherwise-identical operations that have different instruction layouts
+ or options. These suffixes are separated from the main names with a slash
+ ("<code>/</code>") and mainly exist at all to make there be a one-to-one
+ mapping with static constants in the code that generates and interprets
+ executables (that is, to reduce ambiguity for humans).
+ </li>
+ </ul>
+</li>
+<li>See the <a href="instruction-formats.html">instruction formats
+ document</a> for more details about the various instruction formats
+ (listed under "Op & Format") as well as details about the opcode
+ syntax.
+</li>
+</ul>
+
+<h2>Summary of Instruction Set</h2>
+
+<table class="instruc">
+<thead>
+<tr>
+ <th>Op & Format</th>
+ <th>Mnemonic / Syntax</th>
+ <th>Arguments</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>00 10x</td>
+ <td>nop</td>
+ <td> </td>
+ <td>Waste cycles.</td>
+</tr>
+<tr>
+ <td>01 12x</td>
+ <td>move vA, vB</td>
+ <td><code>A:</code> destination register (4 bits)<br/>
+ <code>B:</code> source register (4 bits)</td>
+ <td>Move the contents of one non-object register to another.</td>
+</tr>
+<tr>
+ <td>02 22x</td>
+ <td>move/from16 vAA, vBBBB</td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> source register (16 bits)</td>
+ <td>Move the contents of one non-object register to another.</td>
+</tr>
+<tr>
+ <td>03 32x</td>
+ <td>move/16 vAAAA, vBBBB</td>
+ <td><code>A:</code> destination register (16 bits)<br/>
+ <code>B:</code> source register (16 bits)</td>
+ <td>Move the contents of one non-object register to another.</td>
+</tr>
+<tr>
+ <td>04 12x</td>
+ <td>move-wide vA, vB</td>
+ <td><code>A:</code> destination register pair (4 bits)<br/>
+ <code>B:</code> source register pair (4 bits)</td>
+ <td>Move the contents of one register-pair to another.
+ <p><b>Note:</b>
+ It is legal to move from <code>v<i>N</i></code> to either
+ <code>v<i>N-1</i></code> or <code>v<i>N+1</i></code>, so implementations
+ must arrange for both halves of a register pair to be read before
+ anything is written.</p>
+ </td>
+</tr>
+<tr>
+ <td>05 22x</td>
+ <td>move-wide/from16 vAA, vBBBB</td>
+ <td><code>A:</code> destination register pair (8 bits)<br/>
+ <code>B:</code> source register pair (16 bits)</td>
+ <td>Move the contents of one register-pair to another.
+ <p><b>Note:</b>
+ Implementation considerations are the same as <code>move-wide</code>,
+ above.</p>
+ </td>
+</tr>
+<tr>
+ <td>06 32x</td>
+ <td>move-wide/16 vAAAA, vBBBB</td>
+ <td><code>A:</code> destination register pair (16 bits)<br/>
+ <code>B:</code> source register pair (16 bits)</td>
+ <td>Move the contents of one register-pair to another.
+ <p><b>Note:</b>
+ Implementation considerations are the same as <code>move-wide</code>,
+ above.</p>
+ </td>
+</tr>
+<tr>
+ <td>07 12x</td>
+ <td>move-object vA, vB</td>
+ <td><code>A:</code> destination register (4 bits)<br/>
+ <code>B:</code> source register (4 bits)</td>
+ <td>Move the contents of one object-bearing register to another.</td>
+</tr>
+<tr>
+ <td>08 22x</td>
+ <td>move-object/from16 vAA, vBBBB</td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> source register (16 bits)</td>
+ <td>Move the contents of one object-bearing register to another.</td>
+</tr>
+<tr>
+ <td>09 32x</td>
+ <td>move-object/16 vAAAA, vBBBB</td>
+ <td><code>A:</code> destination register (16 bits)<br/>
+ <code>B:</code> source register (16 bits)</td>
+ <td>Move the contents of one object-bearing register to another.</td>
+</tr>
+<tr>
+ <td>0a 11x</td>
+ <td>move-result vAA</td>
+ <td><code>A:</code> destination register (8 bits)</td>
+ <td>Move the single-word non-object result of the most recent
+ <code>invoke-<i>kind</i></code> into the indicated register.
+ This must be done as the instruction immediately after an
+ <code>invoke-<i>kind</i></code> whose (single-word, non-object) result
+ is not to be ignored; anywhere else is invalid.</td>
+</tr>
+<tr>
+ <td>0b 11x</td>
+ <td>move-result-wide vAA</td>
+ <td><code>A:</code> destination register pair (8 bits)</td>
+ <td>Move the double-word result of the most recent
+ <code>invoke-<i>kind</i></code> into the indicated register pair.
+ This must be done as the instruction immediately after an
+ <code>invoke-<i>kind</i></code> whose (double-word) result
+ is not to be ignored; anywhere else is invalid.</td>
+</tr>
+<tr>
+ <td>0c 11x</td>
+ <td>move-result-object vAA</td>
+ <td><code>A:</code> destination register (8 bits)</td>
+ <td>Move the object result of the most recent <code>invoke-<i>kind</i></code>
+ into the indicated register. This must be done as the instruction
+ immediately after an <code>invoke-<i>kind</i></code> or
+ <code>filled-new-array</code>
+ whose (object) result is not to be ignored; anywhere else is invalid.</td>
+</tr>
+<tr>
+ <td>0d 11x</td>
+ <td>move-exception vAA</td>
+ <td><code>A:</code> destination register (8 bits)</td>
+ <td>Save a just-caught exception into the given register. This should
+ be the first instruction of any exception handler whose caught
+ exception is not to be ignored, and this instruction may <i>only</i>
+ ever occur as the first instruction of an exception handler; anywhere
+ else is invalid.</td>
+</tr>
+<tr>
+ <td>0e 10x</td>
+ <td>return-void</td>
+ <td> </td>
+ <td>Return from a <code>void</code> method.</td>
+</tr>
+<tr>
+ <td>0f 11x</td>
+ <td>return vAA</td>
+ <td><code>A:</code> return value register (8 bits)</td>
+ <td>Return from a single-width (32-bit) non-object value-returning
+ method.
+ </td>
+</tr>
+<tr>
+ <td>10 11x</td>
+ <td>return-wide vAA</td>
+ <td><code>A:</code> return value register-pair (8 bits)</td>
+ <td>Return from a double-width (64-bit) value-returning method.</td>
+</tr>
+<tr>
+ <td>11 11x</td>
+ <td>return-object vAA</td>
+ <td><code>A:</code> return value register (8 bits)</td>
+ <td>Return from an object-returning method.</td>
+</tr>
+<tr>
+ <td>12 11n</td>
+ <td>const/4 vA, #+B</td>
+ <td><code>A:</code> destination register (4 bits)<br/>
+ <code>B:</code> signed int (4 bits)</td>
+ <td>Move the given literal value (sign-extended to 32 bits) into
+ the specified register.</td>
+</tr>
+<tr>
+ <td>13 21s</td>
+ <td>const/16 vAA, #+BBBB</td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> signed int (16 bits)</td>
+ <td>Move the given literal value (sign-extended to 32 bits) into
+ the specified register.</td>
+</tr>
+<tr>
+ <td>14 31i</td>
+ <td>const vAA, #+BBBBBBBB</td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> arbitrary 32-bit constant</td>
+ <td>Move the given literal value into the specified register.</td>
+</tr>
+<tr>
+ <td>15 21h</td>
+ <td>const/high16 vAA, #+BBBB0000</td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> signed int (16 bits)</td>
+ <td>Move the given literal value (right-zero-extended to 32 bits) into
+ the specified register.</td>
+</tr>
+<tr>
+ <td>16 21s</td>
+ <td>const-wide/16 vAA, #+BBBB</td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> signed int (16 bits)</td>
+ <td>Move the given literal value (sign-extended to 64 bits) into
+ the specified register-pair.</td>
+</tr>
+<tr>
+ <td>17 31i</td>
+ <td>const-wide/32 vAA, #+BBBBBBBB</td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> signed int (32 bits)</td>
+ <td>Move the given literal value (sign-extended to 64 bits) into
+ the specified register-pair.</td>
+</tr>
+<tr>
+ <td>18 51l</td>
+ <td>const-wide vAA, #+BBBBBBBBBBBBBBBB</td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> arbitrary double-width (64-bit) constant</td>
+ <td>Move the given literal value into
+ the specified register-pair.</td>
+</tr>
+<tr>
+ <td>19 21h</td>
+ <td>const-wide/high16 vAA, #+BBBB000000000000</td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> signed int (16 bits)</td>
+ <td>Move the given literal value (right-zero-extended to 64 bits) into
+ the specified register-pair.</td>
+</tr>
+<tr>
+ <td>1a 21c</td>
+ <td>const-string vAA, string@BBBB</td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> string index</td>
+ <td>Move a reference to the string specified by the given index into the
+ specified register.</td>
+</tr>
+<tr>
+ <td>1b 31c</td>
+ <td>const-string/jumbo vAA, string@BBBBBBBB</td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> string index</td>
+ <td>Move a reference to the string specified by the given index into the
+ specified register.</td>
+</tr>
+<tr>
+ <td>1c 21c</td>
+ <td>const-class vAA, type@BBBB</td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> type index</td>
+ <td>Move a reference to the class specified by the given index into the
+ specified register. In the case where the indicated type is primitive,
+ this will store a reference to the primitive type's degenerate
+ class.</td>
+</tr>
+<tr>
+ <td>1d 11x</td>
+ <td>monitor-enter vAA</td>
+ <td><code>A:</code> reference-bearing register (8 bits)</td>
+ <td>Acquire the monitor for the indicated object.</td>
+</tr>
+<tr>
+ <td>1e 11x</td>
+ <td>monitor-exit vAA</td>
+ <td><code>A:</code> reference-bearing register (8 bits)</td>
+ <td>Release the monitor for the indicated object.
+ <p><b>Note:</b>
+ If this instruction needs to throw an exception, it must do
+ so as if the pc has already advanced past the instruction.
+ It may be useful to think of this as the instruction successfully
+ executing (in a sense), and the exception getting thrown <i>after</i>
+ the instruction but <i>before</i> the next one gets a chance to
+ run. This definition makes it possible for a method to use
+ a monitor cleanup catch-all (e.g., <code>finally</code>) block as
+ the monitor cleanup for that block itself, as a way to handle the
+ arbitrary exceptions that might get thrown due to the historical
+ implementation of <code>Thread.stop()</code>, while still managing
+ to have proper monitor hygiene.</p>
+ </td>
+</tr>
+<tr>
+ <td>1f 21c</td>
+ <td>check-cast vAA, type@BBBB</td>
+ <td><code>A:</code> reference-bearing register (8 bits)<br/>
+ <code>B:</code> type index (16 bits)</td>
+ <td>Throw if the reference in the given register cannot be cast to the
+ indicated type. The type must be a reference type (not a primitive
+ type).</td>
+</tr>
+<tr>
+ <td>20 22c</td>
+ <td>instance-of vA, vB, type@CCCC</td>
+ <td><code>A:</code> destination register (4 bits)<br/>
+ <code>B:</code> reference-bearing register (4 bits)<br/>
+ <code>C:</code> type index (16 bits)</td>
+ <td>Store in the given destination register <code>1</code>
+ if the indicated reference is an instance of the given type,
+ or <code>0</code> if not. The type must be a
+ reference type (not a primitive type).</td>
+</tr>
+<tr>
+ <td>21 12x</td>
+ <td>array-length vA, vB</td>
+ <td><code>A:</code> destination register (4 bits)<br/>
+ <code>B:</code> array reference-bearing register (4 bits)</td>
+ <td>Store in the given destination register the length of the indicated
+ array, in entries</td>
+</tr>
+<tr>
+ <td>22 21c</td>
+ <td>new-instance vAA, type@BBBB</td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> type index</td>
+ <td>Construct a new instance of the indicated type, storing a
+ reference to it in the destination. The type must refer to a
+ non-array class.</td>
+</tr>
+<tr>
+ <td>23 22c</td>
+ <td>new-array vA, vB, type@CCCC</td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> size register<br/>
+ <code>C:</code> type index</td>
+ <td>Construct a new array of the indicated type and size. The type
+ must be an array type.</td>
+</tr>
+<tr>
+ <td>24 35c</td>
+ <td>filled-new-array {vD, vE, vF, vG, vA}, type@CCCC</td>
+ <td><code>B:</code> array size and argument word count (4 bits)<br/>
+ <code>C:</code> type index (16 bits)<br/>
+ <code>D..G, A:</code> argument registers (4 bits each)</td>
+ <td>Construct an array of the given type and size, filling it with the
+ supplied contents. The type must be an array type. The array's
+ contents must be single-word (that is,
+ no arrays of <code>long</code> or <code>double</code>). The constructed
+ instance is stored as a "result" in the same way that the method invocation
+ instructions store their results, so the constructed instance must
+ be moved to a register with a subsequent
+ <code>move-result-object</code> instruction (if it is to be used).</td>
+</tr>
+<tr>
+ <td>25 3rc</td>
+ <td>filled-new-array/range {vCCCC .. vNNNN}, type@BBBB</td>
+ <td><code>A:</code> array size and argument word count (8 bits)<br/>
+ <code>B:</code> type index (16 bits)<br/>
+ <code>C:</code> first argument register (16 bits)<br/>
+ <code>N = A + C - 1</code></td>
+ <td>Construct an array of the given type and size, filling it with
+ the supplied contents. Clarifications and restrictions are the same
+ as <code>filled-new-array</code>, described above.</td>
+</tr>
+<tr>
+ <td>26 31t</td>
+ <td>fill-array-data vAA, +BBBBBBBB <i>(with supplemental data as specified
+ below in "<code>fill-array-data</code> Format")</i></td>
+ <td><code>A:</code> array reference (8 bits)<br/>
+ <code>B:</code> signed "branch" offset to table data (32 bits)</td>
+ <td>Fill the given array with the indicated data. The reference must be
+ to an array of primitives, and the data table must match it in type and
+ size.
+ <p><b>Note:</b>
+ The address of the table is guaranteed to be even
+ (that is, 4-byte aligned). If the code size of the method is otherwise
+ odd, then an extra code unit is inserted between the main code and the
+ table whose value is the same as a <code>nop</code>.</p>
+ </td>
+</tr>
+<tr>
+ <td>27 11x</td>
+ <td>throw vAA</td>
+ <td><code>A:</code> exception-bearing register (8 bits)<br/></td>
+ <td>Throw the indicated exception.</td>
+</tr>
+<tr>
+ <td>28 10t</td>
+ <td>goto +AA</td>
+ <td><code>A:</code> signed branch offset (8 bits)</td>
+ <td>Unconditionally jump to the indicated instruction.
+ <p><b>Note:</b>
+ The branch offset may not be <code>0</code>. (A spin
+ loop may be legally constructed either with <code>goto/32</code> or
+ by including a <code>nop</code> as a target before the branch.)</p>
+ </td>
+</tr>
+<tr>
+ <td>29 20t</td>
+ <td>goto/16 +AAAA</td>
+ <td><code>A:</code> signed branch offset (16 bits)<br/></td>
+ <td>Unconditionally jump to the indicated instruction.
+ <p><b>Note:</b>
+ The branch offset may not be <code>0</code>. (A spin
+ loop may be legally constructed either with <code>goto/32</code> or
+ by including a <code>nop</code> as a target before the branch.)</p>
+ </td>
+</tr>
+<tr>
+ <td>2a 30t</td>
+ <td>goto/32 +AAAAAAAA</td>
+ <td><code>A:</code> signed branch offset (32 bits)<br/></td>
+ <td>Unconditionally jump to the indicated instruction.</td>
+</tr>
+<tr>
+ <td>2b 31t</td>
+ <td>packed-switch vAA, +BBBBBBBB <i>(with supplemental data as
+ specified below in "<code>packed-switch</code> Format")</i></td>
+ <td><code>A:</code> register to test<br/>
+ <code>B:</code> signed "branch" offset to table data (32 bits)</td>
+ <td>Jump to a new instruction based on the value in the
+ given register, using a table of offsets corresponding to each value
+ in a particular integral range, or fall through to the next
+ instruction if there is no match.
+ <p><b>Note:</b>
+ The address of the
+ table is guaranteed to be even (that is, 4-byte aligned). If the
+ code size of the method is otherwise odd, then an extra code unit
+ is inserted between the main code and the table whose value is
+ the same as a <code>nop</code>.</p>
+ </td>
+</tr>
+<tr>
+ <td>2c 31t</td>
+ <td>sparse-switch vAA, +BBBBBBBB <i>(with supplemental data as
+ specified below in "<code>sparse-switch</code> Format")</i></td>
+ <td><code>A:</code> register to test<br/>
+ <code>B:</code> signed "branch" offset to table data (32 bits)</td>
+ <td>Jump to a new instruction based on the value in the given
+ register, using an ordered table of value-offset pairs, or fall
+ through to the next instruction if there is no match.
+ <p><b>Note:</b>
+ Alignment and padding considerations are identical to
+ <code>packed-switch</code>, above.</p>
+ </td>
+</tr>
+<tr>
+ <td>2d..31 23x</td>
+ <td>cmp<i>kind</i> vAA, vBB, vCC<br/>
+ 2d: cmpl-float <i>(lt bias)</i><br/>
+ 2e: cmpg-float <i>(gt bias)</i><br/>
+ 2f: cmpl-double <i>(lt bias)</i><br/>
+ 30: cmpg-double <i>(gt bias)</i><br/>
+ 31: cmp-long
+ </td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> first source register or pair<br/>
+ <code>C:</code> second source register or pair</td>
+ <td>Perform the indicated floating point or <code>long</code> comparison,
+ storing <code>0</code> if the two arguments are equal, <code>1</code>
+ if the second argument is larger, or <code>-1</code> if the first
+ argument is larger. The "bias" listed for the floating point operations
+ indicates how <code>NaN</code> comparisons are treated: "Gt bias"
+ instructions return <code>1</code> for <code>NaN</code> comparisons,
+ and "lt bias" instructions return
+ <code>-1</code>.
+ <p>For example, to check to see if floating point
+ <code>a < b</code>, then it is advisable to use
+ <code>cmpg-float</code>; a result of <code>-1</code> indicates that
+ the test was true, and the other values indicate it was false either
+ due to a valid comparison or because one or the other values was
+ <code>NaN</code>.</p>
+ </td>
+</tr>
+<tr>
+ <td>32..37 22t</td>
+ <td>if-<i>test</i> vA, vB, +CCCC<br/>
+ 32: if-eq<br/>
+ 33: if-ne<br/>
+ 34: if-lt<br/>
+ 35: if-ge<br/>
+ 36: if-gt<br/>
+ 37: if-le<br/>
+ </td>
+ <td><code>A:</code> first register to test (4 bits)<br/>
+ <code>B:</code> second register to test (4 bits)<br/>
+ <code>C:</code> signed branch offset (16 bits)</td>
+ <td>Branch to the given destination if the given two registers' values
+ compare as specified.
+ <p><b>Note:</b>
+ The branch offset may not be <code>0</code>. (A spin
+ loop may be legally constructed either by branching around a
+ backward <code>goto</code> or by including a <code>nop</code> as
+ a target before the branch.)</p>
+ </td>
+</tr>
+<tr>
+ <td>38..3d 21t</td>
+ <td>if-<i>test</i>z vAA, +BBBB<br/>
+ 38: if-eqz<br/>
+ 39: if-nez<br/>
+ 3a: if-ltz<br/>
+ 3b: if-gez<br/>
+ 3c: if-gtz<br/>
+ 3d: if-lez<br/>
+ </td>
+ <td><code>A:</code> register to test (8 bits)<br/>
+ <code>B:</code> signed branch offset (16 bits)</td>
+ <td>Branch to the given destination if the given register's value compares
+ with 0 as specified.
+ <p><b>Note:</b>
+ The branch offset may not be <code>0</code>. (A spin
+ loop may be legally constructed either by branching around a
+ backward <code>goto</code> or by including a <code>nop</code> as
+ a target before the branch.)</p>
+ </td>
+</tr>
+<tr>
+ <td>3e..43 10x</td>
+ <td><i>(unused)</i></td>
+ <td> </td>
+ <td><i>(unused)</i></td>
+</tr>
+<tr>
+ <td>44..51 23x</td>
+ <td><i>arrayop</i> vAA, vBB, vCC<br/>
+ 44: aget<br/>
+ 45: aget-wide<br/>
+ 46: aget-object<br/>
+ 47: aget-boolean<br/>
+ 48: aget-byte<br/>
+ 49: aget-char<br/>
+ 4a: aget-short<br/>
+ 4b: aput<br/>
+ 4c: aput-wide<br/>
+ 4d: aput-object<br/>
+ 4e: aput-boolean<br/>
+ 4f: aput-byte<br/>
+ 50: aput-char<br/>
+ 51: aput-short
+ </td>
+ <td><code>A:</code> value register or pair; may be source or dest
+ (8 bits)<br/>
+ <code>B:</code> array register (8 bits)<br/>
+ <code>C:</code> index register (8 bits)</td>
+ <td>Perform the identified array operation at the identified index of
+ the given array, loading or storing into the value register.</td>
+</tr>
+<tr>
+ <td>52..5f 22c</td>
+ <td>i<i>instanceop</i> vA, vB, field@CCCC<br/>
+ 52: iget<br/>
+ 53: iget-wide<br/>
+ 54: iget-object<br/>
+ 55: iget-boolean<br/>
+ 56: iget-byte<br/>
+ 57: iget-char<br/>
+ 58: iget-short<br/>
+ 59: iput<br/>
+ 5a: iput-wide<br/>
+ 5b: iput-object<br/>
+ 5c: iput-boolean<br/>
+ 5d: iput-byte<br/>
+ 5e: iput-char<br/>
+ 5f: iput-short
+ </td>
+ <td><code>A:</code> value register or pair; may be source or dest
+ (4 bits)<br/>
+ <code>B:</code> object register (4 bits)<br/>
+ <code>C:</code> instance field reference index (16 bits)</td>
+ <td>Perform the identified object instance field operation with
+ the identified field, loading or storing into the value register.
+ <p><b>Note:</b> These opcodes are reasonable candidates for static linking,
+ altering the field argument to be a more direct offset.</p>
+ </td>
+</tr>
+<tr>
+ <td>60..6d 21c</td>
+ <td>s<i>staticop</i> vAA, field@BBBB<br/>
+ 60: sget<br/>
+ 61: sget-wide<br/>
+ 62: sget-object<br/>
+ 63: sget-boolean<br/>
+ 64: sget-byte<br/>
+ 65: sget-char<br/>
+ 66: sget-short<br/>
+ 67: sput<br/>
+ 68: sput-wide<br/>
+ 69: sput-object<br/>
+ 6a: sput-boolean<br/>
+ 6b: sput-byte<br/>
+ 6c: sput-char<br/>
+ 6d: sput-short
+ </td>
+ <td><code>A:</code> value register or pair; may be source or dest
+ (8 bits)<br/>
+ <code>B:</code> static field reference index (16 bits)</td>
+ <td>Perform the identified object static field operation with the identified
+ static field, loading or storing into the value register.
+ <p><b>Note:</b> These opcodes are reasonable candidates for static linking,
+ altering the field argument to be a more direct offset.</p>
+ </td>
+</tr>
+<tr>
+ <td>6e..72 35c</td>
+ <td>invoke-<i>kind</i> {vD, vE, vF, vG, vA}, meth@CCCC<br/>
+ 6e: invoke-virtual<br/>
+ 6f: invoke-super<br/>
+ 70: invoke-direct<br/>
+ 71: invoke-static<br/>
+ 72: invoke-interface
+ </td>
+ <td><code>B:</code> argument word count (4 bits)<br/>
+ <code>C:</code> method index (16 bits)<br/>
+ <code>D..G, A:</code> argument registers (4 bits each)</td>
+ <td>Call the indicated method. The result (if any) may be stored
+ with an appropriate <code>move-result*</code> variant as the immediately
+ subsequent instruction.
+ <p><code>invoke-virtual</code> is used to invoke a normal virtual
+ method (a method that is not <code>static</code> or <code>final</code>,
+ and is not a constructor).</p>
+ <p><code>invoke-super</code> is used to invoke the closest superclass's
+ virtual method (as opposed to the one with the same <code>method_id</code>
+ in the calling class).</p>
+ <p><code>invoke-direct</code> is used to invoke a non-<code>static</code>
+ direct method (that is, an instance method that is by its nature
+ non-overridable, namely either a <code>private</code> instance method
+ or a constructor).</p>
+ <p><code>invoke-static</code> is used to invoke a <code>static</code>
+ method (which is always considered a direct method).</p>
+ <p><code>invoke-interface</code> is used to invoke an
+ <code>interface</code> method, that is, on an object whose concrete
+ class isn't known, using a <code>method_id</code> that refers to
+ an <code>interface</code>.</p>
+ <p><b>Note:</b> These opcodes are reasonable candidates for static linking,
+ altering the method argument to be a more direct offset
+ (or pair thereof).</p>
+ </td>
+</tr>
+<tr>
+ <td>73 10x</td>
+ <td><i>(unused)</i></td>
+ <td> </td>
+ <td><i>(unused)</i></td>
+</tr>
+<tr>
+ <td>74..78 3rc</td>
+ <td>invoke-<i>kind</i>/range {vCCCC .. vNNNN}, meth@BBBB<br/>
+ 74: invoke-virtual/range<br/>
+ 75: invoke-super/range<br/>
+ 76: invoke-direct/range<br/>
+ 77: invoke-static/range<br/>
+ 78: invoke-interface/range
+ </td>
+ <td><code>A:</code> argument word count (8 bits)<br/>
+ <code>B:</code> method index (16 bits)<br/>
+ <code>C:</code> first argument register (16 bits)<br/>
+ <code>N = A + C - 1</code></td>
+ <td>Call the indicated method. See first <code>invoke-<i>kind</i></code>
+ description above for details, caveats, and suggestions.
+ </td>
+</tr>
+<tr>
+ <td>79..7a 10x</td>
+ <td><i>(unused)</i></td>
+ <td> </td>
+ <td><i>(unused)</i></td>
+</tr>
+<tr>
+ <td>7b..8f 12x</td>
+ <td><i>unop</i> vA, vB<br/>
+ 7b: neg-int<br/>
+ 7c: not-int<br/>
+ 7d: neg-long<br/>
+ 7e: not-long<br/>
+ 7f: neg-float<br/>
+ 80: neg-double<br/>
+ 81: int-to-long<br/>
+ 82: int-to-float<br/>
+ 83: int-to-double<br/>
+ 84: long-to-int<br/>
+ 85: long-to-float<br/>
+ 86: long-to-double<br/>
+ 87: float-to-int<br/>
+ 88: float-to-long<br/>
+ 89: float-to-double<br/>
+ 8a: double-to-int<br/>
+ 8b: double-to-long<br/>
+ 8c: double-to-float<br/>
+ 8d: int-to-byte<br/>
+ 8e: int-to-char<br/>
+ 8f: int-to-short
+ </td>
+ <td><code>A:</code> destination register or pair (4 bits)<br/>
+ <code>B:</code> source register or pair (4 bits)</td>
+ <td>Perform the identified unary operation on the source register,
+ storing the result in the destination register.</td>
+</tr>
+
+<tr>
+ <td>90..af 23x</td>
+ <td><i>binop</i> vAA, vBB, vCC<br/>
+ 90: add-int<br/>
+ 91: sub-int<br/>
+ 92: mul-int<br/>
+ 93: div-int<br/>
+ 94: rem-int<br/>
+ 95: and-int<br/>
+ 96: or-int<br/>
+ 97: xor-int<br/>
+ 98: shl-int<br/>
+ 99: shr-int<br/>
+ 9a: ushr-int<br/>
+ 9b: add-long<br/>
+ 9c: sub-long<br/>
+ 9d: mul-long<br/>
+ 9e: div-long<br/>
+ 9f: rem-long<br/>
+ a0: and-long<br/>
+ a1: or-long<br/>
+ a2: xor-long<br/>
+ a3: shl-long<br/>
+ a4: shr-long<br/>
+ a5: ushr-long<br/>
+ a6: add-float<br/>
+ a7: sub-float<br/>
+ a8: mul-float<br/>
+ a9: div-float<br/>
+ aa: rem-float<br/>
+ ab: add-double<br/>
+ ac: sub-double<br/>
+ ad: mul-double<br/>
+ ae: div-double<br/>
+ af: rem-double
+ </td>
+ <td><code>A:</code> destination register or pair (8 bits)<br/>
+ <code>B:</code> first source register or pair (8 bits)<br/>
+ <code>C:</code> second source register or pair (8 bits)</td>
+ <td>Perform the identified binary operation on the two source registers,
+ storing the result in the first source register.</td>
+</tr>
+<tr>
+ <td>b0..cf 12x</td>
+ <td><i>binop</i>/2addr vA, vB<br/>
+ b0: add-int/2addr<br/>
+ b1: sub-int/2addr<br/>
+ b2: mul-int/2addr<br/>
+ b3: div-int/2addr<br/>
+ b4: rem-int/2addr<br/>
+ b5: and-int/2addr<br/>
+ b6: or-int/2addr<br/>
+ b7: xor-int/2addr<br/>
+ b8: shl-int/2addr<br/>
+ b9: shr-int/2addr<br/>
+ ba: ushr-int/2addr<br/>
+ bb: add-long/2addr<br/>
+ bc: sub-long/2addr<br/>
+ bd: mul-long/2addr<br/>
+ be: div-long/2addr<br/>
+ bf: rem-long/2addr<br/>
+ c0: and-long/2addr<br/>
+ c1: or-long/2addr<br/>
+ c2: xor-long/2addr<br/>
+ c3: shl-long/2addr<br/>
+ c4: shr-long/2addr<br/>
+ c5: ushr-long/2addr<br/>
+ c6: add-float/2addr<br/>
+ c7: sub-float/2addr<br/>
+ c8: mul-float/2addr<br/>
+ c9: div-float/2addr<br/>
+ ca: rem-float/2addr<br/>
+ cb: add-double/2addr<br/>
+ cc: sub-double/2addr<br/>
+ cd: mul-double/2addr<br/>
+ ce: div-double/2addr<br/>
+ cf: rem-double/2addr
+ </td>
+ <td><code>A:</code> destination and first source register or pair
+ (4 bits)<br/>
+ <code>B:</code> second source register or pair (4 bits)</td>
+ <td>Perform the identified binary operation on the two source registers,
+ storing the result in the first source register.</td>
+</tr>
+<tr>
+ <td>d0..d7 22s</td>
+ <td><i>binop</i>/lit16 vA, vB, #+CCCC<br/>
+ d0: add-int/lit16<br/>
+ d1: rsub-int (reverse subtract)<br/>
+ d2: mul-int/lit16<br/>
+ d3: div-int/lit16<br/>
+ d4: rem-int/lit16<br/>
+ d5: and-int/lit16<br/>
+ d6: or-int/lit16<br/>
+ d7: xor-int/lit16
+ </td>
+ <td><code>A:</code> destination register (4 bits)<br/>
+ <code>B:</code> source register (4 bits)<br/>
+ <code>C:</code> signed int constant (16 bits)</td>
+ <td>Perform the indicated binary op on the indicated register (first
+ argument) and literal value (second argument), storing the result in
+ the destination register.
+ <p><b>Note:</b>
+ <code>rsub-int</code> does not have a suffix since this version is the
+ main opcode of its family. Also, see below for details on its semantics.
+ </p>
+ </td>
+</tr>
+<tr>
+ <td>d8..e2 22b</td>
+ <td><i>binop</i>/lit8 vAA, vBB, #+CC<br/>
+ d8: add-int/lit8<br/>
+ d9: rsub-int/lit8<br/>
+ da: mul-int/lit8<br/>
+ db: div-int/lit8<br/>
+ dc: rem-int/lit8<br/>
+ dd: and-int/lit8<br/>
+ de: or-int/lit8<br/>
+ df: xor-int/lit8<br/>
+ e0: shl-int/lit8<br/>
+ e1: shr-int/lit8<br/>
+ e2: ushr-int/lit8
+ </td>
+ <td><code>A:</code> destination register (8 bits)<br/>
+ <code>B:</code> source register (8 bits)<br/>
+ <code>C:</code> signed int constant (8 bits)</td>
+ <td>Perform the indicated binary op on the indicated register (first
+ argument) and literal value (second argument), storing the result
+ in the destination register.
+ <p><b>Note:</b> See below for details on the semantics of
+ <code>rsub-int</code>.</p>
+ </td>
+</tr>
+<tr>
+ <td>e3..ff 10x</td>
+ <td><i>(unused)</i></td>
+ <td> </td>
+ <td><i>(unused)</i></td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>packed-switch</code> Format</h2>
+
+<table class="supplement">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>ident</td>
+ <td>ushort = 0x0100</td>
+ <td>identifying pseudo-opcode</td>
+</tr>
+<tr>
+ <td>size</td>
+ <td>ushort</td>
+ <td>number of entries in the table</td>
+</tr>
+<tr>
+ <td>first_key</td>
+ <td>int</td>
+ <td>first (and lowest) switch case value</td>
+</tr>
+<tr>
+ <td>targets</td>
+ <td>int[]</td>
+ <td>list of <code>size</code> relative branch targets. The targets are
+ relative to the address of the switch opcode, not of this table.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<p><b>Note:</b> The total number of code units for an instance of this
+table is <code>(size * 2) + 4</code>.</p>
+
+<h2><code>sparse-switch</code> Format</h2>
+
+<table class="supplement">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>ident</td>
+ <td>ushort = 0x0200</td>
+ <td>identifying pseudo-opcode</td>
+</tr>
+<tr>
+ <td>size</td>
+ <td>ushort</td>
+ <td>number of entries in the table</td>
+</tr>
+<tr>
+ <td>keys</td>
+ <td>int[]</td>
+ <td>list of <code>size</code> key values, sorted low-to-high</td>
+</tr>
+<tr>
+ <td>targets</td>
+ <td>int[]</td>
+ <td>list of <code>size</code> relative branch targets, each corresponding
+ to the key value at the same index. The targets are
+ relative to the address of the switch opcode, not of this table.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<p><b>Note:</b> The total number of code units for an instance of this
+table is <code>(size * 4) + 2</code>.</p>
+
+<h2><code>fill-array-data</code> Format</h2>
+
+<table class="supplement">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>ident</td>
+ <td>ushort = 0x0300</td>
+ <td>identifying pseudo-opcode</td>
+</tr>
+<tr>
+ <td>element_width</td>
+ <td>ushort</td>
+ <td>number of bytes in each element</td>
+</tr>
+<tr>
+ <td>size</td>
+ <td>uint</td>
+ <td>number of elements in the table</td>
+</tr>
+<tr>
+ <td>data</td>
+ <td>ubyte[]</td>
+ <td>data values</td>
+</tr>
+</tbody>
+</table>
+
+<p><b>Note:</b> The total number of code units for an instance of this
+table is <code>(size * element_width + 1) / 2 + 4</code>.</p>
+
+
+<h2>Mathematical Operation Details</h2>
+
+<p><b>Note:</b> Floating point operations must follow IEEE 754 rules, using
+round-to-nearest and gradual underflow, except where stated otherwise.</p>
+
+<table class="math">
+<thead>
+<tr>
+ <th>Opcode</th>
+ <th>C Semantics</th>
+ <th>Notes</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>neg-int</td>
+ <td>int32 a;<br/>
+ int32 result = -a;
+ </td>
+ <td>Unary twos-complement.</td>
+</tr>
+<tr>
+ <td>not-int</td>
+ <td>int32 a;<br/>
+ int32 result = ~a;
+ </td>
+ <td>Unary ones-complement.</td>
+</tr>
+<tr>
+ <td>neg-long</td>
+ <td>int64 a;<br/>
+ int64 result = -a;
+ </td>
+ <td>Unary twos-complement.</td>
+</tr>
+<tr>
+ <td>not-long</td>
+ <td>int64 a;<br/>
+ int64 result = ~a;
+ </td>
+ <td>Unary ones-complement.</td>
+</tr>
+<tr>
+ <td>neg-float</td>
+ <td>float a;<br/>
+ float result = -a;
+ </td>
+ <td>Floating point negation.</td>
+</tr>
+<tr>
+ <td>neg-double</td>
+ <td>double a;<br/>
+ double result = -a;
+ </td>
+ <td>Floating point negation.</td>
+</tr>
+<tr>
+ <td>int-to-long</td>
+ <td>int32 a;<br/>
+ int64 result = (int64) a;
+ </td>
+ <td>Sign extension of <code>int32</code> into <code>int64</code>.</td>
+</tr>
+<tr>
+ <td>int-to-float</td>
+ <td>int32 a;<br/>
+ float result = (float) a;
+ </td>
+ <td>Conversion of <code>int32</code> to <code>float</code>, using
+ round-to-nearest. This loses precision for some values.
+ </td>
+</tr>
+<tr>
+ <td>int-to-double</td>
+ <td>int32 a;<br/>
+ double result = (double) a;
+ </td>
+ <td>Conversion of <code>int32</code> to <code>double</code>.</td>
+</tr>
+<tr>
+ <td>long-to-int</td>
+ <td>int64 a;<br/>
+ int32 result = (int32) a;
+ </td>
+ <td>Truncation of <code>int64</code> into <code>int32</code>.</td>
+</tr>
+<tr>
+ <td>long-to-float</td>
+ <td>int64 a;<br/>
+ float result = (float) a;
+ </td>
+ <td>Conversion of <code>int64</code> to <code>float</code>, using
+ round-to-nearest. This loses precision for some values.
+ </td>
+</tr>
+<tr>
+ <td>long-to-double</td>
+ <td>int64 a;<br/>
+ double result = (double) a;
+ </td>
+ <td>Conversion of <code>int64</code> to <code>double</code>, using
+ round-to-nearest. This loses precision for some values.
+ </td>
+</tr>
+<tr>
+ <td>float-to-int</td>
+ <td>float a;<br/>
+ int32 result = (int32) a;
+ </td>
+ <td>Conversion of <code>float</code> to <code>int32</code>, using
+ round-toward-zero. <code>NaN</code> and <code>-0.0</code> (negative zero)
+ convert to the integer <code>0</code>. Infinities and values with
+ too large a magnitude to be represented get converted to either
+ <code>0x7fffffff</code> or <code>-0x80000000</code> depending on sign.
+ </td>
+</tr>
+<tr>
+ <td>float-to-long</td>
+ <td>float a;<br/>
+ int64 result = (int64) a;
+ </td>
+ <td>Conversion of <code>float</code> to <code>int32</code>, using
+ round-toward-zero. The same special case rules as for
+ <code>float-to-int</code> apply here, except that out-of-range values
+ get converted to either <code>0x7fffffffffffffff</code> or
+ <code>-0x8000000000000000</code> depending on sign.
+ </td>
+</tr>
+<tr>
+ <td>float-to-double</td>
+ <td>float a;<br/>
+ double result = (double) a;
+ </td>
+ <td>Conversion of <code>float</code> to <code>double</code>, preserving
+ the value exactly.
+ </td>
+</tr>
+<tr>
+ <td>double-to-int</td>
+ <td>double a;<br/>
+ int32 result = (int32) a;
+ </td>
+ <td>Conversion of <code>double</code> to <code>int32</code>, using
+ round-toward-zero. The same special case rules as for
+ <code>float-to-int</code> apply here.
+ </td>
+</tr>
+<tr>
+ <td>double-to-long</td>
+ <td>double a;<br/>
+ int64 result = (int64) a;
+ </td>
+ <td>Conversion of <code>double</code> to <code>int64</code>, using
+ round-toward-zero. The same special case rules as for
+ <code>float-to-long</code> apply here.
+ </td>
+</tr>
+<tr>
+ <td>double-to-float</td>
+ <td>double a;<br/>
+ float result = (float) a;
+ </td>
+ <td>Conversion of <code>double</code> to <code>float</code>, using
+ round-to-nearest. This loses precision for some values.
+ </td>
+</tr>
+<tr>
+ <td>int-to-byte</td>
+ <td>int32 a;<br/>
+ int32 result = (a << 24) >> 24;
+ </td>
+ <td>Truncation of <code>int32</code> to <code>int8</code>, sign
+ extending the result.
+ </td>
+</tr>
+<tr>
+ <td>int-to-char</td>
+ <td>int32 a;<br/>
+ int32 result = a & 0xffff;
+ </td>
+ <td>Truncation of <code>int32</code> to <code>uint16</code>, without
+ sign extension.
+ </td>
+</tr>
+<tr>
+ <td>int-to-short</td>
+ <td>int32 a;<br/>
+ int32 result = (a << 16) >> 16;
+ </td>
+ <td>Truncation of <code>int32</code> to <code>int16</code>, sign
+ extending the result.
+ </td>
+</tr>
+<tr>
+ <td>add-int</td>
+ <td>int32 a, b;<br/>
+ int32 result = a + b;
+ </td>
+ <td>Twos-complement addition.</td>
+</tr>
+<tr>
+ <td>sub-int</td>
+ <td>int32 a, b;<br/>
+ int32 result = a - b;
+ </td>
+ <td>Twos-complement subtraction.</td>
+</tr>
+<tr>
+ <td>rsub-int</td>
+ <td>int32 a, b;<br/>
+ int32 result = b - a;
+ </td>
+ <td>Twos-complement reverse subtraction.</td>
+</tr>
+<tr>
+ <td>mul-int</td>
+ <td>int32 a, b;<br/>
+ int32 result = a * b;
+ </td>
+ <td>Twos-complement multiplication.</td>
+</tr>
+<tr>
+ <td>div-int</td>
+ <td>int32 a, b;<br/>
+ int32 result = a / b;
+ </td>
+ <td>Twos-complement division, rounded towards zero (that is, truncated to
+ integer). This throws <code>ArithmeticException</code> if
+ <code>b == 0</code>.
+ </td>
+</tr>
+<tr>
+ <td>rem-int</td>
+ <td>int32 a, b;<br/>
+ int32 result = a % b;
+ </td>
+ <td>Twos-complement remainder after division. The sign of the result
+ is the same as that of <code>a</code>, and it is more precisely
+ defined as <code>result == a - (a / b) * b</code>. This throws
+ <code>ArithmeticException</code> if <code>b == 0</code>.
+ </td>
+</tr>
+<tr>
+ <td>and-int</td>
+ <td>int32 a, b;<br/>
+ int32 result = a & b;
+ </td>
+ <td>Bitwise AND.</td>
+</tr>
+<tr>
+ <td>or-int</td>
+ <td>int32 a, b;<br/>
+ int32 result = a | b;
+ </td>
+ <td>Bitwise OR.</td>
+</tr>
+<tr>
+ <td>xor-int</td>
+ <td>int32 a, b;<br/>
+ int32 result = a ^ b;
+ </td>
+ <td>Bitwise XOR.</td>
+</tr>
+<tr>
+ <td>shl-int</td>
+ <td>int32 a, b;<br/>
+ int32 result = a << (b & 0x1f);
+ </td>
+ <td>Bitwise shift left (with masked argument).</td>
+</tr>
+<tr>
+ <td>shr-int</td>
+ <td>int32 a, b;<br/>
+ int32 result = a >> (b & 0x1f);
+ </td>
+ <td>Bitwise signed shift right (with masked argument).</td>
+</tr>
+<tr>
+ <td>ushr-int</td>
+ <td>uint32 a, b;<br/>
+ int32 result = a >> (b & 0x1f);
+ </td>
+ <td>Bitwise unsigned shift right (with masked argument).</td>
+</tr>
+<tr>
+ <td>add-long</td>
+ <td>int64 a, b;<br/>
+ int64 result = a + b;
+ </td>
+ <td>Twos-complement addition.</td>
+</tr>
+<tr>
+ <td>sub-long</td>
+ <td>int64 a, b;<br/>
+ int64 result = a - b;
+ </td>
+ <td>Twos-complement subtraction.</td>
+</tr>
+<tr>
+ <td>mul-long</td>
+ <td>int64 a, b;<br/>
+ int64 result = a * b;
+ </td>
+ <td>Twos-complement multiplication.</td>
+</tr>
+<tr>
+ <td>div-long</td>
+ <td>int64 a, b;<br/>
+ int64 result = a / b;
+ </td>
+ <td>Twos-complement division, rounded towards zero (that is, truncated to
+ integer). This throws <code>ArithmeticException</code> if
+ <code>b == 0</code>.
+ </td>
+</tr>
+<tr>
+ <td>rem-long</td>
+ <td>int64 a, b;<br/>
+ int64 result = a % b;
+ </td>
+ <td>Twos-complement remainder after division. The sign of the result
+ is the same as that of <code>a</code>, and it is more precisely
+ defined as <code>result == a - (a / b) * b</code>. This throws
+ <code>ArithmeticException</code> if <code>b == 0</code>.
+ </td>
+</tr>
+<tr>
+ <td>and-long</td>
+ <td>int64 a, b;<br/>
+ int64 result = a & b;
+ </td>
+ <td>Bitwise AND.</td>
+</tr>
+<tr>
+ <td>or-long</td>
+ <td>int64 a, b;<br/>
+ int64 result = a | b;
+ </td>
+ <td>Bitwise OR.</td>
+</tr>
+<tr>
+ <td>xor-long</td>
+ <td>int64 a, b;<br/>
+ int64 result = a ^ b;
+ </td>
+ <td>Bitwise XOR.</td>
+</tr>
+<tr>
+ <td>shl-long</td>
+ <td>int64 a, b;<br/>
+ int64 result = a << (b & 0x3f);
+ </td>
+ <td>Bitwise shift left (with masked argument).</td>
+</tr>
+<tr>
+ <td>shr-long</td>
+ <td>int64 a, b;<br/>
+ int64 result = a >> (b & 0x3f);
+ </td>
+ <td>Bitwise signed shift right (with masked argument).</td>
+</tr>
+<tr>
+ <td>ushr-long</td>
+ <td>uint64 a, b;<br/>
+ int64 result = a >> (b & 0x3f);
+ </td>
+ <td>Bitwise unsigned shift right (with masked argument).</td>
+</tr>
+<tr>
+ <td>add-float</td>
+ <td>float a, b;<br/>
+ float result = a + b;
+ </td>
+ <td>Floating point addition.</td>
+</tr>
+<tr>
+ <td>sub-float</td>
+ <td>float a, b;<br/>
+ float result = a - b;
+ </td>
+ <td>Floating point subtraction.</td>
+</tr>
+<tr>
+ <td>mul-float</td>
+ <td>float a, b;<br/>
+ float result = a * b;
+ </td>
+ <td>Floating point multiplication.</td>
+</tr>
+<tr>
+ <td>div-float</td>
+ <td>float a, b;<br/>
+ float result = a / b;
+ </td>
+ <td>Floating point division.</td>
+</tr>
+<tr>
+ <td>rem-float</td>
+ <td>float a, b;<br/>
+ float result = a % b;
+ </td>
+ <td>Floating point remainder after division. This function is different
+ than IEEE 754 remainder and is defined as
+ <code>result == a - roundTowardZero(a / b) * b</code>.
+ </td>
+</tr>
+<tr>
+ <td>add-double</td>
+ <td>double a, b;<br/>
+ double result = a + b;
+ </td>
+ <td>Floating point addition.</td>
+</tr>
+<tr>
+ <td>sub-double</td>
+ <td>double a, b;<br/>
+ double result = a - b;
+ </td>
+ <td>Floating point subtraction.</td>
+</tr>
+<tr>
+ <td>mul-double</td>
+ <td>double a, b;<br/>
+ double result = a * b;
+ </td>
+ <td>Floating point multiplication.</td>
+</tr>
+<tr>
+ <td>div-double</td>
+ <td>double a, b;<br/>
+ double result = a / b;
+ </td>
+ <td>Floating point division.</td>
+</tr>
+<tr>
+ <td>rem-double</td>
+ <td>double a, b;<br/>
+ double result = a % b;
+ </td>
+ <td>Floating point remainder after division. This function is different
+ than IEEE 754 remainder and is defined as
+ <code>result == a - roundTowardZero(a / b) * b</code>.
+ </td>
+</tr>
+</tbody>
+</table>
+
+</body>
+</html>
diff --git a/docs/debugmon.html b/docs/debugmon.html
new file mode 100644
index 0000000..cf56ef5
--- /dev/null
+++ b/docs/debugmon.html
@@ -0,0 +1,736 @@
+<HTML>
+
+
+<head>
+ <title>Dalvik VM Debug Monitor</title>
+ <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
+ <link href="http://www.google.com/favicon.ico" type="image/x-icon"
+ rel="shortcut icon">
+ <link href="../android.css" type="text/css" rel="stylesheet">
+ <script language="JavaScript1.2" type="text/javascript">
+function highlight(name) {
+ if (document.getElementById) {
+ tags = [ 'span', 'div', 'tr', 'td' ];
+ for (i in tags) {
+ elements = document.getElementsByTagName(tags[i]);
+ if (elements) {
+ for (j = 0; j < elements.length; j++) {
+ elementName = elements[j].getAttribute("id");
+ if (elementName == name) {
+ elements[j].style.backgroundColor = "#C0F0C0";
+ } else if (elementName && elementName.indexOf("rev") == 0) {
+ elements[j].style.backgroundColor = "#FFFFFF";
+ }
+ }
+ }
+ }
+ }
+}
+ </script>
+</head>
+<body onload="prettyPrint()">
+
+<h1><a name="My_Project_"></a>Dalvik VM<br>Debug Monitor</h1>
+
+<!-- Status is one of: Draft, Current, Needs Update, Obsolete -->
+<p style="text-align:center"><strong>Status:</strong><em>Draft</em>
+<small>(as of March 6, 2007)</small></p>
+<address>
+[authors]
+<address>
+
+<!-- last modified date can be different to the "Status date." It automatically
+updates
+whenever the file is modified. -->
+<i>Modified:</i>
+ <!-- this script automatically sets the modified date,you don't need to modify
+it -->
+ <script type=text/javascript>
+ <!--
+ var lm = new Date(document.lastModified);
+ document.write(lm.toDateString());
+ //-->
+ </script>
+</address>
+
+<p><br>
+<HR>
+
+<h2>Introduction</h2>
+
+<p>It's extremely useful to be able to monitor the live state of the
+VM. For Android, we need to monitor multiple VMs running on a device
+connected through USB or a wireless network connection. This document
+describes a debug monitor server that interacts with multiple VMs, and
+an API that VMs and applications can use to provide information
+to the monitor.
+
+<p>Some things we can monitor with the Dalvik Debug Monitor ("DDM"):
+<ul>
+ <li> Thread states. Track thread creation/exit, busy/idle status.
+ <li> Overall heap status, useful for a heap bitmap display or
+ fragmentation analysis.
+</ul>
+
+<p>It is possible for something other than a VM to act as a DDM client, but
+that is a secondary goal. Examples include "logcat" log extraction
+and system monitors for virtual memory usage and load average.
+
+<p>It's also possible for the DDM server to be run on the device, with
+the information presented through the device UI. However, the initial goal
+is to provide a display tool that takes advantage of desktop tools and
+screen real estate.
+
+<p>This work is necessary because we are unable to use standard JVMTI-based
+tools with Dalvik. JVMTI relies on bytecode insertion, which is not
+currently possible because Dalvik doesn't support Java bytecode.
+
+<p>The DDM server is written in the Java programming language
+for portability. It uses a desktop
+UI toolkit (SWT) for its interface.
+
+
+<h2>Protocol</h2>
+
+<p>To take advantage of existing infrastructure we are piggy-backing the
+DDM protocol on top of JDWP (the Java Debug Wire Protocol, normally spoken
+between a VM and a debugger). To a
+non-DDM client, the DDM server just looks like a debugger.
+
+<p>The JDWP protocol is very close to what we want to use. In particular:
+<ul>
+ <li>It explicitly allows for vendor-defined packets, so there is no
+ need to "bend" the JDWP spec.
+ <li>Events may be posted from the VM at arbitrary points. Such
+ events do not elicit a response from the debugger, meaning the client
+ can post data and immediately resume work without worrying about the
+ eventual response.
+ <li>The basic protocol is stateless and asynchronous. Request packets
+ from the debugger side include a serial number, which the VM includes
+ in the response packet. This allows multiple simultaneous
+ conversations, which means the DDM traffic can be interleaved with
+ debugger traffic.
+</ul>
+
+<p>There are a few issues with using JDWP for our purposes:
+<ul>
+ <li>The VM only expects one connection from a debugger, so you couldn't
+ attach the monitor and a debugger at the same time. This will be
+ worked around by connecting the debugger to the monitor and passing the
+ traffic through. (We're already doing the pass-through with "jdwpspy";
+ requires some management of our request IDs though.) This should
+ be more convenient than the current "guess the port
+ number" system when we're attached to a device.
+ <li>The VM behaves differently when a debugger is attached. It will
+ run more slowly, and any objects passed to the monitor or debugger are
+ immune to GC. We can work around this by not enabling the slow path
+ until non-DDM traffic is observed. We also want to have a "debugger
+ has connected/disconnected" message that allows the VM to release
+ debugger-related resources without dropping the net connection.
+ <li>Non-DDM VMs should not freak out when DDM connects. There are
+ no guarantees here for 3rd-party VMs (e.g. a certain mainstream VM,
+ which crashes instantly), but our older JamVM can be
+ configured to reject the "hello" packet.
+</ul>
+
+
+<h3>Connection Establishment</h3>
+
+<p>There are two basic approaches: have the server contact the VMs, and
+have the VMs contact the server. The former is less "precise" than the
+latter, because you have to scan for the clients, but it has some
+advantages.
+
+<p>There are three interesting scenarios:
+<ol>
+ <li>The DDM server is started, then the USB-attached device is booted
+ or the simulator is launched.
+ <li>The device or simulator is already running when the DDM server
+ is started.
+ <li>The DDM server is running when an already-started device is
+ attached to USB.
+</ol>
+<p>If we have the VMs connect to the DDM server on startup, we only handle
+case #1. If the DDM server scans for VMs when it starts, we only handle
+case #2. Neither handles case #3, which is probably the most important
+of the bunch as the device matures.
+<p>The plan is to have a drop-down menu with two entries,
+"scan workstation" and "scan device".
+The former causes the DDM server to search for VMs on "localhost", the
+latter causes it to search for VMs on the other side of an ADB connection.
+The DDM server will scan for VMs every few seconds, either checking a
+range of known VM ports (e.g. 8000-8040) or interacting with some sort
+of process database on the device. Changing modes causes all existing
+connections to be dropped.
+<p>When the DDM server first starts, it will try to execute "adb usb"
+to ensure that the ADB server is running. (Note it will be necessary
+to launch the DDM server from a shell with "adb" in the path.) If this
+fails, talking to the device will still be possible so long as the ADB
+daemon is already running.
+
+<h4>Connecting a Debugger</h4>
+
+<p>With the DDM server sitting on the JDWP port of all VMs, it will be
+necessary to connect the debugger through the DDM server. Each VM being
+debugged will have a separate port being listened to by the DDM server,
+allowing you to connect a debugger to one or more VMs simultaneously.
+
+<p>In the common case, however, the developer will only want to debug
+a single VM. One port (say 8700) will be listened to by the DDM server,
+and anything connecting to it will be connected to the "current VM"
+(selected in the UI). This should allow developers to focus on a
+single application, which may otherwise shift around in the ordering, without
+having to adjust their IDE settings to a different port every time they
+restart the device.
+
+
+<h3>Packet Format</h3>
+
+<p>Information is sent in chunks. Each chunk starts with:
+<pre>
+u4 type
+u4 length
+</pre>
+and contains a variable amount of type-specific data.
+Unrecognized types cause an empty response from the client and
+are quietly ignored by the server. [Should probably return an error;
+need an "error" chunk type and a handler on the server side.]
+
+<p>The same chunk type may have different meanings when sent in different
+directions. For example, the same type may be used for both a query and
+a response to the query. For sanity the type must always be used in
+related transactions.
+
+<p>This is somewhat redundant with the JDWP framing, which includes a
+4-byte length and a two-byte type code ("command set" and "command"; a
+range of command set values is designated for "vendor-defined commands
+and extensions"). Using the chunk format allows us to remain independent
+of the underlying transport, avoids intrusive integration
+with JDWP client code, and provides a way to send multiple chunks in a
+single transmission unit. [I'm taking the multi-chunk packets into
+account in the design, but do not plan to implement them unless the need
+arises.]
+
+<p>Because we may be sending data over a slow USB link, the chunks may be
+compressed. Compressed chunks are written as a chunk type that
+indicates the compression, followed by the compressed length, followed
+by the original chunk type and the uncompressed length. For zlib's deflate
+algorithm, the chunk type is "ZLIB".
+
+<p>Following the JDWP model, packets sent from the server to the client
+are always acknowledged, but packets sent from client to server never are.
+The JDWP error code field is always set to "no error"; failure responses
+from specific requests must be encoded into the DDM messages.
+
+<p>In what follows "u4" is an unsigned 32-bit value and "u1" is an
+unsigned 8-bit value. Values are written in big-endian order to match
+JDWP.
+
+
+<h3>Initial Handshake</h3>
+
+<p>After the JDWP handshake, the server sends a HELO chunk to the client.
+If the client's JDWP layer rejects it, the server assumes that the client
+is not a DDM-aware VM, and does not send it any further DDM queries.
+<p>On the client side, upon seeing a HELO it can know that a DDM server
+is attached and prepare accordingly. The VM should not assume that a
+debugger is attached until a non-DDM packet arrives.
+
+<h4>Chunk HELO (server --> client)</h4>
+<p>Basic "hello" message.
+<pre>
+u4 DDM server protocol version
+</pre>
+
+
+<h4>Chunk HELO (client --> server, reply only)</h4>
+Information about the client. Must be sent in response to the HELO message.
+<pre>
+u4 DDM client protocol version
+u4 pid
+u4 VM ident string len (in 16-bit units)
+u4 application name len (in 16-bit units)
+var VM ident string (UTF-16)
+var application name (UTF-16)
+</pre>
+
+<p>If the client does not wish to speak to the DDM server, it should respond
+with a JDWP error packet. This is the same behavior you'd get from a VM
+that doesn't support DDM.
+
+
+<h3>Debugger Management</h3>
+<p>VMs usually prepare for debugging when a JDWP connection is established,
+and release debugger-related resources when the connection drops. We want
+to open the JDWP connection early and hold it open after the debugger
+disconnects.
+<p>The VM can tell when a debugger attaches, because it will start seeing
+non-DDM JDWP traffic, but it can't identify the disconnect. For this reason,
+we need to send a packet to the client when the debugger disconnects.
+<p>If the DDM server is talking to a non-DDM-aware client, it will be
+necessary to drop and re-establish the connection when the debugger goes away.
+(This also works with DDM-aware clients; this packet is an optimization.)
+
+<h4>Chunk DBGD (server --> client)</h4>
+<p>Debugger has disconnected. The client responds with a DBGD to acknowledge
+receipt. No data in request, no response required.
+
+
+<h3>VM Info</h3>
+<p>Update the server's info about the client.
+
+<h4>Chunk APNM (client --> server)</h4>
+
+<p>If a VM's application name changes -- possible in our environment because
+of the "pre-initialized" app processes -- it must send up one of these.
+<pre>
+u4 application name len (in 16-bit chars)
+var application name (UTF-16)
+</pre>
+
+<h4>Chunk WAIT (client --> server)</h4>
+
+<p>This tells DDMS that one or more threads are waiting on an external
+event. The simplest use is to tell DDMS that the VM is waiting for a
+debugger to attach.
+<pre>
+u1 reason (0 = wait for debugger)
+</pre>
+If DDMS is attached, the client VM sends this up when waitForDebugger()
+is called. If waitForDebugger() is called before DDMS attaches, the WAIT
+chunk will be sent up at about the same time as the HELO response.
+
+
+<h3>Thread Status</h3>
+
+<p>The client can send updates when their status changes, or periodically
+send thread state info, e.g. 2x per
+second to allow a "blinkenlights" display of thread activity.
+
+<h4>Chunk THEN (server --> client)</h4>
+
+<p>Enable thread creation/death notification.
+<pre>
+u1 boolean (true=enable, false=disable)
+</pre>
+<p>The response is empty. The client generates THCR packets for all
+known threads. (Note the THCR packets may arrive before the THEN
+response.)
+
+<h4>Chunk THCR (client --> server)</h4>
+<p>Thread Creation notification.
+<pre>
+u4 VM-local thread ID (usually a small int)
+u4 thread name len (in 16-bit chars)
+var thread name (UTF-16)
+</pre>
+
+<h4>Chunk THDE (client --> server)</h4>
+<p>Thread Death notification.
+<pre>
+u4 VM-local thread ID
+</pre>
+
+<h4>Chunk THST (server --> client)</h4>
+
+<p>Enable periodic thread activity updates.
+Threads in THCR messages are assumed to be in the "initializing" state. A
+THST message should follow closely on the heels of THCR.
+<pre>
+u4 interval, in msec
+</pre>
+<p>An interval of 0 disables the updates. This is done periodically,
+rather than every time the thread state changes, to reduce the amount
+of data that must be sent for an actively running VM.
+
+<h4>Chunk THST (client --> server)</h4>
+<p>Thread Status, describing the state of one or more threads. This is
+most useful when creation/death notifications are enabled first. The
+overall layout is:
+<pre>
+u4 count
+var thread data
+</pre>
+Then, for every thread:
+<pre>
+u4 VM-local thread ID
+u1 thread state
+u1 suspended
+</pre>
+<p>"thread state" must be one of:
+<ul> <!-- don't use ol, we may need (-1) or sparse -->
+ <li> 1 - running (now executing or ready to do so)
+ <li> 2 - sleeping (in Thread.sleep())
+ <li> 3 - monitor (blocked on a monitor lock)
+ <li> 4 - waiting (in Object.wait())
+ <li> 5 - initializing
+ <li> 6 - starting
+ <li> 7 - native (executing native code)
+ <li> 8 - vmwait (waiting on a VM resource)
+</ul>
+<p>"suspended" will be 0 if the thread is running, 1 if not.
+<p>[Any reason not to make "suspended" be the high bit of "thread state"?
+Do we need to differentiate suspend-by-GC from suspend-by-debugger?]
+<p>[We might be able to send the currently-executing method. This is a
+little risky in a running VM, and increases the size of the messages
+considerably, but might be handy.]
+
+
+<h3>Heap Status</h3>
+
+<p>The client sends what amounts to a color-coded bitmap to the server,
+indicating which stretches of memory are free and which are in use. For
+compactness the bitmap is run-length encoded, and based on multi-byte
+"allocation units" rather than byte counts.
+
+<p>In the future the server will be able to correlate the bitmap with more
+detailed object data, so enough information is provided to associate the
+bitmap data with virtual addresses.
+
+<p>Heaps may be broken into segments within the VM, and due to memory
+constraints it may be desirable to send the bitmap in smaller pieces,
+so the protocol allows the heap data to be sent in several chunks.
+To avoid ambiguity, the client is required
+to send explicit "start" and "end" messages during an update.
+
+<p>All messages include a "heap ID" that can be used to differentiate
+between multiple independent virtual heaps or perhaps a native heap. The
+client is allowed to send information about different heaps simultaneously,
+so all heap-specific information is tagged with a "heap ID".
+
+<h4>Chunk HPIF (server --> client)</h4>
+<p>Request heap info.
+<pre>
+u1 when to send
+</pre>
+<p>The "when" values are:
+<pre>
+0: never
+1: immediately
+2: at the next GC
+3: at every GC
+</pre>
+
+<h4>Chunk HPIF (client --> server, reply only)</h4>
+<p>Heap Info. General information about the heap, suitable for a summary
+display.
+<pre>
+u4 number of heaps
+</pre>
+For each heap:
+<pre>
+u4 heap ID
+u8 timestamp in ms since Unix epoch
+u1 capture reason (same as 'when' value from server)
+u4 max heap size in bytes (-Xmx)
+u4 current heap size in bytes
+u4 current number of bytes allocated
+u4 current number of objects allocated
+</pre>
+<p>[We can get some of this from HPSG, more from HPSO.]
+<p>[Do we need a "heap overhead" stat here, indicating how much goes to
+waste? e.g. (8 bytes per object * number of objects)]
+
+<h4>Chunk HPSG (server --> client)</h4>
+<p>Request transmission of heap segment data.
+<pre>
+u1 when to send
+u1 what to send
+</pre>
+<p>The "when" to send will be zero to disable transmission, 1 to send
+during a GC. Other values are currently undefined. (Could use to pick
+which part of the GC to send it, or cause periodic transmissions.)
+<p>The "what" field is currently 0 for HPSG and 1 for HPSO.
+<p>No reply is expected.
+
+<h4>Chunk NHSG (server --> client)</h4>
+<p>Request transmission of native heap segment data.
+<pre>
+u1 when to send
+u1 what to send
+</pre>
+<p>The "when" to send will be zero to disable transmission, 1 to send
+during a GC. Other values are currently undefined.
+<p>The "what" field is currently ignored.
+<p>No reply is expected.
+
+<h4>Chunk HPST/NHST (client --> server)</h4>
+<p>This is a Heap Start message. It tells the server to discard any
+existing notion of what the client's heap looks like, and prepare for
+new information. HPST indicates a virtual heap dump and must be followed
+by zero or more HPSG/HPSO messages and an HPEN. NHST indicates a native
+heap dump and must be followed by zero or more NHSG messages and an NHEN.
+
+<p>The only data item is:
+<pre>
+u4 heap ID
+</pre>
+
+<h4>Chunk HPEN/NHEN (client --> server)</h4>
+<p>Heap End, indicating that all information about the heap has been sent.
+A HPST will be paired with an HPEN and an NHST will be paired with an NHEN.
+
+<p>The only data item is:
+<pre>
+u4 heap ID
+</pre>
+
+<h4>Chunk HPSG (client --> server)</h4>
+<p>Heap segment data. Each chunk describes all or part of a contiguous
+stretch of heap memory.
+<pre>
+u4 heap ID
+u1 size of allocation unit, in bytes (e.g. 8 bytes)
+u4 virtual address of segment start
+u4 offset of this piece (relative to the virtual address)
+u4 length of piece, in allocation units
+var usage data
+</pre>
+<p>The "usage data" indicates the status of each allocation unit. The data
+is a stream of pairs of bytes, where the first byte indicates the state
+of the allocation unit, and the second byte indicates the number of
+consecutive allocation units with the same state.
+<p>The bits in the "state" byte have the following meaning:
+<pre>
++---------------------------------------+
+| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
++---------------------------------------+
+| P | U0 | K2 | K1 | K0 | S2 | S1 | S0 |
++---------------------------------------+
+</pre>
+<ul>
+ <li>'S': solidity
+ <ul>
+ <li>0=free
+ <li>1=has hard reference
+ <li>2=has soft reference
+ <li>3=has weak reference
+ <li>4=has phantom reference
+ <li>5=pending finalization
+ <li>6=marked, about to be swept
+ </ul>
+ <li>'K': kind
+ <ul>
+ <li>0=object
+ <li>1=class object
+ <li>2=array of byte/boolean
+ <li>3=array of char/short
+ <li>4=array of Object/int/float
+ <li>5=array of long/double
+ </ul>
+ <li>'P': partial flag (not used for HPSG)
+ <li>'U': unused, must be zero
+</ul>
+
+<p>The use of the various 'S' types depends on when the information is
+sent. The current plan is to send it either immediately after a GC,
+or between the "mark" and "sweep" phases of the GC. For a fancy generational
+collector, we may just want to send it up periodically.
+
+<p>The run-length byte indicates the number of allocation units minus one, so a
+length of 255 means there are 256 consecutive units with this state. In
+some cases, e.g. arrays of bytes, the actual size of the data is rounded
+up the nearest allocation unit.
+<p>For HPSG, the runs do not end at object boundaries. It is not possible
+to tell from this bitmap whether a run contains one or several objects.
+(But see HPSO, below.)
+<p>[If we find that we have many long runs, we can overload the 'P' flag
+or dedicate the 'U' flag to indicate that we have a 16-bit length instead
+of 8-bit. We can also use a variable-width integer scheme for the length,
+encoding 1-128 in one byte, 1-16384 in two bytes, etc.]
+<p>[Alternate plan for 'K': array of byte, array of char, array of Object,
+array of miscellaneous primitive type]
+<p>To parse the data, the server runs through the usage data until either
+(a) the end of the chunk is reached, or (b) all allocation units have been
+accounted for. (If these two things don't happen at the same time, the
+chunk is rejected.)
+<p>Example: suppose a VM has a heap at 0x10000 that is 0x2000 bytes long
+(with an 8-byte allocation unit size, that's 0x0400 units long).
+The client could send one chunk (allocSize=8, virtAddr=0x10000, offset=0,
+length=0x0400) or two (allocSize=8, virtAddr=0x10000, offset=0, length=0x300;
+then allocSize=8, virtAddr=0x10000, offset=0x300, length=0x100).
+<p>The client must encode the entire heap, including all free space at
+the end, or the server will not have an accurate impression of the amount
+of memory in the heap. This refers to the current heap size, not the
+maximum heap size.
+
+<h4>Chunk HPSO (client --> server)</h4>
+<p>This is essentially identical to HPSG, but the runs are terminated at
+object boundaries. If an object is larger than 256 allocation units, the
+"partial" flag is set in all runs except the last.
+<p>The resulting unpacked bitmap is identical, but the object boundary
+information can be used to gain insights into heap layout.
+<p>[Do we want to have a separate message for this? Maybe just include
+a "variant" flag in the HPST packet. Another possible form of output
+would be one that indicates the age, in generations, of each block of
+memory. That would provide a quick visual indication of "permanent vs.
+transient residents", perhaps with a 16-level grey scale.]
+
+<h4>Chunk NHSG (client --> server)</h4>
+<p>Native heap segment data. Each chunk describes all or part of a
+contiguous stretch of native heap memory. The format is the same as
+for HPSG, except that only solidity values 0 (= free) and 1 (= hard
+reference) are used, and the kind value is always 0 for free chunks
+and 7 for allocated chunks, indicating a non-VM object.
+<pre>
+u4 heap ID
+u1 size of allocation unit, in bytes (e.g. 8 bytes)
+u4 virtual address of segment start
+u4 offset of this piece (relative to the virtual address)
+u4 length of piece, in allocation units
+var usage data
+</pre>
+
+<h3>Generic Replies</h3>
+
+The client-side chunk handlers need a common way to report simple success
+or failure. By convention, an empty reply packet indicates success.
+
+<h4>Chunk FAIL (client --> server, reply only)</h4>
+<p>The chunk includes a machine-readable error code and a
+human-readable error message. Server code can associate the failure
+with the original request by comparing the JDWP packet ID.
+<p>This allows a standard way of, for example, rejecting badly-formed
+request packets.
+<pre>
+u4 error code
+u4 error message len (in 16-bit chars)
+var error message (UTF-16)
+</pre>
+
+<h3>Miscellaneous</h3>
+
+<h4>Chunk EXIT (server --> client)</h4>
+<p>Cause the client to exit with the specified status, using System.exit().
+Useful for certain kinds of testing.
+<pre>
+u4 exit status
+</pre>
+
+<h4>Chunk DTRC (server --> client)</h4>
+<p>[TBD] start/stop dmtrace; can send the results back over the wire. For
+size reasons we probably need "sending", "data", "key", "finished" as
+4 separate chunks/packets rather than one glob.
+
+
+<h2>Client API</h2>
+
+<p>The API is written in the Java programming language
+for convenience. The code is free to call native methods if appropriate.
+
+<h3>Chunk Handler API</h3>
+
+<p>The basic idea is that arbitrary code can register handlers for
+specific chunk types. When a DDM chunk with that type arrives, the
+appropriate handler is invoked. The handler's return value provides the
+response to the server.
+
+<p>There are two packages. android.ddm lives in the "framework" library,
+and has all of the chunk handlers and registration code. It can freely
+use Android classes. org.apache.harmony.dalvik.ddmc lives in the "core"
+library, and has
+some base classes and features that interact with the VM. Nothing should
+need to modify the org.apache.harmony.dalvik.ddmc classes.
+
+<p>The DDM classes pass chunks of data around with a simple class:
+
+<pre class=prettyprint>
+class Chunk {
+ int type;
+ byte[] data;
+ int offset, length;
+};
+</pre>
+
+<p>The chunk handlers accept and return them:
+<pre class=prettyprint>
+public Chunk handleChunk(Chunk request)
+</pre>
+<p>The code is free to parse the chunk and generate a response in any
+way it chooses. Big-endian byte ordering is recommended but not mandatory.
+<p>Chunk handlers will be notified when a DDM server connects or disconnects,
+so that they can perform setup and cleanup operations:
+<pre class=prettyprint>
+public void connected()
+public void disconnected()
+</pre>
+
+<p>The method processes the request, formulates a response, and returns it.
+If the method returns null, an empty JDWP success message will be returned.
+<p>The request/response interaction is essentially asynchronous in the
+protocol. The packets are linked together with the JDWP message ID.
+<p>[We could use ByteBuffer here instead of byte[], but it doesn't gain
+us much. Wrapping a ByteBuffer around an array is easy. We don't want
+to pass the full packet in because we could have multiple chunks in one
+request packet. The DDM code needs to collect and aggregate the responses
+to all chunks into a single JDWP response packet. Parties wanting to
+write multiple chunks in response to a single chunk should send a null
+response back and use "sendChunk()" to send the data independently.]
+
+<h3>Unsolicited event API</h3>
+
+<p>If a piece of code wants to send a chunk of data to the server at some
+arbitrary time, it may do so with a method provided by
+org.apache.harmony.dalvik.DdmServer:
+
+<pre class=prettyprint>
+public static void sendChunk(Chunk chunk)
+</pre>
+
+<p>There is no response or status code. No exceptions are thrown.
+
+
+<h2>Server API</h2>
+
+<p>This is similar to the client side in many ways, but makes extensive
+use of ByteBuffer in a perhaps misguided attempt to use java.nio.channels
+and avoid excessive thread creation and unnecessary data copying.
+
+<p>Upon receipt of a packet, the server will identify it as one of:
+<ol>
+ <li>Message to be passed through to the debugger
+ <li>Response to an earlier request
+ <li>Unsolicited event packet
+</ol>
+<p>To handle (2), when messages are sent from the server to the client,
+the message must be paired with a callback method. The response might be
+delayed for a while -- or might never arrive -- so the server can't block
+waiting for responses from the client.
+<p>The chunk handlers look like this:
+<pre class=prettyprint>
+public void handleChunk(Client client, int type,
+ ByteBuffer data, boolean isReply, int msgId)
+</pre>
+<p>The arguments are:
+<dl>
+ <dt>client
+ <dd>An object representing the client VM that send us the packet.
+ <dt>type
+ <dd>The 32-bit chunk type.
+ <dt>data
+ <dd>The data. The data's length can be determined by calling data.limit().
+ <dt>isReply
+ <dd>Set to "true" if this was a reply to a message we sent earlier,
+ "false" if the client sent this unsolicited.
+ <dt>msgId
+ <dd>The JDWP message ID. Useful for connecting replies with requests.
+</dl>
+<p>If a handler doesn't like the contents of a packet, it should log an
+error message and return. If the handler doesn't recognize the packet at
+all, it can call the superclass' handleUnknownChunk() method.
+
+<p>As with the client, the server code can be notified when clients
+connect or disconnect. This allows the handler to send initialization
+code immediately after a connect, or clean up after a disconnect.
+<p>Data associated with a client can be stored in a ClientData object,
+which acts as a general per-client dumping around for VM and UI state.
+
+
+<P><BR>
+
+<HR>
+
+<address>Copyright © 2007 The Android Open Source Project</address>
+
+</body>
+</HTML>
diff --git a/docs/dex-format.css b/docs/dex-format.css
new file mode 100644
index 0000000..17e935f
--- /dev/null
+++ b/docs/dex-format.css
@@ -0,0 +1,387 @@
+h1 {
+ font-family: serif;
+ border-top-style: solid;
+ border-top-width: 5px;
+ padding-top: 9pt;
+ margin-top: 40pt;
+ color: #222266;
+}
+
+h1.title {
+ border: none;
+}
+
+h2 {
+ font-family: serif;
+ border-top-style: solid;
+ border-top-width: 2px;
+ border-color: #ccccdd;
+ padding-top: 9pt;
+ margin-top: 40pt;
+ margin-bottom: 2pt;
+ color: #222266;
+}
+
+h3 {
+ font-family: serif;
+ font-style: bold;
+ margin-top: 20pt;
+ margin-bottom: 2pt;
+ color: #222266;
+}
+
+h4 {
+ font-family: serif;
+ font-style: italic;
+ margin-top: 2pt;
+ margin-bottom: 2pt;
+ color: #666688;
+}
+
+@media print {
+ table {
+ font-size: 8pt;
+ }
+}
+
+@media screen {
+ table {
+ font-size: 10pt;
+ }
+}
+
+pre {
+ background: #eeeeff;
+ border-color: #aaaaff;
+ border-style: solid;
+ border-width: 1px;
+ margin-left: 40pt;
+ margin-right: 40pt;
+ padding: 6pt;
+}
+
+table {
+ border-collapse: collapse;
+ margin-top: 10pt;
+ margin-left: 40pt;
+ margin-right: 40pt;
+}
+
+table th {
+ font-family: sans-serif;
+ background: #aabbff;
+}
+
+table td {
+ font-family: sans-serif;
+ border-top-style: solid;
+ border-bottom-style: solid;
+ border-width: 1px;
+ border-color: #aaaaff;
+ padding-top: 3pt;
+ padding-bottom: 3pt;
+ padding-left: 3pt;
+ padding-right: 4pt;
+ background: #eeeeff;
+}
+
+table p {
+ margin-bottom: 0pt;
+}
+
+/* for the bnf syntax sections */
+
+table.bnf {
+ background: #eeeeff;
+ border-color: #aaaaff;
+ border-style: solid;
+ border-width: 1px;
+ margin-top: 3pt;
+ margin-bottom: 3pt;
+ padding-top: 2pt;
+ padding-bottom: 6pt;
+ padding-left: 6pt;
+ padding-right: 6pt;
+}
+
+table.bnf td {
+ border: none;
+ padding-left: 6pt;
+ padding-right: 6pt;
+ padding-top: 1pt;
+ padding-bottom: 1pt;
+}
+
+table.bnf td:first-child {
+ padding-right: 0pt;
+ width: 8pt;
+}
+
+table.bnf td:first-child td {
+ padding-left: 0pt;
+}
+
+table.bnf td.def {
+ padding-top: 6pt;
+}
+
+table.bnf td.bar {
+ padding-left: 15pt;
+}
+
+table.bnf code {
+ font-weight: bold;
+}
+
+
+/* for the type name guide */
+
+table.guide {
+ margin-top: 20pt;
+ margin-bottom: 20pt;
+}
+
+table.guide td:first-child {
+ font-family: monospace;
+ width: 15%;
+}
+
+table.guide td:first-child + td {
+ font-family: sans-serif;
+ width: 85%;
+}
+
+
+/* for the LEB128 example tables */
+
+table.leb128Bits {
+ margin-top: 20pt;
+ margin-bottom: 20pt;
+}
+
+table.leb128Bits td {
+ border-left: solid #aaaaff 1px;
+ border-right: solid #aaaaff 1px;
+}
+
+table.leb128Bits td.start1 {
+ border-left: none;
+}
+
+table.leb128Bits td.start2 {
+ border-left: solid #000 2px;
+}
+
+table.leb128Bits td.end2 {
+ border-right: none;
+}
+
+table.leb128 {
+ margin-top: 20pt;
+ margin-bottom: 20pt;
+}
+
+table.leb128 td:first-child {
+ font-family: monospace;
+ text-align: center;
+ width: 31%;
+}
+
+table.leb128 td:first-child + td {
+ font-family: monospace;
+ text-align: center;
+ width: 23%;
+}
+
+table.leb128 td:first-child + td + td {
+ font-family: monospace;
+ text-align: center;
+ width: 23%;
+}
+
+table.leb128 td:first-child + td + td + td {
+ font-family: monospace;
+ text-align: center;
+ width: 23%;
+}
+
+
+/* for the general format tables */
+
+table.format {
+ margin-top: 20pt;
+ margin-bottom: 20pt;
+}
+
+table.format td:first-child {
+ font-family: monospace;
+ width: 20%;
+}
+
+table.format td:first-child + td {
+ font-family: monospace;
+ width: 20%;
+}
+
+table.format td:first-child + td + td {
+ width: 60%;
+}
+
+table.format td i {
+ font-family: sans-serif;
+}
+
+
+/* for the type code table */
+
+table.typeCodes {
+ margin-top: 20pt;
+ margin-bottom: 20pt;
+}
+
+table.typeCodes td:first-child {
+ font-family: monospace;
+ width: 30%;
+}
+
+table.typeCodes td:first-child + td {
+ font-family: monospace;
+ width: 30%;
+}
+
+table.typeCodes td:first-child + td + td {
+ font-family: monospace;
+ width: 10%;
+}
+
+table.typeCodes td:first-child + td + td + td {
+ font-family: monospace;
+ width: 30%;
+}
+
+table.typeCodes td i {
+ font-family: sans-serif;
+}
+
+
+/* for the access flags table */
+
+table.accessFlags {
+ margin-top: 20pt;
+ margin-bottom: 20pt;
+}
+
+table.accessFlags td:first-child {
+ font-family: monospace;
+ width: 10%;
+}
+
+table.accessFlags td:first-child + td {
+ font-family: monospace;
+ width: 6%;
+}
+
+table.accessFlags td:first-child + td + td {
+ width: 28%;
+}
+
+table.accessFlags td:first-child + td + td + td {
+ width: 28%;
+}
+
+table.accessFlags td:first-child + td + td + td + td {
+ width: 28%;
+}
+
+table.accessFlags i {
+ font-family: sans-serif;
+}
+
+
+/* for the descriptor table */
+
+table.descriptor {
+ margin-top: 20pt;
+ margin-bottom: 20pt;
+}
+
+table.descriptor td:first-child {
+ font-family: monospace;
+ width: 25%;
+}
+
+table.descriptor td:first-child + td {
+ font-family: sans-serif;
+ width: 75%;
+}
+
+
+/* for the debug bytecode table */
+
+table.debugByteCode {
+ margin-top: 20pt;
+ margin-bottom: 20pt;
+}
+
+table.debugByteCode td:first-child {
+ font-family: monospace;
+ width: 20%;
+}
+
+table.debugByteCode td:first-child + td {
+ font-family: monospace;
+ width: 5%;
+}
+
+table.debugByteCode td:first-child + td + td{
+ font-family: monospace;
+ width: 15%;
+}
+
+table.debugByteCode td:first-child + td + td + td {
+ width: 25%;
+}
+
+table.debugByteCode td:first-child + td + td + td + td {
+ width: 35%;
+}
+
+table.debugByteCode i {
+ font-family: sans-serif;
+}
+
+
+/* for the encoded value table */
+
+table.encodedValue {
+ margin-top: 20pt;
+ margin-bottom: 20pt;
+}
+
+table.encodedValue td:first-child {
+ font-family: monospace;
+ width: 12%;
+}
+
+table.encodedValue td:first-child + td {
+ font-family: monospace;
+ width: 10%;
+}
+
+table.encodedValue td:first-child + td + td {
+ font-family: monospace;
+ width: 15%;
+}
+
+table.encodedValue td:first-child + td + td + td {
+ font-family: monospace;
+ width: 15%;
+}
+
+table.encodedValue td:first-child + td + td + td + td {
+ width: 48%;
+}
+
+table.encodedValue td i {
+ font-family: sans-serif;
+}
diff --git a/docs/dex-format.html b/docs/dex-format.html
new file mode 100644
index 0000000..88a7fb0
--- /dev/null
+++ b/docs/dex-format.html
@@ -0,0 +1,3043 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+
+<html>
+
+<head>
+<title>.dex — Dalvik Executable Format</title>
+<link rel=stylesheet href="dex-format.css">
+</head>
+
+<body>
+
+<h1 class="title"><code>.dex</code> — Dalvik Executable Format</h1>
+<p>Copyright © 2007 The Android Open Source Project
+
+<p>This document describes the layout and contents of <code>.dex</code>
+files, which are used to hold a set of class definitions and their associated
+adjunct data.</p>
+
+<h1>Guide To Types</h1>
+
+<table class="guide">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>byte</td>
+ <td>8-bit signed int</td>
+</tr>
+<tr>
+ <td>ubyte</td>
+ <td>8-bit unsigned int</td>
+</tr>
+<tr>
+ <td>short</td>
+ <td>16-bit signed int, little-endian</td>
+</tr>
+<tr>
+ <td>ushort</td>
+ <td>16-bit unsigned int, little-endian</td>
+</tr>
+<tr>
+ <td>int</td>
+ <td>32-bit signed int, little-endian</td>
+</tr>
+<tr>
+ <td>uint</td>
+ <td>32-bit unsigned int, little-endian</td>
+</tr>
+<tr>
+ <td>long</td>
+ <td>64-bit signed int, little-endian</td>
+</tr>
+<tr>
+ <td>ulong</td>
+ <td>64-bit unsigned int, little-endian</td>
+</tr>
+<tr>
+ <td>sleb128</td>
+ <td>signed LEB128, variable-length (see below)</td>
+</tr>
+<tr>
+ <td>uleb128</td>
+ <td>unsigned LEB128, variable-length (see below)</td>
+</tr>
+<tr>
+ <td>uleb128p1</td>
+ <td>unsigned LEB128 plus <code>1</code>, variable-length (see below)</td>
+</tr>
+</tbody>
+</table>
+
+<h3>LEB128</h3>
+
+<p>LEB128 ("<b>L</b>ittle-<b>E</b>ndian <b>B</b>ase <b>128</b>") is a
+variable-length encoding for
+arbitrary signed or unsigned integer quantities. The format was
+borrowed from the <a href="http://dwarfstd.org/Dwarf3Std.php">DWARF3</a>
+specification. In a <code>.dex</code> file, LEB128 is only ever used to
+encode 32-bit quantities.</p>
+
+<p>Each LEB128 encoded value consists of one to five
+bytes, which together represent a single 32-bit value. Each
+byte has its most significant bit set except for the final byte in the
+sequence, which has its most significant bit clear. The remaining
+seven bits of each byte are payload, with the least significant seven
+bits of the quantity in the first byte, the next seven in the second
+byte and so on. In the case of a signed LEB128 (<code>sleb128</code>),
+the most significant payload bit of the final byte in the sequence is
+sign-extended to produce the final value. In the unsigned case
+(<code>uleb128</code>), any bits not explicitly represented are
+interpreted as <code>0</code>.
+
+<table class="leb128Bits">
+<thead>
+<tr><th colspan="16">Bitwise diagram of a two-byte LEB128 value</th></tr>
+<tr>
+ <th colspan="8">First byte</td>
+ <th colspan="8">Second byte</td>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td class="start1"><code>1</code></td>
+ <td>bit<sub>6</sub></td>
+ <td>bit<sub>5</sub></td>
+ <td>bit<sub>4</sub></td>
+ <td>bit<sub>3</sub></td>
+ <td>bit<sub>2</sub></td>
+ <td>bit<sub>1</sub></td>
+ <td>bit<sub>0</sub></td>
+ <td class="start2"><code>0</code></td>
+ <td>bit<sub>13</sub></td>
+ <td>bit<sub>12</sub></td>
+ <td>bit<sub>11</sub></td>
+ <td>bit<sub>10</sub></td>
+ <td>bit<sub>9</sub></td>
+ <td>bit<sub>8</sub></td>
+ <td class="end2">bit<sub>7</sub></td>
+</tr>
+</tbody>
+</table>
+
+<p>The variant <code>uleb128p1</code> is used to represent a signed
+value, where the representation is of the value <i>plus one</i> encoded
+as a <code>uleb128</code>. This makes the encoding of <code>-1</code>
+(alternatively thought of as the unsigned value <code>0xffffffff</code>)
+— but no other negative number — a single byte, and is
+useful in exactly those cases where the represented number must either
+be non-negative or <code>-1</code> (or <code>0xffffffff</code>),
+and where no other negative values are allowed (or where large unsigned
+values are unlikely to be needed).</p>
+
+<p>Here are some examples of the formats:</p>
+
+<table class="leb128">
+<thead>
+<tr>
+ <th>Encoded Sequence</th>
+ <th>As <code>sleb128</code></th>
+ <th>As <code>uleb128</code></th>
+ <th>As <code>uleb128p1</code></th>
+</tr>
+</thead>
+<tbody>
+ <tr><td>00</td><td>0</td><td>0</td><td>-1</td></tr>
+ <tr><td>01</td><td>1</td><td>1</td><td>0</td></tr>
+ <tr><td>7f</td><td>-1</td><td>127</td><td>126</td></tr>
+ <tr><td>80 7f</td><td>-128</td><td>16256</td><td>16255</td></tr>
+</tbody>
+</table>
+
+<h1>Overall File Layout</h1>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>header</td>
+ <td>header_item</td>
+ <td>the header</td>
+</tr>
+<tr>
+ <td>string_ids</td>
+ <td>string_id_item[]</td>
+ <td>string identifiers list. These are identifiers for all the strings
+ used by this file, either for internal naming (e.g., type descriptors)
+ or as constant objects referred to by code. This list must be sorted
+ by string contents, using UTF-16 code point values (not in a
+ locale-sensitive manner).
+ </td>
+</tr>
+<tr>
+ <td>type_ids</td>
+ <td>type_id_item[]</td>
+ <td>type identifiers list. These are identifiers for all types (classes,
+ arrays, or primitive types) referred to by this file, whether defined
+ in the file or not. This list must be sorted by <code>string_id</code>
+ index.
+ </td>
+</tr>
+<tr>
+ <td>proto_ids</td>
+ <td>proto_id_item[]</td>
+ <td>method prototype identifiers list. These are identifiers for all
+ prototypes referred to by this file. This list must be sorted in
+ return-type (by <code>type_id</code> index) major order, and then
+ by arguments (also by <code>type_id</code> index).
+ </td>
+</tr>
+<tr>
+ <td>field_ids</td>
+ <td>field_id_item[]</td>
+ <td>field identifiers list. These are identifiers for all fields
+ referred to by this file, whether defined in the file or not. This
+ list must be sorted, where the defining type (by <code>type_id</code>
+ index) is the major order, field name (by <code>string_id</code> index)
+ is the intermediate order, and type (by <code>type_id</code> index)
+ is the minor order.
+ </td>
+</tr>
+<tr>
+ <td>method_ids</td>
+ <td>method_id_item[]</td>
+ <td>method identifiers list. These are identifiers for all methods
+ referred to by this file, whether defined in the file or not. This
+ list must be sorted, where the defining type (by <code>type_id</code>
+ index) is the major order, method name (by <code>string_id</code>
+ index) is the intermediate order, and method
+ prototype (by <code>proto_id</code> index) is the minor order.
+ </td>
+</tr>
+<tr>
+ <td>class_defs</td>
+ <td>class_def_item[]</td>
+ <td>class definitions list. The classes must be ordered such that a given
+ class's superclass and implemented interfaces appear in the
+ list earlier than the referring class.
+ </td>
+</tr>
+<tr>
+ <td>data</td>
+ <td>ubyte[]</td>
+ <td>data area, containing all the support data for the tables listed above.
+ Different items have different alignment requirements, and
+ padding bytes are inserted before each item if necessary to achieve
+ proper alignment.
+ </td>
+</tr>
+<tr>
+ <td>link_data</td>
+ <td>ubyte[]</td>
+ <td>data used in statically linked files. The format of the data in
+ this section is left unspecified by this document;
+ this section is empty in unlinked files, and runtime implementations
+ may use it as they see fit.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h1>Bitfield, String, and Constant Definitions</h1>
+
+<h2><code>DEX_FILE_MAGIC</code></h2>
+<h4>embedded in <code>header_item</code></h4>
+
+<p>The constant array/string <code>DEX_FILE_MAGIC</code> is the list of
+bytes that must appear at the beginning of a <code>.dex</code> file
+in order for it to be recognized as such. The value intentionally
+contains a newline (<code>"\n"</code> or <code>0x0a</code>) and a
+null byte (<code>"\0"</code> or <code>0x00</code>) in order to help
+in the detection of certain forms of corruption. The value also
+encodes a format version number as three decimal digits, which is
+expected to increase monotonically over time as the format evolves.</p>
+
+<pre>
+ubyte[8] DEX_FILE_MAGIC = { 0x64 0x65 0x78 0x0a 0x30 0x33 0x35 0x00 }
+ = "dex\n035\0"
+</pre>
+
+<p><b>Note:</b> At least a couple earlier versions of the format have
+been used in widely-available public software releases. For example,
+version <code>009</code> was used for the M3 releases of the
+Android platform (November-December 2007),
+and version <code>013</code> was used for the M5 releases of the Android
+platform (February-March 2008). In several respects, these earlier versions
+of the format differ significantly from the version described in this
+document.</p>
+
+<h2><code>ENDIAN_CONSTANT</code> and <code>REVERSE_ENDIAN_CONSTANT</code></h2>
+<h4>embedded in <code>header_item</code></h4>
+
+<p>The constant <code>ENDIAN_CONSTANT</code> is used to indicate the
+endianness of the file in which it is found. Although the standard
+<code>.dex</code> format is little-endian, implementations may choose
+to perform byte-swapping. Should an implementation come across a
+header whose <code>endian_tag</code> is <code>REVERSE_ENDIAN_CONSTANT</code>
+instead of <code>ENDIAN_CONSTANT</code>, it would know that the file
+has been byte-swapped from the expected form.</p>
+
+<pre>
+uint ENDIAN_CONSTANT = 0x12345678;
+uint REVERSE_ENDIAN_CONSTANT = 0x78563412;
+</pre>
+
+<h2><code>NO_INDEX</code></h2>
+<h4>embedded in <code>class_def_item</code> and
+<code>debug_info_item</code></h4>
+
+<p>The constant <code>NO_INDEX</code> is used to indicate that
+an index value is absent.</p>
+
+<p><b>Note:</b> This value isn't defined to be
+<code>0</code>, because that is in fact typically a valid index.</p>
+
+<p><b>Also Note:</b> The chosen value for <code>NO_INDEX</code> is
+representable as a single byte in the <code>uleb128p1</code> encoding.</p>
+
+<pre>
+uint NO_INDEX = 0xffffffff; // == -1 if treated as a signed int
+</pre>
+
+<h2><code>access_flags</code> Definitions</h2>
+<h4>embedded in <code>class_def_item</code>,
+<code>field_item</code>, <code>method_item</code>, and
+<code>InnerClass</code></h4>
+
+<p>Bitfields of these flags are used to indicate the accessibility and
+overall properties of classes and class members.</p>
+
+<table class="accessFlags">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Value</th>
+ <th>For Classes (and <code>InnerClass</code> annotations)</th>
+ <th>For Fields</th>
+ <th>For Methods</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>ACC_PUBLIC</td>
+ <td>0x1</td>
+ <td><code>public</code>: visible everywhere</td>
+ <td><code>public</code>: visible everywhere</td>
+ <td><code>public</code>: visible everywhere</td>
+</tr>
+<tr>
+ <td>ACC_PRIVATE</td>
+ <td>0x2</td>
+ <td><super>*</super>
+ <code>private</code>: only visible to defining class
+ </td>
+ <td><code>private</code>: only visible to defining class</td>
+ <td><code>private</code>: only visible to defining class</td>
+</tr>
+<tr>
+ <td>ACC_PROTECTED</td>
+ <td>0x4</td>
+ <td><super>*</super>
+ <code>protected</code>: visible to package and subclasses
+ </td>
+ <td><code>protected</code>: visible to package and subclasses</td>
+ <td><code>protected</code>: visible to package and subclasses</td>
+</tr>
+<tr>
+ <td>ACC_STATIC</td>
+ <td>0x8</td>
+ <td><super>*</super>
+ <code>static</code>: is not constructed with an outer
+ <code>this</code> reference</td>
+ <td><code>static</code>: global to defining class</td>
+ <td><code>static</code>: does not take a <code>this</code> argument</td>
+</tr>
+<tr>
+ <td>ACC_FINAL</td>
+ <td>0x10</td>
+ <td><code>final</code>: not subclassable</td>
+ <td><code>final</code>: immutable after construction</td>
+ <td><code>final</code>: not overridable</td>
+</tr>
+<tr>
+ <td>ACC_SYNCHRONIZED</td>
+ <td>0x20</td>
+ <td> </td>
+ <td> </td>
+ <td><code>synchronized</code>: associated lock automatically acquired
+ around call to this method. <b>Note:</b> This is only valid to set when
+ <code>ACC_NATIVE</code> is also set.</td>
+</tr>
+<tr>
+ <td>ACC_VOLATILE</td>
+ <td>0x40</td>
+ <td> </td>
+ <td><code>volatile</code>: special access rules to help with thread
+ safety</td>
+ <td> </td>
+</tr>
+<tr>
+ <td>ACC_BRIDGE</td>
+ <td>0x40</td>
+ <td> </td>
+ <td> </td>
+ <td>bridge method, added automatically by compiler as a type-safe
+ bridge</td>
+</tr>
+<tr>
+ <td>ACC_TRANSIENT</td>
+ <td>0x80</td>
+ <td> </td>
+ <td><code>transient</code>: not to be saved by default serialization</td>
+ <td> </td>
+</tr>
+<tr>
+ <td>ACC_VARARGS</td>
+ <td>0x80</td>
+ <td> </td>
+ <td> </td>
+ <td>last argument should be treated as a "rest" argument by compiler</td>
+</tr>
+<tr>
+ <td>ACC_NATIVE</td>
+ <td>0x100</td>
+ <td> </td>
+ <td> </td>
+ <td><code>native</code>: implemented in native code</td>
+</tr>
+<tr>
+ <td>ACC_INTERFACE</td>
+ <td>0x200</td>
+ <td><code>interface</code>: multiply-implementable abstract class</td>
+ <td> </td>
+ <td> </td>
+</tr>
+<tr>
+ <td>ACC_ABSTRACT</td>
+ <td>0x400</td>
+ <td><code>abstract</code>: not directly instantiable</td>
+ <td> </td>
+ <td><code>abstract</code>: unimplemented by this class</td>
+</tr>
+<tr>
+ <td>ACC_STRICT</td>
+ <td>0x800</td>
+ <td> </td>
+ <td> </td>
+ <td><code>strictfp</code>: strict rules for floating-point arithmetic</td>
+</tr>
+<tr>
+ <td>ACC_SYNTHETIC</td>
+ <td>0x1000</td>
+ <td>not directly defined in source code</td>
+ <td>not directly defined in source code</td>
+ <td>not directly defined in source code</td>
+</tr>
+<tr>
+ <td>ACC_ANNOTATION</td>
+ <td>0x2000</td>
+ <td>declared as an annotation class</td>
+ <td> </td>
+ <td> </td>
+</tr>
+<tr>
+ <td>ACC_ENUM</td>
+ <td>0x4000</td>
+ <td>declared as an enumerated type</td>
+ <td>declared as an enumerated value</td>
+ <td> </td>
+</tr>
+<tr>
+ <td><i>(unused)</i></td>
+ <td>0x8000</td>
+ <td> </td>
+ <td> </td>
+ <td> </td>
+</tr>
+<tr>
+ <td>ACC_CONSTRUCTOR</td>
+ <td>0x10000</td>
+ <td> </td>
+ <td> </td>
+ <td>constructor method (class or instance initializer)</td>
+</tr>
+<tr>
+ <td>ACC_DECLARED_<br/>SYNCHRONIZED</td>
+ <td>0x20000</td>
+ <td> </td>
+ <td> </td>
+ <td>declared <code>synchronized</code>. <b>Note:</b> This has no effect on
+ execution (other than in reflection of this flag, per se).
+ </td>
+</tr>
+</tbody>
+</table>
+
+<p><super>*</super> Only allowed on for <code>InnerClass</code> annotations,
+and must not ever be on in a <code>class_def_item</code>.</p>
+
+<h2>MUTF-8 (Modified UTF-8) Encoding</h2>
+
+<p>As a concession to easier legacy support, the <code>.dex</code> format
+encodes its string data in a de facto standard modified UTF-8 form, hereafter
+referred to as MUTF-8. This form is identical to standard UTF-8, except:</p>
+
+<ul>
+ <li>Only the one-, two-, and three-byte encodings are used.</li>
+ <li>Code points in the range <code>U+10000</code> …
+ <code>U+10ffff</code> are encoded as a surrogate pair, each of
+ which is represented as a three-byte encoded value.</li>
+ <li>The code point <code>U+0000</code> is encoded in two-byte form.</li>
+ <li>A plain null byte (value <code>0</code>) indicates the end of
+ a string, as is the standard C language interpretation.</li>
+</ul>
+
+<p>The first two items above can be summarized as: MUTF-8
+is an encoding format for UTF-16, instead of being a more direct
+encoding format for Unicode characters.</p>
+
+<p>The final two items above make it simultaneously possible to include
+the code point <code>U+0000</code> in a string <i>and</i> still manipulate
+it as a C-style null-terminated string.</p>
+
+<p>However, the special encoding of <code>U+0000</code> means that, unlike
+normal UTF-8, the result of calling the standard C function
+<code>strcmp()</code> on a pair of MUTF-8 strings does not always
+indicate the properly signed result of comparison of <i>unequal</i> strings.
+When ordering (not just equality) is a concern, the most straightforward
+way to compare MUTF-8 strings is to decode them character by character,
+and compare the decoded values. (However, more clever implementations are
+also possible.)</p>
+
+<p>Please refer to <a href="http://unicode.org">The Unicode
+Standard</a> for further information about character encoding.
+MUTF-8 is actually closer to the (relatively less well-known) encoding
+<a href="http://www.unicode.org/reports/tr26/">CESU-8</a> than to UTF-8
+per se.</p>
+
+<h2><code>encoded_value</code> Encoding</h2>
+<h4>embedded in <code>annotation_element</code> and
+<code>encoded_array_item</code></h4>
+
+<p>An <code>encoded_value</code> is an encoded piece of (nearly)
+arbitrary hierarchically structured data. The encoding is meant to
+be both compact and straightforward to parse.</p>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>(value_arg << 5) | value_type</td>
+ <td>ubyte</td>
+ <td>byte indicating the type of the immediately subsequent
+ <code>value</code> along
+ with an optional clarifying argument in the high-order three bits.
+ See below for the various <code>value</code> definitions.
+ In most cases, <code>value_arg</code> encodes the length of
+ the immediately-subsequent <code>value</code> in bytes, as
+ <code>(size - 1)</code>, e.g., <code>0</code> means that
+ the value requires one byte, and <code>7</code> means it requires
+ eight bytes; however, there are exceptions as noted below.
+ </td>
+</tr>
+<tr>
+ <td>value</td>
+ <td>ubyte[]</td>
+ <td>bytes representing the value, variable in length and interpreted
+ differently for different <code>value_type</code> bytes, though
+ always little-endian. See the various value definitions below for
+ details.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h3>Value Formats</h3>
+
+<table class="encodedValue">
+<thead>
+<tr>
+ <th>Type Name</th>
+ <th><code>value_type</code></th>
+ <th><code>value_arg</code> Format</th>
+ <th><code>value</code> Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>VALUE_BYTE</td>
+ <td>0x00</td>
+ <td><i>(none; must be <code>0</code>)</i></td>
+ <td>ubyte[1]</td>
+ <td>signed one-byte integer value</td>
+</tr>
+<tr>
+ <td>VALUE_SHORT</td>
+ <td>0x02</td>
+ <td>size - 1 (0…1)</td>
+ <td>ubyte[size]</td>
+ <td>signed two-byte integer value, sign-extended</td>
+</tr>
+<tr>
+ <td>VALUE_CHAR</td>
+ <td>0x03</td>
+ <td>size - 1 (0…1)</td>
+ <td>ubyte[size]</td>
+ <td>unsigned two-byte integer value, zero-extended</td>
+</tr>
+<tr>
+ <td>VALUE_INT</td>
+ <td>0x04</td>
+ <td>size - 1 (0…3)</td>
+ <td>ubyte[size]</td>
+ <td>signed four-byte integer value, sign-extended</td>
+</tr>
+<tr>
+ <td>VALUE_LONG</td>
+ <td>0x06</td>
+ <td>size - 1 (0…7)</td>
+ <td>ubyte[size]</td>
+ <td>signed eight-byte integer value, sign-extended</td>
+</tr>
+<tr>
+ <td>VALUE_FLOAT</td>
+ <td>0x10</td>
+ <td>size - 1 (0…3)</td>
+ <td>ubyte[size]</td>
+ <td>four-byte bit pattern, zero-extended <i>to the right</i>, and
+ interpreted as an IEEE754 32-bit floating point value
+ </td>
+</tr>
+<tr>
+ <td>VALUE_DOUBLE</td>
+ <td>0x11</td>
+ <td>size - 1 (0…7)</td>
+ <td>ubyte[size]</td>
+ <td>eight-byte bit pattern, zero-extended <i>to the right</i>, and
+ interpreted as an IEEE754 64-bit floating point value
+ </td>
+</tr>
+<tr>
+ <td>VALUE_STRING</td>
+ <td>0x17</td>
+ <td>size - 1 (0…3)</td>
+ <td>ubyte[size]</td>
+ <td>unsigned (zero-extended) four-byte integer value,
+ interpreted as an index into
+ the <code>string_ids</code> section and representing a string value
+ </td>
+</tr>
+<tr>
+ <td>VALUE_TYPE</td>
+ <td>0x18</td>
+ <td>size - 1 (0…3)</td>
+ <td>ubyte[size]</td>
+ <td>unsigned (zero-extended) four-byte integer value,
+ interpreted as an index into
+ the <code>type_ids</code> section and representing a reflective
+ type/class value
+ </td>
+</tr>
+<tr>
+ <td>VALUE_FIELD</td>
+ <td>0x19</td>
+ <td>size - 1 (0…3)</td>
+ <td>ubyte[size]</td>
+ <td>unsigned (zero-extended) four-byte integer value,
+ interpreted as an index into
+ the <code>field_ids</code> section and representing a reflective
+ field value
+ </td>
+</tr>
+<tr>
+ <td>VALUE_METHOD</td>
+ <td>0x1a</td>
+ <td>size - 1 (0…3)</td>
+ <td>ubyte[size]</td>
+ <td>unsigned (zero-extended) four-byte integer value,
+ interpreted as an index into
+ the <code>method_ids</code> section and representing a reflective
+ method value
+ </td>
+</tr>
+<tr>
+ <td>VALUE_ENUM</td>
+ <td>0x1b</td>
+ <td>size - 1 (0…3)</td>
+ <td>ubyte[size]</td>
+ <td>unsigned (zero-extended) four-byte integer value,
+ interpreted as an index into
+ the <code>field_ids</code> section and representing the value of
+ an enumerated type constant
+ </td>
+</tr>
+<tr>
+ <td>VALUE_ARRAY</td>
+ <td>0x1c</td>
+ <td><i>(none; must be <code>0</code>)</i></td>
+ <td>encoded_array</td>
+ <td>an array of values, in the format specified by
+ "<code>encoded_array</code> Format" below. The size
+ of the <code>value</code> is implicit in the encoding.
+ </td>
+</tr>
+<tr>
+ <td>VALUE_ANNOTATION</td>
+ <td>0x1d</td>
+ <td><i>(none; must be <code>0</code>)</i></td>
+ <td>encoded_annotation</td>
+ <td>a sub-annotation, in the format specified by
+ "<code>encoded_annotation</code> Format" below. The size
+ of the <code>value</code> is implicit in the encoding.
+ </td>
+</tr>
+<tr>
+ <td>VALUE_NULL</td>
+ <td>0x1e</td>
+ <td><i>(none; must be <code>0</code>)</i></td>
+ <td><i>(none)</i></td>
+ <td><code>null</code> reference value</td>
+</tr>
+<tr>
+ <td>VALUE_BOOLEAN</td>
+ <td>0x1f</td>
+ <td>boolean (0…1)</td>
+ <td><i>(none)</i></td>
+ <td>one-bit value; <code>0</code> for <code>false</code> and
+ <code>1</code> for <code>true</code>. The bit is represented in the
+ <code>value_arg</code>.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h3><code>encoded_array</code> Format</h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>size</td>
+ <td>uleb128</td>
+ <td>number of elements in the array</td>
+</tr>
+<tr>
+ <td>values</td>
+ <td>encoded_value[size]</td>
+ <td>a series of <code>size</code> <code>encoded_value</code> byte
+ sequences in the format specified by this section, concatenated
+ sequentially.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h3><code>encoded_annotation</code> Format</h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>type_idx</td>
+ <td>uleb128</td>
+ <td>type of the annotation. This must be a class (not array or primitive)
+ type.
+ </td>
+</tr>
+<tr>
+ <td>size</td>
+ <td>uleb128</td>
+ <td>number of name-value mappings in this annotation</td>
+</tr>
+<tr>
+ <td>elements</td>
+ <td>annotation_element[size]</td>
+ <td>elements of the annotataion, represented directly in-line (not as
+ offsets). Elements must be sorted in increasing order by
+ <code>string_id</code> index.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h3><code>annotation_element</code> Format</h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>name_idx</td>
+ <td>uleb128</td>
+ <td>element name, represented as an index into the
+ <code>string_ids</code> section. The string must conform to the
+ syntax for <i>MemberName</i>, defined above.
+ </td>
+</tr>
+<tr>
+ <td>value</td>
+ <td>encoded_value</td>
+ <td>element value</td>
+</tr>
+</tbody>
+</table>
+
+<h2>String Syntax</h2>
+
+<p>There are several kinds of item in a <code>.dex</code> file which
+ultimately refer to a string. The following BNF-style definitions
+indicate the acceptable syntax for these strings.</p>
+
+<h3><i>SimpleName</i></h3>
+
+<p>A <i>SimpleName</i> is the basis for the syntax of the names of other
+things. The <code>.dex</code> format allows a fair amount of latitude
+here (much more than most common source languages). In brief, a simple
+name may consist of any low-ASCII alphabetic character or digit, a few
+specific low-ASCII symbols, and most non-ASCII code points that are not
+control, space, or special characters. Note that surrogate code points
+(in the range <code>U+d800</code> … <code>U+dfff</code>) are not
+considered valid name characters, per se, but Unicode supplemental
+characters <i>are</i> valid (which are represented by the final
+alternative of the rule for <i>SimpleNameChar</i>), and they should be
+represented in a file as pairs of surrogate code points in the MUTF-8
+encoding.</p>
+
+<table class="bnf">
+ <tr><td colspan="2" class="def"><i>SimpleName</i> →</td></tr>
+ <tr>
+ <td/>
+ <td><i>SimpleNameChar</i> (<i>SimpleNameChar</i>)*</td>
+ </tr>
+
+ <tr><td colspan="2" class="def"><i>SimpleNameChar</i> →</td></tr>
+ <tr>
+ <td/>
+ <td><code>'A'</code> … <code>'Z'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'a'</code> … <code>'z'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'0'</code> … <code>'9'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'$'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'-'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'_'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>U+00a1</code> … <code>U+1fff</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>U+2010</code> … <code>U+2027</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>U+2030</code> … <code>U+d7ff</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>U+e000</code> … <code>U+ffef</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>U+10000</code> … <code>U+10ffff</code></td>
+ </tr>
+</table>
+
+<h3><i>MemberName</i></h3>
+<h4>used by <code>field_id_item</code> and <code>method_id_item</code></h4>
+
+<p>A <i>MemberName</i> is the name of a member of a class, members being
+fields, methods, and inner classes.</p>
+
+<table class="bnf">
+ <tr><td colspan="2" class="def"><i>MemberName</i> →</td></tr>
+ <tr>
+ <td/>
+ <td><i>SimpleName</i></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'<'</code> <i>SimpleName</i> <code>'>'</code></td>
+ </tr>
+</table>
+
+<h3><i>FullClassName</i></h3>
+
+<p>A <i>FullClassName</i> is a fully-qualified class name, including an
+optional package specifier followed by a required name.</p>
+
+<table class="bnf">
+ <tr><td colspan="2" class="def"><i>FullClassName</i> →</td></tr>
+ <tr>
+ <td/>
+ <td><i>OptionalPackagePrefix</i> <i>SimpleName</i></td>
+ </tr>
+
+ <tr><td colspan="2" class="def"><i>OptionalPackagePrefix</i> →</td></tr>
+ <tr>
+ <td/>
+ <td>(<i>SimpleName</i> <code>'/'</code>)*</td>
+ </tr>
+</table>
+
+<h3><i>TypeDescriptor</i></h3>
+<h4>used by <code>type_id_item</code></h4>
+
+<p>A <i>TypeDescriptor</i> is the representation of any type, including
+primitives, classes, arrays, and <code>void</code>. See below for
+the meaning of the various versions.</p>
+
+<table class="bnf">
+ <tr><td colspan="2" class="def"><i>TypeDescriptor</i> →</td></tr>
+ <tr>
+ <td/>
+ <td><code>'V'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><i>FieldTypeDescriptor</i></td>
+ </tr>
+
+ <tr><td colspan="2" class="def"><i>FieldTypeDescriptor</i> →</td></tr>
+ <tr>
+ <td/>
+ <td><i>NonArrayFieldTypeDescriptor</i></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td>(<code>'['</code> * 1…255)
+ <i>NonArrayFieldTypeDescriptor</i></td>
+ </tr>
+
+ <tr>
+ <td colspan="2" class="def"><i>NonArrayFieldTypeDescriptor</i>→</td>
+ </tr>
+ <tr>
+ <td/>
+ <td><code>'Z'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'B'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'S'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'C'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'I'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'J'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'F'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'D'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'L'</code> <i>FullClassName</i> <code>';'</code></td>
+ </tr>
+</table>
+
+<h3><i>ShortyDescriptor</i></h3>
+<h4>used by <code>proto_id_item</code></h4>
+
+<p>A <i>ShortyDescriptor</i> is the short form representation of a method
+prototype, including return and parameter types, except that there is
+no distinction between various reference (class or array) types. Instead,
+all reference types are represented by a single <code>'L'</code> character.</p>
+
+<table class="bnf">
+ <tr><td colspan="2" class="def"><i>ShortyDescriptor</i> →</td></tr>
+ <tr>
+ <td/>
+ <td><i>ShortyReturnType</i> (<i>ShortyFieldType</i>)*</td>
+ </tr>
+
+ <tr><td colspan="2" class="def"><i>ShortyReturnType</i> →</td></tr>
+ <tr>
+ <td/>
+ <td><code>'V'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><i>ShortyFieldType</i></td>
+ </tr>
+
+ <tr><td colspan="2" class="def"><i>ShortyFieldType</i> →</td></tr>
+ <tr>
+ <td/>
+ <td><code>'Z'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'B'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'S'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'C'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'I'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'J'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'F'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'D'</code></td>
+ </tr>
+ <tr>
+ <td class="bar">|</td>
+ <td><code>'L'</code></td>
+ </tr>
+</table>
+
+<h2><i>TypeDescriptor</i> Semantics</h2>
+
+<p>This is the meaning of each of the variants of <i>TypeDescriptor</i>.</p>
+
+<table class="descriptor">
+<thead>
+<tr>
+ <th>Syntax</th>
+ <th>Meaning</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>V</td>
+ <td><code>void</code>; only valid for return types</td>
+</tr>
+<tr>
+ <td>Z</td>
+ <td><code>boolean</code></td>
+</tr>
+<tr>
+ <td>B</td>
+ <td><code>byte</code></td>
+</tr>
+<tr>
+ <td>S</td>
+ <td><code>short</code></td>
+</tr>
+<tr>
+ <td>C</td>
+ <td><code>char</code></td>
+</tr>
+<tr>
+ <td>I</td>
+ <td><code>int</code></td>
+</tr>
+<tr>
+ <td>J</td>
+ <td><code>long</code></td>
+</tr>
+<tr>
+ <td>F</td>
+ <td><code>float</code></td>
+</tr>
+<tr>
+ <td>D</td>
+ <td><code>double</code></td>
+</tr>
+<tr>
+ <td>L<i>fully/qualified/Name</i>;</td>
+ <td>the class <code><i>fully.qualified.Name</i></code></td>
+</tr>
+<tr>
+ <td>[<i>descriptor</i></td>
+ <td>array of <code><i>descriptor</i></code>, usable recursively for
+ arrays-of-arrays, though it is invalid to have more than 255
+ dimensions.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h1>Items and Related Structures</h1>
+
+<p>This section includes definitions for each of the top-level items that
+may appear in a <code>.dex</code> file.
+
+<h2><code>header_item</code></h2>
+<h4>appears in the <code>header</code> section</h4>
+<h4>alignment: 4 bytes</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>magic</td>
+ <td>ubyte[8] = DEX_FILE_MAGIC</td>
+ <td>magic value. See discussion above under "<code>DEX_FILE_MAGIC</code>"
+ for more details.
+ </td>
+</tr>
+<tr>
+ <td>checksum</td>
+ <td>uint</td>
+ <td>adler32 checksum of the rest of the file (everything but
+ <code>magic</code> and this field); used to detect file corruption
+ </td>
+</tr>
+<tr>
+ <td>signature</td>
+ <td>ubyte[20]</td>
+ <td>SHA-1 signature (hash) of the rest of the file (everything but
+ <code>magic</code>, <code>checksum</code>, and this field); used
+ to uniquely identify files
+ </td>
+</tr>
+<tr>
+ <td>file_size</td>
+ <td>uint</td>
+ <td>size of the entire file (including the header), in bytes
+</tr>
+<tr>
+ <td>header_size</td>
+ <td>uint = 0x70</td>
+ <td>size of the header (this entire section), in bytes. This allows for at
+ least a limited amount of backwards/forwards compatibility without
+ invalidating the format.
+ </td>
+</tr>
+<tr>
+ <td>endian_tag</td>
+ <td>uint = ENDIAN_CONSTANT</td>
+ <td>endianness tag. See discussion above under "<code>ENDIAN_CONSTANT</code>
+ and <code>REVERSE_ENDIAN_CONSTANT</code>" for more details.
+ </td>
+</tr>
+<tr>
+ <td>link_size</td>
+ <td>uint</td>
+ <td>size of the link section, or <code>0</code> if this file isn't
+ statically linked</td>
+</tr>
+<tr>
+ <td>link_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the link section, or
+ <code>0</code> if <code>link_size == 0</code>. The offset, if non-zero,
+ should be to an offset into the <code>link_data</code> section. The
+ format of the data pointed at is left unspecified by this document;
+ this header field (and the previous) are left as hooks for use by
+ runtime implementations.
+ </td>
+</tr>
+<tr>
+ <td>map_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the map item, or
+ <code>0</code> if this file has no map. The offset, if non-zero,
+ should be to an offset into the <code>data</code> section,
+ and the data should be in the format specified by "<code>map_list</code>"
+ below.
+ </td>
+</tr>
+<tr>
+ <td>string_ids_size</td>
+ <td>uint</td>
+ <td>count of strings in the string identifiers list</td>
+</tr>
+<tr>
+ <td>string_ids_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the string identifiers list, or
+ <code>0</code> if <code>string_ids_size == 0</code> (admittedly a
+ strange edge case). The offset, if non-zero,
+ should be to the start of the <code>string_ids</code> section.
+ </td>
+</tr>
+<tr>
+ <td>type_ids_size</td>
+ <td>uint</td>
+ <td>count of elements in the type identifiers list</td>
+</tr>
+<tr>
+ <td>type_ids_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the type identifiers list, or
+ <code>0</code> if <code>type_ids_size == 0</code> (admittedly a
+ strange edge case). The offset, if non-zero,
+ should be to the start of the <code>type_ids</code>
+ section.
+ </td>
+</tr>
+<tr>
+ <td>proto_ids_size</td>
+ <td>uint</td>
+ <td>count of elements in the prototype identifiers list</td>
+</tr>
+<tr>
+ <td>proto_ids_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the prototype identifiers list, or
+ <code>0</code> if <code>proto_ids_size == 0</code> (admittedly a
+ strange edge case). The offset, if non-zero,
+ should be to the start of the <code>proto_ids</code>
+ section.
+ </td>
+</tr>
+<tr>
+ <td>field_ids_size</td>
+ <td>uint</td>
+ <td>count of elements in the field identifiers list</td>
+</tr>
+<tr>
+ <td>field_ids_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the field identifiers list, or
+ <code>0</code> if <code>field_ids_size == 0</code>. The offset, if
+ non-zero, should be to the start of the <code>field_ids</code>
+ section.</td>
+</td>
+</tr>
+<tr>
+ <td>method_ids_size</td>
+ <td>uint</td>
+ <td>count of elements in the method identifiers list</td>
+</tr>
+<tr>
+ <td>method_ids_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the method identifiers list, or
+ <code>0</code> if <code>method_ids_size == 0</code>. The offset, if
+ non-zero, should be to the start of the <code>method_ids</code>
+ section.</td>
+</tr>
+<tr>
+ <td>class_defs_size</td>
+ <td>uint</td>
+ <td>count of elements in the class definitions list</td>
+</tr>
+<tr>
+ <td>class_defs_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the class definitions list, or
+ <code>0</code> if <code>class_defs_size == 0</code> (admittedly a
+ strange edge case). The offset, if non-zero,
+ should be to the start of the <code>class_defs</code> section.
+ </td>
+</tr>
+<tr>
+ <td>data_size</td>
+ <td>uint</td>
+ <td>Size of <code>data</code> section in bytes. Must be an even
+ multiple of sizeof(uint).</td>
+</tr>
+<tr>
+ <td>data_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the start of the
+ <code>data</code> section.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>map_list</code></h2>
+<h4>appears in the <code>data</code> section</h4>
+<h4>referenced from <code>header_item</code></h4>
+<h4>alignment: 4 bytes</h4>
+
+<p>This is a list of the entire contents of a file, in order. It
+contains some redundancy with respect to the <code>header_item</code>
+but is intended to be an easy form to use to iterate over an entire
+file. A given type may appear at most once in a map, but there is no
+restriction on what order types may appear in, other than the
+restrictions implied by the rest of the format (e.g., a
+<code>header</code> section must appear first, followed by a
+<code>string_ids</code> section, etc.). Additionally, the map entries must
+be ordered by initial offset and must not overlap.</p>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>size</td>
+ <td>uint</td>
+ <td>size of the list, in entries</td>
+</tr>
+<tr>
+ <td>list</td>
+ <td>map_item[size]</td>
+ <td>elements of the list</td>
+</tr>
+</tbody>
+</table>
+
+<h3><code>map_item</code> Format</h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>type</td>
+ <td>ushort</td>
+ <td>type of the items; see table below</td>
+</tr>
+<tr>
+ <td>unused</td>
+ <td>ushort</td>
+ <td><i>(unused)</i></td>
+</tr>
+<tr>
+ <td>size</td>
+ <td>uint</td>
+ <td>count of the number of items to be found at the indicated offset</td>
+</tr>
+<tr>
+ <td>offset</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the items in question</td>
+</tr>
+</tbody>
+</table>
+
+
+<h3>Type Codes</h3>
+
+<table class="typeCodes">
+<thead>
+<tr>
+ <th>Item Type</th>
+ <th>Constant</th>
+ <th>Value</th>
+ <th>Item Size In Bytes</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>header_item</td>
+ <td>TYPE_HEADER_ITEM</td>
+ <td>0x0000</td>
+ <td>0x70</td>
+</tr>
+<tr>
+ <td>string_id_item</td>
+ <td>TYPE_STRING_ID_ITEM</td>
+ <td>0x0001</td>
+ <td>0x04</td>
+</tr>
+<tr>
+ <td>type_id_item</td>
+ <td>TYPE_TYPE_ID_ITEM</td>
+ <td>0x0002</td>
+ <td>0x04</td>
+</tr>
+<tr>
+ <td>proto_id_item</td>
+ <td>TYPE_PROTO_ID_ITEM</td>
+ <td>0x0003</td>
+ <td>0x0c</td>
+</tr>
+<tr>
+ <td>field_id_item</td>
+ <td>TYPE_FIELD_ID_ITEM</td>
+ <td>0x0004</td>
+ <td>0x08</td>
+</tr>
+<tr>
+ <td>method_id_item</td>
+ <td>TYPE_METHOD_ID_ITEM</td>
+ <td>0x0005</td>
+ <td>0x08</td>
+</tr>
+<tr>
+ <td>class_def_item</td>
+ <td>TYPE_CLASS_DEF_ITEM</td>
+ <td>0x0006</td>
+ <td>0x20</td>
+</tr>
+<tr>
+ <td>map_list</td>
+ <td>TYPE_MAP_LIST</td>
+ <td>0x1000</td>
+ <td>4 + (item.size * 12)</td>
+</tr>
+<tr>
+ <td>type_list</td>
+ <td>TYPE_TYPE_LIST</td>
+ <td>0x1001</td>
+ <td>4 + (item.size * 2)</td>
+</tr>
+<tr>
+ <td>annotation_set_ref_list</td>
+ <td>TYPE_ANNOTATION_SET_REF_LIST</td>
+ <td>0x1002</td>
+ <td>4 + (item.size * 4)</td>
+</tr>
+<tr>
+ <td>annotation_set_item</td>
+ <td>TYPE_ANNOTATION_SET_ITEM</td>
+ <td>0x1003</td>
+ <td>4 + (item.size * 4)</td>
+</tr>
+<tr>
+ <td>class_data_item</td>
+ <td>TYPE_CLASS_DATA_ITEM</td>
+ <td>0x2000</td>
+ <td><i>implicit; must parse</i></td>
+</tr>
+<tr>
+ <td>code_item</td>
+ <td>TYPE_CODE_ITEM</td>
+ <td>0x2001</td>
+ <td><i>implicit; must parse</i></td>
+</tr>
+<tr>
+ <td>string_data_item</td>
+ <td>TYPE_STRING_DATA_ITEM</td>
+ <td>0x2002</td>
+ <td><i>implicit; must parse</i></td>
+</tr>
+<tr>
+ <td>debug_info_item</td>
+ <td>TYPE_DEBUG_INFO_ITEM</td>
+ <td>0x2003</td>
+ <td><i>implicit; must parse</i></td>
+</tr>
+<tr>
+ <td>annotation_item</td>
+ <td>TYPE_ANNOTATION_ITEM</td>
+ <td>0x2004</td>
+ <td><i>implicit; must parse</i></td>
+</tr>
+<tr>
+ <td>encoded_array_item</td>
+ <td>TYPE_ENCODED_ARRAY_ITEM</td>
+ <td>0x2005</td>
+ <td><i>implicit; must parse</i></td>
+</tr>
+<tr>
+ <td>annotations_directory_item</td>
+ <td>TYPE_ANNOTATIONS_DIRECTORY_ITEM</td>
+ <td>0x2006</td>
+ <td><i>implicit; must parse</i></td>
+</tr>
+</tbody>
+</table>
+
+
+<h2><code>string_id_item</code></h2>
+<h4>appears in the <code>string_ids</code> section</h4>
+<h4>alignment: 4 bytes</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>string_data_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the string data for this
+ item. The offset should be to a location
+ in the <code>data</code> section, and the data should be in the
+ format specified by "<code>string_data_item</code>" below.
+ There is no alignment requirement for the offset.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>string_data_item</code></h2>
+<h4>appears in the <code>data</code> section</h4>
+<h4>alignment: none (byte-aligned)</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>utf16_size</td>
+ <td>uleb128</td>
+ <td>size of this string, in UTF-16 code units (which is the "string
+ length" in many systems). That is, this is the decoded length of
+ the string. (The encoded length is implied by the position of
+ the <code>0</code> byte.)</td>
+</tr>
+<tr>
+ <td>data</td>
+ <td>ubyte[]</td>
+ <td>a series of MUTF-8 code units (a.k.a. octets, a.k.a. bytes)
+ followed by a byte of value <code>0</code>. See
+ "MUTF-8 (Modified UTF-8) Encoding" above for details and
+ discussion about the data format.
+ <p><b>Note:</b> It is acceptable to have a string which includes
+ (the encoded form of) UTF-16 surrogate code units (that is,
+ <code>U+d800</code> … <code>U+dfff</code>)
+ either in isolation or out-of-order with respect to the usual
+ encoding of Unicode into UTF-16. It is up to higher-level uses of
+ strings to reject such invalid encodings, if appropriate.</p>
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>type_id_item</code></h2>
+<h4>appears in the <code>type_ids</code> section</h4>
+<h4>alignment: 4 bytes</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>descriptor_idx</td>
+ <td>uint</td>
+ <td>index into the <code>string_ids</code> list for the descriptor
+ string of this type. The string must conform to the syntax for
+ <i>TypeDescriptor</i>, defined above.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>proto_id_item</code></h2>
+<h4>appears in the <code>proto_ids</code> section</h4>
+<h4>alignment: 4 bytes</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>shorty_idx</td>
+ <td>uint</td>
+ <td>index into the <code>string_ids</code> list for the short-form
+ descriptor string of this prototype. The string must conform to the
+ syntax for <i>ShortyDescriptor</i>, defined above, and must correspond
+ to the return type and parameters of this item.
+ </td>
+</tr>
+<tr>
+ <td>return_type_idx</td>
+ <td>uint</td>
+ <td>index into the <code>type_ids</code> list for the return type
+ of this prototype
+ </td>
+</tr>
+<tr>
+ <td>parameters_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the list of parameter types
+ for this prototype, or <code>0</code> if this prototype has no
+ parameters. This offset, if non-zero, should be in the
+ <code>data</code> section, and the data there should be in the
+ format specified by <code>"type_list"</code> below. Additionally, there
+ should be no reference to the type <code>void</code> in the list.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>field_id_item</code></h2>
+<h4>appears in the <code>field_ids</code> section</h4>
+<h4>alignment: 4 bytes</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>class_idx</td>
+ <td>ushort</td>
+ <td>index into the <code>type_ids</code> list for the definer of this
+ field. This must be a class type, and not an array or primitive type.
+ </td>
+</tr>
+<tr>
+ <td>type_idx</td>
+ <td>ushort</td>
+ <td>index into the <code>type_ids</code> list for the type of
+ this field
+ </td>
+</tr>
+<tr>
+ <td>name_idx</td>
+ <td>uint</td>
+ <td>index into the <code>string_ids</code> list for the name of this
+ field. The string must conform to the syntax for <i>MemberName</i>,
+ defined above.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>method_id_item</code></h2>
+<h4>appears in the <code>method_ids</code> section</h4>
+<h4>alignment: 4 bytes</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>class_idx</td>
+ <td>ushort</td>
+ <td>index into the <code>type_ids</code> list for the definer of this
+ method. This must be a class or array type, and not a primitive type.
+ </td>
+</tr>
+<tr>
+ <td>proto_idx</td>
+ <td>ushort</td>
+ <td>index into the <code>proto_ids</code> list for the prototype of
+ this method
+ </td>
+</tr>
+<tr>
+ <td>name_idx</td>
+ <td>uint</td>
+ <td>index into the <code>string_ids</code> list for the name of this
+ method. The string must conform to the syntax for <i>MemberName</i>,
+ defined above.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>class_def_item</code></h2>
+<h4>appears in the <code>class_defs</code> section</h4>
+<h4>alignment: 4 bytes</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>class_idx</td>
+ <td>uint</td>
+ <td>index into the <code>type_ids</code> list for this class.
+ This must be a class type, and not an array or primitive type.
+ </td>
+</tr>
+<tr>
+ <td>access_flags</td>
+ <td>uint</td>
+ <td>access flags for the class (<code>public</code>, <code>final</code>,
+ etc.). See "<code>access_flags</code> Definitions" for details.
+ </td>
+</tr>
+<tr>
+ <td>superclass_idx</td>
+ <td>uint</td>
+ <td>index into the <code>type_ids</code> list for the superclass, or
+ the constant value <code>NO_INDEX</code> if this class has no
+ superclass (i.e., it is a root class such as <code>Object</code>).
+ If present, this must be a class type, and not an array or primitive type.
+ </td>
+</tr>
+<tr>
+ <td>interfaces_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the list of interfaces, or
+ <code>0</code> if there are none. This offset
+ should be in the <code>data</code> section, and the data
+ there should be in the format specified by
+ "<code>type_list</code>" below. Each of the elements of the list
+ must be a class type (not an array or primitive type), and there
+ must not be any duplicates.
+ </td>
+</tr>
+<tr>
+ <td>source_file_idx</td>
+ <td>uint</td>
+ <td>index into the <code>string_ids</code> list for the name of the
+ file containing the original source for (at least most of) this class,
+ or the special value <code>NO_INDEX</code> to represent a lack of
+ this information. The <code>debug_info_item</code> of any given method
+ may override this source file, but the expectation is that most classes
+ will only come from one source file.
+ </td>
+</tr>
+<tr>
+ <td>annotations_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the annotations structure
+ for this class, or <code>0</code> if there are no annotations on
+ this class. This offset, if non-zero, should be in the
+ <code>data</code> section, and the data there should be in
+ the format specified by "<code>annotations_directory_item</code>" below,
+ with all items referring to this class as the definer.
+ </td>
+</tr>
+<tr>
+ <td>class_data_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the associated
+ class data for this item, or <code>0</code> if there is no class
+ data for this class. (This may be the case, for example, if this class
+ is a marker interface.) The offset, if non-zero, should be in the
+ <code>data</code> section, and the data there should be in the
+ format specified by "<code>class_data_item</code>" below, with all
+ items referring to this class as the definer.
+ </td>
+</tr>
+<tr>
+ <td>static_values_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the list of initial
+ values for <code>static</code> fields, or <code>0</code> if there
+ are none (and all <code>static</code> fields are to be initialized with
+ <code>0</code> or <code>null</code>). This offset should be in the
+ <code>data</code> section, and the data there should be in the
+ format specified by "<code>encoded_array_item</code>" below. The size
+ of the array must be no larger than the number of <code>static</code>
+ fields declared by this class, and the elements correspond to the
+ <code>static</code> fields in the same order as declared in the
+ corresponding <code>field_list</code>. The type of each array
+ element must match the declared type of its corresponding field.
+ If there are fewer elements in the array than there are
+ <code>static</code> fields, then the leftover fields are initialized
+ with a type-appropriate <code>0</code> or <code>null</code>.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>class_data_item</code></h2>
+<h4>referenced from <code>class_def_item</code></h4>
+<h4>appears in the <code>data</code> section</h4>
+<h4>alignment: none (byte-aligned)</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>static_fields_size</td>
+ <td>uleb128</td>
+ <td>the number of static fields defined in this item</td>
+</tr>
+<tr>
+ <td>instance_fields_size</td>
+ <td>uleb128</td>
+ <td>the number of instance fields defined in this item</td>
+</tr>
+<tr>
+ <td>direct_methods_size</td>
+ <td>uleb128</td>
+ <td>the number of direct methods defined in this item</td>
+</tr>
+<tr>
+ <td>virtual_methods_size</td>
+ <td>uleb128</td>
+ <td>the number of virtual methods defined in this item</td>
+</tr>
+<tr>
+ <td>static_fields</td>
+ <td>encoded_field[static_fields_size]</td>
+ <td>the defined static fields, represented as a sequence of
+ encoded elements. The fields must be sorted by
+ <code>field_idx</code> in increasing order.
+ </td>
+</tr>
+<tr>
+ <td>instance_fields</td>
+ <td>encoded_field[instance_fields_size]</td>
+ <td>the defined instance fields, represented as a sequence of
+ encoded elements. The fields must be sorted by
+ <code>field_idx</code> in increasing order.
+ </td>
+</tr>
+<tr>
+ <td>direct_methods</td>
+ <td>encoded_method[direct_methods_size]</td>
+ <td>the defined direct (any of <code>static</code>, <code>private</code>,
+ or constructor) methods, represented as a sequence of
+ encoded elements. The methods must be sorted by
+ <code>method_idx</code> in increasing order.
+ </td>
+</tr>
+<tr>
+ <td>virtual_methods</td>
+ <td>encoded_method[virtual_methods_size]</td>
+ <td>the defined virtual (none of <code>static</code>, <code>private</code>,
+ or constructor) methods, represented as a sequence of
+ encoded elements. This list should <i>not</i> include inherited
+ methods unless overridden by the class that this item represents. The
+ methods must be sorted by <code>method_idx</code> in increasing order.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<p><b>Note:</b> All elements' <code>field_id</code>s and
+<code>method_id</code>s must refer to the same defining class.</p>
+
+<h3><code>encoded_field</code> Format</h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>field_idx_diff</td>
+ <td>uleb128</td>
+ <td>index into the <code>field_ids</code> list for the identity of this
+ field (includes the name and descriptor), represented as a difference
+ from the index of previous element in the list. The index of the
+ first element in a list is represented directly.
+ </td>
+</tr>
+<tr>
+ <td>access_flags</td>
+ <td>uleb128</td>
+ <td>access flags for the field (<code>public</code>, <code>final</code>,
+ etc.). See "<code>access_flags</code> Definitions" for details.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h3><code>encoded_method</code> Format</h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>method_idx_diff</td>
+ <td>uleb128</td>
+ <td>index into the <code>method_ids</code> list for the identity of this
+ method (includes the name and descriptor), represented as a difference
+ from the index of previous element in the list. The index of the
+ first element in a list is represented directly.
+ </td>
+</tr>
+<tr>
+ <td>access_flags</td>
+ <td>uleb128</td>
+ <td>access flags for the method (<code>public</code>, <code>final</code>,
+ etc.). See "<code>access_flags</code> Definitions" for details.
+ </td>
+</tr>
+<tr>
+ <td>code_off</td>
+ <td>uleb128</td>
+ <td>offset from the start of the file to the code structure for this
+ method, or <code>0</code> if this method is either <code>abstract</code>
+ or <code>native</code>. The offset should be to a location in the
+ <code>data</code> section. The format of the data is specified by
+ "<code>code_item</code>" below.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>type_list</code></h2>
+<h4>referenced from <code>class_def_item</code> and
+<code>proto_id_item</code></h4>
+<h4>appears in the <code>data</code> section</h4>
+<h4>alignment: 4 bytes</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>size</td>
+ <td>uint</td>
+ <td>size of the list, in entries</td>
+</tr>
+<tr>
+ <td>list</td>
+ <td>type_item[size]</td>
+ <td>elements of the list</td>
+</tr>
+</tbody>
+</table>
+
+<h3><code>type_item</code> Format</h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>type_idx</td>
+ <td>ushort</td>
+ <td>index into the <code>type_ids</code> list</td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>code_item</code></h2>
+<h4>referenced from <code>method_item</code></h4>
+<h4>appears in the <code>data</code> section</h4>
+<h4>alignment: 4 bytes</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>registers_size</td>
+ <td>ushort</td>
+ <td>the number of registers used by this code</td>
+</tr>
+<tr>
+ <td>ins_size</td>
+ <td>ushort</td>
+ <td>the number of words of incoming arguments to the method that this
+ code is for</td>
+</tr>
+<tr>
+ <td>outs_size</td>
+ <td>ushort</td>
+ <td>the number of words of outgoing argument space required by this
+ code for method invocation
+ </td>
+</tr>
+<tr>
+ <td>tries_size</td>
+ <td>ushort</td>
+ <td>the number of <code>try_item</code>s for this instance. If non-zero,
+ then these appear as the <code>tries</code> array just after the
+ <code>insns</code> in this instance.
+ </td>
+</tr>
+<tr>
+ <td>debug_info_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the debug info (line numbers +
+ local variable info) sequence for this code, or <code>0</code> if
+ there simply is no information. The offset, if non-zero, should be
+ to a location in the <code>data</code> section. The format of
+ the data is specified by "<code>debug_info_item</code>" below.
+ </td>
+</tr>
+<tr>
+ <td>insns_size</td>
+ <td>uint</td>
+ <td>size of the instructions list, in 16-bit code units</td>
+</tr>
+<tr>
+ <td>insns</td>
+ <td>ushort[insns_size]</td>
+ <td>actual array of bytecode. The format of code in an <code>insns</code>
+ array is specified by the companion document
+ <a href="dalvik-bytecode.html">"Bytecode for the Dalvik VM"</a>. Note
+ that though this is defined as an array of <code>ushort</code>, there
+ are some internal structures that prefer four-byte alignment. Also,
+ if this happens to be in an endian-swapped file, then the swapping is
+ <i>only</i> done on individual <code>ushort</code>s and not on the
+ larger internal structures.
+ </td>
+</tr>
+<tr>
+ <td>padding</td>
+ <td>ushort <i>(optional)</i> = 0</td>
+ <td>two bytes of padding to make <code>tries</code> four-byte aligned.
+ This element is only present if <code>tries_size</code> is non-zero
+ and <code>insns_size</code> is odd.
+ </td>
+</tr>
+<tr>
+ <td>tries</td>
+ <td>try_item[tries_size] <i>(optional)</i></td>
+ <td>array indicating where in the code exceptions may be caught and
+ how to handle them. Elements of the array must be non-overlapping in
+ range and in order from low to high address. This element is only
+ present if <code>tries_size</code> is non-zero.
+ </td>
+</tr>
+<tr>
+ <td>handlers</td>
+ <td>encoded_catch_handler_list <i>(optional)</i></td>
+ <td>bytes representing a list of lists of catch types and associated
+ handler addresses. Each <code>try_item</code> has a byte-wise offset
+ into this structure. This element is only present if
+ <code>tries_size</code> is non-zero.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h3><code>try_item</code> Format </h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>start_addr</td>
+ <td>uint</td>
+ <td>start address of the block of code covered by this entry. The address
+ is a count of 16-bit code units to the start of the first covered
+ instruction.
+ </td>
+</tr>
+<tr>
+ <td>insn_count</td>
+ <td>ushort</td>
+ <td>number of 16-bit code units covered by this entry. The last code
+ unit covered (inclusive) is <code>start_addr + insn_count - 1</code>.
+ </td>
+</tr>
+<tr>
+ <td>handler_off</td>
+ <td>ushort</td>
+ <td>offset in bytes from the start of the associated encoded handler data
+ to the <code>catch_handler_item</code> for this entry
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h3><code>encoded_catch_handler_list</code> Format</h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>size</td>
+ <td>uleb128</td>
+ <td>size of this list, in entries</td>
+</tr>
+<tr>
+ <td>list</td>
+ <td>encoded_catch_handler[handlers_size]</td>
+ <td>actual list of handler lists, represented directly (not as offsets),
+ and concatenated sequentially</td>
+</tr>
+</tbody>
+</table>
+
+<h3><code>encoded_catch_handler</code> Format</h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>size</td>
+ <td>sleb128</td>
+ <td>number of catch types in this list. If non-positive, then this is
+ the negative of the number of catch types, and the catches are followed
+ by a catch-all handler. For example: A <code>size</code> of <code>0</code>
+ means that there is a catch-all but no explicitly typed catches.
+ A <code>size</code> of <code>2</code> means that there are two explicitly
+ typed catches and no catch-all. And a <code>size</code> of <code>-1</code>
+ means that there is one typed catch along with a catch-all.
+ </td>
+</tr>
+<tr>
+ <td>handlers</td>
+ <td>encoded_type_addr_pair[abs(size)]</td>
+ <td>stream of <code>abs(size)</code> encoded items, one for each caught
+ type, in the order that the types should be tested.
+ </td>
+</tr>
+<tr>
+ <td>catch_all_addr</td>
+ <td>uleb128 <i>(optional)</i></td>
+ <td>bytecode address of the catch-all handler. This element is only
+ present if <code>size</code> is non-positive.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h3><code>encoded_type_addr_pair</code> Format</h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>type_idx</td>
+ <td>uleb128</td>
+ <td>index into the <code>type_ids</code> list for the type of the
+ exception to catch
+ </td>
+</tr>
+<tr>
+ <td>addr</td>
+ <td>uleb128</td>
+ <td>bytecode address of the associated exception handler</td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>debug_info_item</code></h2>
+<h4>referenced from <code>code_item</code></h4>
+<h4>appears in the <code>data</code> section</h4>
+<h4>alignment: none (byte-aligned)</h4>
+
+<p>Each <code>debug_info_item</code> defines a DWARF3-inspired byte-coded
+state machine that, when interpreted, emits the positions
+table and (potentially) the local variable information for a
+<code>code_item</code>. The sequence begins with a variable-length
+header (the length of which depends on the number of method
+parameters), is followed by the state machine bytecodes, and ends
+with an <code>DBG_END_SEQUENCE</code> byte.</p>
+
+<p>The state machine consists of five registers. The
+<code>address</code> register represents the instruction offset in the
+associated <code>insns_item</code> in 16-bit code units. The
+<code>address</code> register starts at <code>0</code> at the beginning of each
+<code>debug_info</code> sequence and may only monotonically increase.
+The <code>line</code> register represents what source line number
+should be associated with the next positions table entry emitted by
+the state machine. It is initialized in the sequence header, and may
+change in positive or negative directions but must never be less than
+<code>1</code>. The <code>source_file</code> register represents the
+source file that the line number entries refer to. It is initialized to
+the value of <code>source_file_idx</code> in <code>class_def_item</code>.
+The other two variables, <code>prologue_end</code> and
+<code>epilogue_begin</code>, are boolean flags (initialized to
+<code>false</code>) that indicate whether the next position emitted
+should be considered a method prologue or epilogue. The state machine
+must also track the name and type of the last local variable live in
+each register for the <code>DBG_RESTART_LOCAL</code> code.</p>
+
+<p>The header is as follows:</p>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>line_start</td>
+ <td>uleb128</td>
+ <td>the initial value for the state machine's <code>line</code> register.
+ Does not represent an actual positions entry.
+ </td>
+</tr>
+<tr>
+ <td>parameters_size</td>
+ <td>uleb128</td>
+ <td>the number of parameter names that are encoded. There should be
+ one per method parameter, excluding an instance method's <code>this</code>,
+ if any.
+ </td>
+</tr>
+<tr>
+ <td>parameter_names</td>
+ <td>uleb128p1[parameters_size]</td>
+ <td>string index of the method parameter name. An encoded value of
+ <code>NO_INDEX</code> indicates that no name
+ is available for the associated parameter. The type descriptor
+ and signature are implied from the method descriptor and signature.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<p>The byte code values are as follows:</p>
+
+<table class="debugByteCode">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Value</th>
+ <th>Format</th>
+ <th>Arguments</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>DBG_END_SEQUENCE</td>
+ <td>0x00</td>
+ <td></td>
+ <td><i>(none)</i></td>
+ <td>terminates a debug info sequence for a <code>code_item</code></td>
+</tr>
+<tr>
+ <td>DBG_ADVANCE_PC</td>
+ <td>0x01</td>
+ <td>uleb128 addr_diff</td>
+ <td><code>addr_diff</code>: amount to add to address register</td>
+ <td>advances the address register without emitting a positions entry</td>
+</tr>
+<tr>
+ <td>DBG_ADVANCE_LINE</td>
+ <td>0x02</td>
+ <td>sleb128 line_diff</td>
+ <td><code>line_diff</code>: amount to change line register by</td>
+ <td>advances the line register without emitting a positions entry</td>
+</tr>
+<tr>
+ <td>DBG_START_LOCAL</td>
+ <td>0x03</td>
+ <td>uleb128 register_num<br/>
+ uleb128p1 name_idx<br/>
+ uleb128p1 type_idx
+ </td>
+ <td><code>register_num</code>: register that will contain local<br/>
+ <code>name_idx</code>: string index of the name<br/>
+ <code>type_idx</code>: type index of the type
+ </td>
+ <td>introduces a local variable at the current address. Either
+ <code>name_idx</code> or <code>type_idx</code> may be
+ <code>NO_INDEX</code> to indicate that that value is unknown.
+ </td>
+</tr>
+<tr>
+ <td>DBG_START_LOCAL_EXTENDED</td>
+ <td>0x04</td>
+ <td>uleb128 register_num<br/>
+ uleb128p1 name_idx<br/>
+ uleb128p1 type_idx<br/>
+ uleb128p1 sig_idx
+ </td>
+ <td><code>register_num</code>: register that will contain local<br/>
+ <code>name_idx</code>: string index of the name<br/>
+ <code>type_idx</code>: type index of the type<br/>
+ <code>sig_idx</code>: string index of the type signature
+ </td>
+ <td>introduces a local with a type signature at the current address.
+ Any of <code>name_idx</code>, <code>type_idx</code>, or
+ <code>sig_idx</code> may be <code>NO_INDEX</code>
+ to indicate that that value is unknown. (If <code>sig_idx</code> is
+ <code>-1</code>, though, the same data could be represented more
+ efficiently using the opcode <code>DBG_START_LOCAL</code>.)
+ <p><b>Note:</b> See the discussion under
+ "<code>dalvik.annotation.Signature</code>" below for caveats about
+ handling signatures.</p>
+ </td>
+</tr>
+<tr>
+ <td>DBG_END_LOCAL</td>
+ <td>0x05</td>
+ <td>uleb128 register_num</td>
+ <td><code>register_num</code>: register that contained local</td>
+ <td>marks a currently-live local variable as out of scope at the current
+ address
+ </td>
+</tr>
+<tr>
+ <td>DBG_RESTART_LOCAL</td>
+ <td>0x06</td>
+ <td>uleb128 register_num</td>
+ <td><code>register_num</code>: register to restart</td>
+ <td>re-introduces a local variable at the current address. The name
+ and type are the same as the last local that was live in the specified
+ register.
+ </td>
+</tr>
+<tr>
+ <td>DBG_SET_PROLOGUE_END</td>
+ <td>0x07</td>
+ <td></td>
+ <td><i>(none)</i></td>
+ <td>sets the <code>prologue_end</code> state machine register,
+ indicating that the next position entry that is added should be
+ considered the end of a method prologue (an appropriate place for
+ a method breakpoint). The <code>prologue_end</code> register is
+ cleared by any special (<code>>= 0x0a</code>) opcode.
+ </td>
+</tr>
+<tr>
+ <td>DBG_SET_EPILOGUE_BEGIN</td>
+ <td>0x08</td>
+ <td></td>
+ <td><i>(none)</i></td>
+ <td>sets the <code>epilogue_begin</code> state machine register,
+ indicating that the next position entry that is added should be
+ considered the beginning of a method epilogue (an appropriate place
+ to suspend execution before method exit).
+ The <code>epilogue_begin</code> register is cleared by any special
+ (<code>>= 0x0a</code>) opcode.
+ </td>
+</tr>
+<tr>
+ <td>DBG_SET_FILE</td>
+ <td>0x09</td>
+ <td>uleb128p1 name_idx</td>
+ <td><code>name_idx</code>: string index of source file name;
+ <code>NO_INDEX</code> if unknown
+ </td>
+ <td>indicates that all subsequent line number entries make reference to this
+ source file name, instead of the default name specified in
+ <code>code_item</code>
+ </td>
+</tr>
+<tr>
+ <td><i>Special Opcodes</i></td>
+ <!-- When updating the range below, make sure to search for other
+ instances of 0x0a in this section. -->
+ <td>0x0a…0xff</td>
+ <td></td>
+ <td><i>(none)</i></td>
+ <td>advances the <code>line</code> and <code>address</code> registers,
+ emits a position entry, and clears <code>prologue_end</code> and
+ <code>epilogue_begin</code>. See below for description.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h3>Special Opcodes</h3>
+
+<p>Opcodes with values between <code>0x0a</code> and <code>0xff</code>
+(inclusive) move both the <code>line</code> and <code>address</code>
+registers by a small amount and then emit a new position table entry.
+The formula for the increments are as follows:</p>
+
+<pre>
+DBG_FIRST_SPECIAL = 0x0a // the smallest special opcode
+DBG_LINE_BASE = -4 // the smallest line number increment
+DBG_LINE_RANGE = 15 // the number of line increments represented
+
+adjusted_opcode = opcode - DBG_FIRST_SPECIAL
+
+line += DBG_LINE_BASE + (adjusted_opcode % DBG_LINE_RANGE)
+address += (adjusted_opcode / DBG_LINE_RANGE)
+</pre>
+
+<h2><code>annotations_directory_item</code></h2>
+<h4>referenced from <code>class_def_item</code></h4>
+<h4>appears in the <code>data</code> section</h4>
+<h4>alignment: 4 bytes</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>class_annotations_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the annotations made directly
+ on the class, or <code>0</code> if the class has no direct annotations.
+ The offset, if non-zero, should be to a location in the
+ <code>data</code> section. The format of the data is specified
+ by "<code>annotation_set_item</code>" below.
+ </td>
+</tr>
+<tr>
+ <td>fields_size</td>
+ <td>uint</td>
+ <td>count of fields annotated by this item</td>
+</tr>
+<tr>
+ <td>annotated_methods_off</td>
+ <td>uint</td>
+ <td>count of methods annotated by this item</td>
+</tr>
+<tr>
+ <td>annotated_parameters_off</td>
+ <td>uint</td>
+ <td>count of method parameter lists annotated by this item</td>
+</tr>
+<tr>
+ <td>field_annotations</td>
+ <td>field_annotation[fields_size] <i>(optional)</i></td>
+ <td>list of associated field annotations. The elements of the list must
+ be sorted in increasing order, by <code>field_idx</code>.
+ </td>
+</tr>
+<tr>
+ <td>method_annotations</td>
+ <td>method_annotation[methods_size] <i>(optional)</i></td>
+ <td>list of associated method annotations. The elements of the list must
+ be sorted in increasing order, by <code>method_idx</code>.
+ </td>
+</tr>
+<tr>
+ <td>parameter_annotations</td>
+ <td>parameter_annotation[parameters_size] <i>(optional)</i></td>
+ <td>list of associated method parameter annotations. The elements of the
+ list must be sorted in increasing order, by <code>method_idx</code>.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<p><b>Note:</b> All elements' <code>field_id</code>s and
+<code>method_id</code>s must refer to the same defining class.</p>
+
+<h3><code>field_annotation</code> Format</h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>field_idx</td>
+ <td>uint</td>
+ <td>index into the <code>field_ids</code> list for the identity of the
+ field being annotated
+ </td>
+</tr>
+<tr>
+ <td>annotations_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the list of annotations for
+ the field. The offset should be to a location in the <code>data</code>
+ section. The format of the data is specified by
+ "<code>annotation_set_item</code>" below.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h3><code>method_annotation</code> Format</h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>method_idx</td>
+ <td>uint</td>
+ <td>index into the <code>method_ids</code> list for the identity of the
+ method being annotated
+ </td>
+</tr>
+<tr>
+ <td>annotations_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the list of annotations for
+ the method. The offset should be to a location in the
+ <code>data</code> section. The format of the data is specified by
+ "<code>annotation_set_item</code>" below.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h3><code>parameter_annotation</code> Format</h2>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>method_idx</td>
+ <td>uint</td>
+ <td>index into the <code>method_ids</code> list for the identity of the
+ method whose parameters are being annotated
+ </td>
+</tr>
+<tr>
+ <td>annotations_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the list of annotations for
+ the method parameters. The offset should be to a location in the
+ <code>data</code> section. The format of the data is specified by
+ "<code>annotation_set_ref_list</code>" below.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>annotation_set_ref_list</code></h2>
+<h4>referenced from <code>parameter_annotations_item</code></h4>
+<h4>appears in the <code>data</code> section</h4>
+<h4>alignment: 4 bytes</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>size</td>
+ <td>uint</td>
+ <td>size of the list, in entries</td>
+</tr>
+<tr>
+ <td>list</td>
+ <td>annotation_set_ref_item[size]</td>
+ <td>elements of the list</td>
+</tr>
+</tbody>
+</table>
+
+<h3><code>annotation_set_ref_item</code> Format</h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>annotations_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to the referenced annotation set
+ or <code>0</code> if there are no annotations for this element.
+ The offset, if non-zero, should be to a location in the <code>data</code>
+ section. The format of the data is specified by
+ "<code>annotation_set_item</code>" below.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>annotation_set_item</code></h2>
+<h4>referenced from <code>annotations_directory_item</code>,
+<code>field_annotations_item</code>,
+<code>method_annotations_item</code>, and
+<code>annotation_set_ref_item</code></h4>
+<h4>appears in the <code>data</code> section</h4>
+<h4>alignment: 4 bytes</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>size</td>
+ <td>uint</td>
+ <td>size of the set, in entries</td>
+</tr>
+<tr>
+ <td>entries</td>
+ <td>annotation_off_item[size]</td>
+ <td>elements of the set. The elements must be sorted in increasing order,
+ by <code>type_idx</code>.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h3><code>annotation_off_item</code> Format</h3>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>annotation_off</td>
+ <td>uint</td>
+ <td>offset from the start of the file to an annotation.
+ The offset should be to a location in the <code>data</code> section,
+ and the format of the data at that location is specified by
+ "<code>annotation_item</code>" below.
+ </td>
+</tr>
+</tbody>
+</table>
+
+
+<h2><code>annotation_item</code></h2>
+<h4>referenced from <code>annotation_set_item</code></h4>
+<h4>appears in the <code>data</code> section</h4>
+<h4>alignment: none (byte-aligned)</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>visibility</td>
+ <td>ubyte</td>
+ <td>intended visibility of this annotation (see below)</td>
+</tr>
+<tr>
+ <td>annotation</td>
+ <td>encoded_annotation</td>
+ <td>encoded annotation contents, in the format described by
+ "<code>encoded_annotation</code> Format" under
+ "<code>encoded_value</code> Encoding" above.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h3>Visibility values</h3>
+
+<p>These are the options for the <code>visibility</code> field in an
+<code>annotation_item</code>:</p>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Value</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>VISIBILITY_BUILD</td>
+ <td>0x00</td>
+ <td>intended only to be visible at build time (e.g., during compilation
+ of other code)
+ </td>
+</tr>
+<tr>
+ <td>VISIBILITY_RUNTIME</td>
+ <td>0x01</td>
+ <td>intended to visible at runtime</td>
+</tr>
+<tr>
+ <td>VISIBILITY_SYSTEM</td>
+ <td>0x02</td>
+ <td>intended to visible at runtime, but only to the underlying system
+ (and not to regular user code)
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>encoded_array_item</code></h2>
+<h4>referenced from <code>class_def_item</code></h4>
+<h4>appears in the <code>data</code> section</h4>
+<h4>alignment: none (byte-aligned)</h4>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>value</td>
+ <td>encoded_array</td>
+ <td>bytes representing the encoded array value, in the format specified
+ by "<code>encoded_array</code> Format" under "<code>encoded_value</code>
+ Encoding" above.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h1>System Annotations</h1>
+
+<p>System annotations are used to represent various pieces of reflective
+information about classes (and methods and fields). This information is
+generally only accessed indirectly by client (non-system) code.</p>
+
+<p>System annotations are represented in <code>.dex</code> files as
+annotations with visibility set to <code>VISIBILITY_SYSTEM</code>.
+
+<h2><code>dalvik.annotation.AnnotationDefault</code></h2>
+<h4>appears on methods in annotation interfaces</h4>
+
+<p>An <code>AnnotationDefault</code> annotation is attached to each
+annotation interface which wishes to indicate default bindings.</p>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>value</td>
+ <td>Annotation</td>
+ <td>the default bindings for this annotation, represented as an annotation
+ of this type. The annotation need not include all names defined by the
+ annotation; missing names simply do not have defaults.
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>dalvik.annotation.EnclosingClass</code></h2>
+<h4>appears on classes</h4>
+
+<p>An <code>EnclosingClass</code> annotation is attached to each class
+which is either defined as a member of another class, per se, or is
+anonymous but not defined within a method body (e.g., a synthetic
+inner class). Every class that has this annotation must also have an
+<code>InnerClass</code> annotation. Additionally, a class may not have
+both an <code>EnclosingClass</code> and an
+<code>EnclosingMethod</code> annotation.</p>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>value</td>
+ <td>Class</td>
+ <td>the class which most closely lexically scopes this class</td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>dalvik.annotation.EnclosingMethod</code></h2>
+<h4>appears on classes</h4>
+
+<p>An <code>EnclosingMethod</code> annotation is attached to each class
+which is defined inside a method body. Every class that has this
+annotation must also have an <code>InnerClass</code> annotation.
+Additionally, a class may not have both an <code>EnclosingClass</code>
+and an <code>EnclosingMethod</code> annotation.</p>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>value</td>
+ <td>Method</td>
+ <td>the method which most closely lexically scopes this class</td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>dalvik.annotation.InnerClass</code></h2>
+<h4>appears on classes</h4>
+
+<p>An <code>InnerClass</code> annotation is attached to each class
+which is defined in the lexical scope of another class's definition.
+Any class which has this annotation must also have <i>either</i> an
+<code>EnclosingClass</code> annotation <i>or</i> an
+<code>EnclosingMethod</code> annotation.</p>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>name</td>
+ <td>String</td>
+ <td>the originally declared simple name of this class (not including any
+ package prefix). If this class is anonymous, then the name is
+ <code>null</code>.
+ </td>
+</tr>
+<tr>
+ <td>accessFlags</td>
+ <td>int</td>
+ <td>the originally declared access flags of the class (which may differ
+ from the effective flags because of a mismatch between the execution
+ models of the source language and target virtual machine)
+ </td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>dalvik.annotation.MemberClasses</code></h2>
+<h4>appears on classes</h4>
+
+<p>A <code>MemberClasses</code> annotation is attached to each class
+which declares member classes. (A member class is a direct inner class
+that has a name.)</p>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>value</td>
+ <td>Class[]</td>
+ <td>array of the member classes</td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>dalvik.annotation.Signature</code></h2>
+<h4>appears on classes, fields, and methods</h4>
+
+<p>A <code>Signature</code> annotation is attached to each class,
+field, or method which is defined in terms of a more complicated type
+than is representable by a <code>type_id_item</code>. The
+<code>.dex</code> format does not define the format for signatures; it
+is merely meant to be able to represent whatever signatures a source
+language requires for successful implementation of that language's
+semantics. As such, signatures are not generally parsed (or verified)
+by virtual machine implementations. The signatures simply get handed
+off to higher-level APIs and tools (such as debuggers). Any use of a
+signature, therefore, should be written so as not to make any
+assumptions about only receiving valid signatures, explicitly guarding
+itself against the possibility of coming across a syntactically
+invalid signature.</p>
+
+<p>Because signature strings tend to have a lot of duplicated content,
+a <code>Signature</code> annotation is defined as an <i>array</i> of
+strings, where duplicated elements naturally refer to the same
+underlying data, and the signature is taken to be the concatenation of
+all the strings in the array. There are no rules about how to pull
+apart a signature into separate strings; that is entirely up to the
+tools that generate <code>.dex</code> files.</p>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>value</td>
+ <td>String[]</td>
+ <td>the signature of this class or member, as an array of strings that
+ is to be concatenated together</td>
+</tr>
+</tbody>
+</table>
+
+<h2><code>dalvik.annotation.Throws</code></h2>
+<h4>appears on methods</h4>
+
+<p>A <code>Throws</code> annotation is attached to each method which is
+declared to throw one or more exception types.</p>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Name</th>
+ <th>Format</th>
+ <th>Description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>value</td>
+ <td>Class[]</td>
+ <td>the array of exception types thrown</td>
+</tr>
+</tbody>
+</table>
+
+</body>
+</html>
diff --git a/docs/dexopt.html b/docs/dexopt.html
new file mode 100644
index 0000000..7f0b4bc
--- /dev/null
+++ b/docs/dexopt.html
@@ -0,0 +1,326 @@
+<html>
+<head>
+ <title>Dalvik Optimization and Verification</title>
+</head>
+
+<body>
+<h1>Dalvik Optimization and Verification With <i>dexopt</i></h1>
+
+<p>
+The Dalvik virtual machine was designed specifically for the Android
+mobile platform. The target systems have little RAM, store data on slow
+internal flash memory, and generally have the performance characteristics
+of decade-old desktop systems. They also run Linux, which provides
+virtual memory, processes and threads, and UID-based security mechanisms.
+<p>
+The features and limitations caused us to focus on certain goals:
+
+<ul>
+ <li>Class data, notably bytecode, must be shared between multiple
+ processes to minimize total system memory usage.
+ <li>The overhead in launching a new app must be minimized to keep
+ the device responsive.
+ <li>Storing class data in individual files results in a lot of
+ redundancy, especially with respect to strings. To conserve disk
+ space we need to factor this out.
+ <li>Parsing class data fields adds unnecessary overhead during
+ class loading. Accessing data values (e.g. integers and strings)
+ directly as C types is better.
+ <li>Bytecode verification is necessary, but slow, so we want to verify
+ as much as possible outside app execution.
+ <li>Bytecode optimization (quickened instructions, method pruning) is
+ important for speed and battery life.
+ <li>For security reasons, processes may not edit shared code.
+</ul>
+
+<p>
+The typical VM implementation uncompresses individual classes from a
+compressed archive and stores them on the heap. This implies a separate
+copy of each class in every process, and slows application startup because
+the code must be uncompressed (or at least read off disk in many small
+pieces). On the other hand, having the bytecode on the local heap makes
+it easy to rewrite instructions on first use, facilitating a number of
+different optimizations.
+<p>
+The goals led us to make some fundamental decisions:
+
+<ul>
+ <li>Multiple classes are aggregated into a single "DEX" file.
+ <li>DEX files are mapped read-only and shared between processes.
+ <li>Byte ordering and word alignment are adjusted to suit the local
+ system.
+ <li>Bytecode verification is mandatory for all classes, but we want
+ to "pre-verify" whatever we can.
+ <li>Optimizations that require rewriting bytecode must be done ahead
+ of time.
+</ul>
+
+<p>
+The consequences of these decisions are explained in the following sections.
+
+
+<h2>VM Operation</h2>
+
+<p>
+Application code is delivered to the system in a <code>.jar</code>
+or <code>.apk</code> file. These are really just <code>.zip</code>
+archives with some meta-data files added. The Dalvik DEX data file
+is always called <code>classes.dex</code>.
+<p>
+The bytecode cannot be memory-mapped and executed directly from the zip
+file, because the data is compressed and the start of the file is not
+guaranteed to be word-aligned. These problems could be addressed by
+storing <code>classes.dex</code> without compression and padding out the zip
+file, but that would increase the size of the package sent across the
+data network.
+<p>
+We need to extract <code>classes.dex</code> from the zip archive before
+we can use it. While we have the file available, we might as well perform
+some of the other actions (realignment, optimization, verification) described
+earlier. This raises a new question however: who is responsible for doing
+this, and where do we keep the output?
+
+<h3>Preparation</h3>
+
+<p>
+There are at least three different ways to create a "prepared" DEX file,
+sometimes known as "ODEX" (for Optimized DEX):
+<ol>
+ <li>The VM does it "just in time". The output goes into a special
+ <code>dalvik-cache</code> directory. This works on the desktop and
+ engineering-only device builds where the permissions on the
+ <code>dalvik-cache</code> directory are not restricted. On production
+ devices, this is not allowed.
+ <li>The system installer does it when an application is first added.
+ It has the privileges required to write to <code>dalvik-cache</code>.
+ <li>The build system does it ahead of time. The relevant <code>jar</code>
+ / <code>apk</code> files are present, but the <code>classes.dex</code>
+ is stripped out. The optimized DEX is stored next to the original
+ zip archive, not in <code>dalvik-cache</code>, and is part of the
+ system image.
+</ol>
+<p>
+The <code>dalvik-cache</code> directory is more accurately
+<code>$ANDROID_DATA/data/dalvik-cache</code>. The files inside it have
+names derived from the full path of the source DEX. On the device the
+directory is owned by <code>system</code> / <code>system</code>
+and has 0771 permissions, and the optimized DEX files stored there are
+owned by <code>system</code> and the
+application's group, with 0644 permissions. DRM-locked applications will
+use 640 permissions to prevent other user applications from examining them.
+The bottom line is that you can read your own DEX file and those of most
+other applications, but you cannot create, modify, or remove them.
+<p>
+Preparation of the DEX file for the "just in time" and "system installer"
+approaches proceeds in three steps:
+<p>
+First, the dalvik-cache file is created. This must be done in a process
+with appropriate privileges, so for the "system installer" case this is
+done within <code>installd</code>, which runs as root.
+<p>
+Second, the <code>classes.dex</code> entry is extracted from the the zip
+archive. A small amount of space is left at the start of the file for
+the ODEX header.
+<p>
+Third, the file is memory-mapped for easy access and tweaked for use on
+the current system. This includes byte-swapping and structure realigning,
+but no meaningful changes to the DEX file. We also do some basic
+structure checks, such as ensuring that file offsets and data indices
+fall within valid ranges.
+<p>
+The build system uses a hairy process that involves starting the
+emulator, forcing just-in-time optimization of all relevant DEX files,
+and then extracting the results from <code>dalvik-cache</code>. The
+reasons for doing this, rather than using a tool that runs on the desktop,
+will become more apparent when the optimizations are explained.
+<p>
+Once the code is byte-swapped and aligned, we're ready to go. We append
+some pre-computed data, fill in the ODEX header at the start of the file,
+and start executing. (The header is filled in last, so that we don't
+try to use a partial file.) If we're interested in verification and
+optimization, however, we need to insert a step after the initial prep.
+
+<h3>dexopt</h3>
+
+<p>
+We want to verify and optimize all of the classes in the DEX file. The
+easiest and safest way to do this is to load all of the classes into
+the VM and run through them. Anything that fails to load is simply not
+verified or optimized. Unfortunately, this can cause allocation of some
+resources that are difficult to release (e.g. loading of native shared
+libraries), so we don't want to do it in the same virtual machine that
+we're running applications in.
+<p>
+The solution is to invoke a program called <code>dexopt</code>, which
+is really just a back door into the VM. It performs an abbreviated VM
+initialization, loads zero or more DEX files from the bootstrap class
+path, and then sets about verifying and optimizing whatever it can from
+the target DEX. On completion, the process exits, freeing all resources.
+<p>
+It is possible for multiple VMs to want the same DEX file at the same
+time. File locking is used to ensure that dexopt is only run once.
+
+
+<h2>Verification</h2>
+
+<p>
+The bytecode verification process involves scanning through the instructions
+in every method in every class in a DEX file. The goal is to identify
+illegal instruction sequences so that we don't have to check for them at
+run time. Many of the computations involved are also necessary for "exact"
+garbage collection. See
+<a href="verifier.html">Dalvik Bytecode Verifier Notes</a> for more
+information.
+<p>
+For performance reasons, the optimizer (described in the next section)
+assumes that the verifier has run successfully, and makes some potentially
+unsafe assumptions. By default, Dalvik insists upon verifying all classes,
+and only optimizes classes that have been verified. If you want to
+disable the verifier, you can use command-line flags to do so. See also
+<a href="embedded-vm-control.html"> Controlling the Embedded VM</a>
+for instructions on controlling these
+features within the Android application framework.
+<p>
+Reporting of verification failures is a tricky issue. For example,
+calling a package-scope method on a class in a different package is
+illegal and will be caught by the verifier. We don't necessarily want
+to report it during verification though -- we actually want to throw
+an exception when the method call is attempted. Checking the access
+flags on every method call is expensive though. The
+<a href="verifier.html">Dalvik Bytecode Verifier Notes</a> document
+addresses this issue.
+<p>
+Classes that have been verified successfully have a flag set in the ODEX.
+They will not be re-verified when loaded. The Linux access permissions
+are expected to prevent tampering; if you can get around those, installing
+faulty bytecode is far from the easiest line of attack. The ODEX file has
+a 32-bit checksum, but that's chiefly present as a quick check for
+corrupted data.
+
+
+<h2>Optimization</h2>
+
+<p>
+Virtual machine interpreters typically perform certain optimizations the
+first time a piece of code is used. Constant pool references are replaced
+with pointers to internal data structures, operations that always succeed
+or always work a certain way are replaced with simpler forms. Some of
+these require information only available at runtime, others can be inferred
+statically when certain assumptions are made.
+<p>
+The Dalvik optimizer does the following:
+<ul>
+ <li>For virtual method calls, replace the method index with a
+ vtable index.
+ <li>For instance field get/put, replace the field index with
+ a byte offset. Also, merge the boolean / byte / char / short
+ variants into a single 32-bit form (less code in the interpreter
+ means more room in the CPU I-cache).
+ <li>Replace a handful of high-volume calls, like String.length(),
+ with "inline" replacements. This skips the usual method call
+ overhead, directly switching from the interpreter to a native
+ implementation.
+ <li>Prune empty methods. The simplest example is
+ <code>Object.<init></code>, which does nothing, but must be
+ called whenever any object is allocated. The instruction is
+ replaced with a new version that acts as a no-op unless a debugger
+ is attached.
+ <li>Append pre-computed data. For example, the VM wants to have a
+ hash table for lookups on class name. Instead of computing this
+ when the DEX file is loaded, we can compute it now, saving heap
+ space and computation time in every VM where the DEX is loaded.
+</ul>
+
+<p>
+All of the instruction modifications involve replacing the opcode with
+one not defined by the Dalvik specification. This allows us to freely
+mix optimized and unoptimized instructions. The set of optimized
+instructions, and their exact representation, is tied closely to the VM
+version.
+<p>
+Most of the optimizations are obvious "wins". The use of raw indices
+and offsets not only allows us to execute more quickly, we can also
+skip the initial symbolic resolution. Pre-computation eats up
+disk space, and so must be done in moderation.
+<p>
+There are a couple of potential sources of trouble with these
+optimizations. First, vtable indices and byte offsets are subject to
+change if the VM is updated. Second, if a superclass is in a different
+DEX, and that other DEX is updated, we need to ensure that our optimized
+indices and offsets are updated as well. A similar but more subtle
+problem emerges when user-defined class loaders are employed: the class
+we actually call may not be the one we expected to call.
+<p>These problems are addressed with dependency lists and some limitations
+on what can be optimized.
+
+
+<h2>Dependencies and Limitations</h2>
+
+<p>
+The optimized DEX file includes a list of dependencies on other DEX files,
+plus the CRC-32 and modification date from the originating
+<code>classes.dex</code> zip file entry. The dependency list includes the
+full path to the <code>dalvik-cache</code> file, and the file's SHA-1
+signature. The timestamps of files on the device are unreliable and
+not used. The dependency area also includes the VM version number.
+<p>
+An optimized DEX is dependent upon all of the DEX files in the bootstrap
+class path. DEX files that are part of the bootstrap class path depend
+upon the DEX files that appeared earlier. To ensure that nothing outside
+the dependent DEX files is available, <code>dexopt</code> only loads the
+bootstrap classes. References to classes in other DEX files fail, which
+causes class loading and/or verification to fail, and classes with
+external dependencies are simply not optimized.
+<p>
+This means that splitting code out into many separate DEX files has a
+disadvantage: virtual method calls and instance field lookups between
+non-boot DEX files can't be optimized. Because verification is pass/fail
+with class granularity, no method in a class that has any reliance on
+classes in external DEX files can be optimized. This may be a bit
+heavy-handed, but it's the only way to guarantee that nothing breaks
+when individual pieces are updated.
+<p>
+Another negative consequence: any change to a bootstrap DEX will result
+in rejection of all optimized DEX files. This makes it hard to keep
+system updates small.
+<p>
+Despite our caution, there is still a possibility that a class in a DEX
+file loaded by a user-defined class loader could ask for a bootstrap class
+(say, String) and be given a different class with the same name. If a
+class in the DEX file being processed has the same name as a class in the
+bootstrap DEX files, the class will be flagged as ambiguous and references
+to it will not be resolved during verification / optimization. The class
+linking code in the VM does additional checks to plug another hole;
+see the verbose description in the VM sources for details (vm/oo/Class.c).
+<p>
+If one of the dependencies is updated, we need to re-verify and
+re-optimize the DEX file. If we can do a just-in-time <code>dexopt</code>
+invocation, this is easy. If we have to rely on the installer daemon, or
+the DEX was shipped only in ODEX, then the VM has to reject the DEX.
+<p>
+The output of <code>dexopt</code> is byte-swapped and struct-aligned
+for the host, and contains indices and offsets that are highly VM-specific
+(both version-wise and platform-wise). For this reason it's tricky to
+write a version of <code>dexopt</code> that runs on the desktop but
+generates output suitable for a particular device. The safest way to
+invoke it is on the target device, or on an emulator for that device.
+
+
+<h2>Generated DEX</h2>
+
+<p>
+Some languages and frameworks rely on the ability to generate bytecode
+and execute it. The rather heavy <code>dexopt</code> verification and
+optimization model doesn't work well with that.
+<p>
+We intend to support this in a future release, but the exact method is
+to be determined. We may allow individual classes to be added or whole
+DEX files; may allow Java bytecode or Dalvik bytecode in instructions;
+may perform the usual set of optimizations, or use a separate interpreter
+that performs on-first-use optimizations directly on the bytecode (which
+won't be mapped read-only, since it's locally defined).
+
+<address>Copyright © 2008 The Android Open Source Project</address>
+
+</body>
+</html>
diff --git a/docs/embedded-vm-control.html b/docs/embedded-vm-control.html
new file mode 100644
index 0000000..0f39fbe
--- /dev/null
+++ b/docs/embedded-vm-control.html
@@ -0,0 +1,232 @@
+<html>
+<head>
+ <title>Controlling the Embedded VM</title>
+ <link rel=stylesheet href="android.css">
+</head>
+
+<body>
+<h1>Controlling the Embedded VM</h1>
+
+<ul>
+ <li><a href="#overview">Overview</a>
+ <li><a href="#checkjni">Extended JNI Checks</a>
+ <li><a href="#assertions">Assertions</a>
+ <li><a href="#verifier">Bytecode Verification and Optimization</a>
+ <li><a href="#execmode">Execution Mode</a>
+ <li><a href="#dp">Deadlock Prediction</a>
+ <li><a href="#stackdump">Stack Dumps</a>
+</ul>
+
+<h2><a name="overview">Overview</a></h2>
+
+<p>The Dalvik VM supports a variety of command-line arguments
+(use <code>adb shell dalvikvm -help</code> to get a summary), but
+it's not possible to pass arbitrary arguments through the
+Android application runtime. It is, however, possible to affect the
+VM behavior through certain system properties.
+
+<p>For all of the features described below, you would set the system property
+with <code>setprop</code>,
+issuing a shell command on the device like this:
+<pre>adb shell setprop <name> <value></pre>
+
+<p>The Android runtime must be restarted before the changes will take
+effect (<code>adb shell stop; adb shell start</code>). This is because the
+settings are processed in the "zygote" process, which starts early and stays
+around "forever".
+
+<p>You could also add a line to <code>/data/local.prop</code> that looks like:
+<pre><name> = <value></pre>
+
+<p>Such changes will survive reboots, but will be removed by anything
+that wipes the data partition. (Hint: create a <code>local.prop</code>
+on your workstation, then <code>adb push local.prop /data</code> .)
+
+
+<h2><a name="checkjni">Extended JNI Checks</a></h2>
+
+<p>JNI, the Java Native Interface, provides a way for code written in the
+Java programming language
+interact with native (C/C++) code. The extended JNI checks will cause
+the system to run more slowly, but they can spot a variety of nasty bugs
+before they have a chance to cause problems.
+
+<p>There are two system properties that affect this feature, which is
+enabled with the <code>-Xcheck:jni</code> command-line argument. The
+first is <code>ro.kernel.android.checkjni</code>. This is set by the
+Android build system for development builds. (It may also be set by
+the Android emulator unless the <code>-nojni</code> flag is provided on the
+emulator command line.) Because this is an "ro." property, the value cannot
+be changed once the device has started.
+
+<p>To allow toggling of the CheckJNI flag, a second
+property, <code>dalvik.vm.checkjni</code>, is also checked. The value
+of this overrides the value from <code>ro.kernel.android.checkjni</code>.
+
+<p>If neither property is defined, or <code>dalvik.vm.checkjni</code>
+is set to <code>false</code>, the <code>-Xcheck:jni</code> flag is
+not passed in, and JNI checks will be disabled.
+
+<p>To enable JNI checking:
+<pre>adb shell setprop dalvik.vm.checkjni true</pre>
+
+<p>For more information about JNI checks, see
+<a href="jni-tips.html">JNI Tips</a>.
+
+
+<h2><a name="assertions">Assertions</a></h2>
+
+<p>Dalvik VM supports the Java programming language "assert" statement.
+By default they are off, but the <code>dalvik.vm.enableassertions</code>
+property provides a way to set the value for a <code>-ea</code> argument.
+
+<p>The argument behaves the same as it does in other desktop VMs. You
+can provide a class name, a package name (followed by "..."), or the
+special value "all".
+
+<p>For example, this:
+<pre>adb shell setprop dalvik.vm.enableassertions all</pre>
+enables assertions in all non-system classes.
+
+<p>The system property is much more limited than the full command line.
+It is not possible to specify more than one <code>-ea</code> entry, and there
+is no way to specify a <code>-da</code> entry. There is presently no
+equivalent for <code>-esa</code>/<code>-dsa</code>.
+
+
+<h2><a name="verifier">Bytecode Verification and Optimization</a></h2>
+
+<p>The system tries to pre-verify all classes in a DEX file to reduce
+class load overhead, and performs a series of optimizations to improve
+runtime performance. Both of these are done by the <code>dexopt</code>
+command, either in the build system or by the installer. On a development
+device, <code>dexopt</code> may be run the first time a DEX file is used
+and whenever it or one of its dependencies is updated ("just-in-time"
+optimization and verification).
+
+<p>There are two command-line flags that control the just-in-time
+verification and optimization,
+<code>-Xverify</code> and <code>-Xdexopt</code>. The Android framework
+configures these based on the <code>dalvik.vm.verify-bytecode</code>
+property.
+
+<p>If you set:
+<pre>adb shell setprop dalvik.vm.verify-bytecode true</pre>
+then the framework will pass <code>-Xverify:all -Xdexopt:verified</code>
+to the VM. This enables verification, and only optimizes classes that
+successfully verified. This is the safest setting, and is the default.
+<p>If <code>dalvik.vm.verify-bytecode</code> is set to <code>false</code>,
+the framework passes <code>-Xverify:none -Xdexopt:verified</code> to disable
+verification. (We could pass in <code>-Xdexopt:all</code>, but that wouldn't
+necessarily optimize more of the code, since classes that fail
+verification may well be skipped by the optimizer for the same reasons.)
+Classes will not be verified by <code>dexopt</code>, and unverified code
+will be loaded and executed.
+
+<p>Enabling verification will make the <code>dexopt</code> command
+take significantly longer, because the verification process is fairly slow.
+Once the verified and optimized DEX files have been prepared, verification
+incurs no additional overhead except when loading classes that failed
+to pre-verify.
+
+<p>If your DEX files are processed with verification disabled, and you
+later turn the verifier on, application loading will be noticeably
+slower (perhaps 40% or more) as classes are verified on first use.
+
+<p>For best results you should force a re-dexopt of all DEX files when
+this property changes. You can do this with:
+<pre>adb shell "rm /data/dalvik-cache/*"</pre>
+This removes the cached versions of the DEX files. Remember to
+stop and restart the runtime (<code>adb shell stop; adb shell start</code>).
+
+
+<h2><a name="execmode">Execution Mode</a></h2>
+
+<p>The current implementation of the Dalvik VM includes three distinct
+interpreter cores. These are referred to as "fast", "portable", and
+"debug". The "fast" interpreter is optimized for the current
+platform, and might consist of hand-optimized assembly routines. In
+constrast, the "portable" interpreter is written in C and expected to
+run on a broad range of platforms. The "debug" interpreter is a variant
+of "portable" that includes support for profiling and single-stepping.
+
+<p>The VM allows you to choose between "fast" and "portable" with an
+extended form of the <code>-Xint</code> argument. The value of this
+argument can be set through the <code>dalvik.vm.execution-mode</code>
+system property.
+
+<p>To select the "portable" interpreter, you would use:
+<pre>adb shell setprop dalvik.vm.execution-mode int:portable</pre>
+If the property is not specified, the most appropriate interpreter
+will be selected automatically. At some point this mechanism may allow
+selection of other modes, such as JIT compilation.
+
+<p>Not all platforms have an optimized implementation. In such cases,
+the "fast" interpreter is generated as a series of C stubs, and the
+result will be slower than the
+"portable" version. (When we have optimized versions for all popular
+architectures the naming convention will be more accurate.)
+
+<p>If profiling is enabled or a debugger is attached, the VM
+switches to the "debug" interpreter. When profiling ends or the debugger
+disconnects, the original interpreter is resumed. (The "debug" interpreter
+is substantially slower, something to keep in mind when evaluating
+profiling data.)
+
+
+<h2><a name="dp">Deadlock Prediction</a></h2>
+
+<p>If the VM is built with <code>WITH_DEADLOCK_PREDICTION</code>, the deadlock
+predictor can be enabled with the <code>-Xdeadlockpredict</code> argument.
+(The output from <code>dalvikvm -help</code> will tell you if the VM was
+built appropriately -- look for <code>deadlock_prediction</code> on the
+<code>Configured with:</code> line.)
+This feature tells the VM to keep track of the order in which object
+monitor locks are acquired. If the program attempts to acquire a set
+of locks in a different order from what was seen earlier, the VM logs
+a warning and optionally throws an exception.
+
+<p>The command-line argument is set based on the
+<code>dalvik.vm.deadlock-predict</code> property. Valid values are
+<code>off</code> to disable it (default), <code>warn</code> to log the
+problem but continue executing, <code>err</code> to cause a
+<code>dalvik.system.PotentialDeadlockError</code> to be thrown from the
+<code>monitor-enter</code> instruction, and <code>abort</code> to have
+the entire VM abort.
+
+<p>You will usually want to use:
+<pre>adb shell setprop dalvik.vm.deadlock-predict err</pre>
+unless you are keeping an eye on the logs as they scroll by.
+
+<p>Please note that this feature is deadlock prediction, not deadlock
+detection -- in the current implementation, the computations are performed
+after the lock is acquired (this simplifies the code, reducing the
+overhead added to every mutex operation). You can spot a deadlock in a
+hung process by sending a <code>kill -3</code> and examining the stack
+trace written to the log.
+
+<p>This only takes monitors into account. Native mutexes and other resources
+can also be the cause of deadlocks.
+
+
+<h2><a name="stackdump">Stack Dumps</a></h2>
+
+<p>Like other desktop VMs, when the Dalvik VM receives a SIGQUIT
+(Ctrl-\ or <code>kill -3</code>), it dumps stack traces for all threads.
+By default this goes to the Android log, but it can also be written to a file.
+
+<p>The <code>dalvik.vm.stack-trace-file</code> property allows you to
+specify the name of the file where the thread stack traces will be written.
+The file will be created (world writable) if it doesn't exist, and the
+new information will be appended to the end of the file. The filename
+is passed into the VM via the <code>-Xstacktracefile</code> argument.
+
+<p>For example:
+<pre>adb shell setprop dalvik.vm.stack-trace-file /tmp/stack-traces.txt</pre>
+
+<p>If the property is not defined, the VM will write the stack traces to
+the Android log when the signal arrives.
+
+<address>Copyright © 2008 The Android Open Source Project</address>
+
+</body></html>
diff --git a/docs/instruction-formats.css b/docs/instruction-formats.css
new file mode 100644
index 0000000..ee23c5c
--- /dev/null
+++ b/docs/instruction-formats.css
@@ -0,0 +1,129 @@
+h1 {
+ font-family: serif;
+ color: #222266;
+}
+
+h2 {
+ font-family: serif;
+ border-top-style: solid;
+ border-top-width: 2px;
+ border-color: #ccccdd;
+ padding-top: 12px;
+ margin-top: 48px;
+ margin-bottom: 2px;
+ color: #222266;
+}
+
+h3 {
+ font-family: serif;
+ color: #222266;
+}
+
+@media print {
+ table {
+ font-size: 8pt;
+ }
+}
+
+@media screen {
+ table {
+ font-size: 10pt;
+ }
+}
+
+table th {
+ font-family: sans-serif;
+ background: #aaaaff;
+}
+
+table {
+ border-collapse: collapse;
+}
+
+table td {
+ font-family: sans-serif;
+ border-top-style: solid;
+ border-bottom-style: solid;
+ border-width: 1px;
+ border-color: #aaaaff;
+ padding-top: 4px;
+ padding-bottom: 4px;
+ padding-left: 2px;
+ padding-right: 2px;
+ background: #eeeeff;
+}
+
+
+/* the mnemonic guide */
+
+table.letters {
+ margin-top: 24px;
+ margin-bottom: 24px;
+ margin-left: 48px;
+ margin-right: 48px;
+}
+
+table.letters td:first-child {
+ font-family: monospace;
+ width: 10%;
+ text-align: center;
+}
+
+table.letters td:first-child + td {
+ width: 10%;
+ text-align: center;
+}
+
+table.letters td:first-child + td + td {
+ width: 80%;
+}
+
+
+/* the formats, per se */
+
+table.format {
+ background: #aaaaaa;
+ border-collapse: collapse;
+ margin-top: 24px;
+ margin-bottom: 24px;
+ margin-left: 48px;
+ margin-right: 48px;
+}
+
+table.format td {
+ font-family: monospace;
+}
+
+table.format td + td i {
+ font-family: sans-serif;
+}
+
+table.format td sub {
+ font-family: sans-serif;
+}
+
+table.format td sub {
+ font-family: sans-serif;
+ font-style: italic;
+ font-size: 70%
+}
+
+table.format th:first-child {
+ width: 28%;
+}
+
+table.format th:first-child + th {
+ width: 5%;
+}
+
+table.format th:first-child + th + th {
+ width: 45%;
+}
+
+table.format th:first-child + th + th + th {
+ width: 22%;
+}
+
+table.format p {
+ margin-bottom: 0pt;
+}
\ No newline at end of file
diff --git a/docs/instruction-formats.html b/docs/instruction-formats.html
new file mode 100644
index 0000000..941689e
--- /dev/null
+++ b/docs/instruction-formats.html
@@ -0,0 +1,430 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+
+<html>
+
+<head>
+<title>Dalvik VM Instruction Formats</title>
+<link rel=stylesheet href="instruction-formats.css">
+</head>
+
+<body>
+
+<h1>Dalvik VM Instruction Formats</h1>
+<p>Copyright © 2007 The Android Open Source Project
+
+<h2>Introduction and Overview</h2>
+
+<p>This document lists the instruction formats used by Dalvik bytecode
+and is meant to be used in conjunction with the
+<a href="dalvik-bytecode.html">bytecode reference document</a>.</p>
+
+<h3>Bitwise descriptions</h3>
+
+<p>The first column in the format table lists the bitwise layout of
+the format. It consists of one or more space-separated "words" each of
+which describes a 16-bit code unit. Each character in a word
+represents four bits, read from high bits to low, with vertical bars
+("<code>|</code>") interspersed to aid in reading. Uppercase letters
+in sequence from "<code>A</code>" are used to indicate fields within
+the format (which then get defined further by the syntax column). The term
+"<code>op</code>" is used to indicate the position of the eight-bit
+opcode within the format. A slashed zero ("<code>Ø</code>") is
+used to indicate that all bits should be zero in the indicated
+position.</p>
+
+<p>For example, the format "<code>B|A|<i>op</i> CCCC</code>" indicates
+that the format consists of two 16-bit code units. The first word
+consists of the opcode in the low eight bits and a pair of four-bit
+values in the high eight bits; and the second word consists of a single
+16-bit value.</p>
+
+<h3>Format IDs</h3>
+
+<p>The second column in the format table indicates the short identifier
+for the format, which is used in other documents and in code to identify
+the format.</p>
+
+<p>Format IDs consist of three characters, two digits followed by a
+letter. The first digit indicates the number of 16-bit code units in the
+format. The second digit indicates the maximum number of registers that the
+format contains (maximum, since some formats can accomodate a variable
+number of registers), with the special designation "<code>r</code>" indicating
+that a range of registers is encoded. The final letter semi-mnemonically
+indicates the type of any extra data encoded by the format. For example,
+format "<code>21t</code>" is of length two, contains one register reference,
+and additionally contains a branch target.</p>
+
+<p>Suggested static linking formats have an additional "<code>s</code>" suffix,
+making them four characters total.</p>
+
+<p>The full list of typecode letters are as follows. Note that some
+forms have different sizes, depending on the format:</p>
+
+<table class="letters">
+<thead>
+<tr>
+ <th>Mnemonic</th>
+ <th>Bit Sizes</th>
+ <th>Meaning</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>b</td>
+ <td>8</td>
+ <td>immediate signed <b>b</b>yte</td>
+</tr>
+<tr>
+ <td>c</td>
+ <td>16, 32</td>
+ <td><b>c</b>onstant pool index</td>
+</tr>
+<tr>
+ <td>f</td>
+ <td>16</td>
+ <td>inter<b>f</b>ace constants (only used in statically linked formats)
+ </td>
+</tr>
+<tr>
+ <td>h</td>
+ <td>16</td>
+ <td>immediate signed <b>h</b>at (high-order bits of a 32- or 64-bit
+ value; low-order bits are all <code>0</code>)
+ </td>
+</tr>
+<tr>
+ <td>i</td>
+ <td>32</td>
+ <td>immediate signed <b>i</b>nt, or 32-bit float</td>
+</tr>
+<tr>
+ <td>l</td>
+ <td>64</td>
+ <td>immediate signed <b>l</b>ong, or 64-bit double</td>
+</tr>
+<tr>
+ <td>m</td>
+ <td>16</td>
+ <td><b>m</b>ethod constants (only used in statically linked formats)</td>
+</tr>
+<tr>
+ <td>n</td>
+ <td>4</td>
+ <td>immediate signed <b>n</b>ibble</td>
+</tr>
+<tr>
+ <td>s</td>
+ <td>16</td>
+ <td>immediate signed <b>s</b>hort</td>
+</tr>
+<tr>
+ <td>t</td>
+ <td>8, 16, 32</td>
+ <td>branch <b>t</b>arget</td>
+</tr>
+<tr>
+ <td>x</td>
+ <td>0</td>
+ <td>no additional data</td>
+</tr>
+</tbody>
+</table>
+
+<h3>Syntax</h3>
+
+<p>The third column of the format table indicates the human-oriented
+syntax for instructions which use the indicated format. Each instruction
+starts with the named opcode and is optionally followed by one or
+more arguments, themselves separated with commas.</p>
+
+<p>Wherever an argument refers to a field from the first column, the
+letter for that field is indicated in the syntax, repeated once for
+each four bits of the field. For example, an eight-bit field labeled
+"<code>BB</code>" in the first column would also be labeled
+"<code>BB</code>" in the syntax column.</p>
+
+<p>Arguments which name a register have the form "<code>v<i>X</i></code>".
+The prefix "<code>v</code>" was chosen instead of the more common
+"<code>r</code>" exactly to avoid conflicting with (non-virtual) architectures
+on which a Dalvik virtual machine might be implemented which themselves
+use the prefix "<code>r</code>" for their registers. (That is, this
+decision makes it possible to talk about both virtual and real registers
+together without the need for circumlocution.)</p>
+
+<p>Arguments which indicate a literal value have the form
+"<code>#+<i>X</i></code>". Some formats indicate literals that only
+have non-zero bits in their high-order bits; for these, the zeroes
+are represented explicitly in the syntax, even though they do not
+appear in the bitwise representation.</p>
+
+<p>Arguments which indicate a relative instruction address offset have the
+form "<code>+<i>X</i></code>".</p>
+
+<p>Arguments which indicate a literal constant pool index have the form
+"<code><i>kind</i>@<i>X</i></code>", where "<code><i>kind</i></code>"
+indicates which constant pool is being referred to. Each opcode that
+uses such a format explicitly allows only one kind of constant; see
+the opcode reference to figure out the correspondence. The four
+kinds of constant pool are "<code>string</code>" (string pool index),
+"<code>type</code>" (type pool index), "<code>field</code>" (field
+pool index), and "<code>meth</code>" (method pool index).</p>
+
+<p>Similar to the representation of constant pool indices, there are
+also suggested (optional) forms that indicate prelinked offsets or
+indices. These prelinked values include "<code>vtaboff</code>"
+(vtable offset), "<code>fieldoff</code>" (field offset), and
+"<code>iface</code>" (interface pool index).</p>
+
+<p>In the cases where a format value isn't explictly part of the syntax
+but instead picks a variant, each variant is listed with the prefix
+"<code>[<i>X</i>=<i>N</i>]</code>" (e.g., "<code>[B=2]</code>") to indicate
+the correspondence.</p>
+
+<h2>The Formats</h2>
+
+<table class="format">
+<thead>
+<tr>
+ <th>Format</th>
+ <th>ID</th>
+ <th>Syntax</th>
+ <th>Notable Opcodes Covered</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+ <td>ØØ|<i>op</i></td>
+ <td>10x</td>
+ <td><i><code>op</code></i></td>
+ <td> </td>
+</tr>
+<tr>
+ <td rowspan="2">B|A|<i>op</i></td>
+ <td>12x</td>
+ <td><i><code>op</code></i> vA, vB</td>
+ <td> </td>
+</tr>
+<tr>
+ <td>11n</td>
+ <td><i><code>op</code></i> vA, #+B</td>
+ <td> </td>
+</tr>
+<tr>
+ <td rowspan="2">AA|<i>op</i></td>
+ <td>11x</td>
+ <td><i><code>op</code></i> vAA</td>
+ <td> </td>
+</tr>
+<tr>
+ <td>10t</td>
+ <td><i><code>op</code></i> +AA</td>
+ <td>goto</td>
+</tr>
+<tr>
+ <td>ØØ|<i>op</i> AAAA</td></td>
+ <td>20t</td>
+ <td><i><code>op</code></i> +AAAA</td>
+ <td>goto/16</td>
+</tr>
+<tr>
+ <td rowspan="5">AA|<i>op</i> BBBB</td>
+ <td>22x</td>
+ <td><i><code>op</code></i> vAA, vBBBB</td>
+ <td> </td>
+</tr>
+<tr>
+ <td>21t</td>
+ <td><i><code>op</code></i> vAA, +BBBB</td>
+ <td> </td>
+</tr>
+<tr>
+ <td>21s</td>
+ <td><i><code>op</code></i> vAA, #+BBBB</td>
+ <td> </td>
+</tr>
+<tr>
+ <td>21h</td>
+ <td><i><code>op</code></i> vAA, #+BBBB0000<br/>
+ <i><code>op</code></i> vAA, #+BBBB000000000000
+ </td>
+ <td> </td>
+</tr>
+<tr>
+ <td>21c</td>
+ <td><i><code>op</code></i> vAA, type@BBBB<br/>
+ <i><code>op</code></i> vAA, field@BBBB<br/>
+ <i><code>op</code></i> vAA, string@BBBB
+ </td>
+ <td>check-cast<br/>
+ const-class<br/>
+ const-string
+ </td>
+</tr>
+<tr>
+ <td rowspan="2">AA|<i>op</i> CC|BB</td>
+ <td>23x</td>
+ <td><i><code>op</code></i> vAA, vBB, vCC</td>
+ <td> </td>
+</tr>
+<tr>
+ <td>22b</td>
+ <td><i><code>op</code></i> vAA, vBB, #+CC</td>
+ <td> </td>
+</tr>
+<tr>
+ <td rowspan="4">B|A|<i>op</i> CCCC</td>
+ <td>22t</td>
+ <td><i><code>op</code></i> vA, vB, +CCCC</td>
+ <td> </td>
+</tr>
+<tr>
+ <td>22s</td>
+ <td><i><code>op</code></i> vA, vB, #+CCCC</td>
+ <td> </td>
+</tr>
+<tr>
+ <td>22c</td>
+ <td><i><code>op</code></i> vA, vB, type@CCCC<br/>
+ <i><code>op</code></i> vA, vB, field@CCCC
+ </td>
+ <td>instance-of</td>
+</tr>
+<tr>
+ <td>22cs</td>
+ <td><i><code>op</code></i> vA, vB, fieldoff@CCCC</td>
+ <td><i>(suggested format for statically linked field access instructions of
+ format 22c)</i>
+ </td>
+</tr>
+<tr>
+ <td>ØØ|<i>op</i> AAAA<sub>lo</sub> AAAA<sub>hi</sub></td></td>
+ <td>30t</td>
+ <td><i><code>op</code></i> +AAAAAAAA</td>
+ <td>goto/32</td>
+</tr>
+<tr>
+ <td>ØØ|<i>op</i> AAAA BBBB</td>
+ <td>32x</td>
+ <td><i><code>op</code></i> vAAAA, vBBBB</td>
+ <td> </td>
+</tr>
+<tr>
+ <td rowspan="3">AA|<i>op</i> BBBB<sub>lo</sub> BBBB<sub>hi</sub></td>
+ <td>31i</td>
+ <td><i><code>op</code></i> vAA, #+BBBBBBBB</td>
+ <td> </td>
+</tr>
+<tr>
+ <td>31t</td>
+ <td><i><code>op</code></i> vAA, +BBBBBBBB</td>
+ <td> </td>
+</tr>
+<tr>
+ <td>31c</td>
+ <td><i><code>op</code></i> vAA, string@BBBBBBBB</td>
+ <td>const-string/jumbo</td>
+</tr>
+<tr>
+ <td>B|A|<i>op</i> CCCC G|F|E|D</td>
+ <td>35c</td>
+ <td><i>[<code>B=5</code>] <code>op</code></i> {vD, vE, vF, vG, vA},
+ meth@CCCC<br/>
+ <i>[<code>B=5</code>] <code>op</code></i> {vD, vE, vF, vG, vA},
+ type@CCCC<br/>
+ <i>[<code>B=4</code>] <code>op</code></i> {vD, vE, vF, vG},
+ <i><code>kind</code></i>@CCCC<br/>
+ <i>[<code>B=3</code>] <code>op</code></i> {vD, vE, vF},
+ <i><code>kind</code></i>@CCCC<br/>
+ <i>[<code>B=2</code>] <code>op</code></i> {vD, vE},
+ <i><code>kind</code></i>@CCCC<br/>
+ <i>[<code>B=1</code>] <code>op</code></i> {vD},
+ <i><code>kind</code></i>@CCCC<br/>
+ <i>[<code>B=0</code>] <code>op</code></i> {},
+ <i><code>kind</code></i>@CCCC
+ </td>
+ <td> </td>
+</tr>
+<tr>
+ <td>B|A|<i>op</i> CCCC G|F|E|D</td>
+ <td>35ms</td>
+
+ <td><i>[<code>B=5</code>] <code>op</code></i> {vD, vE, vF, vG, vA},
+ vtaboff@CCCC<br/>
+ <i>[<code>B=4</code>] <code>op</code></i> {vD, vE, vF, vG},
+ vtaboff@CCCC<br/>
+ <i>[<code>B=3</code>] <code>op</code></i> {vD, vE, vF},
+ vtaboff@CCCC<br/>
+ <i>[<code>B=2</code>] <code>op</code></i> {vD, vE},
+ vtaboff@CCCC<br/>
+ <i>[<code>B=1</code>] <code>op</code></i> {vD},
+ vtaboff@CCCC<br/>
+ </td>
+ <td><i>(suggested format for statically linked <code>invoke-virtual</code>
+ and <code>invoke-super</code> instructions of format 35c)</i>
+ </td>
+</tr>
+<tr>
+ <td>B|A|<i>op</i> DDCC H|G|F|E</td>
+ <td>35fs</td>
+ <td><i>[<code>B=5</code>] <code>op</code></i> vB, {vE, vF, vG, vH, vA},
+ vtaboff@CC, iface@DD<br/>
+ <i>[<code>B=4</code>] <code>op</code></i> vB, {vE, vF, vG, vH},
+ vtaboff@CC, iface@DD<br/>
+ <i>[<code>B=3</code>] <code>op</code></i> vB, {vE, vF, vG},
+ vtaboff@CC, iface@DD<br/>
+ <i>[<code>B=2</code>] <code>op</code></i> vB, {vE, vF},
+ vtaboff@CC, iface@DD<br/>
+ <i>[<code>B=1</code>] <code>op</code></i> vB, {vE},
+ vtaboff@CC, iface@DD<br/>
+ </td>
+ <td><i>(suggested format for statically linked <code>invoke-interface</code>
+ instructions of format 35c)</i>
+ </td>
+</tr>
+<tr>
+ <td>AA|<i>op</i> BBBB CCCC</td>
+ <td>3rc</td>
+ <td><i><code>op</code></i> {vCCCC .. vNNNN}, meth@BBBB<br/>
+ <i><code>op</code></i> {vCCCC .. vNNNN}, type@BBBB<br/>
+ <p><i>(where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code>
+ determines the count <code>0..255</code>, and <code>C</code>
+ determines the first register)</i></p>
+ </td>
+ <td> </td>
+</tr>
+<tr>
+ <td>AA|<i>op</i> BBBB CCCC</td>
+ <td>3rms</td>
+ <td><i><code>op</code></i> {vCCCC .. vNNNN}, vtaboff@BBBB<br/>
+ <p><i>(where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code>
+ determines the count <code>0..255</code>, and <code>C</code>
+ determines the first register)</i></p>
+ </td>
+ <td><i>(suggested format for statically linked <code>invoke-virtual</code>
+ and <code>invoke-super</code> instructions of format <code>3rc</code>)</i>
+ </td>
+</tr>
+<tr>
+ <td>AA|<i>op</i> CCBB DDDD</td>
+ <td>3rfs</td>
+ <td><i><code>op</code></i> {vDDDD .. vNNNN}, vtaboff@BB,
+ iface@CC<br/>
+ <p><i>(where <code>NNNN = DDDD+AA-1</code>, that is <code>A</code>
+ determines the count <code>0..255</code>, and <code>D</code>
+ determines the first register)</i></p>
+ </td>
+ <td><i>(suggested format for statically linked <code>invoke-interface</code>
+ instructions of format <code>3rc</code>)</i>
+ </td>
+</tr>
+<tr>
+ <td>AA|<i>op</i> BBBB<sub>lo</sub> BBBB BBBB BBBB<sub>hi</sub></td>
+ <td>51l</td>
+ <td><i><code>op</code></i> vAA, #+BBBBBBBBBBBBBBBB</td>
+ <td>const-wide</td>
+</tr>
+</tbody>
+</table>
+
+</body>
+</html>
diff --git a/docs/java-bytecode.css b/docs/java-bytecode.css
new file mode 100644
index 0000000..6075c0d
--- /dev/null
+++ b/docs/java-bytecode.css
@@ -0,0 +1,54 @@
+@media print {
+ table {
+ font-size: 8pt;
+ }
+}
+
+@media screen {
+ table {
+ font-size: 10pt;
+ }
+}
+
+h1 {
+ text-align: center;
+}
+
+table {
+ vertical-align: top;
+ border-collapse: collapse;
+ font-family: sans-serif;
+}
+
+td {
+ vertical-align: top;
+ background: #f8f8f8;
+ border-width: 0;
+}
+
+td.outer {
+ width: 25%;
+ padding: 0;
+}
+
+td.outer table {
+ width: 100%;
+}
+
+td.outer td {
+ border-width: 0;
+ background: #f8f8f8;
+ padding: 1pt;
+ padding-left: 10pt;
+ padding-right: 2pt;
+}
+
+tr.d td {
+ background: #dddddd;
+}
+
+td.outer td + td + td {
+ font-family: monospace;
+ font-weight: bold;
+ padding-right: 5pt;
+}
\ No newline at end of file
diff --git a/docs/java-bytecode.html b/docs/java-bytecode.html
new file mode 100644
index 0000000..691ae54
--- /dev/null
+++ b/docs/java-bytecode.html
@@ -0,0 +1,228 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+
+<html>
+
+<head>
+<title>Java Bytecode At A Glance</title>
+<link rel="stylesheet" href="java-bytecode.css">
+</head>
+
+<body>
+
+<h1>Java Bytecode At A Glance</h1>
+
+<table align="center">
+<tr><td class="outer"><table>
+<tr><td>0x00</td><td>0</td><td>nop</td></tr>
+<tr><td>0x01</td><td>1</td><td>aconst_null</td></tr>
+<tr class="d"><td>0x02</td><td>2</td><td>iconst_m1</td></tr>
+<tr class="d"><td>0x03</td><td>3</td><td>iconst_0</td></tr>
+<tr><td>0x04</td><td>4</td><td>iconst_1</td></tr>
+<tr><td>0x05</td><td>5</td><td>iconst_2</td></tr>
+<tr class="d"><td>0x06</td><td>6</td><td>iconst_3</td></tr>
+<tr class="d"><td>0x07</td><td>7</td><td>iconst_4</td></tr>
+<tr><td>0x08</td><td>8</td><td>iconst_5</td></tr>
+<tr><td>0x09</td><td>9</td><td>lconst_0</td></tr>
+<tr class="d"><td>0x0a</td><td>10</td><td>lconst_1</td></tr>
+<tr class="d"><td>0x0b</td><td>11</td><td>fconst_0</td></tr>
+<tr><td>0x0c</td><td>12</td><td>fconst_1</td></tr>
+<tr><td>0x0d</td><td>13</td><td>fconst_2</td></tr>
+<tr class="d"><td>0x0e</td><td>14</td><td>dconst_0</td></tr>
+<tr class="d"><td>0x0f</td><td>15</td><td>dconst_1</td></tr>
+<tr><td>0x10</td><td>16</td><td>bipush</td></tr>
+<tr><td>0x11</td><td>17</td><td>sipush</td></tr>
+<tr class="d"><td>0x12</td><td>18</td><td>ldc</td></tr>
+<tr class="d"><td>0x13</td><td>19</td><td>ldc_w</td></tr>
+<tr><td>0x14</td><td>20</td><td>ldc2_w</td></tr>
+<tr><td>0x15</td><td>21</td><td>iload</td></tr>
+<tr class="d"><td>0x16</td><td>22</td><td>lload</td></tr>
+<tr class="d"><td>0x17</td><td>23</td><td>fload</td></tr>
+<tr><td>0x18</td><td>24</td><td>dload</td></tr>
+<tr><td>0x19</td><td>25</td><td>aload</td></tr>
+<tr class="d"><td>0x1a</td><td>26</td><td>iload_0</td></tr>
+<tr class="d"><td>0x1b</td><td>27</td><td>iload_1</td></tr>
+<tr><td>0x1c</td><td>28</td><td>iload_2</td></tr>
+<tr><td>0x1d</td><td>29</td><td>iload_3</td></tr>
+<tr class="d"><td>0x1e</td><td>30</td><td>lload_0</td></tr>
+<tr class="d"><td>0x1f</td><td>31</td><td>lload_1</td></tr>
+<tr><td>0x20</td><td>32</td><td>lload_2</td></tr>
+<tr><td>0x21</td><td>33</td><td>lload_3</td></tr>
+<tr class="d"><td>0x22</td><td>34</td><td>fload_0</td></tr>
+<tr class="d"><td>0x23</td><td>35</td><td>fload_1</td></tr>
+<tr><td>0x24</td><td>36</td><td>fload_2</td></tr>
+<tr><td>0x25</td><td>37</td><td>fload_3</td></tr>
+<tr class="d"><td>0x26</td><td>38</td><td>dload_0</td></tr>
+<tr class="d"><td>0x27</td><td>39</td><td>dload_1</td></tr>
+<tr><td>0x28</td><td>40</td><td>dload_2</td></tr>
+<tr><td>0x29</td><td>41</td><td>dload_3</td></tr>
+<tr class="d"><td>0x2a</td><td>42</td><td>aload_0</td></tr>
+<tr class="d"><td>0x2b</td><td>43</td><td>aload_1</td></tr>
+<tr><td>0x2c</td><td>44</td><td>aload_2</td></tr>
+<tr><td>0x2d</td><td>45</td><td>aload_3</td></tr>
+<tr class="d"><td>0x2e</td><td>46</td><td>iaload</td></tr>
+<tr class="d"><td>0x2f</td><td>47</td><td>laload</td></tr>
+<tr><td>0x30</td><td>48</td><td>faload</td></tr>
+<tr><td>0x31</td><td>49</td><td>daload</td></tr>
+<tr class="d"><td>0x32</td><td>50</td><td>aaload</td></tr>
+</table></td>
+<td class="outer"><table>
+<tr><td>0x33</td><td>51</td><td>baload</td></tr>
+<tr><td>0x34</td><td>52</td><td>caload</td></tr>
+<tr class="d"><td>0x35</td><td>53</td><td>saload</td></tr>
+<tr class="d"><td>0x36</td><td>54</td><td>istore</td></tr>
+<tr><td>0x37</td><td>55</td><td>lstore</td></tr>
+<tr><td>0x38</td><td>56</td><td>fstore</td></tr>
+<tr class="d"><td>0x39</td><td>57</td><td>dstore</td></tr>
+<tr class="d"><td>0x3a</td><td>58</td><td>astore</td></tr>
+<tr><td>0x3b</td><td>59</td><td>istore_0</td></tr>
+<tr><td>0x3c</td><td>60</td><td>istore_1</td></tr>
+<tr class="d"><td>0x3d</td><td>61</td><td>istore_2</td></tr>
+<tr class="d"><td>0x3e</td><td>62</td><td>istore_3</td></tr>
+<tr><td>0x3f</td><td>63</td><td>lstore_0</td></tr>
+<tr><td>0x40</td><td>64</td><td>lstore_1</td></tr>
+<tr class="d"><td>0x41</td><td>65</td><td>lstore_2</td></tr>
+<tr class="d"><td>0x42</td><td>66</td><td>lstore_3</td></tr>
+<tr><td>0x43</td><td>67</td><td>fstore_0</td></tr>
+<tr><td>0x44</td><td>68</td><td>fstore_1</td></tr>
+<tr class="d"><td>0x45</td><td>69</td><td>fstore_2</td></tr>
+<tr class="d"><td>0x46</td><td>70</td><td>fstore_3</td></tr>
+<tr><td>0x47</td><td>71</td><td>dstore_0</td></tr>
+<tr><td>0x48</td><td>72</td><td>dstore_1</td></tr>
+<tr class="d"><td>0x49</td><td>73</td><td>dstore_2</td></tr>
+<tr class="d"><td>0x4a</td><td>74</td><td>dstore_3</td></tr>
+<tr><td>0x4b</td><td>75</td><td>astore_0</td></tr>
+<tr><td>0x4c</td><td>76</td><td>astore_1</td></tr>
+<tr class="d"><td>0x4d</td><td>77</td><td>astore_2</td></tr>
+<tr class="d"><td>0x4e</td><td>78</td><td>astore_3</td></tr>
+<tr><td>0x4f</td><td>79</td><td>iastore</td></tr>
+<tr><td>0x50</td><td>80</td><td>lastore</td></tr>
+<tr class="d"><td>0x51</td><td>81</td><td>fastore</td></tr>
+<tr class="d"><td>0x52</td><td>82</td><td>dastore</td></tr>
+<tr><td>0x53</td><td>83</td><td>aastore</td></tr>
+<tr><td>0x54</td><td>84</td><td>bastore</td></tr>
+<tr class="d"><td>0x55</td><td>85</td><td>castore</td></tr>
+<tr class="d"><td>0x56</td><td>86</td><td>sastore</td></tr>
+<tr><td>0x57</td><td>87</td><td>pop</td></tr>
+<tr><td>0x58</td><td>88</td><td>pop2</td></tr>
+<tr class="d"><td>0x59</td><td>89</td><td>dup</td></tr>
+<tr class="d"><td>0x5a</td><td>90</td><td>dup_x1</td></tr>
+<tr><td>0x5b</td><td>91</td><td>dup_x2</td></tr>
+<tr><td>0x5c</td><td>92</td><td>dup2</td></tr>
+<tr class="d"><td>0x5d</td><td>93</td><td>dup2_x1</td></tr>
+<tr class="d"><td>0x5e</td><td>94</td><td>dup2_x2</td></tr>
+<tr><td>0x5f</td><td>95</td><td>swap</td></tr>
+<tr><td>0x60</td><td>96</td><td>iadd</td></tr>
+<tr class="d"><td>0x61</td><td>97</td><td>ladd</td></tr>
+<tr class="d"><td>0x62</td><td>98</td><td>fadd</td></tr>
+<tr><td>0x63</td><td>99</td><td>dadd</td></tr>
+<tr><td>0x64</td><td>100</td><td>isub</td></tr>
+<tr class="d"><td>0x65</td><td>101</td><td>lsub</td></tr>
+</table></td>
+<td class="outer"><table>
+<tr><td>0x66</td><td>102</td><td>fsub</td></tr>
+<tr><td>0x67</td><td>103</td><td>dsub</td></tr>
+<tr class="d"><td>0x68</td><td>104</td><td>imul</td></tr>
+<tr class="d"><td>0x69</td><td>105</td><td>lmul</td></tr>
+<tr><td>0x6a</td><td>106</td><td>fmul</td></tr>
+<tr><td>0x6b</td><td>107</td><td>dmul</td></tr>
+<tr class="d"><td>0x6c</td><td>108</td><td>idiv</td></tr>
+<tr class="d"><td>0x6d</td><td>109</td><td>ldiv</td></tr>
+<tr><td>0x6e</td><td>110</td><td>fdiv</td></tr>
+<tr><td>0x6f</td><td>111</td><td>ddiv</td></tr>
+<tr class="d"><td>0x70</td><td>112</td><td>irem</td></tr>
+<tr class="d"><td>0x71</td><td>113</td><td>lrem</td></tr>
+<tr><td>0x72</td><td>114</td><td>frem</td></tr>
+<tr><td>0x73</td><td>115</td><td>drem</td></tr>
+<tr class="d"><td>0x74</td><td>116</td><td>ineg</td></tr>
+<tr class="d"><td>0x75</td><td>117</td><td>lneg</td></tr>
+<tr><td>0x76</td><td>118</td><td>fneg</td></tr>
+<tr><td>0x77</td><td>119</td><td>dneg</td></tr>
+<tr class="d"><td>0x78</td><td>120</td><td>ishl</td></tr>
+<tr class="d"><td>0x79</td><td>121</td><td>lshl</td></tr>
+<tr><td>0x7a</td><td>122</td><td>ishr</td></tr>
+<tr><td>0x7b</td><td>123</td><td>lshr</td></tr>
+<tr class="d"><td>0x7c</td><td>124</td><td>iushr</td></tr>
+<tr class="d"><td>0x7d</td><td>125</td><td>lushr</td></tr>
+<tr><td>0x7e</td><td>126</td><td>iand</td></tr>
+<tr><td>0x7f</td><td>127</td><td>land</td></tr>
+<tr class="d"><td>0x80</td><td>128</td><td>ior</td></tr>
+<tr class="d"><td>0x81</td><td>129</td><td>lor</td></tr>
+<tr><td>0x82</td><td>130</td><td>ixor</td></tr>
+<tr><td>0x83</td><td>131</td><td>lxor</td></tr>
+<tr class="d"><td>0x84</td><td>132</td><td>iinc</td></tr>
+<tr class="d"><td>0x85</td><td>133</td><td>i2l</td></tr>
+<tr><td>0x86</td><td>134</td><td>i2f</td></tr>
+<tr><td>0x87</td><td>135</td><td>i2d</td></tr>
+<tr class="d"><td>0x88</td><td>136</td><td>l2i</td></tr>
+<tr class="d"><td>0x89</td><td>137</td><td>l2f</td></tr>
+<tr><td>0x8a</td><td>138</td><td>l2d</td></tr>
+<tr><td>0x8b</td><td>139</td><td>f2i</td></tr>
+<tr class="d"><td>0x8c</td><td>140</td><td>f2l</td></tr>
+<tr class="d"><td>0x8d</td><td>141</td><td>f2d</td></tr>
+<tr><td>0x8e</td><td>142</td><td>d2i</td></tr>
+<tr><td>0x8f</td><td>143</td><td>d2l</td></tr>
+<tr class="d"><td>0x90</td><td>144</td><td>d2f</td></tr>
+<tr class="d"><td>0x91</td><td>145</td><td>i2b</td></tr>
+<tr><td>0x92</td><td>146</td><td>i2c</td></tr>
+<tr><td>0x93</td><td>147</td><td>i2s</td></tr>
+<tr class="d"><td>0x94</td><td>148</td><td>lcmp</td></tr>
+<tr class="d"><td>0x95</td><td>149</td><td>fcmpl</td></tr>
+<tr><td>0x96</td><td>150</td><td>fcmpg</td></tr>
+<tr><td>0x97</td><td>151</td><td>dcmpl</td></tr>
+<tr class="d"><td>0x98</td><td>152</td><td>dcmpg</td></tr>
+</table></td>
+<td class="outer"><table>
+<tr><td>0x99</td><td>153</td><td>ifeq</td></tr>
+<tr><td>0x9a</td><td>154</td><td>ifne</td></tr>
+<tr class="d"><td>0x9b</td><td>155</td><td>iflt</td></tr>
+<tr class="d"><td>0x9c</td><td>156</td><td>ifge</td></tr>
+<tr><td>0x9d</td><td>157</td><td>ifgt</td></tr>
+<tr><td>0x9e</td><td>158</td><td>ifle</td></tr>
+<tr class="d"><td>0x9f</td><td>159</td><td>if_icmpeq</td></tr>
+<tr class="d"><td>0xa0</td><td>160</td><td>if_icmpne</td></tr>
+<tr><td>0xa1</td><td>161</td><td>if_icmplt</td></tr>
+<tr><td>0xa2</td><td>162</td><td>if_icmpge</td></tr>
+<tr class="d"><td>0xa3</td><td>163</td><td>if_icmpgt</td></tr>
+<tr class="d"><td>0xa4</td><td>164</td><td>if_icmple</td></tr>
+<tr><td>0xa5</td><td>165</td><td>if_acmpeq</td></tr>
+<tr><td>0xa6</td><td>166</td><td>if_acmpne</td></tr>
+<tr class="d"><td>0xa7</td><td>167</td><td>goto</td></tr>
+<tr class="d"><td>0xa8</td><td>168</td><td>jsr</td></tr>
+<tr><td>0xa9</td><td>169</td><td>ret</td></tr>
+<tr><td>0xaa</td><td>170</td><td>tableswitch</td></tr>
+<tr class="d"><td>0xab</td><td>171</td><td>lookupswitch</td></tr>
+<tr class="d"><td>0xac</td><td>172</td><td>ireturn</td></tr>
+<tr><td>0xad</td><td>173</td><td>lreturn</td></tr>
+<tr><td>0xae</td><td>174</td><td>freturn</td></tr>
+<tr class="d"><td>0xaf</td><td>175</td><td>dreturn</td></tr>
+<tr class="d"><td>0xb0</td><td>176</td><td>areturn</td></tr>
+<tr><td>0xb1</td><td>177</td><td>return</td></tr>
+<tr><td>0xb2</td><td>178</td><td>getstatic</td></tr>
+<tr class="d"><td>0xb3</td><td>179</td><td>putstatic</td></tr>
+<tr class="d"><td>0xb4</td><td>180</td><td>getfield</td></tr>
+<tr><td>0xb5</td><td>181</td><td>putfield</td></tr>
+<tr><td>0xb6</td><td>182</td><td>invokevirtual</td></tr>
+<tr class="d"><td>0xb7</td><td>183</td><td>invokespecial</td></tr>
+<tr class="d"><td>0xb8</td><td>184</td><td>invokestatic</td></tr>
+<tr><td>0xb9</td><td>185</td><td>invokeinterface</td></tr>
+<tr><td>0xba</td><td>186</td><td><i>(unused)</i></td></tr>
+<tr class="d"><td>0xbb</td><td>187</td><td>new</td></tr>
+<tr class="d"><td>0xbc</td><td>188</td><td>newarray</td></tr>
+<tr><td>0xbd</td><td>189</td><td>anewarray</td></tr>
+<tr><td>0xbe</td><td>190</td><td>arraylength</td></tr>
+<tr class="d"><td>0xbf</td><td>191</td><td>athrow</td></tr>
+<tr class="d"><td>0xc0</td><td>192</td><td>checkcast</td></tr>
+<tr><td>0xc1</td><td>193</td><td>instanceof</td></tr>
+<tr><td>0xc2</td><td>194</td><td>monitorenter</td></tr>
+<tr class="d"><td>0xc3</td><td>195</td><td>monitorexit</td></tr>
+<tr class="d"><td>0xc4</td><td>196</td><td>wide</td></tr>
+<tr><td>0xc5</td><td>197</td><td>multianewarray</td></tr>
+<tr><td>0xc6</td><td>198</td><td>ifnull</td></tr>
+<tr class="d"><td>0xc7</td><td>199</td><td>ifnonnull</td></tr>
+<tr class="d"><td>0xc8</td><td>200</td><td>goto_w</td></tr>
+<tr><td>0xc9</td><td>201</td><td>jsr_w</td></tr>
+</table></td></tr>
+</table>
+
+</body>
+</html>
diff --git a/docs/jni-tips.html b/docs/jni-tips.html
new file mode 100644
index 0000000..b1afbe5
--- /dev/null
+++ b/docs/jni-tips.html
@@ -0,0 +1,384 @@
+<html>
+ <head>
+ <title>JNI Tips</title>
+ <link rel=stylesheet href="android.css">
+ </head>
+
+ <body>
+ <h1><a name="JNI_Tips"></a>JNI Tips</h1>
+<p>
+</p><p>
+</p><ul>
+<li> <a href="#What_s_JNI_">What's JNI?</a>
+</li>
+<li> <a href="#JavaVM_and_JNIEnv">JavaVM and JNIEnv</a>
+
+</li>
+<li> <a href="#jclassID_jmethodID_and_jfieldID">jclassID, jmethodID, and jfieldID</a>
+</li>
+<li> <a href="#local_vs_global_references">Local vs. Global References</a>
+</li>
+<li> <a href="#UTF_8_and_UTF_16_strings">UTF-8 and UTF-16 Strings</a>
+</li>
+<li> <a href="#Arrays">Primitive Arrays</a>
+</li>
+<li> <a href="#Exceptions">Exceptions</a>
+</li>
+
+<li> <a href="#Extended_checking">Extended Checking</a>
+</li>
+<li> <a href="#Native_Libraries">Native Libraries</a>
+</li>
+
+<li> <a href="#Unsupported">Unsupported Features</a>
+</ul>
+<p>
+<noautolink>
+</noautolink></p><p>
+</p><h2><a name="What_s_JNI_"> </a> What's JNI? </h2>
+<p>
+
+JNI is the Java Native Interface. It defines a way for code written in the
+Java programming language to interact with native
+code, e.g. functions written in C/C++. It's VM-neutral, has support for loading code from
+dynamic shared libraries, and while cumbersome at times is reasonably efficient.
+</p><p>
+You really should read through the
+<a href="http://java.sun.com/javase/6/docs/technotes/guides/jni/spec/jniTOC.html">JNI spec for J2SE 1.6</a>
+to understand how JNI works. Some aspects of the spec aren't immediately obvious on
+first reading, so you may find the next few sections handy.
+The more detailed <i>JNI Programmer's Guide and Specification</i> can be found
+<a href="http://java.sun.com/docs/books/jni/html/jniTOC.html">here</a>.
+</p><p>
+</p><p>
+</p><h2><a name="JavaVM_and_JNIEnv"> </a> JavaVM and JNIEnv </h2>
+<p>
+JNI defines two key data structures, "JavaVM" and "JNIEnv". Both of these are essentially
+pointers to pointers to function tables. (In the C++ version, it's a class whose sole member
+is a pointer to a function table.) The JavaVM provides the "invocation interface" functions,
+which allow you to create and destroy the VM. In theory you can have multiple VMs per process,
+but Android's VMs only allow one.
+</p><p>
+The JNIEnv provides most of the JNI functions. Your native functions all receive a JNIEnv as
+the first argument.
+</p><p>
+
+On some VMs, the JNIEnv is used for thread-local storage. For this reason, <strong>you cannot share a JNIEnv between threads</strong>.
+If a piece of code has no other way to get a JNIEnv, you should share
+the JavaVM, and use JavaVM->GetEnv to discover the thread's JNIEnv.
+</p><p>
+The C and C++ definitions of JNIEnv and JavaVM are different. "jni.h" provides different typedefs
+depending on whether it's included into ".c" or ".cpp". For this reason it's a bad idea to
+include JNIEnv arguments in header files included by both languages. (Put another way: if your
+header file requires "#ifdef __cplusplus", you may have to do some extra work if anything in
+that header refers to JNIEnv.)
+</p><p>
+</p><p>
+</p><h2><a name="jclassID_jmethodID_and_jfieldID"> jclassID, jmethodID, and jfieldID </a></h2>
+<p>
+If you want to access an object's field from native code, you would do the following:
+</p><p>
+</p><ul>
+<li> Get the class object reference for the class with <code>FindClass</code>
+</li>
+<li> Get the field ID for the field with <code>GetFieldID</code>
+</li>
+<li> Get the contents of the field with something appropriate, e.g.
+<code>GetIntField</code>
+</li>
+</ul>
+<p>
+Similarly, to call a method, you'd first get a class object reference and then a method ID. The IDs are often just
+pointers to internal VM data structures. Looking them up may require several string
+comparisons, but once you have them the actual call to get the field or invoke the method
+is very quick.
+</p><p>
+If performance is important, it's useful to look the values up once and cache the results
+in your native code. Because we are limiting ourselves to one VM per process, it's reasonable
+to store this data in a static local structure.
+</p><p>
+The class references, field IDs, and method IDs are guaranteed valid until the class is unloaded. Classes
+are only unloaded if all classes associated with a ClassLoader can be garbage collected,
+which is rare but will not be impossible in our system. The jclassID
+is a class reference and <strong>must be protected</strong> with a call
+to <code>NewGlobalRef</code> (see the next section).
+</p><p>
+If you would like to cache the IDs when a class is loaded, and automatically re-cache them
+if the class is ever unloaded and reloaded, the correct way to initialize
+the IDs is to add a piece of code that looks like this to the appropriate class:
+</p><p>
+
+</p><pre> /*
+ * We use a class initializer to allow the native code to cache some
+ * field offsets.
+ */
+
+ /*
+ * A native function that looks up and caches interesting
+ * class/field/method IDs for this class. Returns false on failure.
+ */
+ native private static boolean nativeClassInit();
+
+ /*
+ * Invoke the native initializer when the class is loaded.
+ */
+ static {
+ if (!nativeClassInit())
+ throw new RuntimeException("native init failed");
+ }
+</pre>
+<p>
+Create a nativeClassInit method in your C/C++ code that performs the ID lookups. The code
+will be executed once, when the class is initialized. If the class is ever unloaded and
+then reloaded, it will be executed again. (See the implementation of java.io.FileDescriptor
+for an example in our source tree.)
+</p><p>
+</p><p>
+</p><p>
+</p><h2><a name="local_vs_global_references"> Local vs. Global References </a></h2>
+<p>
+Every object that JNI returns is a "local reference". This means that it's valid for the
+duration of the current native method in the current thread.
+<strong>Even if the object itself continues to live on after the native method returns, the reference is not valid.</strong>
+This applies to all sub-classes of jobject, including jclass and jarray.
+(Dalvik VM will warn you about this when -Xcheck:jni is enabled.)
+</p><p>
+
+If you want to hold on to a reference for a longer period, you must use a "global" reference.
+The <code>NewGlobalRef</code> function takes the local reference as
+an argument and returns a global one:
+
+<p><pre>jobject* localRef = [...];
+jobject* globalRef;
+globalRef = env->NewGlobalRef(localRef);
+</pre>
+
+The global reference is guaranteed to be valid until you call
+<code>DeleteGlobalRef</code>.
+</p><p>
+All JNI methods accept both local and global references as arguments.
+</p><p>
+Programmers are required to "not excessively allocate" local references. In practical terms this means
+that if you're creating large numbers of local references, perhaps while running through an array of
+Objects, you should free them manually with
+<code>DeleteLocalRef</code> instead of letting JNI do it for you. The
+VM is only required to reserve slots for
+16 local references, so if you need more than that you should either delete as you go or use
+<code>EnsureLocalCapacity</code> to reserve more.
+</p><p>
+Note: method and field IDs are just 32-bit identifiers, not object
+references, and should not be passed to <code>NewGlobalRef</code>. The raw data
+pointers returned by functions like <code>GetStringUTFChars</code>
+and <code>GetByteArrayElements</code> are also not objects.
+</p><p>
+</p><p>
+</p><p>
+</p><h2><a name="UTF_8_and_UTF_16_strings"> </a> UTF-8 and UTF-16 Strings </h2>
+<p>
+The Java programming language uses UTF-16. For convenience, JNI provides methods that work with "modified UTF-8" encoding
+as well. (Some VMs use the modified UTF-8 internally to store strings; ours do not.) The
+modified encoding only supports the 8- and 16-bit forms, and stores ASCII NUL values in a 16-bit encoding.
+The nice thing about it is that you can count on having C-style zero-terminated strings,
+suitable for use with standard libc string functions. The down side is that you cannot pass
+arbitrary UTF-8 data into the VM and expect it to work correctly.
+</p><p>
+It's usually best to operate with UTF-16 strings. With our current VMs, the
+<code>GetStringChars</code> method
+does not require a copy, whereas <code>GetStringUTFChars</code> requires a malloc and a UTF conversion. Note that
+<strong>UTF-16 strings are not zero-terminated</strong>, so you need to hang on to the string length as well as
+the string pointer.
+
+</p><p>
+<strong>Don't forget to Release the strings you Get</strong>. The string functions return <code>jchar*</code> or <code>jbyte*</code>, which
+are pointers to primitive types rather than local references. They are not automatically released
+when the native method returns.
+</p><p>
+</p><p>
+
+
+</p><h2><a name="Arrays"> </a> Primitive Arrays </h2>
+<p>
+JNI provides functions for accessing the contents of array objects.
+While arrays of objects must be accessed one entry at a time, arrays of
+primitives can be read and written directly as if they were declared in C.
+</p><p>
+To make the interface as efficient as possible without constraining
+the VM implementation,
+the <code>Get<PrimitiveType>ArrayElements</code> family of calls
+allows the VM to either return a pointer to the actual elements, or
+allocate some memory and make a copy. Either way, the raw pointer returned
+is guaranteed to be valid until the corresponding <code>Release</code> call
+is issued (which implies that, if the data wasn't copied, the array object
+will be pinned down and can't be relocated as part of compacting the heap).
+</p><p>
+You can determine whether or not the data was copied by passing in a
+non-NULL pointer for the <code>isCopy</code> argument. This is rarely
+useful.
+</p><p>
+The <code>Release</code> call takes a <code>mode</code> argument that can
+have one of three values. The actions performed by the VM depend upon
+whether or not the data was copied:
+<ul>
+ <li><code>0</code>
+ <ul>
+ <li>Copy: data is copied back. The buffer with the copy is freed.
+ <li>No copy: the array object is un-pinned.
+ </ul>
+ <li><code>JNI_COMMIT</code>
+ <ul>
+ <li>Copy: data is copied back. The buffer with the copy is NOT freed.
+ <li>No copy: does nothing.
+ </ul>
+ <li><code>JNI_ABORT</code>
+ <ul>
+ <li>Copy: the buffer with the copy is freed; any changes to it are lost.
+ <li>No copy: the array object is un-pinned. Earlier
+ writes are NOT aborted.
+ </ul>
+</ul>
+</p><p>
+One reason for checking the <code>isCopy</code> flag is to know if
+you need to call <code>Release</code> with <code>JNI_COMMIT</code>
+after making changes to an array -- if you're alternating between making
+changes and executing code that uses the contents of the array, you can
+skip the no-op commit. Another possible reason for checking the flag is for
+efficient handling of <code>JNI_ABORT</code>. For example, you might want
+to get an array, modify it in place, pass pieces to other functions, and
+then discard the changes. If you know that JNI is making a new copy for
+you, there's no need to create another "editable" copy. If JNI is passing
+you the original, then you do need to make your own copy.
+</p><p>
+Some have asserted that you can skip the <code>Release</code> call if
+<code>*isCopy</code> is false. This is not the case. If no copy buffer was
+allocated, then the original memory must be pinned down and can't be moved by
+the garbage collector.
+</p><p>
+Also note that the <code>JNI_COMMIT</code> flag does NOT release the array,
+and you will need to call <code>Release</code> again with a different flag
+eventually.
+</p><p>
+</p><p>
+
+
+</p><h2><a name="Exceptions"> Exceptions </a></h2>
+<p>
+You may not call most JNI functions when an exception is pending. Your code is expected to
+see the exception (via <code>ExceptionCheck()</code> or <code>ExceptionOccurred()</code>) and return,
+or clear the exception and handle it.
+</p><p>
+The only JNI functions that you are allowed to call while an exception is
+pending are listed <a href="http://java.sun.com/javase/6/docs/technotes/guides/jni/spec/design.html#wp17626">
+ here</a>.
+</p><p>
+Note that exceptions thrown by interpreted code do not "leap over" native code,
+and exceptions through by native code don't longjmp back into the interpreter. The
+JNI <code>Throw</code> and <code>ThrowNew</code> instructions just
+set an exception pointer in the
+current thread. Upon returning from native code, the exception will be noted and
+handled appropriately.
+</p><p>
+Native code can "catch" an exception by calling <code>ExceptionCheck</code> or
+<code>ExceptionOccurred</code>, and clear it with
+<code>ExceptionClear</code>. As usual,
+discarding exceptions without handling them can lead to problems.
+</p><p>
+There are no built-in functions for manipulating the Throwable object
+itself, so if you want to (say) get the exception string you will need to
+find the Throwable class, look up the method ID for
+<code>getMessage "()Ljava/lang/String;"</code>, invoke it, and if the result
+is non-NULL use <code>GetStringUTFChars</code> to get something you can
+hand to printf or a LOG macro.
+
+</p><p>
+</p><p>
+</p><h2><a name="Extended_checking"> Extended Checking </a></h2>
+<p>
+JNI does very little error checking. Calling <code>SetFieldInt</code>
+on an Object field will succeed. The
+goal is to minimize the overhead on the assumption that, if you've written it in native code,
+you probably did it for performance reasons.
+</p><p>
+Some VMs support extended checking with the "<code>-Xcheck:jni</code>" flag. If the flag is set, the VM
+puts a different table of functions into the JavaVM and JNIEnv pointers. These functions do
+an extended series of checks before calling the standard implementation.
+
+</p><p>
+Some things that may be verified:
+</p><p>
+</p><ul>
+<li> Check for null pointers where not allowed.
+<li>
+<li> Verify argument type correctness (jclass is a class object,
+jfieldID points to field data, jstring is a java.lang.String).
+</li>
+<li> Field type correctness, e.g. don't store a HashMap in a String field.
+</li>
+<li> Check to see if an exception is pending on calls where pending exceptions are not legal.
+</li>
+<li> Check for calls to inappropriate functions between Critical get/release calls.
+</li>
+<li> Check that JNIEnv structs aren't being shared between threads.
+
+</li>
+<li> Make sure local references aren't used outside their allowed lifespan.
+</li>
+<li> UTF-8 strings contain valid "modified UTF-8" data.
+</li>
+</ul>
+<p>Accessibility of methods and fields (i.e. public vs. private) is not
+checked.
+<p>
+The Dalvik VM supports the <code>-Xcheck:jni</code> flag. For a
+description of how to enable it for Android apps, see
+<a href="embedded-vm-control.html">Controlling the Embedded VM</a>.
+It's currently enabled by default in the Android emulator.
+
+</p><p>
+</p><p>
+</p><h2><a name="Native_Libraries"> Native Libraries </a></h2>
+<p>
+You can load native code from shared libraries with the standard
+<code>System.loadLibrary()</code> call. The
+preferred way to get at your native code is:
+</p><p>
+</p><ul>
+<li> Call <code>System.loadLibrary()</code> from a static class initializer. (See the earlier example, where one is used to call nativeClassInit().) The argument is the "undecorated" library name, e.g. to load "libfubar.so" you would pass in "fubar".
+
+</li>
+<li> Provide a native function: <code><b>jint JNI_OnLoad(JavaVM* vm, void* reserved)</b></code>
+</li>
+<li> In JNI_OnLoad, register all of your native methods. You should declare the methods "static" so the names don't occupy space in the symbol table on the device.
+</li>
+</ul>
+<p>
+For a simple example, see <code>//device/tests/jnilibtest/JniLibTest.c</code>
+and <code>//device/apps/AndroidTests/src/com/android/unit_tests/JniLibTest.java</code>.
+</p><p>
+You can also call <code>System.load()</code> with the full path name of the
+shared library. This is not recommended for Android apps, since the
+installation directory could change in the future.
+</p><p>
+
+
+</p><h2><a name="Unsupported"> Unsupported Features </a></h2>
+<p>All JNI 1.6 features are supported, with the following exceptions:
+<ul>
+ <li><code>DefineClass</code> is not implemented. Dalvik does not use
+ Java bytecodes or class files, so passing in binary class data
+ doesn't work. Translation facilities may be added in a future
+ version of the VM.</li>
+ <li><code>NewWeakGlobalRef</code> and <code>DeleteWeakGlobalRef</code>
+ are not implemented. The
+ VM supports weak references, but not JNI "weak global" references.
+ These will be supported in a future release.</li>
+ <li><code>GetObjectRefType</code> (new in 1.6) is implemented but not fully
+ functional -- it can't always tell the difference between "local" and
+ "global" references.</li>
+</ul>
+
+</p>
+
+<address>Copyright © 2008 The Android Open Source Project</address>
+
+ </body>
+</html>
diff --git a/docs/verifier.html b/docs/verifier.html
new file mode 100644
index 0000000..ec730f1
--- /dev/null
+++ b/docs/verifier.html
@@ -0,0 +1,150 @@
+<html>
+<head>
+<title>Dalvik Bytecode Verifier Notes</title>
+</head>
+
+<body>
+<h1>Dalvik Bytecode Verifier Notes</h1>
+
+<p>
+The bytecode verifier in the Dalvik VM attempts to provide the same sorts
+of checks and guarantees that other popular virtual machines do. We
+perform generally the same set of checks as are described in _The Java
+Virtual Machine Specification, Second Edition_, including the updates
+planned for the Third Edition.
+
+<p>
+Verification can be enabled for all classes, disabled for all, or enabled
+only for "remote" (non-bootstrap) classes. It should be performed for any
+class that will be processed with the DEX optimizer, and in fact the
+default VM behavior is to only optimize verified classes.
+
+
+<h2>Why Verify?</h2>
+
+<p>
+The verification process adds additional time to the build and to
+the installation of new applications. It's fairly quick for app-sized
+DEX files, but rather slow for the big "core" and "framework" files.
+Why do it all, when our system relies on UNIX processes for security?
+<p>
+<ol>
+ <li>Optimizations. The interpreter can ignore a lot of potential
+ error cases because the verifier guarantees that they are impossible.
+ Also, we can optimize the DEX file more aggressively if we start
+ with a stronger set of assumptions about the bytecode.
+ <li>"Exact" GC. The work peformed during verification has significant
+ overlap with the work required to compute register use maps for exact
+ GC. Improper register use, caught by the verifier, could lead to
+ subtle problems with an "exact" GC.
+ <li>Intra-application security. If an app wants to download bits
+ of interpreted code over the network and execute them, it can safely
+ do so using well-established security mechanisms.
+ <li>3rd party app failure analysis. We have no way to control the
+ tools and post-processing utilities that external developers employ,
+ so when we get bug reports with a weird exception or native crash
+ it's very helpful to start with the assumption that the bytecode
+ is valid.
+</ol>
+
+
+<h2>Verifier Differences</h2>
+
+<p>
+There are a few checks that the Dalvik bytecode verifier does not perform,
+because they're not relevant. For example:
+<ul>
+ <li>Type restrictions on constant pool references are not enforced,
+ because Dalvik does not have a pool of typed constants. (Dalvik
+ uses a simple index into type-specific pools.)
+ <li>Verification of the operand stack size is not performed, because
+ Dalvik does not have an operand stack.
+ <li>Limitations on <code>jsr</code> and <code>ret</code> do not apply,
+ because Dalvik doesn't support subroutines.
+</ul>
+
+In some cases they are implemented differently, e.g.:
+<ul>
+ <li>In a conventional VM, backward branches are forbidden when the
+ stack has an uninitialized reference. The restriction was changed to
+ disallow use of the <code>new-instance</code> instruction if a register
+ refers to an uninitialized instance created by that same instruction.
+ This solves the same problem without unduly limiting branches.
+</ul>
+
+There are also some new ones, such as:
+<ul>
+ <li>The <code>move-exception</code> instruction can only appear as
+ the first instruction in an exception handler.
+ <li>The <code>move-result*</code> instructions can only appear
+ immediately after an appropriate <code>invoke-*</code>
+ or <code>filled-new-array</code> instruction.
+</ul>
+
+<p>
+The Dalvik verifier is more restrictive than other VMs in one area:
+type safety on sub-32-bit integer widths. These additional restrictions
+should make it impossible to, say, pass a value outside the range
+[-128, 127] to a function that takes a <code>byte</code> as an argument.
+
+
+<h2>Verification Failures</h2>
+
+<p>
+When the verifier rejects a class, it always throws a VerifyError.
+This is different in some cases from other implementations. For example,
+if a class attempts to perform an illegal access on a field, the expected
+behavior is to receive an IllegalAccessError at runtime the first time
+the field is actually accessed. The Dalvik verifier will reject the
+entire class immediately.
+
+<p>
+It's difficult to throw the error on first use in Dalvik. Possible ways
+to implement this behavior include:
+
+<ol>
+<li>We could replace the invalid field access instruction with a special
+instruction that generates an illegal access error, and allow class
+verification to complete successfully. This type of verification must
+often be deferred to first class load, rather than be performed ahead of time
+during DEX optimization, which means the bytecode instructions will be
+mapped read-only during verification. So this won't work.
+</li>
+
+<li>We can perform the access checks when the field/method/class is
+resolved. In a typical VM implementation we would do the check when the
+entry is resolved in the context of the current classfile, but our DEX
+files combine multiple classfiles together, merging the field/method/class
+resolution results into a single large table. Once one class successfully
+resolves the field, every other class in the same DEX file would be able
+to access the field. This is bad.
+</li>
+
+<li>Perform the access checks on every field/method/class access.
+This adds significant overhead. This is mitigated somewhat by the DEX
+optimizer, which will convert many field/method/class accesses into a
+simpler form after performing the access check. However, not all accesses
+can be optimized (e.g. accesses to classes unknown at dexopt time),
+and we don't currently have an optimized form of certain instructions
+(notably static field operations).
+</li>
+</ol>
+
+<p>
+Other implementations are possible, but they all involve allocating
+some amount of additional memory or spending additional cycles
+on non-DEX-optimized instructions. We don't want to throw an
+IllegalAccessError at verification time, since that would indicate that
+access to the class being verified was illegal.
+
+<p>
+The VerifyError is accompanied by detailed, if somewhat cryptic,
+information in the log file. From this it's possible to determine the
+exact instruction that failed, and the reason for the failure. We can
+also constructor the VerifyError with an IllegalAccessError passed in as
+the cause.
+
+<address>Copyright © 2008 The Android Open Source Project</address>
+
+</body>
+</html>