blob: bd167fbef93726b3a84794cfb3a0e56771361e6a [file] [log] [blame]
Clay Murphye4edda62014-10-16 19:00:15 -07001page.title=Dalvik Executable format
Robert Ly35f2fda2013-01-29 16:27:05 -08002@jd:body
Dan Bornstein25705bc2011-04-12 16:23:13 -07003
Robert Ly35f2fda2013-01-29 16:27:05 -08004<!--
Clay Murphy768b82a2013-11-12 11:32:41 -08005 Copyright 2013 The Android Open Source Project
Dan Bornstein25705bc2011-04-12 16:23:13 -07006
Robert Ly35f2fda2013-01-29 16:27:05 -08007 Licensed under the Apache License, Version 2.0 (the "License");
8 you may not use this file except in compliance with the License.
9 You may obtain a copy of the License at
Dan Bornstein25705bc2011-04-12 16:23:13 -070010
Robert Ly35f2fda2013-01-29 16:27:05 -080011 http://www.apache.org/licenses/LICENSE-2.0
Dan Bornstein25705bc2011-04-12 16:23:13 -070012
Robert Ly35f2fda2013-01-29 16:27:05 -080013 Unless required by applicable law or agreed to in writing, software
14 distributed under the License is distributed on an "AS IS" BASIS,
15 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16 See the License for the specific language governing permissions and
17 limitations under the License.
18-->
Dan Bornstein25705bc2011-04-12 16:23:13 -070019<p>This document describes the layout and contents of <code>.dex</code>
20files, which are used to hold a set of class definitions and their associated
21adjunct data.</p>
22
Clay Murphye4edda62014-10-16 19:00:15 -070023<h2 id="types">Guide to types</h2>
Dan Bornstein25705bc2011-04-12 16:23:13 -070024
25<table class="guide">
26<thead>
27<tr>
28 <th>Name</th>
29 <th>Description</th>
30</tr>
31</thead>
32<tbody>
33<tr>
34 <td>byte</td>
35 <td>8-bit signed int</td>
36</tr>
37<tr>
38 <td>ubyte</td>
39 <td>8-bit unsigned int</td>
40</tr>
41<tr>
42 <td>short</td>
43 <td>16-bit signed int, little-endian</td>
44</tr>
45<tr>
46 <td>ushort</td>
47 <td>16-bit unsigned int, little-endian</td>
48</tr>
49<tr>
50 <td>int</td>
51 <td>32-bit signed int, little-endian</td>
52</tr>
53<tr>
54 <td>uint</td>
55 <td>32-bit unsigned int, little-endian</td>
56</tr>
57<tr>
58 <td>long</td>
59 <td>64-bit signed int, little-endian</td>
60</tr>
61<tr>
62 <td>ulong</td>
63 <td>64-bit unsigned int, little-endian</td>
64</tr>
65<tr>
66 <td>sleb128</td>
67 <td>signed LEB128, variable-length (see below)</td>
68</tr>
69<tr>
70 <td>uleb128</td>
71 <td>unsigned LEB128, variable-length (see below)</td>
72</tr>
73<tr>
74 <td>uleb128p1</td>
75 <td>unsigned LEB128 plus <code>1</code>, variable-length (see below)</td>
76</tr>
77</tbody>
78</table>
79
Clay Murphye4edda62014-10-16 19:00:15 -070080<h3 id="leb128">LEB128</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -070081
82<p>LEB128 ("<b>L</b>ittle-<b>E</b>ndian <b>B</b>ase <b>128</b>") is a
83variable-length encoding for
84arbitrary signed or unsigned integer quantities. The format was
85borrowed from the <a href="http://dwarfstd.org/Dwarf3Std.php">DWARF3</a>
86specification. In a <code>.dex</code> file, LEB128 is only ever used to
87encode 32-bit quantities.</p>
88
89<p>Each LEB128 encoded value consists of one to five
90bytes, which together represent a single 32-bit value. Each
91byte has its most significant bit set except for the final byte in the
92sequence, which has its most significant bit clear. The remaining
93seven bits of each byte are payload, with the least significant seven
94bits of the quantity in the first byte, the next seven in the second
95byte and so on. In the case of a signed LEB128 (<code>sleb128</code>),
96the most significant payload bit of the final byte in the sequence is
97sign-extended to produce the final value. In the unsigned case
98(<code>uleb128</code>), any bits not explicitly represented are
99interpreted as <code>0</code>.
100
101<table class="leb128Bits">
102<thead>
103<tr><th colspan="16">Bitwise diagram of a two-byte LEB128 value</th></tr>
104<tr>
105 <th colspan="8">First byte</td>
106 <th colspan="8">Second byte</td>
107</tr>
108</thead>
109<tbody>
110<tr>
111 <td class="start1"><code>1</code></td>
112 <td>bit<sub>6</sub></td>
113 <td>bit<sub>5</sub></td>
114 <td>bit<sub>4</sub></td>
115 <td>bit<sub>3</sub></td>
116 <td>bit<sub>2</sub></td>
117 <td>bit<sub>1</sub></td>
118 <td>bit<sub>0</sub></td>
119 <td class="start2"><code>0</code></td>
120 <td>bit<sub>13</sub></td>
121 <td>bit<sub>12</sub></td>
122 <td>bit<sub>11</sub></td>
123 <td>bit<sub>10</sub></td>
124 <td>bit<sub>9</sub></td>
125 <td>bit<sub>8</sub></td>
126 <td class="end2">bit<sub>7</sub></td>
127</tr>
128</tbody>
129</table>
130
131<p>The variant <code>uleb128p1</code> is used to represent a signed
132value, where the representation is of the value <i>plus one</i> encoded
133as a <code>uleb128</code>. This makes the encoding of <code>-1</code>
134(alternatively thought of as the unsigned value <code>0xffffffff</code>)
135&mdash; but no other negative number &mdash; a single byte, and is
136useful in exactly those cases where the represented number must either
137be non-negative or <code>-1</code> (or <code>0xffffffff</code>),
138and where no other negative values are allowed (or where large unsigned
139values are unlikely to be needed).</p>
140
141<p>Here are some examples of the formats:</p>
142
143<table class="leb128">
144<thead>
145<tr>
146 <th>Encoded Sequence</th>
147 <th>As <code>sleb128</code></th>
148 <th>As <code>uleb128</code></th>
149 <th>As <code>uleb128p1</code></th>
150</tr>
151</thead>
152<tbody>
153 <tr><td>00</td><td>0</td><td>0</td><td>-1</td></tr>
154 <tr><td>01</td><td>1</td><td>1</td><td>0</td></tr>
155 <tr><td>7f</td><td>-1</td><td>127</td><td>126</td></tr>
156 <tr><td>80 7f</td><td>-128</td><td>16256</td><td>16255</td></tr>
157</tbody>
158</table>
159
Clay Murphye4edda62014-10-16 19:00:15 -0700160<h2 id="file-layout">File layout</h2>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700161
162<table class="format">
163<thead>
164<tr>
165 <th>Name</th>
166 <th>Format</th>
167 <th>Description</th>
168</tr>
169</thead>
170<tbody>
171<tr>
172 <td>header</td>
173 <td>header_item</td>
174 <td>the header</td>
175</tr>
176<tr>
177 <td>string_ids</td>
178 <td>string_id_item[]</td>
179 <td>string identifiers list. These are identifiers for all the strings
180 used by this file, either for internal naming (e.g., type descriptors)
181 or as constant objects referred to by code. This list must be sorted
182 by string contents, using UTF-16 code point values (not in a
Elliott Hughes8d777942012-01-05 17:27:02 -0800183 locale-sensitive manner), and it must not contain any duplicate entries.
Dan Bornstein25705bc2011-04-12 16:23:13 -0700184 </td>
185</tr>
186<tr>
187 <td>type_ids</td>
188 <td>type_id_item[]</td>
189 <td>type identifiers list. These are identifiers for all types (classes,
190 arrays, or primitive types) referred to by this file, whether defined
191 in the file or not. This list must be sorted by <code>string_id</code>
Elliott Hughes8d777942012-01-05 17:27:02 -0800192 index, and it must not contain any duplicate entries.
Dan Bornstein25705bc2011-04-12 16:23:13 -0700193 </td>
194</tr>
195<tr>
196 <td>proto_ids</td>
197 <td>proto_id_item[]</td>
198 <td>method prototype identifiers list. These are identifiers for all
199 prototypes referred to by this file. This list must be sorted in
200 return-type (by <code>type_id</code> index) major order, and then
Elliott Hughes8d777942012-01-05 17:27:02 -0800201 by arguments (also by <code>type_id</code> index). The list must not
202 contain any duplicate entries.
Dan Bornstein25705bc2011-04-12 16:23:13 -0700203 </td>
204</tr>
205<tr>
206 <td>field_ids</td>
207 <td>field_id_item[]</td>
208 <td>field identifiers list. These are identifiers for all fields
209 referred to by this file, whether defined in the file or not. This
210 list must be sorted, where the defining type (by <code>type_id</code>
211 index) is the major order, field name (by <code>string_id</code> index)
212 is the intermediate order, and type (by <code>type_id</code> index)
Elliott Hughes8d777942012-01-05 17:27:02 -0800213 is the minor order. The list must not contain any duplicate entries.
Dan Bornstein25705bc2011-04-12 16:23:13 -0700214 </td>
215</tr>
216<tr>
217 <td>method_ids</td>
218 <td>method_id_item[]</td>
219 <td>method identifiers list. These are identifiers for all methods
220 referred to by this file, whether defined in the file or not. This
221 list must be sorted, where the defining type (by <code>type_id</code>
222 index) is the major order, method name (by <code>string_id</code>
Elliott Hughes8d777942012-01-05 17:27:02 -0800223 index) is the intermediate order, and method prototype (by
224 <code>proto_id</code> index) is the minor order. The list must not
225 contain any duplicate entries.
Dan Bornstein25705bc2011-04-12 16:23:13 -0700226 </td>
227</tr>
228<tr>
229 <td>class_defs</td>
230 <td>class_def_item[]</td>
231 <td>class definitions list. The classes must be ordered such that a given
232 class's superclass and implemented interfaces appear in the
Elliott Hughes8d777942012-01-05 17:27:02 -0800233 list earlier than the referring class. Furthermore, it is invalid for
234 a definition for the same-named class to appear more than once in
235 the list.
Dan Bornstein25705bc2011-04-12 16:23:13 -0700236 </td>
237</tr>
238<tr>
239 <td>data</td>
240 <td>ubyte[]</td>
241 <td>data area, containing all the support data for the tables listed above.
242 Different items have different alignment requirements, and
243 padding bytes are inserted before each item if necessary to achieve
244 proper alignment.
245 </td>
246</tr>
247<tr>
248 <td>link_data</td>
249 <td>ubyte[]</td>
250 <td>data used in statically linked files. The format of the data in
Elliott Hughes8d777942012-01-05 17:27:02 -0800251 this section is left unspecified by this document.
252 This section is empty in unlinked files, and runtime implementations
Dan Bornstein25705bc2011-04-12 16:23:13 -0700253 may use it as they see fit.
254 </td>
255</tr>
256</tbody>
257</table>
258
Clay Murphye4edda62014-10-16 19:00:15 -0700259<h2 id="definitions">Bitfield, string and constant definitions</h2>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700260
Clay Murphye4edda62014-10-16 19:00:15 -0700261<h3 id="dex-file-magic">DEX_FILE_MAGIC</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -0700262<h4>embedded in header_item</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700263
264<p>The constant array/string <code>DEX_FILE_MAGIC</code> is the list of
265bytes that must appear at the beginning of a <code>.dex</code> file
266in order for it to be recognized as such. The value intentionally
267contains a newline (<code>"\n"</code> or <code>0x0a</code>) and a
268null byte (<code>"\0"</code> or <code>0x00</code>) in order to help
269in the detection of certain forms of corruption. The value also
270encodes a format version number as three decimal digits, which is
271expected to increase monotonically over time as the format evolves.</p>
272
273<pre>
274ubyte[8] DEX_FILE_MAGIC = { 0x64 0x65 0x78 0x0a 0x30 0x33 0x35 0x00 }
275 = "dex\n035\0"
276</pre>
277
278<p><b>Note:</b> At least a couple earlier versions of the format have
279been used in widely-available public software releases. For example,
280version <code>009</code> was used for the M3 releases of the
Elliott Hughes8d777942012-01-05 17:27:02 -0800281Android platform (November&ndash;December 2007),
Dan Bornstein25705bc2011-04-12 16:23:13 -0700282and version <code>013</code> was used for the M5 releases of the Android
Elliott Hughes8d777942012-01-05 17:27:02 -0800283platform (February&ndash;March 2008). In several respects, these earlier
284versions of the format differ significantly from the version described in this
Dan Bornstein25705bc2011-04-12 16:23:13 -0700285document.</p>
286
Clay Murphye4edda62014-10-16 19:00:15 -0700287<h3 id="endian-constant">ENDIAN_CONSTANT and REVERSE_ENDIAN_CONSTANT</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -0700288<h4>embedded in header_item</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700289
290<p>The constant <code>ENDIAN_CONSTANT</code> is used to indicate the
291endianness of the file in which it is found. Although the standard
292<code>.dex</code> format is little-endian, implementations may choose
293to perform byte-swapping. Should an implementation come across a
294header whose <code>endian_tag</code> is <code>REVERSE_ENDIAN_CONSTANT</code>
295instead of <code>ENDIAN_CONSTANT</code>, it would know that the file
296has been byte-swapped from the expected form.</p>
297
298<pre>
299uint ENDIAN_CONSTANT = 0x12345678;
300uint REVERSE_ENDIAN_CONSTANT = 0x78563412;
301</pre>
302
Clay Murphye4edda62014-10-16 19:00:15 -0700303<h3 id="no-index">NO_INDEX</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -0700304<h4>embedded in class_def_item and debug_info_item</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700305
306<p>The constant <code>NO_INDEX</code> is used to indicate that
307an index value is absent.</p>
308
309<p><b>Note:</b> This value isn't defined to be
310<code>0</code>, because that is in fact typically a valid index.</p>
311
312<p><b>Also Note:</b> The chosen value for <code>NO_INDEX</code> is
313representable as a single byte in the <code>uleb128p1</code> encoding.</p>
314
315<pre>
316uint NO_INDEX = 0xffffffff; // == -1 if treated as a signed int
317</pre>
318
Clay Murphye4edda62014-10-16 19:00:15 -0700319<h3 id="access-flags">access_flags definitions</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -0700320<h4>embedded in class_def_item, encoded_field, encoded_method, and
321InnerClass</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700322
323<p>Bitfields of these flags are used to indicate the accessibility and
324overall properties of classes and class members.</p>
325
326<table class="accessFlags">
327<thead>
328<tr>
329 <th>Name</th>
330 <th>Value</th>
331 <th>For Classes (and <code>InnerClass</code> annotations)</th>
332 <th>For Fields</th>
333 <th>For Methods</th>
334</tr>
335</thead>
336<tbody>
337<tr>
338 <td>ACC_PUBLIC</td>
339 <td>0x1</td>
340 <td><code>public</code>: visible everywhere</td>
341 <td><code>public</code>: visible everywhere</td>
342 <td><code>public</code>: visible everywhere</td>
343</tr>
344<tr>
345 <td>ACC_PRIVATE</td>
346 <td>0x2</td>
347 <td><super>*</super>
348 <code>private</code>: only visible to defining class
349 </td>
350 <td><code>private</code>: only visible to defining class</td>
351 <td><code>private</code>: only visible to defining class</td>
352</tr>
353<tr>
354 <td>ACC_PROTECTED</td>
355 <td>0x4</td>
356 <td><super>*</super>
357 <code>protected</code>: visible to package and subclasses
358 </td>
359 <td><code>protected</code>: visible to package and subclasses</td>
360 <td><code>protected</code>: visible to package and subclasses</td>
361</tr>
362<tr>
363 <td>ACC_STATIC</td>
364 <td>0x8</td>
365 <td><super>*</super>
366 <code>static</code>: is not constructed with an outer
367 <code>this</code> reference</td>
368 <td><code>static</code>: global to defining class</td>
369 <td><code>static</code>: does not take a <code>this</code> argument</td>
370</tr>
371<tr>
372 <td>ACC_FINAL</td>
373 <td>0x10</td>
374 <td><code>final</code>: not subclassable</td>
375 <td><code>final</code>: immutable after construction</td>
376 <td><code>final</code>: not overridable</td>
377</tr>
378<tr>
379 <td>ACC_SYNCHRONIZED</td>
380 <td>0x20</td>
381 <td>&nbsp;</td>
382 <td>&nbsp;</td>
383 <td><code>synchronized</code>: associated lock automatically acquired
384 around call to this method. <b>Note:</b> This is only valid to set when
385 <code>ACC_NATIVE</code> is also set.</td>
386</tr>
387<tr>
388 <td>ACC_VOLATILE</td>
389 <td>0x40</td>
390 <td>&nbsp;</td>
391 <td><code>volatile</code>: special access rules to help with thread
392 safety</td>
393 <td>&nbsp;</td>
394</tr>
395<tr>
396 <td>ACC_BRIDGE</td>
397 <td>0x40</td>
398 <td>&nbsp;</td>
399 <td>&nbsp;</td>
400 <td>bridge method, added automatically by compiler as a type-safe
401 bridge</td>
402</tr>
403<tr>
404 <td>ACC_TRANSIENT</td>
405 <td>0x80</td>
406 <td>&nbsp;</td>
407 <td><code>transient</code>: not to be saved by default serialization</td>
408 <td>&nbsp;</td>
409</tr>
410<tr>
411 <td>ACC_VARARGS</td>
412 <td>0x80</td>
413 <td>&nbsp;</td>
414 <td>&nbsp;</td>
415 <td>last argument should be treated as a "rest" argument by compiler</td>
416</tr>
417<tr>
418 <td>ACC_NATIVE</td>
419 <td>0x100</td>
420 <td>&nbsp;</td>
421 <td>&nbsp;</td>
422 <td><code>native</code>: implemented in native code</td>
423</tr>
424<tr>
425 <td>ACC_INTERFACE</td>
426 <td>0x200</td>
427 <td><code>interface</code>: multiply-implementable abstract class</td>
428 <td>&nbsp;</td>
429 <td>&nbsp;</td>
430</tr>
431<tr>
432 <td>ACC_ABSTRACT</td>
433 <td>0x400</td>
434 <td><code>abstract</code>: not directly instantiable</td>
435 <td>&nbsp;</td>
436 <td><code>abstract</code>: unimplemented by this class</td>
437</tr>
438<tr>
439 <td>ACC_STRICT</td>
440 <td>0x800</td>
441 <td>&nbsp;</td>
442 <td>&nbsp;</td>
443 <td><code>strictfp</code>: strict rules for floating-point arithmetic</td>
444</tr>
445<tr>
446 <td>ACC_SYNTHETIC</td>
447 <td>0x1000</td>
448 <td>not directly defined in source code</td>
449 <td>not directly defined in source code</td>
450 <td>not directly defined in source code</td>
451</tr>
452<tr>
453 <td>ACC_ANNOTATION</td>
454 <td>0x2000</td>
455 <td>declared as an annotation class</td>
456 <td>&nbsp;</td>
457 <td>&nbsp;</td>
458</tr>
459<tr>
460 <td>ACC_ENUM</td>
461 <td>0x4000</td>
462 <td>declared as an enumerated type</td>
463 <td>declared as an enumerated value</td>
464 <td>&nbsp;</td>
465</tr>
466<tr>
467 <td><i>(unused)</i></td>
468 <td>0x8000</td>
469 <td>&nbsp;</td>
470 <td>&nbsp;</td>
471 <td>&nbsp;</td>
472</tr>
473<tr>
474 <td>ACC_CONSTRUCTOR</td>
475 <td>0x10000</td>
476 <td>&nbsp;</td>
477 <td>&nbsp;</td>
478 <td>constructor method (class or instance initializer)</td>
479</tr>
480<tr>
481 <td>ACC_DECLARED_<br/>SYNCHRONIZED</td>
482 <td>0x20000</td>
483 <td>&nbsp;</td>
484 <td>&nbsp;</td>
485 <td>declared <code>synchronized</code>. <b>Note:</b> This has no effect on
486 execution (other than in reflection of this flag, per se).
487 </td>
488</tr>
489</tbody>
490</table>
491
492<p><super>*</super> Only allowed on for <code>InnerClass</code> annotations,
493and must not ever be on in a <code>class_def_item</code>.</p>
494
Clay Murphye4edda62014-10-16 19:00:15 -0700495<h3 id="mutf-8">MUTF-8 (Modified UTF-8) Encoding</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700496
497<p>As a concession to easier legacy support, the <code>.dex</code> format
498encodes its string data in a de facto standard modified UTF-8 form, hereafter
499referred to as MUTF-8. This form is identical to standard UTF-8, except:</p>
500
501<ul>
502 <li>Only the one-, two-, and three-byte encodings are used.</li>
503 <li>Code points in the range <code>U+10000</code> &hellip;
504 <code>U+10ffff</code> are encoded as a surrogate pair, each of
505 which is represented as a three-byte encoded value.</li>
506 <li>The code point <code>U+0000</code> is encoded in two-byte form.</li>
507 <li>A plain null byte (value <code>0</code>) indicates the end of
508 a string, as is the standard C language interpretation.</li>
509</ul>
510
511<p>The first two items above can be summarized as: MUTF-8
512is an encoding format for UTF-16, instead of being a more direct
513encoding format for Unicode characters.</p>
514
515<p>The final two items above make it simultaneously possible to include
516the code point <code>U+0000</code> in a string <i>and</i> still manipulate
517it as a C-style null-terminated string.</p>
518
519<p>However, the special encoding of <code>U+0000</code> means that, unlike
520normal UTF-8, the result of calling the standard C function
521<code>strcmp()</code> on a pair of MUTF-8 strings does not always
522indicate the properly signed result of comparison of <i>unequal</i> strings.
523When ordering (not just equality) is a concern, the most straightforward
524way to compare MUTF-8 strings is to decode them character by character,
525and compare the decoded values. (However, more clever implementations are
526also possible.)</p>
527
528<p>Please refer to <a href="http://unicode.org">The Unicode
529Standard</a> for further information about character encoding.
530MUTF-8 is actually closer to the (relatively less well-known) encoding
531<a href="http://www.unicode.org/reports/tr26/">CESU-8</a> than to UTF-8
532per se.</p>
533
Clay Murphye4edda62014-10-16 19:00:15 -0700534<h3 id="encoding">encoded_value encoding</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -0700535<h4>embedded in annotation_element and encoded_array_item </h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700536
537<p>An <code>encoded_value</code> is an encoded piece of (nearly)
538arbitrary hierarchically structured data. The encoding is meant to
539be both compact and straightforward to parse.</p>
540
541<table class="format">
542<thead>
543<tr>
544 <th>Name</th>
545 <th>Format</th>
546 <th>Description</th>
547</tr>
548</thead>
549<tbody>
550<tr>
551 <td>(value_arg &lt;&lt; 5) | value_type</td>
552 <td>ubyte</td>
553 <td>byte indicating the type of the immediately subsequent
554 <code>value</code> along
555 with an optional clarifying argument in the high-order three bits.
556 See below for the various <code>value</code> definitions.
557 In most cases, <code>value_arg</code> encodes the length of
558 the immediately-subsequent <code>value</code> in bytes, as
559 <code>(size - 1)</code>, e.g., <code>0</code> means that
560 the value requires one byte, and <code>7</code> means it requires
561 eight bytes; however, there are exceptions as noted below.
562 </td>
563</tr>
564<tr>
565 <td>value</td>
566 <td>ubyte[]</td>
567 <td>bytes representing the value, variable in length and interpreted
568 differently for different <code>value_type</code> bytes, though
569 always little-endian. See the various value definitions below for
570 details.
571 </td>
572</tr>
573</tbody>
574</table>
575
Clay Murphye4edda62014-10-16 19:00:15 -0700576<h3 id="value-formats">Value formats</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700577
578<table class="encodedValue">
579<thead>
580<tr>
581 <th>Type Name</th>
582 <th><code>value_type</code></th>
583 <th><code>value_arg</code> Format</th>
584 <th><code>value</code> Format</th>
585 <th>Description</th>
586</tr>
587</thead>
588<tbody>
589<tr>
590 <td>VALUE_BYTE</td>
591 <td>0x00</td>
592 <td><i>(none; must be <code>0</code>)</i></td>
593 <td>ubyte[1]</td>
594 <td>signed one-byte integer value</td>
595</tr>
596<tr>
597 <td>VALUE_SHORT</td>
598 <td>0x02</td>
599 <td>size - 1 (0&hellip;1)</td>
600 <td>ubyte[size]</td>
601 <td>signed two-byte integer value, sign-extended</td>
602</tr>
603<tr>
604 <td>VALUE_CHAR</td>
605 <td>0x03</td>
606 <td>size - 1 (0&hellip;1)</td>
607 <td>ubyte[size]</td>
608 <td>unsigned two-byte integer value, zero-extended</td>
609</tr>
610<tr>
611 <td>VALUE_INT</td>
612 <td>0x04</td>
613 <td>size - 1 (0&hellip;3)</td>
614 <td>ubyte[size]</td>
615 <td>signed four-byte integer value, sign-extended</td>
616</tr>
617<tr>
618 <td>VALUE_LONG</td>
619 <td>0x06</td>
620 <td>size - 1 (0&hellip;7)</td>
621 <td>ubyte[size]</td>
622 <td>signed eight-byte integer value, sign-extended</td>
623</tr>
624<tr>
625 <td>VALUE_FLOAT</td>
626 <td>0x10</td>
627 <td>size - 1 (0&hellip;3)</td>
628 <td>ubyte[size]</td>
629 <td>four-byte bit pattern, zero-extended <i>to the right</i>, and
630 interpreted as an IEEE754 32-bit floating point value
631 </td>
632</tr>
633<tr>
634 <td>VALUE_DOUBLE</td>
635 <td>0x11</td>
636 <td>size - 1 (0&hellip;7)</td>
637 <td>ubyte[size]</td>
638 <td>eight-byte bit pattern, zero-extended <i>to the right</i>, and
639 interpreted as an IEEE754 64-bit floating point value
640 </td>
641</tr>
642<tr>
643 <td>VALUE_STRING</td>
644 <td>0x17</td>
645 <td>size - 1 (0&hellip;3)</td>
646 <td>ubyte[size]</td>
647 <td>unsigned (zero-extended) four-byte integer value,
648 interpreted as an index into
649 the <code>string_ids</code> section and representing a string value
650 </td>
651</tr>
652<tr>
653 <td>VALUE_TYPE</td>
654 <td>0x18</td>
655 <td>size - 1 (0&hellip;3)</td>
656 <td>ubyte[size]</td>
657 <td>unsigned (zero-extended) four-byte integer value,
658 interpreted as an index into
659 the <code>type_ids</code> section and representing a reflective
660 type/class value
661 </td>
662</tr>
663<tr>
664 <td>VALUE_FIELD</td>
665 <td>0x19</td>
666 <td>size - 1 (0&hellip;3)</td>
667 <td>ubyte[size]</td>
668 <td>unsigned (zero-extended) four-byte integer value,
669 interpreted as an index into
670 the <code>field_ids</code> section and representing a reflective
671 field value
672 </td>
673</tr>
674<tr>
675 <td>VALUE_METHOD</td>
676 <td>0x1a</td>
677 <td>size - 1 (0&hellip;3)</td>
678 <td>ubyte[size]</td>
679 <td>unsigned (zero-extended) four-byte integer value,
680 interpreted as an index into
681 the <code>method_ids</code> section and representing a reflective
682 method value
683 </td>
684</tr>
685<tr>
686 <td>VALUE_ENUM</td>
687 <td>0x1b</td>
688 <td>size - 1 (0&hellip;3)</td>
689 <td>ubyte[size]</td>
690 <td>unsigned (zero-extended) four-byte integer value,
691 interpreted as an index into
692 the <code>field_ids</code> section and representing the value of
693 an enumerated type constant
694 </td>
695</tr>
696<tr>
697 <td>VALUE_ARRAY</td>
698 <td>0x1c</td>
699 <td><i>(none; must be <code>0</code>)</i></td>
700 <td>encoded_array</td>
701 <td>an array of values, in the format specified by
Clay Murphye4edda62014-10-16 19:00:15 -0700702 "<code>encoded_array</code> format" below. The size
Dan Bornstein25705bc2011-04-12 16:23:13 -0700703 of the <code>value</code> is implicit in the encoding.
704 </td>
705</tr>
706<tr>
707 <td>VALUE_ANNOTATION</td>
708 <td>0x1d</td>
709 <td><i>(none; must be <code>0</code>)</i></td>
710 <td>encoded_annotation</td>
711 <td>a sub-annotation, in the format specified by
Clay Murphye4edda62014-10-16 19:00:15 -0700712 "<code>encoded_annotation</code> format" below. The size
Dan Bornstein25705bc2011-04-12 16:23:13 -0700713 of the <code>value</code> is implicit in the encoding.
714 </td>
715</tr>
716<tr>
717 <td>VALUE_NULL</td>
718 <td>0x1e</td>
719 <td><i>(none; must be <code>0</code>)</i></td>
720 <td><i>(none)</i></td>
721 <td><code>null</code> reference value</td>
722</tr>
723<tr>
724 <td>VALUE_BOOLEAN</td>
725 <td>0x1f</td>
726 <td>boolean (0&hellip;1)</td>
727 <td><i>(none)</i></td>
728 <td>one-bit value; <code>0</code> for <code>false</code> and
729 <code>1</code> for <code>true</code>. The bit is represented in the
730 <code>value_arg</code>.
731 </td>
732</tr>
733</tbody>
734</table>
735
Clay Murphye4edda62014-10-16 19:00:15 -0700736<h3 id="encoded-array">encoded_array format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700737
738<table class="format">
739<thead>
740<tr>
741 <th>Name</th>
742 <th>Format</th>
743 <th>Description</th>
744</tr>
745</thead>
746<tbody>
747<tr>
748 <td>size</td>
749 <td>uleb128</td>
750 <td>number of elements in the array</td>
751</tr>
752<tr>
753 <td>values</td>
754 <td>encoded_value[size]</td>
755 <td>a series of <code>size</code> <code>encoded_value</code> byte
756 sequences in the format specified by this section, concatenated
757 sequentially.
758 </td>
759</tr>
760</tbody>
761</table>
762
Clay Murphye4edda62014-10-16 19:00:15 -0700763<h3 id="encoded-annotation">encoded_annotation format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700764
765<table class="format">
766<thead>
767<tr>
768 <th>Name</th>
769 <th>Format</th>
770 <th>Description</th>
771</tr>
772</thead>
773<tbody>
774<tr>
775 <td>type_idx</td>
776 <td>uleb128</td>
777 <td>type of the annotation. This must be a class (not array or primitive)
778 type.
779 </td>
780</tr>
781<tr>
782 <td>size</td>
783 <td>uleb128</td>
784 <td>number of name-value mappings in this annotation</td>
785</tr>
786<tr>
787 <td>elements</td>
788 <td>annotation_element[size]</td>
789 <td>elements of the annotataion, represented directly in-line (not as
790 offsets). Elements must be sorted in increasing order by
791 <code>string_id</code> index.
792 </td>
793</tr>
794</tbody>
795</table>
796
Clay Murphye4edda62014-10-16 19:00:15 -0700797<h3 id="annotation-element">annotation_element format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700798
799<table class="format">
800<thead>
801<tr>
802 <th>Name</th>
803 <th>Format</th>
804 <th>Description</th>
805</tr>
806</thead>
807<tbody>
808<tr>
809 <td>name_idx</td>
810 <td>uleb128</td>
811 <td>element name, represented as an index into the
812 <code>string_ids</code> section. The string must conform to the
813 syntax for <i>MemberName</i>, defined above.
814 </td>
815</tr>
816<tr>
817 <td>value</td>
818 <td>encoded_value</td>
819 <td>element value</td>
820</tr>
821</tbody>
822</table>
823
Clay Murphye4edda62014-10-16 19:00:15 -0700824<h2 id="string-syntax">String syntax</h2>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700825
826<p>There are several kinds of item in a <code>.dex</code> file which
827ultimately refer to a string. The following BNF-style definitions
828indicate the acceptable syntax for these strings.</p>
829
Clay Murphye4edda62014-10-16 19:00:15 -0700830<h3 id="simplename"><i>SimpleName</i></h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700831
832<p>A <i>SimpleName</i> is the basis for the syntax of the names of other
833things. The <code>.dex</code> format allows a fair amount of latitude
834here (much more than most common source languages). In brief, a simple
Elliott Hughes8d777942012-01-05 17:27:02 -0800835name consists of any low-ASCII alphabetic character or digit, a few
Dan Bornstein25705bc2011-04-12 16:23:13 -0700836specific low-ASCII symbols, and most non-ASCII code points that are not
837control, space, or special characters. Note that surrogate code points
838(in the range <code>U+d800</code> &hellip; <code>U+dfff</code>) are not
839considered valid name characters, per se, but Unicode supplemental
840characters <i>are</i> valid (which are represented by the final
841alternative of the rule for <i>SimpleNameChar</i>), and they should be
842represented in a file as pairs of surrogate code points in the MUTF-8
843encoding.</p>
844
845<table class="bnf">
846 <tr><td colspan="2" class="def"><i>SimpleName</i> &rarr;</td></tr>
847 <tr>
848 <td/>
849 <td><i>SimpleNameChar</i> (<i>SimpleNameChar</i>)*</td>
850 </tr>
851
852 <tr><td colspan="2" class="def"><i>SimpleNameChar</i> &rarr;</td></tr>
853 <tr>
854 <td/>
855 <td><code>'A'</code> &hellip; <code>'Z'</code></td>
856 </tr>
857 <tr>
858 <td class="bar">|</td>
859 <td><code>'a'</code> &hellip; <code>'z'</code></td>
860 </tr>
861 <tr>
862 <td class="bar">|</td>
863 <td><code>'0'</code> &hellip; <code>'9'</code></td>
864 </tr>
865 <tr>
866 <td class="bar">|</td>
867 <td><code>'$'</code></td>
868 </tr>
869 <tr>
870 <td class="bar">|</td>
871 <td><code>'-'</code></td>
872 </tr>
873 <tr>
874 <td class="bar">|</td>
875 <td><code>'_'</code></td>
876 </tr>
877 <tr>
878 <td class="bar">|</td>
879 <td><code>U+00a1</code> &hellip; <code>U+1fff</code></td>
880 </tr>
881 <tr>
882 <td class="bar">|</td>
883 <td><code>U+2010</code> &hellip; <code>U+2027</code></td>
884 </tr>
885 <tr>
886 <td class="bar">|</td>
887 <td><code>U+2030</code> &hellip; <code>U+d7ff</code></td>
888 </tr>
889 <tr>
890 <td class="bar">|</td>
891 <td><code>U+e000</code> &hellip; <code>U+ffef</code></td>
892 </tr>
893 <tr>
894 <td class="bar">|</td>
895 <td><code>U+10000</code> &hellip; <code>U+10ffff</code></td>
896 </tr>
897</table>
898
Clay Murphye4edda62014-10-16 19:00:15 -0700899<h3 id="membername"><i>MemberName</i></h3>
Clay Murphy945af1a2013-07-01 17:31:13 -0700900<h4>used by field_id_item and method_id_item</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700901
902<p>A <i>MemberName</i> is the name of a member of a class, members being
903fields, methods, and inner classes.</p>
904
905<table class="bnf">
906 <tr><td colspan="2" class="def"><i>MemberName</i> &rarr;</td></tr>
907 <tr>
908 <td/>
909 <td><i>SimpleName</i></td>
910 </tr>
911 <tr>
912 <td class="bar">|</td>
913 <td><code>'&lt;'</code> <i>SimpleName</i> <code>'&gt;'</code></td>
914 </tr>
915</table>
916
Clay Murphye4edda62014-10-16 19:00:15 -0700917<h3 id="fullclassname"><i>FullClassName</i></h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700918
919<p>A <i>FullClassName</i> is a fully-qualified class name, including an
920optional package specifier followed by a required name.</p>
921
922<table class="bnf">
923 <tr><td colspan="2" class="def"><i>FullClassName</i> &rarr;</td></tr>
924 <tr>
925 <td/>
926 <td><i>OptionalPackagePrefix</i> <i>SimpleName</i></td>
927 </tr>
928
929 <tr><td colspan="2" class="def"><i>OptionalPackagePrefix</i> &rarr;</td></tr>
930 <tr>
931 <td/>
932 <td>(<i>SimpleName</i> <code>'/'</code>)*</td>
933 </tr>
934</table>
935
Clay Murphye4edda62014-10-16 19:00:15 -0700936<h3 id="typedescriptor"><i>TypeDescriptor</i></h3>
Clay Murphy945af1a2013-07-01 17:31:13 -0700937<h4>used by type_id_item</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700938
939<p>A <i>TypeDescriptor</i> is the representation of any type, including
940primitives, classes, arrays, and <code>void</code>. See below for
941the meaning of the various versions.</p>
942
943<table class="bnf">
944 <tr><td colspan="2" class="def"><i>TypeDescriptor</i> &rarr;</td></tr>
945 <tr>
946 <td/>
947 <td><code>'V'</code></td>
948 </tr>
949 <tr>
950 <td class="bar">|</td>
951 <td><i>FieldTypeDescriptor</i></td>
952 </tr>
953
954 <tr><td colspan="2" class="def"><i>FieldTypeDescriptor</i> &rarr;</td></tr>
955 <tr>
956 <td/>
957 <td><i>NonArrayFieldTypeDescriptor</i></td>
958 </tr>
959 <tr>
960 <td class="bar">|</td>
961 <td>(<code>'['</code> * 1&hellip;255)
962 <i>NonArrayFieldTypeDescriptor</i></td>
963 </tr>
964
965 <tr>
966 <td colspan="2" class="def"><i>NonArrayFieldTypeDescriptor</i>&rarr;</td>
967 </tr>
968 <tr>
969 <td/>
970 <td><code>'Z'</code></td>
971 </tr>
972 <tr>
973 <td class="bar">|</td>
974 <td><code>'B'</code></td>
975 </tr>
976 <tr>
977 <td class="bar">|</td>
978 <td><code>'S'</code></td>
979 </tr>
980 <tr>
981 <td class="bar">|</td>
982 <td><code>'C'</code></td>
983 </tr>
984 <tr>
985 <td class="bar">|</td>
986 <td><code>'I'</code></td>
987 </tr>
988 <tr>
989 <td class="bar">|</td>
990 <td><code>'J'</code></td>
991 </tr>
992 <tr>
993 <td class="bar">|</td>
994 <td><code>'F'</code></td>
995 </tr>
996 <tr>
997 <td class="bar">|</td>
998 <td><code>'D'</code></td>
999 </tr>
1000 <tr>
1001 <td class="bar">|</td>
1002 <td><code>'L'</code> <i>FullClassName</i> <code>';'</code></td>
1003 </tr>
1004</table>
1005
Clay Murphye4edda62014-10-16 19:00:15 -07001006<h3 id="shortydescriptor"><i>ShortyDescriptor</i></h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07001007<h4>used by proto_id_item</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001008
1009<p>A <i>ShortyDescriptor</i> is the short form representation of a method
1010prototype, including return and parameter types, except that there is
1011no distinction between various reference (class or array) types. Instead,
1012all reference types are represented by a single <code>'L'</code> character.</p>
1013
1014<table class="bnf">
1015 <tr><td colspan="2" class="def"><i>ShortyDescriptor</i> &rarr;</td></tr>
1016 <tr>
1017 <td/>
1018 <td><i>ShortyReturnType</i> (<i>ShortyFieldType</i>)*</td>
1019 </tr>
1020
1021 <tr><td colspan="2" class="def"><i>ShortyReturnType</i> &rarr;</td></tr>
1022 <tr>
1023 <td/>
1024 <td><code>'V'</code></td>
1025 </tr>
1026 <tr>
1027 <td class="bar">|</td>
1028 <td><i>ShortyFieldType</i></td>
1029 </tr>
1030
1031 <tr><td colspan="2" class="def"><i>ShortyFieldType</i> &rarr;</td></tr>
1032 <tr>
1033 <td/>
1034 <td><code>'Z'</code></td>
1035 </tr>
1036 <tr>
1037 <td class="bar">|</td>
1038 <td><code>'B'</code></td>
1039 </tr>
1040 <tr>
1041 <td class="bar">|</td>
1042 <td><code>'S'</code></td>
1043 </tr>
1044 <tr>
1045 <td class="bar">|</td>
1046 <td><code>'C'</code></td>
1047 </tr>
1048 <tr>
1049 <td class="bar">|</td>
1050 <td><code>'I'</code></td>
1051 </tr>
1052 <tr>
1053 <td class="bar">|</td>
1054 <td><code>'J'</code></td>
1055 </tr>
1056 <tr>
1057 <td class="bar">|</td>
1058 <td><code>'F'</code></td>
1059 </tr>
1060 <tr>
1061 <td class="bar">|</td>
1062 <td><code>'D'</code></td>
1063 </tr>
1064 <tr>
1065 <td class="bar">|</td>
1066 <td><code>'L'</code></td>
1067 </tr>
1068</table>
1069
Clay Murphye4edda62014-10-16 19:00:15 -07001070<h3 id="typedescriptor"><i>TypeDescriptor</i> Semantics</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001071
1072<p>This is the meaning of each of the variants of <i>TypeDescriptor</i>.</p>
1073
1074<table class="descriptor">
1075<thead>
1076<tr>
1077 <th>Syntax</th>
1078 <th>Meaning</th>
1079</tr>
1080</thead>
1081<tbody>
1082<tr>
1083 <td>V</td>
1084 <td><code>void</code>; only valid for return types</td>
1085</tr>
1086<tr>
1087 <td>Z</td>
1088 <td><code>boolean</code></td>
1089</tr>
1090<tr>
1091 <td>B</td>
1092 <td><code>byte</code></td>
1093</tr>
1094<tr>
1095 <td>S</td>
1096 <td><code>short</code></td>
1097</tr>
1098<tr>
1099 <td>C</td>
1100 <td><code>char</code></td>
1101</tr>
1102<tr>
1103 <td>I</td>
1104 <td><code>int</code></td>
1105</tr>
1106<tr>
1107 <td>J</td>
1108 <td><code>long</code></td>
1109</tr>
1110<tr>
1111 <td>F</td>
1112 <td><code>float</code></td>
1113</tr>
1114<tr>
1115 <td>D</td>
1116 <td><code>double</code></td>
1117</tr>
1118<tr>
1119 <td>L<i>fully/qualified/Name</i>;</td>
1120 <td>the class <code><i>fully.qualified.Name</i></code></td>
1121</tr>
1122<tr>
1123 <td>[<i>descriptor</i></td>
1124 <td>array of <code><i>descriptor</i></code>, usable recursively for
1125 arrays-of-arrays, though it is invalid to have more than 255
1126 dimensions.
1127 </td>
1128</tr>
1129</tbody>
1130</table>
1131
Clay Murphye4edda62014-10-16 19:00:15 -07001132<h2 id="items">Items and related structures</h2>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001133
1134<p>This section includes definitions for each of the top-level items that
1135may appear in a <code>.dex</code> file.
1136
Clay Murphye4edda62014-10-16 19:00:15 -07001137<h3 id="header-item">header_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07001138<h4>appears in the header section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001139<h4>alignment: 4 bytes</h4>
1140
1141<table class="format">
1142<thead>
1143<tr>
1144 <th>Name</th>
1145 <th>Format</th>
1146 <th>Description</th>
1147</tr>
1148</thead>
1149<tbody>
1150<tr>
1151 <td>magic</td>
1152 <td>ubyte[8] = DEX_FILE_MAGIC</td>
1153 <td>magic value. See discussion above under "<code>DEX_FILE_MAGIC</code>"
1154 for more details.
1155 </td>
1156</tr>
1157<tr>
1158 <td>checksum</td>
1159 <td>uint</td>
1160 <td>adler32 checksum of the rest of the file (everything but
1161 <code>magic</code> and this field); used to detect file corruption
1162 </td>
1163</tr>
1164<tr>
1165 <td>signature</td>
1166 <td>ubyte[20]</td>
1167 <td>SHA-1 signature (hash) of the rest of the file (everything but
1168 <code>magic</code>, <code>checksum</code>, and this field); used
1169 to uniquely identify files
1170 </td>
1171</tr>
1172<tr>
1173 <td>file_size</td>
1174 <td>uint</td>
1175 <td>size of the entire file (including the header), in bytes
1176</tr>
1177<tr>
1178 <td>header_size</td>
1179 <td>uint = 0x70</td>
1180 <td>size of the header (this entire section), in bytes. This allows for at
1181 least a limited amount of backwards/forwards compatibility without
1182 invalidating the format.
1183 </td>
1184</tr>
1185<tr>
1186 <td>endian_tag</td>
1187 <td>uint = ENDIAN_CONSTANT</td>
1188 <td>endianness tag. See discussion above under "<code>ENDIAN_CONSTANT</code>
1189 and <code>REVERSE_ENDIAN_CONSTANT</code>" for more details.
1190 </td>
1191</tr>
1192<tr>
1193 <td>link_size</td>
1194 <td>uint</td>
1195 <td>size of the link section, or <code>0</code> if this file isn't
1196 statically linked</td>
1197</tr>
1198<tr>
1199 <td>link_off</td>
1200 <td>uint</td>
1201 <td>offset from the start of the file to the link section, or
1202 <code>0</code> if <code>link_size == 0</code>. The offset, if non-zero,
1203 should be to an offset into the <code>link_data</code> section. The
1204 format of the data pointed at is left unspecified by this document;
1205 this header field (and the previous) are left as hooks for use by
1206 runtime implementations.
1207 </td>
1208</tr>
1209<tr>
1210 <td>map_off</td>
1211 <td>uint</td>
1212 <td>offset from the start of the file to the map item, or
1213 <code>0</code> if this file has no map. The offset, if non-zero,
1214 should be to an offset into the <code>data</code> section,
1215 and the data should be in the format specified by "<code>map_list</code>"
1216 below.
1217 </td>
1218</tr>
1219<tr>
1220 <td>string_ids_size</td>
1221 <td>uint</td>
1222 <td>count of strings in the string identifiers list</td>
1223</tr>
1224<tr>
1225 <td>string_ids_off</td>
1226 <td>uint</td>
1227 <td>offset from the start of the file to the string identifiers list, or
1228 <code>0</code> if <code>string_ids_size == 0</code> (admittedly a
1229 strange edge case). The offset, if non-zero,
1230 should be to the start of the <code>string_ids</code> section.
1231 </td>
1232</tr>
1233<tr>
1234 <td>type_ids_size</td>
1235 <td>uint</td>
1236 <td>count of elements in the type identifiers list</td>
1237</tr>
1238<tr>
1239 <td>type_ids_off</td>
1240 <td>uint</td>
1241 <td>offset from the start of the file to the type identifiers list, or
1242 <code>0</code> if <code>type_ids_size == 0</code> (admittedly a
1243 strange edge case). The offset, if non-zero,
1244 should be to the start of the <code>type_ids</code>
1245 section.
1246 </td>
1247</tr>
1248<tr>
1249 <td>proto_ids_size</td>
1250 <td>uint</td>
1251 <td>count of elements in the prototype identifiers list</td>
1252</tr>
1253<tr>
1254 <td>proto_ids_off</td>
1255 <td>uint</td>
1256 <td>offset from the start of the file to the prototype identifiers list, or
1257 <code>0</code> if <code>proto_ids_size == 0</code> (admittedly a
1258 strange edge case). The offset, if non-zero,
1259 should be to the start of the <code>proto_ids</code>
1260 section.
1261 </td>
1262</tr>
1263<tr>
1264 <td>field_ids_size</td>
1265 <td>uint</td>
1266 <td>count of elements in the field identifiers list</td>
1267</tr>
1268<tr>
1269 <td>field_ids_off</td>
1270 <td>uint</td>
1271 <td>offset from the start of the file to the field identifiers list, or
1272 <code>0</code> if <code>field_ids_size == 0</code>. The offset, if
1273 non-zero, should be to the start of the <code>field_ids</code>
1274 section.</td>
1275</td>
1276</tr>
1277<tr>
1278 <td>method_ids_size</td>
1279 <td>uint</td>
1280 <td>count of elements in the method identifiers list</td>
1281</tr>
1282<tr>
1283 <td>method_ids_off</td>
1284 <td>uint</td>
1285 <td>offset from the start of the file to the method identifiers list, or
1286 <code>0</code> if <code>method_ids_size == 0</code>. The offset, if
1287 non-zero, should be to the start of the <code>method_ids</code>
1288 section.</td>
1289</tr>
1290<tr>
1291 <td>class_defs_size</td>
1292 <td>uint</td>
1293 <td>count of elements in the class definitions list</td>
1294</tr>
1295<tr>
1296 <td>class_defs_off</td>
1297 <td>uint</td>
1298 <td>offset from the start of the file to the class definitions list, or
1299 <code>0</code> if <code>class_defs_size == 0</code> (admittedly a
1300 strange edge case). The offset, if non-zero,
1301 should be to the start of the <code>class_defs</code> section.
1302 </td>
1303</tr>
1304<tr>
1305 <td>data_size</td>
1306 <td>uint</td>
1307 <td>Size of <code>data</code> section in bytes. Must be an even
1308 multiple of sizeof(uint).</td>
1309</tr>
1310<tr>
1311 <td>data_off</td>
1312 <td>uint</td>
1313 <td>offset from the start of the file to the start of the
1314 <code>data</code> section.
1315 </td>
1316</tr>
1317</tbody>
1318</table>
1319
Clay Murphye4edda62014-10-16 19:00:15 -07001320<h3 id="map-list">map_list</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07001321<h4>appears in the data section</h4>
1322<h4>referenced from header_item</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001323<h4>alignment: 4 bytes</h4>
1324
1325<p>This is a list of the entire contents of a file, in order. It
1326contains some redundancy with respect to the <code>header_item</code>
1327but is intended to be an easy form to use to iterate over an entire
Elliott Hughes8d777942012-01-05 17:27:02 -08001328file. A given type must appear at most once in a map, but there is no
Dan Bornstein25705bc2011-04-12 16:23:13 -07001329restriction on what order types may appear in, other than the
1330restrictions implied by the rest of the format (e.g., a
1331<code>header</code> section must appear first, followed by a
1332<code>string_ids</code> section, etc.). Additionally, the map entries must
1333be ordered by initial offset and must not overlap.</p>
1334
1335<table class="format">
1336<thead>
1337<tr>
1338 <th>Name</th>
1339 <th>Format</th>
1340 <th>Description</th>
1341</tr>
1342</thead>
1343<tbody>
1344<tr>
1345 <td>size</td>
1346 <td>uint</td>
1347 <td>size of the list, in entries</td>
1348</tr>
1349<tr>
1350 <td>list</td>
1351 <td>map_item[size]</td>
1352 <td>elements of the list</td>
1353</tr>
1354</tbody>
1355</table>
1356
Clay Murphye4edda62014-10-16 19:00:15 -07001357<h3 id="map-item">map_item format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001358
1359<table class="format">
1360<thead>
1361<tr>
1362 <th>Name</th>
1363 <th>Format</th>
1364 <th>Description</th>
1365</tr>
1366</thead>
1367<tbody>
1368<tr>
1369 <td>type</td>
1370 <td>ushort</td>
1371 <td>type of the items; see table below</td>
1372</tr>
1373<tr>
1374 <td>unused</td>
1375 <td>ushort</td>
1376 <td><i>(unused)</i></td>
1377</tr>
1378<tr>
1379 <td>size</td>
1380 <td>uint</td>
1381 <td>count of the number of items to be found at the indicated offset</td>
1382</tr>
1383<tr>
1384 <td>offset</td>
1385 <td>uint</td>
1386 <td>offset from the start of the file to the items in question</td>
1387</tr>
1388</tbody>
1389</table>
1390
1391
Clay Murphye4edda62014-10-16 19:00:15 -07001392<h3 id="type-codes">Type Codes</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001393
1394<table class="typeCodes">
1395<thead>
1396<tr>
1397 <th>Item Type</th>
1398 <th>Constant</th>
1399 <th>Value</th>
1400 <th>Item Size In Bytes</th>
1401</tr>
1402</thead>
1403<tbody>
1404<tr>
1405 <td>header_item</td>
1406 <td>TYPE_HEADER_ITEM</td>
1407 <td>0x0000</td>
1408 <td>0x70</td>
1409</tr>
1410<tr>
1411 <td>string_id_item</td>
1412 <td>TYPE_STRING_ID_ITEM</td>
1413 <td>0x0001</td>
1414 <td>0x04</td>
1415</tr>
1416<tr>
1417 <td>type_id_item</td>
1418 <td>TYPE_TYPE_ID_ITEM</td>
1419 <td>0x0002</td>
1420 <td>0x04</td>
1421</tr>
1422<tr>
1423 <td>proto_id_item</td>
1424 <td>TYPE_PROTO_ID_ITEM</td>
1425 <td>0x0003</td>
1426 <td>0x0c</td>
1427</tr>
1428<tr>
1429 <td>field_id_item</td>
1430 <td>TYPE_FIELD_ID_ITEM</td>
1431 <td>0x0004</td>
1432 <td>0x08</td>
1433</tr>
1434<tr>
1435 <td>method_id_item</td>
1436 <td>TYPE_METHOD_ID_ITEM</td>
1437 <td>0x0005</td>
1438 <td>0x08</td>
1439</tr>
1440<tr>
1441 <td>class_def_item</td>
1442 <td>TYPE_CLASS_DEF_ITEM</td>
1443 <td>0x0006</td>
1444 <td>0x20</td>
1445</tr>
1446<tr>
1447 <td>map_list</td>
1448 <td>TYPE_MAP_LIST</td>
1449 <td>0x1000</td>
1450 <td>4 + (item.size * 12)</td>
1451</tr>
1452<tr>
1453 <td>type_list</td>
1454 <td>TYPE_TYPE_LIST</td>
1455 <td>0x1001</td>
1456 <td>4 + (item.size * 2)</td>
1457</tr>
1458<tr>
1459 <td>annotation_set_ref_list</td>
1460 <td>TYPE_ANNOTATION_SET_REF_LIST</td>
1461 <td>0x1002</td>
1462 <td>4 + (item.size * 4)</td>
1463</tr>
1464<tr>
1465 <td>annotation_set_item</td>
1466 <td>TYPE_ANNOTATION_SET_ITEM</td>
1467 <td>0x1003</td>
1468 <td>4 + (item.size * 4)</td>
1469</tr>
1470<tr>
1471 <td>class_data_item</td>
1472 <td>TYPE_CLASS_DATA_ITEM</td>
1473 <td>0x2000</td>
1474 <td><i>implicit; must parse</i></td>
1475</tr>
1476<tr>
1477 <td>code_item</td>
1478 <td>TYPE_CODE_ITEM</td>
1479 <td>0x2001</td>
1480 <td><i>implicit; must parse</i></td>
1481</tr>
1482<tr>
1483 <td>string_data_item</td>
1484 <td>TYPE_STRING_DATA_ITEM</td>
1485 <td>0x2002</td>
1486 <td><i>implicit; must parse</i></td>
1487</tr>
1488<tr>
1489 <td>debug_info_item</td>
1490 <td>TYPE_DEBUG_INFO_ITEM</td>
1491 <td>0x2003</td>
1492 <td><i>implicit; must parse</i></td>
1493</tr>
1494<tr>
1495 <td>annotation_item</td>
1496 <td>TYPE_ANNOTATION_ITEM</td>
1497 <td>0x2004</td>
1498 <td><i>implicit; must parse</i></td>
1499</tr>
1500<tr>
1501 <td>encoded_array_item</td>
1502 <td>TYPE_ENCODED_ARRAY_ITEM</td>
1503 <td>0x2005</td>
1504 <td><i>implicit; must parse</i></td>
1505</tr>
1506<tr>
1507 <td>annotations_directory_item</td>
1508 <td>TYPE_ANNOTATIONS_DIRECTORY_ITEM</td>
1509 <td>0x2006</td>
1510 <td><i>implicit; must parse</i></td>
1511</tr>
1512</tbody>
1513</table>
1514
1515
Clay Murphye4edda62014-10-16 19:00:15 -07001516<h3 id="string-item">string_id_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07001517<h4>appears in the string_ids section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001518<h4>alignment: 4 bytes</h4>
1519
1520<table class="format">
1521<thead>
1522<tr>
1523 <th>Name</th>
1524 <th>Format</th>
1525 <th>Description</th>
1526</tr>
1527</thead>
1528<tbody>
1529<tr>
1530 <td>string_data_off</td>
1531 <td>uint</td>
1532 <td>offset from the start of the file to the string data for this
1533 item. The offset should be to a location
1534 in the <code>data</code> section, and the data should be in the
1535 format specified by "<code>string_data_item</code>" below.
1536 There is no alignment requirement for the offset.
1537 </td>
1538</tr>
1539</tbody>
1540</table>
1541
Clay Murphye4edda62014-10-16 19:00:15 -07001542<h3 id="string-data-item">string_data_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07001543<h4>appears in the data section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001544<h4>alignment: none (byte-aligned)</h4>
1545
1546<table class="format">
1547<thead>
1548<tr>
1549 <th>Name</th>
1550 <th>Format</th>
1551 <th>Description</th>
1552</tr>
1553</thead>
1554<tbody>
1555<tr>
1556 <td>utf16_size</td>
1557 <td>uleb128</td>
1558 <td>size of this string, in UTF-16 code units (which is the "string
1559 length" in many systems). That is, this is the decoded length of
1560 the string. (The encoded length is implied by the position of
1561 the <code>0</code> byte.)</td>
1562</tr>
1563<tr>
1564 <td>data</td>
1565 <td>ubyte[]</td>
1566 <td>a series of MUTF-8 code units (a.k.a. octets, a.k.a. bytes)
1567 followed by a byte of value <code>0</code>. See
1568 "MUTF-8 (Modified UTF-8) Encoding" above for details and
1569 discussion about the data format.
1570 <p><b>Note:</b> It is acceptable to have a string which includes
1571 (the encoded form of) UTF-16 surrogate code units (that is,
1572 <code>U+d800</code> &hellip; <code>U+dfff</code>)
1573 either in isolation or out-of-order with respect to the usual
1574 encoding of Unicode into UTF-16. It is up to higher-level uses of
1575 strings to reject such invalid encodings, if appropriate.</p>
1576 </td>
1577</tr>
1578</tbody>
1579</table>
1580
Clay Murphye4edda62014-10-16 19:00:15 -07001581<h3 id="type-id-item">type_id_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07001582<h4>appears in the type_ids section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001583<h4>alignment: 4 bytes</h4>
1584
1585<table class="format">
1586<thead>
1587<tr>
1588 <th>Name</th>
1589 <th>Format</th>
1590 <th>Description</th>
1591</tr>
1592</thead>
1593<tbody>
1594<tr>
1595 <td>descriptor_idx</td>
1596 <td>uint</td>
1597 <td>index into the <code>string_ids</code> list for the descriptor
1598 string of this type. The string must conform to the syntax for
1599 <i>TypeDescriptor</i>, defined above.
1600 </td>
1601</tr>
1602</tbody>
1603</table>
1604
Clay Murphye4edda62014-10-16 19:00:15 -07001605<h3 id="proto-id-item">proto_id_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07001606<h4>appears in the proto_ids section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001607<h4>alignment: 4 bytes</h4>
1608
1609<table class="format">
1610<thead>
1611<tr>
1612 <th>Name</th>
1613 <th>Format</th>
1614 <th>Description</th>
1615</tr>
1616</thead>
1617<tbody>
1618<tr>
1619 <td>shorty_idx</td>
1620 <td>uint</td>
1621 <td>index into the <code>string_ids</code> list for the short-form
1622 descriptor string of this prototype. The string must conform to the
1623 syntax for <i>ShortyDescriptor</i>, defined above, and must correspond
1624 to the return type and parameters of this item.
1625 </td>
1626</tr>
1627<tr>
1628 <td>return_type_idx</td>
1629 <td>uint</td>
1630 <td>index into the <code>type_ids</code> list for the return type
1631 of this prototype
1632 </td>
1633</tr>
1634<tr>
1635 <td>parameters_off</td>
1636 <td>uint</td>
1637 <td>offset from the start of the file to the list of parameter types
1638 for this prototype, or <code>0</code> if this prototype has no
1639 parameters. This offset, if non-zero, should be in the
1640 <code>data</code> section, and the data there should be in the
1641 format specified by <code>"type_list"</code> below. Additionally, there
1642 should be no reference to the type <code>void</code> in the list.
1643 </td>
1644</tr>
1645</tbody>
1646</table>
1647
Clay Murphye4edda62014-10-16 19:00:15 -07001648<h3 id="field-id-item">field_id_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07001649<h4>appears in the field_ids section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001650<h4>alignment: 4 bytes</h4>
1651
1652<table class="format">
1653<thead>
1654<tr>
1655 <th>Name</th>
1656 <th>Format</th>
1657 <th>Description</th>
1658</tr>
1659</thead>
1660<tbody>
1661<tr>
1662 <td>class_idx</td>
1663 <td>ushort</td>
1664 <td>index into the <code>type_ids</code> list for the definer of this
1665 field. This must be a class type, and not an array or primitive type.
1666 </td>
1667</tr>
1668<tr>
1669 <td>type_idx</td>
1670 <td>ushort</td>
1671 <td>index into the <code>type_ids</code> list for the type of
1672 this field
1673 </td>
1674</tr>
1675<tr>
1676 <td>name_idx</td>
1677 <td>uint</td>
1678 <td>index into the <code>string_ids</code> list for the name of this
1679 field. The string must conform to the syntax for <i>MemberName</i>,
1680 defined above.
1681 </td>
1682</tr>
1683</tbody>
1684</table>
1685
Clay Murphye4edda62014-10-16 19:00:15 -07001686<h3 id="method-id-item">method_id_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07001687<h4>appears in the method_ids section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001688<h4>alignment: 4 bytes</h4>
1689
1690<table class="format">
1691<thead>
1692<tr>
1693 <th>Name</th>
1694 <th>Format</th>
1695 <th>Description</th>
1696</tr>
1697</thead>
1698<tbody>
1699<tr>
1700 <td>class_idx</td>
1701 <td>ushort</td>
1702 <td>index into the <code>type_ids</code> list for the definer of this
1703 method. This must be a class or array type, and not a primitive type.
1704 </td>
1705</tr>
1706<tr>
1707 <td>proto_idx</td>
1708 <td>ushort</td>
1709 <td>index into the <code>proto_ids</code> list for the prototype of
1710 this method
1711 </td>
1712</tr>
1713<tr>
1714 <td>name_idx</td>
1715 <td>uint</td>
1716 <td>index into the <code>string_ids</code> list for the name of this
1717 method. The string must conform to the syntax for <i>MemberName</i>,
1718 defined above.
1719 </td>
1720</tr>
1721</tbody>
1722</table>
1723
Clay Murphye4edda62014-10-16 19:00:15 -07001724<h3 id="class-def-item">class_def_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07001725<h4>appears in the class_defs section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001726<h4>alignment: 4 bytes</h4>
1727
1728<table class="format">
1729<thead>
1730<tr>
1731 <th>Name</th>
1732 <th>Format</th>
1733 <th>Description</th>
1734</tr>
1735</thead>
1736<tbody>
1737<tr>
1738 <td>class_idx</td>
1739 <td>uint</td>
1740 <td>index into the <code>type_ids</code> list for this class.
1741 This must be a class type, and not an array or primitive type.
1742 </td>
1743</tr>
1744<tr>
1745 <td>access_flags</td>
1746 <td>uint</td>
1747 <td>access flags for the class (<code>public</code>, <code>final</code>,
1748 etc.). See "<code>access_flags</code> Definitions" for details.
1749 </td>
1750</tr>
1751<tr>
1752 <td>superclass_idx</td>
1753 <td>uint</td>
1754 <td>index into the <code>type_ids</code> list for the superclass, or
1755 the constant value <code>NO_INDEX</code> if this class has no
1756 superclass (i.e., it is a root class such as <code>Object</code>).
1757 If present, this must be a class type, and not an array or primitive type.
1758 </td>
1759</tr>
1760<tr>
1761 <td>interfaces_off</td>
1762 <td>uint</td>
1763 <td>offset from the start of the file to the list of interfaces, or
1764 <code>0</code> if there are none. This offset
1765 should be in the <code>data</code> section, and the data
1766 there should be in the format specified by
1767 "<code>type_list</code>" below. Each of the elements of the list
1768 must be a class type (not an array or primitive type), and there
1769 must not be any duplicates.
1770 </td>
1771</tr>
1772<tr>
1773 <td>source_file_idx</td>
1774 <td>uint</td>
1775 <td>index into the <code>string_ids</code> list for the name of the
1776 file containing the original source for (at least most of) this class,
1777 or the special value <code>NO_INDEX</code> to represent a lack of
1778 this information. The <code>debug_info_item</code> of any given method
1779 may override this source file, but the expectation is that most classes
1780 will only come from one source file.
1781 </td>
1782</tr>
1783<tr>
1784 <td>annotations_off</td>
1785 <td>uint</td>
1786 <td>offset from the start of the file to the annotations structure
1787 for this class, or <code>0</code> if there are no annotations on
1788 this class. This offset, if non-zero, should be in the
1789 <code>data</code> section, and the data there should be in
1790 the format specified by "<code>annotations_directory_item</code>" below,
1791 with all items referring to this class as the definer.
1792 </td>
1793</tr>
1794<tr>
1795 <td>class_data_off</td>
1796 <td>uint</td>
1797 <td>offset from the start of the file to the associated
1798 class data for this item, or <code>0</code> if there is no class
1799 data for this class. (This may be the case, for example, if this class
1800 is a marker interface.) The offset, if non-zero, should be in the
1801 <code>data</code> section, and the data there should be in the
1802 format specified by "<code>class_data_item</code>" below, with all
1803 items referring to this class as the definer.
1804 </td>
1805</tr>
1806<tr>
1807 <td>static_values_off</td>
1808 <td>uint</td>
1809 <td>offset from the start of the file to the list of initial
1810 values for <code>static</code> fields, or <code>0</code> if there
1811 are none (and all <code>static</code> fields are to be initialized with
1812 <code>0</code> or <code>null</code>). This offset should be in the
1813 <code>data</code> section, and the data there should be in the
1814 format specified by "<code>encoded_array_item</code>" below. The size
1815 of the array must be no larger than the number of <code>static</code>
1816 fields declared by this class, and the elements correspond to the
1817 <code>static</code> fields in the same order as declared in the
1818 corresponding <code>field_list</code>. The type of each array
1819 element must match the declared type of its corresponding field.
1820 If there are fewer elements in the array than there are
1821 <code>static</code> fields, then the leftover fields are initialized
1822 with a type-appropriate <code>0</code> or <code>null</code>.
1823 </td>
1824</tr>
1825</tbody>
1826</table>
1827
Clay Murphye4edda62014-10-16 19:00:15 -07001828<h3 id="class-data-item">class_data_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07001829<h4>referenced from class_def_item</h4>
1830<h4>appears in the data section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001831<h4>alignment: none (byte-aligned)</h4>
1832
1833<table class="format">
1834<thead>
1835<tr>
1836 <th>Name</th>
1837 <th>Format</th>
1838 <th>Description</th>
1839</tr>
1840</thead>
1841<tbody>
1842<tr>
1843 <td>static_fields_size</td>
1844 <td>uleb128</td>
1845 <td>the number of static fields defined in this item</td>
1846</tr>
1847<tr>
1848 <td>instance_fields_size</td>
1849 <td>uleb128</td>
1850 <td>the number of instance fields defined in this item</td>
1851</tr>
1852<tr>
1853 <td>direct_methods_size</td>
1854 <td>uleb128</td>
1855 <td>the number of direct methods defined in this item</td>
1856</tr>
1857<tr>
1858 <td>virtual_methods_size</td>
1859 <td>uleb128</td>
1860 <td>the number of virtual methods defined in this item</td>
1861</tr>
1862<tr>
1863 <td>static_fields</td>
1864 <td>encoded_field[static_fields_size]</td>
1865 <td>the defined static fields, represented as a sequence of
1866 encoded elements. The fields must be sorted by
1867 <code>field_idx</code> in increasing order.
1868 </td>
1869</tr>
1870<tr>
1871 <td>instance_fields</td>
1872 <td>encoded_field[instance_fields_size]</td>
1873 <td>the defined instance fields, represented as a sequence of
1874 encoded elements. The fields must be sorted by
1875 <code>field_idx</code> in increasing order.
1876 </td>
1877</tr>
1878<tr>
1879 <td>direct_methods</td>
1880 <td>encoded_method[direct_methods_size]</td>
1881 <td>the defined direct (any of <code>static</code>, <code>private</code>,
1882 or constructor) methods, represented as a sequence of
1883 encoded elements. The methods must be sorted by
1884 <code>method_idx</code> in increasing order.
1885 </td>
1886</tr>
1887<tr>
1888 <td>virtual_methods</td>
1889 <td>encoded_method[virtual_methods_size]</td>
1890 <td>the defined virtual (none of <code>static</code>, <code>private</code>,
1891 or constructor) methods, represented as a sequence of
1892 encoded elements. This list should <i>not</i> include inherited
1893 methods unless overridden by the class that this item represents. The
1894 methods must be sorted by <code>method_idx</code> in increasing order.
1895 </td>
1896</tr>
1897</tbody>
1898</table>
1899
1900<p><b>Note:</b> All elements' <code>field_id</code>s and
1901<code>method_id</code>s must refer to the same defining class.</p>
1902
Clay Murphye4edda62014-10-16 19:00:15 -07001903<h3 id="encoded-field-format">encoded_field format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001904
1905<table class="format">
1906<thead>
1907<tr>
1908 <th>Name</th>
1909 <th>Format</th>
1910 <th>Description</th>
1911</tr>
1912</thead>
1913<tbody>
1914<tr>
1915 <td>field_idx_diff</td>
1916 <td>uleb128</td>
1917 <td>index into the <code>field_ids</code> list for the identity of this
1918 field (includes the name and descriptor), represented as a difference
1919 from the index of previous element in the list. The index of the
1920 first element in a list is represented directly.
1921 </td>
1922</tr>
1923<tr>
1924 <td>access_flags</td>
1925 <td>uleb128</td>
1926 <td>access flags for the field (<code>public</code>, <code>final</code>,
1927 etc.). See "<code>access_flags</code> Definitions" for details.
1928 </td>
1929</tr>
1930</tbody>
1931</table>
1932
Clay Murphye4edda62014-10-16 19:00:15 -07001933<h3 id="encoded-method">encoded_method format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001934
1935<table class="format">
1936<thead>
1937<tr>
1938 <th>Name</th>
1939 <th>Format</th>
1940 <th>Description</th>
1941</tr>
1942</thead>
1943<tbody>
1944<tr>
1945 <td>method_idx_diff</td>
1946 <td>uleb128</td>
1947 <td>index into the <code>method_ids</code> list for the identity of this
1948 method (includes the name and descriptor), represented as a difference
1949 from the index of previous element in the list. The index of the
1950 first element in a list is represented directly.
1951 </td>
1952</tr>
1953<tr>
1954 <td>access_flags</td>
1955 <td>uleb128</td>
1956 <td>access flags for the method (<code>public</code>, <code>final</code>,
1957 etc.). See "<code>access_flags</code> Definitions" for details.
1958 </td>
1959</tr>
1960<tr>
1961 <td>code_off</td>
1962 <td>uleb128</td>
1963 <td>offset from the start of the file to the code structure for this
1964 method, or <code>0</code> if this method is either <code>abstract</code>
1965 or <code>native</code>. The offset should be to a location in the
1966 <code>data</code> section. The format of the data is specified by
1967 "<code>code_item</code>" below.
1968 </td>
1969</tr>
1970</tbody>
1971</table>
1972
Clay Murphye4edda62014-10-16 19:00:15 -07001973<h3 id="type-list">type_list</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07001974<h4>referenced from class_def_item and proto_id_item</h4>
1975<h4>appears in the data section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07001976<h4>alignment: 4 bytes</h4>
1977
1978<table class="format">
1979<thead>
1980<tr>
1981 <th>Name</th>
1982 <th>Format</th>
1983 <th>Description</th>
1984</tr>
1985</thead>
1986<tbody>
1987<tr>
1988 <td>size</td>
1989 <td>uint</td>
1990 <td>size of the list, in entries</td>
1991</tr>
1992<tr>
1993 <td>list</td>
1994 <td>type_item[size]</td>
1995 <td>elements of the list</td>
1996</tr>
1997</tbody>
1998</table>
1999
Clay Murphye4edda62014-10-16 19:00:15 -07002000<h3 id="type-item-format">type_item format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002001
2002<table class="format">
2003<thead>
2004<tr>
2005 <th>Name</th>
2006 <th>Format</th>
2007 <th>Description</th>
2008</tr>
2009</thead>
2010<tbody>
2011<tr>
2012 <td>type_idx</td>
2013 <td>ushort</td>
2014 <td>index into the <code>type_ids</code> list</td>
2015</tr>
2016</tbody>
2017</table>
2018
Clay Murphye4edda62014-10-16 19:00:15 -07002019<h3 id="code-item">code_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07002020<h4>referenced from encoded_method</h4>
2021<h4>appears in the data section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002022<h4>alignment: 4 bytes</h4>
2023
2024<table class="format">
2025<thead>
2026<tr>
2027 <th>Name</th>
2028 <th>Format</th>
2029 <th>Description</th>
2030</tr>
2031</thead>
2032<tbody>
2033<tr>
2034 <td>registers_size</td>
2035 <td>ushort</td>
2036 <td>the number of registers used by this code</td>
2037</tr>
2038<tr>
2039 <td>ins_size</td>
2040 <td>ushort</td>
2041 <td>the number of words of incoming arguments to the method that this
2042 code is for</td>
2043</tr>
2044<tr>
2045 <td>outs_size</td>
2046 <td>ushort</td>
2047 <td>the number of words of outgoing argument space required by this
2048 code for method invocation
2049 </td>
2050</tr>
2051<tr>
2052 <td>tries_size</td>
2053 <td>ushort</td>
2054 <td>the number of <code>try_item</code>s for this instance. If non-zero,
2055 then these appear as the <code>tries</code> array just after the
2056 <code>insns</code> in this instance.
2057 </td>
2058</tr>
2059<tr>
2060 <td>debug_info_off</td>
2061 <td>uint</td>
2062 <td>offset from the start of the file to the debug info (line numbers +
2063 local variable info) sequence for this code, or <code>0</code> if
2064 there simply is no information. The offset, if non-zero, should be
2065 to a location in the <code>data</code> section. The format of
2066 the data is specified by "<code>debug_info_item</code>" below.
2067 </td>
2068</tr>
2069<tr>
2070 <td>insns_size</td>
2071 <td>uint</td>
2072 <td>size of the instructions list, in 16-bit code units</td>
2073</tr>
2074<tr>
2075 <td>insns</td>
2076 <td>ushort[insns_size]</td>
2077 <td>actual array of bytecode. The format of code in an <code>insns</code>
2078 array is specified by the companion document
Clay Murphye4edda62014-10-16 19:00:15 -07002079 <a href="dalvik-bytecode.html">Dalvik bytecode</a>. Note
Dan Bornstein25705bc2011-04-12 16:23:13 -07002080 that though this is defined as an array of <code>ushort</code>, there
2081 are some internal structures that prefer four-byte alignment. Also,
2082 if this happens to be in an endian-swapped file, then the swapping is
2083 <i>only</i> done on individual <code>ushort</code>s and not on the
2084 larger internal structures.
2085 </td>
2086</tr>
2087<tr>
2088 <td>padding</td>
2089 <td>ushort <i>(optional)</i> = 0</td>
2090 <td>two bytes of padding to make <code>tries</code> four-byte aligned.
2091 This element is only present if <code>tries_size</code> is non-zero
2092 and <code>insns_size</code> is odd.
2093 </td>
2094</tr>
2095<tr>
2096 <td>tries</td>
2097 <td>try_item[tries_size] <i>(optional)</i></td>
Elliott Hughes8d777942012-01-05 17:27:02 -08002098 <td>array indicating where in the code exceptions are caught and
Dan Bornstein25705bc2011-04-12 16:23:13 -07002099 how to handle them. Elements of the array must be non-overlapping in
2100 range and in order from low to high address. This element is only
2101 present if <code>tries_size</code> is non-zero.
2102 </td>
2103</tr>
2104<tr>
2105 <td>handlers</td>
2106 <td>encoded_catch_handler_list <i>(optional)</i></td>
2107 <td>bytes representing a list of lists of catch types and associated
2108 handler addresses. Each <code>try_item</code> has a byte-wise offset
2109 into this structure. This element is only present if
2110 <code>tries_size</code> is non-zero.
2111 </td>
2112</tr>
2113</tbody>
2114</table>
2115
Clay Murphye4edda62014-10-16 19:00:15 -07002116<h3 id="type-item">try_item format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002117
2118<table class="format">
2119<thead>
2120<tr>
2121 <th>Name</th>
2122 <th>Format</th>
2123 <th>Description</th>
2124</tr>
2125</thead>
2126<tbody>
2127<tr>
2128 <td>start_addr</td>
2129 <td>uint</td>
2130 <td>start address of the block of code covered by this entry. The address
2131 is a count of 16-bit code units to the start of the first covered
2132 instruction.
2133 </td>
2134</tr>
2135<tr>
2136 <td>insn_count</td>
2137 <td>ushort</td>
2138 <td>number of 16-bit code units covered by this entry. The last code
2139 unit covered (inclusive) is <code>start_addr + insn_count - 1</code>.
2140 </td>
2141</tr>
2142<tr>
2143 <td>handler_off</td>
2144 <td>ushort</td>
Elliott Hughes8d777942012-01-05 17:27:02 -08002145 <td>offset in bytes from the start of the associated
2146 <code>encoded_catch_hander_list</code> to the
2147 <code>encoded_catch_handler</code> for this entry. This must be an
2148 offset to the start of an <code>encoded_catch_handler</code>.
Dan Bornstein25705bc2011-04-12 16:23:13 -07002149 </td>
2150</tr>
2151</tbody>
2152</table>
2153
Clay Murphye4edda62014-10-16 19:00:15 -07002154<h3 id="encoded-catch-handlerlist">encoded_catch_handler_list format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002155
2156<table class="format">
2157<thead>
2158<tr>
2159 <th>Name</th>
2160 <th>Format</th>
2161 <th>Description</th>
2162</tr>
2163</thead>
2164<tbody>
2165<tr>
2166 <td>size</td>
2167 <td>uleb128</td>
2168 <td>size of this list, in entries</td>
2169</tr>
2170<tr>
2171 <td>list</td>
2172 <td>encoded_catch_handler[handlers_size]</td>
2173 <td>actual list of handler lists, represented directly (not as offsets),
2174 and concatenated sequentially</td>
2175</tr>
2176</tbody>
2177</table>
2178
Clay Murphye4edda62014-10-16 19:00:15 -07002179<h3 id="encoded-catch-handler">encoded_catch_handler format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002180
2181<table class="format">
2182<thead>
2183<tr>
2184 <th>Name</th>
2185 <th>Format</th>
2186 <th>Description</th>
2187</tr>
2188</thead>
2189<tbody>
2190<tr>
2191 <td>size</td>
2192 <td>sleb128</td>
2193 <td>number of catch types in this list. If non-positive, then this is
2194 the negative of the number of catch types, and the catches are followed
2195 by a catch-all handler. For example: A <code>size</code> of <code>0</code>
2196 means that there is a catch-all but no explicitly typed catches.
2197 A <code>size</code> of <code>2</code> means that there are two explicitly
2198 typed catches and no catch-all. And a <code>size</code> of <code>-1</code>
2199 means that there is one typed catch along with a catch-all.
2200 </td>
2201</tr>
2202<tr>
2203 <td>handlers</td>
2204 <td>encoded_type_addr_pair[abs(size)]</td>
2205 <td>stream of <code>abs(size)</code> encoded items, one for each caught
2206 type, in the order that the types should be tested.
2207 </td>
2208</tr>
2209<tr>
2210 <td>catch_all_addr</td>
2211 <td>uleb128 <i>(optional)</i></td>
2212 <td>bytecode address of the catch-all handler. This element is only
2213 present if <code>size</code> is non-positive.
2214 </td>
2215</tr>
2216</tbody>
2217</table>
2218
Clay Murphye4edda62014-10-16 19:00:15 -07002219<h3 id="encoded-type-addr-pair">encoded_type_addr_pair format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002220
2221<table class="format">
2222<thead>
2223<tr>
2224 <th>Name</th>
2225 <th>Format</th>
2226 <th>Description</th>
2227</tr>
2228</thead>
2229<tbody>
2230<tr>
2231 <td>type_idx</td>
2232 <td>uleb128</td>
2233 <td>index into the <code>type_ids</code> list for the type of the
2234 exception to catch
2235 </td>
2236</tr>
2237<tr>
2238 <td>addr</td>
2239 <td>uleb128</td>
2240 <td>bytecode address of the associated exception handler</td>
2241</tr>
2242</tbody>
2243</table>
2244
Clay Murphye4edda62014-10-16 19:00:15 -07002245<h3 id="debug-info-item">debug_info_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07002246<h4>referenced from code_item</h4>
2247<h4>appears in the data section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002248<h4>alignment: none (byte-aligned)</h4>
2249
2250<p>Each <code>debug_info_item</code> defines a DWARF3-inspired byte-coded
2251state machine that, when interpreted, emits the positions
2252table and (potentially) the local variable information for a
2253<code>code_item</code>. The sequence begins with a variable-length
2254header (the length of which depends on the number of method
2255parameters), is followed by the state machine bytecodes, and ends
2256with an <code>DBG_END_SEQUENCE</code> byte.</p>
2257
2258<p>The state machine consists of five registers. The
2259<code>address</code> register represents the instruction offset in the
2260associated <code>insns_item</code> in 16-bit code units. The
2261<code>address</code> register starts at <code>0</code> at the beginning of each
Elliott Hughes8d777942012-01-05 17:27:02 -08002262<code>debug_info</code> sequence and must only monotonically increase.
Dan Bornstein25705bc2011-04-12 16:23:13 -07002263The <code>line</code> register represents what source line number
2264should be associated with the next positions table entry emitted by
2265the state machine. It is initialized in the sequence header, and may
2266change in positive or negative directions but must never be less than
2267<code>1</code>. The <code>source_file</code> register represents the
2268source file that the line number entries refer to. It is initialized to
2269the value of <code>source_file_idx</code> in <code>class_def_item</code>.
2270The other two variables, <code>prologue_end</code> and
2271<code>epilogue_begin</code>, are boolean flags (initialized to
2272<code>false</code>) that indicate whether the next position emitted
2273should be considered a method prologue or epilogue. The state machine
2274must also track the name and type of the last local variable live in
2275each register for the <code>DBG_RESTART_LOCAL</code> code.</p>
2276
2277<p>The header is as follows:</p>
2278
2279<table class="format">
2280<thead>
2281<tr>
2282 <th>Name</th>
2283 <th>Format</th>
2284 <th>Description</th>
2285</tr>
2286</thead>
2287<tbody>
2288<tr>
2289 <td>line_start</td>
2290 <td>uleb128</td>
2291 <td>the initial value for the state machine's <code>line</code> register.
2292 Does not represent an actual positions entry.
2293 </td>
2294</tr>
2295<tr>
2296 <td>parameters_size</td>
2297 <td>uleb128</td>
2298 <td>the number of parameter names that are encoded. There should be
2299 one per method parameter, excluding an instance method's <code>this</code>,
2300 if any.
2301 </td>
2302</tr>
2303<tr>
2304 <td>parameter_names</td>
2305 <td>uleb128p1[parameters_size]</td>
2306 <td>string index of the method parameter name. An encoded value of
2307 <code>NO_INDEX</code> indicates that no name
2308 is available for the associated parameter. The type descriptor
2309 and signature are implied from the method descriptor and signature.
2310 </td>
2311</tr>
2312</tbody>
2313</table>
2314
2315<p>The byte code values are as follows:</p>
2316
2317<table class="debugByteCode">
2318<thead>
2319<tr>
2320 <th>Name</th>
2321 <th>Value</th>
2322 <th>Format</th>
2323 <th>Arguments</th>
2324 <th>Description</th>
2325</tr>
2326</thead>
2327<tbody>
2328<tr>
2329 <td>DBG_END_SEQUENCE</td>
2330 <td>0x00</td>
2331 <td></td>
2332 <td><i>(none)</i></td>
2333 <td>terminates a debug info sequence for a <code>code_item</code></td>
2334</tr>
2335<tr>
2336 <td>DBG_ADVANCE_PC</td>
2337 <td>0x01</td>
2338 <td>uleb128&nbsp;addr_diff</td>
2339 <td><code>addr_diff</code>: amount to add to address register</td>
2340 <td>advances the address register without emitting a positions entry</td>
2341</tr>
2342<tr>
2343 <td>DBG_ADVANCE_LINE</td>
2344 <td>0x02</td>
2345 <td>sleb128&nbsp;line_diff</td>
2346 <td><code>line_diff</code>: amount to change line register by</td>
2347 <td>advances the line register without emitting a positions entry</td>
2348</tr>
2349<tr>
2350 <td>DBG_START_LOCAL</td>
2351 <td>0x03</td>
2352 <td>uleb128&nbsp;register_num<br/>
2353 uleb128p1&nbsp;name_idx<br/>
2354 uleb128p1&nbsp;type_idx
2355 </td>
2356 <td><code>register_num</code>: register that will contain local<br/>
2357 <code>name_idx</code>: string index of the name<br/>
2358 <code>type_idx</code>: type index of the type
2359 </td>
2360 <td>introduces a local variable at the current address. Either
2361 <code>name_idx</code> or <code>type_idx</code> may be
2362 <code>NO_INDEX</code> to indicate that that value is unknown.
2363 </td>
2364</tr>
2365<tr>
2366 <td>DBG_START_LOCAL_EXTENDED</td>
2367 <td>0x04</td>
2368 <td>uleb128&nbsp;register_num<br/>
2369 uleb128p1&nbsp;name_idx<br/>
2370 uleb128p1&nbsp;type_idx<br/>
2371 uleb128p1&nbsp;sig_idx
2372 </td>
2373 <td><code>register_num</code>: register that will contain local<br/>
2374 <code>name_idx</code>: string index of the name<br/>
2375 <code>type_idx</code>: type index of the type<br/>
2376 <code>sig_idx</code>: string index of the type signature
2377 </td>
2378 <td>introduces a local with a type signature at the current address.
2379 Any of <code>name_idx</code>, <code>type_idx</code>, or
2380 <code>sig_idx</code> may be <code>NO_INDEX</code>
2381 to indicate that that value is unknown. (If <code>sig_idx</code> is
2382 <code>-1</code>, though, the same data could be represented more
2383 efficiently using the opcode <code>DBG_START_LOCAL</code>.)
2384 <p><b>Note:</b> See the discussion under
2385 "<code>dalvik.annotation.Signature</code>" below for caveats about
2386 handling signatures.</p>
2387 </td>
2388</tr>
2389<tr>
2390 <td>DBG_END_LOCAL</td>
2391 <td>0x05</td>
2392 <td>uleb128&nbsp;register_num</td>
2393 <td><code>register_num</code>: register that contained local</td>
2394 <td>marks a currently-live local variable as out of scope at the current
2395 address
2396 </td>
2397</tr>
2398<tr>
2399 <td>DBG_RESTART_LOCAL</td>
2400 <td>0x06</td>
2401 <td>uleb128&nbsp;register_num</td>
2402 <td><code>register_num</code>: register to restart</td>
2403 <td>re-introduces a local variable at the current address. The name
2404 and type are the same as the last local that was live in the specified
2405 register.
2406 </td>
2407</tr>
2408<tr>
2409 <td>DBG_SET_PROLOGUE_END</td>
2410 <td>0x07</td>
2411 <td></td>
2412 <td><i>(none)</i></td>
2413 <td>sets the <code>prologue_end</code> state machine register,
2414 indicating that the next position entry that is added should be
2415 considered the end of a method prologue (an appropriate place for
2416 a method breakpoint). The <code>prologue_end</code> register is
2417 cleared by any special (<code>&gt;= 0x0a</code>) opcode.
2418 </td>
2419</tr>
2420<tr>
2421 <td>DBG_SET_EPILOGUE_BEGIN</td>
2422 <td>0x08</td>
2423 <td></td>
2424 <td><i>(none)</i></td>
2425 <td>sets the <code>epilogue_begin</code> state machine register,
2426 indicating that the next position entry that is added should be
2427 considered the beginning of a method epilogue (an appropriate place
2428 to suspend execution before method exit).
2429 The <code>epilogue_begin</code> register is cleared by any special
2430 (<code>&gt;= 0x0a</code>) opcode.
2431 </td>
2432</tr>
2433<tr>
2434 <td>DBG_SET_FILE</td>
2435 <td>0x09</td>
2436 <td>uleb128p1&nbsp;name_idx</td>
2437 <td><code>name_idx</code>: string index of source file name;
2438 <code>NO_INDEX</code> if unknown
2439 </td>
2440 <td>indicates that all subsequent line number entries make reference to this
2441 source file name, instead of the default name specified in
2442 <code>code_item</code>
2443 </td>
2444</tr>
2445<tr>
2446 <td><i>Special Opcodes</i></td>
2447 <!-- When updating the range below, make sure to search for other
2448 instances of 0x0a in this section. -->
2449 <td>0x0a&hellip;0xff</td>
2450 <td></td>
2451 <td><i>(none)</i></td>
2452 <td>advances the <code>line</code> and <code>address</code> registers,
2453 emits a position entry, and clears <code>prologue_end</code> and
2454 <code>epilogue_begin</code>. See below for description.
2455 </td>
2456</tr>
2457</tbody>
2458</table>
2459
Clay Murphye4edda62014-10-16 19:00:15 -07002460<h3 id="opcodes">Special opcodes</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002461
2462<p>Opcodes with values between <code>0x0a</code> and <code>0xff</code>
2463(inclusive) move both the <code>line</code> and <code>address</code>
2464registers by a small amount and then emit a new position table entry.
2465The formula for the increments are as follows:</p>
2466
2467<pre>
2468DBG_FIRST_SPECIAL = 0x0a // the smallest special opcode
2469DBG_LINE_BASE = -4 // the smallest line number increment
2470DBG_LINE_RANGE = 15 // the number of line increments represented
2471
2472adjusted_opcode = opcode - DBG_FIRST_SPECIAL
2473
2474line += DBG_LINE_BASE + (adjusted_opcode % DBG_LINE_RANGE)
2475address += (adjusted_opcode / DBG_LINE_RANGE)
2476</pre>
2477
Clay Murphye4edda62014-10-16 19:00:15 -07002478<h3 id="annotations-directory">annotations_directory_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07002479<h4>referenced from class_def_item</h4>
2480<h4>appears in the data section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002481<h4>alignment: 4 bytes</h4>
2482
2483<table class="format">
2484<thead>
2485<tr>
2486 <th>Name</th>
2487 <th>Format</th>
2488 <th>Description</th>
2489</tr>
2490</thead>
2491<tbody>
2492<tr>
2493 <td>class_annotations_off</td>
2494 <td>uint</td>
2495 <td>offset from the start of the file to the annotations made directly
2496 on the class, or <code>0</code> if the class has no direct annotations.
2497 The offset, if non-zero, should be to a location in the
2498 <code>data</code> section. The format of the data is specified
2499 by "<code>annotation_set_item</code>" below.
2500 </td>
2501</tr>
2502<tr>
2503 <td>fields_size</td>
2504 <td>uint</td>
2505 <td>count of fields annotated by this item</td>
2506</tr>
2507<tr>
2508 <td>annotated_methods_size</td>
2509 <td>uint</td>
2510 <td>count of methods annotated by this item</td>
2511</tr>
2512<tr>
2513 <td>annotated_parameters_size</td>
2514 <td>uint</td>
2515 <td>count of method parameter lists annotated by this item</td>
2516</tr>
2517<tr>
2518 <td>field_annotations</td>
2519 <td>field_annotation[fields_size] <i>(optional)</i></td>
2520 <td>list of associated field annotations. The elements of the list must
2521 be sorted in increasing order, by <code>field_idx</code>.
2522 </td>
2523</tr>
2524<tr>
2525 <td>method_annotations</td>
2526 <td>method_annotation[methods_size] <i>(optional)</i></td>
2527 <td>list of associated method annotations. The elements of the list must
2528 be sorted in increasing order, by <code>method_idx</code>.
2529 </td>
2530</tr>
2531<tr>
2532 <td>parameter_annotations</td>
2533 <td>parameter_annotation[parameters_size] <i>(optional)</i></td>
2534 <td>list of associated method parameter annotations. The elements of the
2535 list must be sorted in increasing order, by <code>method_idx</code>.
2536 </td>
2537</tr>
2538</tbody>
2539</table>
2540
2541<p><b>Note:</b> All elements' <code>field_id</code>s and
2542<code>method_id</code>s must refer to the same defining class.</p>
2543
Clay Murphye4edda62014-10-16 19:00:15 -07002544<h3 id="field-annotation">field_annotation format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002545
2546<table class="format">
2547<thead>
2548<tr>
2549 <th>Name</th>
2550 <th>Format</th>
2551 <th>Description</th>
2552</tr>
2553</thead>
2554<tbody>
2555<tr>
2556 <td>field_idx</td>
2557 <td>uint</td>
2558 <td>index into the <code>field_ids</code> list for the identity of the
2559 field being annotated
2560 </td>
2561</tr>
2562<tr>
2563 <td>annotations_off</td>
2564 <td>uint</td>
2565 <td>offset from the start of the file to the list of annotations for
2566 the field. The offset should be to a location in the <code>data</code>
2567 section. The format of the data is specified by
2568 "<code>annotation_set_item</code>" below.
2569 </td>
2570</tr>
2571</tbody>
2572</table>
2573
Clay Murphye4edda62014-10-16 19:00:15 -07002574<h3 id="method-annotation">method_annotation format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002575
2576<table class="format">
2577<thead>
2578<tr>
2579 <th>Name</th>
2580 <th>Format</th>
2581 <th>Description</th>
2582</tr>
2583</thead>
2584<tbody>
2585<tr>
2586 <td>method_idx</td>
2587 <td>uint</td>
2588 <td>index into the <code>method_ids</code> list for the identity of the
2589 method being annotated
2590 </td>
2591</tr>
2592<tr>
2593 <td>annotations_off</td>
2594 <td>uint</td>
2595 <td>offset from the start of the file to the list of annotations for
2596 the method. The offset should be to a location in the
2597 <code>data</code> section. The format of the data is specified by
2598 "<code>annotation_set_item</code>" below.
2599 </td>
2600</tr>
2601</tbody>
2602</table>
2603
Clay Murphye4edda62014-10-16 19:00:15 -07002604<h3 id="parameter-annotation">parameter_annotation format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002605
2606<table class="format">
2607<thead>
2608<tr>
2609 <th>Name</th>
2610 <th>Format</th>
2611 <th>Description</th>
2612</tr>
2613</thead>
2614<tbody>
2615<tr>
2616 <td>method_idx</td>
2617 <td>uint</td>
2618 <td>index into the <code>method_ids</code> list for the identity of the
2619 method whose parameters are being annotated
2620 </td>
2621</tr>
2622<tr>
2623 <td>annotations_off</td>
2624 <td>uint</td>
2625 <td>offset from the start of the file to the list of annotations for
2626 the method parameters. The offset should be to a location in the
2627 <code>data</code> section. The format of the data is specified by
2628 "<code>annotation_set_ref_list</code>" below.
2629 </td>
2630</tr>
2631</tbody>
2632</table>
2633
Clay Murphye4edda62014-10-16 19:00:15 -07002634<h3 id="set-ref-list">annotation_set_ref_list</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07002635<h4>referenced from parameter_annotations_item</h4>
2636<h4>appears in the data section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002637<h4>alignment: 4 bytes</h4>
2638
2639<table class="format">
2640<thead>
2641<tr>
2642 <th>Name</th>
2643 <th>Format</th>
2644 <th>Description</th>
2645</tr>
2646</thead>
2647<tbody>
2648<tr>
2649 <td>size</td>
2650 <td>uint</td>
2651 <td>size of the list, in entries</td>
2652</tr>
2653<tr>
2654 <td>list</td>
2655 <td>annotation_set_ref_item[size]</td>
2656 <td>elements of the list</td>
2657</tr>
2658</tbody>
2659</table>
2660
Clay Murphye4edda62014-10-16 19:00:15 -07002661<h3 id="set-ref-item">annotation_set_ref_item format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002662
2663<table class="format">
2664<thead>
2665<tr>
2666 <th>Name</th>
2667 <th>Format</th>
2668 <th>Description</th>
2669</tr>
2670</thead>
2671<tbody>
2672<tr>
2673 <td>annotations_off</td>
2674 <td>uint</td>
2675 <td>offset from the start of the file to the referenced annotation set
2676 or <code>0</code> if there are no annotations for this element.
2677 The offset, if non-zero, should be to a location in the <code>data</code>
2678 section. The format of the data is specified by
2679 "<code>annotation_set_item</code>" below.
2680 </td>
2681</tr>
2682</tbody>
2683</table>
2684
Clay Murphye4edda62014-10-16 19:00:15 -07002685<h3 id="annotation-set-item">annotation_set_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07002686<h4>referenced from annotations_directory_item, field_annotations_item,
2687method_annotations_item, and annotation_set_ref_item</h4>
2688<h4>appears in the data section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002689<h4>alignment: 4 bytes</h4>
2690
2691<table class="format">
2692<thead>
2693<tr>
2694 <th>Name</th>
2695 <th>Format</th>
2696 <th>Description</th>
2697</tr>
2698</thead>
2699<tbody>
2700<tr>
2701 <td>size</td>
2702 <td>uint</td>
2703 <td>size of the set, in entries</td>
2704</tr>
2705<tr>
2706 <td>entries</td>
2707 <td>annotation_off_item[size]</td>
2708 <td>elements of the set. The elements must be sorted in increasing order,
2709 by <code>type_idx</code>.
2710 </td>
2711</tr>
2712</tbody>
2713</table>
2714
Clay Murphye4edda62014-10-16 19:00:15 -07002715<h3 id="off-item">annotation_off_item format</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002716
2717<table class="format">
2718<thead>
2719<tr>
2720 <th>Name</th>
2721 <th>Format</th>
2722 <th>Description</th>
2723</tr>
2724</thead>
2725<tbody>
2726<tr>
2727 <td>annotation_off</td>
2728 <td>uint</td>
2729 <td>offset from the start of the file to an annotation.
2730 The offset should be to a location in the <code>data</code> section,
2731 and the format of the data at that location is specified by
2732 "<code>annotation_item</code>" below.
2733 </td>
2734</tr>
2735</tbody>
2736</table>
2737
2738
Clay Murphye4edda62014-10-16 19:00:15 -07002739<h3 id="annotation-item">annotation_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07002740<h4>referenced from annotation_set_item</h4>
2741<h4>appears in the data section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002742<h4>alignment: none (byte-aligned)</h4>
2743
2744<table class="format">
2745<thead>
2746<tr>
2747 <th>Name</th>
2748 <th>Format</th>
2749 <th>Description</th>
2750</tr>
2751</thead>
2752<tbody>
2753<tr>
2754 <td>visibility</td>
2755 <td>ubyte</td>
2756 <td>intended visibility of this annotation (see below)</td>
2757</tr>
2758<tr>
2759 <td>annotation</td>
2760 <td>encoded_annotation</td>
2761 <td>encoded annotation contents, in the format described by
Clay Murphye4edda62014-10-16 19:00:15 -07002762 "<code>encoded_annotation</code> format" under
2763 "<code>encoded_value</code> encoding" above.
Dan Bornstein25705bc2011-04-12 16:23:13 -07002764 </td>
2765</tr>
2766</tbody>
2767</table>
2768
Clay Murphye4edda62014-10-16 19:00:15 -07002769<h3 id="visibility">Visibility values</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002770
2771<p>These are the options for the <code>visibility</code> field in an
2772<code>annotation_item</code>:</p>
2773
2774<table class="format">
2775<thead>
2776<tr>
2777 <th>Name</th>
2778 <th>Value</th>
2779 <th>Description</th>
2780</tr>
2781</thead>
2782<tbody>
2783<tr>
2784 <td>VISIBILITY_BUILD</td>
2785 <td>0x00</td>
2786 <td>intended only to be visible at build time (e.g., during compilation
2787 of other code)
2788 </td>
2789</tr>
2790<tr>
2791 <td>VISIBILITY_RUNTIME</td>
2792 <td>0x01</td>
2793 <td>intended to visible at runtime</td>
2794</tr>
2795<tr>
2796 <td>VISIBILITY_SYSTEM</td>
2797 <td>0x02</td>
2798 <td>intended to visible at runtime, but only to the underlying system
2799 (and not to regular user code)
2800 </td>
2801</tr>
2802</tbody>
2803</table>
2804
Clay Murphye4edda62014-10-16 19:00:15 -07002805<h3 id="encoded-array-item">encoded_array_item</h3>
Clay Murphy945af1a2013-07-01 17:31:13 -07002806<h4>referenced from class_def_item</h4>
2807<h4>appears in the data section</h4>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002808<h4>alignment: none (byte-aligned)</h4>
2809
2810<table class="format">
2811<thead>
2812<tr>
2813 <th>Name</th>
2814 <th>Format</th>
2815 <th>Description</th>
2816</tr>
2817</thead>
2818<tbody>
2819<tr>
2820 <td>value</td>
2821 <td>encoded_array</td>
2822 <td>bytes representing the encoded array value, in the format specified
2823 by "<code>encoded_array</code> Format" under "<code>encoded_value</code>
2824 Encoding" above.
2825 </td>
2826</tr>
2827</tbody>
2828</table>
2829
Clay Murphye4edda62014-10-16 19:00:15 -07002830<h2 id="system-annotation">System annotations</h2>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002831
2832<p>System annotations are used to represent various pieces of reflective
2833information about classes (and methods and fields). This information is
2834generally only accessed indirectly by client (non-system) code.</p>
2835
2836<p>System annotations are represented in <code>.dex</code> files as
2837annotations with visibility set to <code>VISIBILITY_SYSTEM</code>.
2838
Clay Murphye4edda62014-10-16 19:00:15 -07002839<h3 id="dalvik-annotation-default">dalvik.annotation.AnnotationDefault</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002840<h4>appears on methods in annotation interfaces</h4>
2841
2842<p>An <code>AnnotationDefault</code> annotation is attached to each
2843annotation interface which wishes to indicate default bindings.</p>
2844
2845<table class="format">
2846<thead>
2847<tr>
2848 <th>Name</th>
2849 <th>Format</th>
2850 <th>Description</th>
2851</tr>
2852</thead>
2853<tbody>
2854<tr>
2855 <td>value</td>
2856 <td>Annotation</td>
2857 <td>the default bindings for this annotation, represented as an annotation
2858 of this type. The annotation need not include all names defined by the
2859 annotation; missing names simply do not have defaults.
2860 </td>
2861</tr>
2862</tbody>
2863</table>
2864
Clay Murphye4edda62014-10-16 19:00:15 -07002865<h3 id="dalvik-enclosingclass">dalvik.annotation.EnclosingClass</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002866<h4>appears on classes</h4>
2867
2868<p>An <code>EnclosingClass</code> annotation is attached to each class
2869which is either defined as a member of another class, per se, or is
2870anonymous but not defined within a method body (e.g., a synthetic
2871inner class). Every class that has this annotation must also have an
Elliott Hughes8d777942012-01-05 17:27:02 -08002872<code>InnerClass</code> annotation. Additionally, a class must not have
Dan Bornstein25705bc2011-04-12 16:23:13 -07002873both an <code>EnclosingClass</code> and an
2874<code>EnclosingMethod</code> annotation.</p>
2875
2876<table class="format">
2877<thead>
2878<tr>
2879 <th>Name</th>
2880 <th>Format</th>
2881 <th>Description</th>
2882</tr>
2883</thead>
2884<tbody>
2885<tr>
2886 <td>value</td>
2887 <td>Class</td>
2888 <td>the class which most closely lexically scopes this class</td>
2889</tr>
2890</tbody>
2891</table>
2892
Clay Murphye4edda62014-10-16 19:00:15 -07002893<h3 id="dalvik-enclosingmethod">dalvik.annotation.EnclosingMethod</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002894<h4>appears on classes</h4>
2895
2896<p>An <code>EnclosingMethod</code> annotation is attached to each class
2897which is defined inside a method body. Every class that has this
2898annotation must also have an <code>InnerClass</code> annotation.
Elliott Hughes8d777942012-01-05 17:27:02 -08002899Additionally, a class must not have both an <code>EnclosingClass</code>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002900and an <code>EnclosingMethod</code> annotation.</p>
2901
2902<table class="format">
2903<thead>
2904<tr>
2905 <th>Name</th>
2906 <th>Format</th>
2907 <th>Description</th>
2908</tr>
2909</thead>
2910<tbody>
2911<tr>
2912 <td>value</td>
2913 <td>Method</td>
2914 <td>the method which most closely lexically scopes this class</td>
2915</tr>
2916</tbody>
2917</table>
2918
Clay Murphye4edda62014-10-16 19:00:15 -07002919<h3 id="dalvik-innerclass">dalvik.annotation.InnerClass</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002920<h4>appears on classes</h4>
2921
2922<p>An <code>InnerClass</code> annotation is attached to each class
2923which is defined in the lexical scope of another class's definition.
2924Any class which has this annotation must also have <i>either</i> an
2925<code>EnclosingClass</code> annotation <i>or</i> an
2926<code>EnclosingMethod</code> annotation.</p>
2927
2928<table class="format">
2929<thead>
2930<tr>
2931 <th>Name</th>
2932 <th>Format</th>
2933 <th>Description</th>
2934</tr>
2935</thead>
2936<tbody>
2937<tr>
2938 <td>name</td>
2939 <td>String</td>
2940 <td>the originally declared simple name of this class (not including any
2941 package prefix). If this class is anonymous, then the name is
2942 <code>null</code>.
2943 </td>
2944</tr>
2945<tr>
2946 <td>accessFlags</td>
2947 <td>int</td>
2948 <td>the originally declared access flags of the class (which may differ
2949 from the effective flags because of a mismatch between the execution
2950 models of the source language and target virtual machine)
2951 </td>
2952</tr>
2953</tbody>
2954</table>
2955
Clay Murphye4edda62014-10-16 19:00:15 -07002956<h3 id="dalvik-memberclasses">dalvik.annotation.MemberClasses</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002957<h4>appears on classes</h4>
2958
2959<p>A <code>MemberClasses</code> annotation is attached to each class
2960which declares member classes. (A member class is a direct inner class
2961that has a name.)</p>
2962
2963<table class="format">
2964<thead>
2965<tr>
2966 <th>Name</th>
2967 <th>Format</th>
2968 <th>Description</th>
2969</tr>
2970</thead>
2971<tbody>
2972<tr>
2973 <td>value</td>
2974 <td>Class[]</td>
2975 <td>array of the member classes</td>
2976</tr>
2977</tbody>
2978</table>
2979
Clay Murphye4edda62014-10-16 19:00:15 -07002980<h3 id="dalvik-signature">dalvik.annotation.Signature</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07002981<h4>appears on classes, fields, and methods</h4>
2982
2983<p>A <code>Signature</code> annotation is attached to each class,
2984field, or method which is defined in terms of a more complicated type
2985than is representable by a <code>type_id_item</code>. The
2986<code>.dex</code> format does not define the format for signatures; it
2987is merely meant to be able to represent whatever signatures a source
2988language requires for successful implementation of that language's
2989semantics. As such, signatures are not generally parsed (or verified)
2990by virtual machine implementations. The signatures simply get handed
2991off to higher-level APIs and tools (such as debuggers). Any use of a
2992signature, therefore, should be written so as not to make any
2993assumptions about only receiving valid signatures, explicitly guarding
2994itself against the possibility of coming across a syntactically
2995invalid signature.</p>
2996
2997<p>Because signature strings tend to have a lot of duplicated content,
2998a <code>Signature</code> annotation is defined as an <i>array</i> of
2999strings, where duplicated elements naturally refer to the same
3000underlying data, and the signature is taken to be the concatenation of
3001all the strings in the array. There are no rules about how to pull
3002apart a signature into separate strings; that is entirely up to the
3003tools that generate <code>.dex</code> files.</p>
3004
3005<table class="format">
3006<thead>
3007<tr>
3008 <th>Name</th>
3009 <th>Format</th>
3010 <th>Description</th>
3011</tr>
3012</thead>
3013<tbody>
3014<tr>
3015 <td>value</td>
3016 <td>String[]</td>
3017 <td>the signature of this class or member, as an array of strings that
3018 is to be concatenated together</td>
3019</tr>
3020</tbody>
3021</table>
3022
Clay Murphye4edda62014-10-16 19:00:15 -07003023<h3 id="dalvik-throws">dalvik.annotation.Throws</h3>
Dan Bornstein25705bc2011-04-12 16:23:13 -07003024<h4>appears on methods</h4>
3025
3026<p>A <code>Throws</code> annotation is attached to each method which is
3027declared to throw one or more exception types.</p>
3028
3029<table class="format">
3030<thead>
3031<tr>
3032 <th>Name</th>
3033 <th>Format</th>
3034 <th>Description</th>
3035</tr>
3036</thead>
3037<tbody>
3038<tr>
3039 <td>value</td>
3040 <td>Class[]</td>
3041 <td>the array of exception types thrown</td>
3042</tr>
3043</tbody>
3044</table>