The Android Open Source Project | f6c3871 | 2009-03-03 19:28:47 -0800 | [diff] [blame] | 1 | <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> |
| 2 | |
| 3 | <html> |
| 4 | |
| 5 | <head> |
| 6 | <title>Dalvik VM Instruction Formats</title> |
| 7 | <link rel=stylesheet href="instruction-formats.css"> |
| 8 | </head> |
| 9 | |
| 10 | <body> |
| 11 | |
| 12 | <h1>Dalvik VM Instruction Formats</h1> |
| 13 | <p>Copyright © 2007 The Android Open Source Project |
| 14 | |
| 15 | <h2>Introduction and Overview</h2> |
| 16 | |
| 17 | <p>This document lists the instruction formats used by Dalvik bytecode |
| 18 | and is meant to be used in conjunction with the |
| 19 | <a href="dalvik-bytecode.html">bytecode reference document</a>.</p> |
| 20 | |
| 21 | <h3>Bitwise descriptions</h3> |
| 22 | |
| 23 | <p>The first column in the format table lists the bitwise layout of |
| 24 | the format. It consists of one or more space-separated "words" each of |
| 25 | which describes a 16-bit code unit. Each character in a word |
| 26 | represents four bits, read from high bits to low, with vertical bars |
| 27 | ("<code>|</code>") interspersed to aid in reading. Uppercase letters |
| 28 | in sequence from "<code>A</code>" are used to indicate fields within |
| 29 | the format (which then get defined further by the syntax column). The term |
| 30 | "<code>op</code>" is used to indicate the position of the eight-bit |
| 31 | opcode within the format. A slashed zero ("<code>Ø</code>") is |
| 32 | used to indicate that all bits should be zero in the indicated |
| 33 | position.</p> |
| 34 | |
| 35 | <p>For example, the format "<code>B|A|<i>op</i> CCCC</code>" indicates |
| 36 | that the format consists of two 16-bit code units. The first word |
| 37 | consists of the opcode in the low eight bits and a pair of four-bit |
| 38 | values in the high eight bits; and the second word consists of a single |
| 39 | 16-bit value.</p> |
| 40 | |
| 41 | <h3>Format IDs</h3> |
| 42 | |
| 43 | <p>The second column in the format table indicates the short identifier |
| 44 | for the format, which is used in other documents and in code to identify |
| 45 | the format.</p> |
| 46 | |
| 47 | <p>Format IDs consist of three characters, two digits followed by a |
| 48 | letter. The first digit indicates the number of 16-bit code units in the |
| 49 | format. The second digit indicates the maximum number of registers that the |
| 50 | format contains (maximum, since some formats can accomodate a variable |
| 51 | number of registers), with the special designation "<code>r</code>" indicating |
| 52 | that a range of registers is encoded. The final letter semi-mnemonically |
| 53 | indicates the type of any extra data encoded by the format. For example, |
| 54 | format "<code>21t</code>" is of length two, contains one register reference, |
| 55 | and additionally contains a branch target.</p> |
| 56 | |
| 57 | <p>Suggested static linking formats have an additional "<code>s</code>" suffix, |
| 58 | making them four characters total.</p> |
| 59 | |
| 60 | <p>The full list of typecode letters are as follows. Note that some |
| 61 | forms have different sizes, depending on the format:</p> |
| 62 | |
| 63 | <table class="letters"> |
| 64 | <thead> |
| 65 | <tr> |
| 66 | <th>Mnemonic</th> |
| 67 | <th>Bit Sizes</th> |
| 68 | <th>Meaning</th> |
| 69 | </tr> |
| 70 | </thead> |
| 71 | <tbody> |
| 72 | <tr> |
| 73 | <td>b</td> |
| 74 | <td>8</td> |
| 75 | <td>immediate signed <b>b</b>yte</td> |
| 76 | </tr> |
| 77 | <tr> |
| 78 | <td>c</td> |
| 79 | <td>16, 32</td> |
| 80 | <td><b>c</b>onstant pool index</td> |
| 81 | </tr> |
| 82 | <tr> |
| 83 | <td>f</td> |
| 84 | <td>16</td> |
| 85 | <td>inter<b>f</b>ace constants (only used in statically linked formats) |
| 86 | </td> |
| 87 | </tr> |
| 88 | <tr> |
| 89 | <td>h</td> |
| 90 | <td>16</td> |
| 91 | <td>immediate signed <b>h</b>at (high-order bits of a 32- or 64-bit |
| 92 | value; low-order bits are all <code>0</code>) |
| 93 | </td> |
| 94 | </tr> |
| 95 | <tr> |
| 96 | <td>i</td> |
| 97 | <td>32</td> |
| 98 | <td>immediate signed <b>i</b>nt, or 32-bit float</td> |
| 99 | </tr> |
| 100 | <tr> |
| 101 | <td>l</td> |
| 102 | <td>64</td> |
| 103 | <td>immediate signed <b>l</b>ong, or 64-bit double</td> |
| 104 | </tr> |
| 105 | <tr> |
| 106 | <td>m</td> |
| 107 | <td>16</td> |
| 108 | <td><b>m</b>ethod constants (only used in statically linked formats)</td> |
| 109 | </tr> |
| 110 | <tr> |
| 111 | <td>n</td> |
| 112 | <td>4</td> |
| 113 | <td>immediate signed <b>n</b>ibble</td> |
| 114 | </tr> |
| 115 | <tr> |
| 116 | <td>s</td> |
| 117 | <td>16</td> |
| 118 | <td>immediate signed <b>s</b>hort</td> |
| 119 | </tr> |
| 120 | <tr> |
| 121 | <td>t</td> |
| 122 | <td>8, 16, 32</td> |
| 123 | <td>branch <b>t</b>arget</td> |
| 124 | </tr> |
| 125 | <tr> |
| 126 | <td>x</td> |
| 127 | <td>0</td> |
| 128 | <td>no additional data</td> |
| 129 | </tr> |
| 130 | </tbody> |
| 131 | </table> |
| 132 | |
| 133 | <h3>Syntax</h3> |
| 134 | |
| 135 | <p>The third column of the format table indicates the human-oriented |
| 136 | syntax for instructions which use the indicated format. Each instruction |
| 137 | starts with the named opcode and is optionally followed by one or |
| 138 | more arguments, themselves separated with commas.</p> |
| 139 | |
| 140 | <p>Wherever an argument refers to a field from the first column, the |
| 141 | letter for that field is indicated in the syntax, repeated once for |
| 142 | each four bits of the field. For example, an eight-bit field labeled |
| 143 | "<code>BB</code>" in the first column would also be labeled |
| 144 | "<code>BB</code>" in the syntax column.</p> |
| 145 | |
| 146 | <p>Arguments which name a register have the form "<code>v<i>X</i></code>". |
| 147 | The prefix "<code>v</code>" was chosen instead of the more common |
| 148 | "<code>r</code>" exactly to avoid conflicting with (non-virtual) architectures |
| 149 | on which a Dalvik virtual machine might be implemented which themselves |
| 150 | use the prefix "<code>r</code>" for their registers. (That is, this |
| 151 | decision makes it possible to talk about both virtual and real registers |
| 152 | together without the need for circumlocution.)</p> |
| 153 | |
| 154 | <p>Arguments which indicate a literal value have the form |
| 155 | "<code>#+<i>X</i></code>". Some formats indicate literals that only |
| 156 | have non-zero bits in their high-order bits; for these, the zeroes |
| 157 | are represented explicitly in the syntax, even though they do not |
| 158 | appear in the bitwise representation.</p> |
| 159 | |
| 160 | <p>Arguments which indicate a relative instruction address offset have the |
| 161 | form "<code>+<i>X</i></code>".</p> |
| 162 | |
| 163 | <p>Arguments which indicate a literal constant pool index have the form |
| 164 | "<code><i>kind</i>@<i>X</i></code>", where "<code><i>kind</i></code>" |
| 165 | indicates which constant pool is being referred to. Each opcode that |
| 166 | uses such a format explicitly allows only one kind of constant; see |
| 167 | the opcode reference to figure out the correspondence. The four |
| 168 | kinds of constant pool are "<code>string</code>" (string pool index), |
| 169 | "<code>type</code>" (type pool index), "<code>field</code>" (field |
| 170 | pool index), and "<code>meth</code>" (method pool index).</p> |
| 171 | |
| 172 | <p>Similar to the representation of constant pool indices, there are |
| 173 | also suggested (optional) forms that indicate prelinked offsets or |
| 174 | indices. These prelinked values include "<code>vtaboff</code>" |
| 175 | (vtable offset), "<code>fieldoff</code>" (field offset), and |
| 176 | "<code>iface</code>" (interface pool index).</p> |
| 177 | |
| 178 | <p>In the cases where a format value isn't explictly part of the syntax |
| 179 | but instead picks a variant, each variant is listed with the prefix |
| 180 | "<code>[<i>X</i>=<i>N</i>]</code>" (e.g., "<code>[B=2]</code>") to indicate |
| 181 | the correspondence.</p> |
| 182 | |
| 183 | <h2>The Formats</h2> |
| 184 | |
| 185 | <table class="format"> |
| 186 | <thead> |
| 187 | <tr> |
| 188 | <th>Format</th> |
| 189 | <th>ID</th> |
| 190 | <th>Syntax</th> |
| 191 | <th>Notable Opcodes Covered</th> |
| 192 | </tr> |
| 193 | </thead> |
| 194 | <tbody> |
| 195 | <tr> |
| 196 | <td>ØØ|<i>op</i></td> |
| 197 | <td>10x</td> |
| 198 | <td><i><code>op</code></i></td> |
| 199 | <td> </td> |
| 200 | </tr> |
| 201 | <tr> |
| 202 | <td rowspan="2">B|A|<i>op</i></td> |
| 203 | <td>12x</td> |
| 204 | <td><i><code>op</code></i> vA, vB</td> |
| 205 | <td> </td> |
| 206 | </tr> |
| 207 | <tr> |
| 208 | <td>11n</td> |
| 209 | <td><i><code>op</code></i> vA, #+B</td> |
| 210 | <td> </td> |
| 211 | </tr> |
| 212 | <tr> |
| 213 | <td rowspan="2">AA|<i>op</i></td> |
| 214 | <td>11x</td> |
| 215 | <td><i><code>op</code></i> vAA</td> |
| 216 | <td> </td> |
| 217 | </tr> |
| 218 | <tr> |
| 219 | <td>10t</td> |
| 220 | <td><i><code>op</code></i> +AA</td> |
| 221 | <td>goto</td> |
| 222 | </tr> |
| 223 | <tr> |
| 224 | <td>ØØ|<i>op</i> AAAA</td></td> |
| 225 | <td>20t</td> |
| 226 | <td><i><code>op</code></i> +AAAA</td> |
| 227 | <td>goto/16</td> |
| 228 | </tr> |
| 229 | <tr> |
| 230 | <td rowspan="5">AA|<i>op</i> BBBB</td> |
| 231 | <td>22x</td> |
| 232 | <td><i><code>op</code></i> vAA, vBBBB</td> |
| 233 | <td> </td> |
| 234 | </tr> |
| 235 | <tr> |
| 236 | <td>21t</td> |
| 237 | <td><i><code>op</code></i> vAA, +BBBB</td> |
| 238 | <td> </td> |
| 239 | </tr> |
| 240 | <tr> |
| 241 | <td>21s</td> |
| 242 | <td><i><code>op</code></i> vAA, #+BBBB</td> |
| 243 | <td> </td> |
| 244 | </tr> |
| 245 | <tr> |
| 246 | <td>21h</td> |
| 247 | <td><i><code>op</code></i> vAA, #+BBBB0000<br/> |
| 248 | <i><code>op</code></i> vAA, #+BBBB000000000000 |
| 249 | </td> |
| 250 | <td> </td> |
| 251 | </tr> |
| 252 | <tr> |
| 253 | <td>21c</td> |
| 254 | <td><i><code>op</code></i> vAA, type@BBBB<br/> |
| 255 | <i><code>op</code></i> vAA, field@BBBB<br/> |
| 256 | <i><code>op</code></i> vAA, string@BBBB |
| 257 | </td> |
| 258 | <td>check-cast<br/> |
| 259 | const-class<br/> |
| 260 | const-string |
| 261 | </td> |
| 262 | </tr> |
| 263 | <tr> |
| 264 | <td rowspan="2">AA|<i>op</i> CC|BB</td> |
| 265 | <td>23x</td> |
| 266 | <td><i><code>op</code></i> vAA, vBB, vCC</td> |
| 267 | <td> </td> |
| 268 | </tr> |
| 269 | <tr> |
| 270 | <td>22b</td> |
| 271 | <td><i><code>op</code></i> vAA, vBB, #+CC</td> |
| 272 | <td> </td> |
| 273 | </tr> |
| 274 | <tr> |
| 275 | <td rowspan="4">B|A|<i>op</i> CCCC</td> |
| 276 | <td>22t</td> |
| 277 | <td><i><code>op</code></i> vA, vB, +CCCC</td> |
| 278 | <td> </td> |
| 279 | </tr> |
| 280 | <tr> |
| 281 | <td>22s</td> |
| 282 | <td><i><code>op</code></i> vA, vB, #+CCCC</td> |
| 283 | <td> </td> |
| 284 | </tr> |
| 285 | <tr> |
| 286 | <td>22c</td> |
| 287 | <td><i><code>op</code></i> vA, vB, type@CCCC<br/> |
| 288 | <i><code>op</code></i> vA, vB, field@CCCC |
| 289 | </td> |
| 290 | <td>instance-of</td> |
| 291 | </tr> |
| 292 | <tr> |
| 293 | <td>22cs</td> |
| 294 | <td><i><code>op</code></i> vA, vB, fieldoff@CCCC</td> |
| 295 | <td><i>(suggested format for statically linked field access instructions of |
| 296 | format 22c)</i> |
| 297 | </td> |
| 298 | </tr> |
| 299 | <tr> |
| 300 | <td>ØØ|<i>op</i> AAAA<sub>lo</sub> AAAA<sub>hi</sub></td></td> |
| 301 | <td>30t</td> |
| 302 | <td><i><code>op</code></i> +AAAAAAAA</td> |
| 303 | <td>goto/32</td> |
| 304 | </tr> |
| 305 | <tr> |
| 306 | <td>ØØ|<i>op</i> AAAA BBBB</td> |
| 307 | <td>32x</td> |
| 308 | <td><i><code>op</code></i> vAAAA, vBBBB</td> |
| 309 | <td> </td> |
| 310 | </tr> |
| 311 | <tr> |
| 312 | <td rowspan="3">AA|<i>op</i> BBBB<sub>lo</sub> BBBB<sub>hi</sub></td> |
| 313 | <td>31i</td> |
| 314 | <td><i><code>op</code></i> vAA, #+BBBBBBBB</td> |
| 315 | <td> </td> |
| 316 | </tr> |
| 317 | <tr> |
| 318 | <td>31t</td> |
| 319 | <td><i><code>op</code></i> vAA, +BBBBBBBB</td> |
| 320 | <td> </td> |
| 321 | </tr> |
| 322 | <tr> |
| 323 | <td>31c</td> |
| 324 | <td><i><code>op</code></i> vAA, string@BBBBBBBB</td> |
| 325 | <td>const-string/jumbo</td> |
| 326 | </tr> |
| 327 | <tr> |
| 328 | <td>B|A|<i>op</i> CCCC G|F|E|D</td> |
| 329 | <td>35c</td> |
| 330 | <td><i>[<code>B=5</code>] <code>op</code></i> {vD, vE, vF, vG, vA}, |
| 331 | meth@CCCC<br/> |
| 332 | <i>[<code>B=5</code>] <code>op</code></i> {vD, vE, vF, vG, vA}, |
| 333 | type@CCCC<br/> |
| 334 | <i>[<code>B=4</code>] <code>op</code></i> {vD, vE, vF, vG}, |
| 335 | <i><code>kind</code></i>@CCCC<br/> |
| 336 | <i>[<code>B=3</code>] <code>op</code></i> {vD, vE, vF}, |
| 337 | <i><code>kind</code></i>@CCCC<br/> |
| 338 | <i>[<code>B=2</code>] <code>op</code></i> {vD, vE}, |
| 339 | <i><code>kind</code></i>@CCCC<br/> |
| 340 | <i>[<code>B=1</code>] <code>op</code></i> {vD}, |
| 341 | <i><code>kind</code></i>@CCCC<br/> |
| 342 | <i>[<code>B=0</code>] <code>op</code></i> {}, |
| 343 | <i><code>kind</code></i>@CCCC |
| 344 | </td> |
| 345 | <td> </td> |
| 346 | </tr> |
| 347 | <tr> |
| 348 | <td>B|A|<i>op</i> CCCC G|F|E|D</td> |
| 349 | <td>35ms</td> |
| 350 | |
| 351 | <td><i>[<code>B=5</code>] <code>op</code></i> {vD, vE, vF, vG, vA}, |
| 352 | vtaboff@CCCC<br/> |
| 353 | <i>[<code>B=4</code>] <code>op</code></i> {vD, vE, vF, vG}, |
| 354 | vtaboff@CCCC<br/> |
| 355 | <i>[<code>B=3</code>] <code>op</code></i> {vD, vE, vF}, |
| 356 | vtaboff@CCCC<br/> |
| 357 | <i>[<code>B=2</code>] <code>op</code></i> {vD, vE}, |
| 358 | vtaboff@CCCC<br/> |
| 359 | <i>[<code>B=1</code>] <code>op</code></i> {vD}, |
| 360 | vtaboff@CCCC<br/> |
| 361 | </td> |
| 362 | <td><i>(suggested format for statically linked <code>invoke-virtual</code> |
| 363 | and <code>invoke-super</code> instructions of format 35c)</i> |
| 364 | </td> |
| 365 | </tr> |
| 366 | <tr> |
| 367 | <td>B|A|<i>op</i> DDCC H|G|F|E</td> |
| 368 | <td>35fs</td> |
| 369 | <td><i>[<code>B=5</code>] <code>op</code></i> {vE, vF, vG, vH, vA}, |
| 370 | vtaboff@CC, iface@DD<br/> |
| 371 | <i>[<code>B=4</code>] <code>op</code></i> {vE, vF, vG, vH}, |
| 372 | vtaboff@CC, iface@DD<br/> |
| 373 | <i>[<code>B=3</code>] <code>op</code></i> {vE, vF, vG}, |
| 374 | vtaboff@CC, iface@DD<br/> |
| 375 | <i>[<code>B=2</code>] <code>op</code></i> {vE, vF}, |
| 376 | vtaboff@CC, iface@DD<br/> |
| 377 | <i>[<code>B=1</code>] <code>op</code></i> {vE}, |
| 378 | vtaboff@CC, iface@DD<br/> |
| 379 | </td> |
| 380 | <td><i>(suggested format for statically linked <code>invoke-interface</code> |
| 381 | instructions of format 35c)</i> |
| 382 | </td> |
| 383 | </tr> |
| 384 | <tr> |
| 385 | <td>AA|<i>op</i> BBBB CCCC</td> |
| 386 | <td>3rc</td> |
| 387 | <td><i><code>op</code></i> {vCCCC .. vNNNN}, meth@BBBB<br/> |
| 388 | <i><code>op</code></i> {vCCCC .. vNNNN}, type@BBBB<br/> |
| 389 | <p><i>(where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code> |
| 390 | determines the count <code>0..255</code>, and <code>C</code> |
| 391 | determines the first register)</i></p> |
| 392 | </td> |
| 393 | <td> </td> |
| 394 | </tr> |
| 395 | <tr> |
| 396 | <td>AA|<i>op</i> BBBB CCCC</td> |
| 397 | <td>3rms</td> |
| 398 | <td><i><code>op</code></i> {vCCCC .. vNNNN}, vtaboff@BBBB<br/> |
| 399 | <p><i>(where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code> |
| 400 | determines the count <code>0..255</code>, and <code>C</code> |
| 401 | determines the first register)</i></p> |
| 402 | </td> |
| 403 | <td><i>(suggested format for statically linked <code>invoke-virtual</code> |
| 404 | and <code>invoke-super</code> instructions of format <code>3rc</code>)</i> |
| 405 | </td> |
| 406 | </tr> |
| 407 | <tr> |
| 408 | <td>AA|<i>op</i> CCBB DDDD</td> |
| 409 | <td>3rfs</td> |
| 410 | <td><i><code>op</code></i> {vDDDD .. vNNNN}, vtaboff@BB, |
| 411 | iface@CC<br/> |
| 412 | <p><i>(where <code>NNNN = DDDD+AA-1</code>, that is <code>A</code> |
| 413 | determines the count <code>0..255</code>, and <code>D</code> |
| 414 | determines the first register)</i></p> |
| 415 | </td> |
| 416 | <td><i>(suggested format for statically linked <code>invoke-interface</code> |
| 417 | instructions of format <code>3rc</code>)</i> |
| 418 | </td> |
| 419 | </tr> |
| 420 | <tr> |
| 421 | <td>AA|<i>op</i> BBBB<sub>lo</sub> BBBB BBBB BBBB<sub>hi</sub></td> |
| 422 | <td>51l</td> |
| 423 | <td><i><code>op</code></i> vAA, #+BBBBBBBBBBBBBBBB</td> |
| 424 | <td>const-wide</td> |
| 425 | </tr> |
| 426 | </tbody> |
| 427 | </table> |
| 428 | |
| 429 | </body> |
| 430 | </html> |