blob: f81b595a58994d743afb3c142240cccfde2c8d0b [file] [log] [blame]
Dan Bornstein25705bc2011-04-12 16:23:13 -07001<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
2
3<html>
4
5<head>
6<title>Dalvik VM Instruction Formats</title>
7<link rel=stylesheet href="instruction-formats.css">
8</head>
9
10<body>
11
12<h1>Dalvik VM Instruction Formats</h1>
13<p>Copyright &copy; 2007 The Android Open Source Project
14
15<h2>Introduction and Overview</h2>
16
17<p>This document lists the instruction formats used by Dalvik bytecode
18and is meant to be used in conjunction with the
19<a href="dalvik-bytecode.html">bytecode reference document</a>.</p>
20
21<h3>Bitwise descriptions</h3>
22
23<p>The first column in the format table lists the bitwise layout of
24the format. It consists of one or more space-separated "words" each of
25which describes a 16-bit code unit. Each character in a word
26represents four bits, read from high bits to low, with vertical bars
27("<code>|</code>") interspersed to aid in reading. Uppercase letters
28in sequence from "<code>A</code>" are used to indicate fields within
29the format (which then get defined further by the syntax column). The term
Elliott Hughes8d777942012-01-05 17:27:02 -080030"<code>op</code>" is used to indicate the position of an eight-bit
31opcode within the format. A slashed zero
32("<code>&Oslash;</code>") is used to indicate that all bits must be
33zero in the indicated position.</p>
34
35<p>For the most part, lettering proceeds from earlier code units to
36later code units, and low-order to high-order within a code unit.
37However, there are a few exceptions to this general rule, which are
38done in order to make the naming of similar-meaning parts be the same
39across different instruction formats. These cases are noted explicitly
40in the format descriptions.</p>
Dan Bornstein25705bc2011-04-12 16:23:13 -070041
42<p>For example, the format "<code>B|A|<i>op</i> CCCC</code>" indicates
43that the format consists of two 16-bit code units. The first word
44consists of the opcode in the low eight bits and a pair of four-bit
45values in the high eight bits; and the second word consists of a single
4616-bit value.</p>
47
48<h3>Format IDs</h3>
49
50<p>The second column in the format table indicates the short identifier
51for the format, which is used in other documents and in code to identify
52the format.</p>
53
Elliott Hughes8d777942012-01-05 17:27:02 -080054<p>Most format IDs consist of three characters, two digits followed by a
Dan Bornstein25705bc2011-04-12 16:23:13 -070055letter. The first digit indicates the number of 16-bit code units in the
56format. The second digit indicates the maximum number of registers that the
57format contains (maximum, since some formats can accomodate a variable
58number of registers), with the special designation "<code>r</code>" indicating
59that a range of registers is encoded. The final letter semi-mnemonically
60indicates the type of any extra data encoded by the format. For example,
61format "<code>21t</code>" is of length two, contains one register reference,
62and additionally contains a branch target.</p>
63
Elliott Hughes8d777942012-01-05 17:27:02 -080064<p>Suggested static linking formats have an additional
65"<code>s</code>" suffix, making them four characters total. Similarly,
66suggested "inline" linking formats have an additional "<code>i</code>"
67suffix. (In this context, inline linking is like static linking,
68except with more direct ties into a virtual machine's implementation.)
69Finally, a couple oddball suggested formats (e.g.,
70"<code>20bc</code>") include two pieces of data which are both
71represented in its format ID.</p>
Dan Bornstein25705bc2011-04-12 16:23:13 -070072
73<p>The full list of typecode letters are as follows. Note that some
74forms have different sizes, depending on the format:</p>
75
76<table class="letters">
77<thead>
78<tr>
79 <th>Mnemonic</th>
80 <th>Bit Sizes</th>
81 <th>Meaning</th>
82</tr>
83</thead>
84<tbody>
85<tr>
86 <td>b</td>
87 <td>8</td>
88 <td>immediate signed <b>b</b>yte</td>
89</tr>
90<tr>
91 <td>c</td>
92 <td>16, 32</td>
93 <td><b>c</b>onstant pool index</td>
94</tr>
95<tr>
96 <td>f</td>
97 <td>16</td>
98 <td>inter<b>f</b>ace constants (only used in statically linked formats)
99 </td>
100</tr>
101<tr>
102 <td>h</td>
103 <td>16</td>
104 <td>immediate signed <b>h</b>at (high-order bits of a 32- or 64-bit
105 value; low-order bits are all <code>0</code>)
106 </td>
107</tr>
108<tr>
109 <td>i</td>
110 <td>32</td>
111 <td>immediate signed <b>i</b>nt, or 32-bit float</td>
112</tr>
113<tr>
114 <td>l</td>
115 <td>64</td>
116 <td>immediate signed <b>l</b>ong, or 64-bit double</td>
117</tr>
118<tr>
119 <td>m</td>
120 <td>16</td>
121 <td><b>m</b>ethod constants (only used in statically linked formats)</td>
122</tr>
123<tr>
124 <td>n</td>
125 <td>4</td>
126 <td>immediate signed <b>n</b>ibble</td>
127</tr>
128<tr>
129 <td>s</td>
130 <td>16</td>
131 <td>immediate signed <b>s</b>hort</td>
132</tr>
133<tr>
134 <td>t</td>
135 <td>8, 16, 32</td>
136 <td>branch <b>t</b>arget</td>
137</tr>
138<tr>
139 <td>x</td>
140 <td>0</td>
141 <td>no additional data</td>
142</tr>
143</tbody>
144</table>
145
146<h3>Syntax</h3>
147
148<p>The third column of the format table indicates the human-oriented
149syntax for instructions which use the indicated format. Each instruction
150starts with the named opcode and is optionally followed by one or
151more arguments, themselves separated with commas.</p>
152
153<p>Wherever an argument refers to a field from the first column, the
154letter for that field is indicated in the syntax, repeated once for
155each four bits of the field. For example, an eight-bit field labeled
156"<code>BB</code>" in the first column would also be labeled
157"<code>BB</code>" in the syntax column.</p>
158
159<p>Arguments which name a register have the form "<code>v<i>X</i></code>".
160The prefix "<code>v</code>" was chosen instead of the more common
161"<code>r</code>" exactly to avoid conflicting with (non-virtual) architectures
162on which a Dalvik virtual machine might be implemented which themselves
163use the prefix "<code>r</code>" for their registers. (That is, this
164decision makes it possible to talk about both virtual and real registers
165together without the need for circumlocution.)</p>
166
167<p>Arguments which indicate a literal value have the form
168"<code>#+<i>X</i></code>". Some formats indicate literals that only
169have non-zero bits in their high-order bits; for these, the zeroes
170are represented explicitly in the syntax, even though they do not
171appear in the bitwise representation.</p>
172
173<p>Arguments which indicate a relative instruction address offset have the
174form "<code>+<i>X</i></code>".</p>
175
176<p>Arguments which indicate a literal constant pool index have the form
177"<code><i>kind</i>@<i>X</i></code>", where "<code><i>kind</i></code>"
178indicates which constant pool is being referred to. Each opcode that
179uses such a format explicitly allows only one kind of constant; see
180the opcode reference to figure out the correspondence. The four
181kinds of constant pool are "<code>string</code>" (string pool index),
182"<code>type</code>" (type pool index), "<code>field</code>" (field
183pool index), and "<code>meth</code>" (method pool index).</p>
184
185<p>Similar to the representation of constant pool indices, there are
186also suggested (optional) forms that indicate prelinked offsets or
Elliott Hughes8d777942012-01-05 17:27:02 -0800187indices. There are two types of suggested prelinked value: vtable offsets
188(indicated as "<code>vtaboff</code>") and field offsets (indicated as
189"<code>fieldoff</code>").</p>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700190
191<p>In the cases where a format value isn't explictly part of the syntax
192but instead picks a variant, each variant is listed with the prefix
Elliott Hughes8d777942012-01-05 17:27:02 -0800193"<code>[<i>X</i>=<i>N</i>]</code>" (e.g., "<code>[A=2]</code>") to indicate
Dan Bornstein25705bc2011-04-12 16:23:13 -0700194the correspondence.</p>
195
196<h2>The Formats</h2>
197
198<table class="format">
199<thead>
200<tr>
201 <th>Format</th>
202 <th>ID</th>
203 <th>Syntax</th>
204 <th>Notable Opcodes Covered</th>
205</tr>
206</thead>
207<tbody>
208<tr>
Elliott Hughes8d777942012-01-05 17:27:02 -0800209 <td><i>N/A</i></td>
210 <td>00x</td>
211 <td><i><code>N/A</code></i></td>
212 <td><i>pseudo-format used for unused opcodes; suggested for use as the
213 nominal format for a breakpoint opcode</i></td>
214</tr>
215<tr>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700216 <td>&Oslash;&Oslash;|<i>op</i></td>
217 <td>10x</td>
218 <td><i><code>op</code></i></td>
219 <td>&nbsp;</td>
220</tr>
221<tr>
222 <td rowspan="2">B|A|<i>op</i></td>
223 <td>12x</td>
224 <td><i><code>op</code></i> vA, vB</td>
225 <td>&nbsp;</td>
226</tr>
227<tr>
228 <td>11n</td>
229 <td><i><code>op</code></i> vA, #+B</td>
230 <td>&nbsp;</td>
231</tr>
232<tr>
233 <td rowspan="2">AA|<i>op</i></td>
234 <td>11x</td>
235 <td><i><code>op</code></i> vAA</td>
236 <td>&nbsp;</td>
237</tr>
238<tr>
239 <td>10t</td>
240 <td><i><code>op</code></i> +AA</td>
241 <td>goto</td>
242</tr>
243<tr>
244 <td>&Oslash;&Oslash;|<i>op</i> AAAA</td></td>
245 <td>20t</td>
246 <td><i><code>op</code></i> +AAAA</td>
247 <td>goto/16</td>
248</tr>
249<tr>
Elliott Hughes8d777942012-01-05 17:27:02 -0800250 <td>AA|<i>op</i> BBBB</td></td>
251 <td>20bc</td>
252 <td><i><code>op</code></i> AA, kind@BBBB</td>
253 <td><i>suggested format for statically determined verification errors;
254 A is the type of error and B is an index into a type-appropriate
255 table (e.g. method references for a no-such-method error)</i></td>
256</tr>
257<tr>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700258 <td rowspan="5">AA|<i>op</i> BBBB</td>
259 <td>22x</td>
260 <td><i><code>op</code></i> vAA, vBBBB</td>
261 <td>&nbsp;</td>
262</tr>
263<tr>
264 <td>21t</td>
265 <td><i><code>op</code></i> vAA, +BBBB</td>
266 <td>&nbsp;</td>
267</tr>
268<tr>
269 <td>21s</td>
270 <td><i><code>op</code></i> vAA, #+BBBB</td>
271 <td>&nbsp;</td>
272</tr>
273<tr>
274 <td>21h</td>
275 <td><i><code>op</code></i> vAA, #+BBBB0000<br/>
276 <i><code>op</code></i> vAA, #+BBBB000000000000
277 </td>
278 <td>&nbsp;</td>
279</tr>
280<tr>
281 <td>21c</td>
282 <td><i><code>op</code></i> vAA, type@BBBB<br/>
283 <i><code>op</code></i> vAA, field@BBBB<br/>
284 <i><code>op</code></i> vAA, string@BBBB
285 </td>
286 <td>check-cast<br/>
287 const-class<br/>
288 const-string
289 </td>
290</tr>
291<tr>
292 <td rowspan="2">AA|<i>op</i> CC|BB</td>
293 <td>23x</td>
294 <td><i><code>op</code></i> vAA, vBB, vCC</td>
295 <td>&nbsp;</td>
296</tr>
297<tr>
298 <td>22b</td>
299 <td><i><code>op</code></i> vAA, vBB, #+CC</td>
300 <td>&nbsp;</td>
301</tr>
302<tr>
303 <td rowspan="4">B|A|<i>op</i> CCCC</td>
304 <td>22t</td>
305 <td><i><code>op</code></i> vA, vB, +CCCC</td>
306 <td>&nbsp;</td>
307</tr>
308<tr>
309 <td>22s</td>
310 <td><i><code>op</code></i> vA, vB, #+CCCC</td>
311 <td>&nbsp;</td>
312</tr>
313<tr>
314 <td>22c</td>
315 <td><i><code>op</code></i> vA, vB, type@CCCC<br/>
316 <i><code>op</code></i> vA, vB, field@CCCC
317 </td>
318 <td>instance-of</td>
319</tr>
320<tr>
321 <td>22cs</td>
322 <td><i><code>op</code></i> vA, vB, fieldoff@CCCC</td>
Elliott Hughes8d777942012-01-05 17:27:02 -0800323 <td><i>suggested format for statically linked field access instructions of
324 format 22c</i>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700325 </td>
326</tr>
327<tr>
328 <td>&Oslash;&Oslash;|<i>op</i> AAAA<sub>lo</sub> AAAA<sub>hi</sub></td></td>
329 <td>30t</td>
330 <td><i><code>op</code></i> +AAAAAAAA</td>
331 <td>goto/32</td>
332</tr>
333<tr>
334 <td>&Oslash;&Oslash;|<i>op</i> AAAA BBBB</td>
335 <td>32x</td>
336 <td><i><code>op</code></i> vAAAA, vBBBB</td>
337 <td>&nbsp;</td>
338</tr>
339<tr>
340 <td rowspan="3">AA|<i>op</i> BBBB<sub>lo</sub> BBBB<sub>hi</sub></td>
341 <td>31i</td>
342 <td><i><code>op</code></i> vAA, #+BBBBBBBB</td>
343 <td>&nbsp;</td>
344</tr>
345<tr>
346 <td>31t</td>
347 <td><i><code>op</code></i> vAA, +BBBBBBBB</td>
348 <td>&nbsp;</td>
349</tr>
350<tr>
351 <td>31c</td>
352 <td><i><code>op</code></i> vAA, string@BBBBBBBB</td>
353 <td>const-string/jumbo</td>
354</tr>
355<tr>
Elliott Hughes8d777942012-01-05 17:27:02 -0800356 <td rowspan="3">A|G|<i>op</i> BBBB F|E|D|C</td>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700357 <td>35c</td>
Elliott Hughes8d777942012-01-05 17:27:02 -0800358 <td><i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG},
359 meth@BBBB<br/>
360 <i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG},
361 type@BBBB<br/>
362 <i>[<code>A=4</code>] <code>op</code></i> {vC, vD, vE, vF},
363 <i><code>kind</code></i>@BBBB<br/>
364 <i>[<code>A=3</code>] <code>op</code></i> {vC, vD, vE},
365 <i><code>kind</code></i>@BBBB<br/>
366 <i>[<code>A=2</code>] <code>op</code></i> {vC, vD},
367 <i><code>kind</code></i>@BBBB<br/>
368 <i>[<code>A=1</code>] <code>op</code></i> {vC},
369 <i><code>kind</code></i>@BBBB<br/>
370 <i>[<code>A=0</code>] <code>op</code></i> {},
371 <i><code>kind</code></i>@BBBB<br/>
372 <p><i>The unusual choice in lettering here reflects a desire to make
373 the count and the reference index have the same label as in format
374 3rc.</i></p>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700375 </td>
376 <td>&nbsp;</td>
377</tr>
378<tr>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700379 <td>35ms</td>
Elliott Hughes8d777942012-01-05 17:27:02 -0800380 <td><i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG},
381 vtaboff@BBBB<br/>
382 <i>[<code>A=4</code>] <code>op</code></i> {vC, vD, vE, vF},
383 vtaboff@BBBB<br/>
384 <i>[<code>A=3</code>] <code>op</code></i> {vC, vD, vE},
385 vtaboff@BBBB<br/>
386 <i>[<code>A=2</code>] <code>op</code></i> {vC, vD},
387 vtaboff@BBBB<br/>
388 <i>[<code>A=1</code>] <code>op</code></i> {vC},
389 vtaboff@BBBB<br/>
390 <p><i>The unusual choice in lettering here reflects a desire to make
391 the count and the reference index have the same label as in format
392 3rms.</i></p>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700393 </td>
Elliott Hughes8d777942012-01-05 17:27:02 -0800394 <td><i>suggested format for statically linked <code>invoke-virtual</code>
395 and <code>invoke-super</code> instructions of format 35c</i>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700396 </td>
397</tr>
398<tr>
Elliott Hughes8d777942012-01-05 17:27:02 -0800399 <td>35mi</td>
400 <td><i>[<code>A=5</code>] <code>op</code></i> {vC, vD, vE, vF, vG},
401 inline@BBBB<br/>
402 <i>[<code>A=4</code>] <code>op</code></i> {vC, vD, vE, vF},
403 inline@BBBB<br/>
404 <i>[<code>A=3</code>] <code>op</code></i> {vC, vD, vE},
405 inline@BBBB<br/>
406 <i>[<code>A=2</code>] <code>op</code></i> {vC, vD},
407 inline@BBBB<br/>
408 <i>[<code>A=1</code>] <code>op</code></i> {vC},
409 inline@BBBB<br/>
410 <p><i>The unusual choice in lettering here reflects a desire to make
411 the count and the reference index have the same label as in format
412 3rmi.</i></p>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700413 </td>
Elliott Hughes8d777942012-01-05 17:27:02 -0800414 <td><i>suggested format for inline linked <code>invoke-static</code>
415 and <code>invoke-virtual</code> instructions of format 35c</i>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700416 </td>
417</tr>
418<tr>
Elliott Hughes8d777942012-01-05 17:27:02 -0800419 <td rowspan="3">AA|<i>op</i> BBBB CCCC</td>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700420 <td>3rc</td>
421 <td><i><code>op</code></i> {vCCCC .. vNNNN}, meth@BBBB<br/>
422 <i><code>op</code></i> {vCCCC .. vNNNN}, type@BBBB<br/>
Elliott Hughes8d777942012-01-05 17:27:02 -0800423 <p><i>where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700424 determines the count <code>0..255</code>, and <code>C</code>
Elliott Hughes8d777942012-01-05 17:27:02 -0800425 determines the first register</i></p>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700426 </td>
427 <td>&nbsp;</td>
428</tr>
429<tr>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700430 <td>3rms</td>
431 <td><i><code>op</code></i> {vCCCC .. vNNNN}, vtaboff@BBBB<br/>
Elliott Hughes8d777942012-01-05 17:27:02 -0800432 <p><i>where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700433 determines the count <code>0..255</code>, and <code>C</code>
Elliott Hughes8d777942012-01-05 17:27:02 -0800434 determines the first register</i></p>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700435 </td>
Elliott Hughes8d777942012-01-05 17:27:02 -0800436 <td><i>suggested format for statically linked <code>invoke-virtual</code>
437 and <code>invoke-super</code> instructions of format <code>3rc</code></i>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700438 </td>
439</tr>
440<tr>
Elliott Hughes8d777942012-01-05 17:27:02 -0800441 <td>3rmi</td>
442 <td><i><code>op</code></i> {vCCCC .. vNNNN}, inline@BBBB<br/>
443 <p><i>where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code>
444 determines the count <code>0..255</code>, and <code>C</code>
445 determines the first register</i></p>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700446 </td>
Elliott Hughes8d777942012-01-05 17:27:02 -0800447 <td><i>suggested format for inline linked <code>invoke-static</code>
448 and <code>invoke-virtual</code> instructions of format 3rc</i>
Dan Bornstein25705bc2011-04-12 16:23:13 -0700449 </td>
450</tr>
451<tr>
452 <td>AA|<i>op</i> BBBB<sub>lo</sub> BBBB BBBB BBBB<sub>hi</sub></td>
453 <td>51l</td>
454 <td><i><code>op</code></i> vAA, #+BBBBBBBBBBBBBBBB</td>
455 <td>const-wide</td>
456</tr>
457</tbody>
458</table>
459
460</body>
461</html>