blob: d7bf69057441f6f303745859321b75fc05d1647f [file] [log] [blame]
The Android Open Source Projectf6c38712009-03-03 19:28:47 -08001<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
2
3<html>
4
5<head>
6<title>Dalvik VM Instruction Formats</title>
7<link rel=stylesheet href="instruction-formats.css">
8</head>
9
10<body>
11
12<h1>Dalvik VM Instruction Formats</h1>
13<p>Copyright &copy; 2007 The Android Open Source Project
14
15<h2>Introduction and Overview</h2>
16
17<p>This document lists the instruction formats used by Dalvik bytecode
18and is meant to be used in conjunction with the
19<a href="dalvik-bytecode.html">bytecode reference document</a>.</p>
20
21<h3>Bitwise descriptions</h3>
22
23<p>The first column in the format table lists the bitwise layout of
24the format. It consists of one or more space-separated "words" each of
25which describes a 16-bit code unit. Each character in a word
26represents four bits, read from high bits to low, with vertical bars
27("<code>|</code>") interspersed to aid in reading. Uppercase letters
28in sequence from "<code>A</code>" are used to indicate fields within
29the format (which then get defined further by the syntax column). The term
30"<code>op</code>" is used to indicate the position of the eight-bit
31opcode within the format. A slashed zero ("<code>&Oslash;</code>") is
32used to indicate that all bits should be zero in the indicated
33position.</p>
34
35<p>For example, the format "<code>B|A|<i>op</i> CCCC</code>" indicates
36that the format consists of two 16-bit code units. The first word
37consists of the opcode in the low eight bits and a pair of four-bit
38values in the high eight bits; and the second word consists of a single
3916-bit value.</p>
40
41<h3>Format IDs</h3>
42
43<p>The second column in the format table indicates the short identifier
44for the format, which is used in other documents and in code to identify
45the format.</p>
46
47<p>Format IDs consist of three characters, two digits followed by a
48letter. The first digit indicates the number of 16-bit code units in the
49format. The second digit indicates the maximum number of registers that the
50format contains (maximum, since some formats can accomodate a variable
51number of registers), with the special designation "<code>r</code>" indicating
52that a range of registers is encoded. The final letter semi-mnemonically
53indicates the type of any extra data encoded by the format. For example,
54format "<code>21t</code>" is of length two, contains one register reference,
55and additionally contains a branch target.</p>
56
57<p>Suggested static linking formats have an additional "<code>s</code>" suffix,
58making them four characters total.</p>
59
60<p>The full list of typecode letters are as follows. Note that some
61forms have different sizes, depending on the format:</p>
62
63<table class="letters">
64<thead>
65<tr>
66 <th>Mnemonic</th>
67 <th>Bit Sizes</th>
68 <th>Meaning</th>
69</tr>
70</thead>
71<tbody>
72<tr>
73 <td>b</td>
74 <td>8</td>
75 <td>immediate signed <b>b</b>yte</td>
76</tr>
77<tr>
78 <td>c</td>
79 <td>16, 32</td>
80 <td><b>c</b>onstant pool index</td>
81</tr>
82<tr>
83 <td>f</td>
84 <td>16</td>
85 <td>inter<b>f</b>ace constants (only used in statically linked formats)
86 </td>
87</tr>
88<tr>
89 <td>h</td>
90 <td>16</td>
91 <td>immediate signed <b>h</b>at (high-order bits of a 32- or 64-bit
92 value; low-order bits are all <code>0</code>)
93 </td>
94</tr>
95<tr>
96 <td>i</td>
97 <td>32</td>
98 <td>immediate signed <b>i</b>nt, or 32-bit float</td>
99</tr>
100<tr>
101 <td>l</td>
102 <td>64</td>
103 <td>immediate signed <b>l</b>ong, or 64-bit double</td>
104</tr>
105<tr>
106 <td>m</td>
107 <td>16</td>
108 <td><b>m</b>ethod constants (only used in statically linked formats)</td>
109</tr>
110<tr>
111 <td>n</td>
112 <td>4</td>
113 <td>immediate signed <b>n</b>ibble</td>
114</tr>
115<tr>
116 <td>s</td>
117 <td>16</td>
118 <td>immediate signed <b>s</b>hort</td>
119</tr>
120<tr>
121 <td>t</td>
122 <td>8, 16, 32</td>
123 <td>branch <b>t</b>arget</td>
124</tr>
125<tr>
126 <td>x</td>
127 <td>0</td>
128 <td>no additional data</td>
129</tr>
130</tbody>
131</table>
132
133<h3>Syntax</h3>
134
135<p>The third column of the format table indicates the human-oriented
136syntax for instructions which use the indicated format. Each instruction
137starts with the named opcode and is optionally followed by one or
138more arguments, themselves separated with commas.</p>
139
140<p>Wherever an argument refers to a field from the first column, the
141letter for that field is indicated in the syntax, repeated once for
142each four bits of the field. For example, an eight-bit field labeled
143"<code>BB</code>" in the first column would also be labeled
144"<code>BB</code>" in the syntax column.</p>
145
146<p>Arguments which name a register have the form "<code>v<i>X</i></code>".
147The prefix "<code>v</code>" was chosen instead of the more common
148"<code>r</code>" exactly to avoid conflicting with (non-virtual) architectures
149on which a Dalvik virtual machine might be implemented which themselves
150use the prefix "<code>r</code>" for their registers. (That is, this
151decision makes it possible to talk about both virtual and real registers
152together without the need for circumlocution.)</p>
153
154<p>Arguments which indicate a literal value have the form
155"<code>#+<i>X</i></code>". Some formats indicate literals that only
156have non-zero bits in their high-order bits; for these, the zeroes
157are represented explicitly in the syntax, even though they do not
158appear in the bitwise representation.</p>
159
160<p>Arguments which indicate a relative instruction address offset have the
161form "<code>+<i>X</i></code>".</p>
162
163<p>Arguments which indicate a literal constant pool index have the form
164"<code><i>kind</i>@<i>X</i></code>", where "<code><i>kind</i></code>"
165indicates which constant pool is being referred to. Each opcode that
166uses such a format explicitly allows only one kind of constant; see
167the opcode reference to figure out the correspondence. The four
168kinds of constant pool are "<code>string</code>" (string pool index),
169"<code>type</code>" (type pool index), "<code>field</code>" (field
170pool index), and "<code>meth</code>" (method pool index).</p>
171
172<p>Similar to the representation of constant pool indices, there are
173also suggested (optional) forms that indicate prelinked offsets or
174indices. These prelinked values include "<code>vtaboff</code>"
175(vtable offset), "<code>fieldoff</code>" (field offset), and
176"<code>iface</code>" (interface pool index).</p>
177
178<p>In the cases where a format value isn't explictly part of the syntax
179but instead picks a variant, each variant is listed with the prefix
180"<code>[<i>X</i>=<i>N</i>]</code>" (e.g., "<code>[B=2]</code>") to indicate
181the correspondence.</p>
182
183<h2>The Formats</h2>
184
185<table class="format">
186<thead>
187<tr>
188 <th>Format</th>
189 <th>ID</th>
190 <th>Syntax</th>
191 <th>Notable Opcodes Covered</th>
192</tr>
193</thead>
194<tbody>
195<tr>
196 <td>&Oslash;&Oslash;|<i>op</i></td>
197 <td>10x</td>
198 <td><i><code>op</code></i></td>
199 <td>&nbsp;</td>
200</tr>
201<tr>
202 <td rowspan="2">B|A|<i>op</i></td>
203 <td>12x</td>
204 <td><i><code>op</code></i> vA, vB</td>
205 <td>&nbsp;</td>
206</tr>
207<tr>
208 <td>11n</td>
209 <td><i><code>op</code></i> vA, #+B</td>
210 <td>&nbsp;</td>
211</tr>
212<tr>
213 <td rowspan="2">AA|<i>op</i></td>
214 <td>11x</td>
215 <td><i><code>op</code></i> vAA</td>
216 <td>&nbsp;</td>
217</tr>
218<tr>
219 <td>10t</td>
220 <td><i><code>op</code></i> +AA</td>
221 <td>goto</td>
222</tr>
223<tr>
224 <td>&Oslash;&Oslash;|<i>op</i> AAAA</td></td>
225 <td>20t</td>
226 <td><i><code>op</code></i> +AAAA</td>
227 <td>goto/16</td>
228</tr>
229<tr>
230 <td rowspan="5">AA|<i>op</i> BBBB</td>
231 <td>22x</td>
232 <td><i><code>op</code></i> vAA, vBBBB</td>
233 <td>&nbsp;</td>
234</tr>
235<tr>
236 <td>21t</td>
237 <td><i><code>op</code></i> vAA, +BBBB</td>
238 <td>&nbsp;</td>
239</tr>
240<tr>
241 <td>21s</td>
242 <td><i><code>op</code></i> vAA, #+BBBB</td>
243 <td>&nbsp;</td>
244</tr>
245<tr>
246 <td>21h</td>
247 <td><i><code>op</code></i> vAA, #+BBBB0000<br/>
248 <i><code>op</code></i> vAA, #+BBBB000000000000
249 </td>
250 <td>&nbsp;</td>
251</tr>
252<tr>
253 <td>21c</td>
254 <td><i><code>op</code></i> vAA, type@BBBB<br/>
255 <i><code>op</code></i> vAA, field@BBBB<br/>
256 <i><code>op</code></i> vAA, string@BBBB
257 </td>
258 <td>check-cast<br/>
259 const-class<br/>
260 const-string
261 </td>
262</tr>
263<tr>
264 <td rowspan="2">AA|<i>op</i> CC|BB</td>
265 <td>23x</td>
266 <td><i><code>op</code></i> vAA, vBB, vCC</td>
267 <td>&nbsp;</td>
268</tr>
269<tr>
270 <td>22b</td>
271 <td><i><code>op</code></i> vAA, vBB, #+CC</td>
272 <td>&nbsp;</td>
273</tr>
274<tr>
275 <td rowspan="4">B|A|<i>op</i> CCCC</td>
276 <td>22t</td>
277 <td><i><code>op</code></i> vA, vB, +CCCC</td>
278 <td>&nbsp;</td>
279</tr>
280<tr>
281 <td>22s</td>
282 <td><i><code>op</code></i> vA, vB, #+CCCC</td>
283 <td>&nbsp;</td>
284</tr>
285<tr>
286 <td>22c</td>
287 <td><i><code>op</code></i> vA, vB, type@CCCC<br/>
288 <i><code>op</code></i> vA, vB, field@CCCC
289 </td>
290 <td>instance-of</td>
291</tr>
292<tr>
293 <td>22cs</td>
294 <td><i><code>op</code></i> vA, vB, fieldoff@CCCC</td>
295 <td><i>(suggested format for statically linked field access instructions of
296 format 22c)</i>
297 </td>
298</tr>
299<tr>
300 <td>&Oslash;&Oslash;|<i>op</i> AAAA<sub>lo</sub> AAAA<sub>hi</sub></td></td>
301 <td>30t</td>
302 <td><i><code>op</code></i> +AAAAAAAA</td>
303 <td>goto/32</td>
304</tr>
305<tr>
306 <td>&Oslash;&Oslash;|<i>op</i> AAAA BBBB</td>
307 <td>32x</td>
308 <td><i><code>op</code></i> vAAAA, vBBBB</td>
309 <td>&nbsp;</td>
310</tr>
311<tr>
312 <td rowspan="3">AA|<i>op</i> BBBB<sub>lo</sub> BBBB<sub>hi</sub></td>
313 <td>31i</td>
314 <td><i><code>op</code></i> vAA, #+BBBBBBBB</td>
315 <td>&nbsp;</td>
316</tr>
317<tr>
318 <td>31t</td>
319 <td><i><code>op</code></i> vAA, +BBBBBBBB</td>
320 <td>&nbsp;</td>
321</tr>
322<tr>
323 <td>31c</td>
324 <td><i><code>op</code></i> vAA, string@BBBBBBBB</td>
325 <td>const-string/jumbo</td>
326</tr>
327<tr>
328 <td>B|A|<i>op</i> CCCC G|F|E|D</td>
329 <td>35c</td>
330 <td><i>[<code>B=5</code>] <code>op</code></i> {vD, vE, vF, vG, vA},
331 meth@CCCC<br/>
332 <i>[<code>B=5</code>] <code>op</code></i> {vD, vE, vF, vG, vA},
333 type@CCCC<br/>
334 <i>[<code>B=4</code>] <code>op</code></i> {vD, vE, vF, vG},
335 <i><code>kind</code></i>@CCCC<br/>
336 <i>[<code>B=3</code>] <code>op</code></i> {vD, vE, vF},
337 <i><code>kind</code></i>@CCCC<br/>
338 <i>[<code>B=2</code>] <code>op</code></i> {vD, vE},
339 <i><code>kind</code></i>@CCCC<br/>
340 <i>[<code>B=1</code>] <code>op</code></i> {vD},
341 <i><code>kind</code></i>@CCCC<br/>
342 <i>[<code>B=0</code>] <code>op</code></i> {},
343 <i><code>kind</code></i>@CCCC
344 </td>
345 <td>&nbsp;</td>
346</tr>
347<tr>
348 <td>B|A|<i>op</i> CCCC G|F|E|D</td>
349 <td>35ms</td>
350
351 <td><i>[<code>B=5</code>] <code>op</code></i> {vD, vE, vF, vG, vA},
352 vtaboff@CCCC<br/>
353 <i>[<code>B=4</code>] <code>op</code></i> {vD, vE, vF, vG},
354 vtaboff@CCCC<br/>
355 <i>[<code>B=3</code>] <code>op</code></i> {vD, vE, vF},
356 vtaboff@CCCC<br/>
357 <i>[<code>B=2</code>] <code>op</code></i> {vD, vE},
358 vtaboff@CCCC<br/>
359 <i>[<code>B=1</code>] <code>op</code></i> {vD},
360 vtaboff@CCCC<br/>
361 </td>
362 <td><i>(suggested format for statically linked <code>invoke-virtual</code>
363 and <code>invoke-super</code> instructions of format 35c)</i>
364 </td>
365</tr>
366<tr>
367 <td>B|A|<i>op</i> DDCC H|G|F|E</td>
368 <td>35fs</td>
369 <td><i>[<code>B=5</code>] <code>op</code></i> {vE, vF, vG, vH, vA},
370 vtaboff@CC, iface@DD<br/>
371 <i>[<code>B=4</code>] <code>op</code></i> {vE, vF, vG, vH},
372 vtaboff@CC, iface@DD<br/>
373 <i>[<code>B=3</code>] <code>op</code></i> {vE, vF, vG},
374 vtaboff@CC, iface@DD<br/>
375 <i>[<code>B=2</code>] <code>op</code></i> {vE, vF},
376 vtaboff@CC, iface@DD<br/>
377 <i>[<code>B=1</code>] <code>op</code></i> {vE},
378 vtaboff@CC, iface@DD<br/>
379 </td>
380 <td><i>(suggested format for statically linked <code>invoke-interface</code>
381 instructions of format 35c)</i>
382 </td>
383</tr>
384<tr>
385 <td>AA|<i>op</i> BBBB CCCC</td>
386 <td>3rc</td>
387 <td><i><code>op</code></i> {vCCCC .. vNNNN}, meth@BBBB<br/>
388 <i><code>op</code></i> {vCCCC .. vNNNN}, type@BBBB<br/>
389 <p><i>(where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code>
390 determines the count <code>0..255</code>, and <code>C</code>
391 determines the first register)</i></p>
392 </td>
393 <td>&nbsp;</td>
394</tr>
395<tr>
396 <td>AA|<i>op</i> BBBB CCCC</td>
397 <td>3rms</td>
398 <td><i><code>op</code></i> {vCCCC .. vNNNN}, vtaboff@BBBB<br/>
399 <p><i>(where <code>NNNN = CCCC+AA-1</code>, that is <code>A</code>
400 determines the count <code>0..255</code>, and <code>C</code>
401 determines the first register)</i></p>
402 </td>
403 <td><i>(suggested format for statically linked <code>invoke-virtual</code>
404 and <code>invoke-super</code> instructions of format <code>3rc</code>)</i>
405 </td>
406</tr>
407<tr>
408 <td>AA|<i>op</i> CCBB DDDD</td>
409 <td>3rfs</td>
410 <td><i><code>op</code></i> {vDDDD .. vNNNN}, vtaboff@BB,
411 iface@CC<br/>
412 <p><i>(where <code>NNNN = DDDD+AA-1</code>, that is <code>A</code>
413 determines the count <code>0..255</code>, and <code>D</code>
414 determines the first register)</i></p>
415 </td>
416 <td><i>(suggested format for statically linked <code>invoke-interface</code>
417 instructions of format <code>3rc</code>)</i>
418 </td>
419</tr>
420<tr>
421 <td>AA|<i>op</i> BBBB<sub>lo</sub> BBBB BBBB BBBB<sub>hi</sub></td>
422 <td>51l</td>
423 <td><i><code>op</code></i> vAA, #+BBBBBBBBBBBBBBBB</td>
424 <td>const-wide</td>
425</tr>
426</tbody>
427</table>
428
429</body>
430</html>