blob: 612b0bffc13ae5c3ac434bb7263d51e5b27cf6e5 [file] [log] [blame]
page.title=Dalvik bytecode constraints
@jd:body
<!--
Copyright 2015 The Android Open Source Project
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<div id="qv-wrapper">
<div id="qv">
<h2 id="Contents">In this document</h2>
<ol id="auto-toc">
</ol>
</div>
</div>
<h2 id="gen-constraints">General integrity constraints </h2>
<table>
<tr>
<th>
Identifier
</th>
<th>
Description
</th>
</tr>
<tr>
<td>
A1
</td>
<td>
The magic number of the DEX file must be "dex\n035\0".
</td>
</tr>
<tr>
<td>
A1
</td>
<td>
The checksum must be an Adler-32 checksum of the whole file contents
except magic and checksum field.
</td>
</tr>
</table>
<p>The signature must be a SHA-1 hash of the whole file contents except magic,
checksum, and signature.</p>
<p>The file_size must match the actual file size in bytes.</p>
<p>The header_size must have the value 0x70.</p>
<p>The endian_tag must have either the value ENDIAN_CONSTANT or
REVERSE_ENDIAN_CONSTANT.</p>
<p>For each of the link, string_ids, type_ids, proto_ids, field_ids, method_ids, class_defs
and data sections, the offset and size fields must be either both zero or both
non-zero. In the latter case, the offset must be four-byte-aligned.</p>
<p>All offset fields in the header except map_off must be four-byte-aligned.</p>
<p>The map_off field must be either zero or point into the data section. In the
latter case, the data section must exist.</p>
<p>None of the link, string_ids, type_ids, proto_ids, field_ids, method_ids, class_defs
and data sections must overlap each other or the header.</p>
<p>If a map exists, then each map entry must have a valid type. Each type may
appear at most once.</p>
<p>If a map exists, then each map entry must have a nonzero offset and size. The
offset must point into the corresponding section of the file (i.e. a
string_id_item must point into the string_ids section) and the explicit or
implicit size of the item must match the actual contents and size of the
section.</p>
<p>If a map exists, then the offset of map entry n+1 must be greater or equal to
the offset of map entry n plus then size of map entry n. This implies
non-overlapping entries and low-to-high ordering.</p>
<p>The following types of entries must have an offset that is
four-byte-aligned: string_id_item, type_id_item, proto_id_item, field_id_item,
method_id_item, class_def_item, type_list, code_item,
annotations_directory_item.</p>
<p>For each string_id_item, the string_data_off field must contain a valid
reference into the data section. For the referenced string_data_item, the data
field must contain a valid MUTF-8 string, and the utf16_size must match the
decoded length of the string.</p>
<p>For each type_id_item, the desciptor_idx field must contain a valid reference
into the string_ids list. The referenced string must be a valid type descriptor.</p>
<p>For each proto_id_item, the shorty_idx field must contain a valid reference
into the string_ids list. The referenced string must be a valid shorty descriptor.
Also, the return_type_idx field must be a valid index into the type_ids section,
and the parameters_off field must be either zero or a valid offset pointing
into the data section. If nonzero, the parameter list must not contain any void
entries.</p>
<p>For each field_id_item, both the class_idx and type_idx fields must be a valid
indices into the
type_ids list. The entry referenced by class_idx must be a non-array reference type.
In addition, the name_idx field must be a valid reference into the string_ids
section, and the contents of the referenced entry must conform to the MemberName
specification.</p>
<p>For each method_id_item, the class_idx field must be a valid index into the
type_ids section, and the
referenced entry must be a non-array reference type. The proto_id field must
be a valid reference into the proto_ids list. The name_idx field must be a
valid reference into the string_ids
section, and the contents of the referenced entry must conform to the MemberName
specification.</p>
<p>For each class_def_item, ...</p>
<p>For each field_id_item, the class_idx field must be a valid index into the
type_ids list. The referenced entry must be a non-array reference type.</p>
<h2 id="static-constraints">
Static constraints
</h2>
<p>
Static constraints are constraints on individual elements of the bytecode.
They usually can be checked without employing control or data-flow analysis
techniques.
</p>
<table>
<tr>
<th>
Identifier
</th>
<th>
Description
</th>
</tr>
<tr>
<td>
A1
</td>
<td>
The <code>insns</code> array must not be empty.
</td>
</tr>
<tr>
<td>
A2
</td>
<td>
The first opcode in the <code>insns</code> array must have index zero.
</td>
</tr>
<tr>
<td>
A3
</td>
<td>
The <code>insns</code> array must only contain valid Dalvik opcodes.
</td>
</tr>
<tr>
<td>
A4
</td>
<td>
The index of instruction <code>n+1</code> must equal the index of
instruction <code>n</code> plus the length of instruction
<code>n</code>, taking into account possible operands.
</td>
</tr>
<tr>
<td>
A5
</td>
<td>
The last instruction in the <code>insns</code> array must end at index
<code>insns_size-1</code>.
</td>
</tr>
<tr>
<td>
A6
</td>
<td>
All <code>goto</code> and <code>if-&lt;kind&gt;</code> targets must
be opcodes within in the same method.
</td>
</tr>
<tr>
<td>
A7
</td>
<td>
All targets of a <code>packed-switch</code> instruction must be
opcodes within in the same method. The size and the list of targets
must be consistent.
</td>
</tr>
<tr>
<td>
A8
</td>
<td>
All targets of a <code>sparse-switch</code> instruction must be
opcodes within in the same method. The corresponding table must be
consistent and sorted low-to-high.
</td>
</tr>
<tr>
<td>
A9
</td>
<td>
The <code>B</code> operand of the <code>const-string</code> and
<code>const-string/jumbo</code> instructions must be a valid index
into the string constant pool.
</td>
</tr>
<tr>
<td>
A10
</td>
<td>
The <code>C</code> operand of the <code>iget&lt;kind&gt;</code> and
<code>iput&lt;kind&gt;</code> instructions must be a valid index into
the field constant pool. The referenced entry must represent an
instance field.
</td>
</tr>
<tr>
<td>
A11
</td>
<td>
The <code>C</code> operand of the <code>sget&lt;kind&gt;</code> and
<code>sput&lt;kind&gt;</code> instructions must be a valid index into
the field constant pool. The referenced entry must represent a static
field.
</td>
</tr>
<tr>
<td>
A12
</td>
<td>
The <code>C</code> operand of the <code>invoke-virtual</code>,
<code>invoke-super</code>, <code<invoke-direct</code> and
<code>invoke-static</code> instructions must be a valid index into the
method constant pool. In all cases, the referenced
<code>method_id</code> must belong to a class (not an interface).
</td>
</tr>
<tr>
<td>
A13
</td>
<td>
The <code>B</code> operand of the <code>invoke-virtual/range</code>,
<code>invoke-super/range</code>, <code>invoke-direct/range</code>, and
<code>invoke-static/range</code> instructions must be a valid index
into the method constant pool. In all cases, the referenced
<code>method_id</code> must belong to a class (not an interface).
</td>
</tr>
<tr>
<td>
A14
</td>
<td>
A method the name of which starts with a '<' must only be invoked
implicitly by the VM, not by code originating from a Dex file. The
only exception is the instance initializer, which may be invoked by
<code>invoke-direct</code>.
</td>
</tr>
<tr>
<td>
A15
</td>
<td>
The <code>C</code> operand of the <code>invoke-interface</code>
instruction must be a valid index into the method constant pool. The
referenced <code>method_id</code> must belong to an interface (not a
class).
</td>
</tr>
<tr>
<td>
A16
</td>
<td>
The <code>B</code> operand of the <code>invoke-interface/range</code>
instruction must be a valid index into the method constant pool.
The referenced <code>method_id</code> must belong to an interface (not
a class).
</td>
</tr>
<tr>
<td>
A17
</td>
<td>
The <code>B</code> operand of the <code>const-class</code>,
<code>check-cast</code>, <code>new-instance</code>, and
<code>filled-new-array/range</code> instructions must be a valid index
into the type constant pool.
</td>
</tr>
<tr>
<td>
A18
</td>
<td>
The <code>C</code> operand of the <code>instance-of</code>,
<code>new-array</code>, and <code>filled-new-array</code>
instructions must be a valid index into the type constant pool.
</td>
</tr>
<tr>
<td>
A19
</td>
<td>
The dimensions of an array created by a <code>new-array</code>
instruction must be less than <code>256</code>.
</td>
</tr>
<tr>
<td>
A20
</td>
<td>
The <code>new</code> instruction must not refer to array classes,
interfaces, or abstract classes.
</td>
</tr>
<tr>
<td>
A21
</td>
<td>
The type referred to by a <code>new-array</code> instruction must be
a valid, non-reference type.
</td>
</tr>
<tr>
<td>
A22
</td>
<td>
All registers referred to by an instruction in a single-width
(non-pair) fashion must be valid for the current method. That is,
their indices must be non-negative and smaller than
<code>registers_size</code>.
</td>
</tr>
<tr>
<td>
A23
</td>
<td>
All registers referred to by an instruction in a double-width (pair)
fashion must be valid for the current method. That is, their indices
must be non-negative and smaller than <code>registers_size-1</code>.
</td>
</tr>
</table>
<h2 id="struct-constraints">
Structural constraints
</h2>
<p>
Structural constraints are constraints on relationships between several
elements of the bytecode. They usually can't be checked without employing
control or data-flow analysis techniques.
</p>
<table>
<tr>
<th>
Identifier
</th>
<th>
Description
</th>
</tr>
<tr>
<td>
B1
</td>
<td>
The number and types of arguments (registers and immediate values)
must always match the instruction.
</td>
</tr>
<tr>
<td>
B2
</td>
<td>
Register pairs must never be broken up.
</td>
</tr>
<tr>
<td>
B3
</td>
<td>
A register (or pair) has to be assigned first before it can be
read.
</td>
</tr>
<tr>
<td>
B4
</td>
<td>
An <code>invoke-direct</code> instruction must only invoke an instance
initializer or a method in the current class or one of its
superclasses.
</td>
</tr>
<tr>
<td>
B5
</td>
<td>
An instance initializer must only be invoked on an uninitialized
instance.
</td>
</tr>
<tr>
<td>
B6
</td>
<td>
Instance methods may only be invoked on and instance fields may only
be accessed on already initialized instances.
</td>
</tr>
<tr>
<td>
B7
</td>
<td>
A register which holds the result of a <code>new-instance</code>
instruction must not be used if the same
<code>new-instance</code> instruction is again executed before
the instance is initialized.
</td>
</tr>
<tr>
<td>
B8
</td>
<td>
An instance initializer must call another instance initializer (same
class or superclass) before any instance members can be accessed.
Exceptions are non-inherited instance fields, which can be assigned
before calling another initializer, and the <code>Object</code> class
in general.
</td>
</tr>
<tr>
<td>
B9
</td>
<td>
All actual method arguments must be assignment-compatible with their
respective formal arguments.
</td>
</tr>
<tr>
<td>
B10
</td>
<td>
For each instance method invocation, the actual instance must be
assignment-compatible with the class or interface specified in the
instruction.
</td>
</tr>
<tr>
<td>
B11
</td>
<td>
A <code>return&lt;kind&gt;</code> instruction must match its
method's return type.
</td>
</tr>
<tr>
<td>
B12
</td>
<td>
When accessing protected members of a superclass, the actual type of
the instance being accessed must be either the current class or one
of its subclasses.
</td>
</tr>
<tr>
<td>
B13
</td>
<td>
The type of a value stored into a static field must be
assignment-compatible with or convertible to the field's type.
</td>
</tr>
<tr>
<td>
B14
</td>
<td>
The type of a value stored into a field must be assignment-compatible
with or convertible to the field's type.
</td>
</tr>
<tr>
<td>
B15
</td>
<td>
The type of every value stored into an array must be
assignment-compatible with the array's component type.
</td>
</tr>
<tr>
<td>
B16
</td>
<td>
The <code>A</code> operand of a <code>throw</code> instruction must
be assignment-compatible with <code>java.lang.Throwable</code>.
</td>
</tr>
<tr>
<td>
B17
</td>
<td>
The last reachable instruction of a method must either be a backwards
<code>goto</code> or branch, a <code>return</code>, or a
<code>throw</code> instruction. It must not be possible to leave the
<code>insns</code> array at the bottom.
</td>
</tr>
<tr>
<td>
B18
</td>
<td>
The unassigned half of a former register pair may not be read (is
considered invalid) until it has been re-assigned by some other
instruction.
</td>
</tr>
<tr>
<td>
B19
</td>
<td>
A <code>move-result&lt;kind&gt;</code> instruction must be immediately
preceded (in the <code>insns</code> array) by an
<code>invoke-&lt;kind&gt;</code> instruction. The only exception is
the <code>move-result-object</code> instruction, which may also be
preceded by a <code>filled-new-array</code> instruction.
</td>
</tr>
<tr>
<td>
B20
</td>
<td>
A <code>move-result&lt;kind&gt;</code> instruction must be immediately
preceded (in actual control flow) by a matching
<code>return-&lt;kind&gt;</code> instruction (it must not be jumped
to). The only exception is the <code>move-result-object</code>
instruction, which may also be preceded by a
<code>filled-new-array</code> instruction.
</td>
</tr>
<tr>
<td>
B21
</td>
<td>
A <code>move-exception</code> instruction must only appear as the
first instruction in an exception handler.
</td>
</tr>
<tr>
<td>
B22
</td>
<td>
The <code>packed-switch-data</code>, <code>sparse-switch-data</code>,
and <code>fill-array-data</code> pseudo-instructions must not be
reachable by control flow.
</td>
</tr>
</table>