blob: 84d513c8bf2309764daf090d19e30de9914592b4 [file] [log] [blame]
The Android Open Source Projectf6c38712009-03-03 19:28:47 -08001<html>
2<head>
3<title>Dalvik Bytecode Verifier Notes</title>
4</head>
5
6<body>
7<h1>Dalvik Bytecode Verifier Notes</h1>
8
9<p>
10The bytecode verifier in the Dalvik VM attempts to provide the same sorts
11of checks and guarantees that other popular virtual machines do. We
12perform generally the same set of checks as are described in _The Java
13Virtual Machine Specification, Second Edition_, including the updates
14planned for the Third Edition.
15
16<p>
17Verification can be enabled for all classes, disabled for all, or enabled
18only for "remote" (non-bootstrap) classes. It should be performed for any
19class that will be processed with the DEX optimizer, and in fact the
20default VM behavior is to only optimize verified classes.
21
22
23<h2>Why Verify?</h2>
24
25<p>
26The verification process adds additional time to the build and to
27the installation of new applications. It's fairly quick for app-sized
28DEX files, but rather slow for the big "core" and "framework" files.
29Why do it all, when our system relies on UNIX processes for security?
30<p>
31<ol>
32 <li>Optimizations. The interpreter can ignore a lot of potential
33 error cases because the verifier guarantees that they are impossible.
34 Also, we can optimize the DEX file more aggressively if we start
35 with a stronger set of assumptions about the bytecode.
Andy McFadden0df44132009-07-29 16:07:01 -070036 <li>"Precise" GC. The work peformed during verification has significant
37 overlap with the work required to compute register use maps for
38 type-precise GC.
The Android Open Source Projectf6c38712009-03-03 19:28:47 -080039 <li>Intra-application security. If an app wants to download bits
40 of interpreted code over the network and execute them, it can safely
41 do so using well-established security mechanisms.
42 <li>3rd party app failure analysis. We have no way to control the
43 tools and post-processing utilities that external developers employ,
44 so when we get bug reports with a weird exception or native crash
45 it's very helpful to start with the assumption that the bytecode
46 is valid.
47</ol>
48
49
50<h2>Verifier Differences</h2>
51
52<p>
53There are a few checks that the Dalvik bytecode verifier does not perform,
54because they're not relevant. For example:
55<ul>
56 <li>Type restrictions on constant pool references are not enforced,
57 because Dalvik does not have a pool of typed constants. (Dalvik
58 uses a simple index into type-specific pools.)
59 <li>Verification of the operand stack size is not performed, because
60 Dalvik does not have an operand stack.
61 <li>Limitations on <code>jsr</code> and <code>ret</code> do not apply,
62 because Dalvik doesn't support subroutines.
63</ul>
64
65In some cases they are implemented differently, e.g.:
66<ul>
67 <li>In a conventional VM, backward branches and exceptions are
68 forbidden when a local variable holds an uninitialized reference. The
69 restriction was changed to mark registers as invalid when they hold
70 references to the uninitialized result of a previous invocation of the
71 same <code>new-instance</code> instruction.
72 This solves the same problem -- trickery potentially allowing
73 uninitialized objects to slip past the verifier -- without unduly
74 limiting branches.
75</ul>
76
77There are also some new ones, such as:
78<ul>
79 <li>The <code>move-exception</code> instruction can only appear as
80 the first instruction in an exception handler.
81 <li>The <code>move-result*</code> instructions can only appear
82 immediately after an appropriate <code>invoke-*</code>
83 or <code>filled-new-array</code> instruction.
84</ul>
85
86<p>
Andy McFadden0df44132009-07-29 16:07:01 -070087The VM is permitted but not required to enforce "structured locking"
88constraints, which are designed to ensure that, when a method returns, all
89monitors locked by the method have been unlocked an equal number of times.
90This is not currently implemented.
91
92<p>
The Android Open Source Projectf6c38712009-03-03 19:28:47 -080093The Dalvik verifier is more restrictive than other VMs in one area:
94type safety on sub-32-bit integer widths. These additional restrictions
95should make it impossible to, say, pass a value outside the range
96[-128, 127] to a function that takes a <code>byte</code> as an argument.
97
98
99<h2>Verification Failures</h2>
100
101<p>
Andy McFadden0df44132009-07-29 16:07:01 -0700102The verifier may reject a class immediately, or it may defer throwing
103an exception until the code is actually used. For example, if a class
104attempts to perform an illegal access on a field, the VM should throw
105an IllegalAccessError the first time the instruction is encountered.
106On the other hand, if a class contains an invalid bytecode, it should be
107rejected immediately with a VerifyError.
The Android Open Source Projectf6c38712009-03-03 19:28:47 -0800108
109<p>
Andy McFadden0df44132009-07-29 16:07:01 -0700110Immediate VerifyErrors are accompanied by detailed, if somewhat cryptic,
111information in the log file. From this it's possible to determine the
112exact instruction that failed, and the reason for the failure.
113
114<p>
115It's a bit tricky to implement deferred verification errors in Dalvik.
116A few approaches were considered:
The Android Open Source Projectf6c38712009-03-03 19:28:47 -0800117
118<ol>
119<li>We could replace the invalid field access instruction with a special
120instruction that generates an illegal access error, and allow class
121verification to complete successfully. This type of verification must
Andy McFadden0df44132009-07-29 16:07:01 -0700122be deferred to first class load, rather than be performed ahead of time
123during DEX optimization, because some failures will depend on the current
124execution environment (e.g. not all classes are available at dexopt time).
125At that point the bytecode instructions are mapped read-only during
126verification, so rewriting them isn't possible.
The Android Open Source Projectf6c38712009-03-03 19:28:47 -0800127</li>
128
129<li>We can perform the access checks when the field/method/class is
130resolved. In a typical VM implementation we would do the check when the
131entry is resolved in the context of the current classfile, but our DEX
132files combine multiple classfiles together, merging the field/method/class
133resolution results into a single large table. Once one class successfully
134resolves the field, every other class in the same DEX file would be able
Andy McFadden0df44132009-07-29 16:07:01 -0700135to access the field. This is incorrect.
The Android Open Source Projectf6c38712009-03-03 19:28:47 -0800136</li>
137
138<li>Perform the access checks on every field/method/class access.
139This adds significant overhead. This is mitigated somewhat by the DEX
140optimizer, which will convert many field/method/class accesses into a
141simpler form after performing the access check. However, not all accesses
142can be optimized (e.g. accesses to classes unknown at dexopt time),
143and we don't currently have an optimized form of certain instructions
144(notably static field operations).
145</li>
146</ol>
147
148<p>
Andy McFadden0df44132009-07-29 16:07:01 -0700149In early versions of Dalvik (as shipped with Android 1.0/1.5), the verifier
150simply regarded all problems as immediately fatal. This generally worked,
151but in some cases the VM was rejecting classes because of bits of code
152that were never used. The VerifyError itself was sometimes difficult to
153decipher, because it was thrown during verification rather than at the
154point where the problem was first noticed during execution.
The Android Open Source Projectf6c38712009-03-03 19:28:47 -0800155<p>
Andy McFadden0df44132009-07-29 16:07:01 -0700156The current version uses a variation of approach #1. The dexopt
157command works the way it did before, leaving the code untouched and
158flagging fully-correct classes as "pre-verified". When the VM loads a
159class that didn't pass pre-verification, the verifier is invoked. If a
160"deferrable" problem is detected, a modifiable copy of the instructions
161in the problematic method is made. In that copy, the troubled instruction
162is replaced with an "always throw" opcode, and verification continues.
The Android Open Source Projectf6c38712009-03-03 19:28:47 -0800163
164<p>
Andy McFadden0df44132009-07-29 16:07:01 -0700165In the example used earlier, an attempt to read from an inaccessible
166field would result in the "field get" instruction being replaced by
167"always throw IllegalAccessError on field X". Creating copies of method
168bodies requires additional heap space, but since this affects very few
169methods overall the memory impact should be minor.
The Android Open Source Projectf6c38712009-03-03 19:28:47 -0800170
Andy McFadden0df44132009-07-29 16:07:01 -0700171<p>
The Android Open Source Projectf6c38712009-03-03 19:28:47 -0800172<address>Copyright &copy; 2008 The Android Open Source Project</address>
173
174</body>
175</html>