The Android Open Source Project | f6c3871 | 2009-03-03 19:28:47 -0800 | [diff] [blame] | 1 | <html> |
| 2 | <head> |
| 3 | <title>Dalvik Bytecode Verifier Notes</title> |
| 4 | </head> |
| 5 | |
| 6 | <body> |
| 7 | <h1>Dalvik Bytecode Verifier Notes</h1> |
| 8 | |
| 9 | <p> |
| 10 | The bytecode verifier in the Dalvik VM attempts to provide the same sorts |
| 11 | of checks and guarantees that other popular virtual machines do. We |
| 12 | perform generally the same set of checks as are described in _The Java |
| 13 | Virtual Machine Specification, Second Edition_, including the updates |
| 14 | planned for the Third Edition. |
| 15 | |
| 16 | <p> |
| 17 | Verification can be enabled for all classes, disabled for all, or enabled |
| 18 | only for "remote" (non-bootstrap) classes. It should be performed for any |
| 19 | class that will be processed with the DEX optimizer, and in fact the |
| 20 | default VM behavior is to only optimize verified classes. |
| 21 | |
| 22 | |
| 23 | <h2>Why Verify?</h2> |
| 24 | |
| 25 | <p> |
| 26 | The verification process adds additional time to the build and to |
| 27 | the installation of new applications. It's fairly quick for app-sized |
| 28 | DEX files, but rather slow for the big "core" and "framework" files. |
| 29 | Why do it all, when our system relies on UNIX processes for security? |
| 30 | <p> |
| 31 | <ol> |
| 32 | <li>Optimizations. The interpreter can ignore a lot of potential |
| 33 | error cases because the verifier guarantees that they are impossible. |
| 34 | Also, we can optimize the DEX file more aggressively if we start |
| 35 | with a stronger set of assumptions about the bytecode. |
Andy McFadden | 0df4413 | 2009-07-29 16:07:01 -0700 | [diff] [blame^] | 36 | <li>"Precise" GC. The work peformed during verification has significant |
| 37 | overlap with the work required to compute register use maps for |
| 38 | type-precise GC. |
The Android Open Source Project | f6c3871 | 2009-03-03 19:28:47 -0800 | [diff] [blame] | 39 | <li>Intra-application security. If an app wants to download bits |
| 40 | of interpreted code over the network and execute them, it can safely |
| 41 | do so using well-established security mechanisms. |
| 42 | <li>3rd party app failure analysis. We have no way to control the |
| 43 | tools and post-processing utilities that external developers employ, |
| 44 | so when we get bug reports with a weird exception or native crash |
| 45 | it's very helpful to start with the assumption that the bytecode |
| 46 | is valid. |
| 47 | </ol> |
| 48 | |
| 49 | |
| 50 | <h2>Verifier Differences</h2> |
| 51 | |
| 52 | <p> |
| 53 | There are a few checks that the Dalvik bytecode verifier does not perform, |
| 54 | because they're not relevant. For example: |
| 55 | <ul> |
| 56 | <li>Type restrictions on constant pool references are not enforced, |
| 57 | because Dalvik does not have a pool of typed constants. (Dalvik |
| 58 | uses a simple index into type-specific pools.) |
| 59 | <li>Verification of the operand stack size is not performed, because |
| 60 | Dalvik does not have an operand stack. |
| 61 | <li>Limitations on <code>jsr</code> and <code>ret</code> do not apply, |
| 62 | because Dalvik doesn't support subroutines. |
| 63 | </ul> |
| 64 | |
| 65 | In some cases they are implemented differently, e.g.: |
| 66 | <ul> |
| 67 | <li>In a conventional VM, backward branches and exceptions are |
| 68 | forbidden when a local variable holds an uninitialized reference. The |
| 69 | restriction was changed to mark registers as invalid when they hold |
| 70 | references to the uninitialized result of a previous invocation of the |
| 71 | same <code>new-instance</code> instruction. |
| 72 | This solves the same problem -- trickery potentially allowing |
| 73 | uninitialized objects to slip past the verifier -- without unduly |
| 74 | limiting branches. |
| 75 | </ul> |
| 76 | |
| 77 | There are also some new ones, such as: |
| 78 | <ul> |
| 79 | <li>The <code>move-exception</code> instruction can only appear as |
| 80 | the first instruction in an exception handler. |
| 81 | <li>The <code>move-result*</code> instructions can only appear |
| 82 | immediately after an appropriate <code>invoke-*</code> |
| 83 | or <code>filled-new-array</code> instruction. |
| 84 | </ul> |
| 85 | |
| 86 | <p> |
Andy McFadden | 0df4413 | 2009-07-29 16:07:01 -0700 | [diff] [blame^] | 87 | The VM is permitted but not required to enforce "structured locking" |
| 88 | constraints, which are designed to ensure that, when a method returns, all |
| 89 | monitors locked by the method have been unlocked an equal number of times. |
| 90 | This is not currently implemented. |
| 91 | |
| 92 | <p> |
The Android Open Source Project | f6c3871 | 2009-03-03 19:28:47 -0800 | [diff] [blame] | 93 | The Dalvik verifier is more restrictive than other VMs in one area: |
| 94 | type safety on sub-32-bit integer widths. These additional restrictions |
| 95 | should make it impossible to, say, pass a value outside the range |
| 96 | [-128, 127] to a function that takes a <code>byte</code> as an argument. |
| 97 | |
| 98 | |
| 99 | <h2>Verification Failures</h2> |
| 100 | |
| 101 | <p> |
Andy McFadden | 0df4413 | 2009-07-29 16:07:01 -0700 | [diff] [blame^] | 102 | The verifier may reject a class immediately, or it may defer throwing |
| 103 | an exception until the code is actually used. For example, if a class |
| 104 | attempts to perform an illegal access on a field, the VM should throw |
| 105 | an IllegalAccessError the first time the instruction is encountered. |
| 106 | On the other hand, if a class contains an invalid bytecode, it should be |
| 107 | rejected immediately with a VerifyError. |
The Android Open Source Project | f6c3871 | 2009-03-03 19:28:47 -0800 | [diff] [blame] | 108 | |
| 109 | <p> |
Andy McFadden | 0df4413 | 2009-07-29 16:07:01 -0700 | [diff] [blame^] | 110 | Immediate VerifyErrors are accompanied by detailed, if somewhat cryptic, |
| 111 | information in the log file. From this it's possible to determine the |
| 112 | exact instruction that failed, and the reason for the failure. |
| 113 | |
| 114 | <p> |
| 115 | It's a bit tricky to implement deferred verification errors in Dalvik. |
| 116 | A few approaches were considered: |
The Android Open Source Project | f6c3871 | 2009-03-03 19:28:47 -0800 | [diff] [blame] | 117 | |
| 118 | <ol> |
| 119 | <li>We could replace the invalid field access instruction with a special |
| 120 | instruction that generates an illegal access error, and allow class |
| 121 | verification to complete successfully. This type of verification must |
Andy McFadden | 0df4413 | 2009-07-29 16:07:01 -0700 | [diff] [blame^] | 122 | be deferred to first class load, rather than be performed ahead of time |
| 123 | during DEX optimization, because some failures will depend on the current |
| 124 | execution environment (e.g. not all classes are available at dexopt time). |
| 125 | At that point the bytecode instructions are mapped read-only during |
| 126 | verification, so rewriting them isn't possible. |
The Android Open Source Project | f6c3871 | 2009-03-03 19:28:47 -0800 | [diff] [blame] | 127 | </li> |
| 128 | |
| 129 | <li>We can perform the access checks when the field/method/class is |
| 130 | resolved. In a typical VM implementation we would do the check when the |
| 131 | entry is resolved in the context of the current classfile, but our DEX |
| 132 | files combine multiple classfiles together, merging the field/method/class |
| 133 | resolution results into a single large table. Once one class successfully |
| 134 | resolves the field, every other class in the same DEX file would be able |
Andy McFadden | 0df4413 | 2009-07-29 16:07:01 -0700 | [diff] [blame^] | 135 | to access the field. This is incorrect. |
The Android Open Source Project | f6c3871 | 2009-03-03 19:28:47 -0800 | [diff] [blame] | 136 | </li> |
| 137 | |
| 138 | <li>Perform the access checks on every field/method/class access. |
| 139 | This adds significant overhead. This is mitigated somewhat by the DEX |
| 140 | optimizer, which will convert many field/method/class accesses into a |
| 141 | simpler form after performing the access check. However, not all accesses |
| 142 | can be optimized (e.g. accesses to classes unknown at dexopt time), |
| 143 | and we don't currently have an optimized form of certain instructions |
| 144 | (notably static field operations). |
| 145 | </li> |
| 146 | </ol> |
| 147 | |
| 148 | <p> |
Andy McFadden | 0df4413 | 2009-07-29 16:07:01 -0700 | [diff] [blame^] | 149 | In early versions of Dalvik (as shipped with Android 1.0/1.5), the verifier |
| 150 | simply regarded all problems as immediately fatal. This generally worked, |
| 151 | but in some cases the VM was rejecting classes because of bits of code |
| 152 | that were never used. The VerifyError itself was sometimes difficult to |
| 153 | decipher, because it was thrown during verification rather than at the |
| 154 | point where the problem was first noticed during execution. |
The Android Open Source Project | f6c3871 | 2009-03-03 19:28:47 -0800 | [diff] [blame] | 155 | <p> |
Andy McFadden | 0df4413 | 2009-07-29 16:07:01 -0700 | [diff] [blame^] | 156 | The current version uses a variation of approach #1. The dexopt |
| 157 | command works the way it did before, leaving the code untouched and |
| 158 | flagging fully-correct classes as "pre-verified". When the VM loads a |
| 159 | class that didn't pass pre-verification, the verifier is invoked. If a |
| 160 | "deferrable" problem is detected, a modifiable copy of the instructions |
| 161 | in the problematic method is made. In that copy, the troubled instruction |
| 162 | is replaced with an "always throw" opcode, and verification continues. |
The Android Open Source Project | f6c3871 | 2009-03-03 19:28:47 -0800 | [diff] [blame] | 163 | |
| 164 | <p> |
Andy McFadden | 0df4413 | 2009-07-29 16:07:01 -0700 | [diff] [blame^] | 165 | In the example used earlier, an attempt to read from an inaccessible |
| 166 | field would result in the "field get" instruction being replaced by |
| 167 | "always throw IllegalAccessError on field X". Creating copies of method |
| 168 | bodies requires additional heap space, but since this affects very few |
| 169 | methods overall the memory impact should be minor. |
The Android Open Source Project | f6c3871 | 2009-03-03 19:28:47 -0800 | [diff] [blame] | 170 | |
Andy McFadden | 0df4413 | 2009-07-29 16:07:01 -0700 | [diff] [blame^] | 171 | <p> |
The Android Open Source Project | f6c3871 | 2009-03-03 19:28:47 -0800 | [diff] [blame] | 172 | <address>Copyright © 2008 The Android Open Source Project</address> |
| 173 | |
| 174 | </body> |
| 175 | </html> |