Blame - docs/verifier.html - platform/dalvik

blob: 276967f72a023ffe283deb229b2b7233aecb9ac8 [file] [log] [blame]

The Android Open Source Project	f6c3871	2009-03-03 19:28:47 -0800	[diff] [blame]	1	<html>
				2	<head>
				3	<title>Dalvik Bytecode Verifier Notes</title>
				4	</head>
				5
				6	<body>
				7	<h1>Dalvik Bytecode Verifier Notes</h1>
				8
				9	<p>
				10	The bytecode verifier in the Dalvik VM attempts to provide the same sorts
				11	of checks and guarantees that other popular virtual machines do. We
				12	perform generally the same set of checks as are described in _The Java
				13	Virtual Machine Specification, Second Edition_, including the updates
				14	planned for the Third Edition.
				15
				16	<p>
				17	Verification can be enabled for all classes, disabled for all, or enabled
				18	only for "remote" (non-bootstrap) classes. It should be performed for any
				19	class that will be processed with the DEX optimizer, and in fact the
				20	default VM behavior is to only optimize verified classes.
				21
				22
				23	<h2>Why Verify?</h2>
				24
				25	<p>
				26	The verification process adds additional time to the build and to
				27	the installation of new applications. It's fairly quick for app-sized
				28	DEX files, but rather slow for the big "core" and "framework" files.
				29	Why do it all, when our system relies on UNIX processes for security?
				30	<p>
				31	<ol>
				32	<li>Optimizations. The interpreter can ignore a lot of potential
				33	error cases because the verifier guarantees that they are impossible.
				34	Also, we can optimize the DEX file more aggressively if we start
				35	with a stronger set of assumptions about the bytecode.
Andy McFadden	0df4413	2009-07-29 16:07:01 -0700	[diff] [blame]	36	<li>"Precise" GC. The work peformed during verification has significant
				37	overlap with the work required to compute register use maps for
				38	type-precise GC.
The Android Open Source Project	f6c3871	2009-03-03 19:28:47 -0800	[diff] [blame]	39	<li>Intra-application security. If an app wants to download bits
				40	of interpreted code over the network and execute them, it can safely
				41	do so using well-established security mechanisms.
				42	<li>3rd party app failure analysis. We have no way to control the
				43	tools and post-processing utilities that external developers employ,
				44	so when we get bug reports with a weird exception or native crash
				45	it's very helpful to start with the assumption that the bytecode
				46	is valid.
				47	</ol>
Brian Carlstrom	fbdcfb9	2010-05-28 15:42:12 -0700	[diff] [blame]	48	<p>
				49	It's also a convenient framework to deal with certain situations, notably
				50	replacement of instructions that access volatile 64-bit fields with
				51	more rigorous versions that guarantee atomicity.
The Android Open Source Project	f6c3871	2009-03-03 19:28:47 -0800	[diff] [blame]	52
				53
				54	<h2>Verifier Differences</h2>
				55
				56	<p>
				57	There are a few checks that the Dalvik bytecode verifier does not perform,
				58	because they're not relevant. For example:
				59	<ul>
				60	<li>Type restrictions on constant pool references are not enforced,
				61	because Dalvik does not have a pool of typed constants. (Dalvik
				62	uses a simple index into type-specific pools.)
				63	<li>Verification of the operand stack size is not performed, because
				64	Dalvik does not have an operand stack.
				65	<li>Limitations on <code>jsr</code> and <code>ret</code> do not apply,
				66	because Dalvik doesn't support subroutines.
				67	</ul>
				68
				69	In some cases they are implemented differently, e.g.:
				70	<ul>
				71	<li>In a conventional VM, backward branches and exceptions are
				72	forbidden when a local variable holds an uninitialized reference. The
				73	restriction was changed to mark registers as invalid when they hold
				74	references to the uninitialized result of a previous invocation of the
				75	same <code>new-instance</code> instruction.
				76	This solves the same problem -- trickery potentially allowing
				77	uninitialized objects to slip past the verifier -- without unduly
				78	limiting branches.
				79	</ul>
				80
				81	There are also some new ones, such as:
				82	<ul>
				83	<li>The <code>move-exception</code> instruction can only appear as
				84	the first instruction in an exception handler.
				85	<li>The <code>move-result*</code> instructions can only appear
				86	immediately after an appropriate <code>invoke-*</code>
				87	or <code>filled-new-array</code> instruction.
				88	</ul>
				89
				90	<p>
Andy McFadden	0df4413	2009-07-29 16:07:01 -0700	[diff] [blame]	91	The VM is permitted but not required to enforce "structured locking"
				92	constraints, which are designed to ensure that, when a method returns, all
				93	monitors locked by the method have been unlocked an equal number of times.
				94	This is not currently implemented.
				95
				96	<p>
The Android Open Source Project	f6c3871	2009-03-03 19:28:47 -0800	[diff] [blame]	97	The Dalvik verifier is more restrictive than other VMs in one area:
				98	type safety on sub-32-bit integer widths. These additional restrictions
				99	should make it impossible to, say, pass a value outside the range
				100	[-128, 127] to a function that takes a <code>byte</code> as an argument.
				101
				102
Andy McFadden	3f64a02	2010-11-12 16:55:21 -0800	[diff] [blame]	103	<h2>Monitor Verification</h2>
				104
				105	<p>
				106	If a method locks an object with a <code>synchronized</code> statement, the
				107	object must be unlocked before the method returns. At the bytecode level,
				108	this means the method must execute a matching <code>monitor-exit</code>
				109	for every <code>monitor-enter</code> instruction, whether the function
				110	completes normally or abnormally. The bytecode verifier optionally
				111	enforces this.
				112
				113	<p>
				114	The verifier uses a fairly simple-minded model. If you enter a monitor
				115	held in register N, you can exit the monitor using register N or any
				116	subsequently-made copies of register N. The verifier does not attempt
				117	to identify previously-made copies, track loads and stores through
Andy McFadden	81d713a	2010-11-29 14:23:46 -0800	[diff] [blame]	118	fields, or recognize identical constant values (for example, the result
				119	values from two <code>const-class</code> instructions on the same class
				120	will be the same reference, but the verifier doesn't recognize this).
				121
				122	<p>
				123	Further, you may only exit the monitor most recently entered. "Hand
				124	over hand" locking techniques, e.g. "lock A; lock B; unlock A; unlock B",
				125	are not allowed.
Andy McFadden	3f64a02	2010-11-12 16:55:21 -0800	[diff] [blame]	126
				127	<p>
				128	This means that there are a number of situations in which the verifier
				129	will throw an exception on code that would execute correctly at run time.
				130	This is not expected to be an issue for compiler-generated bytecode.
				131
				132	<p>
				133	For implementation convenience, the maximum nesting depth of
				134	<code>synchronized</code> statements has been set to 32. This is not
				135	a limitation on the recursion count. The only way to trip this would be
				136	to have a single method with more than 32 nested <code>synchronized</code>
				137	statements, something that is unlikely to occur.
				138
				139
The Android Open Source Project	f6c3871	2009-03-03 19:28:47 -0800	[diff] [blame]	140	<h2>Verification Failures</h2>
				141
				142	<p>
Andy McFadden	0df4413	2009-07-29 16:07:01 -0700	[diff] [blame]	143	The verifier may reject a class immediately, or it may defer throwing
				144	an exception until the code is actually used. For example, if a class
				145	attempts to perform an illegal access on a field, the VM should throw
				146	an IllegalAccessError the first time the instruction is encountered.
				147	On the other hand, if a class contains an invalid bytecode, it should be
				148	rejected immediately with a VerifyError.
The Android Open Source Project	f6c3871	2009-03-03 19:28:47 -0800	[diff] [blame]	149
				150	<p>
Andy McFadden	0df4413	2009-07-29 16:07:01 -0700	[diff] [blame]	151	Immediate VerifyErrors are accompanied by detailed, if somewhat cryptic,
				152	information in the log file. From this it's possible to determine the
				153	exact instruction that failed, and the reason for the failure.
				154
				155	<p>
				156	It's a bit tricky to implement deferred verification errors in Dalvik.
				157	A few approaches were considered:
The Android Open Source Project	f6c3871	2009-03-03 19:28:47 -0800	[diff] [blame]	158
				159	<ol>
				160	<li>We could replace the invalid field access instruction with a special
				161	instruction that generates an illegal access error, and allow class
				162	verification to complete successfully. This type of verification must
Andy McFadden	0df4413	2009-07-29 16:07:01 -0700	[diff] [blame]	163	be deferred to first class load, rather than be performed ahead of time
				164	during DEX optimization, because some failures will depend on the current
				165	execution environment (e.g. not all classes are available at dexopt time).
				166	At that point the bytecode instructions are mapped read-only during
				167	verification, so rewriting them isn't possible.
The Android Open Source Project	f6c3871	2009-03-03 19:28:47 -0800	[diff] [blame]	168	</li>
				169
				170	<li>We can perform the access checks when the field/method/class is
				171	resolved. In a typical VM implementation we would do the check when the
				172	entry is resolved in the context of the current classfile, but our DEX
				173	files combine multiple classfiles together, merging the field/method/class
				174	resolution results into a single large table. Once one class successfully
				175	resolves the field, every other class in the same DEX file would be able
Andy McFadden	0df4413	2009-07-29 16:07:01 -0700	[diff] [blame]	176	to access the field. This is incorrect.
The Android Open Source Project	f6c3871	2009-03-03 19:28:47 -0800	[diff] [blame]	177	</li>
				178
				179	<li>Perform the access checks on every field/method/class access.
				180	This adds significant overhead. This is mitigated somewhat by the DEX
				181	optimizer, which will convert many field/method/class accesses into a
				182	simpler form after performing the access check. However, not all accesses
				183	can be optimized (e.g. accesses to classes unknown at dexopt time),
				184	and we don't currently have an optimized form of certain instructions
				185	(notably static field operations).
				186	</li>
				187	</ol>
				188
				189	<p>
Andy McFadden	c7659ec	2009-09-18 16:14:41 -0700	[diff] [blame]	190	In early versions of Dalvik (as found in Android 1.6 and earlier), the verifier
Andy McFadden	0df4413	2009-07-29 16:07:01 -0700	[diff] [blame]	191	simply regarded all problems as immediately fatal. This generally worked,
				192	but in some cases the VM was rejecting classes because of bits of code
				193	that were never used. The VerifyError itself was sometimes difficult to
				194	decipher, because it was thrown during verification rather than at the
				195	point where the problem was first noticed during execution.
The Android Open Source Project	f6c3871	2009-03-03 19:28:47 -0800	[diff] [blame]	196	<p>
Andy McFadden	0df4413	2009-07-29 16:07:01 -0700	[diff] [blame]	197	The current version uses a variation of approach #1. The dexopt
				198	command works the way it did before, leaving the code untouched and
				199	flagging fully-correct classes as "pre-verified". When the VM loads a
				200	class that didn't pass pre-verification, the verifier is invoked. If a
				201	"deferrable" problem is detected, a modifiable copy of the instructions
				202	in the problematic method is made. In that copy, the troubled instruction
				203	is replaced with an "always throw" opcode, and verification continues.
The Android Open Source Project	f6c3871	2009-03-03 19:28:47 -0800	[diff] [blame]	204
				205	<p>
Andy McFadden	0df4413	2009-07-29 16:07:01 -0700	[diff] [blame]	206	In the example used earlier, an attempt to read from an inaccessible
				207	field would result in the "field get" instruction being replaced by
				208	"always throw IllegalAccessError on field X". Creating copies of method
				209	bodies requires additional heap space, but since this affects very few
				210	methods overall the memory impact should be minor.
The Android Open Source Project	f6c3871	2009-03-03 19:28:47 -0800	[diff] [blame]	211
Andy McFadden	0df4413	2009-07-29 16:07:01 -0700	[diff] [blame]	212	<p>
The Android Open Source Project	f6c3871	2009-03-03 19:28:47 -0800	[diff] [blame]	213	<address>Copyright © 2008 The Android Open Source Project</address>
				214
				215	</body>
				216	</html>