Blame - docs/verifier.html - fp2-dev/platform/dalvik

blob: 656b832738d37ab62ee92ab76a0277074a7ff212 [file] [log] [blame]

The Android Open Source Project	f6c3871	2009-03-03 19:28:47 -0800	[diff] [blame^]	1	<html>
				2	<head>
				3	<title>Dalvik Bytecode Verifier Notes</title>
				4	</head>
				5
				6	<body>
				7	<h1>Dalvik Bytecode Verifier Notes</h1>
				8
				9	<p>
				10	The bytecode verifier in the Dalvik VM attempts to provide the same sorts
				11	of checks and guarantees that other popular virtual machines do. We
				12	perform generally the same set of checks as are described in _The Java
				13	Virtual Machine Specification, Second Edition_, including the updates
				14	planned for the Third Edition.
				15
				16	<p>
				17	Verification can be enabled for all classes, disabled for all, or enabled
				18	only for "remote" (non-bootstrap) classes. It should be performed for any
				19	class that will be processed with the DEX optimizer, and in fact the
				20	default VM behavior is to only optimize verified classes.
				21
				22
				23	<h2>Why Verify?</h2>
				24
				25	<p>
				26	The verification process adds additional time to the build and to
				27	the installation of new applications. It's fairly quick for app-sized
				28	DEX files, but rather slow for the big "core" and "framework" files.
				29	Why do it all, when our system relies on UNIX processes for security?
				30	<p>
				31	<ol>
				32	<li>Optimizations. The interpreter can ignore a lot of potential
				33	error cases because the verifier guarantees that they are impossible.
				34	Also, we can optimize the DEX file more aggressively if we start
				35	with a stronger set of assumptions about the bytecode.
				36	<li>"Exact" GC. The work peformed during verification has significant
				37	overlap with the work required to compute register use maps for exact
				38	GC. Improper register use, caught by the verifier, could lead to
				39	subtle problems with an "exact" GC.
				40	<li>Intra-application security. If an app wants to download bits
				41	of interpreted code over the network and execute them, it can safely
				42	do so using well-established security mechanisms.
				43	<li>3rd party app failure analysis. We have no way to control the
				44	tools and post-processing utilities that external developers employ,
				45	so when we get bug reports with a weird exception or native crash
				46	it's very helpful to start with the assumption that the bytecode
				47	is valid.
				48	</ol>
				49
				50
				51	<h2>Verifier Differences</h2>
				52
				53	<p>
				54	There are a few checks that the Dalvik bytecode verifier does not perform,
				55	because they're not relevant. For example:
				56	<ul>
				57	<li>Type restrictions on constant pool references are not enforced,
				58	because Dalvik does not have a pool of typed constants. (Dalvik
				59	uses a simple index into type-specific pools.)
				60	<li>Verification of the operand stack size is not performed, because
				61	Dalvik does not have an operand stack.
				62	<li>Limitations on <code>jsr</code> and <code>ret</code> do not apply,
				63	because Dalvik doesn't support subroutines.
				64	</ul>
				65
				66	In some cases they are implemented differently, e.g.:
				67	<ul>
				68	<li>In a conventional VM, backward branches and exceptions are
				69	forbidden when a local variable holds an uninitialized reference. The
				70	restriction was changed to mark registers as invalid when they hold
				71	references to the uninitialized result of a previous invocation of the
				72	same <code>new-instance</code> instruction.
				73	This solves the same problem -- trickery potentially allowing
				74	uninitialized objects to slip past the verifier -- without unduly
				75	limiting branches.
				76	</ul>
				77
				78	There are also some new ones, such as:
				79	<ul>
				80	<li>The <code>move-exception</code> instruction can only appear as
				81	the first instruction in an exception handler.
				82	<li>The <code>move-result*</code> instructions can only appear
				83	immediately after an appropriate <code>invoke-*</code>
				84	or <code>filled-new-array</code> instruction.
				85	</ul>
				86
				87	<p>
				88	The Dalvik verifier is more restrictive than other VMs in one area:
				89	type safety on sub-32-bit integer widths. These additional restrictions
				90	should make it impossible to, say, pass a value outside the range
				91	[-128, 127] to a function that takes a <code>byte</code> as an argument.
				92
				93
				94	<h2>Verification Failures</h2>
				95
				96	<p>
				97	When the verifier rejects a class, it always throws a VerifyError.
				98	This is different in some cases from other implementations. For example,
				99	if a class attempts to perform an illegal access on a field, the expected
				100	behavior is to receive an IllegalAccessError at runtime the first time
				101	the field is actually accessed. The Dalvik verifier will reject the
				102	entire class immediately.
				103
				104	<p>
				105	It's difficult to throw the error on first use in Dalvik. Possible ways
				106	to implement this behavior include:
				107
				108	<ol>
				109	<li>We could replace the invalid field access instruction with a special
				110	instruction that generates an illegal access error, and allow class
				111	verification to complete successfully. This type of verification must
				112	often be deferred to first class load, rather than be performed ahead of time
				113	during DEX optimization, which means the bytecode instructions will be
				114	mapped read-only during verification. So this won't work.
				115	</li>
				116
				117	<li>We can perform the access checks when the field/method/class is
				118	resolved. In a typical VM implementation we would do the check when the
				119	entry is resolved in the context of the current classfile, but our DEX
				120	files combine multiple classfiles together, merging the field/method/class
				121	resolution results into a single large table. Once one class successfully
				122	resolves the field, every other class in the same DEX file would be able
				123	to access the field. This is bad.
				124	</li>
				125
				126	<li>Perform the access checks on every field/method/class access.
				127	This adds significant overhead. This is mitigated somewhat by the DEX
				128	optimizer, which will convert many field/method/class accesses into a
				129	simpler form after performing the access check. However, not all accesses
				130	can be optimized (e.g. accesses to classes unknown at dexopt time),
				131	and we don't currently have an optimized form of certain instructions
				132	(notably static field operations).
				133	</li>
				134	</ol>
				135
				136	<p>
				137	Other implementations are possible, but they all involve allocating
				138	some amount of additional memory or spending additional cycles
				139	on non-DEX-optimized instructions. We don't want to throw an
				140	IllegalAccessError at verification time, since that would indicate that
				141	access to the class being verified was illegal.
				142	<p>
				143	One approach that might be worth pursuing: for situations like illegal
				144	accesses, the verifier makes an in-RAM private copy of the method, and
				145	alters the instructions there. The class object is altered to point at
				146	the new copy of the instructions. This requires minimal memory overhead
				147	and provides a better experience for developers.
				148
				149	<p>
				150	The VerifyError is accompanied by detailed, if somewhat cryptic,
				151	information in the log file. From this it's possible to determine the
				152	exact instruction that failed, and the reason for the failure. We can
				153	also constructor the VerifyError with an IllegalAccessError passed in as
				154	the cause.
				155
				156	<address>Copyright © 2008 The Android Open Source Project</address>
				157
				158	</body>
				159	</html>