Fast Art interpreter

Add a Dalvik-style fast interpreter to Art.
Three primary deficiencies in the existing Art interpreter
will be addressed:

1.  Structural inefficiencies (primarily the bloated
    fetch/decode/execute overhead of the C++ interpreter
    implementation).
2.  Stack memory wastage.  Each managed-language invoke
    adds a full copy of the interpreter's compiler-generated
    locals on the shared stack.  We're at the mercy of
    the compiler now in how much memory is wasted here.  An
    assembly based interpreter can manage memory usage more
    effectively.
3.  Shadow frame model, which not only spends twice the memory
    to store the Dalvik virtual registers, but causes vreg stores
    to happen twice.

This CL mostly deals with #1 (but does provide some stack memory
savings).  Subsequent CLs will address the other issues.

Current status:
   Passes all run-tests.
   Phone boots interpret-only.
   2.5x faster than Clang-compiled Art goto interpreter on fetch/decode/execute
       microbenchmark, 5x faster than gcc-compiled goto interpreter.
   1.6x faster than Clang goto on Caffeinemark overall
   2.0x faster than Clang switch on Caffeinemark overall
   68% of Dalvik interpreter performance on Caffeinemark (still much slower,
       primarily because of poor invoke performance and lack of execute-inline)
   Still nearly an order of magnitude slower than Dalvik on invokes
       (but slightly better than Art Clang goto interpreter.
   Importantly, saves ~200 bytes of stack memory per invoke (but still
       wastes ~400 relative to Dalvik).

What's needed:
   Remove the (large quantity of) bring-up hackery in place.
   Integrate into the build mechanism.  I'm still using the old Dalvik manual
       build step to generate assembly code from the stub files.
   Remove the suspend check hack.  For bring-up purposes, I'm using an explicit
       suspend check (like the other Art interpreters).  However, we should be
       doing a Dalvik style suspend check via the table base switch mechanism.
       This should be done during the alternative interpreter activation.
   General cleanup.
   Add CFI info.
   Update the new target bring-up README documentation.
   Add other targets.

In later CLs:
   Consolidate mterp handlers for expensive operations (such as new-instance) with
       the code used by the switch interpreter.  No need to duplicate the code for
       heavyweight operations (but will need some refactoring to align).
   Tuning - some fast paths needs to be moved down to the assembly handlers,
       rather than being dealt with in the out-of-line code.
   JIT profiling.  Currently, the fast interpreter is used only in the fast
       case - no instrumentation, no transactions and no access checks. We
       will want to implement fast + JIT-profiling as the alternate fast
       interpreter.  All other cases can still fall back to the reference
       interpreter.
   Improve invoke performance.  We're nearly an order of magnitude slower than
       Dalvik here.  Some of that is unavoidable, but I suspect we can do
       better.
   Add support for our other targets.

Change-Id: I43e25dc3d786fb87245705ac74a87274ad34fedc
diff --git a/runtime/interpreter/mterp/config_x86 b/runtime/interpreter/mterp/config_x86
new file mode 100644
index 0000000..277817d
--- /dev/null
+++ b/runtime/interpreter/mterp/config_x86
@@ -0,0 +1,298 @@
+# Copyright (C) 2015 The Android Open Source Project
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+#
+# Configuration for X86
+#
+
+handler-style computed-goto
+handler-size 128
+
+# source for alternate entry stub
+asm-alt-stub x86/alt_stub.S
+
+# file header and basic definitions
+import x86/header.S
+
+# arch-specific entry point to interpreter
+import x86/entry.S
+
+# Stub to switch to alternate interpreter
+fallback-stub x86/fallback.S
+
+# opcode list; argument to op-start is default directory
+op-start x86
+    # (override example:) op OP_SUB_FLOAT_2ADDR arm-vfp
+    # (fallback example:) op OP_SUB_FLOAT_2ADDR FALLBACK
+
+    op op_nop FALLBACK
+    op op_move FALLBACK
+    op op_move_from16 FALLBACK
+    op op_move_16 FALLBACK
+    op op_move_wide FALLBACK
+    op op_move_wide_from16 FALLBACK
+    op op_move_wide_16 FALLBACK
+    op op_move_object FALLBACK
+    op op_move_object_from16 FALLBACK
+    op op_move_object_16 FALLBACK
+    op op_move_result FALLBACK
+    op op_move_result_wide FALLBACK
+    op op_move_result_object FALLBACK
+    op op_move_exception FALLBACK
+    op op_return_void FALLBACK
+    op op_return FALLBACK
+    op op_return_wide FALLBACK
+    op op_return_object FALLBACK
+    op op_const_4 FALLBACK
+    op op_const_16 FALLBACK
+    op op_const FALLBACK
+    op op_const_high16 FALLBACK
+    op op_const_wide_16 FALLBACK
+    op op_const_wide_32 FALLBACK
+    op op_const_wide FALLBACK
+    op op_const_wide_high16 FALLBACK
+    op op_const_string FALLBACK
+    op op_const_string_jumbo FALLBACK
+    op op_const_class FALLBACK
+    op op_monitor_enter FALLBACK
+    op op_monitor_exit FALLBACK
+    op op_check_cast FALLBACK
+    op op_instance_of FALLBACK
+    op op_array_length FALLBACK
+    op op_new_instance FALLBACK
+    op op_new_array FALLBACK
+    op op_filled_new_array FALLBACK
+    op op_filled_new_array_range FALLBACK
+    op op_fill_array_data FALLBACK
+    op op_throw FALLBACK
+    op op_goto FALLBACK
+    op op_goto_16 FALLBACK
+    op op_goto_32 FALLBACK
+    op op_packed_switch FALLBACK
+    op op_sparse_switch FALLBACK
+    op op_cmpl_float FALLBACK
+    op op_cmpg_float FALLBACK
+    op op_cmpl_double FALLBACK
+    op op_cmpg_double FALLBACK
+    op op_cmp_long FALLBACK
+    op op_if_eq FALLBACK
+    op op_if_ne FALLBACK
+    op op_if_lt FALLBACK
+    op op_if_ge FALLBACK
+    op op_if_gt FALLBACK
+    op op_if_le FALLBACK
+    op op_if_eqz FALLBACK
+    op op_if_nez FALLBACK
+    op op_if_ltz FALLBACK
+    op op_if_gez FALLBACK
+    op op_if_gtz FALLBACK
+    op op_if_lez FALLBACK
+    op_unused_3e FALLBACK
+    op_unused_3f FALLBACK
+    op_unused_40 FALLBACK
+    op_unused_41 FALLBACK
+    op_unused_42 FALLBACK
+    op_unused_43 FALLBACK
+    op op_aget FALLBACK
+    op op_aget_wide FALLBACK
+    op op_aget_object FALLBACK
+    op op_aget_boolean FALLBACK
+    op op_aget_byte FALLBACK
+    op op_aget_char FALLBACK
+    op op_aget_short FALLBACK
+    op op_aput FALLBACK
+    op op_aput_wide FALLBACK
+    op op_aput_object FALLBACK
+    op op_aput_boolean FALLBACK
+    op op_aput_byte FALLBACK
+    op op_aput_char FALLBACK
+    op op_aput_short FALLBACK
+    op op_iget FALLBACK
+    op op_iget_wide FALLBACK
+    op op_iget_object FALLBACK
+    op op_iget_boolean FALLBACK
+    op op_iget_byte FALLBACK
+    op op_iget_char FALLBACK
+    op op_iget_short FALLBACK
+    op op_iput FALLBACK
+    op op_iput_wide FALLBACK
+    op op_iput_object FALLBACK
+    op op_iput_boolean FALLBACK
+    op op_iput_byte FALLBACK
+    op op_iput_char FALLBACK
+    op op_iput_short FALLBACK
+    op op_sget FALLBACK
+    op op_sget_wide FALLBACK
+    op op_sget_object FALLBACK
+    op op_sget_boolean FALLBACK
+    op op_sget_byte FALLBACK
+    op op_sget_char FALLBACK
+    op op_sget_short FALLBACK
+    op op_sput FALLBACK
+    op op_sput_wide FALLBACK
+    op op_sput_object FALLBACK
+    op op_sput_boolean FALLBACK
+    op op_sput_byte FALLBACK
+    op op_sput_char FALLBACK
+    op op_sput_short FALLBACK
+    op op_invoke_virtual FALLBACK
+    op op_invoke_super FALLBACK
+    op op_invoke_direct FALLBACK
+    op op_invoke_static FALLBACK
+    op op_invoke_interface FALLBACK
+    op op_return_void_no_barrier FALLBACK
+    op op_invoke_virtual_range FALLBACK
+    op op_invoke_super_range FALLBACK
+    op op_invoke_direct_range FALLBACK
+    op op_invoke_static_range FALLBACK
+    op op_invoke_interface_range FALLBACK
+    op_unused_79 FALLBACK
+    op_unused_7a FALLBACK
+    op op_neg_int FALLBACK
+    op op_not_int FALLBACK
+    op op_neg_long FALLBACK
+    op op_not_long FALLBACK
+    op op_neg_float FALLBACK
+    op op_neg_double FALLBACK
+    op op_int_to_long FALLBACK
+    op op_int_to_float FALLBACK
+    op op_int_to_double FALLBACK
+    op op_long_to_int FALLBACK
+    op op_long_to_float FALLBACK
+    op op_long_to_double FALLBACK
+    op op_float_to_int FALLBACK
+    op op_float_to_long FALLBACK
+    op op_float_to_double FALLBACK
+    op op_double_to_int FALLBACK
+    op op_double_to_long FALLBACK
+    op op_double_to_float FALLBACK
+    op op_int_to_byte FALLBACK
+    op op_int_to_char FALLBACK
+    op op_int_to_short FALLBACK
+    op op_add_int FALLBACK
+    op op_sub_int FALLBACK
+    op op_mul_int FALLBACK
+    op op_div_int FALLBACK
+    op op_rem_int FALLBACK
+    op op_and_int FALLBACK
+    op op_or_int FALLBACK
+    op op_xor_int FALLBACK
+    op op_shl_int FALLBACK
+    op op_shr_int FALLBACK
+    op op_ushr_int FALLBACK
+    op op_add_long FALLBACK
+    op op_sub_long FALLBACK
+    op op_mul_long FALLBACK
+    op op_div_long FALLBACK
+    op op_rem_long FALLBACK
+    op op_and_long FALLBACK
+    op op_or_long FALLBACK
+    op op_xor_long FALLBACK
+    op op_shl_long FALLBACK
+    op op_shr_long FALLBACK
+    op op_ushr_long FALLBACK
+    op op_add_float FALLBACK
+    op op_sub_float FALLBACK
+    op op_mul_float FALLBACK
+    op op_div_float FALLBACK
+    op op_rem_float FALLBACK
+    op op_add_double FALLBACK
+    op op_sub_double FALLBACK
+    op op_mul_double FALLBACK
+    op op_div_double FALLBACK
+    op op_rem_double FALLBACK
+    op op_add_int_2addr FALLBACK
+    op op_sub_int_2addr FALLBACK
+    op op_mul_int_2addr FALLBACK
+    op op_div_int_2addr FALLBACK
+    op op_rem_int_2addr FALLBACK
+    op op_and_int_2addr FALLBACK
+    op op_or_int_2addr FALLBACK
+    op op_xor_int_2addr FALLBACK
+    op op_shl_int_2addr FALLBACK
+    op op_shr_int_2addr FALLBACK
+    op op_ushr_int_2addr FALLBACK
+    op op_add_long_2addr FALLBACK
+    op op_sub_long_2addr FALLBACK
+    op op_mul_long_2addr FALLBACK
+    op op_div_long_2addr FALLBACK
+    op op_rem_long_2addr FALLBACK
+    op op_and_long_2addr FALLBACK
+    op op_or_long_2addr FALLBACK
+    op op_xor_long_2addr FALLBACK
+    op op_shl_long_2addr FALLBACK
+    op op_shr_long_2addr FALLBACK
+    op op_ushr_long_2addr FALLBACK
+    op op_add_float_2addr FALLBACK
+    op op_sub_float_2addr FALLBACK
+    op op_mul_float_2addr FALLBACK
+    op op_div_float_2addr FALLBACK
+    op op_rem_float_2addr FALLBACK
+    op op_add_double_2addr FALLBACK
+    op op_sub_double_2addr FALLBACK
+    op op_mul_double_2addr FALLBACK
+    op op_div_double_2addr FALLBACK
+    op op_rem_double_2addr FALLBACK
+    op op_add_int_lit16 FALLBACK
+    op op_rsub_int FALLBACK
+    op op_mul_int_lit16 FALLBACK
+    op op_div_int_lit16 FALLBACK
+    op op_rem_int_lit16 FALLBACK
+    op op_and_int_lit16 FALLBACK
+    op op_or_int_lit16 FALLBACK
+    op op_xor_int_lit16 FALLBACK
+    op op_add_int_lit8 FALLBACK
+    op op_rsub_int_lit8 FALLBACK
+    op op_mul_int_lit8 FALLBACK
+    op op_div_int_lit8 FALLBACK
+    op op_rem_int_lit8 FALLBACK
+    op op_and_int_lit8 FALLBACK
+    op op_or_int_lit8 FALLBACK
+    op op_xor_int_lit8 FALLBACK
+    op op_shl_int_lit8 FALLBACK
+    op op_shr_int_lit8 FALLBACK
+    op op_ushr_int_lit8 FALLBACK
+    op op_iget_quick FALLBACK
+    op op_iget_wide_quick FALLBACK
+    op op_iget_object_quick FALLBACK
+    op op_iput_quick FALLBACK
+    op op_iput_wide_quick FALLBACK
+    op op_iput_object_quick FALLBACK
+    op op_invoke_virtual_quick FALLBACK
+    op op_invoke_virtual_range_quick FALLBACK
+    op op_iput_boolean_quick FALLBACK
+    op op_iput_byte_quick FALLBACK
+    op op_iput_char_quick FALLBACK
+    op op_iput_short_quick FALLBACK
+    op op_iget_boolean_quick FALLBACK
+    op op_iget_byte_quick FALLBACK
+    op op_iget_char_quick FALLBACK
+    op op_iget_short_quick FALLBACK
+    op_unused_f3 FALLBACK
+    op_unused_f4 FALLBACK
+    op_unused_f5 FALLBACK
+    op_unused_f6 FALLBACK
+    op_unused_f7 FALLBACK
+    op_unused_f8 FALLBACK
+    op_unused_f9 FALLBACK
+    op_unused_fa FALLBACK
+    op_unused_fb FALLBACK
+    op_unused_fc FALLBACK
+    op_unused_fd FALLBACK
+    op_unused_fe FALLBACK
+    op_unused_ff FALLBACK
+op-end
+
+# common subroutines for asm
+import x86/footer.S