VIXL Release 1.0 Refer to the README.md and LICENCE files for details.

commit: ad96eda8944ab1c1ba55715c50d9d6f0a3ed1dc8 [log] [tgz]
author: armvixl Fri Jun 14 11:42:37 2013 +0100
committer: armvixl Tue Jun 18 16:55:15 2013 +0100
tree: 11017e875811dc153d4f9ba7acb599394c007d78
diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..10e2041
--- /dev/null
+++ b/.gitignore

@@ -0,0 +1,11 @@
+# ignore python compiled object
+*.pyc
+# ignore vi temporary files
+*.swo
+*.swp
+.sconsign.dblite
+obj/
+cctest*
+bench_*
+libvixl*
+example-*

diff --git a/LICENCE b/LICENCE
new file mode 100644
index 0000000..b7e160a
--- /dev/null
+++ b/LICENCE

@@ -0,0 +1,30 @@
+LICENCE
+=======
+
+The software in this repository is covered by the following licence.
+
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

diff --git a/README.md b/README.md
new file mode 100644
index 0000000..df3de37
--- /dev/null
+++ b/README.md

@@ -0,0 +1,120 @@
+VIXL: Runtime Code Generation Library Version 1.0
+=================================================
+
+Contents:
+
+ * Requirements
+ * Overview
+ * Known limitations
+ * Usage
+
+
+Requirements
+============
+
+To build VIXL the following software is required:
+
+ 1. Python 2.7
+ 2. Scons 2.0
+ 3. GCC 4.4
+
+A 64-bit host machine is required, implementing an LP64 data model. VIXL has
+only been tested using GCC on Ubuntu systems.
+
+To run the linter stage of the tests, the following software is also required:
+
+ 1. Git
+ 2. [Google's `cpplint.py`][cpplint]
+
+Refer to the 'Usage' section for details.
+
+Overview
+========
+
+VIXL is made of three components.
+
+ 1. A programmatic assembler to generate A64 code at runtime. The assembler
+    abstracts some of the constraints of the A64 ISA, for example most
+    instructions support any immediate.
+ 2. A disassembler which can print any instruction emitted by the assembler.
+ 3. A simulator which can simulate any instruction emitted by the assembler.
+    The simulator allows generated code to be run on another architecture
+    without the need for a full ISA model.
+
+
+Known Limitations
+=================
+
+VIXL was developed to target JavaScript engines so a number of features from A64
+were deemed unnecessary:
+
+ * No Advanced SIMD support.
+ * Limited rounding mode support for floating point.
+ * No support for synchronisation instructions.
+ * Limited support for system instructions.
+ * A few miscellaneous integer and floating point instructions are missing.
+
+The VIXL simulator supports only those instructions that the VIXL assembler can
+generate.
+
+
+Usage
+=====
+
+
+Running all Tests
+-----------------
+
+The helper script `tools/presubmit.py` will build and run every test that is
+provided with VIXL, in both release and debug mode. It is a useful script for
+verifying that all of VIXL's dependencies are in place and that VIXL is working
+as it should.
+
+By default, the `tools/presubmit.py` script runs a linter to check that the
+source code conforms with the code style guide, and to detect several common
+errors that the compiler may not warn about. This is most useful for VIXL
+developers. The linter has the following dependencies:
+
+ 1. Git must be installed, and the VIXL project must be in a valid Git
+    repository, such as one produced using `git clone`.
+ 2. `cpplint.py`, [as provided by Google][cpplint], must be available (and
+    executable) on the `PATH`. Only revision 104 has been tested with VIXL.
+
+It is possible to tell `tools/presubmit.py` to skip the linter stage by passing
+`--nolint`. This removes the dependency on `cpplint.py` and Git. The `--nolint`
+option is implied if the VIXL project is a snapshot (with no `.git` directory).
+
+
+Building and Running Specific Tests
+-----------------------------------
+
+The helper script `tools/test.py` will build and run all the tests for the
+assembler and disassembler in release mode. Add `--mode=debug` to build and run
+in debug mode. The tests can be built separately using SCons: `scons
+target=cctest`.
+
+
+Building and Running the Benchmarks
+-----------------------------------
+
+There are two very basic benchmarks provided with VIXL:
+
+ 1. bench\_dataop, emitting adds
+ 2. bench\_branch, emitting branches
+
+To build one benchmark: `scons target=bench_xxx`, then run it as
+`./bench_xxx_sim <number of iterations>`. The benchmarks do not report a
+figure; they should be timed using the `time` command.
+
+
+Getting Started
+---------------
+
+A short introduction to using VIXL can be found at `doc/getting-started.md`.
+Example source code is provided in the `examples` directory. Build this using
+`scons target=examples` from the root directory.
+
+
+
+[cpplint]: https://google-styleguide.googlecode.com/svn-history/r104/trunk/cpplint/cpplint.py
+           "Google's cpplint.py script."

diff --git a/SConstruct b/SConstruct
new file mode 100644
index 0000000..e8094d8
--- /dev/null
+++ b/SConstruct

@@ -0,0 +1,195 @@
+# Copyright 2013, ARM Limited
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are met:
+#
+#   * Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#   * Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#   * Neither the name of ARM Limited nor the names of its contributors may be
+#     used to endorse or promote products derived from this software without
+#     specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+import os
+import os.path
+import sys
+
+# Global configuration.
+PROJ_SRC_DIR   = 'src'
+PROJ_SRC_FILES = '''
+src/utils.cc
+src/a64/assembler-a64.cc
+src/a64/macro-assembler-a64.cc
+src/a64/instructions-a64.cc
+src/a64/decoder-a64.cc
+src/a64/debugger-a64.cc
+src/a64/disasm-a64.cc
+src/a64/cpu-a64.cc
+src/a64/simulator-a64.cc
+'''.split()
+PROJ_EXAMPLES_DIR = 'examples'
+PROJ_EXAMPLES_SRC_FILES = '''
+examples/debugger.cc
+examples/add3-double.cc
+examples/add4-double.cc
+examples/factorial-rec.cc
+examples/factorial.cc
+examples/sum-array.cc
+examples/abs.cc
+examples/swap4.cc
+examples/swap-int32.cc
+examples/check-bounds.cc
+examples/getting-started.cc
+'''.split()
+# List target specific files.
+# Target names are used as dictionary entries.
+TARGET_SRC_DIR = {
+  'cctest': 'test',
+  'bench_dataop': 'benchmarks',
+  'bench_branch': 'benchmarks',
+  'examples': 'examples'
+}
+TARGET_SRC_FILES = {
+  'cctest': '''
+    test/cctest.cc
+    test/test-utils-a64.cc
+    test/test-assembler-a64.cc
+    test/test-disasm-a64.cc
+    test/examples/test-examples.cc
+    '''.split() + PROJ_EXAMPLES_SRC_FILES,
+  'bench_dataop': '''
+    benchmarks/bench-dataop.cc
+    '''.split(),
+  'bench_branch': '''
+    benchmarks/bench-branch.cc
+    '''.split()
+}
+RELEASE_OBJ_DIR  = 'obj/release'
+DEBUG_OBJ_DIR    = 'obj/debug'
+COVERAGE_OBJ_DIR = 'obj/coverage'
+
+
+# Helper functions.
+def abort(message):
+  print('ABORTING: ' + message)
+  sys.exit(1)
+
+
+def list_target(obj_dir, src_files):
+  return map(lambda x: os.path.join(obj_dir, x), src_files)
+
+
+def create_variant(obj_dir, targets_dir):
+  VariantDir(os.path.join(obj_dir, PROJ_SRC_DIR), PROJ_SRC_DIR)
+  for directory in targets_dir.itervalues():
+    VariantDir(os.path.join(obj_dir, directory), directory)
+
+
+# Build arguments.
+args = Variables()
+args.Add(EnumVariable('mode', 'Build mode', 'release',
+                      allowed_values = ['release', 'debug', 'coverage']))
+args.Add(EnumVariable('target', 'Target to build', 'cctest',
+                      allowed_values = ['cctest',
+                                        'bench_dataop',
+                                        'bench_branch',
+                                        'examples']))
+args.Add(EnumVariable('simulator', 'build for the simulator', 'on',
+                      allowed_values = ['on', 'off']))
+
+# Configure the environment.
+create_variant(RELEASE_OBJ_DIR, TARGET_SRC_DIR)
+create_variant(DEBUG_OBJ_DIR, TARGET_SRC_DIR)
+create_variant(COVERAGE_OBJ_DIR, TARGET_SRC_DIR)
+env = Environment(variables=args)
+
+# Commandline help.
+Help(args.GenerateHelpText(env))
+
+# Abort if any invalid argument was passed.
+# This check must happened after an environment is created.
+unknown_arg = args.UnknownVariables()
+if unknown_arg:
+  abort('Unknown variable(s): ' + str(unknown_arg.keys()))
+
+# Setup tools.
+# This is necessary for cross-compilation.
+env['CXX'] = os.environ.get('CXX', env.get('CXX'))
+env['AR'] = os.environ.get('AR', env.get('AR'))
+env['RANLIB'] = os.environ.get('RANLIB', env.get('RANLIB'))
+env['CC'] = os.environ.get('CC', env.get('CC'))
+env['LD'] = os.environ.get('LD', env.get('LD'))
+
+env.Append(CPPFLAGS = os.environ.get('CPPFLAGS'))
+
+# Always look in 'src' for include files.
+env.Append(CPPPATH = [PROJ_SRC_DIR])
+env.Append(CPPFLAGS = ['-Wall',
+                       '-Werror',
+                       '-fdiagnostics-show-option',
+                       '-Wextra',
+                       '-pedantic',
+                       # Explicitly enable the write-strings warning. VIXL uses
+                       # const correctly when handling string constants.
+                       '-Wwrite-strings'])
+
+target_program = env['target']
+build_suffix = ''
+
+if env['simulator'] == 'on':
+  env.Append(CPPFLAGS = ['-DUSE_SIMULATOR'])
+  build_suffix += '_sim'
+
+if env['mode'] == 'debug':
+  env.Append(CPPFLAGS = ['-g', '-DDEBUG'])
+  # Append the debug mode suffix to the executable name.
+  build_suffix += '_g'
+  build_dir = DEBUG_OBJ_DIR
+elif env['mode'] == 'coverage':
+  env.Append(CPPFLAGS = ['-g', '-DDEBUG', '-fprofile-arcs', '-ftest-coverage'])
+  env.Append(LINKFLAGS = ['-fprofile-arcs'])
+  # Append the coverage mode suffix to the executable name.
+  build_suffix += '_gcov'
+  build_dir = COVERAGE_OBJ_DIR
+else:
+  # Release mode.
+  env.Append(CPPFLAGS = ['-O3'])
+  build_dir = RELEASE_OBJ_DIR
+  # GCC 4.8 has a bug which produces a warning saying that an anonymous Operand
+  # object might be used uninitialized:
+  #   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57045
+  # The bug does not seem to appear in GCC 4.7, or in debug builds with GCC 4.8.
+  env.Append(CPPFLAGS = ['-Wno-maybe-uninitialized'])
+
+
+if target_program == 'cctest':
+  env.Append(CPPPATH = [PROJ_EXAMPLES_DIR])
+  env.Append(CPPFLAGS = ['-DTEST_EXAMPLES'])
+
+# Build the library.
+proj_library = env.Library('vixl' + build_suffix, list_target(build_dir, PROJ_SRC_FILES))
+
+if target_program == 'examples':
+  # Build the examples.
+  env.Append(CPPPATH = [PROJ_EXAMPLES_DIR])
+  for example in PROJ_EXAMPLES_SRC_FILES:
+    example_name = "example-" + os.path.splitext(os.path.basename(example))[0]
+    env.Program(example_name, list_target(build_dir, [example]) + proj_library)
+else:
+  # Build the target program.
+  program_target_files = list_target(build_dir, TARGET_SRC_FILES[env['target']])
+  env.Program(target_program + build_suffix, program_target_files + proj_library)

diff --git a/benchmarks/bench-branch.cc b/benchmarks/bench-branch.cc
new file mode 100644
index 0000000..247a712
--- /dev/null
+++ b/benchmarks/bench-branch.cc

@@ -0,0 +1,82 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "a64/macro-assembler-a64.h"
+#include "a64/instructions-a64.h"
+#include "globals.h"
+
+using namespace vixl;
+
+static const unsigned kDefaultInstructionCount = 100000;
+
+// This program focuses on emitting branch instructions.
+//
+// This code will emit a given number of branch immediate to the next
+// instructions in a fixed size buffer, looping over the buffer if necessary.
+// This code therefore focuses on Emit and label binding/patching.
+int main(int argc, char* argv[]) {
+  unsigned instructions = 0;
+
+  switch (argc) {
+    case 1: instructions = kDefaultInstructionCount; break;
+    case 2: instructions = atoi(argv[1]); break;
+    default:
+      printf("Usage: %s [#instructions]\n", argv[0]);
+      exit(1);
+  }
+
+  const unsigned buffer_size = 256 * KBytes;
+  // Emitting on the last word of the buffer will trigger an assert.
+  const unsigned buffer_instruction_count = buffer_size / kInstructionSize - 1;
+
+  byte* assm_buffer = new byte[buffer_size];
+  MacroAssembler* masm = new MacroAssembler(assm_buffer, buffer_size);
+
+  #define __ masm->
+  // We emit a branch to the next instruction.
+
+  unsigned rounds = instructions / buffer_instruction_count;
+  for (unsigned i = 0; i < rounds; ++i) {
+    for (unsigned j = 0; j < buffer_instruction_count; ++j) {
+      Label target;
+      __ b(&target);
+      __ bind(&target);
+    }
+    masm->Reset();
+  }
+
+  unsigned remaining = instructions % buffer_instruction_count;
+  for (unsigned i = 0; i < remaining; ++i) {
+    Label target;
+    __ b(&target);
+    __ bind(&target);
+  }
+
+  delete masm;
+  delete assm_buffer;
+
+  return 0;
+}

diff --git a/benchmarks/bench-dataop.cc b/benchmarks/bench-dataop.cc
new file mode 100644
index 0000000..086a51c
--- /dev/null
+++ b/benchmarks/bench-dataop.cc

@@ -0,0 +1,77 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "a64/macro-assembler-a64.h"
+#include "a64/instructions-a64.h"
+#include "globals.h"
+
+using namespace vixl;
+
+static const unsigned kDefaultInstructionCount = 100000;
+
+// This program focuses on emitting simple instructions.
+//
+// This code will emit a given number of 'add x0, x1, x2' in a fixed size
+// buffer, looping over the buffer if necessary. This code therefore focuses
+// on Emit and Operand.
+int main(int argc, char* argv[]) {
+  unsigned instructions = 0;
+
+  switch (argc) {
+    case 1: instructions = kDefaultInstructionCount; break;
+    case 2: instructions = atoi(argv[1]); break;
+    default:
+      printf("Usage: %s [#instructions]\n", argv[0]);
+      exit(1);
+  }
+
+  const unsigned buffer_size = 256 * KBytes;
+  // Emitting on the last word of the buffer will trigger an assert.
+  const unsigned buffer_instruction_count = buffer_size / kInstructionSize - 1;
+
+  byte* assm_buffer = new byte[buffer_size];
+  MacroAssembler* masm = new MacroAssembler(assm_buffer, buffer_size);
+
+  #define __ masm->
+
+  unsigned rounds = instructions / buffer_instruction_count;
+  for (unsigned i = 0; i < rounds; ++i) {
+    for (unsigned j = 0; j < buffer_instruction_count; ++j) {
+      __ add(x0, x1, Operand(x2));
+    }
+    masm->Reset();
+  }
+
+  unsigned remaining = instructions % buffer_instruction_count;
+  for (unsigned i = 0; i < remaining; ++i) {
+    __ add(x0, x1, Operand(x2));
+  }
+
+  delete masm;
+  delete assm_buffer;
+
+  return 0;
+}

diff --git a/doc/getting-started.md b/doc/getting-started.md
new file mode 100644
index 0000000..7d32a32
--- /dev/null
+++ b/doc/getting-started.md

@@ -0,0 +1,206 @@
+Getting Started with VIXL
+=========================
+
+
+This guide will show you how to use the VIXL framework. We will see how to set
+up the VIXL assembler and generate some code. We will also go into details on a
+few useful features provided by VIXL and see how to run the generated code in
+the VIXL simulator.
+
+The source code of the example developed in this guide can be found in the
+`examples` directory (`examples/getting-started.cc`).
+
+
+Creating the macro assembler and the simulator.
+-----------------------------------------------
+
+First of all you need to make sure that the header files for the assembler and
+the simulator are included. You should have the following lines at the beginning
+of your source file:
+
+    #include "a64/simulator-a64.h"
+    #include "a64/macro-assembler-a64.h"
+
+VIXL's assembler will generate some code at run-time, and this code needs to
+be stored in a buffer. It must be large enough to contain all of the
+instructions and data that will be generated. In this guide we will use a
+default value of 4096 but you are free to change it to something that suits your
+needs.
+
+    #define BUF_SIZE (4096)
+
+All VIXL components are declared in the `vixl` namespace, so let's add this to
+the beginning of the file for convenience:
+
+    using namespace vixl;
+
+Now we are ready to create and initialize the different components.
+
+First of all we need to allocate the code buffer and to create a macro
+assembler object which uses this buffer.
+
+    byte assm_buf[BUF_SIZE];
+    MacroAssembler masm(assm_buf, BUF_SIZE);
+
+We also need to set-up the simulator. The simulator uses a Decoder object to
+read and decode the instructions from the code buffer. We need to create a
+decoder and bind our simulator to this decoder.
+
+    Decoder decoder;
+    Simulator simulator(&decoder);
+
+
+Generating some code.
+---------------------
+
+We are now ready to generate some code. The macro assembler provides methods
+for all the instructions that you can use. As it's a macro assembler,
+the instructions that you tell it to generate may not directly map to a single
+hardware instruction. Instead, it can produce a short sequence of instructions
+that has the same effect.
+
+For instance, the hardware `add` instruction can only take a 12-bit immediate
+optionally shifted by 12, but the macro assembler can generate one or more
+instructions to handle any 64-bit immediate. For example, `Add(x0, x0, -1)`
+will be turned into `Sub(x0, x0, 1)`.
+
+Before looking at how to generate some code, let's introduce a simple but handy
+macro:
+
+    #define __ masm->
+
+It allows us to write `__ Mov(x0, 42);` instead of `masm->Mov(x0, 42);` to
+generate code.
+
+Now we are going to write a C++ function to generate our first assembly
+code fragment.
+
+    void GenerateDemoFunction(MacroAssembler *masm) {
+      __ Ldr(x1, 0x1122334455667788);
+      __ And(x0, x0, x1);
+      __ Ret();
+    }
+
+The generated code corresponds to a function with the following C prototype:
+
+    uint64_t demo_function(uint64_t x);
+
+This function doesn't perform any useful operation. It loads the value
+0x1122334455667788 into x1 and performs a bitwise `and` operation with
+the function's argument (stored in x0). The result of this `and` operation
+is returned by the function in x0.
+
+Now in our program main function, we only need to create a label to represent
+the entry point of the assembly function and to call `GenerateDemoFunction` to
+generate the code.
+
+    Label demo_function;
+    masm.Bind(&demo_function);
+    GenerateDemoFunction(&masm);
+    masm.Finalize();
+
+Now we are going to learn a bit more on a couple of interesting VIXL features
+which are used in this example.
+
+### Label
+
+VIXL's assembler provides a mechanism to represent labels with `Label` objects.
+They are easy to use: simply create the C++ object and bind it to a location in
+the generated instruction stream.
+
+Creating a label is easy, since you only need to define the variable and bind it
+to a location using the macro assembler.
+
+    Label my_label;      // Create the label object.
+    __ Bind(&my_label);  // Bind it to the current location.
+
+The target of a branch using a label will be the address to which it has been
+bound. For example, let's consider the following code fragment:
+
+    Label foo;
+
+    __ B(&foo);     // Branch to foo.
+    __ Mov(x0, 42);
+    __ Bind(&foo);  // Actual address of foo is here.
+    __ Mov(x1, 0xc001);
+
+If we run this code fragment the `Mov(x0, 42)` will never be executed since
+the first thing this code does is to jump to `foo`, which correspond to the
+`Mov(x1, 0xc001)` instruction.
+
+When working with labels you need to know that they are only to be used for
+local branches, and should be passed around with care. There are two reasons
+for this:
+
+  - They can't safely be passed or returned by value because this can trigger
+    multiple constructor and destructor calls. The destructor has assertions
+    to check that we don't try to branch to a label that hasn't been bound.
+
+  - The `B` instruction does not branch to labels which are out of range of the
+    branch. The `B` instruction has a range of 2^28 bytes, but other variants
+    (such as conditional or `CBZ`-like branches) have smaller ranges. Confining
+    them to local ranges doesn't mean that we won't hit these limits, but it
+    makes the lifetime of the labels much shorter and eases the debugging of
+    these kinds of issues.
+
+
+### Literal Pool
+
+On ARMv8 instructions are 32 bits long, thus immediate values encoded in the
+instructions have limited size. If you want to load a constant bigger than this
+limit you have two possibilities:
+
+1. Use multiple instructions to load the constant in multiple steps. This
+  solution is already handled in VIXL. For instance you can write:
+
+  `__ Mov(x0, 0x1122334455667788);`
+
+  The previous instruction would not be legal since the immediate value is too
+  big. However, VIXL's macro assembler will automatically rewrite this line into
+  multiple instructions to efficiently generate the value.
+
+
+2. Store the constant in memory and load this value from the memory. The value
+  needs to be written near the code that will load it since we use a PC-relative
+  offset to indicate the address of this value. This solution has the advantage
+  of making the value easily modifiable at run-time; since it does not reside
+  in the instruction stream, it doesn't require cache maintenance when updated.
+
+  VIXL also provides a way to do this:
+
+  `__ Ldr(x0, 0x1122334455667788);`
+
+  The assembler will store the immediate value in a "literal pool", a set of
+  constants embedded in the code. VIXL will emit literal pools after natural
+  breaks in the control flow, such as unconditional branches or return
+  instructions.
+
+  Literal pools are emitted regularly, such that they are within range of the
+  instructions that refer to them. However, you can force a literal pool to be
+  emitted using `masm.EmitLiteralPool()`.
+
+
+Running the code in the simulator.
+----------------------------------
+
+Now we are going to see how to use the simulator to run the code that we
+generated previously.
+
+Use the simulator to assign a value to the registers. Our previous code example
+uses the register x0 as an input, so let's set the value of this register.
+
+    simulator.set_xreg(0, 0x8899aabbccddeeff);
+
+Now we can jump to the "entry" label to execute the code:
+
+    simulator.RunFrom(entry.target());
+
+When the execution is finished and the simulator returned, you can inspect
+the value of the registers after the execution. For instance:
+
+    printf("x0 = %" PRIx64 "\n", simulator.xreg(0));
+
+The example shown in this tutorial is very simple, because the goal was to
+demonstrate the basics of the VIXL framework. There are more complex code
+examples in the VIXL `examples` directory showing more features of both the
+macro assembler and the ARMv8 architecture.

diff --git a/doc/supported-instructions.md b/doc/supported-instructions.md
new file mode 100644
index 0000000..90d63ec
--- /dev/null
+++ b/doc/supported-instructions.md

@@ -0,0 +1,1133 @@
+VIXL Supported Instruction List
+===============================
+
+This is a list of the AArch64 instructions supported by the VIXL assembler,
+disassembler and simulator. The simulator may not support all floating point
+operations to the precision required by AArch64 - please check the simulator
+source code for details.
+
+AArch64 integer instructions
+----------------------------
+
+### adc ###
+
+Add with carry bit.
+
+    void adc(const Register& rd,
+             const Register& rn,
+             const Operand& operand,
+             FlagsUpdate S = LeaveFlags)
+
+
+### add ###
+
+Add.
+
+    void add(const Register& rd,
+             const Register& rn,
+             const Operand& operand,
+             FlagsUpdate S = LeaveFlags)
+
+
+### adr ###
+
+Calculate the address of a PC offset.
+
+    void adr(const Register& rd, int imm21)
+
+
+### adr ###
+
+Calculate the address of a label.
+
+    void adr(const Register& rd, Label* label)
+
+
+### asr ###
+
+Arithmetic shift right.
+
+    inline void asr(const Register& rd, const Register& rn, unsigned shift)
+
+
+### asrv ###
+
+Arithmetic shift right by variable.
+
+    void asrv(const Register& rd, const Register& rn, const Register& rm)
+
+
+### b ###
+
+Branch to PC offset.
+
+    void b(int imm26, Condition cond = al)
+
+
+### b ###
+
+Branch to label.
+
+    void b(Label* label, Condition cond = al)
+
+
+### bfi ###
+
+Bitfield insert.
+
+    inline void bfi(const Register& rd,
+                    const Register& rn,
+                    unsigned lsb,
+                    unsigned width)
+
+
+### bfm ###
+
+Bitfield move.
+
+    void bfm(const Register& rd,
+             const Register& rn,
+             unsigned immr,
+             unsigned imms)
+
+
+### bfxil ###
+
+Bitfield extract and insert low.
+
+    inline void bfxil(const Register& rd,
+                      const Register& rn,
+                      unsigned lsb,
+                      unsigned width)
+
+
+### bic ###
+
+Bit clear (A & ~B).
+
+    void bic(const Register& rd,
+             const Register& rn,
+             const Operand& operand,
+             FlagsUpdate S = LeaveFlags)
+
+
+### bl ###
+
+Branch with link to PC offset.
+
+    void bl(int imm26)
+
+
+### bl ###
+
+Branch with link to label.
+
+    void bl(Label* label)
+
+
+### blr ###
+
+Branch with link to register.
+
+    void blr(const Register& xn)
+
+
+### br ###
+
+Branch to register.
+
+    void br(const Register& xn)
+
+
+### brk ###
+
+Monitor debug-mode breakpoint.
+
+    void brk(int code)
+
+
+### cbnz ###
+
+Compare and branch to PC offset if not zero.
+
+    void cbnz(const Register& rt, int imm19)
+
+
+### cbnz ###
+
+Compare and branch to label if not zero.
+
+    void cbnz(const Register& rt, Label* label)
+
+
+### cbz ###
+
+Compare and branch to PC offset if zero.
+
+    void cbz(const Register& rt, int imm19)
+
+
+### cbz ###
+
+Compare and branch to label if zero.
+
+    void cbz(const Register& rt, Label* label)
+
+
+### ccmn ###
+
+Conditional compare negative.
+
+    void ccmn(const Register& rn,
+              const Operand& operand,
+              StatusFlags nzcv,
+              Condition cond)
+
+
+### ccmp ###
+
+Conditional compare.
+
+    void ccmp(const Register& rn,
+              const Operand& operand,
+              StatusFlags nzcv,
+              Condition cond)
+
+
+### cinc ###
+
+Conditional increment: rd = cond ? rn + 1 : rn.
+
+    void cinc(const Register& rd, const Register& rn, Condition cond)
+
+
+### cinv ###
+
+Conditional invert: rd = cond ? ~rn : rn.
+
+    void cinv(const Register& rd, const Register& rn, Condition cond)
+
+
+### cls ###
+
+Count leading sign bits.
+
+    void cls(const Register& rd, const Register& rn)
+
+
+### clz ###
+
+Count leading zeroes.
+
+    void clz(const Register& rd, const Register& rn)
+
+
+### cmn ###
+
+Compare negative.
+
+    void cmn(const Register& rn, const Operand& operand)
+
+
+### cmp ###
+
+Compare.
+
+    void cmp(const Register& rn, const Operand& operand)
+
+
+### cneg ###
+
+Conditional negate: rd = cond ? -rn : rn.
+
+    void cneg(const Register& rd, const Register& rn, Condition cond)
+
+
+### csel ###
+
+Conditional select: rd = cond ? rn : rm.
+
+    void csel(const Register& rd,
+              const Register& rn,
+              const Register& rm,
+              Condition cond)
+
+
+### cset ###
+
+Conditional set: rd = cond ? 1 : 0.
+
+    void cset(const Register& rd, Condition cond)
+
+
+### csetm ###
+
+Conditional set mask: rd = cond ? -1 : 0.
+
+    void csetm(const Register& rd, Condition cond)
+
+
+### csinc ###
+
+Conditional select increment: rd = cond ? rn : rm + 1.
+
+    void csinc(const Register& rd,
+               const Register& rn,
+               const Register& rm,
+               Condition cond)
+
+
+### csinv ###
+
+Conditional select inversion: rd = cond ? rn : ~rm.
+
+    void csinv(const Register& rd,
+               const Register& rn,
+               const Register& rm,
+               Condition cond)
+
+
+### csneg ###
+
+Conditional select negation: rd = cond ? rn : -rm.
+
+    void csneg(const Register& rd,
+               const Register& rn,
+               const Register& rm,
+               Condition cond)
+
+
+### eon ###
+
+Bitwise enor/xnor (A ^ ~B).
+
+    void eon(const Register& rd, const Register& rn, const Operand& operand)
+
+
+### eor ###
+
+Bitwise eor/xor (A ^ B).
+
+    void eor(const Register& rd, const Register& rn, const Operand& operand)
+
+
+### extr ###
+
+Extract.
+
+    void extr(const Register& rd,
+              const Register& rn,
+              const Register& rm,
+              unsigned lsb)
+
+
+### hint ###
+
+System hint.
+
+    void hint(SystemHint code)
+
+
+### hlt ###
+
+Halting debug-mode breakpoint.
+
+    void hlt(int code)
+
+
+### ldnp ###
+
+Load integer or FP register pair, non-temporal.
+
+    void ldnp(const CPURegister& rt, const CPURegister& rt2,
+              const MemOperand& src)
+
+
+### ldp ###
+
+Load integer or FP register pair.
+
+    void ldp(const CPURegister& rt, const CPURegister& rt2,
+             const MemOperand& src)
+
+
+### ldpsw ###
+
+Load word pair with sign extension.
+
+    void ldpsw(const Register& rt, const Register& rt2, const MemOperand& src)
+
+
+### ldr ###
+
+Load integer or FP register.
+
+    void ldr(const CPURegister& rt, const MemOperand& src)
+
+
+### ldr ###
+
+Load literal to FP register.
+
+    void ldr(const FPRegister& ft, double imm)
+
+
+### ldr ###
+
+Load literal to register.
+
+    void ldr(const Register& rt, uint64_t imm)
+
+
+### ldrb ###
+
+Load byte.
+
+    void ldrb(const Register& rt, const MemOperand& src)
+
+
+### ldrh ###
+
+Load half-word.
+
+    void ldrh(const Register& rt, const MemOperand& src)
+
+
+### ldrsb ###
+
+Load byte with sign extension.
+
+    void ldrsb(const Register& rt, const MemOperand& src)
+
+
+### ldrsh ###
+
+Load half-word with sign extension.
+
+    void ldrsh(const Register& rt, const MemOperand& src)
+
+
+### ldrsw ###
+
+Load word with sign extension.
+
+    void ldrsw(const Register& rt, const MemOperand& src)
+
+
+### lsl ###
+
+Logical shift left.
+
+    inline void lsl(const Register& rd, const Register& rn, unsigned shift)
+
+
+### lslv ###
+
+Logical shift left by variable.
+
+    void lslv(const Register& rd, const Register& rn, const Register& rm)
+
+
+### lsr ###
+
+Logical shift right.
+
+    inline void lsr(const Register& rd, const Register& rn, unsigned shift)
+
+
+### lsrv ###
+
+Logical shift right by variable.
+
+    void lsrv(const Register& rd, const Register& rn, const Register& rm)
+
+
+### madd ###
+
+Multiply and accumulate.
+
+    void madd(const Register& rd,
+              const Register& rn,
+              const Register& rm,
+              const Register& ra)
+
+
+### mneg ###
+
+Negated multiply.
+
+    void mneg(const Register& rd, const Register& rn, const Register& rm)
+
+
+### mov ###
+
+Move register to register.
+
+    void mov(const Register& rd, const Register& rn)
+
+
+### movk ###
+
+Move immediate and keep.
+
+    void movk(const Register& rd, uint64_t imm, int shift = -1)
+
+
+### movn ###
+
+Move inverted immediate.
+
+    void movn(const Register& rd, uint64_t imm, int shift = -1)
+
+
+### movz ###
+
+Move immediate.
+
+    void movz(const Register& rd, uint64_t imm, int shift = -1)
+
+
+### mrs ###
+
+Move to register from system register.
+
+    void mrs(const Register& rt, SystemRegister sysreg)
+
+
+### msr ###
+
+Move from register to system register.
+
+    void msr(SystemRegister sysreg, const Register& rt)
+
+
+### msub ###
+
+Multiply and subtract.
+
+    void msub(const Register& rd,
+              const Register& rn,
+              const Register& rm,
+              const Register& ra)
+
+
+### mul ###
+
+Multiply.
+
+    void mul(const Register& rd, const Register& rn, const Register& rm)
+
+
+### mvn ###
+
+Move inverted operand to register.
+
+    void mvn(const Register& rd, const Operand& operand)
+
+
+### neg ###
+
+Negate.
+
+    void neg(const Register& rd,
+             const Operand& operand,
+             FlagsUpdate S = LeaveFlags)
+
+
+### ngc ###
+
+Negate with carry bit.
+
+    void ngc(const Register& rd,
+             const Operand& operand,
+             FlagsUpdate S = LeaveFlags)
+
+
+### nop ###
+
+No-op.
+
+    void nop()
+
+
+### orn ###
+
+Bitwise nor (A | ~B).
+
+    void orn(const Register& rd, const Register& rn, const Operand& operand)
+
+
+### orr ###
+
+Bitwise or (A | B).
+
+    void orr(const Register& rd, const Register& rn, const Operand& operand)
+
+
+### rbit ###
+
+Bit reverse.
+
+    void rbit(const Register& rd, const Register& rn)
+
+
+### ret ###
+
+Branch to register with return hint.
+
+    void ret(const Register& xn = lr)
+
+
+### rev ###
+
+Reverse bytes.
+
+    void rev(const Register& rd, const Register& rn)
+
+
+### rev16 ###
+
+Reverse bytes in 16-bit half words.
+
+    void rev16(const Register& rd, const Register& rn)
+
+
+### rev32 ###
+
+Reverse bytes in 32-bit words.
+
+    void rev32(const Register& rd, const Register& rn)
+
+
+### ror ###
+
+Rotate right.
+
+    inline void ror(const Register& rd, const Register& rs, unsigned shift)
+
+
+### rorv ###
+
+Rotate right by variable.
+
+    void rorv(const Register& rd, const Register& rn, const Register& rm)
+
+
+### sbc ###
+
+Subtract with carry bit.
+
+    void sbc(const Register& rd,
+             const Register& rn,
+             const Operand& operand,
+             FlagsUpdate S = LeaveFlags)
+
+
+### sbfiz ###
+
+Signed bitfield insert with zero at right.
+
+    inline void sbfiz(const Register& rd,
+                      const Register& rn,
+                      unsigned lsb,
+                      unsigned width)
+
+
+### sbfm ###
+
+Signed bitfield move.
+
+    void sbfm(const Register& rd,
+              const Register& rn,
+              unsigned immr,
+              unsigned imms)
+
+
+### sbfx ###
+
+Signed bitfield extract.
+
+    inline void sbfx(const Register& rd,
+                     const Register& rn,
+                     unsigned lsb,
+                     unsigned width)
+
+
+### scvtf ###
+
+Convert signed integer or fixed point to FP.
+
+    void scvtf(const FPRegister& fd, const Register& rn, unsigned fbits = 0)
+
+
+### sdiv ###
+
+Signed integer divide.
+
+    void sdiv(const Register& rd, const Register& rn, const Register& rm)
+
+
+### smaddl ###
+
+Signed long multiply and accumulate: 32 x 32 + 64 -> 64-bit.
+
+    void smaddl(const Register& rd,
+                const Register& rn,
+                const Register& rm,
+                const Register& ra)
+
+
+### smsubl ###
+
+Signed long multiply and subtract: 64 - (32 x 32) -> 64-bit.
+
+    void smsubl(const Register& rd,
+                const Register& rn,
+                const Register& rm,
+                const Register& ra)
+
+
+### smulh ###
+
+Signed multiply high: 64 x 64 -> 64-bit <127:64>.
+
+    void smulh(const Register& xd, const Register& xn, const Register& xm)
+
+
+### smull ###
+
+Signed long multiply: 32 x 32 -> 64-bit.
+
+    void smull(const Register& rd, const Register& rn, const Register& rm)
+
+
+### stnp ###
+
+Store integer or FP register pair, non-temporal.
+
+    void stnp(const CPURegister& rt, const CPURegister& rt2,
+              const MemOperand& dst)
+
+
+### stp ###
+
+Store integer or FP register pair.
+
+    void stp(const CPURegister& rt, const CPURegister& rt2,
+             const MemOperand& dst)
+
+
+### str ###
+
+Store integer or FP register.
+
+    void str(const CPURegister& rt, const MemOperand& dst)
+
+
+### strb ###
+
+Store byte.
+
+    void strb(const Register& rt, const MemOperand& dst)
+
+
+### strh ###
+
+Store half-word.
+
+    void strh(const Register& rt, const MemOperand& dst)
+
+
+### sub ###
+
+Subtract.
+
+    void sub(const Register& rd,
+             const Register& rn,
+             const Operand& operand,
+             FlagsUpdate S = LeaveFlags)
+
+
+### sxtb ###
+
+Signed extend byte.
+
+    inline void sxtb(const Register& rd, const Register& rn)
+
+
+### sxth ###
+
+Signed extend halfword.
+
+    inline void sxth(const Register& rd, const Register& rn)
+
+
+### sxtw ###
+
+Signed extend word.
+
+    inline void sxtw(const Register& rd, const Register& rn)
+
+
+### tbnz ###
+
+Test bit and branch to PC offset if not zero.
+
+    void tbnz(const Register& rt, unsigned bit_pos, int imm14)
+
+
+### tbnz ###
+
+Test bit and branch to label if not zero.
+
+    void tbnz(const Register& rt, unsigned bit_pos, Label* label)
+
+
+### tbz ###
+
+Test bit and branch to PC offset if zero.
+
+    void tbz(const Register& rt, unsigned bit_pos, int imm14)
+
+
+### tbz ###
+
+Test bit and branch to label if zero.
+
+    void tbz(const Register& rt, unsigned bit_pos, Label* label)
+
+
+### tst ###
+
+Bit test and set flags.
+
+    void tst(const Register& rn, const Operand& operand)
+
+
+### ubfiz ###
+
+Unsigned bitfield insert with zero at right.
+
+    inline void ubfiz(const Register& rd,
+                      const Register& rn,
+                      unsigned lsb,
+                      unsigned width)
+
+
+### ubfm ###
+
+Unsigned bitfield move.
+
+    void ubfm(const Register& rd,
+              const Register& rn,
+              unsigned immr,
+              unsigned imms)
+
+
+### ubfx ###
+
+Unsigned bitfield extract.
+
+    inline void ubfx(const Register& rd,
+                     const Register& rn,
+                     unsigned lsb,
+                     unsigned width)
+
+
+### ucvtf ###
+
+Convert unsigned integer or fixed point to FP.
+
+    void ucvtf(const FPRegister& fd, const Register& rn, unsigned fbits = 0)
+
+
+### udiv ###
+
+Unsigned integer divide.
+
+    void udiv(const Register& rd, const Register& rn, const Register& rm)
+
+
+### umaddl ###
+
+Unsigned long multiply and accumulate: 32 x 32 + 64 -> 64-bit.
+
+    void umaddl(const Register& rd,
+                const Register& rn,
+                const Register& rm,
+                const Register& ra)
+
+
+### umsubl ###
+
+Unsigned long multiply and subtract: 64 - (32 x 32) -> 64-bit.
+
+    void umsubl(const Register& rd,
+                const Register& rn,
+                const Register& rm,
+                const Register& ra)
+
+
+### uxtb ###
+
+Unsigned extend byte.
+
+    inline void uxtb(const Register& rd, const Register& rn)
+
+
+### uxth ###
+
+Unsigned extend halfword.
+
+    inline void uxth(const Register& rd, const Register& rn)
+
+
+### uxtw ###
+
+Unsigned extend word.
+
+    inline void uxtw(const Register& rd, const Register& rn)
+
+
+
+AArch64 floating point instructions
+-----------------------------------
+
+### fabs ###
+
+FP absolute.
+
+    void fabs(const FPRegister& fd, const FPRegister& fn)
+
+
+### fadd ###
+
+FP add.
+
+    void fadd(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm)
+
+
+### fccmp ###
+
+FP conditional compare.
+
+    void fccmp(const FPRegister& fn,
+               const FPRegister& fm,
+               StatusFlags nzcv,
+               Condition cond)
+
+
+### fcmp ###
+
+FP compare immediate.
+
+    void fcmp(const FPRegister& fn, double value)
+
+
+### fcmp ###
+
+FP compare registers.
+
+    void fcmp(const FPRegister& fn, const FPRegister& fm)
+
+
+### fcsel ###
+
+FP conditional select.
+
+    void fcsel(const FPRegister& fd,
+               const FPRegister& fn,
+               const FPRegister& fm,
+               Condition cond)
+
+
+### fcvt ###
+
+FP convert single to double precision.
+
+    void fcvt(const FPRegister& fd, const FPRegister& fn)
+
+
+### fcvtms ###
+
+Convert FP to signed integer (round towards -infinity).
+
+    void fcvtms(const Register& rd, const FPRegister& fn)
+
+
+### fcvtmu ###
+
+Convert FP to unsigned integer (round towards -infinity).
+
+    void fcvtmu(const Register& rd, const FPRegister& fn)
+
+
+### fcvtns ###
+
+Convert FP to signed integer (nearest with ties to even).
+
+    void fcvtns(const Register& rd, const FPRegister& fn)
+
+
+### fcvtnu ###
+
+Convert FP to unsigned integer (nearest with ties to even).
+
+    void fcvtnu(const Register& rd, const FPRegister& fn)
+
+
+### fcvtzs ###
+
+Convert FP to signed integer (round towards zero).
+
+    void fcvtzs(const Register& rd, const FPRegister& fn)
+
+
+### fcvtzu ###
+
+Convert FP to unsigned integer (round towards zero).
+
+    void fcvtzu(const Register& rd, const FPRegister& fn)
+
+
+### fdiv ###
+
+FP divide.
+
+    void fdiv(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm)
+
+
+### fmax ###
+
+FP maximum.
+
+    void fmax(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm)
+
+
+### fmin ###
+
+FP minimum.
+
+    void fmin(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm)
+
+
+### fmov ###
+
+Move FP register to FP register.
+
+    void fmov(FPRegister fd, FPRegister fn)
+
+
+### fmov ###
+
+Move FP register to register.
+
+    void fmov(Register rd, FPRegister fn)
+
+
+### fmov ###
+
+Move immediate to FP register.
+
+    void fmov(FPRegister fd, double imm)
+
+
+### fmov ###
+
+Move register to FP register.
+
+    void fmov(FPRegister fd, Register rn)
+
+
+### fmsub ###
+
+FP multiply and subtract.
+
+    void fmsub(const FPRegister& fd,
+               const FPRegister& fn,
+               const FPRegister& fm,
+               const FPRegister& fa)
+
+
+### fmul ###
+
+FP multiply.
+
+    void fmul(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm)
+
+
+### fneg ###
+
+FP negate.
+
+    void fneg(const FPRegister& fd, const FPRegister& fn)
+
+
+### frintn ###
+
+FP round to integer (nearest with ties to even).
+
+    void frintn(const FPRegister& fd, const FPRegister& fn)
+
+
+### frintz ###
+
+FP round to integer (towards zero).
+
+    void frintz(const FPRegister& fd, const FPRegister& fn)
+
+
+### fsqrt ###
+
+FP square root.
+
+    void fsqrt(const FPRegister& fd, const FPRegister& fn)
+
+
+### fsub ###
+
+FP subtract.
+
+    void fsub(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm)
+
+
+
+Additional or pseudo instructions
+---------------------------------
+
+### bind ###
+
+Bind a label to the current PC.
+
+    void bind(Label* label)
+
+
+### dc32 ###
+
+Emit 32 bits of data into the instruction stream.
+
+    inline void dc32(uint32_t data)
+
+
+### dc64 ###
+
+Emit 64 bits of data into the instruction stream.
+
+    inline void dc64(uint64_t data)
+
+
+### dci ###
+
+Emit raw instructions into the instruction stream.
+
+    inline void dci(Instr raw_inst)
+
+
+### debug ###
+
+Debug control pseudo instruction, only supported by the debugger.
+
+    void debug(const char* message, uint32_t code, Instr params = BREAK)
+
+
+

diff --git a/examples/abs.cc b/examples/abs.cc
new file mode 100644
index 0000000..fa4f582
--- /dev/null
+++ b/examples/abs.cc

@@ -0,0 +1,67 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "examples.h"
+
+#define BUF_SIZE (4096)
+#define __ masm->
+
+void GenerateAbs(MacroAssembler* masm) {
+  // int64_t abs(int64_t x)
+  // Argument location:
+  //   x -> x0
+
+  // This example uses a conditional instruction (cneg) to compute the
+  // absolute value of an integer.
+  __ Cmp(x0, 0);
+  __ Cneg(x0, x0, mi);
+  __ Ret();
+}
+
+
+#ifndef TEST_EXAMPLES
+int main(void) {
+  // Create and initialize the assembler and the simulator.
+  byte assm_buf[BUF_SIZE];
+  MacroAssembler masm(assm_buf, BUF_SIZE);
+  Decoder decoder;
+  Simulator simulator(&decoder);
+
+  // Generate the code for the example function.
+  Label abs;
+  masm.Bind(&abs);
+  GenerateAbs(&masm);
+  masm.FinalizeCode();
+
+  // Run the example function.
+  int64_t input_value = -42;
+  simulator.set_xreg(0, input_value);
+  simulator.RunFrom(abs.target());
+  printf("abs(%ld) = %ld\n", input_value, simulator.xreg(0));
+
+  return 0;
+}
+#endif

diff --git a/examples/add3-double.cc b/examples/add3-double.cc
new file mode 100644
index 0000000..c2d3f6f
--- /dev/null
+++ b/examples/add3-double.cc

@@ -0,0 +1,72 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "examples.h"
+
+#define BUF_SIZE (4096)
+#define __ masm->
+
+void GenerateAdd3Double(MacroAssembler* masm) {
+  // double add3_double(double x, double y, double z)
+  //  Argument locations:
+  //    x -> d0
+  //    y -> d1
+  //    z -> d2
+  __ Fadd(d0, d0, d1);    // d0 <- x + y
+  __ Fadd(d0, d0, d2);    // d0 <- d0 + z
+
+  // The return value is already in d0.
+  __ Ret();
+}
+
+
+#ifndef TEST_EXAMPLES
+int main(void) {
+  // Create and initialize the assembler and the simulator.
+  byte assm_buf[BUF_SIZE];
+  MacroAssembler masm(assm_buf, BUF_SIZE);
+  Decoder decoder;
+  Simulator simulator(&decoder);
+
+  // Generate the code for the example function.
+  Label add3_double;
+  masm.Bind(&add3_double);
+  GenerateAdd3Double(&masm);
+  masm.FinalizeCode();
+
+  // Run the example function.
+  double a = 498.36547;
+  double b = 23.369;
+  double c = 7964.697954;
+  simulator.set_dreg(0, a);
+  simulator.set_dreg(1, b);
+  simulator.set_dreg(2, c);
+  simulator.RunFrom(add3_double.target());
+  printf("%f + %f + %f = %f\n", a, b, c, simulator.dreg(0));
+
+  return 0;
+}
+#endif

diff --git a/examples/add4-double.cc b/examples/add4-double.cc
new file mode 100644
index 0000000..ca1050f
--- /dev/null
+++ b/examples/add4-double.cc

@@ -0,0 +1,82 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "examples.h"
+
+#define BUF_SIZE (4096)
+#define __ masm->
+
+void GenerateAdd4Double(MacroAssembler* masm) {
+  // double Add4Double(uint64_t a, double b, uint64_t c, double d)
+  //  Argument locations:
+  //    a -> x0
+  //    b -> d0
+  //    c -> x1
+  //    d -> d1
+
+  // Turn 'a' and 'c' into double values.
+  __ Ucvtf(d2, x0);
+  __ Ucvtf(d3, x1);
+
+  // Add everything together.
+  __ Fadd(d0, d0, d1);
+  __ Fadd(d2, d2, d3);
+  __ Fadd(d0, d0, d2);
+
+  // The return value is in d0.
+  __ Ret();
+}
+
+
+#ifndef TEST_EXAMPLES
+int main(void) {
+  // Create and initialize the assembler and the simulator.
+  byte assm_buf[BUF_SIZE];
+  MacroAssembler masm(assm_buf, BUF_SIZE);
+  Decoder decoder;
+  Simulator simulator(&decoder);
+
+  // Generate the code for the example function.
+  Label add4_double;
+  masm.Bind(&add4_double);
+  GenerateAdd4Double(&masm);
+  masm.FinalizeCode();
+
+  // Run the example function.
+  uint64_t a = 21;
+  double b = 987.3654;
+  uint64_t c = 4387;
+  double d = 36.698754;
+  simulator.set_xreg(0, a);
+  simulator.set_dreg(0, b);
+  simulator.set_xreg(1, c);
+  simulator.set_dreg(1, d);
+  simulator.RunFrom(add4_double.target());
+  printf("%ld + %f + %ld + %f = %f\n", a, b, c, d, simulator.dreg(0));
+
+  return 0;
+}
+#endif

diff --git a/examples/check-bounds.cc b/examples/check-bounds.cc
new file mode 100644
index 0000000..c4f1d39
--- /dev/null
+++ b/examples/check-bounds.cc

@@ -0,0 +1,95 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "examples.h"
+
+#define BUF_SIZE (4096)
+#define __ masm->
+
+void GenerateCheckBounds(MacroAssembler* masm) {
+  // uint64_t check_bounds(uint64_t value, uint64_t low, uint64_t high)
+  // Argument locations:
+  //   value -> x0
+  //   low   -> x1
+  //   high  -> x2
+
+  // First we compare 'value' with the 'low' bound. If x1 <= x0 the N flag will
+  // be cleared. This configuration can be checked with the 'pl' condition.
+  __ Cmp(x0, x1);
+
+  // Now we will compare 'value' and 'high' (x0 and x2) but only if the 'pl'
+  // condition is verified. If the condition is not verified, we will clear
+  // all the flags except the carry one (C flag).
+  __ Ccmp(x0, x2, CFlag, pl);
+
+  // We set x0 to 1 only if the 'ls' condition is satisfied.
+  // 'ls' performs the following test: !(C==1 && Z==0). If the previous
+  // comparison has been skipped we have C==1 and Z==0, so the 'ls' test
+  // will fail and x0 will be set to 0.
+  // Otherwise if the previous comparison occurred, x0 will be set to 1
+  // only if x0 is less than or equal to x2.
+  __ Cset(x0, ls);
+
+  __ Ret();
+}
+
+
+#ifndef TEST_EXAMPLES
+void run_function(Simulator *simulator, Label *function,
+                  uint64_t value, uint64_t low, uint64_t high) {
+  simulator->set_xreg(0, value);
+  simulator->set_xreg(1, low);
+  simulator->set_xreg(2, high);
+
+  simulator->RunFrom(function->target());
+  printf("%ld %s between %ld and %ld\n", value,
+         simulator->xreg(0) ? "is" : "is not",
+         low, high);
+
+  simulator->ResetState();
+}
+
+int main(void) {
+  // Create and initialize the assembler and the simulator.
+  byte assm_buf[BUF_SIZE];
+  MacroAssembler masm(assm_buf, BUF_SIZE);
+  Decoder decoder;
+  Simulator simulator(&decoder);
+
+  // Generate the code for the example function.
+  Label check_bounds;
+  masm.Bind(&check_bounds);
+  GenerateCheckBounds(&masm);
+  masm.FinalizeCode();
+
+  // Run the example function.
+  run_function(&simulator, &check_bounds, 546, 50, 1000);
+  run_function(&simulator, &check_bounds, 62, 100, 200);
+  run_function(&simulator, &check_bounds, 200, 100, 200);
+
+  return 0;
+}
+#endif

diff --git a/examples/debugger.cc b/examples/debugger.cc
new file mode 100644
index 0000000..35feb09
--- /dev/null
+++ b/examples/debugger.cc

@@ -0,0 +1,70 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "examples.h"
+
+// This is an interactive example, not to be used for testing.
+#ifndef TEST_EXAMPLES
+
+#define BUF_SIZE (4096)
+#define __ masm->
+
+// The aim is to let the user "play" with the debugger. Brk will trigger the
+// debugger shell.
+void GenerateBreak(MacroAssembler* masm) {
+  Label hop;
+  __ Brk();
+  __ Nop();
+  __ B(&hop);
+  __ Nop();
+  __ Bind(&hop);
+  __ Mov(x1, 123);
+  __ Mov(x2, 456);
+  __ Add(x0, x1, x2);
+  __ Ret();
+}
+
+
+int main(void) {
+  // Create and initialize the assembler and the debugger.
+  byte assm_buf[BUF_SIZE];
+  MacroAssembler masm(assm_buf, BUF_SIZE);
+  Decoder decoder;
+  Debugger debugger(&decoder);
+
+  // Generate the code for the example function.
+  Label start;
+  masm.Bind(&start);
+  GenerateBreak(&masm);
+  masm.FinalizeCode();
+
+  // Run the example function.
+  debugger.RunFrom(start.target());
+  printf("Debugger example run\n");
+
+  return 0;
+}
+#endif

diff --git a/examples/examples.h b/examples/examples.h
new file mode 100644
index 0000000..1952a12
--- /dev/null
+++ b/examples/examples.h

@@ -0,0 +1,100 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_EXAMPLE_EXAMPLES_H_
+# define VIXL_EXAMPLE_EXAMPLES_H_
+
+#include "a64/simulator-a64.h"
+#include "a64/debugger-a64.h"
+#include "a64/macro-assembler-a64.h"
+
+using namespace vixl;
+
+// Generate a function with the following prototype:
+//   uint64_t factorial(uint64_t n)
+//
+// It provides an iterative implementation of the factorial computation.
+void GenerateFactorial(MacroAssembler* masm);
+
+// Generate a function with the following prototype:
+//   uint64_t factorial_rec(uint64_t n)
+//
+// It provides a recursive implementation of the factorial computation.
+void GenerateFactorialRec(MacroAssembler* masm);
+
+// Generate a function with the following prototype:
+//   double add3_double(double x, double y, double z)
+//
+// This example is intended to show the calling convention with double
+// floating point arguments.
+void GenerateAdd3Double(MacroAssembler* masm);
+
+// Generate a function with the following prototype:
+//   double add4_double(uint64_t a, double b, uint64_t c, double d)
+//
+// The generated function pictures the calling convention for functions
+// mixing integer and floating point arguments.
+void GenerateAdd4Double(MacroAssembler* masm);
+
+// Generate a function with the following prototype:
+//   uint32_t sum_array(uint8_t* array, uint32_t size)
+//
+// The generated function computes the sum of all the elements in
+// the given array.
+void GenerateSumArray(MacroAssembler* masm);
+
+// Generate a function with the following prototype:
+//   int64_t abs(int64_t x)
+//
+// The generated function computes the absolute value of an integer.
+void GenerateAbs(MacroAssembler* masm);
+
+// Generate a function with the following prototype:
+//   uint64_t check_bounds(uint64_t value, uint64_t low, uint64_t high)
+//
+// The goal of this example is to illustrate the use of conditional
+// instructions. The generated function will check that the given value is
+// contained within the given boundaries. It returns 1 if 'value' is between
+// 'low' and 'high' (ie. low <= value <= high).
+void GenerateCheckBounds(MacroAssembler* masm);
+
+// Generate a function which uses the stack to swap the content of the x0, x1,
+// x2 and x3 registers.
+void GenerateSwap4(MacroAssembler* masm);
+
+// Generate a function which swaps the content of w0 and w1.
+// This example demonstrates some interesting features of VIXL's stack
+// operations.
+void GenerateSwapInt32(MacroAssembler* masm);
+
+// Generate a function with the following prototype:
+//   uint64_t demo_function(uint64_t x)
+//
+// This is the example used in doc/getting-started.txt
+void GenerateDemoFunction(MacroAssembler *masm);
+
+
+#endif /* !VIXL_EXAMPLE_EXAMPLES_H_ */

diff --git a/examples/factorial-rec.cc b/examples/factorial-rec.cc
new file mode 100644
index 0000000..2d5cb4c
--- /dev/null
+++ b/examples/factorial-rec.cc

@@ -0,0 +1,79 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "examples.h"
+
+#define BUF_SIZE (4096)
+#define __ masm->
+
+void GenerateFactorialRec(MacroAssembler* masm) {
+  // uint64_t factorial_rec(uint64_t n)
+  // Argument location:
+  //   n -> x0
+
+  Label entry, input_is_zero;
+
+  __ Bind(&entry);
+  // Check for the stopping condition: the input number is null.
+  __ Cbz(x0, &input_is_zero);
+
+  __ Mov(x1, x0);
+  __ Sub(x0, x0, 1);
+  __ Push(x1, lr);
+  __ Bl(&entry);    // Recursive call factorial_rec(n - 1).
+  __ Pop(lr, x1);
+  __ Mul(x0, x0, x1);
+  __ Ret();
+
+  __ Bind(&input_is_zero);
+  __ Mov(x0, 1);
+  __ Ret();
+}
+
+
+#ifndef TEST_EXAMPLES
+int main(void) {
+  // Create and initialize the assembler and the simulator.
+  byte assm_buf[BUF_SIZE];
+  MacroAssembler masm(assm_buf, BUF_SIZE);
+  Decoder decoder;
+  Simulator simulator(&decoder);
+
+  // Generate the code for the example function.
+  Label factorial_rec;
+  masm.Bind(&factorial_rec);
+  GenerateFactorialRec(&masm);
+  masm.FinalizeCode();
+
+  // Run the example function.
+  uint64_t input_val = 16;
+  simulator.set_xreg(0, input_val);
+  simulator.RunFrom(factorial_rec.target());
+  printf("factorial(%ld) = %ld\n", input_val, simulator.xreg(0));
+
+  return 0;
+}
+#endif

diff --git a/examples/factorial.cc b/examples/factorial.cc
new file mode 100644
index 0000000..b5e6097
--- /dev/null
+++ b/examples/factorial.cc

@@ -0,0 +1,77 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "examples.h"
+
+#define BUF_SIZE (4096)
+#define __ masm->
+
+void GenerateFactorial(MacroAssembler* masm) {
+  // uint64_t factorial(uint64_t n)
+  // Argument location:
+  //   n -> x0
+
+  Label loop, end;
+
+  __ Mov(x1, x0);
+  __ Mov(x0, 1);     // Use x0 as the accumulator.
+
+  __ Cbz(x1, &end);  // Nothing to do if the input is null.
+
+  __ Bind(&loop);
+  __ Mul(x0, x0, x1);
+  __ Sub(x1, x1, 1);
+  __ Cbnz(x1, &loop);
+
+  __ Bind(&end);
+  // The return value is in x0.
+  __ Ret();
+}
+
+
+#ifndef TEST_EXAMPLES
+int main(void) {
+  // Create and initialize the assembler and the simulator.
+  byte assm_buf[BUF_SIZE];
+  MacroAssembler masm(assm_buf, BUF_SIZE);
+  Decoder decoder;
+  Simulator simulator(&decoder);
+
+  // Generate the code for the example function.
+  Label factorial;
+  masm.Bind(&factorial);
+  GenerateFactorial(&masm);
+  masm.FinalizeCode();
+
+  // Run the example function.
+  uint64_t input_val = 16;
+  simulator.set_xreg(0, input_val);
+  simulator.RunFrom(factorial.target());
+  printf("factorial(%ld) = %ld\n", input_val, simulator.xreg(0));
+
+  return 0;
+}
+#endif

diff --git a/examples/getting-started.cc b/examples/getting-started.cc
new file mode 100644
index 0000000..f7dae9e
--- /dev/null
+++ b/examples/getting-started.cc

@@ -0,0 +1,61 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "a64/simulator-a64.h"
+#include "a64/macro-assembler-a64.h"
+
+#define BUF_SIZE (4096)
+#define __ masm->
+
+using namespace vixl;
+
+void GenerateDemoFunction(MacroAssembler *masm) {
+  // uint64_t demo_function(uint64_t x)
+  __ Ldr(x1, 0x1122334455667788);
+  __ And(x0, x0, x1);
+  __ Ret();
+}
+
+
+#ifndef TEST_EXAMPLES
+int main() {
+  byte assm_buf[BUF_SIZE];
+  MacroAssembler masm(assm_buf, BUF_SIZE);
+  Decoder decoder;
+  Simulator simulator(&decoder);
+
+  Label demo_function;
+  masm.Bind(&demo_function);
+  GenerateDemoFunction(&masm);
+  masm.FinalizeCode();
+
+  simulator.set_xreg(0, 0x8899aabbccddeeff);
+  simulator.RunFrom(demo_function.target());
+  printf("x0 = %" PRIx64 "\n", simulator.xreg(0));
+
+  return 0;
+}
+#endif

diff --git a/examples/sum-array.cc b/examples/sum-array.cc
new file mode 100644
index 0000000..5f23e6a
--- /dev/null
+++ b/examples/sum-array.cc

@@ -0,0 +1,90 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "examples.h"
+
+#define ARRAY_SIZE(Array) (sizeof(Array) / sizeof((Array)[0]))
+#define BUF_SIZE (4096)
+#define __ masm->
+
+void GenerateSumArray(MacroAssembler* masm) {
+  // uint32_t sum_array(uint8_t* array, uint32_t size)
+  //  Argument locations:
+  //    array (pointer) -> x0
+  //    size            -> x1
+
+  Label loop, end;
+
+  __ Mov(x2, x0);
+  __ Mov(w0, 0);
+
+  // There's nothing to do if the array is empty.
+  __ Cbz(w1, &end);
+
+  // Go through the array and sum the elements.
+  __ Bind(&loop);
+
+  __ Ldrb(w3, MemOperand(x2, 1, PostIndex));  // w3 = *(x2++)
+  __ Add(w0, w0, w3);
+
+  __ Sub(w1, w1, 1);
+  __ Cbnz(w1, &loop);
+
+  __ Bind(&end);
+  __ Ret();
+}
+
+
+#ifndef TEST_EXAMPLES
+int main(void) {
+  // Create and initialize the assembler and the simulator.
+  byte assm_buf[BUF_SIZE];
+  MacroAssembler masm(assm_buf, BUF_SIZE);
+  Decoder decoder;
+  Simulator simulator(&decoder);
+
+  // Generate the code for the example function.
+  Label sum_array;
+  masm.Bind(&sum_array);
+  GenerateSumArray(&masm);
+  masm.FinalizeCode();
+
+  // Run the example function.
+  uint8_t data[] = { 2, 45, 63, 7, 245, 38 };
+  uintptr_t data_addr = reinterpret_cast<uintptr_t>(data);
+  simulator.set_xreg(0, data_addr);
+  simulator.set_xreg(1, ARRAY_SIZE(data));
+  simulator.RunFrom(sum_array.target());
+
+  unsigned int i;
+  for (i = 0; i < ARRAY_SIZE(data) - 1; ++i) {
+    printf("%d + ", data[i]);
+  }
+  printf("%d = %d\n", data[i], simulator.wreg(0));
+
+  return 0;
+}
+#endif

diff --git a/examples/swap-int32.cc b/examples/swap-int32.cc
new file mode 100644
index 0000000..1b14c4b
--- /dev/null
+++ b/examples/swap-int32.cc

@@ -0,0 +1,95 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "examples.h"
+
+#define BUF_SIZE (4096)
+#define __ masm->
+
+void GenerateSwapInt32(MacroAssembler* masm) {
+  {
+    // In this scope the register x2 will be used by the macro-assembler
+    // as the stack pointer (via peek, poke, push, etc).
+    const Register old_stack_pointer = __ StackPointer();
+    __ Mov(x2, __ StackPointer());
+    __ SetStackPointer(x2);
+
+    // This call to Claim is not 16-byte aligned and would have failed
+    // if the current stack pointer was sp.
+    __ Claim(8);
+
+    __ Poke(w0, 0);
+    __ Poke(w1, 4);
+    __ Peek(w1, 0);
+    __ Peek(w0, 4);
+
+    __ Drop(8);
+
+    // Even if we didn't use the system stack pointer, sp might have been
+    // modified because the ABI forbids access to memory below the stack
+    // pointer.
+    __ Mov(old_stack_pointer, __ StackPointer());
+    __ SetStackPointer(old_stack_pointer);
+  }
+
+  // The stack pointer has now been switched back to sp.
+  __ Ret();
+}
+
+
+#ifndef TEST_EXAMPLES
+int main(void) {
+  // Create and initialize the assembler and the simulator.
+  byte assm_buf[BUF_SIZE];
+  MacroAssembler masm(assm_buf, BUF_SIZE);
+  Decoder decoder;
+  Simulator simulator(&decoder);
+
+  // Generate the code for the example function.
+  Label swap_int32;
+  masm.Bind(&swap_int32);
+  GenerateSwapInt32(&masm);
+  masm.FinalizeCode();
+
+  // Run the example function.
+  simulator.set_wreg(0, 0x11111111);
+  simulator.set_wreg(1, 0x22222222);
+
+  printf("Before swap_int32:\n"
+         "x0 = 0x%" PRIx32 "\n"
+         "x1 = 0x%" PRIx32 "\n",
+         simulator.wreg(0), simulator.wreg(1));
+
+  simulator.RunFrom(swap_int32.target());
+
+  printf("After swap_int32:\n"
+         "x0 = 0x%" PRIx32 "\n"
+         "x1 = 0x%" PRIx32 "\n",
+         simulator.wreg(0), simulator.wreg(1));
+
+  return 0;
+}
+#endif

diff --git a/examples/swap4.cc b/examples/swap4.cc
new file mode 100644
index 0000000..746cc3e
--- /dev/null
+++ b/examples/swap4.cc

@@ -0,0 +1,89 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "examples.h"
+
+#define BUF_SIZE (4096)
+#define __ masm->
+
+void GenerateSwap4(MacroAssembler* masm) {
+  // VIXL's macro assembler provides some functions to manipulate the stack.
+  // This example shows some of these functions.
+  __ Claim(16);
+  __ Poke(x0, 0);
+  __ Poke(x1, 8);
+  __ Push(x3, x2);
+
+  __ Pop(x1, x0);
+  __ Peek(x3, 0);
+  __ Peek(x2, 8);
+  __ Drop(16);
+
+  __ Ret();
+}
+
+
+#ifndef TEST_EXAMPLES
+int main(void) {
+  // Create and initialize the assembler and the simulator.
+  byte assm_buf[BUF_SIZE];
+  MacroAssembler masm(assm_buf, BUF_SIZE);
+  Decoder decoder;
+  Simulator simulator(&decoder);
+
+  // Generate the code for the example function.
+  Label swap4;
+  masm.Bind(&swap4);
+  GenerateSwap4(&masm);
+  masm.FinalizeCode();
+
+  // Run the example function.
+  simulator.set_xreg(0, 0x1111111111111111);
+  simulator.set_xreg(1, 0x2222222222222222);
+  simulator.set_xreg(2, 0x3333333333333333);
+  simulator.set_xreg(3, 0x4444444444444444);
+
+  printf("Before swap4:\n"
+         "x0 = 0x%" PRIx64 "\n"
+         "x1 = 0x%" PRIx64 "\n"
+         "x2 = 0x%" PRIx64 "\n"
+         "x3 = 0x%" PRIx64 "\n",
+         simulator.xreg(0), simulator.xreg(1),
+         simulator.xreg(2), simulator.xreg(3));
+
+  simulator.RunFrom(swap4.target());
+
+  printf("After swap4:\n"
+         "x0 = 0x%" PRIx64 "\n"
+         "x1 = 0x%" PRIx64 "\n"
+         "x2 = 0x%" PRIx64 "\n"
+         "x3 = 0x%" PRIx64 "\n",
+         simulator.xreg(0), simulator.xreg(1),
+         simulator.xreg(2), simulator.xreg(3));
+
+  return 0;
+}
+#endif

diff --git a/src/a64/assembler-a64.cc b/src/a64/assembler-a64.cc
new file mode 100644
index 0000000..0d0c5d5
--- /dev/null
+++ b/src/a64/assembler-a64.cc

@@ -0,0 +1,2166 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+#include <cmath>
+#include "a64/assembler-a64.h"
+
+namespace vixl {
+
+// CPURegList utilities.
+CPURegister CPURegList::PopLowestIndex() {
+  if (IsEmpty()) {
+    return NoCPUReg;
+  }
+  int index = CountTrailingZeros(list_, kRegListSizeInBits);
+  ASSERT((1 << index) & list_);
+  Remove(index);
+  return CPURegister(index, size_, type_);
+}
+
+
+CPURegister CPURegList::PopHighestIndex() {
+  ASSERT(IsValid());
+  if (IsEmpty()) {
+    return NoCPUReg;
+  }
+  int index = CountLeadingZeros(list_, kRegListSizeInBits);
+  index = kRegListSizeInBits - 1 - index;
+  ASSERT((1 << index) & list_);
+  Remove(index);
+  return CPURegister(index, size_, type_);
+}
+
+
+bool CPURegList::IsValid() const {
+  if ((type_ == CPURegister::kRegister) ||
+      (type_ == CPURegister::kFPRegister)) {
+    bool is_valid = true;
+    // Try to create a CPURegister for each element in the list.
+    for (int i = 0; i < kRegListSizeInBits; i++) {
+      if (((list_ >> i) & 1) != 0) {
+        is_valid &= CPURegister(i, size_, type_).IsValid();
+      }
+    }
+    return is_valid;
+  } else if (type_ == CPURegister::kNoRegister) {
+    // We can't use IsEmpty here because that asserts IsValid().
+    return list_ == 0;
+  } else {
+    return false;
+  }
+}
+
+
+void CPURegList::RemoveCalleeSaved() {
+  if (type() == CPURegister::kRegister) {
+    Remove(GetCalleeSaved(RegisterSizeInBits()));
+  } else if (type() == CPURegister::kFPRegister) {
+    Remove(GetCalleeSavedFP(RegisterSizeInBits()));
+  } else {
+    ASSERT(type() == CPURegister::kNoRegister);
+    ASSERT(IsEmpty());
+    // The list must already be empty, so do nothing.
+  }
+}
+
+
+CPURegList CPURegList::GetCalleeSaved(unsigned size) {
+  return CPURegList(CPURegister::kRegister, size, 19, 29);
+}
+
+
+CPURegList CPURegList::GetCalleeSavedFP(unsigned size) {
+  return CPURegList(CPURegister::kFPRegister, size, 8, 15);
+}
+
+
+CPURegList CPURegList::GetCallerSaved(unsigned size) {
+  // Registers x0-x18 and lr (x30) are caller-saved.
+  CPURegList list = CPURegList(CPURegister::kRegister, size, 0, 18);
+  list.Combine(lr);
+  return list;
+}
+
+
+CPURegList CPURegList::GetCallerSavedFP(unsigned size) {
+  // Registers d0-d7 and d16-d31 are caller-saved.
+  CPURegList list = CPURegList(CPURegister::kFPRegister, size, 0, 7);
+  list.Combine(CPURegList(CPURegister::kFPRegister, size, 16, 31));
+  return list;
+}
+
+
+const CPURegList kCalleeSaved = CPURegList::GetCalleeSaved();
+const CPURegList kCalleeSavedFP = CPURegList::GetCalleeSavedFP();
+const CPURegList kCallerSaved = CPURegList::GetCallerSaved();
+const CPURegList kCallerSavedFP = CPURegList::GetCallerSavedFP();
+
+
+// Registers.
+#define WREG(n) w##n,
+const Register Register::wregisters[] = {
+REGISTER_CODE_LIST(WREG)
+};
+#undef WREG
+
+#define XREG(n) x##n,
+const Register Register::xregisters[] = {
+REGISTER_CODE_LIST(XREG)
+};
+#undef XREG
+
+#define SREG(n) s##n,
+const FPRegister FPRegister::sregisters[] = {
+REGISTER_CODE_LIST(SREG)
+};
+#undef SREG
+
+#define DREG(n) d##n,
+const FPRegister FPRegister::dregisters[] = {
+REGISTER_CODE_LIST(DREG)
+};
+#undef DREG
+
+
+const Register& Register::WRegFromCode(unsigned code) {
+  // This function returns the zero register when code = 31. The stack pointer
+  // can not be returned.
+  ASSERT(code < kNumberOfRegisters);
+  return wregisters[code];
+}
+
+
+const Register& Register::XRegFromCode(unsigned code) {
+  // This function returns the zero register when code = 31. The stack pointer
+  // can not be returned.
+  ASSERT(code < kNumberOfRegisters);
+  return xregisters[code];
+}
+
+
+const FPRegister& FPRegister::SRegFromCode(unsigned code) {
+  ASSERT(code < kNumberOfFPRegisters);
+  return sregisters[code];
+}
+
+
+const FPRegister& FPRegister::DRegFromCode(unsigned code) {
+  ASSERT(code < kNumberOfFPRegisters);
+  return dregisters[code];
+}
+
+
+const Register& CPURegister::W() const {
+  ASSERT(IsValidRegister());
+  ASSERT(Is64Bits());
+  return Register::WRegFromCode(code_);
+}
+
+
+const Register& CPURegister::X() const {
+  ASSERT(IsValidRegister());
+  ASSERT(Is32Bits());
+  return Register::XRegFromCode(code_);
+}
+
+
+const FPRegister& CPURegister::S() const {
+  ASSERT(IsValidFPRegister());
+  ASSERT(Is64Bits());
+  return FPRegister::SRegFromCode(code_);
+}
+
+
+const FPRegister& CPURegister::D() const {
+  ASSERT(IsValidFPRegister());
+  ASSERT(Is32Bits());
+  return FPRegister::DRegFromCode(code_);
+}
+
+
+// Operand.
+Operand::Operand(int64_t immediate)
+    : immediate_(immediate),
+      reg_(NoReg),
+      shift_(NO_SHIFT),
+      extend_(NO_EXTEND),
+      shift_amount_(0) {}
+
+
+Operand::Operand(Register reg, Shift shift, unsigned shift_amount)
+    : reg_(reg),
+      shift_(shift),
+      extend_(NO_EXTEND),
+      shift_amount_(shift_amount) {
+  ASSERT(reg.Is64Bits() || (shift_amount < kWRegSize));
+  ASSERT(reg.Is32Bits() || (shift_amount < kXRegSize));
+  ASSERT(!reg.IsSP());
+}
+
+
+Operand::Operand(Register reg, Extend extend, unsigned shift_amount)
+    : reg_(reg),
+      shift_(NO_SHIFT),
+      extend_(extend),
+      shift_amount_(shift_amount) {
+  ASSERT(reg.IsValid());
+  ASSERT(shift_amount <= 4);
+  ASSERT(!reg.IsSP());
+}
+
+
+bool Operand::IsImmediate() const {
+  return reg_.Is(NoReg);
+}
+
+
+bool Operand::IsShiftedRegister() const {
+  return reg_.IsValid() && (shift_ != NO_SHIFT);
+}
+
+
+bool Operand::IsExtendedRegister() const {
+  return reg_.IsValid() && (extend_ != NO_EXTEND);
+}
+
+
+Operand Operand::ToExtendedRegister() const {
+  ASSERT(IsShiftedRegister());
+  ASSERT((shift_ == LSL) && (shift_amount_ <= 4));
+  return Operand(reg_, reg_.Is64Bits() ? UXTX : UXTW, shift_amount_);
+}
+
+
+// MemOperand
+MemOperand::MemOperand(Register base, ptrdiff_t offset, AddrMode addrmode)
+  : base_(base), regoffset_(NoReg), offset_(offset), addrmode_(addrmode) {
+  ASSERT(base.Is64Bits() && !base.IsZero());
+}
+
+
+MemOperand::MemOperand(Register base,
+                       Register regoffset,
+                       Extend extend,
+                       unsigned shift_amount)
+  : base_(base), regoffset_(regoffset), offset_(0), addrmode_(Offset),
+    shift_(NO_SHIFT), extend_(extend), shift_amount_(shift_amount) {
+  ASSERT(base.Is64Bits() && !base.IsZero());
+  ASSERT(!regoffset.IsSP());
+  ASSERT((extend == UXTW) || (extend == SXTW) || (extend == SXTX));
+}
+
+
+MemOperand::MemOperand(Register base,
+                       Register regoffset,
+                       Shift shift,
+                       unsigned shift_amount)
+  : base_(base), regoffset_(regoffset), offset_(0), addrmode_(Offset),
+    shift_(shift), extend_(NO_EXTEND), shift_amount_(shift_amount) {
+  ASSERT(base.Is64Bits() && !base.IsZero());
+  ASSERT(!regoffset.IsSP());
+  ASSERT(shift == LSL);
+}
+
+
+MemOperand::MemOperand(Register base, const Operand& offset, AddrMode addrmode)
+  : base_(base), regoffset_(NoReg), addrmode_(addrmode) {
+  ASSERT(base.Is64Bits() && !base.IsZero());
+
+  if (offset.IsImmediate()) {
+    offset_ = offset.immediate();
+  } else if (offset.IsShiftedRegister()) {
+    ASSERT(addrmode == Offset);
+
+    regoffset_ = offset.reg();
+    shift_= offset.shift();
+    shift_amount_ = offset.shift_amount();
+
+    extend_ = NO_EXTEND;
+    offset_ = 0;
+
+    // These assertions match those in the shifted-register constructor.
+    ASSERT(!regoffset_.IsSP());
+    ASSERT(shift_ == LSL);
+  } else {
+    ASSERT(offset.IsExtendedRegister());
+    ASSERT(addrmode == Offset);
+
+    regoffset_ = offset.reg();
+    extend_ = offset.extend();
+    shift_amount_ = offset.shift_amount();
+
+    shift_= NO_SHIFT;
+    offset_ = 0;
+
+    // These assertions match those in the extended-register constructor.
+    ASSERT(!regoffset_.IsSP());
+    ASSERT((extend_ == UXTW) || (extend_ == SXTW) || (extend_ == SXTX));
+  }
+}
+
+
+bool MemOperand::IsImmediateOffset() const {
+  return (addrmode_ == Offset) && regoffset_.Is(NoReg);
+}
+
+
+bool MemOperand::IsRegisterOffset() const {
+  return (addrmode_ == Offset) && !regoffset_.Is(NoReg);
+}
+
+
+bool MemOperand::IsPreIndex() const {
+  return addrmode_ == PreIndex;
+}
+
+
+bool MemOperand::IsPostIndex() const {
+  return addrmode_ == PostIndex;
+}
+
+
+// Assembler
+Assembler::Assembler(byte* buffer, unsigned buffer_size)
+    : buffer_size_(buffer_size), literal_pool_monitor_(0) {
+  // Assert that this is an LP64 system.
+  ASSERT(sizeof(int) == sizeof(int32_t));     // NOLINT(runtime/sizeof)
+  ASSERT(sizeof(long) == sizeof(int64_t));    // NOLINT(runtime/int)
+  ASSERT(sizeof(void *) == sizeof(int64_t));  // NOLINT(runtime/sizeof)
+  ASSERT(sizeof(1) == sizeof(int32_t));       // NOLINT(runtime/sizeof)
+  ASSERT(sizeof(1L) == sizeof(int64_t));      // NOLINT(runtime/sizeof)
+
+  buffer_ = reinterpret_cast<Instruction*>(buffer);
+  pc_ = buffer_;
+  Reset();
+}
+
+
+Assembler::~Assembler() {
+  ASSERT(finalized_ || (pc_ == buffer_));
+  ASSERT(literals_.empty());
+}
+
+
+void Assembler::Reset() {
+#ifdef DEBUG
+  ASSERT((pc_ >= buffer_) && (pc_ < buffer_ + buffer_size_));
+  ASSERT(literal_pool_monitor_ == 0);
+  memset(buffer_, 0, pc_ - buffer_);
+  finalized_ = false;
+#endif
+  pc_ = buffer_;
+  literals_.clear();
+  next_literal_pool_check_ = pc_ + kLiteralPoolCheckInterval;
+}
+
+
+void Assembler::FinalizeCode() {
+  EmitLiteralPool();
+#ifdef DEBUG
+  finalized_ = true;
+#endif
+}
+
+
+void Assembler::bind(Label* label) {
+  label->is_bound_ = true;
+  label->target_ = pc_;
+  while (label->IsLinked()) {
+    // Get the address of the following instruction in the chain.
+    Instruction* next_link = label->link_->ImmPCOffsetTarget();
+    // Update the instruction target.
+    label->link_->SetImmPCOffsetTarget(label->target_);
+    // Update the label's link.
+    // If the offset of the branch we just updated was 0 (kEndOfChain) we are
+    // done.
+    label->link_ = (label->link_ != next_link) ? next_link : NULL;
+  }
+}
+
+
+int Assembler::UpdateAndGetByteOffsetTo(Label* label) {
+  int offset;
+  ASSERT(sizeof(*pc_) == 1);
+  if (label->IsBound()) {
+    offset = label->target() - pc_;
+  } else if (label->IsLinked()) {
+    offset = label->link() - pc_;
+  } else {
+    offset = Label::kEndOfChain;
+  }
+  label->set_link(pc_);
+  return offset;
+}
+
+
+// Code generation.
+void Assembler::br(const Register& xn) {
+  ASSERT(xn.Is64Bits());
+  Emit(BR | Rn(xn));
+}
+
+
+void Assembler::blr(const Register& xn) {
+  ASSERT(xn.Is64Bits());
+  Emit(BLR | Rn(xn));
+}
+
+
+void Assembler::ret(const Register& xn) {
+  ASSERT(xn.Is64Bits());
+  Emit(RET | Rn(xn));
+}
+
+
+void Assembler::b(int imm26, Condition cond) {
+  if (cond == al) {
+    Emit(B | ImmUncondBranch(imm26));
+  } else {
+    // The immediate field is only 19bit wide here.
+    Emit(B_cond | ImmCondBranch(imm26) | cond);
+  }
+}
+
+
+void Assembler::b(Label* label, Condition cond) {
+  b(UpdateAndGetInstructionOffsetTo(label), cond);
+}
+
+
+void Assembler::bl(int imm26) {
+  Emit(BL | ImmUncondBranch(imm26));
+}
+
+
+void Assembler::bl(Label* label) {
+  bl(UpdateAndGetInstructionOffsetTo(label));
+}
+
+
+void Assembler::cbz(const Register& rt,
+                    int imm19) {
+  Emit(SF(rt) | CBZ | ImmCmpBranch(imm19) | Rt(rt));
+}
+
+
+void Assembler::cbz(const Register& rt,
+                    Label* label) {
+  cbz(rt, UpdateAndGetInstructionOffsetTo(label));
+}
+
+
+void Assembler::cbnz(const Register& rt,
+                     int imm19) {
+  Emit(SF(rt) | CBNZ | ImmCmpBranch(imm19) | Rt(rt));
+}
+
+
+void Assembler::cbnz(const Register& rt,
+                     Label* label) {
+  cbnz(rt, UpdateAndGetInstructionOffsetTo(label));
+}
+
+
+void Assembler::tbz(const Register& rt,
+                    unsigned bit_pos,
+                    int imm14) {
+  ASSERT(rt.Is64Bits());
+  Emit(TBZ | ImmTestBranchBit(bit_pos) | ImmTestBranch(imm14) | Rt(rt));
+}
+
+
+void Assembler::tbz(const Register& rt,
+                    unsigned bit_pos,
+                    Label* label) {
+  tbz(rt, bit_pos, UpdateAndGetInstructionOffsetTo(label));
+}
+
+
+void Assembler::tbnz(const Register& rt,
+                     unsigned bit_pos,
+                     int imm14) {
+  ASSERT(rt.Is64Bits());
+  Emit(TBNZ | ImmTestBranchBit(bit_pos) | ImmTestBranch(imm14) | Rt(rt));
+}
+
+
+void Assembler::tbnz(const Register& rt,
+                     unsigned bit_pos,
+                     Label* label) {
+  tbnz(rt, bit_pos, UpdateAndGetInstructionOffsetTo(label));
+}
+
+
+void Assembler::adr(const Register& rd, int imm21) {
+  ASSERT(rd.Is64Bits());
+  Emit(ADR | ImmPCRelAddress(imm21) | Rd(rd));
+}
+
+
+void Assembler::adr(const Register& rd, Label* label) {
+  adr(rd, UpdateAndGetByteOffsetTo(label));
+}
+
+
+void Assembler::add(const Register& rd,
+                    const Register& rn,
+                    const Operand& operand,
+                    FlagsUpdate S) {
+  AddSub(rd, rn, operand, S, ADD);
+}
+
+
+void Assembler::cmn(const Register& rn,
+                    const Operand& operand) {
+  Register zr = AppropriateZeroRegFor(rn);
+  add(zr, rn, operand, SetFlags);
+}
+
+
+void Assembler::sub(const Register& rd,
+                    const Register& rn,
+                    const Operand& operand,
+                    FlagsUpdate S) {
+  AddSub(rd, rn, operand, S, SUB);
+}
+
+
+void Assembler::cmp(const Register& rn, const Operand& operand) {
+  Register zr = AppropriateZeroRegFor(rn);
+  sub(zr, rn, operand, SetFlags);
+}
+
+
+void Assembler::neg(const Register& rd, const Operand& operand, FlagsUpdate S) {
+  Register zr = AppropriateZeroRegFor(rd);
+  sub(rd, zr, operand, S);
+}
+
+
+void Assembler::adc(const Register& rd,
+                    const Register& rn,
+                    const Operand& operand,
+                    FlagsUpdate S) {
+  AddSubWithCarry(rd, rn, operand, S, ADC);
+}
+
+
+void Assembler::sbc(const Register& rd,
+                    const Register& rn,
+                    const Operand& operand,
+                    FlagsUpdate S) {
+  AddSubWithCarry(rd, rn, operand, S, SBC);
+}
+
+
+void Assembler::ngc(const Register& rd, const Operand& operand, FlagsUpdate S) {
+  Register zr = AppropriateZeroRegFor(rd);
+  sbc(rd, zr, operand, S);
+}
+
+
+// Logical instructions.
+void Assembler::and_(const Register& rd,
+                     const Register& rn,
+                     const Operand& operand,
+                     FlagsUpdate S) {
+  Logical(rd, rn, operand, (S == SetFlags) ? ANDS : AND);
+}
+
+
+void Assembler::tst(const Register& rn,
+                    const Operand& operand) {
+  and_(AppropriateZeroRegFor(rn), rn, operand, SetFlags);
+}
+
+
+void Assembler::bic(const Register& rd,
+                    const Register& rn,
+                    const Operand& operand,
+                    FlagsUpdate S) {
+  Logical(rd, rn, operand, (S == SetFlags) ? BICS : BIC);
+}
+
+
+void Assembler::orr(const Register& rd,
+                    const Register& rn,
+                    const Operand& operand) {
+  Logical(rd, rn, operand, ORR);
+}
+
+
+void Assembler::orn(const Register& rd,
+                    const Register& rn,
+                    const Operand& operand) {
+  Logical(rd, rn, operand, ORN);
+}
+
+
+void Assembler::eor(const Register& rd,
+                    const Register& rn,
+                    const Operand& operand) {
+  Logical(rd, rn, operand, EOR);
+}
+
+
+void Assembler::eon(const Register& rd,
+                    const Register& rn,
+                    const Operand& operand) {
+  Logical(rd, rn, operand, EON);
+}
+
+
+void Assembler::lslv(const Register& rd,
+                     const Register& rn,
+                     const Register& rm) {
+  ASSERT(rd.size() == rn.size());
+  ASSERT(rd.size() == rm.size());
+  Emit(SF(rd) | LSLV | Rm(rm) | Rn(rn) | Rd(rd));
+}
+
+
+void Assembler::lsrv(const Register& rd,
+                     const Register& rn,
+                     const Register& rm) {
+  ASSERT(rd.size() == rn.size());
+  ASSERT(rd.size() == rm.size());
+  Emit(SF(rd) | LSRV | Rm(rm) | Rn(rn) | Rd(rd));
+}
+
+
+void Assembler::asrv(const Register& rd,
+                     const Register& rn,
+                     const Register& rm) {
+  ASSERT(rd.size() == rn.size());
+  ASSERT(rd.size() == rm.size());
+  Emit(SF(rd) | ASRV | Rm(rm) | Rn(rn) | Rd(rd));
+}
+
+
+void Assembler::rorv(const Register& rd,
+                     const Register& rn,
+                     const Register& rm) {
+  ASSERT(rd.size() == rn.size());
+  ASSERT(rd.size() == rm.size());
+  Emit(SF(rd) | RORV | Rm(rm) | Rn(rn) | Rd(rd));
+}
+
+
+// Bitfield operations.
+void Assembler::bfm(const Register& rd,
+                     const Register& rn,
+                     unsigned immr,
+                     unsigned imms) {
+  ASSERT(rd.size() == rn.size());
+  Instr N = SF(rd) >> (kSFOffset - kBitfieldNOffset);
+  Emit(SF(rd) | BFM | N |
+       ImmR(immr, rd.size()) | ImmS(imms, rd.size()) | Rn(rn) | Rd(rd));
+}
+
+
+void Assembler::sbfm(const Register& rd,
+                     const Register& rn,
+                     unsigned immr,
+                     unsigned imms) {
+  ASSERT(rd.size() == rn.size());
+  Instr N = SF(rd) >> (kSFOffset - kBitfieldNOffset);
+  Emit(SF(rd) | SBFM | N |
+       ImmR(immr, rd.size()) | ImmS(imms, rd.size()) | Rn(rn) | Rd(rd));
+}
+
+
+void Assembler::ubfm(const Register& rd,
+                     const Register& rn,
+                     unsigned immr,
+                     unsigned imms) {
+  ASSERT(rd.size() == rn.size());
+  Instr N = SF(rd) >> (kSFOffset - kBitfieldNOffset);
+  Emit(SF(rd) | UBFM | N |
+       ImmR(immr, rd.size()) | ImmS(imms, rd.size()) | Rn(rn) | Rd(rd));
+}
+
+
+void Assembler::extr(const Register& rd,
+                     const Register& rn,
+                     const Register& rm,
+                     unsigned lsb) {
+  ASSERT(rd.size() == rn.size());
+  ASSERT(rd.size() == rm.size());
+  Instr N = SF(rd) >> (kSFOffset - kBitfieldNOffset);
+  Emit(SF(rd) | EXTR | N | Rm(rm) | ImmS(lsb, rd.size()) | Rn(rn) | Rd(rd));
+}
+
+
+void Assembler::csel(const Register& rd,
+                     const Register& rn,
+                     const Register& rm,
+                     Condition cond) {
+  ConditionalSelect(rd, rn, rm, cond, CSEL);
+}
+
+
+void Assembler::csinc(const Register& rd,
+                      const Register& rn,
+                      const Register& rm,
+                      Condition cond) {
+  ConditionalSelect(rd, rn, rm, cond, CSINC);
+}
+
+
+void Assembler::csinv(const Register& rd,
+                      const Register& rn,
+                      const Register& rm,
+                      Condition cond) {
+  ConditionalSelect(rd, rn, rm, cond, CSINV);
+}
+
+
+void Assembler::csneg(const Register& rd,
+                      const Register& rn,
+                      const Register& rm,
+                      Condition cond) {
+  ConditionalSelect(rd, rn, rm, cond, CSNEG);
+}
+
+
+void Assembler::cset(const Register &rd, Condition cond) {
+  Register zr = AppropriateZeroRegFor(rd);
+  csinc(rd, zr, zr, InvertCondition(cond));
+}
+
+
+void Assembler::csetm(const Register &rd, Condition cond) {
+  Register zr = AppropriateZeroRegFor(rd);
+  csinv(rd, zr, zr, InvertCondition(cond));
+}
+
+
+void Assembler::cinc(const Register &rd, const Register &rn, Condition cond) {
+  csinc(rd, rn, rn, InvertCondition(cond));
+}
+
+
+void Assembler::cinv(const Register &rd, const Register &rn, Condition cond) {
+  csinv(rd, rn, rn, InvertCondition(cond));
+}
+
+
+void Assembler::cneg(const Register &rd, const Register &rn, Condition cond) {
+  csneg(rd, rn, rn, InvertCondition(cond));
+}
+
+
+void Assembler::ConditionalSelect(const Register& rd,
+                                  const Register& rn,
+                                  const Register& rm,
+                                  Condition cond,
+                                  ConditionalSelectOp op) {
+  ASSERT(rd.size() == rn.size());
+  ASSERT(rd.size() == rm.size());
+  ASSERT(cond != al);
+  Emit(SF(rd) | op | Rm(rm) | Cond(cond) | Rn(rn) | Rd(rd));
+}
+
+
+void Assembler::ccmn(const Register& rn,
+                     const Operand& operand,
+                     StatusFlags nzcv,
+                     Condition cond) {
+  ConditionalCompare(rn, operand, nzcv, cond, CCMN);
+}
+
+
+void Assembler::ccmp(const Register& rn,
+                     const Operand& operand,
+                     StatusFlags nzcv,
+                     Condition cond) {
+  ConditionalCompare(rn, operand, nzcv, cond, CCMP);
+}
+
+
+void Assembler::DataProcessing3Source(const Register& rd,
+                     const Register& rn,
+                     const Register& rm,
+                     const Register& ra,
+                     DataProcessing3SourceOp op) {
+  Emit(SF(rd) | op | Rm(rm) | Ra(ra) | Rn(rn) | Rd(rd));
+}
+
+
+void Assembler::mul(const Register& rd,
+                    const Register& rn,
+                    const Register& rm) {
+  ASSERT(AreSameSizeAndType(rd, rn, rm));
+  DataProcessing3Source(rd, rn, rm, AppropriateZeroRegFor(rd), MADD);
+}
+
+
+void Assembler::madd(const Register& rd,
+                     const Register& rn,
+                     const Register& rm,
+                     const Register& ra) {
+  DataProcessing3Source(rd, rn, rm, ra, MADD);
+}
+
+
+void Assembler::mneg(const Register& rd,
+                     const Register& rn,
+                     const Register& rm) {
+  ASSERT(AreSameSizeAndType(rd, rn, rm));
+  DataProcessing3Source(rd, rn, rm, AppropriateZeroRegFor(rd), MSUB);
+}
+
+
+void Assembler::msub(const Register& rd,
+                     const Register& rn,
+                     const Register& rm,
+                     const Register& ra) {
+  DataProcessing3Source(rd, rn, rm, ra, MSUB);
+}
+
+
+void Assembler::umaddl(const Register& rd,
+                       const Register& rn,
+                       const Register& rm,
+                       const Register& ra) {
+  ASSERT(rd.Is64Bits() && ra.Is64Bits());
+  ASSERT(rn.Is32Bits() && rm.Is32Bits());
+  DataProcessing3Source(rd, rn, rm, ra, UMADDL_x);
+}
+
+
+void Assembler::smaddl(const Register& rd,
+                       const Register& rn,
+                       const Register& rm,
+                       const Register& ra) {
+  ASSERT(rd.Is64Bits() && ra.Is64Bits());
+  ASSERT(rn.Is32Bits() && rm.Is32Bits());
+  DataProcessing3Source(rd, rn, rm, ra, SMADDL_x);
+}
+
+
+void Assembler::umsubl(const Register& rd,
+                       const Register& rn,
+                       const Register& rm,
+                       const Register& ra) {
+  ASSERT(rd.Is64Bits() && ra.Is64Bits());
+  ASSERT(rn.Is32Bits() && rm.Is32Bits());
+  DataProcessing3Source(rd, rn, rm, ra, UMSUBL_x);
+}
+
+
+void Assembler::smsubl(const Register& rd,
+                       const Register& rn,
+                       const Register& rm,
+                       const Register& ra) {
+  ASSERT(rd.Is64Bits() && ra.Is64Bits());
+  ASSERT(rn.Is32Bits() && rm.Is32Bits());
+  DataProcessing3Source(rd, rn, rm, ra, SMSUBL_x);
+}
+
+
+void Assembler::smull(const Register& rd,
+                      const Register& rn,
+                      const Register& rm) {
+  ASSERT(rd.Is64Bits());
+  ASSERT(rn.Is32Bits() && rm.Is32Bits());
+  DataProcessing3Source(rd, rn, rm, xzr, SMADDL_x);
+}
+
+
+void Assembler::sdiv(const Register& rd,
+                     const Register& rn,
+                     const Register& rm) {
+  ASSERT(rd.size() == rn.size());
+  ASSERT(rd.size() == rm.size());
+  Emit(SF(rd) | SDIV | Rm(rm) | Rn(rn) | Rd(rd));
+}
+
+
+void Assembler::smulh(const Register& xd,
+                      const Register& xn,
+                      const Register& xm) {
+  ASSERT(xd.Is64Bits() && xn.Is64Bits() && xm.Is64Bits());
+  DataProcessing3Source(xd, xn, xm, xzr, SMULH_x);
+}
+
+void Assembler::udiv(const Register& rd,
+                     const Register& rn,
+                     const Register& rm) {
+  ASSERT(rd.size() == rn.size());
+  ASSERT(rd.size() == rm.size());
+  Emit(SF(rd) | UDIV | Rm(rm) | Rn(rn) | Rd(rd));
+}
+
+
+void Assembler::rbit(const Register& rd,
+                     const Register& rn) {
+  DataProcessing1Source(rd, rn, RBIT);
+}
+
+
+void Assembler::rev16(const Register& rd,
+                      const Register& rn) {
+  DataProcessing1Source(rd, rn, REV16);
+}
+
+
+void Assembler::rev32(const Register& rd,
+                      const Register& rn) {
+  ASSERT(rd.Is64Bits());
+  DataProcessing1Source(rd, rn, REV);
+}
+
+
+void Assembler::rev(const Register& rd,
+                    const Register& rn) {
+  DataProcessing1Source(rd, rn, rd.Is64Bits() ? REV_x : REV_w);
+}
+
+
+void Assembler::clz(const Register& rd,
+                    const Register& rn) {
+  DataProcessing1Source(rd, rn, CLZ);
+}
+
+
+void Assembler::cls(const Register& rd,
+                    const Register& rn) {
+  DataProcessing1Source(rd, rn, CLS);
+}
+
+
+void Assembler::ldp(const CPURegister& rt,
+                    const CPURegister& rt2,
+                    const MemOperand& src) {
+  LoadStorePair(rt, rt2, src, LoadPairOpFor(rt, rt2));
+}
+
+
+void Assembler::stp(const CPURegister& rt,
+                    const CPURegister& rt2,
+                    const MemOperand& dst) {
+  LoadStorePair(rt, rt2, dst, StorePairOpFor(rt, rt2));
+}
+
+
+void Assembler::ldpsw(const Register& rt,
+                      const Register& rt2,
+                      const MemOperand& src) {
+  ASSERT(rt.Is64Bits());
+  LoadStorePair(rt, rt2, src, LDPSW_x);
+}
+
+
+void Assembler::LoadStorePair(const CPURegister& rt,
+                              const CPURegister& rt2,
+                              const MemOperand& addr,
+                              LoadStorePairOp op) {
+  // 'rt' and 'rt2' can only be aliased for stores.
+  ASSERT(((op & LoadStorePairLBit) == 0) || !rt.Is(rt2));
+  ASSERT(AreSameSizeAndType(rt, rt2));
+
+  Instr memop = op | Rt(rt) | Rt2(rt2) | RnSP(addr.base()) |
+                ImmLSPair(addr.offset(), CalcLSPairDataSize(op));
+
+  Instr addrmodeop;
+  if (addr.IsImmediateOffset()) {
+    addrmodeop = LoadStorePairOffsetFixed;
+  } else {
+    ASSERT(addr.offset() != 0);
+    if (addr.IsPreIndex()) {
+      addrmodeop = LoadStorePairPreIndexFixed;
+    } else {
+      ASSERT(addr.IsPostIndex());
+      addrmodeop = LoadStorePairPostIndexFixed;
+    }
+  }
+  Emit(addrmodeop | memop);
+}
+
+
+void Assembler::ldnp(const CPURegister& rt,
+                     const CPURegister& rt2,
+                     const MemOperand& src) {
+  LoadStorePairNonTemporal(rt, rt2, src,
+                           LoadPairNonTemporalOpFor(rt, rt2));
+}
+
+
+void Assembler::stnp(const CPURegister& rt,
+                     const CPURegister& rt2,
+                     const MemOperand& dst) {
+  LoadStorePairNonTemporal(rt, rt2, dst,
+                           StorePairNonTemporalOpFor(rt, rt2));
+}
+
+
+void Assembler::LoadStorePairNonTemporal(const CPURegister& rt,
+                                         const CPURegister& rt2,
+                                         const MemOperand& addr,
+                                         LoadStorePairNonTemporalOp op) {
+  ASSERT(!rt.Is(rt2));
+  ASSERT(AreSameSizeAndType(rt, rt2));
+  ASSERT(addr.IsImmediateOffset());
+
+  LSDataSize size = CalcLSPairDataSize(
+    static_cast<LoadStorePairOp>(op & LoadStorePairMask));
+  Emit(op | Rt(rt) | Rt2(rt2) | RnSP(addr.base()) |
+       ImmLSPair(addr.offset(), size));
+}
+
+
+// Memory instructions.
+void Assembler::ldrb(const Register& rt, const MemOperand& src) {
+  LoadStore(rt, src, LDRB_w);
+}
+
+
+void Assembler::strb(const Register& rt, const MemOperand& dst) {
+  LoadStore(rt, dst, STRB_w);
+}
+
+
+void Assembler::ldrsb(const Register& rt, const MemOperand& src) {
+  LoadStore(rt, src, rt.Is64Bits() ? LDRSB_x : LDRSB_w);
+}
+
+
+void Assembler::ldrh(const Register& rt, const MemOperand& src) {
+  LoadStore(rt, src, LDRH_w);
+}
+
+
+void Assembler::strh(const Register& rt, const MemOperand& dst) {
+  LoadStore(rt, dst, STRH_w);
+}
+
+
+void Assembler::ldrsh(const Register& rt, const MemOperand& src) {
+  LoadStore(rt, src, rt.Is64Bits() ? LDRSH_x : LDRSH_w);
+}
+
+
+void Assembler::ldr(const CPURegister& rt, const MemOperand& src) {
+  LoadStore(rt, src, LoadOpFor(rt));
+}
+
+
+void Assembler::str(const CPURegister& rt, const MemOperand& src) {
+  LoadStore(rt, src, StoreOpFor(rt));
+}
+
+
+void Assembler::ldrsw(const Register& rt, const MemOperand& src) {
+  ASSERT(rt.Is64Bits());
+  LoadStore(rt, src, LDRSW_x);
+}
+
+
+void Assembler::ldr(const Register& rt, uint64_t imm) {
+  LoadLiteral(rt, imm, rt.Is64Bits() ? LDR_x_lit : LDR_w_lit);
+}
+
+
+void Assembler::ldr(const FPRegister& ft, double imm) {
+  uint64_t rawbits = 0;
+  LoadLiteralOp op;
+
+  if (ft.Is64Bits()) {
+    rawbits = double_to_rawbits(imm);
+    op = LDR_d_lit;
+  } else {
+    ASSERT(ft.Is32Bits());
+    float float_imm = static_cast<float>(imm);
+    rawbits = float_to_rawbits(float_imm);
+    op = LDR_s_lit;
+  }
+
+  LoadLiteral(ft, rawbits, op);
+}
+
+
+void Assembler::mov(const Register& rd, const Register& rm) {
+  // Moves involving the stack pointer are encoded as add immediate with
+  // second operand of zero. Otherwise, orr with first operand zr is
+  // used.
+  if (rd.IsSP() || rm.IsSP()) {
+    add(rd, rm, 0);
+  } else {
+    orr(rd, AppropriateZeroRegFor(rd), rm);
+  }
+}
+
+
+void Assembler::mvn(const Register& rd, const Operand& operand) {
+  orn(rd, AppropriateZeroRegFor(rd), operand);
+}
+
+
+void Assembler::mrs(const Register& rt, SystemRegister sysreg) {
+  ASSERT(rt.Is64Bits());
+  Emit(MRS | ImmSystemRegister(sysreg) | Rt(rt));
+}
+
+
+void Assembler::msr(SystemRegister sysreg, const Register& rt) {
+  ASSERT(rt.Is64Bits());
+  Emit(MSR | Rt(rt) | ImmSystemRegister(sysreg));
+}
+
+
+void Assembler::hint(SystemHint code) {
+  Emit(HINT | ImmHint(code) | Rt(xzr));
+}
+
+
+void Assembler::fmov(FPRegister fd, double imm) {
+  if (fd.Is64Bits() && IsImmFP64(imm)) {
+    Emit(FMOV_d_imm | Rd(fd) | ImmFP64(imm));
+  } else if (fd.Is32Bits() && IsImmFP32(imm)) {
+    Emit(FMOV_s_imm | Rd(fd) | ImmFP32(static_cast<float>(imm)));
+  } else if ((imm == 0.0) && (copysign(1.0, imm) == 1.0)) {
+    Register zr = AppropriateZeroRegFor(fd);
+    fmov(fd, zr);
+  } else {
+    ldr(fd, imm);
+  }
+}
+
+
+void Assembler::fmov(Register rd, FPRegister fn) {
+  ASSERT(rd.size() == fn.size());
+  FPIntegerConvertOp op = rd.Is32Bits() ? FMOV_ws : FMOV_xd;
+  Emit(op | Rd(rd) | Rn(fn));
+}
+
+
+void Assembler::fmov(FPRegister fd, Register rn) {
+  ASSERT(fd.size() == rn.size());
+  FPIntegerConvertOp op = fd.Is32Bits() ? FMOV_sw : FMOV_dx;
+  Emit(op | Rd(fd) | Rn(rn));
+}
+
+
+void Assembler::fmov(FPRegister fd, FPRegister fn) {
+  ASSERT(fd.size() == fn.size());
+  Emit(FPType(fd) | FMOV | Rd(fd) | Rn(fn));
+}
+
+
+void Assembler::fadd(const FPRegister& fd,
+                     const FPRegister& fn,
+                     const FPRegister& fm) {
+  FPDataProcessing2Source(fd, fn, fm, FADD);
+}
+
+
+void Assembler::fsub(const FPRegister& fd,
+                     const FPRegister& fn,
+                     const FPRegister& fm) {
+  FPDataProcessing2Source(fd, fn, fm, FSUB);
+}
+
+
+void Assembler::fmul(const FPRegister& fd,
+                     const FPRegister& fn,
+                     const FPRegister& fm) {
+  FPDataProcessing2Source(fd, fn, fm, FMUL);
+}
+
+
+void Assembler::fmsub(const FPRegister& fd,
+                      const FPRegister& fn,
+                      const FPRegister& fm,
+                      const FPRegister& fa) {
+  FPDataProcessing3Source(fd, fn, fm, fa, fd.Is32Bits() ? FMSUB_s : FMSUB_d);
+}
+
+
+void Assembler::fdiv(const FPRegister& fd,
+                     const FPRegister& fn,
+                     const FPRegister& fm) {
+  FPDataProcessing2Source(fd, fn, fm, FDIV);
+}
+
+
+void Assembler::fmax(const FPRegister& fd,
+                     const FPRegister& fn,
+                     const FPRegister& fm) {
+  FPDataProcessing2Source(fd, fn, fm, FMAX);
+}
+
+
+void Assembler::fmin(const FPRegister& fd,
+                     const FPRegister& fn,
+                     const FPRegister& fm) {
+  FPDataProcessing2Source(fd, fn, fm, FMIN);
+}
+
+
+void Assembler::fabs(const FPRegister& fd,
+                     const FPRegister& fn) {
+  ASSERT(fd.SizeInBits() == fn.SizeInBits());
+  FPDataProcessing1Source(fd, fn, FABS);
+}
+
+
+void Assembler::fneg(const FPRegister& fd,
+                     const FPRegister& fn) {
+  ASSERT(fd.SizeInBits() == fn.SizeInBits());
+  FPDataProcessing1Source(fd, fn, FNEG);
+}
+
+
+void Assembler::fsqrt(const FPRegister& fd,
+                      const FPRegister& fn) {
+  ASSERT(fd.SizeInBits() == fn.SizeInBits());
+  FPDataProcessing1Source(fd, fn, FSQRT);
+}
+
+
+void Assembler::frintn(const FPRegister& fd,
+                       const FPRegister& fn) {
+  ASSERT(fd.SizeInBits() == fn.SizeInBits());
+  FPDataProcessing1Source(fd, fn, FRINTN);
+}
+
+
+void Assembler::frintz(const FPRegister& fd,
+                       const FPRegister& fn) {
+  ASSERT(fd.SizeInBits() == fn.SizeInBits());
+  FPDataProcessing1Source(fd, fn, FRINTZ);
+}
+
+
+void Assembler::fcvt(const FPRegister& fd,
+                     const FPRegister& fn) {
+  // Only float to double conversion is supported.
+  ASSERT(fd.Is64Bits() && fn.Is32Bits());
+  FPDataProcessing1Source(fd, fn, FCVT_ds);
+}
+
+
+void Assembler::fcmp(const FPRegister& fn,
+                     const FPRegister& fm) {
+  ASSERT(fn.size() == fm.size());
+  Emit(FPType(fn) | FCMP | Rm(fm) | Rn(fn));
+}
+
+
+void Assembler::fcmp(const FPRegister& fn,
+                     double value) {
+  USE(value);
+  // Although the fcmp instruction can strictly only take an immediate value of
+  // +0.0, we don't need to check for -0.0 because the sign of 0.0 doesn't
+  // affect the result of the comparison.
+  ASSERT(value == 0.0);
+  Emit(FPType(fn) | FCMP_zero | Rn(fn));
+}
+
+
+void Assembler::fccmp(const FPRegister& fn,
+                      const FPRegister& fm,
+                      StatusFlags nzcv,
+                      Condition cond) {
+  ASSERT(fn.size() == fm.size());
+  Emit(FPType(fn) | FCCMP | Rm(fm) | Cond(cond) | Rn(fn) | Nzcv(nzcv));
+}
+
+
+void Assembler::fcsel(const FPRegister& fd,
+                      const FPRegister& fn,
+                      const FPRegister& fm,
+                      Condition cond) {
+  ASSERT(fd.size() == fn.size());
+  ASSERT(fd.size() == fm.size());
+  ASSERT(cond != al);
+  Emit(FPType(fd) | FCSEL | Rm(fm) | Cond(cond) | Rn(fn) | Rd(fd));
+}
+
+
+void Assembler::FPConvertToInt(const Register& rd,
+                               const FPRegister& fn,
+                               FPIntegerConvertOp op) {
+  Emit(SF(rd) | FPType(fn) | op | Rn(fn) | Rd(rd));
+}
+
+
+void Assembler::fcvtmu(const Register& rd, const FPRegister& fn) {
+  FPConvertToInt(rd, fn, FCVTMU);
+}
+
+
+void Assembler::fcvtms(const Register& rd, const FPRegister& fn) {
+  FPConvertToInt(rd, fn, FCVTMS);
+}
+
+
+void Assembler::fcvtnu(const Register& rd,
+                       const FPRegister& fn) {
+  FPConvertToInt(rd, fn, FCVTNU);
+}
+
+
+void Assembler::fcvtns(const Register& rd,
+                       const FPRegister& fn) {
+  FPConvertToInt(rd, fn, FCVTNS);
+}
+
+
+void Assembler::fcvtzu(const Register& rd,
+                       const FPRegister& fn) {
+  FPConvertToInt(rd, fn, FCVTZU);
+}
+
+
+void Assembler::fcvtzs(const Register& rd,
+                       const FPRegister& fn) {
+  FPConvertToInt(rd, fn, FCVTZS);
+}
+
+
+void Assembler::scvtf(const FPRegister& fd,
+                      const Register& rn,
+                      unsigned fbits) {
+  // We support double register destinations only.
+  ASSERT(fd.Is64Bits());
+  if (fbits == 0) {
+    Emit(SF(rn) | FPType(fd) | SCVTF | Rn(rn) | Rd(fd));
+  } else {
+    // For fixed point numbers, we support X register sources only.
+    ASSERT(rn.Is64Bits());
+    Emit(SF(rn) | FPType(fd) | SCVTF_fixed | FPScale(64 - fbits) | Rn(rn) |
+         Rd(fd));
+  }
+}
+
+
+void Assembler::ucvtf(const FPRegister& fd,
+                      const Register& rn,
+                      unsigned fbits) {
+  // We support double register destinations only.
+  ASSERT(fd.Is64Bits());
+  if (fbits == 0) {
+    Emit(SF(rn) | FPType(fd) | UCVTF | Rn(rn) | Rd(fd));
+  } else {
+    // For fixed point numbers, we support X register sources only.
+    ASSERT(rn.Is64Bits());
+    Emit(SF(rn) | FPType(fd) | UCVTF_fixed | FPScale(64 - fbits) | Rn(rn) |
+         Rd(fd));
+  }
+}
+
+
+// Note:
+// Below, a difference in case for the same letter indicates a
+// negated bit.
+// If b is 1, then B is 0.
+Instr Assembler::ImmFP32(float imm) {
+  ASSERT(IsImmFP32(imm));
+  // bits: aBbb.bbbc.defg.h000.0000.0000.0000.0000
+  uint32_t bits = float_to_rawbits(imm);
+  // bit7: a000.0000
+  uint32_t bit7 = ((bits >> 31) & 0x1) << 7;
+  // bit6: 0b00.0000
+  uint32_t bit6 = ((bits >> 29) & 0x1) << 6;
+  // bit5_to_0: 00cd.efgh
+  uint32_t bit5_to_0 = (bits >> 19) & 0x3f;
+
+  return (bit7 | bit6 | bit5_to_0) << ImmFP_offset;
+}
+
+
+Instr Assembler::ImmFP64(double imm) {
+  ASSERT(IsImmFP64(imm));
+  // bits: aBbb.bbbb.bbcd.efgh.0000.0000.0000.0000
+  //       0000.0000.0000.0000.0000.0000.0000.0000
+  uint64_t bits = double_to_rawbits(imm);
+  // bit7: a000.0000
+  uint32_t bit7 = ((bits >> 63) & 0x1) << 7;
+  // bit6: 0b00.0000
+  uint32_t bit6 = ((bits >> 61) & 0x1) << 6;
+  // bit5_to_0: 00cd.efgh
+  uint32_t bit5_to_0 = (bits >> 48) & 0x3f;
+
+  return (bit7 | bit6 | bit5_to_0) << ImmFP_offset;
+}
+
+
+// Code generation helpers.
+void Assembler::MoveWide(const Register& rd,
+                         uint64_t imm,
+                         int shift,
+                         MoveWideImmediateOp mov_op) {
+  if (shift >= 0) {
+    // Explicit shift specified.
+    ASSERT((shift == 0) || (shift == 16) || (shift == 32) || (shift == 48));
+    ASSERT(rd.Is64Bits() || (shift == 0) || (shift == 16));
+    shift /= 16;
+  } else {
+    // Calculate a new immediate and shift combination to encode the immediate
+    // argument.
+    shift = 0;
+    if ((imm & ~0xffffUL) == 0) {
+      // Nothing to do.
+    } else if ((imm & ~(0xffffUL << 16)) == 0) {
+      imm >>= 16;
+      shift = 1;
+    } else if ((imm & ~(0xffffUL << 32)) == 0) {
+      ASSERT(rd.Is64Bits());
+      imm >>= 32;
+      shift = 2;
+    } else if ((imm & ~(0xffffUL << 48)) == 0) {
+      ASSERT(rd.Is64Bits());
+      imm >>= 48;
+      shift = 3;
+    }
+  }
+
+  ASSERT(is_uint16(imm));
+
+  Emit(SF(rd) | MoveWideImmediateFixed | mov_op |
+       Rd(rd) | ImmMoveWide(imm) | ShiftMoveWide(shift));
+}
+
+
+void Assembler::AddSub(const Register& rd,
+                       const Register& rn,
+                       const Operand& operand,
+                       FlagsUpdate S,
+                       AddSubOp op) {
+  ASSERT(rd.size() == rn.size());
+  if (operand.IsImmediate()) {
+    int64_t immediate = operand.immediate();
+    ASSERT(IsImmAddSub(immediate));
+    Instr dest_reg = (S == SetFlags) ? Rd(rd) : RdSP(rd);
+    Emit(SF(rd) | AddSubImmediateFixed | op | Flags(S) |
+         ImmAddSub(immediate) | dest_reg | RnSP(rn));
+  } else if (operand.IsShiftedRegister()) {
+    ASSERT(operand.reg().size() == rd.size());
+    ASSERT(operand.shift() != ROR);
+
+    // For instructions of the form:
+    //   add/sub   wsp, <Wn>, <Wm> [, LSL #0-3 ]
+    //   add/sub   <Wd>, wsp, <Wm> [, LSL #0-3 ]
+    //   add/sub   wsp, wsp, <Wm> [, LSL #0-3 ]
+    //   adds/subs <Wd>, wsp, <Wm> [, LSL #0-3 ]
+    // or their 64-bit register equivalents, convert the operand from shifted to
+    // extended register mode, and emit an add/sub extended instruction.
+    if (rn.IsSP() || rd.IsSP()) {
+      ASSERT(!(rd.IsSP() && (S == SetFlags)));
+      DataProcExtendedRegister(rd, rn, operand.ToExtendedRegister(), S,
+                               AddSubExtendedFixed | op);
+    } else {
+      DataProcShiftedRegister(rd, rn, operand, S, AddSubShiftedFixed | op);
+    }
+  } else {
+    ASSERT(operand.IsExtendedRegister());
+    DataProcExtendedRegister(rd, rn, operand, S, AddSubExtendedFixed | op);
+  }
+}
+
+
+void Assembler::AddSubWithCarry(const Register& rd,
+                                const Register& rn,
+                                const Operand& operand,
+                                FlagsUpdate S,
+                                AddSubWithCarryOp op) {
+  ASSERT(rd.size() == rn.size());
+  ASSERT(rd.size() == operand.reg().size());
+  ASSERT(operand.IsShiftedRegister() && (operand.shift_amount() == 0));
+  Emit(SF(rd) | op | Flags(S) | Rm(operand.reg()) | Rn(rn) | Rd(rd));
+}
+
+
+void Assembler::hlt(int code) {
+  ASSERT(is_uint16(code));
+  Emit(HLT | ImmException(code));
+}
+
+
+void Assembler::brk(int code) {
+  ASSERT(is_uint16(code));
+  Emit(BRK | ImmException(code));
+}
+
+
+void Assembler::Logical(const Register& rd,
+                        const Register& rn,
+                        const Operand& operand,
+                        LogicalOp op) {
+  ASSERT(rd.size() == rn.size());
+  if (operand.IsImmediate()) {
+    int64_t immediate = operand.immediate();
+    unsigned reg_size = rd.size();
+
+    ASSERT(immediate != 0);
+    ASSERT(immediate != -1);
+    ASSERT(rd.Is64Bits() || is_uint32(immediate));
+
+    // If the operation is NOT, invert the operation and immediate.
+    if ((op & NOT) == NOT) {
+      op = static_cast<LogicalOp>(op & ~NOT);
+      immediate = rd.Is64Bits() ? ~immediate : (~immediate & kWRegMask);
+    }
+
+    unsigned n, imm_s, imm_r;
+    if (IsImmLogical(immediate, reg_size, &n, &imm_s, &imm_r)) {
+      // Immediate can be encoded in the instruction.
+      LogicalImmediate(rd, rn, n, imm_s, imm_r, op);
+    } else {
+      // This case is handled in the macro assembler.
+      UNREACHABLE();
+    }
+  } else {
+    ASSERT(operand.IsShiftedRegister());
+    ASSERT(operand.reg().size() == rd.size());
+    Instr dp_op = static_cast<Instr>(op | LogicalShiftedFixed);
+    DataProcShiftedRegister(rd, rn, operand, LeaveFlags, dp_op);
+  }
+}
+
+
+void Assembler::LogicalImmediate(const Register& rd,
+                                 const Register& rn,
+                                 unsigned n,
+                                 unsigned imm_s,
+                                 unsigned imm_r,
+                                 LogicalOp op) {
+  unsigned reg_size = rd.size();
+  Instr dest_reg = (op == ANDS) ? Rd(rd) : RdSP(rd);
+  Emit(SF(rd) | LogicalImmediateFixed | op | BitN(n, reg_size) |
+       ImmSetBits(imm_s, reg_size) | ImmRotate(imm_r, reg_size) | dest_reg |
+       Rn(rn));
+}
+
+
+void Assembler::ConditionalCompare(const Register& rn,
+                                   const Operand& operand,
+                                   StatusFlags nzcv,
+                                   Condition cond,
+                                   ConditionalCompareOp op) {
+  Instr ccmpop;
+  if (operand.IsImmediate()) {
+    int64_t immediate = operand.immediate();
+    ASSERT(IsImmConditionalCompare(immediate));
+    ccmpop = ConditionalCompareImmediateFixed | op | ImmCondCmp(immediate);
+  } else {
+    ASSERT(operand.IsShiftedRegister() && (operand.shift_amount() == 0));
+    ccmpop = ConditionalCompareRegisterFixed | op | Rm(operand.reg());
+  }
+  Emit(SF(rn) | ccmpop | Cond(cond) | Rn(rn) | Nzcv(nzcv));
+}
+
+
+void Assembler::DataProcessing1Source(const Register& rd,
+                                      const Register& rn,
+                                      DataProcessing1SourceOp op) {
+  ASSERT(rd.size() == rn.size());
+  Emit(SF(rn) | op | Rn(rn) | Rd(rd));
+}
+
+
+void Assembler::FPDataProcessing1Source(const FPRegister& fd,
+                                        const FPRegister& fn,
+                                        FPDataProcessing1SourceOp op) {
+  Emit(FPType(fn) | op | Rn(fn) | Rd(fd));
+}
+
+
+void Assembler::FPDataProcessing2Source(const FPRegister& fd,
+                                        const FPRegister& fn,
+                                        const FPRegister& fm,
+                                        FPDataProcessing2SourceOp op) {
+  ASSERT(fd.size() == fn.size());
+  ASSERT(fd.size() == fm.size());
+  Emit(FPType(fd) | op | Rm(fm) | Rn(fn) | Rd(fd));
+}
+
+
+void Assembler::FPDataProcessing3Source(const FPRegister& fd,
+                                        const FPRegister& fn,
+                                        const FPRegister& fm,
+                                        const FPRegister& fa,
+                                        FPDataProcessing3SourceOp op) {
+  ASSERT(AreSameSizeAndType(fd, fn, fm, fa));
+  Emit(FPType(fd) | op | Rm(fm) | Rn(fn) | Rd(fd) | Ra(fa));
+}
+
+
+void Assembler::EmitShift(const Register& rd,
+                          const Register& rn,
+                          Shift shift,
+                          unsigned shift_amount) {
+  switch (shift) {
+    case LSL:
+      lsl(rd, rn, shift_amount);
+      break;
+    case LSR:
+      lsr(rd, rn, shift_amount);
+      break;
+    case ASR:
+      asr(rd, rn, shift_amount);
+      break;
+    case ROR:
+      ror(rd, rn, shift_amount);
+      break;
+    default:
+      UNREACHABLE();
+  }
+}
+
+
+void Assembler::EmitExtendShift(const Register& rd,
+                                const Register& rn,
+                                Extend extend,
+                                unsigned left_shift) {
+  ASSERT(rd.size() >= rn.size());
+  unsigned reg_size = rd.size();
+  // Use the correct size of register.
+  Register rn_ = Register(rn.code(), rd.size());
+  // Bits extracted are high_bit:0.
+  unsigned high_bit = (8 << (extend & 0x3)) - 1;
+  // Number of bits left in the result that are not introduced by the shift.
+  unsigned non_shift_bits = (reg_size - left_shift) & (reg_size - 1);
+
+  if ((non_shift_bits > high_bit) || (non_shift_bits == 0)) {
+    switch (extend) {
+      case UXTB:
+      case UXTH:
+      case UXTW: ubfm(rd, rn_, non_shift_bits, high_bit); break;
+      case SXTB:
+      case SXTH:
+      case SXTW: sbfm(rd, rn_, non_shift_bits, high_bit); break;
+      case UXTX:
+      case SXTX: {
+        ASSERT(rn.size() == kXRegSize);
+        // Nothing to extend. Just shift.
+        lsl(rd, rn_, left_shift);
+        break;
+      }
+      default: UNREACHABLE();
+    }
+  } else {
+    // No need to extend as the extended bits would be shifted away.
+    lsl(rd, rn_, left_shift);
+  }
+}
+
+
+void Assembler::DataProcShiftedRegister(const Register& rd,
+                                        const Register& rn,
+                                        const Operand& operand,
+                                        FlagsUpdate S,
+                                        Instr op) {
+  ASSERT(operand.IsShiftedRegister());
+  ASSERT(rn.Is64Bits() || (rn.Is32Bits() && is_uint5(operand.shift_amount())));
+  Emit(SF(rd) | op | Flags(S) |
+       ShiftDP(operand.shift()) | ImmDPShift(operand.shift_amount()) |
+       Rm(operand.reg()) | Rn(rn) | Rd(rd));
+}
+
+
+void Assembler::DataProcExtendedRegister(const Register& rd,
+                                         const Register& rn,
+                                         const Operand& operand,
+                                         FlagsUpdate S,
+                                         Instr op) {
+  Instr dest_reg = (S == SetFlags) ? Rd(rd) : RdSP(rd);
+  Emit(SF(rd) | op | Flags(S) | Rm(operand.reg()) |
+       ExtendMode(operand.extend()) | ImmExtendShift(operand.shift_amount()) |
+       dest_reg | RnSP(rn));
+}
+
+
+bool Assembler::IsImmAddSub(int64_t immediate) {
+  return is_uint12(immediate) ||
+         (is_uint12(immediate >> 12) && ((immediate & 0xfff) == 0));
+}
+
+void Assembler::LoadStore(const CPURegister& rt,
+                          const MemOperand& addr,
+                          LoadStoreOp op) {
+  Instr memop = op | Rt(rt) | RnSP(addr.base());
+  ptrdiff_t offset = addr.offset();
+
+  if (addr.IsImmediateOffset()) {
+    LSDataSize size = CalcLSDataSize(op);
+    if (IsImmLSScaled(offset, size)) {
+      // Use the scaled addressing mode.
+      Emit(LoadStoreUnsignedOffsetFixed | memop |
+           ImmLSUnsigned(offset >> size));
+    } else if (IsImmLSUnscaled(offset)) {
+      // Use the unscaled addressing mode.
+      Emit(LoadStoreUnscaledOffsetFixed | memop | ImmLS(offset));
+    } else {
+      // This case is handled in the macro assembler.
+      UNREACHABLE();
+    }
+  } else if (addr.IsRegisterOffset()) {
+    Extend ext = addr.extend();
+    Shift shift = addr.shift();
+    unsigned shift_amount = addr.shift_amount();
+
+    // LSL is encoded in the option field as UXTX.
+    if (shift == LSL) {
+      ext = UXTX;
+    }
+
+    // Shifts are encoded in one bit, indicating a left shift by the memory
+    // access size.
+    ASSERT((shift_amount == 0) ||
+           (shift_amount == static_cast<unsigned>(CalcLSDataSize(op))));
+    Emit(LoadStoreRegisterOffsetFixed | memop | Rm(addr.regoffset()) |
+         ExtendMode(ext) | ImmShiftLS((shift_amount > 0) ? 1 : 0));
+  } else {
+    if (IsImmLSUnscaled(offset)) {
+      if (addr.IsPreIndex()) {
+        Emit(LoadStorePreIndexFixed | memop | ImmLS(offset));
+      } else {
+        ASSERT(addr.IsPostIndex());
+        Emit(LoadStorePostIndexFixed | memop | ImmLS(offset));
+      }
+    } else {
+      // This case is handled in the macro assembler.
+      UNREACHABLE();
+    }
+  }
+}
+
+
+bool Assembler::IsImmLSUnscaled(ptrdiff_t offset) {
+  return is_int9(offset);
+}
+
+
+bool Assembler::IsImmLSScaled(ptrdiff_t offset, LSDataSize size) {
+  bool offset_is_size_multiple = (((offset >> size) << size) == offset);
+  return offset_is_size_multiple && is_uint12(offset >> size);
+}
+
+
+void Assembler::LoadLiteral(const CPURegister& rt,
+                            uint64_t imm,
+                            LoadLiteralOp op) {
+  ASSERT(is_int32(imm) || is_uint32(imm) || (rt.Is64Bits()));
+
+  BlockLiteralPoolScope scope(this);
+  RecordLiteral(imm, rt.SizeInBytes());
+  Emit(op | ImmLLiteral(0) | Rt(rt));
+}
+
+
+// Test if a given value can be encoded in the immediate field of a logical
+// instruction.
+// If it can be encoded, the function returns true, and values pointed to by n,
+// imm_s and imm_r are updated with immediates encoded in the format required
+// by the corresponding fields in the logical instruction.
+// If it can not be encoded, the function returns false, and the values pointed
+// to by n, imm_s and imm_r are undefined.
+bool Assembler::IsImmLogical(uint64_t value,
+                             unsigned width,
+                             unsigned* n,
+                             unsigned* imm_s,
+                             unsigned* imm_r) {
+  ASSERT((n != NULL) && (imm_s != NULL) && (imm_r != NULL));
+  ASSERT((width == kWRegSize) || (width == kXRegSize));
+
+  // Logical immediates are encoded using parameters n, imm_s and imm_r using
+  // the following table:
+  //
+  //  N   imms    immr    size        S             R
+  //  1  ssssss  rrrrrr    64    UInt(ssssss)  UInt(rrrrrr)
+  //  0  0sssss  xrrrrr    32    UInt(sssss)   UInt(rrrrr)
+  //  0  10ssss  xxrrrr    16    UInt(ssss)    UInt(rrrr)
+  //  0  110sss  xxxrrr     8    UInt(sss)     UInt(rrr)
+  //  0  1110ss  xxxxrr     4    UInt(ss)      UInt(rr)
+  //  0  11110s  xxxxxr     2    UInt(s)       UInt(r)
+  // (s bits must not be all set)
+  //
+  // A pattern is constructed of size bits, where the least significant S+1
+  // bits are set. The pattern is rotated right by R, and repeated across a
+  // 32 or 64-bit value, depending on destination register width.
+  //
+  // To test if an arbitrary immediate can be encoded using this scheme, an
+  // iterative algorithm is used.
+  //
+  // TODO: This code does not consider using X/W register overlap to support
+  // 64-bit immediates where the top 32-bits are zero, and the bottom 32-bits
+  // are an encodable logical immediate.
+
+  // 1. If the value has all set or all clear bits, it can't be encoded.
+  if ((value == 0) || (value == 0xffffffffffffffffUL) ||
+      ((width == kWRegSize) && (value == 0xffffffff))) {
+    return false;
+  }
+
+  unsigned lead_zero = CountLeadingZeros(value, width);
+  unsigned lead_one = CountLeadingZeros(~value, width);
+  unsigned trail_zero = CountTrailingZeros(value, width);
+  unsigned trail_one = CountTrailingZeros(~value, width);
+  unsigned set_bits = CountSetBits(value, width);
+
+  // The fixed bits in the immediate s field.
+  // If width == 64 (X reg), start at 0xFFFFFF80.
+  // If width == 32 (W reg), start at 0xFFFFFFC0, as the iteration for 64-bit
+  // widths won't be executed.
+  int imm_s_fixed = (width == kXRegSize) ? -128 : -64;
+  int imm_s_mask = 0x3F;
+
+  for (;;) {
+    // 2. If the value is two bits wide, it can be encoded.
+    if (width == 2) {
+      *n = 0;
+      *imm_s = 0x3C;
+      *imm_r = (value & 3) - 1;
+      return true;
+    }
+
+    *n = (width == 64) ? 1 : 0;
+    *imm_s = ((imm_s_fixed | (set_bits - 1)) & imm_s_mask);
+    if ((lead_zero + set_bits) == width) {
+      *imm_r = 0;
+    } else {
+      *imm_r = (lead_zero > 0) ? (width - trail_zero) : lead_one;
+    }
+
+    // 3. If the sum of leading zeros, trailing zeros and set bits is equal to
+    //    the bit width of the value, it can be encoded.
+    if (lead_zero + trail_zero + set_bits == width) {
+      return true;
+    }
+
+    // 4. If the sum of leading ones, trailing ones and unset bits in the
+    //    value is equal to the bit width of the value, it can be encoded.
+    if (lead_one + trail_one + (width - set_bits) == width) {
+      return true;
+    }
+
+    // 5. If the most-significant half of the bitwise value is equal to the
+    //    least-significant half, return to step 2 using the least-significant
+    //    half of the value.
+    uint64_t mask = (1UL << (width >> 1)) - 1;
+    if ((value & mask) == ((value >> (width >> 1)) & mask)) {
+      width >>= 1;
+      set_bits >>= 1;
+      imm_s_fixed >>= 1;
+      continue;
+    }
+
+    // 6. Otherwise, the value can't be encoded.
+    return false;
+  }
+}
+
+bool Assembler::IsImmConditionalCompare(int64_t immediate) {
+  return is_uint5(immediate);
+}
+
+
+bool Assembler::IsImmFP32(float imm) {
+  // Valid values will have the form:
+  // aBbb.bbbc.defg.h000.0000.0000.0000.0000
+  uint32_t bits = float_to_rawbits(imm);
+  // bits[19..0] are cleared.
+  if ((bits & 0x7ffff) != 0) {
+    return false;
+  }
+
+  // bits[29..25] are all set or all cleared.
+  uint32_t b_pattern = (bits >> 16) & 0x3e00;
+  if (b_pattern != 0 && b_pattern != 0x3e00) {
+    return false;
+  }
+
+  // bit[30] and bit[29] are opposite.
+  if (((bits ^ (bits << 1)) & 0x40000000) == 0) {
+    return false;
+  }
+
+  return true;
+}
+
+
+bool Assembler::IsImmFP64(double imm) {
+  // Valid values will have the form:
+  // aBbb.bbbb.bbcd.efgh.0000.0000.0000.0000
+  // 0000.0000.0000.0000.0000.0000.0000.0000
+  uint64_t bits = double_to_rawbits(imm);
+  // bits[47..0] are cleared.
+  if ((bits & 0xffffffffffffL) != 0) {
+    return false;
+  }
+
+  // bits[61..54] are all set or all cleared.
+  uint32_t b_pattern = (bits >> 48) & 0x3fc0;
+  if (b_pattern != 0 && b_pattern != 0x3fc0) {
+    return false;
+  }
+
+  // bit[62] and bit[61] are opposite.
+  if (((bits ^ (bits << 1)) & 0x4000000000000000L) == 0) {
+    return false;
+  }
+
+  return true;
+}
+
+
+LoadStoreOp Assembler::LoadOpFor(const CPURegister& rt) {
+  ASSERT(rt.IsValid());
+  if (rt.IsRegister()) {
+    return rt.Is64Bits() ? LDR_x : LDR_w;
+  } else {
+    ASSERT(rt.IsFPRegister());
+    return rt.Is64Bits() ? LDR_d : LDR_s;
+  }
+}
+
+
+LoadStorePairOp Assembler::LoadPairOpFor(const CPURegister& rt,
+    const CPURegister& rt2) {
+  ASSERT(AreSameSizeAndType(rt, rt2));
+  USE(rt2);
+  if (rt.IsRegister()) {
+    return rt.Is64Bits() ? LDP_x : LDP_w;
+  } else {
+    ASSERT(rt.IsFPRegister());
+    return rt.Is64Bits() ? LDP_d : LDP_s;
+  }
+}
+
+
+LoadStoreOp Assembler::StoreOpFor(const CPURegister& rt) {
+  ASSERT(rt.IsValid());
+  if (rt.IsRegister()) {
+    return rt.Is64Bits() ? STR_x : STR_w;
+  } else {
+    ASSERT(rt.IsFPRegister());
+    return rt.Is64Bits() ? STR_d : STR_s;
+  }
+}
+
+
+LoadStorePairOp Assembler::StorePairOpFor(const CPURegister& rt,
+    const CPURegister& rt2) {
+  ASSERT(AreSameSizeAndType(rt, rt2));
+  USE(rt2);
+  if (rt.IsRegister()) {
+    return rt.Is64Bits() ? STP_x : STP_w;
+  } else {
+    ASSERT(rt.IsFPRegister());
+    return rt.Is64Bits() ? STP_d : STP_s;
+  }
+}
+
+
+LoadStorePairNonTemporalOp Assembler::LoadPairNonTemporalOpFor(
+    const CPURegister& rt, const CPURegister& rt2) {
+  ASSERT(AreSameSizeAndType(rt, rt2));
+  USE(rt2);
+  if (rt.IsRegister()) {
+    return rt.Is64Bits() ? LDNP_x : LDNP_w;
+  } else {
+    ASSERT(rt.IsFPRegister());
+    return rt.Is64Bits() ? LDNP_d : LDNP_s;
+  }
+}
+
+
+LoadStorePairNonTemporalOp Assembler::StorePairNonTemporalOpFor(
+    const CPURegister& rt, const CPURegister& rt2) {
+  ASSERT(AreSameSizeAndType(rt, rt2));
+  USE(rt2);
+  if (rt.IsRegister()) {
+    return rt.Is64Bits() ? STNP_x : STNP_w;
+  } else {
+    ASSERT(rt.IsFPRegister());
+    return rt.Is64Bits() ? STNP_d : STNP_s;
+  }
+}
+
+
+void Assembler::RecordLiteral(int64_t imm, unsigned size) {
+  literals_.push_front(new Literal(pc_, imm, size));
+}
+
+
+// Check if a literal pool should be emitted. Currently a literal is emitted
+// when:
+//  * the distance to the first literal load handled by this pool is greater
+//    than the recommended distance and the literal pool can be emitted without
+//    generating a jump over it.
+//  * the distance to the first literal load handled by this pool is greater
+//    than twice the recommended distance.
+// TODO: refine this heuristic using real world data.
+void Assembler::CheckLiteralPool(LiteralPoolEmitOption option) {
+  if (IsLiteralPoolBlocked()) {
+    // Literal pool emission is forbidden, no point in doing further checks.
+    return;
+  }
+
+  if (literals_.empty()) {
+    // No literal pool to emit.
+    next_literal_pool_check_ += kLiteralPoolCheckInterval;
+    return;
+  }
+
+  intptr_t distance = pc_ - literals_.back()->pc_;
+  if ((distance < kRecommendedLiteralPoolRange) ||
+      ((option == JumpRequired) &&
+       (distance < (2 * kRecommendedLiteralPoolRange)))) {
+    // We prefer not to have to jump over the literal pool.
+    next_literal_pool_check_ += kLiteralPoolCheckInterval;
+    return;
+  }
+
+  EmitLiteralPool(option);
+}
+
+
+void Assembler::EmitLiteralPool(LiteralPoolEmitOption option) {
+  // Prevent recursive calls while emitting the literal pool.
+  BlockLiteralPoolScope scope(this);
+
+  Label marker;
+  Label start_of_pool;
+  Label end_of_pool;
+
+  if (option == JumpRequired) {
+    b(&end_of_pool);
+  }
+
+  // Leave space for a literal pool marker. This is populated later, once the
+  // size of the pool is known.
+  bind(&marker);
+  nop();
+
+  // Now populate the literal pool.
+  bind(&start_of_pool);
+  std::list<Literal*>::iterator it;
+  for (it = literals_.begin(); it != literals_.end(); it++) {
+    // Update the load-literal instruction to point to this pool entry.
+    Instruction* load_literal = (*it)->pc_;
+    load_literal->SetImmLLiteral(pc_);
+    // Copy the data into the pool.
+    uint64_t value= (*it)->value_;
+    unsigned size = (*it)->size_;
+    ASSERT((size == kXRegSizeInBytes) || (size == kWRegSizeInBytes));
+    ASSERT((pc_ + size) <= (buffer_ + buffer_size_));
+    memcpy(pc_, &value, size);
+    pc_ += size;
+    delete *it;
+  }
+  literals_.clear();
+  bind(&end_of_pool);
+
+  // The pool size should always be a multiple of four bytes because that is the
+  // scaling applied by the LDR(literal) instruction, even for X-register loads.
+  ASSERT((SizeOfCodeGeneratedSince(&start_of_pool) % 4) == 0);
+  uint64_t pool_size = SizeOfCodeGeneratedSince(&start_of_pool) / 4;
+
+  // Literal pool marker indicating the size in words of the literal pool.
+  // We use a literal load to the zero register, the offset indicating the
+  // size in words. This instruction can encode a large enough offset to span
+  // the entire pool at its maximum size.
+  Instr marker_instruction = LDR_x_lit | ImmLLiteral(pool_size) | Rt(xzr);
+  memcpy(marker.target(), &marker_instruction, kInstructionSize);
+
+  next_literal_pool_check_ = pc_ + kLiteralPoolCheckInterval;
+}
+
+
+// Return the size in bytes, required by the literal pool entries. This does
+// not include any marker or branch over the literal pool itself.
+size_t Assembler::LiteralPoolSize() {
+  size_t size = 0;
+
+  std::list<Literal*>::iterator it;
+  for (it = literals_.begin(); it != literals_.end(); it++) {
+    size += (*it)->size_;
+  }
+
+  return size;
+}
+
+
+bool AreAliased(const CPURegister& reg1, const CPURegister& reg2,
+                const CPURegister& reg3, const CPURegister& reg4,
+                const CPURegister& reg5, const CPURegister& reg6,
+                const CPURegister& reg7, const CPURegister& reg8) {
+  int number_of_valid_regs = 0;
+  int number_of_valid_fpregs = 0;
+
+  RegList unique_regs = 0;
+  RegList unique_fpregs = 0;
+
+  const CPURegister regs[] = {reg1, reg2, reg3, reg4, reg5, reg6, reg7, reg8};
+
+  for (unsigned i = 0; i < sizeof(regs) / sizeof(regs[0]); i++) {
+    if (regs[i].IsRegister()) {
+      number_of_valid_regs++;
+      unique_regs |= regs[i].Bit();
+    } else if (regs[i].IsFPRegister()) {
+      number_of_valid_fpregs++;
+      unique_fpregs |= regs[i].Bit();
+    } else {
+      ASSERT(!regs[i].IsValid());
+    }
+  }
+
+  int number_of_unique_regs =
+    CountSetBits(unique_regs, sizeof(unique_regs) * 8);
+  int number_of_unique_fpregs =
+    CountSetBits(unique_fpregs, sizeof(unique_fpregs) * 8);
+
+  ASSERT(number_of_valid_regs >= number_of_unique_regs);
+  ASSERT(number_of_valid_fpregs >= number_of_unique_fpregs);
+
+  return (number_of_valid_regs != number_of_unique_regs) ||
+         (number_of_valid_fpregs != number_of_unique_fpregs);
+}
+
+
+bool AreSameSizeAndType(const CPURegister& reg1, const CPURegister& reg2,
+                        const CPURegister& reg3, const CPURegister& reg4,
+                        const CPURegister& reg5, const CPURegister& reg6,
+                        const CPURegister& reg7, const CPURegister& reg8) {
+  ASSERT(reg1.IsValid());
+  bool match = true;
+  match &= !reg2.IsValid() || reg2.IsSameSizeAndType(reg1);
+  match &= !reg3.IsValid() || reg3.IsSameSizeAndType(reg1);
+  match &= !reg4.IsValid() || reg4.IsSameSizeAndType(reg1);
+  match &= !reg5.IsValid() || reg5.IsSameSizeAndType(reg1);
+  match &= !reg6.IsValid() || reg6.IsSameSizeAndType(reg1);
+  match &= !reg7.IsValid() || reg7.IsSameSizeAndType(reg1);
+  match &= !reg8.IsValid() || reg8.IsSameSizeAndType(reg1);
+  return match;
+}
+
+
+}  // namespace vixl

diff --git a/src/a64/assembler-a64.h b/src/a64/assembler-a64.h
new file mode 100644
index 0000000..ebc744a
--- /dev/null
+++ b/src/a64/assembler-a64.h

@@ -0,0 +1,1778 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_A64_ASSEMBLER_A64_H_
+#define VIXL_A64_ASSEMBLER_A64_H_
+
+#include <list>
+
+#include "globals.h"
+#include "utils.h"
+#include "a64/instructions-a64.h"
+
+namespace vixl {
+
+typedef uint64_t RegList;
+static const int kRegListSizeInBits = sizeof(RegList) * 8;
+
+// Registers.
+
+// Some CPURegister methods can return Register and FPRegister types, so we
+// need to declare them in advance.
+class Register;
+class FPRegister;
+
+
+class CPURegister {
+ public:
+  enum RegisterType {
+    // The kInvalid value is used to detect uninitialized static instances,
+    // which are always zero-initialized before any constructors are called.
+    kInvalid = 0,
+    kRegister,
+    kFPRegister,
+    kNoRegister
+  };
+
+  CPURegister() : code_(0), size_(0), type_(kNoRegister) {
+    ASSERT(!IsValid());
+    ASSERT(IsNone());
+  }
+
+  CPURegister(unsigned code, unsigned size, RegisterType type)
+      : code_(code), size_(size), type_(type) {
+    ASSERT(IsValidOrNone());
+  }
+
+  unsigned code() const {
+    ASSERT(IsValid());
+    return code_;
+  }
+
+  RegisterType type() const {
+    ASSERT(IsValidOrNone());
+    return type_;
+  }
+
+  RegList Bit() const {
+    ASSERT(code_ < (sizeof(RegList) * 8));
+    return IsValid() ? (static_cast<RegList>(1) << code_) : 0;
+  }
+
+  unsigned size() const {
+    ASSERT(IsValid());
+    return size_;
+  }
+
+  int SizeInBytes() const {
+    ASSERT(IsValid());
+    ASSERT(size() % 8 == 0);
+    return size_ / 8;
+  }
+
+  int SizeInBits() const {
+    ASSERT(IsValid());
+    return size_;
+  }
+
+  bool Is32Bits() const {
+    ASSERT(IsValid());
+    return size_ == 32;
+  }
+
+  bool Is64Bits() const {
+    ASSERT(IsValid());
+    return size_ == 64;
+  }
+
+  bool IsValid() const {
+    if (IsValidRegister() || IsValidFPRegister()) {
+      ASSERT(!IsNone());
+      return true;
+    } else {
+      ASSERT(IsNone());
+      return false;
+    }
+  }
+
+  bool IsValidRegister() const {
+    return IsRegister() &&
+           ((size_ == kWRegSize) || (size_ == kXRegSize)) &&
+           ((code_ < kNumberOfRegisters) || (code_ == kSPRegInternalCode));
+  }
+
+  bool IsValidFPRegister() const {
+    return IsFPRegister() &&
+           ((size_ == kSRegSize) || (size_ == kDRegSize)) &&
+           (code_ < kNumberOfFPRegisters);
+  }
+
+  bool IsNone() const {
+    // kNoRegister types should always have size 0 and code 0.
+    ASSERT((type_ != kNoRegister) || (code_ == 0));
+    ASSERT((type_ != kNoRegister) || (size_ == 0));
+
+    return type_ == kNoRegister;
+  }
+
+  bool Is(const CPURegister& other) const {
+    ASSERT(IsValidOrNone() && other.IsValidOrNone());
+    return (code_ == other.code_) && (size_ == other.size_) &&
+           (type_ == other.type_);
+  }
+
+  inline bool IsZero() const {
+    ASSERT(IsValid());
+    return IsRegister() && (code_ == kZeroRegCode);
+  }
+
+  inline bool IsSP() const {
+    ASSERT(IsValid());
+    return IsRegister() && (code_ == kSPRegInternalCode);
+  }
+
+  inline bool IsRegister() const {
+    return type_ == kRegister;
+  }
+
+  inline bool IsFPRegister() const {
+    return type_ == kFPRegister;
+  }
+
+  const Register& W() const;
+  const Register& X() const;
+  const FPRegister& S() const;
+  const FPRegister& D() const;
+
+  inline bool IsSameSizeAndType(const CPURegister& other) const {
+    return (size_ == other.size_) && (type_ == other.type_);
+  }
+
+ protected:
+  unsigned code_;
+  unsigned size_;
+  RegisterType type_;
+
+ private:
+  bool IsValidOrNone() const {
+    return IsValid() || IsNone();
+  }
+};
+
+
+class Register : public CPURegister {
+ public:
+  explicit Register() : CPURegister() {}
+  inline explicit Register(const CPURegister& other)
+      : CPURegister(other.code(), other.size(), other.type()) {
+    ASSERT(IsValidRegister());
+  }
+  explicit Register(unsigned code, unsigned size)
+      : CPURegister(code, size, kRegister) {}
+
+  bool IsValid() const {
+    ASSERT(IsRegister() || IsNone());
+    return IsValidRegister();
+  }
+
+  static const Register& WRegFromCode(unsigned code);
+  static const Register& XRegFromCode(unsigned code);
+
+  // V8 compatibility.
+  static const int kNumRegisters = kNumberOfRegisters;
+  static const int kNumAllocatableRegisters = kNumberOfRegisters - 1;
+
+ private:
+  static const Register wregisters[];
+  static const Register xregisters[];
+};
+
+
+class FPRegister : public CPURegister {
+ public:
+  inline FPRegister() : CPURegister() {}
+  inline explicit FPRegister(const CPURegister& other)
+      : CPURegister(other.code(), other.size(), other.type()) {
+    ASSERT(IsValidFPRegister());
+  }
+  inline FPRegister(unsigned code, unsigned size)
+      : CPURegister(code, size, kFPRegister) {}
+
+  bool IsValid() const {
+    ASSERT(IsFPRegister() || IsNone());
+    return IsValidFPRegister();
+  }
+
+  static const FPRegister& SRegFromCode(unsigned code);
+  static const FPRegister& DRegFromCode(unsigned code);
+
+  // V8 compatibility.
+  static const int kNumRegisters = kNumberOfFPRegisters;
+  static const int kNumAllocatableRegisters = kNumberOfFPRegisters - 1;
+
+ private:
+  static const FPRegister sregisters[];
+  static const FPRegister dregisters[];
+};
+
+
+// No*Reg is used to indicate an unused argument, or an error case. Note that
+// these all compare equal (using the Is() method). The Register and FPRegister
+// variants are provided for convenience.
+const Register NoReg;
+const FPRegister NoFPReg;
+const CPURegister NoCPUReg;
+
+
+#define DEFINE_REGISTERS(N)  \
+const Register w##N(N, kWRegSize);  \
+const Register x##N(N, kXRegSize);
+REGISTER_CODE_LIST(DEFINE_REGISTERS)
+#undef DEFINE_REGISTERS
+const Register wsp(kSPRegInternalCode, kWRegSize);
+const Register sp(kSPRegInternalCode, kXRegSize);
+
+
+#define DEFINE_FPREGISTERS(N)  \
+const FPRegister s##N(N, kSRegSize);  \
+const FPRegister d##N(N, kDRegSize);
+REGISTER_CODE_LIST(DEFINE_FPREGISTERS)
+#undef DEFINE_FPREGISTERS
+
+
+// Registers aliases.
+const Register ip0 = x16;
+const Register ip1 = x17;
+const Register lr = x30;
+const Register xzr = x31;
+const Register wzr = w31;
+
+
+// AreAliased returns true if any of the named registers overlap. Arguments
+// set to NoReg are ignored. The system stack pointer may be specified.
+bool AreAliased(const CPURegister& reg1,
+                const CPURegister& reg2,
+                const CPURegister& reg3 = NoReg,
+                const CPURegister& reg4 = NoReg,
+                const CPURegister& reg5 = NoReg,
+                const CPURegister& reg6 = NoReg,
+                const CPURegister& reg7 = NoReg,
+                const CPURegister& reg8 = NoReg);
+
+
+// AreSameSizeAndType returns true if all of the specified registers have the
+// same size, and are of the same type. The system stack pointer may be
+// specified. Arguments set to NoReg are ignored, as are any subsequent
+// arguments. At least one argument (reg1) must be valid (not NoCPUReg).
+bool AreSameSizeAndType(const CPURegister& reg1,
+                        const CPURegister& reg2,
+                        const CPURegister& reg3 = NoCPUReg,
+                        const CPURegister& reg4 = NoCPUReg,
+                        const CPURegister& reg5 = NoCPUReg,
+                        const CPURegister& reg6 = NoCPUReg,
+                        const CPURegister& reg7 = NoCPUReg,
+                        const CPURegister& reg8 = NoCPUReg);
+
+
+// Lists of registers.
+class CPURegList {
+ public:
+  inline explicit CPURegList(CPURegister reg1,
+                             CPURegister reg2 = NoCPUReg,
+                             CPURegister reg3 = NoCPUReg,
+                             CPURegister reg4 = NoCPUReg)
+      : list_(reg1.Bit() | reg2.Bit() | reg3.Bit() | reg4.Bit()),
+        size_(reg1.size()), type_(reg1.type()) {
+    ASSERT(AreSameSizeAndType(reg1, reg2, reg3, reg4));
+    ASSERT(IsValid());
+  }
+
+  inline CPURegList(CPURegister::RegisterType type, unsigned size, RegList list)
+      : list_(list), size_(size), type_(type) {
+    ASSERT(IsValid());
+  }
+
+  inline CPURegList(CPURegister::RegisterType type, unsigned size,
+                    unsigned first_reg, unsigned last_reg)
+      : size_(size), type_(type) {
+    ASSERT(((type == CPURegister::kRegister) &&
+            (last_reg < kNumberOfRegisters)) ||
+           ((type == CPURegister::kFPRegister) &&
+            (last_reg < kNumberOfFPRegisters)));
+    ASSERT(last_reg >= first_reg);
+    list_ = (1UL << (last_reg + 1)) - 1;
+    list_ &= ~((1UL << first_reg) - 1);
+    ASSERT(IsValid());
+  }
+
+  inline CPURegister::RegisterType type() const {
+    ASSERT(IsValid());
+    return type_;
+  }
+
+  // Combine another CPURegList into this one. Registers that already exist in
+  // this list are left unchanged. The type and size of the registers in the
+  // 'other' list must match those in this list.
+  void Combine(const CPURegList& other) {
+    ASSERT(IsValid());
+    ASSERT(other.type() == type_);
+    ASSERT(other.RegisterSizeInBits() == size_);
+    list_ |= other.list();
+  }
+
+  // Remove every register in the other CPURegList from this one. Registers that
+  // do not exist in this list are ignored. The type and size of the registers
+  // in the 'other' list must match those in this list.
+  void Remove(const CPURegList& other) {
+    ASSERT(IsValid());
+    ASSERT(other.type() == type_);
+    ASSERT(other.RegisterSizeInBits() == size_);
+    list_ &= ~other.list();
+  }
+
+  // Variants of Combine and Remove which take a single register.
+  inline void Combine(const CPURegister& other) {
+    ASSERT(other.type() == type_);
+    ASSERT(other.size() == size_);
+    Combine(other.code());
+  }
+
+  inline void Remove(const CPURegister& other) {
+    ASSERT(other.type() == type_);
+    ASSERT(other.size() == size_);
+    Remove(other.code());
+  }
+
+  // Variants of Combine and Remove which take a single register by its code;
+  // the type and size of the register is inferred from this list.
+  inline void Combine(int code) {
+    ASSERT(IsValid());
+    ASSERT(CPURegister(code, size_, type_).IsValid());
+    list_ |= (1UL << code);
+  }
+
+  inline void Remove(int code) {
+    ASSERT(IsValid());
+    ASSERT(CPURegister(code, size_, type_).IsValid());
+    list_ &= ~(1UL << code);
+  }
+
+  inline RegList list() const {
+    ASSERT(IsValid());
+    return list_;
+  }
+
+  // Remove all callee-saved registers from the list. This can be useful when
+  // preparing registers for an AAPCS64 function call, for example.
+  void RemoveCalleeSaved();
+
+  CPURegister PopLowestIndex();
+  CPURegister PopHighestIndex();
+
+  // AAPCS64 callee-saved registers.
+  static CPURegList GetCalleeSaved(unsigned size = kXRegSize);
+  static CPURegList GetCalleeSavedFP(unsigned size = kDRegSize);
+
+  // AAPCS64 caller-saved registers. Note that this includes lr.
+  static CPURegList GetCallerSaved(unsigned size = kXRegSize);
+  static CPURegList GetCallerSavedFP(unsigned size = kDRegSize);
+
+  inline bool IsEmpty() const {
+    ASSERT(IsValid());
+    return list_ == 0;
+  }
+
+  inline bool IncludesAliasOf(const CPURegister& other) const {
+    ASSERT(IsValid());
+    return (type_ == other.type()) && (other.Bit() & list_);
+  }
+
+  inline int Count() const {
+    ASSERT(IsValid());
+    return CountSetBits(list_, kRegListSizeInBits);
+  }
+
+  inline unsigned RegisterSizeInBits() const {
+    ASSERT(IsValid());
+    return size_;
+  }
+
+  inline unsigned RegisterSizeInBytes() const {
+    int size_in_bits = RegisterSizeInBits();
+    ASSERT((size_in_bits % 8) == 0);
+    return size_in_bits / 8;
+  }
+
+ private:
+  RegList list_;
+  unsigned size_;
+  CPURegister::RegisterType type_;
+
+  bool IsValid() const;
+};
+
+
+// AAPCS64 callee-saved registers.
+extern const CPURegList kCalleeSaved;
+extern const CPURegList kCalleeSavedFP;
+
+
+// AAPCS64 caller-saved registers. Note that this includes lr.
+extern const CPURegList kCallerSaved;
+extern const CPURegList kCallerSavedFP;
+
+
+// Operand.
+class Operand {
+ public:
+  // #<immediate>
+  // where <immediate> is int64_t.
+  // This is allowed to be an implicit constructor because Operand is
+  // a wrapper class that doesn't normally perform any type conversion.
+  Operand(int64_t immediate);           // NOLINT(runtime/explicit)
+
+  // rm, {<shift> #<shift_amount>}
+  // where <shift> is one of {LSL, LSR, ASR, ROR}.
+  //       <shift_amount> is uint6_t.
+  // This is allowed to be an implicit constructor because Operand is
+  // a wrapper class that doesn't normally perform any type conversion.
+  Operand(Register reg,
+          Shift shift = LSL,
+          unsigned shift_amount = 0);   // NOLINT(runtime/explicit)
+
+  // rm, {<extend> {#<shift_amount>}}
+  // where <extend> is one of {UXTB, UXTH, UXTW, UXTX, SXTB, SXTH, SXTW, SXTX}.
+  //       <shift_amount> is uint2_t.
+  explicit Operand(Register reg, Extend extend, unsigned shift_amount = 0);
+
+  bool IsImmediate() const;
+  bool IsShiftedRegister() const;
+  bool IsExtendedRegister() const;
+
+  // This returns an LSL shift (<= 4) operand as an equivalent extend operand,
+  // which helps in the encoding of instructions that use the stack pointer.
+  Operand ToExtendedRegister() const;
+
+  int64_t immediate() const {
+    ASSERT(IsImmediate());
+    return immediate_;
+  }
+
+  Register reg() const {
+    ASSERT(IsShiftedRegister() || IsExtendedRegister());
+    return reg_;
+  }
+
+  Shift shift() const {
+    ASSERT(IsShiftedRegister());
+    return shift_;
+  }
+
+  Extend extend() const {
+    ASSERT(IsExtendedRegister());
+    return extend_;
+  }
+
+  unsigned shift_amount() const {
+    ASSERT(IsShiftedRegister() || IsExtendedRegister());
+    return shift_amount_;
+  }
+
+ private:
+  int64_t immediate_;
+  Register reg_;
+  Shift shift_;
+  Extend extend_;
+  unsigned shift_amount_;
+};
+
+
+// MemOperand represents the addressing mode of a load or store instruction.
+class MemOperand {
+ public:
+  explicit MemOperand(Register base,
+                      ptrdiff_t offset = 0,
+                      AddrMode addrmode = Offset);
+  explicit MemOperand(Register base,
+                      Register regoffset,
+                      Shift shift = LSL,
+                      unsigned shift_amount = 0);
+  explicit MemOperand(Register base,
+                      Register regoffset,
+                      Extend extend,
+                      unsigned shift_amount = 0);
+  explicit MemOperand(Register base,
+                      const Operand& offset,
+                      AddrMode addrmode = Offset);
+
+  const Register& base() const { return base_; }
+  const Register& regoffset() const { return regoffset_; }
+  ptrdiff_t offset() const { return offset_; }
+  AddrMode addrmode() const { return addrmode_; }
+  Shift shift() const { return shift_; }
+  Extend extend() const { return extend_; }
+  unsigned shift_amount() const { return shift_amount_; }
+  bool IsImmediateOffset() const;
+  bool IsRegisterOffset() const;
+  bool IsPreIndex() const;
+  bool IsPostIndex() const;
+
+ private:
+  Register base_;
+  Register regoffset_;
+  ptrdiff_t offset_;
+  AddrMode addrmode_;
+  Shift shift_;
+  Extend extend_;
+  unsigned shift_amount_;
+};
+
+
+class Label {
+ public:
+  Label() : is_bound_(false), link_(NULL), target_(NULL) {}
+  ~Label() {
+    // If the label has been linked to, it needs to be bound to a target.
+    ASSERT(!IsLinked() || IsBound());
+  }
+
+  inline Instruction* link() const { return link_; }
+  inline Instruction* target() const { return target_; }
+
+  inline bool IsBound() const { return is_bound_; }
+  inline bool IsLinked() const { return link_ != NULL; }
+
+  inline void set_link(Instruction* new_link) { link_ = new_link; }
+
+  static const int kEndOfChain = 0;
+
+ private:
+  // Indicates if the label has been bound, ie its location is fixed.
+  bool is_bound_;
+  // Branches instructions branching to this label form a chained list, with
+  // their offset indicating where the next instruction is located.
+  // link_ points to the latest branch instruction generated branching to this
+  // branch.
+  // If link_ is not NULL, the label has been linked to.
+  Instruction* link_;
+  // The label location.
+  Instruction* target_;
+
+  friend class Assembler;
+};
+
+
+// TODO: Obtain better values for these, based on real-world data.
+const int kLiteralPoolCheckInterval = 4 * KBytes;
+const int kRecommendedLiteralPoolRange = 2 * kLiteralPoolCheckInterval;
+
+
+// Control whether a branch over the literal pool should also be emitted. This
+// is needed if the literal pool has to be emitted in the middle of the JITted
+// code.
+enum LiteralPoolEmitOption {
+  JumpRequired,
+  NoJumpRequired
+};
+
+
+// Literal pool entry.
+class Literal {
+ public:
+  Literal(Instruction* pc, uint64_t imm, unsigned size)
+      : pc_(pc), value_(imm), size_(size) {}
+
+ private:
+  Instruction* pc_;
+  int64_t value_;
+  unsigned size_;
+
+  friend class Assembler;
+};
+
+
+// Assembler.
+class Assembler {
+ public:
+  Assembler(byte* buffer, unsigned buffer_size);
+
+  // The destructor asserts that one of the following is true:
+  //  * The Assembler object has not been used.
+  //  * Nothing has been emitted since the last Reset() call.
+  //  * Nothing has been emitted since the last FinalizeCode() call.
+  ~Assembler();
+
+  // System functions.
+
+  // Start generating code from the beginning of the buffer, discarding any code
+  // and data that has already been emitted into the buffer.
+  //
+  // In order to avoid any accidental transfer of state, Reset ASSERTs that the
+  // constant pool is not blocked.
+  void Reset();
+
+  // Finalize a code buffer of generated instructions. This function must be
+  // called before executing or copying code from the buffer.
+  void FinalizeCode();
+
+  // Label.
+  // Bind a label to the current PC.
+  void bind(Label* label);
+  int UpdateAndGetByteOffsetTo(Label* label);
+  inline int UpdateAndGetInstructionOffsetTo(Label* label) {
+    ASSERT(Label::kEndOfChain == 0);
+    return UpdateAndGetByteOffsetTo(label) >> kInstructionSizeLog2;
+  }
+
+
+  // Instruction set functions.
+
+  // Branch / Jump instructions.
+  // Branch to register.
+  void br(const Register& xn);
+
+  // Branch with link to register.
+  void blr(const Register& xn);
+
+  // Branch to register with return hint.
+  void ret(const Register& xn = lr);
+
+  // Branch to label.
+  void b(Label* label, Condition cond = al);
+
+  // Branch to PC offset.
+  void b(int imm26, Condition cond = al);
+
+  // Branch with link to label.
+  void bl(Label* label);
+
+  // Branch with link to PC offset.
+  void bl(int imm26);
+
+  // Compare and branch to label if zero.
+  void cbz(const Register& rt, Label* label);
+
+  // Compare and branch to PC offset if zero.
+  void cbz(const Register& rt, int imm19);
+
+  // Compare and branch to label if not zero.
+  void cbnz(const Register& rt, Label* label);
+
+  // Compare and branch to PC offset if not zero.
+  void cbnz(const Register& rt, int imm19);
+
+  // Test bit and branch to label if zero.
+  void tbz(const Register& rt, unsigned bit_pos, Label* label);
+
+  // Test bit and branch to PC offset if zero.
+  void tbz(const Register& rt, unsigned bit_pos, int imm14);
+
+  // Test bit and branch to label if not zero.
+  void tbnz(const Register& rt, unsigned bit_pos, Label* label);
+
+  // Test bit and branch to PC offset if not zero.
+  void tbnz(const Register& rt, unsigned bit_pos, int imm14);
+
+  // Address calculation instructions.
+  // Calculate a PC-relative address. Unlike for branches the offset in adr is
+  // unscaled (i.e. the result can be unaligned).
+
+  // Calculate the address of a label.
+  void adr(const Register& rd, Label* label);
+
+  // Calculate the address of a PC offset.
+  void adr(const Register& rd, int imm21);
+
+  // Data Processing instructions.
+  // Add.
+  void add(const Register& rd,
+           const Register& rn,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+
+  // Compare negative.
+  void cmn(const Register& rn, const Operand& operand);
+
+  // Subtract.
+  void sub(const Register& rd,
+           const Register& rn,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+
+  // Compare.
+  void cmp(const Register& rn, const Operand& operand);
+
+  // Negate.
+  void neg(const Register& rd,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+
+  // Add with carry bit.
+  void adc(const Register& rd,
+           const Register& rn,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+
+  // Subtract with carry bit.
+  void sbc(const Register& rd,
+           const Register& rn,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+
+  // Negate with carry bit.
+  void ngc(const Register& rd,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+
+  // Logical instructions.
+  // Bitwise and (A & B).
+  void and_(const Register& rd,
+            const Register& rn,
+            const Operand& operand,
+            FlagsUpdate S = LeaveFlags);
+
+  // Bit test and set flags.
+  void tst(const Register& rn, const Operand& operand);
+
+  // Bit clear (A & ~B).
+  void bic(const Register& rd,
+           const Register& rn,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+
+  // Bitwise or (A | B).
+  void orr(const Register& rd, const Register& rn, const Operand& operand);
+
+  // Bitwise nor (A | ~B).
+  void orn(const Register& rd, const Register& rn, const Operand& operand);
+
+  // Bitwise eor/xor (A ^ B).
+  void eor(const Register& rd, const Register& rn, const Operand& operand);
+
+  // Bitwise enor/xnor (A ^ ~B).
+  void eon(const Register& rd, const Register& rn, const Operand& operand);
+
+  // Logical shift left by variable.
+  void lslv(const Register& rd, const Register& rn, const Register& rm);
+
+  // Logical shift right by variable.
+  void lsrv(const Register& rd, const Register& rn, const Register& rm);
+
+  // Arithmetic shift right by variable.
+  void asrv(const Register& rd, const Register& rn, const Register& rm);
+
+  // Rotate right by variable.
+  void rorv(const Register& rd, const Register& rn, const Register& rm);
+
+  // Bitfield instructions.
+  // Bitfield move.
+  void bfm(const Register& rd,
+           const Register& rn,
+           unsigned immr,
+           unsigned imms);
+
+  // Signed bitfield move.
+  void sbfm(const Register& rd,
+            const Register& rn,
+            unsigned immr,
+            unsigned imms);
+
+  // Unsigned bitfield move.
+  void ubfm(const Register& rd,
+            const Register& rn,
+            unsigned immr,
+            unsigned imms);
+
+  // Bfm aliases.
+  // Bitfield insert.
+  inline void bfi(const Register& rd,
+                  const Register& rn,
+                  unsigned lsb,
+                  unsigned width) {
+    ASSERT(width >= 1);
+    ASSERT(lsb + width <= rn.size());
+    bfm(rd, rn, (rd.size() - lsb) & (rd.size() - 1), width - 1);
+  }
+
+  // Bitfield extract and insert low.
+  inline void bfxil(const Register& rd,
+                    const Register& rn,
+                    unsigned lsb,
+                    unsigned width) {
+    ASSERT(width >= 1);
+    ASSERT(lsb + width <= rn.size());
+    bfm(rd, rn, lsb, lsb + width - 1);
+  }
+
+  // Sbfm aliases.
+  // Arithmetic shift right.
+  inline void asr(const Register& rd, const Register& rn, unsigned shift) {
+    ASSERT(shift < rd.size());
+    sbfm(rd, rn, shift, rd.size() - 1);
+  }
+
+  // Signed bitfield insert with zero at right.
+  inline void sbfiz(const Register& rd,
+                    const Register& rn,
+                    unsigned lsb,
+                    unsigned width) {
+    ASSERT(width >= 1);
+    ASSERT(lsb + width <= rn.size());
+    sbfm(rd, rn, (rd.size() - lsb) & (rd.size() - 1), width - 1);
+  }
+
+  // Signed bitfield extract.
+  inline void sbfx(const Register& rd,
+                   const Register& rn,
+                   unsigned lsb,
+                   unsigned width) {
+    ASSERT(width >= 1);
+    ASSERT(lsb + width <= rn.size());
+    sbfm(rd, rn, lsb, lsb + width - 1);
+  }
+
+  // Signed extend byte.
+  inline void sxtb(const Register& rd, const Register& rn) {
+    sbfm(rd, rn, 0, 7);
+  }
+
+  // Signed extend halfword.
+  inline void sxth(const Register& rd, const Register& rn) {
+    sbfm(rd, rn, 0, 15);
+  }
+
+  // Signed extend word.
+  inline void sxtw(const Register& rd, const Register& rn) {
+    sbfm(rd, rn, 0, 31);
+  }
+
+  // Ubfm aliases.
+  // Logical shift left.
+  inline void lsl(const Register& rd, const Register& rn, unsigned shift) {
+    unsigned reg_size = rd.size();
+    ASSERT(shift < reg_size);
+    ubfm(rd, rn, (reg_size - shift) % reg_size, reg_size - shift - 1);
+  }
+
+  // Logical shift right.
+  inline void lsr(const Register& rd, const Register& rn, unsigned shift) {
+    ASSERT(shift < rd.size());
+    ubfm(rd, rn, shift, rd.size() - 1);
+  }
+
+  // Unsigned bitfield insert with zero at right.
+  inline void ubfiz(const Register& rd,
+                    const Register& rn,
+                    unsigned lsb,
+                    unsigned width) {
+    ASSERT(width >= 1);
+    ASSERT(lsb + width <= rn.size());
+    ubfm(rd, rn, (rd.size() - lsb) & (rd.size() - 1), width - 1);
+  }
+
+  // Unsigned bitfield extract.
+  inline void ubfx(const Register& rd,
+                   const Register& rn,
+                   unsigned lsb,
+                   unsigned width) {
+    ASSERT(width >= 1);
+    ASSERT(lsb + width <= rn.size());
+    ubfm(rd, rn, lsb, lsb + width - 1);
+  }
+
+  // Unsigned extend byte.
+  inline void uxtb(const Register& rd, const Register& rn) {
+    ubfm(rd, rn, 0, 7);
+  }
+
+  // Unsigned extend halfword.
+  inline void uxth(const Register& rd, const Register& rn) {
+    ubfm(rd, rn, 0, 15);
+  }
+
+  // Unsigned extend word.
+  inline void uxtw(const Register& rd, const Register& rn) {
+    ubfm(rd, rn, 0, 31);
+  }
+
+  // Extract.
+  void extr(const Register& rd,
+            const Register& rn,
+            const Register& rm,
+            unsigned lsb);
+
+  // Conditional select: rd = cond ? rn : rm.
+  void csel(const Register& rd,
+            const Register& rn,
+            const Register& rm,
+            Condition cond);
+
+  // Conditional select increment: rd = cond ? rn : rm + 1.
+  void csinc(const Register& rd,
+             const Register& rn,
+             const Register& rm,
+             Condition cond);
+
+  // Conditional select inversion: rd = cond ? rn : ~rm.
+  void csinv(const Register& rd,
+             const Register& rn,
+             const Register& rm,
+             Condition cond);
+
+  // Conditional select negation: rd = cond ? rn : -rm.
+  void csneg(const Register& rd,
+             const Register& rn,
+             const Register& rm,
+             Condition cond);
+
+  // Conditional set: rd = cond ? 1 : 0.
+  void cset(const Register& rd, Condition cond);
+
+  // Conditional set mask: rd = cond ? -1 : 0.
+  void csetm(const Register& rd, Condition cond);
+
+  // Conditional increment: rd = cond ? rn + 1 : rn.
+  void cinc(const Register& rd, const Register& rn, Condition cond);
+
+  // Conditional invert: rd = cond ? ~rn : rn.
+  void cinv(const Register& rd, const Register& rn, Condition cond);
+
+  // Conditional negate: rd = cond ? -rn : rn.
+  void cneg(const Register& rd, const Register& rn, Condition cond);
+
+  // Rotate right.
+  inline void ror(const Register& rd, const Register& rs, unsigned shift) {
+    extr(rd, rs, rs, shift);
+  }
+
+  // Conditional comparison.
+  // Conditional compare negative.
+  void ccmn(const Register& rn,
+            const Operand& operand,
+            StatusFlags nzcv,
+            Condition cond);
+
+  // Conditional compare.
+  void ccmp(const Register& rn,
+            const Operand& operand,
+            StatusFlags nzcv,
+            Condition cond);
+
+  // Multiply.
+  void mul(const Register& rd, const Register& rn, const Register& rm);
+
+  // Negated multiply.
+  void mneg(const Register& rd, const Register& rn, const Register& rm);
+
+  // Signed long multiply: 32 x 32 -> 64-bit.
+  void smull(const Register& rd, const Register& rn, const Register& rm);
+
+  // Signed multiply high: 64 x 64 -> 64-bit <127:64>.
+  void smulh(const Register& xd, const Register& xn, const Register& xm);
+
+  // Multiply and accumulate.
+  void madd(const Register& rd,
+            const Register& rn,
+            const Register& rm,
+            const Register& ra);
+
+  // Multiply and subtract.
+  void msub(const Register& rd,
+            const Register& rn,
+            const Register& rm,
+            const Register& ra);
+
+  // Signed long multiply and accumulate: 32 x 32 + 64 -> 64-bit.
+  void smaddl(const Register& rd,
+              const Register& rn,
+              const Register& rm,
+              const Register& ra);
+
+  // Unsigned long multiply and accumulate: 32 x 32 + 64 -> 64-bit.
+  void umaddl(const Register& rd,
+              const Register& rn,
+              const Register& rm,
+              const Register& ra);
+
+  // Signed long multiply and subtract: 64 - (32 x 32) -> 64-bit.
+  void smsubl(const Register& rd,
+              const Register& rn,
+              const Register& rm,
+              const Register& ra);
+
+  // Unsigned long multiply and subtract: 64 - (32 x 32) -> 64-bit.
+  void umsubl(const Register& rd,
+              const Register& rn,
+              const Register& rm,
+              const Register& ra);
+
+  // Signed integer divide.
+  void sdiv(const Register& rd, const Register& rn, const Register& rm);
+
+  // Unsigned integer divide.
+  void udiv(const Register& rd, const Register& rn, const Register& rm);
+
+  // Bit reverse.
+  void rbit(const Register& rd, const Register& rn);
+
+  // Reverse bytes in 16-bit half words.
+  void rev16(const Register& rd, const Register& rn);
+
+  // Reverse bytes in 32-bit words.
+  void rev32(const Register& rd, const Register& rn);
+
+  // Reverse bytes.
+  void rev(const Register& rd, const Register& rn);
+
+  // Count leading zeroes.
+  void clz(const Register& rd, const Register& rn);
+
+  // Count leading sign bits.
+  void cls(const Register& rd, const Register& rn);
+
+  // Memory instructions.
+  // Load integer or FP register.
+  void ldr(const CPURegister& rt, const MemOperand& src);
+
+  // Store integer or FP register.
+  void str(const CPURegister& rt, const MemOperand& dst);
+
+  // Load word with sign extension.
+  void ldrsw(const Register& rt, const MemOperand& src);
+
+  // Load byte.
+  void ldrb(const Register& rt, const MemOperand& src);
+
+  // Store byte.
+  void strb(const Register& rt, const MemOperand& dst);
+
+  // Load byte with sign extension.
+  void ldrsb(const Register& rt, const MemOperand& src);
+
+  // Load half-word.
+  void ldrh(const Register& rt, const MemOperand& src);
+
+  // Store half-word.
+  void strh(const Register& rt, const MemOperand& dst);
+
+  // Load half-word with sign extension.
+  void ldrsh(const Register& rt, const MemOperand& src);
+
+  // Load integer or FP register pair.
+  void ldp(const CPURegister& rt, const CPURegister& rt2,
+           const MemOperand& src);
+
+  // Store integer or FP register pair.
+  void stp(const CPURegister& rt, const CPURegister& rt2,
+           const MemOperand& dst);
+
+  // Load word pair with sign extension.
+  void ldpsw(const Register& rt, const Register& rt2, const MemOperand& src);
+
+  // Load integer or FP register pair, non-temporal.
+  void ldnp(const CPURegister& rt, const CPURegister& rt2,
+            const MemOperand& src);
+
+  // Store integer or FP register pair, non-temporal.
+  void stnp(const CPURegister& rt, const CPURegister& rt2,
+            const MemOperand& dst);
+
+  // Load literal to register.
+  void ldr(const Register& rt, uint64_t imm);
+
+  // Load literal to FP register.
+  void ldr(const FPRegister& ft, double imm);
+
+  // Move instructions. The default shift of -1 indicates that the move
+  // instruction will calculate an appropriate 16-bit immediate and left shift
+  // that is equal to the 64-bit immediate argument. If an explicit left shift
+  // is specified (0, 16, 32 or 48), the immediate must be a 16-bit value.
+  //
+  // For movk, an explicit shift can be used to indicate which half word should
+  // be overwritten, eg. movk(x0, 0, 0) will overwrite the least-significant
+  // half word with zero, whereas movk(x0, 0, 48) will overwrite the
+  // most-significant.
+
+  // Move immediate and keep.
+  void movk(const Register& rd, uint64_t imm, int shift = -1) {
+    MoveWide(rd, imm, shift, MOVK);
+  }
+
+  // Move inverted immediate.
+  void movn(const Register& rd, uint64_t imm, int shift = -1) {
+    MoveWide(rd, imm, shift, MOVN);
+  }
+
+  // Move immediate.
+  void movz(const Register& rd, uint64_t imm, int shift = -1) {
+    MoveWide(rd, imm, shift, MOVZ);
+  }
+
+  // Misc instructions.
+  // Monitor debug-mode breakpoint.
+  void brk(int code);
+
+  // Halting debug-mode breakpoint.
+  void hlt(int code);
+
+  // Move register to register.
+  void mov(const Register& rd, const Register& rn);
+
+  // Move inverted operand to register.
+  void mvn(const Register& rd, const Operand& operand);
+
+  // System instructions.
+  // Move to register from system register.
+  void mrs(const Register& rt, SystemRegister sysreg);
+
+  // Move from register to system register.
+  void msr(SystemRegister sysreg, const Register& rt);
+
+  // System hint.
+  void hint(SystemHint code);
+
+  // Alias for system instructions.
+  // No-op.
+  void nop() {
+    hint(NOP);
+  }
+
+  // FP instructions.
+  // Move immediate to FP register.
+  void fmov(FPRegister fd, double imm);
+
+  // Move FP register to register.
+  void fmov(Register rd, FPRegister fn);
+
+  // Move register to FP register.
+  void fmov(FPRegister fd, Register rn);
+
+  // Move FP register to FP register.
+  void fmov(FPRegister fd, FPRegister fn);
+
+  // FP add.
+  void fadd(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm);
+
+  // FP subtract.
+  void fsub(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm);
+
+  // FP multiply.
+  void fmul(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm);
+
+  // FP multiply and subtract.
+  void fmsub(const FPRegister& fd,
+             const FPRegister& fn,
+             const FPRegister& fm,
+             const FPRegister& fa);
+
+  // FP divide.
+  void fdiv(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm);
+
+  // FP maximum.
+  void fmax(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm);
+
+  // FP minimum.
+  void fmin(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm);
+
+  // FP absolute.
+  void fabs(const FPRegister& fd, const FPRegister& fn);
+
+  // FP negate.
+  void fneg(const FPRegister& fd, const FPRegister& fn);
+
+  // FP square root.
+  void fsqrt(const FPRegister& fd, const FPRegister& fn);
+
+  // FP round to integer (nearest with ties to even).
+  void frintn(const FPRegister& fd, const FPRegister& fn);
+
+  // FP round to integer (towards zero).
+  void frintz(const FPRegister& fd, const FPRegister& fn);
+
+  // FP convert single to double precision.
+  void fcvt(const FPRegister& fd, const FPRegister& fn);
+
+  // FP compare registers.
+  void fcmp(const FPRegister& fn, const FPRegister& fm);
+
+  // FP compare immediate.
+  void fcmp(const FPRegister& fn, double value);
+
+  // FP conditional compare.
+  void fccmp(const FPRegister& fn,
+             const FPRegister& fm,
+             StatusFlags nzcv,
+             Condition cond);
+
+  // FP conditional select.
+  void fcsel(const FPRegister& fd,
+             const FPRegister& fn,
+             const FPRegister& fm,
+             Condition cond);
+
+  // Common FP Convert function.
+  void FPConvertToInt(const Register& rd,
+                      const FPRegister& fn,
+                      FPIntegerConvertOp op);
+
+  // Convert FP to unsigned integer (round towards -infinity).
+  void fcvtmu(const Register& rd, const FPRegister& fn);
+
+  // Convert FP to signed integer (round towards -infinity).
+  void fcvtms(const Register& rd, const FPRegister& fn);
+
+  // Convert FP to unsigned integer (nearest with ties to even).
+  void fcvtnu(const Register& rd, const FPRegister& fn);
+
+  // Convert FP to signed integer (nearest with ties to even).
+  void fcvtns(const Register& rd, const FPRegister& fn);
+
+  // Convert FP to unsigned integer (round towards zero).
+  void fcvtzu(const Register& rd, const FPRegister& fn);
+
+  // Convert FP to signed integer (round towards zero).
+  void fcvtzs(const Register& rd, const FPRegister& fn);
+
+  // Convert signed integer or fixed point to FP.
+  void scvtf(const FPRegister& fd, const Register& rn, unsigned fbits = 0);
+
+  // Convert unsigned integer or fixed point to FP.
+  void ucvtf(const FPRegister& fd, const Register& rn, unsigned fbits = 0);
+
+  // Emit generic instructions.
+  // Emit raw instructions into the instruction stream.
+  inline void dci(Instr raw_inst) { Emit(raw_inst); }
+
+  // Emit 32 bits of data into the instruction stream.
+  inline void dc32(uint32_t data) { EmitData(&data, sizeof(data)); }
+
+  // Emit 64 bits of data into the instruction stream.
+  inline void dc64(uint64_t data) { EmitData(&data, sizeof(data)); }
+
+  // Copy a string into the instruction stream, including the terminating NULL
+  // character. The instruction pointer (pc_) is then aligned correctly for
+  // subsequent instructions.
+  void EmitStringData(const char * string) {
+    ASSERT(string != NULL);
+
+    size_t len = strlen(string) + 1;
+    EmitData(string, len);
+
+    // Pad with NULL characters until pc_ is aligned.
+    const char pad[] = {'\0', '\0', '\0', '\0'};
+    ASSERT(sizeof(pad) == kInstructionSize);
+    Instruction* next_pc = AlignUp(pc_, kInstructionSize);
+    EmitData(&pad, next_pc - pc_);
+  }
+
+  // Code generation helpers.
+
+  // Register encoding.
+  static Instr Rd(CPURegister rd) {
+    ASSERT(rd.code() != kSPRegInternalCode);
+    return rd.code() << Rd_offset;
+  }
+
+  static Instr Rn(CPURegister rn) {
+    ASSERT(rn.code() != kSPRegInternalCode);
+    return rn.code() << Rn_offset;
+  }
+
+  static Instr Rm(CPURegister rm) {
+    ASSERT(rm.code() != kSPRegInternalCode);
+    return rm.code() << Rm_offset;
+  }
+
+  static Instr Ra(CPURegister ra) {
+    ASSERT(ra.code() != kSPRegInternalCode);
+    return ra.code() << Ra_offset;
+  }
+
+  static Instr Rt(CPURegister rt) {
+    ASSERT(rt.code() != kSPRegInternalCode);
+    return rt.code() << Rt_offset;
+  }
+
+  static Instr Rt2(CPURegister rt2) {
+    ASSERT(rt2.code() != kSPRegInternalCode);
+    return rt2.code() << Rt2_offset;
+  }
+
+  // These encoding functions allow the stack pointer to be encoded, and
+  // disallow the zero register.
+  static Instr RdSP(Register rd) {
+    ASSERT(!rd.IsZero());
+    return (rd.code() & kRegCodeMask) << Rd_offset;
+  }
+
+  static Instr RnSP(Register rn) {
+    ASSERT(!rn.IsZero());
+    return (rn.code() & kRegCodeMask) << Rn_offset;
+  }
+
+  // Flags encoding.
+  static Instr Flags(FlagsUpdate S) {
+    if (S == SetFlags) {
+      return 1 << FlagsUpdate_offset;
+    } else if (S == LeaveFlags) {
+      return 0 << FlagsUpdate_offset;
+    }
+    UNREACHABLE();
+    return 0;
+  }
+
+  static Instr Cond(Condition cond) {
+    return cond << Condition_offset;
+  }
+
+  // PC-relative address encoding.
+  static Instr ImmPCRelAddress(int imm21) {
+    ASSERT(is_int21(imm21));
+    Instr imm = static_cast<Instr>(truncate_to_int21(imm21));
+    Instr immhi = (imm >> ImmPCRelLo_width) << ImmPCRelHi_offset;
+    Instr immlo = imm << ImmPCRelLo_offset;
+    return (immhi & ImmPCRelHi_mask) | (immlo & ImmPCRelLo_mask);
+  }
+
+  // Branch encoding.
+  static Instr ImmUncondBranch(int imm26) {
+    ASSERT(is_int26(imm26));
+    return truncate_to_int26(imm26) << ImmUncondBranch_offset;
+  }
+
+  static Instr ImmCondBranch(int imm19) {
+    ASSERT(is_int19(imm19));
+    return truncate_to_int19(imm19) << ImmCondBranch_offset;
+  }
+
+  static Instr ImmCmpBranch(int imm19) {
+    ASSERT(is_int19(imm19));
+    return truncate_to_int19(imm19) << ImmCmpBranch_offset;
+  }
+
+  static Instr ImmTestBranch(int imm14) {
+    ASSERT(is_int14(imm14));
+    return truncate_to_int14(imm14) << ImmTestBranch_offset;
+  }
+
+  static Instr ImmTestBranchBit(unsigned bit_pos) {
+    ASSERT(is_uint6(bit_pos));
+    // Subtract five from the shift offset, as we need bit 5 from bit_pos.
+    unsigned b5 = bit_pos << (ImmTestBranchBit5_offset - 5);
+    unsigned b40 = bit_pos << ImmTestBranchBit40_offset;
+    b5 &= ImmTestBranchBit5_mask;
+    b40 &= ImmTestBranchBit40_mask;
+    return b5 | b40;
+  }
+
+  // Data Processing encoding.
+  static Instr SF(Register rd) {
+      return rd.Is64Bits() ? SixtyFourBits : ThirtyTwoBits;
+  }
+
+  static Instr ImmAddSub(int64_t imm) {
+    ASSERT(IsImmAddSub(imm));
+    if (is_uint12(imm)) {  // No shift required.
+      return imm << ImmAddSub_offset;
+    } else {
+      return ((imm >> 12) << ImmAddSub_offset) | (1 << ShiftAddSub_offset);
+    }
+  }
+
+  static inline Instr ImmS(unsigned imms, unsigned reg_size) {
+    ASSERT(((reg_size == kXRegSize) && is_uint6(imms)) ||
+           ((reg_size == kWRegSize) && is_uint5(imms)));
+    USE(reg_size);
+    return imms << ImmS_offset;
+  }
+
+  static inline Instr ImmR(unsigned immr, unsigned reg_size) {
+    ASSERT(((reg_size == kXRegSize) && is_uint6(immr)) ||
+           ((reg_size == kWRegSize) && is_uint5(immr)));
+    USE(reg_size);
+    ASSERT(is_uint6(immr));
+    return immr << ImmR_offset;
+  }
+
+  static inline Instr ImmSetBits(unsigned imms, unsigned reg_size) {
+    ASSERT((reg_size == kWRegSize) || (reg_size == kXRegSize));
+    ASSERT(is_uint6(imms));
+    ASSERT((reg_size == kXRegSize) || is_uint6(imms + 3));
+    USE(reg_size);
+    return imms << ImmSetBits_offset;
+  }
+
+  static inline Instr ImmRotate(unsigned immr, unsigned reg_size) {
+    ASSERT((reg_size == kWRegSize) || (reg_size == kXRegSize));
+    ASSERT(((reg_size == kXRegSize) && is_uint6(immr)) ||
+           ((reg_size == kWRegSize) && is_uint5(immr)));
+    USE(reg_size);
+    return immr << ImmRotate_offset;
+  }
+
+  static inline Instr ImmLLiteral(int imm19) {
+    ASSERT(is_int19(imm19));
+    return truncate_to_int19(imm19) << ImmLLiteral_offset;
+  }
+
+  static inline Instr BitN(unsigned bitn, unsigned reg_size) {
+    ASSERT((reg_size == kWRegSize) || (reg_size == kXRegSize));
+    ASSERT((reg_size == kXRegSize) || (bitn == 0));
+    USE(reg_size);
+    return bitn << BitN_offset;
+  }
+
+  static Instr ShiftDP(Shift shift) {
+    ASSERT(shift == LSL || shift == LSR || shift == ASR || shift == ROR);
+    return shift << ShiftDP_offset;
+  }
+
+  static Instr ImmDPShift(unsigned amount) {
+    ASSERT(is_uint6(amount));
+    return amount << ImmDPShift_offset;
+  }
+
+  static Instr ExtendMode(Extend extend) {
+    return extend << ExtendMode_offset;
+  }
+
+  static Instr ImmExtendShift(unsigned left_shift) {
+    ASSERT(left_shift <= 4);
+    return left_shift << ImmExtendShift_offset;
+  }
+
+  static Instr ImmCondCmp(unsigned imm) {
+    ASSERT(is_uint5(imm));
+    return imm << ImmCondCmp_offset;
+  }
+
+  static Instr Nzcv(StatusFlags nzcv) {
+    return ((nzcv >> Flags_offset) & 0xf) << Nzcv_offset;
+  }
+
+  // MemOperand offset encoding.
+  static Instr ImmLSUnsigned(int imm12) {
+    ASSERT(is_uint12(imm12));
+    return imm12 << ImmLSUnsigned_offset;
+  }
+
+  static Instr ImmLS(int imm9) {
+    ASSERT(is_int9(imm9));
+    return truncate_to_int9(imm9) << ImmLS_offset;
+  }
+
+  static Instr ImmLSPair(int imm7, LSDataSize size) {
+    ASSERT(((imm7 >> size) << size) == imm7);
+    int scaled_imm7 = imm7 >> size;
+    ASSERT(is_int7(scaled_imm7));
+    return truncate_to_int7(scaled_imm7) << ImmLSPair_offset;
+  }
+
+  static Instr ImmShiftLS(unsigned shift_amount) {
+    ASSERT(is_uint1(shift_amount));
+    return shift_amount << ImmShiftLS_offset;
+  }
+
+  static Instr ImmException(int imm16) {
+    ASSERT(is_uint16(imm16));
+    return imm16 << ImmException_offset;
+  }
+
+  static Instr ImmSystemRegister(int imm15) {
+    ASSERT(is_uint15(imm15));
+    return imm15 << ImmSystemRegister_offset;
+  }
+
+  static Instr ImmHint(int imm7) {
+    ASSERT(is_uint7(imm7));
+    return imm7 << ImmHint_offset;
+  }
+
+  static LSDataSize CalcLSDataSize(LoadStoreOp op) {
+    ASSERT((SizeLS_offset + SizeLS_width) == (kInstructionSize * 8));
+    return static_cast<LSDataSize>(op >> SizeLS_offset);
+  }
+
+  // Move immediates encoding.
+  static Instr ImmMoveWide(uint64_t imm) {
+    ASSERT(is_uint16(imm));
+    return imm << ImmMoveWide_offset;
+  }
+
+  static Instr ShiftMoveWide(int64_t shift) {
+    ASSERT(is_uint2(shift));
+    return shift << ShiftMoveWide_offset;
+  }
+
+  // FP Immediates.
+  static Instr ImmFP32(float imm);
+  static Instr ImmFP64(double imm);
+
+  // FP register type.
+  static Instr FPType(FPRegister fd) {
+    return fd.Is64Bits() ? FP64 : FP32;
+  }
+
+  static Instr FPScale(unsigned scale) {
+    ASSERT(is_uint6(scale));
+    return scale << FPScale_offset;
+  }
+
+  // Size of the code generated in bytes
+  uint64_t SizeOfCodeGenerated() const {
+    ASSERT((pc_ >= buffer_) && (pc_ < (buffer_ + buffer_size_)));
+    return pc_ - buffer_;
+  }
+
+  // Size of the code generated since label to the current position.
+  uint64_t SizeOfCodeGeneratedSince(Label* label) const {
+    ASSERT(label->IsBound());
+    ASSERT((pc_ >= label->target()) && (pc_ < (buffer_ + buffer_size_)));
+    return pc_ - label->target();
+  }
+
+
+  inline void BlockLiteralPool() {
+    literal_pool_monitor_++;
+  }
+
+  inline void ReleaseLiteralPool() {
+    if (--literal_pool_monitor_ == 0) {
+      // Has the literal pool been blocked for too long?
+      ASSERT(literals_.empty() ||
+             (pc_ < (literals_.back()->pc_ + kMaxLoadLiteralRange)));
+    }
+  }
+
+  inline bool IsLiteralPoolBlocked() {
+    return literal_pool_monitor_ != 0;
+  }
+
+  void CheckLiteralPool(LiteralPoolEmitOption option = JumpRequired);
+  void EmitLiteralPool(LiteralPoolEmitOption option = NoJumpRequired);
+  size_t LiteralPoolSize();
+
+ protected:
+  inline const Register& AppropriateZeroRegFor(const CPURegister& reg) const {
+    return reg.Is64Bits() ? xzr : wzr;
+  }
+
+
+  void LoadStore(const CPURegister& rt,
+                 const MemOperand& addr,
+                 LoadStoreOp op);
+  static bool IsImmLSUnscaled(ptrdiff_t offset);
+  static bool IsImmLSScaled(ptrdiff_t offset, LSDataSize size);
+
+  void Logical(const Register& rd,
+               const Register& rn,
+               const Operand& operand,
+               LogicalOp op);
+  void LogicalImmediate(const Register& rd,
+                        const Register& rn,
+                        unsigned n,
+                        unsigned imm_s,
+                        unsigned imm_r,
+                        LogicalOp op);
+  static bool IsImmLogical(uint64_t value,
+                           unsigned width,
+                           unsigned* n,
+                           unsigned* imm_s,
+                           unsigned* imm_r);
+
+  void ConditionalCompare(const Register& rn,
+                          const Operand& operand,
+                          StatusFlags nzcv,
+                          Condition cond,
+                          ConditionalCompareOp op);
+  static bool IsImmConditionalCompare(int64_t immediate);
+
+  void AddSubWithCarry(const Register& rd,
+                       const Register& rn,
+                       const Operand& operand,
+                       FlagsUpdate S,
+                       AddSubWithCarryOp op);
+
+  // Functions for emulating operands not directly supported by the instruction
+  // set.
+  void EmitShift(const Register& rd,
+                 const Register& rn,
+                 Shift shift,
+                 unsigned amount);
+  void EmitExtendShift(const Register& rd,
+                       const Register& rn,
+                       Extend extend,
+                       unsigned left_shift);
+
+  void AddSub(const Register& rd,
+              const Register& rn,
+              const Operand& operand,
+              FlagsUpdate S,
+              AddSubOp op);
+  static bool IsImmAddSub(int64_t immediate);
+
+  // Find an appropriate LoadStoreOp or LoadStorePairOp for the specified
+  // registers. Only simple loads are supported; sign- and zero-extension (such
+  // as in LDPSW_x or LDRB_w) are not supported.
+  static LoadStoreOp LoadOpFor(const CPURegister& rt);
+  static LoadStorePairOp LoadPairOpFor(const CPURegister& rt,
+                                       const CPURegister& rt2);
+  static LoadStoreOp StoreOpFor(const CPURegister& rt);
+  static LoadStorePairOp StorePairOpFor(const CPURegister& rt,
+                                        const CPURegister& rt2);
+  static LoadStorePairNonTemporalOp LoadPairNonTemporalOpFor(
+    const CPURegister& rt, const CPURegister& rt2);
+  static LoadStorePairNonTemporalOp StorePairNonTemporalOpFor(
+    const CPURegister& rt, const CPURegister& rt2);
+
+
+ private:
+  // Instruction helpers.
+  void MoveWide(const Register& rd,
+                uint64_t imm,
+                int shift,
+                MoveWideImmediateOp mov_op);
+  void DataProcShiftedRegister(const Register& rd,
+                               const Register& rn,
+                               const Operand& operand,
+                               FlagsUpdate S,
+                               Instr op);
+  void DataProcExtendedRegister(const Register& rd,
+                                const Register& rn,
+                                const Operand& operand,
+                                FlagsUpdate S,
+                                Instr op);
+  void LoadStorePair(const CPURegister& rt,
+                     const CPURegister& rt2,
+                     const MemOperand& addr,
+                     LoadStorePairOp op);
+  void LoadStorePairNonTemporal(const CPURegister& rt,
+                                const CPURegister& rt2,
+                                const MemOperand& addr,
+                                LoadStorePairNonTemporalOp op);
+  void LoadLiteral(const CPURegister& rt, uint64_t imm, LoadLiteralOp op);
+  void ConditionalSelect(const Register& rd,
+                         const Register& rn,
+                         const Register& rm,
+                         Condition cond,
+                         ConditionalSelectOp op);
+  void DataProcessing1Source(const Register& rd,
+                             const Register& rn,
+                             DataProcessing1SourceOp op);
+  void DataProcessing3Source(const Register& rd,
+                             const Register& rn,
+                             const Register& rm,
+                             const Register& ra,
+                             DataProcessing3SourceOp op);
+  void FPDataProcessing1Source(const FPRegister& fd,
+                               const FPRegister& fn,
+                               FPDataProcessing1SourceOp op);
+  void FPDataProcessing2Source(const FPRegister& fd,
+                               const FPRegister& fn,
+                               const FPRegister& fm,
+                               FPDataProcessing2SourceOp op);
+  void FPDataProcessing3Source(const FPRegister& fd,
+                               const FPRegister& fn,
+                               const FPRegister& fm,
+                               const FPRegister& fa,
+                               FPDataProcessing3SourceOp op);
+
+  // Encoding helpers.
+  static bool IsImmFP32(float imm);
+  static bool IsImmFP64(double imm);
+
+  void RecordLiteral(int64_t imm, unsigned size);
+
+  // Emit the instruction at pc_.
+  void Emit(Instr instruction) {
+    ASSERT(sizeof(*pc_) == 1);
+    ASSERT(sizeof(instruction) == kInstructionSize);
+    ASSERT((pc_ + sizeof(instruction)) <= (buffer_ + buffer_size_));
+
+#ifdef DEBUG
+    finalized_ = false;
+#endif
+
+    memcpy(pc_, &instruction, sizeof(instruction));
+    pc_ += sizeof(instruction);
+    CheckBufferSpace();
+  }
+
+  // Emit data inline in the instruction stream.
+  void EmitData(void const * data, unsigned size) {
+    ASSERT(sizeof(*pc_) == 1);
+    ASSERT((pc_ + size) <= (buffer_ + buffer_size_));
+
+#ifdef DEBUG
+    finalized_ = false;
+#endif
+
+    // TODO: Record this 'instruction' as data, so that it can be disassembled
+    // correctly.
+    memcpy(pc_, data, size);
+    pc_ += size;
+    CheckBufferSpace();
+  }
+
+  inline void CheckBufferSpace() {
+    ASSERT(pc_ < (buffer_ + buffer_size_));
+    if (pc_ > next_literal_pool_check_) {
+      CheckLiteralPool();
+    }
+  }
+
+  // The buffer into which code and relocation info are generated.
+  Instruction* buffer_;
+  // Buffer size, in bytes.
+  unsigned buffer_size_;
+  Instruction* pc_;
+  std::list<Literal*> literals_;
+  Instruction* next_literal_pool_check_;
+  unsigned literal_pool_monitor_;
+
+  friend class BlockLiteralPoolScope;
+
+#ifdef DEBUG
+  bool finalized_;
+#endif
+};
+
+class BlockLiteralPoolScope {
+ public:
+  explicit BlockLiteralPoolScope(Assembler* assm) : assm_(assm) {
+    assm_->BlockLiteralPool();
+  }
+
+  ~BlockLiteralPoolScope() {
+    assm_->ReleaseLiteralPool();
+  }
+
+ private:
+  Assembler* assm_;
+};
+}  // namespace vixl
+
+#endif  // VIXL_A64_ASSEMBLER_A64_H_

diff --git a/src/a64/constants-a64.h b/src/a64/constants-a64.h
new file mode 100644
index 0000000..578c84d
--- /dev/null
+++ b/src/a64/constants-a64.h

@@ -0,0 +1,1048 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_A64_CONSTANTS_A64_H_
+#define VIXL_A64_CONSTANTS_A64_H_
+
+namespace vixl {
+
+const unsigned kNumberOfRegisters = 32;
+const unsigned kNumberOfFPRegisters = 32;
+// Callee saved registers are x21-x30(lr).
+const int kNumberOfCalleeSavedRegisters = 10;
+const int kFirstCalleeSavedRegisterIndex = 21;
+// Callee saved FP registers are d8-d15.
+const int kNumberOfCalleeSavedFPRegisters = 8;
+const int kFirstCalleeSavedFPRegisterIndex = 8;
+
+#define REGISTER_CODE_LIST(R)                                                  \
+R(0)  R(1)  R(2)  R(3)  R(4)  R(5)  R(6)  R(7)                                 \
+R(8)  R(9)  R(10) R(11) R(12) R(13) R(14) R(15)                                \
+R(16) R(17) R(18) R(19) R(20) R(21) R(22) R(23)                                \
+R(24) R(25) R(26) R(27) R(28) R(29) R(30) R(31)
+
+#define FIELDS_LIST(V)                                                         \
+/* Register fields */                                                          \
+V(Rd, 4, 0, Bits)                        /* Destination register.     */       \
+V(Rn, 9, 5, Bits)                        /* First source register.    */       \
+V(Rm, 20, 16, Bits)                      /* Second source register.   */       \
+V(Ra, 14, 10, Bits)                      /* Third source register.    */       \
+V(Rt, 4, 0, Bits)                        /* Load dest / store source. */       \
+V(Rt2, 14, 10, Bits)                     /* Load second dest /        */       \
+                                         /* store second source.      */       \
+V(PrefetchMode, 4, 0, Bits)                                                    \
+                                                                               \
+/* Common bits */                                                              \
+V(SixtyFourBits, 31, 31, Bits)                                                 \
+V(FlagsUpdate, 29, 29, Bits)                                                   \
+                                                                               \
+/* PC relative addressing */                                                   \
+V(ImmPCRelHi, 23, 5, SignedBits)                                               \
+V(ImmPCRelLo, 30, 29, Bits)                                                    \
+                                                                               \
+/* Add/subtract/logical shift register */                                      \
+V(ShiftDP, 23, 22, Bits)                                                       \
+V(ImmDPShift, 15, 10, Bits)                                                    \
+                                                                               \
+/* Add/subtract immediate */                                                   \
+V(ImmAddSub, 21, 10, Bits)                                                     \
+V(ShiftAddSub, 23, 22, Bits)                                                   \
+                                                                               \
+/* Add/substract extend */                                                     \
+V(ImmExtendShift, 12, 10, Bits)                                                \
+V(ExtendMode, 15, 13, Bits)                                                    \
+                                                                               \
+/* Move wide */                                                                \
+V(ImmMoveWide, 20, 5, Bits)                                                    \
+V(ShiftMoveWide, 22, 21, Bits)                                                 \
+                                                                               \
+/* Logical immediate, bitfield and extract */                                  \
+V(BitN, 22, 22, Bits)                                                          \
+V(ImmRotate, 21, 16, Bits)                                                     \
+V(ImmSetBits, 15, 10, Bits)                                                    \
+V(ImmR, 21, 16, Bits)                                                          \
+V(ImmS, 15, 10, Bits)                                                          \
+                                                                               \
+/* Test and branch immediate */                                                \
+V(ImmTestBranch, 18, 5, SignedBits)                                            \
+V(ImmTestBranchBit40, 23, 19, Bits)                                            \
+V(ImmTestBranchBit5, 31, 31, Bits)                                             \
+                                                                               \
+/* Conditionals */                                                             \
+V(Condition, 15, 12, Bits)                                                     \
+V(ConditionBranch, 3, 0, Bits)                                                 \
+V(Nzcv, 3, 0, Bits)                                                            \
+V(ImmCondCmp, 20, 16, Bits)                                                    \
+V(ImmCondBranch, 23, 5, SignedBits)                                            \
+                                                                               \
+/* Floating point */                                                           \
+V(FPType, 23, 22, Bits)                                                        \
+V(ImmFP, 20, 13, Bits)                                                         \
+V(FPScale, 15, 10, Bits)                                                       \
+                                                                               \
+/* Load Store */                                                               \
+V(ImmLS, 20, 12, SignedBits)                                                   \
+V(ImmLSUnsigned, 21, 10, Bits)                                                 \
+V(ImmLSPair, 21, 15, SignedBits)                                               \
+V(SizeLS, 31, 30, Bits)                                                        \
+V(ImmShiftLS, 12, 12, Bits)                                                    \
+                                                                               \
+/* Other immediates */                                                         \
+V(ImmUncondBranch, 25, 0, SignedBits)                                          \
+V(ImmCmpBranch, 23, 5, SignedBits)                                             \
+V(ImmLLiteral, 23, 5, SignedBits)                                              \
+V(ImmException, 20, 5, Bits)                                                   \
+V(ImmHint, 11, 5, Bits)                                                        \
+V(ImmSystemRegister, 19, 5, Bits)                                              \
+                                                                               \
+/* System */                                                                   \
+V(Cn, 15, 12, Bits)                                                            \
+V(Cm, 11, 8, Bits)
+
+// Fields offsets.
+#define DECLARE_FIELDS_OFFSETS(Name, HighBit, LowBit, X)                       \
+const int Name##_offset = LowBit;                                              \
+const int Name##_width = HighBit - LowBit + 1;                                 \
+const int Name##_mask = ((1 << Name##_width) - 1) << LowBit;
+FIELDS_LIST(DECLARE_FIELDS_OFFSETS)
+#undef DECLARE_FIELDS_BITS
+
+// ImmPCRel is a compound field (not present in FIELDS_LIST), formed from
+// ImmPCRelLo and ImmPCRelHi.
+const int ImmPCRel_mask = ImmPCRelLo_mask | ImmPCRelHi_mask;
+
+// Condition codes.
+enum Condition {
+  eq = 0,
+  ne = 1,
+  hs = 2,
+  lo = 3,
+  mi = 4,
+  pl = 5,
+  vs = 6,
+  vc = 7,
+  hi = 8,
+  ls = 9,
+  ge = 10,
+  lt = 11,
+  gt = 12,
+  le = 13,
+  al = 14
+};
+
+inline Condition InvertCondition(Condition cond) {
+  ASSERT(cond != al);
+  return static_cast<Condition>(cond ^ 1);
+}
+
+enum FlagsUpdate {
+  SetFlags   = 1,
+  LeaveFlags = 0
+};
+
+const int N_offset = 31;
+const int Z_offset = 30;
+const int C_offset = 29;
+const int V_offset = 28;
+const int Flags_offset = V_offset;
+
+enum StatusFlags {
+  NoFlag    = 0,
+  VFlag     = 0x10000000u,
+  CFlag     = 0x20000000u,
+  CVFlag    = 0x30000000u,
+  ZFlag     = 0x40000000u,
+  ZVFlag    = 0x50000000u,
+  ZCFlag    = 0x60000000u,
+  ZCVFlag   = 0x70000000u,
+  NFlag     = 0x80000000u,
+  NVFlag    = 0x90000000u,
+  NCFlag    = 0xa0000000u,
+  NCVFlag   = 0xb0000000u,
+  NZFlag    = 0xc0000000u,
+  NZVFlag   = 0xd0000000u,
+  NZCFlag   = 0xe0000000u,
+  NZCVFlag  = 0xf0000000u
+};
+const unsigned int Flags_mask = NZCVFlag;
+
+enum Shift {
+  NO_SHIFT = -1,
+  LSL = 0x0,
+  LSR = 0x1,
+  ASR = 0x2,
+  ROR = 0x3
+};
+
+enum Extend {
+  NO_EXTEND = -1,
+  UXTB      = 0,
+  UXTH      = 1,
+  UXTW      = 2,
+  UXTX      = 3,
+  SXTB      = 4,
+  SXTH      = 5,
+  SXTW      = 6,
+  SXTX      = 7
+};
+
+enum SystemHint {
+  NOP   = 0,
+  YIELD = 1,
+  WFE   = 2,
+  WFI   = 3,
+  SEV   = 4,
+  SEVL  = 5
+};
+
+// System/special register names.
+// This information is not encoded as one field but as the concatenation of
+// multiple fields (lsb of Op0, Op1, Crn, Crm, Op2).
+enum SystemRegister {
+  NZCV = ((0xb << 16) | (0x4 << Cn_offset) | (0x2 << Cm_offset)) >>
+      ImmSystemRegister_offset
+};
+
+// Instruction enumerations.
+//
+// These are the masks that define a class of instructions, and the list of
+// instructions within each class. Each enumeration has a Fixed, FMask and
+// Mask value.
+//
+// Fixed: The fixed bits in this instruction class.
+// FMask: The mask used to extract the fixed bits in the class.
+// Mask:  The mask used to identify the instructions within a class.
+//
+// The enumerations can be used like this:
+//
+// ASSERT(instr->Mask(PCRelAddressingFMask) == PCRelAddressingFixed);
+// switch(instr->Mask(PCRelAddressingMask)) {
+//   case ADR:  Format("adr 'Xd, 'AddrPCRelByte"); break;
+//   case ADRP: Format("adrp 'Xd, 'AddrPCRelPage"); break;
+//   default:   printf("Unknown instruction\n");
+// }
+
+
+// Generic fields.
+enum GenericInstrField {
+  SixtyFourBits        = 0x80000000,
+  ThirtyTwoBits        = 0x00000000,
+  FP32                 = 0x00000000,
+  FP64                 = 0x00400000
+};
+
+// PC relative addressing.
+enum PCRelAddressingOp {
+  PCRelAddressingFixed = 0x10000000,
+  PCRelAddressingFMask = 0x1F000000,
+  PCRelAddressingMask  = 0x9F000000,
+  ADR                  = PCRelAddressingFixed | 0x00000000,
+  ADRP                 = PCRelAddressingFixed | 0x80000000
+};
+
+// Add/sub (immediate, shifted and extended.)
+const int kSFOffset = 31;
+enum AddSubOp {
+  AddSubOpMask      = 0x60000000,
+  AddSubSetFlagsBit = 0x20000000,
+  ADD               = 0x00000000,
+  ADDS              = ADD | AddSubSetFlagsBit,
+  SUB               = 0x40000000,
+  SUBS              = SUB | AddSubSetFlagsBit
+};
+
+#define ADD_SUB_OP_LIST(V)  \
+  V(ADD),                   \
+  V(ADDS),                  \
+  V(SUB),                   \
+  V(SUBS)
+
+enum AddSubImmediateOp {
+  AddSubImmediateFixed = 0x11000000,
+  AddSubImmediateFMask = 0x1F000000,
+  AddSubImmediateMask  = 0xFF000000,
+  #define ADD_SUB_IMMEDIATE(A)           \
+  A##_w_imm = AddSubImmediateFixed | A,  \
+  A##_x_imm = AddSubImmediateFixed | A | SixtyFourBits
+  ADD_SUB_OP_LIST(ADD_SUB_IMMEDIATE)
+  #undef ADD_SUB_IMMEDIATE
+};
+
+enum AddSubShiftedOp {
+  AddSubShiftedFixed   = 0x0B000000,
+  AddSubShiftedFMask   = 0x1F200000,
+  AddSubShiftedMask    = 0xFF200000,
+  #define ADD_SUB_SHIFTED(A)             \
+  A##_w_shift = AddSubShiftedFixed | A,  \
+  A##_x_shift = AddSubShiftedFixed | A | SixtyFourBits
+  ADD_SUB_OP_LIST(ADD_SUB_SHIFTED)
+  #undef ADD_SUB_SHIFTED
+};
+
+enum AddSubExtendedOp {
+  AddSubExtendedFixed  = 0x0B200000,
+  AddSubExtendedFMask  = 0x1F200000,
+  AddSubExtendedMask   = 0xFFE00000,
+  #define ADD_SUB_EXTENDED(A)           \
+  A##_w_ext = AddSubExtendedFixed | A,  \
+  A##_x_ext = AddSubExtendedFixed | A | SixtyFourBits
+  ADD_SUB_OP_LIST(ADD_SUB_EXTENDED)
+  #undef ADD_SUB_EXTENDED
+};
+
+// Add/sub with carry.
+enum AddSubWithCarryOp {
+  AddSubWithCarryFixed = 0x1A000000,
+  AddSubWithCarryFMask = 0x1FE00000,
+  AddSubWithCarryMask  = 0xFFE0FC00,
+  ADC_w                = AddSubWithCarryFixed | ADD,
+  ADC_x                = AddSubWithCarryFixed | ADD | SixtyFourBits,
+  ADC                  = ADC_w,
+  ADCS_w               = AddSubWithCarryFixed | ADDS,
+  ADCS_x               = AddSubWithCarryFixed | ADDS | SixtyFourBits,
+  SBC_w                = AddSubWithCarryFixed | SUB,
+  SBC_x                = AddSubWithCarryFixed | SUB | SixtyFourBits,
+  SBC                  = SBC_w,
+  SBCS_w               = AddSubWithCarryFixed | SUBS,
+  SBCS_x               = AddSubWithCarryFixed | SUBS | SixtyFourBits
+};
+
+
+// Logical (immediate and shifted register).
+enum LogicalOp {
+  LogicalOpMask = 0x60200000,
+  NOT   = 0x00200000,
+  AND   = 0x00000000,
+  BIC   = AND | NOT,
+  ORR   = 0x20000000,
+  ORN   = ORR | NOT,
+  EOR   = 0x40000000,
+  EON   = EOR | NOT,
+  ANDS  = 0x60000000,
+  BICS  = ANDS | NOT
+};
+
+// Logical immediate.
+enum LogicalImmediateOp {
+  LogicalImmediateFixed = 0x12000000,
+  LogicalImmediateFMask = 0x1F800000,
+  LogicalImmediateMask  = 0xFF800000,
+  AND_w_imm   = LogicalImmediateFixed | AND,
+  AND_x_imm   = LogicalImmediateFixed | AND | SixtyFourBits,
+  ORR_w_imm   = LogicalImmediateFixed | ORR,
+  ORR_x_imm   = LogicalImmediateFixed | ORR | SixtyFourBits,
+  EOR_w_imm   = LogicalImmediateFixed | EOR,
+  EOR_x_imm   = LogicalImmediateFixed | EOR | SixtyFourBits,
+  ANDS_w_imm  = LogicalImmediateFixed | ANDS,
+  ANDS_x_imm  = LogicalImmediateFixed | ANDS | SixtyFourBits
+};
+
+// Logical shifted register.
+enum LogicalShiftedOp {
+  LogicalShiftedFixed = 0x0A000000,
+  LogicalShiftedFMask = 0x1F000000,
+  LogicalShiftedMask  = 0xFF200000,
+  AND_w               = LogicalShiftedFixed | AND,
+  AND_x               = LogicalShiftedFixed | AND | SixtyFourBits,
+  AND_shift           = AND_w,
+  BIC_w               = LogicalShiftedFixed | BIC,
+  BIC_x               = LogicalShiftedFixed | BIC | SixtyFourBits,
+  BIC_shift           = BIC_w,
+  ORR_w               = LogicalShiftedFixed | ORR,
+  ORR_x               = LogicalShiftedFixed | ORR | SixtyFourBits,
+  ORR_shift           = ORR_w,
+  ORN_w               = LogicalShiftedFixed | ORN,
+  ORN_x               = LogicalShiftedFixed | ORN | SixtyFourBits,
+  ORN_shift           = ORN_w,
+  EOR_w               = LogicalShiftedFixed | EOR,
+  EOR_x               = LogicalShiftedFixed | EOR | SixtyFourBits,
+  EOR_shift           = EOR_w,
+  EON_w               = LogicalShiftedFixed | EON,
+  EON_x               = LogicalShiftedFixed | EON | SixtyFourBits,
+  EON_shift           = EON_w,
+  ANDS_w              = LogicalShiftedFixed | ANDS,
+  ANDS_x              = LogicalShiftedFixed | ANDS | SixtyFourBits,
+  ANDS_shift          = ANDS_w,
+  BICS_w              = LogicalShiftedFixed | BICS,
+  BICS_x              = LogicalShiftedFixed | BICS | SixtyFourBits,
+  BICS_shift          = BICS_w
+};
+
+// Move wide immediate.
+enum MoveWideImmediateOp {
+  MoveWideImmediateFixed = 0x12800000,
+  MoveWideImmediateFMask = 0x1F800000,
+  MoveWideImmediateMask  = 0xFF800000,
+  MOVN                   = 0x00000000,
+  MOVZ                   = 0x40000000,
+  MOVK                   = 0x60000000,
+  MOVN_w                 = MoveWideImmediateFixed | MOVN,
+  MOVN_x                 = MoveWideImmediateFixed | MOVN | SixtyFourBits,
+  MOVZ_w                 = MoveWideImmediateFixed | MOVZ,
+  MOVZ_x                 = MoveWideImmediateFixed | MOVZ | SixtyFourBits,
+  MOVK_w                 = MoveWideImmediateFixed | MOVK,
+  MOVK_x                 = MoveWideImmediateFixed | MOVK | SixtyFourBits
+};
+
+// Bitfield.
+const int kBitfieldNOffset = 22;
+enum BitfieldOp {
+  BitfieldFixed = 0x13000000,
+  BitfieldFMask = 0x1F800000,
+  BitfieldMask  = 0xFF800000,
+  SBFM_w        = BitfieldFixed | 0x00000000,
+  SBFM_x        = BitfieldFixed | 0x80000000,
+  SBFM          = SBFM_w,
+  BFM_w         = BitfieldFixed | 0x20000000,
+  BFM_x         = BitfieldFixed | 0xA0000000,
+  BFM           = BFM_w,
+  UBFM_w        = BitfieldFixed | 0x40000000,
+  UBFM_x        = BitfieldFixed | 0xC0000000,
+  UBFM          = UBFM_w
+  // Bitfield N field.
+};
+
+// Extract.
+enum ExtractOp {
+  ExtractFixed = 0x13800000,
+  ExtractFMask = 0x1F800000,
+  ExtractMask  = 0xFFA00000,
+  EXTR_w       = ExtractFixed | 0x00000000,
+  EXTR_x       = ExtractFixed | 0x80000000,
+  EXTR         = EXTR_w
+};
+
+// Unconditional branch.
+enum UnconditionalBranchOp {
+  UnconditionalBranchFixed = 0x14000000,
+  UnconditionalBranchFMask = 0x7C000000,
+  UnconditionalBranchMask  = 0xFC000000,
+  B                        = UnconditionalBranchFixed | 0x00000000,
+  BL                       = UnconditionalBranchFixed | 0x80000000
+};
+
+// Unconditional branch to register.
+enum UnconditionalBranchToRegisterOp {
+  UnconditionalBranchToRegisterFixed = 0xD6000000,
+  UnconditionalBranchToRegisterFMask = 0xFE000000,
+  UnconditionalBranchToRegisterMask  = 0xFFFFFC1F,
+  BR      = UnconditionalBranchToRegisterFixed | 0x001F0000,
+  BLR     = UnconditionalBranchToRegisterFixed | 0x003F0000,
+  RET     = UnconditionalBranchToRegisterFixed | 0x005F0000
+};
+
+// Compare and branch.
+enum CompareBranchOp {
+  CompareBranchFixed = 0x34000000,
+  CompareBranchFMask = 0x7E000000,
+  CompareBranchMask  = 0xFF000000,
+  CBZ_w              = CompareBranchFixed | 0x00000000,
+  CBZ_x              = CompareBranchFixed | 0x80000000,
+  CBZ                = CBZ_w,
+  CBNZ_w             = CompareBranchFixed | 0x01000000,
+  CBNZ_x             = CompareBranchFixed | 0x81000000,
+  CBNZ               = CBNZ_w
+};
+
+// Test and branch.
+enum TestBranchOp {
+  TestBranchFixed = 0x36000000,
+  TestBranchFMask = 0x7E000000,
+  TestBranchMask  = 0x7F000000,
+  TBZ             = TestBranchFixed | 0x00000000,
+  TBNZ            = TestBranchFixed | 0x01000000
+};
+
+// Conditional branch.
+enum ConditionalBranchOp {
+  ConditionalBranchFixed = 0x54000000,
+  ConditionalBranchFMask = 0xFE000000,
+  ConditionalBranchMask  = 0xFF000010,
+  B_cond                 = ConditionalBranchFixed | 0x00000000
+};
+
+// System.
+// System instruction encoding is complicated because some instructions use op
+// and CR fields to encode parameters. To handle this cleanly, the system
+// instructions are split into more than one enum.
+
+enum SystemOp {
+  SystemFixed = 0xD5000000,
+  SystemFMask = 0xFFC00000
+};
+
+enum SystemSysRegOp {
+  SystemSysRegFixed = 0xD5100000,
+  SystemSysRegFMask = 0xFFD00000,
+  SystemSysRegMask  = 0xFFF00000,
+  MRS               = SystemSysRegFixed | 0x00200000,
+  MSR               = SystemSysRegFixed | 0x00000000
+};
+
+enum SystemHintOp {
+  SystemHintFixed = 0xD503201F,
+  SystemHintFMask = 0xFFFFF01F,
+  SystemHintMask  = 0xFFFFF01F,
+  HINT            = SystemHintFixed | 0x00000000
+};
+
+// Exception.
+enum ExceptionOp {
+  ExceptionFixed = 0xD4000000,
+  ExceptionFMask = 0xFF000000,
+  ExceptionMask  = 0xFFE0001F,
+  HLT            = ExceptionFixed | 0x00400000,
+  BRK            = ExceptionFixed | 0x00200000,
+  SVC            = ExceptionFixed | 0x00000001
+};
+
+// Any load or store.
+enum LoadStoreAnyOp {
+  LoadStoreAnyFMask = 0x0a000000,
+  LoadStoreAnyFixed = 0x08000000
+};
+
+#define LOAD_STORE_PAIR_OP_LIST(V)  \
+  V(STP, w,   0x00000000),          \
+  V(LDP, w,   0x00400000),          \
+  V(LDPSW, x, 0x40400000),          \
+  V(STP, x,   0x80000000),          \
+  V(LDP, x,   0x80400000),          \
+  V(STP, s,   0x04000000),          \
+  V(LDP, s,   0x04400000),          \
+  V(STP, d,   0x44000000),          \
+  V(LDP, d,   0x44400000)
+
+// Load/store pair (post, pre and offset.)
+enum LoadStorePairOp {
+  LoadStorePairMask = 0xC4400000,
+  LoadStorePairLBit = 1 << 22,
+  #define LOAD_STORE_PAIR(A, B, C) \
+  A##_##B = C
+  LOAD_STORE_PAIR_OP_LIST(LOAD_STORE_PAIR)
+  #undef LOAD_STORE_PAIR
+};
+
+enum LoadStorePairPostIndexOp {
+  LoadStorePairPostIndexFixed = 0x28800000,
+  LoadStorePairPostIndexFMask = 0x3B800000,
+  LoadStorePairPostIndexMask  = 0xFFC00000,
+  #define LOAD_STORE_PAIR_POST_INDEX(A, B, C)  \
+  A##_##B##_post = LoadStorePairPostIndexFixed | A##_##B
+  LOAD_STORE_PAIR_OP_LIST(LOAD_STORE_PAIR_POST_INDEX)
+  #undef LOAD_STORE_PAIR_POST_INDEX
+};
+
+enum LoadStorePairPreIndexOp {
+  LoadStorePairPreIndexFixed = 0x29800000,
+  LoadStorePairPreIndexFMask = 0x3B800000,
+  LoadStorePairPreIndexMask  = 0xFFC00000,
+  #define LOAD_STORE_PAIR_PRE_INDEX(A, B, C)  \
+  A##_##B##_pre = LoadStorePairPreIndexFixed | A##_##B
+  LOAD_STORE_PAIR_OP_LIST(LOAD_STORE_PAIR_PRE_INDEX)
+  #undef LOAD_STORE_PAIR_PRE_INDEX
+};
+
+enum LoadStorePairOffsetOp {
+  LoadStorePairOffsetFixed = 0x29000000,
+  LoadStorePairOffsetFMask = 0x3B800000,
+  LoadStorePairOffsetMask  = 0xFFC00000,
+  #define LOAD_STORE_PAIR_OFFSET(A, B, C)  \
+  A##_##B##_off = LoadStorePairOffsetFixed | A##_##B
+  LOAD_STORE_PAIR_OP_LIST(LOAD_STORE_PAIR_OFFSET)
+  #undef LOAD_STORE_PAIR_OFFSET
+};
+
+enum LoadStorePairNonTemporalOp {
+  LoadStorePairNonTemporalFixed = 0x28000000,
+  LoadStorePairNonTemporalFMask = 0x3B800000,
+  LoadStorePairNonTemporalMask  = 0xFFC00000,
+  STNP_w = LoadStorePairNonTemporalFixed | STP_w,
+  LDNP_w = LoadStorePairNonTemporalFixed | LDP_w,
+  STNP_x = LoadStorePairNonTemporalFixed | STP_x,
+  LDNP_x = LoadStorePairNonTemporalFixed | LDP_x,
+  STNP_s = LoadStorePairNonTemporalFixed | STP_s,
+  LDNP_s = LoadStorePairNonTemporalFixed | LDP_s,
+  STNP_d = LoadStorePairNonTemporalFixed | STP_d,
+  LDNP_d = LoadStorePairNonTemporalFixed | LDP_d
+};
+
+// Load literal.
+enum LoadLiteralOp {
+  LoadLiteralFixed = 0x18000000,
+  LoadLiteralFMask = 0x3B000000,
+  LoadLiteralMask  = 0xFF000000,
+  LDR_w_lit        = LoadLiteralFixed | 0x00000000,
+  LDR_x_lit        = LoadLiteralFixed | 0x40000000,
+  LDRSW_x_lit      = LoadLiteralFixed | 0x80000000,
+  PRFM_lit         = LoadLiteralFixed | 0xC0000000,
+  LDR_s_lit        = LoadLiteralFixed | 0x04000000,
+  LDR_d_lit        = LoadLiteralFixed | 0x44000000
+};
+
+#define LOAD_STORE_OP_LIST(V)     \
+  V(ST, RB, w,  0x00000000),  \
+  V(ST, RH, w,  0x40000000),  \
+  V(ST, R, w,   0x80000000),  \
+  V(ST, R, x,   0xC0000000),  \
+  V(LD, RB, w,  0x00400000),  \
+  V(LD, RH, w,  0x40400000),  \
+  V(LD, R, w,   0x80400000),  \
+  V(LD, R, x,   0xC0400000),  \
+  V(LD, RSB, x, 0x00800000),  \
+  V(LD, RSH, x, 0x40800000),  \
+  V(LD, RSW, x, 0x80800000),  \
+  V(LD, RSB, w, 0x00C00000),  \
+  V(LD, RSH, w, 0x40C00000),  \
+  V(ST, R, s,   0x84000000),  \
+  V(ST, R, d,   0xC4000000),  \
+  V(LD, R, s,   0x84400000),  \
+  V(LD, R, d,   0xC4400000)
+
+
+// Load/store unscaled offset.
+enum LoadStoreUnscaledOffsetOp {
+  LoadStoreUnscaledOffsetFixed = 0x38000000,
+  LoadStoreUnscaledOffsetFMask = 0x3B200C00,
+  LoadStoreUnscaledOffsetMask  = 0xFFE00C00,
+  #define LOAD_STORE_UNSCALED(A, B, C, D)  \
+  A##U##B##_##C = LoadStoreUnscaledOffsetFixed | D
+  LOAD_STORE_OP_LIST(LOAD_STORE_UNSCALED)
+  #undef LOAD_STORE_UNSCALED
+};
+
+// Load/store (post, pre, offset and unsigned.)
+enum LoadStoreOp {
+  LoadStoreOpMask   = 0xC4C00000,
+  #define LOAD_STORE(A, B, C, D)  \
+  A##B##_##C = D
+  LOAD_STORE_OP_LIST(LOAD_STORE),
+  #undef LOAD_STORE
+  PRFM = 0xC0800000
+};
+
+// Load/store post index.
+enum LoadStorePostIndex {
+  LoadStorePostIndexFixed = 0x38000400,
+  LoadStorePostIndexFMask = 0x3B200C00,
+  LoadStorePostIndexMask  = 0xFFE00C00,
+  #define LOAD_STORE_POST_INDEX(A, B, C, D)  \
+  A##B##_##C##_post = LoadStorePostIndexFixed | D
+  LOAD_STORE_OP_LIST(LOAD_STORE_POST_INDEX)
+  #undef LOAD_STORE_POST_INDEX
+};
+
+// Load/store pre index.
+enum LoadStorePreIndex {
+  LoadStorePreIndexFixed = 0x38000C00,
+  LoadStorePreIndexFMask = 0x3B200C00,
+  LoadStorePreIndexMask  = 0xFFE00C00,
+  #define LOAD_STORE_PRE_INDEX(A, B, C, D)  \
+  A##B##_##C##_pre = LoadStorePreIndexFixed | D
+  LOAD_STORE_OP_LIST(LOAD_STORE_PRE_INDEX)
+  #undef LOAD_STORE_PRE_INDEX
+};
+
+// Load/store unsigned offset.
+enum LoadStoreUnsignedOffset {
+  LoadStoreUnsignedOffsetFixed = 0x39000000,
+  LoadStoreUnsignedOffsetFMask = 0x3B000000,
+  LoadStoreUnsignedOffsetMask  = 0xFFC00000,
+  PRFM_unsigned                = LoadStoreUnsignedOffsetFixed | PRFM,
+  #define LOAD_STORE_UNSIGNED_OFFSET(A, B, C, D) \
+  A##B##_##C##_unsigned = LoadStoreUnsignedOffsetFixed | D
+  LOAD_STORE_OP_LIST(LOAD_STORE_UNSIGNED_OFFSET)
+  #undef LOAD_STORE_UNSIGNED_OFFSET
+};
+
+// Load/store register offset.
+enum LoadStoreRegisterOffset {
+  LoadStoreRegisterOffsetFixed = 0x38200800,
+  LoadStoreRegisterOffsetFMask = 0x3B200C00,
+  LoadStoreRegisterOffsetMask  = 0xFFE00C00,
+  PRFM_reg                     = LoadStoreRegisterOffsetFixed | PRFM,
+  #define LOAD_STORE_REGISTER_OFFSET(A, B, C, D) \
+  A##B##_##C##_reg = LoadStoreRegisterOffsetFixed | D
+  LOAD_STORE_OP_LIST(LOAD_STORE_REGISTER_OFFSET)
+  #undef LOAD_STORE_REGISTER_OFFSET
+};
+
+// Conditional compare.
+enum ConditionalCompareOp {
+  ConditionalCompareMask = 0x60000000,
+  CCMN                   = 0x20000000,
+  CCMP                   = 0x60000000
+};
+
+// Conditional compare register.
+enum ConditionalCompareRegisterOp {
+  ConditionalCompareRegisterFixed = 0x1A400000,
+  ConditionalCompareRegisterFMask = 0x1FE00800,
+  ConditionalCompareRegisterMask  = 0xFFE00C10,
+  CCMN_w = ConditionalCompareRegisterFixed | CCMN,
+  CCMN_x = ConditionalCompareRegisterFixed | SixtyFourBits | CCMN,
+  CCMP_w = ConditionalCompareRegisterFixed | CCMP,
+  CCMP_x = ConditionalCompareRegisterFixed | SixtyFourBits | CCMP
+};
+
+// Conditional compare immediate.
+enum ConditionalCompareImmediateOp {
+  ConditionalCompareImmediateFixed = 0x1A400800,
+  ConditionalCompareImmediateFMask = 0x1FE00800,
+  ConditionalCompareImmediateMask  = 0xFFE00C10,
+  CCMN_w_imm = ConditionalCompareImmediateFixed | CCMN,
+  CCMN_x_imm = ConditionalCompareImmediateFixed | SixtyFourBits | CCMN,
+  CCMP_w_imm = ConditionalCompareImmediateFixed | CCMP,
+  CCMP_x_imm = ConditionalCompareImmediateFixed | SixtyFourBits | CCMP
+};
+
+// Conditional select.
+enum ConditionalSelectOp {
+  ConditionalSelectFixed = 0x1A800000,
+  ConditionalSelectFMask = 0x1FE00000,
+  ConditionalSelectMask  = 0xFFE00C00,
+  CSEL_w                 = ConditionalSelectFixed | 0x00000000,
+  CSEL_x                 = ConditionalSelectFixed | 0x80000000,
+  CSEL                   = CSEL_w,
+  CSINC_w                = ConditionalSelectFixed | 0x00000400,
+  CSINC_x                = ConditionalSelectFixed | 0x80000400,
+  CSINC                  = CSINC_w,
+  CSINV_w                = ConditionalSelectFixed | 0x40000000,
+  CSINV_x                = ConditionalSelectFixed | 0xC0000000,
+  CSINV                  = CSINV_w,
+  CSNEG_w                = ConditionalSelectFixed | 0x40000400,
+  CSNEG_x                = ConditionalSelectFixed | 0xC0000400,
+  CSNEG                  = CSNEG_w
+};
+
+// Data processing 1 source.
+enum DataProcessing1SourceOp {
+  DataProcessing1SourceFixed = 0x5AC00000,
+  DataProcessing1SourceFMask = 0x5FE00000,
+  DataProcessing1SourceMask  = 0xFFFFFC00,
+  RBIT    = DataProcessing1SourceFixed | 0x00000000,
+  RBIT_w  = RBIT,
+  RBIT_x  = RBIT | SixtyFourBits,
+  REV16   = DataProcessing1SourceFixed | 0x00000400,
+  REV16_w = REV16,
+  REV16_x = REV16 | SixtyFourBits,
+  REV     = DataProcessing1SourceFixed | 0x00000800,
+  REV_w   = REV,
+  REV32_x = REV | SixtyFourBits,
+  REV_x   = DataProcessing1SourceFixed | SixtyFourBits | 0x00000C00,
+  CLZ     = DataProcessing1SourceFixed | 0x00001000,
+  CLZ_w   = CLZ,
+  CLZ_x   = CLZ | SixtyFourBits,
+  CLS     = DataProcessing1SourceFixed | 0x00001400,
+  CLS_w   = CLS,
+  CLS_x   = CLS | SixtyFourBits
+};
+
+// Data processing 2 source.
+enum DataProcessing2SourceOp {
+  DataProcessing2SourceFixed = 0x1AC00000,
+  DataProcessing2SourceFMask = 0x5FE00000,
+  DataProcessing2SourceMask  = 0xFFE0FC00,
+  UDIV_w                     = DataProcessing2SourceFixed | 0x00000800,
+  UDIV_x                     = DataProcessing2SourceFixed | 0x80000800,
+  UDIV                       = UDIV_w,
+  SDIV_w                     = DataProcessing2SourceFixed | 0x00000C00,
+  SDIV_x                     = DataProcessing2SourceFixed | 0x80000C00,
+  SDIV                       = SDIV_w,
+  LSLV_w                     = DataProcessing2SourceFixed | 0x00002000,
+  LSLV_x                     = DataProcessing2SourceFixed | 0x80002000,
+  LSLV                       = LSLV_w,
+  LSRV_w                     = DataProcessing2SourceFixed | 0x00002400,
+  LSRV_x                     = DataProcessing2SourceFixed | 0x80002400,
+  LSRV                       = LSRV_w,
+  ASRV_w                     = DataProcessing2SourceFixed | 0x00002800,
+  ASRV_x                     = DataProcessing2SourceFixed | 0x80002800,
+  ASRV                       = ASRV_w,
+  RORV_w                     = DataProcessing2SourceFixed | 0x00002C00,
+  RORV_x                     = DataProcessing2SourceFixed | 0x80002C00,
+  RORV                       = RORV_w
+};
+
+// Data processing 3 source.
+enum DataProcessing3SourceOp {
+  DataProcessing3SourceFixed = 0x1B000000,
+  DataProcessing3SourceFMask = 0x1F000000,
+  DataProcessing3SourceMask  = 0xFFE08000,
+  MADD_w                     = DataProcessing3SourceFixed | 0x00000000,
+  MADD_x                     = DataProcessing3SourceFixed | 0x80000000,
+  MADD                       = MADD_w,
+  MSUB_w                     = DataProcessing3SourceFixed | 0x00008000,
+  MSUB_x                     = DataProcessing3SourceFixed | 0x80008000,
+  MSUB                       = MSUB_w,
+  SMADDL_x                   = DataProcessing3SourceFixed | 0x80200000,
+  SMSUBL_x                   = DataProcessing3SourceFixed | 0x80208000,
+  SMULH_x                    = DataProcessing3SourceFixed | 0x80400000,
+  UMADDL_x                   = DataProcessing3SourceFixed | 0x80A00000,
+  UMSUBL_x                   = DataProcessing3SourceFixed | 0x80A08000,
+  UMULH_x                    = DataProcessing3SourceFixed | 0x80C00000
+};
+
+// Floating point compare.
+enum FPCompareOp {
+  FPCompareFixed = 0x1E202000,
+  FPCompareFMask = 0x5F203C00,
+  FPCompareMask  = 0xFFE0FC1F,
+  FCMP_s         = FPCompareFixed | 0x00000000,
+  FCMP_d         = FPCompareFixed | FP64 | 0x00000000,
+  FCMP           = FCMP_s,
+  FCMP_s_zero    = FPCompareFixed | 0x00000008,
+  FCMP_d_zero    = FPCompareFixed | FP64 | 0x00000008,
+  FCMP_zero      = FCMP_s_zero,
+  FCMPE_s        = FPCompareFixed | 0x00000010,
+  FCMPE_d        = FPCompareFixed | FP64 | 0x00000010,
+  FCMPE_s_zero   = FPCompareFixed | 0x00000018,
+  FCMPE_d_zero   = FPCompareFixed | FP64 | 0x00000018
+};
+
+// Floating point conditional compare.
+enum FPConditionalCompareOp {
+  FPConditionalCompareFixed = 0x1E200400,
+  FPConditionalCompareFMask = 0x5F200C00,
+  FPConditionalCompareMask  = 0xFFE00C10,
+  FCCMP_s                   = FPConditionalCompareFixed | 0x00000000,
+  FCCMP_d                   = FPConditionalCompareFixed | FP64 | 0x00000000,
+  FCCMP                     = FCCMP_s,
+  FCCMPE_s                  = FPConditionalCompareFixed | 0x00000010,
+  FCCMPE_d                  = FPConditionalCompareFixed | FP64 | 0x00000010,
+  FCCMPE                    = FCCMPE_s
+};
+
+// Floating point conditional select.
+enum FPConditionalSelectOp {
+  FPConditionalSelectFixed = 0x1E200C00,
+  FPConditionalSelectFMask = 0x5F200C00,
+  FPConditionalSelectMask  = 0xFFE00C00,
+  FCSEL_s                  = FPConditionalSelectFixed | 0x00000000,
+  FCSEL_d                  = FPConditionalSelectFixed | FP64 | 0x00000000,
+  FCSEL                    = FCSEL_s
+};
+
+// Floating point immediate.
+enum FPImmediateOp {
+  FPImmediateFixed = 0x1E201000,
+  FPImmediateFMask = 0x5F201C00,
+  FPImmediateMask  = 0xFFE01C00,
+  FMOV_s_imm       = FPImmediateFixed | 0x00000000,
+  FMOV_d_imm       = FPImmediateFixed | FP64 | 0x00000000
+};
+
+// Floating point data processing 1 source.
+enum FPDataProcessing1SourceOp {
+  FPDataProcessing1SourceFixed = 0x1E204000,
+  FPDataProcessing1SourceFMask = 0x5F207C00,
+  FPDataProcessing1SourceMask  = 0xFFFFFC00,
+  FMOV_s   = FPDataProcessing1SourceFixed | 0x00000000,
+  FMOV_d   = FPDataProcessing1SourceFixed | FP64 | 0x00000000,
+  FMOV     = FMOV_s,
+  FABS_s   = FPDataProcessing1SourceFixed | 0x00008000,
+  FABS_d   = FPDataProcessing1SourceFixed | FP64 | 0x00008000,
+  FABS     = FABS_s,
+  FNEG_s   = FPDataProcessing1SourceFixed | 0x00010000,
+  FNEG_d   = FPDataProcessing1SourceFixed | FP64 | 0x00010000,
+  FNEG     = FNEG_s,
+  FSQRT_s  = FPDataProcessing1SourceFixed | 0x00018000,
+  FSQRT_d  = FPDataProcessing1SourceFixed | FP64 | 0x00018000,
+  FSQRT    = FSQRT_s,
+  FCVT_ds  = FPDataProcessing1SourceFixed | 0x00028000,
+  FCVT_sd  = FPDataProcessing1SourceFixed | FP64 | 0x00020000,
+  FRINTN_s = FPDataProcessing1SourceFixed | 0x00040000,
+  FRINTN_d = FPDataProcessing1SourceFixed | FP64 | 0x00040000,
+  FRINTN   = FRINTN_s,
+  FRINTP_s = FPDataProcessing1SourceFixed | 0x00048000,
+  FRINTP_d = FPDataProcessing1SourceFixed | FP64 | 0x00048000,
+  FRINTM_s = FPDataProcessing1SourceFixed | 0x00050000,
+  FRINTM_d = FPDataProcessing1SourceFixed | FP64 | 0x00050000,
+  FRINTZ_s = FPDataProcessing1SourceFixed | 0x00058000,
+  FRINTZ_d = FPDataProcessing1SourceFixed | FP64 | 0x00058000,
+  FRINTZ   = FRINTZ_s,
+  FRINTA_s = FPDataProcessing1SourceFixed | 0x00060000,
+  FRINTA_d = FPDataProcessing1SourceFixed | FP64 | 0x00060000,
+  FRINTX_s = FPDataProcessing1SourceFixed | 0x00070000,
+  FRINTX_d = FPDataProcessing1SourceFixed | FP64 | 0x00070000,
+  FRINTI_s = FPDataProcessing1SourceFixed | 0x00078000,
+  FRINTI_d = FPDataProcessing1SourceFixed | FP64 | 0x00078000
+};
+
+// Floating point data processing 2 source.
+enum FPDataProcessing2SourceOp {
+  FPDataProcessing2SourceFixed = 0x1E200800,
+  FPDataProcessing2SourceFMask = 0x5F200C00,
+  FPDataProcessing2SourceMask  = 0xFFE0FC00,
+  FMUL     = FPDataProcessing2SourceFixed | 0x00000000,
+  FMUL_s   = FMUL,
+  FMUL_d   = FMUL | FP64,
+  FDIV     = FPDataProcessing2SourceFixed | 0x00001000,
+  FDIV_s   = FDIV,
+  FDIV_d   = FDIV | FP64,
+  FADD     = FPDataProcessing2SourceFixed | 0x00002000,
+  FADD_s   = FADD,
+  FADD_d   = FADD | FP64,
+  FSUB     = FPDataProcessing2SourceFixed | 0x00003000,
+  FSUB_s   = FSUB,
+  FSUB_d   = FSUB | FP64,
+  FMAX     = FPDataProcessing2SourceFixed | 0x00004000,
+  FMAX_s   = FMAX,
+  FMAX_d   = FMAX | FP64,
+  FMIN     = FPDataProcessing2SourceFixed | 0x00005000,
+  FMIN_s   = FMIN,
+  FMIN_d   = FMIN | FP64,
+  FMAXNM   = FPDataProcessing2SourceFixed | 0x00006000,
+  FMAXNM_s = FMAXNM,
+  FMAXNM_d = FMAXNM | FP64,
+  FMINNM   = FPDataProcessing2SourceFixed | 0x00007000,
+  FMINNM_s = FMINNM,
+  FMINNM_d = FMINNM | FP64,
+  FNMUL    = FPDataProcessing2SourceFixed | 0x00008000,
+  FNMUL_s  = FNMUL,
+  FNMUL_d  = FNMUL | FP64
+};
+
+// Floating point data processing 3 source.
+enum FPDataProcessing3SourceOp {
+  FPDataProcessing3SourceFixed = 0x1F000000,
+  FPDataProcessing3SourceFMask = 0x5F000000,
+  FPDataProcessing3SourceMask  = 0xFFE08000,
+  FMADD_s                      = FPDataProcessing3SourceFixed | 0x00000000,
+  FMSUB_s                      = FPDataProcessing3SourceFixed | 0x00008000,
+  FNMADD_s                     = FPDataProcessing3SourceFixed | 0x00200000,
+  FNMSUB_s                     = FPDataProcessing3SourceFixed | 0x00208000,
+  FMADD_d                      = FPDataProcessing3SourceFixed | 0x00400000,
+  FMSUB_d                      = FPDataProcessing3SourceFixed | 0x00408000,
+  FNMADD_d                     = FPDataProcessing3SourceFixed | 0x00600000,
+  FNMSUB_d                     = FPDataProcessing3SourceFixed | 0x00608000
+};
+
+// Conversion between floating point and integer.
+enum FPIntegerConvertOp {
+  FPIntegerConvertFixed = 0x1E200000,
+  FPIntegerConvertFMask = 0x5F20FC00,
+  FPIntegerConvertMask  = 0xFFFFFC00,
+  FCVTNS    = FPIntegerConvertFixed | 0x00000000,
+  FCVTNS_ws = FCVTNS,
+  FCVTNS_xs = FCVTNS | SixtyFourBits,
+  FCVTNS_wd = FCVTNS | FP64,
+  FCVTNS_xd = FCVTNS | SixtyFourBits | FP64,
+  FCVTNU    = FPIntegerConvertFixed | 0x00010000,
+  FCVTNU_ws = FCVTNU,
+  FCVTNU_xs = FCVTNU | SixtyFourBits,
+  FCVTNU_wd = FCVTNU | FP64,
+  FCVTNU_xd = FCVTNU | SixtyFourBits | FP64,
+  FCVTPS    = FPIntegerConvertFixed | 0x00080000,
+  FCVTPS_ws = FCVTPS,
+  FCVTPS_xs = FCVTPS | SixtyFourBits,
+  FCVTPS_wd = FCVTPS | FP64,
+  FCVTPS_xd = FCVTPS | SixtyFourBits | FP64,
+  FCVTPU    = FPIntegerConvertFixed | 0x00090000,
+  FCVTPU_ws = FCVTPU,
+  FCVTPU_xs = FCVTPU | SixtyFourBits,
+  FCVTPU_wd = FCVTPU | FP64,
+  FCVTPU_xd = FCVTPU | SixtyFourBits | FP64,
+  FCVTMS    = FPIntegerConvertFixed | 0x00100000,
+  FCVTMS_ws = FCVTMS,
+  FCVTMS_xs = FCVTMS | SixtyFourBits,
+  FCVTMS_wd = FCVTMS | FP64,
+  FCVTMS_xd = FCVTMS | SixtyFourBits | FP64,
+  FCVTMU    = FPIntegerConvertFixed | 0x00110000,
+  FCVTMU_ws = FCVTMU,
+  FCVTMU_xs = FCVTMU | SixtyFourBits,
+  FCVTMU_wd = FCVTMU | FP64,
+  FCVTMU_xd = FCVTMU | SixtyFourBits | FP64,
+  FCVTZS    = FPIntegerConvertFixed | 0x00180000,
+  FCVTZS_ws = FCVTZS,
+  FCVTZS_xs = FCVTZS | SixtyFourBits,
+  FCVTZS_wd = FCVTZS | FP64,
+  FCVTZS_xd = FCVTZS | SixtyFourBits | FP64,
+  FCVTZU    = FPIntegerConvertFixed | 0x00190000,
+  FCVTZU_ws = FCVTZU,
+  FCVTZU_xs = FCVTZU | SixtyFourBits,
+  FCVTZU_wd = FCVTZU | FP64,
+  FCVTZU_xd = FCVTZU | SixtyFourBits | FP64,
+  SCVTF     = FPIntegerConvertFixed | 0x00020000,
+  SCVTF_sw  = SCVTF,
+  SCVTF_sx  = SCVTF | SixtyFourBits,
+  SCVTF_dw  = SCVTF | FP64,
+  SCVTF_dx  = SCVTF | SixtyFourBits | FP64,
+  UCVTF     = FPIntegerConvertFixed | 0x00030000,
+  UCVTF_sw  = UCVTF,
+  UCVTF_sx  = UCVTF | SixtyFourBits,
+  UCVTF_dw  = UCVTF | FP64,
+  UCVTF_dx  = UCVTF | SixtyFourBits | FP64,
+  FCVTAS    = FPIntegerConvertFixed | 0x00040000,
+  FCVTAS_ws = FCVTAS,
+  FCVTAS_xs = FCVTAS | SixtyFourBits,
+  FCVTAS_wd = FCVTAS | FP64,
+  FCVTAS_xd = FCVTAS | SixtyFourBits | FP64,
+  FCVTAU    = FPIntegerConvertFixed | 0x00050000,
+  FCVTAU_ws = FCVTAU,
+  FCVTAU_xs = FCVTAU | SixtyFourBits,
+  FCVTAU_wd = FCVTAU | FP64,
+  FCVTAU_xd = FCVTAU | SixtyFourBits | FP64,
+  FMOV_ws   = FPIntegerConvertFixed | 0x00060000,
+  FMOV_sw   = FPIntegerConvertFixed | 0x00070000,
+  FMOV_xd   = FMOV_ws | SixtyFourBits | FP64,
+  FMOV_dx   = FMOV_sw | SixtyFourBits | FP64
+};
+
+// Conversion between fixed point and floating point.
+enum FPFixedPointConvertOp {
+  FPFixedPointConvertFixed = 0x1E000000,
+  FPFixedPointConvertFMask = 0x5F200000,
+  FPFixedPointConvertMask  = 0xFFFF0000,
+  FCVTZS_fixed    = FPFixedPointConvertFixed | 0x00180000,
+  FCVTZS_ws_fixed = FCVTZS_fixed,
+  FCVTZS_xs_fixed = FCVTZS_fixed | SixtyFourBits,
+  FCVTZS_wd_fixed = FCVTZS_fixed | FP64,
+  FCVTZS_xd_fixed = FCVTZS_fixed | SixtyFourBits | FP64,
+  FCVTZU_fixed    = FPFixedPointConvertFixed | 0x00190000,
+  FCVTZU_ws_fixed = FCVTZU_fixed,
+  FCVTZU_xs_fixed = FCVTZU_fixed | SixtyFourBits,
+  FCVTZU_wd_fixed = FCVTZU_fixed | FP64,
+  FCVTZU_xd_fixed = FCVTZU_fixed | SixtyFourBits | FP64,
+  SCVTF_fixed     = FPFixedPointConvertFixed | 0x00020000,
+  SCVTF_sw_fixed  = SCVTF_fixed,
+  SCVTF_sx_fixed  = SCVTF_fixed | SixtyFourBits,
+  SCVTF_dw_fixed  = SCVTF_fixed | FP64,
+  SCVTF_dx_fixed  = SCVTF_fixed | SixtyFourBits | FP64,
+  UCVTF_fixed     = FPFixedPointConvertFixed | 0x00030000,
+  UCVTF_sw_fixed  = UCVTF_fixed,
+  UCVTF_sx_fixed  = UCVTF_fixed | SixtyFourBits,
+  UCVTF_dw_fixed  = UCVTF_fixed | FP64,
+  UCVTF_dx_fixed  = UCVTF_fixed | SixtyFourBits | FP64
+};
+
+// Unknown instruction. These are defined to make fixed bit assertion easier.
+enum UnknownOp {
+  UnknownFixed = 0x00000000,
+  UnknownFMask = 0x00000000
+};
+}  // namespace vixl
+
+#endif  // VIXL_A64_CONSTANTS_A64_H_

diff --git a/src/a64/cpu-a64.cc b/src/a64/cpu-a64.cc
new file mode 100644
index 0000000..8586563
--- /dev/null
+++ b/src/a64/cpu-a64.cc

@@ -0,0 +1,148 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "utils.h"
+#include "a64/cpu-a64.h"
+
+namespace vixl {
+
+// Initialise to smallest possible cache size.
+unsigned CPU::dcache_line_size_ = 1;
+unsigned CPU::icache_line_size_ = 1;
+
+
+// Currently computes I and D cache line size.
+void CPU::SetUp() {
+  uint32_t cache_type_register = GetCacheType();
+
+  // The cache type register holds information about the caches, including I
+  // D caches line size.
+  static const int kDCacheLineSizeShift = 16;
+  static const int kICacheLineSizeShift = 0;
+  static const uint32_t kDCacheLineSizeMask = 0xf << kDCacheLineSizeShift;
+  static const uint32_t kICacheLineSizeMask = 0xf << kICacheLineSizeShift;
+
+  // The cache type register holds the size of the I and D caches as a power of
+  // two.
+  uint32_t dcache_line_size_power_of_two =
+      (cache_type_register & kDCacheLineSizeMask) >> kDCacheLineSizeShift;
+  uint32_t icache_line_size_power_of_two =
+      (cache_type_register & kICacheLineSizeMask) >> kICacheLineSizeShift;
+
+  dcache_line_size_ = 1 << dcache_line_size_power_of_two;
+  icache_line_size_ = 1 << icache_line_size_power_of_two;
+}
+
+
+uint32_t CPU::GetCacheType() {
+#ifdef USE_SIMULATOR
+  // This will lead to a cache with 1 byte long lines, which is fine since the
+  // simulator will not need this information.
+  return 0;
+#else
+  uint32_t cache_type_register;
+  // Copy the content of the cache type register to a core register.
+  __asm__ __volatile__ ("mrs %[ctr], ctr_el0"  // NOLINT
+                        : [ctr] "=r" (cache_type_register));
+  return cache_type_register;
+#endif
+}
+
+
+void CPU::EnsureIAndDCacheCoherency(void *address, size_t length) {
+#ifdef USE_SIMULATOR
+  USE(address);
+  USE(length);
+  // TODO: consider adding cache simulation to ensure every address run has been
+  // synchronised.
+#else
+  // The code below assumes user space cache operations are allowed.
+
+  uintptr_t start = reinterpret_cast<uintptr_t>(address);
+  // Sizes will be used to generate a mask big enough to cover a pointer.
+  uintptr_t dsize = static_cast<uintptr_t>(dcache_line_size_);
+  uintptr_t isize = static_cast<uintptr_t>(icache_line_size_);
+  // Cache line sizes are always a power of 2.
+  ASSERT(CountSetBits(dsize, 64) == 1);
+  ASSERT(CountSetBits(isize, 64) == 1);
+  uintptr_t dstart = start & ~(dsize - 1);
+  uintptr_t istart = start & ~(isize - 1);
+  uintptr_t end = start + length;
+
+  __asm__ __volatile__ (  // NOLINT
+    // Clean every line of the D cache containing the target data.
+    "0:                                \n\t"
+    // dc      : Data Cache maintenance
+    //    c    : Clean
+    //     va  : by (Virtual) Address
+    //       u : to the point of Unification
+    // The point of unification for a processor is the point by which the
+    // instruction and data caches are guaranteed to see the same copy of a
+    // memory location. See ARM DDI 0406B page B2-12 for more information.
+    "dc   cvau, %[dline]                \n\t"
+    "add  %[dline], %[dline], %[dsize]  \n\t"
+    "cmp  %[dline], %[end]              \n\t"
+    "b.lt 0b                            \n\t"
+    // Barrier to make sure the effect of the code above is visible to the rest
+    // of the world.
+    // dsb    : Data Synchronisation Barrier
+    //    ish : Inner SHareable domain
+    // The point of unification for an Inner Shareable shareability domain is
+    // the point by which the instruction and data caches of all the processors
+    // in that Inner Shareable shareability domain are guaranteed to see the
+    // same copy of a memory location.  See ARM DDI 0406B page B2-12 for more
+    // information.
+    "dsb  ish                           \n\t"
+    // Invalidate every line of the I cache containing the target data.
+    "1:                                 \n\t"
+    // ic      : instruction cache maintenance
+    //    i    : invalidate
+    //     va  : by address
+    //       u : to the point of unification
+    "ic   ivau, %[iline]                \n\t"
+    "add  %[iline], %[iline], %[isize]  \n\t"
+    "cmp  %[iline], %[end]              \n\t"
+    "b.lt 1b                            \n\t"
+    // Barrier to make sure the effect of the code above is visible to the rest
+    // of the world.
+    "dsb  ish                           \n\t"
+    // Barrier to ensure any prefetching which happened before this code is
+    // discarded.
+    // isb : Instruction Synchronisation Barrier
+    "isb                                \n\t"
+    : [dline] "+r" (dstart),
+      [iline] "+r" (istart)
+    : [dsize] "r"  (dsize),
+      [isize] "r"  (isize),
+      [end]   "r"  (end)
+    // This code does not write to memory but without the dependency gcc might
+    // move this code before the code is generated.
+    : "cc", "memory"
+  );  // NOLINT
+#endif
+}
+
+}  // namespace vixl

diff --git a/src/a64/cpu-a64.h b/src/a64/cpu-a64.h
new file mode 100644
index 0000000..dfd8f01
--- /dev/null
+++ b/src/a64/cpu-a64.h

@@ -0,0 +1,56 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_CPU_A64_H
+#define VIXL_CPU_A64_H
+
+#include "globals.h"
+
+namespace vixl {
+
+class CPU {
+ public:
+  // Initialise CPU support.
+  static void SetUp();
+
+  // Ensures the data at a given address and with a given size is the same for
+  // the I and D caches. I and D caches are not automatically coherent on ARM
+  // so this operation is required before any dynamically generated code can
+  // safely run.
+  static void EnsureIAndDCacheCoherency(void *address, size_t length);
+
+ private:
+  // Return the content of the cache type register.
+  static uint32_t GetCacheType();
+
+  // I and D cache line size in bytes.
+  static unsigned icache_line_size_;
+  static unsigned dcache_line_size_;
+};
+
+}  // namespace vixl
+
+#endif  // VIXL_CPU_A64_H

diff --git a/src/a64/debugger-a64.cc b/src/a64/debugger-a64.cc
new file mode 100644
index 0000000..f817203
--- /dev/null
+++ b/src/a64/debugger-a64.cc

@@ -0,0 +1,1511 @@
+// Copyright 2013 ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY ARM LIMITED AND CONTRIBUTORS "AS IS" AND ANY
+// EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL ARM LIMITED BE LIABLE FOR ANY DIRECT, INDIRECT,
+// INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
+// OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+// LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+// NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+// EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "a64/debugger-a64.h"
+
+namespace vixl {
+
+// List of commands supported by the debugger.
+#define DEBUG_COMMAND_LIST(C)  \
+C(HelpCommand)                 \
+C(ContinueCommand)             \
+C(StepCommand)                 \
+C(DisasmCommand)               \
+C(PrintCommand)                \
+C(MemCommand)
+
+// Debugger command lines are broken up in token of different type to make
+// processing easier later on.
+class Token {
+ public:
+  virtual ~Token() {}
+
+  // Token type.
+  virtual bool IsRegister() const { return false; }
+  virtual bool IsFPRegister() const { return false; }
+  virtual bool IsIdentifier() const { return false; }
+  virtual bool IsAddress() const { return false; }
+  virtual bool IsInteger() const { return false; }
+  virtual bool IsFormat() const { return false; }
+  virtual bool IsUnknown() const { return false; }
+  // Token properties.
+  virtual bool CanAddressMemory() const { return false; }
+  virtual uint8_t* ToAddress(Debugger* debugger) const;
+  virtual void Print(FILE* out = stdout) const = 0;
+
+  static Token* Tokenize(const char* arg);
+};
+
+// Tokens often hold one value.
+template<typename T> class ValueToken : public Token {
+ public:
+  explicit ValueToken(T value) : value_(value) {}
+  ValueToken() {}
+
+  T value() const { return value_; }
+
+ protected:
+  T value_;
+};
+
+// Integer registers (X or W) and their aliases.
+// Format: wn or xn with 0 <= n < 32 or a name in the aliases list.
+class RegisterToken : public ValueToken<const Register> {
+ public:
+  explicit RegisterToken(const Register reg)
+      : ValueToken<const Register>(reg) {}
+
+  virtual bool IsRegister() const { return true; }
+  virtual bool CanAddressMemory() const { return value().Is64Bits(); }
+  virtual uint8_t* ToAddress(Debugger* debugger) const;
+  virtual void Print(FILE* out = stdout) const ;
+  const char* Name() const;
+
+  static Token* Tokenize(const char* arg);
+  static RegisterToken* Cast(Token* tok) {
+    ASSERT(tok->IsRegister());
+    return reinterpret_cast<RegisterToken*>(tok);
+  }
+
+ private:
+  static const int kMaxAliasNumber = 4;
+  static const char* kXAliases[kNumberOfRegisters][kMaxAliasNumber];
+  static const char* kWAliases[kNumberOfRegisters][kMaxAliasNumber];
+};
+
+// Floating point registers (D or S).
+// Format: sn or dn with 0 <= n < 32.
+class FPRegisterToken : public ValueToken<const FPRegister> {
+ public:
+  explicit FPRegisterToken(const FPRegister fpreg)
+      : ValueToken<const FPRegister>(fpreg) {}
+
+  virtual bool IsFPRegister() const { return true; }
+  virtual void Print(FILE* out = stdout) const ;
+
+  static Token* Tokenize(const char* arg);
+  static FPRegisterToken* Cast(Token* tok) {
+    ASSERT(tok->IsFPRegister());
+    return reinterpret_cast<FPRegisterToken*>(tok);
+  }
+};
+
+
+// Non-register identifiers.
+// Format: Alphanumeric string starting with a letter.
+class IdentifierToken : public ValueToken<char*> {
+ public:
+  explicit IdentifierToken(const char* name) {
+    int size = strlen(name) + 1;
+    value_ = new char[size];
+    strncpy(value_, name, size);
+  }
+  virtual ~IdentifierToken() { delete[] value_; }
+
+  virtual bool IsIdentifier() const { return true; }
+  virtual bool CanAddressMemory() const { return strcmp(value(), "pc") == 0; }
+  virtual uint8_t* ToAddress(Debugger* debugger) const;
+  virtual void Print(FILE* out = stdout) const;
+
+  static Token* Tokenize(const char* arg);
+  static IdentifierToken* Cast(Token* tok) {
+    ASSERT(tok->IsIdentifier());
+    return reinterpret_cast<IdentifierToken*>(tok);
+  }
+};
+
+// 64-bit address literal.
+// Format: 0x... with up to 16 hexadecimal digits.
+class AddressToken : public ValueToken<uint8_t*> {
+ public:
+  explicit AddressToken(uint8_t* address) : ValueToken<uint8_t*>(address) {}
+
+  virtual bool IsAddress() const { return true; }
+  virtual bool CanAddressMemory() const { return true; }
+  virtual uint8_t* ToAddress(Debugger* debugger) const;
+  virtual void Print(FILE* out = stdout) const ;
+
+  static Token* Tokenize(const char* arg);
+  static AddressToken* Cast(Token* tok) {
+    ASSERT(tok->IsAddress());
+    return reinterpret_cast<AddressToken*>(tok);
+  }
+};
+
+
+// 64-bit decimal integer literal.
+// Format: n.
+class IntegerToken : public ValueToken<int64_t> {
+ public:
+  explicit IntegerToken(int value) : ValueToken<int64_t>(value) {}
+
+  virtual bool IsInteger() const { return true; }
+  virtual void Print(FILE* out = stdout) const;
+
+  static Token* Tokenize(const char* arg);
+  static IntegerToken* Cast(Token* tok) {
+    ASSERT(tok->IsInteger());
+    return reinterpret_cast<IntegerToken*>(tok);
+  }
+};
+
+// Literal describing how to print a chunk of data (up to 64 bits).
+// Format: %qt
+// where q (qualifier) is one of
+//  * s: signed integer
+//  * u: unsigned integer
+//  * a: hexadecimal floating point
+// and t (type) is one of
+//  * x: 64-bit integer
+//  * w: 32-bit integer
+//  * h: 16-bit integer
+//  * b: 8-bit integer
+//  * c: character
+//  * d: double
+//  * s: float
+// When no qualifier is given for integers, they are printed in hexadecinal.
+class FormatToken : public Token {
+ public:
+  FormatToken() {}
+
+  virtual bool IsFormat() const { return true; }
+  virtual int SizeOf() const = 0;
+  virtual void PrintData(void* data, FILE* out = stdout) const = 0;
+  virtual void Print(FILE* out = stdout) const = 0;
+
+  static Token* Tokenize(const char* arg);
+  static FormatToken* Cast(Token* tok) {
+    ASSERT(tok->IsFormat());
+    return reinterpret_cast<FormatToken*>(tok);
+  }
+};
+
+
+template<typename T> class Format : public FormatToken {
+ public:
+  explicit Format(const char* fmt) : fmt_(fmt) {}
+
+  virtual int SizeOf() const { return sizeof(T); }
+  virtual void PrintData(void* data, FILE* out = stdout) const {
+    T value;
+    memcpy(&value, data, sizeof(value));
+    fprintf(out, fmt_, value);
+  }
+  virtual void Print(FILE* out = stdout) const;
+
+ private:
+  const char* fmt_;
+};
+
+// Tokens which don't fit any of the above.
+class UnknownToken : public Token {
+ public:
+  explicit UnknownToken(const char* arg) {
+    int size = strlen(arg) + 1;
+    unknown_ = new char[size];
+    strncpy(unknown_, arg, size);
+  }
+  virtual ~UnknownToken() { delete[] unknown_; }
+
+  virtual bool IsUnknown() const { return true; }
+  virtual void Print(FILE* out = stdout) const;
+
+ private:
+  char* unknown_;
+};
+
+
+// All debugger commands must subclass DebugCommand and implement Run, Print
+// and Build. Commands must also define kHelp and kAliases.
+class DebugCommand {
+ public:
+  explicit DebugCommand(Token* name) : name_(IdentifierToken::Cast(name)) {}
+  DebugCommand() : name_(NULL) {}
+  virtual ~DebugCommand() { delete name_; }
+
+  const char* name() { return name_->value(); }
+  // Run the command on the given debugger. The command returns true if
+  // execution should move to the next instruction.
+  virtual bool Run(Debugger * debugger) = 0;
+  virtual void Print(FILE* out = stdout);
+
+  static bool Match(const char* name, const char** aliases);
+  static DebugCommand* Parse(char* line);
+  static void PrintHelp(const char** aliases,
+                        const char* args,
+                        const char* help);
+
+ private:
+  IdentifierToken* name_;
+};
+
+// For all commands below see their respective kHelp and kAliases in
+// debugger-a64.cc
+class HelpCommand : public DebugCommand {
+ public:
+  explicit HelpCommand(Token* name) : DebugCommand(name) {}
+
+  virtual bool Run(Debugger* debugger);
+
+  static DebugCommand* Build(std::vector<Token*> args);
+
+  static const char* kHelp;
+  static const char* kAliases[];
+  static const char* kArguments;
+};
+
+
+class ContinueCommand : public DebugCommand {
+ public:
+  explicit ContinueCommand(Token* name) : DebugCommand(name) {}
+
+  virtual bool Run(Debugger* debugger);
+
+  static DebugCommand* Build(std::vector<Token*> args);
+
+  static const char* kHelp;
+  static const char* kAliases[];
+  static const char* kArguments;
+};
+
+
+class StepCommand : public DebugCommand {
+ public:
+  StepCommand(Token* name, IntegerToken* count)
+      : DebugCommand(name), count_(count) {}
+  virtual ~StepCommand() { delete count_; }
+
+  int64_t count() { return count_->value(); }
+  virtual bool Run(Debugger* debugger);
+  virtual void Print(FILE* out = stdout);
+
+  static DebugCommand* Build(std::vector<Token*> args);
+
+  static const char* kHelp;
+  static const char* kAliases[];
+  static const char* kArguments;
+
+ private:
+  IntegerToken* count_;
+};
+
+class DisasmCommand : public DebugCommand {
+ public:
+  DisasmCommand(Token* name, Token* target, IntegerToken* count)
+      : DebugCommand(name), target_(target), count_(count) {}
+  virtual ~DisasmCommand() {
+    delete target_;
+    delete count_;
+  }
+
+  Token* target() { return target_; }
+  int64_t count() { return count_->value(); }
+  virtual bool Run(Debugger* debugger);
+  virtual void Print(FILE* out = stdout);
+
+  static DebugCommand* Build(std::vector<Token*> args);
+
+  static const char* kHelp;
+  static const char* kAliases[];
+  static const char* kArguments;
+
+ private:
+  Token* target_;
+  IntegerToken* count_;
+};
+
+
+class PrintCommand : public DebugCommand {
+ public:
+  PrintCommand(Token* name, Token* target)
+      : DebugCommand(name), target_(target) {}
+  virtual ~PrintCommand() { delete target_; }
+
+  Token* target() { return target_; }
+  virtual bool Run(Debugger* debugger);
+  virtual void Print(FILE* out = stdout);
+
+  static DebugCommand* Build(std::vector<Token*> args);
+
+  static const char* kHelp;
+  static const char* kAliases[];
+  static const char* kArguments;
+
+ private:
+  Token* target_;
+};
+
+class MemCommand : public DebugCommand {
+ public:
+  MemCommand(Token* name,
+             Token* target,
+             IntegerToken* count,
+             FormatToken* format)
+      : DebugCommand(name), target_(target), count_(count), format_(format) {}
+  virtual ~MemCommand() {
+    delete target_;
+    delete count_;
+    delete format_;
+  }
+
+  Token* target() { return target_; }
+  int64_t count() { return count_->value(); }
+  FormatToken* format() { return format_; }
+  virtual bool Run(Debugger* debugger);
+  virtual void Print(FILE* out = stdout);
+
+  static DebugCommand* Build(std::vector<Token*> args);
+
+  static const char* kHelp;
+  static const char* kAliases[];
+  static const char* kArguments;
+
+ private:
+  Token* target_;
+  IntegerToken* count_;
+  FormatToken* format_;
+};
+
+// Commands which name does not match any of the known commnand.
+class UnknownCommand : public DebugCommand {
+ public:
+  explicit UnknownCommand(std::vector<Token*> args) : args_(args) {}
+  virtual ~UnknownCommand();
+
+  virtual bool Run(Debugger* debugger);
+
+ private:
+  std::vector<Token*> args_;
+};
+
+// Commands which name match a known command but the syntax is invalid.
+class InvalidCommand : public DebugCommand {
+ public:
+  InvalidCommand(std::vector<Token*> args, int index, const char* cause)
+      : args_(args), index_(index), cause_(cause) {}
+  virtual ~InvalidCommand();
+
+  virtual bool Run(Debugger* debugger);
+
+ private:
+  std::vector<Token*> args_;
+  int index_;
+  const char* cause_;
+};
+
+const char* HelpCommand::kAliases[] = { "help", NULL };
+const char* HelpCommand::kArguments = NULL;
+const char* HelpCommand::kHelp = "  print this help";
+
+const char* ContinueCommand::kAliases[] = { "continue", "c", NULL };
+const char* ContinueCommand::kArguments = NULL;
+const char* ContinueCommand::kHelp = "  resume execution";
+
+const char* StepCommand::kAliases[] = { "stepi", "si", NULL };
+const char* StepCommand::kArguments = "[n = 1]";
+const char* StepCommand::kHelp = "  execute n next instruction(s)";
+
+const char* DisasmCommand::kAliases[] = { "dis", "d", NULL };
+const char* DisasmCommand::kArguments = "[addr = pc] [n = 1]";
+const char* DisasmCommand::kHelp =
+  "  disassemble n instruction(s) at address addr.\n"
+  "  addr can be an immediate address, a register or the pc."
+;
+
+const char* PrintCommand::kAliases[] = { "print", "p", NULL };
+const char* PrintCommand::kArguments =  "<entity>";
+const char* PrintCommand::kHelp =
+  "  print the given entity\n"
+  "  entity can be 'regs' for W and X registers, 'fpregs' for S and D\n"
+  "  registers, 'flags' for CPU flags and 'pc'."
+;
+
+const char* MemCommand::kAliases[] = { "mem", "m", NULL };
+const char* MemCommand::kArguments = "<addr> [n = 1] [format = %x]";
+const char* MemCommand::kHelp =
+  "  print n memory item(s) at address addr according to the given format.\n"
+  "  addr can be an immediate address, a register or the pc.\n"
+  "  format is made of a qualifer: 's', 'u', 'a' (signed, unsigned, hexa)\n"
+  "  and a type 'x', 'w', 'h', 'b' (64- to 8-bit integer), 'c' (character),\n"
+  "  's' (float) or 'd' (double). E.g 'mem sp %w' will print a 32-bit word\n"
+  "  from the stack as an hexadecimal number."
+;
+
+const char* RegisterToken::kXAliases[kNumberOfRegisters][kMaxAliasNumber] = {
+  { "x0", NULL },
+  { "x1", NULL },
+  { "x2", NULL },
+  { "x3", NULL },
+  { "x4", NULL },
+  { "x5", NULL },
+  { "x6", NULL },
+  { "x7", NULL },
+  { "x8", NULL },
+  { "x9", NULL },
+  { "x10", NULL },
+  { "x11", NULL },
+  { "x12", NULL },
+  { "x13", NULL },
+  { "x14", NULL },
+  { "x15", NULL },
+  { "ip0", "x16", NULL },
+  { "ip1", "x17", NULL },
+  { "x18", "pr", NULL },
+  { "x19", NULL },
+  { "x20", NULL },
+  { "x21", NULL },
+  { "x22", NULL },
+  { "x23", NULL },
+  { "x24", NULL },
+  { "x25", NULL },
+  { "x26", NULL },
+  { "x27", NULL },
+  { "x28", NULL },
+  { "fp", "x29", NULL },
+  { "lr", "x30", NULL },
+  { "sp", NULL}
+};
+
+const char* RegisterToken::kWAliases[kNumberOfRegisters][kMaxAliasNumber] = {
+  { "w0", NULL },
+  { "w1", NULL },
+  { "w2", NULL },
+  { "w3", NULL },
+  { "w4", NULL },
+  { "w5", NULL },
+  { "w6", NULL },
+  { "w7", NULL },
+  { "w8", NULL },
+  { "w9", NULL },
+  { "w10", NULL },
+  { "w11", NULL },
+  { "w12", NULL },
+  { "w13", NULL },
+  { "w14", NULL },
+  { "w15", NULL },
+  { "w16", NULL },
+  { "w17", NULL },
+  { "w18", NULL },
+  { "w19", NULL },
+  { "w20", NULL },
+  { "w21", NULL },
+  { "w22", NULL },
+  { "w23", NULL },
+  { "w24", NULL },
+  { "w25", NULL },
+  { "w26", NULL },
+  { "w27", NULL },
+  { "w28", NULL },
+  { "w29", NULL },
+  { "w30", NULL },
+  { "wsp", NULL }
+};
+
+
+Debugger::Debugger(Decoder* decoder, FILE* stream)
+    : Simulator(decoder, stream),
+      log_parameters_(0),
+      debug_parameters_(0),
+      pending_request_(false),
+      steps_(0),
+      last_command_(NULL) {
+  disasm_ = new PrintDisassembler(stdout);
+  printer_ = new Decoder();
+  printer_->AppendVisitor(disasm_);
+}
+
+
+void Debugger::Run() {
+  while (pc_ != kEndOfSimAddress) {
+    if (pending_request()) {
+      LogProcessorState();
+      RunDebuggerShell();
+    }
+
+    ExecuteInstruction();
+  }
+}
+
+
+void Debugger::PrintInstructions(void* address, int64_t count) {
+  if (count == 0) {
+    return;
+  }
+
+  Instruction* from = Instruction::Cast(address);
+  if (count < 0) {
+    count = -count;
+    from -= (count - 1) * kInstructionSize;
+  }
+  Instruction* to = from + count * kInstructionSize;
+
+  for (Instruction* current = from;
+       current < to;
+       current = current->NextInstruction()) {
+    printer_->Decode(current);
+  }
+}
+
+
+void Debugger::PrintMemory(const uint8_t* address,
+                           int64_t count,
+                           const FormatToken* format) {
+  if (count == 0) {
+    return;
+  }
+
+  const uint8_t* from = address;
+  int size = format->SizeOf();
+  if (count < 0) {
+    count = -count;
+    from -= (count - 1) * size;
+  }
+  const uint8_t* to = from + count * size;
+
+  for (const uint8_t* current = from; current < to; current += size) {
+    if (((current - from) % 16) == 0) {
+      printf("\n%p: ", current);
+    }
+
+    uint64_t data = MemoryRead(current, size);
+    format->PrintData(&data);
+    printf(" ");
+  }
+  printf("\n\n");
+}
+
+
+void Debugger::VisitException(Instruction* instr) {
+  switch (instr->Mask(ExceptionMask)) {
+    case BRK:
+      DoBreakpoint(instr);
+      return;
+    case HLT:
+      switch (instr->ImmException()) {
+        case kUnreachableOpcode:
+          DoUnreachable(instr);
+          return;
+        case kTraceOpcode:
+          DoTrace(instr);
+          return;
+        case kLogOpcode:
+          DoLog(instr);
+          return;
+      }
+      // Fall through
+    default: Simulator::VisitException(instr);
+  }
+}
+
+
+void Debugger::LogFlags() {
+  if (log_parameters_ & LOG_FLAGS) PrintFlags();
+}
+
+
+void Debugger::LogRegisters() {
+  if (log_parameters_ & LOG_REGS) PrintRegisters();
+}
+
+
+void Debugger::LogFPRegisters() {
+  if (log_parameters_ & LOG_FP_REGS) PrintFPRegisters();
+}
+
+
+void Debugger::LogProcessorState() {
+  LogFlags();
+  LogRegisters();
+  LogFPRegisters();
+}
+
+
+// Read a command. A command will be at most kMaxDebugShellLine char long and
+// ends with '\n\0'.
+// TODO: Should this be a utility function?
+char* Debugger::ReadCommandLine(const char* prompt, char* buffer, int length) {
+  int fgets_calls = 0;
+  char* end = NULL;
+
+  printf("%s", prompt);
+  fflush(stdout);
+
+  do {
+    if (fgets(buffer, length, stdin) == NULL) {
+      printf(" ** Error while reading command. **\n");
+      return NULL;
+    }
+
+    fgets_calls++;
+    end = strchr(buffer, '\n');
+  } while (end == NULL);
+
+  if (fgets_calls != 1) {
+    printf(" ** Command too long. **\n");
+    return NULL;
+  }
+
+  // Remove the newline from the end of the command.
+  ASSERT(end[1] == '\0');
+  ASSERT((end - buffer) < (length - 1));
+  end[0] = '\0';
+
+  return buffer;
+}
+
+
+void Debugger::RunDebuggerShell() {
+  if (IsDebuggerRunning()) {
+    if (steps_ > 0) {
+      // Finish stepping first.
+      --steps_;
+      return;
+    }
+
+    printf("Next: ");
+    PrintInstructions(pc());
+    bool done = false;
+    while (!done) {
+      char buffer[kMaxDebugShellLine];
+      char* line = ReadCommandLine("vixl> ", buffer, kMaxDebugShellLine);
+
+      if (line == NULL) continue;  // An error occurred.
+
+      DebugCommand* command = DebugCommand::Parse(line);
+      if (command != NULL) {
+        last_command_ = command;
+      }
+
+      if (last_command_ != NULL) {
+        done = last_command_->Run(this);
+      } else {
+        printf("No previous command to run!\n");
+      }
+    }
+
+    if ((debug_parameters_ & DBG_BREAK) != 0) {
+      // The break request has now been handled, move to next instruction.
+      debug_parameters_ &= ~DBG_BREAK;
+      increment_pc();
+    }
+  }
+}
+
+
+void Debugger::DoBreakpoint(Instruction* instr) {
+  ASSERT(instr->Mask(ExceptionMask) == BRK);
+
+  printf("Hit breakpoint at pc=%p.\n", reinterpret_cast<void*>(instr));
+  set_debug_parameters(debug_parameters() | DBG_BREAK | DBG_ACTIVE);
+  // Make the shell point to the brk instruction.
+  set_pc(instr);
+}
+
+
+void Debugger::DoUnreachable(Instruction* instr) {
+  ASSERT((instr->Mask(ExceptionMask) == HLT) &&
+         (instr->ImmException() == kUnreachableOpcode));
+
+  fprintf(stream_, "Hit UNREACHABLE marker at pc=%p.\n",
+          reinterpret_cast<void*>(instr));
+  abort();
+}
+
+
+void Debugger::DoTrace(Instruction* instr) {
+  ASSERT((instr->Mask(ExceptionMask) == HLT) &&
+         (instr->ImmException() == kTraceOpcode));
+
+  // Read the arguments encoded inline in the instruction stream.
+  uint32_t parameters;
+  uint32_t command;
+
+  ASSERT(sizeof(*instr) == 1);
+  memcpy(&parameters, instr + kTraceParamsOffset, sizeof(parameters));
+  memcpy(&command, instr + kTraceCommandOffset, sizeof(command));
+
+  switch (command) {
+    case TRACE_ENABLE:
+      set_log_parameters(log_parameters() | parameters);
+      break;
+    case TRACE_DISABLE:
+      set_log_parameters(log_parameters() & ~parameters);
+      break;
+    default:
+      UNREACHABLE();
+  }
+
+  set_pc(instr->InstructionAtOffset(kTraceLength));
+}
+
+
+void Debugger::DoLog(Instruction* instr) {
+  ASSERT((instr->Mask(ExceptionMask) == HLT) &&
+         (instr->ImmException() == kLogOpcode));
+
+  // Read the arguments encoded inline in the instruction stream.
+  uint32_t parameters;
+
+  ASSERT(sizeof(*instr) == 1);
+  memcpy(&parameters, instr + kTraceParamsOffset, sizeof(parameters));
+
+  // We don't support a one-shot LOG_DISASM.
+  ASSERT((parameters & LOG_DISASM) == 0);
+  // Print the requested information.
+  if (parameters & LOG_FLAGS) PrintFlags(true);
+  if (parameters & LOG_REGS) PrintRegisters(true);
+  if (parameters & LOG_FP_REGS) PrintFPRegisters(true);
+
+  set_pc(instr->InstructionAtOffset(kLogLength));
+}
+
+
+static bool StringToUInt64(uint64_t* value, const char* line, int base = 10) {
+  char* endptr = NULL;
+  errno = 0;  // Reset errors.
+  uint64_t parsed = strtoul(line, &endptr, base);
+
+  if (errno == ERANGE) {
+    // Overflow.
+    return false;
+  }
+
+  if (endptr == line) {
+    // No digits were parsed.
+    return false;
+  }
+
+  if (*endptr != '\0') {
+    // Non-digit characters present at the end.
+    return false;
+  }
+
+  *value = parsed;
+  return true;
+}
+
+
+static bool StringToInt64(int64_t* value, const char* line, int base = 10) {
+  char* endptr = NULL;
+  errno = 0;  // Reset errors.
+  int64_t parsed = strtol(line, &endptr, base);
+
+  if (errno == ERANGE) {
+    // Overflow, undeflow.
+    return false;
+  }
+
+  if (endptr == line) {
+    // No digits were parsed.
+    return false;
+  }
+
+  if (*endptr != '\0') {
+    // Non-digit characters present at the end.
+    return false;
+  }
+
+  *value = parsed;
+  return true;
+}
+
+
+uint8_t* Token::ToAddress(Debugger* debugger) const {
+  USE(debugger);
+  UNREACHABLE();
+  return NULL;
+}
+
+
+Token* Token::Tokenize(const char* arg) {
+  if ((arg == NULL) || (*arg == '\0')) {
+    return NULL;
+  }
+
+  // The order is important. For example Identifier::Tokenize would consider
+  // any register to be a valid identifier.
+
+  Token* token = RegisterToken::Tokenize(arg);
+  if (token != NULL) {
+    return token;
+  }
+
+  token = FPRegisterToken::Tokenize(arg);
+  if (token != NULL) {
+    return token;
+  }
+
+  token = IdentifierToken::Tokenize(arg);
+  if (token != NULL) {
+    return token;
+  }
+
+  token = AddressToken::Tokenize(arg);
+  if (token != NULL) {
+    return token;
+  }
+
+  token = IntegerToken::Tokenize(arg);
+  if (token != NULL) {
+    return token;
+  }
+
+  token = FormatToken::Tokenize(arg);
+  if (token != NULL) {
+    return token;
+  }
+
+  return new UnknownToken(arg);
+}
+
+
+uint8_t* RegisterToken::ToAddress(Debugger* debugger) const {
+  ASSERT(CanAddressMemory());
+  uint64_t reg_value = debugger->xreg(value().code(), Reg31IsStackPointer);
+  uint8_t* address = NULL;
+  memcpy(&address, &reg_value, sizeof(address));
+  return address;
+}
+
+
+void RegisterToken::Print(FILE* out) const {
+  ASSERT(value().IsValid());
+  fprintf(out, "[Register %s]", Name());
+}
+
+
+const char* RegisterToken::Name() const {
+  if (value().Is32Bits()) {
+    return kWAliases[value().code()][0];
+  } else {
+    return kXAliases[value().code()][0];
+  }
+}
+
+
+Token* RegisterToken::Tokenize(const char* arg) {
+  for (unsigned i = 0; i < kNumberOfRegisters; i++) {
+    // Is it a X register or alias?
+    for (const char** current = kXAliases[i]; *current != NULL; current++) {
+      if (strcmp(arg, *current) == 0) {
+        return new RegisterToken(Register::XRegFromCode(i));
+      }
+    }
+
+    // Is it a W register or alias?
+    for (const char** current = kWAliases[i]; *current != NULL; current++) {
+      if (strcmp(arg, *current) == 0) {
+        return new RegisterToken(Register::WRegFromCode(i));
+      }
+    }
+  }
+
+  return NULL;
+}
+
+
+void FPRegisterToken::Print(FILE* out) const {
+  ASSERT(value().IsValid());
+  char prefix = value().Is32Bits() ? 's' : 'd';
+  fprintf(out, "[FPRegister %c%" PRIu32 "]", prefix, value().code());
+}
+
+
+Token* FPRegisterToken::Tokenize(const char* arg) {
+  if (strlen(arg) < 2) {
+    return NULL;
+  }
+
+  switch (*arg) {
+    case 's':
+    case 'd':
+      const char* cursor = arg + 1;
+      uint64_t code = 0;
+      if (!StringToUInt64(&code, cursor)) {
+        return NULL;
+      }
+
+      if (code > kNumberOfFPRegisters) {
+        return NULL;
+      }
+
+      FPRegister fpreg = NoFPReg;
+      switch (*arg) {
+        case 's': fpreg = FPRegister::SRegFromCode(code); break;
+        case 'd': fpreg = FPRegister::DRegFromCode(code); break;
+        default: UNREACHABLE();
+      }
+
+      return new FPRegisterToken(fpreg);
+  }
+
+  return NULL;
+}
+
+
+uint8_t* IdentifierToken::ToAddress(Debugger* debugger) const {
+  ASSERT(CanAddressMemory());
+  Instruction* pc_value = debugger->pc();
+  uint8_t* address = NULL;
+  memcpy(&address, &pc_value, sizeof(address));
+  return address;
+}
+
+void IdentifierToken::Print(FILE* out) const {
+  fprintf(out, "[Identifier %s]", value());
+}
+
+
+Token* IdentifierToken::Tokenize(const char* arg) {
+  if (!isalpha(arg[0])) {
+    return NULL;
+  }
+
+  const char* cursor = arg + 1;
+  while ((*cursor != '\0') && isalnum(*cursor)) {
+    ++cursor;
+  }
+
+  if (*cursor == '\0') {
+    return new IdentifierToken(arg);
+  }
+
+  return NULL;
+}
+
+
+uint8_t* AddressToken::ToAddress(Debugger* debugger) const {
+  USE(debugger);
+  return value();
+}
+
+
+void AddressToken::Print(FILE* out) const {
+  fprintf(out, "[Address %p]", value());
+}
+
+
+Token* AddressToken::Tokenize(const char* arg) {
+  if ((strlen(arg) < 3) || (arg[0] != '0') || (arg[1] != 'x')) {
+    return NULL;
+  }
+
+  uint64_t ptr = 0;
+  if (!StringToUInt64(&ptr, arg, 16)) {
+    return NULL;
+  }
+
+  uint8_t* address = reinterpret_cast<uint8_t*>(ptr);
+  return new AddressToken(address);
+}
+
+
+void IntegerToken::Print(FILE* out) const {
+  fprintf(out, "[Integer %" PRId64 "]", value());
+}
+
+
+Token* IntegerToken::Tokenize(const char* arg) {
+  int64_t value = 0;
+  if (!StringToInt64(&value, arg)) {
+    return NULL;
+  }
+
+  return new IntegerToken(value);
+}
+
+
+Token* FormatToken::Tokenize(const char* arg) {
+  if (arg[0] != '%') {
+    return NULL;
+  }
+
+  int length = strlen(arg);
+  if ((length < 2) || (length > 3)) {
+    return NULL;
+  }
+
+  char type = arg[length - 1];
+  if (length == 2) {
+    switch (type) {
+      case 'x': return new Format<uint64_t>("%016" PRIx64);
+      case 'w': return new Format<uint32_t>("%08" PRIx32);
+      case 'h': return new Format<uint16_t>("%04" PRIx16);
+      case 'b': return new Format<uint8_t>("%02" PRIx8);
+      case 'c': return new Format<char>("%c");
+      case 'd': return new Format<double>("%g");
+      case 's': return new Format<float>("%g");
+      default: return NULL;
+    }
+  }
+
+  ASSERT(length == 3);
+  switch (arg[1]) {
+    case 's':
+      switch (type) {
+        case 'x': return new Format<int64_t>("%+20" PRId64);
+        case 'w': return new Format<int32_t>("%+11" PRId32);
+        case 'h': return new Format<int16_t>("%+6" PRId16);
+        case 'b': return new Format<int8_t>("%+4" PRId8);
+        default: return NULL;
+      }
+    case 'u':
+      switch (type) {
+        case 'x': return new Format<uint64_t>("%20" PRIu64);
+        case 'w': return new Format<uint32_t>("%10" PRIu32);
+        case 'h': return new Format<uint16_t>("%5" PRIu16);
+        case 'b': return new Format<uint8_t>("%3" PRIu8);
+        default: return NULL;
+      }
+    case 'a':
+      switch (type) {
+        case 'd': return new Format<double>("%a");
+        case 's': return new Format<float>("%a");
+        default: return NULL;
+      }
+    default: return NULL;
+  }
+}
+
+
+template<typename T>
+void Format<T>::Print(FILE* out) const {
+  fprintf(out, "[Format %s - %lu byte(s)]", fmt_, sizeof(T));
+}
+
+
+void UnknownToken::Print(FILE* out) const {
+  fprintf(out, "[Unknown %s]", unknown_);
+}
+
+
+void DebugCommand::Print(FILE* out) {
+  fprintf(out, "%s", name());
+}
+
+
+bool DebugCommand::Match(const char* name, const char** aliases) {
+  for (const char** current = aliases; *current != NULL; current++) {
+    if (strcmp(name, *current) == 0) {
+       return true;
+    }
+  }
+
+  return false;
+}
+
+
+DebugCommand* DebugCommand::Parse(char* line) {
+  std::vector<Token*> args;
+
+  for (char* chunk = strtok(line, " ");
+       chunk != NULL;
+       chunk = strtok(NULL, " ")) {
+    args.push_back(Token::Tokenize(chunk));
+  }
+
+  if (args.size() == 0) {
+    return NULL;
+  }
+
+  if (!args[0]->IsIdentifier()) {
+    return new InvalidCommand(args, 0, "command name is not an identifier");
+  }
+
+  const char* name = IdentifierToken::Cast(args[0])->value();
+  #define RETURN_IF_MATCH(Command)       \
+  if (Match(name, Command::kAliases)) {  \
+    return Command::Build(args);         \
+  }
+  DEBUG_COMMAND_LIST(RETURN_IF_MATCH);
+  #undef RETURN_IF_MATCH
+
+  return new UnknownCommand(args);
+}
+
+
+void DebugCommand::PrintHelp(const char** aliases,
+                             const char* args,
+                             const char* help) {
+  ASSERT(aliases[0] != NULL);
+  ASSERT(help != NULL);
+
+  printf("\n----\n\n");
+  for (const char** current = aliases; *current != NULL; current++) {
+    if (args != NULL) {
+      printf("%s %s\n", *current, args);
+    } else {
+      printf("%s\n", *current);
+    }
+  }
+  printf("\n%s\n", help);
+}
+
+
+bool HelpCommand::Run(Debugger* debugger) {
+  ASSERT(debugger->IsDebuggerRunning());
+  USE(debugger);
+
+  #define PRINT_HELP(Command)                     \
+    DebugCommand::PrintHelp(Command::kAliases,    \
+                            Command::kArguments,  \
+                            Command::kHelp);
+  DEBUG_COMMAND_LIST(PRINT_HELP);
+  #undef PRINT_HELP
+  printf("\n----\n\n");
+
+  return false;
+}
+
+
+DebugCommand* HelpCommand::Build(std::vector<Token*> args) {
+  if (args.size() != 1) {
+    return new InvalidCommand(args, -1, "too many arguments");
+  }
+
+  return new HelpCommand(args[0]);
+}
+
+
+bool ContinueCommand::Run(Debugger* debugger) {
+  ASSERT(debugger->IsDebuggerRunning());
+
+  debugger->set_debug_parameters(debugger->debug_parameters() & ~DBG_ACTIVE);
+  return true;
+}
+
+
+DebugCommand* ContinueCommand::Build(std::vector<Token*> args) {
+  if (args.size() != 1) {
+    return new InvalidCommand(args, -1, "too many arguments");
+  }
+
+  return new ContinueCommand(args[0]);
+}
+
+
+bool StepCommand::Run(Debugger* debugger) {
+  ASSERT(debugger->IsDebuggerRunning());
+
+  int64_t steps = count();
+  if (steps < 0) {
+    printf(" ** invalid value for steps: %" PRId64 " (<0) **\n", steps);
+  } else if (steps > 1) {
+    debugger->set_steps(steps - 1);
+  }
+
+  return true;
+}
+
+
+void StepCommand::Print(FILE* out) {
+  fprintf(out, "%s %" PRId64 "", name(), count());
+}
+
+
+DebugCommand* StepCommand::Build(std::vector<Token*> args) {
+  IntegerToken* count = NULL;
+  switch (args.size()) {
+    case 1: {  // step [1]
+      count = new IntegerToken(1);
+      break;
+    }
+    case 2: {  // step n
+      Token* first = args[1];
+      if (!first->IsInteger()) {
+        return new InvalidCommand(args, 1, "expects int");
+      }
+      count = IntegerToken::Cast(first);
+      break;
+    }
+    default:
+      return new InvalidCommand(args, -1, "too many arguments");
+  }
+
+  return new StepCommand(args[0], count);
+}
+
+
+bool DisasmCommand::Run(Debugger* debugger) {
+  ASSERT(debugger->IsDebuggerRunning());
+
+  uint8_t* from = target()->ToAddress(debugger);
+  debugger->PrintInstructions(from, count());
+
+  return false;
+}
+
+
+void DisasmCommand::Print(FILE* out) {
+  fprintf(out, "%s ", name());
+  target()->Print(out);
+  fprintf(out, " %" PRId64 "", count());
+}
+
+
+DebugCommand* DisasmCommand::Build(std::vector<Token*> args) {
+  Token* address = NULL;
+  IntegerToken* count = NULL;
+  switch (args.size()) {
+    case 1: {  // disasm [pc] [1]
+      address = new IdentifierToken("pc");
+      count = new IntegerToken(1);
+      break;
+    }
+    case 2: {  // disasm [pc] n or disasm address [1]
+      Token* first = args[1];
+      if (first->IsInteger()) {
+        address = new IdentifierToken("pc");
+        count = IntegerToken::Cast(first);
+      } else if (first->CanAddressMemory()) {
+        address = first;
+        count = new IntegerToken(1);
+      } else {
+        return new InvalidCommand(args, 1, "expects int or addr");
+      }
+      break;
+    }
+    case 3: {  // disasm address count
+      Token* first = args[1];
+      Token* second = args[2];
+      if (!first->CanAddressMemory() || !second->IsInteger()) {
+        return new InvalidCommand(args, -1, "disasm addr int");
+      }
+      address = first;
+      count = IntegerToken::Cast(second);
+      break;
+    }
+    default:
+      return new InvalidCommand(args, -1, "wrong arguments number");
+  }
+
+  return new DisasmCommand(args[0], address, count);
+}
+
+
+void PrintCommand::Print(FILE* out) {
+  fprintf(out, "%s ", name());
+  target()->Print(out);
+}
+
+
+bool PrintCommand::Run(Debugger* debugger) {
+  ASSERT(debugger->IsDebuggerRunning());
+
+  Token* tok = target();
+  if (tok->IsIdentifier()) {
+    char* identifier = IdentifierToken::Cast(tok)->value();
+    if (strcmp(identifier, "regs") == 0) {
+      debugger->PrintRegisters(true);
+    } else if (strcmp(identifier, "fpregs") == 0) {
+      debugger->PrintFPRegisters(true);
+    } else if (strcmp(identifier, "flags") == 0) {
+      debugger->PrintFlags(true);
+    } else if (strcmp(identifier, "pc") == 0) {
+      printf("pc = %16p\n", reinterpret_cast<void*>(debugger->pc()));
+    } else {
+      printf(" ** Unknown identifier to print: %s **\n", identifier);
+    }
+
+    return false;
+  }
+
+  if (tok->IsRegister()) {
+    RegisterToken* reg_tok = RegisterToken::Cast(tok);
+    Register reg = reg_tok->value();
+    if (reg.Is32Bits()) {
+      printf("%s = %" PRId32 "\n",
+             reg_tok->Name(),
+             debugger->wreg(reg.code(), Reg31IsStackPointer));
+    } else {
+      printf("%s = %" PRId64 "\n",
+             reg_tok->Name(),
+             debugger->xreg(reg.code(), Reg31IsStackPointer));
+    }
+
+    return false;
+  }
+
+  if (tok->IsFPRegister()) {
+    FPRegister fpreg = FPRegisterToken::Cast(tok)->value();
+    if (fpreg.Is32Bits()) {
+      printf("s%u = %g\n", fpreg.code(), debugger->sreg(fpreg.code()));
+    } else {
+      printf("d%u = %g\n", fpreg.code(), debugger->dreg(fpreg.code()));
+    }
+
+    return false;
+  }
+
+  UNREACHABLE();
+  return false;
+}
+
+
+DebugCommand* PrintCommand::Build(std::vector<Token*> args) {
+  Token* target = NULL;
+  switch (args.size()) {
+    case 2: {
+      target = args[1];
+      if (!target->IsRegister()
+          && !target->IsFPRegister()
+          && !target->IsIdentifier()) {
+        return new InvalidCommand(args, 1, "expects reg or identifier");
+      }
+      break;
+    }
+    default:
+      return new InvalidCommand(args, -1, "too many arguments");
+  }
+
+  return new PrintCommand(args[0], target);
+}
+
+
+bool MemCommand::Run(Debugger* debugger) {
+  ASSERT(debugger->IsDebuggerRunning());
+
+  uint8_t* address = target()->ToAddress(debugger);
+  debugger->PrintMemory(address, count(), format());
+
+  return false;
+}
+
+
+void MemCommand::Print(FILE* out) {
+  fprintf(out, "%s ", name());
+  target()->Print(out);
+  fprintf(out, " %" PRId64 " ", count());
+  format()->Print(out);
+}
+
+
+DebugCommand* MemCommand::Build(std::vector<Token*> args) {
+  if (args.size() < 2) {
+    return new InvalidCommand(args, -1, "too few arguments");
+  }
+
+  Token* target = args[1];
+  IntegerToken* count = NULL;
+  FormatToken* format = NULL;
+
+  if (!target->CanAddressMemory()) {
+    return new InvalidCommand(args, 1, "expects address");
+  }
+
+  switch (args.size()) {
+    case 2: {  // mem addressable [1] [%x]
+      count = new IntegerToken(1);
+      format = new Format<uint64_t>("%016x");
+      break;
+    }
+    case 3: {  // mem addr n [%x] or mem addr [n] %f
+      Token* second = args[2];
+      if (second->IsInteger()) {
+        count = IntegerToken::Cast(second);
+        format = new Format<uint64_t>("%016x");
+      } else if (second->IsFormat()) {
+        count = new IntegerToken(1);
+        format = FormatToken::Cast(second);
+      } else {
+        return new InvalidCommand(args, 2, "expects int or format");
+      }
+      break;
+    }
+    case 4: {  // mem addr n %f
+      Token* second = args[2];
+      Token* third = args[3];
+      if (!second->IsInteger() || !third->IsFormat()) {
+        return new InvalidCommand(args, -1, "mem addr >>int<< %F");
+      }
+
+      count = IntegerToken::Cast(second);
+      format = FormatToken::Cast(third);
+      break;
+    }
+    default:
+      return new InvalidCommand(args, -1, "too many arguments");
+  }
+
+  return new MemCommand(args[0], target, count, format);
+}
+
+
+UnknownCommand::~UnknownCommand() {
+  const int size = args_.size();
+  for (int i = 0; i < size; ++i) {
+    delete args_[i];
+  }
+}
+
+
+bool UnknownCommand::Run(Debugger* debugger) {
+  ASSERT(debugger->IsDebuggerRunning());
+  USE(debugger);
+
+  printf(" ** Unknown Command:");
+  const int size = args_.size();
+  for (int i = 0; i < size; ++i) {
+    printf(" ");
+    args_[i]->Print(stdout);
+  }
+  printf(" **\n");
+
+  return false;
+}
+
+
+InvalidCommand::~InvalidCommand() {
+  const int size = args_.size();
+  for (int i = 0; i < size; ++i) {
+    delete args_[i];
+  }
+}
+
+
+bool InvalidCommand::Run(Debugger* debugger) {
+  ASSERT(debugger->IsDebuggerRunning());
+  USE(debugger);
+
+  printf(" ** Invalid Command:");
+  const int size = args_.size();
+  for (int i = 0; i < size; ++i) {
+    printf(" ");
+    if (i == index_) {
+      printf(">>");
+      args_[i]->Print(stdout);
+      printf("<<");
+    } else {
+      args_[i]->Print(stdout);
+    }
+  }
+  printf(" **\n");
+  printf(" ** %s\n", cause_);
+
+  return false;
+}
+
+}  // namespace vixl

diff --git a/src/a64/debugger-a64.h b/src/a64/debugger-a64.h
new file mode 100644
index 0000000..ca4c1dc
--- /dev/null
+++ b/src/a64/debugger-a64.h

@@ -0,0 +1,188 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_A64_DEBUGGER_A64_H_
+#define VIXL_A64_DEBUGGER_A64_H_
+
+#include <ctype.h>
+#include <limits.h>
+#include <errno.h>
+#include <vector>
+
+#include "globals.h"
+#include "utils.h"
+#include "a64/constants-a64.h"
+#include "a64/simulator-a64.h"
+
+namespace vixl {
+
+// Debug instructions.
+//
+// VIXL's macro-assembler and debugger support a few pseudo instructions to
+// make debugging easier. These pseudo instructions do not exist on real
+// hardware.
+//
+// Each debug pseudo instruction is represented by a HLT instruction. The HLT
+// immediate field is used to identify the type of debug pseudo isntruction.
+// Each pseudo instruction use a custom encoding for additional arguments, as
+// described below.
+
+// Unreachable
+//
+// Instruction which should never be executed. This is used as a guard in parts
+// of the code that should not be reachable, such as in data encoded inline in
+// the instructions.
+const Instr kUnreachableOpcode = 0xdeb0;
+
+// Trace
+//  - parameter: TraceParameter stored as a uint32_t
+//  - command: TraceCommand stored as a uint32_t
+//
+// Allow for trace management in the generated code. See the corresponding
+// enums for more information on permitted actions.
+const Instr kTraceOpcode = 0xdeb2;
+const unsigned kTraceParamsOffset = 1 * kInstructionSize;
+const unsigned kTraceCommandOffset = 2 * kInstructionSize;
+const unsigned kTraceLength = 3 * kInstructionSize;
+
+// Log
+//  - parameter: TraceParameter stored as a uint32_t
+//
+// Output the requested information.
+const Instr kLogOpcode = 0xdeb3;
+const unsigned kLogParamsOffset = 1 * kInstructionSize;
+const unsigned kLogLength = 2 * kInstructionSize;
+
+// Trace commands.
+enum TraceCommand {
+  TRACE_ENABLE   = 1,
+  TRACE_DISABLE  = 2
+};
+
+// Trace parameters.
+enum TraceParameters {
+  LOG_DISASM     = 1 << 0,  // Log disassembly.
+  LOG_REGS       = 1 << 1,  // Log general purpose registers.
+  LOG_FP_REGS    = 1 << 2,  // Log floating-point registers.
+  LOG_FLAGS      = 1 << 3,  // Log the status flags.
+
+  LOG_STATE      = LOG_REGS | LOG_FP_REGS | LOG_FLAGS,
+  LOG_ALL        = LOG_DISASM | LOG_REGS | LOG_FP_REGS | LOG_FLAGS
+};
+
+// Debugger parameters
+enum DebugParameters {
+  DBG_ACTIVE = 1 << 0,  // The debugger is active.
+  DBG_BREAK  = 1 << 1   // The debugger is at a breakpoint.
+};
+
+// Forward declarations.
+class DebugCommand;
+class Token;
+class FormatToken;
+
+class Debugger : public Simulator {
+ public:
+  Debugger(Decoder* decoder, FILE* stream = stdout);
+
+  virtual void Run();
+  void VisitException(Instruction* instr);
+
+  inline int log_parameters() {
+    // The simulator can control disassembly, so make sure that the Debugger's
+    // log parameters agree with it.
+    if (disasm_trace()) {
+      log_parameters_ |= LOG_DISASM;
+    }
+    return log_parameters_;
+  }
+  inline void set_log_parameters(int parameters) {
+    set_disasm_trace((parameters & LOG_DISASM) != 0);
+    log_parameters_ = parameters;
+
+    update_pending_request();
+  }
+
+  inline int debug_parameters() { return debug_parameters_; }
+  inline void set_debug_parameters(int parameters) {
+    debug_parameters_ = parameters;
+
+    update_pending_request();
+  }
+
+  // Numbers of instructions to execute before the debugger shell is given
+  // back control.
+  inline int steps() { return steps_; }
+  inline void set_steps(int value) {
+    ASSERT(value > 1);
+    steps_ = value;
+  }
+
+  inline bool IsDebuggerRunning() {
+    return (debug_parameters_ & DBG_ACTIVE) != 0;
+  }
+
+  inline bool pending_request() { return pending_request_; }
+  inline void update_pending_request() {
+    const int kLoggingMask = LOG_FLAGS | LOG_REGS | LOG_FP_REGS;
+    const bool logging = (log_parameters_ & kLoggingMask) != 0;
+    const bool debugging = IsDebuggerRunning();
+
+    pending_request_ = logging || debugging;
+  }
+
+  void PrintInstructions(void* address, int64_t count = 1);
+  void PrintMemory(const uint8_t* address,
+                   int64_t count,
+                   const FormatToken* format);
+
+ private:
+  void LogFlags();
+  void LogRegisters();
+  void LogFPRegisters();
+  void LogProcessorState();
+  char* ReadCommandLine(const char* prompt, char* buffer, int length);
+  void RunDebuggerShell();
+  void DoBreakpoint(Instruction* instr);
+  void DoUnreachable(Instruction* instr);
+  void DoTrace(Instruction* instr);
+  void DoLog(Instruction* instr);
+
+  int  log_parameters_;
+  int  debug_parameters_;
+  bool pending_request_;
+  int steps_;
+  DebugCommand* last_command_;
+  PrintDisassembler* disasm_;
+  Decoder* printer_;
+
+  // Length of the biggest command line accepted by the debugger shell.
+  static const int kMaxDebugShellLine = 256;
+};
+
+}  // namespace vixl
+
+#endif  // VIXL_A64_DEBUGGER_A64_H_

diff --git a/src/a64/decoder-a64.cc b/src/a64/decoder-a64.cc
new file mode 100644
index 0000000..c8b7b42
--- /dev/null
+++ b/src/a64/decoder-a64.cc

@@ -0,0 +1,524 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "globals.h"
+#include "utils.h"
+#include "a64/decoder-a64.h"
+
+namespace vixl {
+// Top-level instruction decode function.
+void Decoder::Decode(Instruction *instr) {
+  switch (instr->Bits(27, 24)) {
+    // 1:   Add/sub immediate.
+    // A:   Logical shifted register.
+    //      Add/sub with carry.
+    //      Conditional compare register.
+    //      Conditional compare immediate.
+    //      Conditional select.
+    //      Data processing 1 source.
+    //      Data processing 2 source.
+    // B:   Add/sub shifted register.
+    //      Add/sub extended register.
+    //      Data processing 3 source.
+    case 0x1:
+    case 0xA:
+    case 0xB: DecodeDataProcessing(instr); break;
+
+    // 2:   Logical immediate.
+    //      Move wide immediate.
+    case 0x2: DecodeLogical(instr); break;
+
+    // 3:   Bitfield.
+    //      Extract.
+    case 0x3: DecodeBitfieldExtract(instr); break;
+
+    // 0:   PC relative addressing.
+    // 4:   Unconditional branch immediate.
+    //      Exception generation.
+    //      Compare and branch immediate.
+    // 5:   Compare and branch immediate.
+    //      Conditional branch.
+    //      System.
+    // 6,7: Unconditional branch.
+    //      Test and branch immediate.
+    case 0x0:
+    case 0x4:
+    case 0x5:
+    case 0x6:
+    case 0x7: DecodeBranchSystemException(instr); break;
+
+    // 8,9: Load/store register pair post-index.
+    //      Load register literal.
+    //      Load/store register unscaled immediate.
+    //      Load/store register immediate post-index.
+    //      Load/store register immediate pre-index.
+    //      Load/store register offset.
+    //      Load/store exclusive.
+    // C,D: Load/store register pair offset.
+    //      Load/store register pair pre-index.
+    //      Load/store register unsigned immediate.
+    case 0x8:
+    case 0x9:
+    case 0xC:
+    case 0xD: DecodeLoadStore(instr); break;
+
+    // E:   FP fixed point conversion.
+    //      FP integer conversion.
+    //      FP data processing 1 source.
+    //      FP compare.
+    //      FP immediate.
+    //      FP data processing 2 source.
+    //      FP conditional compare.
+    //      FP conditional select.
+    // F:   FP data processing 3 source.
+    case 0xE:
+    case 0xF: DecodeFP(instr); break;
+  }
+}
+
+void Decoder::AppendVisitor(DecoderVisitor* new_visitor) {
+  visitors_.remove(new_visitor);
+  visitors_.push_front(new_visitor);
+}
+
+
+void Decoder::PrependVisitor(DecoderVisitor* new_visitor) {
+  visitors_.remove(new_visitor);
+  visitors_.push_back(new_visitor);
+}
+
+
+void Decoder::InsertVisitorBefore(DecoderVisitor* new_visitor,
+                                  DecoderVisitor* registered_visitor) {
+  visitors_.remove(new_visitor);
+  std::list<DecoderVisitor*>::iterator it;
+  for (it = visitors_.begin(); it != visitors_.end(); it++) {
+    if (*it == registered_visitor) {
+      visitors_.insert(it, new_visitor);
+      return;
+    }
+  }
+  // We reached the end of the list. The last element must be
+  // registered_visitor.
+  ASSERT(*it == registered_visitor);
+  visitors_.insert(it, new_visitor);
+}
+
+
+void Decoder::InsertVisitorAfter(DecoderVisitor* new_visitor,
+                                 DecoderVisitor* registered_visitor) {
+  visitors_.remove(new_visitor);
+  std::list<DecoderVisitor*>::iterator it;
+  for (it = visitors_.begin(); it != visitors_.end(); it++) {
+    if (*it == registered_visitor) {
+      it++;
+      visitors_.insert(it, new_visitor);
+      return;
+    }
+  }
+  // We reached the end of the list. The last element must be
+  // registered_visitor.
+  ASSERT(*it == registered_visitor);
+  visitors_.push_back(new_visitor);
+}
+
+
+void Decoder::RemoveVisitor(DecoderVisitor* visitor) {
+  visitors_.remove(visitor);
+}
+
+
+void Decoder::DecodeBranchSystemException(Instruction *instr) {
+  ASSERT((instr->Bits(27, 24) == 0x0) ||
+         (instr->Bits(27, 24) == 0x4) ||
+         (instr->Bits(27, 24) == 0x5) ||
+         (instr->Bits(27, 24) == 0x6) ||
+         (instr->Bits(27, 24) == 0x7) );
+
+  if (instr->Bit(26) == 0) {
+    VisitPCRelAddressing(instr);
+  } else {
+    switch (instr->Bits(31, 29)) {
+      case 0:
+      case 4: {
+        VisitUnconditionalBranch(instr);
+        break;
+      }
+      case 1:
+      case 5: {
+        if (instr->Bit(25) == 0) {
+          VisitCompareBranch(instr);
+        } else {
+          VisitTestBranch(instr);
+        }
+        break;
+      }
+      case 2: {
+        UNALLOC(instr, SpacedBits(2, 24, 4) == 0x1);
+        UNALLOC(instr, Bit(24) == 0x1);
+        VisitConditionalBranch(instr);
+        break;
+      }
+      case 6: {
+        if (instr->Bit(25) == 0) {
+          if (instr->Bit(24) == 0) {
+            UNALLOC(instr, Bits(4, 2) != 0);
+            UNALLOC(instr, Mask(0x00E0001D) == 0x00200001);
+            UNALLOC(instr, Mask(0x00E0001E) == 0x00200002);
+            UNALLOC(instr, Mask(0x00E0001D) == 0x00400001);
+            UNALLOC(instr, Mask(0x00E0001E) == 0x00400002);
+            UNALLOC(instr, Mask(0x00E0001C) == 0x00600000);
+            UNALLOC(instr, Mask(0x00E0001C) == 0x00800000);
+            UNALLOC(instr, Mask(0x00E0001F) == 0x00A00000);
+            UNALLOC(instr, Mask(0x00C0001C) == 0x00C00000);
+            VisitException(instr);
+          } else {
+            UNALLOC(instr, Mask(0x0038E000) == 0x00000000);
+            UNALLOC(instr, Mask(0x0039E000) == 0x00002000);
+            UNALLOC(instr, Mask(0x003AE000) == 0x00002000);
+            UNALLOC(instr, Mask(0x003CE000) == 0x00042000);
+            UNALLOC(instr, Mask(0x003FFFC0) == 0x000320C0);
+            UNALLOC(instr, Mask(0x003FF100) == 0x00032100);
+            UNALLOC(instr, Mask(0x003FF200) == 0x00032200);
+            UNALLOC(instr, Mask(0x003FF400) == 0x00032400);
+            UNALLOC(instr, Mask(0x003FF800) == 0x00032800);
+            UNALLOC(instr, Mask(0x003FF0E0) == 0x00033000);
+            UNALLOC(instr, Mask(0x003FF0E0) == 0x003FF020);
+            UNALLOC(instr, Mask(0x003FF0E0) == 0x003FF060);
+            UNALLOC(instr, Mask(0x003FF0E0) == 0x003FF0E0);
+            UNALLOC(instr, Mask(0x0038F000) == 0x00005000);
+            UNALLOC(instr, Mask(0x0038E000) == 0x00006000);
+            UNALLOC(instr, SpacedBits(4, 21, 20, 19, 15) == 0x1);
+            UNALLOC(instr, Bits(21, 19) == 0x4);
+            VisitSystem(instr);
+          }
+        } else {
+          UNALLOC(instr, Bits(20, 16) != 0x1F);
+          UNALLOC(instr, Bits(15, 10) != 0);
+          UNALLOC(instr, Bits(4, 0) != 0);
+          UNALLOC(instr, Bits(24, 21) == 0x3);
+          UNALLOC(instr, Bits(24, 22) == 0x3);
+          UNALLOC(instr, Bit(24) == 0x1);
+          VisitUnconditionalBranchToRegister(instr);
+        }
+        break;
+      }
+      default: VisitUnknown(instr);
+    }
+  }
+}
+
+
+void Decoder::DecodeLoadStore(Instruction *instr) {
+  ASSERT((instr->Bits(27, 24) == 0x8) ||
+         (instr->Bits(27, 24) == 0x9) ||
+         (instr->Bits(27, 24) == 0xC) ||
+         (instr->Bits(27, 24) == 0xD) );
+
+  if (instr->Bit(24) == 0) {
+    if (instr->Bit(28) == 0) {
+      if (instr->Bit(29) == 0) {
+        if (instr->Bit(26) == 0) {
+          // TODO: VisitLoadStoreExclusive.
+          UNIMPLEMENTED();
+        } else {
+          // TODO: VisitLoadStoreAdvSIMD.
+          UNIMPLEMENTED();
+        }
+      } else {
+        UNALLOC(instr, Bits(31, 30) == 0x3);
+        UNALLOC(instr, SpacedBits(4, 26, 31, 30, 22) == 0x2);
+        if (instr->Bit(23) == 0) {
+          UNALLOC(instr, SpacedBits(4, 26, 31, 30, 22) == 0x3);
+          VisitLoadStorePairNonTemporal(instr);
+        } else {
+          VisitLoadStorePairPostIndex(instr);
+        }
+      }
+    } else {
+      if (instr->Bit(29) == 0) {
+        UNALLOC(instr, SpacedBits(3, 26, 31, 30) == 0x7);
+        VisitLoadLiteral(instr);
+      } else {
+        UNALLOC(instr, SpacedBits(4, 26, 23, 22, 31) == 0x7);
+        UNALLOC(instr, SpacedBits(3, 26, 23, 30) == 0x7);
+        UNALLOC(instr, SpacedBits(3, 26, 23, 31) == 0x7);
+        if (instr->Bit(21) == 0) {
+          switch (instr->Bits(11, 10)) {
+            case 0: {
+              VisitLoadStoreUnscaledOffset(instr);
+              break;
+            }
+            case 1: {
+              UNALLOC(instr, SpacedBits(5, 26, 23, 22, 31, 30) == 0xB);
+              VisitLoadStorePostIndex(instr);
+              break;
+            }
+            case 3: {
+              UNALLOC(instr, SpacedBits(5, 26, 23, 22, 31, 30) == 0xB);
+              VisitLoadStorePreIndex(instr);
+              break;
+            }
+            default: VisitUnknown(instr);
+          }
+        } else {
+          UNALLOC(instr, Bit(14) == 0);
+          VisitLoadStoreRegisterOffset(instr);
+        }
+      }
+    }
+  } else {
+    if (instr->Bit(28) == 0) {
+      UNALLOC(instr, SpacedBits(4, 26, 31, 30, 22) == 0x2);
+      UNALLOC(instr, Bits(31, 30) == 0x3);
+      if (instr->Bit(23) == 0) {
+        VisitLoadStorePairOffset(instr);
+      } else {
+        VisitLoadStorePairPreIndex(instr);
+      }
+    } else {
+      UNALLOC(instr, SpacedBits(4, 26, 23, 22, 31) == 0x7);
+      UNALLOC(instr, SpacedBits(3, 26, 23, 30) == 0x7);
+      UNALLOC(instr, SpacedBits(3, 26, 23, 31) == 0x7);
+      VisitLoadStoreUnsignedOffset(instr);
+    }
+  }
+}
+
+
+void Decoder::DecodeLogical(Instruction *instr) {
+  ASSERT(instr->Bits(27, 24) == 0x2);
+
+  UNALLOC(instr, SpacedBits(2, 31, 22) == 0x1);
+  if (instr->Bit(23) == 0) {
+    VisitLogicalImmediate(instr);
+  } else {
+    UNALLOC(instr, Bits(30, 29) == 0x1);
+    VisitMoveWideImmediate(instr);
+  }
+}
+
+
+void Decoder::DecodeBitfieldExtract(Instruction *instr) {
+  ASSERT(instr->Bits(27, 24) == 0x3);
+
+  UNALLOC(instr, SpacedBits(2, 31, 22) == 0x2);
+  UNALLOC(instr, SpacedBits(2, 31, 22) == 0x1);
+  UNALLOC(instr, SpacedBits(2, 31, 15) == 0x1);
+  if (instr->Bit(23) == 0) {
+    UNALLOC(instr, SpacedBits(2, 31, 21) == 0x1);
+    UNALLOC(instr, Bits(30, 29) == 0x3);
+    VisitBitfield(instr);
+  } else {
+    UNALLOC(instr, SpacedBits(3, 30, 29, 21) == 0x1);
+    UNALLOC(instr, Bits(30, 29) != 0);
+    VisitExtract(instr);
+  }
+}
+
+
+void Decoder::DecodeDataProcessing(Instruction *instr) {
+  ASSERT((instr->Bits(27, 24) == 0x1) ||
+         (instr->Bits(27, 24) == 0xA) ||
+         (instr->Bits(27, 24) == 0xB) );
+
+  if (instr->Bit(27) == 0) {
+    UNALLOC(instr, Bit(23) == 0x1);
+    VisitAddSubImmediate(instr);
+  } else if (instr->Bit(24) == 0) {
+    if (instr->Bit(28) == 0) {
+      UNALLOC(instr, SpacedBits(2, 31, 15) == 0x1);
+      VisitLogicalShifted(instr);
+    } else {
+      switch (instr->Bits(23, 21)) {
+        case 0: {
+          UNALLOC(instr, Bits(15, 10) != 0);
+          VisitAddSubWithCarry(instr);
+          break;
+        }
+        case 2: {
+         UNALLOC(instr, SpacedBits(2, 10, 4) != 0);
+         UNALLOC(instr, Bit(29) == 0x0);
+         if (instr->Bit(11) == 0) {
+            VisitConditionalCompareRegister(instr);
+          } else {
+            VisitConditionalCompareImmediate(instr);
+          }
+          break;
+        }
+        case 4: {
+          UNALLOC(instr, SpacedBits(2, 11, 29) != 0);
+          VisitConditionalSelect(instr);
+          break;
+        }
+        case 6: {
+          UNALLOC(instr, Bit(29) == 1);
+          UNALLOC(instr, Bits(15, 14) != 0);
+          if (instr->Bit(30) == 0) {
+            UNALLOC(instr, Bits(15, 11) == 0);
+            UNALLOC(instr, Bits(15, 12) == 0x1);
+            UNALLOC(instr, Bits(15, 12) == 0x3);
+            VisitDataProcessing2Source(instr);
+          } else {
+            UNALLOC(instr, Bit(13) == 1);
+            UNALLOC(instr, Bits(20, 16) != 0);
+            UNALLOC(instr, Mask(0xA01FFC00) == 0x00000C00);
+            UNALLOC(instr, Mask(0x201FF800) == 0x00001800);
+            VisitDataProcessing1Source(instr);
+          }
+          break;
+        }
+        default: VisitUnknown(instr);
+      }
+    }
+  } else {
+    if (instr->Bit(28) == 0) {
+      if (instr->Bit(21) == 0) {
+        UNALLOC(instr, Bits(23, 22) == 0x3);
+        UNALLOC(instr, SpacedBits(2, 31, 15) == 0x1);
+        VisitAddSubShifted(instr);
+      } else {
+        UNALLOC(instr, SpacedBits(2, 23, 22) != 0);
+        UNALLOC(instr, SpacedBits(2, 12, 10) == 0x3);
+        UNALLOC(instr, Bits(12, 11) == 0x3);
+        VisitAddSubExtended(instr);
+      }
+    } else {
+      UNALLOC(instr, Mask(0xE0E08000) == 0x00200000);
+      UNALLOC(instr, Mask(0xE0E08000) == 0x00208000);
+      UNALLOC(instr, Mask(0xE0E08000) == 0x00400000);
+      UNALLOC(instr, Mask(0x60E08000) == 0x00408000);
+      UNALLOC(instr, SpacedBits(5, 30, 29, 23, 22, 21) == 0x3);
+      UNALLOC(instr, SpacedBits(5, 30, 29, 23, 22, 21) == 0x4);
+      UNALLOC(instr, Mask(0xE0E08000) == 0x00A00000);
+      UNALLOC(instr, Mask(0xE0E08000) == 0x00A08000);
+      UNALLOC(instr, Mask(0xE0E08000) == 0x00C00000);
+      UNALLOC(instr, Mask(0x60E08000) == 0x00C08000);
+      UNALLOC(instr, SpacedBits(5, 30, 29, 23, 22, 21) == 0x7);
+      UNALLOC(instr, Bits(30, 29) == 0x1);
+      UNALLOC(instr, Bit(30) == 0x1);
+      VisitDataProcessing3Source(instr);
+    }
+  }
+}
+
+
+void Decoder::DecodeFP(Instruction *instr) {
+  ASSERT((instr->Bits(27, 24) == 0xE) ||
+         (instr->Bits(27, 24) == 0xF) );
+  UNALLOC(instr, Bit(29) == 0x1);
+
+  if (instr->Bit(24) == 0) {
+    if (instr->Bit(21) == 0) {
+      UNALLOC(instr, Bit(23) == 1);
+      UNALLOC(instr, SpacedBits(2, 31, 15) == 0);
+      UNALLOC(instr, SpacedBits(3, 18, 17, 19) == 0);
+      UNALLOC(instr, SpacedBits(3, 18, 17, 20) == 0);
+      UNALLOC(instr, SpacedBits(3, 18, 17, 19) == 0x3);
+      UNALLOC(instr, SpacedBits(3, 18, 17, 20) == 0x3);
+      UNALLOC(instr, Bit(18) == 1);
+      VisitFPFixedPointConvert(instr);
+    } else {
+      if (instr->Bits(15, 10) == 0) {
+        UNALLOC(instr, SpacedBits(3, 18, 17, 19) == 0x3);
+        UNALLOC(instr, SpacedBits(3, 18, 17, 20) == 0x3);
+        UNALLOC(instr, SpacedBits(3, 18, 17, 19) == 0x5);
+        UNALLOC(instr, SpacedBits(3, 18, 17, 20) == 0x5);
+        UNALLOC(instr, Mask(0xA0C60000) == 0x80060000);
+        UNALLOC(instr, Mask(0xA0CE0000) == 0x000E0000);
+        UNALLOC(instr, Mask(0xA0D60000) == 0x00160000);
+        UNALLOC(instr, Mask(0xA0C60000) == 0x00460000);
+        UNALLOC(instr, Mask(0xA0CE0000) == 0x804E0000);
+        UNALLOC(instr, Mask(0xA0D60000) == 0x80560000);
+        UNALLOC(instr, SpacedBits(4, 23, 22, 18, 29) == 0x8);
+        UNALLOC(instr, SpacedBits(5, 23, 22, 18, 17, 29) == 0x14);
+        UNALLOC(instr, Mask(0xA0C60000) == 0x00860000);
+        UNALLOC(instr, Mask(0xA0CE0000) == 0x80860000);
+        UNALLOC(instr, Mask(0xA0D60000) == 0x80960000);
+        UNALLOC(instr, Bits(23, 22) == 0x3);
+        VisitFPIntegerConvert(instr);
+      } else if (instr->Bits(14, 10) == 16) {
+        UNALLOC(instr, SpacedBits(3, 31, 19, 20) != 0);
+        UNALLOC(instr, Mask(0xA0DF8000) == 0x00020000);
+        UNALLOC(instr, Mask(0xA0DF8000) == 0x00030000);
+        UNALLOC(instr, Mask(0xA0DF8000) == 0x00068000);
+        UNALLOC(instr, Mask(0xA0DF8000) == 0x00428000);
+        UNALLOC(instr, Mask(0xA0DF8000) == 0x00430000);
+        UNALLOC(instr, Mask(0xA0DF8000) == 0x00468000);
+        UNALLOC(instr, Mask(0xA0D80000) == 0x00800000);
+        UNALLOC(instr, Mask(0xA0DE0000) == 0x00C00000);
+        UNALLOC(instr, Mask(0xA0DF0000) == 0x00C30000);
+        UNALLOC(instr, Mask(0xA0DC0000) == 0x00C40000);
+        VisitFPDataProcessing1Source(instr);
+      } else if (instr->Bits(13, 10) == 8) {
+        UNALLOC(instr, SpacedBits(2, 31, 23) != 0);
+        UNALLOC(instr, Bits(2, 0) != 0);
+        UNALLOC(instr, Bits(15, 14) != 0);
+        VisitFPCompare(instr);
+      } else if (instr->Bits(12, 10) == 4) {
+        UNALLOC(instr, Bits(9, 5) != 0);
+        UNALLOC(instr, SpacedBits(2, 31, 23) != 0);
+        VisitFPImmediate(instr);
+      } else {
+        UNALLOC(instr, SpacedBits(2, 31, 23) != 0);
+        switch (instr->Bits(11, 10)) {
+          case 1: {
+            VisitFPConditionalCompare(instr);
+            break;
+          }
+          case 2: {
+            UNALLOC(instr, SpacedBits(2, 15, 12) == 0x3);
+            UNALLOC(instr, SpacedBits(2, 15, 13) == 0x3);
+            UNALLOC(instr, Bits(15, 14) == 0x3);
+            VisitFPDataProcessing2Source(instr);
+            break;
+          }
+          case 3: {
+            VisitFPConditionalSelect(instr);
+            break;
+          }
+          default: VisitUnknown(instr);
+        }
+      }
+    }
+  } else {
+    UNALLOC(instr, Bit(31) == 0x1);
+    UNALLOC(instr, Bit(23) == 0x1);
+    VisitFPDataProcessing3Source(instr);
+  }
+}
+
+#define DEFINE_VISITOR_CALLERS(A)                                              \
+  void Decoder::Visit##A(Instruction *instr) {                                 \
+    ASSERT(instr->Mask(A##FMask) == A##Fixed);                                 \
+    std::list<DecoderVisitor*>::iterator it;                                   \
+    for (it = visitors_.begin(); it != visitors_.end(); it++) {                \
+      (*it)->Visit##A(instr);                                                  \
+    }                                                                          \
+  }
+VISITOR_LIST(DEFINE_VISITOR_CALLERS)
+#undef DEFINE_VISITOR_CALLERS
+}  // namespace vixl

diff --git a/src/a64/decoder-a64.h b/src/a64/decoder-a64.h
new file mode 100644
index 0000000..08cbab2
--- /dev/null
+++ b/src/a64/decoder-a64.h

@@ -0,0 +1,188 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_A64_DECODER_A64_H_
+#define VIXL_A64_DECODER_A64_H_
+
+#include <list>
+
+#include "globals.h"
+#include "a64/instructions-a64.h"
+
+
+#ifdef DEBUG
+  #define UNALLOC(I, P)                                                       \
+    if (I->P) {                                                               \
+      printf("Instruction 0x%08" PRIx32 " uses an unallocated encoding.\n",   \
+             I->InstructionBits());                                           \
+    }                                                                         \
+    ASSERT(!(I->P));
+#else
+  #define UNALLOC(I, P) ((void) 0)
+#endif
+
+// List macro containing all visitors needed by the decoder class.
+
+#define VISITOR_LIST(V)             \
+  V(PCRelAddressing)                \
+  V(AddSubImmediate)                \
+  V(LogicalImmediate)               \
+  V(MoveWideImmediate)              \
+  V(Bitfield)                       \
+  V(Extract)                        \
+  V(UnconditionalBranch)            \
+  V(UnconditionalBranchToRegister)  \
+  V(CompareBranch)                  \
+  V(TestBranch)                     \
+  V(ConditionalBranch)              \
+  V(System)                         \
+  V(Exception)                      \
+  V(LoadStorePairPostIndex)         \
+  V(LoadStorePairOffset)            \
+  V(LoadStorePairPreIndex)          \
+  V(LoadStorePairNonTemporal)       \
+  V(LoadLiteral)                    \
+  V(LoadStoreUnscaledOffset)        \
+  V(LoadStorePostIndex)             \
+  V(LoadStorePreIndex)              \
+  V(LoadStoreRegisterOffset)        \
+  V(LoadStoreUnsignedOffset)        \
+  V(LogicalShifted)                 \
+  V(AddSubShifted)                  \
+  V(AddSubExtended)                 \
+  V(AddSubWithCarry)                \
+  V(ConditionalCompareRegister)     \
+  V(ConditionalCompareImmediate)    \
+  V(ConditionalSelect)              \
+  V(DataProcessing1Source)          \
+  V(DataProcessing2Source)          \
+  V(DataProcessing3Source)          \
+  V(FPCompare)                      \
+  V(FPConditionalCompare)           \
+  V(FPConditionalSelect)            \
+  V(FPImmediate)                    \
+  V(FPDataProcessing1Source)        \
+  V(FPDataProcessing2Source)        \
+  V(FPDataProcessing3Source)        \
+  V(FPIntegerConvert)               \
+  V(FPFixedPointConvert)            \
+  V(Unknown)
+
+namespace vixl {
+
+// The Visitor interface. Disassembler and simulator (and other tools)
+// must provide implementations for all of these functions.
+class DecoderVisitor {
+ public:
+  #define DECLARE(A) virtual void Visit##A(Instruction* instr) = 0;
+  VISITOR_LIST(DECLARE)
+  #undef DECLARE
+
+  virtual ~DecoderVisitor() {}
+
+ private:
+  // Visitors are registered in a list.
+  std::list<DecoderVisitor*> visitors_;
+
+  friend class Decoder;
+};
+
+
+class Decoder: public DecoderVisitor {
+ public:
+  Decoder() {}
+
+  // Top-level instruction decoder function. Decodes an instruction and calls
+  // the visitor functions registered with the Decoder class.
+  void Decode(Instruction *instr);
+
+  // Register a new visitor class with the decoder.
+  // Decode() will call the corresponding visitor method from all registered
+  // visitor classes when decoding reaches the leaf node of the instruction
+  // decode tree.
+  // Visitors are called in the order.
+  // A visitor can only be registered once.
+  // Registering an already registered visitor will update its position.
+  //
+  //   d.AppendVisitor(V1);
+  //   d.AppendVisitor(V2);
+  //   d.PrependVisitor(V2);            // Move V2 at the start of the list.
+  //   d.InsertVisitorBefore(V3, V2);
+  //   d.AppendVisitor(V4);
+  //   d.AppendVisitor(V4);             // No effect.
+  //
+  //   d.Decode(i);
+  //
+  // will call in order visitor methods in V3, V2, V1, V4.
+  void AppendVisitor(DecoderVisitor* visitor);
+  void PrependVisitor(DecoderVisitor* visitor);
+  void InsertVisitorBefore(DecoderVisitor* new_visitor,
+                           DecoderVisitor* registered_visitor);
+  void InsertVisitorAfter(DecoderVisitor* new_visitor,
+                          DecoderVisitor* registered_visitor);
+
+  // Remove a previously registered visitor class from the list of visitors
+  // stored by the decoder.
+  void RemoveVisitor(DecoderVisitor *visitor);
+
+  #define DECLARE(A) void Visit##A(Instruction* instr);
+  VISITOR_LIST(DECLARE)
+  #undef DECLARE
+
+ private:
+  // Decode the branch, system command, and exception generation parts of
+  // the instruction tree, and call the corresponding visitors.
+  // On entry, instruction bits 27:24 = {0x0, 0x4, 0x5, 0x6, 0x7}.
+  void DecodeBranchSystemException(Instruction *instr);
+
+  // Decode the load and store parts of the instruction tree, and call
+  // the corresponding visitors.
+  // On entry, instruction bits 27:24 = {0x8, 0x9, 0xC, 0xD}.
+  void DecodeLoadStore(Instruction *instr);
+
+  // Decode the logical immediate and move wide immediate parts of the
+  // instruction tree, and call the corresponding visitors.
+  // On entry, instruction bits 27:24 = 0x2.
+  void DecodeLogical(Instruction *instr);
+
+  // Decode the bitfield and extraction parts of the instruction tree,
+  // and call the corresponding visitors.
+  // On entry, instruction bits 27:24 = 0x3.
+  void DecodeBitfieldExtract(Instruction *instr);
+
+  // Decode the data processing parts of the instruction tree, and call the
+  // corresponding visitors.
+  // On entry, instruction bits 27:24 = {0x1, 0xA, 0xB}.
+  void DecodeDataProcessing(Instruction *instr);
+
+  // Decode the floating point parts of the instruction tree, and call the
+  // corresponding visitors.
+  // On entry, instruction bits 27:24 = {0xE, 0xF}.
+  void DecodeFP(Instruction *instr);
+};
+}  // namespace vixl
+
+#endif  // VIXL_A64_DECODER_A64_H_

diff --git a/src/a64/disasm-a64.cc b/src/a64/disasm-a64.cc
new file mode 100644
index 0000000..1be16f8
--- /dev/null
+++ b/src/a64/disasm-a64.cc

@@ -0,0 +1,1643 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "a64/disasm-a64.h"
+
+namespace vixl {
+
+Disassembler::Disassembler() {
+  buffer_size_ = 256;
+  buffer_ = reinterpret_cast<char*>(malloc(buffer_size_));
+  buffer_pos_ = 0;
+  own_buffer_ = true;
+}
+
+
+Disassembler::Disassembler(char* text_buffer, int buffer_size) {
+  buffer_size_ = buffer_size;
+  buffer_ = text_buffer;
+  buffer_pos_ = 0;
+  own_buffer_ = false;
+}
+
+
+Disassembler::~Disassembler() {
+  if (own_buffer_) {
+    free(buffer_);
+  }
+}
+
+
+char* Disassembler::GetOutput() {
+  return buffer_;
+}
+
+
+void Disassembler::VisitAddSubImmediate(Instruction* instr) {
+  bool rd_is_zr = RdIsZROrSP(instr);
+  bool stack_op = (rd_is_zr || RnIsZROrSP(instr)) &&
+                  (instr->ImmAddSub() == 0) ? true : false;
+  const char *mnemonic = "unknown";
+  const char *form = "'Rds, 'Rns, 'IAddSub";
+  const char *form_cmp = "'Rns, 'IAddSub";
+  const char *form_mov = "'Rds, 'Rns";
+
+  switch (instr->Mask(AddSubImmediateMask)) {
+    case ADD_w_imm:
+    case ADD_x_imm: {
+      mnemonic = "add";
+      if (stack_op) {
+        mnemonic = "mov";
+        form = form_mov;
+      }
+      break;
+    }
+    case ADDS_w_imm:
+    case ADDS_x_imm: {
+      mnemonic = "adds";
+      if (rd_is_zr) {
+        mnemonic = "cmn";
+        form = form_cmp;
+      }
+      break;
+    }
+    case SUB_w_imm:
+    case SUB_x_imm: mnemonic = "sub"; break;
+    case SUBS_w_imm:
+    case SUBS_x_imm: {
+      mnemonic = "subs";
+      if (rd_is_zr) {
+        mnemonic = "cmp";
+        form = form_cmp;
+      }
+      break;
+    }
+    default: form = "(AddSubImmediate)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitAddSubShifted(Instruction* instr) {
+  bool rd_is_zr = RdIsZROrSP(instr);
+  bool rn_is_zr = RnIsZROrSP(instr);
+  const char *mnemonic = "unknown";
+  const char *form = "'Rd, 'Rn, 'Rm'HDP";
+  const char *form_cmp = "'Rn, 'Rm'HDP";
+  const char *form_neg = "'Rd, 'Rm'HDP";
+
+  switch (instr->Mask(AddSubShiftedMask)) {
+    case ADD_w_shift:
+    case ADD_x_shift: mnemonic = "add"; break;
+    case ADDS_w_shift:
+    case ADDS_x_shift: {
+      mnemonic = "adds";
+      if (rd_is_zr) {
+        mnemonic = "cmn";
+        form = form_cmp;
+      }
+      break;
+    }
+    case SUB_w_shift:
+    case SUB_x_shift: {
+      mnemonic = "sub";
+      if (rn_is_zr) {
+        mnemonic = "neg";
+        form = form_neg;
+      }
+      break;
+    }
+    case SUBS_w_shift:
+    case SUBS_x_shift: {
+      mnemonic = "subs";
+      if (rd_is_zr) {
+        mnemonic = "cmp";
+        form = form_cmp;
+      } else if (rn_is_zr) {
+        mnemonic = "negs";
+        form = form_neg;
+      }
+      break;
+    }
+    default: form = "(AddSubShifted)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitAddSubExtended(Instruction* instr) {
+  bool rd_is_zr = RdIsZROrSP(instr);
+  const char *mnemonic = "unknown";
+  Extend mode = static_cast<Extend>(instr->ExtendMode());
+  const char *form = ((mode == UXTX) || (mode == SXTX)) ?
+                     "'Rds, 'Rns, 'Xm'Ext" : "'Rds, 'Rns, 'Wm'Ext";
+  const char *form_cmp = ((mode == UXTX) || (mode == SXTX)) ?
+                         "'Rns, 'Xm'Ext" : "'Rns, 'Wm'Ext";
+
+  switch (instr->Mask(AddSubExtendedMask)) {
+    case ADD_w_ext:
+    case ADD_x_ext: mnemonic = "add"; break;
+    case ADDS_w_ext:
+    case ADDS_x_ext: {
+      mnemonic = "adds";
+      if (rd_is_zr) {
+        mnemonic = "cmn";
+        form = form_cmp;
+      }
+      break;
+    }
+    case SUB_w_ext:
+    case SUB_x_ext: mnemonic = "sub"; break;
+    case SUBS_w_ext:
+    case SUBS_x_ext: {
+      mnemonic = "subs";
+      if (rd_is_zr) {
+        mnemonic = "cmp";
+        form = form_cmp;
+      }
+      break;
+    }
+    default: form = "(AddSubExtended)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitAddSubWithCarry(Instruction* instr) {
+  bool rn_is_zr = RnIsZROrSP(instr);
+  const char *mnemonic = "unknown";
+  const char *form = "'Rd, 'Rn, 'Rm";
+  const char *form_neg = "'Rd, 'Rm";
+
+  switch (instr->Mask(AddSubWithCarryMask)) {
+    case ADC_w:
+    case ADC_x: mnemonic = "adc"; break;
+    case ADCS_w:
+    case ADCS_x: mnemonic = "adcs"; break;
+    case SBC_w:
+    case SBC_x: {
+      mnemonic = "sbc";
+      if (rn_is_zr) {
+        mnemonic = "ngc";
+        form = form_neg;
+      }
+      break;
+    }
+    case SBCS_w:
+    case SBCS_x: {
+      mnemonic = "sbcs";
+      if (rn_is_zr) {
+        mnemonic = "ngcs";
+        form = form_neg;
+      }
+      break;
+    }
+    default: form = "(AddSubWithCarry)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitLogicalImmediate(Instruction* instr) {
+  bool rd_is_zr = RdIsZROrSP(instr);
+  bool rn_is_zr = RnIsZROrSP(instr);
+  const char *mnemonic = "unknown";
+  const char *form = "'Rds, 'Rn, 'ITri";
+
+  switch (instr->Mask(LogicalImmediateMask)) {
+    case AND_w_imm:
+    case AND_x_imm: mnemonic = "and"; break;
+    case ORR_w_imm:
+    case ORR_x_imm: {
+      mnemonic = "orr";
+      unsigned reg_size = (instr->SixtyFourBits() == 1) ? kXRegSize
+                                                        : kWRegSize;
+      if (rn_is_zr && !IsMovzMovnImm(reg_size, instr->ImmLogical())) {
+        mnemonic = "mov";
+        form = "'Rds, 'ITri";
+      }
+      break;
+    }
+    case EOR_w_imm:
+    case EOR_x_imm: mnemonic = "eor"; break;
+    case ANDS_w_imm:
+    case ANDS_x_imm: {
+      mnemonic = "ands";
+      if (rd_is_zr) {
+        mnemonic = "tst";
+        form = "'Rn, 'ITri";
+      }
+      break;
+    }
+    default: form = "(LogicalImmediate)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+bool Disassembler::IsMovzMovnImm(unsigned reg_size, uint64_t value) {
+  ASSERT((reg_size == kXRegSize) ||
+         ((reg_size == kWRegSize) && (value <= 0xffffffff)));
+
+  // Test for movz: 16 bits set at positions 0, 16, 32 or 48.
+  if (((value & 0xffffffffffff0000UL) == 0UL) ||
+      ((value & 0xffffffff0000ffffUL) == 0UL) ||
+      ((value & 0xffff0000ffffffffUL) == 0UL) ||
+      ((value & 0x0000ffffffffffffUL) == 0UL)) {
+    return true;
+  }
+
+  // Test for movn: NOT(16 bits set at positions 0, 16, 32 or 48).
+  if ((reg_size == kXRegSize) &&
+      (((value & 0xffffffffffff0000UL) == 0xffffffffffff0000UL) ||
+       ((value & 0xffffffff0000ffffUL) == 0xffffffff0000ffffUL) ||
+       ((value & 0xffff0000ffffffffUL) == 0xffff0000ffffffffUL) ||
+       ((value & 0x0000ffffffffffffUL) == 0x0000ffffffffffffUL))) {
+    return true;
+  }
+  if ((reg_size == kWRegSize) &&
+      (((value & 0xffff0000) == 0xffff0000) ||
+       ((value & 0x0000ffff) == 0x0000ffff))) {
+    return true;
+  }
+  return false;
+}
+
+
+void Disassembler::VisitLogicalShifted(Instruction* instr) {
+  bool rd_is_zr = RdIsZROrSP(instr);
+  bool rn_is_zr = RnIsZROrSP(instr);
+  const char *mnemonic = "unknown";
+  const char *form = "'Rd, 'Rn, 'Rm'HLo";
+
+  switch (instr->Mask(LogicalShiftedMask)) {
+    case AND_w:
+    case AND_x: mnemonic = "and"; break;
+    case BIC_w:
+    case BIC_x: mnemonic = "bic"; break;
+    case EOR_w:
+    case EOR_x: mnemonic = "eor"; break;
+    case EON_w:
+    case EON_x: mnemonic = "eon"; break;
+    case BICS_w:
+    case BICS_x: mnemonic = "bics"; break;
+    case ANDS_w:
+    case ANDS_x: {
+      mnemonic = "ands";
+      if (rd_is_zr) {
+        mnemonic = "tst";
+        form = "'Rn, 'Rm'HLo";
+      }
+      break;
+    }
+    case ORR_w:
+    case ORR_x: {
+      mnemonic = "orr";
+      if (rn_is_zr && (instr->ImmDPShift() == 0) && (instr->ShiftDP() == LSL)) {
+        mnemonic = "mov";
+        form = "'Rd, 'Rm";
+      }
+      break;
+    }
+    case ORN_w:
+    case ORN_x: {
+      mnemonic = "orn";
+      if (rn_is_zr) {
+        mnemonic = "mvn";
+        form = "'Rd, 'Rm'HLo";
+      }
+      break;
+    }
+    default: form = "(LogicalShifted)";
+  }
+
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitConditionalCompareRegister(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Rn, 'Rm, 'INzcv, 'Cond";
+
+  switch (instr->Mask(ConditionalCompareRegisterMask)) {
+    case CCMN_w:
+    case CCMN_x: mnemonic = "ccmn"; break;
+    case CCMP_w:
+    case CCMP_x: mnemonic = "ccmp"; break;
+    default: form = "(ConditionalCompareRegister)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitConditionalCompareImmediate(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Rn, 'IP, 'INzcv, 'Cond";
+
+  switch (instr->Mask(ConditionalCompareImmediateMask)) {
+    case CCMN_w_imm:
+    case CCMN_x_imm: mnemonic = "ccmn"; break;
+    case CCMP_w_imm:
+    case CCMP_x_imm: mnemonic = "ccmp"; break;
+    default: form = "(ConditionalCompareImmediate)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitConditionalSelect(Instruction* instr) {
+  bool rnm_is_zr = (RnIsZROrSP(instr) && RmIsZROrSP(instr));
+  bool rn_is_rm = (instr->Rn() == instr->Rm());
+  const char *mnemonic = "unknown";
+  const char *form = "'Rd, 'Rn, 'Rm, 'Cond";
+  const char *form_test = "'Rd, 'CInv";
+  const char *form_update = "'Rd, 'Rn, 'CInv";
+
+  switch (instr->Mask(ConditionalSelectMask)) {
+    case CSEL_w:
+    case CSEL_x: mnemonic = "csel"; break;
+    case CSINC_w:
+    case CSINC_x: {
+      mnemonic = "csinc";
+      if (rnm_is_zr) {
+        mnemonic = "cset";
+        form = form_test;
+      } else if (rn_is_rm) {
+        mnemonic = "cinc";
+        form = form_update;
+      }
+      break;
+    }
+    case CSINV_w:
+    case CSINV_x: {
+      mnemonic = "csinv";
+      if (rnm_is_zr) {
+        mnemonic = "csetm";
+        form = form_test;
+      } else if (rn_is_rm) {
+        mnemonic = "cinv";
+        form = form_update;
+      }
+      break;
+    }
+    case CSNEG_w:
+    case CSNEG_x: {
+      mnemonic = "csneg";
+      if (rn_is_rm) {
+        mnemonic = "cneg";
+        form = form_update;
+      }
+      break;
+    }
+    default: form = "(ConditionalSelect)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitBitfield(Instruction* instr) {
+  unsigned s = instr->ImmS();
+  unsigned r = instr->ImmR();
+  unsigned rd_size_minus_1 =
+    ((instr->SixtyFourBits() == 1) ? kXRegSize : kWRegSize) - 1;
+  const char *mnemonic = "unknown";
+  const char *form = "(Bitfield)";
+  const char *form_shift_right = "'Rd, 'Rn, 'IBr";
+  const char *form_extend = "'Rd, 'Wn";
+  const char *form_bfiz = "'Rd, 'Rn, 'IBZ-r, 'IBs+1";
+  const char *form_bfx = "'Rd, 'Rn, 'IBr, 'IBs-r+1";
+  const char *form_lsl = "'Rd, 'Rn, 'IBZ-r";
+
+  switch (instr->Mask(BitfieldMask)) {
+    case SBFM_w:
+    case SBFM_x: {
+      mnemonic = "sbfx";
+      form = form_bfx;
+      if (r == 0) {
+        form = form_extend;
+        if (s == 7) {
+          mnemonic = "sxtb";
+        } else if (s == 15) {
+          mnemonic = "sxth";
+        } else if ((s == 31) && (instr->SixtyFourBits() == 1)) {
+          mnemonic = "sxtw";
+        } else {
+          form = form_bfx;
+        }
+      } else if (s == rd_size_minus_1) {
+        mnemonic = "asr";
+        form = form_shift_right;
+      } else if (s < r) {
+        mnemonic = "sbfiz";
+        form = form_bfiz;
+      }
+      break;
+    }
+    case UBFM_w:
+    case UBFM_x: {
+      mnemonic = "ubfx";
+      form = form_bfx;
+      if (r == 0) {
+        form = form_extend;
+        if (s == 7) {
+          mnemonic = "uxtb";
+        } else if (s == 15) {
+          mnemonic = "uxth";
+        } else {
+          form = form_bfx;
+        }
+      }
+      if (s == rd_size_minus_1) {
+        mnemonic = "lsr";
+        form = form_shift_right;
+      } else if (r == s + 1) {
+        mnemonic = "lsl";
+        form = form_lsl;
+      } else if (s < r) {
+        mnemonic = "ubfiz";
+        form = form_bfiz;
+      }
+      break;
+    }
+    case BFM_w:
+    case BFM_x: {
+      mnemonic = "bfxil";
+      form = form_bfx;
+      if (s < r) {
+        mnemonic = "bfi";
+        form = form_bfiz;
+      }
+    }
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitExtract(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Rd, 'Rn, 'Rm, 'IExtract";
+
+  switch (instr->Mask(ExtractMask)) {
+    case EXTR_w:
+    case EXTR_x: {
+      if (instr->Rn() == instr->Rm()) {
+        mnemonic = "ror";
+        form = "'Rd, 'Rn, 'IExtract";
+      } else {
+        mnemonic = "extr";
+      }
+      break;
+    }
+    default: form = "(Extract)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitPCRelAddressing(Instruction* instr) {
+  switch (instr->Mask(PCRelAddressingMask)) {
+    case ADR: Format(instr, "adr", "'Xd, 'AddrPCRelByte"); break;
+    // ADRP is not implemented.
+    default: Format(instr, "unknown", "(PCRelAddressing)");
+  }
+}
+
+
+void Disassembler::VisitConditionalBranch(Instruction* instr) {
+  switch (instr->Mask(ConditionalBranchMask)) {
+    case B_cond: Format(instr, "b.'CBrn", "'BImmCond"); break;
+    default: Format(instr, "unknown", "(ConditionalBranch)");
+  }
+}
+
+
+void Disassembler::VisitUnconditionalBranchToRegister(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Xn";
+
+  switch (instr->Mask(UnconditionalBranchToRegisterMask)) {
+    case BR: mnemonic = "br"; break;
+    case BLR: mnemonic = "blr"; break;
+    case RET: {
+      mnemonic = "ret";
+      if (instr->Rn() == kLinkRegCode) {
+        form = NULL;
+      }
+      break;
+    }
+    default: form = "(UnconditionalBranchToRegister)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitUnconditionalBranch(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'BImmUncn";
+
+  switch (instr->Mask(UnconditionalBranchMask)) {
+    case B: mnemonic = "b"; break;
+    case BL: mnemonic = "bl"; break;
+    default: form = "(UnconditionalBranch)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitDataProcessing1Source(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Rd, 'Rn";
+
+  switch (instr->Mask(DataProcessing1SourceMask)) {
+    #define FORMAT(A, B)  \
+    case A##_w:           \
+    case A##_x: mnemonic = B; break;
+    FORMAT(RBIT, "rbit");
+    FORMAT(REV16, "rev16");
+    FORMAT(REV, "rev");
+    FORMAT(CLZ, "clz");
+    FORMAT(CLS, "cls");
+    #undef FORMAT
+    case REV32_x: mnemonic = "rev32"; break;
+    default: form = "(DataProcessing1Source)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitDataProcessing2Source(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Rd, 'Rn, 'Rm";
+
+  switch (instr->Mask(DataProcessing2SourceMask)) {
+    #define FORMAT(A, B)  \
+    case A##_w:           \
+    case A##_x: mnemonic = B; break;
+    FORMAT(UDIV, "udiv");
+    FORMAT(SDIV, "sdiv");
+    FORMAT(LSLV, "lsl");
+    FORMAT(LSRV, "lsr");
+    FORMAT(ASRV, "asr");
+    FORMAT(RORV, "ror");
+    #undef FORMAT
+    default: form = "(DataProcessing2Source)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitDataProcessing3Source(Instruction* instr) {
+  bool ra_is_zr = RaIsZROrSP(instr);
+  const char *mnemonic = "unknown";
+  const char *form = "'Xd, 'Wn, 'Wm, 'Xa";
+  const char *form_rrr = "'Rd, 'Rn, 'Rm";
+  const char *form_rrrr = "'Rd, 'Rn, 'Rm, 'Ra";
+  const char *form_xww = "'Xd, 'Wn, 'Wm";
+  const char *form_xxx = "'Xd, 'Xn, 'Xm";
+
+  switch (instr->Mask(DataProcessing3SourceMask)) {
+    case MADD_w:
+    case MADD_x: {
+      mnemonic = "madd";
+      form = form_rrrr;
+      if (ra_is_zr) {
+        mnemonic = "mul";
+        form = form_rrr;
+      }
+      break;
+    }
+    case MSUB_w:
+    case MSUB_x: {
+      mnemonic = "msub";
+      form = form_rrrr;
+      if (ra_is_zr) {
+        mnemonic = "mneg";
+        form = form_rrr;
+      }
+      break;
+    }
+    case SMADDL_x: {
+      mnemonic = "smaddl";
+      if (ra_is_zr) {
+        mnemonic = "smull";
+        form = form_xww;
+      }
+      break;
+    }
+    case SMSUBL_x: {
+      mnemonic = "smsubl";
+      if (ra_is_zr) {
+        mnemonic = "smnegl";
+        form = form_xww;
+      }
+      break;
+    }
+    case UMADDL_x: {
+      mnemonic = "umaddl";
+      if (ra_is_zr) {
+        mnemonic = "umull";
+        form = form_xww;
+      }
+      break;
+    }
+    case UMSUBL_x: {
+      mnemonic = "umsubl";
+      if (ra_is_zr) {
+        mnemonic = "umnegl";
+        form = form_xww;
+      }
+      break;
+    }
+    case SMULH_x: {
+      mnemonic = "smulh";
+      form = form_xxx;
+      break;
+    }
+    case UMULH_x: {
+      mnemonic = "umulh";
+      form = form_xxx;
+      break;
+    }
+    default: form = "(DataProcessing3Source)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitCompareBranch(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Rt, 'BImmCmpa";
+
+  switch (instr->Mask(CompareBranchMask)) {
+    case CBZ_w:
+    case CBZ_x: mnemonic = "cbz"; break;
+    case CBNZ_w:
+    case CBNZ_x: mnemonic = "cbnz"; break;
+    default: form = "(CompareBranch)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitTestBranch(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Xt, 'IS, 'BImmTest";
+
+  switch (instr->Mask(TestBranchMask)) {
+    case TBZ: mnemonic = "tbz"; break;
+    case TBNZ: mnemonic = "tbnz"; break;
+    default: form = "(TestBranch)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitMoveWideImmediate(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Rd, 'IMoveImm";
+
+  // Print the shift separately for movk, to make it clear which half word will
+  // be overwritten. Movn and movz print the computed immediate, which includes
+  // shift calculation.
+  switch (instr->Mask(MoveWideImmediateMask)) {
+    case MOVN_w:
+    case MOVN_x: mnemonic = "movn"; break;
+    case MOVZ_w:
+    case MOVZ_x: mnemonic = "movz"; break;
+    case MOVK_w:
+    case MOVK_x: mnemonic = "movk"; form = "'Rd, 'IMoveLSL"; break;
+    default: form = "(MoveWideImmediate)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+#define LOAD_STORE_LIST(V)    \
+  V(STRB_w, "strb", "'Wt")    \
+  V(STRH_w, "strh", "'Wt")    \
+  V(STR_w, "str", "'Wt")      \
+  V(STR_x, "str", "'Xt")      \
+  V(LDRB_w, "ldrb", "'Wt")    \
+  V(LDRH_w, "ldrh", "'Wt")    \
+  V(LDR_w, "ldr", "'Wt")      \
+  V(LDR_x, "ldr", "'Xt")      \
+  V(LDRSB_x, "ldrsb", "'Xt")  \
+  V(LDRSH_x, "ldrsh", "'Xt")  \
+  V(LDRSW_x, "ldrsw", "'Xt")  \
+  V(LDRSB_w, "ldrsb", "'Wt")  \
+  V(LDRSH_w, "ldrsh", "'Wt")  \
+  V(STR_s, "str", "'St")      \
+  V(STR_d, "str", "'Dt")      \
+  V(LDR_s, "ldr", "'St")      \
+  V(LDR_d, "ldr", "'Dt")
+
+void Disassembler::VisitLoadStorePreIndex(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "(LoadStorePreIndex)";
+
+  switch (instr->Mask(LoadStorePreIndexMask)) {
+    #define LS_PREINDEX(A, B, C) \
+    case A##_pre: mnemonic = B; form = C ", ['Xns'ILS]!"; break;
+    LOAD_STORE_LIST(LS_PREINDEX)
+    #undef LS_PREINDEX
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitLoadStorePostIndex(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "(LoadStorePostIndex)";
+
+  switch (instr->Mask(LoadStorePostIndexMask)) {
+    #define LS_POSTINDEX(A, B, C) \
+    case A##_post: mnemonic = B; form = C ", ['Xns]'ILS"; break;
+    LOAD_STORE_LIST(LS_POSTINDEX)
+    #undef LS_POSTINDEX
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitLoadStoreUnsignedOffset(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "(LoadStoreUnsignedOffset)";
+
+  switch (instr->Mask(LoadStoreUnsignedOffsetMask)) {
+    #define LS_UNSIGNEDOFFSET(A, B, C) \
+    case A##_unsigned: mnemonic = B; form = C ", ['Xns'ILU]"; break;
+    LOAD_STORE_LIST(LS_UNSIGNEDOFFSET)
+    #undef LS_UNSIGNEDOFFSET
+    case PRFM_unsigned: mnemonic = "prfm"; form = "'PrefOp, ['Xn'ILU]";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitLoadStoreRegisterOffset(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "(LoadStoreRegisterOffset)";
+
+  switch (instr->Mask(LoadStoreRegisterOffsetMask)) {
+    #define LS_REGISTEROFFSET(A, B, C) \
+    case A##_reg: mnemonic = B; form = C ", ['Xns, 'Offsetreg]"; break;
+    LOAD_STORE_LIST(LS_REGISTEROFFSET)
+    #undef LS_REGISTEROFFSET
+    case PRFM_reg: mnemonic = "prfm"; form = "'PrefOp, ['Xns, 'Offsetreg]";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitLoadStoreUnscaledOffset(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Wt, ['Xns'ILS]";
+  const char *form_x = "'Xt, ['Xns'ILS]";
+  const char *form_s = "'St, ['Xns'ILS]";
+  const char *form_d = "'Dt, ['Xns'ILS]";
+
+  switch (instr->Mask(LoadStoreUnscaledOffsetMask)) {
+    case STURB_w:  mnemonic = "sturb"; break;
+    case STURH_w:  mnemonic = "sturh"; break;
+    case STUR_w:   mnemonic = "stur"; break;
+    case STUR_x:   mnemonic = "stur"; form = form_x; break;
+    case STUR_s:   mnemonic = "stur"; form = form_s; break;
+    case STUR_d:   mnemonic = "stur"; form = form_d; break;
+    case LDURB_w:  mnemonic = "ldurb"; break;
+    case LDURH_w:  mnemonic = "ldurh"; break;
+    case LDUR_w:   mnemonic = "ldur"; break;
+    case LDUR_x:   mnemonic = "ldur"; form = form_x; break;
+    case LDUR_s:   mnemonic = "ldur"; form = form_s; break;
+    case LDUR_d:   mnemonic = "ldur"; form = form_d; break;
+    case LDURSB_x: form = form_x;  // Fall through.
+    case LDURSB_w: mnemonic = "ldursb"; break;
+    case LDURSH_x: form = form_x;  // Fall through.
+    case LDURSH_w: mnemonic = "ldursh"; break;
+    case LDURSW_x: mnemonic = "ldursw"; form = form_x; break;
+    default: form = "(LoadStoreUnscaledOffset)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitLoadLiteral(Instruction* instr) {
+  const char *mnemonic = "ldr";
+  const char *form = "(LoadLiteral)";
+
+  switch (instr->Mask(LoadLiteralMask)) {
+    case LDR_w_lit: form = "'Wt, 'ILLiteral 'LValue"; break;
+    case LDR_x_lit: form = "'Xt, 'ILLiteral 'LValue"; break;
+    case LDR_s_lit: form = "'St, 'ILLiteral 'LValue"; break;
+    case LDR_d_lit: form = "'Dt, 'ILLiteral 'LValue"; break;
+    default: mnemonic = "unknown";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+#define LOAD_STORE_PAIR_LIST(V)         \
+  V(STP_w, "stp", "'Wt, 'Wt2", "4")     \
+  V(LDP_w, "ldp", "'Wt, 'Wt2", "4")     \
+  V(LDPSW_x, "ldpsw", "'Xt, 'Xt2", "4") \
+  V(STP_x, "stp", "'Xt, 'Xt2", "8")     \
+  V(LDP_x, "ldp", "'Xt, 'Xt2", "8")     \
+  V(STP_s, "stp", "'St, 'St2", "4")     \
+  V(LDP_s, "ldp", "'St, 'St2", "4")     \
+  V(STP_d, "stp", "'Dt, 'Dt2", "8")     \
+  V(LDP_d, "ldp", "'Dt, 'Dt2", "8")
+
+void Disassembler::VisitLoadStorePairPostIndex(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "(LoadStorePairPostIndex)";
+
+  switch (instr->Mask(LoadStorePairPostIndexMask)) {
+    #define LSP_POSTINDEX(A, B, C, D) \
+    case A##_post: mnemonic = B; form = C ", ['Xns]'ILP" D; break;
+    LOAD_STORE_PAIR_LIST(LSP_POSTINDEX)
+    #undef LSP_POSTINDEX
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitLoadStorePairPreIndex(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "(LoadStorePairPreIndex)";
+
+  switch (instr->Mask(LoadStorePairPreIndexMask)) {
+    #define LSP_PREINDEX(A, B, C, D) \
+    case A##_pre: mnemonic = B; form = C ", ['Xns'ILP" D "]!"; break;
+    LOAD_STORE_PAIR_LIST(LSP_PREINDEX)
+    #undef LSP_PREINDEX
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitLoadStorePairOffset(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "(LoadStorePairOffset)";
+
+  switch (instr->Mask(LoadStorePairOffsetMask)) {
+    #define LSP_OFFSET(A, B, C, D) \
+    case A##_off: mnemonic = B; form = C ", ['Xns'ILP" D "]"; break;
+    LOAD_STORE_PAIR_LIST(LSP_OFFSET)
+    #undef LSP_OFFSET
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitLoadStorePairNonTemporal(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form;
+
+  switch (instr->Mask(LoadStorePairNonTemporalMask)) {
+    case STNP_w: mnemonic = "stnp"; form = "'Wt, 'Wt2, ['Xns'ILP4]"; break;
+    case LDNP_w: mnemonic = "ldnp"; form = "'Wt, 'Wt2, ['Xns'ILP4]"; break;
+    case STNP_x: mnemonic = "stnp"; form = "'Xt, 'Xt2, ['Xns'ILP8]"; break;
+    case LDNP_x: mnemonic = "ldnp"; form = "'Xt, 'Xt2, ['Xns'ILP8]"; break;
+    case STNP_s: mnemonic = "stnp"; form = "'St, 'St2, ['Xns'ILP4]"; break;
+    case LDNP_s: mnemonic = "ldnp"; form = "'St, 'St2, ['Xns'ILP4]"; break;
+    case STNP_d: mnemonic = "stnp"; form = "'Dt, 'Dt2, ['Xns'ILP8]"; break;
+    case LDNP_d: mnemonic = "ldnp"; form = "'Dt, 'Dt2, ['Xns'ILP8]"; break;
+    default: form = "(LoadStorePairNonTemporal)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitFPCompare(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Fn, 'Fm";
+  const char *form_zero = "'Fn, #0.0";
+
+  switch (instr->Mask(FPCompareMask)) {
+    case FCMP_s_zero:
+    case FCMP_d_zero: form = form_zero;  // Fall through.
+    case FCMP_s:
+    case FCMP_d: mnemonic = "fcmp"; break;
+    default: form = "(FPCompare)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitFPConditionalCompare(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Fn, 'Fm, 'INzcv, 'Cond";
+
+  switch (instr->Mask(FPConditionalCompareMask)) {
+    case FCCMP_s:
+    case FCCMP_d: mnemonic = "fccmp"; break;
+    default: form = "(FPConditionalCompare)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitFPConditionalSelect(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Fd, 'Fn, 'Fm, 'Cond";
+
+  switch (instr->Mask(FPConditionalSelectMask)) {
+    case FCSEL_s:
+    case FCSEL_d: mnemonic = "fcsel"; break;
+    default: form = "(FPConditionalSelect)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitFPDataProcessing1Source(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Fd, 'Fn";
+
+  switch (instr->Mask(FPDataProcessing1SourceMask)) {
+    #define FORMAT(A, B)  \
+    case A##_s:           \
+    case A##_d: mnemonic = B; break;
+    FORMAT(FMOV, "fmov");
+    FORMAT(FABS, "fabs");
+    FORMAT(FNEG, "fneg");
+    FORMAT(FSQRT, "fsqrt");
+    FORMAT(FRINTN, "frintn");
+    FORMAT(FRINTP, "frintp");
+    FORMAT(FRINTM, "frintm");
+    FORMAT(FRINTZ, "frintz");
+    FORMAT(FRINTA, "frinta");
+    FORMAT(FRINTX, "frintx");
+    FORMAT(FRINTI, "frinti");
+    #undef FORMAT
+    case FCVT_ds: mnemonic = "fcvt"; form = "'Dd, 'Sn"; break;
+    case FCVT_sd: mnemonic = "fcvt"; form = "'Sd, 'Dn"; break;
+    default: form = "(FPDataProcessing1Source)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitFPDataProcessing2Source(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Fd, 'Fn, 'Fm";
+
+  switch (instr->Mask(FPDataProcessing2SourceMask)) {
+    #define FORMAT(A, B)  \
+    case A##_s:           \
+    case A##_d: mnemonic = B; break;
+    FORMAT(FMUL, "fmul");
+    FORMAT(FDIV, "fdiv");
+    FORMAT(FADD, "fadd");
+    FORMAT(FSUB, "fsub");
+    FORMAT(FMAX, "fmax");
+    FORMAT(FMIN, "fmin");
+    FORMAT(FMAXNM, "fmaxnm");
+    FORMAT(FMINNM, "fminnm");
+    FORMAT(FNMUL, "fnmul");
+    #undef FORMAT
+    default: form = "(FPDataProcessing2Source)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitFPDataProcessing3Source(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "'Fd, 'Fn, 'Fm, 'Fa";
+
+  switch (instr->Mask(FPDataProcessing3SourceMask)) {
+    #define FORMAT(A, B)  \
+    case A##_s:           \
+    case A##_d: mnemonic = B; break;
+    FORMAT(FMADD, "fmadd");
+    FORMAT(FMSUB, "fmsub");
+    FORMAT(FNMADD, "fnmadd");
+    FORMAT(FNMSUB, "fnmsub");
+    #undef FORMAT
+    default: form = "(FPDataProcessing3Source)";
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitFPImmediate(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "(FPImmediate)";
+
+  switch (instr->Mask(FPImmediateMask)) {
+    case FMOV_s_imm: mnemonic = "fmov"; form = "'Sd, 'IFPSingle"; break;
+    case FMOV_d_imm: mnemonic = "fmov"; form = "'Dd, 'IFPDouble"; break;
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitFPIntegerConvert(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "(FPIntegerConvert)";
+  const char *form_rf = "'Rd, 'Fn";
+  const char *form_fr = "'Fd, 'Rn";
+
+  switch (instr->Mask(FPIntegerConvertMask)) {
+    case FMOV_ws:
+    case FMOV_xd: mnemonic = "fmov"; form = form_rf; break;
+    case FMOV_sw:
+    case FMOV_dx: mnemonic = "fmov"; form = form_fr; break;
+    case FCVTMS_ws:
+    case FCVTMS_xs:
+    case FCVTMS_wd:
+    case FCVTMS_xd: mnemonic = "fcvtms"; form = form_rf; break;
+    case FCVTMU_ws:
+    case FCVTMU_xs:
+    case FCVTMU_wd:
+    case FCVTMU_xd: mnemonic = "fcvtmu"; form = form_rf; break;
+    case FCVTNS_ws:
+    case FCVTNS_xs:
+    case FCVTNS_wd:
+    case FCVTNS_xd: mnemonic = "fcvtns"; form = form_rf; break;
+    case FCVTNU_ws:
+    case FCVTNU_xs:
+    case FCVTNU_wd:
+    case FCVTNU_xd: mnemonic = "fcvtnu"; form = form_rf; break;
+    case FCVTZU_xd:
+    case FCVTZU_ws:
+    case FCVTZU_wd:
+    case FCVTZU_xs: mnemonic = "fcvtzu"; form = form_rf; break;
+    case FCVTZS_xd:
+    case FCVTZS_wd:
+    case FCVTZS_xs:
+    case FCVTZS_ws: mnemonic = "fcvtzs"; form = form_rf; break;
+    case SCVTF_dw:
+    case SCVTF_dx: mnemonic = "scvtf"; form = form_fr; break;
+    case UCVTF_dw:
+    case UCVTF_dx: mnemonic = "ucvtf"; form = form_fr; break;
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitFPFixedPointConvert(Instruction* instr) {
+  const char *mnemonic = "unknown";
+  const char *form = "(FPFixedPointConvert)";
+  const char *form_rf = "'Rd, 'Fn, 'IFPFBits";
+  const char *form_fr = "'Fd, 'Rn, 'IFPFBits";
+
+  switch (instr->Mask(FPFixedPointConvertMask)) {
+    case FCVTZS_ws_fixed:
+    case FCVTZS_xs_fixed:
+    case FCVTZS_wd_fixed:
+    case FCVTZS_xd_fixed: mnemonic = "fcvtzs"; form = form_rf; break;
+    case FCVTZU_ws_fixed:
+    case FCVTZU_xs_fixed:
+    case FCVTZU_wd_fixed:
+    case FCVTZU_xd_fixed: mnemonic = "fcvtzu"; form = form_rf; break;
+    case SCVTF_sw_fixed:
+    case SCVTF_sx_fixed:
+    case SCVTF_dw_fixed:
+    case SCVTF_dx_fixed: mnemonic = "scvtf"; form = form_fr; break;
+    case UCVTF_sw_fixed:
+    case UCVTF_sx_fixed:
+    case UCVTF_dw_fixed:
+    case UCVTF_dx_fixed: mnemonic = "ucvtf"; form = form_fr; break;
+  }
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitSystem(Instruction* instr) {
+  // Some system instructions hijack their Op and Cp fields to represent a
+  // range of immediates instead of indicating a different instruction. This
+  // makes the decoding tricky.
+  const char *mnemonic = "unknown";
+  const char *form = "(System)";
+
+  if (instr->Mask(SystemSysRegFMask) == SystemSysRegFixed) {
+    switch (instr->Mask(SystemSysRegMask)) {
+      case MRS: {
+        mnemonic = "mrs";
+        switch (instr->ImmSystemRegister()) {
+          case NZCV: form = "'Xt, nzcv"; break;
+        }
+        break;
+      }
+      case MSR: {
+        mnemonic = "msr";
+        switch (instr->ImmSystemRegister()) {
+          case NZCV: form = "nzcv, 'Xt"; break;
+        }
+        break;
+      }
+    }
+  } else if (instr->Mask(SystemHintFMask) == SystemHintFixed) {
+    ASSERT(instr->Mask(SystemHintMask) == HINT);
+    switch (instr->ImmHint()) {
+      case NOP: {
+        mnemonic = "nop";
+        form = NULL;
+        break;
+      }
+    }
+  }
+
+  Format(instr, mnemonic, form);
+}
+
+
+void Disassembler::VisitException(Instruction* instr) {
+  switch (instr->Mask(ExceptionMask)) {
+    case HLT: Format(instr, "hlt", "'IDebug"); break;
+    case BRK: Format(instr, "brk", "'IDebug"); break;
+    default: Format(instr, "unknown", "(Exception)");
+  }
+}
+
+
+void Disassembler::VisitUnknown(Instruction* instr) {
+  Format(instr, "unknown", "(Unknown)");
+}
+
+
+void Disassembler::ProcessOutput(Instruction* /*instr*/) {
+  // The base disasm does nothing more than disassembling into a buffer.
+}
+
+
+void Disassembler::Format(Instruction* instr, const char* mnemonic,
+                          const char* format) {
+  ASSERT(mnemonic != NULL);
+  ResetOutput();
+  Substitute(instr, mnemonic);
+  if (format != NULL) {
+    buffer_[buffer_pos_++] = ' ';
+    Substitute(instr, format);
+  }
+  buffer_[buffer_pos_] = 0;
+  ProcessOutput(instr);
+}
+
+
+void Disassembler::Substitute(Instruction* instr, const char* string) {
+  char chr = *string++;
+  while (chr != '\0') {
+    if (chr == '\'') {
+      string += SubstituteField(instr, string);
+    } else {
+      buffer_[buffer_pos_++] = chr;
+    }
+    chr = *string++;
+  }
+}
+
+
+int Disassembler::SubstituteField(Instruction* instr, const char* format) {
+  switch (format[0]) {
+    case 'R':  // Register. X or W, selected by sf bit.
+    case 'F':  // FP Register. S or D, selected by type field.
+    case 'W':
+    case 'X':
+    case 'S':
+    case 'D': return SubstituteRegisterField(instr, format);
+    case 'I': return SubstituteImmediateField(instr, format);
+    case 'L': return SubstituteLiteralField(instr, format);
+    case 'H': return SubstituteShiftField(instr, format);
+    case 'P': return SubstitutePrefetchField(instr, format);
+    case 'C': return SubstituteConditionField(instr, format);
+    case 'E': return SubstituteExtendField(instr, format);
+    case 'A': return SubstitutePCRelAddressField(instr, format);
+    case 'B': return SubstituteBranchTargetField(instr, format);
+    case 'O': return SubstituteLSRegOffsetField(instr, format);
+    default: {
+      UNREACHABLE();
+      return 1;
+    }
+  }
+}
+
+
+int Disassembler::SubstituteRegisterField(Instruction* instr,
+                                          const char* format) {
+  unsigned reg_num = 0;
+  unsigned field_len = 2;
+  switch (format[1]) {
+    case 'd': reg_num = instr->Rd(); break;
+    case 'n': reg_num = instr->Rn(); break;
+    case 'm': reg_num = instr->Rm(); break;
+    case 'a': reg_num = instr->Ra(); break;
+    case 't': {
+      if (format[2] == '2') {
+        reg_num = instr->Rt2();
+        field_len = 3;
+      } else {
+        reg_num = instr->Rt();
+      }
+      break;
+    }
+    default: UNREACHABLE();
+  }
+
+  // Increase field length for registers tagged as stack.
+  if (format[2] == 's') {
+    field_len = 3;
+  }
+
+  char reg_type;
+  if (format[0] == 'R') {
+    // Register type is R: use sf bit to choose X and W.
+    reg_type = instr->SixtyFourBits() ? 'x' : 'w';
+  } else if (format[0] == 'F') {
+    // Floating-point register: use type field to choose S or D.
+    reg_type = ((instr->FPType() & 1) == 0) ? 's' : 'd';
+  } else {
+    // Register type is specified. Make it lower case.
+    reg_type = format[0] + 0x20;
+  }
+
+  if ((reg_num != kZeroRegCode) || (reg_type == 's') || (reg_type == 'd')) {
+    // A normal register: w0 - w30, x0 - x30, s0 - s31, d0 - d31.
+    AppendToOutput("%c%d", reg_type, reg_num);
+  } else if (format[2] == 's') {
+    // Disassemble w31/x31 as stack pointer wsp/sp.
+    AppendToOutput("%s", (reg_type == 'w') ? "wsp" : "sp");
+  } else {
+    // Disassemble w31/x31 as zero register wzr/xzr.
+    AppendToOutput("%czr", reg_type);
+  }
+
+  return field_len;
+}
+
+
+int Disassembler::SubstituteImmediateField(Instruction* instr,
+                                           const char* format) {
+  ASSERT(format[0] == 'I');
+
+  switch (format[1]) {
+    case 'M': {  // IMoveImm or IMoveLSL.
+      if (format[5] == 'I') {
+        uint64_t imm = instr->ImmMoveWide() << (16 * instr->ShiftMoveWide());
+        AppendToOutput("#0x%" PRIx64, imm);
+      } else {
+        ASSERT(format[5] == 'L');
+        AppendToOutput("#0x%" PRIx64, instr->ImmMoveWide());
+        if (instr->ShiftMoveWide() > 0) {
+          AppendToOutput(", lsl #%d", 16 * instr->ShiftMoveWide());
+        }
+      }
+      return 8;
+    }
+    case 'L': {
+      switch (format[2]) {
+        case 'L': {  // ILLiteral - Immediate Load Literal.
+          AppendToOutput("#%" PRId64,
+                         instr->ImmLLiteral() << kLiteralEntrySizeLog2);
+          return 9;
+        }
+        case 'S': {  // ILS - Immediate Load/Store.
+          if (instr->ImmLS() != 0) {
+            AppendToOutput(", #%" PRId64, instr->ImmLS());
+          }
+          return 3;
+        }
+        case 'P': {  // ILPx - Immediate Load/Store Pair, x = access size.
+          if (instr->ImmLSPair() != 0) {
+            // format[3] is the scale value. Convert to a number.
+            int scale = format[3] - 0x30;
+            AppendToOutput(", #%" PRId64, instr->ImmLSPair() * scale);
+          }
+          return 4;
+        }
+        case 'U': {  // ILU - Immediate Load/Store Unsigned.
+          if (instr->ImmLSUnsigned() != 0) {
+            AppendToOutput(", #%" PRIu64,
+                           instr->ImmLSUnsigned() << instr->SizeLS());
+          }
+          return 3;
+        }
+      }
+    }
+    case 'C': {  // ICondB - Immediate Conditional Branch.
+      int64_t offset = instr->ImmCondBranch() << 2;
+      char sign = (offset >= 0) ? '+' : '-';
+      AppendToOutput("#%c0x%" PRIx64, sign, offset);
+      return 6;
+    }
+    case 'A': {  // IAddSub.
+      ASSERT(instr->ShiftAddSub() <= 1);
+      int64_t imm = instr->ImmAddSub() << (12 * instr->ShiftAddSub());
+      AppendToOutput("#0x%" PRIx64 " (%" PRId64 ")", imm, imm);
+      return 7;
+    }
+    case 'F': {  // IFPSingle, IFPDouble or IFPFBits.
+      if (format[3] == 'F') {  // IFPFbits.
+        AppendToOutput("#%d", 64 - instr->FPScale());
+        return 8;
+      } else {
+        AppendToOutput("#0x%" PRIx64 " (%.4f)", instr->ImmFP(),
+                       format[3] == 'S' ? instr->ImmFP32() : instr->ImmFP64());
+        return 9;
+      }
+    }
+    case 'T': {  // ITri - Immediate Triangular Encoded.
+      AppendToOutput("#0x%" PRIx64, instr->ImmLogical());
+      return 4;
+    }
+    case 'N': {  // INzcv.
+      int nzcv = (instr->Nzcv() << Flags_offset);
+      AppendToOutput("#%c%c%c%c", ((nzcv & NFlag) == 0) ? 'n' : 'N',
+                                  ((nzcv & ZFlag) == 0) ? 'z' : 'Z',
+                                  ((nzcv & CFlag) == 0) ? 'c' : 'C',
+                                  ((nzcv & VFlag) == 0) ? 'v' : 'V');
+      return 5;
+    }
+    case 'P': {  // IP - Conditional compare.
+      AppendToOutput("#%d", instr->ImmCondCmp());
+      return 2;
+    }
+    case 'B': {  // Bitfields.
+      return SubstituteBitfieldImmediateField(instr, format);
+    }
+    case 'E': {  // IExtract.
+      AppendToOutput("#%d", instr->ImmS());
+      return 8;
+    }
+    case 'S': {  // IS - Test and branch bit.
+      AppendToOutput("#%d", (instr->ImmTestBranchBit5() << 5) |
+                            instr->ImmTestBranchBit40());
+      return 2;
+    }
+    case 'D': {  // IDebug - HLT and BRK instructions.
+      AppendToOutput("#0x%x", instr->ImmException());
+      return 6;
+    }
+    default: {
+      UNIMPLEMENTED();
+      return 0;
+    }
+  }
+}
+
+
+int Disassembler::SubstituteBitfieldImmediateField(Instruction* instr,
+                                                   const char* format) {
+  ASSERT((format[0] == 'I') && (format[1] == 'B'));
+  unsigned r = instr->ImmR();
+  unsigned s = instr->ImmS();
+
+  switch (format[2]) {
+    case 'r': {  // IBr.
+      AppendToOutput("#%d", r);
+      return 3;
+    }
+    case 's': {  // IBs+1 or IBs-r+1.
+      if (format[3] == '+') {
+        AppendToOutput("#%d", s + 1);
+        return 5;
+      } else {
+        ASSERT(format[3] == '-');
+        AppendToOutput("#%d", s - r + 1);
+        return 7;
+      }
+    }
+    case 'Z': {  // IBZ-r.
+      ASSERT((format[3] == '-') && (format[4] == 'r'));
+      unsigned reg_size = (instr->SixtyFourBits() == 1) ? kXRegSize : kWRegSize;
+      AppendToOutput("#%d", reg_size - r);
+      return 5;
+    }
+    default: {
+      UNREACHABLE();
+      return 0;
+    }
+  }
+}
+
+
+int Disassembler::SubstituteLiteralField(Instruction* instr,
+                                         const char* format) {
+  ASSERT(strncmp(format, "LValue", 6) == 0);
+  USE(format);
+
+  switch (instr->Mask(LoadLiteralMask)) {
+    case LDR_s_lit: AppendToOutput("(%.4f)", instr->LiteralFP32()); break;
+    case LDR_d_lit: AppendToOutput("(%.4f)", instr->LiteralFP64()); break;
+    case LDR_w_lit:
+      AppendToOutput("(0x%08" PRIx32 ")", instr->Literal32());
+      break;
+    case LDR_x_lit:
+      AppendToOutput("(0x%016" PRIx64 ")", instr->Literal64());
+      break;
+    default: UNREACHABLE();
+  }
+
+  return 6;
+}
+
+
+int Disassembler::SubstituteShiftField(Instruction* instr, const char* format) {
+  ASSERT(format[0] == 'H');
+  ASSERT(instr->ShiftDP() <= 0x3);
+
+  switch (format[1]) {
+    case 'D': {  // HDP.
+      ASSERT(instr->ShiftDP() != ROR);
+    }  // Fall through.
+    case 'L': {  // HLo.
+      if (instr->ImmDPShift() != 0) {
+        const char* shift_type[] = {"lsl", "lsr", "asr", "ror"};
+        AppendToOutput(", %s #%" PRId64, shift_type[instr->ShiftDP()],
+                       instr->ImmDPShift());
+      }
+      return 3;
+    }
+    default:
+      UNIMPLEMENTED();
+      return 0;
+  }
+}
+
+
+int Disassembler::SubstituteConditionField(Instruction* instr,
+                                           const char* format) {
+  ASSERT(format[0] == 'C');
+  const char* condition_code[] = { "eq", "ne", "hs", "lo",
+                                   "mi", "pl", "vs", "vc",
+                                   "hi", "ls", "ge", "lt",
+                                   "gt", "le", "al", "nv" };
+  int cond;
+  switch (format[1]) {
+    case 'B': cond = instr->ConditionBranch(); break;
+    case 'I': {
+      cond = InvertCondition(static_cast<Condition>(instr->Condition()));
+      break;
+    }
+    default: cond = instr->Condition();
+  }
+  AppendToOutput("%s", condition_code[cond]);
+  return 4;
+}
+
+
+int Disassembler::SubstitutePCRelAddressField(Instruction* instr,
+                                              const char* format) {
+  USE(format);
+  ASSERT(strncmp(format, "AddrPCRel", 9) == 0);
+
+  int offset = instr->ImmPCRel();
+
+  // Only ADR (AddrPCRelByte) is supported.
+  ASSERT(strcmp(format, "AddrPCRelByte") == 0);
+
+  char sign = '+';
+  if (offset < 0) {
+    offset = -offset;
+    sign = '-';
+  }
+  // TODO: Extend this to support printing the target address.
+  AppendToOutput("#%c0x%x", sign, offset);
+  return 13;
+}
+
+
+int Disassembler::SubstituteBranchTargetField(Instruction* instr,
+                                              const char* format) {
+  ASSERT(strncmp(format, "BImm", 4) == 0);
+
+  int64_t offset = 0;
+  switch (format[5]) {
+    // BImmUncn - unconditional branch immediate.
+    case 'n': offset = instr->ImmUncondBranch(); break;
+    // BImmCond - conditional branch immediate.
+    case 'o': offset = instr->ImmCondBranch(); break;
+    // BImmCmpa - compare and branch immediate.
+    case 'm': offset = instr->ImmCmpBranch(); break;
+    // BImmTest - test and branch immediate.
+    case 'e': offset = instr->ImmTestBranch(); break;
+    default: UNIMPLEMENTED();
+  }
+  offset <<= kInstructionSizeLog2;
+  char sign = '+';
+  if (offset < 0) {
+    offset = -offset;
+    sign = '-';
+  }
+  AppendToOutput("#%c0x%" PRIx64, sign, offset);
+  return 8;
+}
+
+
+int Disassembler::SubstituteExtendField(Instruction* instr,
+                                        const char* format) {
+  ASSERT(strncmp(format, "Ext", 3) == 0);
+  ASSERT(instr->ExtendMode() <= 7);
+  USE(format);
+
+  const char* extend_mode[] = { "uxtb", "uxth", "uxtw", "uxtx",
+                                "sxtb", "sxth", "sxtw", "sxtx" };
+
+  // If rd or rn is SP, uxtw on 32-bit registers and uxtx on 64-bit
+  // registers becomes lsl.
+  if (((instr->Rd() == kZeroRegCode) || (instr->Rn() == kZeroRegCode)) &&
+      (((instr->ExtendMode() == UXTW) && (instr->SixtyFourBits() == 0)) ||
+       (instr->ExtendMode() == UXTX))) {
+    if (instr->ImmExtendShift() > 0) {
+      AppendToOutput(", lsl #%d", instr->ImmExtendShift());
+    }
+  } else {
+    AppendToOutput(", %s", extend_mode[instr->ExtendMode()]);
+    if (instr->ImmExtendShift() > 0) {
+      AppendToOutput(" #%d", instr->ImmExtendShift());
+    }
+  }
+  return 3;
+}
+
+
+int Disassembler::SubstituteLSRegOffsetField(Instruction* instr,
+                                             const char* format) {
+  ASSERT(strncmp(format, "Offsetreg", 9) == 0);
+  const char* extend_mode[] = { "undefined", "undefined", "uxtw", "lsl",
+                                "undefined", "undefined", "sxtw", "sxtx" };
+  USE(format);
+
+  unsigned shift = instr->ImmShiftLS();
+  Extend ext = static_cast<Extend>(instr->ExtendMode());
+  char reg_type = ((ext == UXTW) || (ext == SXTW)) ? 'w' : 'x';
+
+  unsigned rm = instr->Rm();
+  if (rm == kZeroRegCode) {
+    AppendToOutput("%czr", reg_type);
+  } else {
+    AppendToOutput("%c%d", reg_type, rm);
+  }
+
+  // Extend mode UXTX is an alias for shift mode LSL here.
+  if (!((ext == UXTX) && (shift == 0))) {
+    AppendToOutput(", %s", extend_mode[ext]);
+    if (shift != 0) {
+      AppendToOutput(" #%d", instr->SizeLS());
+    }
+  }
+  return 9;
+}
+
+
+int Disassembler::SubstitutePrefetchField(Instruction* instr,
+                                          const char* format) {
+  ASSERT(format[0] == 'P');
+  USE(format);
+
+  int prefetch_mode = instr->PrefetchMode();
+
+  const char* ls = (prefetch_mode & 0x10) ? "st" : "ld";
+  int level = (prefetch_mode >> 1) + 1;
+  const char* ks = (prefetch_mode & 1) ? "strm" : "keep";
+
+  AppendToOutput("p%sl%d%s", ls, level, ks);
+  return 6;
+}
+
+
+void Disassembler::ResetOutput() {
+  buffer_pos_ = 0;
+  buffer_[buffer_pos_] = 0;
+}
+
+
+void Disassembler::AppendToOutput(const char* format, ...) {
+  va_list args;
+  va_start(args, format);
+  buffer_pos_ += vsnprintf(&buffer_[buffer_pos_], buffer_size_, format, args);
+  va_end(args);
+}
+
+
+void PrintDisassembler::ProcessOutput(Instruction* instr) {
+  fprintf(stream_, "0x%016" PRIx64 "  %08" PRIx32 "\t\t%s\n",
+          reinterpret_cast<uint64_t>(instr),
+          instr->InstructionBits(),
+          GetOutput());
+}
+}  // namespace vixl

diff --git a/src/a64/disasm-a64.h b/src/a64/disasm-a64.h
new file mode 100644
index 0000000..857a5ac
--- /dev/null
+++ b/src/a64/disasm-a64.h

@@ -0,0 +1,109 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_A64_DISASM_A64_H
+#define VIXL_A64_DISASM_A64_H
+
+#include "globals.h"
+#include "utils.h"
+#include "instructions-a64.h"
+#include "decoder-a64.h"
+
+namespace vixl {
+
+class Disassembler: public DecoderVisitor {
+ public:
+  Disassembler();
+  Disassembler(char* text_buffer, int buffer_size);
+  virtual ~Disassembler();
+  char* GetOutput();
+
+  // Declare all Visitor functions.
+  #define DECLARE(A)  void Visit##A(Instruction* instr);
+  VISITOR_LIST(DECLARE)
+  #undef DECLARE
+
+ protected:
+  virtual void ProcessOutput(Instruction* instr);
+
+ private:
+  void Format(Instruction* instr, const char* mnemonic, const char* format);
+  void Substitute(Instruction* instr, const char* string);
+  int SubstituteField(Instruction* instr, const char* format);
+  int SubstituteRegisterField(Instruction* instr, const char* format);
+  int SubstituteImmediateField(Instruction* instr, const char* format);
+  int SubstituteLiteralField(Instruction* instr, const char* format);
+  int SubstituteBitfieldImmediateField(Instruction* instr, const char* format);
+  int SubstituteShiftField(Instruction* instr, const char* format);
+  int SubstituteExtendField(Instruction* instr, const char* format);
+  int SubstituteConditionField(Instruction* instr, const char* format);
+  int SubstitutePCRelAddressField(Instruction* instr, const char* format);
+  int SubstituteBranchTargetField(Instruction* instr, const char* format);
+  int SubstituteLSRegOffsetField(Instruction* instr, const char* format);
+  int SubstitutePrefetchField(Instruction* instr, const char* format);
+
+  inline bool RdIsZROrSP(Instruction* instr) const {
+    return (instr->Rd() == kZeroRegCode);
+  }
+
+  inline bool RnIsZROrSP(Instruction* instr) const {
+    return (instr->Rn() == kZeroRegCode);
+  }
+
+  inline bool RmIsZROrSP(Instruction* instr) const {
+    return (instr->Rm() == kZeroRegCode);
+  }
+
+  inline bool RaIsZROrSP(Instruction* instr) const {
+    return (instr->Ra() == kZeroRegCode);
+  }
+
+  bool IsMovzMovnImm(unsigned reg_size, uint64_t value);
+
+  void ResetOutput();
+  void AppendToOutput(const char* string, ...);
+
+  char* buffer_;
+  uint32_t buffer_pos_;
+  uint32_t buffer_size_;
+  bool own_buffer_;
+};
+
+
+class PrintDisassembler: public Disassembler {
+ public:
+  explicit PrintDisassembler(FILE* stream) : stream_(stream) { }
+  ~PrintDisassembler() { }
+
+ protected:
+  virtual void ProcessOutput(Instruction* instr);
+
+ private:
+  FILE *stream_;
+};
+}  // namespace vixl
+
+#endif  // VIXL_A64_DISASM_A64_H

diff --git a/src/a64/instructions-a64.cc b/src/a64/instructions-a64.cc
new file mode 100644
index 0000000..b9b4798
--- /dev/null
+++ b/src/a64/instructions-a64.cc

@@ -0,0 +1,242 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "a64/instructions-a64.h"
+#include "a64/assembler-a64.h"
+
+namespace vixl {
+
+#ifdef DEBUG
+uint32_t Instruction::SpacedBits(int num_bits, ...) const {
+  va_list bit_list;
+  va_start(bit_list, num_bits);
+  int32_t result = 0;
+  for (int i = 0; i < num_bits; i++) {
+    result = (result << 1) | Bit(va_arg(bit_list, int));
+  }
+  va_end(bit_list);
+  return result;
+}
+#endif
+
+
+static uint64_t RotateRight(uint64_t value,
+                            unsigned int rotate,
+                            unsigned int width) {
+  ASSERT(width <= 64);
+  rotate &= 63;
+  return ((value & ((1UL << rotate) - 1UL)) << (width - rotate)) |
+         (value >> rotate);
+}
+
+
+static uint64_t RepeatBitsAcrossReg(unsigned reg_size,
+                                    uint64_t value,
+                                    unsigned width) {
+  ASSERT((width == 2) || (width == 4) || (width == 8) || (width == 16) ||
+         (width == 32));
+  ASSERT((reg_size == kWRegSize) || (reg_size == kXRegSize));
+  uint64_t result = value & ((1UL << width) - 1UL);
+  for (unsigned i = width; i < reg_size; i *= 2) {
+    result |= (result << i);
+  }
+  return result;
+}
+
+
+uint64_t Instruction::ImmLogical() {
+  unsigned reg_size = SixtyFourBits() ? kXRegSize : kWRegSize;
+  int64_t n = BitN();
+  int64_t imm_s = ImmSetBits();
+  int64_t imm_r = ImmRotate();
+
+  // An integer is constructed from the n, imm_s and imm_r bits according to
+  // the following table:
+  //
+  //  N   imms    immr    size        S             R
+  //  1  ssssss  rrrrrr    64    UInt(ssssss)  UInt(rrrrrr)
+  //  0  0sssss  xrrrrr    32    UInt(sssss)   UInt(rrrrr)
+  //  0  10ssss  xxrrrr    16    UInt(ssss)    UInt(rrrr)
+  //  0  110sss  xxxrrr     8    UInt(sss)     UInt(rrr)
+  //  0  1110ss  xxxxrr     4    UInt(ss)      UInt(rr)
+  //  0  11110s  xxxxxr     2    UInt(s)       UInt(r)
+  // (s bits must not be all set)
+  //
+  // A pattern is constructed of size bits, where the least significant S+1
+  // bits are set. The pattern is rotated right by R, and repeated across a
+  // 32 or 64-bit value, depending on destination register width.
+  //
+
+  if (n == 1) {
+    ASSERT(imm_s != 0x3F);
+    uint64_t bits = (1UL << (imm_s + 1)) - 1;
+    return RotateRight(bits, imm_r, 64);
+  } else {
+    ASSERT((imm_s >> 1) != 0x1F);
+    for (int width = 0x20; width >= 0x2; width >>= 1) {
+      if ((imm_s & width) == 0) {
+        int mask = width - 1;
+        ASSERT((imm_s & mask) != mask);
+        uint64_t bits = (1UL << ((imm_s & mask) + 1)) - 1;
+        return RepeatBitsAcrossReg(reg_size,
+                                   RotateRight(bits, imm_r & mask, width),
+                                   width);
+      }
+    }
+  }
+  UNREACHABLE();
+  return 0;
+}
+
+
+float Instruction::ImmFP32() {
+  //  ImmFP: abcdefgh (8 bits)
+  // Single: aBbb.bbbc.defg.h000.0000.0000.0000.0000 (32 bits)
+  // where B is b ^ 1
+  uint32_t bits = ImmFP();
+  uint32_t bit7 = (bits >> 7) & 0x1;
+  uint32_t bit6 = (bits >> 6) & 0x1;
+  uint32_t bit5_to_0 = bits & 0x3f;
+  uint32_t result = (bit7 << 31) | ((32 - bit6) << 25) | (bit5_to_0 << 19);
+
+  return rawbits_to_float(result);
+}
+
+
+double Instruction::ImmFP64() {
+  //  ImmFP: abcdefgh (8 bits)
+  // Double: aBbb.bbbb.bbcd.efgh.0000.0000.0000.0000
+  //         0000.0000.0000.0000.0000.0000.0000.0000 (64 bits)
+  // where B is b ^ 1
+  uint32_t bits = ImmFP();
+  uint64_t bit7 = (bits >> 7) & 0x1;
+  uint64_t bit6 = (bits >> 6) & 0x1;
+  uint64_t bit5_to_0 = bits & 0x3f;
+  uint64_t result = (bit7 << 63) | ((256 - bit6) << 54) | (bit5_to_0 << 48);
+
+  return rawbits_to_double(result);
+}
+
+
+LSDataSize CalcLSPairDataSize(LoadStorePairOp op) {
+  switch (op) {
+    case STP_x:
+    case LDP_x:
+    case STP_d:
+    case LDP_d: return LSDoubleWord;
+    default: return LSWord;
+  }
+}
+
+
+Instruction* Instruction::ImmPCOffsetTarget() {
+  ptrdiff_t offset;
+  if (IsPCRelAddressing()) {
+    // PC-relative addressing. Only ADR is supported.
+    offset = ImmPCRel();
+  } else {
+    // All PC-relative branches.
+    ASSERT(BranchType() != UnknownBranchType);
+    // Relative branch offsets are instruction-size-aligned.
+    offset = ImmBranch() << kInstructionSizeLog2;
+  }
+  return this + offset;
+}
+
+
+inline int Instruction::ImmBranch() const {
+  switch (BranchType()) {
+    case CondBranchType: return ImmCondBranch();
+    case UncondBranchType: return ImmUncondBranch();
+    case CompareBranchType: return ImmCmpBranch();
+    case TestBranchType: return ImmTestBranch();
+    default: UNREACHABLE();
+  }
+  return 0;
+}
+
+
+void Instruction::SetImmPCOffsetTarget(Instruction* target) {
+  if (IsPCRelAddressing()) {
+    SetPCRelImmTarget(target);
+  } else {
+    SetBranchImmTarget(target);
+  }
+}
+
+
+void Instruction::SetPCRelImmTarget(Instruction* target) {
+  // ADRP is not supported, so 'this' must point to an ADR instruction.
+  ASSERT(Mask(PCRelAddressingMask) == ADR);
+
+  Instr imm = Assembler::ImmPCRelAddress(target - this);
+
+  SetInstructionBits(Mask(~ImmPCRel_mask) | imm);
+}
+
+
+void Instruction::SetBranchImmTarget(Instruction* target) {
+  ASSERT(((target - this) & 3) == 0);
+  Instr branch_imm = 0;
+  uint32_t imm_mask = 0;
+  int offset = (target - this) >> kInstructionSizeLog2;
+  switch (BranchType()) {
+    case CondBranchType: {
+      branch_imm = Assembler::ImmCondBranch(offset);
+      imm_mask = ImmCondBranch_mask;
+      break;
+    }
+    case UncondBranchType: {
+      branch_imm = Assembler::ImmUncondBranch(offset);
+      imm_mask = ImmUncondBranch_mask;
+      break;
+    }
+    case CompareBranchType: {
+      branch_imm = Assembler::ImmCmpBranch(offset);
+      imm_mask = ImmCmpBranch_mask;
+      break;
+    }
+    case TestBranchType: {
+      branch_imm = Assembler::ImmTestBranch(offset);
+      imm_mask = ImmTestBranch_mask;
+      break;
+    }
+    default: UNREACHABLE();
+  }
+  SetInstructionBits(Mask(~imm_mask) | branch_imm);
+}
+
+
+void Instruction::SetImmLLiteral(Instruction* source) {
+  ASSERT(((source - this) & 3) == 0);
+  int offset = (source - this) >> kLiteralEntrySizeLog2;
+  Instr imm = Assembler::ImmLLiteral(offset);
+  Instr mask = ImmLLiteral_mask;
+
+  SetInstructionBits(Mask(~mask) | imm);
+}
+}  // namespace vixl
+

diff --git a/src/a64/instructions-a64.h b/src/a64/instructions-a64.h
new file mode 100644
index 0000000..76bd0ec
--- /dev/null
+++ b/src/a64/instructions-a64.h

@@ -0,0 +1,330 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_A64_INSTRUCTIONS_A64_H_
+#define VIXL_A64_INSTRUCTIONS_A64_H_
+
+#include "globals.h"
+#include "utils.h"
+#include "a64/constants-a64.h"
+
+namespace vixl {
+// ISA constants. --------------------------------------------------------------
+
+typedef uint32_t Instr;
+const unsigned kInstructionSize = 4;
+const unsigned kInstructionSizeLog2 = 2;
+const unsigned kLiteralEntrySize = 4;
+const unsigned kLiteralEntrySizeLog2 = 2;
+const unsigned kMaxLoadLiteralRange = 1 * MBytes;
+
+const unsigned kWRegSize = 32;
+const unsigned kWRegSizeLog2 = 5;
+const unsigned kWRegSizeInBytes = kWRegSize / 8;
+const unsigned kXRegSize = 64;
+const unsigned kXRegSizeLog2 = 6;
+const unsigned kXRegSizeInBytes = kXRegSize / 8;
+const unsigned kSRegSize = 32;
+const unsigned kSRegSizeLog2 = 5;
+const unsigned kSRegSizeInBytes = kSRegSize / 8;
+const unsigned kDRegSize = 64;
+const unsigned kDRegSizeLog2 = 6;
+const unsigned kDRegSizeInBytes = kDRegSize / 8;
+const int64_t kWRegMask = 0x00000000ffffffffL;
+const int64_t kXRegMask = 0xffffffffffffffffL;
+const int64_t kSRegMask = 0x00000000ffffffffL;
+const int64_t kDRegMask = 0xffffffffffffffffL;
+const int64_t kXSignMask = 0x1L << 63;
+const int64_t kWSignMask = 0x1L << 31;
+const int64_t kByteMask = 0xffL;
+const int64_t kHalfWordMask = 0xffffL;
+const int64_t kWordMask = 0xffffffffL;
+const uint64_t kXMaxUInt = 0xffffffffffffffffUL;
+const uint64_t kWMaxUInt = 0xffffffffUL;
+const int64_t kXMaxInt = 0x7fffffffffffffffL;
+const int64_t kXMinInt = 0x8000000000000000L;
+const int32_t kWMaxInt = 0x7fffffff;
+const int32_t kWMinInt = 0x80000000;
+const unsigned kLinkRegCode = 30;
+const unsigned kZeroRegCode = 31;
+const unsigned kSPRegInternalCode = 63;
+const unsigned kRegCodeMask = 0x1f;
+const float kFP32PositiveInfinity = rawbits_to_float(0x7f800000);
+const float kFP32NegativeInfinity = rawbits_to_float(0xff800000);
+const double kFP64PositiveInfinity = rawbits_to_double(0x7ff0000000000000UL);
+const double kFP64NegativeInfinity = rawbits_to_double(0xfff0000000000000UL);
+
+// This value is a signalling NaN as both a double and as a float (taking the
+// least-significant word).
+static const double kFP64SignallingNaN = rawbits_to_double(0x7ff000007f800001);
+
+// A similar value, but as a quiet NaN.
+static const double kFP64QuietNaN = rawbits_to_double(0x7ff800007fc00001);
+
+enum LSDataSize {
+  LSByte        = 0,
+  LSHalfword    = 1,
+  LSWord        = 2,
+  LSDoubleWord  = 3
+};
+
+LSDataSize CalcLSPairDataSize(LoadStorePairOp op);
+
+enum ImmBranchType {
+  UnknownBranchType = 0,
+  CondBranchType    = 1,
+  UncondBranchType  = 2,
+  CompareBranchType = 3,
+  TestBranchType    = 4
+};
+
+enum AddrMode {
+  Offset,
+  PreIndex,
+  PostIndex
+};
+
+enum FPRounding {
+  FPTieEven,
+  FPPositiveInfinity,
+  FPNegativeInfinity,
+  FPZero,
+  FPTieAway
+};
+
+enum Reg31Mode {
+  Reg31IsStackPointer,
+  Reg31IsZeroRegister
+};
+
+// Instructions. ---------------------------------------------------------------
+
+class Instruction {
+ public:
+  inline Instr InstructionBits() const {
+    return *(reinterpret_cast<const Instr*>(this));
+  }
+
+  inline void SetInstructionBits(Instr new_instr) {
+    *(reinterpret_cast<Instr*>(this)) = new_instr;
+  }
+
+  inline int Bit(int pos) const {
+    return (InstructionBits() >> pos) & 1;
+  }
+
+  inline uint32_t Bits(int msb, int lsb) const {
+    return unsigned_bitextract_32(msb, lsb, InstructionBits());
+  }
+
+  inline int32_t SignedBits(int msb, int lsb) const {
+    int32_t bits = *(reinterpret_cast<const int32_t*>(this));
+    return signed_bitextract_32(msb, lsb, bits);
+  }
+
+#ifdef DEBUG
+  uint32_t SpacedBits(int num_bits, ...) const;
+#endif
+
+  inline Instr Mask(uint32_t mask) const {
+    return InstructionBits() & mask;
+  }
+
+  #define DEFINE_GETTER(Name, HighBit, LowBit, Func)             \
+  inline int64_t Name() const { return Func(HighBit, LowBit); }
+  FIELDS_LIST(DEFINE_GETTER)
+  #undef DEFINE_GETTER
+
+  // ImmPCRel is a compound field (not present in FIELDS_LIST), formed from
+  // ImmPCRelLo and ImmPCRelHi.
+  int ImmPCRel() const {
+    int const offset = ((ImmPCRelHi() << ImmPCRelLo_width) | ImmPCRelLo());
+    int const width = ImmPCRelLo_width + ImmPCRelHi_width;
+    return signed_bitextract_32(width-1, 0, offset);
+  }
+
+  uint64_t ImmLogical();
+  float ImmFP32();
+  double ImmFP64();
+
+  inline LSDataSize SizeLSPair() const {
+    return CalcLSPairDataSize(
+             static_cast<LoadStorePairOp>(Mask(LoadStorePairMask)));
+  }
+
+  // Helpers.
+  inline bool IsCondBranchImm() const {
+    return Mask(ConditionalBranchFMask) == ConditionalBranchFixed;
+  }
+
+  inline bool IsUncondBranchImm() const {
+    return Mask(UnconditionalBranchFMask) == UnconditionalBranchFixed;
+  }
+
+  inline bool IsCompareBranch() const {
+    return Mask(CompareBranchFMask) == CompareBranchFixed;
+  }
+
+  inline bool IsTestBranch() const {
+    return Mask(TestBranchFMask) == TestBranchFixed;
+  }
+
+  inline bool IsPCRelAddressing() const {
+    return Mask(PCRelAddressingFMask) == PCRelAddressingFixed;
+  }
+
+  inline bool IsLogicalImmediate() const {
+    return Mask(LogicalImmediateFMask) == LogicalImmediateFixed;
+  }
+
+  inline bool IsAddSubImmediate() const {
+    return Mask(AddSubImmediateFMask) == AddSubImmediateFixed;
+  }
+
+  inline bool IsAddSubExtended() const {
+    return Mask(AddSubExtendedFMask) == AddSubExtendedFixed;
+  }
+
+  inline bool IsLoadOrStore() const {
+    return Mask(LoadStoreAnyFMask) == LoadStoreAnyFixed;
+  }
+
+  // Indicate whether Rd can be the stack pointer or the zero register. This
+  // does not check that the instruction actually has an Rd field.
+  inline Reg31Mode RdMode() const {
+    // The following instructions use sp or wsp as Rd:
+    //  Add/sub (immediate) when not setting the flags.
+    //  Add/sub (extended) when not setting the flags.
+    //  Logical (immediate) when not setting the flags.
+    // Otherwise, r31 is the zero register.
+    if (IsAddSubImmediate() || IsAddSubExtended()) {
+      if (Mask(AddSubSetFlagsBit)) {
+        return Reg31IsZeroRegister;
+      } else {
+        return Reg31IsStackPointer;
+      }
+    }
+    if (IsLogicalImmediate()) {
+      // Of the logical (immediate) instructions, only ANDS (and its aliases)
+      // can set the flags. The others can all write into sp.
+      // Note that some logical operations are not available to
+      // immediate-operand instructions, so we have to combine two masks here.
+      if (Mask(LogicalImmediateMask & LogicalOpMask) == ANDS) {
+        return Reg31IsZeroRegister;
+      } else {
+        return Reg31IsStackPointer;
+      }
+    }
+    return Reg31IsZeroRegister;
+  }
+
+  // Indicate whether Rn can be the stack pointer or the zero register. This
+  // does not check that the instruction actually has an Rn field.
+  inline Reg31Mode RnMode() const {
+    // The following instructions use sp or wsp as Rn:
+    //  All loads and stores.
+    //  Add/sub (immediate).
+    //  Add/sub (extended).
+    // Otherwise, r31 is the zero register.
+    if (IsLoadOrStore() || IsAddSubImmediate() || IsAddSubExtended()) {
+      return Reg31IsStackPointer;
+    }
+    return Reg31IsZeroRegister;
+  }
+
+  inline ImmBranchType BranchType() const {
+    if (IsCondBranchImm()) {
+      return CondBranchType;
+    } else if (IsUncondBranchImm()) {
+      return UncondBranchType;
+    } else if (IsCompareBranch()) {
+      return CompareBranchType;
+    } else if (IsTestBranch()) {
+      return TestBranchType;
+    } else {
+      return UnknownBranchType;
+    }
+  }
+
+  // Find the target of this instruction. 'this' may be a branch or a
+  // PC-relative addressing instruction.
+  Instruction* ImmPCOffsetTarget();
+
+  // Patch a PC-relative offset to refer to 'target'. 'this' may be a branch or
+  // a PC-relative addressing instruction.
+  void SetImmPCOffsetTarget(Instruction* target);
+  // Patch a literal load instruction to load from 'source'.
+  void SetImmLLiteral(Instruction* source);
+
+  inline uint8_t* LiteralAddress() {
+    int offset = ImmLLiteral() << kLiteralEntrySizeLog2;
+    return reinterpret_cast<uint8_t*>(this) + offset;
+  }
+
+  inline uint32_t Literal32() {
+    uint32_t literal;
+    memcpy(&literal, LiteralAddress(), sizeof(literal));
+
+    return literal;
+  }
+
+  inline uint64_t Literal64() {
+    uint64_t literal;
+    memcpy(&literal, LiteralAddress(), sizeof(literal));
+
+    return literal;
+  }
+
+  inline float LiteralFP32() {
+    return rawbits_to_float(Literal32());
+  }
+
+  inline double LiteralFP64() {
+    return rawbits_to_double(Literal64());
+  }
+
+  inline Instruction* NextInstruction() {
+    return this + kInstructionSize;
+  }
+
+  inline Instruction* InstructionAtOffset(int64_t offset) {
+    ASSERT(IsWordAligned(this + offset));
+    return this + offset;
+  }
+
+  template<typename T> static inline Instruction* Cast(T src) {
+    return reinterpret_cast<Instruction*>(src);
+  }
+
+ private:
+  inline int ImmBranch() const;
+
+  void SetPCRelImmTarget(Instruction* target);
+  void SetBranchImmTarget(Instruction* target);
+};
+}  // namespace vixl
+
+#endif  // VIXL_A64_INSTRUCTIONS_A64_H_

diff --git a/src/a64/macro-assembler-a64.cc b/src/a64/macro-assembler-a64.cc
new file mode 100644
index 0000000..ba1ca52
--- /dev/null
+++ b/src/a64/macro-assembler-a64.cc

@@ -0,0 +1,1081 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "a64/macro-assembler-a64.h"
+namespace vixl {
+
+void MacroAssembler::And(const Register& rd,
+                         const Register& rn,
+                         const Operand& operand,
+                         FlagsUpdate S) {
+  ASSERT(allow_macro_instructions_);
+  LogicalMacro(rd, rn, operand, (S == SetFlags) ? ANDS : AND);
+}
+
+
+void MacroAssembler::Tst(const Register& rn,
+                         const Operand& operand) {
+  ASSERT(allow_macro_instructions_);
+  And(AppropriateZeroRegFor(rn), rn, operand, SetFlags);
+}
+
+
+void MacroAssembler::Bic(const Register& rd,
+                         const Register& rn,
+                         const Operand& operand,
+                         FlagsUpdate S) {
+  ASSERT(allow_macro_instructions_);
+  LogicalMacro(rd, rn, operand, (S == SetFlags) ? BICS : BIC);
+}
+
+
+void MacroAssembler::Orr(const Register& rd,
+                         const Register& rn,
+                         const Operand& operand) {
+  ASSERT(allow_macro_instructions_);
+  LogicalMacro(rd, rn, operand, ORR);
+}
+
+
+void MacroAssembler::Orn(const Register& rd,
+                         const Register& rn,
+                         const Operand& operand) {
+  ASSERT(allow_macro_instructions_);
+  LogicalMacro(rd, rn, operand, ORN);
+}
+
+
+void MacroAssembler::Eor(const Register& rd,
+                         const Register& rn,
+                         const Operand& operand) {
+  ASSERT(allow_macro_instructions_);
+  LogicalMacro(rd, rn, operand, EOR);
+}
+
+
+void MacroAssembler::Eon(const Register& rd,
+                         const Register& rn,
+                         const Operand& operand) {
+  ASSERT(allow_macro_instructions_);
+  LogicalMacro(rd, rn, operand, EON);
+}
+
+
+void MacroAssembler::LogicalMacro(const Register& rd,
+                                  const Register& rn,
+                                  const Operand& operand,
+                                  LogicalOp op) {
+  if (operand.IsImmediate()) {
+    int64_t immediate = operand.immediate();
+    unsigned reg_size = rd.size();
+    ASSERT(rd.Is64Bits() || is_uint32(immediate));
+
+    // If the operation is NOT, invert the operation and immediate.
+    if ((op & NOT) == NOT) {
+      op = static_cast<LogicalOp>(op & ~NOT);
+      immediate = ~immediate;
+      if (rd.Is32Bits()) {
+        immediate &= kWRegMask;
+      }
+    }
+
+    // Special cases for all set or all clear immediates.
+    if (immediate == 0) {
+      switch (op) {
+        case AND:
+          Mov(rd, 0);
+          return;
+        case ORR:  // Fall through.
+        case EOR:
+          Mov(rd, rn);
+          return;
+        case ANDS:  // Fall through.
+        case BICS:
+          break;
+        default:
+          UNREACHABLE();
+      }
+    } else if ((rd.Is64Bits() && (immediate == -1L)) ||
+               (rd.Is32Bits() && (immediate == 0xffffffffL))) {
+      switch (op) {
+        case AND:
+          Mov(rd, rn);
+          return;
+        case ORR:
+          Mov(rd, immediate);
+          return;
+        case EOR:
+          Mvn(rd, rn);
+          return;
+        case ANDS:  // Fall through.
+        case BICS:
+          break;
+        default:
+          UNREACHABLE();
+      }
+    }
+
+    unsigned n, imm_s, imm_r;
+    if (IsImmLogical(immediate, reg_size, &n, &imm_s, &imm_r)) {
+      // Immediate can be encoded in the instruction.
+      LogicalImmediate(rd, rn, n, imm_s, imm_r, op);
+    } else {
+      // Immediate can't be encoded: synthesize using move immediate.
+      Register temp = AppropriateTempFor(rn);
+      Mov(temp, immediate);
+      if (rd.Is(sp)) {
+        // If rd is the stack pointer we cannot use it as the destination
+        // register so we use the temp register as an intermediate again.
+        Logical(temp, rn, Operand(temp), op);
+        Mov(sp, temp);
+      } else {
+        Logical(rd, rn, Operand(temp), op);
+      }
+    }
+  } else if (operand.IsExtendedRegister()) {
+    ASSERT(operand.reg().size() <= rd.size());
+    // Add/sub extended supports shift <= 4. We want to support exactly the
+    // same modes here.
+    ASSERT(operand.shift_amount() <= 4);
+    ASSERT(operand.reg().Is64Bits() ||
+           ((operand.extend() != UXTX) && (operand.extend() != SXTX)));
+    Register temp = AppropriateTempFor(rn, operand.reg());
+    EmitExtendShift(temp, operand.reg(), operand.extend(),
+                    operand.shift_amount());
+    Logical(rd, rn, Operand(temp), op);
+  } else {
+    // The operand can be encoded in the instruction.
+    ASSERT(operand.IsShiftedRegister());
+    Logical(rd, rn, operand, op);
+  }
+}
+
+
+void MacroAssembler::Mov(const Register& rd, const Operand& operand) {
+  ASSERT(allow_macro_instructions_);
+  if (operand.IsImmediate()) {
+    // Call the macro assembler for generic immediates.
+    Mov(rd, operand.immediate());
+  } else if (operand.IsShiftedRegister() && (operand.shift_amount() != 0)) {
+    // Emit a shift instruction if moving a shifted register. This operation
+    // could also be achieved using an orr instruction (like orn used by Mvn),
+    // but using a shift instruction makes the disassembly clearer.
+    EmitShift(rd, operand.reg(), operand.shift(), operand.shift_amount());
+  } else if (operand.IsExtendedRegister()) {
+    // Emit an extend instruction if moving an extended register. This handles
+    // extend with post-shift operations, too.
+    EmitExtendShift(rd, operand.reg(), operand.extend(),
+                    operand.shift_amount());
+  } else {
+    // Otherwise, emit a register move only if the registers are distinct, or
+    // if they are not X registers. Note that mov(w0, w0) is not a no-op
+    // because it clears the top word of x0.
+    // If the sp is an operand, add #0 is emitted, otherwise, orr #0.
+    if (!rd.Is(operand.reg()) || !rd.Is64Bits()) {
+      mov(rd, operand.reg());
+    }
+  }
+}
+
+
+void MacroAssembler::Mvn(const Register& rd, const Operand& operand) {
+  ASSERT(allow_macro_instructions_);
+  if (operand.IsImmediate()) {
+    // Call the macro assembler for generic immediates.
+    Mvn(rd, operand.immediate());
+  } else if (operand.IsExtendedRegister()) {
+    // Emit two instructions for the extend case. This differs from Mov, as
+    // the extend and invert can't be achieved in one instruction.
+    Register temp = AppropriateTempFor(rd, operand.reg());
+    EmitExtendShift(temp, operand.reg(), operand.extend(),
+                    operand.shift_amount());
+    mvn(rd, Operand(temp));
+  } else {
+    // Otherwise, register and shifted register cases can be handled by the
+    // assembler directly, using orn.
+    mvn(rd, operand);
+  }
+}
+
+
+void MacroAssembler::Mov(const Register& rd, uint64_t imm) {
+  ASSERT(allow_macro_instructions_);
+  ASSERT(is_uint32(imm) || is_int32(imm) || rd.Is64Bits());
+
+  // Immediates on Aarch64 can be produced using an initial value, and zero to
+  // three move keep operations.
+  //
+  // Initial values can be generated with:
+  //  1. 64-bit move zero (movz).
+  //  2. 32-bit move negative (movn).
+  //  3. 64-bit move negative.
+  //  4. 32-bit orr immediate.
+  //  5. 64-bit orr immediate.
+  // Move-keep may then be used to modify each of the 16-bit nybbles.
+  //
+  // The code below supports all five initial value generators, and
+  // applying move-keep operations to move-zero initial values only.
+
+  unsigned reg_size = rd.size();
+  unsigned n, imm_s, imm_r;
+  if (IsImmMovz(imm, reg_size) && !rd.IsSP()) {
+    // Immediate can be represented in a move zero instruction.
+    movz(rd, imm);
+  } else if (IsImmMovn(imm, reg_size) && !rd.IsSP()) {
+    // Immediate can be represented in a move negative instruction. Movn can't
+    // write to the stack pointer.
+    movn(rd, rd.Is64Bits() ? ~imm : (~imm & kWRegMask));
+  } else if (IsImmLogical(imm, reg_size, &n, &imm_s, &imm_r)) {
+    // Immediate can be represented in a logical orr instruction.
+    ASSERT(!rd.IsZero());
+    LogicalImmediate(rd, AppropriateZeroRegFor(rd), n, imm_s, imm_r, ORR);
+  } else {
+    // Generic immediate case. Imm will be represented by
+    //   [imm3, imm2, imm1, imm0], where each imm is 16 bits.
+    // A move-zero is generated for the first non-zero immX, and a move-keep
+    // for subsequent non-zero immX.
+
+    // Use a temporary register when moving to the stack pointer.
+    Register temp = rd.IsSP() ? AppropriateTempFor(rd) : rd;
+
+    ASSERT((reg_size % 16) == 0);
+    bool first_mov_done = false;
+    for (unsigned i = 0; i < (temp.size() / 16); i++) {
+      uint64_t imm16 = (imm >> (16 * i)) & 0xffffL;
+      if (imm16 != 0) {
+        if (!first_mov_done) {
+          // Move the first non-zero 16-bit chunk into the destination register.
+          movz(temp, imm16, 16 * i);
+          first_mov_done = true;
+        } else {
+          // Construct a wider constant.
+          movk(temp, imm16, 16 * i);
+        }
+      }
+    }
+
+    if (rd.IsSP()) {
+      mov(rd, temp);
+    }
+
+    ASSERT(first_mov_done);
+  }
+}
+
+
+// The movz instruction can generate immediates containing an arbitrary 16-bit
+// value, with remaining bits set, eg. 0x00001234, 0x0000123400000000.
+bool MacroAssembler::IsImmMovz(uint64_t imm, unsigned reg_size) {
+  if (reg_size == kXRegSize) {
+    if (((imm & 0xffffffffffff0000UL) == 0UL) ||
+        ((imm & 0xffffffff0000ffffUL) == 0UL) ||
+        ((imm & 0xffff0000ffffffffUL) == 0UL) ||
+        ((imm & 0x0000ffffffffffffUL) == 0UL)) {
+      return true;
+    }
+  } else {
+    ASSERT(reg_size == kWRegSize);
+    imm &= kWRegMask;
+    if (((imm & 0xffff0000) == 0) ||
+        ((imm & 0x0000ffff) == 0)) {
+      return true;
+    }
+  }
+  return false;
+}
+
+
+// The movn instruction can generate immediates containing an arbitrary 16-bit
+// value, with remaining bits set, eg. 0xffff1234, 0xffff1234ffffffff.
+bool MacroAssembler::IsImmMovn(uint64_t imm, unsigned reg_size) {
+  return IsImmMovz(~imm, reg_size);
+}
+
+
+void MacroAssembler::Ccmp(const Register& rn,
+                          const Operand& operand,
+                          StatusFlags nzcv,
+                          Condition cond) {
+  ASSERT(allow_macro_instructions_);
+  ConditionalCompareMacro(rn, operand, nzcv, cond, CCMP);
+}
+
+
+void MacroAssembler::Ccmn(const Register& rn,
+                          const Operand& operand,
+                          StatusFlags nzcv,
+                          Condition cond) {
+  ASSERT(allow_macro_instructions_);
+  ConditionalCompareMacro(rn, operand, nzcv, cond, CCMN);
+}
+
+
+void MacroAssembler::ConditionalCompareMacro(const Register& rn,
+                                             const Operand& operand,
+                                             StatusFlags nzcv,
+                                             Condition cond,
+                                             ConditionalCompareOp op) {
+  if ((operand.IsShiftedRegister() && (operand.shift_amount() == 0)) ||
+      (operand.IsImmediate() && IsImmConditionalCompare(operand.immediate()))) {
+    // The immediate can be encoded in the instruction, or the operand is an
+    // unshifted register: call the assembler.
+    ConditionalCompare(rn, operand, nzcv, cond, op);
+  } else {
+    // The operand isn't directly supported by the instruction: perform the
+    // operation on a temporary register.
+    Register temp(NoReg);
+    if (operand.IsImmediate()) {
+      temp = AppropriateTempFor(rn);
+      Mov(temp, operand.immediate());
+    } else if (operand.IsShiftedRegister()) {
+      ASSERT(operand.shift() != ROR);
+      ASSERT(is_uintn(rn.size() == kXRegSize ? kXRegSizeLog2 : kWRegSizeLog2,
+                      operand.shift_amount()));
+      temp = AppropriateTempFor(rn, operand.reg());
+      EmitShift(temp, operand.reg(), operand.shift(), operand.shift_amount());
+    } else {
+      ASSERT(operand.IsExtendedRegister());
+      ASSERT(operand.reg().size() <= rn.size());
+      // Add/sub extended support a shift <= 4. We want to support exactly the
+      // same modes.
+      ASSERT(operand.shift_amount() <= 4);
+      ASSERT(operand.reg().Is64Bits() ||
+             ((operand.extend() != UXTX) && (operand.extend() != SXTX)));
+      temp = AppropriateTempFor(rn, operand.reg());
+      EmitExtendShift(temp, operand.reg(), operand.extend(),
+                    operand.shift_amount());
+    }
+    ConditionalCompare(rn, Operand(temp), nzcv, cond, op);
+  }
+}
+
+
+void MacroAssembler::Add(const Register& rd,
+                         const Register& rn,
+                         const Operand& operand,
+                         FlagsUpdate S) {
+  ASSERT(allow_macro_instructions_);
+  if (operand.IsImmediate() && (operand.immediate() < 0)) {
+    AddSubMacro(rd, rn, -operand.immediate(), S, SUB);
+  } else {
+    AddSubMacro(rd, rn, operand, S, ADD);
+  }
+}
+
+
+void MacroAssembler::Sub(const Register& rd,
+                         const Register& rn,
+                         const Operand& operand,
+                         FlagsUpdate S) {
+  ASSERT(allow_macro_instructions_);
+  if (operand.IsImmediate() && (operand.immediate() < 0)) {
+    AddSubMacro(rd, rn, -operand.immediate(), S, ADD);
+  } else {
+    AddSubMacro(rd, rn, operand, S, SUB);
+  }
+}
+
+
+void MacroAssembler::Cmn(const Register& rn, const Operand& operand) {
+  ASSERT(allow_macro_instructions_);
+  Add(AppropriateZeroRegFor(rn), rn, operand, SetFlags);
+}
+
+
+void MacroAssembler::Cmp(const Register& rn, const Operand& operand) {
+  ASSERT(allow_macro_instructions_);
+  Sub(AppropriateZeroRegFor(rn), rn, operand, SetFlags);
+}
+
+
+void MacroAssembler::Neg(const Register& rd,
+                         const Operand& operand,
+                         FlagsUpdate S) {
+  ASSERT(allow_macro_instructions_);
+  if (operand.IsImmediate()) {
+    Mov(rd, -operand.immediate());
+  } else {
+    Sub(rd, AppropriateZeroRegFor(rd), operand, S);
+  }
+}
+
+
+void MacroAssembler::AddSubMacro(const Register& rd,
+                                 const Register& rn,
+                                 const Operand& operand,
+                                 FlagsUpdate S,
+                                 AddSubOp op) {
+  if ((operand.IsImmediate() && !IsImmAddSub(operand.immediate())) ||
+      (rn.IsZero() && !operand.IsShiftedRegister())                ||
+      (operand.IsShiftedRegister() && (operand.shift() == ROR))) {
+    Register temp = AppropriateTempFor(rn);
+    Mov(temp, operand);
+    AddSub(rd, rn, temp, S, op);
+  } else {
+    AddSub(rd, rn, operand, S, op);
+  }
+}
+
+
+void MacroAssembler::Adc(const Register& rd,
+                         const Register& rn,
+                         const Operand& operand,
+                         FlagsUpdate S) {
+  ASSERT(allow_macro_instructions_);
+  AddSubWithCarryMacro(rd, rn, operand, S, ADC);
+}
+
+
+void MacroAssembler::Sbc(const Register& rd,
+                         const Register& rn,
+                         const Operand& operand,
+                         FlagsUpdate S) {
+  ASSERT(allow_macro_instructions_);
+  AddSubWithCarryMacro(rd, rn, operand, S, SBC);
+}
+
+
+void MacroAssembler::Ngc(const Register& rd,
+                         const Operand& operand,
+                         FlagsUpdate S) {
+  ASSERT(allow_macro_instructions_);
+  Register zr = AppropriateZeroRegFor(rd);
+  Sbc(rd, zr, operand, S);
+}
+
+
+void MacroAssembler::AddSubWithCarryMacro(const Register& rd,
+                                          const Register& rn,
+                                          const Operand& operand,
+                                          FlagsUpdate S,
+                                          AddSubWithCarryOp op) {
+  ASSERT(rd.size() == rn.size());
+
+  if (operand.IsImmediate() ||
+      (operand.IsShiftedRegister() && (operand.shift() == ROR))) {
+    // Add/sub with carry (immediate or ROR shifted register.)
+    Register temp = AppropriateTempFor(rn);
+    Mov(temp, operand);
+    AddSubWithCarry(rd, rn, Operand(temp), S, op);
+  } else if (operand.IsShiftedRegister() && (operand.shift_amount() != 0)) {
+    // Add/sub with carry (shifted register).
+    ASSERT(operand.reg().size() == rd.size());
+    ASSERT(operand.shift() != ROR);
+    ASSERT(is_uintn(rd.size() == kXRegSize ? kXRegSizeLog2 : kWRegSizeLog2,
+                    operand.shift_amount()));
+    Register temp = AppropriateTempFor(rn, operand.reg());
+    EmitShift(temp, operand.reg(), operand.shift(), operand.shift_amount());
+    AddSubWithCarry(rd, rn, Operand(temp), S, op);
+  } else if (operand.IsExtendedRegister()) {
+    // Add/sub with carry (extended register).
+    ASSERT(operand.reg().size() <= rd.size());
+    // Add/sub extended supports a shift <= 4. We want to support exactly the
+    // same modes.
+    ASSERT(operand.shift_amount() <= 4);
+    ASSERT(operand.reg().Is64Bits() ||
+           ((operand.extend() != UXTX) && (operand.extend() != SXTX)));
+    Register temp = AppropriateTempFor(rn, operand.reg());
+    EmitExtendShift(temp, operand.reg(), operand.extend(),
+                    operand.shift_amount());
+    AddSubWithCarry(rd, rn, Operand(temp), S, op);
+  } else {
+    // The addressing mode is directly supported by the instruction.
+    AddSubWithCarry(rd, rn, operand, S, op);
+  }
+}
+
+
+#define DEFINE_FUNCTION(FN, REGTYPE, REG, OP)                         \
+void MacroAssembler::FN(const REGTYPE REG, const MemOperand& addr) {  \
+  LoadStoreMacro(REG, addr, OP);                                      \
+}
+LS_MACRO_LIST(DEFINE_FUNCTION)
+#undef DEFINE_FUNCTION
+
+void MacroAssembler::LoadStoreMacro(const CPURegister& rt,
+                                    const MemOperand& addr,
+                                    LoadStoreOp op) {
+  int64_t offset = addr.offset();
+  LSDataSize size = CalcLSDataSize(op);
+
+  // Check if an immediate offset fits in the immediate field of the
+  // appropriate instruction. If not, emit two instructions to perform
+  // the operation.
+  if (addr.IsImmediateOffset() && !IsImmLSScaled(offset, size) &&
+      !IsImmLSUnscaled(offset)) {
+    // Immediate offset that can't be encoded using unsigned or unscaled
+    // addressing modes.
+    Register temp = AppropriateTempFor(addr.base());
+    Mov(temp, addr.offset());
+    LoadStore(rt, MemOperand(addr.base(), temp), op);
+  } else if (addr.IsPostIndex() && !IsImmLSUnscaled(offset)) {
+    // Post-index beyond unscaled addressing range.
+    LoadStore(rt, MemOperand(addr.base()), op);
+    Add(addr.base(), addr.base(), Operand(offset));
+  } else if (addr.IsPreIndex() && !IsImmLSUnscaled(offset)) {
+    // Pre-index beyond unscaled addressing range.
+    Add(addr.base(), addr.base(), Operand(offset));
+    LoadStore(rt, MemOperand(addr.base()), op);
+  } else {
+    // Encodable in one load/store instruction.
+    LoadStore(rt, addr, op);
+  }
+}
+
+
+void MacroAssembler::Push(const CPURegister& src0, const CPURegister& src1,
+                          const CPURegister& src2, const CPURegister& src3) {
+  ASSERT(allow_macro_instructions_);
+  ASSERT(AreSameSizeAndType(src0, src1, src2, src3));
+  ASSERT(src0.IsValid());
+
+  int count = 1 + src1.IsValid() + src2.IsValid() + src3.IsValid();
+  int size = src0.SizeInBytes();
+
+  PrepareForPush(count, size);
+  PushHelper(count, size, src0, src1, src2, src3);
+}
+
+
+void MacroAssembler::Pop(const CPURegister& dst0, const CPURegister& dst1,
+                         const CPURegister& dst2, const CPURegister& dst3) {
+  // It is not valid to pop into the same register more than once in one
+  // instruction, not even into the zero register.
+  ASSERT(allow_macro_instructions_);
+  ASSERT(!AreAliased(dst0, dst1, dst2, dst3));
+  ASSERT(AreSameSizeAndType(dst0, dst1, dst2, dst3));
+  ASSERT(dst0.IsValid());
+
+  int count = 1 + dst1.IsValid() + dst2.IsValid() + dst3.IsValid();
+  int size = dst0.SizeInBytes();
+
+  PrepareForPop(count, size);
+  PopHelper(count, size, dst0, dst1, dst2, dst3);
+}
+
+
+void MacroAssembler::PushCPURegList(CPURegList registers) {
+  int size = registers.RegisterSizeInBytes();
+
+  PrepareForPush(registers.Count(), size);
+  // Push up to four registers at a time because if the current stack pointer is
+  // sp and reg_size is 32, registers must be pushed in blocks of four in order
+  // to maintain the 16-byte alignment for sp.
+  ASSERT(allow_macro_instructions_);
+  while (!registers.IsEmpty()) {
+    int count_before = registers.Count();
+    const CPURegister& src0 = registers.PopHighestIndex();
+    const CPURegister& src1 = registers.PopHighestIndex();
+    const CPURegister& src2 = registers.PopHighestIndex();
+    const CPURegister& src3 = registers.PopHighestIndex();
+    int count = count_before - registers.Count();
+    PushHelper(count, size, src0, src1, src2, src3);
+  }
+}
+
+
+void MacroAssembler::PopCPURegList(CPURegList registers) {
+  int size = registers.RegisterSizeInBytes();
+
+  PrepareForPop(registers.Count(), size);
+  // Pop up to four registers at a time because if the current stack pointer is
+  // sp and reg_size is 32, registers must be pushed in blocks of four in order
+  // to maintain the 16-byte alignment for sp.
+  ASSERT(allow_macro_instructions_);
+  while (!registers.IsEmpty()) {
+    int count_before = registers.Count();
+    const CPURegister& dst0 = registers.PopLowestIndex();
+    const CPURegister& dst1 = registers.PopLowestIndex();
+    const CPURegister& dst2 = registers.PopLowestIndex();
+    const CPURegister& dst3 = registers.PopLowestIndex();
+    int count = count_before - registers.Count();
+    PopHelper(count, size, dst0, dst1, dst2, dst3);
+  }
+}
+
+
+void MacroAssembler::PushMultipleTimes(int count, Register src) {
+  ASSERT(allow_macro_instructions_);
+  int size = src.SizeInBytes();
+
+  PrepareForPush(count, size);
+  // Push up to four registers at a time if possible because if the current
+  // stack pointer is sp and the register size is 32, registers must be pushed
+  // in blocks of four in order to maintain the 16-byte alignment for sp.
+  while (count >= 4) {
+    PushHelper(4, size, src, src, src, src);
+    count -= 4;
+  }
+  if (count >= 2) {
+    PushHelper(2, size, src, src, NoReg, NoReg);
+    count -= 2;
+  }
+  if (count == 1) {
+    PushHelper(1, size, src, NoReg, NoReg, NoReg);
+    count -= 1;
+  }
+  ASSERT(count == 0);
+}
+
+
+void MacroAssembler::PushHelper(int count, int size,
+                                const CPURegister& src0,
+                                const CPURegister& src1,
+                                const CPURegister& src2,
+                                const CPURegister& src3) {
+  // Ensure that we don't unintentionally modify scratch or debug registers.
+  InstructionAccurateScope scope(this);
+
+  ASSERT(AreSameSizeAndType(src0, src1, src2, src3));
+  ASSERT(size == src0.SizeInBytes());
+
+  // When pushing multiple registers, the store order is chosen such that
+  // Push(a, b) is equivalent to Push(a) followed by Push(b).
+  switch (count) {
+    case 1:
+      ASSERT(src1.IsNone() && src2.IsNone() && src3.IsNone());
+      str(src0, MemOperand(StackPointer(), -1 * size, PreIndex));
+      break;
+    case 2:
+      ASSERT(src2.IsNone() && src3.IsNone());
+      stp(src1, src0, MemOperand(StackPointer(), -2 * size, PreIndex));
+      break;
+    case 3:
+      ASSERT(src3.IsNone());
+      stp(src2, src1, MemOperand(StackPointer(), -3 * size, PreIndex));
+      str(src0, MemOperand(StackPointer(), 2 * size));
+      break;
+    case 4:
+      // Skip over 4 * size, then fill in the gap. This allows four W registers
+      // to be pushed using sp, whilst maintaining 16-byte alignment for sp at
+      // all times.
+      stp(src3, src2, MemOperand(StackPointer(), -4 * size, PreIndex));
+      stp(src1, src0, MemOperand(StackPointer(), 2 * size));
+      break;
+    default:
+      UNREACHABLE();
+  }
+}
+
+
+void MacroAssembler::PopHelper(int count, int size,
+                               const CPURegister& dst0,
+                               const CPURegister& dst1,
+                               const CPURegister& dst2,
+                               const CPURegister& dst3) {
+  // Ensure that we don't unintentionally modify scratch or debug registers.
+  InstructionAccurateScope scope(this);
+
+  ASSERT(AreSameSizeAndType(dst0, dst1, dst2, dst3));
+  ASSERT(size == dst0.SizeInBytes());
+
+  // When popping multiple registers, the load order is chosen such that
+  // Pop(a, b) is equivalent to Pop(a) followed by Pop(b).
+  switch (count) {
+    case 1:
+      ASSERT(dst1.IsNone() && dst2.IsNone() && dst3.IsNone());
+      ldr(dst0, MemOperand(StackPointer(), 1 * size, PostIndex));
+      break;
+    case 2:
+      ASSERT(dst2.IsNone() && dst3.IsNone());
+      ldp(dst0, dst1, MemOperand(StackPointer(), 2 * size, PostIndex));
+      break;
+    case 3:
+      ASSERT(dst3.IsNone());
+      ldr(dst2, MemOperand(StackPointer(), 2 * size));
+      ldp(dst0, dst1, MemOperand(StackPointer(), 3 * size, PostIndex));
+      break;
+    case 4:
+      // Load the higher addresses first, then load the lower addresses and skip
+      // the whole block in the second instruction. This allows four W registers
+      // to be popped using sp, whilst maintaining 16-byte alignment for sp at
+      // all times.
+      ldp(dst2, dst3, MemOperand(StackPointer(), 2 * size));
+      ldp(dst0, dst1, MemOperand(StackPointer(), 4 * size, PostIndex));
+      break;
+    default:
+      UNREACHABLE();
+  }
+}
+
+
+void MacroAssembler::PrepareForPush(int count, int size) {
+  if (sp.Is(StackPointer())) {
+    // If the current stack pointer is sp, then it must be aligned to 16 bytes
+    // on entry and the total size of the specified registers must also be a
+    // multiple of 16 bytes.
+    ASSERT((count * size) % 16 == 0);
+  } else {
+    // Even if the current stack pointer is not the system stack pointer (sp),
+    // the system stack pointer will still be modified in order to comply with
+    // ABI rules about accessing memory below the system stack pointer.
+    BumpSystemStackPointer(count * size);
+  }
+}
+
+
+void MacroAssembler::PrepareForPop(int count, int size) {
+  USE(count);
+  USE(size);
+  if (sp.Is(StackPointer())) {
+    // If the current stack pointer is sp, then it must be aligned to 16 bytes
+    // on entry and the total size of the specified registers must also be a
+    // multiple of 16 bytes.
+    ASSERT((count * size) % 16 == 0);
+  }
+}
+
+void MacroAssembler::Poke(const Register& src, const Operand& offset) {
+  ASSERT(allow_macro_instructions_);
+  if (offset.IsImmediate()) {
+    ASSERT(offset.immediate() >= 0);
+  }
+
+  Str(src, MemOperand(StackPointer(), offset));
+}
+
+
+void MacroAssembler::Peek(const Register& dst, const Operand& offset) {
+  ASSERT(allow_macro_instructions_);
+  if (offset.IsImmediate()) {
+    ASSERT(offset.immediate() >= 0);
+  }
+
+  Ldr(dst, MemOperand(StackPointer(), offset));
+}
+
+
+void MacroAssembler::Claim(const Operand& size) {
+  ASSERT(allow_macro_instructions_);
+  if (size.IsImmediate()) {
+    ASSERT(size.immediate() >= 0);
+    if (sp.Is(StackPointer())) {
+      ASSERT((size.immediate() % 16) == 0);
+    }
+  }
+
+  if (!sp.Is(StackPointer())) {
+    BumpSystemStackPointer(size);
+  }
+
+  Sub(StackPointer(), StackPointer(), size);
+}
+
+
+void MacroAssembler::Drop(const Operand& size) {
+  ASSERT(allow_macro_instructions_);
+  if (size.IsImmediate()) {
+    ASSERT(size.immediate() >= 0);
+    if (sp.Is(StackPointer())) {
+      ASSERT((size.immediate() % 16) == 0);
+    }
+  }
+
+  Add(StackPointer(), StackPointer(), size);
+}
+
+
+void MacroAssembler::PushCalleeSavedRegisters() {
+  // Ensure that the macro-assembler doesn't use any scratch registers.
+  InstructionAccurateScope scope(this);
+
+  // This method must not be called unless the current stack pointer is sp.
+  ASSERT(sp.Is(StackPointer()));
+
+  MemOperand tos(sp, -2 * kXRegSizeInBytes, PreIndex);
+
+  stp(d14, d15, tos);
+  stp(d12, d13, tos);
+  stp(d10, d11, tos);
+  stp(d8, d9, tos);
+
+  stp(x29, x30, tos);
+  stp(x27, x28, tos);
+  stp(x25, x26, tos);
+  stp(x23, x24, tos);
+  stp(x21, x22, tos);
+  stp(x19, x20, tos);
+}
+
+
+void MacroAssembler::PopCalleeSavedRegisters() {
+  // Ensure that the macro-assembler doesn't use any scratch registers.
+  InstructionAccurateScope scope(this);
+
+  // This method must not be called unless the current stack pointer is sp.
+  ASSERT(sp.Is(StackPointer()));
+
+  MemOperand tos(sp, 2 * kXRegSizeInBytes, PostIndex);
+
+  ldp(x19, x20, tos);
+  ldp(x21, x22, tos);
+  ldp(x23, x24, tos);
+  ldp(x25, x26, tos);
+  ldp(x27, x28, tos);
+  ldp(x29, x30, tos);
+
+  ldp(d8, d9, tos);
+  ldp(d10, d11, tos);
+  ldp(d12, d13, tos);
+  ldp(d14, d15, tos);
+}
+
+void MacroAssembler::BumpSystemStackPointer(const Operand& space) {
+  ASSERT(!sp.Is(StackPointer()));
+  // TODO: Several callers rely on this not using scratch registers, so we use
+  // the assembler directly here. However, this means that large immediate
+  // values of 'space' cannot be handled.
+  InstructionAccurateScope scope(this);
+  sub(sp, StackPointer(), space);
+}
+
+
+// This is the main Printf implementation. All callee-saved registers are
+// preserved, but NZCV and the caller-saved registers may be clobbered.
+void MacroAssembler::PrintfNoPreserve(const char * format,
+                                      const CPURegister& arg0,
+                                      const CPURegister& arg1,
+                                      const CPURegister& arg2,
+                                      const CPURegister& arg3) {
+  // We cannot handle a caller-saved stack pointer. It doesn't make much sense
+  // in most cases anyway, so this restriction shouldn't be too serious.
+  ASSERT(!kCallerSaved.IncludesAliasOf(StackPointer()));
+
+  // We cannot print Tmp0() or Tmp1() as they're used internally by the macro
+  // assembler. We cannot print the stack pointer because it is typically used
+  // to preserve caller-saved registers (using other Printf variants which
+  // depend on this helper).
+  ASSERT(!AreAliased(Tmp0(), Tmp1(), StackPointer(), arg0));
+  ASSERT(!AreAliased(Tmp0(), Tmp1(), StackPointer(), arg1));
+  ASSERT(!AreAliased(Tmp0(), Tmp1(), StackPointer(), arg2));
+  ASSERT(!AreAliased(Tmp0(), Tmp1(), StackPointer(), arg3));
+
+  static const int kMaxArgCount = 4;
+  // Assume that we have the maximum number of arguments until we know
+  // otherwise.
+  int arg_count = kMaxArgCount;
+
+  // The provided arguments.
+  CPURegister args[kMaxArgCount] = {arg0, arg1, arg2, arg3};
+
+  // The PCS registers where the arguments need to end up.
+  CPURegister pcs[kMaxArgCount];
+
+  // Promote FP arguments to doubles, and integer arguments to X registers.
+  // Note that FP and integer arguments cannot be mixed, but we'll check
+  // AreSameSizeAndType once we've processed these promotions.
+  for (int i = 0; i < kMaxArgCount; i++) {
+    if (args[i].IsRegister()) {
+      // Note that we use x1 onwards, because x0 will hold the format string.
+      pcs[i] = Register::XRegFromCode(i + 1);
+      // For simplicity, we handle all integer arguments as X registers. An X
+      // register argument takes the same space as a W register argument in the
+      // PCS anyway. The only limitation is that we must explicitly clear the
+      // top word for W register arguments as the callee will expect it to be
+      // clear.
+      if (!args[i].Is64Bits()) {
+        const Register& as_x = args[i].X();
+        And(as_x, as_x, 0x00000000ffffffff);
+        args[i] = as_x;
+      }
+    } else if (args[i].IsFPRegister()) {
+      pcs[i] = FPRegister::DRegFromCode(i);
+      // C and C++ varargs functions (such as printf) implicitly promote float
+      // arguments to doubles.
+      if (!args[i].Is64Bits()) {
+        FPRegister s(args[i]);
+        const FPRegister& as_d = args[i].D();
+        Fcvt(as_d, s);
+        args[i] = as_d;
+      }
+    } else {
+      // This is the first empty (NoCPUReg) argument, so use it to set the
+      // argument count and bail out.
+      arg_count = i;
+      break;
+    }
+  }
+  ASSERT((arg_count >= 0) && (arg_count <= kMaxArgCount));
+  // Check that every remaining argument is NoCPUReg.
+  for (int i = arg_count; i < kMaxArgCount; i++) {
+    ASSERT(args[i].IsNone());
+  }
+  ASSERT((arg_count == 0) || AreSameSizeAndType(args[0], args[1],
+                                                args[2], args[3],
+                                                pcs[0], pcs[1],
+                                                pcs[2], pcs[3]));
+
+  // Move the arguments into the appropriate PCS registers.
+  //
+  // Arranging an arbitrary list of registers into x1-x4 (or d0-d3) is
+  // surprisingly complicated.
+  //
+  //  * For even numbers of registers, we push the arguments and then pop them
+  //    into their final registers. This maintains 16-byte stack alignment in
+  //    case sp is the stack pointer, since we're only handling X or D registers
+  //    at this point.
+  //
+  //  * For odd numbers of registers, we push and pop all but one register in
+  //    the same way, but the left-over register is moved directly, since we
+  //    can always safely move one register without clobbering any source.
+  if (arg_count >= 4) {
+    Push(args[3], args[2], args[1], args[0]);
+  } else if (arg_count >= 2) {
+    Push(args[1], args[0]);
+  }
+
+  if ((arg_count % 2) != 0) {
+    // Move the left-over register directly.
+    const CPURegister& leftover_arg = args[arg_count - 1];
+    const CPURegister& leftover_pcs = pcs[arg_count - 1];
+    if (leftover_arg.IsRegister()) {
+      Mov(Register(leftover_pcs), Register(leftover_arg));
+    } else {
+      Fmov(FPRegister(leftover_pcs), FPRegister(leftover_arg));
+    }
+  }
+
+  if (arg_count >= 4) {
+    Pop(pcs[0], pcs[1], pcs[2], pcs[3]);
+  } else if (arg_count >= 2) {
+    Pop(pcs[0], pcs[1]);
+  }
+
+  // Load the format string into x0, as per the procedure-call standard.
+  //
+  // To make the code as portable as possible, the format string is encoded
+  // directly in the instruction stream. It might be cleaner to encode it in a
+  // literal pool, but since Printf is usually used for debugging, it is
+  // beneficial for it to be minimally dependent on other features.
+  Label format_address;
+  Adr(x0, &format_address);
+
+  // Emit the format string directly in the instruction stream.
+  { BlockLiteralPoolScope scope(this);
+    Label after_data;
+    B(&after_data);
+    Bind(&format_address);
+    EmitStringData(format);
+    Unreachable();
+    Bind(&after_data);
+  }
+
+  // We don't pass any arguments on the stack, but we still need to align the C
+  // stack pointer to a 16-byte boundary for PCS compliance.
+  if (!sp.Is(StackPointer())) {
+    Bic(sp, StackPointer(), 0xf);
+  }
+
+  // Actually call printf. This part needs special handling for the simulator,
+  // since the system printf function will use a different instruction set and
+  // the procedure-call standard will not be compatible.
+#ifdef USE_SIMULATOR
+  { InstructionAccurateScope scope(this, kPrintfLength / kInstructionSize);
+    hlt(kPrintfOpcode);
+    dc32(pcs[0].type());
+  }
+#else
+  Mov(Tmp0(), reinterpret_cast<uintptr_t>(printf));
+  Blr(Tmp0());
+#endif
+}
+
+
+void MacroAssembler::Printf(const char * format,
+                            const CPURegister& arg0,
+                            const CPURegister& arg1,
+                            const CPURegister& arg2,
+                            const CPURegister& arg3) {
+  // Preserve all caller-saved registers as well as NZCV.
+  // If sp is the stack pointer, PushCPURegList asserts that the size of each
+  // list is a multiple of 16 bytes.
+  PushCPURegList(kCallerSaved);
+  PushCPURegList(kCallerSavedFP);
+  // Use Tmp0() as a scratch register. It is not accepted by Printf so it will
+  // never overlap an argument register.
+  Mrs(Tmp0(), NZCV);
+  Push(Tmp0(), xzr);
+
+  PrintfNoPreserve(format, arg0, arg1, arg2, arg3);
+
+  Pop(xzr, Tmp0());
+  Msr(NZCV, Tmp0());
+  PopCPURegList(kCallerSavedFP);
+  PopCPURegList(kCallerSaved);
+}
+
+void MacroAssembler::Trace(TraceParameters parameters, TraceCommand command) {
+  ASSERT(allow_macro_instructions_);
+
+#ifdef USE_SIMULATOR
+  // The arguments to the trace pseudo instruction need to be contiguous in
+  // memory, so make sure we don't try to emit a literal pool.
+  InstructionAccurateScope scope(this, kTraceLength / kInstructionSize);
+
+  Label start;
+  bind(&start);
+
+  // Refer to instructions-a64.h for a description of the marker and its
+  // arguments.
+  hlt(kTraceOpcode);
+
+  ASSERT(SizeOfCodeGeneratedSince(&start) == kTraceParamsOffset);
+  dc32(parameters);
+
+  ASSERT(SizeOfCodeGeneratedSince(&start) == kTraceCommandOffset);
+  dc32(command);
+#else
+  // Emit nothing on real hardware.
+  USE(parameters);
+  USE(command);
+#endif
+}
+
+
+void MacroAssembler::Log(TraceParameters parameters) {
+  ASSERT(allow_macro_instructions_);
+
+#ifdef USE_SIMULATOR
+  // The arguments to the log pseudo instruction need to be contiguous in
+  // memory, so make sure we don't try to emit a literal pool.
+  InstructionAccurateScope scope(this, kLogLength / kInstructionSize);
+
+  Label start;
+  bind(&start);
+
+  // Refer to instructions-a64.h for a description of the marker and its
+  // arguments.
+  hlt(kLogOpcode);
+
+  ASSERT(SizeOfCodeGeneratedSince(&start) == kLogParamsOffset);
+  dc32(parameters);
+#else
+  // Emit nothing on real hardware.
+  USE(parameters);
+#endif
+}
+
+}  // namespace vixl

diff --git a/src/a64/macro-assembler-a64.h b/src/a64/macro-assembler-a64.h
new file mode 100644
index 0000000..f4b13a0
--- /dev/null
+++ b/src/a64/macro-assembler-a64.h

@@ -0,0 +1,1155 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_A64_MACRO_ASSEMBLER_A64_H_
+#define VIXL_A64_MACRO_ASSEMBLER_A64_H_
+
+#include "globals.h"
+#include "a64/assembler-a64.h"
+#include "a64/debugger-a64.h"
+
+
+#define LS_MACRO_LIST(V)                                      \
+  V(Ldrb, Register&, rt, LDRB_w)                              \
+  V(Strb, Register&, rt, STRB_w)                              \
+  V(Ldrsb, Register&, rt, rt.Is64Bits() ? LDRSB_x : LDRSB_w)  \
+  V(Ldrh, Register&, rt, LDRH_w)                              \
+  V(Strh, Register&, rt, STRH_w)                              \
+  V(Ldrsh, Register&, rt, rt.Is64Bits() ? LDRSH_x : LDRSH_w)  \
+  V(Ldr, CPURegister&, rt, LoadOpFor(rt))                     \
+  V(Str, CPURegister&, rt, StoreOpFor(rt))                    \
+  V(Ldrsw, Register&, rt, LDRSW_x)
+
+namespace vixl {
+
+class MacroAssembler : public Assembler {
+ public:
+  MacroAssembler(byte * buffer, unsigned buffer_size)
+      : Assembler(buffer, buffer_size),
+#ifdef DEBUG
+        allow_macro_instructions_(true),
+#endif
+        sp_(sp), tmp0_(ip0), tmp1_(ip1), fptmp0_(d31) {}
+
+  // Logical macros.
+  void And(const Register& rd,
+           const Register& rn,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+  void Bic(const Register& rd,
+           const Register& rn,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+  void Orr(const Register& rd,
+           const Register& rn,
+           const Operand& operand);
+  void Orn(const Register& rd,
+           const Register& rn,
+           const Operand& operand);
+  void Eor(const Register& rd,
+           const Register& rn,
+           const Operand& operand);
+  void Eon(const Register& rd,
+           const Register& rn,
+           const Operand& operand);
+  void Tst(const Register& rn, const Operand& operand);
+  void LogicalMacro(const Register& rd,
+                    const Register& rn,
+                    const Operand& operand,
+                    LogicalOp op);
+
+  // Add and sub macros.
+  void Add(const Register& rd,
+           const Register& rn,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+  void Sub(const Register& rd,
+           const Register& rn,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+  void Cmn(const Register& rn, const Operand& operand);
+  void Cmp(const Register& rn, const Operand& operand);
+  void Neg(const Register& rd,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+  void AddSubMacro(const Register& rd,
+                   const Register& rn,
+                   const Operand& operand,
+                   FlagsUpdate S,
+                   AddSubOp op);
+
+  // Add/sub with carry macros.
+  void Adc(const Register& rd,
+           const Register& rn,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+  void Sbc(const Register& rd,
+           const Register& rn,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+  void Ngc(const Register& rd,
+           const Operand& operand,
+           FlagsUpdate S = LeaveFlags);
+  void AddSubWithCarryMacro(const Register& rd,
+                            const Register& rn,
+                            const Operand& operand,
+                            FlagsUpdate S,
+                            AddSubWithCarryOp op);
+
+  // Move macros.
+  void Mov(const Register& rd, uint64_t imm);
+  void Mov(const Register& rd, const Operand& operand);
+  void Mvn(const Register& rd, uint64_t imm) {
+    Mov(rd, ~imm);
+  };
+  void Mvn(const Register& rd, const Operand& operand);
+  bool IsImmMovn(uint64_t imm, unsigned reg_size);
+  bool IsImmMovz(uint64_t imm, unsigned reg_size);
+
+  // Conditional compare macros.
+  void Ccmp(const Register& rn,
+            const Operand& operand,
+            StatusFlags nzcv,
+            Condition cond);
+  void Ccmn(const Register& rn,
+            const Operand& operand,
+            StatusFlags nzcv,
+            Condition cond);
+  void ConditionalCompareMacro(const Register& rn,
+                               const Operand& operand,
+                               StatusFlags nzcv,
+                               Condition cond,
+                               ConditionalCompareOp op);
+
+  // Load/store macros.
+#define DECLARE_FUNCTION(FN, REGTYPE, REG, OP) \
+  void FN(const REGTYPE REG, const MemOperand& addr);
+  LS_MACRO_LIST(DECLARE_FUNCTION)
+#undef DECLARE_FUNCTION
+
+  void LoadStoreMacro(const CPURegister& rt,
+                      const MemOperand& addr,
+                      LoadStoreOp op);
+
+  // Push or pop up to 4 registers of the same width to or from the stack,
+  // using the current stack pointer as set by SetStackPointer.
+  //
+  // If an argument register is 'NoReg', all further arguments are also assumed
+  // to be 'NoReg', and are thus not pushed or popped.
+  //
+  // Arguments are ordered such that "Push(a, b);" is functionally equivalent
+  // to "Push(a); Push(b);".
+  //
+  // It is valid to push the same register more than once, and there is no
+  // restriction on the order in which registers are specified.
+  //
+  // It is not valid to pop into the same register more than once in one
+  // operation, not even into the zero register.
+  //
+  // If the current stack pointer (as set by SetStackPointer) is sp, then it
+  // must be aligned to 16 bytes on entry and the total size of the specified
+  // registers must also be a multiple of 16 bytes.
+  //
+  // Even if the current stack pointer is not the system stack pointer (sp),
+  // Push (and derived methods) will still modify the system stack pointer in
+  // order to comply with ABI rules about accessing memory below the system
+  // stack pointer.
+  //
+  // Other than the registers passed into Pop, the stack pointer and (possibly)
+  // the system stack pointer, these methods do not modify any other registers.
+  // Scratch registers such as Tmp0() and Tmp1() are preserved.
+  void Push(const CPURegister& src0, const CPURegister& src1 = NoReg,
+            const CPURegister& src2 = NoReg, const CPURegister& src3 = NoReg);
+  void Pop(const CPURegister& dst0, const CPURegister& dst1 = NoReg,
+           const CPURegister& dst2 = NoReg, const CPURegister& dst3 = NoReg);
+
+  // Alternative forms of Push and Pop, taking a RegList or CPURegList that
+  // specifies the registers that are to be pushed or popped. Higher-numbered
+  // registers are associated with higher memory addresses (as in the A32 push
+  // and pop instructions).
+  //
+  // (Push|Pop)SizeRegList allow you to specify the register size as a
+  // parameter. Only kXRegSize, kWRegSize, kDRegSize and kSRegSize are
+  // supported.
+  //
+  // Otherwise, (Push|Pop)(CPU|X|W|D|S)RegList is preferred.
+  void PushCPURegList(CPURegList registers);
+  void PopCPURegList(CPURegList registers);
+
+  void PushSizeRegList(RegList registers, unsigned reg_size,
+      CPURegister::RegisterType type = CPURegister::kRegister) {
+    PushCPURegList(CPURegList(type, reg_size, registers));
+  }
+  void PopSizeRegList(RegList registers, unsigned reg_size,
+      CPURegister::RegisterType type = CPURegister::kRegister) {
+    PopCPURegList(CPURegList(type, reg_size, registers));
+  }
+  void PushXRegList(RegList regs) {
+    PushSizeRegList(regs, kXRegSize);
+  }
+  void PopXRegList(RegList regs) {
+    PopSizeRegList(regs, kXRegSize);
+  }
+  void PushWRegList(RegList regs) {
+    PushSizeRegList(regs, kWRegSize);
+  }
+  void PopWRegList(RegList regs) {
+    PopSizeRegList(regs, kWRegSize);
+  }
+  inline void PushDRegList(RegList regs) {
+    PushSizeRegList(regs, kDRegSize, CPURegister::kFPRegister);
+  }
+  inline void PopDRegList(RegList regs) {
+    PopSizeRegList(regs, kDRegSize, CPURegister::kFPRegister);
+  }
+  inline void PushSRegList(RegList regs) {
+    PushSizeRegList(regs, kSRegSize, CPURegister::kFPRegister);
+  }
+  inline void PopSRegList(RegList regs) {
+    PopSizeRegList(regs, kSRegSize, CPURegister::kFPRegister);
+  }
+
+  // Push the specified register 'count' times.
+  void PushMultipleTimes(int count, Register src);
+
+  // Poke 'src' onto the stack. The offset is in bytes.
+  //
+  // If the current stack pointer (as set by SetStackPointer) is sp, then sp
+  // must be aligned to 16 bytes.
+  void Poke(const Register& src, const Operand& offset);
+
+  // Peek at a value on the stack, and put it in 'dst'. The offset is in bytes.
+  //
+  // If the current stack pointer (as set by SetStackPointer) is sp, then sp
+  // must be aligned to 16 bytes.
+  void Peek(const Register& dst, const Operand& offset);
+
+  // Claim or drop stack space without actually accessing memory.
+  //
+  // If the current stack pointer (as set by SetStackPointer) is sp, then it
+  // must be aligned to 16 bytes and the size claimed or dropped must be a
+  // multiple of 16 bytes.
+  void Claim(const Operand& size);
+  void Drop(const Operand& size);
+
+  // Preserve the callee-saved registers (as defined by AAPCS64).
+  //
+  // Higher-numbered registers are pushed before lower-numbered registers, and
+  // thus get higher addresses.
+  // Floating-point registers are pushed before general-purpose registers, and
+  // thus get higher addresses.
+  //
+  // This method must not be called unless StackPointer() is sp, and it is
+  // aligned to 16 bytes.
+  void PushCalleeSavedRegisters();
+
+  // Restore the callee-saved registers (as defined by AAPCS64).
+  //
+  // Higher-numbered registers are popped after lower-numbered registers, and
+  // thus come from higher addresses.
+  // Floating-point registers are popped after general-purpose registers, and
+  // thus come from higher addresses.
+  //
+  // This method must not be called unless StackPointer() is sp, and it is
+  // aligned to 16 bytes.
+  void PopCalleeSavedRegisters();
+
+  // Remaining instructions are simple pass-through calls to the assembler.
+  void Adr(const Register& rd, Label* label) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    adr(rd, label);
+  }
+  void Asr(const Register& rd, const Register& rn, unsigned shift) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    asr(rd, rn, shift);
+  }
+  void Asr(const Register& rd, const Register& rn, const Register& rm) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    asrv(rd, rn, rm);
+  }
+  void B(Label* label, Condition cond = al) {
+    ASSERT(allow_macro_instructions_);
+    b(label, cond);
+  }
+  void Bfi(const Register& rd,
+           const Register& rn,
+           unsigned lsb,
+           unsigned width) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    bfi(rd, rn, lsb, width);
+  }
+  void Bfxil(const Register& rd,
+             const Register& rn,
+             unsigned lsb,
+             unsigned width) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    bfxil(rd, rn, lsb, width);
+  }
+  void Bind(Label* label) {
+    ASSERT(allow_macro_instructions_);
+    bind(label);
+  }
+  void Bl(Label* label) {
+    ASSERT(allow_macro_instructions_);
+    bl(label);
+  }
+  void Blr(const Register& xn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!xn.IsZero());
+    blr(xn);
+  }
+  void Br(const Register& xn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!xn.IsZero());
+    br(xn);
+  }
+  void Brk(int code = 0) {
+    ASSERT(allow_macro_instructions_);
+    brk(code);
+  }
+  void Cbnz(const Register& rt, Label* label) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rt.IsZero());
+    cbnz(rt, label);
+  }
+  void Cbz(const Register& rt, Label* label) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rt.IsZero());
+    cbz(rt, label);
+  }
+  void Cinc(const Register& rd, const Register& rn, Condition cond) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    cinc(rd, rn, cond);
+  }
+  void Cinv(const Register& rd, const Register& rn, Condition cond) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    cinv(rd, rn, cond);
+  }
+  void Cls(const Register& rd, const Register& rn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    cls(rd, rn);
+  }
+  void Clz(const Register& rd, const Register& rn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    clz(rd, rn);
+  }
+  void Cneg(const Register& rd, const Register& rn, Condition cond) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    cneg(rd, rn, cond);
+  }
+  void Csel(const Register& rd,
+            const Register& rn,
+            const Register& rm,
+            Condition cond) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    csel(rd, rn, rm, cond);
+  }
+  void Cset(const Register& rd, Condition cond) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    cset(rd, cond);
+  }
+  void Csetm(const Register& rd, Condition cond) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    csetm(rd, cond);
+  }
+  void Csinc(const Register& rd,
+             const Register& rn,
+             const Register& rm,
+             Condition cond) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    csinc(rd, rn, rm, cond);
+  }
+  void Csinv(const Register& rd,
+             const Register& rn,
+             const Register& rm,
+             Condition cond) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    csinv(rd, rn, rm, cond);
+  }
+  void Csneg(const Register& rd,
+             const Register& rn,
+             const Register& rm,
+             Condition cond) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    csneg(rd, rn, rm, cond);
+  }
+  void Extr(const Register& rd,
+            const Register& rn,
+            const Register& rm,
+            unsigned lsb) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    extr(rd, rn, rm, lsb);
+  }
+  void Fabs(const FPRegister& fd, const FPRegister& fn) {
+    ASSERT(allow_macro_instructions_);
+    fabs(fd, fn);
+  }
+  void Fadd(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm) {
+    ASSERT(allow_macro_instructions_);
+    fadd(fd, fn, fm);
+  }
+  void Fccmp(const FPRegister& fn,
+             const FPRegister& fm,
+             StatusFlags nzcv,
+             Condition cond) {
+    ASSERT(allow_macro_instructions_);
+    fccmp(fn, fm, nzcv, cond);
+  }
+  void Fcmp(const FPRegister& fn, const FPRegister& fm) {
+    ASSERT(allow_macro_instructions_);
+    fcmp(fn, fm);
+  }
+  void Fcmp(const FPRegister& fn, double value) {
+    ASSERT(allow_macro_instructions_);
+    if (value != 0.0) {
+      FPRegister tmp = AppropriateTempFor(fn);
+      Fmov(tmp, value);
+      fcmp(fn, tmp);
+    } else {
+      fcmp(fn, value);
+    }
+  }
+  void Fcsel(const FPRegister& fd,
+             const FPRegister& fn,
+             const FPRegister& fm,
+             Condition cond) {
+    ASSERT(allow_macro_instructions_);
+    fcsel(fd, fn, fm, cond);
+  }
+  void Fcvtms(const Register& rd, const FPRegister& fn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    fcvtms(rd, fn);
+  }
+  void Fcvtmu(const Register& rd, const FPRegister& fn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    fcvtmu(rd, fn);
+  }
+  void Fcvtns(const Register& rd, const FPRegister& fn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    fcvtns(rd, fn);
+  }
+  void Fcvtnu(const Register& rd, const FPRegister& fn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    fcvtnu(rd, fn);
+  }
+  void Fcvtzs(const Register& rd, const FPRegister& fn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    fcvtzs(rd, fn);
+  }
+  void Fcvtzu(const Register& rd, const FPRegister& fn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    fcvtzu(rd, fn);
+  }
+  void Fdiv(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm) {
+    ASSERT(allow_macro_instructions_);
+    fdiv(fd, fn, fm);
+  }
+  void Fmax(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm) {
+    ASSERT(allow_macro_instructions_);
+    fmax(fd, fn, fm);
+  }
+  void Fmin(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm) {
+    ASSERT(allow_macro_instructions_);
+    fmin(fd, fn, fm);
+  }
+  void Fmov(FPRegister fd, FPRegister fn) {
+    ASSERT(allow_macro_instructions_);
+    // Only emit an instruction if fd and fn are different, and they are both D
+    // registers. fmov(s0, s0) is not a no-op because it clears the top word of
+    // d0. Technically, fmov(d0, d0) is not a no-op either because it clears
+    // the top of q0, but FPRegister does not currently support Q registers.
+    if (!fd.Is(fn) || !fd.Is64Bits()) {
+      fmov(fd, fn);
+    }
+  }
+  void Fmov(FPRegister fd, Register rn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rn.IsZero());
+    fmov(fd, rn);
+  }
+  void Fmov(FPRegister fd, double imm) {
+    ASSERT(allow_macro_instructions_);
+    fmov(fd, imm);
+  }
+  void Fmov(Register rd, FPRegister fn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    fmov(rd, fn);
+  }
+  void Fmul(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm) {
+    ASSERT(allow_macro_instructions_);
+    fmul(fd, fn, fm);
+  }
+  void Fmsub(const FPRegister& fd,
+             const FPRegister& fn,
+             const FPRegister& fm,
+             const FPRegister& fa) {
+    ASSERT(allow_macro_instructions_);
+    fmsub(fd, fn, fm, fa);
+  }
+  void Fneg(const FPRegister& fd, const FPRegister& fn) {
+    ASSERT(allow_macro_instructions_);
+    fneg(fd, fn);
+  }
+  void Frintn(const FPRegister& fd, const FPRegister& fn) {
+    ASSERT(allow_macro_instructions_);
+    frintn(fd, fn);
+  }
+  void Frintz(const FPRegister& fd, const FPRegister& fn) {
+    ASSERT(allow_macro_instructions_);
+    frintz(fd, fn);
+  }
+  void Fsqrt(const FPRegister& fd, const FPRegister& fn) {
+    ASSERT(allow_macro_instructions_);
+    fsqrt(fd, fn);
+  }
+  void Fcvt(const FPRegister& fd, const FPRegister& fn) {
+    ASSERT(allow_macro_instructions_);
+    fcvt(fd, fn);
+  }
+  void Fsub(const FPRegister& fd, const FPRegister& fn, const FPRegister& fm) {
+    ASSERT(allow_macro_instructions_);
+    fsub(fd, fn, fm);
+  }
+  void Hint(SystemHint code) {
+    ASSERT(allow_macro_instructions_);
+    hint(code);
+  }
+  void Hlt(int code) {
+    ASSERT(allow_macro_instructions_);
+    hlt(code);
+  }
+  void Ldnp(const CPURegister& rt,
+            const CPURegister& rt2,
+            const MemOperand& src) {
+    ASSERT(allow_macro_instructions_);
+    ldnp(rt, rt2, src);
+  }
+  void Ldp(const CPURegister& rt,
+           const CPURegister& rt2,
+           const MemOperand& src) {
+    ASSERT(allow_macro_instructions_);
+    ldp(rt, rt2, src);
+  }
+  void Ldpsw(const Register& rt, const Register& rt2, const MemOperand& src) {
+    ASSERT(allow_macro_instructions_);
+    ldpsw(rt, rt2, src);
+  }
+  void Ldr(const FPRegister& ft, double imm) {
+    ASSERT(allow_macro_instructions_);
+    ldr(ft, imm);
+  }
+  void Ldr(const Register& rt, uint64_t imm) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rt.IsZero());
+    ldr(rt, imm);
+  }
+  void Lsl(const Register& rd, const Register& rn, unsigned shift) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    lsl(rd, rn, shift);
+  }
+  void Lsl(const Register& rd, const Register& rn, const Register& rm) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    lslv(rd, rn, rm);
+  }
+  void Lsr(const Register& rd, const Register& rn, unsigned shift) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    lsr(rd, rn, shift);
+  }
+  void Lsr(const Register& rd, const Register& rn, const Register& rm) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    lsrv(rd, rn, rm);
+  }
+  void Madd(const Register& rd,
+            const Register& rn,
+            const Register& rm,
+            const Register& ra) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    ASSERT(!ra.IsZero());
+    madd(rd, rn, rm, ra);
+  }
+  void Mneg(const Register& rd, const Register& rn, const Register& rm) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    mneg(rd, rn, rm);
+  }
+  void Mov(const Register& rd, const Register& rn) {
+    ASSERT(allow_macro_instructions_);
+    mov(rd, rn);
+  }
+  void Mrs(const Register& rt, SystemRegister sysreg) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rt.IsZero());
+    mrs(rt, sysreg);
+  }
+  void Msr(SystemRegister sysreg, const Register& rt) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rt.IsZero());
+    msr(sysreg, rt);
+  }
+  void Msub(const Register& rd,
+            const Register& rn,
+            const Register& rm,
+            const Register& ra) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    ASSERT(!ra.IsZero());
+    msub(rd, rn, rm, ra);
+  }
+  void Mul(const Register& rd, const Register& rn, const Register& rm) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    mul(rd, rn, rm);
+  }
+  void Nop() {
+    ASSERT(allow_macro_instructions_);
+    nop();
+  }
+  void Rbit(const Register& rd, const Register& rn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    rbit(rd, rn);
+  }
+  void Ret(const Register& xn = lr) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!xn.IsZero());
+    ret(xn);
+  }
+  void Rev(const Register& rd, const Register& rn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    rev(rd, rn);
+  }
+  void Rev16(const Register& rd, const Register& rn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    rev16(rd, rn);
+  }
+  void Rev32(const Register& rd, const Register& rn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    rev32(rd, rn);
+  }
+  void Ror(const Register& rd, const Register& rs, unsigned shift) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rs.IsZero());
+    ror(rd, rs, shift);
+  }
+  void Ror(const Register& rd, const Register& rn, const Register& rm) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    rorv(rd, rn, rm);
+  }
+  void Sbfiz(const Register& rd,
+             const Register& rn,
+             unsigned lsb,
+             unsigned width) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    sbfiz(rd, rn, lsb, width);
+  }
+  void Sbfx(const Register& rd,
+            const Register& rn,
+            unsigned lsb,
+            unsigned width) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    sbfx(rd, rn, lsb, width);
+  }
+  void Scvtf(const FPRegister& fd, const Register& rn, unsigned fbits = 0) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rn.IsZero());
+    scvtf(fd, rn, fbits);
+  }
+  void Sdiv(const Register& rd, const Register& rn, const Register& rm) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    sdiv(rd, rn, rm);
+  }
+  void Smaddl(const Register& rd,
+              const Register& rn,
+              const Register& rm,
+              const Register& ra) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    ASSERT(!ra.IsZero());
+    smaddl(rd, rn, rm, ra);
+  }
+  void Smsubl(const Register& rd,
+              const Register& rn,
+              const Register& rm,
+              const Register& ra) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    ASSERT(!ra.IsZero());
+    smsubl(rd, rn, rm, ra);
+  }
+  void Smull(const Register& rd, const Register& rn, const Register& rm) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    smull(rd, rn, rm);
+  }
+  void Smulh(const Register& xd, const Register& xn, const Register& xm) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!xd.IsZero());
+    ASSERT(!xn.IsZero());
+    ASSERT(!xm.IsZero());
+    smulh(xd, xn, xm);
+  }
+  void Stnp(const CPURegister& rt,
+            const CPURegister& rt2,
+            const MemOperand& dst) {
+    ASSERT(allow_macro_instructions_);
+    stnp(rt, rt2, dst);
+  }
+  void Stp(const CPURegister& rt,
+           const CPURegister& rt2,
+           const MemOperand& dst) {
+    ASSERT(allow_macro_instructions_);
+    stp(rt, rt2, dst);
+  }
+  void Sxtb(const Register& rd, const Register& rn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    sxtb(rd, rn);
+  }
+  void Sxth(const Register& rd, const Register& rn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    sxth(rd, rn);
+  }
+  void Sxtw(const Register& rd, const Register& rn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    sxtw(rd, rn);
+  }
+  void Tbnz(const Register& rt, unsigned bit_pos, Label* label) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rt.IsZero());
+    tbnz(rt, bit_pos, label);
+  }
+  void Tbz(const Register& rt, unsigned bit_pos, Label* label) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rt.IsZero());
+    tbz(rt, bit_pos, label);
+  }
+  void Ubfiz(const Register& rd,
+             const Register& rn,
+             unsigned lsb,
+             unsigned width) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ubfiz(rd, rn, lsb, width);
+  }
+  void Ubfx(const Register& rd,
+            const Register& rn,
+            unsigned lsb,
+            unsigned width) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ubfx(rd, rn, lsb, width);
+  }
+  void Ucvtf(const FPRegister& fd, const Register& rn, unsigned fbits = 0) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rn.IsZero());
+    ucvtf(fd, rn, fbits);
+  }
+  void Udiv(const Register& rd, const Register& rn, const Register& rm) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    udiv(rd, rn, rm);
+  }
+  void Umaddl(const Register& rd,
+              const Register& rn,
+              const Register& rm,
+              const Register& ra) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    ASSERT(!ra.IsZero());
+    umaddl(rd, rn, rm, ra);
+  }
+  void Umsubl(const Register& rd,
+              const Register& rn,
+              const Register& rm,
+              const Register& ra) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    ASSERT(!rm.IsZero());
+    ASSERT(!ra.IsZero());
+    umsubl(rd, rn, rm, ra);
+  }
+  void Unreachable() {
+    ASSERT(allow_macro_instructions_);
+#ifdef USE_SIMULATOR
+    hlt(kUnreachableOpcode);
+#else
+    // Branch to 0 to generate a segfault.
+    // lr - kInstructionSize is the address of the offending instruction.
+    blr(xzr);
+#endif
+  }
+  void Uxtb(const Register& rd, const Register& rn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    uxtb(rd, rn);
+  }
+  void Uxth(const Register& rd, const Register& rn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    uxth(rd, rn);
+  }
+  void Uxtw(const Register& rd, const Register& rn) {
+    ASSERT(allow_macro_instructions_);
+    ASSERT(!rd.IsZero());
+    ASSERT(!rn.IsZero());
+    uxtw(rd, rn);
+  }
+
+  // Push the system stack pointer (sp) down to allow the same to be done to
+  // the current stack pointer (according to StackPointer()). This must be
+  // called _before_ accessing the memory.
+  //
+  // This is necessary when pushing or otherwise adding things to the stack, to
+  // satisfy the AAPCS64 constraint that the memory below the system stack
+  // pointer is not accessed.
+  //
+  // This method asserts that StackPointer() is not sp, since the call does
+  // not make sense in that context.
+  //
+  // TODO: This method can only accept values of 'space' that can be encoded in
+  // one instruction. Refer to the implementation for details.
+  void BumpSystemStackPointer(const Operand& space);
+
+#if DEBUG
+  void SetAllowMacroInstructions(bool value) {
+    allow_macro_instructions_ = value;
+  }
+
+  bool AllowMacroInstructions() const {
+    return allow_macro_instructions_;
+  }
+#endif
+
+  // Set the current stack pointer, but don't generate any code.
+  // Note that this does not directly affect LastStackPointer().
+  void SetStackPointer(const Register& stack_pointer) {
+    ASSERT(!AreAliased(stack_pointer, Tmp0(), Tmp1()));
+    sp_ = stack_pointer;
+  }
+
+  // Return the current stack pointer, as set by SetStackPointer.
+  const Register& StackPointer() const {
+    return sp_;
+  }
+
+  // Set the registers used internally by the MacroAssembler as scratch
+  // registers. These registers are used to implement behaviours which are not
+  // directly supported by A64, and where an intermediate result is required.
+  //
+  // Both tmp0 and tmp1 may be set to any X register except for xzr, sp,
+  // and StackPointer(). Also, they must not be the same register (though they
+  // may both be NoReg).
+  //
+  // It is valid to set either or both of these registers to NoReg if you don't
+  // want the MacroAssembler to use any scratch registers. In a debug build, the
+  // Assembler will assert that any registers it uses are valid. Be aware that
+  // this check is not present in release builds. If this is a problem, use the
+  // Assembler directly.
+  void SetScratchRegisters(const Register& tmp0, const Register& tmp1) {
+    ASSERT(!AreAliased(xzr, sp, tmp0, tmp1));
+    ASSERT(!AreAliased(StackPointer(), tmp0, tmp1));
+    tmp0_ = tmp0;
+    tmp1_ = tmp1;
+  }
+
+  const Register& Tmp0() const {
+    return tmp0_;
+  }
+
+  const Register& Tmp1() const {
+    return tmp1_;
+  }
+
+  void SetFPScratchRegister(const FPRegister& fptmp0) {
+    fptmp0_ = fptmp0;
+  }
+
+  const FPRegister& FPTmp0() const {
+    return fptmp0_;
+  }
+
+  const Register AppropriateTempFor(
+      const Register& target,
+      const CPURegister& forbidden = NoCPUReg) const {
+    Register candidate = forbidden.Is(Tmp0()) ? Tmp1() : Tmp0();
+    ASSERT(!candidate.Is(target));
+    return Register(candidate.code(), target.size());
+  }
+
+  const FPRegister AppropriateTempFor(
+      const FPRegister& target,
+      const CPURegister& forbidden = NoCPUReg) const {
+    USE(forbidden);
+    FPRegister candidate = FPTmp0();
+    ASSERT(!candidate.Is(forbidden));
+    ASSERT(!candidate.Is(target));
+    return FPRegister(candidate.code(), target.size());
+  }
+
+  // Like printf, but print at run-time from generated code.
+  //
+  // The caller must ensure that arguments for floating-point placeholders
+  // (such as %e, %f or %g) are FPRegisters, and that arguments for integer
+  // placeholders are Registers.
+  //
+  // A maximum of four arguments may be given to any single Printf call. The
+  // arguments must be of the same type, but they do not need to have the same
+  // size.
+  //
+  // The following registers cannot be printed:
+  //    Tmp0(), Tmp1(), StackPointer(), sp.
+  //
+  // This function automatically preserves caller-saved registers so that
+  // calling code can use Printf at any point without having to worry about
+  // corruption. The preservation mechanism generates a lot of code. If this is
+  // a problem, preserve the important registers manually and then call
+  // PrintfNoPreserve. Callee-saved registers are not used by Printf, and are
+  // implicitly preserved.
+  //
+  // This function assumes (and asserts) that the current stack pointer is
+  // callee-saved, not caller-saved. This is most likely the case anyway, as a
+  // caller-saved stack pointer doesn't make a lot of sense.
+  void Printf(const char * format,
+              const CPURegister& arg0 = NoCPUReg,
+              const CPURegister& arg1 = NoCPUReg,
+              const CPURegister& arg2 = NoCPUReg,
+              const CPURegister& arg3 = NoCPUReg);
+
+  // Like Printf, but don't preserve any caller-saved registers, not even 'lr'.
+  //
+  // The return code from the system printf call will be returned in x0.
+  void PrintfNoPreserve(const char * format,
+                        const CPURegister& arg0 = NoCPUReg,
+                        const CPURegister& arg1 = NoCPUReg,
+                        const CPURegister& arg2 = NoCPUReg,
+                        const CPURegister& arg3 = NoCPUReg);
+
+  // Trace control when running the debug simulator.
+  //
+  // For example:
+  //
+  // __ Trace(LOG_REGS, TRACE_ENABLE);
+  // Will add registers to the trace if it wasn't already the case.
+  //
+  // __ Trace(LOG_DISASM, TRACE_DISABLE);
+  // Will stop logging disassembly. It has no effect if the disassembly wasn't
+  // already being logged.
+  void Trace(TraceParameters parameters, TraceCommand command);
+
+  // Log the requested data independently of what is being traced.
+  //
+  // For example:
+  //
+  // __ Log(LOG_FLAGS)
+  // Will output the flags.
+  void Log(TraceParameters paramters);
+
+ private:
+  // The actual Push and Pop implementations. These don't generate any code
+  // other than that required for the push or pop. This allows
+  // (Push|Pop)CPURegList to bundle together setup code for a large block of
+  // registers.
+  //
+  // Note that size is per register, and is specified in bytes.
+  void PushHelper(int count, int size,
+                  const CPURegister& src0, const CPURegister& src1,
+                  const CPURegister& src2, const CPURegister& src3);
+  void PopHelper(int count, int size,
+                 const CPURegister& dst0, const CPURegister& dst1,
+                 const CPURegister& dst2, const CPURegister& dst3);
+
+  // Perform necessary maintenance operations before a push or pop.
+  //
+  // Note that size is per register, and is specified in bytes.
+  void PrepareForPush(int count, int size);
+  void PrepareForPop(int count, int size);
+
+#if DEBUG
+  // Tell whether any of the macro instruction can be used. When false the
+  // MacroAssembler will assert if a method which can emit a variable number
+  // of instructions is called.
+  bool allow_macro_instructions_;
+#endif
+
+  // The register to use as a stack pointer for stack operations.
+  Register sp_;
+
+  // Scratch registers used internally by the MacroAssembler.
+  Register tmp0_;
+  Register tmp1_;
+  FPRegister fptmp0_;
+};
+
+
+// Use this scope when you need a one-to-one mapping between methods and
+// instructions. This scope prevents the MacroAssembler from being called and
+// literal pools from being emitted. It also asserts the number of instructions
+// emitted is what you specified when creating the scope.
+class InstructionAccurateScope {
+ public:
+  explicit InstructionAccurateScope(MacroAssembler* masm)
+      : masm_(masm), size_(0) {
+    masm_->BlockLiteralPool();
+#ifdef DEBUG
+    old_allow_macro_instructions_ = masm_->AllowMacroInstructions();
+    masm_->SetAllowMacroInstructions(false);
+#endif
+  }
+
+  InstructionAccurateScope(MacroAssembler* masm, int count)
+      : masm_(masm), size_(count * kInstructionSize) {
+    masm_->BlockLiteralPool();
+#ifdef DEBUG
+    masm_->bind(&start_);
+    old_allow_macro_instructions_ = masm_->AllowMacroInstructions();
+    masm_->SetAllowMacroInstructions(false);
+#endif
+  }
+
+  ~InstructionAccurateScope() {
+    masm_->ReleaseLiteralPool();
+#ifdef DEBUG
+    if (start_.IsBound()) {
+      ASSERT(masm_->SizeOfCodeGeneratedSince(&start_) == size_);
+    }
+    masm_->SetAllowMacroInstructions(old_allow_macro_instructions_);
+#endif
+  }
+
+ private:
+  MacroAssembler* masm_;
+  uint64_t size_;
+#ifdef DEBUG
+  Label start_;
+  bool old_allow_macro_instructions_;
+#endif
+};
+
+
+}  // namespace vixl
+
+#endif  // VIXL_A64_MACRO_ASSEMBLER_A64_H_

diff --git a/src/a64/simulator-a64.cc b/src/a64/simulator-a64.cc
new file mode 100644
index 0000000..0a872c5
--- /dev/null
+++ b/src/a64/simulator-a64.cc

@@ -0,0 +1,1658 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include <cmath>
+#include "a64/simulator-a64.h"
+
+namespace vixl {
+
+const Instruction* Simulator::kEndOfSimAddress = NULL;
+
+Simulator::Simulator(Decoder* decoder, FILE* stream) {
+  // Ensure shift operations act as the simulator expects.
+  ASSERT((static_cast<int32_t>(-1) >> 1) == -1);
+  ASSERT((static_cast<uint32_t>(-1) >> 1) == 0x7FFFFFFF);
+
+  // Setup the decoder.
+  decoder_ = decoder;
+  decoder_->AppendVisitor(this);
+
+  ResetState();
+
+  // Allocate and setup the simulator stack.
+  stack_ = reinterpret_cast<byte*>(malloc(stack_size_));
+  stack_limit_ = stack_ + stack_protection_size_;
+  byte* tos = stack_ + stack_size_ - stack_protection_size_;
+  // The stack pointer must be 16 bytes aligned.
+  set_sp(reinterpret_cast<int64_t>(tos) & ~0xfUL);
+
+  stream_ = stream;
+  print_disasm_ = new PrintDisassembler(stream_);
+  coloured_trace_ = false;
+  disasm_trace_ = false;
+}
+
+
+void Simulator::ResetState() {
+  // Reset the processor state.
+  psr_ = 0;
+
+  // Reset registers to 0.
+  pc_ = NULL;
+  pc_modified_ = false;
+  for (unsigned i = 0; i < kNumberOfRegisters; i++) {
+    set_xreg(i, 0xbadbeef);
+  }
+  for (unsigned i = 0; i < kNumberOfFPRegisters; i++) {
+    // Set FP registers to a value that is NaN in both 32-bit and 64-bit FP.
+    set_dreg_bits(i, 0x7ff000007f800001UL);
+  }
+  // Returning to address 0 exits the Simulator.
+  set_lr(reinterpret_cast<int64_t>(kEndOfSimAddress));
+}
+
+
+Simulator::~Simulator() {
+  free(stack_);
+  // The decoder may outlive the simulator.
+  decoder_->RemoveVisitor(print_disasm_);
+  delete print_disasm_;
+}
+
+
+void Simulator::Run() {
+  while (pc_ != kEndOfSimAddress) {
+    ExecuteInstruction();
+  }
+}
+
+
+void Simulator::RunFrom(Instruction* first) {
+  pc_ = first;
+  pc_modified_ = false;
+  Run();
+}
+
+
+void Simulator::SetFlags(uint32_t new_flags) {
+  ASSERT((new_flags & ~kConditionFlagsMask) == 0);
+  psr_ &= ~kConditionFlagsMask;
+  // Set the new flags.
+  psr_ |= new_flags;
+}
+
+
+const char* Simulator::xreg_names[] = {
+"x0",  "x1",  "x2",  "x3",  "x4",  "x5",  "x6",  "x7",
+"x8",  "x9",  "x10", "x11", "x12", "x13", "x14", "x15",
+"x16", "x17", "x18", "x19", "x20", "x21", "x22", "x23",
+"x24", "x25", "x26", "x27", "x28", "x29", "lr",  "xzr", "sp"};
+
+
+const char* Simulator::wreg_names[] = {
+"w0",  "w1",  "w2",  "w3",  "w4",  "w5",  "w6",  "w7",
+"w8",  "w9",  "w10", "w11", "w12", "w13", "w14", "w15",
+"w16", "w17", "w18", "w19", "w20", "w21", "w22", "w23",
+"w24", "w25", "w26", "w27", "w28", "w29", "w30", "wzr", "wsp"};
+
+const char* Simulator::sreg_names[] = {
+"s0",  "s1",  "s2",  "s3",  "s4",  "s5",  "s6",  "s7",
+"s8",  "s9",  "s10", "s11", "s12", "s13", "s14", "s15",
+"s16", "s17", "s18", "s19", "s20", "s21", "s22", "s23",
+"s24", "s25", "s26", "s27", "s28", "s29", "s30", "s31"};
+
+const char* Simulator::dreg_names[] = {
+"d0",  "d1",  "d2",  "d3",  "d4",  "d5",  "d6",  "d7",
+"d8",  "d9",  "d10", "d11", "d12", "d13", "d14", "d15",
+"d16", "d17", "d18", "d19", "d20", "d21", "d22", "d23",
+"d24", "d25", "d26", "d27", "d28", "d29", "d30", "d31"};
+
+const char* Simulator::vreg_names[] = {
+"v0",  "v1",  "v2",  "v3",  "v4",  "v5",  "v6",  "v7",
+"v8",  "v9",  "v10", "v11", "v12", "v13", "v14", "v15",
+"v16", "v17", "v18", "v19", "v20", "v21", "v22", "v23",
+"v24", "v25", "v26", "v27", "v28", "v29", "v30", "v31"};
+
+
+
+const char* Simulator::WRegNameForCode(unsigned code, Reg31Mode mode) {
+  ASSERT(code < kNumberOfRegisters);
+  // If the code represents the stack pointer, index the name after zr.
+  if ((code == kZeroRegCode) && (mode == Reg31IsStackPointer)) {
+    code = kZeroRegCode + 1;
+  }
+  return wreg_names[code];
+}
+
+
+const char* Simulator::XRegNameForCode(unsigned code, Reg31Mode mode) {
+  ASSERT(code < kNumberOfRegisters);
+  // If the code represents the stack pointer, index the name after zr.
+  if ((code == kZeroRegCode) && (mode == Reg31IsStackPointer)) {
+    code = kZeroRegCode + 1;
+  }
+  return xreg_names[code];
+}
+
+
+const char* Simulator::SRegNameForCode(unsigned code) {
+  ASSERT(code < kNumberOfFPRegisters);
+  return sreg_names[code];
+}
+
+
+const char* Simulator::DRegNameForCode(unsigned code) {
+  ASSERT(code < kNumberOfFPRegisters);
+  return dreg_names[code];
+}
+
+
+const char* Simulator::VRegNameForCode(unsigned code) {
+  ASSERT(code < kNumberOfFPRegisters);
+  return vreg_names[code];
+}
+
+
+// Helpers ---------------------------------------------------------------------
+int64_t Simulator::AddWithCarry(unsigned reg_size,
+                                bool set_flags,
+                                int64_t src1,
+                                int64_t src2,
+                                int64_t carry_in) {
+  ASSERT((carry_in == 0) || (carry_in == 1));
+  ASSERT((reg_size == kXRegSize) || (reg_size == kWRegSize));
+
+  uint64_t u1, u2;
+  int64_t result;
+  int64_t signed_sum = src1 + src2 + carry_in;
+
+  uint32_t N, Z, C, V;
+
+  if (reg_size == kWRegSize) {
+    u1 = static_cast<uint64_t>(src1) & kWRegMask;
+    u2 = static_cast<uint64_t>(src2) & kWRegMask;
+
+    result = signed_sum & kWRegMask;
+    // Compute the C flag by comparing the sum to the max unsigned integer.
+    C = CFlag * (((kWMaxUInt - u1) < (u2 + carry_in)) ||
+                 ((kWMaxUInt - u1 - carry_in) < u2));
+    // Overflow iff the sign bit is the same for the two inputs and different
+    // for the result.
+    int64_t s_src1 = src1 << (kXRegSize - kWRegSize);
+    int64_t s_src2 = src2 << (kXRegSize - kWRegSize);
+    int64_t s_result = result << (kXRegSize - kWRegSize);
+    V = VFlag * (((s_src1 ^ s_src2) >= 0) && ((s_src1 ^ s_result) < 0));
+
+  } else {
+    u1 = static_cast<uint64_t>(src1);
+    u2 = static_cast<uint64_t>(src2);
+
+    result = signed_sum;
+    // Compute the C flag by comparing the sum to the max unsigned integer.
+    C = CFlag * (((kXMaxUInt - u1) < (u2 + carry_in)) ||
+                 ((kXMaxUInt - u1 - carry_in) < u2));
+    // Overflow iff the sign bit is the same for the two inputs and different
+    // for the result.
+    V = VFlag * (((src1 ^ src2) >= 0) && ((src1 ^ result) < 0));
+  }
+
+  N = CalcNFlag(result, reg_size);
+  Z = CalcZFlag(result);
+
+  if (set_flags) SetFlags(N | Z | C | V);
+  return result;
+}
+
+
+int64_t Simulator::ShiftOperand(unsigned reg_size,
+                                int64_t value,
+                                Shift shift_type,
+                                unsigned amount) {
+  if (amount == 0) {
+    return value;
+  }
+  int64_t mask = reg_size == kXRegSize ? kXRegMask : kWRegMask;
+  switch (shift_type) {
+    case LSL:
+      return (value << amount) & mask;
+    case LSR:
+      return static_cast<uint64_t>(value) >> amount;
+    case ASR: {
+      // Shift used to restore the sign.
+      unsigned s_shift = kXRegSize - reg_size;
+      // Value with its sign restored.
+      int64_t s_value = (value << s_shift) >> s_shift;
+      return (s_value >> amount) & mask;
+    }
+    case ROR: {
+      if (reg_size == kWRegSize) {
+        value &= kWRegMask;
+      }
+      return (static_cast<uint64_t>(value) >> amount) |
+             ((value & ((1L << amount) - 1L)) << (reg_size - amount));
+    }
+    default:
+      UNIMPLEMENTED();
+      return 0;
+  }
+}
+
+
+int64_t Simulator::ExtendValue(unsigned reg_size,
+                               int64_t value,
+                               Extend extend_type,
+                               unsigned left_shift) {
+  switch (extend_type) {
+    case UXTB:
+      value &= kByteMask;
+      break;
+    case UXTH:
+      value &= kHalfWordMask;
+      break;
+    case UXTW:
+      value &= kWordMask;
+      break;
+    case SXTB:
+      value = (value << 56) >> 56;
+      break;
+    case SXTH:
+      value = (value << 48) >> 48;
+      break;
+    case SXTW:
+      value = (value << 32) >> 32;
+      break;
+    case UXTX:
+    case SXTX:
+      break;
+    default:
+      UNREACHABLE();
+  }
+  int64_t mask = (reg_size == kXRegSize) ? kXRegMask : kWRegMask;
+  return (value << left_shift) & mask;
+}
+
+
+void Simulator::FPCompare(double val0, double val1) {
+  unsigned new_flags;
+
+  if ((isnan(val0) != 0) || (isnan(val1) != 0)) {
+    new_flags = CVFlag;
+  } else {
+    if (val0 < val1) {
+      new_flags = NFlag;
+    } else {
+      new_flags = CFlag;
+      if (val0 == val1) {
+        new_flags |= ZFlag;
+      }
+    }
+  }
+  SetFlags(new_flags);
+}
+
+
+void Simulator::PrintFlags(bool print_all) {
+  static bool first_run = true;
+  static uint32_t last_flags;
+
+  // Define some colour codes to use for the register dump.
+  // TODO: Find a more elegant way of defining these.
+  char const * const clr_normal     = (coloured_trace_) ? ("\033[m") : ("");
+  char const * const clr_flag_name  = (coloured_trace_) ? ("\033[1;30m") : ("");
+  char const * const clr_flag_value = (coloured_trace_) ? ("\033[1;37m") : ("");
+
+  if (print_all || first_run || (last_flags != nzcv())) {
+    fprintf(stream_, "# %sFLAGS: %sN:%1d Z:%1d C:%1d V:%1d%s\n",
+            clr_flag_name,
+            clr_flag_value,
+            N(), Z(), C(), V(),
+            clr_normal);
+  }
+  last_flags = nzcv();
+  first_run = false;
+}
+
+
+void Simulator::PrintRegisters(bool print_all_regs) {
+  static bool first_run = true;
+  static int64_t last_regs[kNumberOfRegisters];
+
+  // Define some colour codes to use for the register dump.
+  // TODO: Find a more elegant way of defining these.
+  char const * const clr_normal    = (coloured_trace_) ? ("\033[m") : ("");
+  char const * const clr_reg_name  = (coloured_trace_) ? ("\033[1;34m") : ("");
+  char const * const clr_reg_value = (coloured_trace_) ? ("\033[1;36m") : ("");
+
+  for (unsigned i = 0; i < kNumberOfRegisters; i++) {
+    if (print_all_regs || first_run || (last_regs[i] != registers_[i].x)) {
+      fprintf(stream_,
+              "# %s%4s:%s 0x%016" PRIx64 "%s\n",
+              clr_reg_name,
+              XRegNameForCode(i, Reg31IsStackPointer),
+              clr_reg_value,
+              registers_[i].x,
+              clr_normal);
+    }
+    // Cache the new register value so the next run can detect any changes.
+    last_regs[i] = registers_[i].x;
+  }
+  first_run = false;
+}
+
+
+void Simulator::PrintFPRegisters(bool print_all_regs) {
+  static bool first_run = true;
+  static uint64_t last_regs[kNumberOfFPRegisters];
+
+  // Define some colour codes to use for the register dump.
+  // TODO: Find a more elegant way of defining these.
+  char const * const clr_normal    = (coloured_trace_) ? ("\033[m") : ("");
+  char const * const clr_reg_name  = (coloured_trace_) ? ("\033[1;33m") : ("");
+  char const * const clr_reg_value = (coloured_trace_) ? ("\033[1;35m") : ("");
+
+  // Print as many rows of registers as necessary, keeping each individual
+  // register in the same column each time (to make it easy to visually scan
+  // for changes).
+  for (unsigned i = 0; i < kNumberOfFPRegisters; i++) {
+    if (print_all_regs || first_run ||
+        (last_regs[i] != double_to_rawbits(fpregisters_[i].d))) {
+      fprintf(stream_,
+              "# %s %4s:%s 0x%016" PRIx64 "%s (%s%s:%s %g%s %s:%s %g%s)\n",
+              clr_reg_name,
+              VRegNameForCode(i),
+              clr_reg_value,
+              double_to_rawbits(fpregisters_[i].d),
+              clr_normal,
+              clr_reg_name,
+              DRegNameForCode(i),
+              clr_reg_value,
+              fpregisters_[i].d,
+              clr_reg_name,
+              SRegNameForCode(i),
+              clr_reg_value,
+              fpregisters_[i].s,
+              clr_normal);
+    }
+    // Cache the new register value so the next run can detect any changes.
+    last_regs[i] = double_to_rawbits(fpregisters_[i].d);
+  }
+  first_run = false;
+}
+
+
+void Simulator::PrintProcessorState() {
+  PrintFlags();
+  PrintRegisters();
+  PrintFPRegisters();
+}
+
+
+// Visitors---------------------------------------------------------------------
+
+void Simulator::VisitUnknown(Instruction* instr) {
+  printf("Unknown instruction at 0x%p: 0x%08" PRIx32 "\n",
+         reinterpret_cast<void*>(instr), instr->InstructionBits());
+  UNIMPLEMENTED();
+}
+
+
+void Simulator::VisitPCRelAddressing(Instruction* instr) {
+  switch (instr->Mask(PCRelAddressingMask)) {
+    case ADR:
+      set_reg(kXRegSize,
+              instr->Rd(),
+              reinterpret_cast<int64_t>(instr->ImmPCOffsetTarget()));
+      break;
+    case ADRP:  // Not implemented in the assembler.
+      UNIMPLEMENTED();
+      break;
+    default:
+      UNREACHABLE();
+  }
+}
+
+
+void Simulator::VisitUnconditionalBranch(Instruction* instr) {
+  switch (instr->Mask(UnconditionalBranchMask)) {
+    case BL:
+      set_lr(reinterpret_cast<int64_t>(instr->NextInstruction()));
+      // Fall through.
+    case B:
+      set_pc(instr->ImmPCOffsetTarget());
+      break;
+    default: UNREACHABLE();
+  }
+}
+
+
+void Simulator::VisitConditionalBranch(Instruction* instr) {
+  ASSERT(instr->Mask(ConditionalBranchMask) == B_cond);
+  if (ConditionPassed(static_cast<Condition>(instr->ConditionBranch()))) {
+    set_pc(instr->ImmPCOffsetTarget());
+  }
+}
+
+
+void Simulator::VisitUnconditionalBranchToRegister(Instruction* instr) {
+  Instruction* target = Instruction::Cast(xreg(instr->Rn()));
+
+  switch (instr->Mask(UnconditionalBranchToRegisterMask)) {
+    case BLR:
+      set_lr(reinterpret_cast<int64_t>(instr->NextInstruction()));
+      // Fall through.
+    case BR:
+    case RET: set_pc(target); break;
+    default: UNREACHABLE();
+  }
+}
+
+
+void Simulator::VisitTestBranch(Instruction* instr) {
+  unsigned bit_pos = (instr->ImmTestBranchBit5() << 5) |
+                     instr->ImmTestBranchBit40();
+  bool take_branch = ((xreg(instr->Rt()) & (1UL << bit_pos)) == 0);
+  switch (instr->Mask(TestBranchMask)) {
+    case TBZ: break;
+    case TBNZ: take_branch = !take_branch; break;
+    default: UNIMPLEMENTED();
+  }
+  if (take_branch) {
+    set_pc(instr->ImmPCOffsetTarget());
+  }
+}
+
+
+void Simulator::VisitCompareBranch(Instruction* instr) {
+  unsigned rt = instr->Rt();
+  bool take_branch = false;
+  switch (instr->Mask(CompareBranchMask)) {
+    case CBZ_w: take_branch = (wreg(rt) == 0); break;
+    case CBZ_x: take_branch = (xreg(rt) == 0); break;
+    case CBNZ_w: take_branch = (wreg(rt) != 0); break;
+    case CBNZ_x: take_branch = (xreg(rt) != 0); break;
+    default: UNIMPLEMENTED();
+  }
+  if (take_branch) {
+    set_pc(instr->ImmPCOffsetTarget());
+  }
+}
+
+
+void Simulator::AddSubHelper(Instruction* instr, int64_t op2) {
+  unsigned reg_size = instr->SixtyFourBits() ? kXRegSize : kWRegSize;
+  bool set_flags = instr->FlagsUpdate();
+  int64_t new_val = 0;
+  Instr operation = instr->Mask(AddSubOpMask);
+
+  switch (operation) {
+    case ADD:
+    case ADDS: {
+      new_val = AddWithCarry(reg_size,
+                             set_flags,
+                             reg(reg_size, instr->Rn(), instr->RnMode()),
+                             op2);
+      break;
+    }
+    case SUB:
+    case SUBS: {
+      new_val = AddWithCarry(reg_size,
+                             set_flags,
+                             reg(reg_size, instr->Rn(), instr->RnMode()),
+                             ~op2,
+                             1);
+      break;
+    }
+    default: UNREACHABLE();
+  }
+
+  set_reg(reg_size, instr->Rd(), new_val, instr->RdMode());
+}
+
+
+void Simulator::VisitAddSubShifted(Instruction* instr) {
+  unsigned reg_size = instr->SixtyFourBits() ? kXRegSize : kWRegSize;
+  int64_t op2 = ShiftOperand(reg_size,
+                             reg(reg_size, instr->Rm()),
+                             static_cast<Shift>(instr->ShiftDP()),
+                             instr->ImmDPShift());
+  AddSubHelper(instr, op2);
+}
+
+
+void Simulator::VisitAddSubImmediate(Instruction* instr) {
+  int64_t op2 = instr->ImmAddSub() << ((instr->ShiftAddSub() == 1) ? 12 : 0);
+  AddSubHelper(instr, op2);
+}
+
+
+void Simulator::VisitAddSubExtended(Instruction* instr) {
+  unsigned reg_size = instr->SixtyFourBits() ? kXRegSize : kWRegSize;
+  int64_t op2 = ExtendValue(reg_size,
+                            reg(reg_size, instr->Rm()),
+                            static_cast<Extend>(instr->ExtendMode()),
+                            instr->ImmExtendShift());
+  AddSubHelper(instr, op2);
+}
+
+
+void Simulator::VisitAddSubWithCarry(Instruction* instr) {
+  unsigned reg_size = instr->SixtyFourBits() ? kXRegSize : kWRegSize;
+  int64_t op2 = reg(reg_size, instr->Rm());
+  int64_t new_val;
+
+  if ((instr->Mask(AddSubOpMask) == SUB) || instr->Mask(AddSubOpMask) == SUBS) {
+    op2 = ~op2;
+  }
+
+  new_val = AddWithCarry(reg_size,
+                         instr->FlagsUpdate(),
+                         reg(reg_size, instr->Rn()),
+                         op2,
+                         C());
+
+  set_reg(reg_size, instr->Rd(), new_val);
+}
+
+
+void Simulator::VisitLogicalShifted(Instruction* instr) {
+  unsigned reg_size = instr->SixtyFourBits() ? kXRegSize : kWRegSize;
+  Shift shift_type = static_cast<Shift>(instr->ShiftDP());
+  unsigned shift_amount = instr->ImmDPShift();
+  int64_t op2 = ShiftOperand(reg_size, reg(reg_size, instr->Rm()), shift_type,
+                             shift_amount);
+  if (instr->Mask(NOT) == NOT) {
+    op2 = ~op2;
+  }
+  LogicalHelper(instr, op2);
+}
+
+
+void Simulator::VisitLogicalImmediate(Instruction* instr) {
+  LogicalHelper(instr, instr->ImmLogical());
+}
+
+
+void Simulator::LogicalHelper(Instruction* instr, int64_t op2) {
+  unsigned reg_size = instr->SixtyFourBits() ? kXRegSize : kWRegSize;
+  int64_t op1 = reg(reg_size, instr->Rn());
+  int64_t result = 0;
+  bool update_flags = false;
+
+  // Switch on the logical operation, stripping out the NOT bit, as it has a
+  // different meaning for logical immediate instructions.
+  switch (instr->Mask(LogicalOpMask & ~NOT)) {
+    case ANDS: update_flags = true;  // Fall through.
+    case AND: result = op1 & op2; break;
+    case ORR: result = op1 | op2; break;
+    case EOR: result = op1 ^ op2; break;
+    default:
+      UNIMPLEMENTED();
+  }
+
+  if (update_flags) {
+    SetFlags(CalcNFlag(result, reg_size) | CalcZFlag(result));
+  }
+
+  set_reg(reg_size, instr->Rd(), result, instr->RdMode());
+}
+
+
+void Simulator::VisitConditionalCompareRegister(Instruction* instr) {
+  unsigned reg_size = instr->SixtyFourBits() ? kXRegSize : kWRegSize;
+  ConditionalCompareHelper(instr, reg(reg_size, instr->Rm()));
+}
+
+
+void Simulator::VisitConditionalCompareImmediate(Instruction* instr) {
+  ConditionalCompareHelper(instr, instr->ImmCondCmp());
+}
+
+
+void Simulator::ConditionalCompareHelper(Instruction* instr, int64_t op2) {
+  unsigned reg_size = instr->SixtyFourBits() ? kXRegSize : kWRegSize;
+  int64_t op1 = reg(reg_size, instr->Rn());
+
+  if (ConditionPassed(static_cast<Condition>(instr->Condition()))) {
+    // If the condition passes, set the status flags to the result of comparing
+    // the operands.
+    if (instr->Mask(ConditionalCompareMask) == CCMP) {
+      AddWithCarry(reg_size, true, op1, ~op2, 1);
+    } else {
+      ASSERT(instr->Mask(ConditionalCompareMask) == CCMN);
+      AddWithCarry(reg_size, true, op1, op2, 0);
+    }
+  } else {
+    // If the condition fails, set the status flags to the nzcv immediate.
+    SetFlags(instr->Nzcv() << Flags_offset);
+  }
+}
+
+
+void Simulator::VisitLoadStoreUnsignedOffset(Instruction* instr) {
+  int offset = instr->ImmLSUnsigned() << instr->SizeLS();
+  LoadStoreHelper(instr, offset, Offset);
+}
+
+
+void Simulator::VisitLoadStoreUnscaledOffset(Instruction* instr) {
+  LoadStoreHelper(instr, instr->ImmLS(), Offset);
+}
+
+
+void Simulator::VisitLoadStorePreIndex(Instruction* instr) {
+  LoadStoreHelper(instr, instr->ImmLS(), PreIndex);
+}
+
+
+void Simulator::VisitLoadStorePostIndex(Instruction* instr) {
+  LoadStoreHelper(instr, instr->ImmLS(), PostIndex);
+}
+
+
+void Simulator::VisitLoadStoreRegisterOffset(Instruction* instr) {
+  Extend ext = static_cast<Extend>(instr->ExtendMode());
+  ASSERT((ext == UXTW) || (ext == UXTX) || (ext == SXTW) || (ext == SXTX));
+  unsigned shift_amount = instr->ImmShiftLS() * instr->SizeLS();
+
+  int64_t offset = ExtendValue(kXRegSize, xreg(instr->Rm()), ext,
+                               shift_amount);
+  LoadStoreHelper(instr, offset, Offset);
+}
+
+
+void Simulator::LoadStoreHelper(Instruction* instr,
+                                int64_t offset,
+                                AddrMode addrmode) {
+  unsigned srcdst = instr->Rt();
+  uint8_t* address = AddressModeHelper(instr->Rn(), offset, addrmode);
+  int num_bytes = 1 << instr->SizeLS();
+
+  LoadStoreOp op = static_cast<LoadStoreOp>(instr->Mask(LoadStoreOpMask));
+  switch (op) {
+    case LDRB_w:
+    case LDRH_w:
+    case LDR_w:
+    case LDR_x: set_xreg(srcdst, MemoryRead(address, num_bytes)); break;
+    case STRB_w:
+    case STRH_w:
+    case STR_w:
+    case STR_x: MemoryWrite(address, xreg(srcdst), num_bytes); break;
+    case LDRSB_w: {
+      set_wreg(srcdst, ExtendValue(kWRegSize, MemoryRead8(address), SXTB));
+      break;
+    }
+    case LDRSB_x: {
+      set_xreg(srcdst, ExtendValue(kXRegSize, MemoryRead8(address), SXTB));
+      break;
+    }
+    case LDRSH_w: {
+      set_wreg(srcdst, ExtendValue(kWRegSize, MemoryRead16(address), SXTH));
+      break;
+    }
+    case LDRSH_x: {
+      set_xreg(srcdst, ExtendValue(kXRegSize, MemoryRead16(address), SXTH));
+      break;
+    }
+    case LDRSW_x: {
+      set_xreg(srcdst, ExtendValue(kXRegSize, MemoryRead32(address), SXTW));
+      break;
+    }
+    case LDR_s: set_sreg(srcdst, MemoryReadFP32(address)); break;
+    case LDR_d: set_dreg(srcdst, MemoryReadFP64(address)); break;
+    case STR_s: MemoryWriteFP32(address, sreg(srcdst)); break;
+    case STR_d: MemoryWriteFP64(address, dreg(srcdst)); break;
+    default: UNIMPLEMENTED();
+  }
+}
+
+
+void Simulator::VisitLoadStorePairOffset(Instruction* instr) {
+  LoadStorePairHelper(instr, Offset);
+}
+
+
+void Simulator::VisitLoadStorePairPreIndex(Instruction* instr) {
+  LoadStorePairHelper(instr, PreIndex);
+}
+
+
+void Simulator::VisitLoadStorePairPostIndex(Instruction* instr) {
+  LoadStorePairHelper(instr, PostIndex);
+}
+
+
+void Simulator::VisitLoadStorePairNonTemporal(Instruction* instr) {
+  LoadStorePairHelper(instr, Offset);
+}
+
+
+void Simulator::LoadStorePairHelper(Instruction* instr,
+                                    AddrMode addrmode) {
+  unsigned rt = instr->Rt();
+  unsigned rt2 = instr->Rt2();
+  int offset = instr->ImmLSPair() << instr->SizeLSPair();
+  uint8_t* address = AddressModeHelper(instr->Rn(), offset, addrmode);
+
+  LoadStorePairOp op =
+    static_cast<LoadStorePairOp>(instr->Mask(LoadStorePairMask));
+
+  // 'rt' and 'rt2' can only be aliased for stores.
+  ASSERT(((op & LoadStorePairLBit) == 0) || (rt != rt2));
+
+  switch (op) {
+    case LDP_w: {
+      set_wreg(rt, MemoryRead32(address));
+      set_wreg(rt2, MemoryRead32(address + kWRegSizeInBytes));
+      break;
+    }
+    case LDP_s: {
+      set_sreg(rt, MemoryReadFP32(address));
+      set_sreg(rt2, MemoryReadFP32(address + kSRegSizeInBytes));
+      break;
+    }
+    case LDP_x: {
+      set_xreg(rt, MemoryRead64(address));
+      set_xreg(rt2, MemoryRead64(address + kXRegSizeInBytes));
+      break;
+    }
+    case LDP_d: {
+      set_dreg(rt, MemoryReadFP64(address));
+      set_dreg(rt2, MemoryReadFP64(address + kDRegSizeInBytes));
+      break;
+    }
+    case LDPSW_x: {
+      set_xreg(rt, ExtendValue(kXRegSize, MemoryRead32(address), SXTW));
+      set_xreg(rt2, ExtendValue(kXRegSize,
+               MemoryRead32(address + kWRegSizeInBytes), SXTW));
+      break;
+    }
+    case STP_w: {
+      MemoryWrite32(address, wreg(rt));
+      MemoryWrite32(address + kWRegSizeInBytes, wreg(rt2));
+      break;
+    }
+    case STP_s: {
+      MemoryWriteFP32(address, sreg(rt));
+      MemoryWriteFP32(address + kSRegSizeInBytes, sreg(rt2));
+      break;
+    }
+    case STP_x: {
+      MemoryWrite64(address, xreg(rt));
+      MemoryWrite64(address + kXRegSizeInBytes, xreg(rt2));
+      break;
+    }
+    case STP_d: {
+      MemoryWriteFP64(address, dreg(rt));
+      MemoryWriteFP64(address + kDRegSizeInBytes, dreg(rt2));
+      break;
+    }
+    default: UNREACHABLE();
+  }
+}
+
+
+void Simulator::VisitLoadLiteral(Instruction* instr) {
+  uint8_t* address = instr->LiteralAddress();
+  unsigned rt = instr->Rt();
+
+  switch (instr->Mask(LoadLiteralMask)) {
+    case LDR_w_lit: set_wreg(rt, MemoryRead32(address));  break;
+    case LDR_x_lit: set_xreg(rt, MemoryRead64(address));  break;
+    case LDR_s_lit: set_sreg(rt, MemoryReadFP32(address));  break;
+    case LDR_d_lit: set_dreg(rt, MemoryReadFP64(address));  break;
+    default: UNREACHABLE();
+  }
+}
+
+
+uint8_t* Simulator::AddressModeHelper(unsigned addr_reg,
+                                      int64_t offset,
+                                      AddrMode addrmode) {
+  uint64_t address = xreg(addr_reg, Reg31IsStackPointer);
+  ASSERT((sizeof(uintptr_t) == kXRegSizeInBytes) ||
+         (address < 0x100000000UL));
+  if ((addr_reg == 31) && ((address % 16) != 0)) {
+    // When the base register is SP the stack pointer is required to be
+    // quadword aligned prior to the address calculation and write-backs.
+    // Misalignment will cause a stack alignment fault.
+    ALIGNMENT_EXCEPTION();
+  }
+  if ((addrmode == PreIndex) || (addrmode == PostIndex)) {
+    ASSERT(offset != 0);
+    set_xreg(addr_reg, address + offset, Reg31IsStackPointer);
+  }
+
+  if ((addrmode == Offset) || (addrmode == PreIndex)) {
+    address += offset;
+  }
+
+  return reinterpret_cast<uint8_t*>(address);
+}
+
+
+uint64_t Simulator::MemoryRead(const uint8_t* address, unsigned num_bytes) {
+  ASSERT(address != NULL);
+  ASSERT((num_bytes > 0) && (num_bytes <= sizeof(uint64_t)));
+  uint64_t read = 0;
+  memcpy(&read, address, num_bytes);
+  return read;
+}
+
+
+uint8_t Simulator::MemoryRead8(uint8_t* address) {
+  return MemoryRead(address, sizeof(uint8_t));
+}
+
+
+uint16_t Simulator::MemoryRead16(uint8_t* address) {
+  return MemoryRead(address, sizeof(uint16_t));
+}
+
+
+uint32_t Simulator::MemoryRead32(uint8_t* address) {
+  return MemoryRead(address, sizeof(uint32_t));
+}
+
+
+float Simulator::MemoryReadFP32(uint8_t* address) {
+  return rawbits_to_float(MemoryRead32(address));
+}
+
+
+uint64_t Simulator::MemoryRead64(uint8_t* address) {
+  return MemoryRead(address, sizeof(uint64_t));
+}
+
+
+double Simulator::MemoryReadFP64(uint8_t* address) {
+  return rawbits_to_double(MemoryRead64(address));
+}
+
+
+void Simulator::MemoryWrite(uint8_t* address,
+                            uint64_t value,
+                            unsigned num_bytes) {
+  ASSERT(address != NULL);
+  ASSERT((num_bytes > 0) && (num_bytes <= sizeof(uint64_t)));
+  memcpy(address, &value, num_bytes);
+}
+
+
+void Simulator::MemoryWrite32(uint8_t* address, uint32_t value) {
+  MemoryWrite(address, value, sizeof(uint32_t));
+}
+
+
+void Simulator::MemoryWriteFP32(uint8_t* address, float value) {
+  MemoryWrite32(address, float_to_rawbits(value));
+}
+
+
+void Simulator::MemoryWrite64(uint8_t* address, uint64_t value) {
+  MemoryWrite(address, value, sizeof(uint64_t));
+}
+
+
+void Simulator::MemoryWriteFP64(uint8_t* address, double value) {
+  MemoryWrite64(address, double_to_rawbits(value));
+}
+
+
+void Simulator::VisitMoveWideImmediate(Instruction* instr) {
+  MoveWideImmediateOp mov_op =
+    static_cast<MoveWideImmediateOp>(instr->Mask(MoveWideImmediateMask));
+  int64_t new_xn_val = 0;
+
+  bool is_64_bits = instr->SixtyFourBits() == 1;
+  // Shift is limited for W operations.
+  ASSERT(is_64_bits || (instr->ShiftMoveWide() < 2));
+
+  // Get the shifted immediate.
+  int64_t shift = instr->ShiftMoveWide() * 16;
+  int64_t shifted_imm16 = instr->ImmMoveWide() << shift;
+
+  // Compute the new value.
+  switch (mov_op) {
+    case MOVN_w:
+    case MOVN_x: {
+        new_xn_val = ~shifted_imm16;
+        if (!is_64_bits) new_xn_val &= kWRegMask;
+      break;
+    }
+    case MOVK_w:
+    case MOVK_x: {
+        unsigned reg_code = instr->Rd();
+        int64_t prev_xn_val = is_64_bits ? xreg(reg_code)
+                                         : wreg(reg_code);
+        new_xn_val = (prev_xn_val & ~(0xffffL << shift)) | shifted_imm16;
+      break;
+    }
+    case MOVZ_w:
+    case MOVZ_x: {
+        new_xn_val = shifted_imm16;
+      break;
+    }
+    default:
+      UNREACHABLE();
+  }
+
+  // Update the destination register.
+  set_xreg(instr->Rd(), new_xn_val);
+}
+
+
+void Simulator::VisitConditionalSelect(Instruction* instr) {
+  uint64_t new_val = xreg(instr->Rn());
+
+  if (ConditionFailed(static_cast<Condition>(instr->Condition()))) {
+    new_val = xreg(instr->Rm());
+    switch (instr->Mask(ConditionalSelectMask)) {
+      case CSEL_w:
+      case CSEL_x: break;
+      case CSINC_w:
+      case CSINC_x: new_val++; break;
+      case CSINV_w:
+      case CSINV_x: new_val = ~new_val; break;
+      case CSNEG_w:
+      case CSNEG_x: new_val = -new_val; break;
+      default: UNIMPLEMENTED();
+    }
+  }
+  unsigned reg_size = instr->SixtyFourBits() ? kXRegSize : kWRegSize;
+  set_reg(reg_size, instr->Rd(), new_val);
+}
+
+
+void Simulator::VisitDataProcessing1Source(Instruction* instr) {
+  unsigned dst = instr->Rd();
+  unsigned src = instr->Rn();
+
+  switch (instr->Mask(DataProcessing1SourceMask)) {
+    case RBIT_w: set_wreg(dst, ReverseBits(wreg(src), kWRegSize)); break;
+    case RBIT_x: set_xreg(dst, ReverseBits(xreg(src), kXRegSize)); break;
+    case REV16_w: set_wreg(dst, ReverseBytes(wreg(src), Reverse16)); break;
+    case REV16_x: set_xreg(dst, ReverseBytes(xreg(src), Reverse16)); break;
+    case REV_w: set_wreg(dst, ReverseBytes(wreg(src), Reverse32)); break;
+    case REV32_x: set_xreg(dst, ReverseBytes(xreg(src), Reverse32)); break;
+    case REV_x: set_xreg(dst, ReverseBytes(xreg(src), Reverse64)); break;
+    case CLZ_w: set_wreg(dst, CountLeadingZeros(wreg(src), kWRegSize)); break;
+    case CLZ_x: set_xreg(dst, CountLeadingZeros(xreg(src), kXRegSize)); break;
+    case CLS_w: {
+      set_wreg(dst, CountLeadingSignBits(wreg(src), kWRegSize));
+      break;
+    }
+    case CLS_x: {
+      set_xreg(dst, CountLeadingSignBits(xreg(src), kXRegSize));
+      break;
+    }
+    default: UNIMPLEMENTED();
+  }
+}
+
+
+uint64_t Simulator::ReverseBits(uint64_t value, unsigned num_bits) {
+  ASSERT((num_bits == kWRegSize) || (num_bits == kXRegSize));
+  uint64_t result = 0;
+  for (unsigned i = 0; i < num_bits; i++) {
+    result = (result << 1) | (value & 1);
+    value >>= 1;
+  }
+  return result;
+}
+
+
+uint64_t Simulator::ReverseBytes(uint64_t value, ReverseByteMode mode) {
+  // Split the 64-bit value into an 8-bit array, where b[0] is the least
+  // significant byte, and b[7] is the most significant.
+  uint8_t bytes[8];
+  uint64_t mask = 0xff00000000000000UL;
+  for (int i = 7; i >= 0; i--) {
+    bytes[i] = (value & mask) >> (i * 8);
+    mask >>= 8;
+  }
+
+  // Permutation tables for REV instructions.
+  //  permute_table[Reverse16] is used by REV16_x, REV16_w
+  //  permute_table[Reverse32] is used by REV32_x, REV_w
+  //  permute_table[Reverse64] is used by REV_x
+  ASSERT((Reverse16 == 0) && (Reverse32 == 1) && (Reverse64 == 2));
+  static const uint8_t permute_table[3][8] = { {6, 7, 4, 5, 2, 3, 0, 1},
+                                               {4, 5, 6, 7, 0, 1, 2, 3},
+                                               {0, 1, 2, 3, 4, 5, 6, 7} };
+  uint64_t result = 0;
+  for (int i = 0; i < 8; i++) {
+    result <<= 8;
+    result |= bytes[permute_table[mode][i]];
+  }
+  return result;
+}
+
+
+void Simulator::VisitDataProcessing2Source(Instruction* instr) {
+  Shift shift_op = NO_SHIFT;
+  int64_t result = 0;
+  switch (instr->Mask(DataProcessing2SourceMask)) {
+    case SDIV_w: result = wreg(instr->Rn()) / wreg(instr->Rm()); break;
+    case SDIV_x: result = xreg(instr->Rn()) / xreg(instr->Rm()); break;
+    case UDIV_w: {
+      uint32_t rn = static_cast<uint32_t>(wreg(instr->Rn()));
+      uint32_t rm = static_cast<uint32_t>(wreg(instr->Rm()));
+      result = rn / rm;
+      break;
+    }
+    case UDIV_x: {
+      uint64_t rn = static_cast<uint64_t>(xreg(instr->Rn()));
+      uint64_t rm = static_cast<uint64_t>(xreg(instr->Rm()));
+      result = rn / rm;
+      break;
+    }
+    case LSLV_w:
+    case LSLV_x: shift_op = LSL; break;
+    case LSRV_w:
+    case LSRV_x: shift_op = LSR; break;
+    case ASRV_w:
+    case ASRV_x: shift_op = ASR; break;
+    case RORV_w:
+    case RORV_x: shift_op = ROR; break;
+    default: UNIMPLEMENTED();
+  }
+
+  unsigned reg_size = instr->SixtyFourBits() ? kXRegSize : kWRegSize;
+  if (shift_op != NO_SHIFT) {
+    // Shift distance encoded in the least-significant five/six bits of the
+    // register.
+    int mask = (instr->SixtyFourBits() == 1) ? 0x3f : 0x1f;
+    unsigned shift = wreg(instr->Rm()) & mask;
+    result = ShiftOperand(reg_size, reg(reg_size, instr->Rn()), shift_op,
+                          shift);
+  }
+  set_reg(reg_size, instr->Rd(), result);
+}
+
+
+// The algorithm used is adapted from the one described in section 8.2 of
+//   Hacker's Delight, by Henry S. Warren, Jr.
+// It assumes that a right shift on a signed integer is an arithmetic shift.
+static int64_t MultiplyHighSigned(int64_t u, int64_t v) {
+  uint64_t u0, v0, w0;
+  int64_t u1, v1, w1, w2, t;
+
+  u0 = u & 0xffffffffL;
+  u1 = u >> 32;
+  v0 = v & 0xffffffffL;
+  v1 = v >> 32;
+
+  w0 = u0 * v0;
+  t = u1 * v0 + (w0 >> 32);
+  w1 = t & 0xffffffffL;
+  w2 = t >> 32;
+  w1 = u0 * v1 + w1;
+
+  return u1 * v1 + w2 + (w1 >> 32);
+}
+
+
+void Simulator::VisitDataProcessing3Source(Instruction* instr) {
+  unsigned reg_size = instr->SixtyFourBits() ? kXRegSize : kWRegSize;
+
+  int64_t result = 0;
+  uint64_t rn;
+  uint64_t rm;
+  switch (instr->Mask(DataProcessing3SourceMask)) {
+    case MADD_w:
+    case MADD_x:
+      result = xreg(instr->Ra()) + (xreg(instr->Rn()) * xreg(instr->Rm()));
+      break;
+    case MSUB_w:
+    case MSUB_x:
+      result = xreg(instr->Ra()) - (xreg(instr->Rn()) * xreg(instr->Rm()));
+      break;
+    case SMADDL_x:
+      result = xreg(instr->Ra()) + (wreg(instr->Rn()) * wreg(instr->Rm()));
+      break;
+    case SMSUBL_x:
+      result = xreg(instr->Ra()) - (wreg(instr->Rn()) * wreg(instr->Rm()));
+      break;
+    case UMADDL_x:
+      rn = static_cast<uint32_t>(wreg(instr->Rn()));
+      rm = static_cast<uint32_t>(wreg(instr->Rm()));
+      result = xreg(instr->Ra()) + (rn * rm);
+      break;
+    case UMSUBL_x:
+      rn = static_cast<uint32_t>(wreg(instr->Rn()));
+      rm = static_cast<uint32_t>(wreg(instr->Rm()));
+      result = xreg(instr->Ra()) - (rn * rm);
+      break;
+    case SMULH_x:
+      result = MultiplyHighSigned(xreg(instr->Rn()), xreg(instr->Rm()));
+      break;
+    default: UNIMPLEMENTED();
+  }
+  set_reg(reg_size, instr->Rd(), result);
+}
+
+
+void Simulator::VisitBitfield(Instruction* instr) {
+  unsigned reg_size = instr->SixtyFourBits() ? kXRegSize : kWRegSize;
+  int64_t reg_mask = instr->SixtyFourBits() ? kXRegMask : kWRegMask;
+  int64_t R = instr->ImmR();
+  int64_t S = instr->ImmS();
+  int64_t diff = S - R;
+  int64_t mask;
+  if (diff >= 0) {
+    mask = diff < reg_size - 1 ? (1L << (diff + 1)) - 1
+                               : reg_mask;
+  } else {
+    mask = ((1L << (S + 1)) - 1);
+    mask = (static_cast<uint64_t>(mask) >> R) | (mask << (reg_size - R));
+    diff += reg_size;
+  }
+
+  // inzero indicates if the extracted bitfield is inserted into the
+  // destination register value or in zero.
+  // If extend is true, extend the sign of the extracted bitfield.
+  bool inzero = false;
+  bool extend = false;
+  switch (instr->Mask(BitfieldMask)) {
+    case BFM_x:
+    case BFM_w:
+      break;
+    case SBFM_x:
+    case SBFM_w:
+      inzero = true;
+      extend = true;
+      break;
+    case UBFM_x:
+    case UBFM_w:
+      inzero = true;
+      break;
+    default:
+      UNIMPLEMENTED();
+  }
+
+  int64_t dst = inzero ? 0 : reg(reg_size, instr->Rd());
+  int64_t src = reg(reg_size, instr->Rn());
+  // Rotate source bitfield into place.
+  int64_t result = (static_cast<uint64_t>(src) >> R) | (src << (reg_size - R));
+  // Determine the sign extension.
+  int64_t topbits = ((1L << (reg_size - diff - 1)) - 1) << (diff + 1);
+  int64_t signbits = extend && ((src >> S) & 1) ? topbits : 0;
+
+  // Merge sign extension, dest/zero and bitfield.
+  result = signbits | (result & mask) | (dst & ~mask);
+
+  set_reg(reg_size, instr->Rd(), result);
+}
+
+
+void Simulator::VisitExtract(Instruction* instr) {
+  unsigned lsb = instr->ImmS();
+  unsigned reg_size = (instr->SixtyFourBits() == 1) ? kXRegSize
+                                                    : kWRegSize;
+  set_reg(reg_size,
+          instr->Rd(),
+          (static_cast<uint64_t>(reg(reg_size, instr->Rm())) >> lsb) |
+          (reg(reg_size, instr->Rn()) << (reg_size - lsb)));
+}
+
+
+void Simulator::VisitFPImmediate(Instruction* instr) {
+  unsigned dest = instr->Rd();
+  switch (instr->Mask(FPImmediateMask)) {
+    case FMOV_s_imm: set_sreg(dest, instr->ImmFP32()); break;
+    case FMOV_d_imm: set_dreg(dest, instr->ImmFP64()); break;
+    default: UNREACHABLE();
+  }
+}
+
+
+void Simulator::VisitFPIntegerConvert(Instruction* instr) {
+  unsigned dst = instr->Rd();
+  unsigned src = instr->Rn();
+
+  switch (instr->Mask(FPIntegerConvertMask)) {
+    case FCVTMS_ws:
+      set_wreg(dst, FPToInt32(sreg(src), FPNegativeInfinity));
+      break;
+    case FCVTMS_xs:
+      set_xreg(dst, FPToInt64(sreg(src), FPNegativeInfinity));
+      break;
+    case FCVTMS_wd:
+      set_wreg(dst, FPToInt32(dreg(src), FPNegativeInfinity));
+      break;
+    case FCVTMS_xd:
+      set_xreg(dst, FPToInt64(dreg(src), FPNegativeInfinity));
+      break;
+    case FCVTMU_ws:
+      set_wreg(dst, FPToUInt32(sreg(src), FPNegativeInfinity));
+      break;
+    case FCVTMU_xs:
+      set_xreg(dst, FPToUInt64(sreg(src), FPNegativeInfinity));
+      break;
+    case FCVTMU_wd:
+      set_wreg(dst, FPToUInt32(dreg(src), FPNegativeInfinity));
+      break;
+    case FCVTMU_xd:
+      set_xreg(dst, FPToUInt64(dreg(src), FPNegativeInfinity));
+      break;
+    case FCVTNS_ws: set_wreg(dst, FPToInt32(sreg(src), FPTieEven)); break;
+    case FCVTNS_xs: set_xreg(dst, FPToInt64(sreg(src), FPTieEven)); break;
+    case FCVTNS_wd: set_wreg(dst, FPToInt32(dreg(src), FPTieEven)); break;
+    case FCVTNS_xd: set_xreg(dst, FPToInt64(dreg(src), FPTieEven)); break;
+    case FCVTNU_ws: set_wreg(dst, FPToUInt32(sreg(src), FPTieEven)); break;
+    case FCVTNU_xs: set_xreg(dst, FPToUInt64(sreg(src), FPTieEven)); break;
+    case FCVTNU_wd: set_wreg(dst, FPToUInt32(dreg(src), FPTieEven)); break;
+    case FCVTNU_xd: set_xreg(dst, FPToUInt64(dreg(src), FPTieEven)); break;
+    case FCVTZS_ws: set_wreg(dst, FPToInt32(sreg(src), FPZero)); break;
+    case FCVTZS_xs: set_xreg(dst, FPToInt64(sreg(src), FPZero)); break;
+    case FCVTZS_wd: set_wreg(dst, FPToInt32(dreg(src), FPZero)); break;
+    case FCVTZS_xd: set_xreg(dst, FPToInt64(dreg(src), FPZero)); break;
+    case FCVTZU_ws: set_wreg(dst, FPToUInt32(sreg(src), FPZero)); break;
+    case FCVTZU_xs: set_xreg(dst, FPToUInt64(sreg(src), FPZero)); break;
+    case FCVTZU_wd: set_wreg(dst, FPToUInt32(dreg(src), FPZero)); break;
+    case FCVTZU_xd: set_xreg(dst, FPToUInt64(dreg(src), FPZero)); break;
+    case FMOV_ws: set_wreg(dst, sreg_bits(src)); break;
+    case FMOV_xd: set_xreg(dst, dreg_bits(src)); break;
+    case FMOV_sw: set_sreg_bits(dst, wreg(src)); break;
+    case FMOV_dx: set_dreg_bits(dst, xreg(src)); break;
+
+    // We only support conversions to double, and only when that double can
+    // exactly represent a given integer. This means all 32-bit integers, and a
+    // subset of 64-bit integers.
+    case SCVTF_dw: {
+      set_dreg(dst, wreg(src));
+      break;
+    }
+    case SCVTF_dx: {
+      double value = static_cast<double>(xreg(src));
+      ASSERT(static_cast<int64_t>(value) == xreg(src));
+      set_dreg(dst, static_cast<int64_t>(value));
+      break;
+    }
+    case UCVTF_dw: {
+      set_dreg(dst, static_cast<uint32_t>(wreg(src)));
+      break;
+    }
+    case UCVTF_dx: {
+      double value = static_cast<double>(static_cast<uint64_t>(xreg(src)));
+      ASSERT(static_cast<uint64_t>(value) == static_cast<uint64_t>(xreg(src)));
+      set_dreg(dst, static_cast<uint64_t>(value));
+      break;
+    }
+    default: UNIMPLEMENTED();
+  }
+}
+
+
+void Simulator::VisitFPFixedPointConvert(Instruction* instr) {
+  unsigned dst = instr->Rd();
+  unsigned src = instr->Rn();
+  int fbits = 64 - instr->FPScale();
+
+  // We only support two cases: unsigned and signed conversion from fixed point
+  // values in X registers to floating point values in D registers. We rely on
+  // casting to convert from integer to floating point, and assert that the
+  // fractional part of the number is zero.
+  switch (instr->Mask(FPFixedPointConvertMask)) {
+    case UCVTF_dx_fixed: {
+      uint64_t value = static_cast<uint64_t>(xreg(src));
+      ASSERT((value & ((1UL << fbits) - 1)) == 0);
+      set_dreg(dst, static_cast<double>(value >> fbits));
+      break;
+    }
+    case SCVTF_dx_fixed: {
+      int64_t value = xreg(src);
+      ASSERT((value & ((1UL << fbits) - 1)) == 0);
+      set_dreg(dst, static_cast<double>(value >> fbits));
+      break;
+    }
+    default: UNIMPLEMENTED();
+  }
+}
+
+
+int32_t Simulator::FPToInt32(double value, FPRounding rmode) {
+  value = FPRoundInt(value, rmode);
+  if (value >= kWMaxInt) {
+    return kWMaxInt;
+  } else if (value < kWMinInt) {
+    return kWMinInt;
+  }
+  return isnan(value) ? 0 : static_cast<int32_t>(value);
+}
+
+
+int64_t Simulator::FPToInt64(double value, FPRounding rmode) {
+  value = FPRoundInt(value, rmode);
+  if (value >= kXMaxInt) {
+    return kXMaxInt;
+  } else if (value < kXMinInt) {
+    return kXMinInt;
+  }
+  return isnan(value) ? 0 : static_cast<int64_t>(value);
+}
+
+
+uint32_t Simulator::FPToUInt32(double value, FPRounding rmode) {
+  value = FPRoundInt(value, rmode);
+  if (value >= kWMaxUInt) {
+    return kWMaxUInt;
+  } else if (value < 0.0) {
+    return 0;
+  }
+  return isnan(value) ? 0 : static_cast<uint32_t>(value);
+}
+
+
+uint64_t Simulator::FPToUInt64(double value, FPRounding rmode) {
+  value = FPRoundInt(value, rmode);
+  if (value >= kXMaxUInt) {
+    return kXMaxUInt;
+  } else if (value < 0.0) {
+    return 0;
+  }
+  return isnan(value) ? 0 : static_cast<uint64_t>(value);
+}
+
+
+void Simulator::VisitFPCompare(Instruction* instr) {
+  unsigned reg_size = instr->FPType() == FP32 ? kSRegSize : kDRegSize;
+  double fn_val = fpreg(reg_size, instr->Rn());
+
+  switch (instr->Mask(FPCompareMask)) {
+    case FCMP_s:
+    case FCMP_d: FPCompare(fn_val, fpreg(reg_size, instr->Rm())); break;
+    case FCMP_s_zero:
+    case FCMP_d_zero: FPCompare(fn_val, 0.0); break;
+    default: UNIMPLEMENTED();
+  }
+}
+
+
+void Simulator::VisitFPConditionalCompare(Instruction* instr) {
+  switch (instr->Mask(FPConditionalCompareMask)) {
+    case FCCMP_s:
+    case FCCMP_d: {
+      if (ConditionPassed(static_cast<Condition>(instr->Condition()))) {
+        // If the condition passes, set the status flags to the result of
+        // comparing the operands.
+        unsigned reg_size = instr->FPType() == FP32 ? kSRegSize : kDRegSize;
+        FPCompare(fpreg(reg_size, instr->Rn()), fpreg(reg_size, instr->Rm()));
+      } else {
+        // If the condition fails, set the status flags to the nzcv immediate.
+        SetFlags(instr->Nzcv() << Flags_offset);
+      }
+      break;
+    }
+    default: UNIMPLEMENTED();
+  }
+}
+
+
+void Simulator::VisitFPConditionalSelect(Instruction* instr) {
+  unsigned reg_size = instr->FPType() == FP32 ? kSRegSize : kDRegSize;
+
+  double selected_val;
+  if (ConditionPassed(static_cast<Condition>(instr->Condition()))) {
+    selected_val = fpreg(reg_size, instr->Rn());
+  } else {
+    selected_val = fpreg(reg_size, instr->Rm());
+  }
+
+  switch (instr->Mask(FPConditionalSelectMask)) {
+    case FCSEL_s:
+    case FCSEL_d: set_fpreg(reg_size, instr->Rd(), selected_val); break;
+    default: UNIMPLEMENTED();
+  }
+}
+
+
+void Simulator::VisitFPDataProcessing1Source(Instruction* instr) {
+  unsigned fd = instr->Rd();
+  unsigned fn = instr->Rn();
+
+  switch (instr->Mask(FPDataProcessing1SourceMask)) {
+    case FMOV_s: set_sreg(fd, sreg(fn)); break;
+    case FMOV_d: set_dreg(fd, dreg(fn)); break;
+    case FABS_s: set_sreg(fd, fabs(sreg(fn))); break;
+    case FABS_d: set_dreg(fd, fabs(dreg(fn))); break;
+    case FNEG_s: set_sreg(fd, -sreg(fn)); break;
+    case FNEG_d: set_dreg(fd, -dreg(fn)); break;
+    case FSQRT_s: set_sreg(fd, sqrt(sreg(fn))); break;
+    case FSQRT_d: set_dreg(fd, sqrt(dreg(fn))); break;
+    case FRINTN_s: set_sreg(fd, FPRoundInt(sreg(fn), FPTieEven)); break;
+    case FRINTN_d: set_dreg(fd, FPRoundInt(dreg(fn), FPTieEven)); break;
+    case FRINTZ_s: set_sreg(fd, FPRoundInt(sreg(fn), FPZero)); break;
+    case FRINTZ_d: set_dreg(fd, FPRoundInt(dreg(fn), FPZero)); break;
+    case FCVT_ds: set_dreg(fd, sreg(fn)); break;
+    default: UNIMPLEMENTED();
+  }
+}
+
+
+double Simulator::FPRoundInt(double value, FPRounding round_mode) {
+  if ((value == 0.0) || (value == kFP64PositiveInfinity) ||
+      (value == kFP64NegativeInfinity) || isnan(value)) {
+    return value;
+  }
+
+  double int_result = floor(value);
+  double error = value - int_result;
+  switch (round_mode) {
+    case FPTieEven: {
+      // If the error is greater than 0.5, or is equal to 0.5 and the integer
+      // result is odd, round up.
+      if ((error > 0.5) ||
+          ((error == 0.5) && (fmod(int_result, 2) != 0))) {
+        int_result++;
+      }
+      break;
+    }
+    case FPZero: {
+      // If value>0 then we take floor(value)
+      // otherwise, ceil(value).
+      if (value < 0) {
+         int_result = ceil(value);
+      }
+      break;
+    }
+    case FPNegativeInfinity: {
+      // We always use floor(value).
+      break;
+    }
+    default: UNIMPLEMENTED();
+  }
+  return int_result;
+}
+
+
+void Simulator::VisitFPDataProcessing2Source(Instruction* instr) {
+  unsigned fd = instr->Rd();
+  unsigned fn = instr->Rn();
+  unsigned fm = instr->Rm();
+
+  switch (instr->Mask(FPDataProcessing2SourceMask)) {
+    case FADD_s: set_sreg(fd, sreg(fn) + sreg(fm)); break;
+    case FADD_d: set_dreg(fd, dreg(fn) + dreg(fm)); break;
+    case FSUB_s: set_sreg(fd, sreg(fn) - sreg(fm)); break;
+    case FSUB_d: set_dreg(fd, dreg(fn) - dreg(fm)); break;
+    case FMUL_s: set_sreg(fd, sreg(fn) * sreg(fm)); break;
+    case FMUL_d: set_dreg(fd, dreg(fn) * dreg(fm)); break;
+    case FDIV_s: set_sreg(fd, sreg(fn) / sreg(fm)); break;
+    case FDIV_d: set_dreg(fd, dreg(fn) / dreg(fm)); break;
+    case FMAX_s: set_sreg(fd, FPMax(sreg(fn), sreg(fm))); break;
+    case FMAX_d: set_dreg(fd, FPMax(dreg(fn), dreg(fm))); break;
+    case FMIN_s: set_sreg(fd, FPMin(sreg(fn), sreg(fm))); break;
+    case FMIN_d: set_dreg(fd, FPMin(dreg(fn), dreg(fm))); break;
+    default: UNIMPLEMENTED();
+  }
+}
+
+
+void Simulator::VisitFPDataProcessing3Source(Instruction* instr) {
+  unsigned fd = instr->Rd();
+  unsigned fn = instr->Rn();
+  unsigned fm = instr->Rm();
+  unsigned fa = instr->Ra();
+
+  // Note: The FMSUB implementation here is not precisely the same as the
+  // instruction definition. In full implementation rounding of results would
+  // occur once at the end, here rounding will occur after the first multiply
+  // and then after the subsequent addition.  A full implementation here would
+  // be possible but would require an effort isn't immediately justified given
+  // the small differences we expect to see in most cases.
+
+  switch (instr->Mask(FPDataProcessing3SourceMask)) {
+    case FMSUB_s: set_sreg(fd, sreg(fa) + (-sreg(fn))*sreg(fm)); break;
+    case FMSUB_d: set_dreg(fd, dreg(fa) + (-dreg(fn))*dreg(fm)); break;
+    default: UNIMPLEMENTED();
+  }
+}
+
+
+double Simulator::FPMax(double a, double b) {
+  if (isnan(a)) {
+    return a;
+  } else if (isnan(b)) {
+    return b;
+  }
+
+  if ((a == 0.0) && (b == 0.0) &&
+      (copysign(1.0, a) != copysign(1.0, b))) {
+    // a and b are zero, and the sign differs: return +0.0.
+    return 0.0;
+  } else {
+    return (a > b) ? a : b;
+  }
+}
+
+
+double Simulator::FPMin(double a, double b) {
+  if (isnan(a)) {
+    return a;
+  } else if (isnan(b)) {
+    return b;
+  }
+
+  if ((a == 0.0) && (b == 0.0) &&
+      (copysign(1.0, a) != copysign(1.0, b))) {
+    // a and b are zero, and the sign differs: return -0.0.
+    return -0.0;
+  } else {
+    return (a < b) ? a : b;
+  }
+}
+
+
+void Simulator::VisitSystem(Instruction* instr) {
+  // Some system instructions hijack their Op and Cp fields to represent a
+  // range of immediates instead of indicating a different instruction. This
+  // makes the decoding tricky.
+  if (instr->Mask(SystemSysRegFMask) == SystemSysRegFixed) {
+    switch (instr->Mask(SystemSysRegMask)) {
+      case MRS: {
+        switch (instr->ImmSystemRegister()) {
+          case NZCV: set_xreg(instr->Rt(), nzcv()); break;
+          default: UNIMPLEMENTED();
+        }
+        break;
+      }
+      case MSR: {
+        switch (instr->ImmSystemRegister()) {
+          case NZCV:
+            SetFlags(xreg(instr->Rt()) & kConditionFlagsMask);
+            break;
+          default: UNIMPLEMENTED();
+        }
+        break;
+      }
+    }
+  } else if (instr->Mask(SystemHintFMask) == SystemHintFixed) {
+    ASSERT(instr->Mask(SystemHintMask) == HINT);
+    switch (instr->ImmHint()) {
+      case NOP: break;
+      default: UNIMPLEMENTED();
+    }
+  } else {
+    UNIMPLEMENTED();
+  }
+}
+
+
+void Simulator::VisitException(Instruction* instr) {
+  switch (instr->Mask(ExceptionMask)) {
+    case BRK: HostBreakpoint(); break;
+    case HLT:
+      // The Printf pseudo instruction is so useful, we include it in the
+      // default simulator.
+      if (instr->ImmException() == kPrintfOpcode) {
+        DoPrintf(instr);
+      } else {
+        HostBreakpoint();
+      }
+      break;
+    default:
+      UNIMPLEMENTED();
+  }
+}
+
+
+void Simulator::DoPrintf(Instruction* instr) {
+  ASSERT((instr->Mask(ExceptionMask) == HLT) &&
+         (instr->ImmException() == kPrintfOpcode));
+
+  // Read the argument encoded inline in the instruction stream.
+  uint32_t type;
+  ASSERT(sizeof(*instr) == 1);
+  memcpy(&type, instr + kPrintfTypeOffset, sizeof(type));
+
+  const char * format = reinterpret_cast<const char *>(x0());
+  ASSERT(format != NULL);
+
+  // Pass all of the relevant PCS registers onto printf. It doesn't matter
+  // if we pass too many as the extra ones won't be read.
+  int result = 0;
+  if (type == CPURegister::kRegister) {
+    result = printf(format, x1(), x2(), x3(), x4(), x5(), x6(), x7());
+  } else if (type == CPURegister::kFPRegister) {
+    result = printf(format, d0(), d1(), d2(), d3(), d4(), d5(), d6(), d7());
+  } else {
+    ASSERT(type == CPURegister::kNoRegister);
+    result = printf("%s", format);
+  }
+  set_x0(result);
+
+  // TODO: Clobber all caller-saved registers here, to ensure no assumptions
+  // are made about preserved state.
+
+  // The printf parameters are inlined in the code, so skip them.
+  set_pc(instr->InstructionAtOffset(kPrintfLength));
+
+  // Set LR as if we'd just called a native printf function.
+  set_lr(reinterpret_cast<uint64_t>(pc()));
+}
+
+}  // namespace vixl

diff --git a/src/a64/simulator-a64.h b/src/a64/simulator-a64.h
new file mode 100644
index 0000000..115896c
--- /dev/null
+++ b/src/a64/simulator-a64.h

@@ -0,0 +1,476 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_A64_SIMULATOR_A64_H_
+#define VIXL_A64_SIMULATOR_A64_H_
+
+#include "globals.h"
+#include "utils.h"
+#include "a64/instructions-a64.h"
+#include "a64/assembler-a64.h"
+#include "a64/disasm-a64.h"
+
+namespace vixl {
+
+enum ReverseByteMode {
+  Reverse16 = 0,
+  Reverse32 = 1,
+  Reverse64 = 2
+};
+
+// Printf. See debugger-a64.h for more informations on pseudo instructions.
+//  - type: CPURegister::RegisterType stored as a uint32_t.
+//
+// Simulate a call to printf.
+//
+// Floating-point and integer arguments are passed in separate sets of
+// registers in AAPCS64 (even for varargs functions), so it is not possible to
+// determine the type of location of each argument without some information
+// about the values that were passed in. This information could be retrieved
+// from the printf format string, but the format string is not trivial to
+// parse so we encode the relevant information with the HLT instruction under
+// the type argument. Therefore the interface is:
+//    x0: The format string
+// x1-x7: Optional arguments, if type == CPURegister::kRegister
+// d0-d7: Optional arguments, if type == CPURegister::kFPRegister
+const Instr kPrintfOpcode = 0xdeb1;
+const unsigned kPrintfTypeOffset = 1 * kInstructionSize;
+const unsigned kPrintfLength = 2 * kInstructionSize;
+
+class Simulator : public DecoderVisitor {
+ public:
+  explicit Simulator(Decoder* decoder, FILE* stream = stdout);
+  ~Simulator();
+
+  void ResetState();
+
+  // TODO: We assume little endianness, and the way in which the members of this
+  // union overlay. Add tests to ensure this, or fix accessors to no longer
+  // require this assumption.
+  union SimRegister {
+    int64_t x;
+    int32_t w;
+  };
+
+  union SimFPRegister {
+    double d;
+    float s;
+  };
+
+  // Run the simulator.
+  virtual void Run();
+  void RunFrom(Instruction* first);
+
+  // Simulation helpers.
+  inline Instruction* pc() { return pc_; }
+  inline void set_pc(Instruction* new_pc) {
+    pc_ = new_pc;
+    pc_modified_ = true;
+  }
+
+  inline void increment_pc() {
+    if (!pc_modified_) {
+      pc_ = pc_->NextInstruction();
+    }
+
+    pc_modified_ = false;
+  }
+
+  inline void ExecuteInstruction() {
+    // The program counter should always be aligned.
+    ASSERT(IsWordAligned(pc_));
+    decoder_->Decode(pc_);
+    increment_pc();
+  }
+
+  // Declare all Visitor functions.
+  #define DECLARE(A)  void Visit##A(Instruction* instr);
+  VISITOR_LIST(DECLARE)
+  #undef DECLARE
+
+  // Register accessors.
+  inline int32_t wreg(unsigned code,
+                      Reg31Mode r31mode = Reg31IsZeroRegister) const {
+    ASSERT(code < kNumberOfRegisters);
+    if ((code == 31) && (r31mode == Reg31IsZeroRegister)) {
+      return 0;
+    }
+    return registers_[code].w;
+  }
+
+  inline int64_t xreg(unsigned code,
+                      Reg31Mode r31mode = Reg31IsZeroRegister) const {
+    ASSERT(code < kNumberOfRegisters);
+    if ((code == 31) && (r31mode == Reg31IsZeroRegister)) {
+      return 0;
+    }
+    return registers_[code].x;
+  }
+
+  inline int64_t reg(unsigned size,
+                     unsigned code,
+                     Reg31Mode r31mode = Reg31IsZeroRegister) const {
+    switch (size) {
+      case kWRegSize: return wreg(code, r31mode) & kWRegMask;
+      case kXRegSize: return xreg(code, r31mode);
+      default:
+        UNREACHABLE();
+        return 0;
+    }
+  }
+
+  inline void set_wreg(unsigned code, int32_t value,
+                       Reg31Mode r31mode = Reg31IsZeroRegister) {
+    ASSERT(code < kNumberOfRegisters);
+    if ((code == kZeroRegCode) && (r31mode == Reg31IsZeroRegister)) {
+      return;
+    }
+    registers_[code].x = 0;  // First clear the register top bits.
+    registers_[code].w = value;
+  }
+
+  inline void set_xreg(unsigned code, int64_t value,
+                       Reg31Mode r31mode = Reg31IsZeroRegister) {
+    ASSERT(code < kNumberOfRegisters);
+    if ((code == kZeroRegCode) && (r31mode == Reg31IsZeroRegister)) {
+      return;
+    }
+    registers_[code].x = value;
+  }
+
+  inline void set_reg(unsigned size, unsigned code, int64_t value,
+                      Reg31Mode r31mode = Reg31IsZeroRegister) {
+    switch (size) {
+      case kWRegSize:
+        return set_wreg(code, static_cast<int32_t>(value & 0xffffffff),
+                        r31mode);
+      case kXRegSize:
+        return set_xreg(code, value, r31mode);
+      default:
+        UNREACHABLE();
+        break;
+    }
+  }
+
+  #define REG_ACCESSORS(N)                                 \
+  inline int32_t w##N() { return wreg(N); }                \
+  inline int64_t x##N() { return xreg(N); }                \
+  inline void set_w##N(int32_t val) { set_wreg(N, val); }  \
+  inline void set_x##N(int64_t val) { set_xreg(N, val); }
+  REGISTER_CODE_LIST(REG_ACCESSORS)
+  #undef REG_ACCESSORS
+
+  // Aliases.
+  #define REG_ALIAS_ACCESSORS(N, wname, xname)                \
+  inline int32_t wname() { return wreg(N); }                  \
+  inline int64_t xname() { return xreg(N); }                  \
+  inline void set_##wname(int32_t val) { set_wreg(N, val); }  \
+  inline void set_##xname(int64_t val) { set_xreg(N, val); }
+  REG_ALIAS_ACCESSORS(30, wlr, lr);
+  #undef REG_ALIAS_ACCESSORS
+
+  // The stack is a special case in aarch64.
+  inline int32_t wsp() { return wreg(31, Reg31IsStackPointer); }
+  inline int64_t sp() { return xreg(31, Reg31IsStackPointer); }
+  inline void set_wsp(int32_t val) {
+    set_wreg(31, val, Reg31IsStackPointer);
+  }
+  inline void set_sp(int64_t val) {
+    set_xreg(31, val, Reg31IsStackPointer);
+  }
+
+  // FPRegister accessors.
+  inline float sreg(unsigned code) const {
+    ASSERT(code < kNumberOfFPRegisters);
+    return fpregisters_[code].s;
+  }
+
+  inline uint32_t sreg_bits(unsigned code) const {
+    return float_to_rawbits(sreg(code));
+  }
+
+  inline double dreg(unsigned code) const {
+    ASSERT(code < kNumberOfFPRegisters);
+    return fpregisters_[code].d;
+  }
+
+  inline uint64_t dreg_bits(unsigned code) const {
+    return double_to_rawbits(dreg(code));
+  }
+
+  inline double fpreg(unsigned size, unsigned code) const {
+    switch (size) {
+      case kSRegSize: return sreg(code);
+      case kDRegSize: return dreg(code);
+      default: {
+        UNREACHABLE();
+        return 0.0;
+      }
+    }
+  }
+
+  inline void set_sreg(unsigned code, float val) {
+    ASSERT(code < kNumberOfFPRegisters);
+    // Ensure that the upper word is set to 0.
+    set_dreg_bits(code, 0);
+
+    fpregisters_[code].s = val;
+  }
+
+  inline void set_sreg_bits(unsigned code, uint32_t rawbits) {
+    ASSERT(code < kNumberOfFPRegisters);
+    // Ensure that the upper word is set to 0.
+    set_dreg_bits(code, 0);
+
+    set_sreg(code, rawbits_to_float(rawbits));
+  }
+
+  inline void set_dreg(unsigned code, double val) {
+    ASSERT(code < kNumberOfFPRegisters);
+    fpregisters_[code].d = val;
+  }
+
+  inline void set_dreg_bits(unsigned code, uint64_t rawbits) {
+    ASSERT(code < kNumberOfFPRegisters);
+    set_dreg(code, rawbits_to_double(rawbits));
+  }
+
+  inline void set_fpreg(unsigned size, unsigned code, double value) {
+    switch (size) {
+      case kSRegSize:
+        return set_sreg(code, value);
+      case kDRegSize:
+        return set_dreg(code, value);
+      default:
+        UNREACHABLE();
+        break;
+    }
+  }
+
+  #define FPREG_ACCESSORS(N)                             \
+  inline float s##N() { return sreg(N); }                \
+  inline double d##N() { return dreg(N); }               \
+  inline void set_s##N(float val) { set_sreg(N, val); }  \
+  inline void set_d##N(double val) { set_dreg(N, val); }
+  REGISTER_CODE_LIST(FPREG_ACCESSORS)
+  #undef FPREG_ACCESSORS
+
+  bool N() { return (psr_ & NFlag) != 0; }
+  bool Z() { return (psr_ & ZFlag) != 0; }
+  bool C() { return (psr_ & CFlag) != 0; }
+  bool V() { return (psr_ & VFlag) != 0; }
+  uint32_t nzcv() { return psr_ & (NFlag | ZFlag | CFlag | VFlag); }
+
+  // Debug helpers
+  void PrintFlags(bool print_all = false);
+  void PrintRegisters(bool print_all_regs = false);
+  void PrintFPRegisters(bool print_all_regs = false);
+  void PrintProcessorState();
+
+  static const char* WRegNameForCode(unsigned code,
+                                     Reg31Mode mode = Reg31IsZeroRegister);
+  static const char* XRegNameForCode(unsigned code,
+                                     Reg31Mode mode = Reg31IsZeroRegister);
+  static const char* SRegNameForCode(unsigned code);
+  static const char* DRegNameForCode(unsigned code);
+  static const char* VRegNameForCode(unsigned code);
+
+  inline bool coloured_trace() { return coloured_trace_; }
+  inline void set_coloured_trace(bool value) { coloured_trace_ = value; }
+
+  inline bool disasm_trace() { return disasm_trace_; }
+  inline void set_disasm_trace(bool value) {
+    if (value != disasm_trace_) {
+      if (value) {
+        decoder_->InsertVisitorBefore(print_disasm_, this);
+      } else {
+        decoder_->RemoveVisitor(print_disasm_);
+      }
+      disasm_trace_ = value;
+    }
+  }
+
+ protected:
+  // Simulation helpers ------------------------------------
+  bool ConditionPassed(Condition cond) {
+    switch (cond) {
+      case eq:
+        return Z();
+      case ne:
+        return !Z();
+      case hs:
+        return C();
+      case lo:
+        return !C();
+      case mi:
+        return N();
+      case pl:
+        return !N();
+      case vs:
+        return V();
+      case vc:
+        return !V();
+      case hi:
+        return C() && !Z();
+      case ls:
+        return !(C() && !Z());
+      case ge:
+        return N() == V();
+      case lt:
+        return N() != V();
+      case gt:
+        return !Z() && (N() == V());
+      case le:
+        return !(!Z() && (N() == V()));
+      case al:
+        return true;
+      default:
+        UNREACHABLE();
+        return false;
+    }
+  }
+
+  bool ConditionFailed(Condition cond) {
+    return !ConditionPassed(cond);
+  }
+
+  void AddSubHelper(Instruction* instr, int64_t op2);
+  int64_t AddWithCarry(unsigned reg_size,
+                       bool set_flags,
+                       int64_t src1,
+                       int64_t src2,
+                       int64_t carry_in = 0);
+  void LogicalHelper(Instruction* instr, int64_t op2);
+  void ConditionalCompareHelper(Instruction* instr, int64_t op2);
+  void LoadStoreHelper(Instruction* instr,
+                       int64_t offset,
+                       AddrMode addrmode);
+  void LoadStorePairHelper(Instruction* instr, AddrMode addrmode);
+  uint8_t* AddressModeHelper(unsigned addr_reg,
+                             int64_t offset,
+                             AddrMode addrmode);
+
+  uint64_t MemoryRead(const uint8_t* address, unsigned num_bytes);
+  uint8_t MemoryRead8(uint8_t* address);
+  uint16_t MemoryRead16(uint8_t* address);
+  uint32_t MemoryRead32(uint8_t* address);
+  float MemoryReadFP32(uint8_t* address);
+  uint64_t MemoryRead64(uint8_t* address);
+  double MemoryReadFP64(uint8_t* address);
+
+  void MemoryWrite(uint8_t* address, uint64_t value, unsigned num_bytes);
+  void MemoryWrite32(uint8_t* address, uint32_t value);
+  void MemoryWriteFP32(uint8_t* address, float value);
+  void MemoryWrite64(uint8_t* address, uint64_t value);
+  void MemoryWriteFP64(uint8_t* address, double value);
+
+  int64_t ShiftOperand(unsigned reg_size,
+                       int64_t value,
+                       Shift shift_type,
+                       unsigned amount);
+  int64_t Rotate(unsigned reg_width,
+                 int64_t value,
+                 Shift shift_type,
+                 unsigned amount);
+  int64_t ExtendValue(unsigned reg_width,
+                      int64_t value,
+                      Extend extend_type,
+                      unsigned left_shift = 0);
+
+  uint64_t ReverseBits(uint64_t value, unsigned num_bits);
+  uint64_t ReverseBytes(uint64_t value, ReverseByteMode mode);
+
+  void FPCompare(double val0, double val1);
+  double FPRoundInt(double value, FPRounding round_mode);
+  int32_t FPToInt32(double value, FPRounding rmode);
+  int64_t FPToInt64(double value, FPRounding rmode);
+  uint32_t FPToUInt32(double value, FPRounding rmode);
+  uint64_t FPToUInt64(double value, FPRounding rmode);
+  double FPMax(double a, double b);
+  double FPMin(double a, double b);
+
+  // Pseudo Printf instruction
+  void DoPrintf(Instruction* instr);
+
+  // Processor state ---------------------------------------
+
+  // Output stream.
+  FILE* stream_;
+  PrintDisassembler* print_disasm_;
+
+  // General purpose registers. Register 31 is the stack pointer.
+  SimRegister registers_[kNumberOfRegisters];
+
+  // Floating point registers
+  SimFPRegister fpregisters_[kNumberOfFPRegisters];
+
+  // Program Status Register.
+  // bits[31, 27]: Condition flags N, Z, C, and V.
+  //               (Negative, Zero, Carry, Overflow)
+  uint32_t psr_;
+
+  // Condition flags.
+  void SetFlags(uint32_t new_flags);
+
+  static inline uint32_t CalcNFlag(int64_t result, unsigned reg_size) {
+    return ((result >> (reg_size - 1)) & 1) * NFlag;
+  }
+
+  static inline uint32_t CalcZFlag(int64_t result) {
+    return (result == 0) ? static_cast<uint32_t>(ZFlag) : 0;
+  }
+
+  static const uint32_t kConditionFlagsMask = 0xf0000000;
+
+  // Stack
+  byte* stack_;
+  static const int stack_protection_size_ = 256;
+  // 2 KB stack.
+  static const int stack_size_ = 2 * 1024 + 2 * stack_protection_size_;
+  byte* stack_limit_;
+
+  Decoder* decoder_;
+  // Indicates if the pc has been modified by the instruction and should not be
+  // automatically incremented.
+  bool pc_modified_;
+  Instruction* pc_;
+
+  static const char* xreg_names[];
+  static const char* wreg_names[];
+  static const char* sreg_names[];
+  static const char* dreg_names[];
+  static const char* vreg_names[];
+
+  static const Instruction* kEndOfSimAddress;
+
+ private:
+  bool coloured_trace_;
+  // Indicates whether the disassembly trace is active.
+  bool disasm_trace_;
+};
+}  // namespace vixl
+
+#endif  // VIXL_A64_SIMULATOR_A64_H_

diff --git a/src/globals.h b/src/globals.h
new file mode 100644
index 0000000..859ea69
--- /dev/null
+++ b/src/globals.h

@@ -0,0 +1,66 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_GLOBALS_H
+#define VIXL_GLOBALS_H
+
+// Get the standard printf format macros for C99 stdint types.
+#define __STDC_FORMAT_MACROS
+#include <inttypes.h>
+
+#include <assert.h>
+#include <stdarg.h>
+#include <stdio.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <stddef.h>
+#include "platform.h"
+
+
+typedef uint8_t byte;
+
+const int KBytes = 1024;
+const int MBytes = 1024 * KBytes;
+const int GBytes = 1024 * MBytes;
+
+  #define ABORT() printf("in %s, line %i", __FILE__, __LINE__); abort()
+#ifdef DEBUG
+  #define ASSERT(condition) assert(condition)
+  #define CHECK(condition) ASSERT(condition)
+  #define UNIMPLEMENTED() printf("UNIMPLEMENTED\t"); ABORT()
+  #define UNREACHABLE() printf("UNREACHABLE\t"); ABORT()
+#else
+  #define ASSERT(condition) ((void) 0)
+  #define CHECK(condition) assert(condition)
+  #define UNIMPLEMENTED() ((void) 0)
+  #define UNREACHABLE() ((void) 0)
+#endif
+
+template <typename T> inline void USE(T) {}
+
+#define ALIGNMENT_EXCEPTION() printf("ALIGNMENT EXCEPTION\t"); ABORT()
+
+#endif  // VIXL_GLOBALS_H

diff --git a/src/platform.h b/src/platform.h
new file mode 100644
index 0000000..a2600f3
--- /dev/null
+++ b/src/platform.h

@@ -0,0 +1,43 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef PLATFORM_H
+#define PLATFORM_H
+
+// Define platform specific functionalities.
+
+namespace vixl {
+#ifdef USE_SIMULATOR
+// Currently we assume running the simulator implies running on x86 hardware.
+inline void HostBreakpoint() { asm("int3"); }
+#else
+inline void HostBreakpoint() {
+  // TODO: Implement HostBreakpoint on a64.
+}
+#endif
+}  // namespace vixl
+
+#endif

diff --git a/src/utils.cc b/src/utils.cc
new file mode 100644
index 0000000..6f85e61
--- /dev/null
+++ b/src/utils.cc

@@ -0,0 +1,120 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "utils.h"
+#include <stdio.h>
+
+namespace vixl {
+
+uint32_t float_to_rawbits(float value) {
+  uint32_t bits = 0;
+  memcpy(&bits, &value, 4);
+  return bits;
+}
+
+
+uint64_t double_to_rawbits(double value) {
+  uint64_t bits = 0;
+  memcpy(&bits, &value, 8);
+  return bits;
+}
+
+
+float rawbits_to_float(uint32_t bits) {
+  float value = 0.0;
+  memcpy(&value, &bits, 4);
+  return value;
+}
+
+
+double rawbits_to_double(uint64_t bits) {
+  double value = 0.0;
+  memcpy(&value, &bits, 8);
+  return value;
+}
+
+
+int CountLeadingZeros(uint64_t value, int width) {
+  ASSERT((width == 32) || (width == 64));
+  int count = 0;
+  uint64_t bit_test = 1UL << (width - 1);
+  while ((count < width) && ((bit_test & value) == 0)) {
+    count++;
+    bit_test >>= 1;
+  }
+  return count;
+}
+
+
+int CountLeadingSignBits(int64_t value, int width) {
+  ASSERT((width == 32) || (width == 64));
+  if (value >= 0) {
+    return CountLeadingZeros(value, width) - 1;
+  } else {
+    return CountLeadingZeros(~value, width) - 1;
+  }
+}
+
+
+int CountTrailingZeros(uint64_t value, int width) {
+  ASSERT((width == 32) || (width == 64));
+  int count = 0;
+  while ((count < width) && (((value >> count) & 1) == 0)) {
+    count++;
+  }
+  return count;
+}
+
+
+int CountSetBits(uint64_t value, int width) {
+  // TODO: Other widths could be added here, as the implementation already
+  // supports them.
+  ASSERT((width == 32) || (width == 64));
+
+  // Mask out unused bits to ensure that they are not counted.
+  value &= (0xffffffffffffffffUL >> (64-width));
+
+  // Add up the set bits.
+  // The algorithm works by adding pairs of bit fields together iteratively,
+  // where the size of each bit field doubles each time.
+  // An example for an 8-bit value:
+  // Bits:  h  g  f  e  d  c  b  a
+  //         \ |   \ |   \ |   \ |
+  // value = h+g   f+e   d+c   b+a
+  //            \    |      \    |
+  // value =   h+g+f+e     d+c+b+a
+  //                  \          |
+  // value =       h+g+f+e+d+c+b+a
+  value = ((value >> 1) & 0x5555555555555555) + (value & 0x5555555555555555);
+  value = ((value >> 2) & 0x3333333333333333) + (value & 0x3333333333333333);
+  value = ((value >> 4) & 0x0f0f0f0f0f0f0f0f) + (value & 0x0f0f0f0f0f0f0f0f);
+  value = ((value >> 8) & 0x00ff00ff00ff00ff) + (value & 0x00ff00ff00ff00ff);
+  value = ((value >> 16) & 0x0000ffff0000ffff) + (value & 0x0000ffff0000ffff);
+  value = ((value >> 32) & 0x00000000ffffffff) + (value & 0x00000000ffffffff);
+
+  return value;
+}
+}  // namespace vixl

diff --git a/src/utils.h b/src/utils.h
new file mode 100644
index 0000000..400e6aa
--- /dev/null
+++ b/src/utils.h

@@ -0,0 +1,126 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_UTILS_H
+#define VIXL_UTILS_H
+
+
+#include <string.h>
+#include "globals.h"
+
+namespace vixl {
+
+// Check number width.
+inline bool is_intn(unsigned n, int64_t x) {
+  ASSERT((0 < n) && (n < 64));
+  int64_t limit = 1L << (n - 1);
+  return (-limit <= x) && (x < limit);
+}
+
+inline bool is_uintn(unsigned n, int64_t x) {
+  ASSERT((0 < n) && (n < 64));
+  return !(x >> n);
+}
+
+inline unsigned truncate_to_intn(unsigned n, int64_t x) {
+  ASSERT((0 < n) && (n < 64));
+  return (x & ((1L << n) - 1));
+}
+
+#define INT_1_TO_63_LIST(V)                                                    \
+V(1)  V(2)  V(3)  V(4)  V(5)  V(6)  V(7)  V(8)                                 \
+V(9)  V(10) V(11) V(12) V(13) V(14) V(15) V(16)                                \
+V(17) V(18) V(19) V(20) V(21) V(22) V(23) V(24)                                \
+V(25) V(26) V(27) V(28) V(29) V(30) V(31) V(32)                                \
+V(33) V(34) V(35) V(36) V(37) V(38) V(39) V(40)                                \
+V(41) V(42) V(43) V(44) V(45) V(46) V(47) V(48)                                \
+V(49) V(50) V(51) V(52) V(53) V(54) V(55) V(56)                                \
+V(57) V(58) V(59) V(60) V(61) V(62) V(63)
+
+#define DECLARE_IS_INT_N(N)                                                    \
+inline bool is_int##N(int64_t x) { return is_intn(N, x); }
+#define DECLARE_IS_UINT_N(N)                                                   \
+inline bool is_uint##N(int64_t x) { return is_uintn(N, x); }
+#define DECLARE_TRUNCATE_TO_INT_N(N)                                           \
+inline int truncate_to_int##N(int x) { return truncate_to_intn(N, x); }
+INT_1_TO_63_LIST(DECLARE_IS_INT_N)
+INT_1_TO_63_LIST(DECLARE_IS_UINT_N)
+INT_1_TO_63_LIST(DECLARE_TRUNCATE_TO_INT_N)
+#undef DECLARE_IS_INT_N
+#undef DECLARE_IS_UINT_N
+#undef DECLARE_TRUNCATE_TO_INT_N
+
+// Bit field extraction.
+inline uint32_t unsigned_bitextract_32(int msb, int lsb, uint32_t x) {
+  return (x >> lsb) & ((1 << (1 + msb - lsb)) - 1);
+}
+
+inline uint64_t unsigned_bitextract_64(int msb, int lsb, uint64_t x) {
+  return (x >> lsb) & ((1 << (1 + msb - lsb)) - 1);
+}
+
+inline int32_t signed_bitextract_32(int msb, int lsb, int32_t x) {
+  return (x << (31 - msb)) >> (lsb + 31 - msb);
+}
+
+inline int64_t signed_bitextract_64(int msb, int lsb, int64_t x) {
+  return (x << (63 - msb)) >> (lsb + 63 - msb);
+}
+
+// floating point representation
+uint32_t float_to_rawbits(float value);
+uint64_t double_to_rawbits(double value);
+float rawbits_to_float(uint32_t bits);
+double rawbits_to_double(uint64_t bits);
+
+// Bits counting.
+int CountLeadingZeros(uint64_t value, int width);
+int CountLeadingSignBits(int64_t value, int width);
+int CountTrailingZeros(uint64_t value, int width);
+int CountSetBits(uint64_t value, int width);
+
+// Pointer alignment
+// TODO: rename/refactor to make it specific to instructions.
+template<typename T>
+bool IsWordAligned(T pointer) {
+  ASSERT(sizeof(pointer) == sizeof(intptr_t));   // NOLINT(runtime/sizeof)
+  return (reinterpret_cast<intptr_t>(pointer) & 3) == 0;
+}
+
+// Increment a pointer until it has the specified alignment.
+template<class T>
+T AlignUp(T pointer, size_t alignment) {
+  ASSERT(sizeof(pointer) == sizeof(uintptr_t));
+  uintptr_t pointer_raw = reinterpret_cast<uintptr_t>(pointer);
+  size_t align_step = (alignment - pointer_raw) % alignment;
+  ASSERT((pointer_raw + align_step) % alignment == 0);
+  return reinterpret_cast<T>(pointer_raw + align_step);
+}
+
+
+}  // namespace vixl
+
+#endif  // VIXL_UTILS_H

diff --git a/test/cctest.cc b/test/cctest.cc
new file mode 100644
index 0000000..a5fcae2
--- /dev/null
+++ b/test/cctest.cc

@@ -0,0 +1,164 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include <stdlib.h>
+#include <stdio.h>
+#include <string.h>
+#include "cctest.h"
+
+// Initialize the list as empty.
+vixl::Cctest* vixl::Cctest::first_ = NULL;
+vixl::Cctest* vixl::Cctest::last_ = NULL;
+
+// No debugger to start with.
+bool vixl::Cctest::debug_ = false;
+
+// No tracing to start with.
+bool vixl::Cctest::trace_sim_ = false;
+bool vixl::Cctest::trace_reg_ = false;
+
+// No colour highlight by default.
+bool vixl::Cctest::coloured_trace_ = false;
+
+// Instantiate a Cctest and add append it to the linked list.
+vixl::Cctest::Cctest(const char* name, CctestFunction* callback)
+  : name_(name), callback_(callback), next_(NULL) {
+  // Append this cctest to the linked list.
+  if (first_ == NULL) {
+    ASSERT(last_ == NULL);
+    first_ = this;
+  } else {
+    last_->next_ = this;
+  }
+  last_ = this;
+}
+
+
+// Look for 'search' in the arguments.
+bool IsInArgs(const char* search, int argc, char* argv[]) {
+  for (int i = 1; i < argc; i++) {
+    if (strcmp(search, argv[i]) == 0) {
+      return true;
+    }
+  }
+  return false;
+}
+
+
+// Special keywords used as arguments must be registered here.
+bool IsSpecialArgument(const char* arg) {
+  return (strcmp(arg, "--help") == 0) ||
+         (strcmp(arg, "--list") == 0) ||
+         (strcmp(arg, "--run_all") == 0) ||
+         (strcmp(arg, "--debugger") == 0) ||
+         (strcmp(arg, "--trace_sim") == 0) ||
+         (strcmp(arg, "--trace_reg") == 0) ||
+         (strcmp(arg, "--coloured_trace") == 0);
+}
+
+
+void PrintHelpMessage() {
+  printf("Usage:  ./cctest [options] [test names]\n"
+         "Run all tests specified on the command line.\n"
+         "--help            print this help message.\n"
+         "--list            list all available tests.\n"
+         "--run_all         run all available tests.\n"
+         "--debugger        run in the debugger.\n"
+         "--trace_sim       generate a trace of simulated instructions.\n"
+         "--trace_reg       generate a trace of simulated registers. "
+           "Implies --debugger.\n"
+         "--coloured_trace  generate coloured trace.\n");
+}
+
+int main(int argc, char* argv[]) {
+  // Parse the arguments, with the following priority:
+  // --help
+  // --list
+  // --run_all
+  // --debugger
+  // --trace_sim
+  // --trace_reg
+  // --coloured_trace
+  // test names
+
+  if (IsInArgs("--coloured_trace", argc, argv)) {
+    vixl::Cctest::set_coloured_trace(true);
+  }
+
+  if (IsInArgs("--debugger", argc, argv)) {
+    vixl::Cctest::set_debug(true);
+  }
+
+  if (IsInArgs("--trace_reg", argc, argv)) {
+    vixl::Cctest::set_trace_reg(true);
+  }
+
+  if (IsInArgs("--trace_sim", argc, argv)) {
+    vixl::Cctest::set_trace_sim(true);
+  }
+
+  if (IsInArgs("--help", argc, argv)) {
+    PrintHelpMessage();
+
+  } else if (IsInArgs("--list", argc, argv)) {
+    // List all registered cctests.
+    for (vixl::Cctest* c = vixl::Cctest::first(); c != NULL; c = c->next()) {
+      printf("%s\n", c->name());
+    }
+
+  } else if (IsInArgs("--run_all", argc, argv)) {
+    // Run all registered cctests.
+    for (vixl::Cctest* c = vixl::Cctest::first(); c != NULL; c = c->next()) {
+      printf("Running %s\n", c->name());
+      c->callback()();
+    }
+
+  } else {
+    if (argc <= 1)
+      PrintHelpMessage();
+    // Other arguments must be tests to run.
+    int i = 1;
+    for (i = 1; i < argc; i++) {
+      if (!IsSpecialArgument(argv[i])) {
+        vixl::Cctest* c;
+        for (c = vixl::Cctest::first(); c != NULL; c = c->next()) {
+          if (strcmp(c->name(), argv[i]) == 0) {
+            c->callback()();
+            break;
+          }
+        }
+        // Fail if we have not found a matching test to run.
+        if (c == NULL) {
+          printf("Test '%s' does not exist. Aborting...\n", argv[i]);
+          abort();
+        }
+      }
+    }
+  }
+
+  return EXIT_SUCCESS;
+}
+

diff --git a/test/cctest.h b/test/cctest.h
new file mode 100644
index 0000000..41f47f6
--- /dev/null
+++ b/test/cctest.h

@@ -0,0 +1,82 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef TEST_CCTEST_H_
+#define TEST_CCTEST_H_
+
+#include "utils.h"
+
+namespace vixl {
+
+// Each actual test is represented by a CCtest instance.
+// Cctests are appended to a static linked list upon creation.
+class Cctest {
+  typedef void (CctestFunction)();
+
+ public:
+  Cctest(const char* name, CctestFunction* callback);
+
+  const char* name() { return name_; }
+  CctestFunction* callback() { return callback_; }
+  static Cctest* first() { return first_; }
+  static Cctest* last() { return last_; }
+  Cctest* next() { return next_; }
+  static bool debug() { return debug_; }
+  static void set_debug(bool value) { debug_ = value; }
+  static bool trace_sim() { return trace_sim_; }
+  static void set_trace_sim(bool value) { trace_sim_ = value; }
+  static bool trace_reg() { return trace_reg_; }
+  static void set_trace_reg(bool value) { trace_reg_ = value; }
+  static bool coloured_trace() { return coloured_trace_; }
+  static void set_coloured_trace(bool value) { coloured_trace_ = value; }
+
+  // The debugger is needed to trace register values.
+  static bool run_debugger() { return debug_ || trace_reg_; }
+
+ private:
+  const char* name_;
+  CctestFunction* callback_;
+
+  static Cctest* first_;
+  static Cctest* last_;
+  Cctest* next_;
+  static bool debug_;
+  static bool trace_sim_;
+  static bool trace_reg_;
+  static bool coloured_trace_;
+};
+
+// Define helper macros for cctest files.
+
+// Macro to register a cctest. It instantiates a Cctest and registers its
+// callback function.
+#define TEST_(Name)                                                            \
+void Test##Name();                                                             \
+Cctest cctest_##Name(#Name, &Test##Name);                                      \
+void Test##Name()
+}  // namespace vixl
+
+#endif  // TEST_CCTEST_H_

diff --git a/test/examples/test-examples.cc b/test/examples/test-examples.cc
new file mode 100644
index 0000000..263ff3a
--- /dev/null
+++ b/test/examples/test-examples.cc

@@ -0,0 +1,408 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "a64/macro-assembler-a64.h"
+#include "a64/debugger-a64.h"
+#include "a64/simulator-a64.h"
+#include "examples.h"
+#include "../test-utils-a64.h"
+
+#include "../cctest.h"
+
+#define ARRAY_SIZE(Array) (sizeof(Array) / sizeof((Array)[0]))
+#define BUF_SIZE (4096)
+#define __ masm->
+
+using namespace vixl;
+
+
+uint64_t FactorialC(uint64_t n) {
+  uint64_t result = 1;
+
+  while (n != 0) {
+    result *= n;
+    n--;
+  }
+
+  return result;
+}
+
+double Add3DoubleC(double x, double y, double z) {
+  return x + y + z;
+}
+
+double Add4DoubleC(uint64_t a, double b, uint64_t c, double d) {
+  return static_cast<double>(a) + b + static_cast<double>(c) + d;
+}
+
+uint32_t SumArrayC(uint8_t* array, uint32_t size) {
+  uint32_t result = 0;
+
+  for (uint32_t i = 0; i < size; ++i) {
+    result += array[i];
+  }
+
+  return result;
+}
+
+
+void GenerateTestWrapper(MacroAssembler* masm, RegisterDump *regs) {
+  __ Push(xzr, lr);
+  __ Blr(x15);
+  regs->Dump(masm);
+  __ Pop(lr, xzr);
+  __ Ret();
+}
+
+
+#define TEST_FUNCTION(Func)                                             \
+  do {                                                                  \
+    int64_t saved_xregs[13];                                            \
+    saved_xregs[0] = simulator.xreg(19);                                \
+    saved_xregs[1] = simulator.xreg(20);                                \
+    saved_xregs[2] = simulator.xreg(21);                                \
+    saved_xregs[3] = simulator.xreg(22);                                \
+    saved_xregs[4] = simulator.xreg(23);                                \
+    saved_xregs[5] = simulator.xreg(24);                                \
+    saved_xregs[6] = simulator.xreg(25);                                \
+    saved_xregs[7] = simulator.xreg(26);                                \
+    saved_xregs[8] = simulator.xreg(27);                                \
+    saved_xregs[9] = simulator.xreg(28);                                \
+    saved_xregs[10] = simulator.xreg(29);                               \
+    saved_xregs[11] = simulator.xreg(30);                               \
+    saved_xregs[12] = simulator.xreg(31);                               \
+                                                                        \
+    uint64_t saved_dregs[8];                                            \
+    saved_dregs[0] = simulator.dreg_bits(8);                            \
+    saved_dregs[1] = simulator.dreg_bits(9);                            \
+    saved_dregs[2] = simulator.dreg_bits(10);                           \
+    saved_dregs[3] = simulator.dreg_bits(11);                           \
+    saved_dregs[4] = simulator.dreg_bits(12);                           \
+    saved_dregs[5] = simulator.dreg_bits(13);                           \
+    saved_dregs[6] = simulator.dreg_bits(14);                           \
+    saved_dregs[7] = simulator.dreg_bits(15);                           \
+                                                                        \
+    simulator.set_xreg(15, reinterpret_cast<uint64_t>((Func).target()));\
+    simulator.RunFrom(test.target());                                   \
+                                                                        \
+    assert(saved_xregs[0] == simulator.xreg(19));                       \
+    assert(saved_xregs[1] == simulator.xreg(20));                       \
+    assert(saved_xregs[2] == simulator.xreg(21));                       \
+    assert(saved_xregs[3] == simulator.xreg(22));                       \
+    assert(saved_xregs[4] == simulator.xreg(23));                       \
+    assert(saved_xregs[5] == simulator.xreg(24));                       \
+    assert(saved_xregs[6] == simulator.xreg(25));                       \
+    assert(saved_xregs[7] == simulator.xreg(26));                       \
+    assert(saved_xregs[8] == simulator.xreg(27));                       \
+    assert(saved_xregs[9] == simulator.xreg(28));                       \
+    assert(saved_xregs[10] == simulator.xreg(29));                      \
+    assert(saved_xregs[11] == simulator.xreg(30));                      \
+    assert(saved_xregs[12] == simulator.xreg(31));                      \
+                                                                        \
+    assert(saved_dregs[0] == simulator.dreg_bits(8));                   \
+    assert(saved_dregs[1] == simulator.dreg_bits(9));                   \
+    assert(saved_dregs[2] == simulator.dreg_bits(10));                  \
+    assert(saved_dregs[3] == simulator.dreg_bits(11));                  \
+    assert(saved_dregs[4] == simulator.dreg_bits(12));                  \
+    assert(saved_dregs[5] == simulator.dreg_bits(13));                  \
+    assert(saved_dregs[6] == simulator.dreg_bits(14));                  \
+    assert(saved_dregs[7] == simulator.dreg_bits(15));                  \
+                                                                        \
+  } while (0)
+
+#define START()                                             \
+  byte assm_buf[BUF_SIZE];                                  \
+  MacroAssembler masm(assm_buf, BUF_SIZE);                  \
+  Decoder decoder;                                          \
+  Debugger simulator(&decoder);                             \
+  simulator.set_coloured_trace(Cctest::coloured_trace());   \
+  PrintDisassembler* pdis = NULL;                           \
+  if (Cctest::trace_sim()) {                                \
+    pdis = new PrintDisassembler(stdout);                   \
+    decoder.PrependVisitor(pdis);                           \
+  }                                                         \
+  RegisterDump regs;                                        \
+                                                            \
+  Label test;                                               \
+  masm.Bind(&test);                                         \
+  GenerateTestWrapper(&masm, &regs);                        \
+  masm.FinalizeCode()
+
+
+#define TEST(name) TEST_(EXAMPLE_##name)
+
+
+#define FACTORIAL_DOTEST(N)                                             \
+  do {                                                                  \
+    simulator.ResetState();                                             \
+    simulator.set_xreg(0, N);                                           \
+    TEST_FUNCTION(factorial);                                           \
+    assert(static_cast<uint64_t>(regs.xreg(0)) == FactorialC(N));       \
+  } while (0)
+
+TEST(factorial) {
+  START();
+
+  Label factorial;
+  masm.Bind(&factorial);
+  GenerateFactorial(&masm);
+  masm.FinalizeCode();
+
+  FACTORIAL_DOTEST(0);
+  FACTORIAL_DOTEST(1);
+  FACTORIAL_DOTEST(5);
+  FACTORIAL_DOTEST(10);
+  FACTORIAL_DOTEST(20);
+  FACTORIAL_DOTEST(25);
+}
+
+
+#define FACTORIAL_REC_DOTEST(N)                                         \
+  do {                                                                  \
+    simulator.ResetState();                                             \
+    simulator.set_xreg(0, N);                                           \
+    TEST_FUNCTION(factorial_rec);                                       \
+    assert(static_cast<uint64_t>(regs.xreg(0)) == FactorialC(N));       \
+  } while (0)
+
+TEST(factorial_rec) {
+  START();
+
+  Label factorial_rec;
+  masm.Bind(&factorial_rec);
+  GenerateFactorialRec(&masm);
+  masm.FinalizeCode();
+
+  FACTORIAL_REC_DOTEST(0);
+  FACTORIAL_REC_DOTEST(1);
+  FACTORIAL_REC_DOTEST(5);
+  FACTORIAL_REC_DOTEST(10);
+  FACTORIAL_REC_DOTEST(20);
+  FACTORIAL_REC_DOTEST(25);
+}
+
+
+#define ADD3_DOUBLE_DOTEST(A, B, C)                                     \
+  do {                                                                  \
+    simulator.ResetState();                                             \
+    simulator.set_dreg(0, A);                                           \
+    simulator.set_dreg(1, B);                                           \
+    simulator.set_dreg(2, C);                                           \
+    TEST_FUNCTION(add3_double);                                         \
+    assert(regs.dreg(0) == Add3DoubleC(A, B, C));                       \
+  } while (0)
+
+TEST(add3_double) {
+  START();
+
+  Label add3_double;
+  masm.Bind(&add3_double);
+  GenerateAdd3Double(&masm);
+  masm.FinalizeCode();
+
+  ADD3_DOUBLE_DOTEST(0.0, 0.0, 0.0);
+  ADD3_DOUBLE_DOTEST(457.698, 14.36, 2.00025);
+  ADD3_DOUBLE_DOTEST(-45.55, -98.9, -0.354);
+  ADD3_DOUBLE_DOTEST(.55, .9, .12);
+}
+
+
+#define ADD4_DOUBLE_DOTEST(A, B, C, D)                                  \
+  do {                                                                  \
+    simulator.ResetState();                                             \
+    simulator.set_xreg(0, A);                                           \
+    simulator.set_dreg(0, B);                                           \
+    simulator.set_xreg(1, C);                                           \
+    simulator.set_dreg(1, D);                                           \
+    TEST_FUNCTION(add4_double);                                         \
+    assert(regs.dreg(0) == Add4DoubleC(A, B, C, D));                    \
+  } while (0)
+
+TEST(add4_double) {
+  START();
+
+  Label add4_double;
+  masm.Bind(&add4_double);
+  GenerateAdd4Double(&masm);
+  masm.FinalizeCode();
+
+  ADD4_DOUBLE_DOTEST(0, 0, 0, 0);
+  ADD4_DOUBLE_DOTEST(4, 3.287, 6, 13.48);
+  ADD4_DOUBLE_DOTEST(56, 665.368, 0, -4932.4697);
+  ADD4_DOUBLE_DOTEST(56, 0, 546, 0);
+  ADD4_DOUBLE_DOTEST(0, 0.658, 0, 0.00000011540026);
+}
+
+
+#define SUM_ARRAY_DOTEST(Array)                                         \
+  do {                                                                  \
+    simulator.ResetState();                                             \
+    uintptr_t addr = reinterpret_cast<uintptr_t>(Array);                \
+    simulator.set_xreg(0, addr);                                        \
+    simulator.set_xreg(1, ARRAY_SIZE(Array));                           \
+    TEST_FUNCTION(sum_array);                                           \
+    assert(regs.xreg(0) == SumArrayC(Array, ARRAY_SIZE(Array)));        \
+  } while (0)
+
+TEST(sum_array) {
+  START();
+
+  Label sum_array;
+  masm.Bind(&sum_array);
+  GenerateSumArray(&masm);
+  masm.FinalizeCode();
+
+  uint8_t data1[] = { 4, 9, 13, 3, 2, 6, 5 };
+  SUM_ARRAY_DOTEST(data1);
+
+  uint8_t data2[] = { 42 };
+  SUM_ARRAY_DOTEST(data2);
+
+  uint8_t data3[1000];
+  for (unsigned int i = 0; i < ARRAY_SIZE(data3); ++i)
+    data3[i] = 255;
+  SUM_ARRAY_DOTEST(data3);
+}
+
+
+#define ABS_DOTEST(X)                                                   \
+  do {                                                                  \
+    simulator.ResetState();                                             \
+    simulator.set_xreg(0, X);                                           \
+    TEST_FUNCTION(func_abs);                                            \
+    assert(regs.xreg(0) == abs(X));                                     \
+  } while (0)
+
+TEST(abs) {
+  START();
+
+  Label func_abs;
+  masm.Bind(&func_abs);
+  GenerateAbs(&masm);
+  masm.FinalizeCode();
+
+  ABS_DOTEST(-42);
+  ABS_DOTEST(0);
+  ABS_DOTEST(545);
+  ABS_DOTEST(-428751489);
+}
+
+
+TEST(swap4) {
+  START();
+
+  Label swap4;
+  masm.Bind(&swap4);
+  GenerateSwap4(&masm);
+  masm.FinalizeCode();
+
+  int64_t a = 15;
+  int64_t b = 26;
+  int64_t c = 46;
+  int64_t d = 79;
+
+  simulator.set_xreg(0, a);
+  simulator.set_xreg(1, b);
+  simulator.set_xreg(2, c);
+  simulator.set_xreg(3, d);
+  TEST_FUNCTION(swap4);
+  assert(regs.xreg(0) == d);
+  assert(regs.xreg(1) == c);
+  assert(regs.xreg(2) == b);
+  assert(regs.xreg(3) == a);
+}
+
+
+TEST(swap_int32) {
+  START();
+
+  Label swap_int32;
+  masm.Bind(&swap_int32);
+  GenerateSwapInt32(&masm);
+  masm.FinalizeCode();
+
+  int32_t x = 168;
+  int32_t y = 246;
+  simulator.set_wreg(0, x);
+  simulator.set_wreg(1, y);
+  TEST_FUNCTION(swap_int32);
+  assert(regs.wreg(0) == y);
+  assert(regs.wreg(1) == x);
+}
+
+
+#define CHECKBOUNDS_DOTEST(Value, Low, High)                            \
+  do {                                                                  \
+    simulator.ResetState();                                             \
+    simulator.set_xreg(0, Value);                                       \
+    simulator.set_xreg(1, Low);                                         \
+    simulator.set_xreg(2, High);                                        \
+    TEST_FUNCTION(check_bounds);                                        \
+    assert(regs.xreg(0) == ((Low <= Value) && (Value <= High)));        \
+  } while (0)
+
+TEST(check_bounds) {
+  START();
+
+  Label check_bounds;
+  masm.Bind(&check_bounds);
+  GenerateCheckBounds(&masm);
+  masm.FinalizeCode();
+
+  CHECKBOUNDS_DOTEST(0, 100, 200);
+  CHECKBOUNDS_DOTEST(58, 100, 200);
+  CHECKBOUNDS_DOTEST(99, 100, 200);
+  CHECKBOUNDS_DOTEST(100, 100, 200);
+  CHECKBOUNDS_DOTEST(101, 100, 200);
+  CHECKBOUNDS_DOTEST(150, 100, 200);
+  CHECKBOUNDS_DOTEST(199, 100, 200);
+  CHECKBOUNDS_DOTEST(200, 100, 200);
+  CHECKBOUNDS_DOTEST(201, 100, 200);
+}
+
+
+#define GETTING_STARTED_DOTEST(Value)                           \
+  do {                                                          \
+    simulator.ResetState();                                     \
+    simulator.set_xreg(0, Value);                               \
+    TEST_FUNCTION(demo_function);                               \
+    assert(regs.xreg(0) == (Value & 0x1122334455667788));       \
+  } while (0)
+
+TEST(getting_started) {
+  START();
+
+  Label demo_function;
+  masm.Bind(&demo_function);
+  GenerateDemoFunction(&masm);
+  masm.FinalizeCode();
+
+  GETTING_STARTED_DOTEST(0x8899aabbccddeeff);
+  GETTING_STARTED_DOTEST(0x1122334455667788);
+  GETTING_STARTED_DOTEST(0x0000000000000000);
+  GETTING_STARTED_DOTEST(0xffffffffffffffff);
+  GETTING_STARTED_DOTEST(0x5a5a5a5a5a5a5a5a);
+}

diff --git a/test/test-assembler-a64.cc b/test/test-assembler-a64.cc
new file mode 100644
index 0000000..5fb4c22
--- /dev/null
+++ b/test/test-assembler-a64.cc

@@ -0,0 +1,7350 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include <stdio.h>
+#include <string.h>
+#include <cmath>
+
+#include "cctest.h"
+#include "test-utils-a64.h"
+#include "a64/macro-assembler-a64.h"
+#include "a64/simulator-a64.h"
+#include "a64/debugger-a64.h"
+#include "a64/disasm-a64.h"
+#include "a64/cpu-a64.h"
+
+namespace vixl {
+
+// Test infrastructure.
+//
+// Tests are functions which accept no parameters and have no return values.
+// The testing code should not perform an explicit return once completed. For
+// example to test the mov immediate instruction a very simple test would be:
+//
+//   TEST(mov_x0_one) {
+//     SETUP();
+//
+//     START();
+//     __ mov(x0, Operand(1));
+//     END();
+//
+//     RUN();
+//
+//     ASSERT_EQUAL_64(1, x0);
+//
+//     TEARDOWN();
+//   }
+//
+// Within a START ... END block all registers but sp can be modified. sp has to
+// be explicitly saved/restored. The END() macro replaces the function return
+// so it may appear multiple times in a test if the test has multiple exit
+// points.
+//
+// Once the test has been run all integer and floating point registers as well
+// as flags are accessible through a RegisterDump instance, see
+// utils-a64.cc for more info on RegisterDump.
+//
+// We provide some helper assert to handle common cases:
+//
+//   ASSERT_EQUAL_32(int32_t, int_32t)
+//   ASSERT_EQUAL_FP32(float, float)
+//   ASSERT_EQUAL_32(int32_t, W register)
+//   ASSERT_EQUAL_FP32(float, S register)
+//   ASSERT_EQUAL_64(int64_t, int_64t)
+//   ASSERT_EQUAL_FP64(double, double)
+//   ASSERT_EQUAL_64(int64_t, X register)
+//   ASSERT_EQUAL_64(X register, X register)
+//   ASSERT_EQUAL_FP64(double, D register)
+//
+// e.g. ASSERT_EQUAL_64(0.5, d30);
+//
+// If more advance computation is required before the assert then access the
+// RegisterDump named core directly:
+//
+//   ASSERT_EQUAL_64(0x1234, core->reg_x0() & 0xffff);
+
+
+#define __ masm.
+#define TEST(name)  TEST_(ASM_##name)
+
+#define BUF_SIZE (4096)
+
+#define SETUP() SETUP_SIZE(BUF_SIZE)
+
+#ifdef USE_SIMULATOR
+
+// Run tests with the simulator.
+#define SETUP_SIZE(buf_size)                                                   \
+  byte* buf = new byte[buf_size];                                              \
+  MacroAssembler masm(buf, buf_size);                                          \
+  Decoder decoder;                                                             \
+  Simulator* simulator = NULL;                                                 \
+  if (Cctest::run_debugger()) {                                                \
+    simulator = new Debugger(&decoder);                                        \
+  } else {                                                                     \
+    simulator = new Simulator(&decoder);                                       \
+    simulator->set_disasm_trace(Cctest::trace_sim());                          \
+  }                                                                            \
+  simulator->set_coloured_trace(Cctest::coloured_trace());                     \
+  RegisterDump core
+
+#define START()                                                                \
+  masm.Reset();                                                                \
+  simulator->ResetState();                                                     \
+  __ PushCalleeSavedRegisters();                                               \
+  if (Cctest::run_debugger()) {                                                \
+    if (Cctest::trace_reg()) {                                                 \
+      __ Trace(LOG_STATE, TRACE_ENABLE);                                       \
+    }                                                                          \
+    if (Cctest::trace_sim()) {                                                 \
+      __ Trace(LOG_DISASM, TRACE_ENABLE);                                      \
+    }                                                                          \
+  }
+
+#define END()                                                                  \
+  if (Cctest::run_debugger()) {                                                \
+    __ Trace(LOG_ALL, TRACE_DISABLE);                                          \
+  }                                                                            \
+  core.Dump(&masm);                                                            \
+  __ PopCalleeSavedRegisters();                                                \
+  __ Ret();                                                                    \
+  masm.FinalizeCode()
+
+#define RUN()                                                                  \
+  simulator->RunFrom(reinterpret_cast<Instruction*>(buf))
+
+#define TEARDOWN()                                                             \
+  delete simulator;                                                            \
+  delete[] buf;
+
+#else  // ifdef USE_SIMULATOR.
+// Run the test on real hardware or models.
+#define SETUP_SIZE(buf_size)                                                   \
+  byte* buf = new byte[buf_size];                                              \
+  MacroAssembler masm(buf, buf_size);                                          \
+  RegisterDump core;                                                           \
+  CPU::SetUp()
+
+#define START()                                                                \
+  masm.Reset();                                                                \
+  __ PushCalleeSavedRegisters()
+
+#define END()                                                                  \
+  core.Dump(&masm);                                                            \
+  __ PopCalleeSavedRegisters();                                                \
+  __ Ret();                                                                    \
+  masm.FinalizeCode()
+
+#define RUN()                                                                  \
+  CPU::EnsureIAndDCacheCoherency(&buf, sizeof(buf));                           \
+  {                                                                            \
+    void (*test_function)(void);                                               \
+    memcpy(&test_function, &buf, sizeof(buf));                                 \
+    test_function();                                                           \
+  }
+
+#define TEARDOWN()                                                             \
+  delete[] buf;
+
+#endif  // ifdef USE_SIMULATOR.
+
+#define ASSERT_EQUAL_NZCV(expected)                                            \
+  assert(EqualNzcv(expected, core.flags_nzcv()))
+
+#define ASSERT_EQUAL_REGISTERS(expected)                                       \
+  assert(EqualRegisters(&expected, &core))
+
+#define ASSERT_EQUAL_32(expected, result)                                      \
+  assert(Equal32(static_cast<uint32_t>(expected), &core, result))
+
+#define ASSERT_EQUAL_FP32(expected, result)                                    \
+  assert(EqualFP32(expected, &core, result))
+
+#define ASSERT_EQUAL_64(expected, result)                                      \
+  assert(Equal64(expected, &core, result))
+
+#define ASSERT_EQUAL_FP64(expected, result)                                    \
+  assert(EqualFP64(expected, &core, result))
+
+#define ASSERT_LITERAL_POOL_SIZE(expected)                                     \
+  assert((expected) == (__ LiteralPoolSize()))
+
+
+TEST(stack_ops) {
+  SETUP();
+
+  START();
+  // save sp.
+  __ Mov(x29, sp);
+
+  // Set the sp to a known value.
+  __ Mov(sp, 0x1004);
+  __ Mov(x0, sp);
+
+  // Add immediate to the sp, and move the result to a normal register.
+  __ Add(sp, sp, Operand(0x50));
+  __ Mov(x1, sp);
+
+  // Add extended to the sp, and move the result to a normal register.
+  __ Mov(x17, 0xfff);
+  __ Add(sp, sp, Operand(x17, SXTB));
+  __ Mov(x2, sp);
+
+  // Create an sp using a logical instruction, and move to normal register.
+  __ Orr(sp, xzr, Operand(0x1fff));
+  __ Mov(x3, sp);
+
+  // Write wsp using a logical instruction.
+  __ Orr(wsp, wzr, Operand(0xfffffff8L));
+  __ Mov(x4, sp);
+
+  // Write sp, and read back wsp.
+  __ Orr(sp, xzr, Operand(0xfffffff8L));
+  __ Mov(w5, wsp);
+
+  //  restore sp.
+  __ Mov(sp, x29);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x1004, x0);
+  ASSERT_EQUAL_64(0x1054, x1);
+  ASSERT_EQUAL_64(0x1053, x2);
+  ASSERT_EQUAL_64(0x1fff, x3);
+  ASSERT_EQUAL_64(0xfffffff8, x4);
+  ASSERT_EQUAL_64(0xfffffff8, x5);
+
+  TEARDOWN();
+}
+
+
+TEST(mvn) {
+  SETUP();
+
+  START();
+  __ Mvn(w0, 0xfff);
+  __ Mvn(x1, 0xfff);
+  __ Mvn(w2, Operand(w0, LSL, 1));
+  __ Mvn(x3, Operand(x1, LSL, 2));
+  __ Mvn(w4, Operand(w0, LSR, 3));
+  __ Mvn(x5, Operand(x1, LSR, 4));
+  __ Mvn(w6, Operand(w0, ASR, 11));
+  __ Mvn(x7, Operand(x1, ASR, 12));
+  __ Mvn(w8, Operand(w0, ROR, 13));
+  __ Mvn(x9, Operand(x1, ROR, 14));
+  __ Mvn(w10, Operand(w2, UXTB));
+  __ Mvn(x11, Operand(x2, SXTB, 1));
+  __ Mvn(w12, Operand(w2, UXTH, 2));
+  __ Mvn(x13, Operand(x2, SXTH, 3));
+  __ Mvn(x14, Operand(w2, UXTW, 4));
+  __ Mvn(x15, Operand(w2, SXTW, 4));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xfffff000, x0);
+  ASSERT_EQUAL_64(0xfffffffffffff000UL, x1);
+  ASSERT_EQUAL_64(0x00001fff, x2);
+  ASSERT_EQUAL_64(0x0000000000003fffUL, x3);
+  ASSERT_EQUAL_64(0xe00001ff, x4);
+  ASSERT_EQUAL_64(0xf0000000000000ffUL, x5);
+  ASSERT_EQUAL_64(0x00000001, x6);
+  ASSERT_EQUAL_64(0x0, x7);
+  ASSERT_EQUAL_64(0x7ff80000, x8);
+  ASSERT_EQUAL_64(0x3ffc000000000000UL, x9);
+  ASSERT_EQUAL_64(0xffffff00, x10);
+  ASSERT_EQUAL_64(0x0000000000000001UL, x11);
+  ASSERT_EQUAL_64(0xffff8003, x12);
+  ASSERT_EQUAL_64(0xffffffffffff0007UL, x13);
+  ASSERT_EQUAL_64(0xfffffffffffe000fUL, x14);
+  ASSERT_EQUAL_64(0xfffffffffffe000fUL, x15);
+
+  TEARDOWN();
+}
+
+
+TEST(mov) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0xffffffffffffffffL);
+  __ Mov(x1, 0xffffffffffffffffL);
+  __ Mov(x2, 0xffffffffffffffffL);
+  __ Mov(x3, 0xffffffffffffffffL);
+
+  __ Mov(x0, 0x0123456789abcdefL);
+
+  __ movz(x1, 0xabcdL << 16);
+  __ movk(x2, 0xabcdL << 32);
+  __ movn(x3, 0xabcdL << 48);
+
+  __ Mov(x4, 0x0123456789abcdefL);
+  __ Mov(x5, x4);
+
+  __ Mov(w6, -1);
+
+  // Test that moves back to the same register have the desired effect. This
+  // is a no-op for X registers, and a truncation for W registers.
+  __ Mov(x7, 0x0123456789abcdefL);
+  __ Mov(x7, x7);
+  __ Mov(x8, 0x0123456789abcdefL);
+  __ Mov(w8, w8);
+  __ Mov(x9, 0x0123456789abcdefL);
+  __ Mov(x9, Operand(x9));
+  __ Mov(x10, 0x0123456789abcdefL);
+  __ Mov(w10, Operand(w10));
+
+  __ Mov(w11, 0xfff);
+  __ Mov(x12, 0xfff);
+  __ Mov(w13, Operand(w11, LSL, 1));
+  __ Mov(x14, Operand(x12, LSL, 2));
+  __ Mov(w15, Operand(w11, LSR, 3));
+  __ Mov(x18, Operand(x12, LSR, 4));
+  __ Mov(w19, Operand(w11, ASR, 11));
+  __ Mov(x20, Operand(x12, ASR, 12));
+  __ Mov(w21, Operand(w11, ROR, 13));
+  __ Mov(x22, Operand(x12, ROR, 14));
+  __ Mov(w23, Operand(w13, UXTB));
+  __ Mov(x24, Operand(x13, SXTB, 1));
+  __ Mov(w25, Operand(w13, UXTH, 2));
+  __ Mov(x26, Operand(x13, SXTH, 3));
+  __ Mov(x27, Operand(w13, UXTW, 4));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x0123456789abcdefL, x0);
+  ASSERT_EQUAL_64(0x00000000abcd0000L, x1);
+  ASSERT_EQUAL_64(0xffffabcdffffffffL, x2);
+  ASSERT_EQUAL_64(0x5432ffffffffffffL, x3);
+  ASSERT_EQUAL_64(x4, x5);
+  ASSERT_EQUAL_32(-1, w6);
+  ASSERT_EQUAL_64(0x0123456789abcdefL, x7);
+  ASSERT_EQUAL_32(0x89abcdefL, w8);
+  ASSERT_EQUAL_64(0x0123456789abcdefL, x9);
+  ASSERT_EQUAL_32(0x89abcdefL, w10);
+  ASSERT_EQUAL_64(0x00000fff, x11);
+  ASSERT_EQUAL_64(0x0000000000000fffUL, x12);
+  ASSERT_EQUAL_64(0x00001ffe, x13);
+  ASSERT_EQUAL_64(0x0000000000003ffcUL, x14);
+  ASSERT_EQUAL_64(0x000001ff, x15);
+  ASSERT_EQUAL_64(0x00000000000000ffUL, x18);
+  ASSERT_EQUAL_64(0x00000001, x19);
+  ASSERT_EQUAL_64(0x0, x20);
+  ASSERT_EQUAL_64(0x7ff80000, x21);
+  ASSERT_EQUAL_64(0x3ffc000000000000UL, x22);
+  ASSERT_EQUAL_64(0x000000fe, x23);
+  ASSERT_EQUAL_64(0xfffffffffffffffcUL, x24);
+  ASSERT_EQUAL_64(0x00007ff8, x25);
+  ASSERT_EQUAL_64(0x000000000000fff0UL, x26);
+  ASSERT_EQUAL_64(0x000000000001ffe0UL, x27);
+
+  TEARDOWN();
+}
+
+
+TEST(orr) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0xf0f0);
+  __ Mov(x1, 0xf00000ff);
+
+  __ Orr(x2, x0, Operand(x1));
+  __ Orr(w3, w0, Operand(w1, LSL, 28));
+  __ Orr(x4, x0, Operand(x1, LSL, 32));
+  __ Orr(x5, x0, Operand(x1, LSR, 4));
+  __ Orr(w6, w0, Operand(w1, ASR, 4));
+  __ Orr(x7, x0, Operand(x1, ASR, 4));
+  __ Orr(w8, w0, Operand(w1, ROR, 12));
+  __ Orr(x9, x0, Operand(x1, ROR, 12));
+  __ Orr(w10, w0, Operand(0xf));
+  __ Orr(x11, x0, Operand(0xf0000000f0000000L));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xf000f0ff, x2);
+  ASSERT_EQUAL_64(0xf000f0f0, x3);
+  ASSERT_EQUAL_64(0xf00000ff0000f0f0L, x4);
+  ASSERT_EQUAL_64(0x0f00f0ff, x5);
+  ASSERT_EQUAL_64(0xff00f0ff, x6);
+  ASSERT_EQUAL_64(0x0f00f0ff, x7);
+  ASSERT_EQUAL_64(0x0ffff0f0, x8);
+  ASSERT_EQUAL_64(0x0ff00000000ff0f0L, x9);
+  ASSERT_EQUAL_64(0xf0ff, x10);
+  ASSERT_EQUAL_64(0xf0000000f000f0f0L, x11);
+
+  TEARDOWN();
+}
+
+
+TEST(orr_extend) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 1);
+  __ Mov(x1, 0x8000000080008080UL);
+  __ Orr(w6, w0, Operand(w1, UXTB));
+  __ Orr(x7, x0, Operand(x1, UXTH, 1));
+  __ Orr(w8, w0, Operand(w1, UXTW, 2));
+  __ Orr(x9, x0, Operand(x1, UXTX, 3));
+  __ Orr(w10, w0, Operand(w1, SXTB));
+  __ Orr(x11, x0, Operand(x1, SXTH, 1));
+  __ Orr(x12, x0, Operand(x1, SXTW, 2));
+  __ Orr(x13, x0, Operand(x1, SXTX, 3));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x00000081, x6);
+  ASSERT_EQUAL_64(0x00010101, x7);
+  ASSERT_EQUAL_64(0x00020201, x8);
+  ASSERT_EQUAL_64(0x0000000400040401UL, x9);
+  ASSERT_EQUAL_64(0x00000000ffffff81UL, x10);
+  ASSERT_EQUAL_64(0xffffffffffff0101UL, x11);
+  ASSERT_EQUAL_64(0xfffffffe00020201UL, x12);
+  ASSERT_EQUAL_64(0x0000000400040401UL, x13);
+
+  TEARDOWN();
+}
+
+
+TEST(bitwise_wide_imm) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0);
+  __ Mov(x1, 0xf0f0f0f0f0f0f0f0UL);
+
+  __ Orr(x10, x0, Operand(0x1234567890abcdefUL));
+  __ Orr(w11, w1, Operand(0x90abcdef));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0, x0);
+  ASSERT_EQUAL_64(0xf0f0f0f0f0f0f0f0UL, x1);
+  ASSERT_EQUAL_64(0x1234567890abcdefUL, x10);
+  ASSERT_EQUAL_64(0xf0fbfdffUL, x11);
+
+  TEARDOWN();
+}
+
+
+TEST(orn) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0xf0f0);
+  __ Mov(x1, 0xf00000ff);
+
+  __ Orn(x2, x0, Operand(x1));
+  __ Orn(w3, w0, Operand(w1, LSL, 4));
+  __ Orn(x4, x0, Operand(x1, LSL, 4));
+  __ Orn(x5, x0, Operand(x1, LSR, 1));
+  __ Orn(w6, w0, Operand(w1, ASR, 1));
+  __ Orn(x7, x0, Operand(x1, ASR, 1));
+  __ Orn(w8, w0, Operand(w1, ROR, 16));
+  __ Orn(x9, x0, Operand(x1, ROR, 16));
+  __ Orn(w10, w0, Operand(0xffff));
+  __ Orn(x11, x0, Operand(0xffff0000ffffL));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xffffffff0ffffff0L, x2);
+  ASSERT_EQUAL_64(0xfffff0ff, x3);
+  ASSERT_EQUAL_64(0xfffffff0fffff0ffL, x4);
+  ASSERT_EQUAL_64(0xffffffff87fffff0L, x5);
+  ASSERT_EQUAL_64(0x07fffff0, x6);
+  ASSERT_EQUAL_64(0xffffffff87fffff0L, x7);
+  ASSERT_EQUAL_64(0xff00ffff, x8);
+  ASSERT_EQUAL_64(0xff00ffffffffffffL, x9);
+  ASSERT_EQUAL_64(0xfffff0f0, x10);
+  ASSERT_EQUAL_64(0xffff0000fffff0f0L, x11);
+
+  TEARDOWN();
+}
+
+
+TEST(orn_extend) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 1);
+  __ Mov(x1, 0x8000000080008081UL);
+  __ Orn(w6, w0, Operand(w1, UXTB));
+  __ Orn(x7, x0, Operand(x1, UXTH, 1));
+  __ Orn(w8, w0, Operand(w1, UXTW, 2));
+  __ Orn(x9, x0, Operand(x1, UXTX, 3));
+  __ Orn(w10, w0, Operand(w1, SXTB));
+  __ Orn(x11, x0, Operand(x1, SXTH, 1));
+  __ Orn(x12, x0, Operand(x1, SXTW, 2));
+  __ Orn(x13, x0, Operand(x1, SXTX, 3));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xffffff7f, x6);
+  ASSERT_EQUAL_64(0xfffffffffffefefdUL, x7);
+  ASSERT_EQUAL_64(0xfffdfdfb, x8);
+  ASSERT_EQUAL_64(0xfffffffbfffbfbf7UL, x9);
+  ASSERT_EQUAL_64(0x0000007f, x10);
+  ASSERT_EQUAL_64(0x0000fefd, x11);
+  ASSERT_EQUAL_64(0x00000001fffdfdfbUL, x12);
+  ASSERT_EQUAL_64(0xfffffffbfffbfbf7UL, x13);
+
+  TEARDOWN();
+}
+
+
+TEST(and_) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0xfff0);
+  __ Mov(x1, 0xf00000ff);
+
+  __ And(x2, x0, Operand(x1));
+  __ And(w3, w0, Operand(w1, LSL, 4));
+  __ And(x4, x0, Operand(x1, LSL, 4));
+  __ And(x5, x0, Operand(x1, LSR, 1));
+  __ And(w6, w0, Operand(w1, ASR, 20));
+  __ And(x7, x0, Operand(x1, ASR, 20));
+  __ And(w8, w0, Operand(w1, ROR, 28));
+  __ And(x9, x0, Operand(x1, ROR, 28));
+  __ And(w10, w0, Operand(0xff00));
+  __ And(x11, x0, Operand(0xff));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x000000f0, x2);
+  ASSERT_EQUAL_64(0x00000ff0, x3);
+  ASSERT_EQUAL_64(0x00000ff0, x4);
+  ASSERT_EQUAL_64(0x00000070, x5);
+  ASSERT_EQUAL_64(0x0000ff00, x6);
+  ASSERT_EQUAL_64(0x00000f00, x7);
+  ASSERT_EQUAL_64(0x00000ff0, x8);
+  ASSERT_EQUAL_64(0x00000000, x9);
+  ASSERT_EQUAL_64(0x0000ff00, x10);
+  ASSERT_EQUAL_64(0x000000f0, x11);
+
+  TEARDOWN();
+}
+
+
+TEST(and_extend) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0xffffffffffffffffUL);
+  __ Mov(x1, 0x8000000080008081UL);
+  __ And(w6, w0, Operand(w1, UXTB));
+  __ And(x7, x0, Operand(x1, UXTH, 1));
+  __ And(w8, w0, Operand(w1, UXTW, 2));
+  __ And(x9, x0, Operand(x1, UXTX, 3));
+  __ And(w10, w0, Operand(w1, SXTB));
+  __ And(x11, x0, Operand(x1, SXTH, 1));
+  __ And(x12, x0, Operand(x1, SXTW, 2));
+  __ And(x13, x0, Operand(x1, SXTX, 3));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x00000081, x6);
+  ASSERT_EQUAL_64(0x00010102, x7);
+  ASSERT_EQUAL_64(0x00020204, x8);
+  ASSERT_EQUAL_64(0x0000000400040408UL, x9);
+  ASSERT_EQUAL_64(0xffffff81, x10);
+  ASSERT_EQUAL_64(0xffffffffffff0102UL, x11);
+  ASSERT_EQUAL_64(0xfffffffe00020204UL, x12);
+  ASSERT_EQUAL_64(0x0000000400040408UL, x13);
+
+  TEARDOWN();
+}
+
+
+TEST(ands) {
+  SETUP();
+
+  START();
+  __ Mov(x1, 0xf00000ff);
+  __ And(w0, w1, Operand(w1), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(NFlag);
+  ASSERT_EQUAL_64(0xf00000ff, x0);
+
+  START();
+  __ Mov(x0, 0xfff0);
+  __ Mov(x1, 0xf00000ff);
+  __ And(w0, w0, Operand(w1, LSR, 4), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(ZFlag);
+  ASSERT_EQUAL_64(0x00000000, x0);
+
+  START();
+  __ Mov(x0, 0x8000000000000000L);
+  __ Mov(x1, 0x00000001);
+  __ And(x0, x0, Operand(x1, ROR, 1), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(NFlag);
+  ASSERT_EQUAL_64(0x8000000000000000L, x0);
+
+  START();
+  __ Mov(x0, 0xfff0);
+  __ And(w0, w0, Operand(0xf), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(ZFlag);
+  ASSERT_EQUAL_64(0x00000000, x0);
+
+  START();
+  __ Mov(x0, 0xff000000);
+  __ And(w0, w0, Operand(0x80000000), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(NFlag);
+  ASSERT_EQUAL_64(0x80000000, x0);
+
+  TEARDOWN();
+}
+
+
+TEST(bic) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0xfff0);
+  __ Mov(x1, 0xf00000ff);
+
+  __ Bic(x2, x0, Operand(x1));
+  __ Bic(w3, w0, Operand(w1, LSL, 4));
+  __ Bic(x4, x0, Operand(x1, LSL, 4));
+  __ Bic(x5, x0, Operand(x1, LSR, 1));
+  __ Bic(w6, w0, Operand(w1, ASR, 20));
+  __ Bic(x7, x0, Operand(x1, ASR, 20));
+  __ Bic(w8, w0, Operand(w1, ROR, 28));
+  __ Bic(x9, x0, Operand(x1, ROR, 24));
+  __ Bic(x10, x0, Operand(0x1f));
+  __ Bic(x11, x0, Operand(0x100));
+
+  // Test bic into sp when the constant cannot be encoded in the immediate
+  // field.
+  // Use x20 to preserve sp. We check for the result via x21 because the
+  // test infrastructure requires that sp be restored to its original value.
+  __ Mov(x20, sp);
+  __ Mov(x0, 0xffffff);
+  __ Bic(sp, x0, Operand(0xabcdef));
+  __ Mov(x21, sp);
+  __ Mov(sp, x20);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x0000ff00, x2);
+  ASSERT_EQUAL_64(0x0000f000, x3);
+  ASSERT_EQUAL_64(0x0000f000, x4);
+  ASSERT_EQUAL_64(0x0000ff80, x5);
+  ASSERT_EQUAL_64(0x000000f0, x6);
+  ASSERT_EQUAL_64(0x0000f0f0, x7);
+  ASSERT_EQUAL_64(0x0000f000, x8);
+  ASSERT_EQUAL_64(0x0000ff00, x9);
+  ASSERT_EQUAL_64(0x0000ffe0, x10);
+  ASSERT_EQUAL_64(0x0000fef0, x11);
+
+  ASSERT_EQUAL_64(0x543210, x21);
+
+  TEARDOWN();
+}
+
+
+TEST(bic_extend) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0xffffffffffffffffUL);
+  __ Mov(x1, 0x8000000080008081UL);
+  __ Bic(w6, w0, Operand(w1, UXTB));
+  __ Bic(x7, x0, Operand(x1, UXTH, 1));
+  __ Bic(w8, w0, Operand(w1, UXTW, 2));
+  __ Bic(x9, x0, Operand(x1, UXTX, 3));
+  __ Bic(w10, w0, Operand(w1, SXTB));
+  __ Bic(x11, x0, Operand(x1, SXTH, 1));
+  __ Bic(x12, x0, Operand(x1, SXTW, 2));
+  __ Bic(x13, x0, Operand(x1, SXTX, 3));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xffffff7e, x6);
+  ASSERT_EQUAL_64(0xfffffffffffefefdUL, x7);
+  ASSERT_EQUAL_64(0xfffdfdfb, x8);
+  ASSERT_EQUAL_64(0xfffffffbfffbfbf7UL, x9);
+  ASSERT_EQUAL_64(0x0000007e, x10);
+  ASSERT_EQUAL_64(0x0000fefd, x11);
+  ASSERT_EQUAL_64(0x00000001fffdfdfbUL, x12);
+  ASSERT_EQUAL_64(0xfffffffbfffbfbf7UL, x13);
+
+  TEARDOWN();
+}
+
+
+TEST(bics) {
+  SETUP();
+
+  START();
+  __ Mov(x1, 0xffff);
+  __ Bic(w0, w1, Operand(w1), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(ZFlag);
+  ASSERT_EQUAL_64(0x00000000, x0);
+
+  START();
+  __ Mov(x0, 0xffffffff);
+  __ Bic(w0, w0, Operand(w0, LSR, 1), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(NFlag);
+  ASSERT_EQUAL_64(0x80000000, x0);
+
+  START();
+  __ Mov(x0, 0x8000000000000000L);
+  __ Mov(x1, 0x00000001);
+  __ Bic(x0, x0, Operand(x1, ROR, 1), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(ZFlag);
+  ASSERT_EQUAL_64(0x00000000, x0);
+
+  START();
+  __ Mov(x0, 0xffffffffffffffffL);
+  __ Bic(x0, x0, Operand(0x7fffffffffffffffL), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(NFlag);
+  ASSERT_EQUAL_64(0x8000000000000000L, x0);
+
+  START();
+  __ Mov(w0, 0xffff0000);
+  __ Bic(w0, w0, Operand(0xfffffff0), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(ZFlag);
+  ASSERT_EQUAL_64(0x00000000, x0);
+
+  TEARDOWN();
+}
+
+
+TEST(eor) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0xfff0);
+  __ Mov(x1, 0xf00000ff);
+
+  __ Eor(x2, x0, Operand(x1));
+  __ Eor(w3, w0, Operand(w1, LSL, 4));
+  __ Eor(x4, x0, Operand(x1, LSL, 4));
+  __ Eor(x5, x0, Operand(x1, LSR, 1));
+  __ Eor(w6, w0, Operand(w1, ASR, 20));
+  __ Eor(x7, x0, Operand(x1, ASR, 20));
+  __ Eor(w8, w0, Operand(w1, ROR, 28));
+  __ Eor(x9, x0, Operand(x1, ROR, 28));
+  __ Eor(w10, w0, Operand(0xff00ff00));
+  __ Eor(x11, x0, Operand(0xff00ff00ff00ff00L));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xf000ff0f, x2);
+  ASSERT_EQUAL_64(0x0000f000, x3);
+  ASSERT_EQUAL_64(0x0000000f0000f000L, x4);
+  ASSERT_EQUAL_64(0x7800ff8f, x5);
+  ASSERT_EQUAL_64(0xffff00f0, x6);
+  ASSERT_EQUAL_64(0x0000f0f0, x7);
+  ASSERT_EQUAL_64(0x0000f00f, x8);
+  ASSERT_EQUAL_64(0x00000ff00000ffffL, x9);
+  ASSERT_EQUAL_64(0xff0000f0, x10);
+  ASSERT_EQUAL_64(0xff00ff00ff0000f0L, x11);
+
+  TEARDOWN();
+}
+
+TEST(eor_extend) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0x1111111111111111UL);
+  __ Mov(x1, 0x8000000080008081UL);
+  __ Eor(w6, w0, Operand(w1, UXTB));
+  __ Eor(x7, x0, Operand(x1, UXTH, 1));
+  __ Eor(w8, w0, Operand(w1, UXTW, 2));
+  __ Eor(x9, x0, Operand(x1, UXTX, 3));
+  __ Eor(w10, w0, Operand(w1, SXTB));
+  __ Eor(x11, x0, Operand(x1, SXTH, 1));
+  __ Eor(x12, x0, Operand(x1, SXTW, 2));
+  __ Eor(x13, x0, Operand(x1, SXTX, 3));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x11111190, x6);
+  ASSERT_EQUAL_64(0x1111111111101013UL, x7);
+  ASSERT_EQUAL_64(0x11131315, x8);
+  ASSERT_EQUAL_64(0x1111111511151519UL, x9);
+  ASSERT_EQUAL_64(0xeeeeee90, x10);
+  ASSERT_EQUAL_64(0xeeeeeeeeeeee1013UL, x11);
+  ASSERT_EQUAL_64(0xeeeeeeef11131315UL, x12);
+  ASSERT_EQUAL_64(0x1111111511151519UL, x13);
+
+  TEARDOWN();
+}
+
+
+TEST(eon) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0xfff0);
+  __ Mov(x1, 0xf00000ff);
+
+  __ Eon(x2, x0, Operand(x1));
+  __ Eon(w3, w0, Operand(w1, LSL, 4));
+  __ Eon(x4, x0, Operand(x1, LSL, 4));
+  __ Eon(x5, x0, Operand(x1, LSR, 1));
+  __ Eon(w6, w0, Operand(w1, ASR, 20));
+  __ Eon(x7, x0, Operand(x1, ASR, 20));
+  __ Eon(w8, w0, Operand(w1, ROR, 28));
+  __ Eon(x9, x0, Operand(x1, ROR, 28));
+  __ Eon(w10, w0, Operand(0x03c003c0));
+  __ Eon(x11, x0, Operand(0x0000100000001000L));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xffffffff0fff00f0L, x2);
+  ASSERT_EQUAL_64(0xffff0fff, x3);
+  ASSERT_EQUAL_64(0xfffffff0ffff0fffL, x4);
+  ASSERT_EQUAL_64(0xffffffff87ff0070L, x5);
+  ASSERT_EQUAL_64(0x0000ff0f, x6);
+  ASSERT_EQUAL_64(0xffffffffffff0f0fL, x7);
+  ASSERT_EQUAL_64(0xffff0ff0, x8);
+  ASSERT_EQUAL_64(0xfffff00fffff0000L, x9);
+  ASSERT_EQUAL_64(0xfc3f03cf, x10);
+  ASSERT_EQUAL_64(0xffffefffffff100fL, x11);
+
+  TEARDOWN();
+}
+
+
+TEST(eon_extend) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0x1111111111111111UL);
+  __ Mov(x1, 0x8000000080008081UL);
+  __ Eon(w6, w0, Operand(w1, UXTB));
+  __ Eon(x7, x0, Operand(x1, UXTH, 1));
+  __ Eon(w8, w0, Operand(w1, UXTW, 2));
+  __ Eon(x9, x0, Operand(x1, UXTX, 3));
+  __ Eon(w10, w0, Operand(w1, SXTB));
+  __ Eon(x11, x0, Operand(x1, SXTH, 1));
+  __ Eon(x12, x0, Operand(x1, SXTW, 2));
+  __ Eon(x13, x0, Operand(x1, SXTX, 3));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xeeeeee6f, x6);
+  ASSERT_EQUAL_64(0xeeeeeeeeeeefefecUL, x7);
+  ASSERT_EQUAL_64(0xeeececea, x8);
+  ASSERT_EQUAL_64(0xeeeeeeeaeeeaeae6UL, x9);
+  ASSERT_EQUAL_64(0x1111116f, x10);
+  ASSERT_EQUAL_64(0x111111111111efecUL, x11);
+  ASSERT_EQUAL_64(0x11111110eeececeaUL, x12);
+  ASSERT_EQUAL_64(0xeeeeeeeaeeeaeae6UL, x13);
+
+  TEARDOWN();
+}
+
+
+TEST(mul) {
+  SETUP();
+
+  START();
+  __ Mov(x16, 0);
+  __ Mov(x17, 1);
+  __ Mov(x18, 0xffffffff);
+  __ Mov(x19, 0xffffffffffffffffUL);
+
+  __ Mul(w0, w16, w16);
+  __ Mul(w1, w16, w17);
+  __ Mul(w2, w17, w18);
+  __ Mul(w3, w18, w19);
+  __ Mul(x4, x16, x16);
+  __ Mul(x5, x17, x18);
+  __ Mul(x6, x18, x19);
+  __ Mul(x7, x19, x19);
+  __ Smull(x8, w17, w18);
+  __ Smull(x9, w18, w18);
+  __ Smull(x10, w19, w19);
+  __ Mneg(w11, w16, w16);
+  __ Mneg(w12, w16, w17);
+  __ Mneg(w13, w17, w18);
+  __ Mneg(w14, w18, w19);
+  __ Mneg(x20, x16, x16);
+  __ Mneg(x21, x17, x18);
+  __ Mneg(x22, x18, x19);
+  __ Mneg(x23, x19, x19);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0, x0);
+  ASSERT_EQUAL_64(0, x1);
+  ASSERT_EQUAL_64(0xffffffff, x2);
+  ASSERT_EQUAL_64(1, x3);
+  ASSERT_EQUAL_64(0, x4);
+  ASSERT_EQUAL_64(0xffffffff, x5);
+  ASSERT_EQUAL_64(0xffffffff00000001UL, x6);
+  ASSERT_EQUAL_64(1, x7);
+  ASSERT_EQUAL_64(0xffffffffffffffffUL, x8);
+  ASSERT_EQUAL_64(1, x9);
+  ASSERT_EQUAL_64(1, x10);
+  ASSERT_EQUAL_64(0, x11);
+  ASSERT_EQUAL_64(0, x12);
+  ASSERT_EQUAL_64(1, x13);
+  ASSERT_EQUAL_64(0xffffffff, x14);
+  ASSERT_EQUAL_64(0, x20);
+  ASSERT_EQUAL_64(0xffffffff00000001UL, x21);
+  ASSERT_EQUAL_64(0xffffffff, x22);
+  ASSERT_EQUAL_64(0xffffffffffffffffUL, x23);
+
+  TEARDOWN();
+}
+
+
+TEST(madd) {
+  SETUP();
+
+  START();
+  __ Mov(x16, 0);
+  __ Mov(x17, 1);
+  __ Mov(x18, 0xffffffff);
+  __ Mov(x19, 0xffffffffffffffffUL);
+
+  __ Madd(w0, w16, w16, w16);
+  __ Madd(w1, w16, w16, w17);
+  __ Madd(w2, w16, w16, w18);
+  __ Madd(w3, w16, w16, w19);
+  __ Madd(w4, w16, w17, w17);
+  __ Madd(w5, w17, w17, w18);
+  __ Madd(w6, w17, w17, w19);
+  __ Madd(w7, w17, w18, w16);
+  __ Madd(w8, w17, w18, w18);
+  __ Madd(w9, w18, w18, w17);
+  __ Madd(w10, w18, w19, w18);
+  __ Madd(w11, w19, w19, w19);
+
+  __ Madd(x12, x16, x16, x16);
+  __ Madd(x13, x16, x16, x17);
+  __ Madd(x14, x16, x16, x18);
+  __ Madd(x15, x16, x16, x19);
+  __ Madd(x20, x16, x17, x17);
+  __ Madd(x21, x17, x17, x18);
+  __ Madd(x22, x17, x17, x19);
+  __ Madd(x23, x17, x18, x16);
+  __ Madd(x24, x17, x18, x18);
+  __ Madd(x25, x18, x18, x17);
+  __ Madd(x26, x18, x19, x18);
+  __ Madd(x27, x19, x19, x19);
+
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0, x0);
+  ASSERT_EQUAL_64(1, x1);
+  ASSERT_EQUAL_64(0xffffffff, x2);
+  ASSERT_EQUAL_64(0xffffffff, x3);
+  ASSERT_EQUAL_64(1, x4);
+  ASSERT_EQUAL_64(0, x5);
+  ASSERT_EQUAL_64(0, x6);
+  ASSERT_EQUAL_64(0xffffffff, x7);
+  ASSERT_EQUAL_64(0xfffffffe, x8);
+  ASSERT_EQUAL_64(2, x9);
+  ASSERT_EQUAL_64(0, x10);
+  ASSERT_EQUAL_64(0, x11);
+
+  ASSERT_EQUAL_64(0, x12);
+  ASSERT_EQUAL_64(1, x13);
+  ASSERT_EQUAL_64(0xffffffff, x14);
+  ASSERT_EQUAL_64(0xffffffffffffffff, x15);
+  ASSERT_EQUAL_64(1, x20);
+  ASSERT_EQUAL_64(0x100000000UL, x21);
+  ASSERT_EQUAL_64(0, x22);
+  ASSERT_EQUAL_64(0xffffffff, x23);
+  ASSERT_EQUAL_64(0x1fffffffe, x24);
+  ASSERT_EQUAL_64(0xfffffffe00000002UL, x25);
+  ASSERT_EQUAL_64(0, x26);
+  ASSERT_EQUAL_64(0, x27);
+
+  TEARDOWN();
+}
+
+
+TEST(msub) {
+  SETUP();
+
+  START();
+  __ Mov(x16, 0);
+  __ Mov(x17, 1);
+  __ Mov(x18, 0xffffffff);
+  __ Mov(x19, 0xffffffffffffffffUL);
+
+  __ Msub(w0, w16, w16, w16);
+  __ Msub(w1, w16, w16, w17);
+  __ Msub(w2, w16, w16, w18);
+  __ Msub(w3, w16, w16, w19);
+  __ Msub(w4, w16, w17, w17);
+  __ Msub(w5, w17, w17, w18);
+  __ Msub(w6, w17, w17, w19);
+  __ Msub(w7, w17, w18, w16);
+  __ Msub(w8, w17, w18, w18);
+  __ Msub(w9, w18, w18, w17);
+  __ Msub(w10, w18, w19, w18);
+  __ Msub(w11, w19, w19, w19);
+
+  __ Msub(x12, x16, x16, x16);
+  __ Msub(x13, x16, x16, x17);
+  __ Msub(x14, x16, x16, x18);
+  __ Msub(x15, x16, x16, x19);
+  __ Msub(x20, x16, x17, x17);
+  __ Msub(x21, x17, x17, x18);
+  __ Msub(x22, x17, x17, x19);
+  __ Msub(x23, x17, x18, x16);
+  __ Msub(x24, x17, x18, x18);
+  __ Msub(x25, x18, x18, x17);
+  __ Msub(x26, x18, x19, x18);
+  __ Msub(x27, x19, x19, x19);
+
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0, x0);
+  ASSERT_EQUAL_64(1, x1);
+  ASSERT_EQUAL_64(0xffffffff, x2);
+  ASSERT_EQUAL_64(0xffffffff, x3);
+  ASSERT_EQUAL_64(1, x4);
+  ASSERT_EQUAL_64(0xfffffffe, x5);
+  ASSERT_EQUAL_64(0xfffffffe, x6);
+  ASSERT_EQUAL_64(1, x7);
+  ASSERT_EQUAL_64(0, x8);
+  ASSERT_EQUAL_64(0, x9);
+  ASSERT_EQUAL_64(0xfffffffe, x10);
+  ASSERT_EQUAL_64(0xfffffffe, x11);
+
+  ASSERT_EQUAL_64(0, x12);
+  ASSERT_EQUAL_64(1, x13);
+  ASSERT_EQUAL_64(0xffffffff, x14);
+  ASSERT_EQUAL_64(0xffffffffffffffffUL, x15);
+  ASSERT_EQUAL_64(1, x20);
+  ASSERT_EQUAL_64(0xfffffffeUL, x21);
+  ASSERT_EQUAL_64(0xfffffffffffffffeUL, x22);
+  ASSERT_EQUAL_64(0xffffffff00000001UL, x23);
+  ASSERT_EQUAL_64(0, x24);
+  ASSERT_EQUAL_64(0x200000000UL, x25);
+  ASSERT_EQUAL_64(0x1fffffffeUL, x26);
+  ASSERT_EQUAL_64(0xfffffffffffffffeUL, x27);
+
+  TEARDOWN();
+}
+
+
+TEST(smulh) {
+  SETUP();
+
+  START();
+  __ Mov(x20, 0);
+  __ Mov(x21, 1);
+  __ Mov(x22, 0x0000000100000000L);
+  __ Mov(x23, 0x12345678);
+  __ Mov(x24, 0x0123456789abcdefL);
+  __ Mov(x25, 0x0000000200000000L);
+  __ Mov(x26, 0x8000000000000000UL);
+  __ Mov(x27, 0xffffffffffffffffUL);
+  __ Mov(x28, 0x5555555555555555UL);
+  __ Mov(x29, 0xaaaaaaaaaaaaaaaaUL);
+
+  __ Smulh(x0, x20, x24);
+  __ Smulh(x1, x21, x24);
+  __ Smulh(x2, x22, x23);
+  __ Smulh(x3, x22, x24);
+  __ Smulh(x4, x24, x25);
+  __ Smulh(x5, x23, x27);
+  __ Smulh(x6, x26, x26);
+  __ Smulh(x7, x26, x27);
+  __ Smulh(x8, x27, x27);
+  __ Smulh(x9, x28, x28);
+  __ Smulh(x10, x28, x29);
+  __ Smulh(x11, x29, x29);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0, x0);
+  ASSERT_EQUAL_64(0, x1);
+  ASSERT_EQUAL_64(0, x2);
+  ASSERT_EQUAL_64(0x01234567, x3);
+  ASSERT_EQUAL_64(0x02468acf, x4);
+  ASSERT_EQUAL_64(0xffffffffffffffffUL, x5);
+  ASSERT_EQUAL_64(0x4000000000000000UL, x6);
+  ASSERT_EQUAL_64(0, x7);
+  ASSERT_EQUAL_64(0, x8);
+  ASSERT_EQUAL_64(0x1c71c71c71c71c71UL, x9);
+  ASSERT_EQUAL_64(0xe38e38e38e38e38eUL, x10);
+  ASSERT_EQUAL_64(0x1c71c71c71c71c72UL, x11);
+
+  TEARDOWN();
+}
+
+
+TEST(smaddl_umaddl) {
+  SETUP();
+
+  START();
+  __ Mov(x17, 1);
+  __ Mov(x18, 0xffffffff);
+  __ Mov(x19, 0xffffffffffffffffUL);
+  __ Mov(x20, 4);
+  __ Mov(x21, 0x200000000UL);
+
+  __ Smaddl(x9, w17, w18, x20);
+  __ Smaddl(x10, w18, w18, x20);
+  __ Smaddl(x11, w19, w19, x20);
+  __ Smaddl(x12, w19, w19, x21);
+  __ Umaddl(x13, w17, w18, x20);
+  __ Umaddl(x14, w18, w18, x20);
+  __ Umaddl(x15, w19, w19, x20);
+  __ Umaddl(x22, w19, w19, x21);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(3, x9);
+  ASSERT_EQUAL_64(5, x10);
+  ASSERT_EQUAL_64(5, x11);
+  ASSERT_EQUAL_64(0x200000001UL, x12);
+  ASSERT_EQUAL_64(0x100000003UL, x13);
+  ASSERT_EQUAL_64(0xfffffffe00000005UL, x14);
+  ASSERT_EQUAL_64(0xfffffffe00000005UL, x15);
+  ASSERT_EQUAL_64(0x1, x22);
+
+  TEARDOWN();
+}
+
+
+TEST(smsubl_umsubl) {
+  SETUP();
+
+  START();
+  __ Mov(x17, 1);
+  __ Mov(x18, 0xffffffff);
+  __ Mov(x19, 0xffffffffffffffffUL);
+  __ Mov(x20, 4);
+  __ Mov(x21, 0x200000000UL);
+
+  __ Smsubl(x9, w17, w18, x20);
+  __ Smsubl(x10, w18, w18, x20);
+  __ Smsubl(x11, w19, w19, x20);
+  __ Smsubl(x12, w19, w19, x21);
+  __ Umsubl(x13, w17, w18, x20);
+  __ Umsubl(x14, w18, w18, x20);
+  __ Umsubl(x15, w19, w19, x20);
+  __ Umsubl(x22, w19, w19, x21);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(5, x9);
+  ASSERT_EQUAL_64(3, x10);
+  ASSERT_EQUAL_64(3, x11);
+  ASSERT_EQUAL_64(0x1ffffffffUL, x12);
+  ASSERT_EQUAL_64(0xffffffff00000005UL, x13);
+  ASSERT_EQUAL_64(0x200000003UL, x14);
+  ASSERT_EQUAL_64(0x200000003UL, x15);
+  ASSERT_EQUAL_64(0x3ffffffffUL, x22);
+
+  TEARDOWN();
+}
+
+
+TEST(div) {
+  SETUP();
+
+  START();
+  __ Mov(x16, 1);
+  __ Mov(x17, 0xffffffff);
+  __ Mov(x18, 0xffffffffffffffffUL);
+  __ Mov(x19, 0x80000000);
+  __ Mov(x20, 0x8000000000000000UL);
+  __ Mov(x21, 2);
+
+  __ Udiv(w0, w16, w16);
+  __ Udiv(w1, w17, w16);
+  __ Sdiv(w2, w16, w16);
+  __ Sdiv(w3, w16, w17);
+  __ Sdiv(w4, w17, w18);
+
+  __ Udiv(x5, x16, x16);
+  __ Udiv(x6, x17, x18);
+  __ Sdiv(x7, x16, x16);
+  __ Sdiv(x8, x16, x17);
+  __ Sdiv(x9, x17, x18);
+
+  __ Udiv(w10, w19, w21);
+  __ Sdiv(w11, w19, w21);
+  __ Udiv(x12, x19, x21);
+  __ Sdiv(x13, x19, x21);
+  __ Udiv(x14, x20, x21);
+  __ Sdiv(x15, x20, x21);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(1, x0);
+  ASSERT_EQUAL_64(0xffffffff, x1);
+  ASSERT_EQUAL_64(1, x2);
+  ASSERT_EQUAL_64(0xffffffff, x3);
+  ASSERT_EQUAL_64(1, x4);
+  ASSERT_EQUAL_64(1, x5);
+  ASSERT_EQUAL_64(0, x6);
+  ASSERT_EQUAL_64(1, x7);
+  ASSERT_EQUAL_64(0, x8);
+  ASSERT_EQUAL_64(0xffffffff00000001UL, x9);
+  ASSERT_EQUAL_64(0x40000000, x10);
+  ASSERT_EQUAL_64(0xC0000000, x11);
+  ASSERT_EQUAL_64(0x40000000, x12);
+  ASSERT_EQUAL_64(0x40000000, x13);
+  ASSERT_EQUAL_64(0x4000000000000000UL, x14);
+  ASSERT_EQUAL_64(0xC000000000000000UL, x15);
+
+  TEARDOWN();
+}
+
+
+TEST(rbit_rev) {
+  SETUP();
+
+  START();
+  __ Mov(x24, 0xfedcba9876543210UL);
+  __ Rbit(w0, w24);
+  __ Rbit(x1, x24);
+  __ Rev16(w2, w24);
+  __ Rev16(x3, x24);
+  __ Rev(w4, w24);
+  __ Rev32(x5, x24);
+  __ Rev(x6, x24);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x084c2a6e, x0);
+  ASSERT_EQUAL_64(0x084c2a6e195d3b7fUL, x1);
+  ASSERT_EQUAL_64(0x54761032, x2);
+  ASSERT_EQUAL_64(0xdcfe98ba54761032UL, x3);
+  ASSERT_EQUAL_64(0x10325476, x4);
+  ASSERT_EQUAL_64(0x98badcfe10325476UL, x5);
+  ASSERT_EQUAL_64(0x1032547698badcfeUL, x6);
+
+  TEARDOWN();
+}
+
+
+TEST(clz_cls) {
+  SETUP();
+
+  START();
+  __ Mov(x24, 0x0008000000800000UL);
+  __ Mov(x25, 0xff800000fff80000UL);
+  __ Mov(x26, 0);
+  __ Clz(w0, w24);
+  __ Clz(x1, x24);
+  __ Clz(w2, w25);
+  __ Clz(x3, x25);
+  __ Clz(w4, w26);
+  __ Clz(x5, x26);
+  __ Cls(w6, w24);
+  __ Cls(x7, x24);
+  __ Cls(w8, w25);
+  __ Cls(x9, x25);
+  __ Cls(w10, w26);
+  __ Cls(x11, x26);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(8, x0);
+  ASSERT_EQUAL_64(12, x1);
+  ASSERT_EQUAL_64(0, x2);
+  ASSERT_EQUAL_64(0, x3);
+  ASSERT_EQUAL_64(32, x4);
+  ASSERT_EQUAL_64(64, x5);
+  ASSERT_EQUAL_64(7, x6);
+  ASSERT_EQUAL_64(11, x7);
+  ASSERT_EQUAL_64(12, x8);
+  ASSERT_EQUAL_64(8, x9);
+  ASSERT_EQUAL_64(31, x10);
+  ASSERT_EQUAL_64(63, x11);
+
+  TEARDOWN();
+}
+
+
+TEST(label) {
+  SETUP();
+
+  Label label_1, label_2, label_3, label_4;
+
+  START();
+  __ Mov(x0, 0x1);
+  __ Mov(x1, 0x0);
+  __ Mov(x22, lr);    // Save lr.
+
+  __ B(&label_1);
+  __ B(&label_1);
+  __ B(&label_1);     // Multiple branches to the same label.
+  __ Mov(x0, 0x0);
+  __ Bind(&label_2);
+  __ B(&label_3);     // Forward branch.
+  __ Mov(x0, 0x0);
+  __ Bind(&label_1);
+  __ B(&label_2);     // Backward branch.
+  __ Mov(x0, 0x0);
+  __ Bind(&label_3);
+  __ Bl(&label_4);
+  END();
+
+  __ Bind(&label_4);
+  __ Mov(x1, 0x1);
+  __ Mov(lr, x22);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x1, x0);
+  ASSERT_EQUAL_64(0x1, x1);
+
+  TEARDOWN();
+}
+
+
+TEST(adr) {
+  SETUP();
+
+  Label label_1, label_2, label_3, label_4;
+
+  START();
+  __ Mov(x0, 0x0);        // Set to non-zero to indicate failure.
+  __ Adr(x1, &label_3);   // Set to zero to indicate success.
+
+  __ Adr(x2, &label_1);   // Multiple forward references to the same label.
+  __ Adr(x3, &label_1);
+  __ Adr(x4, &label_1);
+
+  __ Bind(&label_2);
+  __ Eor(x5, x2, Operand(x3));  // Ensure that x2,x3 and x4 are identical.
+  __ Eor(x6, x2, Operand(x4));
+  __ Orr(x0, x0, Operand(x5));
+  __ Orr(x0, x0, Operand(x6));
+  __ Br(x2);  // label_1, label_3
+
+  __ Bind(&label_3);
+  __ Adr(x2, &label_3);   // Self-reference (offset 0).
+  __ Eor(x1, x1, Operand(x2));
+  __ Adr(x2, &label_4);   // Simple forward reference.
+  __ Br(x2);  // label_4
+
+  __ Bind(&label_1);
+  __ Adr(x2, &label_3);   // Multiple reverse references to the same label.
+  __ Adr(x3, &label_3);
+  __ Adr(x4, &label_3);
+  __ Adr(x5, &label_2);   // Simple reverse reference.
+  __ Br(x5);  // label_2
+
+  __ Bind(&label_4);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x0, x0);
+  ASSERT_EQUAL_64(0x0, x1);
+
+  TEARDOWN();
+}
+
+
+TEST(branch_cond) {
+  SETUP();
+
+  Label wrong;
+
+  START();
+  __ Mov(x0, 0x1);
+  __ Mov(x1, 0x1);
+  __ Mov(x2, 0x8000000000000000L);
+
+  // For each 'cmp' instruction below, condition codes other than the ones
+  // following it would branch.
+
+  __ Cmp(x1, Operand(0));
+  __ B(&wrong, eq);
+  __ B(&wrong, lo);
+  __ B(&wrong, mi);
+  __ B(&wrong, vs);
+  __ B(&wrong, ls);
+  __ B(&wrong, lt);
+  __ B(&wrong, le);
+  Label ok_1;
+  __ B(&ok_1, ne);
+  __ Mov(x0, 0x0);
+  __ Bind(&ok_1);
+
+  __ Cmp(x1, Operand(1));
+  __ B(&wrong, ne);
+  __ B(&wrong, lo);
+  __ B(&wrong, mi);
+  __ B(&wrong, vs);
+  __ B(&wrong, hi);
+  __ B(&wrong, lt);
+  __ B(&wrong, gt);
+  Label ok_2;
+  __ B(&ok_2, pl);
+  __ Mov(x0, 0x0);
+  __ Bind(&ok_2);
+
+  __ Cmp(x1, Operand(2));
+  __ B(&wrong, eq);
+  __ B(&wrong, hs);
+  __ B(&wrong, pl);
+  __ B(&wrong, vs);
+  __ B(&wrong, hi);
+  __ B(&wrong, ge);
+  __ B(&wrong, gt);
+  Label ok_3;
+  __ B(&ok_3, vc);
+  __ Mov(x0, 0x0);
+  __ Bind(&ok_3);
+
+  __ Cmp(x2, Operand(1));
+  __ B(&wrong, eq);
+  __ B(&wrong, lo);
+  __ B(&wrong, mi);
+  __ B(&wrong, vc);
+  __ B(&wrong, ls);
+  __ B(&wrong, ge);
+  __ B(&wrong, gt);
+  Label ok_4;
+  __ B(&ok_4, le);
+  __ Mov(x0, 0x0);
+  __ Bind(&ok_4);
+  END();
+
+  __ Bind(&wrong);
+  __ Mov(x0, 0x0);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x1, x0);
+
+  TEARDOWN();
+}
+
+
+TEST(branch_to_reg) {
+  SETUP();
+
+  // Test br.
+  Label fn1, after_fn1;
+
+  START();
+  __ Mov(x29, lr);
+
+  __ Mov(x1, 0);
+  __ B(&after_fn1);
+
+  __ Bind(&fn1);
+  __ Mov(x0, lr);
+  __ Mov(x1, 42);
+  __ Br(x0);
+
+  __ Bind(&after_fn1);
+  __ Bl(&fn1);
+
+  // Test blr.
+  Label fn2, after_fn2;
+
+  __ Mov(x2, 0);
+  __ B(&after_fn2);
+
+  __ Bind(&fn2);
+  __ Mov(x0, lr);
+  __ Mov(x2, 84);
+  __ Blr(x0);
+
+  __ Bind(&after_fn2);
+  __ Bl(&fn2);
+  __ Mov(x3, lr);
+
+  __ Mov(lr, x29);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(core.xreg(3) + kInstructionSize, x0);
+  ASSERT_EQUAL_64(42, x1);
+  ASSERT_EQUAL_64(84, x2);
+
+  TEARDOWN();
+}
+
+
+TEST(compare_branch) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0);
+  __ Mov(x1, 0);
+  __ Mov(x2, 0);
+  __ Mov(x3, 0);
+  __ Mov(x4, 0);
+  __ Mov(x5, 0);
+  __ Mov(x16, 0);
+  __ Mov(x17, 42);
+
+  Label zt, zt_end;
+  __ Cbz(w16, &zt);
+  __ B(&zt_end);
+  __ Bind(&zt);
+  __ Mov(x0, 1);
+  __ Bind(&zt_end);
+
+  Label zf, zf_end;
+  __ Cbz(x17, &zf);
+  __ B(&zf_end);
+  __ Bind(&zf);
+  __ Mov(x1, 1);
+  __ Bind(&zf_end);
+
+  Label nzt, nzt_end;
+  __ Cbnz(w17, &nzt);
+  __ B(&nzt_end);
+  __ Bind(&nzt);
+  __ Mov(x2, 1);
+  __ Bind(&nzt_end);
+
+  Label nzf, nzf_end;
+  __ Cbnz(x16, &nzf);
+  __ B(&nzf_end);
+  __ Bind(&nzf);
+  __ Mov(x3, 1);
+  __ Bind(&nzf_end);
+
+  __ Mov(x18, 0xffffffff00000000UL);
+
+  Label a, a_end;
+  __ Cbz(w18, &a);
+  __ B(&a_end);
+  __ Bind(&a);
+  __ Mov(x4, 1);
+  __ Bind(&a_end);
+
+  Label b, b_end;
+  __ Cbnz(w18, &b);
+  __ B(&b_end);
+  __ Bind(&b);
+  __ Mov(x5, 1);
+  __ Bind(&b_end);
+
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(1, x0);
+  ASSERT_EQUAL_64(0, x1);
+  ASSERT_EQUAL_64(1, x2);
+  ASSERT_EQUAL_64(0, x3);
+  ASSERT_EQUAL_64(1, x4);
+  ASSERT_EQUAL_64(0, x5);
+
+  TEARDOWN();
+}
+
+
+TEST(test_branch) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0);
+  __ Mov(x1, 0);
+  __ Mov(x2, 0);
+  __ Mov(x3, 0);
+  __ Mov(x16, 0xaaaaaaaaaaaaaaaaUL);
+
+  Label bz, bz_end;
+  __ Tbz(x16, 0, &bz);
+  __ B(&bz_end);
+  __ Bind(&bz);
+  __ Mov(x0, 1);
+  __ Bind(&bz_end);
+
+  Label bo, bo_end;
+  __ Tbz(x16, 63, &bo);
+  __ B(&bo_end);
+  __ Bind(&bo);
+  __ Mov(x1, 1);
+  __ Bind(&bo_end);
+
+  Label nbz, nbz_end;
+  __ Tbnz(x16, 61, &nbz);
+  __ B(&nbz_end);
+  __ Bind(&nbz);
+  __ Mov(x2, 1);
+  __ Bind(&nbz_end);
+
+  Label nbo, nbo_end;
+  __ Tbnz(x16, 2, &nbo);
+  __ B(&nbo_end);
+  __ Bind(&nbo);
+  __ Mov(x3, 1);
+  __ Bind(&nbo_end);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(1, x0);
+  ASSERT_EQUAL_64(0, x1);
+  ASSERT_EQUAL_64(1, x2);
+  ASSERT_EQUAL_64(0, x3);
+
+  TEARDOWN();
+}
+
+
+TEST(ldr_str_offset) {
+  SETUP();
+
+  uint64_t src[2] = {0xfedcba9876543210UL, 0x0123456789abcdefUL};
+  uint64_t dst[5] = {0, 0, 0, 0, 0};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+  uintptr_t dst_base = reinterpret_cast<uintptr_t>(dst);
+
+  START();
+  __ Mov(x17, src_base);
+  __ Mov(x18, dst_base);
+  __ Ldr(w0, MemOperand(x17));
+  __ Str(w0, MemOperand(x18));
+  __ Ldr(w1, MemOperand(x17, 4));
+  __ Str(w1, MemOperand(x18, 12));
+  __ Ldr(x2, MemOperand(x17, 8));
+  __ Str(x2, MemOperand(x18, 16));
+  __ Ldrb(w3, MemOperand(x17, 1));
+  __ Strb(w3, MemOperand(x18, 25));
+  __ Ldrh(w4, MemOperand(x17, 2));
+  __ Strh(w4, MemOperand(x18, 33));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x76543210, x0);
+  ASSERT_EQUAL_64(0x76543210, dst[0]);
+  ASSERT_EQUAL_64(0xfedcba98, x1);
+  ASSERT_EQUAL_64(0xfedcba9800000000UL, dst[1]);
+  ASSERT_EQUAL_64(0x0123456789abcdefUL, x2);
+  ASSERT_EQUAL_64(0x0123456789abcdefUL, dst[2]);
+  ASSERT_EQUAL_64(0x32, x3);
+  ASSERT_EQUAL_64(0x3200, dst[3]);
+  ASSERT_EQUAL_64(0x7654, x4);
+  ASSERT_EQUAL_64(0x765400, dst[4]);
+  ASSERT_EQUAL_64(src_base, x17);
+  ASSERT_EQUAL_64(dst_base, x18);
+
+  TEARDOWN();
+}
+
+
+TEST(ldr_str_wide) {
+  SETUP();
+
+  uint32_t src[8192];
+  uint32_t dst[8192];
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+  uintptr_t dst_base = reinterpret_cast<uintptr_t>(dst);
+  memset(src, 0xaa, 8192 * sizeof(src[0]));
+  memset(dst, 0xaa, 8192 * sizeof(dst[0]));
+  src[0] = 0;
+  src[6144] = 6144;
+  src[8191] = 8191;
+
+  START();
+  __ Mov(x22, src_base);
+  __ Mov(x23, dst_base);
+  __ Mov(x24, src_base);
+  __ Mov(x25, dst_base);
+  __ Mov(x26, src_base);
+  __ Mov(x27, dst_base);
+
+  __ Ldr(w0, MemOperand(x22, 8191 * sizeof(src[0])));
+  __ Str(w0, MemOperand(x23, 8191 * sizeof(dst[0])));
+  __ Ldr(w1, MemOperand(x24, 4096 * sizeof(src[0]), PostIndex));
+  __ Str(w1, MemOperand(x25, 4096 * sizeof(dst[0]), PostIndex));
+  __ Ldr(w2, MemOperand(x26, 6144 * sizeof(src[0]), PreIndex));
+  __ Str(w2, MemOperand(x27, 6144 * sizeof(dst[0]), PreIndex));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_32(8191, w0);
+  ASSERT_EQUAL_32(8191, dst[8191]);
+  ASSERT_EQUAL_64(src_base, x22);
+  ASSERT_EQUAL_64(dst_base, x23);
+  ASSERT_EQUAL_32(0, w1);
+  ASSERT_EQUAL_32(0, dst[0]);
+  ASSERT_EQUAL_64(src_base + 4096 * sizeof(src[0]), x24);
+  ASSERT_EQUAL_64(dst_base + 4096 * sizeof(dst[0]), x25);
+  ASSERT_EQUAL_32(6144, w2);
+  ASSERT_EQUAL_32(6144, dst[6144]);
+  ASSERT_EQUAL_64(src_base + 6144 * sizeof(src[0]), x26);
+  ASSERT_EQUAL_64(dst_base + 6144 * sizeof(dst[0]), x27);
+
+  TEARDOWN();
+}
+
+
+TEST(ldr_str_preindex) {
+  SETUP();
+
+  uint64_t src[2] = {0xfedcba9876543210UL, 0x0123456789abcdefUL};
+  uint64_t dst[6] = {0, 0, 0, 0, 0, 0};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+  uintptr_t dst_base = reinterpret_cast<uintptr_t>(dst);
+
+  START();
+  __ Mov(x17, src_base);
+  __ Mov(x18, dst_base);
+  __ Mov(x19, src_base);
+  __ Mov(x20, dst_base);
+  __ Mov(x21, src_base + 16);
+  __ Mov(x22, dst_base + 40);
+  __ Mov(x23, src_base);
+  __ Mov(x24, dst_base);
+  __ Mov(x25, src_base);
+  __ Mov(x26, dst_base);
+  __ Ldr(w0, MemOperand(x17, 4, PreIndex));
+  __ Str(w0, MemOperand(x18, 12, PreIndex));
+  __ Ldr(x1, MemOperand(x19, 8, PreIndex));
+  __ Str(x1, MemOperand(x20, 16, PreIndex));
+  __ Ldr(w2, MemOperand(x21, -4, PreIndex));
+  __ Str(w2, MemOperand(x22, -4, PreIndex));
+  __ Ldrb(w3, MemOperand(x23, 1, PreIndex));
+  __ Strb(w3, MemOperand(x24, 25, PreIndex));
+  __ Ldrh(w4, MemOperand(x25, 3, PreIndex));
+  __ Strh(w4, MemOperand(x26, 41, PreIndex));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xfedcba98, x0);
+  ASSERT_EQUAL_64(0xfedcba9800000000UL, dst[1]);
+  ASSERT_EQUAL_64(0x0123456789abcdefUL, x1);
+  ASSERT_EQUAL_64(0x0123456789abcdefUL, dst[2]);
+  ASSERT_EQUAL_64(0x01234567, x2);
+  ASSERT_EQUAL_64(0x0123456700000000UL, dst[4]);
+  ASSERT_EQUAL_64(0x32, x3);
+  ASSERT_EQUAL_64(0x3200, dst[3]);
+  ASSERT_EQUAL_64(0x9876, x4);
+  ASSERT_EQUAL_64(0x987600, dst[5]);
+  ASSERT_EQUAL_64(src_base + 4, x17);
+  ASSERT_EQUAL_64(dst_base + 12, x18);
+  ASSERT_EQUAL_64(src_base + 8, x19);
+  ASSERT_EQUAL_64(dst_base + 16, x20);
+  ASSERT_EQUAL_64(src_base + 12, x21);
+  ASSERT_EQUAL_64(dst_base + 36, x22);
+  ASSERT_EQUAL_64(src_base + 1, x23);
+  ASSERT_EQUAL_64(dst_base + 25, x24);
+  ASSERT_EQUAL_64(src_base + 3, x25);
+  ASSERT_EQUAL_64(dst_base + 41, x26);
+
+  TEARDOWN();
+}
+
+
+TEST(ldr_str_postindex) {
+  SETUP();
+
+  uint64_t src[2] = {0xfedcba9876543210UL, 0x0123456789abcdefUL};
+  uint64_t dst[6] = {0, 0, 0, 0, 0, 0};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+  uintptr_t dst_base = reinterpret_cast<uintptr_t>(dst);
+
+  START();
+  __ Mov(x17, src_base + 4);
+  __ Mov(x18, dst_base + 12);
+  __ Mov(x19, src_base + 8);
+  __ Mov(x20, dst_base + 16);
+  __ Mov(x21, src_base + 8);
+  __ Mov(x22, dst_base + 32);
+  __ Mov(x23, src_base + 1);
+  __ Mov(x24, dst_base + 25);
+  __ Mov(x25, src_base + 3);
+  __ Mov(x26, dst_base + 41);
+  __ Ldr(w0, MemOperand(x17, 4, PostIndex));
+  __ Str(w0, MemOperand(x18, 12, PostIndex));
+  __ Ldr(x1, MemOperand(x19, 8, PostIndex));
+  __ Str(x1, MemOperand(x20, 16, PostIndex));
+  __ Ldr(x2, MemOperand(x21, -8, PostIndex));
+  __ Str(x2, MemOperand(x22, -32, PostIndex));
+  __ Ldrb(w3, MemOperand(x23, 1, PostIndex));
+  __ Strb(w3, MemOperand(x24, 5, PostIndex));
+  __ Ldrh(w4, MemOperand(x25, -3, PostIndex));
+  __ Strh(w4, MemOperand(x26, -41, PostIndex));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xfedcba98, x0);
+  ASSERT_EQUAL_64(0xfedcba9800000000UL, dst[1]);
+  ASSERT_EQUAL_64(0x0123456789abcdefUL, x1);
+  ASSERT_EQUAL_64(0x0123456789abcdefUL, dst[2]);
+  ASSERT_EQUAL_64(0x0123456789abcdefUL, x2);
+  ASSERT_EQUAL_64(0x0123456789abcdefUL, dst[4]);
+  ASSERT_EQUAL_64(0x32, x3);
+  ASSERT_EQUAL_64(0x3200, dst[3]);
+  ASSERT_EQUAL_64(0x9876, x4);
+  ASSERT_EQUAL_64(0x987600, dst[5]);
+  ASSERT_EQUAL_64(src_base + 8, x17);
+  ASSERT_EQUAL_64(dst_base + 24, x18);
+  ASSERT_EQUAL_64(src_base + 16, x19);
+  ASSERT_EQUAL_64(dst_base + 32, x20);
+  ASSERT_EQUAL_64(src_base, x21);
+  ASSERT_EQUAL_64(dst_base, x22);
+  ASSERT_EQUAL_64(src_base + 2, x23);
+  ASSERT_EQUAL_64(dst_base + 30, x24);
+  ASSERT_EQUAL_64(src_base, x25);
+  ASSERT_EQUAL_64(dst_base, x26);
+
+  TEARDOWN();
+}
+
+
+TEST(ldr_str_largeindex) {
+  SETUP();
+
+  // This value won't fit in the immediate offset field of ldr/str instructions.
+  int largeoffset = 0xabcdef;
+
+  int64_t data[3] = { 0x1122334455667788, 0, 0 };
+  uintptr_t base_addr = reinterpret_cast<uintptr_t>(data);
+  uintptr_t drifted_addr = base_addr - largeoffset;
+
+  // This test checks that we we can use large immediate offsets when
+  // using PreIndex or PostIndex addressing mode of the MacroAssembler
+  // Ldr/Str instructions.
+
+  START();
+  __ Mov(x17, drifted_addr);
+  __ Ldr(x0, MemOperand(x17, largeoffset, PreIndex));
+
+  __ Mov(x18, base_addr);
+  __ Ldr(x1, MemOperand(x18, largeoffset, PostIndex));
+
+  __ Mov(x19, drifted_addr);
+  __ Str(x0, MemOperand(x19, largeoffset + 8, PreIndex));
+
+  __ Mov(x20, base_addr + 16);
+  __ Str(x0, MemOperand(x20, largeoffset, PostIndex));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x1122334455667788, data[0]);
+  ASSERT_EQUAL_64(0x1122334455667788, data[1]);
+  ASSERT_EQUAL_64(0x1122334455667788, data[2]);
+  ASSERT_EQUAL_64(0x1122334455667788, x0);
+  ASSERT_EQUAL_64(0x1122334455667788, x1);
+
+  ASSERT_EQUAL_64(base_addr, x17);
+  ASSERT_EQUAL_64(base_addr + largeoffset, x18);
+  ASSERT_EQUAL_64(base_addr + 8, x19);
+  ASSERT_EQUAL_64(base_addr + 16 + largeoffset, x20);
+
+  TEARDOWN();
+}
+
+
+TEST(load_signed) {
+  SETUP();
+
+  uint32_t src[2] = {0x80008080, 0x7fff7f7f};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+
+  START();
+  __ Mov(x24, src_base);
+  __ Ldrsb(w0, MemOperand(x24));
+  __ Ldrsb(w1, MemOperand(x24, 4));
+  __ Ldrsh(w2, MemOperand(x24));
+  __ Ldrsh(w3, MemOperand(x24, 4));
+  __ Ldrsb(x4, MemOperand(x24));
+  __ Ldrsb(x5, MemOperand(x24, 4));
+  __ Ldrsh(x6, MemOperand(x24));
+  __ Ldrsh(x7, MemOperand(x24, 4));
+  __ Ldrsw(x8, MemOperand(x24));
+  __ Ldrsw(x9, MemOperand(x24, 4));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xffffff80, x0);
+  ASSERT_EQUAL_64(0x0000007f, x1);
+  ASSERT_EQUAL_64(0xffff8080, x2);
+  ASSERT_EQUAL_64(0x00007f7f, x3);
+  ASSERT_EQUAL_64(0xffffffffffffff80UL, x4);
+  ASSERT_EQUAL_64(0x000000000000007fUL, x5);
+  ASSERT_EQUAL_64(0xffffffffffff8080UL, x6);
+  ASSERT_EQUAL_64(0x0000000000007f7fUL, x7);
+  ASSERT_EQUAL_64(0xffffffff80008080UL, x8);
+  ASSERT_EQUAL_64(0x000000007fff7f7fUL, x9);
+
+  TEARDOWN();
+}
+
+
+TEST(load_store_regoffset) {
+  SETUP();
+
+  uint32_t src[3] = {1, 2, 3};
+  uint32_t dst[4] = {0, 0, 0, 0};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+  uintptr_t dst_base = reinterpret_cast<uintptr_t>(dst);
+
+  START();
+  __ Mov(x16, src_base);
+  __ Mov(x17, dst_base);
+  __ Mov(x18, src_base + 3 * sizeof(src[0]));
+  __ Mov(x19, dst_base + 3 * sizeof(dst[0]));
+  __ Mov(x20, dst_base + 4 * sizeof(dst[0]));
+  __ Mov(x24, 0);
+  __ Mov(x25, 4);
+  __ Mov(x26, -4);
+  __ Mov(x27, 0xfffffffc);  // 32-bit -4.
+  __ Mov(x28, 0xfffffffe);  // 32-bit -2.
+  __ Mov(x29, 0xffffffff);  // 32-bit -1.
+
+  __ Ldr(w0, MemOperand(x16, x24));
+  __ Ldr(x1, MemOperand(x16, x25));
+  __ Ldr(w2, MemOperand(x18, x26));
+  __ Ldr(w3, MemOperand(x18, x27, SXTW));
+  __ Ldr(w4, MemOperand(x18, x28, SXTW, 2));
+  __ Str(w0, MemOperand(x17, x24));
+  __ Str(x1, MemOperand(x17, x25));
+  __ Str(w2, MemOperand(x20, x29, SXTW, 2));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(1, x0);
+  ASSERT_EQUAL_64(0x0000000300000002UL, x1);
+  ASSERT_EQUAL_64(3, x2);
+  ASSERT_EQUAL_64(3, x3);
+  ASSERT_EQUAL_64(2, x4);
+  ASSERT_EQUAL_32(1, dst[0]);
+  ASSERT_EQUAL_32(2, dst[1]);
+  ASSERT_EQUAL_32(3, dst[2]);
+  ASSERT_EQUAL_32(3, dst[3]);
+
+  TEARDOWN();
+}
+
+
+TEST(load_store_float) {
+  SETUP();
+
+  float src[3] = {1.0, 2.0, 3.0};
+  float dst[3] = {0.0, 0.0, 0.0};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+  uintptr_t dst_base = reinterpret_cast<uintptr_t>(dst);
+
+  START();
+  __ Mov(x17, src_base);
+  __ Mov(x18, dst_base);
+  __ Mov(x19, src_base);
+  __ Mov(x20, dst_base);
+  __ Mov(x21, src_base);
+  __ Mov(x22, dst_base);
+  __ Ldr(s0, MemOperand(x17, sizeof(src[0])));
+  __ Str(s0, MemOperand(x18, sizeof(dst[0]), PostIndex));
+  __ Ldr(s1, MemOperand(x19, sizeof(src[0]), PostIndex));
+  __ Str(s1, MemOperand(x20, 2 * sizeof(dst[0]), PreIndex));
+  __ Ldr(s2, MemOperand(x21, 2 * sizeof(src[0]), PreIndex));
+  __ Str(s2, MemOperand(x22, sizeof(dst[0])));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(2.0, s0);
+  ASSERT_EQUAL_FP32(2.0, dst[0]);
+  ASSERT_EQUAL_FP32(1.0, s1);
+  ASSERT_EQUAL_FP32(1.0, dst[2]);
+  ASSERT_EQUAL_FP32(3.0, s2);
+  ASSERT_EQUAL_FP32(3.0, dst[1]);
+  ASSERT_EQUAL_64(src_base, x17);
+  ASSERT_EQUAL_64(dst_base + sizeof(dst[0]), x18);
+  ASSERT_EQUAL_64(src_base + sizeof(src[0]), x19);
+  ASSERT_EQUAL_64(dst_base + 2 * sizeof(dst[0]), x20);
+  ASSERT_EQUAL_64(src_base + 2 * sizeof(src[0]), x21);
+  ASSERT_EQUAL_64(dst_base, x22);
+
+  TEARDOWN();
+}
+
+
+TEST(load_store_double) {
+  SETUP();
+
+  double src[3] = {1.0, 2.0, 3.0};
+  double dst[3] = {0.0, 0.0, 0.0};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+  uintptr_t dst_base = reinterpret_cast<uintptr_t>(dst);
+
+  START();
+  __ Mov(x17, src_base);
+  __ Mov(x18, dst_base);
+  __ Mov(x19, src_base);
+  __ Mov(x20, dst_base);
+  __ Mov(x21, src_base);
+  __ Mov(x22, dst_base);
+  __ Ldr(d0, MemOperand(x17, sizeof(src[0])));
+  __ Str(d0, MemOperand(x18, sizeof(dst[0]), PostIndex));
+  __ Ldr(d1, MemOperand(x19, sizeof(src[0]), PostIndex));
+  __ Str(d1, MemOperand(x20, 2 * sizeof(dst[0]), PreIndex));
+  __ Ldr(d2, MemOperand(x21, 2 * sizeof(src[0]), PreIndex));
+  __ Str(d2, MemOperand(x22, sizeof(dst[0])));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP64(2.0, d0);
+  ASSERT_EQUAL_FP64(2.0, dst[0]);
+  ASSERT_EQUAL_FP64(1.0, d1);
+  ASSERT_EQUAL_FP64(1.0, dst[2]);
+  ASSERT_EQUAL_FP64(3.0, d2);
+  ASSERT_EQUAL_FP64(3.0, dst[1]);
+  ASSERT_EQUAL_64(src_base, x17);
+  ASSERT_EQUAL_64(dst_base + sizeof(dst[0]), x18);
+  ASSERT_EQUAL_64(src_base + sizeof(src[0]), x19);
+  ASSERT_EQUAL_64(dst_base + 2 * sizeof(dst[0]), x20);
+  ASSERT_EQUAL_64(src_base + 2 * sizeof(src[0]), x21);
+  ASSERT_EQUAL_64(dst_base, x22);
+
+  TEARDOWN();
+}
+
+
+TEST(ldp_stp_float) {
+  SETUP();
+
+  float src[2] = {1.0, 2.0};
+  float dst[3] = {0.0, 0.0, 0.0};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+  uintptr_t dst_base = reinterpret_cast<uintptr_t>(dst);
+
+  START();
+  __ Mov(x16, src_base);
+  __ Mov(x17, dst_base);
+  __ Ldp(s31, s0, MemOperand(x16, 2 * sizeof(src[0]), PostIndex));
+  __ Stp(s0, s31, MemOperand(x17, sizeof(dst[1]), PreIndex));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(1.0, s31);
+  ASSERT_EQUAL_FP32(2.0, s0);
+  ASSERT_EQUAL_FP32(0.0, dst[0]);
+  ASSERT_EQUAL_FP32(2.0, dst[1]);
+  ASSERT_EQUAL_FP32(1.0, dst[2]);
+  ASSERT_EQUAL_64(src_base + 2 * sizeof(src[0]), x16);
+  ASSERT_EQUAL_64(dst_base + sizeof(dst[1]), x17);
+
+  TEARDOWN();
+}
+
+
+TEST(ldp_stp_double) {
+  SETUP();
+
+  double src[2] = {1.0, 2.0};
+  double dst[3] = {0.0, 0.0, 0.0};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+  uintptr_t dst_base = reinterpret_cast<uintptr_t>(dst);
+
+  START();
+  __ Mov(x16, src_base);
+  __ Mov(x17, dst_base);
+  __ Ldp(d31, d0, MemOperand(x16, 2 * sizeof(src[0]), PostIndex));
+  __ Stp(d0, d31, MemOperand(x17, sizeof(dst[1]), PreIndex));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP64(1.0, d31);
+  ASSERT_EQUAL_FP64(2.0, d0);
+  ASSERT_EQUAL_FP64(0.0, dst[0]);
+  ASSERT_EQUAL_FP64(2.0, dst[1]);
+  ASSERT_EQUAL_FP64(1.0, dst[2]);
+  ASSERT_EQUAL_64(src_base + 2 * sizeof(src[0]), x16);
+  ASSERT_EQUAL_64(dst_base + sizeof(dst[1]), x17);
+
+  TEARDOWN();
+}
+
+
+TEST(ldp_stp_offset) {
+  SETUP();
+
+  uint64_t src[3] = {0x0011223344556677UL, 0x8899aabbccddeeffUL,
+                     0xffeeddccbbaa9988UL};
+  uint64_t dst[7] = {0, 0, 0, 0, 0, 0, 0};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+  uintptr_t dst_base = reinterpret_cast<uintptr_t>(dst);
+
+  START();
+  __ Mov(x16, src_base);
+  __ Mov(x17, dst_base);
+  __ Mov(x18, src_base + 24);
+  __ Mov(x19, dst_base + 56);
+  __ Ldp(w0, w1, MemOperand(x16));
+  __ Ldp(w2, w3, MemOperand(x16, 4));
+  __ Ldp(x4, x5, MemOperand(x16, 8));
+  __ Ldp(w6, w7, MemOperand(x18, -12));
+  __ Ldp(x8, x9, MemOperand(x18, -16));
+  __ Stp(w0, w1, MemOperand(x17));
+  __ Stp(w2, w3, MemOperand(x17, 8));
+  __ Stp(x4, x5, MemOperand(x17, 16));
+  __ Stp(w6, w7, MemOperand(x19, -24));
+  __ Stp(x8, x9, MemOperand(x19, -16));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x44556677, x0);
+  ASSERT_EQUAL_64(0x00112233, x1);
+  ASSERT_EQUAL_64(0x0011223344556677UL, dst[0]);
+  ASSERT_EQUAL_64(0x00112233, x2);
+  ASSERT_EQUAL_64(0xccddeeff, x3);
+  ASSERT_EQUAL_64(0xccddeeff00112233UL, dst[1]);
+  ASSERT_EQUAL_64(0x8899aabbccddeeffUL, x4);
+  ASSERT_EQUAL_64(0x8899aabbccddeeffUL, dst[2]);
+  ASSERT_EQUAL_64(0xffeeddccbbaa9988UL, x5);
+  ASSERT_EQUAL_64(0xffeeddccbbaa9988UL, dst[3]);
+  ASSERT_EQUAL_64(0x8899aabb, x6);
+  ASSERT_EQUAL_64(0xbbaa9988, x7);
+  ASSERT_EQUAL_64(0xbbaa99888899aabbUL, dst[4]);
+  ASSERT_EQUAL_64(0x8899aabbccddeeffUL, x8);
+  ASSERT_EQUAL_64(0x8899aabbccddeeffUL, dst[5]);
+  ASSERT_EQUAL_64(0xffeeddccbbaa9988UL, x9);
+  ASSERT_EQUAL_64(0xffeeddccbbaa9988UL, dst[6]);
+  ASSERT_EQUAL_64(src_base, x16);
+  ASSERT_EQUAL_64(dst_base, x17);
+  ASSERT_EQUAL_64(src_base + 24, x18);
+  ASSERT_EQUAL_64(dst_base + 56, x19);
+
+  TEARDOWN();
+}
+
+
+TEST(ldnp_stnp_offset) {
+  SETUP();
+
+  uint64_t src[3] = {0x0011223344556677UL, 0x8899aabbccddeeffUL,
+                     0xffeeddccbbaa9988UL};
+  uint64_t dst[7] = {0, 0, 0, 0, 0, 0, 0};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+  uintptr_t dst_base = reinterpret_cast<uintptr_t>(dst);
+
+  START();
+  __ Mov(x16, src_base);
+  __ Mov(x17, dst_base);
+  __ Mov(x18, src_base + 24);
+  __ Mov(x19, dst_base + 56);
+  __ Ldnp(w0, w1, MemOperand(x16));
+  __ Ldnp(w2, w3, MemOperand(x16, 4));
+  __ Ldnp(x4, x5, MemOperand(x16, 8));
+  __ Ldnp(w6, w7, MemOperand(x18, -12));
+  __ Ldnp(x8, x9, MemOperand(x18, -16));
+  __ Stnp(w0, w1, MemOperand(x17));
+  __ Stnp(w2, w3, MemOperand(x17, 8));
+  __ Stnp(x4, x5, MemOperand(x17, 16));
+  __ Stnp(w6, w7, MemOperand(x19, -24));
+  __ Stnp(x8, x9, MemOperand(x19, -16));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x44556677, x0);
+  ASSERT_EQUAL_64(0x00112233, x1);
+  ASSERT_EQUAL_64(0x0011223344556677UL, dst[0]);
+  ASSERT_EQUAL_64(0x00112233, x2);
+  ASSERT_EQUAL_64(0xccddeeff, x3);
+  ASSERT_EQUAL_64(0xccddeeff00112233UL, dst[1]);
+  ASSERT_EQUAL_64(0x8899aabbccddeeffUL, x4);
+  ASSERT_EQUAL_64(0x8899aabbccddeeffUL, dst[2]);
+  ASSERT_EQUAL_64(0xffeeddccbbaa9988UL, x5);
+  ASSERT_EQUAL_64(0xffeeddccbbaa9988UL, dst[3]);
+  ASSERT_EQUAL_64(0x8899aabb, x6);
+  ASSERT_EQUAL_64(0xbbaa9988, x7);
+  ASSERT_EQUAL_64(0xbbaa99888899aabbUL, dst[4]);
+  ASSERT_EQUAL_64(0x8899aabbccddeeffUL, x8);
+  ASSERT_EQUAL_64(0x8899aabbccddeeffUL, dst[5]);
+  ASSERT_EQUAL_64(0xffeeddccbbaa9988UL, x9);
+  ASSERT_EQUAL_64(0xffeeddccbbaa9988UL, dst[6]);
+  ASSERT_EQUAL_64(src_base, x16);
+  ASSERT_EQUAL_64(dst_base, x17);
+  ASSERT_EQUAL_64(src_base + 24, x18);
+  ASSERT_EQUAL_64(dst_base + 56, x19);
+
+  TEARDOWN();
+}
+
+
+TEST(ldp_stp_preindex) {
+  SETUP();
+
+  uint64_t src[3] = {0x0011223344556677UL, 0x8899aabbccddeeffUL,
+                     0xffeeddccbbaa9988UL};
+  uint64_t dst[5] = {0, 0, 0, 0, 0};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+  uintptr_t dst_base = reinterpret_cast<uintptr_t>(dst);
+
+  START();
+  __ Mov(x16, src_base);
+  __ Mov(x17, dst_base);
+  __ Mov(x18, dst_base + 16);
+  __ Ldp(w0, w1, MemOperand(x16, 4, PreIndex));
+  __ Mov(x19, x16);
+  __ Ldp(w2, w3, MemOperand(x16, -4, PreIndex));
+  __ Stp(w2, w3, MemOperand(x17, 4, PreIndex));
+  __ Mov(x20, x17);
+  __ Stp(w0, w1, MemOperand(x17, -4, PreIndex));
+  __ Ldp(x4, x5, MemOperand(x16, 8, PreIndex));
+  __ Mov(x21, x16);
+  __ Ldp(x6, x7, MemOperand(x16, -8, PreIndex));
+  __ Stp(x7, x6, MemOperand(x18, 8, PreIndex));
+  __ Mov(x22, x18);
+  __ Stp(x5, x4, MemOperand(x18, -8, PreIndex));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x00112233, x0);
+  ASSERT_EQUAL_64(0xccddeeff, x1);
+  ASSERT_EQUAL_64(0x44556677, x2);
+  ASSERT_EQUAL_64(0x00112233, x3);
+  ASSERT_EQUAL_64(0xccddeeff00112233UL, dst[0]);
+  ASSERT_EQUAL_64(0x0000000000112233UL, dst[1]);
+  ASSERT_EQUAL_64(0x8899aabbccddeeffUL, x4);
+  ASSERT_EQUAL_64(0xffeeddccbbaa9988UL, x5);
+  ASSERT_EQUAL_64(0x0011223344556677UL, x6);
+  ASSERT_EQUAL_64(0x8899aabbccddeeffUL, x7);
+  ASSERT_EQUAL_64(0xffeeddccbbaa9988UL, dst[2]);
+  ASSERT_EQUAL_64(0x8899aabbccddeeffUL, dst[3]);
+  ASSERT_EQUAL_64(0x0011223344556677UL, dst[4]);
+  ASSERT_EQUAL_64(src_base, x16);
+  ASSERT_EQUAL_64(dst_base, x17);
+  ASSERT_EQUAL_64(dst_base + 16, x18);
+  ASSERT_EQUAL_64(src_base + 4, x19);
+  ASSERT_EQUAL_64(dst_base + 4, x20);
+  ASSERT_EQUAL_64(src_base + 8, x21);
+  ASSERT_EQUAL_64(dst_base + 24, x22);
+
+  TEARDOWN();
+}
+
+
+TEST(ldp_stp_postindex) {
+  SETUP();
+
+  uint64_t src[4] = {0x0011223344556677UL, 0x8899aabbccddeeffUL,
+                     0xffeeddccbbaa9988UL, 0x7766554433221100UL};
+  uint64_t dst[5] = {0, 0, 0, 0, 0};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+  uintptr_t dst_base = reinterpret_cast<uintptr_t>(dst);
+
+  START();
+  __ Mov(x16, src_base);
+  __ Mov(x17, dst_base);
+  __ Mov(x18, dst_base + 16);
+  __ Ldp(w0, w1, MemOperand(x16, 4, PostIndex));
+  __ Mov(x19, x16);
+  __ Ldp(w2, w3, MemOperand(x16, -4, PostIndex));
+  __ Stp(w2, w3, MemOperand(x17, 4, PostIndex));
+  __ Mov(x20, x17);
+  __ Stp(w0, w1, MemOperand(x17, -4, PostIndex));
+  __ Ldp(x4, x5, MemOperand(x16, 8, PostIndex));
+  __ Mov(x21, x16);
+  __ Ldp(x6, x7, MemOperand(x16, -8, PostIndex));
+  __ Stp(x7, x6, MemOperand(x18, 8, PostIndex));
+  __ Mov(x22, x18);
+  __ Stp(x5, x4, MemOperand(x18, -8, PostIndex));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x44556677, x0);
+  ASSERT_EQUAL_64(0x00112233, x1);
+  ASSERT_EQUAL_64(0x00112233, x2);
+  ASSERT_EQUAL_64(0xccddeeff, x3);
+  ASSERT_EQUAL_64(0x4455667700112233UL, dst[0]);
+  ASSERT_EQUAL_64(0x0000000000112233UL, dst[1]);
+  ASSERT_EQUAL_64(0x0011223344556677UL, x4);
+  ASSERT_EQUAL_64(0x8899aabbccddeeffUL, x5);
+  ASSERT_EQUAL_64(0x8899aabbccddeeffUL, x6);
+  ASSERT_EQUAL_64(0xffeeddccbbaa9988UL, x7);
+  ASSERT_EQUAL_64(0xffeeddccbbaa9988UL, dst[2]);
+  ASSERT_EQUAL_64(0x8899aabbccddeeffUL, dst[3]);
+  ASSERT_EQUAL_64(0x0011223344556677UL, dst[4]);
+  ASSERT_EQUAL_64(src_base, x16);
+  ASSERT_EQUAL_64(dst_base, x17);
+  ASSERT_EQUAL_64(dst_base + 16, x18);
+  ASSERT_EQUAL_64(src_base + 4, x19);
+  ASSERT_EQUAL_64(dst_base + 4, x20);
+  ASSERT_EQUAL_64(src_base + 8, x21);
+  ASSERT_EQUAL_64(dst_base + 24, x22);
+
+  TEARDOWN();
+}
+
+
+TEST(ldp_sign_extend) {
+  SETUP();
+
+  uint32_t src[2] = {0x80000000, 0x7fffffff};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+
+  START();
+  __ Mov(x24, src_base);
+  __ Ldpsw(x0, x1, MemOperand(x24));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xffffffff80000000UL, x0);
+  ASSERT_EQUAL_64(0x000000007fffffffUL, x1);
+
+  TEARDOWN();
+}
+
+
+TEST(ldur_stur) {
+  SETUP();
+
+  int64_t src[2] = {0x0123456789abcdefUL, 0x0123456789abcdefUL};
+  int64_t dst[5] = {0, 0, 0, 0, 0};
+  uintptr_t src_base = reinterpret_cast<uintptr_t>(src);
+  uintptr_t dst_base = reinterpret_cast<uintptr_t>(dst);
+
+  START();
+  __ Mov(x17, src_base);
+  __ Mov(x18, dst_base);
+  __ Mov(x19, src_base + 16);
+  __ Mov(x20, dst_base + 32);
+  __ Mov(x21, dst_base + 40);
+  __ Ldr(w0, MemOperand(x17, 1));
+  __ Str(w0, MemOperand(x18, 2));
+  __ Ldr(x1, MemOperand(x17, 3));
+  __ Str(x1, MemOperand(x18, 9));
+  __ Ldr(w2, MemOperand(x19, -9));
+  __ Str(w2, MemOperand(x20, -5));
+  __ Ldrb(w3, MemOperand(x19, -1));
+  __ Strb(w3, MemOperand(x21, -1));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x6789abcd, x0);
+  ASSERT_EQUAL_64(0x6789abcd0000L, dst[0]);
+  ASSERT_EQUAL_64(0xabcdef0123456789L, x1);
+  ASSERT_EQUAL_64(0xcdef012345678900L, dst[1]);
+  ASSERT_EQUAL_64(0x000000ab, dst[2]);
+  ASSERT_EQUAL_64(0xabcdef01, x2);
+  ASSERT_EQUAL_64(0x00abcdef01000000L, dst[3]);
+  ASSERT_EQUAL_64(0x00000001, x3);
+  ASSERT_EQUAL_64(0x0100000000000000L, dst[4]);
+  ASSERT_EQUAL_64(src_base, x17);
+  ASSERT_EQUAL_64(dst_base, x18);
+  ASSERT_EQUAL_64(src_base + 16, x19);
+  ASSERT_EQUAL_64(dst_base + 32, x20);
+
+  TEARDOWN();
+}
+
+
+TEST(ldr_literal) {
+  SETUP();
+
+  START();
+  __ Ldr(x2, 0x1234567890abcdefUL);
+  __ Ldr(w3, 0xfedcba09);
+  __ Ldr(d13, 1.234);
+  __ Ldr(s25, 2.5);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x1234567890abcdefUL, x2);
+  ASSERT_EQUAL_64(0xfedcba09, x3);
+  ASSERT_EQUAL_FP64(1.234, d13);
+  ASSERT_EQUAL_FP32(2.5, s25);
+
+  TEARDOWN();
+}
+
+
+static void LdrLiteralRangeHelper(ptrdiff_t range_,
+                                  LiteralPoolEmitOption option,
+                                  bool expect_dump) {
+  ASSERT(range_ > 0);
+  SETUP_SIZE(range_ + 1024);
+
+  Label label_1, label_2;
+
+  size_t range = static_cast<size_t>(range_);
+  size_t code_size = 0;
+  size_t pool_guard_size;
+
+  if (option == NoJumpRequired) {
+    // Space for an explicit branch.
+    pool_guard_size = sizeof(Instr);
+  } else {
+    pool_guard_size = 0;
+  }
+
+  START();
+  // Force a pool dump so the pool starts off empty.
+  __ EmitLiteralPool(JumpRequired);
+  ASSERT_LITERAL_POOL_SIZE(0);
+
+  __ Ldr(x0, 0x1234567890abcdefUL);
+  __ Ldr(w1, 0xfedcba09);
+  __ Ldr(d0, 1.234);
+  __ Ldr(s1, 2.5);
+  ASSERT_LITERAL_POOL_SIZE(24);
+
+  code_size += 4 * sizeof(Instr);
+
+  // Check that the requested range (allowing space for a branch over the pool)
+  // can be handled by this test.
+  ASSERT((code_size + pool_guard_size) <= range);
+
+  // Emit NOPs up to 'range', leaving space for the pool guard.
+  while ((code_size + pool_guard_size) < range) {
+    __ Nop();
+    code_size += sizeof(Instr);
+  }
+
+  // Emit the guard sequence before the literal pool.
+  if (option == NoJumpRequired) {
+    __ B(&label_1);
+    code_size += sizeof(Instr);
+  }
+
+  ASSERT(code_size == range);
+  ASSERT_LITERAL_POOL_SIZE(24);
+
+  // Possibly generate a literal pool.
+  __ CheckLiteralPool(option);
+  __ Bind(&label_1);
+  if (expect_dump) {
+    ASSERT_LITERAL_POOL_SIZE(0);
+  } else {
+    ASSERT_LITERAL_POOL_SIZE(24);
+  }
+
+  // Force a pool flush to check that a second pool functions correctly.
+  __ EmitLiteralPool(JumpRequired);
+  ASSERT_LITERAL_POOL_SIZE(0);
+
+  // These loads should be after the pool (and will require a new one).
+  __ Ldr(x4, 0x34567890abcdef12UL);
+  __ Ldr(w5, 0xdcba09fe);
+  __ Ldr(d4, 123.4);
+  __ Ldr(s5, 250.0);
+  ASSERT_LITERAL_POOL_SIZE(24);
+  END();
+
+  RUN();
+
+  // Check that the literals loaded correctly.
+  ASSERT_EQUAL_64(0x1234567890abcdefUL, x0);
+  ASSERT_EQUAL_64(0xfedcba09, x1);
+  ASSERT_EQUAL_FP64(1.234, d0);
+  ASSERT_EQUAL_FP32(2.5, s1);
+  ASSERT_EQUAL_64(0x34567890abcdef12UL, x4);
+  ASSERT_EQUAL_64(0xdcba09fe, x5);
+  ASSERT_EQUAL_FP64(123.4, d4);
+  ASSERT_EQUAL_FP32(250.0, s5);
+
+  TEARDOWN();
+}
+
+
+TEST(ldr_literal_range_1) {
+  LdrLiteralRangeHelper(kRecommendedLiteralPoolRange,
+                        NoJumpRequired,
+                        true);
+}
+
+
+TEST(ldr_literal_range_2) {
+  LdrLiteralRangeHelper(kRecommendedLiteralPoolRange-sizeof(Instr),
+                        NoJumpRequired,
+                        false);
+}
+
+
+TEST(ldr_literal_range_3) {
+  LdrLiteralRangeHelper(2 * kRecommendedLiteralPoolRange,
+                        JumpRequired,
+                        true);
+}
+
+
+TEST(ldr_literal_range_4) {
+  LdrLiteralRangeHelper(2 * kRecommendedLiteralPoolRange-sizeof(Instr),
+                        JumpRequired,
+                        false);
+}
+
+
+TEST(ldr_literal_range_5) {
+  LdrLiteralRangeHelper(kLiteralPoolCheckInterval,
+                        JumpRequired,
+                        false);
+}
+
+
+TEST(ldr_literal_range_6) {
+  LdrLiteralRangeHelper(kLiteralPoolCheckInterval-sizeof(Instr),
+                        JumpRequired,
+                        false);
+}
+
+
+TEST(add_sub_imm) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0x0);
+  __ Mov(x1, 0x1111);
+  __ Mov(x2, 0xffffffffffffffffL);
+  __ Mov(x3, 0x8000000000000000L);
+
+  __ Add(x10, x0, Operand(0x123));
+  __ Add(x11, x1, Operand(0x122000));
+  __ Add(x12, x0, Operand(0xabc << 12));
+  __ Add(x13, x2, Operand(1));
+
+  __ Add(w14, w0, Operand(0x123));
+  __ Add(w15, w1, Operand(0x122000));
+  __ Add(w16, w0, Operand(0xabc << 12));
+  __ Add(w17, w2, Operand(1));
+
+  __ Sub(x20, x0, Operand(0x1));
+  __ Sub(x21, x1, Operand(0x111));
+  __ Sub(x22, x1, Operand(0x1 << 12));
+  __ Sub(x23, x3, Operand(1));
+
+  __ Sub(w24, w0, Operand(0x1));
+  __ Sub(w25, w1, Operand(0x111));
+  __ Sub(w26, w1, Operand(0x1 << 12));
+  __ Sub(w27, w3, Operand(1));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x123, x10);
+  ASSERT_EQUAL_64(0x123111, x11);
+  ASSERT_EQUAL_64(0xabc000, x12);
+  ASSERT_EQUAL_64(0x0, x13);
+
+  ASSERT_EQUAL_32(0x123, w14);
+  ASSERT_EQUAL_32(0x123111, w15);
+  ASSERT_EQUAL_32(0xabc000, w16);
+  ASSERT_EQUAL_32(0x0, w17);
+
+  ASSERT_EQUAL_64(0xffffffffffffffffL, x20);
+  ASSERT_EQUAL_64(0x1000, x21);
+  ASSERT_EQUAL_64(0x111, x22);
+  ASSERT_EQUAL_64(0x7fffffffffffffffL, x23);
+
+  ASSERT_EQUAL_32(0xffffffff, w24);
+  ASSERT_EQUAL_32(0x1000, w25);
+  ASSERT_EQUAL_32(0x111, w26);
+  ASSERT_EQUAL_32(0xffffffff, w27);
+
+  TEARDOWN();
+}
+
+
+TEST(add_sub_wide_imm) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0x0);
+  __ Mov(x1, 0x1);
+
+  __ Add(x10, x0, Operand(0x1234567890abcdefUL));
+  __ Add(x11, x1, Operand(0xffffffff));
+
+  __ Add(w12, w0, Operand(0x12345678));
+  __ Add(w13, w1, Operand(0xffffffff));
+
+  __ Sub(x20, x0, Operand(0x1234567890abcdefUL));
+
+  __ Sub(w21, w0, Operand(0x12345678));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x1234567890abcdefUL, x10);
+  ASSERT_EQUAL_64(0x100000000UL, x11);
+
+  ASSERT_EQUAL_32(0x12345678, w12);
+  ASSERT_EQUAL_64(0x0, x13);
+
+  ASSERT_EQUAL_64(-0x1234567890abcdefUL, x20);
+
+  ASSERT_EQUAL_32(-0x12345678, w21);
+
+  TEARDOWN();
+}
+
+
+TEST(add_sub_shifted) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0);
+  __ Mov(x1, 0x0123456789abcdefL);
+  __ Mov(x2, 0xfedcba9876543210L);
+  __ Mov(x3, 0xffffffffffffffffL);
+
+  __ Add(x10, x1, Operand(x2));
+  __ Add(x11, x0, Operand(x1, LSL, 8));
+  __ Add(x12, x0, Operand(x1, LSR, 8));
+  __ Add(x13, x0, Operand(x1, ASR, 8));
+  __ Add(x14, x0, Operand(x2, ASR, 8));
+  __ Add(w15, w0, Operand(w1, ASR, 8));
+  __ Add(w18, w3, Operand(w1, ROR, 8));
+  __ Add(x19, x3, Operand(x1, ROR, 8));
+
+  __ Sub(x20, x3, Operand(x2));
+  __ Sub(x21, x3, Operand(x1, LSL, 8));
+  __ Sub(x22, x3, Operand(x1, LSR, 8));
+  __ Sub(x23, x3, Operand(x1, ASR, 8));
+  __ Sub(x24, x3, Operand(x2, ASR, 8));
+  __ Sub(w25, w3, Operand(w1, ASR, 8));
+  __ Sub(w26, w3, Operand(w1, ROR, 8));
+  __ Sub(x27, x3, Operand(x1, ROR, 8));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xffffffffffffffffL, x10);
+  ASSERT_EQUAL_64(0x23456789abcdef00L, x11);
+  ASSERT_EQUAL_64(0x000123456789abcdL, x12);
+  ASSERT_EQUAL_64(0x000123456789abcdL, x13);
+  ASSERT_EQUAL_64(0xfffedcba98765432L, x14);
+  ASSERT_EQUAL_64(0xff89abcd, x15);
+  ASSERT_EQUAL_64(0xef89abcc, x18);
+  ASSERT_EQUAL_64(0xef0123456789abccL, x19);
+
+  ASSERT_EQUAL_64(0x0123456789abcdefL, x20);
+  ASSERT_EQUAL_64(0xdcba9876543210ffL, x21);
+  ASSERT_EQUAL_64(0xfffedcba98765432L, x22);
+  ASSERT_EQUAL_64(0xfffedcba98765432L, x23);
+  ASSERT_EQUAL_64(0x000123456789abcdL, x24);
+  ASSERT_EQUAL_64(0x00765432, x25);
+  ASSERT_EQUAL_64(0x10765432, x26);
+  ASSERT_EQUAL_64(0x10fedcba98765432L, x27);
+
+  TEARDOWN();
+}
+
+
+TEST(add_sub_extended) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0);
+  __ Mov(x1, 0x0123456789abcdefL);
+  __ Mov(x2, 0xfedcba9876543210L);
+  __ Mov(w3, 0x80);
+
+  __ Add(x10, x0, Operand(x1, UXTB, 0));
+  __ Add(x11, x0, Operand(x1, UXTB, 1));
+  __ Add(x12, x0, Operand(x1, UXTH, 2));
+  __ Add(x13, x0, Operand(x1, UXTW, 4));
+
+  __ Add(x14, x0, Operand(x1, SXTB, 0));
+  __ Add(x15, x0, Operand(x1, SXTB, 1));
+  __ Add(x16, x0, Operand(x1, SXTH, 2));
+  __ Add(x17, x0, Operand(x1, SXTW, 3));
+  __ Add(x18, x0, Operand(x2, SXTB, 0));
+  __ Add(x19, x0, Operand(x2, SXTB, 1));
+  __ Add(x20, x0, Operand(x2, SXTH, 2));
+  __ Add(x21, x0, Operand(x2, SXTW, 3));
+
+  __ Add(x22, x1, Operand(x2, SXTB, 1));
+  __ Sub(x23, x1, Operand(x2, SXTB, 1));
+
+  __ Add(w24, w1, Operand(w2, UXTB, 2));
+  __ Add(w25, w0, Operand(w1, SXTB, 0));
+  __ Add(w26, w0, Operand(w1, SXTB, 1));
+  __ Add(w27, w2, Operand(w1, SXTW, 3));
+
+  __ Add(w28, w0, Operand(w1, SXTW, 3));
+  __ Add(x29, x0, Operand(w1, SXTW, 3));
+
+  __ Sub(x30, x0, Operand(w3, SXTB, 1));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xefL, x10);
+  ASSERT_EQUAL_64(0x1deL, x11);
+  ASSERT_EQUAL_64(0x337bcL, x12);
+  ASSERT_EQUAL_64(0x89abcdef0L, x13);
+
+  ASSERT_EQUAL_64(0xffffffffffffffefL, x14);
+  ASSERT_EQUAL_64(0xffffffffffffffdeL, x15);
+  ASSERT_EQUAL_64(0xffffffffffff37bcL, x16);
+  ASSERT_EQUAL_64(0xfffffffc4d5e6f78L, x17);
+  ASSERT_EQUAL_64(0x10L, x18);
+  ASSERT_EQUAL_64(0x20L, x19);
+  ASSERT_EQUAL_64(0xc840L, x20);
+  ASSERT_EQUAL_64(0x3b2a19080L, x21);
+
+  ASSERT_EQUAL_64(0x0123456789abce0fL, x22);
+  ASSERT_EQUAL_64(0x0123456789abcdcfL, x23);
+
+  ASSERT_EQUAL_32(0x89abce2f, w24);
+  ASSERT_EQUAL_32(0xffffffef, w25);
+  ASSERT_EQUAL_32(0xffffffde, w26);
+  ASSERT_EQUAL_32(0xc3b2a188, w27);
+
+  ASSERT_EQUAL_32(0x4d5e6f78, w28);
+  ASSERT_EQUAL_64(0xfffffffc4d5e6f78L, x29);
+
+  ASSERT_EQUAL_64(256, x30);
+
+  TEARDOWN();
+}
+
+
+TEST(add_sub_negative) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0);
+  __ Mov(x1, 4687);
+  __ Mov(x2, 0x1122334455667788);
+  __ Mov(w3, 0x11223344);
+  __ Mov(w4, 400000);
+
+  __ Add(x10, x0, -42);
+  __ Add(x11, x1, -687);
+  __ Add(x12, x2, -0x88);
+
+  __ Sub(x13, x0, -600);
+  __ Sub(x14, x1, -313);
+  __ Sub(x15, x2, -0x555);
+
+  __ Add(w19, w3, -0x344);
+  __ Add(w20, w4, -2000);
+
+  __ Sub(w21, w3, -0xbc);
+  __ Sub(w22, w4, -2000);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(-42, x10);
+  ASSERT_EQUAL_64(4000, x11);
+  ASSERT_EQUAL_64(0x1122334455667700, x12);
+
+  ASSERT_EQUAL_64(600, x13);
+  ASSERT_EQUAL_64(5000, x14);
+  ASSERT_EQUAL_64(0x1122334455667cdd, x15);
+
+  ASSERT_EQUAL_32(0x11223000, w19);
+  ASSERT_EQUAL_32(398000, w20);
+
+  ASSERT_EQUAL_32(0x11223400, w21);
+  ASSERT_EQUAL_32(402000, w22);
+
+  TEARDOWN();
+}
+
+
+TEST(neg) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0xf123456789abcdefL);
+
+  // Immediate.
+  __ Neg(x1, 0x123);
+  __ Neg(w2, 0x123);
+
+  // Shifted.
+  __ Neg(x3, Operand(x0, LSL, 1));
+  __ Neg(w4, Operand(w0, LSL, 2));
+  __ Neg(x5, Operand(x0, LSR, 3));
+  __ Neg(w6, Operand(w0, LSR, 4));
+  __ Neg(x7, Operand(x0, ASR, 5));
+  __ Neg(w8, Operand(w0, ASR, 6));
+
+  // Extended.
+  __ Neg(w9, Operand(w0, UXTB));
+  __ Neg(x10, Operand(x0, SXTB, 1));
+  __ Neg(w11, Operand(w0, UXTH, 2));
+  __ Neg(x12, Operand(x0, SXTH, 3));
+  __ Neg(w13, Operand(w0, UXTW, 4));
+  __ Neg(x14, Operand(x0, SXTW, 4));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xfffffffffffffeddUL, x1);
+  ASSERT_EQUAL_64(0xfffffedd, x2);
+  ASSERT_EQUAL_64(0x1db97530eca86422UL, x3);
+  ASSERT_EQUAL_64(0xd950c844, x4);
+  ASSERT_EQUAL_64(0xe1db97530eca8643UL, x5);
+  ASSERT_EQUAL_64(0xf7654322, x6);
+  ASSERT_EQUAL_64(0x0076e5d4c3b2a191UL, x7);
+  ASSERT_EQUAL_64(0x01d950c9, x8);
+  ASSERT_EQUAL_64(0xffffff11, x9);
+  ASSERT_EQUAL_64(0x0000000000000022UL, x10);
+  ASSERT_EQUAL_64(0xfffcc844, x11);
+  ASSERT_EQUAL_64(0x0000000000019088UL, x12);
+  ASSERT_EQUAL_64(0x65432110, x13);
+  ASSERT_EQUAL_64(0x0000000765432110UL, x14);
+
+  TEARDOWN();
+}
+
+
+TEST(adc_sbc_shift) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0);
+  __ Mov(x1, 1);
+  __ Mov(x2, 0x0123456789abcdefL);
+  __ Mov(x3, 0xfedcba9876543210L);
+  __ Mov(x4, 0xffffffffffffffffL);
+
+  // Clear the C flag.
+  __ Add(x0, x0, Operand(0), SetFlags);
+
+  __ Adc(x5, x2, Operand(x3));
+  __ Adc(x6, x0, Operand(x1, LSL, 60));
+  __ Sbc(x7, x4, Operand(x3, LSR, 4));
+  __ Adc(x8, x2, Operand(x3, ASR, 4));
+  __ Adc(x9, x2, Operand(x3, ROR, 8));
+
+  __ Adc(w10, w2, Operand(w3));
+  __ Adc(w11, w0, Operand(w1, LSL, 30));
+  __ Sbc(w12, w4, Operand(w3, LSR, 4));
+  __ Adc(w13, w2, Operand(w3, ASR, 4));
+  __ Adc(w14, w2, Operand(w3, ROR, 8));
+
+  // Set the C flag.
+  __ Cmp(w0, Operand(w0));
+
+  __ Adc(x18, x2, Operand(x3));
+  __ Adc(x19, x0, Operand(x1, LSL, 60));
+  __ Sbc(x20, x4, Operand(x3, LSR, 4));
+  __ Adc(x21, x2, Operand(x3, ASR, 4));
+  __ Adc(x22, x2, Operand(x3, ROR, 8));
+
+  __ Adc(w23, w2, Operand(w3));
+  __ Adc(w24, w0, Operand(w1, LSL, 30));
+  __ Sbc(w25, w4, Operand(w3, LSR, 4));
+  __ Adc(w26, w2, Operand(w3, ASR, 4));
+  __ Adc(w27, w2, Operand(w3, ROR, 8));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xffffffffffffffffL, x5);
+  ASSERT_EQUAL_64(1L << 60, x6);
+  ASSERT_EQUAL_64(0xf0123456789abcddL, x7);
+  ASSERT_EQUAL_64(0x0111111111111110L, x8);
+  ASSERT_EQUAL_64(0x1222222222222221L, x9);
+
+  ASSERT_EQUAL_32(0xffffffff, w10);
+  ASSERT_EQUAL_32(1 << 30, w11);
+  ASSERT_EQUAL_32(0xf89abcdd, w12);
+  ASSERT_EQUAL_32(0x91111110, w13);
+  ASSERT_EQUAL_32(0x9a222221, w14);
+
+  ASSERT_EQUAL_64(0xffffffffffffffffL + 1, x18);
+  ASSERT_EQUAL_64((1L << 60) + 1, x19);
+  ASSERT_EQUAL_64(0xf0123456789abcddL + 1, x20);
+  ASSERT_EQUAL_64(0x0111111111111110L + 1, x21);
+  ASSERT_EQUAL_64(0x1222222222222221L + 1, x22);
+
+  ASSERT_EQUAL_32(0xffffffff + 1, w23);
+  ASSERT_EQUAL_32((1 << 30) + 1, w24);
+  ASSERT_EQUAL_32(0xf89abcdd + 1, w25);
+  ASSERT_EQUAL_32(0x91111110 + 1, w26);
+  ASSERT_EQUAL_32(0x9a222221 + 1, w27);
+
+  // Check that adc correctly sets the condition flags.
+  START();
+  __ Mov(x0, 1);
+  __ Mov(x1, 0xffffffffffffffffL);
+  // Clear the C flag.
+  __ Add(x0, x0, Operand(0), SetFlags);
+  __ Adc(x10, x0, Operand(x1), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(ZCFlag);
+
+  START();
+  __ Mov(x0, 1);
+  __ Mov(x1, 0x8000000000000000L);
+  // Clear the C flag.
+  __ Add(x0, x0, Operand(0), SetFlags);
+  __ Adc(x10, x0, Operand(x1, ASR, 63), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(ZCFlag);
+
+  START();
+  __ Mov(x0, 0x10);
+  __ Mov(x1, 0x07ffffffffffffffL);
+  // Clear the C flag.
+  __ Add(x0, x0, Operand(0), SetFlags);
+  __ Adc(x10, x0, Operand(x1, LSL, 4), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(NVFlag);
+
+  TEARDOWN();
+}
+
+
+TEST(adc_sbc_extend) {
+  SETUP();
+
+  START();
+  // Clear the C flag.
+  __ Add(x0, x0, Operand(0), SetFlags);
+
+  __ Mov(x0, 0);
+  __ Mov(x1, 1);
+  __ Mov(x2, 0x0123456789abcdefL);
+
+  __ Adc(x10, x1, Operand(w2, UXTB, 1));
+  __ Adc(x11, x1, Operand(x2, SXTH, 2));
+  __ Sbc(x12, x1, Operand(w2, UXTW, 4));
+  __ Adc(x13, x1, Operand(x2, UXTX, 4));
+
+  __ Adc(w14, w1, Operand(w2, UXTB, 1));
+  __ Adc(w15, w1, Operand(w2, SXTH, 2));
+  __ Adc(w9, w1, Operand(w2, UXTW, 4));
+
+  // Set the C flag.
+  __ Cmp(w0, Operand(w0));
+
+  __ Adc(x20, x1, Operand(w2, UXTB, 1));
+  __ Adc(x21, x1, Operand(x2, SXTH, 2));
+  __ Sbc(x22, x1, Operand(w2, UXTW, 4));
+  __ Adc(x23, x1, Operand(x2, UXTX, 4));
+
+  __ Adc(w24, w1, Operand(w2, UXTB, 1));
+  __ Adc(w25, w1, Operand(w2, SXTH, 2));
+  __ Adc(w26, w1, Operand(w2, UXTW, 4));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x1df, x10);
+  ASSERT_EQUAL_64(0xffffffffffff37bdL, x11);
+  ASSERT_EQUAL_64(0xfffffff765432110L, x12);
+  ASSERT_EQUAL_64(0x123456789abcdef1L, x13);
+
+  ASSERT_EQUAL_32(0x1df, w14);
+  ASSERT_EQUAL_32(0xffff37bd, w15);
+  ASSERT_EQUAL_32(0x9abcdef1, w9);
+
+  ASSERT_EQUAL_64(0x1df + 1, x20);
+  ASSERT_EQUAL_64(0xffffffffffff37bdL + 1, x21);
+  ASSERT_EQUAL_64(0xfffffff765432110L + 1, x22);
+  ASSERT_EQUAL_64(0x123456789abcdef1L + 1, x23);
+
+  ASSERT_EQUAL_32(0x1df + 1, w24);
+  ASSERT_EQUAL_32(0xffff37bd + 1, w25);
+  ASSERT_EQUAL_32(0x9abcdef1 + 1, w26);
+
+  // Check that adc correctly sets the condition flags.
+  START();
+  __ Mov(x0, 0xff);
+  __ Mov(x1, 0xffffffffffffffffL);
+  // Clear the C flag.
+  __ Add(x0, x0, Operand(0), SetFlags);
+  __ Adc(x10, x0, Operand(x1, SXTX, 1), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(CFlag);
+
+  START();
+  __ Mov(x0, 0x7fffffffffffffffL);
+  __ Mov(x1, 1);
+  // Clear the C flag.
+  __ Add(x0, x0, Operand(0), SetFlags);
+  __ Adc(x10, x0, Operand(x1, UXTB, 2), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(NVFlag);
+
+  START();
+  __ Mov(x0, 0x7fffffffffffffffL);
+  // Clear the C flag.
+  __ Add(x0, x0, Operand(0), SetFlags);
+  __ Adc(x10, x0, Operand(1), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(NVFlag);
+
+  TEARDOWN();
+}
+
+
+TEST(adc_sbc_wide_imm) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0);
+
+  // Clear the C flag.
+  __ Add(x0, x0, Operand(0), SetFlags);
+
+  __ Adc(x7, x0, Operand(0x1234567890abcdefUL));
+  __ Adc(w8, w0, Operand(0xffffffff));
+
+  // Set the C flag.
+  __ Cmp(w0, Operand(w0));
+
+  __ Adc(x27, x0, Operand(0x1234567890abcdefUL));
+  __ Adc(w28, w0, Operand(0xffffffff));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x1234567890abcdefUL, x7);
+  ASSERT_EQUAL_64(0xffffffff, x8);
+  ASSERT_EQUAL_64(0x1234567890abcdefUL + 1, x27);
+  ASSERT_EQUAL_64(0, x28);
+
+  TEARDOWN();
+}
+
+TEST(flags) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0);
+  __ Mov(x1, 0x1111111111111111L);
+  __ Neg(x10, Operand(x0));
+  __ Neg(x11, Operand(x1));
+  __ Neg(w12, Operand(w1));
+  // Clear the C flag.
+  __ Add(x0, x0, Operand(0), SetFlags);
+  __ Ngc(x13, Operand(x0));
+  // Set the C flag.
+  __ Cmp(x0, Operand(x0));
+  __ Ngc(w14, Operand(w0));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0, x10);
+  ASSERT_EQUAL_64(-0x1111111111111111L, x11);
+  ASSERT_EQUAL_32(-0x11111111, w12);
+  ASSERT_EQUAL_64(-1L, x13);
+  ASSERT_EQUAL_32(0, w14);
+
+  START();
+  __ Mov(x0, 0);
+  __ Cmp(x0, Operand(x0));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(ZCFlag);
+
+  START();
+  __ Mov(w0, 0);
+  __ Cmp(w0, Operand(w0));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(ZCFlag);
+
+  START();
+  __ Mov(x0, 0);
+  __ Mov(x1, 0x1111111111111111L);
+  __ Cmp(x0, Operand(x1));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(NFlag);
+
+  START();
+  __ Mov(w0, 0);
+  __ Mov(w1, 0x11111111);
+  __ Cmp(w0, Operand(w1));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(NFlag);
+
+  START();
+  __ Mov(x1, 0x1111111111111111L);
+  __ Cmp(x1, Operand(0));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(CFlag);
+
+  START();
+  __ Mov(w1, 0x11111111);
+  __ Cmp(w1, Operand(0));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(CFlag);
+
+  START();
+  __ Mov(x0, 1);
+  __ Mov(x1, 0x7fffffffffffffffL);
+  __ Cmn(x1, Operand(x0));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(NVFlag);
+
+  START();
+  __ Mov(w0, 1);
+  __ Mov(w1, 0x7fffffff);
+  __ Cmn(w1, Operand(w0));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(NVFlag);
+
+  START();
+  __ Mov(x0, 1);
+  __ Mov(x1, 0xffffffffffffffffL);
+  __ Cmn(x1, Operand(x0));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(ZCFlag);
+
+  START();
+  __ Mov(w0, 1);
+  __ Mov(w1, 0xffffffff);
+  __ Cmn(w1, Operand(w0));
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(ZCFlag);
+
+  START();
+  __ Mov(w0, 0);
+  __ Mov(w1, 1);
+  // Clear the C flag.
+  __ Add(w0, w0, Operand(0), SetFlags);
+  __ Ngc(w0, Operand(w1), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(NFlag);
+
+  START();
+  __ Mov(w0, 0);
+  __ Mov(w1, 0);
+  // Set the C flag.
+  __ Cmp(w0, Operand(w0));
+  __ Ngc(w0, Operand(w1), SetFlags);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_NZCV(ZCFlag);
+
+  TEARDOWN();
+}
+
+
+TEST(cmp_shift) {
+  SETUP();
+
+  START();
+  __ Mov(x18, 0xf0000000);
+  __ Mov(x19, 0xf000000010000000UL);
+  __ Mov(x20, 0xf0000000f0000000UL);
+  __ Mov(x21, 0x7800000078000000UL);
+  __ Mov(x22, 0x3c0000003c000000UL);
+  __ Mov(x23, 0x8000000780000000UL);
+  __ Mov(x24, 0x0000000f00000000UL);
+  __ Mov(x25, 0x00000003c0000000UL);
+  __ Mov(x26, 0x8000000780000000UL);
+  __ Mov(x27, 0xc0000003);
+
+  __ Cmp(w20, Operand(w21, LSL, 1));
+  __ Mrs(x0, NZCV);
+
+  __ Cmp(x20, Operand(x22, LSL, 2));
+  __ Mrs(x1, NZCV);
+
+  __ Cmp(w19, Operand(w23, LSR, 3));
+  __ Mrs(x2, NZCV);
+
+  __ Cmp(x18, Operand(x24, LSR, 4));
+  __ Mrs(x3, NZCV);
+
+  __ Cmp(w20, Operand(w25, ASR, 2));
+  __ Mrs(x4, NZCV);
+
+  __ Cmp(x20, Operand(x26, ASR, 3));
+  __ Mrs(x5, NZCV);
+
+  __ Cmp(w27, Operand(w22, ROR, 28));
+  __ Mrs(x6, NZCV);
+
+  __ Cmp(x20, Operand(x21, ROR, 31));
+  __ Mrs(x7, NZCV);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_32(ZCFlag, w0);
+  ASSERT_EQUAL_32(ZCFlag, w1);
+  ASSERT_EQUAL_32(ZCFlag, w2);
+  ASSERT_EQUAL_32(ZCFlag, w3);
+  ASSERT_EQUAL_32(ZCFlag, w4);
+  ASSERT_EQUAL_32(ZCFlag, w5);
+  ASSERT_EQUAL_32(ZCFlag, w6);
+  ASSERT_EQUAL_32(ZCFlag, w7);
+
+  TEARDOWN();
+}
+
+
+TEST(cmp_extend) {
+  SETUP();
+
+  START();
+  __ Mov(w20, 0x2);
+  __ Mov(w21, 0x1);
+  __ Mov(x22, 0xffffffffffffffffUL);
+  __ Mov(x23, 0xff);
+  __ Mov(x24, 0xfffffffffffffffeUL);
+  __ Mov(x25, 0xffff);
+  __ Mov(x26, 0xffffffff);
+
+  __ Cmp(w20, Operand(w21, LSL, 1));
+  __ Mrs(x0, NZCV);
+
+  __ Cmp(x22, Operand(x23, SXTB, 0));
+  __ Mrs(x1, NZCV);
+
+  __ Cmp(x24, Operand(x23, SXTB, 1));
+  __ Mrs(x2, NZCV);
+
+  __ Cmp(x24, Operand(x23, UXTB, 1));
+  __ Mrs(x3, NZCV);
+
+  __ Cmp(w22, Operand(w25, UXTH));
+  __ Mrs(x4, NZCV);
+
+  __ Cmp(x22, Operand(x25, SXTH));
+  __ Mrs(x5, NZCV);
+
+  __ Cmp(x22, Operand(x26, UXTW));
+  __ Mrs(x6, NZCV);
+
+  __ Cmp(x24, Operand(x26, SXTW, 1));
+  __ Mrs(x7, NZCV);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_32(ZCFlag, w0);
+  ASSERT_EQUAL_32(ZCFlag, w1);
+  ASSERT_EQUAL_32(ZCFlag, w2);
+  ASSERT_EQUAL_32(NCFlag, w3);
+  ASSERT_EQUAL_32(NCFlag, w4);
+  ASSERT_EQUAL_32(ZCFlag, w5);
+  ASSERT_EQUAL_32(NCFlag, w6);
+  ASSERT_EQUAL_32(ZCFlag, w7);
+
+  TEARDOWN();
+}
+
+
+TEST(ccmp) {
+  SETUP();
+
+  START();
+  __ Mov(w16, 0);
+  __ Mov(w17, 1);
+  __ Cmp(w16, Operand(w16));
+  __ Ccmp(w16, Operand(w17), NCFlag, eq);
+  __ Mrs(x0, NZCV);
+
+  __ Cmp(w16, Operand(w16));
+  __ Ccmp(w16, Operand(w17), NCFlag, ne);
+  __ Mrs(x1, NZCV);
+
+  __ Cmp(x16, Operand(x16));
+  __ Ccmn(x16, Operand(2), NZCVFlag, eq);
+  __ Mrs(x2, NZCV);
+
+  __ Cmp(x16, Operand(x16));
+  __ Ccmn(x16, Operand(2), NZCVFlag, ne);
+  __ Mrs(x3, NZCV);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_32(NFlag, w0);
+  ASSERT_EQUAL_32(NCFlag, w1);
+  ASSERT_EQUAL_32(NoFlag, w2);
+  ASSERT_EQUAL_32(NZCVFlag, w3);
+
+  TEARDOWN();
+}
+
+
+TEST(ccmp_wide_imm) {
+  SETUP();
+
+  START();
+  __ Mov(w20, 0);
+
+  __ Cmp(w20, Operand(w20));
+  __ Ccmp(w20, Operand(0x12345678), NZCVFlag, eq);
+  __ Mrs(x0, NZCV);
+
+  __ Cmp(w20, Operand(w20));
+  __ Ccmp(x20, Operand(0xffffffffffffffffUL), NZCVFlag, eq);
+  __ Mrs(x1, NZCV);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_32(NFlag, w0);
+  ASSERT_EQUAL_32(NoFlag, w1);
+
+  TEARDOWN();
+}
+
+
+TEST(ccmp_shift_extend) {
+  SETUP();
+
+  START();
+  __ Mov(w20, 0x2);
+  __ Mov(w21, 0x1);
+  __ Mov(x22, 0xffffffffffffffffUL);
+  __ Mov(x23, 0xff);
+  __ Mov(x24, 0xfffffffffffffffeUL);
+
+  __ Cmp(w20, Operand(w20));
+  __ Ccmp(w20, Operand(w21, LSL, 1), NZCVFlag, eq);
+  __ Mrs(x0, NZCV);
+
+  __ Cmp(w20, Operand(w20));
+  __ Ccmp(x22, Operand(x23, SXTB, 0), NZCVFlag, eq);
+  __ Mrs(x1, NZCV);
+
+  __ Cmp(w20, Operand(w20));
+  __ Ccmp(x24, Operand(x23, SXTB, 1), NZCVFlag, eq);
+  __ Mrs(x2, NZCV);
+
+  __ Cmp(w20, Operand(w20));
+  __ Ccmp(x24, Operand(x23, UXTB, 1), NZCVFlag, eq);
+  __ Mrs(x3, NZCV);
+
+  __ Cmp(w20, Operand(w20));
+  __ Ccmp(x24, Operand(x23, UXTB, 1), NZCVFlag, ne);
+  __ Mrs(x4, NZCV);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_32(ZCFlag, w0);
+  ASSERT_EQUAL_32(ZCFlag, w1);
+  ASSERT_EQUAL_32(ZCFlag, w2);
+  ASSERT_EQUAL_32(NCFlag, w3);
+  ASSERT_EQUAL_32(NZCVFlag, w4);
+
+  TEARDOWN();
+}
+
+
+TEST(csel) {
+  SETUP();
+
+  START();
+  __ Mov(x16, 0);
+  __ Mov(x24, 0x0000000f0000000fUL);
+  __ Mov(x25, 0x0000001f0000001fUL);
+
+  __ Cmp(w16, Operand(0));
+  __ Csel(w0, w24, w25, eq);
+  __ Csel(w1, w24, w25, ne);
+  __ Csinc(w2, w24, w25, mi);
+  __ Csinc(w3, w24, w25, pl);
+
+  __ Cmp(x16, Operand(1));
+  __ Csinv(x4, x24, x25, gt);
+  __ Csinv(x5, x24, x25, le);
+  __ Csneg(x6, x24, x25, hs);
+  __ Csneg(x7, x24, x25, lo);
+
+  __ Cset(w8, ne);
+  __ Csetm(w9, ne);
+  __ Cinc(x10, x25, ne);
+  __ Cinv(x11, x24, ne);
+  __ Cneg(x12, x24, ne);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x0000000f, x0);
+  ASSERT_EQUAL_64(0x0000001f, x1);
+  ASSERT_EQUAL_64(0x00000020, x2);
+  ASSERT_EQUAL_64(0x0000000f, x3);
+  ASSERT_EQUAL_64(0xffffffe0ffffffe0UL, x4);
+  ASSERT_EQUAL_64(0x0000000f0000000fUL, x5);
+  ASSERT_EQUAL_64(0xffffffe0ffffffe1UL, x6);
+  ASSERT_EQUAL_64(0x0000000f0000000fUL, x7);
+  ASSERT_EQUAL_64(0x00000001, x8);
+  ASSERT_EQUAL_64(0xffffffff, x9);
+  ASSERT_EQUAL_64(0x0000001f00000020UL, x10);
+  ASSERT_EQUAL_64(0xfffffff0fffffff0UL, x11);
+  ASSERT_EQUAL_64(0xfffffff0fffffff1UL, x12);
+
+  TEARDOWN();
+}
+
+
+TEST(lslv) {
+  SETUP();
+
+  uint64_t value = 0x0123456789abcdefUL;
+  int shift[] = {1, 3, 5, 9, 17, 33};
+
+  START();
+  __ Mov(x0, value);
+  __ Mov(w1, shift[0]);
+  __ Mov(w2, shift[1]);
+  __ Mov(w3, shift[2]);
+  __ Mov(w4, shift[3]);
+  __ Mov(w5, shift[4]);
+  __ Mov(w6, shift[5]);
+
+  __ lslv(x0, x0, xzr);
+
+  __ Lsl(x16, x0, x1);
+  __ Lsl(x17, x0, x2);
+  __ Lsl(x18, x0, x3);
+  __ Lsl(x19, x0, x4);
+  __ Lsl(x20, x0, x5);
+  __ Lsl(x21, x0, x6);
+
+  __ Lsl(w22, w0, w1);
+  __ Lsl(w23, w0, w2);
+  __ Lsl(w24, w0, w3);
+  __ Lsl(w25, w0, w4);
+  __ Lsl(w26, w0, w5);
+  __ Lsl(w27, w0, w6);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(value, x0);
+  ASSERT_EQUAL_64(value << (shift[0] & 63), x16);
+  ASSERT_EQUAL_64(value << (shift[1] & 63), x17);
+  ASSERT_EQUAL_64(value << (shift[2] & 63), x18);
+  ASSERT_EQUAL_64(value << (shift[3] & 63), x19);
+  ASSERT_EQUAL_64(value << (shift[4] & 63), x20);
+  ASSERT_EQUAL_64(value << (shift[5] & 63), x21);
+  ASSERT_EQUAL_32(value << (shift[0] & 31), w22);
+  ASSERT_EQUAL_32(value << (shift[1] & 31), w23);
+  ASSERT_EQUAL_32(value << (shift[2] & 31), w24);
+  ASSERT_EQUAL_32(value << (shift[3] & 31), w25);
+  ASSERT_EQUAL_32(value << (shift[4] & 31), w26);
+  ASSERT_EQUAL_32(value << (shift[5] & 31), w27);
+
+  TEARDOWN();
+}
+
+
+TEST(lsrv) {
+  SETUP();
+
+  uint64_t value = 0x0123456789abcdefUL;
+  int shift[] = {1, 3, 5, 9, 17, 33};
+
+  START();
+  __ Mov(x0, value);
+  __ Mov(w1, shift[0]);
+  __ Mov(w2, shift[1]);
+  __ Mov(w3, shift[2]);
+  __ Mov(w4, shift[3]);
+  __ Mov(w5, shift[4]);
+  __ Mov(w6, shift[5]);
+
+  __ lsrv(x0, x0, xzr);
+
+  __ Lsr(x16, x0, x1);
+  __ Lsr(x17, x0, x2);
+  __ Lsr(x18, x0, x3);
+  __ Lsr(x19, x0, x4);
+  __ Lsr(x20, x0, x5);
+  __ Lsr(x21, x0, x6);
+
+  __ Lsr(w22, w0, w1);
+  __ Lsr(w23, w0, w2);
+  __ Lsr(w24, w0, w3);
+  __ Lsr(w25, w0, w4);
+  __ Lsr(w26, w0, w5);
+  __ Lsr(w27, w0, w6);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(value, x0);
+  ASSERT_EQUAL_64(value >> (shift[0] & 63), x16);
+  ASSERT_EQUAL_64(value >> (shift[1] & 63), x17);
+  ASSERT_EQUAL_64(value >> (shift[2] & 63), x18);
+  ASSERT_EQUAL_64(value >> (shift[3] & 63), x19);
+  ASSERT_EQUAL_64(value >> (shift[4] & 63), x20);
+  ASSERT_EQUAL_64(value >> (shift[5] & 63), x21);
+
+  value &= 0xffffffffUL;
+  ASSERT_EQUAL_32(value >> (shift[0] & 31), w22);
+  ASSERT_EQUAL_32(value >> (shift[1] & 31), w23);
+  ASSERT_EQUAL_32(value >> (shift[2] & 31), w24);
+  ASSERT_EQUAL_32(value >> (shift[3] & 31), w25);
+  ASSERT_EQUAL_32(value >> (shift[4] & 31), w26);
+  ASSERT_EQUAL_32(value >> (shift[5] & 31), w27);
+
+  TEARDOWN();
+}
+
+
+TEST(asrv) {
+  SETUP();
+
+  int64_t value = 0xfedcba98fedcba98UL;
+  int shift[] = {1, 3, 5, 9, 17, 33};
+
+  START();
+  __ Mov(x0, value);
+  __ Mov(w1, shift[0]);
+  __ Mov(w2, shift[1]);
+  __ Mov(w3, shift[2]);
+  __ Mov(w4, shift[3]);
+  __ Mov(w5, shift[4]);
+  __ Mov(w6, shift[5]);
+
+  __ asrv(x0, x0, xzr);
+
+  __ Asr(x16, x0, x1);
+  __ Asr(x17, x0, x2);
+  __ Asr(x18, x0, x3);
+  __ Asr(x19, x0, x4);
+  __ Asr(x20, x0, x5);
+  __ Asr(x21, x0, x6);
+
+  __ Asr(w22, w0, w1);
+  __ Asr(w23, w0, w2);
+  __ Asr(w24, w0, w3);
+  __ Asr(w25, w0, w4);
+  __ Asr(w26, w0, w5);
+  __ Asr(w27, w0, w6);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(value, x0);
+  ASSERT_EQUAL_64(value >> (shift[0] & 63), x16);
+  ASSERT_EQUAL_64(value >> (shift[1] & 63), x17);
+  ASSERT_EQUAL_64(value >> (shift[2] & 63), x18);
+  ASSERT_EQUAL_64(value >> (shift[3] & 63), x19);
+  ASSERT_EQUAL_64(value >> (shift[4] & 63), x20);
+  ASSERT_EQUAL_64(value >> (shift[5] & 63), x21);
+
+  int32_t value32 = static_cast<int32_t>(value & 0xffffffffUL);
+  ASSERT_EQUAL_32(value32 >> (shift[0] & 31), w22);
+  ASSERT_EQUAL_32(value32 >> (shift[1] & 31), w23);
+  ASSERT_EQUAL_32(value32 >> (shift[2] & 31), w24);
+  ASSERT_EQUAL_32(value32 >> (shift[3] & 31), w25);
+  ASSERT_EQUAL_32(value32 >> (shift[4] & 31), w26);
+  ASSERT_EQUAL_32(value32 >> (shift[5] & 31), w27);
+
+  TEARDOWN();
+}
+
+
+TEST(rorv) {
+  SETUP();
+
+  uint64_t value = 0x0123456789abcdefUL;
+  int shift[] = {4, 8, 12, 16, 24, 36};
+
+  START();
+  __ Mov(x0, value);
+  __ Mov(w1, shift[0]);
+  __ Mov(w2, shift[1]);
+  __ Mov(w3, shift[2]);
+  __ Mov(w4, shift[3]);
+  __ Mov(w5, shift[4]);
+  __ Mov(w6, shift[5]);
+
+  __ rorv(x0, x0, xzr);
+
+  __ Ror(x16, x0, x1);
+  __ Ror(x17, x0, x2);
+  __ Ror(x18, x0, x3);
+  __ Ror(x19, x0, x4);
+  __ Ror(x20, x0, x5);
+  __ Ror(x21, x0, x6);
+
+  __ Ror(w22, w0, w1);
+  __ Ror(w23, w0, w2);
+  __ Ror(w24, w0, w3);
+  __ Ror(w25, w0, w4);
+  __ Ror(w26, w0, w5);
+  __ Ror(w27, w0, w6);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(value, x0);
+  ASSERT_EQUAL_64(0xf0123456789abcdeUL, x16);
+  ASSERT_EQUAL_64(0xef0123456789abcdUL, x17);
+  ASSERT_EQUAL_64(0xdef0123456789abcUL, x18);
+  ASSERT_EQUAL_64(0xcdef0123456789abUL, x19);
+  ASSERT_EQUAL_64(0xabcdef0123456789UL, x20);
+  ASSERT_EQUAL_64(0x789abcdef0123456UL, x21);
+  ASSERT_EQUAL_32(0xf89abcde, w22);
+  ASSERT_EQUAL_32(0xef89abcd, w23);
+  ASSERT_EQUAL_32(0xdef89abc, w24);
+  ASSERT_EQUAL_32(0xcdef89ab, w25);
+  ASSERT_EQUAL_32(0xabcdef89, w26);
+  ASSERT_EQUAL_32(0xf89abcde, w27);
+
+  TEARDOWN();
+}
+
+
+TEST(bfm) {
+  SETUP();
+
+  START();
+  __ Mov(x1, 0x0123456789abcdefL);
+
+  __ Mov(x10, 0x8888888888888888L);
+  __ Mov(x11, 0x8888888888888888L);
+  __ Mov(x12, 0x8888888888888888L);
+  __ Mov(x13, 0x8888888888888888L);
+  __ Mov(w20, 0x88888888);
+  __ Mov(w21, 0x88888888);
+
+  __ bfm(x10, x1, 16, 31);
+  __ bfm(x11, x1, 32, 15);
+
+  __ bfm(w20, w1, 16, 23);
+  __ bfm(w21, w1, 24, 15);
+
+  // Aliases.
+  __ Bfi(x12, x1, 16, 8);
+  __ Bfxil(x13, x1, 16, 8);
+  END();
+
+  RUN();
+
+
+  ASSERT_EQUAL_64(0x88888888888889abL, x10);
+  ASSERT_EQUAL_64(0x8888cdef88888888L, x11);
+
+  ASSERT_EQUAL_32(0x888888ab, w20);
+  ASSERT_EQUAL_32(0x88cdef88, w21);
+
+  ASSERT_EQUAL_64(0x8888888888ef8888L, x12);
+  ASSERT_EQUAL_64(0x88888888888888abL, x13);
+
+  TEARDOWN();
+}
+
+
+TEST(sbfm) {
+  SETUP();
+
+  START();
+  __ Mov(x1, 0x0123456789abcdefL);
+  __ Mov(x2, 0xfedcba9876543210L);
+
+  __ sbfm(x10, x1, 16, 31);
+  __ sbfm(x11, x1, 32, 15);
+  __ sbfm(x12, x1, 32, 47);
+  __ sbfm(x13, x1, 48, 35);
+
+  __ sbfm(w14, w1, 16, 23);
+  __ sbfm(w15, w1, 24, 15);
+  __ sbfm(w16, w2, 16, 23);
+  __ sbfm(w17, w2, 24, 15);
+
+  // Aliases.
+  __ Asr(x18, x1, 32);
+  __ Asr(x19, x2, 32);
+  __ Sbfiz(x20, x1, 8, 16);
+  __ Sbfiz(x21, x2, 8, 16);
+  __ Sbfx(x22, x1, 8, 16);
+  __ Sbfx(x23, x2, 8, 16);
+  __ Sxtb(x24, x1);
+  __ Sxtb(x25, x2);
+  __ Sxth(x26, x1);
+  __ Sxth(x27, x2);
+  __ Sxtw(x28, x1);
+  __ Sxtw(x29, x2);
+  END();
+
+  RUN();
+
+
+  ASSERT_EQUAL_64(0xffffffffffff89abL, x10);
+  ASSERT_EQUAL_64(0xffffcdef00000000L, x11);
+  ASSERT_EQUAL_64(0x4567L, x12);
+  ASSERT_EQUAL_64(0x789abcdef0000L, x13);
+
+  ASSERT_EQUAL_32(0xffffffab, w14);
+  ASSERT_EQUAL_32(0xffcdef00, w15);
+  ASSERT_EQUAL_32(0x54, w16);
+  ASSERT_EQUAL_32(0x00321000, w17);
+
+  ASSERT_EQUAL_64(0x01234567L, x18);
+  ASSERT_EQUAL_64(0xfffffffffedcba98L, x19);
+  ASSERT_EQUAL_64(0xffffffffffcdef00L, x20);
+  ASSERT_EQUAL_64(0x321000L, x21);
+  ASSERT_EQUAL_64(0xffffffffffffabcdL, x22);
+  ASSERT_EQUAL_64(0x5432L, x23);
+  ASSERT_EQUAL_64(0xffffffffffffffefL, x24);
+  ASSERT_EQUAL_64(0x10, x25);
+  ASSERT_EQUAL_64(0xffffffffffffcdefL, x26);
+  ASSERT_EQUAL_64(0x3210, x27);
+  ASSERT_EQUAL_64(0xffffffff89abcdefL, x28);
+  ASSERT_EQUAL_64(0x76543210, x29);
+
+  TEARDOWN();
+}
+
+
+TEST(ubfm) {
+  SETUP();
+
+  START();
+  __ Mov(x1, 0x0123456789abcdefL);
+  __ Mov(x2, 0xfedcba9876543210L);
+
+  __ Mov(x10, 0x8888888888888888L);
+  __ Mov(x11, 0x8888888888888888L);
+
+  __ ubfm(x10, x1, 16, 31);
+  __ ubfm(x11, x1, 32, 15);
+  __ ubfm(x12, x1, 32, 47);
+  __ ubfm(x13, x1, 48, 35);
+
+  __ ubfm(w25, w1, 16, 23);
+  __ ubfm(w26, w1, 24, 15);
+  __ ubfm(w27, w2, 16, 23);
+  __ ubfm(w28, w2, 24, 15);
+
+  // Aliases
+  __ Lsl(x15, x1, 63);
+  __ Lsl(x16, x1, 0);
+  __ Lsr(x17, x1, 32);
+  __ Ubfiz(x18, x1, 8, 16);
+  __ Ubfx(x19, x1, 8, 16);
+  __ Uxtb(x20, x1);
+  __ Uxth(x21, x1);
+  __ Uxtw(x22, x1);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x00000000000089abL, x10);
+  ASSERT_EQUAL_64(0x0000cdef00000000L, x11);
+  ASSERT_EQUAL_64(0x4567L, x12);
+  ASSERT_EQUAL_64(0x789abcdef0000L, x13);
+
+  ASSERT_EQUAL_32(0x000000ab, w25);
+  ASSERT_EQUAL_32(0x00cdef00, w26);
+  ASSERT_EQUAL_32(0x54, w27);
+  ASSERT_EQUAL_32(0x00321000, w28);
+
+  ASSERT_EQUAL_64(0x8000000000000000L, x15);
+  ASSERT_EQUAL_64(0x0123456789abcdefL, x16);
+  ASSERT_EQUAL_64(0x01234567L, x17);
+  ASSERT_EQUAL_64(0xcdef00L, x18);
+  ASSERT_EQUAL_64(0xabcdL, x19);
+  ASSERT_EQUAL_64(0xefL, x20);
+  ASSERT_EQUAL_64(0xcdefL, x21);
+  ASSERT_EQUAL_64(0x89abcdefL, x22);
+
+  TEARDOWN();
+}
+
+
+TEST(extr) {
+  SETUP();
+
+  START();
+  __ Mov(x1, 0x0123456789abcdefL);
+  __ Mov(x2, 0xfedcba9876543210L);
+
+  __ Extr(w10, w1, w2, 0);
+  __ Extr(w11, w1, w2, 1);
+  __ Extr(x12, x2, x1, 2);
+
+  __ Ror(w13, w1, 0);
+  __ Ror(w14, w2, 17);
+  __ Ror(w15, w1, 31);
+  __ Ror(x18, x2, 1);
+  __ Ror(x19, x1, 63);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x76543210, x10);
+  ASSERT_EQUAL_64(0xbb2a1908, x11);
+  ASSERT_EQUAL_64(0x0048d159e26af37bUL, x12);
+  ASSERT_EQUAL_64(0x89abcdef, x13);
+  ASSERT_EQUAL_64(0x19083b2a, x14);
+  ASSERT_EQUAL_64(0x13579bdf, x15);
+  ASSERT_EQUAL_64(0x7f6e5d4c3b2a1908UL, x18);
+  ASSERT_EQUAL_64(0x02468acf13579bdeUL, x19);
+
+  TEARDOWN();
+}
+
+
+TEST(fmov_imm) {
+  SETUP();
+
+  START();
+  __ Fmov(s11, 1.0);
+  __ Fmov(d22, -13.0);
+  __ Fmov(s1, 255.0);
+  __ Fmov(d2, 12.34567);
+  __ Fmov(s3, 0.0);
+  __ Fmov(d4, 0.0);
+  __ Fmov(s5, kFP32PositiveInfinity);
+  __ Fmov(d6, kFP64NegativeInfinity);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(1.0, s11);
+  ASSERT_EQUAL_FP64(-13.0, d22);
+  ASSERT_EQUAL_FP32(255.0, s1);
+  ASSERT_EQUAL_FP64(12.34567, d2);
+  ASSERT_EQUAL_FP32(0.0, s3);
+  ASSERT_EQUAL_FP64(0.0, d4);
+  ASSERT_EQUAL_FP32(kFP32PositiveInfinity, s5);
+  ASSERT_EQUAL_FP64(kFP64NegativeInfinity, d6);
+
+  TEARDOWN();
+}
+
+
+TEST(fmov_reg) {
+  SETUP();
+
+  START();
+  __ Fmov(s20, 1.0);
+  __ Fmov(w10, s20);
+  __ Fmov(s30, w10);
+  __ Fmov(s5, s20);
+  __ Fmov(d1, -13.0);
+  __ Fmov(x1, d1);
+  __ Fmov(d2, x1);
+  __ Fmov(d4, d1);
+  __ Fmov(d6, rawbits_to_double(0x0123456789abcdefL));
+  __ Fmov(s6, s6);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_32(float_to_rawbits(1.0), w10);
+  ASSERT_EQUAL_FP32(1.0, s30);
+  ASSERT_EQUAL_FP32(1.0, s5);
+  ASSERT_EQUAL_64(double_to_rawbits(-13.0), x1);
+  ASSERT_EQUAL_FP64(-13.0, d2);
+  ASSERT_EQUAL_FP64(-13.0, d4);
+  ASSERT_EQUAL_FP32(rawbits_to_float(0x89abcdef), s6);
+
+  TEARDOWN();
+}
+
+
+TEST(fadd) {
+  SETUP();
+
+  START();
+  __ Fmov(s13, -0.0);
+  __ Fmov(s14, kFP32PositiveInfinity);
+  __ Fmov(s15, kFP32NegativeInfinity);
+  __ Fmov(s16, 3.25);
+  __ Fmov(s17, 1.0);
+  __ Fmov(s18, 0);
+
+  __ Fmov(d26, -0.0);
+  __ Fmov(d27, kFP64PositiveInfinity);
+  __ Fmov(d28, kFP64NegativeInfinity);
+  __ Fmov(d29, 0);
+  __ Fmov(d30, -2.0);
+  __ Fmov(d31, 2.25);
+
+  __ Fadd(s0, s16, s17);
+  __ Fadd(s1, s17, s18);
+  __ Fadd(s2, s13, s17);
+  __ Fadd(s3, s14, s17);
+  __ Fadd(s4, s15, s17);
+
+  __ Fadd(d5, d30, d31);
+  __ Fadd(d6, d29, d31);
+  __ Fadd(d7, d26, d31);
+  __ Fadd(d8, d27, d31);
+  __ Fadd(d9, d28, d31);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(4.25, s0);
+  ASSERT_EQUAL_FP32(1.0, s1);
+  ASSERT_EQUAL_FP32(1.0, s2);
+  ASSERT_EQUAL_FP32(kFP32PositiveInfinity, s3);
+  ASSERT_EQUAL_FP32(kFP32NegativeInfinity, s4);
+  ASSERT_EQUAL_FP64(0.25, d5);
+  ASSERT_EQUAL_FP64(2.25, d6);
+  ASSERT_EQUAL_FP64(2.25, d7);
+  ASSERT_EQUAL_FP64(kFP64PositiveInfinity, d8);
+  ASSERT_EQUAL_FP64(kFP64NegativeInfinity, d9);
+
+  TEARDOWN();
+}
+
+
+TEST(fsub) {
+  SETUP();
+
+  START();
+  __ Fmov(s13, -0.0);
+  __ Fmov(s14, kFP32PositiveInfinity);
+  __ Fmov(s15, kFP32NegativeInfinity);
+  __ Fmov(s16, 3.25);
+  __ Fmov(s17, 1.0);
+  __ Fmov(s18, 0);
+
+  __ Fmov(d26, -0.0);
+  __ Fmov(d27, kFP64PositiveInfinity);
+  __ Fmov(d28, kFP64NegativeInfinity);
+  __ Fmov(d29, 0);
+  __ Fmov(d30, -2.0);
+  __ Fmov(d31, 2.25);
+
+  __ Fsub(s0, s16, s17);
+  __ Fsub(s1, s17, s18);
+  __ Fsub(s2, s13, s17);
+  __ Fsub(s3, s17, s14);
+  __ Fsub(s4, s17, s15);
+
+  __ Fsub(d5, d30, d31);
+  __ Fsub(d6, d29, d31);
+  __ Fsub(d7, d26, d31);
+  __ Fsub(d8, d31, d27);
+  __ Fsub(d9, d31, d28);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(2.25, s0);
+  ASSERT_EQUAL_FP32(1.0, s1);
+  ASSERT_EQUAL_FP32(-1.0, s2);
+  ASSERT_EQUAL_FP32(kFP32NegativeInfinity, s3);
+  ASSERT_EQUAL_FP32(kFP32PositiveInfinity, s4);
+  ASSERT_EQUAL_FP64(-4.25, d5);
+  ASSERT_EQUAL_FP64(-2.25, d6);
+  ASSERT_EQUAL_FP64(-2.25, d7);
+  ASSERT_EQUAL_FP64(kFP64NegativeInfinity, d8);
+  ASSERT_EQUAL_FP64(kFP64PositiveInfinity, d9);
+
+  TEARDOWN();
+}
+
+
+TEST(fmul) {
+  SETUP();
+
+  START();
+  __ Fmov(s13, -0.0);
+  __ Fmov(s14, kFP32PositiveInfinity);
+  __ Fmov(s15, kFP32NegativeInfinity);
+  __ Fmov(s16, 3.25);
+  __ Fmov(s17, 2.0);
+  __ Fmov(s18, 0);
+  __ Fmov(s19, -2.0);
+
+  __ Fmov(d26, -0.0);
+  __ Fmov(d27, kFP64PositiveInfinity);
+  __ Fmov(d28, kFP64NegativeInfinity);
+  __ Fmov(d29, 0);
+  __ Fmov(d30, -2.0);
+  __ Fmov(d31, 2.25);
+
+  __ Fmul(s0, s16, s17);
+  __ Fmul(s1, s17, s18);
+  __ Fmul(s2, s13, s13);
+  __ Fmul(s3, s14, s19);
+  __ Fmul(s4, s15, s19);
+
+  __ Fmul(d5, d30, d31);
+  __ Fmul(d6, d29, d31);
+  __ Fmul(d7, d26, d26);
+  __ Fmul(d8, d27, d30);
+  __ Fmul(d9, d28, d30);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(6.5, s0);
+  ASSERT_EQUAL_FP32(0.0, s1);
+  ASSERT_EQUAL_FP32(0.0, s2);
+  ASSERT_EQUAL_FP32(kFP32NegativeInfinity, s3);
+  ASSERT_EQUAL_FP32(kFP32PositiveInfinity, s4);
+  ASSERT_EQUAL_FP64(-4.5, d5);
+  ASSERT_EQUAL_FP64(0.0, d6);
+  ASSERT_EQUAL_FP64(0.0, d7);
+  ASSERT_EQUAL_FP64(kFP64NegativeInfinity, d8);
+  ASSERT_EQUAL_FP64(kFP64PositiveInfinity, d9);
+
+  TEARDOWN();
+}
+
+
+TEST(fmsub) {
+  SETUP();
+
+  START();
+  __ Fmov(s16, 3.25);
+  __ Fmov(s17, 2.0);
+  __ Fmov(s18, 0);
+  __ Fmov(s19, -0.5);
+  __ Fmov(s20, kFP32PositiveInfinity);
+  __ Fmov(s21, kFP32NegativeInfinity);
+  __ Fmov(s22, -0);
+
+  __ Fmov(d29, 0);
+  __ Fmov(d30, -2.0);
+  __ Fmov(d31, 2.25);
+  __ Fmov(d28, 4);
+  __ Fmov(d24, kFP64PositiveInfinity);
+  __ Fmov(d25, kFP64NegativeInfinity);
+  __ Fmov(d26, -0);
+
+     // Normal combinations
+  __ Fmsub(s0, s16, s17, s18);
+  __ Fmsub(s1, s17, s18, s16);
+  __ Fmsub(s2, s17, s16, s19);
+     // Pos/Neg Infinity
+  __ Fmsub(s3, s16, s21, s19);
+  __ Fmsub(s4, s17, s16, s20);
+  __ Fmsub(s5, s20, s16, s19);
+  __ Fmsub(s6, s21, s16, s19);
+     // -0
+  __ Fmsub(s7, s22, s16, s19);
+  __ Fmsub(s8, s19, s16, s22);
+
+     // Normal combinations
+  __ Fmsub(d9, d30, d31, d29);
+  __ Fmsub(d10, d29, d31, d30);
+  __ Fmsub(d11, d30, d31, d28);
+     // Pos/Neg Infinity
+  __ Fmsub(d12, d30, d24, d28);
+  __ Fmsub(d13, d24, d31, d25);
+  __ Fmsub(d14, d24, d31, d28);
+  __ Fmsub(d15, d25, d31, d28);
+     // -0
+  __ Fmsub(d16, d26, d31, d28);
+  __ Fmsub(d17, d30, d26, d28);
+  END();
+
+  RUN();
+
+  // Normal combinations
+  ASSERT_EQUAL_FP32(-6.5, s0);
+  ASSERT_EQUAL_FP32(3.25, s1);
+  ASSERT_EQUAL_FP32(-7, s2);
+  // Pos/Neg Infinity
+  ASSERT_EQUAL_FP32(kFP32PositiveInfinity, s3);
+  ASSERT_EQUAL_FP32(kFP32PositiveInfinity, s4);
+  ASSERT_EQUAL_FP32(kFP32NegativeInfinity, s5);
+  ASSERT_EQUAL_FP32(kFP32PositiveInfinity, s6);
+  // -0
+  ASSERT_EQUAL_FP32(-0.5, s7);
+  ASSERT_EQUAL_FP32(1.625, s8);
+
+  // Normal combinations
+  ASSERT_EQUAL_FP64(4.5, d9);
+  ASSERT_EQUAL_FP64(-2.0, d10);
+  ASSERT_EQUAL_FP64(8.5, d11);
+  // Pos/Neg Infinity
+  ASSERT_EQUAL_FP64(kFP64PositiveInfinity, d12);
+  ASSERT_EQUAL_FP64(kFP64NegativeInfinity, d13);
+  ASSERT_EQUAL_FP64(kFP64NegativeInfinity, d14);
+  ASSERT_EQUAL_FP64(kFP64PositiveInfinity, d15);
+  // -0
+  ASSERT_EQUAL_FP64(4.0, d16);
+  ASSERT_EQUAL_FP64(4.0, d17);
+
+  TEARDOWN();
+}
+
+
+TEST(fdiv) {
+  SETUP();
+
+  START();
+  __ Fmov(s13, -0.0);
+  __ Fmov(s14, kFP32PositiveInfinity);
+  __ Fmov(s15, kFP32NegativeInfinity);
+  __ Fmov(s16, 3.25);
+  __ Fmov(s17, 2.0);
+  __ Fmov(s18, 2.0);
+  __ Fmov(s19, -2.0);
+
+  __ Fmov(d26, -0.0);
+  __ Fmov(d27, kFP64PositiveInfinity);
+  __ Fmov(d28, kFP64NegativeInfinity);
+  __ Fmov(d29, 0);
+  __ Fmov(d30, -2.0);
+  __ Fmov(d31, 2.25);
+
+  __ Fdiv(s0, s16, s17);
+  __ Fdiv(s1, s17, s18);
+  __ Fdiv(s2, s13, s17);
+  __ Fdiv(s3, s17, s14);
+  __ Fdiv(s4, s17, s15);
+  __ Fdiv(d5, d31, d30);
+  __ Fdiv(d6, d29, d31);
+  __ Fdiv(d7, d26, d31);
+  __ Fdiv(d8, d31, d27);
+  __ Fdiv(d9, d31, d28);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(1.625, s0);
+  ASSERT_EQUAL_FP32(1.0, s1);
+  ASSERT_EQUAL_FP32(-0.0, s2);
+  ASSERT_EQUAL_FP32(0.0, s3);
+  ASSERT_EQUAL_FP32(-0.0, s4);
+  ASSERT_EQUAL_FP64(-1.125, d5);
+  ASSERT_EQUAL_FP64(0.0, d6);
+  ASSERT_EQUAL_FP64(-0.0, d7);
+  ASSERT_EQUAL_FP64(0.0, d8);
+  ASSERT_EQUAL_FP64(-0.0, d9);
+
+  TEARDOWN();
+}
+
+
+TEST(fmin_s) {
+  SETUP();
+
+  START();
+  __ Fmov(s25, 0.0);
+  __ Fneg(s26, s25);
+  __ Fmov(s27, kFP32PositiveInfinity);
+  __ Fmov(s28, 1.0);
+  __ Fmin(s0, s25, s26);
+  __ Fmin(s1, s27, s28);
+  __ Fmin(s2, s28, s26);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(-0.0, s0);
+  ASSERT_EQUAL_FP32(1.0, s1);
+  ASSERT_EQUAL_FP32(-0.0, s2);
+
+  TEARDOWN();
+}
+
+
+TEST(fmin_d) {
+  SETUP();
+
+  START();
+  __ Fmov(d25, 0.0);
+  __ Fneg(d26, d25);
+  __ Fmov(d27, kFP32PositiveInfinity);
+  __ Fneg(d28, d27);
+  __ Fmov(d29, 1.0);
+
+  for (unsigned j = 0; j < 5; j++) {
+    for (unsigned i = 0; i < 5; i++) {
+      // Test all combinations, writing results into d0 - d24.
+      __ Fmin(FPRegister::DRegFromCode(i + 5*j),
+              FPRegister::DRegFromCode(i + 25),
+              FPRegister::DRegFromCode(j + 25));
+    }
+  }
+  END();
+
+  RUN();
+
+  // Second register is 0.0.
+  ASSERT_EQUAL_FP64(0.0, d0);
+  ASSERT_EQUAL_FP64(-0.0, d1);
+  ASSERT_EQUAL_FP64(0.0, d2);
+  ASSERT_EQUAL_FP64(kFP32NegativeInfinity, d3);
+  ASSERT_EQUAL_FP64(0.0, d4);
+
+  // Second register is -0.0.
+  ASSERT_EQUAL_FP64(-0.0, d5);
+  ASSERT_EQUAL_FP64(-0.0, d6);
+  ASSERT_EQUAL_FP64(-0.0, d7);
+  ASSERT_EQUAL_FP64(kFP32NegativeInfinity, d8);
+  ASSERT_EQUAL_FP64(-0.0, d9);
+
+  // Second register is +Inf.
+  ASSERT_EQUAL_FP64(0.0, d10);
+  ASSERT_EQUAL_FP64(-0.0, d11);
+  ASSERT_EQUAL_FP64(kFP32PositiveInfinity, d12);
+  ASSERT_EQUAL_FP64(kFP32NegativeInfinity, d13);
+  ASSERT_EQUAL_FP64(1.0, d14);
+
+  // Second register is -Inf.
+  ASSERT_EQUAL_FP64(kFP32NegativeInfinity, d15);
+  ASSERT_EQUAL_FP64(kFP32NegativeInfinity, d16);
+  ASSERT_EQUAL_FP64(kFP32NegativeInfinity, d17);
+  ASSERT_EQUAL_FP64(kFP32NegativeInfinity, d18);
+  ASSERT_EQUAL_FP64(kFP32NegativeInfinity, d19);
+
+  // Second register is 1.0.
+  ASSERT_EQUAL_FP64(0.0, d20);
+  ASSERT_EQUAL_FP64(-0.0, d21);
+  ASSERT_EQUAL_FP64(1.0, d22);
+  ASSERT_EQUAL_FP64(kFP32NegativeInfinity, d23);
+  ASSERT_EQUAL_FP64(1.0, d24);
+
+  TEARDOWN();
+}
+
+
+TEST(fmax_s) {
+  SETUP();
+
+  START();
+  __ Fmov(s25, 0.0);
+  __ Fneg(s26, s25);
+  __ Fmov(s27, kFP32PositiveInfinity);
+  __ Fmov(s28, 1.0);
+  __ Fmax(s0, s25, s26);
+  __ Fmax(s1, s27, s28);
+  __ Fmax(s2, s28, s26);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(0.0, s0);
+  ASSERT_EQUAL_FP32(kFP32PositiveInfinity, s1);
+  ASSERT_EQUAL_FP32(1.0, s2);
+
+  TEARDOWN();
+}
+
+
+TEST(fmax_d) {
+  SETUP();
+
+  START();
+  __ Fmov(d25, 0.0);
+  __ Fneg(d26, d25);
+  __ Fmov(d27, kFP32PositiveInfinity);
+  __ Fneg(d28, d27);
+  __ Fmov(d29, 1.0);
+
+  for (unsigned j = 0; j < 5; j++) {
+    for (unsigned i = 0; i < 5; i++) {
+      // Test all combinations, writing results into d0 - d24.
+      __ Fmax(FPRegister::DRegFromCode(i + 5*j),
+              FPRegister::DRegFromCode(i + 25),
+              FPRegister::DRegFromCode(j + 25));
+    }
+  }
+  END();
+
+  RUN();
+
+  // Second register is 0.0.
+  ASSERT_EQUAL_FP64(0.0, d0);
+  ASSERT_EQUAL_FP64(0.0, d1);
+  ASSERT_EQUAL_FP64(kFP32PositiveInfinity, d2);
+  ASSERT_EQUAL_FP64(0.0, d3);
+  ASSERT_EQUAL_FP64(1.0, d4);
+
+  // Second register is -0.0.
+  ASSERT_EQUAL_FP64(0.0, d5);
+  ASSERT_EQUAL_FP64(-0.0, d6);
+  ASSERT_EQUAL_FP64(kFP32PositiveInfinity, d7);
+  ASSERT_EQUAL_FP64(-0.0, d8);
+  ASSERT_EQUAL_FP64(1.0, d9);
+
+  // Second register is +Inf.
+  ASSERT_EQUAL_FP64(kFP32PositiveInfinity, d10);
+  ASSERT_EQUAL_FP64(kFP32PositiveInfinity, d11);
+  ASSERT_EQUAL_FP64(kFP32PositiveInfinity, d12);
+  ASSERT_EQUAL_FP64(kFP32PositiveInfinity, d13);
+  ASSERT_EQUAL_FP64(kFP32PositiveInfinity, d14);
+
+  // Second register is -Inf.
+  ASSERT_EQUAL_FP64(0.0, d15);
+  ASSERT_EQUAL_FP64(-0.0, d16);
+  ASSERT_EQUAL_FP64(kFP32PositiveInfinity, d17);
+  ASSERT_EQUAL_FP64(kFP32NegativeInfinity, d18);
+  ASSERT_EQUAL_FP64(1.0, d19);
+
+  // Second register is 1.0.
+  ASSERT_EQUAL_FP64(1.0, d20);
+  ASSERT_EQUAL_FP64(1.0, d21);
+  ASSERT_EQUAL_FP64(kFP32PositiveInfinity, d22);
+  ASSERT_EQUAL_FP64(1.0, d23);
+  ASSERT_EQUAL_FP64(1.0, d24);
+
+  TEARDOWN();
+}
+
+
+TEST(fccmp) {
+  SETUP();
+
+  START();
+  __ Fmov(s16, 0.0);
+  __ Fmov(s17, 0.5);
+  __ Fmov(d18, -0.5);
+  __ Fmov(d19, -1.0);
+  __ Mov(x20, 0);
+
+  __ Cmp(x20, Operand(0));
+  __ Fccmp(s16, s16, NoFlag, eq);
+  __ Mrs(x0, NZCV);
+
+  __ Cmp(x20, Operand(0));
+  __ Fccmp(s16, s16, VFlag, ne);
+  __ Mrs(x1, NZCV);
+
+  __ Cmp(x20, Operand(0));
+  __ Fccmp(s16, s17, CFlag, ge);
+  __ Mrs(x2, NZCV);
+
+  __ Cmp(x20, Operand(0));
+  __ Fccmp(s16, s17, CVFlag, lt);
+  __ Mrs(x3, NZCV);
+
+  __ Cmp(x20, Operand(0));
+  __ Fccmp(d18, d18, ZFlag, le);
+  __ Mrs(x4, NZCV);
+
+  __ Cmp(x20, Operand(0));
+  __ Fccmp(d18, d18, ZVFlag, gt);
+  __ Mrs(x5, NZCV);
+
+  __ Cmp(x20, Operand(0));
+  __ Fccmp(d18, d19, ZCVFlag, ls);
+  __ Mrs(x6, NZCV);
+
+  __ Cmp(x20, Operand(0));
+  __ Fccmp(d18, d19, NFlag, hi);
+  __ Mrs(x7, NZCV);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_32(ZCFlag, w0);
+  ASSERT_EQUAL_32(VFlag, w1);
+  ASSERT_EQUAL_32(NFlag, w2);
+  ASSERT_EQUAL_32(CVFlag, w3);
+  ASSERT_EQUAL_32(ZCFlag, w4);
+  ASSERT_EQUAL_32(ZVFlag, w5);
+  ASSERT_EQUAL_32(CFlag, w6);
+  ASSERT_EQUAL_32(NFlag, w7);
+
+  TEARDOWN();
+}
+
+
+TEST(fcmp) {
+  SETUP();
+
+  START();
+  __ Fmov(s8, 0.0);
+  __ Fmov(s9, 0.5);
+  __ Mov(w18, 0x7f800001);  // Single precision NaN.
+  __ Fmov(s18, w18);
+
+  __ Fcmp(s8, s8);
+  __ Mrs(x0, NZCV);
+  __ Fcmp(s8, s9);
+  __ Mrs(x1, NZCV);
+  __ Fcmp(s9, s8);
+  __ Mrs(x2, NZCV);
+  __ Fcmp(s8, s18);
+  __ Mrs(x3, NZCV);
+  __ Fcmp(s18, s18);
+  __ Mrs(x4, NZCV);
+  __ Fcmp(s8, 0.0);
+  __ Mrs(x5, NZCV);
+  __ Fcmp(s8, 255.0);
+  __ Mrs(x6, NZCV);
+
+  __ Fmov(d19, 0.0);
+  __ Fmov(d20, 0.5);
+  __ Mov(x21, 0x7ff0000000000001UL);  // Double precision NaN.
+  __ Fmov(d21, x21);
+
+  __ Fcmp(d19, d19);
+  __ Mrs(x10, NZCV);
+  __ Fcmp(d19, d20);
+  __ Mrs(x11, NZCV);
+  __ Fcmp(d20, d19);
+  __ Mrs(x12, NZCV);
+  __ Fcmp(d19, d21);
+  __ Mrs(x13, NZCV);
+  __ Fcmp(d21, d21);
+  __ Mrs(x14, NZCV);
+  __ Fcmp(d19, 0.0);
+  __ Mrs(x15, NZCV);
+  __ Fcmp(d19, 12.3456);
+  __ Mrs(x16, NZCV);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_32(ZCFlag, w0);
+  ASSERT_EQUAL_32(NFlag, w1);
+  ASSERT_EQUAL_32(CFlag, w2);
+  ASSERT_EQUAL_32(CVFlag, w3);
+  ASSERT_EQUAL_32(CVFlag, w4);
+  ASSERT_EQUAL_32(ZCFlag, w5);
+  ASSERT_EQUAL_32(NFlag, w6);
+  ASSERT_EQUAL_32(ZCFlag, w10);
+  ASSERT_EQUAL_32(NFlag, w11);
+  ASSERT_EQUAL_32(CFlag, w12);
+  ASSERT_EQUAL_32(CVFlag, w13);
+  ASSERT_EQUAL_32(CVFlag, w14);
+  ASSERT_EQUAL_32(ZCFlag, w15);
+  ASSERT_EQUAL_32(NFlag, w16);
+
+  TEARDOWN();
+}
+
+
+TEST(fcsel) {
+  SETUP();
+
+  START();
+  __ Mov(x16, 0);
+  __ Fmov(s16, 1.0);
+  __ Fmov(s17, 2.0);
+  __ Fmov(d18, 3.0);
+  __ Fmov(d19, 4.0);
+
+  __ Cmp(x16, Operand(0));
+  __ Fcsel(s0, s16, s17, eq);
+  __ Fcsel(s1, s16, s17, ne);
+  __ Fcsel(d2, d18, d19, eq);
+  __ Fcsel(d3, d18, d19, ne);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(1.0, s0);
+  ASSERT_EQUAL_FP32(2.0, s1);
+  ASSERT_EQUAL_FP64(3.0, d2);
+  ASSERT_EQUAL_FP64(4.0, d3);
+
+  TEARDOWN();
+}
+
+
+TEST(fneg) {
+  SETUP();
+
+  START();
+  __ Fmov(s16, 1.0);
+  __ Fmov(s17, 0.0);
+  __ Fmov(s18, kFP32PositiveInfinity);
+  __ Fmov(d19, 1.0);
+  __ Fmov(d20, 0.0);
+  __ Fmov(d21, kFP64PositiveInfinity);
+
+  __ Fneg(s0, s16);
+  __ Fneg(s1, s0);
+  __ Fneg(s2, s17);
+  __ Fneg(s3, s2);
+  __ Fneg(s4, s18);
+  __ Fneg(s5, s4);
+  __ Fneg(d6, d19);
+  __ Fneg(d7, d6);
+  __ Fneg(d8, d20);
+  __ Fneg(d9, d8);
+  __ Fneg(d10, d21);
+  __ Fneg(d11, d10);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(-1.0, s0);
+  ASSERT_EQUAL_FP32(1.0, s1);
+  ASSERT_EQUAL_FP32(-0.0, s2);
+  ASSERT_EQUAL_FP32(0.0, s3);
+  ASSERT_EQUAL_FP32(kFP32NegativeInfinity, s4);
+  ASSERT_EQUAL_FP32(kFP32PositiveInfinity, s5);
+  ASSERT_EQUAL_FP64(-1.0, d6);
+  ASSERT_EQUAL_FP64(1.0, d7);
+  ASSERT_EQUAL_FP64(-0.0, d8);
+  ASSERT_EQUAL_FP64(0.0, d9);
+  ASSERT_EQUAL_FP64(kFP64NegativeInfinity, d10);
+  ASSERT_EQUAL_FP64(kFP64PositiveInfinity, d11);
+
+  TEARDOWN();
+}
+
+
+TEST(fabs) {
+  SETUP();
+
+  START();
+  __ Fmov(s16, -1.0);
+  __ Fmov(s17, -0.0);
+  __ Fmov(s18, kFP32NegativeInfinity);
+  __ Fmov(d19, -1.0);
+  __ Fmov(d20, -0.0);
+  __ Fmov(d21, kFP64NegativeInfinity);
+
+  __ Fabs(s0, s16);
+  __ Fabs(s1, s0);
+  __ Fabs(s2, s17);
+  __ Fabs(s3, s18);
+  __ Fabs(d4, d19);
+  __ Fabs(d5, d4);
+  __ Fabs(d6, d20);
+  __ Fabs(d7, d21);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(1.0, s0);
+  ASSERT_EQUAL_FP32(1.0, s1);
+  ASSERT_EQUAL_FP32(0.0, s2);
+  ASSERT_EQUAL_FP32(kFP32PositiveInfinity, s3);
+  ASSERT_EQUAL_FP64(1.0, d4);
+  ASSERT_EQUAL_FP64(1.0, d5);
+  ASSERT_EQUAL_FP64(0.0, d6);
+  ASSERT_EQUAL_FP64(kFP64PositiveInfinity, d7);
+
+  TEARDOWN();
+}
+
+
+TEST(fsqrt) {
+  SETUP();
+
+  START();
+  __ Fmov(s16, 0.0);
+  __ Fmov(s17, 1.0);
+  __ Fmov(s18, 0.25);
+  __ Fmov(s19, 65536.0);
+  __ Fmov(s20, -0.0);
+  __ Fmov(s21, kFP32PositiveInfinity);
+  __ Fmov(d22, 0.0);
+  __ Fmov(d23, 1.0);
+  __ Fmov(d24, 0.25);
+  __ Fmov(d25, 4294967296.0);
+  __ Fmov(d26, -0.0);
+  __ Fmov(d27, kFP64PositiveInfinity);
+
+  __ Fsqrt(s0, s16);
+  __ Fsqrt(s1, s17);
+  __ Fsqrt(s2, s18);
+  __ Fsqrt(s3, s19);
+  __ Fsqrt(s4, s20);
+  __ Fsqrt(s5, s21);
+  __ Fsqrt(d6, d22);
+  __ Fsqrt(d7, d23);
+  __ Fsqrt(d8, d24);
+  __ Fsqrt(d9, d25);
+  __ Fsqrt(d10, d26);
+  __ Fsqrt(d11, d27);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(0.0, s0);
+  ASSERT_EQUAL_FP32(1.0, s1);
+  ASSERT_EQUAL_FP32(0.5, s2);
+  ASSERT_EQUAL_FP32(256.0, s3);
+  ASSERT_EQUAL_FP32(-0.0, s4);
+  ASSERT_EQUAL_FP32(kFP32PositiveInfinity, s5);
+  ASSERT_EQUAL_FP64(0.0, d6);
+  ASSERT_EQUAL_FP64(1.0, d7);
+  ASSERT_EQUAL_FP64(0.5, d8);
+  ASSERT_EQUAL_FP64(65536.0, d9);
+  ASSERT_EQUAL_FP64(-0.0, d10);
+  ASSERT_EQUAL_FP64(kFP32PositiveInfinity, d11);
+
+  TEARDOWN();
+}
+
+
+TEST(frintn) {
+  SETUP();
+
+  START();
+  __ Fmov(s16, 1.0);
+  __ Fmov(s17, 1.1);
+  __ Fmov(s18, 1.5);
+  __ Fmov(s19, 1.9);
+  __ Fmov(s20, 2.5);
+  __ Fmov(s21, -1.5);
+  __ Fmov(s22, -2.5);
+  __ Fmov(s23, kFP32PositiveInfinity);
+  __ Fmov(s24, kFP32NegativeInfinity);
+  __ Fmov(s25, 0.0);
+  __ Fmov(s26, -0.0);
+
+  __ Frintn(s0, s16);
+  __ Frintn(s1, s17);
+  __ Frintn(s2, s18);
+  __ Frintn(s3, s19);
+  __ Frintn(s4, s20);
+  __ Frintn(s5, s21);
+  __ Frintn(s6, s22);
+  __ Frintn(s7, s23);
+  __ Frintn(s8, s24);
+  __ Frintn(s9, s25);
+  __ Frintn(s10, s26);
+
+  __ Fmov(d16, 1.0);
+  __ Fmov(d17, 1.1);
+  __ Fmov(d18, 1.5);
+  __ Fmov(d19, 1.9);
+  __ Fmov(d20, 2.5);
+  __ Fmov(d21, -1.5);
+  __ Fmov(d22, -2.5);
+  __ Fmov(d23, kFP32PositiveInfinity);
+  __ Fmov(d24, kFP32NegativeInfinity);
+  __ Fmov(d25, 0.0);
+  __ Fmov(d26, -0.0);
+
+  __ Frintn(d11, d16);
+  __ Frintn(d12, d17);
+  __ Frintn(d13, d18);
+  __ Frintn(d14, d19);
+  __ Frintn(d15, d20);
+  __ Frintn(d16, d21);
+  __ Frintn(d17, d22);
+  __ Frintn(d18, d23);
+  __ Frintn(d19, d24);
+  __ Frintn(d20, d25);
+  __ Frintn(d21, d26);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(1.0, s0);
+  ASSERT_EQUAL_FP32(1.0, s1);
+  ASSERT_EQUAL_FP32(2.0, s2);
+  ASSERT_EQUAL_FP32(2.0, s3);
+  ASSERT_EQUAL_FP32(2.0, s4);
+  ASSERT_EQUAL_FP32(-2.0, s5);
+  ASSERT_EQUAL_FP32(-2.0, s6);
+  ASSERT_EQUAL_FP32(kFP32PositiveInfinity, s7);
+  ASSERT_EQUAL_FP32(kFP32NegativeInfinity, s8);
+  ASSERT_EQUAL_FP32(0.0, s9);
+  ASSERT_EQUAL_FP32(-0.0, s10);
+  ASSERT_EQUAL_FP64(1.0, d11);
+  ASSERT_EQUAL_FP64(1.0, d12);
+  ASSERT_EQUAL_FP64(2.0, d13);
+  ASSERT_EQUAL_FP64(2.0, d14);
+  ASSERT_EQUAL_FP64(2.0, d15);
+  ASSERT_EQUAL_FP64(-2.0, d16);
+  ASSERT_EQUAL_FP64(-2.0, d17);
+  ASSERT_EQUAL_FP64(kFP64PositiveInfinity, d18);
+  ASSERT_EQUAL_FP64(kFP64NegativeInfinity, d19);
+  ASSERT_EQUAL_FP64(0.0, d20);
+  ASSERT_EQUAL_FP64(-0.0, d21);
+
+  TEARDOWN();
+}
+
+
+TEST(frintz) {
+  SETUP();
+
+  START();
+  __ Fmov(s16, 1.0);
+  __ Fmov(s17, 1.1);
+  __ Fmov(s18, 1.5);
+  __ Fmov(s19, 1.9);
+  __ Fmov(s20, 2.5);
+  __ Fmov(s21, -1.5);
+  __ Fmov(s22, -2.5);
+  __ Fmov(s23, kFP32PositiveInfinity);
+  __ Fmov(s24, kFP32NegativeInfinity);
+  __ Fmov(s25, 0.0);
+  __ Fmov(s26, -0.0);
+
+  __ Frintz(s0, s16);
+  __ Frintz(s1, s17);
+  __ Frintz(s2, s18);
+  __ Frintz(s3, s19);
+  __ Frintz(s4, s20);
+  __ Frintz(s5, s21);
+  __ Frintz(s6, s22);
+  __ Frintz(s7, s23);
+  __ Frintz(s8, s24);
+  __ Frintz(s9, s25);
+  __ Frintz(s10, s26);
+
+  __ Fmov(d16, 1.0);
+  __ Fmov(d17, 1.1);
+  __ Fmov(d18, 1.5);
+  __ Fmov(d19, 1.9);
+  __ Fmov(d20, 2.5);
+  __ Fmov(d21, -1.5);
+  __ Fmov(d22, -2.5);
+  __ Fmov(d23, kFP32PositiveInfinity);
+  __ Fmov(d24, kFP32NegativeInfinity);
+  __ Fmov(d25, 0.0);
+  __ Fmov(d26, -0.0);
+
+  __ Frintz(d11, d16);
+  __ Frintz(d12, d17);
+  __ Frintz(d13, d18);
+  __ Frintz(d14, d19);
+  __ Frintz(d15, d20);
+  __ Frintz(d16, d21);
+  __ Frintz(d17, d22);
+  __ Frintz(d18, d23);
+  __ Frintz(d19, d24);
+  __ Frintz(d20, d25);
+  __ Frintz(d21, d26);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP32(1.0, s0);
+  ASSERT_EQUAL_FP32(1.0, s1);
+  ASSERT_EQUAL_FP32(1.0, s2);
+  ASSERT_EQUAL_FP32(1.0, s3);
+  ASSERT_EQUAL_FP32(2.0, s4);
+  ASSERT_EQUAL_FP32(-1.0, s5);
+  ASSERT_EQUAL_FP32(-2.0, s6);
+  ASSERT_EQUAL_FP32(kFP32PositiveInfinity, s7);
+  ASSERT_EQUAL_FP32(kFP32NegativeInfinity, s8);
+  ASSERT_EQUAL_FP32(0.0, s9);
+  ASSERT_EQUAL_FP32(-0.0, s10);
+  ASSERT_EQUAL_FP64(1.0, d11);
+  ASSERT_EQUAL_FP64(1.0, d12);
+  ASSERT_EQUAL_FP64(1.0, d13);
+  ASSERT_EQUAL_FP64(1.0, d14);
+  ASSERT_EQUAL_FP64(2.0, d15);
+  ASSERT_EQUAL_FP64(-1.0, d16);
+  ASSERT_EQUAL_FP64(-2.0, d17);
+  ASSERT_EQUAL_FP64(kFP64PositiveInfinity, d18);
+  ASSERT_EQUAL_FP64(kFP64NegativeInfinity, d19);
+  ASSERT_EQUAL_FP64(0.0, d20);
+  ASSERT_EQUAL_FP64(-0.0, d21);
+
+  TEARDOWN();
+}
+
+
+TEST(fcvt) {
+  SETUP();
+
+  START();
+  __ Fmov(s16, 1.0);
+  __ Fmov(s17, 1.1);
+  __ Fmov(s18, 1.5);
+  __ Fmov(s19, 1.9);
+  __ Fmov(s20, 2.5);
+  __ Fmov(s21, -1.5);
+  __ Fmov(s22, -2.5);
+  __ Fmov(s23, kFP32PositiveInfinity);
+  __ Fmov(s24, kFP32NegativeInfinity);
+  __ Fmov(s25, 0.0);
+  __ Fmov(s26, -0.0);
+
+  __ Fcvt(d0, s16);
+  __ Fcvt(d1, s17);
+  __ Fcvt(d2, s18);
+  __ Fcvt(d3, s19);
+  __ Fcvt(d4, s20);
+  __ Fcvt(d5, s21);
+  __ Fcvt(d6, s22);
+  __ Fcvt(d7, s23);
+  __ Fcvt(d8, s24);
+  __ Fcvt(d9, s25);
+  __ Fcvt(d10, s26);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP64(1.0f, d0);
+  ASSERT_EQUAL_FP64(1.1f, d1);
+  ASSERT_EQUAL_FP64(1.5f, d2);
+  ASSERT_EQUAL_FP64(1.9f, d3);
+  ASSERT_EQUAL_FP64(2.5f, d4);
+  ASSERT_EQUAL_FP64(-1.5f, d5);
+  ASSERT_EQUAL_FP64(-2.5f, d6);
+  ASSERT_EQUAL_FP64(kFP64PositiveInfinity, d7);
+  ASSERT_EQUAL_FP64(kFP64NegativeInfinity, d8);
+  ASSERT_EQUAL_FP64(0.0f, d9);
+  ASSERT_EQUAL_FP64(-0.0f, d10);
+
+  TEARDOWN();
+}
+
+
+TEST(fcvtms) {
+  SETUP();
+
+  START();
+  __ Fmov(s0, 1.0);
+  __ Fmov(s1, 1.1);
+  __ Fmov(s2, 1.5);
+  __ Fmov(s3, -1.5);
+  __ Fmov(s4, kFP32PositiveInfinity);
+  __ Fmov(s5, kFP32NegativeInfinity);
+  __ Fmov(s6, 0x7fffff80);  // Largest float < INT32_MAX.
+  __ Fneg(s7, s6);          // Smallest float > INT32_MIN.
+  __ Fmov(d8, 1.0);
+  __ Fmov(d9, 1.1);
+  __ Fmov(d10, 1.5);
+  __ Fmov(d11, -1.5);
+  __ Fmov(d12, kFP64PositiveInfinity);
+  __ Fmov(d13, kFP64NegativeInfinity);
+  __ Fmov(d14, kWMaxInt - 1);
+  __ Fmov(d15, kWMinInt + 1);
+  __ Fmov(s17, 1.1);
+  __ Fmov(s18, 1.5);
+  __ Fmov(s19, -1.5);
+  __ Fmov(s20, kFP32PositiveInfinity);
+  __ Fmov(s21, kFP32NegativeInfinity);
+  __ Fmov(s22, 0x7fffff8000000000UL);  // Largest float < INT64_MAX.
+  __ Fneg(s23, s22);                    // Smallest float > INT64_MIN.
+  __ Fmov(d24, 1.1);
+  __ Fmov(d25, 1.5);
+  __ Fmov(d26, -1.5);
+  __ Fmov(d27, kFP64PositiveInfinity);
+  __ Fmov(d28, kFP64NegativeInfinity);
+  __ Fmov(d29, 0x7ffffffffffffc00UL);  // Largest double < INT64_MAX.
+  __ Fneg(d30, d29);                    // Smallest double > INT64_MIN.
+
+  __ Fcvtms(w0, s0);
+  __ Fcvtms(w1, s1);
+  __ Fcvtms(w2, s2);
+  __ Fcvtms(w3, s3);
+  __ Fcvtms(w4, s4);
+  __ Fcvtms(w5, s5);
+  __ Fcvtms(w6, s6);
+  __ Fcvtms(w7, s7);
+  __ Fcvtms(w8, d8);
+  __ Fcvtms(w9, d9);
+  __ Fcvtms(w10, d10);
+  __ Fcvtms(w11, d11);
+  __ Fcvtms(w12, d12);
+  __ Fcvtms(w13, d13);
+  __ Fcvtms(w14, d14);
+  __ Fcvtms(w15, d15);
+  __ Fcvtms(x17, s17);
+  __ Fcvtms(x18, s18);
+  __ Fcvtms(x19, s19);
+  __ Fcvtms(x20, s20);
+  __ Fcvtms(x21, s21);
+  __ Fcvtms(x22, s22);
+  __ Fcvtms(x23, s23);
+  __ Fcvtms(x24, d24);
+  __ Fcvtms(x25, d25);
+  __ Fcvtms(x26, d26);
+  __ Fcvtms(x27, d27);
+  __ Fcvtms(x28, d28);
+  __ Fcvtms(x29, d29);
+  __ Fcvtms(x30, d30);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(1, x0);
+  ASSERT_EQUAL_64(1, x1);
+  ASSERT_EQUAL_64(1, x2);
+  ASSERT_EQUAL_64(0xfffffffe, x3);
+  ASSERT_EQUAL_64(0x7fffffff, x4);
+  ASSERT_EQUAL_64(0x80000000, x5);
+  ASSERT_EQUAL_64(0x7fffff80, x6);
+  ASSERT_EQUAL_64(0x80000080, x7);
+  ASSERT_EQUAL_64(1, x8);
+  ASSERT_EQUAL_64(1, x9);
+  ASSERT_EQUAL_64(1, x10);
+  ASSERT_EQUAL_64(0xfffffffe, x11);
+  ASSERT_EQUAL_64(0x7fffffff, x12);
+  ASSERT_EQUAL_64(0x80000000, x13);
+  ASSERT_EQUAL_64(0x7ffffffe, x14);
+  ASSERT_EQUAL_64(0x80000001, x15);
+  ASSERT_EQUAL_64(1, x17);
+  ASSERT_EQUAL_64(1, x18);
+  ASSERT_EQUAL_64(0xfffffffffffffffeUL, x19);
+  ASSERT_EQUAL_64(0x7fffffffffffffffUL, x20);
+  ASSERT_EQUAL_64(0x8000000000000000UL, x21);
+  ASSERT_EQUAL_64(0x7fffff8000000000UL, x22);
+  ASSERT_EQUAL_64(0x8000008000000000UL, x23);
+  ASSERT_EQUAL_64(1, x24);
+  ASSERT_EQUAL_64(1, x25);
+  ASSERT_EQUAL_64(0xfffffffffffffffeUL, x26);
+  ASSERT_EQUAL_64(0x7fffffffffffffffUL, x27);
+  ASSERT_EQUAL_64(0x8000000000000000UL, x28);
+  ASSERT_EQUAL_64(0x7ffffffffffffc00UL, x29);
+  ASSERT_EQUAL_64(0x8000000000000400UL, x30);
+
+  TEARDOWN();
+}
+
+
+TEST(fcvtmu) {
+  SETUP();
+
+  START();
+  __ Fmov(s0, 1.0);
+  __ Fmov(s1, 1.1);
+  __ Fmov(s2, 1.5);
+  __ Fmov(s3, -1.5);
+  __ Fmov(s4, kFP32PositiveInfinity);
+  __ Fmov(s5, kFP32NegativeInfinity);
+  __ Fmov(s6, 0x7fffff80);  // Largest float < INT32_MAX.
+  __ Fneg(s7, s6);          // Smallest float > INT32_MIN.
+  __ Fmov(d8, 1.0);
+  __ Fmov(d9, 1.1);
+  __ Fmov(d10, 1.5);
+  __ Fmov(d11, -1.5);
+  __ Fmov(d12, kFP64PositiveInfinity);
+  __ Fmov(d13, kFP64NegativeInfinity);
+  __ Fmov(d14, kWMaxInt - 1);
+  __ Fmov(d15, kWMinInt + 1);
+  __ Fmov(s17, 1.1);
+  __ Fmov(s18, 1.5);
+  __ Fmov(s19, -1.5);
+  __ Fmov(s20, kFP32PositiveInfinity);
+  __ Fmov(s21, kFP32NegativeInfinity);
+  __ Fmov(s22, 0x7fffff8000000000UL);  // Largest float < INT64_MAX.
+  __ Fneg(s23, s22);                    // Smallest float > INT64_MIN.
+  __ Fmov(d24, 1.1);
+  __ Fmov(d25, 1.5);
+  __ Fmov(d26, -1.5);
+  __ Fmov(d27, kFP64PositiveInfinity);
+  __ Fmov(d28, kFP64NegativeInfinity);
+  __ Fmov(d29, 0x7ffffffffffffc00UL);  // Largest double < INT64_MAX.
+  __ Fneg(d30, d29);                    // Smallest double > INT64_MIN.
+
+  __ Fcvtmu(w0, s0);
+  __ Fcvtmu(w1, s1);
+  __ Fcvtmu(w2, s2);
+  __ Fcvtmu(w3, s3);
+  __ Fcvtmu(w4, s4);
+  __ Fcvtmu(w5, s5);
+  __ Fcvtmu(w6, s6);
+  __ Fcvtmu(w7, s7);
+  __ Fcvtmu(w8, d8);
+  __ Fcvtmu(w9, d9);
+  __ Fcvtmu(w10, d10);
+  __ Fcvtmu(w11, d11);
+  __ Fcvtmu(w12, d12);
+  __ Fcvtmu(w13, d13);
+  __ Fcvtmu(w14, d14);
+  __ Fcvtmu(x17, s17);
+  __ Fcvtmu(x18, s18);
+  __ Fcvtmu(x19, s19);
+  __ Fcvtmu(x20, s20);
+  __ Fcvtmu(x21, s21);
+  __ Fcvtmu(x22, s22);
+  __ Fcvtmu(x23, s23);
+  __ Fcvtmu(x24, d24);
+  __ Fcvtmu(x25, d25);
+  __ Fcvtmu(x26, d26);
+  __ Fcvtmu(x27, d27);
+  __ Fcvtmu(x28, d28);
+  __ Fcvtmu(x29, d29);
+  __ Fcvtmu(x30, d30);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(1, x0);
+  ASSERT_EQUAL_64(1, x1);
+  ASSERT_EQUAL_64(1, x2);
+  ASSERT_EQUAL_64(0, x3);
+  ASSERT_EQUAL_64(0xffffffff, x4);
+  ASSERT_EQUAL_64(0, x5);
+  ASSERT_EQUAL_64(0x7fffff80, x6);
+  ASSERT_EQUAL_64(0, x7);
+  ASSERT_EQUAL_64(1, x8);
+  ASSERT_EQUAL_64(1, x9);
+  ASSERT_EQUAL_64(1, x10);
+  ASSERT_EQUAL_64(0, x11);
+  ASSERT_EQUAL_64(0xffffffff, x12);
+  ASSERT_EQUAL_64(0, x13);
+  ASSERT_EQUAL_64(0x7ffffffe, x14);
+  ASSERT_EQUAL_64(1, x17);
+  ASSERT_EQUAL_64(1, x18);
+  ASSERT_EQUAL_64(0x0UL, x19);
+  ASSERT_EQUAL_64(0xffffffffffffffffUL, x20);
+  ASSERT_EQUAL_64(0x0UL, x21);
+  ASSERT_EQUAL_64(0x7fffff8000000000UL, x22);
+  ASSERT_EQUAL_64(0x0UL, x23);
+  ASSERT_EQUAL_64(1, x24);
+  ASSERT_EQUAL_64(1, x25);
+  ASSERT_EQUAL_64(0x0UL, x26);
+  ASSERT_EQUAL_64(0xffffffffffffffffUL, x27);
+  ASSERT_EQUAL_64(0x0UL, x28);
+  ASSERT_EQUAL_64(0x7ffffffffffffc00UL, x29);
+  ASSERT_EQUAL_64(0x0UL, x30);
+
+  TEARDOWN();
+}
+
+
+TEST(fcvtns) {
+  SETUP();
+
+  START();
+  __ Fmov(s0, 1.0);
+  __ Fmov(s1, 1.1);
+  __ Fmov(s2, 1.5);
+  __ Fmov(s3, -1.5);
+  __ Fmov(s4, kFP32PositiveInfinity);
+  __ Fmov(s5, kFP32NegativeInfinity);
+  __ Fmov(s6, 0x7fffff80);  // Largest float < INT32_MAX.
+  __ Fneg(s7, s6);          // Smallest float > INT32_MIN.
+  __ Fmov(d8, 1.0);
+  __ Fmov(d9, 1.1);
+  __ Fmov(d10, 1.5);
+  __ Fmov(d11, -1.5);
+  __ Fmov(d12, kFP64PositiveInfinity);
+  __ Fmov(d13, kFP64NegativeInfinity);
+  __ Fmov(d14, kWMaxInt - 1);
+  __ Fmov(d15, kWMinInt + 1);
+  __ Fmov(s17, 1.1);
+  __ Fmov(s18, 1.5);
+  __ Fmov(s19, -1.5);
+  __ Fmov(s20, kFP32PositiveInfinity);
+  __ Fmov(s21, kFP32NegativeInfinity);
+  __ Fmov(s22, 0x7fffff8000000000UL);   // Largest float < INT64_MAX.
+  __ Fneg(s23, s22);                    // Smallest float > INT64_MIN.
+  __ Fmov(d24, 1.1);
+  __ Fmov(d25, 1.5);
+  __ Fmov(d26, -1.5);
+  __ Fmov(d27, kFP64PositiveInfinity);
+  __ Fmov(d28, kFP64NegativeInfinity);
+  __ Fmov(d29, 0x7ffffffffffffc00UL);   // Largest double < INT64_MAX.
+  __ Fneg(d30, d29);                    // Smallest double > INT64_MIN.
+
+  __ Fcvtns(w0, s0);
+  __ Fcvtns(w1, s1);
+  __ Fcvtns(w2, s2);
+  __ Fcvtns(w3, s3);
+  __ Fcvtns(w4, s4);
+  __ Fcvtns(w5, s5);
+  __ Fcvtns(w6, s6);
+  __ Fcvtns(w7, s7);
+  __ Fcvtns(w8, d8);
+  __ Fcvtns(w9, d9);
+  __ Fcvtns(w10, d10);
+  __ Fcvtns(w11, d11);
+  __ Fcvtns(w12, d12);
+  __ Fcvtns(w13, d13);
+  __ Fcvtns(w14, d14);
+  __ Fcvtns(w15, d15);
+  __ Fcvtns(x17, s17);
+  __ Fcvtns(x18, s18);
+  __ Fcvtns(x19, s19);
+  __ Fcvtns(x20, s20);
+  __ Fcvtns(x21, s21);
+  __ Fcvtns(x22, s22);
+  __ Fcvtns(x23, s23);
+  __ Fcvtns(x24, d24);
+  __ Fcvtns(x25, d25);
+  __ Fcvtns(x26, d26);
+  __ Fcvtns(x27, d27);
+  __ Fcvtns(x28, d28);
+  __ Fcvtns(x29, d29);
+  __ Fcvtns(x30, d30);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(1, x0);
+  ASSERT_EQUAL_64(1, x1);
+  ASSERT_EQUAL_64(2, x2);
+  ASSERT_EQUAL_64(0xfffffffe, x3);
+  ASSERT_EQUAL_64(0x7fffffff, x4);
+  ASSERT_EQUAL_64(0x80000000, x5);
+  ASSERT_EQUAL_64(0x7fffff80, x6);
+  ASSERT_EQUAL_64(0x80000080, x7);
+  ASSERT_EQUAL_64(1, x8);
+  ASSERT_EQUAL_64(1, x9);
+  ASSERT_EQUAL_64(2, x10);
+  ASSERT_EQUAL_64(0xfffffffe, x11);
+  ASSERT_EQUAL_64(0x7fffffff, x12);
+  ASSERT_EQUAL_64(0x80000000, x13);
+  ASSERT_EQUAL_64(0x7ffffffe, x14);
+  ASSERT_EQUAL_64(0x80000001, x15);
+  ASSERT_EQUAL_64(1, x17);
+  ASSERT_EQUAL_64(2, x18);
+  ASSERT_EQUAL_64(0xfffffffffffffffeUL, x19);
+  ASSERT_EQUAL_64(0x7fffffffffffffffUL, x20);
+  ASSERT_EQUAL_64(0x8000000000000000UL, x21);
+  ASSERT_EQUAL_64(0x7fffff8000000000UL, x22);
+  ASSERT_EQUAL_64(0x8000008000000000UL, x23);
+  ASSERT_EQUAL_64(1, x24);
+  ASSERT_EQUAL_64(2, x25);
+  ASSERT_EQUAL_64(0xfffffffffffffffeUL, x26);
+  ASSERT_EQUAL_64(0x7fffffffffffffffUL, x27);
+  ASSERT_EQUAL_64(0x8000000000000000UL, x28);
+  ASSERT_EQUAL_64(0x7ffffffffffffc00UL, x29);
+  ASSERT_EQUAL_64(0x8000000000000400UL, x30);
+
+  TEARDOWN();
+}
+
+
+TEST(fcvtnu) {
+  SETUP();
+
+  START();
+  __ Fmov(s0, 1.0);
+  __ Fmov(s1, 1.1);
+  __ Fmov(s2, 1.5);
+  __ Fmov(s3, -1.5);
+  __ Fmov(s4, kFP32PositiveInfinity);
+  __ Fmov(s5, kFP32NegativeInfinity);
+  __ Fmov(s6, 0xffffff00);  // Largest float < UINT32_MAX.
+  __ Fmov(d8, 1.0);
+  __ Fmov(d9, 1.1);
+  __ Fmov(d10, 1.5);
+  __ Fmov(d11, -1.5);
+  __ Fmov(d12, kFP64PositiveInfinity);
+  __ Fmov(d13, kFP64NegativeInfinity);
+  __ Fmov(d14, 0xfffffffe);
+  __ Fmov(s16, 1.0);
+  __ Fmov(s17, 1.1);
+  __ Fmov(s18, 1.5);
+  __ Fmov(s19, -1.5);
+  __ Fmov(s20, kFP32PositiveInfinity);
+  __ Fmov(s21, kFP32NegativeInfinity);
+  __ Fmov(s22, 0xffffff0000000000UL);  // Largest float < UINT64_MAX.
+  __ Fmov(d24, 1.1);
+  __ Fmov(d25, 1.5);
+  __ Fmov(d26, -1.5);
+  __ Fmov(d27, kFP64PositiveInfinity);
+  __ Fmov(d28, kFP64NegativeInfinity);
+  __ Fmov(d29, 0xfffffffffffff800UL);  // Largest double < UINT64_MAX.
+  __ Fmov(s30, 0x100000000UL);
+
+  __ Fcvtnu(w0, s0);
+  __ Fcvtnu(w1, s1);
+  __ Fcvtnu(w2, s2);
+  __ Fcvtnu(w3, s3);
+  __ Fcvtnu(w4, s4);
+  __ Fcvtnu(w5, s5);
+  __ Fcvtnu(w6, s6);
+  __ Fcvtnu(w8, d8);
+  __ Fcvtnu(w9, d9);
+  __ Fcvtnu(w10, d10);
+  __ Fcvtnu(w11, d11);
+  __ Fcvtnu(w12, d12);
+  __ Fcvtnu(w13, d13);
+  __ Fcvtnu(w14, d14);
+  __ Fcvtnu(w15, d15);
+  __ Fcvtnu(x16, s16);
+  __ Fcvtnu(x17, s17);
+  __ Fcvtnu(x18, s18);
+  __ Fcvtnu(x19, s19);
+  __ Fcvtnu(x20, s20);
+  __ Fcvtnu(x21, s21);
+  __ Fcvtnu(x22, s22);
+  __ Fcvtnu(x24, d24);
+  __ Fcvtnu(x25, d25);
+  __ Fcvtnu(x26, d26);
+  __ Fcvtnu(x27, d27);
+  __ Fcvtnu(x28, d28);
+  __ Fcvtnu(x29, d29);
+  __ Fcvtnu(w30, s30);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(1, x0);
+  ASSERT_EQUAL_64(1, x1);
+  ASSERT_EQUAL_64(2, x2);
+  ASSERT_EQUAL_64(0, x3);
+  ASSERT_EQUAL_64(0xffffffff, x4);
+  ASSERT_EQUAL_64(0, x5);
+  ASSERT_EQUAL_64(0xffffff00, x6);
+  ASSERT_EQUAL_64(1, x8);
+  ASSERT_EQUAL_64(1, x9);
+  ASSERT_EQUAL_64(2, x10);
+  ASSERT_EQUAL_64(0, x11);
+  ASSERT_EQUAL_64(0xffffffff, x12);
+  ASSERT_EQUAL_64(0, x13);
+  ASSERT_EQUAL_64(0xfffffffe, x14);
+  ASSERT_EQUAL_64(1, x16);
+  ASSERT_EQUAL_64(1, x17);
+  ASSERT_EQUAL_64(2, x18);
+  ASSERT_EQUAL_64(0, x19);
+  ASSERT_EQUAL_64(0xffffffffffffffffUL, x20);
+  ASSERT_EQUAL_64(0, x21);
+  ASSERT_EQUAL_64(0xffffff0000000000UL, x22);
+  ASSERT_EQUAL_64(1, x24);
+  ASSERT_EQUAL_64(2, x25);
+  ASSERT_EQUAL_64(0, x26);
+  ASSERT_EQUAL_64(0xffffffffffffffffUL, x27);
+  ASSERT_EQUAL_64(0, x28);
+  ASSERT_EQUAL_64(0xfffffffffffff800UL, x29);
+  ASSERT_EQUAL_64(0xffffffff, x30);
+
+  TEARDOWN();
+}
+
+
+TEST(fcvtzs) {
+  SETUP();
+
+  START();
+  __ Fmov(s0, 1.0);
+  __ Fmov(s1, 1.1);
+  __ Fmov(s2, 1.5);
+  __ Fmov(s3, -1.5);
+  __ Fmov(s4, kFP32PositiveInfinity);
+  __ Fmov(s5, kFP32NegativeInfinity);
+  __ Fmov(s6, 0x7fffff80);  // Largest float < INT32_MAX.
+  __ Fneg(s7, s6);          // Smallest float > INT32_MIN.
+  __ Fmov(d8, 1.0);
+  __ Fmov(d9, 1.1);
+  __ Fmov(d10, 1.5);
+  __ Fmov(d11, -1.5);
+  __ Fmov(d12, kFP64PositiveInfinity);
+  __ Fmov(d13, kFP64NegativeInfinity);
+  __ Fmov(d14, kWMaxInt - 1);
+  __ Fmov(d15, kWMinInt + 1);
+  __ Fmov(s17, 1.1);
+  __ Fmov(s18, 1.5);
+  __ Fmov(s19, -1.5);
+  __ Fmov(s20, kFP32PositiveInfinity);
+  __ Fmov(s21, kFP32NegativeInfinity);
+  __ Fmov(s22, 0x7fffff8000000000UL);   // Largest float < INT64_MAX.
+  __ Fneg(s23, s22);                    // Smallest float > INT64_MIN.
+  __ Fmov(d24, 1.1);
+  __ Fmov(d25, 1.5);
+  __ Fmov(d26, -1.5);
+  __ Fmov(d27, kFP64PositiveInfinity);
+  __ Fmov(d28, kFP64NegativeInfinity);
+  __ Fmov(d29, 0x7ffffffffffffc00UL);   // Largest double < INT64_MAX.
+  __ Fneg(d30, d29);                    // Smallest double > INT64_MIN.
+
+  __ Fcvtzs(w0, s0);
+  __ Fcvtzs(w1, s1);
+  __ Fcvtzs(w2, s2);
+  __ Fcvtzs(w3, s3);
+  __ Fcvtzs(w4, s4);
+  __ Fcvtzs(w5, s5);
+  __ Fcvtzs(w6, s6);
+  __ Fcvtzs(w7, s7);
+  __ Fcvtzs(w8, d8);
+  __ Fcvtzs(w9, d9);
+  __ Fcvtzs(w10, d10);
+  __ Fcvtzs(w11, d11);
+  __ Fcvtzs(w12, d12);
+  __ Fcvtzs(w13, d13);
+  __ Fcvtzs(w14, d14);
+  __ Fcvtzs(w15, d15);
+  __ Fcvtzs(x17, s17);
+  __ Fcvtzs(x18, s18);
+  __ Fcvtzs(x19, s19);
+  __ Fcvtzs(x20, s20);
+  __ Fcvtzs(x21, s21);
+  __ Fcvtzs(x22, s22);
+  __ Fcvtzs(x23, s23);
+  __ Fcvtzs(x24, d24);
+  __ Fcvtzs(x25, d25);
+  __ Fcvtzs(x26, d26);
+  __ Fcvtzs(x27, d27);
+  __ Fcvtzs(x28, d28);
+  __ Fcvtzs(x29, d29);
+  __ Fcvtzs(x30, d30);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(1, x0);
+  ASSERT_EQUAL_64(1, x1);
+  ASSERT_EQUAL_64(1, x2);
+  ASSERT_EQUAL_64(0xffffffff, x3);
+  ASSERT_EQUAL_64(0x7fffffff, x4);
+  ASSERT_EQUAL_64(0x80000000, x5);
+  ASSERT_EQUAL_64(0x7fffff80, x6);
+  ASSERT_EQUAL_64(0x80000080, x7);
+  ASSERT_EQUAL_64(1, x8);
+  ASSERT_EQUAL_64(1, x9);
+  ASSERT_EQUAL_64(1, x10);
+  ASSERT_EQUAL_64(0xffffffff, x11);
+  ASSERT_EQUAL_64(0x7fffffff, x12);
+  ASSERT_EQUAL_64(0x80000000, x13);
+  ASSERT_EQUAL_64(0x7ffffffe, x14);
+  ASSERT_EQUAL_64(0x80000001, x15);
+  ASSERT_EQUAL_64(1, x17);
+  ASSERT_EQUAL_64(1, x18);
+  ASSERT_EQUAL_64(0xffffffffffffffffUL, x19);
+  ASSERT_EQUAL_64(0x7fffffffffffffffUL, x20);
+  ASSERT_EQUAL_64(0x8000000000000000UL, x21);
+  ASSERT_EQUAL_64(0x7fffff8000000000UL, x22);
+  ASSERT_EQUAL_64(0x8000008000000000UL, x23);
+  ASSERT_EQUAL_64(1, x24);
+  ASSERT_EQUAL_64(1, x25);
+  ASSERT_EQUAL_64(0xffffffffffffffffUL, x26);
+  ASSERT_EQUAL_64(0x7fffffffffffffffUL, x27);
+  ASSERT_EQUAL_64(0x8000000000000000UL, x28);
+  ASSERT_EQUAL_64(0x7ffffffffffffc00UL, x29);
+  ASSERT_EQUAL_64(0x8000000000000400UL, x30);
+
+  TEARDOWN();
+}
+
+TEST(fcvtzu) {
+  SETUP();
+
+  START();
+  __ Fmov(s0, 1.0);
+  __ Fmov(s1, 1.1);
+  __ Fmov(s2, 1.5);
+  __ Fmov(s3, -1.5);
+  __ Fmov(s4, kFP32PositiveInfinity);
+  __ Fmov(s5, kFP32NegativeInfinity);
+  __ Fmov(s6, 0x7fffff80);  // Largest float < INT32_MAX.
+  __ Fneg(s7, s6);          // Smallest float > INT32_MIN.
+  __ Fmov(d8, 1.0);
+  __ Fmov(d9, 1.1);
+  __ Fmov(d10, 1.5);
+  __ Fmov(d11, -1.5);
+  __ Fmov(d12, kFP64PositiveInfinity);
+  __ Fmov(d13, kFP64NegativeInfinity);
+  __ Fmov(d14, kWMaxInt - 1);
+  __ Fmov(d15, kWMinInt + 1);
+  __ Fmov(s17, 1.1);
+  __ Fmov(s18, 1.5);
+  __ Fmov(s19, -1.5);
+  __ Fmov(s20, kFP32PositiveInfinity);
+  __ Fmov(s21, kFP32NegativeInfinity);
+  __ Fmov(s22, 0x7fffff8000000000UL);  // Largest float < INT64_MAX.
+  __ Fneg(s23, s22);                    // Smallest float > INT64_MIN.
+  __ Fmov(d24, 1.1);
+  __ Fmov(d25, 1.5);
+  __ Fmov(d26, -1.5);
+  __ Fmov(d27, kFP64PositiveInfinity);
+  __ Fmov(d28, kFP64NegativeInfinity);
+  __ Fmov(d29, 0x7ffffffffffffc00UL);  // Largest double < INT64_MAX.
+  __ Fneg(d30, d29);                    // Smallest double > INT64_MIN.
+
+  __ Fcvtzu(w0, s0);
+  __ Fcvtzu(w1, s1);
+  __ Fcvtzu(w2, s2);
+  __ Fcvtzu(w3, s3);
+  __ Fcvtzu(w4, s4);
+  __ Fcvtzu(w5, s5);
+  __ Fcvtzu(w6, s6);
+  __ Fcvtzu(w7, s7);
+  __ Fcvtzu(w8, d8);
+  __ Fcvtzu(w9, d9);
+  __ Fcvtzu(w10, d10);
+  __ Fcvtzu(w11, d11);
+  __ Fcvtzu(w12, d12);
+  __ Fcvtzu(w13, d13);
+  __ Fcvtzu(w14, d14);
+  __ Fcvtzu(x17, s17);
+  __ Fcvtzu(x18, s18);
+  __ Fcvtzu(x19, s19);
+  __ Fcvtzu(x20, s20);
+  __ Fcvtzu(x21, s21);
+  __ Fcvtzu(x22, s22);
+  __ Fcvtzu(x23, s23);
+  __ Fcvtzu(x24, d24);
+  __ Fcvtzu(x25, d25);
+  __ Fcvtzu(x26, d26);
+  __ Fcvtzu(x27, d27);
+  __ Fcvtzu(x28, d28);
+  __ Fcvtzu(x29, d29);
+  __ Fcvtzu(x30, d30);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(1, x0);
+  ASSERT_EQUAL_64(1, x1);
+  ASSERT_EQUAL_64(1, x2);
+  ASSERT_EQUAL_64(0, x3);
+  ASSERT_EQUAL_64(0xffffffff, x4);
+  ASSERT_EQUAL_64(0, x5);
+  ASSERT_EQUAL_64(0x7fffff80, x6);
+  ASSERT_EQUAL_64(0, x7);
+  ASSERT_EQUAL_64(1, x8);
+  ASSERT_EQUAL_64(1, x9);
+  ASSERT_EQUAL_64(1, x10);
+  ASSERT_EQUAL_64(0, x11);
+  ASSERT_EQUAL_64(0xffffffff, x12);
+  ASSERT_EQUAL_64(0, x13);
+  ASSERT_EQUAL_64(0x7ffffffe, x14);
+  ASSERT_EQUAL_64(1, x17);
+  ASSERT_EQUAL_64(1, x18);
+  ASSERT_EQUAL_64(0x0UL, x19);
+  ASSERT_EQUAL_64(0xffffffffffffffffUL, x20);
+  ASSERT_EQUAL_64(0x0UL, x21);
+  ASSERT_EQUAL_64(0x7fffff8000000000UL, x22);
+  ASSERT_EQUAL_64(0x0UL, x23);
+  ASSERT_EQUAL_64(1, x24);
+  ASSERT_EQUAL_64(1, x25);
+  ASSERT_EQUAL_64(0x0UL, x26);
+  ASSERT_EQUAL_64(0xffffffffffffffffUL, x27);
+  ASSERT_EQUAL_64(0x0UL, x28);
+  ASSERT_EQUAL_64(0x7ffffffffffffc00UL, x29);
+  ASSERT_EQUAL_64(0x0UL, x30);
+
+  TEARDOWN();
+}
+
+
+TEST(scvtf_ucvtf) {
+  SETUP();
+
+  START();
+  __ Mov(w0, 42424242);
+  __ Mov(x1, 0x7ffffffffffffc00UL);  // Largest double < INT64_MAX.
+  __ Mov(w2, 0xffffffff);             // 32-bit -1.
+  __ Mov(x3, 0xffffffffffffffffUL);  // 64-bit -1.
+  __ Mov(x4, 0xfffffffffffff800UL);  // Largest double < UINT64_MAX.
+  __ Scvtf(d0, w0);
+  __ Scvtf(d1, x1);
+  __ Scvtf(d2, w2);
+  __ Scvtf(d3, x2);
+  __ Scvtf(d4, x3);
+  __ Scvtf(d5, x4);
+  __ Ucvtf(d6, w0);
+  __ Ucvtf(d7, x1);
+  __ Ucvtf(d8, w2);
+  __ Ucvtf(d9, x2);
+  __ Ucvtf(d10, x4);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP64(42424242.0, d0);
+  ASSERT_EQUAL_FP64(9223372036854774784.0, d1);
+  ASSERT_EQUAL_FP64(-1.0, d2);
+  ASSERT_EQUAL_FP64(4294967295.0, d3);
+  ASSERT_EQUAL_FP64(-1.0, d4);
+  ASSERT_EQUAL_FP64(-2048.0, d5);
+  ASSERT_EQUAL_FP64(42424242.0, d6);
+  ASSERT_EQUAL_FP64(9223372036854774784.0, d7);
+  ASSERT_EQUAL_FP64(4294967295.0, d8);
+  ASSERT_EQUAL_FP64(4294967295.0, d9);
+  ASSERT_EQUAL_FP64(18446744073709549568.0, d10);
+
+  TEARDOWN();
+}
+
+
+TEST(scvtf_ucvtf_fixed) {
+  SETUP();
+
+  START();
+  __ Mov(x0, 0);
+  __ Mov(x1, 0x0000000000010000UL);
+  __ Mov(x2, 0x7fffffffffff0000UL);
+  __ Mov(x3, 0x8000000000000000UL);
+  __ Mov(x4, 0xffffffffffff0000UL);
+  __ Mov(x5, 0x0000000100000000UL);
+  __ Mov(x6, 0x7fffffff00000000UL);
+  __ Mov(x7, 0xffffffff00000000UL);
+  __ Mov(x8, 0x1000000000000000UL);
+  __ Mov(x9, 0x7000000000000000UL);
+  __ Mov(x10, 0xf000000000000000UL);
+
+  __ Scvtf(d0, x0, 16);
+  __ Scvtf(d1, x1, 16);
+  __ Scvtf(d2, x2, 16);
+  __ Scvtf(d3, x3, 16);
+  __ Scvtf(d4, x4, 16);
+  __ Scvtf(d5, x0, 32);
+  __ Scvtf(d6, x5, 32);
+  __ Scvtf(d7, x6, 32);
+  __ Scvtf(d8, x3, 32);
+  __ Scvtf(d9, x7, 32);
+  __ Scvtf(d10, x0, 60);
+  __ Scvtf(d11, x8, 60);
+  __ Scvtf(d12, x9, 60);
+  __ Scvtf(d13, x3, 60);
+  __ Scvtf(d14, x10, 60);
+  __ Ucvtf(d15, x0, 16);
+  __ Ucvtf(d16, x1, 16);
+  __ Ucvtf(d17, x2, 16);
+  __ Ucvtf(d18, x3, 16);
+  __ Ucvtf(d19, x4, 16);
+  __ Ucvtf(d20, x0, 32);
+  __ Ucvtf(d21, x5, 32);
+  __ Ucvtf(d22, x6, 32);
+  __ Ucvtf(d23, x3, 32);
+  __ Ucvtf(d24, x7, 32);
+  __ Ucvtf(d25, x0, 60);
+  __ Ucvtf(d26, x8, 60);
+  __ Ucvtf(d27, x9, 60);
+  __ Ucvtf(d28, x3, 60);
+  __ Ucvtf(d29, x10, 60);
+
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_FP64(0.0, d0);
+  ASSERT_EQUAL_FP64(1.0, d1);
+  ASSERT_EQUAL_FP64(140737488355327.0, d2);
+  ASSERT_EQUAL_FP64(-140737488355328.0, d3);
+  ASSERT_EQUAL_FP64(-1.0, d4);
+  ASSERT_EQUAL_FP64(0.0, d5);
+  ASSERT_EQUAL_FP64(1.0, d6);
+  ASSERT_EQUAL_FP64(2147483647.0, d7);
+  ASSERT_EQUAL_FP64(-2147483648.0, d8);
+  ASSERT_EQUAL_FP64(-1.0, d9);
+  ASSERT_EQUAL_FP64(0.0, d10);
+  ASSERT_EQUAL_FP64(1.0, d11);
+  ASSERT_EQUAL_FP64(7.0, d12);
+  ASSERT_EQUAL_FP64(-8.0, d13);
+  ASSERT_EQUAL_FP64(-1.0, d14);
+
+  ASSERT_EQUAL_FP64(0.0, d15);
+  ASSERT_EQUAL_FP64(1.0, d16);
+  ASSERT_EQUAL_FP64(140737488355327.0, d17);
+  ASSERT_EQUAL_FP64(140737488355328.0, d18);
+  ASSERT_EQUAL_FP64(281474976710655.0, d19);
+  ASSERT_EQUAL_FP64(0.0, d20);
+  ASSERT_EQUAL_FP64(1.0, d21);
+  ASSERT_EQUAL_FP64(2147483647.0, d22);
+  ASSERT_EQUAL_FP64(2147483648.0, d23);
+  ASSERT_EQUAL_FP64(4294967295.0, d24);
+  ASSERT_EQUAL_FP64(0.0, d25);
+  ASSERT_EQUAL_FP64(1.0, d26);
+  ASSERT_EQUAL_FP64(7.0, d27);
+  ASSERT_EQUAL_FP64(8.0, d28);
+  ASSERT_EQUAL_FP64(15.0, d29);
+
+  TEARDOWN();
+}
+
+
+TEST(system_mrs) {
+  SETUP();
+
+  START();
+  __ Mov(w0, 0);
+  __ Mov(w1, 1);
+  __ Mov(w2, 0x80000000);
+
+  // Set the Z and C flags.
+  __ Cmp(w0, w0);
+  __ Mrs(x3, NZCV);
+
+  // Set the N flag.
+  __ Cmp(w0, w1);
+  __ Mrs(x4, NZCV);
+
+  // Set the Z, C and V flags.
+  __ Add(w0, w2, w2, SetFlags);
+  __ Mrs(x5, NZCV);
+  END();
+
+  RUN();
+
+  // TODO: The assertions below should be ASSERT_EQUAL_64(flag, X register), but
+  // the flag (enum) will be sign extended, since the assertion's argument type
+  // is int64_t.
+  ASSERT_EQUAL_32(ZCFlag, w3);
+  ASSERT_EQUAL_32(NFlag, w4);
+  ASSERT_EQUAL_32(ZCVFlag, w5);
+
+  TEARDOWN();
+}
+
+
+TEST(system_msr) {
+  SETUP();
+
+  START();
+  __ Mov(w0, 0);
+  __ Mov(w1, 0x7fffffff);
+
+  __ Mov(x7, 0);
+
+  __ Mov(x10, NVFlag);
+  __ Cmp(w0, w0);     // Set Z and C.
+  __ Msr(NZCV, x10);  // Set N and V.
+  // The Msr should have overwritten every flag set by the Cmp.
+  __ Cinc(x7, x7, mi);  // N
+  __ Cinc(x7, x7, ne);  // !Z
+  __ Cinc(x7, x7, lo);  // !C
+  __ Cinc(x7, x7, vs);  // V
+
+  __ Mov(x10, ZCFlag);
+  __ Cmn(w1, w1);     // Set N and V.
+  __ Msr(NZCV, x10);  // Set Z and C.
+  // The Msr should have overwritten every flag set by the Cmn.
+  __ Cinc(x7, x7, pl);  // !N
+  __ Cinc(x7, x7, eq);  // Z
+  __ Cinc(x7, x7, hs);  // C
+  __ Cinc(x7, x7, vc);  // !V
+
+  END();
+
+  RUN();
+
+  // We should have incremented x7 (from 0) exactly 8 times.
+  ASSERT_EQUAL_64(8, x7);
+
+  TEARDOWN();
+}
+
+
+TEST(system_nop) {
+  SETUP();
+  RegisterDump before;
+
+  START();
+  before.Dump(&masm);
+  __ Nop();
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_REGISTERS(before);
+  ASSERT_EQUAL_NZCV(before.flags_nzcv());
+
+  TEARDOWN();
+}
+
+
+TEST(zero_dest) {
+  SETUP();
+  RegisterDump before;
+
+  START();
+  // Preserve the stack pointer, in case we clobber it.
+  __ Mov(x30, sp);
+  // Initialize the other registers used in this test.
+  uint64_t literal_base = 0x0100001000100101UL;
+  __ Mov(x0, 0);
+  __ Mov(x1, literal_base);
+  for (unsigned i = 2; i < x30.code(); i++) {
+    __ Add(Register::XRegFromCode(i), Register::XRegFromCode(i-1), x1);
+  }
+  before.Dump(&masm);
+
+  // All of these instructions should be NOPs in these forms, but have
+  // alternate forms which can write into the stack pointer.
+  __ add(xzr, x0, x1);
+  __ add(xzr, x1, xzr);
+  __ add(xzr, xzr, x1);
+
+  __ and_(xzr, x0, x2);
+  __ and_(xzr, x2, xzr);
+  __ and_(xzr, xzr, x2);
+
+  __ bic(xzr, x0, x3);
+  __ bic(xzr, x3, xzr);
+  __ bic(xzr, xzr, x3);
+
+  __ eon(xzr, x0, x4);
+  __ eon(xzr, x4, xzr);
+  __ eon(xzr, xzr, x4);
+
+  __ eor(xzr, x0, x5);
+  __ eor(xzr, x5, xzr);
+  __ eor(xzr, xzr, x5);
+
+  __ orr(xzr, x0, x6);
+  __ orr(xzr, x6, xzr);
+  __ orr(xzr, xzr, x6);
+
+  __ sub(xzr, x0, x7);
+  __ sub(xzr, x7, xzr);
+  __ sub(xzr, xzr, x7);
+
+  // Swap the saved stack pointer with the real one. If sp was written
+  // during the test, it will show up in x30. This is done because the test
+  // framework assumes that sp will be valid at the end of the test.
+  __ Mov(x29, x30);
+  __ Mov(x30, sp);
+  __ Mov(sp, x29);
+  // We used x29 as a scratch register, so reset it to make sure it doesn't
+  // trigger a test failure.
+  __ Add(x29, x28, x1);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_REGISTERS(before);
+  ASSERT_EQUAL_NZCV(before.flags_nzcv());
+
+  TEARDOWN();
+}
+
+
+TEST(zero_dest_setflags) {
+  SETUP();
+  RegisterDump before;
+
+  START();
+  // Preserve the stack pointer, in case we clobber it.
+  __ Mov(x30, sp);
+  // Initialize the other registers used in this test.
+  uint64_t literal_base = 0x0100001000100101UL;
+  __ Mov(x0, 0);
+  __ Mov(x1, literal_base);
+  for (int i = 2; i < 30; i++) {
+    __ Add(Register::XRegFromCode(i), Register::XRegFromCode(i-1), x1);
+  }
+  before.Dump(&masm);
+
+  // All of these instructions should only write to the flags in these forms,
+  // but have alternate forms which can write into the stack pointer.
+  __ add(xzr, x0, Operand(x1, UXTX), SetFlags);
+  __ add(xzr, x1, Operand(xzr, UXTX), SetFlags);
+  __ add(xzr, x1, 1234, SetFlags);
+  __ add(xzr, x0, x1, SetFlags);
+  __ add(xzr, x1, xzr, SetFlags);
+  __ add(xzr, xzr, x1, SetFlags);
+
+  __ and_(xzr, x2, ~0xf, SetFlags);
+  __ and_(xzr, xzr, ~0xf, SetFlags);
+  __ and_(xzr, x0, x2, SetFlags);
+  __ and_(xzr, x2, xzr, SetFlags);
+  __ and_(xzr, xzr, x2, SetFlags);
+
+  __ bic(xzr, x3, ~0xf, SetFlags);
+  __ bic(xzr, xzr, ~0xf, SetFlags);
+  __ bic(xzr, x0, x3, SetFlags);
+  __ bic(xzr, x3, xzr, SetFlags);
+  __ bic(xzr, xzr, x3, SetFlags);
+
+  __ sub(xzr, x0, Operand(x3, UXTX), SetFlags);
+  __ sub(xzr, x3, Operand(xzr, UXTX), SetFlags);
+  __ sub(xzr, x3, 1234, SetFlags);
+  __ sub(xzr, x0, x3, SetFlags);
+  __ sub(xzr, x3, xzr, SetFlags);
+  __ sub(xzr, xzr, x3, SetFlags);
+
+  // Swap the saved stack pointer with the real one. If sp was written
+  // during the test, it will show up in x30. This is done because the test
+  // framework assumes that sp will be valid at the end of the test.
+  __ Mov(x29, x30);
+  __ Mov(x30, sp);
+  __ Mov(sp, x29);
+  // We used x29 as a scratch register, so reset it to make sure it doesn't
+  // trigger a test failure.
+  __ Add(x29, x28, x1);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_REGISTERS(before);
+
+  TEARDOWN();
+}
+
+
+TEST(register_bit) {
+  // No code generation takes place in this test, so no need to setup and
+  // teardown.
+
+  // Simple tests.
+  assert(x0.Bit() == (1UL << 0));
+  assert(x1.Bit() == (1UL << 1));
+  assert(x10.Bit() == (1UL << 10));
+
+  // AAPCS64 definitions.
+  assert(lr.Bit() == (1UL << kLinkRegCode));
+
+  // Fixed (hardware) definitions.
+  assert(xzr.Bit() == (1UL << kZeroRegCode));
+
+  // Internal ABI definitions.
+  assert(sp.Bit() == (1UL << kSPRegInternalCode));
+  assert(sp.Bit() != xzr.Bit());
+
+  // xn.Bit() == wn.Bit() at all times, for the same n.
+  assert(x0.Bit() == w0.Bit());
+  assert(x1.Bit() == w1.Bit());
+  assert(x10.Bit() == w10.Bit());
+  assert(xzr.Bit() == wzr.Bit());
+  assert(sp.Bit() == wsp.Bit());
+}
+
+
+TEST(stack_pointer_override) {
+  // This test generates some stack maintenance code, but the test only checks
+  // the reported state.
+  SETUP();
+  START();
+
+  // The default stack pointer in VIXL is sp.
+  assert(sp.Is(__ StackPointer()));
+  __ SetStackPointer(x0);
+  assert(x0.Is(__ StackPointer()));
+  __ SetStackPointer(x28);
+  assert(x28.Is(__ StackPointer()));
+  __ SetStackPointer(sp);
+  assert(sp.Is(__ StackPointer()));
+
+  END();
+  RUN();
+  TEARDOWN();
+}
+
+
+TEST(peek_poke_simple) {
+  SETUP();
+  START();
+
+  static const RegList x0_to_x3 = x0.Bit() | x1.Bit() | x2.Bit() | x3.Bit();
+  static const RegList x10_to_x13 = x10.Bit() | x11.Bit() |
+                                    x12.Bit() | x13.Bit();
+
+  // The literal base is chosen to have two useful properties:
+  //  * When multiplied by small values (such as a register index), this value
+  //    is clearly readable in the result.
+  //  * The value is not formed from repeating fixed-size smaller values, so it
+  //    can be used to detect endianness-related errors.
+  uint64_t literal_base = 0x0100001000100101UL;
+
+  // Initialize the registers.
+  __ Mov(x0, literal_base);
+  __ Add(x1, x0, x0);
+  __ Add(x2, x1, x0);
+  __ Add(x3, x2, x0);
+
+  __ Claim(32);
+
+  // Simple exchange.
+  //  After this test:
+  //    x0-x3 should be unchanged.
+  //    w10-w13 should contain the lower words of x0-x3.
+  __ Poke(x0, 0);
+  __ Poke(x1, 8);
+  __ Poke(x2, 16);
+  __ Poke(x3, 24);
+  Clobber(&masm, x0_to_x3);
+  __ Peek(x0, 0);
+  __ Peek(x1, 8);
+  __ Peek(x2, 16);
+  __ Peek(x3, 24);
+
+  __ Poke(w0, 0);
+  __ Poke(w1, 4);
+  __ Poke(w2, 8);
+  __ Poke(w3, 12);
+  Clobber(&masm, x10_to_x13);
+  __ Peek(w10, 0);
+  __ Peek(w11, 4);
+  __ Peek(w12, 8);
+  __ Peek(w13, 12);
+
+  __ Drop(32);
+
+  END();
+  RUN();
+
+  ASSERT_EQUAL_64(literal_base * 1, x0);
+  ASSERT_EQUAL_64(literal_base * 2, x1);
+  ASSERT_EQUAL_64(literal_base * 3, x2);
+  ASSERT_EQUAL_64(literal_base * 4, x3);
+
+  ASSERT_EQUAL_64((literal_base * 1) & 0xffffffff, x10);
+  ASSERT_EQUAL_64((literal_base * 2) & 0xffffffff, x11);
+  ASSERT_EQUAL_64((literal_base * 3) & 0xffffffff, x12);
+  ASSERT_EQUAL_64((literal_base * 4) & 0xffffffff, x13);
+
+  TEARDOWN();
+}
+
+
+TEST(peek_poke_unaligned) {
+  SETUP();
+  START();
+
+  // The literal base is chosen to have two useful properties:
+  //  * When multiplied by small values (such as a register index), this value
+  //    is clearly readable in the result.
+  //  * The value is not formed from repeating fixed-size smaller values, so it
+  //    can be used to detect endianness-related errors.
+  uint64_t literal_base = 0x0100001000100101UL;
+
+  // Initialize the registers.
+  __ Mov(x0, literal_base);
+  __ Add(x1, x0, x0);
+  __ Add(x2, x1, x0);
+  __ Add(x3, x2, x0);
+  __ Add(x4, x3, x0);
+  __ Add(x5, x4, x0);
+  __ Add(x6, x5, x0);
+
+  __ Claim(32);
+
+  // Unaligned exchanges.
+  //  After this test:
+  //    x0-x6 should be unchanged.
+  //    w10-w12 should contain the lower words of x0-x2.
+  __ Poke(x0, 1);
+  Clobber(&masm, x0.Bit());
+  __ Peek(x0, 1);
+  __ Poke(x1, 2);
+  Clobber(&masm, x1.Bit());
+  __ Peek(x1, 2);
+  __ Poke(x2, 3);
+  Clobber(&masm, x2.Bit());
+  __ Peek(x2, 3);
+  __ Poke(x3, 4);
+  Clobber(&masm, x3.Bit());
+  __ Peek(x3, 4);
+  __ Poke(x4, 5);
+  Clobber(&masm, x4.Bit());
+  __ Peek(x4, 5);
+  __ Poke(x5, 6);
+  Clobber(&masm, x5.Bit());
+  __ Peek(x5, 6);
+  __ Poke(x6, 7);
+  Clobber(&masm, x6.Bit());
+  __ Peek(x6, 7);
+
+  __ Poke(w0, 1);
+  Clobber(&masm, w10.Bit());
+  __ Peek(w10, 1);
+  __ Poke(w1, 2);
+  Clobber(&masm, w11.Bit());
+  __ Peek(w11, 2);
+  __ Poke(w2, 3);
+  Clobber(&masm, w12.Bit());
+  __ Peek(w12, 3);
+
+  __ Drop(32);
+
+  END();
+  RUN();
+
+  ASSERT_EQUAL_64(literal_base * 1, x0);
+  ASSERT_EQUAL_64(literal_base * 2, x1);
+  ASSERT_EQUAL_64(literal_base * 3, x2);
+  ASSERT_EQUAL_64(literal_base * 4, x3);
+  ASSERT_EQUAL_64(literal_base * 5, x4);
+  ASSERT_EQUAL_64(literal_base * 6, x5);
+  ASSERT_EQUAL_64(literal_base * 7, x6);
+
+  ASSERT_EQUAL_64((literal_base * 1) & 0xffffffff, x10);
+  ASSERT_EQUAL_64((literal_base * 2) & 0xffffffff, x11);
+  ASSERT_EQUAL_64((literal_base * 3) & 0xffffffff, x12);
+
+  TEARDOWN();
+}
+
+
+TEST(peek_poke_endianness) {
+  SETUP();
+  START();
+
+  // The literal base is chosen to have two useful properties:
+  //  * When multiplied by small values (such as a register index), this value
+  //    is clearly readable in the result.
+  //  * The value is not formed from repeating fixed-size smaller values, so it
+  //    can be used to detect endianness-related errors.
+  uint64_t literal_base = 0x0100001000100101UL;
+
+  // Initialize the registers.
+  __ Mov(x0, literal_base);
+  __ Add(x1, x0, x0);
+
+  __ Claim(32);
+
+  // Endianness tests.
+  //  After this section:
+  //    x4 should match x0[31:0]:x0[63:32]
+  //    w5 should match w1[15:0]:w1[31:16]
+  __ Poke(x0, 0);
+  __ Poke(x0, 8);
+  __ Peek(x4, 4);
+
+  __ Poke(w1, 0);
+  __ Poke(w1, 4);
+  __ Peek(w5, 2);
+
+  __ Drop(32);
+
+  END();
+  RUN();
+
+  uint64_t x0_expected = literal_base * 1;
+  uint64_t x1_expected = literal_base * 2;
+  uint64_t x4_expected = (x0_expected << 32) | (x0_expected >> 32);
+  uint64_t x5_expected = ((x1_expected << 16) & 0xffff0000) |
+                         ((x1_expected >> 16) & 0x0000ffff);
+
+  ASSERT_EQUAL_64(x0_expected, x0);
+  ASSERT_EQUAL_64(x1_expected, x1);
+  ASSERT_EQUAL_64(x4_expected, x4);
+  ASSERT_EQUAL_64(x5_expected, x5);
+
+  TEARDOWN();
+}
+
+
+TEST(peek_poke_mixed) {
+  SETUP();
+  START();
+
+  // The literal base is chosen to have two useful properties:
+  //  * When multiplied by small values (such as a register index), this value
+  //    is clearly readable in the result.
+  //  * The value is not formed from repeating fixed-size smaller values, so it
+  //    can be used to detect endianness-related errors.
+  uint64_t literal_base = 0x0100001000100101UL;
+
+  // Initialize the registers.
+  __ Mov(x0, literal_base);
+  __ Add(x1, x0, x0);
+  __ Add(x2, x1, x0);
+  __ Add(x3, x2, x0);
+
+  __ Claim(32);
+
+  // Mix with other stack operations.
+  //  After this section:
+  //    x0-x3 should be unchanged.
+  //    x6 should match x1[31:0]:x0[63:32]
+  //    w7 should match x1[15:0]:x0[63:48]
+  __ Poke(x1, 8);
+  __ Poke(x0, 0);
+  {
+    ASSERT(__ StackPointer().Is(sp));
+    __ Mov(x4, __ StackPointer());
+    __ SetStackPointer(x4);
+
+    __ Poke(wzr, 0);    // Clobber the space we're about to drop.
+    __ Drop(4);
+    __ Peek(x6, 0);
+    __ Claim(8);
+    __ Peek(w7, 10);
+    __ Poke(x3, 28);
+    __ Poke(xzr, 0);    // Clobber the space we're about to drop.
+    __ Drop(8);
+    __ Poke(x2, 12);
+    __ Push(w0);
+
+    __ Mov(sp, __ StackPointer());
+    __ SetStackPointer(sp);
+  }
+
+  __ Pop(x0, x1, x2, x3);
+
+  END();
+  RUN();
+
+  uint64_t x0_expected = literal_base * 1;
+  uint64_t x1_expected = literal_base * 2;
+  uint64_t x2_expected = literal_base * 3;
+  uint64_t x3_expected = literal_base * 4;
+  uint64_t x6_expected = (x1_expected << 32) | (x0_expected >> 32);
+  uint64_t x7_expected = ((x1_expected << 16) & 0xffff0000) |
+                         ((x0_expected >> 48) & 0x0000ffff);
+
+  ASSERT_EQUAL_64(x0_expected, x0);
+  ASSERT_EQUAL_64(x1_expected, x1);
+  ASSERT_EQUAL_64(x2_expected, x2);
+  ASSERT_EQUAL_64(x3_expected, x3);
+  ASSERT_EQUAL_64(x6_expected, x6);
+  ASSERT_EQUAL_64(x7_expected, x7);
+
+  TEARDOWN();
+}
+
+
+// This enum is used only as an argument to the push-pop test helpers.
+enum PushPopMethod {
+  // Push or Pop using the Push and Pop methods, with blocks of up to four
+  // registers. (Smaller blocks will be used if necessary.)
+  PushPopByFour,
+
+  // Use Push<Size>RegList and Pop<Size>RegList to transfer the registers.
+  PushPopRegList
+};
+
+
+// The maximum number of registers that can be used by the PushPopXReg* tests,
+// where a reg_count field is provided.
+static int const kPushPopXRegMaxRegCount = -1;
+
+// Test a simple push-pop pattern:
+//  * Claim <claim> bytes to set the stack alignment.
+//  * Push <reg_count> registers with size <reg_size>.
+//  * Clobber the register contents.
+//  * Pop <reg_count> registers to restore the original contents.
+//  * Drop <claim> bytes to restore the original stack pointer.
+//
+// Different push and pop methods can be specified independently to test for
+// proper word-endian behaviour.
+static void PushPopXRegSimpleHelper(int reg_count,
+                                    int claim,
+                                    int reg_size,
+                                    PushPopMethod push_method,
+                                    PushPopMethod pop_method) {
+  SETUP();
+
+  START();
+
+  // Arbitrarily pick a register to use as a stack pointer.
+  const Register& stack_pointer = x20;
+  const RegList allowed = ~stack_pointer.Bit();
+  if (reg_count == kPushPopXRegMaxRegCount) {
+    reg_count = CountSetBits(allowed, kNumberOfRegisters);
+  }
+  // Work out which registers to use, based on reg_size.
+  Register r[kNumberOfRegisters];
+  Register x[kNumberOfRegisters];
+  RegList list = PopulateRegisterArray(NULL, x, r, reg_size, reg_count,
+                                       allowed);
+
+  // The literal base is chosen to have two useful properties:
+  //  * When multiplied by small values (such as a register index), this value
+  //    is clearly readable in the result.
+  //  * The value is not formed from repeating fixed-size smaller values, so it
+  //    can be used to detect endianness-related errors.
+  uint64_t literal_base = 0x0100001000100101UL;
+
+  {
+    ASSERT(__ StackPointer().Is(sp));
+    __ Mov(stack_pointer, __ StackPointer());
+    __ SetStackPointer(stack_pointer);
+
+    int i;
+
+    // Initialize the registers.
+    for (i = 0; i < reg_count; i++) {
+      // Always write into the X register, to ensure that the upper word is
+      // properly ignored by Push when testing W registers.
+      __ Mov(x[i], literal_base * i);
+    }
+
+    // Claim memory first, as requested.
+    __ Claim(claim);
+
+    switch (push_method) {
+      case PushPopByFour:
+        // Push high-numbered registers first (to the highest addresses).
+        for (i = reg_count; i >= 4; i -= 4) {
+          __ Push(r[i-1], r[i-2], r[i-3], r[i-4]);
+        }
+        // Finish off the leftovers.
+        switch (i) {
+          case 3:  __ Push(r[2], r[1], r[0]); break;
+          case 2:  __ Push(r[1], r[0]);       break;
+          case 1:  __ Push(r[0]);             break;
+          default: ASSERT(i == 0);            break;
+        }
+        break;
+      case PushPopRegList:
+        __ PushSizeRegList(list, reg_size);
+        break;
+    }
+
+    // Clobber all the registers, to ensure that they get repopulated by Pop.
+    Clobber(&masm, list);
+
+    switch (pop_method) {
+      case PushPopByFour:
+        // Pop low-numbered registers first (from the lowest addresses).
+        for (i = 0; i <= (reg_count-4); i += 4) {
+          __ Pop(r[i], r[i+1], r[i+2], r[i+3]);
+        }
+        // Finish off the leftovers.
+        switch (reg_count - i) {
+          case 3:  __ Pop(r[i], r[i+1], r[i+2]); break;
+          case 2:  __ Pop(r[i], r[i+1]);         break;
+          case 1:  __ Pop(r[i]);                 break;
+          default: ASSERT(i == reg_count);       break;
+        }
+        break;
+      case PushPopRegList:
+        __ PopSizeRegList(list, reg_size);
+        break;
+    }
+
+    // Drop memory to restore stack_pointer.
+    __ Drop(claim);
+
+    __ Mov(sp, __ StackPointer());
+    __ SetStackPointer(sp);
+  }
+
+  END();
+
+  RUN();
+
+  // Check that the register contents were preserved.
+  // Always use ASSERT_EQUAL_64, even when testing W registers, so we can test
+  // that the upper word was properly cleared by Pop.
+  literal_base &= (0xffffffffffffffffUL >> (64-reg_size));
+  for (int i = 0; i < reg_count; i++) {
+    if (x[i].Is(xzr)) {
+      ASSERT_EQUAL_64(0, x[i]);
+    } else {
+      ASSERT_EQUAL_64(literal_base * i, x[i]);
+    }
+  }
+
+  TEARDOWN();
+}
+
+
+TEST(push_pop_xreg_simple_32) {
+  for (int claim = 0; claim <= 8; claim++) {
+    for (int count = 0; count <= 8; count++) {
+      PushPopXRegSimpleHelper(count, claim, kWRegSize,
+                              PushPopByFour, PushPopByFour);
+      PushPopXRegSimpleHelper(count, claim, kWRegSize,
+                              PushPopByFour, PushPopRegList);
+      PushPopXRegSimpleHelper(count, claim, kWRegSize,
+                              PushPopRegList, PushPopByFour);
+      PushPopXRegSimpleHelper(count, claim, kWRegSize,
+                              PushPopRegList, PushPopRegList);
+    }
+    // Test with the maximum number of registers.
+    PushPopXRegSimpleHelper(kPushPopXRegMaxRegCount,
+                            claim, kWRegSize, PushPopByFour, PushPopByFour);
+    PushPopXRegSimpleHelper(kPushPopXRegMaxRegCount,
+                            claim, kWRegSize, PushPopByFour, PushPopRegList);
+    PushPopXRegSimpleHelper(kPushPopXRegMaxRegCount,
+                            claim, kWRegSize, PushPopRegList, PushPopByFour);
+    PushPopXRegSimpleHelper(kPushPopXRegMaxRegCount,
+                            claim, kWRegSize, PushPopRegList, PushPopRegList);
+  }
+}
+
+
+TEST(push_pop_xreg_simple_64) {
+  for (int claim = 0; claim <= 8; claim++) {
+    for (int count = 0; count <= 8; count++) {
+      PushPopXRegSimpleHelper(count, claim, kXRegSize,
+                              PushPopByFour, PushPopByFour);
+      PushPopXRegSimpleHelper(count, claim, kXRegSize,
+                              PushPopByFour, PushPopRegList);
+      PushPopXRegSimpleHelper(count, claim, kXRegSize,
+                              PushPopRegList, PushPopByFour);
+      PushPopXRegSimpleHelper(count, claim, kXRegSize,
+                              PushPopRegList, PushPopRegList);
+    }
+    // Test with the maximum number of registers.
+    PushPopXRegSimpleHelper(kPushPopXRegMaxRegCount,
+                            claim, kXRegSize, PushPopByFour, PushPopByFour);
+    PushPopXRegSimpleHelper(kPushPopXRegMaxRegCount,
+                            claim, kXRegSize, PushPopByFour, PushPopRegList);
+    PushPopXRegSimpleHelper(kPushPopXRegMaxRegCount,
+                            claim, kXRegSize, PushPopRegList, PushPopByFour);
+    PushPopXRegSimpleHelper(kPushPopXRegMaxRegCount,
+                            claim, kXRegSize, PushPopRegList, PushPopRegList);
+  }
+}
+
+
+// The maximum number of registers that can be used by the PushPopFPXReg* tests,
+// where a reg_count field is provided.
+static int const kPushPopFPXRegMaxRegCount = -1;
+
+// Test a simple push-pop pattern:
+//  * Claim <claim> bytes to set the stack alignment.
+//  * Push <reg_count> FP registers with size <reg_size>.
+//  * Clobber the register contents.
+//  * Pop <reg_count> FP registers to restore the original contents.
+//  * Drop <claim> bytes to restore the original stack pointer.
+//
+// Different push and pop methods can be specified independently to test for
+// proper word-endian behaviour.
+static void PushPopFPXRegSimpleHelper(int reg_count,
+                                      int claim,
+                                      int reg_size,
+                                      PushPopMethod push_method,
+                                      PushPopMethod pop_method) {
+  SETUP();
+
+  START();
+
+  // We can use any floating-point register. None of them are reserved for
+  // debug code, for example.
+  static RegList const allowed = ~0;
+  if (reg_count == kPushPopFPXRegMaxRegCount) {
+    reg_count = CountSetBits(allowed, kNumberOfFPRegisters);
+  }
+  // Work out which registers to use, based on reg_size.
+  FPRegister v[kNumberOfRegisters];
+  FPRegister d[kNumberOfRegisters];
+  RegList list = PopulateFPRegisterArray(NULL, d, v, reg_size, reg_count,
+                                         allowed);
+
+  // Arbitrarily pick a register to use as a stack pointer.
+  const Register& stack_pointer = x10;
+
+  // The literal base is chosen to have two useful properties:
+  //  * When multiplied (using an integer) by small values (such as a register
+  //    index), this value is clearly readable in the result.
+  //  * The value is not formed from repeating fixed-size smaller values, so it
+  //    can be used to detect endianness-related errors.
+  //  * It is never a floating-point NaN, and will therefore always compare
+  //    equal to itself.
+  uint64_t literal_base = 0x0100001000100101UL;
+
+  {
+    ASSERT(__ StackPointer().Is(sp));
+    __ Mov(stack_pointer, __ StackPointer());
+    __ SetStackPointer(stack_pointer);
+
+    int i;
+
+    // Initialize the registers, using X registers to load the literal.
+    __ Mov(x0, 0);
+    __ Mov(x1, literal_base);
+    for (i = 0; i < reg_count; i++) {
+      // Always write into the D register, to ensure that the upper word is
+      // properly ignored by Push when testing S registers.
+      __ Fmov(d[i], x0);
+      // Calculate the next literal.
+      __ Add(x0, x0, x1);
+    }
+
+    // Claim memory first, as requested.
+    __ Claim(claim);
+
+    switch (push_method) {
+      case PushPopByFour:
+        // Push high-numbered registers first (to the highest addresses).
+        for (i = reg_count; i >= 4; i -= 4) {
+          __ Push(v[i-1], v[i-2], v[i-3], v[i-4]);
+        }
+        // Finish off the leftovers.
+        switch (i) {
+          case 3:  __ Push(v[2], v[1], v[0]); break;
+          case 2:  __ Push(v[1], v[0]);       break;
+          case 1:  __ Push(v[0]);             break;
+          default: ASSERT(i == 0);            break;
+        }
+        break;
+      case PushPopRegList:
+        __ PushSizeRegList(list, reg_size, CPURegister::kFPRegister);
+        break;
+    }
+
+    // Clobber all the registers, to ensure that they get repopulated by Pop.
+    ClobberFP(&masm, list);
+
+    switch (pop_method) {
+      case PushPopByFour:
+        // Pop low-numbered registers first (from the lowest addresses).
+        for (i = 0; i <= (reg_count-4); i += 4) {
+          __ Pop(v[i], v[i+1], v[i+2], v[i+3]);
+        }
+        // Finish off the leftovers.
+        switch (reg_count - i) {
+          case 3:  __ Pop(v[i], v[i+1], v[i+2]); break;
+          case 2:  __ Pop(v[i], v[i+1]);         break;
+          case 1:  __ Pop(v[i]);                 break;
+          default: ASSERT(i == reg_count);       break;
+        }
+        break;
+      case PushPopRegList:
+        __ PopSizeRegList(list, reg_size, CPURegister::kFPRegister);
+        break;
+    }
+
+    // Drop memory to restore the stack pointer.
+    __ Drop(claim);
+
+    __ Mov(sp, __ StackPointer());
+    __ SetStackPointer(sp);
+  }
+
+  END();
+
+  RUN();
+
+  // Check that the register contents were preserved.
+  // Always use ASSERT_EQUAL_FP64, even when testing S registers, so we can
+  // test that the upper word was properly cleared by Pop.
+  literal_base &= (0xffffffffffffffffUL >> (64-reg_size));
+  for (int i = 0; i < reg_count; i++) {
+    uint64_t literal = literal_base * i;
+    double expected;
+    memcpy(&expected, &literal, sizeof(expected));
+    ASSERT_EQUAL_FP64(expected, d[i]);
+  }
+
+  TEARDOWN();
+}
+
+
+TEST(push_pop_fp_xreg_simple_32) {
+  for (int claim = 0; claim <= 8; claim++) {
+    for (int count = 0; count <= 8; count++) {
+      PushPopFPXRegSimpleHelper(count, claim, kSRegSize,
+                                PushPopByFour, PushPopByFour);
+      PushPopFPXRegSimpleHelper(count, claim, kSRegSize,
+                                PushPopByFour, PushPopRegList);
+      PushPopFPXRegSimpleHelper(count, claim, kSRegSize,
+                                PushPopRegList, PushPopByFour);
+      PushPopFPXRegSimpleHelper(count, claim, kSRegSize,
+                                PushPopRegList, PushPopRegList);
+    }
+    // Test with the maximum number of registers.
+    PushPopFPXRegSimpleHelper(kPushPopFPXRegMaxRegCount, claim, kSRegSize,
+                              PushPopByFour, PushPopByFour);
+    PushPopFPXRegSimpleHelper(kPushPopFPXRegMaxRegCount, claim, kSRegSize,
+                              PushPopByFour, PushPopRegList);
+    PushPopFPXRegSimpleHelper(kPushPopFPXRegMaxRegCount, claim, kSRegSize,
+                              PushPopRegList, PushPopByFour);
+    PushPopFPXRegSimpleHelper(kPushPopFPXRegMaxRegCount, claim, kSRegSize,
+                              PushPopRegList, PushPopRegList);
+  }
+}
+
+
+TEST(push_pop_fp_xreg_simple_64) {
+  for (int claim = 0; claim <= 8; claim++) {
+    for (int count = 0; count <= 8; count++) {
+      PushPopFPXRegSimpleHelper(count, claim, kDRegSize,
+                                PushPopByFour, PushPopByFour);
+      PushPopFPXRegSimpleHelper(count, claim, kDRegSize,
+                                PushPopByFour, PushPopRegList);
+      PushPopFPXRegSimpleHelper(count, claim, kDRegSize,
+                                PushPopRegList, PushPopByFour);
+      PushPopFPXRegSimpleHelper(count, claim, kDRegSize,
+                                PushPopRegList, PushPopRegList);
+    }
+    // Test with the maximum number of registers.
+    PushPopFPXRegSimpleHelper(kPushPopFPXRegMaxRegCount, claim, kDRegSize,
+                              PushPopByFour, PushPopByFour);
+    PushPopFPXRegSimpleHelper(kPushPopFPXRegMaxRegCount, claim, kDRegSize,
+                              PushPopByFour, PushPopRegList);
+    PushPopFPXRegSimpleHelper(kPushPopFPXRegMaxRegCount, claim, kDRegSize,
+                              PushPopRegList, PushPopByFour);
+    PushPopFPXRegSimpleHelper(kPushPopFPXRegMaxRegCount, claim, kDRegSize,
+                              PushPopRegList, PushPopRegList);
+  }
+}
+
+
+// Push and pop data using an overlapping combination of Push/Pop and
+// RegList-based methods.
+static void PushPopXRegMixedMethodsHelper(int claim, int reg_size) {
+  SETUP();
+
+  // Arbitrarily pick a register to use as a stack pointer.
+  const Register& stack_pointer = x5;
+  const RegList allowed = ~stack_pointer.Bit();
+  // Work out which registers to use, based on reg_size.
+  Register r[10];
+  Register x[10];
+  PopulateRegisterArray(NULL, x, r, reg_size, 10, allowed);
+
+  // Calculate some handy register lists.
+  RegList r0_to_r3 = 0;
+  for (int i = 0; i <= 3; i++) {
+    r0_to_r3 |= x[i].Bit();
+  }
+  RegList r4_to_r5 = 0;
+  for (int i = 4; i <= 5; i++) {
+    r4_to_r5 |= x[i].Bit();
+  }
+  RegList r6_to_r9 = 0;
+  for (int i = 6; i <= 9; i++) {
+    r6_to_r9 |= x[i].Bit();
+  }
+
+  // The literal base is chosen to have two useful properties:
+  //  * When multiplied by small values (such as a register index), this value
+  //    is clearly readable in the result.
+  //  * The value is not formed from repeating fixed-size smaller values, so it
+  //    can be used to detect endianness-related errors.
+  uint64_t literal_base = 0x0100001000100101UL;
+
+  START();
+  {
+    ASSERT(__ StackPointer().Is(sp));
+    __ Mov(stack_pointer, __ StackPointer());
+    __ SetStackPointer(stack_pointer);
+
+    // Claim memory first, as requested.
+    __ Claim(claim);
+
+    __ Mov(x[3], literal_base * 3);
+    __ Mov(x[2], literal_base * 2);
+    __ Mov(x[1], literal_base * 1);
+    __ Mov(x[0], literal_base * 0);
+
+    __ PushSizeRegList(r0_to_r3, reg_size);
+    __ Push(r[3], r[2]);
+
+    Clobber(&masm, r0_to_r3);
+    __ PopSizeRegList(r0_to_r3, reg_size);
+
+    __ Push(r[2], r[1], r[3], r[0]);
+
+    Clobber(&masm, r4_to_r5);
+    __ Pop(r[4], r[5]);
+    Clobber(&masm, r6_to_r9);
+    __ Pop(r[6], r[7], r[8], r[9]);
+
+    // Drop memory to restore stack_pointer.
+    __ Drop(claim);
+
+    __ Mov(sp, __ StackPointer());
+    __ SetStackPointer(sp);
+  }
+
+  END();
+
+  RUN();
+
+  // Always use ASSERT_EQUAL_64, even when testing W registers, so we can test
+  // that the upper word was properly cleared by Pop.
+  literal_base &= (0xffffffffffffffffUL >> (64-reg_size));
+
+  ASSERT_EQUAL_64(literal_base * 3, x[9]);
+  ASSERT_EQUAL_64(literal_base * 2, x[8]);
+  ASSERT_EQUAL_64(literal_base * 0, x[7]);
+  ASSERT_EQUAL_64(literal_base * 3, x[6]);
+  ASSERT_EQUAL_64(literal_base * 1, x[5]);
+  ASSERT_EQUAL_64(literal_base * 2, x[4]);
+
+  TEARDOWN();
+}
+
+
+TEST(push_pop_xreg_mixed_methods_64) {
+  for (int claim = 0; claim <= 8; claim++) {
+    PushPopXRegMixedMethodsHelper(claim, kXRegSize);
+  }
+}
+
+
+TEST(push_pop_xreg_mixed_methods_32) {
+  for (int claim = 0; claim <= 8; claim++) {
+    PushPopXRegMixedMethodsHelper(claim, kWRegSize);
+  }
+}
+
+
+// Push and pop data using overlapping X- and W-sized quantities.
+static void PushPopXRegWXOverlapHelper(int reg_count, int claim) {
+  SETUP();
+
+  // Arbitrarily pick a register to use as a stack pointer.
+  const Register& stack_pointer = x10;
+  const RegList allowed = ~stack_pointer.Bit();
+  if (reg_count == kPushPopXRegMaxRegCount) {
+    reg_count = CountSetBits(allowed, kNumberOfRegisters);
+  }
+  // Work out which registers to use, based on reg_size.
+  Register w[kNumberOfRegisters];
+  Register x[kNumberOfRegisters];
+  RegList list = PopulateRegisterArray(w, x, NULL, 0, reg_count, allowed);
+
+  // The number of W-sized slots we expect to pop. When we pop, we alternate
+  // between W and X registers, so we need reg_count*1.5 W-sized slots.
+  int const requested_w_slots = reg_count + reg_count / 2;
+
+  // Track what _should_ be on the stack, using W-sized slots.
+  static int const kMaxWSlots = kNumberOfRegisters + kNumberOfRegisters / 2;
+  uint32_t stack[kMaxWSlots];
+  for (int i = 0; i < kMaxWSlots; i++) {
+    stack[i] = 0xdeadbeef;
+  }
+
+  // The literal base is chosen to have two useful properties:
+  //  * When multiplied by small values (such as a register index), this value
+  //    is clearly readable in the result.
+  //  * The value is not formed from repeating fixed-size smaller values, so it
+  //    can be used to detect endianness-related errors.
+  static uint64_t const literal_base = 0x0100001000100101UL;
+  static uint64_t const literal_base_hi = literal_base >> 32;
+  static uint64_t const literal_base_lo = literal_base & 0xffffffff;
+  static uint64_t const literal_base_w = literal_base & 0xffffffff;
+
+  START();
+  {
+    ASSERT(__ StackPointer().Is(sp));
+    __ Mov(stack_pointer, __ StackPointer());
+    __ SetStackPointer(stack_pointer);
+
+    // Initialize the registers.
+    for (int i = 0; i < reg_count; i++) {
+      // Always write into the X register, to ensure that the upper word is
+      // properly ignored by Push when testing W registers.
+      __ Mov(x[i], literal_base * i);
+    }
+
+    // Claim memory first, as requested.
+    __ Claim(claim);
+
+    // The push-pop pattern is as follows:
+    // Push:           Pop:
+    //  x[0](hi)   ->   w[0]
+    //  x[0](lo)   ->   x[1](hi)
+    //  w[1]       ->   x[1](lo)
+    //  w[1]       ->   w[2]
+    //  x[2](hi)   ->   x[2](hi)
+    //  x[2](lo)   ->   x[2](lo)
+    //  x[2](hi)   ->   w[3]
+    //  x[2](lo)   ->   x[4](hi)
+    //  x[2](hi)   ->   x[4](lo)
+    //  x[2](lo)   ->   w[5]
+    //  w[3]       ->   x[5](hi)
+    //  w[3]       ->   x[6](lo)
+    //  w[3]       ->   w[7]
+    //  w[3]       ->   x[8](hi)
+    //  x[4](hi)   ->   x[8](lo)
+    //  x[4](lo)   ->   w[9]
+    // ... pattern continues ...
+    //
+    // That is, registers are pushed starting with the lower numbers,
+    // alternating between x and w registers, and pushing i%4+1 copies of each,
+    // where i is the register number.
+    // Registers are popped starting with the higher numbers one-by-one,
+    // alternating between x and w registers, but only popping one at a time.
+    //
+    // This pattern provides a wide variety of alignment effects and overlaps.
+
+    // ---- Push ----
+
+    int active_w_slots = 0;
+    for (int i = 0; active_w_slots < requested_w_slots; i++) {
+      ASSERT(i < reg_count);
+      // In order to test various arguments to PushMultipleTimes, and to try to
+      // exercise different alignment and overlap effects, we push each
+      // register a different number of times.
+      int times = i % 4 + 1;
+      if (i & 1) {
+        // Push odd-numbered registers as W registers.
+        __ PushMultipleTimes(times, w[i]);
+        // Fill in the expected stack slots.
+        for (int j = 0; j < times; j++) {
+          if (w[i].Is(wzr)) {
+            // The zero register always writes zeroes.
+            stack[active_w_slots++] = 0;
+          } else {
+            stack[active_w_slots++] = literal_base_w * i;
+          }
+        }
+      } else {
+        // Push even-numbered registers as X registers.
+        __ PushMultipleTimes(times, x[i]);
+        // Fill in the expected stack slots.
+        for (int j = 0; j < times; j++) {
+          if (x[i].Is(xzr)) {
+            // The zero register always writes zeroes.
+            stack[active_w_slots++] = 0;
+            stack[active_w_slots++] = 0;
+          } else {
+            stack[active_w_slots++] = literal_base_hi * i;
+            stack[active_w_slots++] = literal_base_lo * i;
+          }
+        }
+      }
+    }
+    // Because we were pushing several registers at a time, we probably pushed
+    // more than we needed to.
+    if (active_w_slots > requested_w_slots) {
+      __ Drop((active_w_slots - requested_w_slots) * kWRegSizeInBytes);
+      // Bump the number of active W-sized slots back to where it should be,
+      // and fill the empty space with a dummy value.
+      do {
+        stack[active_w_slots--] = 0xdeadbeef;
+      } while (active_w_slots > requested_w_slots);
+    }
+
+    // ---- Pop ----
+
+    Clobber(&masm, list);
+
+    // If popping an even number of registers, the first one will be X-sized.
+    // Otherwise, the first one will be W-sized.
+    bool next_is_64 = !(reg_count & 1);
+    for (int i = reg_count-1; i >= 0; i--) {
+      if (next_is_64) {
+        __ Pop(x[i]);
+        active_w_slots -= 2;
+      } else {
+        __ Pop(w[i]);
+        active_w_slots -= 1;
+      }
+      next_is_64 = !next_is_64;
+    }
+    ASSERT(active_w_slots == 0);
+
+    // Drop memory to restore stack_pointer.
+    __ Drop(claim);
+
+    __ Mov(sp, __ StackPointer());
+    __ SetStackPointer(sp);
+  }
+
+  END();
+
+  RUN();
+
+  int slot = 0;
+  for (int i = 0; i < reg_count; i++) {
+    // Even-numbered registers were written as W registers.
+    // Odd-numbered registers were written as X registers.
+    bool expect_64 = (i & 1);
+    uint64_t expected;
+
+    if (expect_64) {
+      uint64_t hi = stack[slot++];
+      uint64_t lo = stack[slot++];
+      expected = (hi << 32) | lo;
+    } else {
+      expected = stack[slot++];
+    }
+
+    // Always use ASSERT_EQUAL_64, even when testing W registers, so we can
+    // test that the upper word was properly cleared by Pop.
+    if (x[i].Is(xzr)) {
+      ASSERT_EQUAL_64(0, x[i]);
+    } else {
+      ASSERT_EQUAL_64(expected, x[i]);
+    }
+  }
+  ASSERT(slot == requested_w_slots);
+
+  TEARDOWN();
+}
+
+
+TEST(push_pop_xreg_wx_overlap) {
+  for (int claim = 0; claim <= 8; claim++) {
+    for (int count = 1; count <= 8; count++) {
+      PushPopXRegWXOverlapHelper(count, claim);
+    }
+    // Test with the maximum number of registers.
+    PushPopXRegWXOverlapHelper(kPushPopXRegMaxRegCount, claim);
+  }
+}
+
+
+TEST(push_pop_sp) {
+  SETUP();
+
+  START();
+
+  ASSERT(sp.Is(__ StackPointer()));
+
+  __ Mov(x3, 0x3333333333333333UL);
+  __ Mov(x2, 0x2222222222222222UL);
+  __ Mov(x1, 0x1111111111111111UL);
+  __ Mov(x0, 0x0000000000000000UL);
+  __ Claim(2 * kXRegSizeInBytes);
+  __ PushXRegList(x0.Bit() | x1.Bit() | x2.Bit() | x3.Bit());
+  __ Push(x3, x2);
+  __ PopXRegList(x0.Bit() | x1.Bit() | x2.Bit() | x3.Bit());
+  __ Push(x2, x1, x3, x0);
+  __ Pop(x4, x5);
+  __ Pop(x6, x7, x8, x9);
+
+  __ Claim(2 * kXRegSizeInBytes);
+  __ PushWRegList(w0.Bit() | w1.Bit() | w2.Bit() | w3.Bit());
+  __ Push(w3, w1, w2, w0);
+  __ PopWRegList(w10.Bit() | w11.Bit() | w12.Bit() | w13.Bit());
+  __ Pop(w14, w15, w16, w17);
+
+  __ Claim(2 * kXRegSizeInBytes);
+  __ Push(w2, w2, w1, w1);
+  __ Push(x3, x3);
+  __ Pop(w18, w19, w20, w21);
+  __ Pop(x22, x23);
+
+  __ Claim(2 * kXRegSizeInBytes);
+  __ PushXRegList(x1.Bit() | x22.Bit());
+  __ PopXRegList(x24.Bit() | x26.Bit());
+
+  __ Claim(2 * kXRegSizeInBytes);
+  __ PushWRegList(w1.Bit() | w2.Bit() | w4.Bit() | w22.Bit());
+  __ PopWRegList(w25.Bit() | w27.Bit() | w28.Bit() | w29.Bit());
+
+  __ Claim(2 * kXRegSizeInBytes);
+  __ PushXRegList(0);
+  __ PopXRegList(0);
+  __ PushXRegList(0xffffffff);
+  __ PopXRegList(0xffffffff);
+  __ Drop(12 * kXRegSizeInBytes);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0x1111111111111111UL, x3);
+  ASSERT_EQUAL_64(0x0000000000000000UL, x2);
+  ASSERT_EQUAL_64(0x3333333333333333UL, x1);
+  ASSERT_EQUAL_64(0x2222222222222222UL, x0);
+  ASSERT_EQUAL_64(0x3333333333333333UL, x9);
+  ASSERT_EQUAL_64(0x2222222222222222UL, x8);
+  ASSERT_EQUAL_64(0x0000000000000000UL, x7);
+  ASSERT_EQUAL_64(0x3333333333333333UL, x6);
+  ASSERT_EQUAL_64(0x1111111111111111UL, x5);
+  ASSERT_EQUAL_64(0x2222222222222222UL, x4);
+
+  ASSERT_EQUAL_32(0x11111111U, w13);
+  ASSERT_EQUAL_32(0x33333333U, w12);
+  ASSERT_EQUAL_32(0x00000000U, w11);
+  ASSERT_EQUAL_32(0x22222222U, w10);
+  ASSERT_EQUAL_32(0x11111111U, w17);
+  ASSERT_EQUAL_32(0x00000000U, w16);
+  ASSERT_EQUAL_32(0x33333333U, w15);
+  ASSERT_EQUAL_32(0x22222222U, w14);
+
+  ASSERT_EQUAL_32(0x11111111U, w18);
+  ASSERT_EQUAL_32(0x11111111U, w19);
+  ASSERT_EQUAL_32(0x11111111U, w20);
+  ASSERT_EQUAL_32(0x11111111U, w21);
+  ASSERT_EQUAL_64(0x3333333333333333UL, x22);
+  ASSERT_EQUAL_64(0x0000000000000000UL, x23);
+
+  ASSERT_EQUAL_64(0x3333333333333333UL, x24);
+  ASSERT_EQUAL_64(0x3333333333333333UL, x26);
+
+  ASSERT_EQUAL_32(0x33333333U, w25);
+  ASSERT_EQUAL_32(0x00000000U, w27);
+  ASSERT_EQUAL_32(0x22222222U, w28);
+  ASSERT_EQUAL_32(0x33333333U, w29);
+  TEARDOWN();
+}
+
+
+TEST(noreg) {
+  // This test doesn't generate any code, but it verifies some invariants
+  // related to NoReg.
+  CHECK(NoReg.Is(NoFPReg));
+  CHECK(NoFPReg.Is(NoReg));
+  CHECK(NoReg.Is(NoCPUReg));
+  CHECK(NoCPUReg.Is(NoReg));
+  CHECK(NoFPReg.Is(NoCPUReg));
+  CHECK(NoCPUReg.Is(NoFPReg));
+
+  CHECK(NoReg.IsNone());
+  CHECK(NoFPReg.IsNone());
+  CHECK(NoCPUReg.IsNone());
+}
+
+
+TEST(isvalid) {
+  // This test doesn't generate any code, but it verifies some invariants
+  // related to IsValid().
+  CHECK(!NoReg.IsValid());
+  CHECK(!NoFPReg.IsValid());
+  CHECK(!NoCPUReg.IsValid());
+
+  CHECK(x0.IsValid());
+  CHECK(w0.IsValid());
+  CHECK(x30.IsValid());
+  CHECK(w30.IsValid());
+  CHECK(xzr.IsValid());
+  CHECK(wzr.IsValid());
+
+  CHECK(sp.IsValid());
+  CHECK(wsp.IsValid());
+
+  CHECK(d0.IsValid());
+  CHECK(s0.IsValid());
+  CHECK(d31.IsValid());
+  CHECK(s31.IsValid());
+
+  CHECK(x0.IsValidRegister());
+  CHECK(w0.IsValidRegister());
+  CHECK(xzr.IsValidRegister());
+  CHECK(wzr.IsValidRegister());
+  CHECK(sp.IsValidRegister());
+  CHECK(wsp.IsValidRegister());
+  CHECK(!x0.IsValidFPRegister());
+  CHECK(!w0.IsValidFPRegister());
+  CHECK(!xzr.IsValidFPRegister());
+  CHECK(!wzr.IsValidFPRegister());
+  CHECK(!sp.IsValidFPRegister());
+  CHECK(!wsp.IsValidFPRegister());
+
+  CHECK(d0.IsValidFPRegister());
+  CHECK(s0.IsValidFPRegister());
+  CHECK(!d0.IsValidRegister());
+  CHECK(!s0.IsValidRegister());
+
+  // Test the same as before, but using CPURegister types. This shouldn't make
+  // any difference.
+  CHECK(static_cast<CPURegister>(x0).IsValid());
+  CHECK(static_cast<CPURegister>(w0).IsValid());
+  CHECK(static_cast<CPURegister>(x30).IsValid());
+  CHECK(static_cast<CPURegister>(w30).IsValid());
+  CHECK(static_cast<CPURegister>(xzr).IsValid());
+  CHECK(static_cast<CPURegister>(wzr).IsValid());
+
+  CHECK(static_cast<CPURegister>(sp).IsValid());
+  CHECK(static_cast<CPURegister>(wsp).IsValid());
+
+  CHECK(static_cast<CPURegister>(d0).IsValid());
+  CHECK(static_cast<CPURegister>(s0).IsValid());
+  CHECK(static_cast<CPURegister>(d31).IsValid());
+  CHECK(static_cast<CPURegister>(s31).IsValid());
+
+  CHECK(static_cast<CPURegister>(x0).IsValidRegister());
+  CHECK(static_cast<CPURegister>(w0).IsValidRegister());
+  CHECK(static_cast<CPURegister>(xzr).IsValidRegister());
+  CHECK(static_cast<CPURegister>(wzr).IsValidRegister());
+  CHECK(static_cast<CPURegister>(sp).IsValidRegister());
+  CHECK(static_cast<CPURegister>(wsp).IsValidRegister());
+  CHECK(!static_cast<CPURegister>(x0).IsValidFPRegister());
+  CHECK(!static_cast<CPURegister>(w0).IsValidFPRegister());
+  CHECK(!static_cast<CPURegister>(xzr).IsValidFPRegister());
+  CHECK(!static_cast<CPURegister>(wzr).IsValidFPRegister());
+  CHECK(!static_cast<CPURegister>(sp).IsValidFPRegister());
+  CHECK(!static_cast<CPURegister>(wsp).IsValidFPRegister());
+
+  CHECK(static_cast<CPURegister>(d0).IsValidFPRegister());
+  CHECK(static_cast<CPURegister>(s0).IsValidFPRegister());
+  CHECK(!static_cast<CPURegister>(d0).IsValidRegister());
+  CHECK(!static_cast<CPURegister>(s0).IsValidRegister());
+}
+
+
+TEST(printf) {
+#ifdef USE_SIMULATOR
+  // These tests only run when the debugger is requested.
+  if (Cctest::run_debugger()) {
+#endif
+  SETUP();
+  START();
+
+  char const * test_plain_string = "Printf with no arguments.\n";
+  char const * test_substring = "'This is a substring.'";
+  RegisterDump before;
+
+  // Initialize x29 to the value of the stack pointer. We will use x29 as a
+  // temporary stack pointer later, and initializing it in this way allows the
+  // RegisterDump check to pass.
+  __ Mov(x29, __ StackPointer());
+
+  // Test simple integer arguments.
+  __ Mov(x0, 1234);
+  __ Mov(x1, 0x1234);
+
+  // Test simple floating-point arguments.
+  __ Fmov(d0, 1.234);
+
+  // Test pointer (string) arguments.
+  __ Mov(x2, reinterpret_cast<uintptr_t>(test_substring));
+
+  // Test the maximum number of arguments, and sign extension.
+  __ Mov(w3, 0xffffffff);
+  __ Mov(w4, 0xffffffff);
+  __ Mov(x5, 0xffffffffffffffff);
+  __ Mov(x6, 0xffffffffffffffff);
+  __ Fmov(s1, 1.234);
+  __ Fmov(s2, 2.345);
+  __ Fmov(d3, 3.456);
+  __ Fmov(d4, 4.567);
+
+  // Test printing callee-saved registers.
+  __ Mov(x28, 0x123456789abcdef);
+  __ Fmov(d10, 42.0);
+
+  // Test with three arguments.
+  __ Mov(x10, 3);
+  __ Mov(x11, 40);
+  __ Mov(x12, 500);
+
+  // Check that we don't clobber any registers, except those that we explicitly
+  // write results into.
+  before.Dump(&masm);
+
+  __ Printf(test_plain_string);   // NOLINT(runtime/printf)
+  __ Printf("x0: %" PRId64", x1: 0x%08" PRIx64 "\n", x0, x1);
+  __ Printf("d0: %f\n", d0);
+  __ Printf("Test %%s: %s\n", x2);
+  __ Printf("w3(uint32): %" PRIu32 "\nw4(int32): %" PRId32 "\n"
+            "x5(uint64): %" PRIu64 "\nx6(int64): %" PRId64 "\n",
+            w3, w4, x5, x6);
+  __ Printf("%%f: %f\n%%g: %g\n%%e: %e\n%%E: %E\n", s1, s2, d3, d4);
+  __ Printf("0x%08" PRIx32 ", 0x%016" PRIx64 "\n", x28, x28);
+  __ Printf("%g\n", d10);
+
+  // Test with a different stack pointer.
+  const Register old_stack_pointer = __ StackPointer();
+  __ mov(x29, old_stack_pointer);
+  __ SetStackPointer(x29);
+  __ Printf("old_stack_pointer: 0x%016" PRIx64 "\n", old_stack_pointer);
+  __ mov(old_stack_pointer, __ StackPointer());
+  __ SetStackPointer(old_stack_pointer);
+
+  __ Printf("3=%u, 4=%u, 5=%u\n", x10, x11, x12);
+
+  END();
+  RUN();
+
+  // We cannot easily test the output of the Printf sequences, and because
+  // Printf preserves all registers by default, we can't look at the number of
+  // bytes that were printed. However, the printf_no_preserve test should check
+  // that, and here we just test that we didn't clobber any registers.
+  ASSERT_EQUAL_REGISTERS(before);
+
+  TEARDOWN();
+#ifdef USE_SIMULATOR
+  }
+#endif
+}
+
+
+TEST(printf_no_preserve) {
+#ifdef USE_SIMULATOR
+  // These tests only run when the debugger is requested.
+  if (Cctest::run_debugger()) {
+#endif
+  SETUP();
+  START();
+
+  char const * test_plain_string = "Printf with no arguments.\n";
+  char const * test_substring = "'This is a substring.'";
+
+  __ PrintfNoPreserve(test_plain_string);
+  __ Mov(x19, x0);
+
+  // Test simple integer arguments.
+  __ Mov(x0, 1234);
+  __ Mov(x1, 0x1234);
+  __ PrintfNoPreserve("x0: %" PRId64", x1: 0x%08" PRIx64 "\n", x0, x1);
+  __ Mov(x20, x0);
+
+  // Test simple floating-point arguments.
+  __ Fmov(d0, 1.234);
+  __ PrintfNoPreserve("d0: %f\n", d0);
+  __ Mov(x21, x0);
+
+  // Test pointer (string) arguments.
+  __ Mov(x2, reinterpret_cast<uintptr_t>(test_substring));
+  __ PrintfNoPreserve("Test %%s: %s\n", x2);
+  __ Mov(x22, x0);
+
+  // Test the maximum number of arguments, and sign extension.
+  __ Mov(w3, 0xffffffff);
+  __ Mov(w4, 0xffffffff);
+  __ Mov(x5, 0xffffffffffffffff);
+  __ Mov(x6, 0xffffffffffffffff);
+  __ PrintfNoPreserve("w3(uint32): %" PRIu32 "\nw4(int32): %" PRId32 "\n"
+                      "x5(uint64): %" PRIu64 "\nx6(int64): %" PRId64 "\n",
+                      w3, w4, x5, x6);
+  __ Mov(x23, x0);
+
+  __ Fmov(s1, 1.234);
+  __ Fmov(s2, 2.345);
+  __ Fmov(d3, 3.456);
+  __ Fmov(d4, 4.567);
+  __ PrintfNoPreserve("%%f: %f\n%%g: %g\n%%e: %e\n%%E: %E\n", s1, s2, d3, d4);
+  __ Mov(x24, x0);
+
+  // Test printing callee-saved registers.
+  __ Mov(x28, 0x123456789abcdef);
+  __ PrintfNoPreserve("0x%08" PRIx32 ", 0x%016" PRIx64 "\n", x28, x28);
+  __ Mov(x25, x0);
+
+  __ Fmov(d10, 42.0);
+  __ PrintfNoPreserve("%g\n", d10);
+  __ Mov(x26, x0);
+
+  // Test with a different stack pointer.
+  const Register old_stack_pointer = __ StackPointer();
+  __ Mov(x29, old_stack_pointer);
+  __ SetStackPointer(x29);
+
+  __ PrintfNoPreserve("old_stack_pointer: 0x%016" PRIx64 "\n",
+                      old_stack_pointer);
+  __ Mov(x27, x0);
+
+  __ Mov(old_stack_pointer, __ StackPointer());
+  __ SetStackPointer(old_stack_pointer);
+
+  // Test with three arguments.
+  __ Mov(x3, 3);
+  __ Mov(x4, 40);
+  __ Mov(x5, 500);
+  __ PrintfNoPreserve("3=%u, 4=%u, 5=%u\n", x3, x4, x5);
+  __ Mov(x28, x0);
+
+  END();
+  RUN();
+
+  // We cannot easily test the exact output of the Printf sequences, but we can
+  // use the return code to check that the string length was correct.
+
+  // Printf with no arguments.
+  ASSERT_EQUAL_64(strlen(test_plain_string), x19);
+  // x0: 1234, x1: 0x00001234
+  ASSERT_EQUAL_64(25, x20);
+  // d0: 1.234000
+  ASSERT_EQUAL_64(13, x21);
+  // Test %s: 'This is a substring.'
+  ASSERT_EQUAL_64(32, x22);
+  // w3(uint32): 4294967295
+  // w4(int32): -1
+  // x5(uint64): 18446744073709551615
+  // x6(int64): -1
+  ASSERT_EQUAL_64(23 + 14 + 33 + 14, x23);
+  // %f: 1.234000
+  // %g: 2.345
+  // %e: 3.456000e+00
+  // %E: 4.567000E+00
+  ASSERT_EQUAL_64(13 + 10 + 17 + 17, x24);
+  // 0x89abcdef, 0x0123456789abcdef
+  ASSERT_EQUAL_64(31, x25);
+  // 42
+  ASSERT_EQUAL_64(3, x26);
+  // old_stack_pointer: 0x00007fb037ae2370
+  // Note: This is an example value, but the field width is fixed here so the
+  // string length is still predictable.
+  ASSERT_EQUAL_64(38, x27);
+  // 3=3, 4=40, 5=500
+  ASSERT_EQUAL_64(17, x28);
+
+  TEARDOWN();
+#ifdef USE_SIMULATOR
+  }
+#endif
+}
+
+
+#ifndef USE_SIMULATOR
+TEST(trace) {
+  // The Trace helper should not generate any code unless the simulator (or
+  // debugger) is being used.
+  SETUP();
+  START();
+
+  Label start;
+  __ Bind(&start);
+  __ Trace(LOG_ALL, TRACE_ENABLE);
+  __ Trace(LOG_ALL, TRACE_DISABLE);
+  CHECK(__ SizeOfCodeGeneratedSince(&start) == 0);
+
+  END();
+  TEARDOWN();
+}
+#endif
+
+
+#ifndef USE_SIMULATOR
+TEST(log) {
+  // The Log helper should not generate any code unless the simulator (or
+  // debugger) is being used.
+  SETUP();
+  START();
+
+  Label start;
+  __ Bind(&start);
+  __ Log(LOG_ALL);
+  CHECK(__ SizeOfCodeGeneratedSince(&start) == 0);
+
+  END();
+  TEARDOWN();
+}
+#endif
+
+
+TEST(instruction_accurate_scope) {
+  SETUP();
+  START();
+
+  // By default macro instructions are allowed.
+  ASSERT(masm.AllowMacroInstructions());
+  {
+    InstructionAccurateScope scope1(&masm);
+    ASSERT(!masm.AllowMacroInstructions());
+    {
+      InstructionAccurateScope scope2(&masm);
+      ASSERT(!masm.AllowMacroInstructions());
+    }
+    ASSERT(!masm.AllowMacroInstructions());
+  }
+  ASSERT(masm.AllowMacroInstructions());
+
+  {
+    InstructionAccurateScope scope(&masm, 2);
+    __ add(x0, x0, x0);
+    __ sub(x0, x0, x0);
+  }
+
+  END();
+  RUN();
+  TEARDOWN();
+}
+
+
+TEST(blr_lr) {
+  // A simple test to check that the simulator correcty handle "blr lr".
+  SETUP();
+
+  START();
+  Label target;
+  Label end;
+
+  __ Mov(x0, 0x0);
+  __ Adr(lr, &target);
+
+  __ Blr(lr);
+  __ Mov(x0, 0xdeadbeef);
+  __ B(&end);
+
+  __ Bind(&target);
+  __ Mov(x0, 0xc001c0de);
+
+  __ Bind(&end);
+  END();
+
+  RUN();
+
+  ASSERT_EQUAL_64(0xc001c0de, x0);
+
+  TEARDOWN();
+}
+
+}  // namespace vixl

diff --git a/test/test-disasm-a64.cc b/test/test-disasm-a64.cc
new file mode 100644
index 0000000..7478025
--- /dev/null
+++ b/test/test-disasm-a64.cc

@@ -0,0 +1,1546 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include <stdio.h>
+#include <cstring>
+#include "cctest.h"
+
+#include "a64/macro-assembler-a64.h"
+#include "a64/disasm-a64.h"
+
+#define TEST(name)  TEST_(DISASM_##name)
+
+#define EXP_SIZE   (256)
+#define INSTR_SIZE (1024)
+#define SETUP_CLASS(ASMCLASS)                                                  \
+  byte* buf = static_cast<byte*>(malloc(INSTR_SIZE));                          \
+  uint32_t encoding = 0;                                                       \
+  ASMCLASS* masm = new ASMCLASS(buf, INSTR_SIZE);                              \
+  Decoder* decoder = new Decoder();                                            \
+  Disassembler* disasm = new Disassembler();                                   \
+  decoder->AppendVisitor(disasm)
+
+#define SETUP() SETUP_CLASS(Assembler)
+
+#define COMPARE(ASM, EXP)                                                      \
+  masm->Reset();                                                               \
+  masm->ASM;                                                                   \
+  masm->FinalizeCode();                                                        \
+  decoder->Decode(reinterpret_cast<Instruction*>(buf));                        \
+  encoding = *reinterpret_cast<uint32_t*>(buf);                                \
+  if (strcmp(disasm->GetOutput(), EXP) != 0) {                                 \
+    printf("Encoding: %08" PRIx32 "\nExpected: %s\nFound:    %s\n",            \
+           encoding, EXP, disasm->GetOutput());                                \
+    abort();                                                                   \
+  }
+
+#define CLEANUP()                                                              \
+  delete disasm;                                                               \
+  delete decoder;                                                              \
+  delete masm
+
+namespace vixl {
+
+TEST(bootstrap) {
+  SETUP();
+
+  // Instructions generated by C compiler, disassembled by objdump, and
+  // reformatted to suit our disassembly style.
+  COMPARE(dci(0xa9ba7bfd), "stp x29, x30, [sp, #-96]!");
+  COMPARE(dci(0x910003fd), "mov x29, sp");
+  COMPARE(dci(0x9100e3a0), "add x0, x29, #0x38 (56)");
+  COMPARE(dci(0xb900001f), "str wzr, [x0]");
+  COMPARE(dci(0x528000e1), "movz w1, #0x7");
+  COMPARE(dci(0xb9001c01), "str w1, [x0, #28]");
+  COMPARE(dci(0x390043a0), "strb w0, [x29, #16]");
+  COMPARE(dci(0x790027a0), "strh w0, [x29, #18]");
+  COMPARE(dci(0xb9400400), "ldr w0, [x0, #4]");
+  COMPARE(dci(0x0b000021), "add w1, w1, w0");
+  COMPARE(dci(0x531b6800), "lsl w0, w0, #5");
+  COMPARE(dci(0x521e0400), "eor w0, w0, #0xc");
+  COMPARE(dci(0x72af0f00), "movk w0, #0x7878, lsl #16");
+  COMPARE(dci(0xd360fc00), "lsr x0, x0, #32");
+  COMPARE(dci(0x13037c01), "asr w1, w0, #3");
+  COMPARE(dci(0x4b000021), "sub w1, w1, w0");
+  COMPARE(dci(0x2a0103e0), "mov w0, w1");
+  COMPARE(dci(0x93407c00), "sxtw x0, w0");
+  COMPARE(dci(0x2a000020), "orr w0, w1, w0");
+  COMPARE(dci(0xa8c67bfd), "ldp x29, x30, [sp], #96");
+
+  CLEANUP();
+}
+
+TEST(mov_mvn) {
+  SETUP_CLASS(MacroAssembler);
+
+  COMPARE(Mov(w0, Operand(0x1234)), "movz w0, #0x1234");
+  COMPARE(Mov(x1, Operand(0x1234)), "movz x1, #0x1234");
+  COMPARE(Mov(w2, Operand(w3)), "mov w2, w3");
+  COMPARE(Mov(x4, Operand(x5)), "mov x4, x5");
+  COMPARE(Mov(w6, Operand(w7, LSL, 5)), "lsl w6, w7, #5");
+  COMPARE(Mov(x8, Operand(x9, ASR, 42)), "asr x8, x9, #42");
+  COMPARE(Mov(w10, Operand(w11, UXTB)), "uxtb w10, w11");
+  COMPARE(Mov(x12, Operand(x13, UXTB, 1)), "ubfiz x12, x13, #1, #8");
+  COMPARE(Mov(w14, Operand(w15, SXTH, 2)), "sbfiz w14, w15, #2, #16");
+  COMPARE(Mov(x16, Operand(x17, SXTW, 3)), "sbfiz x16, x17, #3, #32");
+
+  COMPARE(Mvn(w0, Operand(0x1)), "movn w0, #0x1");
+  COMPARE(Mvn(x1, Operand(0xfff)), "movn x1, #0xfff");
+  COMPARE(Mvn(w2, Operand(w3)), "mvn w2, w3");
+  COMPARE(Mvn(x4, Operand(x5)), "mvn x4, x5");
+  COMPARE(Mvn(w6, Operand(w7, LSL, 12)), "mvn w6, w7, lsl #12");
+  COMPARE(Mvn(x8, Operand(x9, ASR, 63)), "mvn x8, x9, asr #63");
+
+  CLEANUP();
+}
+
+TEST(move_immediate) {
+  SETUP();
+
+  COMPARE(movz(w0, 0x1234), "movz w0, #0x1234");
+  COMPARE(movz(x1, 0xabcd0000), "movz x1, #0xabcd0000");
+  COMPARE(movz(x2, 0x555500000000), "movz x2, #0x555500000000");
+  COMPARE(movz(x3, 0xaaaa000000000000), "movz x3, #0xaaaa000000000000");
+  COMPARE(movz(x4, 0xabcd, 16), "movz x4, #0xabcd0000");
+  COMPARE(movz(x5, 0x5555, 32), "movz x5, #0x555500000000");
+  COMPARE(movz(x6, 0xaaaa, 48), "movz x6, #0xaaaa000000000000");
+
+  COMPARE(movk(w7, 0x1234), "movk w7, #0x1234");
+  COMPARE(movk(x8, 0xabcd0000), "movk x8, #0xabcd, lsl #16");
+  COMPARE(movk(x9, 0x555500000000), "movk x9, #0x5555, lsl #32");
+  COMPARE(movk(x10, 0xaaaa000000000000), "movk x10, #0xaaaa, lsl #48");
+  COMPARE(movk(w11, 0xabcd, 16), "movk w11, #0xabcd, lsl #16");
+  COMPARE(movk(x12, 0x5555, 32), "movk x12, #0x5555, lsl #32");
+  COMPARE(movk(x13, 0xaaaa, 48), "movk x13, #0xaaaa, lsl #48");
+
+  COMPARE(movn(w14, 0x1234), "movn w14, #0x1234");
+  COMPARE(movn(x15, 0xabcd0000), "movn x15, #0xabcd0000");
+  COMPARE(movn(x16, 0x555500000000), "movn x16, #0x555500000000");
+  COMPARE(movn(x17, 0xaaaa000000000000), "movn x17, #0xaaaa000000000000");
+  COMPARE(movn(w18, 0xabcd, 16), "movn w18, #0xabcd0000");
+  COMPARE(movn(x19, 0x5555, 32), "movn x19, #0x555500000000");
+  COMPARE(movn(x20, 0xaaaa, 48), "movn x20, #0xaaaa000000000000");
+
+  COMPARE(movk(w21, 0), "movk w21, #0x0");
+  COMPARE(movk(x22, 0, 0), "movk x22, #0x0");
+  COMPARE(movk(w23, 0, 16), "movk w23, #0x0, lsl #16");
+  COMPARE(movk(x24, 0, 32), "movk x24, #0x0, lsl #32");
+  COMPARE(movk(x25, 0, 48), "movk x25, #0x0, lsl #48");
+
+  CLEANUP();
+}
+
+TEST(add_immediate) {
+  SETUP();
+
+  COMPARE(add(w0, w1, Operand(0xff)), "add w0, w1, #0xff (255)");
+  COMPARE(add(x2, x3, Operand(0x3ff)), "add x2, x3, #0x3ff (1023)");
+  COMPARE(add(w4, w5, Operand(0xfff)), "add w4, w5, #0xfff (4095)");
+  COMPARE(add(x6, x7, Operand(0x1000)), "add x6, x7, #0x1000 (4096)");
+  COMPARE(add(w8, w9, Operand(0xff000)), "add w8, w9, #0xff000 (1044480)");
+  COMPARE(add(x10, x11, Operand(0x3ff000)),
+          "add x10, x11, #0x3ff000 (4190208)");
+  COMPARE(add(w12, w13, Operand(0xfff000)),
+          "add w12, w13, #0xfff000 (16773120)");
+  COMPARE(add(w14, w15, Operand(0xff), SetFlags), "adds w14, w15, #0xff (255)");
+  COMPARE(add(x16, x17, Operand(0xaa000), SetFlags),
+          "adds x16, x17, #0xaa000 (696320)");
+  COMPARE(cmn(w18, Operand(0xff)), "cmn w18, #0xff (255)");
+  COMPARE(cmn(x19, Operand(0xff000)), "cmn x19, #0xff000 (1044480)");
+  COMPARE(add(w0, wsp, Operand(0)), "mov w0, wsp");
+  COMPARE(add(sp, x0, Operand(0)), "mov sp, x0");
+
+  COMPARE(add(w1, wsp, Operand(8)), "add w1, wsp, #0x8 (8)");
+  COMPARE(add(x2, sp, Operand(16)), "add x2, sp, #0x10 (16)");
+  COMPARE(add(wsp, wsp, Operand(42)), "add wsp, wsp, #0x2a (42)");
+  COMPARE(cmn(sp, Operand(24)), "cmn sp, #0x18 (24)");
+  COMPARE(add(wzr, wsp, Operand(9), SetFlags), "cmn wsp, #0x9 (9)");
+
+  CLEANUP();
+}
+
+TEST(sub_immediate) {
+  SETUP();
+
+  COMPARE(sub(w0, w1, Operand(0xff)), "sub w0, w1, #0xff (255)");
+  COMPARE(sub(x2, x3, Operand(0x3ff)), "sub x2, x3, #0x3ff (1023)");
+  COMPARE(sub(w4, w5, Operand(0xfff)), "sub w4, w5, #0xfff (4095)");
+  COMPARE(sub(x6, x7, Operand(0x1000)), "sub x6, x7, #0x1000 (4096)");
+  COMPARE(sub(w8, w9, Operand(0xff000)), "sub w8, w9, #0xff000 (1044480)");
+  COMPARE(sub(x10, x11, Operand(0x3ff000)),
+          "sub x10, x11, #0x3ff000 (4190208)");
+  COMPARE(sub(w12, w13, Operand(0xfff000)),
+          "sub w12, w13, #0xfff000 (16773120)");
+  COMPARE(sub(w14, w15, Operand(0xff), SetFlags), "subs w14, w15, #0xff (255)");
+  COMPARE(sub(x16, x17, Operand(0xaa000), SetFlags),
+          "subs x16, x17, #0xaa000 (696320)");
+  COMPARE(cmp(w18, Operand(0xff)), "cmp w18, #0xff (255)");
+  COMPARE(cmp(x19, Operand(0xff000)), "cmp x19, #0xff000 (1044480)");
+
+  COMPARE(sub(w1, wsp, Operand(8)), "sub w1, wsp, #0x8 (8)");
+  COMPARE(sub(x2, sp, Operand(16)), "sub x2, sp, #0x10 (16)");
+  COMPARE(sub(wsp, wsp, Operand(42)), "sub wsp, wsp, #0x2a (42)");
+  COMPARE(cmp(sp, Operand(24)), "cmp sp, #0x18 (24)");
+  COMPARE(sub(wzr, wsp, Operand(9), SetFlags), "cmp wsp, #0x9 (9)");
+
+  CLEANUP();
+}
+
+
+TEST(add_shifted) {
+  SETUP();
+
+  COMPARE(add(w0, w1, Operand(w2)), "add w0, w1, w2");
+  COMPARE(add(x3, x4, Operand(x5)), "add x3, x4, x5");
+  COMPARE(add(w6, w7, Operand(w8, LSL, 1)), "add w6, w7, w8, lsl #1");
+  COMPARE(add(x9, x10, Operand(x11, LSL, 2)), "add x9, x10, x11, lsl #2");
+  COMPARE(add(w12, w13, Operand(w14, LSR, 3)), "add w12, w13, w14, lsr #3");
+  COMPARE(add(x15, x16, Operand(x17, LSR, 4)), "add x15, x16, x17, lsr #4");
+  COMPARE(add(w18, w19, Operand(w20, ASR, 5)), "add w18, w19, w20, asr #5");
+  COMPARE(add(x21, x22, Operand(x23, ASR, 6)), "add x21, x22, x23, asr #6");
+  COMPARE(cmn(w24, Operand(w25)), "cmn w24, w25");
+  COMPARE(cmn(x26, Operand(x27, LSL, 63)), "cmn x26, x27, lsl #63");
+
+  COMPARE(add(x0, sp, Operand(x1)), "add x0, sp, x1");
+  COMPARE(add(w2, wsp, Operand(w3)), "add w2, wsp, w3");
+  COMPARE(add(x4, sp, Operand(x5, LSL, 1)), "add x4, sp, x5, lsl #1");
+  COMPARE(add(x4, xzr, Operand(x5, LSL, 1)), "add x4, xzr, x5, lsl #1");
+  COMPARE(add(w6, wsp, Operand(w7, LSL, 3)), "add w6, wsp, w7, lsl #3");
+  COMPARE(add(xzr, sp, Operand(x8, LSL, 4), SetFlags), "cmn sp, x8, lsl #4");
+  COMPARE(add(xzr, xzr, Operand(x8, LSL, 5), SetFlags), "cmn xzr, x8, lsl #5");
+
+  CLEANUP();
+}
+
+
+TEST(sub_shifted) {
+  SETUP();
+
+  COMPARE(sub(w0, w1, Operand(w2)), "sub w0, w1, w2");
+  COMPARE(sub(x3, x4, Operand(x5)), "sub x3, x4, x5");
+  COMPARE(sub(w6, w7, Operand(w8, LSL, 1)), "sub w6, w7, w8, lsl #1");
+  COMPARE(sub(x9, x10, Operand(x11, LSL, 2)), "sub x9, x10, x11, lsl #2");
+  COMPARE(sub(w12, w13, Operand(w14, LSR, 3)), "sub w12, w13, w14, lsr #3");
+  COMPARE(sub(x15, x16, Operand(x17, LSR, 4)), "sub x15, x16, x17, lsr #4");
+  COMPARE(sub(w18, w19, Operand(w20, ASR, 5)), "sub w18, w19, w20, asr #5");
+  COMPARE(sub(x21, x22, Operand(x23, ASR, 6)), "sub x21, x22, x23, asr #6");
+  COMPARE(cmp(w24, Operand(w25)), "cmp w24, w25");
+  COMPARE(cmp(x26, Operand(x27, LSL, 63)), "cmp x26, x27, lsl #63");
+  COMPARE(neg(w28, Operand(w29)), "neg w28, w29");
+  COMPARE(neg(x30, Operand(x0, LSR, 62)), "neg x30, x0, lsr #62");
+  COMPARE(neg(w1, Operand(w2), SetFlags), "negs w1, w2");
+  COMPARE(neg(x3, Operand(x4, ASR, 61), SetFlags), "negs x3, x4, asr #61");
+
+  COMPARE(sub(x0, sp, Operand(x1)), "sub x0, sp, x1");
+  COMPARE(sub(w2, wsp, Operand(w3)), "sub w2, wsp, w3");
+  COMPARE(sub(x4, sp, Operand(x5, LSL, 1)), "sub x4, sp, x5, lsl #1");
+  COMPARE(sub(x4, xzr, Operand(x5, LSL, 1)), "neg x4, x5, lsl #1");
+  COMPARE(sub(w6, wsp, Operand(w7, LSL, 3)), "sub w6, wsp, w7, lsl #3");
+  COMPARE(sub(xzr, sp, Operand(x8, LSL, 4), SetFlags), "cmp sp, x8, lsl #4");
+  COMPARE(sub(xzr, xzr, Operand(x8, LSL, 5), SetFlags), "cmp xzr, x8, lsl #5");
+
+  CLEANUP();
+}
+
+
+TEST(add_extended) {
+  SETUP();
+
+  COMPARE(add(w0, w1, Operand(w2, UXTB)), "add w0, w1, w2, uxtb");
+  COMPARE(add(x3, x4, Operand(w5, UXTB, 1), SetFlags),
+          "adds x3, x4, w5, uxtb #1");
+  COMPARE(add(w6, w7, Operand(w8, UXTH, 2)), "add w6, w7, w8, uxth #2");
+  COMPARE(add(x9, x10, Operand(x11, UXTW, 3), SetFlags),
+          "adds x9, x10, w11, uxtw #3");
+  COMPARE(add(x12, x13, Operand(x14, UXTX, 4)), "add x12, x13, x14, uxtx #4");
+  COMPARE(add(w15, w16, Operand(w17, SXTB, 4), SetFlags),
+          "adds w15, w16, w17, sxtb #4");
+  COMPARE(add(x18, x19, Operand(x20, SXTB, 3)), "add x18, x19, w20, sxtb #3");
+  COMPARE(add(w21, w22, Operand(w23, SXTH, 2), SetFlags),
+          "adds w21, w22, w23, sxth #2");
+  COMPARE(add(x24, x25, Operand(x26, SXTW, 1)), "add x24, x25, w26, sxtw #1");
+  COMPARE(add(x27, x28, Operand(x29, SXTX), SetFlags),
+          "adds x27, x28, x29, sxtx");
+  COMPARE(cmn(w0, Operand(w1, UXTB, 2)), "cmn w0, w1, uxtb #2");
+  COMPARE(cmn(x2, Operand(x3, SXTH, 4)), "cmn x2, w3, sxth #4");
+
+  COMPARE(add(w0, wsp, Operand(w1, UXTB)), "add w0, wsp, w1, uxtb");
+  COMPARE(add(x2, sp, Operand(x3, UXTH, 1)), "add x2, sp, w3, uxth #1");
+  COMPARE(add(wsp, wsp, Operand(w4, UXTW, 2)), "add wsp, wsp, w4, lsl #2");
+  COMPARE(cmn(sp, Operand(xzr, UXTX, 3)), "cmn sp, xzr, lsl #3");
+  COMPARE(cmn(sp, Operand(xzr, LSL, 4)), "cmn sp, xzr, lsl #4");
+
+  CLEANUP();
+}
+
+
+TEST(sub_extended) {
+  SETUP();
+
+  COMPARE(sub(w0, w1, Operand(w2, UXTB)), "sub w0, w1, w2, uxtb");
+  COMPARE(sub(x3, x4, Operand(w5, UXTB, 1), SetFlags),
+          "subs x3, x4, w5, uxtb #1");
+  COMPARE(sub(w6, w7, Operand(w8, UXTH, 2)), "sub w6, w7, w8, uxth #2");
+  COMPARE(sub(x9, x10, Operand(x11, UXTW, 3), SetFlags),
+          "subs x9, x10, w11, uxtw #3");
+  COMPARE(sub(x12, x13, Operand(x14, UXTX, 4)), "sub x12, x13, x14, uxtx #4");
+  COMPARE(sub(w15, w16, Operand(w17, SXTB, 4), SetFlags),
+          "subs w15, w16, w17, sxtb #4");
+  COMPARE(sub(x18, x19, Operand(x20, SXTB, 3)), "sub x18, x19, w20, sxtb #3");
+  COMPARE(sub(w21, w22, Operand(w23, SXTH, 2), SetFlags),
+          "subs w21, w22, w23, sxth #2");
+  COMPARE(sub(x24, x25, Operand(x26, SXTW, 1)), "sub x24, x25, w26, sxtw #1");
+  COMPARE(sub(x27, x28, Operand(x29, SXTX), SetFlags),
+          "subs x27, x28, x29, sxtx");
+  COMPARE(cmp(w0, Operand(w1, SXTB, 1)), "cmp w0, w1, sxtb #1");
+  COMPARE(cmp(x2, Operand(x3, UXTH, 3)), "cmp x2, w3, uxth #3");
+
+  COMPARE(sub(w0, wsp, Operand(w1, UXTB)), "sub w0, wsp, w1, uxtb");
+  COMPARE(sub(x2, sp, Operand(x3, UXTH, 1)), "sub x2, sp, w3, uxth #1");
+  COMPARE(sub(wsp, wsp, Operand(w4, UXTW, 2)), "sub wsp, wsp, w4, lsl #2");
+  COMPARE(cmp(sp, Operand(xzr, UXTX, 3)), "cmp sp, xzr, lsl #3");
+  COMPARE(cmp(sp, Operand(xzr, LSL, 4)), "cmp sp, xzr, lsl #4");
+
+  CLEANUP();
+}
+
+
+TEST(adc_subc_ngc) {
+  SETUP();
+
+  COMPARE(adc(w0, w1, Operand(w2)), "adc w0, w1, w2");
+  COMPARE(adc(x3, x4, Operand(x5)), "adc x3, x4, x5");
+  COMPARE(adc(w6, w7, Operand(w8), SetFlags), "adcs w6, w7, w8");
+  COMPARE(adc(x9, x10, Operand(x11), SetFlags), "adcs x9, x10, x11");
+  COMPARE(sbc(w12, w13, Operand(w14)), "sbc w12, w13, w14");
+  COMPARE(sbc(x15, x16, Operand(x17)), "sbc x15, x16, x17");
+  COMPARE(sbc(w18, w19, Operand(w20), SetFlags), "sbcs w18, w19, w20");
+  COMPARE(sbc(x21, x22, Operand(x23), SetFlags), "sbcs x21, x22, x23");
+  COMPARE(ngc(w24, Operand(w25)), "ngc w24, w25");
+  COMPARE(ngc(x26, Operand(x27)), "ngc x26, x27");
+  COMPARE(ngc(w28, Operand(w29), SetFlags), "ngcs w28, w29");
+  COMPARE(ngc(x30, Operand(x0), SetFlags), "ngcs x30, x0");
+
+  CLEANUP();
+}
+
+
+TEST(mul_and_div) {
+  SETUP();
+
+  COMPARE(mul(w0, w1, w2), "mul w0, w1, w2");
+  COMPARE(mul(x3, x4, x5), "mul x3, x4, x5");
+  COMPARE(mul(w30, w0, w1), "mul w30, w0, w1");
+  COMPARE(mul(x30, x0, x1), "mul x30, x0, x1");
+  COMPARE(mneg(w0, w1, w2), "mneg w0, w1, w2");
+  COMPARE(mneg(x3, x4, x5), "mneg x3, x4, x5");
+  COMPARE(mneg(w30, w0, w1), "mneg w30, w0, w1");
+  COMPARE(mneg(x30, x0, x1), "mneg x30, x0, x1");
+  COMPARE(smull(x0, w0, w1), "smull x0, w0, w1");
+  COMPARE(smull(x30, w30, w0), "smull x30, w30, w0");
+  COMPARE(smulh(x0, x1, x2), "smulh x0, x1, x2");
+
+  COMPARE(sdiv(w0, w1, w2), "sdiv w0, w1, w2");
+  COMPARE(sdiv(x3, x4, x5), "sdiv x3, x4, x5");
+  COMPARE(udiv(w6, w7, w8), "udiv w6, w7, w8");
+  COMPARE(udiv(x9, x10, x11), "udiv x9, x10, x11");
+
+  CLEANUP();
+}
+
+
+TEST(madd) {
+  SETUP();
+
+  COMPARE(madd(w0, w1, w2, w3), "madd w0, w1, w2, w3");
+  COMPARE(madd(w30, w21, w22, w16), "madd w30, w21, w22, w16");
+  COMPARE(madd(x0, x1, x2, x3), "madd x0, x1, x2, x3");
+  COMPARE(madd(x30, x21, x22, x16), "madd x30, x21, x22, x16");
+
+  COMPARE(smaddl(x0, w1, w2, x3), "smaddl x0, w1, w2, x3");
+  COMPARE(smaddl(x30, w21, w22, x16), "smaddl x30, w21, w22, x16");
+  COMPARE(umaddl(x0, w1, w2, x3), "umaddl x0, w1, w2, x3");
+  COMPARE(umaddl(x30, w21, w22, x16), "umaddl x30, w21, w22, x16");
+
+  CLEANUP();
+}
+
+
+TEST(msub) {
+  SETUP();
+
+  COMPARE(msub(w0, w1, w2, w3), "msub w0, w1, w2, w3");
+  COMPARE(msub(w30, w21, w22, w16), "msub w30, w21, w22, w16");
+  COMPARE(msub(x0, x1, x2, x3), "msub x0, x1, x2, x3");
+  COMPARE(msub(x30, x21, x22, x16), "msub x30, x21, x22, x16");
+
+  COMPARE(smsubl(x0, w1, w2, x3), "smsubl x0, w1, w2, x3");
+  COMPARE(smsubl(x30, w21, w22, x16), "smsubl x30, w21, w22, x16");
+  COMPARE(umsubl(x0, w1, w2, x3), "umsubl x0, w1, w2, x3");
+  COMPARE(umsubl(x30, w21, w22, x16), "umsubl x30, w21, w22, x16");
+
+  CLEANUP();
+}
+
+
+TEST(dp_1_source) {
+  SETUP();
+
+  COMPARE(rbit(w0, w1), "rbit w0, w1");
+  COMPARE(rbit(x2, x3), "rbit x2, x3");
+  COMPARE(rev16(w4, w5), "rev16 w4, w5");
+  COMPARE(rev16(x6, x7), "rev16 x6, x7");
+  COMPARE(rev32(x8, x9), "rev32 x8, x9");
+  COMPARE(rev(w10, w11), "rev w10, w11");
+  COMPARE(rev(x12, x13), "rev x12, x13");
+  COMPARE(clz(w14, w15), "clz w14, w15");
+  COMPARE(clz(x16, x17), "clz x16, x17");
+  COMPARE(cls(w18, w19), "cls w18, w19");
+  COMPARE(cls(x20, x21), "cls x20, x21");
+
+  CLEANUP();
+}
+
+
+TEST(bitfield) {
+  SETUP();
+
+  COMPARE(sxtb(w0, w1), "sxtb w0, w1");
+  COMPARE(sxtb(x2, x3), "sxtb x2, w3");
+  COMPARE(sxth(w4, w5), "sxth w4, w5");
+  COMPARE(sxth(x6, x7), "sxth x6, w7");
+  COMPARE(sxtw(x8, x9), "sxtw x8, w9");
+  COMPARE(uxtb(w10, w11), "uxtb w10, w11");
+  COMPARE(uxtb(x12, x13), "uxtb x12, w13");
+  COMPARE(uxth(w14, w15), "uxth w14, w15");
+  COMPARE(uxth(x16, x17), "uxth x16, w17");
+  COMPARE(uxtw(x18, x19), "ubfx x18, x19, #0, #32");
+
+  COMPARE(asr(w20, w21, 10), "asr w20, w21, #10");
+  COMPARE(asr(x22, x23, 20), "asr x22, x23, #20");
+  COMPARE(lsr(w24, w25, 10), "lsr w24, w25, #10");
+  COMPARE(lsr(x26, x27, 20), "lsr x26, x27, #20");
+  COMPARE(lsl(w28, w29, 10), "lsl w28, w29, #10");
+  COMPARE(lsl(x30, x0, 20), "lsl x30, x0, #20");
+
+  COMPARE(sbfiz(w1, w2, 1, 20), "sbfiz w1, w2, #1, #20");
+  COMPARE(sbfiz(x3, x4, 2, 19), "sbfiz x3, x4, #2, #19");
+  COMPARE(sbfx(w5, w6, 3, 18), "sbfx w5, w6, #3, #18");
+  COMPARE(sbfx(x7, x8, 4, 17), "sbfx x7, x8, #4, #17");
+  COMPARE(bfi(w9, w10, 5, 16), "bfi w9, w10, #5, #16");
+  COMPARE(bfi(x11, x12, 6, 15), "bfi x11, x12, #6, #15");
+  COMPARE(bfxil(w13, w14, 7, 14), "bfxil w13, w14, #7, #14");
+  COMPARE(bfxil(x15, x16, 8, 13), "bfxil x15, x16, #8, #13");
+  COMPARE(ubfiz(w17, w18, 9, 12), "ubfiz w17, w18, #9, #12");
+  COMPARE(ubfiz(x19, x20, 10, 11), "ubfiz x19, x20, #10, #11");
+  COMPARE(ubfx(w21, w22, 11, 10), "ubfx w21, w22, #11, #10");
+  COMPARE(ubfx(x23, x24, 12, 9), "ubfx x23, x24, #12, #9");
+
+  CLEANUP();
+}
+
+
+TEST(extract) {
+  SETUP();
+
+  COMPARE(extr(w0, w1, w2, 0), "extr w0, w1, w2, #0");
+  COMPARE(extr(x3, x4, x5, 1), "extr x3, x4, x5, #1");
+  COMPARE(extr(w6, w7, w8, 31), "extr w6, w7, w8, #31");
+  COMPARE(extr(x9, x10, x11, 63), "extr x9, x10, x11, #63");
+  COMPARE(extr(w12, w13, w13, 10), "ror w12, w13, #10");
+  COMPARE(extr(x14, x15, x15, 42), "ror x14, x15, #42");
+
+  CLEANUP();
+}
+
+
+TEST(logical_immediate) {
+  SETUP();
+  #define RESULT_SIZE (256)
+
+  char result[RESULT_SIZE];
+
+  // Test immediate encoding - 64-bit destination.
+  // 64-bit patterns.
+  uint64_t value = 0x7fffffff;
+  for (int i = 0; i < 64; i++) {
+    snprintf(result, RESULT_SIZE, "and x0, x0, #0x%" PRIx64, value);
+    COMPARE(and_(x0, x0, Operand(value)), result);
+    value = ((value & 1) << 63) | (value >> 1);  // Rotate right 1 bit.
+  }
+
+  // 32-bit patterns.
+  value = 0x00003fff00003fffL;
+  for (int i = 0; i < 32; i++) {
+    snprintf(result, RESULT_SIZE, "and x0, x0, #0x%" PRIx64, value);
+    COMPARE(and_(x0, x0, Operand(value)), result);
+    value = ((value & 1) << 63) | (value >> 1);  // Rotate right 1 bit.
+  }
+
+  // 16-bit patterns.
+  value = 0x001f001f001f001fL;
+  for (int i = 0; i < 16; i++) {
+    snprintf(result, RESULT_SIZE, "and x0, x0, #0x%" PRIx64, value);
+    COMPARE(and_(x0, x0, Operand(value)), result);
+    value = ((value & 1) << 63) | (value >> 1);  // Rotate right 1 bit.
+  }
+
+  // 8-bit patterns.
+  value = 0x0e0e0e0e0e0e0e0eL;
+  for (int i = 0; i < 8; i++) {
+    snprintf(result, RESULT_SIZE, "and x0, x0, #0x%" PRIx64, value);
+    COMPARE(and_(x0, x0, Operand(value)), result);
+    value = ((value & 1) << 63) | (value >> 1);  // Rotate right 1 bit.
+  }
+
+  // 4-bit patterns.
+  value = 0x6666666666666666L;
+  for (int i = 0; i < 4; i++) {
+    snprintf(result, RESULT_SIZE, "and x0, x0, #0x%" PRIx64, value);
+    COMPARE(and_(x0, x0, Operand(value)), result);
+    value = ((value & 1) << 63) | (value >> 1);  // Rotate right 1 bit.
+  }
+
+  // 2-bit patterns.
+  COMPARE(and_(x0, x0, Operand(0x5555555555555555L)),
+          "and x0, x0, #0x5555555555555555");
+  COMPARE(and_(x0, x0, Operand(0xaaaaaaaaaaaaaaaaL)),
+          "and x0, x0, #0xaaaaaaaaaaaaaaaa");
+
+  // Test immediate encoding - 32-bit destination.
+  COMPARE(and_(w0, w0, Operand(0xff8007ff)),
+          "and w0, w0, #0xff8007ff");  // 32-bit pattern.
+  COMPARE(and_(w0, w0, Operand(0xf87ff87f)),
+          "and w0, w0, #0xf87ff87f");  // 16-bit pattern.
+  COMPARE(and_(w0, w0, Operand(0x87878787)),
+          "and w0, w0, #0x87878787");  // 8-bit pattern.
+  COMPARE(and_(w0, w0, Operand(0x66666666)),
+          "and w0, w0, #0x66666666");  // 4-bit pattern.
+  COMPARE(and_(w0, w0, Operand(0x55555555)),
+          "and w0, w0, #0x55555555");  // 2-bit pattern.
+
+  // Test other instructions.
+  COMPARE(tst(w1, Operand(0x11111111)),
+          "tst w1, #0x11111111");
+  COMPARE(tst(x2, Operand(0x8888888888888888L)),
+          "tst x2, #0x8888888888888888");
+  COMPARE(orr(w7, w8, Operand(0xaaaaaaaa)),
+          "orr w7, w8, #0xaaaaaaaa");
+  COMPARE(orr(x9, x10, Operand(0x5555555555555555L)),
+          "orr x9, x10, #0x5555555555555555");
+  COMPARE(eor(w15, w16, Operand(0x00000001)),
+          "eor w15, w16, #0x1");
+  COMPARE(eor(x17, x18, Operand(0x0000000000000003L)),
+          "eor x17, x18, #0x3");
+  COMPARE(and_(w23, w24, Operand(0x0000000f), SetFlags),
+          "ands w23, w24, #0xf");
+  COMPARE(and_(x25, x26, Operand(0x800000000000000fL), SetFlags),
+          "ands x25, x26, #0x800000000000000f");
+
+  // Test inverse.
+  COMPARE(bic(w3, w4, Operand(0x20202020)),
+          "and w3, w4, #0xdfdfdfdf");
+  COMPARE(bic(x5, x6, Operand(0x4040404040404040L)),
+          "and x5, x6, #0xbfbfbfbfbfbfbfbf");
+  COMPARE(orn(w11, w12, Operand(0x40004000)),
+          "orr w11, w12, #0xbfffbfff");
+  COMPARE(orn(x13, x14, Operand(0x8181818181818181L)),
+          "orr x13, x14, #0x7e7e7e7e7e7e7e7e");
+  COMPARE(eon(w19, w20, Operand(0x80000001)),
+          "eor w19, w20, #0x7ffffffe");
+  COMPARE(eon(x21, x22, Operand(0xc000000000000003L)),
+          "eor x21, x22, #0x3ffffffffffffffc");
+  COMPARE(bic(w27, w28, Operand(0xfffffff7), SetFlags),
+          "ands w27, w28, #0x8");
+  COMPARE(bic(x29, x0, Operand(0xfffffffeffffffffL), SetFlags),
+          "ands x29, x0, #0x100000000");
+
+  // Test stack pointer.
+  COMPARE(and_(wsp, wzr, Operand(7)), "and wsp, wzr, #0x7");
+  COMPARE(and_(xzr, xzr, Operand(7), SetFlags), "tst xzr, #0x7");
+  COMPARE(orr(sp, xzr, Operand(15)), "orr sp, xzr, #0xf");
+  COMPARE(eor(wsp, w0, Operand(31)), "eor wsp, w0, #0x1f");
+
+  // Test move aliases.
+  COMPARE(orr(w0, wzr, Operand(0x00000780)), "orr w0, wzr, #0x780");
+  COMPARE(orr(w1, wzr, Operand(0x00007800)), "orr w1, wzr, #0x7800");
+  COMPARE(orr(w2, wzr, Operand(0x00078000)), "mov w2, #0x78000");
+  COMPARE(orr(w3, wzr, Operand(0x00780000)), "orr w3, wzr, #0x780000");
+  COMPARE(orr(w4, wzr, Operand(0x07800000)), "orr w4, wzr, #0x7800000");
+  COMPARE(orr(x5, xzr, Operand(0xffffffffffffc001UL)),
+          "orr x5, xzr, #0xffffffffffffc001");
+  COMPARE(orr(x6, xzr, Operand(0xfffffffffffc001fUL)),
+          "mov x6, #0xfffffffffffc001f");
+  COMPARE(orr(x7, xzr, Operand(0xffffffffffc001ffUL)),
+          "mov x7, #0xffffffffffc001ff");
+  COMPARE(orr(x8, xzr, Operand(0xfffffffffc001fffUL)),
+          "mov x8, #0xfffffffffc001fff");
+  COMPARE(orr(x9, xzr, Operand(0xffffffffc001ffffUL)),
+          "orr x9, xzr, #0xffffffffc001ffff");
+
+  CLEANUP();
+}
+
+
+TEST(logical_shifted) {
+  SETUP();
+
+  COMPARE(and_(w0, w1, Operand(w2)), "and w0, w1, w2");
+  COMPARE(and_(x3, x4, Operand(x5, LSL, 1)), "and x3, x4, x5, lsl #1");
+  COMPARE(and_(w6, w7, Operand(w8, LSR, 2)), "and w6, w7, w8, lsr #2");
+  COMPARE(and_(x9, x10, Operand(x11, ASR, 3)), "and x9, x10, x11, asr #3");
+  COMPARE(and_(w12, w13, Operand(w14, ROR, 4)), "and w12, w13, w14, ror #4");
+
+  COMPARE(bic(w15, w16, Operand(w17)), "bic w15, w16, w17");
+  COMPARE(bic(x18, x19, Operand(x20, LSL, 5)), "bic x18, x19, x20, lsl #5");
+  COMPARE(bic(w21, w22, Operand(w23, LSR, 6)), "bic w21, w22, w23, lsr #6");
+  COMPARE(bic(x24, x25, Operand(x26, ASR, 7)), "bic x24, x25, x26, asr #7");
+  COMPARE(bic(w27, w28, Operand(w29, ROR, 8)), "bic w27, w28, w29, ror #8");
+
+  COMPARE(orr(w0, w1, Operand(w2)), "orr w0, w1, w2");
+  COMPARE(orr(x3, x4, Operand(x5, LSL, 9)), "orr x3, x4, x5, lsl #9");
+  COMPARE(orr(w6, w7, Operand(w8, LSR, 10)), "orr w6, w7, w8, lsr #10");
+  COMPARE(orr(x9, x10, Operand(x11, ASR, 11)), "orr x9, x10, x11, asr #11");
+  COMPARE(orr(w12, w13, Operand(w14, ROR, 12)), "orr w12, w13, w14, ror #12");
+
+  COMPARE(orn(w15, w16, Operand(w17)), "orn w15, w16, w17");
+  COMPARE(orn(x18, x19, Operand(x20, LSL, 13)), "orn x18, x19, x20, lsl #13");
+  COMPARE(orn(w21, w22, Operand(w23, LSR, 14)), "orn w21, w22, w23, lsr #14");
+  COMPARE(orn(x24, x25, Operand(x26, ASR, 15)), "orn x24, x25, x26, asr #15");
+  COMPARE(orn(w27, w28, Operand(w29, ROR, 16)), "orn w27, w28, w29, ror #16");
+
+  COMPARE(eor(w0, w1, Operand(w2)), "eor w0, w1, w2");
+  COMPARE(eor(x3, x4, Operand(x5, LSL, 17)), "eor x3, x4, x5, lsl #17");
+  COMPARE(eor(w6, w7, Operand(w8, LSR, 18)), "eor w6, w7, w8, lsr #18");
+  COMPARE(eor(x9, x10, Operand(x11, ASR, 19)), "eor x9, x10, x11, asr #19");
+  COMPARE(eor(w12, w13, Operand(w14, ROR, 20)), "eor w12, w13, w14, ror #20");
+
+  COMPARE(eon(w15, w16, Operand(w17)), "eon w15, w16, w17");
+  COMPARE(eon(x18, x19, Operand(x20, LSL, 21)), "eon x18, x19, x20, lsl #21");
+  COMPARE(eon(w21, w22, Operand(w23, LSR, 22)), "eon w21, w22, w23, lsr #22");
+  COMPARE(eon(x24, x25, Operand(x26, ASR, 23)), "eon x24, x25, x26, asr #23");
+  COMPARE(eon(w27, w28, Operand(w29, ROR, 24)), "eon w27, w28, w29, ror #24");
+
+  COMPARE(and_(w0, w1, Operand(w2), SetFlags), "ands w0, w1, w2");
+  COMPARE(and_(x3, x4, Operand(x5, LSL, 1), SetFlags),
+          "ands x3, x4, x5, lsl #1");
+  COMPARE(and_(w6, w7, Operand(w8, LSR, 2), SetFlags),
+          "ands w6, w7, w8, lsr #2");
+  COMPARE(and_(x9, x10, Operand(x11, ASR, 3), SetFlags),
+          "ands x9, x10, x11, asr #3");
+  COMPARE(and_(w12, w13, Operand(w14, ROR, 4), SetFlags),
+          "ands w12, w13, w14, ror #4");
+
+  COMPARE(bic(w15, w16, Operand(w17), SetFlags), "bics w15, w16, w17");
+  COMPARE(bic(x18, x19, Operand(x20, LSL, 5), SetFlags),
+          "bics x18, x19, x20, lsl #5");
+  COMPARE(bic(w21, w22, Operand(w23, LSR, 6), SetFlags),
+          "bics w21, w22, w23, lsr #6");
+  COMPARE(bic(x24, x25, Operand(x26, ASR, 7), SetFlags),
+          "bics x24, x25, x26, asr #7");
+  COMPARE(bic(w27, w28, Operand(w29, ROR, 8), SetFlags),
+          "bics w27, w28, w29, ror #8");
+
+  COMPARE(tst(w0, Operand(w1)), "tst w0, w1");
+  COMPARE(tst(w2, Operand(w3, ROR, 10)), "tst w2, w3, ror #10");
+  COMPARE(tst(x0, Operand(x1)), "tst x0, x1");
+  COMPARE(tst(x2, Operand(x3, ROR, 42)), "tst x2, x3, ror #42");
+
+  COMPARE(orn(w0, wzr, Operand(w1)), "mvn w0, w1");
+  COMPARE(orn(w2, wzr, Operand(w3, ASR, 5)), "mvn w2, w3, asr #5");
+  COMPARE(orn(x0, xzr, Operand(x1)), "mvn x0, x1");
+  COMPARE(orn(x2, xzr, Operand(x3, ASR, 42)), "mvn x2, x3, asr #42");
+
+  COMPARE(orr(w0, wzr, Operand(w1)), "mov w0, w1");
+  COMPARE(orr(x0, xzr, Operand(x1)), "mov x0, x1");
+  COMPARE(orr(w16, wzr, Operand(w17, LSL, 1)), "orr w16, wzr, w17, lsl #1");
+  COMPARE(orr(x16, xzr, Operand(x17, ASR, 2)), "orr x16, xzr, x17, asr #2");
+
+  CLEANUP();
+}
+
+
+TEST(dp_2_source) {
+  SETUP();
+
+  COMPARE(lslv(w0, w1, w2), "lsl w0, w1, w2");
+  COMPARE(lslv(x3, x4, x5), "lsl x3, x4, x5");
+  COMPARE(lsrv(w6, w7, w8), "lsr w6, w7, w8");
+  COMPARE(lsrv(x9, x10, x11), "lsr x9, x10, x11");
+  COMPARE(asrv(w12, w13, w14), "asr w12, w13, w14");
+  COMPARE(asrv(x15, x16, x17), "asr x15, x16, x17");
+  COMPARE(rorv(w18, w19, w20), "ror w18, w19, w20");
+  COMPARE(rorv(x21, x22, x23), "ror x21, x22, x23");
+
+  CLEANUP();
+}
+
+TEST(adr) {
+  SETUP();
+
+  COMPARE(adr(x0, 0), "adr x0, #+0x0");
+  COMPARE(adr(x1, 1), "adr x1, #+0x1");
+  COMPARE(adr(x2, -1), "adr x2, #-0x1");
+  COMPARE(adr(x3, 4), "adr x3, #+0x4");
+  COMPARE(adr(x4, -4), "adr x4, #-0x4");
+  COMPARE(adr(x5, 0x000fffff), "adr x5, #+0xfffff");
+  COMPARE(adr(x6, -0x00100000), "adr x6, #-0x100000");
+  COMPARE(adr(xzr, 0), "adr xzr, #+0x0");
+
+  CLEANUP();
+}
+
+TEST(branch) {
+  SETUP();
+
+  #define INST_OFF(x) ((x) >> kInstructionSizeLog2)
+  COMPARE(b(INST_OFF(0x4)), "b #+0x4");
+  COMPARE(b(INST_OFF(-0x4)), "b #-0x4");
+  COMPARE(b(INST_OFF(0x7fffffc)), "b #+0x7fffffc");
+  COMPARE(b(INST_OFF(-0x8000000)), "b #-0x8000000");
+  COMPARE(b(INST_OFF(0xffffc), eq), "b.eq #+0xffffc");
+  COMPARE(b(INST_OFF(-0x100000), mi), "b.mi #-0x100000");
+  COMPARE(bl(INST_OFF(0x4)), "bl #+0x4");
+  COMPARE(bl(INST_OFF(-0x4)), "bl #-0x4");
+  COMPARE(bl(INST_OFF(0xffffc)), "bl #+0xffffc");
+  COMPARE(bl(INST_OFF(-0x100000)), "bl #-0x100000");
+  COMPARE(cbz(w0, INST_OFF(0xffffc)), "cbz w0, #+0xffffc");
+  COMPARE(cbz(x1, INST_OFF(-0x100000)), "cbz x1, #-0x100000");
+  COMPARE(cbnz(w2, INST_OFF(0xffffc)), "cbnz w2, #+0xffffc");
+  COMPARE(cbnz(x3, INST_OFF(-0x100000)), "cbnz x3, #-0x100000");
+  COMPARE(tbz(x4, 0, INST_OFF(0x7ffc)), "tbz x4, #0, #+0x7ffc");
+  COMPARE(tbz(x5, 63, INST_OFF(-0x8000)), "tbz x5, #63, #-0x8000");
+  COMPARE(tbnz(x6, 0, INST_OFF(0x7ffc)), "tbnz x6, #0, #+0x7ffc");
+  COMPARE(tbnz(x7, 63, INST_OFF(-0x8000)), "tbnz x7, #63, #-0x8000");
+
+  COMPARE(br(x0), "br x0");
+  COMPARE(blr(x1), "blr x1");
+  COMPARE(ret(x2), "ret x2");
+  COMPARE(ret(lr), "ret")
+
+  CLEANUP();
+}
+
+TEST(load_store) {
+  SETUP();
+
+  COMPARE(ldr(w0, MemOperand(x1)), "ldr w0, [x1]");
+  COMPARE(ldr(w2, MemOperand(x3, 4)), "ldr w2, [x3, #4]");
+  COMPARE(ldr(w4, MemOperand(x5, 16380)), "ldr w4, [x5, #16380]");
+  COMPARE(ldr(x6, MemOperand(x7)), "ldr x6, [x7]");
+  COMPARE(ldr(x8, MemOperand(x9, 8)), "ldr x8, [x9, #8]");
+  COMPARE(ldr(x10, MemOperand(x11, 32760)), "ldr x10, [x11, #32760]");
+  COMPARE(str(w12, MemOperand(x13)), "str w12, [x13]");
+  COMPARE(str(w14, MemOperand(x15, 4)), "str w14, [x15, #4]");
+  COMPARE(str(w16, MemOperand(x17, 16380)), "str w16, [x17, #16380]");
+  COMPARE(str(x18, MemOperand(x19)), "str x18, [x19]");
+  COMPARE(str(x20, MemOperand(x21, 8)), "str x20, [x21, #8]");
+  COMPARE(str(x22, MemOperand(x23, 32760)), "str x22, [x23, #32760]");
+
+  COMPARE(ldr(w0, MemOperand(x1, 4, PreIndex)), "ldr w0, [x1, #4]!");
+  COMPARE(ldr(w2, MemOperand(x3, 255, PreIndex)), "ldr w2, [x3, #255]!");
+  COMPARE(ldr(w4, MemOperand(x5, -256, PreIndex)), "ldr w4, [x5, #-256]!");
+  COMPARE(ldr(x6, MemOperand(x7, 8, PreIndex)), "ldr x6, [x7, #8]!");
+  COMPARE(ldr(x8, MemOperand(x9, 255, PreIndex)), "ldr x8, [x9, #255]!");
+  COMPARE(ldr(x10, MemOperand(x11, -256, PreIndex)), "ldr x10, [x11, #-256]!");
+  COMPARE(str(w12, MemOperand(x13, 4, PreIndex)), "str w12, [x13, #4]!");
+  COMPARE(str(w14, MemOperand(x15, 255, PreIndex)), "str w14, [x15, #255]!");
+  COMPARE(str(w16, MemOperand(x17, -256, PreIndex)), "str w16, [x17, #-256]!");
+  COMPARE(str(x18, MemOperand(x19, 8, PreIndex)), "str x18, [x19, #8]!");
+  COMPARE(str(x20, MemOperand(x21, 255, PreIndex)), "str x20, [x21, #255]!");
+  COMPARE(str(x22, MemOperand(x23, -256, PreIndex)), "str x22, [x23, #-256]!");
+
+  COMPARE(ldr(w0, MemOperand(x1, 4, PostIndex)), "ldr w0, [x1], #4");
+  COMPARE(ldr(w2, MemOperand(x3, 255, PostIndex)), "ldr w2, [x3], #255");
+  COMPARE(ldr(w4, MemOperand(x5, -256, PostIndex)), "ldr w4, [x5], #-256");
+  COMPARE(ldr(x6, MemOperand(x7, 8, PostIndex)), "ldr x6, [x7], #8");
+  COMPARE(ldr(x8, MemOperand(x9, 255, PostIndex)), "ldr x8, [x9], #255");
+  COMPARE(ldr(x10, MemOperand(x11, -256, PostIndex)), "ldr x10, [x11], #-256");
+  COMPARE(str(w12, MemOperand(x13, 4, PostIndex)), "str w12, [x13], #4");
+  COMPARE(str(w14, MemOperand(x15, 255, PostIndex)), "str w14, [x15], #255");
+  COMPARE(str(w16, MemOperand(x17, -256, PostIndex)), "str w16, [x17], #-256");
+  COMPARE(str(x18, MemOperand(x19, 8, PostIndex)), "str x18, [x19], #8");
+  COMPARE(str(x20, MemOperand(x21, 255, PostIndex)), "str x20, [x21], #255");
+  COMPARE(str(x22, MemOperand(x23, -256, PostIndex)), "str x22, [x23], #-256");
+
+  COMPARE(ldr(w24, MemOperand(sp)), "ldr w24, [sp]");
+  COMPARE(ldr(x25, MemOperand(sp, 8)), "ldr x25, [sp, #8]");
+  COMPARE(str(w26, MemOperand(sp, 4, PreIndex)), "str w26, [sp, #4]!");
+  COMPARE(str(x27, MemOperand(sp, -8, PostIndex)), "str x27, [sp], #-8");
+
+  COMPARE(ldrsw(x0, MemOperand(x1)), "ldrsw x0, [x1]");
+  COMPARE(ldrsw(x2, MemOperand(x3, 8)), "ldrsw x2, [x3, #8]");
+  COMPARE(ldrsw(x4, MemOperand(x5, 42, PreIndex)), "ldrsw x4, [x5, #42]!");
+  COMPARE(ldrsw(x6, MemOperand(x7, -11, PostIndex)), "ldrsw x6, [x7], #-11");
+
+  CLEANUP();
+}
+
+
+TEST(load_store_regoffset) {
+  SETUP();
+
+  COMPARE(ldr(w0, MemOperand(x1, w2, UXTW)), "ldr w0, [x1, w2, uxtw]");
+  COMPARE(ldr(w3, MemOperand(x4, w5, UXTW, 2)), "ldr w3, [x4, w5, uxtw #2]");
+  COMPARE(ldr(w6, MemOperand(x7, x8)), "ldr w6, [x7, x8]");
+  COMPARE(ldr(w9, MemOperand(x10, x11, LSL, 2)), "ldr w9, [x10, x11, lsl #2]");
+  COMPARE(ldr(w12, MemOperand(x13, w14, SXTW)), "ldr w12, [x13, w14, sxtw]");
+  COMPARE(ldr(w15, MemOperand(x16, w17, SXTW, 2)),
+          "ldr w15, [x16, w17, sxtw #2]");
+  COMPARE(ldr(w18, MemOperand(x19, x20, SXTX)), "ldr w18, [x19, x20, sxtx]");
+  COMPARE(ldr(w21, MemOperand(x22, x23, SXTX, 2)),
+          "ldr w21, [x22, x23, sxtx #2]");
+  COMPARE(ldr(x0, MemOperand(x1, w2, UXTW)), "ldr x0, [x1, w2, uxtw]");
+  COMPARE(ldr(x3, MemOperand(x4, w5, UXTW, 3)), "ldr x3, [x4, w5, uxtw #3]");
+  COMPARE(ldr(x6, MemOperand(x7, x8)), "ldr x6, [x7, x8]");
+  COMPARE(ldr(x9, MemOperand(x10, x11, LSL, 3)), "ldr x9, [x10, x11, lsl #3]");
+  COMPARE(ldr(x12, MemOperand(x13, w14, SXTW)), "ldr x12, [x13, w14, sxtw]");
+  COMPARE(ldr(x15, MemOperand(x16, w17, SXTW, 3)),
+          "ldr x15, [x16, w17, sxtw #3]");
+  COMPARE(ldr(x18, MemOperand(x19, x20, SXTX)), "ldr x18, [x19, x20, sxtx]");
+  COMPARE(ldr(x21, MemOperand(x22, x23, SXTX, 3)),
+          "ldr x21, [x22, x23, sxtx #3]");
+
+  COMPARE(str(w0, MemOperand(x1, w2, UXTW)), "str w0, [x1, w2, uxtw]");
+  COMPARE(str(w3, MemOperand(x4, w5, UXTW, 2)), "str w3, [x4, w5, uxtw #2]");
+  COMPARE(str(w6, MemOperand(x7, x8)), "str w6, [x7, x8]");
+  COMPARE(str(w9, MemOperand(x10, x11, LSL, 2)), "str w9, [x10, x11, lsl #2]");
+  COMPARE(str(w12, MemOperand(x13, w14, SXTW)), "str w12, [x13, w14, sxtw]");
+  COMPARE(str(w15, MemOperand(x16, w17, SXTW, 2)),
+          "str w15, [x16, w17, sxtw #2]");
+  COMPARE(str(w18, MemOperand(x19, x20, SXTX)), "str w18, [x19, x20, sxtx]");
+  COMPARE(str(w21, MemOperand(x22, x23, SXTX, 2)),
+          "str w21, [x22, x23, sxtx #2]");
+  COMPARE(str(x0, MemOperand(x1, w2, UXTW)), "str x0, [x1, w2, uxtw]");
+  COMPARE(str(x3, MemOperand(x4, w5, UXTW, 3)), "str x3, [x4, w5, uxtw #3]");
+  COMPARE(str(x6, MemOperand(x7, x8)), "str x6, [x7, x8]");
+  COMPARE(str(x9, MemOperand(x10, x11, LSL, 3)), "str x9, [x10, x11, lsl #3]");
+  COMPARE(str(x12, MemOperand(x13, w14, SXTW)), "str x12, [x13, w14, sxtw]");
+  COMPARE(str(x15, MemOperand(x16, w17, SXTW, 3)),
+          "str x15, [x16, w17, sxtw #3]");
+  COMPARE(str(x18, MemOperand(x19, x20, SXTX)), "str x18, [x19, x20, sxtx]");
+  COMPARE(str(x21, MemOperand(x22, x23, SXTX, 3)),
+          "str x21, [x22, x23, sxtx #3]");
+
+  COMPARE(ldrb(w0, MemOperand(x1, w2, UXTW)), "ldrb w0, [x1, w2, uxtw]");
+  COMPARE(ldrb(w6, MemOperand(x7, x8)), "ldrb w6, [x7, x8]");
+  COMPARE(ldrb(w12, MemOperand(x13, w14, SXTW)), "ldrb w12, [x13, w14, sxtw]");
+  COMPARE(ldrb(w18, MemOperand(x19, x20, SXTX)), "ldrb w18, [x19, x20, sxtx]");
+  COMPARE(strb(w0, MemOperand(x1, w2, UXTW)), "strb w0, [x1, w2, uxtw]");
+  COMPARE(strb(w6, MemOperand(x7, x8)), "strb w6, [x7, x8]");
+  COMPARE(strb(w12, MemOperand(x13, w14, SXTW)), "strb w12, [x13, w14, sxtw]");
+  COMPARE(strb(w18, MemOperand(x19, x20, SXTX)), "strb w18, [x19, x20, sxtx]");
+
+  COMPARE(ldrh(w0, MemOperand(x1, w2, UXTW)), "ldrh w0, [x1, w2, uxtw]");
+  COMPARE(ldrh(w3, MemOperand(x4, w5, UXTW, 1)), "ldrh w3, [x4, w5, uxtw #1]");
+  COMPARE(ldrh(w6, MemOperand(x7, x8)), "ldrh w6, [x7, x8]");
+  COMPARE(ldrh(w9, MemOperand(x10, x11, LSL, 1)),
+          "ldrh w9, [x10, x11, lsl #1]");
+  COMPARE(ldrh(w12, MemOperand(x13, w14, SXTW)), "ldrh w12, [x13, w14, sxtw]");
+  COMPARE(ldrh(w15, MemOperand(x16, w17, SXTW, 1)),
+          "ldrh w15, [x16, w17, sxtw #1]");
+  COMPARE(ldrh(w18, MemOperand(x19, x20, SXTX)), "ldrh w18, [x19, x20, sxtx]");
+  COMPARE(ldrh(w21, MemOperand(x22, x23, SXTX, 1)),
+          "ldrh w21, [x22, x23, sxtx #1]");
+  COMPARE(strh(w0, MemOperand(x1, w2, UXTW)), "strh w0, [x1, w2, uxtw]");
+  COMPARE(strh(w3, MemOperand(x4, w5, UXTW, 1)), "strh w3, [x4, w5, uxtw #1]");
+  COMPARE(strh(w6, MemOperand(x7, x8)), "strh w6, [x7, x8]");
+  COMPARE(strh(w9, MemOperand(x10, x11, LSL, 1)),
+          "strh w9, [x10, x11, lsl #1]");
+  COMPARE(strh(w12, MemOperand(x13, w14, SXTW)), "strh w12, [x13, w14, sxtw]");
+  COMPARE(strh(w15, MemOperand(x16, w17, SXTW, 1)),
+          "strh w15, [x16, w17, sxtw #1]");
+  COMPARE(strh(w18, MemOperand(x19, x20, SXTX)), "strh w18, [x19, x20, sxtx]");
+  COMPARE(strh(w21, MemOperand(x22, x23, SXTX, 1)),
+          "strh w21, [x22, x23, sxtx #1]");
+
+  COMPARE(ldr(x0, MemOperand(sp, wzr, SXTW)), "ldr x0, [sp, wzr, sxtw]");
+  COMPARE(str(x1, MemOperand(sp, xzr)), "str x1, [sp, xzr]");
+
+  CLEANUP();
+}
+
+
+TEST(load_store_byte) {
+  SETUP();
+
+  COMPARE(ldrb(w0, MemOperand(x1)), "ldrb w0, [x1]");
+  COMPARE(ldrb(x2, MemOperand(x3)), "ldrb w2, [x3]");
+  COMPARE(ldrb(w4, MemOperand(x5, 4095)), "ldrb w4, [x5, #4095]");
+  COMPARE(ldrb(w6, MemOperand(x7, 255, PreIndex)), "ldrb w6, [x7, #255]!");
+  COMPARE(ldrb(w8, MemOperand(x9, -256, PreIndex)), "ldrb w8, [x9, #-256]!");
+  COMPARE(ldrb(w10, MemOperand(x11, 255, PostIndex)), "ldrb w10, [x11], #255");
+  COMPARE(ldrb(w12, MemOperand(x13, -256, PostIndex)),
+          "ldrb w12, [x13], #-256");
+  COMPARE(strb(w14, MemOperand(x15)), "strb w14, [x15]");
+  COMPARE(strb(x16, MemOperand(x17)), "strb w16, [x17]");
+  COMPARE(strb(w18, MemOperand(x19, 4095)), "strb w18, [x19, #4095]");
+  COMPARE(strb(w20, MemOperand(x21, 255, PreIndex)), "strb w20, [x21, #255]!");
+  COMPARE(strb(w22, MemOperand(x23, -256, PreIndex)),
+          "strb w22, [x23, #-256]!");
+  COMPARE(strb(w24, MemOperand(x25, 255, PostIndex)), "strb w24, [x25], #255");
+  COMPARE(strb(w26, MemOperand(x27, -256, PostIndex)),
+          "strb w26, [x27], #-256");
+  COMPARE(ldrb(w28, MemOperand(sp, 3, PostIndex)), "ldrb w28, [sp], #3");
+  COMPARE(strb(x29, MemOperand(sp, -42, PreIndex)), "strb w29, [sp, #-42]!");
+  COMPARE(ldrsb(w0, MemOperand(x1)), "ldrsb w0, [x1]");
+  COMPARE(ldrsb(x2, MemOperand(x3, 8)), "ldrsb x2, [x3, #8]");
+  COMPARE(ldrsb(w4, MemOperand(x5, 42, PreIndex)), "ldrsb w4, [x5, #42]!");
+  COMPARE(ldrsb(x6, MemOperand(x7, -11, PostIndex)), "ldrsb x6, [x7], #-11");
+
+  CLEANUP();
+}
+
+
+TEST(load_store_half) {
+  SETUP();
+
+  COMPARE(ldrh(w0, MemOperand(x1)), "ldrh w0, [x1]");
+  COMPARE(ldrh(x2, MemOperand(x3)), "ldrh w2, [x3]");
+  COMPARE(ldrh(w4, MemOperand(x5, 8190)), "ldrh w4, [x5, #8190]");
+  COMPARE(ldrh(w6, MemOperand(x7, 255, PreIndex)), "ldrh w6, [x7, #255]!");
+  COMPARE(ldrh(w8, MemOperand(x9, -256, PreIndex)), "ldrh w8, [x9, #-256]!");
+  COMPARE(ldrh(w10, MemOperand(x11, 255, PostIndex)), "ldrh w10, [x11], #255");
+  COMPARE(ldrh(w12, MemOperand(x13, -256, PostIndex)),
+          "ldrh w12, [x13], #-256");
+  COMPARE(strh(w14, MemOperand(x15)), "strh w14, [x15]");
+  COMPARE(strh(x16, MemOperand(x17)), "strh w16, [x17]");
+  COMPARE(strh(w18, MemOperand(x19, 8190)), "strh w18, [x19, #8190]");
+  COMPARE(strh(w20, MemOperand(x21, 255, PreIndex)), "strh w20, [x21, #255]!");
+  COMPARE(strh(w22, MemOperand(x23, -256, PreIndex)),
+          "strh w22, [x23, #-256]!");
+  COMPARE(strh(w24, MemOperand(x25, 255, PostIndex)), "strh w24, [x25], #255");
+  COMPARE(strh(w26, MemOperand(x27, -256, PostIndex)),
+          "strh w26, [x27], #-256");
+  COMPARE(ldrh(w28, MemOperand(sp, 3, PostIndex)), "ldrh w28, [sp], #3");
+  COMPARE(strh(x29, MemOperand(sp, -42, PreIndex)), "strh w29, [sp, #-42]!");
+  COMPARE(ldrh(w30, MemOperand(x0, 255)), "ldurh w30, [x0, #255]");
+  COMPARE(ldrh(x1, MemOperand(x2, -256)), "ldurh w1, [x2, #-256]");
+  COMPARE(strh(w3, MemOperand(x4, 255)), "sturh w3, [x4, #255]");
+  COMPARE(strh(x5, MemOperand(x6, -256)), "sturh w5, [x6, #-256]");
+  COMPARE(ldrsh(w0, MemOperand(x1)), "ldrsh w0, [x1]");
+  COMPARE(ldrsh(w2, MemOperand(x3, 8)), "ldrsh w2, [x3, #8]");
+  COMPARE(ldrsh(w4, MemOperand(x5, 42, PreIndex)), "ldrsh w4, [x5, #42]!");
+  COMPARE(ldrsh(x6, MemOperand(x7, -11, PostIndex)), "ldrsh x6, [x7], #-11");
+
+  CLEANUP();
+}
+
+
+TEST(load_store_fp) {
+  SETUP();
+
+  COMPARE(ldr(s0, MemOperand(x1)), "ldr s0, [x1]");
+  COMPARE(ldr(s2, MemOperand(x3, 4)), "ldr s2, [x3, #4]");
+  COMPARE(ldr(s4, MemOperand(x5, 16380)), "ldr s4, [x5, #16380]");
+  COMPARE(ldr(d6, MemOperand(x7)), "ldr d6, [x7]");
+  COMPARE(ldr(d8, MemOperand(x9, 8)), "ldr d8, [x9, #8]");
+  COMPARE(ldr(d10, MemOperand(x11, 32760)), "ldr d10, [x11, #32760]");
+  COMPARE(str(s12, MemOperand(x13)), "str s12, [x13]");
+  COMPARE(str(s14, MemOperand(x15, 4)), "str s14, [x15, #4]");
+  COMPARE(str(s16, MemOperand(x17, 16380)), "str s16, [x17, #16380]");
+  COMPARE(str(d18, MemOperand(x19)), "str d18, [x19]");
+  COMPARE(str(d20, MemOperand(x21, 8)), "str d20, [x21, #8]");
+  COMPARE(str(d22, MemOperand(x23, 32760)), "str d22, [x23, #32760]");
+
+  COMPARE(ldr(s0, MemOperand(x1, 4, PreIndex)), "ldr s0, [x1, #4]!");
+  COMPARE(ldr(s2, MemOperand(x3, 255, PreIndex)), "ldr s2, [x3, #255]!");
+  COMPARE(ldr(s4, MemOperand(x5, -256, PreIndex)), "ldr s4, [x5, #-256]!");
+  COMPARE(ldr(d6, MemOperand(x7, 8, PreIndex)), "ldr d6, [x7, #8]!");
+  COMPARE(ldr(d8, MemOperand(x9, 255, PreIndex)), "ldr d8, [x9, #255]!");
+  COMPARE(ldr(d10, MemOperand(x11, -256, PreIndex)), "ldr d10, [x11, #-256]!");
+  COMPARE(str(s12, MemOperand(x13, 4, PreIndex)), "str s12, [x13, #4]!");
+  COMPARE(str(s14, MemOperand(x15, 255, PreIndex)), "str s14, [x15, #255]!");
+  COMPARE(str(s16, MemOperand(x17, -256, PreIndex)), "str s16, [x17, #-256]!");
+  COMPARE(str(d18, MemOperand(x19, 8, PreIndex)), "str d18, [x19, #8]!");
+  COMPARE(str(d20, MemOperand(x21, 255, PreIndex)), "str d20, [x21, #255]!");
+  COMPARE(str(d22, MemOperand(x23, -256, PreIndex)), "str d22, [x23, #-256]!");
+
+  COMPARE(ldr(s0, MemOperand(x1, 4, PostIndex)), "ldr s0, [x1], #4");
+  COMPARE(ldr(s2, MemOperand(x3, 255, PostIndex)), "ldr s2, [x3], #255");
+  COMPARE(ldr(s4, MemOperand(x5, -256, PostIndex)), "ldr s4, [x5], #-256");
+  COMPARE(ldr(d6, MemOperand(x7, 8, PostIndex)), "ldr d6, [x7], #8");
+  COMPARE(ldr(d8, MemOperand(x9, 255, PostIndex)), "ldr d8, [x9], #255");
+  COMPARE(ldr(d10, MemOperand(x11, -256, PostIndex)), "ldr d10, [x11], #-256");
+  COMPARE(str(s12, MemOperand(x13, 4, PostIndex)), "str s12, [x13], #4");
+  COMPARE(str(s14, MemOperand(x15, 255, PostIndex)), "str s14, [x15], #255");
+  COMPARE(str(s16, MemOperand(x17, -256, PostIndex)), "str s16, [x17], #-256");
+  COMPARE(str(d18, MemOperand(x19, 8, PostIndex)), "str d18, [x19], #8");
+  COMPARE(str(d20, MemOperand(x21, 255, PostIndex)), "str d20, [x21], #255");
+  COMPARE(str(d22, MemOperand(x23, -256, PostIndex)), "str d22, [x23], #-256");
+
+  COMPARE(ldr(s24, MemOperand(sp)), "ldr s24, [sp]");
+  COMPARE(ldr(d25, MemOperand(sp, 8)), "ldr d25, [sp, #8]");
+  COMPARE(str(s26, MemOperand(sp, 4, PreIndex)), "str s26, [sp, #4]!");
+  COMPARE(str(d27, MemOperand(sp, -8, PostIndex)), "str d27, [sp], #-8");
+
+  CLEANUP();
+}
+
+
+TEST(load_store_unscaled) {
+  SETUP();
+
+  COMPARE(ldr(w0, MemOperand(x1, 1)), "ldur w0, [x1, #1]");
+  COMPARE(ldr(w2, MemOperand(x3, -1)), "ldur w2, [x3, #-1]");
+  COMPARE(ldr(w4, MemOperand(x5, 255)), "ldur w4, [x5, #255]");
+  COMPARE(ldr(w6, MemOperand(x7, -256)), "ldur w6, [x7, #-256]");
+  COMPARE(ldr(x8, MemOperand(x9, 1)), "ldur x8, [x9, #1]");
+  COMPARE(ldr(x10, MemOperand(x11, -1)), "ldur x10, [x11, #-1]");
+  COMPARE(ldr(x12, MemOperand(x13, 255)), "ldur x12, [x13, #255]");
+  COMPARE(ldr(x14, MemOperand(x15, -256)), "ldur x14, [x15, #-256]");
+  COMPARE(str(w16, MemOperand(x17, 1)), "stur w16, [x17, #1]");
+  COMPARE(str(w18, MemOperand(x19, -1)), "stur w18, [x19, #-1]");
+  COMPARE(str(w20, MemOperand(x21, 255)), "stur w20, [x21, #255]");
+  COMPARE(str(w22, MemOperand(x23, -256)), "stur w22, [x23, #-256]");
+  COMPARE(str(x24, MemOperand(x25, 1)), "stur x24, [x25, #1]");
+  COMPARE(str(x26, MemOperand(x27, -1)), "stur x26, [x27, #-1]");
+  COMPARE(str(x28, MemOperand(x29, 255)), "stur x28, [x29, #255]");
+  COMPARE(str(x30, MemOperand(x0, -256)), "stur x30, [x0, #-256]");
+  COMPARE(ldr(w0, MemOperand(sp, 1)), "ldur w0, [sp, #1]");
+  COMPARE(str(x1, MemOperand(sp, -1)), "stur x1, [sp, #-1]");
+  COMPARE(ldrb(w2, MemOperand(x3, -2)), "ldurb w2, [x3, #-2]");
+  COMPARE(ldrsb(w4, MemOperand(x5, -3)), "ldursb w4, [x5, #-3]");
+  COMPARE(ldrsb(x6, MemOperand(x7, -4)), "ldursb x6, [x7, #-4]");
+  COMPARE(ldrh(w8, MemOperand(x9, -5)), "ldurh w8, [x9, #-5]");
+  COMPARE(ldrsh(w10, MemOperand(x11, -6)), "ldursh w10, [x11, #-6]");
+  COMPARE(ldrsh(x12, MemOperand(x13, -7)), "ldursh x12, [x13, #-7]");
+  COMPARE(ldrsw(x14, MemOperand(x15, -8)), "ldursw x14, [x15, #-8]");
+
+  CLEANUP();
+}
+
+TEST(load_store_pair) {
+  SETUP();
+
+  COMPARE(ldp(w0, w1, MemOperand(x2)), "ldp w0, w1, [x2]");
+  COMPARE(ldp(x3, x4, MemOperand(x5)), "ldp x3, x4, [x5]");
+  COMPARE(ldp(w6, w7, MemOperand(x8, 4)), "ldp w6, w7, [x8, #4]");
+  COMPARE(ldp(x9, x10, MemOperand(x11, 8)), "ldp x9, x10, [x11, #8]");
+  COMPARE(ldp(w12, w13, MemOperand(x14, 252)), "ldp w12, w13, [x14, #252]");
+  COMPARE(ldp(x15, x16, MemOperand(x17, 504)), "ldp x15, x16, [x17, #504]");
+  COMPARE(ldp(w18, w19, MemOperand(x20, -256)), "ldp w18, w19, [x20, #-256]");
+  COMPARE(ldp(x21, x22, MemOperand(x23, -512)), "ldp x21, x22, [x23, #-512]");
+  COMPARE(ldp(w24, w25, MemOperand(x26, 252, PreIndex)),
+          "ldp w24, w25, [x26, #252]!");
+  COMPARE(ldp(x27, x28, MemOperand(x29, 504, PreIndex)),
+          "ldp x27, x28, [x29, #504]!");
+  COMPARE(ldp(w30, w0, MemOperand(x1, -256, PreIndex)),
+          "ldp w30, w0, [x1, #-256]!");
+  COMPARE(ldp(x2, x3, MemOperand(x4, -512, PreIndex)),
+          "ldp x2, x3, [x4, #-512]!");
+  COMPARE(ldp(w5, w6, MemOperand(x7, 252, PostIndex)),
+          "ldp w5, w6, [x7], #252");
+  COMPARE(ldp(x8, x9, MemOperand(x10, 504, PostIndex)),
+          "ldp x8, x9, [x10], #504");
+  COMPARE(ldp(w11, w12, MemOperand(x13, -256, PostIndex)),
+          "ldp w11, w12, [x13], #-256");
+  COMPARE(ldp(x14, x15, MemOperand(x16, -512, PostIndex)),
+          "ldp x14, x15, [x16], #-512");
+
+  COMPARE(ldp(s17, s18, MemOperand(x19)), "ldp s17, s18, [x19]");
+  COMPARE(ldp(s20, s21, MemOperand(x22, 252)), "ldp s20, s21, [x22, #252]");
+  COMPARE(ldp(s23, s24, MemOperand(x25, -256)), "ldp s23, s24, [x25, #-256]");
+  COMPARE(ldp(s26, s27, MemOperand(x28, 252, PreIndex)),
+          "ldp s26, s27, [x28, #252]!");
+  COMPARE(ldp(s29, s30, MemOperand(x29, -256, PreIndex)),
+          "ldp s29, s30, [x29, #-256]!");
+  COMPARE(ldp(s31, s0, MemOperand(x1, 252, PostIndex)),
+          "ldp s31, s0, [x1], #252");
+  COMPARE(ldp(s2, s3, MemOperand(x4, -256, PostIndex)),
+          "ldp s2, s3, [x4], #-256");
+  COMPARE(ldp(d17, d18, MemOperand(x19)), "ldp d17, d18, [x19]");
+  COMPARE(ldp(d20, d21, MemOperand(x22, 504)), "ldp d20, d21, [x22, #504]");
+  COMPARE(ldp(d23, d24, MemOperand(x25, -512)), "ldp d23, d24, [x25, #-512]");
+  COMPARE(ldp(d26, d27, MemOperand(x28, 504, PreIndex)),
+          "ldp d26, d27, [x28, #504]!");
+  COMPARE(ldp(d29, d30, MemOperand(x29, -512, PreIndex)),
+          "ldp d29, d30, [x29, #-512]!");
+  COMPARE(ldp(d31, d0, MemOperand(x1, 504, PostIndex)),
+          "ldp d31, d0, [x1], #504");
+  COMPARE(ldp(d2, d3, MemOperand(x4, -512, PostIndex)),
+          "ldp d2, d3, [x4], #-512");
+
+  COMPARE(stp(w0, w1, MemOperand(x2)), "stp w0, w1, [x2]");
+  COMPARE(stp(x3, x4, MemOperand(x5)), "stp x3, x4, [x5]");
+  COMPARE(stp(w6, w7, MemOperand(x8, 4)), "stp w6, w7, [x8, #4]");
+  COMPARE(stp(x9, x10, MemOperand(x11, 8)), "stp x9, x10, [x11, #8]");
+  COMPARE(stp(w12, w13, MemOperand(x14, 252)), "stp w12, w13, [x14, #252]");
+  COMPARE(stp(x15, x16, MemOperand(x17, 504)), "stp x15, x16, [x17, #504]");
+  COMPARE(stp(w18, w19, MemOperand(x20, -256)), "stp w18, w19, [x20, #-256]");
+  COMPARE(stp(x21, x22, MemOperand(x23, -512)), "stp x21, x22, [x23, #-512]");
+  COMPARE(stp(w24, w25, MemOperand(x26, 252, PreIndex)),
+          "stp w24, w25, [x26, #252]!");
+  COMPARE(stp(x27, x28, MemOperand(x29, 504, PreIndex)),
+          "stp x27, x28, [x29, #504]!");
+  COMPARE(stp(w30, w0, MemOperand(x1, -256, PreIndex)),
+          "stp w30, w0, [x1, #-256]!");
+  COMPARE(stp(x2, x3, MemOperand(x4, -512, PreIndex)),
+          "stp x2, x3, [x4, #-512]!");
+  COMPARE(stp(w5, w6, MemOperand(x7, 252, PostIndex)),
+          "stp w5, w6, [x7], #252");
+  COMPARE(stp(x8, x9, MemOperand(x10, 504, PostIndex)),
+          "stp x8, x9, [x10], #504");
+  COMPARE(stp(w11, w12, MemOperand(x13, -256, PostIndex)),
+          "stp w11, w12, [x13], #-256");
+  COMPARE(stp(x14, x15, MemOperand(x16, -512, PostIndex)),
+          "stp x14, x15, [x16], #-512");
+
+  COMPARE(stp(s17, s18, MemOperand(x19)), "stp s17, s18, [x19]");
+  COMPARE(stp(s20, s21, MemOperand(x22, 252)), "stp s20, s21, [x22, #252]");
+  COMPARE(stp(s23, s24, MemOperand(x25, -256)), "stp s23, s24, [x25, #-256]");
+  COMPARE(stp(s26, s27, MemOperand(x28, 252, PreIndex)),
+          "stp s26, s27, [x28, #252]!");
+  COMPARE(stp(s29, s30, MemOperand(x29, -256, PreIndex)),
+          "stp s29, s30, [x29, #-256]!");
+  COMPARE(stp(s31, s0, MemOperand(x1, 252, PostIndex)),
+          "stp s31, s0, [x1], #252");
+  COMPARE(stp(s2, s3, MemOperand(x4, -256, PostIndex)),
+          "stp s2, s3, [x4], #-256");
+  COMPARE(stp(d17, d18, MemOperand(x19)), "stp d17, d18, [x19]");
+  COMPARE(stp(d20, d21, MemOperand(x22, 504)), "stp d20, d21, [x22, #504]");
+  COMPARE(stp(d23, d24, MemOperand(x25, -512)), "stp d23, d24, [x25, #-512]");
+  COMPARE(stp(d26, d27, MemOperand(x28, 504, PreIndex)),
+          "stp d26, d27, [x28, #504]!");
+  COMPARE(stp(d29, d30, MemOperand(x29, -512, PreIndex)),
+          "stp d29, d30, [x29, #-512]!");
+  COMPARE(stp(d31, d0, MemOperand(x1, 504, PostIndex)),
+          "stp d31, d0, [x1], #504");
+  COMPARE(stp(d2, d3, MemOperand(x4, -512, PostIndex)),
+          "stp d2, d3, [x4], #-512");
+
+  COMPARE(ldp(w16, w17, MemOperand(sp, 4, PostIndex)),
+          "ldp w16, w17, [sp], #4");
+  COMPARE(stp(x18, x19, MemOperand(sp, -8, PreIndex)),
+          "stp x18, x19, [sp, #-8]!");
+  COMPARE(ldp(s30, s31, MemOperand(sp, 12, PostIndex)),
+          "ldp s30, s31, [sp], #12");
+  COMPARE(stp(d30, d31, MemOperand(sp, -16)),
+          "stp d30, d31, [sp, #-16]");
+
+  COMPARE(ldpsw(x0, x1, MemOperand(x2)), "ldpsw x0, x1, [x2]");
+  COMPARE(ldpsw(x3, x4, MemOperand(x5, 16)), "ldpsw x3, x4, [x5, #16]");
+  COMPARE(ldpsw(x6, x7, MemOperand(x8, -32, PreIndex)),
+          "ldpsw x6, x7, [x8, #-32]!");
+  COMPARE(ldpsw(x9, x10, MemOperand(x11, 128, PostIndex)),
+          "ldpsw x9, x10, [x11], #128");
+
+  CLEANUP();
+}
+
+TEST(load_store_pair_nontemp) {
+  SETUP();
+
+  COMPARE(ldnp(w0, w1, MemOperand(x2)), "ldnp w0, w1, [x2]");
+  COMPARE(stnp(w3, w4, MemOperand(x5, 252)), "stnp w3, w4, [x5, #252]");
+  COMPARE(ldnp(w6, w7, MemOperand(x8, -256)), "ldnp w6, w7, [x8, #-256]");
+  COMPARE(stnp(x9, x10, MemOperand(x11)), "stnp x9, x10, [x11]");
+  COMPARE(ldnp(x12, x13, MemOperand(x14, 504)), "ldnp x12, x13, [x14, #504]");
+  COMPARE(stnp(x15, x16, MemOperand(x17, -512)), "stnp x15, x16, [x17, #-512]");
+  COMPARE(ldnp(s18, s19, MemOperand(x20)), "ldnp s18, s19, [x20]");
+  COMPARE(stnp(s21, s22, MemOperand(x23, 252)), "stnp s21, s22, [x23, #252]");
+  COMPARE(ldnp(s24, s25, MemOperand(x26, -256)), "ldnp s24, s25, [x26, #-256]");
+  COMPARE(stnp(d27, d28, MemOperand(x29)), "stnp d27, d28, [x29]");
+  COMPARE(ldnp(d30, d31, MemOperand(x0, 504)), "ldnp d30, d31, [x0, #504]");
+  COMPARE(stnp(d1, d2, MemOperand(x3, -512)), "stnp d1, d2, [x3, #-512]");
+
+  CLEANUP();
+}
+
+TEST(load_literal) {
+  SETUP();
+
+  COMPARE(ldr(x10, 0x1234567890abcdefUL),  "ldr x10, #8 (0x1234567890abcdef)");
+  COMPARE(ldr(w20, 0xfedcba09),  "ldr w20, #8 (0xfedcba09)");
+  COMPARE(ldr(d11, 1.234),  "ldr d11, #8 (1.2340)");
+  COMPARE(ldr(s22, 2.5),  "ldr s22, #8 (2.5000)");
+
+  CLEANUP();
+}
+
+TEST(cond_select) {
+  SETUP();
+
+  COMPARE(csel(w0, w1, w2, eq), "csel w0, w1, w2, eq");
+  COMPARE(csel(x3, x4, x5, ne), "csel x3, x4, x5, ne");
+  COMPARE(csinc(w6, w7, w8, hs), "csinc w6, w7, w8, hs");
+  COMPARE(csinc(x9, x10, x11, lo), "csinc x9, x10, x11, lo");
+  COMPARE(csinv(w12, w13, w14, mi), "csinv w12, w13, w14, mi");
+  COMPARE(csinv(x15, x16, x17, pl), "csinv x15, x16, x17, pl");
+  COMPARE(csneg(w18, w19, w20, vs), "csneg w18, w19, w20, vs");
+  COMPARE(csneg(x21, x22, x23, vc), "csneg x21, x22, x23, vc");
+  COMPARE(cset(w24, hi), "cset w24, hi");
+  COMPARE(cset(x25, ls), "cset x25, ls");
+  COMPARE(csetm(w26, ge), "csetm w26, ge");
+  COMPARE(csetm(x27, lt), "csetm x27, lt");
+  COMPARE(cinc(w28, w29, gt), "cinc w28, w29, gt");
+  COMPARE(cinc(x30, x0, le), "cinc x30, x0, le");
+  COMPARE(cinv(w1, w2, eq), "cinv w1, w2, eq");
+  COMPARE(cinv(x3, x4, ne), "cinv x3, x4, ne");
+  COMPARE(cneg(w5, w6, hs), "cneg w5, w6, hs");
+  COMPARE(cneg(x7, x8, lo), "cneg x7, x8, lo");
+
+  CLEANUP();
+}
+
+TEST(cond_cmp) {
+  SETUP();
+
+  COMPARE(ccmn(w0, Operand(w1), NZCVFlag, eq), "ccmn w0, w1, #NZCV, eq");
+  COMPARE(ccmn(x2, Operand(x3), NZCFlag, ne), "ccmn x2, x3, #NZCv, ne");
+  COMPARE(ccmp(w4, Operand(w5), NZVFlag, hs), "ccmp w4, w5, #NZcV, hs");
+  COMPARE(ccmp(x6, Operand(x7), NZFlag, lo), "ccmp x6, x7, #NZcv, lo");
+  COMPARE(ccmn(w8, Operand(31), NFlag, mi), "ccmn w8, #31, #Nzcv, mi");
+  COMPARE(ccmn(x9, Operand(30), NCFlag, pl), "ccmn x9, #30, #NzCv, pl");
+  COMPARE(ccmp(w10, Operand(29), NVFlag, vs), "ccmp w10, #29, #NzcV, vs");
+  COMPARE(ccmp(x11, Operand(28), NFlag, vc), "ccmp x11, #28, #Nzcv, vc");
+
+  CLEANUP();
+}
+
+TEST(fmov_imm) {
+  SETUP();
+
+  COMPARE(fmov(s0, 1.0), "fmov s0, #0x70 (1.0000)");
+  COMPARE(fmov(s31, -13.0), "fmov s31, #0xaa (-13.0000)");
+  COMPARE(fmov(d1, 1.0), "fmov d1, #0x70 (1.0000)");
+  COMPARE(fmov(d29, -13.0), "fmov d29, #0xaa (-13.0000)");
+
+  CLEANUP();
+}
+
+TEST(fmov_reg) {
+  SETUP();
+
+  COMPARE(fmov(w3, s13), "fmov w3, s13");
+  COMPARE(fmov(x6, d26), "fmov x6, d26");
+  COMPARE(fmov(s11, w30), "fmov s11, w30");
+  COMPARE(fmov(d31, x2), "fmov d31, x2");
+  COMPARE(fmov(s12, s13), "fmov s12, s13");
+  COMPARE(fmov(d22, d23), "fmov d22, d23");
+
+  CLEANUP();
+}
+
+
+TEST(fp_dp1) {
+  SETUP();
+
+  COMPARE(fabs(s0, s1), "fabs s0, s1");
+  COMPARE(fabs(s31, s30), "fabs s31, s30");
+  COMPARE(fabs(d2, d3), "fabs d2, d3");
+  COMPARE(fabs(d31, d30), "fabs d31, d30");
+  COMPARE(fneg(s4, s5), "fneg s4, s5");
+  COMPARE(fneg(s31, s30), "fneg s31, s30");
+  COMPARE(fneg(d6, d7), "fneg d6, d7");
+  COMPARE(fneg(d31, d30), "fneg d31, d30");
+  COMPARE(fsqrt(s8, s9), "fsqrt s8, s9");
+  COMPARE(fsqrt(s31, s30), "fsqrt s31, s30");
+  COMPARE(fsqrt(d10, d11), "fsqrt d10, d11");
+  COMPARE(fsqrt(d31, d30), "fsqrt d31, d30");
+  COMPARE(frintn(s10, s11), "frintn s10, s11");
+  COMPARE(frintn(s31, s30), "frintn s31, s30");
+  COMPARE(frintn(d12, d13), "frintn d12, d13");
+  COMPARE(frintn(d31, d30), "frintn d31, d30");
+  COMPARE(frintz(s10, s11), "frintz s10, s11");
+  COMPARE(frintz(s31, s30), "frintz s31, s30");
+  COMPARE(frintz(d12, d13), "frintz d12, d13");
+  COMPARE(frintz(d31, d30), "frintz d31, d30");
+  COMPARE(fcvt(d14, s15), "fcvt d14, s15");
+  COMPARE(fcvt(d31, s31), "fcvt d31, s31");
+
+  CLEANUP();
+}
+
+
+TEST(fp_dp2) {
+  SETUP();
+
+  COMPARE(fadd(s0, s1, s2), "fadd s0, s1, s2");
+  COMPARE(fadd(d3, d4, d5), "fadd d3, d4, d5");
+  COMPARE(fsub(s31, s30, s29), "fsub s31, s30, s29");
+  COMPARE(fsub(d31, d30, d29), "fsub d31, d30, d29");
+  COMPARE(fmul(s7, s8, s9), "fmul s7, s8, s9");
+  COMPARE(fmul(d10, d11, d12), "fmul d10, d11, d12");
+  COMPARE(fdiv(s13, s14, s15), "fdiv s13, s14, s15");
+  COMPARE(fdiv(d16, d17, d18), "fdiv d16, d17, d18");
+  COMPARE(fmax(s19, s20, s21), "fmax s19, s20, s21");
+  COMPARE(fmax(d22, d23, d24), "fmax d22, d23, d24");
+  COMPARE(fmin(s25, s26, s27), "fmin s25, s26, s27");
+  COMPARE(fmin(d28, d29, d30), "fmin d28, d29, d30");
+
+  CLEANUP();
+}
+
+
+TEST(fp_dp3) {
+  SETUP();
+
+  COMPARE(fmsub(s7, s8, s9, s10), "fmsub s7, s8, s9, s10");
+  COMPARE(fmsub(d10, d11, d12, d10), "fmsub d10, d11, d12, d10");
+
+  CLEANUP();
+}
+
+
+TEST(fp_compare) {
+  SETUP();
+
+  COMPARE(fcmp(s0, s1), "fcmp s0, s1");
+  COMPARE(fcmp(s31, s30), "fcmp s31, s30");
+  COMPARE(fcmp(d0, d1), "fcmp d0, d1");
+  COMPARE(fcmp(d31, d30), "fcmp d31, d30");
+  COMPARE(fcmp(s12, 0), "fcmp s12, #0.0");
+  COMPARE(fcmp(d12, 0), "fcmp d12, #0.0");
+
+  CLEANUP();
+}
+
+
+TEST(fp_cond_compare) {
+  SETUP();
+
+  COMPARE(fccmp(s0, s1, NoFlag, eq), "fccmp s0, s1, #nzcv, eq");
+  COMPARE(fccmp(s2, s3, ZVFlag, ne), "fccmp s2, s3, #nZcV, ne");
+  COMPARE(fccmp(s30, s16, NCFlag, pl), "fccmp s30, s16, #NzCv, pl");
+  COMPARE(fccmp(s31, s31, NZCVFlag, le), "fccmp s31, s31, #NZCV, le");
+  COMPARE(fccmp(d4, d5, VFlag, gt), "fccmp d4, d5, #nzcV, gt");
+  COMPARE(fccmp(d6, d7, NFlag, vs), "fccmp d6, d7, #Nzcv, vs");
+  COMPARE(fccmp(d30, d0, NZFlag, vc), "fccmp d30, d0, #NZcv, vc");
+  COMPARE(fccmp(d31, d31, ZFlag, hs), "fccmp d31, d31, #nZcv, hs");
+
+  CLEANUP();
+}
+
+
+TEST(fp_select) {
+  SETUP();
+
+  COMPARE(fcsel(s0, s1, s2, eq), "fcsel s0, s1, s2, eq")
+  COMPARE(fcsel(s31, s31, s30, ne), "fcsel s31, s31, s30, ne");
+  COMPARE(fcsel(d0, d1, d2, mi), "fcsel d0, d1, d2, mi");
+  COMPARE(fcsel(d31, d30, d31, pl), "fcsel d31, d30, d31, pl");
+
+  CLEANUP();
+}
+
+
+TEST(fcvt_scvtf_ucvtf) {
+  SETUP();
+
+  COMPARE(fcvtns(w0, s1), "fcvtns w0, s1");
+  COMPARE(fcvtns(x2, s3), "fcvtns x2, s3");
+  COMPARE(fcvtns(w4, d5), "fcvtns w4, d5");
+  COMPARE(fcvtns(x6, d7), "fcvtns x6, d7");
+  COMPARE(fcvtnu(w8, s9), "fcvtnu w8, s9");
+  COMPARE(fcvtnu(x10, s11), "fcvtnu x10, s11");
+  COMPARE(fcvtnu(w12, d13), "fcvtnu w12, d13");
+  COMPARE(fcvtnu(x14, d15), "fcvtnu x14, d15");
+  COMPARE(fcvtzu(x16, d17), "fcvtzu x16, d17");
+  COMPARE(fcvtzu(w18, d19), "fcvtzu w18, d19");
+  COMPARE(fcvtzs(x20, d21), "fcvtzs x20, d21");
+  COMPARE(fcvtzs(w22, d23), "fcvtzs w22, d23");
+  COMPARE(fcvtzu(x16, s17), "fcvtzu x16, s17");
+  COMPARE(fcvtzu(w18, s19), "fcvtzu w18, s19");
+  COMPARE(fcvtzs(x20, s21), "fcvtzs x20, s21");
+  COMPARE(fcvtzs(w22, s23), "fcvtzs w22, s23");
+  COMPARE(scvtf(d24, w25), "scvtf d24, w25");
+  COMPARE(scvtf(d26, x27), "scvtf d26, x27");
+  COMPARE(ucvtf(d28, w29), "ucvtf d28, w29");
+  COMPARE(ucvtf(d0, x1), "ucvtf d0, x1");
+  COMPARE(scvtf(d1, x2, 1), "scvtf d1, x2, #1");
+  COMPARE(scvtf(d3, x4, 15), "scvtf d3, x4, #15");
+  COMPARE(scvtf(d5, x6, 32), "scvtf d5, x6, #32");
+  COMPARE(ucvtf(d7, x8, 2), "ucvtf d7, x8, #2");
+  COMPARE(ucvtf(d9, x10, 16), "ucvtf d9, x10, #16");
+  COMPARE(ucvtf(d11, x12, 33), "ucvtf d11, x12, #33");
+  COMPARE(fcvtms(w0, s1), "fcvtms w0, s1");
+  COMPARE(fcvtms(x2, s3), "fcvtms x2, s3");
+  COMPARE(fcvtms(w4, d5), "fcvtms w4, d5");
+  COMPARE(fcvtms(x6, d7), "fcvtms x6, d7");
+  COMPARE(fcvtmu(w8, s9), "fcvtmu w8, s9");
+  COMPARE(fcvtmu(x10, s11), "fcvtmu x10, s11");
+  COMPARE(fcvtmu(w12, d13), "fcvtmu w12, d13");
+  COMPARE(fcvtmu(x14, d15), "fcvtmu x14, d15");
+
+  CLEANUP();
+}
+
+
+TEST(system_mrs) {
+  SETUP();
+
+  COMPARE(mrs(x0, NZCV), "mrs x0, nzcv");
+  COMPARE(mrs(x30, NZCV), "mrs x30, nzcv");
+
+  CLEANUP();
+}
+
+
+TEST(system_msr) {
+  SETUP();
+
+  COMPARE(msr(NZCV, x0), "msr nzcv, x0");
+  COMPARE(msr(NZCV, x30), "msr nzcv, x30");
+
+  CLEANUP();
+}
+
+
+TEST(system_nop) {
+  SETUP();
+
+  COMPARE(nop(), "nop");
+
+  CLEANUP();
+}
+
+
+TEST(unreachable) {
+  SETUP_CLASS(MacroAssembler);
+
+#ifdef USE_SIMULATOR
+  ASSERT(kUnreachableOpcode == 0xdeb0);
+  COMPARE(Unreachable(), "hlt #0xdeb0");
+#else
+  COMPARE(Unreachable(), "blr xzr");
+#endif
+
+  CLEANUP();
+}
+
+
+#ifdef USE_SIMULATOR
+TEST(trace) {
+  SETUP_CLASS(MacroAssembler);
+
+  ASSERT(kTraceOpcode == 0xdeb2);
+
+  // All Trace calls should produce the same instruction.
+  COMPARE(Trace(LOG_ALL, TRACE_ENABLE), "hlt #0xdeb2");
+  COMPARE(Trace(LOG_REGS, TRACE_DISABLE), "hlt #0xdeb2");
+
+  CLEANUP();
+}
+#endif
+
+
+#ifdef USE_SIMULATOR
+TEST(log) {
+  SETUP_CLASS(MacroAssembler);
+
+  ASSERT(kLogOpcode == 0xdeb3);
+
+  // All Log calls should produce the same instruction.
+  COMPARE(Log(LOG_ALL), "hlt #0xdeb3");
+  COMPARE(Log(LOG_FLAGS), "hlt #0xdeb3");
+
+  CLEANUP();
+}
+#endif
+
+
+TEST(hlt) {
+  SETUP();
+
+  COMPARE(hlt(0), "hlt #0x0");
+  COMPARE(hlt(1), "hlt #0x1");
+  COMPARE(hlt(65535), "hlt #0xffff");
+
+  CLEANUP();
+}
+
+
+TEST(brk) {
+  SETUP();
+
+  COMPARE(brk(0), "brk #0x0");
+  COMPARE(brk(1), "brk #0x1");
+  COMPARE(brk(65535), "brk #0xffff");
+
+  CLEANUP();
+}
+
+
+TEST(add_sub_negative) {
+  SETUP_CLASS(MacroAssembler);
+
+  COMPARE(Add(x10, x0, -42), "sub x10, x0, #0x2a (42)");
+  COMPARE(Add(x11, x1, -687), "sub x11, x1, #0x2af (687)");
+  COMPARE(Add(x12, x2, -0x88), "sub x12, x2, #0x88 (136)");
+
+  COMPARE(Sub(x13, x0, -600), "add x13, x0, #0x258 (600)");
+  COMPARE(Sub(x14, x1, -313), "add x14, x1, #0x139 (313)");
+  COMPARE(Sub(x15, x2, -0x555), "add x15, x2, #0x555 (1365)");
+
+  COMPARE(Add(w19, w3, -0x344), "sub w19, w3, #0x344 (836)");
+  COMPARE(Add(w20, w4, -2000), "sub w20, w4, #0x7d0 (2000)");
+
+  COMPARE(Sub(w21, w3, -0xbc), "add w21, w3, #0xbc (188)");
+  COMPARE(Sub(w22, w4, -2000), "add w22, w4, #0x7d0 (2000)");
+
+  CLEANUP();
+}
+
+
+TEST(logical_immediate_move) {
+  SETUP_CLASS(MacroAssembler);
+
+  COMPARE(And(w0, w1, 0), "movz w0, #0x0");
+  COMPARE(And(x0, x1, 0), "movz x0, #0x0");
+  COMPARE(Orr(w2, w3, 0), "mov w2, w3");
+  COMPARE(Orr(x2, x3, 0), "mov x2, x3");
+  COMPARE(Eor(w4, w5, 0), "mov w4, w5");
+  COMPARE(Eor(x4, x5, 0), "mov x4, x5");
+  COMPARE(Bic(w6, w7, 0), "mov w6, w7");
+  COMPARE(Bic(x6, x7, 0), "mov x6, x7");
+  COMPARE(Orn(w8, w9, 0), "movn w8, #0x0");
+  COMPARE(Orn(x8, x9, 0), "movn x8, #0x0");
+  COMPARE(Eon(w10, w11, 0), "mvn w10, w11");
+  COMPARE(Eon(x10, x11, 0), "mvn x10, x11");
+
+  COMPARE(And(w12, w13, 0xffffffff), "mov w12, w13");
+  COMPARE(And(x12, x13, 0xffffffff), "and x12, x13, #0xffffffff");
+  COMPARE(And(x12, x13, 0xffffffffffffffff), "mov x12, x13");
+  COMPARE(Orr(w14, w15, 0xffffffff), "movn w14, #0x0");
+  COMPARE(Orr(x14, x15, 0xffffffff), "orr x14, x15, #0xffffffff");
+  COMPARE(Orr(x14, x15, 0xffffffffffffffff), "movn x14, #0x0");
+  COMPARE(Eor(w16, w17, 0xffffffff), "mvn w16, w17");
+  COMPARE(Eor(x16, x17, 0xffffffff), "eor x16, x17, #0xffffffff");
+  COMPARE(Eor(x16, x17, 0xffffffffffffffff), "mvn x16, x17");
+  COMPARE(Bic(w18, w19, 0xffffffff), "movz w18, #0x0");
+  COMPARE(Bic(x18, x19, 0xffffffff), "and x18, x19, #0xffffffff00000000");
+  COMPARE(Bic(x18, x19, 0xffffffffffffffff), "movz x18, #0x0");
+  COMPARE(Orn(w20, w21, 0xffffffff), "mov w20, w21");
+  COMPARE(Orn(x20, x21, 0xffffffff), "orr x20, x21, #0xffffffff00000000");
+  COMPARE(Orn(x20, x21, 0xffffffffffffffff), "mov x20, x21");
+  COMPARE(Eon(w22, w23, 0xffffffff), "mov w22, w23");
+  COMPARE(Eon(x22, x23, 0xffffffff), "eor x22, x23, #0xffffffff00000000");
+  COMPARE(Eon(x22, x23, 0xffffffffffffffff), "mov x22, x23");
+
+  CLEANUP();
+}
+}  // namespace vixl

diff --git a/test/test-utils-a64.cc b/test/test-utils-a64.cc
new file mode 100644
index 0000000..c52e926
--- /dev/null
+++ b/test/test-utils-a64.cc

@@ -0,0 +1,424 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "test-utils-a64.h"
+
+#include <math.h>   // Needed for isnan().
+
+#include "cctest.h"
+#include "a64/macro-assembler-a64.h"
+#include "a64/simulator-a64.h"
+#include "a64/disasm-a64.h"
+#include "a64/cpu-a64.h"
+
+#define __ masm->
+
+namespace vixl {
+
+bool Equal32(uint32_t expected, const RegisterDump*, uint32_t result) {
+  if (result != expected) {
+    printf("Expected 0x%08" PRIx32 "\t Found 0x%08" PRIx32 "\n",
+           expected, result);
+  }
+
+  return expected == result;
+}
+
+
+bool Equal64(uint64_t expected, const RegisterDump*, uint64_t result) {
+  if (result != expected) {
+    printf("Expected 0x%016" PRIx64 "\t Found 0x%016" PRIx64 "\n",
+           expected, result);
+  }
+
+  return expected == result;
+}
+
+
+bool EqualFP32(float expected, const RegisterDump*, float result) {
+  if (result != expected) {
+    printf("Expected %.20f\t Found %.20f\n", expected, result);
+  }
+
+  return expected == result;
+}
+
+
+bool EqualFP64(double expected, const RegisterDump*, double result) {
+  if (result != expected) {
+    printf("Expected %.20f\t Found %.20f\n", expected, result);
+  }
+
+  return expected == result;
+}
+
+
+bool Equal32(uint32_t expected, const RegisterDump* core, const Register& reg) {
+  ASSERT(reg.Is32Bits());
+  // Retrieve the corresponding X register so we can check that the upper part
+  // was properly cleared.
+  int64_t result_x = core->xreg(reg.code());
+  if ((result_x & 0xffffffff00000000L) != 0) {
+    printf("Expected 0x%08" PRIx32 "\t Found 0x%016" PRIx64 "\n",
+           expected, result_x);
+    return false;
+  }
+  uint32_t result_w = core->wreg(reg.code());
+  return Equal32(expected, core, result_w);
+}
+
+
+bool Equal64(uint64_t expected,
+             const RegisterDump* core,
+             const Register& reg) {
+  ASSERT(reg.Is64Bits());
+  uint64_t result = core->xreg(reg.code());
+  return Equal64(expected, core, result);
+}
+
+
+bool EqualFP32(float expected,
+               const RegisterDump* core,
+               const FPRegister& fpreg) {
+  ASSERT(fpreg.Is32Bits());
+  // Retrieve the corresponding D register so we can check that the upper part
+  // was properly cleared.
+  uint64_t result_64 = core->dreg_bits(fpreg.code());
+  if ((result_64 & 0xffffffff00000000L) != 0) {
+    printf("Expected 0x%08" PRIx32 " (%f)\t Found 0x%016" PRIx64 "\n",
+           float_to_rawbits(expected), expected, result_64);
+    return false;
+  }
+  if (expected == 0.0) {
+    return Equal32(float_to_rawbits(expected), core,
+                   core->sreg_bits(fpreg.code()));
+  } else if (isnan(expected)) {
+    return isnan(core->sreg(fpreg.code()));
+  } else {
+    float result = core->sreg(fpreg.code());
+    return EqualFP32(expected, core, result);
+  }
+}
+
+
+bool EqualFP64(double expected,
+               const RegisterDump* core,
+               const FPRegister& fpreg) {
+  ASSERT(fpreg.Is64Bits());
+  if (expected == 0.0) {
+    return Equal64(double_to_rawbits(expected), core,
+                   core->dreg_bits(fpreg.code()));
+  } else if (isnan(expected)) {
+    return isnan(core->dreg(fpreg.code()));
+  } else {
+    double result = core->dreg(fpreg.code());
+    return EqualFP64(expected, core, result);
+  }
+}
+
+
+bool Equal64(const Register& reg0,
+             const RegisterDump* core,
+             const Register& reg1) {
+  ASSERT(reg0.Is64Bits() && reg1.Is64Bits());
+  int64_t expected = core->xreg(reg0.code());
+  int64_t result = core->xreg(reg1.code());
+  return Equal64(expected, core, result);
+}
+
+
+static char FlagN(uint32_t flags) {
+  return (flags & NFlag) ? 'N' : 'n';
+}
+
+
+static char FlagZ(uint32_t flags) {
+  return (flags & ZFlag) ? 'Z' : 'z';
+}
+
+
+static char FlagC(uint32_t flags) {
+  return (flags & CFlag) ? 'C' : 'c';
+}
+
+
+static char FlagV(uint32_t flags) {
+  return (flags & VFlag) ? 'V' : 'v';
+}
+
+
+bool EqualNzcv(uint32_t expected, uint32_t result) {
+  ASSERT((expected & ~NZCVFlag) == 0);
+  ASSERT((result & ~NZCVFlag) == 0);
+  if (result != expected) {
+    printf("Expected: %c%c%c%c\t Found: %c%c%c%c\n",
+        FlagN(expected), FlagZ(expected), FlagC(expected), FlagV(expected),
+        FlagN(result), FlagZ(result), FlagC(result), FlagV(result));
+  }
+
+  return result == expected;
+}
+
+
+bool EqualRegisters(const RegisterDump* a, const RegisterDump* b) {
+  for (unsigned i = 0; i < kNumberOfRegisters; i++) {
+    if (a->xreg(i) != b->xreg(i)) {
+      printf("x%d\t Expected 0x%016" PRIx64 "\t Found 0x%016" PRIx64 "\n",
+             i, a->xreg(i), b->xreg(i));
+      return false;
+    }
+  }
+
+  for (unsigned i = 0; i < kNumberOfFPRegisters; i++) {
+    uint64_t a_bits = a->dreg_bits(i);
+    uint64_t b_bits = b->dreg_bits(i);
+    if (a_bits != b_bits) {
+      printf("d%d\t Expected 0x%016" PRIx64 "\t Found 0x%016" PRIx64 "\n",
+             i, a_bits, b_bits);
+      return false;
+    }
+  }
+
+  return true;
+}
+
+
+RegList PopulateRegisterArray(Register* w, Register* x, Register* r,
+                              int reg_size, int reg_count, RegList allowed) {
+  RegList list = 0;
+  int i = 0;
+  for (unsigned n = 0; (n < kNumberOfRegisters) && (i < reg_count); n++) {
+    if (((1UL << n) & allowed) != 0) {
+      // Only assign allowed registers.
+      if (r) {
+        r[i] = Register(n, reg_size);
+      }
+      if (x) {
+        x[i] = Register(n, kXRegSize);
+      }
+      if (w) {
+        w[i] = Register(n, kWRegSize);
+      }
+      list |= (1UL << n);
+      i++;
+    }
+  }
+  // Check that we got enough registers.
+  ASSERT(CountSetBits(list, kNumberOfRegisters) == reg_count);
+
+  return list;
+}
+
+
+RegList PopulateFPRegisterArray(FPRegister* s, FPRegister* d, FPRegister* v,
+                                int reg_size, int reg_count, RegList allowed) {
+  RegList list = 0;
+  int i = 0;
+  for (unsigned n = 0; (n < kNumberOfFPRegisters) && (i < reg_count); n++) {
+    if (((1UL << n) & allowed) != 0) {
+      // Only assigned allowed registers.
+      if (v) {
+        v[i] = FPRegister(n, reg_size);
+      }
+      if (d) {
+        d[i] = FPRegister(n, kDRegSize);
+      }
+      if (s) {
+        s[i] = FPRegister(n, kSRegSize);
+      }
+      list |= (1UL << n);
+      i++;
+    }
+  }
+  // Check that we got enough registers.
+  ASSERT(CountSetBits(list, kNumberOfFPRegisters) == reg_count);
+
+  return list;
+}
+
+
+void Clobber(MacroAssembler* masm, RegList reg_list, uint64_t const value) {
+  Register first = NoReg;
+  for (unsigned i = 0; i < kNumberOfRegisters; i++) {
+    if (reg_list & (1UL << i)) {
+      Register xn(i, kXRegSize);
+      // We should never write into sp here.
+      ASSERT(!xn.Is(sp));
+      if (!xn.IsZero()) {
+        if (!first.IsValid()) {
+          // This is the first register we've hit, so construct the literal.
+          __ Mov(xn, value);
+          first = xn;
+        } else {
+          // We've already loaded the literal, so re-use the value already
+          // loaded into the first register we hit.
+          __ Mov(xn, first);
+        }
+      }
+    }
+  }
+}
+
+
+void ClobberFP(MacroAssembler* masm, RegList reg_list, double const value) {
+  FPRegister first = NoFPReg;
+  for (unsigned i = 0; i < kNumberOfFPRegisters; i++) {
+    if (reg_list & (1UL << i)) {
+      FPRegister dn(i, kDRegSize);
+      if (!first.IsValid()) {
+        // This is the first register we've hit, so construct the literal.
+        __ Fmov(dn, value);
+        first = dn;
+      } else {
+        // We've already loaded the literal, so re-use the value already loaded
+        // into the first register we hit.
+        __ Fmov(dn, first);
+      }
+    }
+  }
+}
+
+
+void Clobber(MacroAssembler* masm, CPURegList reg_list) {
+  if (reg_list.type() == CPURegister::kRegister) {
+    // This will always clobber X registers.
+    Clobber(masm, reg_list.list());
+  } else if (reg_list.type() == CPURegister::kFPRegister) {
+    // This will always clobber D registers.
+    ClobberFP(masm, reg_list.list());
+  } else {
+    UNREACHABLE();
+  }
+}
+
+
+void RegisterDump::Dump(MacroAssembler* masm) {
+  ASSERT(__ StackPointer().Is(sp));
+
+  // Ensure that we don't unintentionally clobber any registers.
+  Register old_tmp0 = __ Tmp0();
+  Register old_tmp1 = __ Tmp1();
+  FPRegister old_fptmp0 = __ FPTmp0();
+  __ SetScratchRegisters(NoReg, NoReg);
+  __ SetFPScratchRegister(NoFPReg);
+
+  // Preserve some temporary registers.
+  Register dump_base = x0;
+  Register dump = x1;
+  Register tmp = x2;
+  Register dump_base_w = dump_base.W();
+  Register dump_w = dump.W();
+  Register tmp_w = tmp.W();
+
+  // Offsets into the dump_ structure.
+  const int x_offset = offsetof(dump_t, x_);
+  const int w_offset = offsetof(dump_t, w_);
+  const int d_offset = offsetof(dump_t, d_);
+  const int s_offset = offsetof(dump_t, s_);
+  const int sp_offset = offsetof(dump_t, sp_);
+  const int wsp_offset = offsetof(dump_t, wsp_);
+  const int flags_offset = offsetof(dump_t, flags_);
+
+  __ Push(xzr, dump_base, dump, tmp);
+
+  // Load the address where we will dump the state.
+  __ Mov(dump_base, reinterpret_cast<uint64_t>(&dump_));
+
+  // Dump the stack pointer (sp and wsp).
+  // The stack pointer cannot be stored directly; it needs to be moved into
+  // another register first. Also, we pushed four X registers, so we need to
+  // compensate here.
+  __ Add(tmp, sp, 4 * kXRegSizeInBytes);
+  __ Str(tmp, MemOperand(dump_base, sp_offset));
+  __ Add(tmp_w, wsp, 4 * kXRegSizeInBytes);
+  __ Str(tmp_w, MemOperand(dump_base, wsp_offset));
+
+  // Dump X registers.
+  __ Add(dump, dump_base, x_offset);
+  for (unsigned i = 0; i < kNumberOfRegisters; i += 2) {
+    __ Stp(Register::XRegFromCode(i), Register::XRegFromCode(i + 1),
+           MemOperand(dump, i * kXRegSizeInBytes));
+  }
+
+  // Dump W registers.
+  __ Add(dump, dump_base, w_offset);
+  for (unsigned i = 0; i < kNumberOfRegisters; i += 2) {
+    __ Stp(Register::WRegFromCode(i), Register::WRegFromCode(i + 1),
+           MemOperand(dump, i * kWRegSizeInBytes));
+  }
+
+  // Dump D registers.
+  __ Add(dump, dump_base, d_offset);
+  for (unsigned i = 0; i < kNumberOfFPRegisters; i += 2) {
+    __ Stp(FPRegister::DRegFromCode(i), FPRegister::DRegFromCode(i + 1),
+           MemOperand(dump, i * kDRegSizeInBytes));
+  }
+
+  // Dump S registers.
+  __ Add(dump, dump_base, s_offset);
+  for (unsigned i = 0; i < kNumberOfFPRegisters; i += 2) {
+    __ Stp(FPRegister::SRegFromCode(i), FPRegister::SRegFromCode(i + 1),
+           MemOperand(dump, i * kSRegSizeInBytes));
+  }
+
+  // Dump the flags.
+  __ Mrs(tmp, NZCV);
+  __ Str(tmp, MemOperand(dump_base, flags_offset));
+
+  // To dump the values that were in tmp amd dump, we need a new scratch
+  // register.  We can use any of the already dumped registers since we can
+  // easily restore them.
+  Register dump2_base = x10;
+  Register dump2 = x11;
+  ASSERT(!AreAliased(dump_base, dump, tmp, dump2_base, dump2));
+
+  // Don't lose the dump_ address.
+  __ Mov(dump2_base, dump_base);
+
+  __ Pop(tmp, dump, dump_base, xzr);
+
+  __ Add(dump2, dump2_base, w_offset);
+  __ Str(dump_base_w, MemOperand(dump2, dump_base.code() * kWRegSizeInBytes));
+  __ Str(dump_w, MemOperand(dump2, dump.code() * kWRegSizeInBytes));
+  __ Str(tmp_w, MemOperand(dump2, tmp.code() * kWRegSizeInBytes));
+
+  __ Add(dump2, dump2_base, x_offset);
+  __ Str(dump_base, MemOperand(dump2, dump_base.code() * kXRegSizeInBytes));
+  __ Str(dump, MemOperand(dump2, dump.code() * kXRegSizeInBytes));
+  __ Str(tmp, MemOperand(dump2, tmp.code() * kXRegSizeInBytes));
+
+  // Finally, restore dump2_base and dump2.
+  __ Ldr(dump2_base, MemOperand(dump2, dump2_base.code() * kXRegSizeInBytes));
+  __ Ldr(dump2, MemOperand(dump2, dump2.code() * kXRegSizeInBytes));
+
+  // Restore the MacroAssembler's scratch registers.
+  __ SetScratchRegisters(old_tmp0, old_tmp1);
+  __ SetFPScratchRegister(old_fptmp0);
+
+  completed_ = true;
+}
+
+}  // namespace vixl

diff --git a/test/test-utils-a64.h b/test/test-utils-a64.h
new file mode 100644
index 0000000..f7cf242
--- /dev/null
+++ b/test/test-utils-a64.h

@@ -0,0 +1,231 @@
+// Copyright 2013, ARM Limited
+// All rights reserved.
+//
+// Redistribution and use in source and binary forms, with or without
+// modification, are permitted provided that the following conditions are met:
+//
+//   * Redistributions of source code must retain the above copyright notice,
+//     this list of conditions and the following disclaimer.
+//   * Redistributions in binary form must reproduce the above copyright notice,
+//     this list of conditions and the following disclaimer in the documentation
+//     and/or other materials provided with the distribution.
+//   * Neither the name of ARM Limited nor the names of its contributors may be
+//     used to endorse or promote products derived from this software without
+//     specific prior written permission.
+//
+// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+// ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+// WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+// DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#ifndef VIXL_A64_TEST_UTILS_A64_H_
+#define VIXL_A64_TEST_UTILS_A64_H_
+
+#include "cctest.h"
+#include "a64/macro-assembler-a64.h"
+#include "a64/simulator-a64.h"
+#include "a64/disasm-a64.h"
+#include "a64/cpu-a64.h"
+
+namespace vixl {
+
+// RegisterDump: Object allowing integer, floating point and flags registers
+// to be saved to itself for future reference.
+class RegisterDump {
+ public:
+  RegisterDump() : completed_(false) {
+    ASSERT(sizeof(dump_.d_[0]) == kDRegSizeInBytes);
+    ASSERT(sizeof(dump_.s_[0]) == kSRegSizeInBytes);
+    ASSERT(sizeof(dump_.d_[0]) == kXRegSizeInBytes);
+    ASSERT(sizeof(dump_.s_[0]) == kWRegSizeInBytes);
+    ASSERT(sizeof(dump_.x_[0]) == kXRegSizeInBytes);
+    ASSERT(sizeof(dump_.w_[0]) == kWRegSizeInBytes);
+  }
+
+  // The Dump method generates code to store a snapshot of the register values.
+  // It needs to be able to use the stack temporarily, and requires that the
+  // current stack pointer is sp, and is properly aligned.
+  //
+  // The dumping code is generated though the given MacroAssembler. No registers
+  // are corrupted in the process, but the stack is used briefly. The flags will
+  // be corrupted during this call.
+  void Dump(MacroAssembler* assm);
+
+  // Register accessors.
+  inline int32_t wreg(unsigned code) const {
+    if (code == kSPRegInternalCode) {
+      return wspreg();
+    }
+    ASSERT(RegAliasesMatch(code));
+    return dump_.w_[code];
+  }
+
+  inline int64_t xreg(unsigned code) const {
+    if (code == kSPRegInternalCode) {
+      return spreg();
+    }
+    ASSERT(RegAliasesMatch(code));
+    return dump_.x_[code];
+  }
+
+  // FPRegister accessors.
+  inline uint32_t sreg_bits(unsigned code) const {
+    ASSERT(FPRegAliasesMatch(code));
+    return dump_.s_[code];
+  }
+
+  inline float sreg(unsigned code) const {
+    return rawbits_to_float(sreg_bits(code));
+  }
+
+  inline uint64_t dreg_bits(unsigned code) const {
+    ASSERT(FPRegAliasesMatch(code));
+    return dump_.d_[code];
+  }
+
+  inline double dreg(unsigned code) const {
+    return rawbits_to_double(dreg_bits(code));
+  }
+
+  // Stack pointer accessors.
+  inline int64_t spreg() const {
+    ASSERT(SPRegAliasesMatch());
+    return dump_.sp_;
+  }
+
+  inline int64_t wspreg() const {
+    ASSERT(SPRegAliasesMatch());
+    return dump_.wsp_;
+  }
+
+  // Flags accessors.
+  inline uint64_t flags_nzcv() const {
+    ASSERT(IsComplete());
+    ASSERT((dump_.flags_ & ~Flags_mask) == 0);
+    return dump_.flags_ & Flags_mask;
+  }
+
+  inline bool IsComplete() const {
+    return completed_;
+  }
+
+ private:
+  // Indicate whether the dump operation has been completed.
+  bool completed_;
+
+  // Check that the lower 32 bits of x<code> exactly match the 32 bits of
+  // w<code>. A failure of this test most likely represents a failure in the
+  // ::Dump method, or a failure in the simulator.
+  bool RegAliasesMatch(unsigned code) const {
+    ASSERT(IsComplete());
+    ASSERT(code < kNumberOfRegisters);
+    return ((dump_.x_[code] & kWRegMask) == dump_.w_[code]);
+  }
+
+  // As RegAliasesMatch, but for the stack pointer.
+  bool SPRegAliasesMatch() const {
+    ASSERT(IsComplete());
+    return ((dump_.sp_ & kWRegMask) == dump_.wsp_);
+  }
+
+  // As RegAliasesMatch, but for floating-point registers.
+  bool FPRegAliasesMatch(unsigned code) const {
+    ASSERT(IsComplete());
+    ASSERT(code < kNumberOfFPRegisters);
+    return (dump_.d_[code] & kSRegMask) == dump_.s_[code];
+  }
+
+  // Store all the dumped elements in a simple struct so the implementation can
+  // use offsetof to quickly find the correct field.
+  struct dump_t {
+    // Core registers.
+    uint64_t x_[kNumberOfRegisters];
+    uint32_t w_[kNumberOfRegisters];
+
+    // Floating-point registers, as raw bits.
+    uint64_t d_[kNumberOfFPRegisters];
+    uint32_t s_[kNumberOfFPRegisters];
+
+    // The stack pointer.
+    uint64_t sp_;
+    uint64_t wsp_;
+
+    // NZCV flags, stored in bits 28 to 31.
+    // bit[31] : Negative
+    // bit[30] : Zero
+    // bit[29] : Carry
+    // bit[28] : oVerflow
+    uint64_t flags_;
+  } dump_;
+};
+
+// Some of these methods don't use the RegisterDump argument, but they have to
+// accept them so that they can overload those that take register arguments.
+bool Equal32(uint32_t expected, const RegisterDump*, uint32_t result);
+bool Equal64(uint64_t expected, const RegisterDump*, uint64_t result);
+
+bool EqualFP32(float expected, const RegisterDump*, float result);
+bool EqualFP64(double expected, const RegisterDump*, double result);
+
+bool Equal32(uint32_t expected, const RegisterDump* core, const Register& reg);
+bool Equal64(uint64_t expected, const RegisterDump* core, const Register& reg);
+
+bool EqualFP32(float expected, const RegisterDump* core,
+               const FPRegister& fpreg);
+bool EqualFP64(double expected, const RegisterDump* core,
+               const FPRegister& fpreg);
+
+bool Equal64(const Register& reg0, const RegisterDump* core,
+             const Register& reg1);
+
+bool EqualNzcv(uint32_t expected, uint32_t result);
+
+bool EqualRegisters(const RegisterDump* a, const RegisterDump* b);
+
+// Populate the w, x and r arrays with registers from the 'allowed' mask. The
+// r array will be populated with <reg_size>-sized registers,
+//
+// This allows for tests which use large, parameterized blocks of registers
+// (such as the push and pop tests), but where certain registers must be
+// avoided as they are used for other purposes.
+//
+// Any of w, x, or r can be NULL if they are not required.
+//
+// The return value is a RegList indicating which registers were allocated.
+RegList PopulateRegisterArray(Register* w, Register* x, Register* r,
+                              int reg_size, int reg_count, RegList allowed);
+
+// As PopulateRegisterArray, but for floating-point registers.
+RegList PopulateFPRegisterArray(FPRegister* s, FPRegister* d, FPRegister* v,
+                                int reg_size, int reg_count, RegList allowed);
+
+// Ovewrite the contents of the specified registers. This enables tests to
+// check that register contents are written in cases where it's likely that the
+// correct outcome could already be stored in the register.
+//
+// This always overwrites X-sized registers. If tests are operating on W
+// registers, a subsequent write into an aliased W register should clear the
+// top word anyway, so clobbering the full X registers should make tests more
+// rigorous.
+void Clobber(MacroAssembler* masm, RegList reg_list,
+             uint64_t const value = 0xfedcba9876543210UL);
+
+// As Clobber, but for FP registers.
+void ClobberFP(MacroAssembler* masm, RegList reg_list,
+               double const value = kFP64SignallingNaN);
+
+// As Clobber, but for a CPURegList with either FP or integer registers. When
+// using this method, the clobber value is always the default for the basic
+// Clobber or ClobberFP functions.
+void Clobber(MacroAssembler* masm, CPURegList reg_list);
+
+
+}  // namespace vixl
+
+#endif  // VIXL_A64_TEST_UTILS_A64_H_

diff --git a/tools/cross_build_gcc.sh b/tools/cross_build_gcc.sh
new file mode 100755
index 0000000..3a365d9
--- /dev/null
+++ b/tools/cross_build_gcc.sh

@@ -0,0 +1,67 @@
+#!/bin/sh
+
+# Copyright 2013, ARM Limited
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are met:
+#
+#   * Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#   * Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#   * Neither the name of ARM Limited nor the names of its contributors may be
+#     used to endorse or promote products derived from this software without
+#     specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+if [ "$#" -lt 1 ]; then
+  echo "Usage: tools/cross_build_gcc.sh <GCC prefix> [scons arguments ...]"
+  exit 1
+fi
+
+export CXX=$1g++
+export AR=$1ar
+export RANLIB=$1ranlib
+export CC=$1gcc
+export LD=$1ld
+
+OK=1
+if [ ! -x "$CXX" ]; then
+  echo "Error: $CXX does not exist or is not executable."
+  OK=0
+fi
+if [ ! -x "$AR" ]; then
+  echo "Error: $AR does not exist or is not executable."
+  OK=0
+fi
+if [ ! -x "$RANLIB" ]; then
+  echo "Error: $RANLIB does not exist or is not executable."
+  OK=0
+fi
+if [ ! -x "$CC" ]; then
+  echo "Error: $CC does not exist or is not executable."
+  OK=0
+fi
+if [ ! -x "$LD" ]; then
+  echo "Error: $LD does not exist or is not executable."
+  OK=0
+fi
+if [ $OK -ne 1 ]; then
+  exit 1
+fi
+
+
+shift
+scons $@

diff --git a/tools/git.py b/tools/git.py
new file mode 100644
index 0000000..b661d11
--- /dev/null
+++ b/tools/git.py

@@ -0,0 +1,75 @@
+# Copyright 2013 ARM Limited
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are met:
+#
+#   * Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#   * Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#   * Neither the name of ARM Limited nor the names of its contributors may be
+#     used to endorse or promote products derived from this software without
+#     specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY ARM LIMITED AND CONTRIBUTORS "AS IS" AND ANY
+# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+# DISCLAIMED. IN NO EVENT SHALL ARM LIMITED BE LIABLE FOR ANY DIRECT, INDIRECT,
+# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
+# OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
+# LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
+# NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+# EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+import re
+import util
+import os.path
+
+def is_git_repository_root():
+  return os.path.isdir('.git')
+
+def get_current_branch():
+  status, branches = util.getstatusoutput('git branch')
+  if status != 0: util.abort('Failed to run git branch.')
+  match = re.search("^\* (.*)$", branches, re.MULTILINE)
+  if not match: util.abort('Failed to find the current branch.')
+
+  branch = match.group(1);
+
+  # If we are not on a named branch, return the hash of the HEAD commit.
+  # This can occur (for example) if a specific revision is checked out by
+  # commit hash, or during a rebase.
+  if branch == '(no branch)':
+    status, commit = util.getstatusoutput('git log -1 --pretty=format:"%H"')
+    if status != 0: util.abort('Failed to run git log.')
+    match = re.search('^[0-9a-fA-F]{40}$', commit, re.MULTILINE)
+    if not match: util.abort('Failed to find the current revision.')
+    branch = match.group(0)
+
+  return branch
+
+
+def get_tracked_files():
+  command = 'git ls-tree '
+  branch = get_current_branch()
+  options = ' -r --full-tree --name-only'
+
+  status, tracked = util.getstatusoutput(command + branch + options)
+  if status != 0: util.abort('Failed to list tracked files.')
+
+  return tracked
+
+
+# Get untracked files in src/, test/, and tools/.
+def get_untracked_files():
+  status, output = util.getstatusoutput('git status -s')
+  if status != 0: util.abort('Failed to get git status.')
+
+  untracked_regexp = re.compile('\?\?.*(src/|test/|tools/).*(.cc$|.h$)')
+  files_in_watched_folder = lambda n: untracked_regexp.search(n) != None
+  untracked_files = filter(files_in_watched_folder, output.split('\n'))
+
+  return untracked_files

diff --git a/tools/make_instruction_doc.pl b/tools/make_instruction_doc.pl
new file mode 100755
index 0000000..a244962
--- /dev/null
+++ b/tools/make_instruction_doc.pl

@@ -0,0 +1,112 @@
+#!/usr/bin/perl
+
+# Copyright 2013, ARM Limited
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are met:
+#
+#   * Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#   * Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#   * Neither the name of ARM Limited nor the names of its contributors may be
+#     used to endorse or promote products derived from this software without
+#     specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+# Assembler header file.
+my $hfile = "src/a64/assembler-a64.h";
+
+# Extra pseudo instructions added to AArch64.
+my @extras = qw/bind debug dci dc32 dc64/;
+
+my %inst = ();  # Global hash of instructions.
+
+$/ = '';
+open(IN, "<$hfile") or die("Can't open header file $header.\n");
+while(<IN>)
+{
+  # Find a function formatted like an instruction.
+  if(my($t) = /^  ((?:void|inline void) [a-z0-9]{1,6})\(/mgp)
+  {
+    my $before = ${^PREMATCH};
+    my $after = ${^POSTMATCH};
+
+    # Extract the instruction.
+    my($i) = $t =~ /(?:void|inline void) ([a-z0-9]{1,6})/;
+
+    # Extract the comment from before the function.
+    my($d) = $before =~ /.*  \/\/ ([A-Z].+?\.)$/;
+
+    # Extract and tidy up the function prototype.
+    my($p) = $after =~ /(.*?\))/ms;
+    $p =~ s/\n/\n  /g;
+    $p = "$t(".$p;
+
+    # Establish the type of the instruction.
+    my $type = 'integer';
+    ($i =~ /^f/) and $type = 'float';
+    ($i ~~ @extras) and $type = 'pseudo';
+
+    # Push the results into a hash keyed by prototype string.
+    $inst{$p}->{'type'} = $type;
+    $inst{$p}->{'mnemonic'} = $i;
+    $inst{$p}->{'description'} = $d;
+  }
+}
+close(IN);
+
+print <<HEADER;
+VIXL Supported Instruction List
+===============================
+
+This is a list of the AArch64 instructions supported by the VIXL assembler,
+disassembler and simulator. The simulator may not support all floating point
+operations to the precision required by AArch64 - please check the simulator
+source code for details.
+
+HEADER
+
+print describe_insts('AArch64 integer instructions', 'integer');
+print describe_insts('AArch64 floating point instructions', 'float');
+print describe_insts('Additional or pseudo instructions', 'pseudo');
+
+# Sort instructions by mnemonic and then description.
+sub inst_sort
+{
+  $inst{$a}->{'mnemonic'} cmp $inst{$b}->{'mnemonic'} ||
+  $inst{$a}->{'description'} cmp $inst{$b}->{'description'};
+}
+
+# Return a Markdown formatted list of instructions of a particular type.
+sub describe_insts
+{
+  my($title, $type) = @_;
+  my $result = '';
+  $result .= "$title\n";
+  $result .= '-' x length($title);
+  $result .= "\n\n";
+
+  foreach my $i (sort inst_sort keys(%inst))
+  {
+    next if($inst{$i}->{'type'} ne $type);
+    $result .= sprintf("### %s ###\n\n%s\n\n", $inst{$i}->{'mnemonic'}, $inst{$i}->{'description'});
+    $result .= "    $i\n\n\n";
+  }
+  $result .= "\n";
+  return $result
+}
+
+

diff --git a/tools/presubmit.py b/tools/presubmit.py
new file mode 100755
index 0000000..6632e4b
--- /dev/null
+++ b/tools/presubmit.py

@@ -0,0 +1,202 @@
+#!/usr/bin/env python2.7
+
+# Copyright 2013, ARM Limited
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are met:
+#
+#   * Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#   * Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#   * Neither the name of ARM Limited nor the names of its contributors may be
+#     used to endorse or promote products derived from this software without
+#     specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+import os
+import sys
+import argparse
+import re
+
+import util
+import git
+
+# Google's cpplint.py from depot_tools is the linter used here.
+CPP_LINTER_RULES = '''
+build/class
+build/deprecated
+build/endif_comment
+build/forward_decl
+build/include_order
+build/printf_format
+build/storage_class
+legal/copyright
+readability/boost
+readability/braces
+readability/casting
+readability/constructors
+readability/fn_size
+readability/function
+readability/multiline_comment
+readability/multiline_string
+readability/streams
+readability/utf8
+runtime/arrays
+runtime/casting
+runtime/deprecated_fn
+runtime/explicit
+runtime/int
+runtime/memset
+runtime/mutex
+runtime/nonconf
+runtime/printf
+runtime/printf_format
+runtime/references
+runtime/rtti
+runtime/sizeof
+runtime/string
+runtime/virtual
+runtime/vlog
+whitespace/blank_line
+whitespace/braces
+whitespace/comma
+whitespace/comments
+whitespace/end_of_line
+whitespace/ending_newline
+whitespace/indent
+whitespace/labels
+whitespace/line_length
+whitespace/newline
+whitespace/operators
+whitespace/parens
+whitespace/tab
+whitespace/todo
+'''.split()
+
+
+def BuildOptions():
+  result = argparse.ArgumentParser(description='Run the linter and unit tests.')
+  result.add_argument('--verbose', '-v', action='store_true',
+                      help='Print all tests output at the end.')
+  result.add_argument('--notest', action='store_true',
+                      help='Do not run tests. Run the linter only.')
+  result.add_argument('--nolint', action='store_true',
+                      help='Do not run the linter. Run the tests only.')
+  result.add_argument('--noclean', action='store_true',
+                      help='Do not clean before build.')
+  result.add_argument('--jobs', '-j', metavar='N', type=int, default=1,
+                      help='Allow N jobs at once.')
+  return result.parse_args()
+
+
+def CleanBuildSystem():
+  status, output = util.getstatusoutput('scons mode=release --clean')
+  if status != 0: util.abort('Failed to clean in release mode.')
+  status, output = util.getstatusoutput('scons mode=debug --clean')
+  if status != 0: util.abort('Failed to clean in debug mode.')
+
+
+class Test:
+  def __init__(self, name, command, get_summary):
+    self.name = name
+    self.command = command
+    self.get_summary = get_summary
+    self.output = 'NOT RUN'
+    self.status = 'NOT RUN'
+    self.summary = 'NOT RUN'
+
+  def Run(self):
+    retcode, self.output = util.getstatusoutput(self.command)
+    self.status = 'PASS' if retcode == 0 else 'FAILED'
+    self.summary = self.get_summary(self.output)
+
+  def PrintOutcome(self):
+    print(('%s :'.ljust(20 - len(self.name)) + '%s\n' + ' ' * 18 + '%s')
+          %(self.name, self.status, self.summary))
+
+  def PrintOutput(self):
+    print('\n\n### OUTPUT of ' + self.name)
+    print(self.output)
+
+
+class Tester:
+  def __init__(self):
+    self.tests = []
+
+  def AddTest(self, test):
+    self.tests.append(test)
+
+  def RunAll(self, verbose):
+    for test in self.tests:
+      test.Run()
+      test.PrintOutcome()
+
+    if verbose:
+      for test in self.tests:
+        test.PrintOutput()
+
+
+if __name__ == '__main__':
+  original_dir = os.path.abspath('.')
+  # $ROOT/tools/presubmit.py
+  root_dir = os.path.dirname(os.path.dirname(os.path.abspath(sys.argv[0])))
+  os.chdir(root_dir)
+  args = BuildOptions()
+
+  if not args.nolint and not git.is_git_repository_root():
+    print 'WARNING: This is not a Git repository. The linter will not run.'
+    args.nolint = True
+
+  tester = Tester()
+  if not args.nolint:
+    lint_args = '--filter=-,+' + ',+'.join(CPP_LINTER_RULES) + ' '
+    tracked_files = git.get_tracked_files().split()
+    tracked_files = filter(util.is_cpp_filename, tracked_files)
+    tracked_files = ' '.join(tracked_files)
+    lint = Test('cpp lint',
+                'cpplint.py ' + lint_args + tracked_files,
+                util.last_line)
+    tester.AddTest(lint)
+  if not args.notest:
+    release = Test('cctest release',
+                   './tools/test.py --mode=release --jobs=%d' % args.jobs,
+                   util.last_line)
+    debug = Test('cctest debug',
+                 './tools/test.py --mode=debug --jobs=%d' % args.jobs,
+                 util.last_line)
+    tester.AddTest(release)
+    tester.AddTest(debug)
+
+  if not args.noclean:
+    CleanBuildSystem()
+
+  tester.RunAll(args.verbose)
+
+  # If the linter failed, print its output. We don't do the same for the debug
+  # or release tests because they're easy to run by themselves. In verbose mode,
+  # the output is printed automatically in tester.RunAll().
+  if not args.nolint and lint.status == 'FAILED' and not args.verbose:
+    lint.PrintOutput()
+
+  if git.is_git_repository_root():
+    untracked_files = git.get_untracked_files()
+    if untracked_files:
+      print '\nWARNING: The following files are untracked:'
+      for f in untracked_files:
+        print f.lstrip('?')
+
+  # Restore original directory.
+  os.chdir(original_dir)

diff --git a/tools/test.py b/tools/test.py
new file mode 100755
index 0000000..cf20385
--- /dev/null
+++ b/tools/test.py

@@ -0,0 +1,247 @@
+#!/usr/bin/env python2.7
+
+# Copyright 2013, ARM Limited
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are met:
+#
+#   * Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#   * Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#   * Neither the name of ARM Limited nor the names of its contributors may be
+#     used to endorse or promote products derived from this software without
+#     specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+import os
+import sys
+import argparse
+import re
+import subprocess
+import threading
+import time
+import util
+
+
+def BuildOptions():
+  result = argparse.ArgumentParser(description = 'Unit test tool')
+  result.add_argument('name_filters', metavar='name filters', nargs='*',
+                      help='Tests matching any of the regexp filters will be run.')
+  result.add_argument('--mode', action='store', choices=['release', 'debug', 'coverage'],
+                      default='release', help='Build mode')
+  result.add_argument('--simulator', action='store', choices=['on', 'off'],
+                      default='on', help='Use the builtin a64 simulator')
+  result.add_argument('--timeout', action='store', type=int, default=5,
+                      help='Timeout (in seconds) for each cctest (5sec default).')
+  result.add_argument('--nobuild', action='store_true',
+                      help='Do not (re)build the tests')
+  result.add_argument('--jobs', '-j', metavar='N', type=int, default=1,
+                      help='Allow N jobs at once.')
+  return result.parse_args()
+
+
+def BuildRequiredObjects(arguments):
+  status, output = util.getstatusoutput('scons ' +
+                                        'mode=' + arguments.mode + ' ' +
+                                        'simulator=' + arguments.simulator + ' ' +
+                                        'target=cctest ' +
+                                        '--jobs=' + str(arguments.jobs))
+
+  if status != 0:
+    print(output)
+    util.abort('Failed bulding cctest')
+
+
+# Display the run progress:
+# [time| progress|+ passed|- failed]
+def UpdateProgress(start_time, passed, failed, card):
+  minutes, seconds = divmod(time.time() - start_time, 60)
+  progress = float(passed + failed) / card * 100
+  passed_colour = '\x1b[32m' if passed != 0 else ''
+  failed_colour = '\x1b[31m' if failed != 0 else ''
+  indicator = '\r[%02d:%02d| %3d%%|' + passed_colour + '+ %d\x1b[0m|' + failed_colour + '- %d\x1b[0m]'
+  sys.stdout.write(indicator % (minutes, seconds, progress, passed, failed))
+
+
+def PrintError(s):
+  # Print the error on a new line.
+  sys.stdout.write('\n')
+  print(s)
+  sys.stdout.flush()
+
+
+# List all tests matching any of the provided filters.
+def ListTests(cctest, filters):
+  status, output = util.getstatusoutput(cctest +  ' --list')
+  if status != 0: util.abort('Failed to list all tests')
+
+  available_tests = output.split()
+  if filters:
+    filters = map(re.compile, filters)
+    def is_selected(test_name):
+      for e in filters:
+        if e.search(test_name):
+          return True
+      return False
+
+    return filter(is_selected, available_tests)
+  else:
+    return available_tests
+
+
+# A class representing a cctest.
+class CCtest:
+  cctest = None
+
+  def __init__(self, name, options = None):
+    self.name = name
+    self.options = options
+    self.process = None
+    self.stdout = None
+    self.stderr = None
+
+  def Command(self):
+    command = '%s %s' % (CCtest.cctest, self.name)
+    if self.options is not None:
+      command = '%s %s' % (commnad, ' '.join(options))
+
+    return command
+
+  # Run the test.
+  # Use a thread to be able to control the test.
+  def Run(self, arguments):
+    command = [CCtest.cctest, self.name]
+    if self.options is not None:
+      command += self.options
+
+    def execute():
+      self.process = subprocess.Popen(command,
+                                      stdout=subprocess.PIPE,
+                                      stderr=subprocess.PIPE)
+      self.stdout, self.stderr = self.process.communicate()
+
+    thread = threading.Thread(target=execute)
+    retcode = -1
+    # Append spaces to hide the previous test name if longer.
+    sys.stdout.write('  ' + self.name + ' ' * 20)
+    sys.stdout.flush()
+    # Start the test with a timeout.
+    thread.start()
+    thread.join(arguments.timeout)
+    if thread.is_alive():
+      # Too slow! Terminate.
+      PrintError('### TIMEOUT %s\nCOMMAND:\n%s' % (self.name, self.Command()))
+      # If timeout was too small the thread may not have run and self.process
+      # is still None. Therefore check.
+      if (self.process):
+        self.process.terminate()
+      # Allow 1 second to terminate. Else, exterminate!
+      thread.join(1)
+      if thread.is_alive():
+        thread.kill()
+        thread.join()
+      # retcode is already set for failure.
+    else:
+      # Check the return status of the test.
+      retcode = self.process.poll()
+      if retcode != 0:
+        PrintError('### FAILED %s\nSTDERR:\n%s\nSTDOUT:\n%s\nCOMMAND:\n%s'
+                   % (self.name, self.stderr.decode(), self.stdout.decode(),
+                      self.Command()))
+
+    return retcode
+
+
+# Run all tests in the list 'tests'.
+def RunTests(cctest, tests, arguments):
+  CCtest.cctest = cctest
+  card = len(tests)
+  passed = 0
+  failed = 0
+
+  if card == 0:
+    print('No test to run')
+    return 0
+
+  # When the simulator is on the tests are ran twice: with and without the
+  # debugger.
+  if arguments.simulator:
+    card *= 2
+
+  print('Running %d tests... (timeout = %ds)' % (card, arguments.timeout))
+  start_time = time.time()
+
+  # Initialize the progress indicator.
+  UpdateProgress(start_time, 0, 0, card)
+  for e in tests:
+    variants = [CCtest(e)]
+    if arguments.simulator: variants.append(CCtest(e, ['--debugger']))
+    for v in variants:
+      retcode = v.Run(arguments)
+      # Update the counters and progress indicator.
+      if retcode == 0:
+        passed += 1
+      else:
+        failed += 1
+    UpdateProgress(start_time, passed, failed, card)
+
+  return failed
+
+
+if __name__ == '__main__':
+  original_dir = os.path.abspath('.')
+  # $ROOT/tools/test.py
+  root_dir = os.path.dirname(os.path.dirname(os.path.abspath(sys.argv[0])))
+  os.chdir(root_dir)
+
+  # Parse the arguments and build the executable.
+  args = BuildOptions()
+  if not args.nobuild:
+    BuildRequiredObjects(args)
+
+  # The test binary.
+  cctest = os.path.join(root_dir, 'cctest')
+  if args.simulator == 'on':
+    cctest += '_sim'
+  if args.mode == 'debug':
+    cctest += '_g'
+  elif args.mode == 'coverage':
+    cctest += '_gcov'
+
+  # List available tests.
+  tests = ListTests(cctest, args.name_filters)
+
+  # Delete coverage data files.
+  if args.mode == 'coverage':
+    status, output = util.getstatusoutput('find obj/coverage -name "*.gcda" -exec rm {} \;')
+
+  # Run the tests.
+  status = RunTests(cctest, tests, args)
+  sys.stdout.write('\n')
+
+  # Print coverage information.
+  if args.mode == 'coverage':
+    cmd = 'tggcov -R summary_all,untested_functions_per_file obj/coverage/src/aarch64'
+    p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE,
+                         stderr=subprocess.PIPE)
+    stdout, stderr = p.communicate()
+    print(stdout)
+
+  # Restore original directory.
+  os.chdir(original_dir)
+
+  sys.exit(status)
+

diff --git a/tools/util.py b/tools/util.py
new file mode 100644
index 0000000..2e84d9a
--- /dev/null
+++ b/tools/util.py

@@ -0,0 +1,56 @@
+# Copyright 2013, ARM Limited
+# All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions are met:
+#
+#   * Redistributions of source code must retain the above copyright notice,
+#     this list of conditions and the following disclaimer.
+#   * Redistributions in binary form must reproduce the above copyright notice,
+#     this list of conditions and the following disclaimer in the documentation
+#     and/or other materials provided with the distribution.
+#   * Neither the name of ARM Limited nor the names of its contributors may be
+#     used to endorse or promote products derived from this software without
+#     specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS CONTRIBUTORS "AS IS" AND
+# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+import sys
+import subprocess
+import shlex
+import re
+
+
+def abort(message):
+  print('ABORTING: ' + message)
+  sys.exit(1)
+
+
+# Emulate python3 subprocess.getstatusoutput.
+def getstatusoutput(command):
+  try:
+    args = shlex.split(command)
+    output = subprocess.check_output(args, stderr=subprocess.STDOUT)
+    return 0, output.rstrip('\n')
+  except subprocess.CalledProcessError as e:
+    return e.returncode, e.output.rstrip('\n')
+
+
+def last_line(text):
+  lines = text.split('\n')
+  last = lines[-1].split('\r')
+  return last[-1]
+
+
+CPP_EXT_REGEXP = re.compile('\.cc$|\.h$')
+def is_cpp_filename(filename):
+  return CPP_EXT_REGEXP.search(filename) != None
commit	ad96eda8944ab1c1ba55715c50d9d6f0a3ed1dc8	[log] [tgz]
author	armvixl	Fri Jun 14 11:42:37 2013 +0100
committer	armvixl	Tue Jun 18 16:55:15 2013 +0100
tree	11017e875811dc153d4f9ba7acb599394c007d78