Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 1 | Subzero - Fast code generator for PNaCl bitcode |
| 2 | =============================================== |
| 3 | |
| 4 | Building |
| 5 | -------- |
| 6 | |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 7 | Subzero is set up to be built within the Native Client tree. Follow the |
| 8 | `Developing PNaCl |
| 9 | <https://sites.google.com/a/chromium.org/dev/nativeclient/pnacl/developing-pnacl>`_ |
| 10 | instructions, in particular the section on building PNaCl sources. This will |
| 11 | prepare the necessary external headers and libraries that Subzero needs. |
| 12 | Checking out the Native Client project also gets the pre-built clang and LLVM |
| 13 | tools in ``native_client/../third_party/llvm-build/Release+Asserts/bin`` which |
| 14 | are used for building Subzero. |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 15 | |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 16 | The Subzero source is in ``native_client/toolchain_build/src/subzero``. From |
| 17 | within that directory, ``git checkout master && git pull`` to get the latest |
| 18 | version of Subzero source code. |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 19 | |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 20 | The Makefile is designed to be used as part of the higher level LLVM build |
| 21 | system. To build manually, use the ``Makefile.standalone``. There are several |
| 22 | build configurations from the command line:: |
| 23 | |
| 24 | make -f Makefile.standalone |
| 25 | make -f Makefile.standalone DEBUG=1 |
| 26 | make -f Makefile.standalone NOASSERT=1 |
| 27 | make -f Makefile.standalone DEBUG=1 NOASSERT=1 |
| 28 | make -f Makefile.standalone MINIMAL=1 |
| 29 | |
| 30 | ``DEBUG=1`` builds without optimizations and is good when running the translator |
| 31 | inside a debugger. ``NOASSERT=1`` disables assertions and is the preferred |
| 32 | configuration for performance testing the translator. ``MINIMAL=1`` attempts to |
| 33 | minimize the size of the translator by compiling out everything unnecessary. |
| 34 | |
Jim Stichnoth | fa0cfa5 | 2015-02-26 09:42:36 -0800 | [diff] [blame] | 35 | The result of the ``make`` command is the target ``pnacl-sz`` in the current |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 36 | directory. |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 37 | |
Jim Stichnoth | fa0cfa5 | 2015-02-26 09:42:36 -0800 | [diff] [blame] | 38 | ``pnacl-sz`` |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 39 | ------------ |
| 40 | |
Jim Stichnoth | fa0cfa5 | 2015-02-26 09:42:36 -0800 | [diff] [blame] | 41 | The ``pnacl-sz`` program parses a pexe or an LLVM bitcode file and translates it |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 42 | into ICE (Subzero's intermediate representation). It then invokes the ICE |
| 43 | translate method to lower it to target-specific machine code, optionally dumping |
| 44 | the intermediate representation at various stages of the translation. |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 45 | |
| 46 | The program can be run as follows:: |
| 47 | |
Jim Stichnoth | fa0cfa5 | 2015-02-26 09:42:36 -0800 | [diff] [blame] | 48 | ../pnacl-sz ./path/to/<file>.pexe |
| 49 | ../pnacl-sz ./tests_lit/pnacl-sz_tests/<file>.ll |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 50 | |
Jim Stichnoth | fa0cfa5 | 2015-02-26 09:42:36 -0800 | [diff] [blame] | 51 | At this time, ``pnacl-sz`` accepts a number of arguments, including the |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 52 | following: |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 53 | |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 54 | ``-help`` -- Show available arguments and possible values. (Note: this |
| 55 | unfortunately also pulls in some LLVM-specific options that are reported but |
| 56 | that Subzero doesn't use.) |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 57 | |
| 58 | ``-notranslate`` -- Suppress the ICE translation phase, which is useful if |
| 59 | ICE is missing some support. |
| 60 | |
Jim Stichnoth | 5bc2b1d | 2014-05-22 13:38:48 -0700 | [diff] [blame] | 61 | ``-target=<TARGET>`` -- Set the target architecture. The default is x8632. |
| 62 | Future targets include x8664, arm32, and arm64. |
| 63 | |
Jim Stichnoth | d442e7e | 2015-02-12 14:01:48 -0800 | [diff] [blame] | 64 | ``-filetype=obj|asm|iasm`` -- Select the output file type. ``obj`` is a |
| 65 | native ELF file, ``asm`` is a textual assembly file, and ``iasm`` is a |
| 66 | low-level textual assembly file demonstrating the integrated assembler. |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 67 | |
Jim Stichnoth | 5bc2b1d | 2014-05-22 13:38:48 -0700 | [diff] [blame] | 68 | ``-O<LEVEL>`` -- Set the optimization level. Valid levels are ``2``, ``1``, |
| 69 | ``0``, ``-1``, and ``m1``. Levels ``-1`` and ``m1`` are synonyms, and |
| 70 | represent the minimum optimization and worst code quality, but fastest code |
| 71 | generation. |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 72 | |
| 73 | ``-verbose=<list>`` -- Set verbosity flags. This argument allows a |
| 74 | comma-separated list of values. The default is ``none``, and the value |
| 75 | ``inst,pred`` will roughly match the .ll bitcode file. Of particular use |
| 76 | are ``all`` and ``none``. |
| 77 | |
Jim Stichnoth | 5bc2b1d | 2014-05-22 13:38:48 -0700 | [diff] [blame] | 78 | ``-o <FILE>`` -- Set the assembly output file name. Default is stdout. |
| 79 | |
| 80 | ``-log <FILE>`` -- Set the file name for diagnostic output (whose level is |
| 81 | controlled by ``-verbose``). Default is stdout. |
| 82 | |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 83 | ``-timing`` -- Dump some pass timing information after translating the input |
| 84 | file. |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 85 | |
| 86 | Running the test suite |
| 87 | ---------------------- |
| 88 | |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 89 | Subzero uses the LLVM ``lit`` testing tool for part of its test suite, which |
| 90 | lives in ``tests_lit``. To execute the test suite, first build Subzero, and then |
| 91 | run:: |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 92 | |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 93 | make -f Makefile.standalone check-lit |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 94 | |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 95 | There is also a suite of cross tests in the ``crosstest`` directory. A cross |
| 96 | test takes a test bitcode file implementing some unit tests, and translates it |
| 97 | twice, once with Subzero and once with LLVM's known-good ``llc`` translator. |
| 98 | The Subzero-translated symbols are specially mangled to avoid multiple |
| 99 | definition errors from the linker. Both translated versions are linked together |
| 100 | with a driver program that calls each version of each unit test with a variety |
| 101 | of interesting inputs and compares the results for equality. The cross tests |
| 102 | are currently invoked by running the ``runtests.sh`` script. |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 103 | |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 104 | A convenient way to run both the lit tests and the cross tests is:: |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 105 | |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 106 | make -f Makefile.standalone check |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 107 | |
Jim Stichnoth | fa0cfa5 | 2015-02-26 09:42:36 -0800 | [diff] [blame] | 108 | Assembling ``pnacl-sz`` output as needed |
Jim Stichnoth | d442e7e | 2015-02-12 14:01:48 -0800 | [diff] [blame] | 109 | ---------------------------------------- |
Jim Stichnoth | f7c9a14 | 2014-04-29 10:52:43 -0700 | [diff] [blame] | 110 | |
Jim Stichnoth | fa0cfa5 | 2015-02-26 09:42:36 -0800 | [diff] [blame] | 111 | ``pnacl-sz`` can now produce a native ELF binary using ``-filetype=obj``. |
Jim Stichnoth | d442e7e | 2015-02-12 14:01:48 -0800 | [diff] [blame] | 112 | |
Jim Stichnoth | fa0cfa5 | 2015-02-26 09:42:36 -0800 | [diff] [blame] | 113 | ``pnacl-sz`` can also produce textual assembly code in a structure suitable for |
Jim Stichnoth | d442e7e | 2015-02-12 14:01:48 -0800 | [diff] [blame] | 114 | input to ``llvm-mc``, using ``-filetype=asm`` or ``-filetype=iasm``. An object |
| 115 | file can then be produced using the command:: |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 116 | |
| 117 | llvm-mc -arch=x86 -filetype=obj -o=MyObj.o |
| 118 | |
Jim Stichnoth | 144a393 | 2014-11-18 09:16:31 -0800 | [diff] [blame] | 119 | Building a translated binary |
| 120 | ---------------------------- |
| 121 | |
| 122 | There is a helper script, ``pydir/szbuild.py``, that translates a finalized pexe |
| 123 | into a fully linked executable. Run it with ``-help`` for extensive |
| 124 | documentation. |
| 125 | |
| 126 | By default, ``szbuild.py`` builds an executable using only Subzero translation, |
| 127 | but it can also be used to produce hybrid Subzero/``llc`` binaries (``llc`` is |
| 128 | the name of the LLVM translator) for bisection-based debugging. In bisection |
| 129 | debugging mode, the pexe is translated using both Subzero and ``llc``, and the |
| 130 | resulting object files are combined into a single executable using symbol |
| 131 | weakening and other linker tricks to control which Subzero symbols and which |
| 132 | ``llc`` symbols take precedence. This is controlled by the ``-include`` and |
| 133 | ``-exclude`` arguments. These can be used to rapidly find a single function |
| 134 | that Subzero translates incorrectly leading to incorrect output. |
| 135 | |
| 136 | There is another helper script, ``pydir/szbuild_spec2k.py``, that runs |
| 137 | ``szbuild.py`` on one or more components of the Spec2K suite. This assumes that |
| 138 | Spec2K is set up in the usual place in the Native Client tree, and the finalized |
| 139 | pexe files have been built. (Note: for working with Spec2K and other pexes, |
| 140 | it's helpful to finalize the pexe using ``--no-strip-syms``, to preserve the |
| 141 | original function and global variable names.) |
| 142 | |
| 143 | Status |
| 144 | ------ |
| 145 | |
| 146 | Subzero currently translates only for the x86-32 architecture. Native Client |
| 147 | sandboxing is not yet implemented. Two optimization levels, ``-Om1`` and |
| 148 | ``-O2``, are implemented. |
| 149 | |
| 150 | The ``-Om1`` configuration is designed to be the simplest and fastest possible, |
| 151 | with a minimal set of passes and transformations. |
| 152 | |
| 153 | * Simple Phi lowering before target lowering, by generating temporaries and |
| 154 | adding assignments to the end of predecessor blocks. |
| 155 | |
| 156 | * Simple register allocation limited to pre-colored and infinite-weight |
| 157 | Variables. |
| 158 | |
| 159 | The ``-O2`` configuration is designed to use all optimizations available and |
| 160 | produce the best code. |
| 161 | |
| 162 | * Address mode inference to leverage the complex x86 addressing modes. |
| 163 | |
| 164 | * Compare/branch fusing based on liveness/last-use analysis. |
| 165 | |
| 166 | * Global, linear-scan register allocation. |
| 167 | |
| 168 | * Advanced phi lowering after target lowering and global register allocation, |
| 169 | via edge splitting, topological sorting of the parallel moves, and final local |
| 170 | register allocation. |
| 171 | |
| 172 | * Stack slot coalescing to reduce frame size. |
| 173 | |
| 174 | * Branch optimization to reduce the number of branches to the following block. |