blob: 0257558a17fd0de0c4b8b3361ee92c51c5c33f58 [file] [log] [blame]
Jim Stichnothf7c9a142014-04-29 10:52:43 -07001Subzero - Fast code generator for PNaCl bitcode
2===============================================
3
Jim Stichnothefb89712015-09-03 13:19:54 -07004Design
5------
6
7See the accompanying DESIGN.rst file for a more detailed technical overview of
8Subzero.
9
Jim Stichnothf7c9a142014-04-29 10:52:43 -070010Building
11--------
12
Jim Stichnoth144a3932014-11-18 09:16:31 -080013Subzero is set up to be built within the Native Client tree. Follow the
14`Developing PNaCl
15<https://sites.google.com/a/chromium.org/dev/nativeclient/pnacl/developing-pnacl>`_
16instructions, in particular the section on building PNaCl sources. This will
17prepare the necessary external headers and libraries that Subzero needs.
18Checking out the Native Client project also gets the pre-built clang and LLVM
19tools in ``native_client/../third_party/llvm-build/Release+Asserts/bin`` which
20are used for building Subzero.
Jim Stichnothf7c9a142014-04-29 10:52:43 -070021
Jim Stichnoth144a3932014-11-18 09:16:31 -080022The Subzero source is in ``native_client/toolchain_build/src/subzero``. From
23within that directory, ``git checkout master && git pull`` to get the latest
24version of Subzero source code.
Jim Stichnothf7c9a142014-04-29 10:52:43 -070025
Jim Stichnoth144a3932014-11-18 09:16:31 -080026The Makefile is designed to be used as part of the higher level LLVM build
27system. To build manually, use the ``Makefile.standalone``. There are several
28build configurations from the command line::
29
30 make -f Makefile.standalone
31 make -f Makefile.standalone DEBUG=1
32 make -f Makefile.standalone NOASSERT=1
33 make -f Makefile.standalone DEBUG=1 NOASSERT=1
34 make -f Makefile.standalone MINIMAL=1
Jim Stichnothefb89712015-09-03 13:19:54 -070035 make -f Makefile.standalone ASAN=1
36 make -f Makefile.standalone TSAN=1
Jim Stichnoth144a3932014-11-18 09:16:31 -080037
38``DEBUG=1`` builds without optimizations and is good when running the translator
39inside a debugger. ``NOASSERT=1`` disables assertions and is the preferred
40configuration for performance testing the translator. ``MINIMAL=1`` attempts to
41minimize the size of the translator by compiling out everything unnecessary.
Jim Stichnothefb89712015-09-03 13:19:54 -070042``ASAN=1`` enables AddressSanitizer, and ``TSAN=1`` enables ThreadSanitizer.
Jim Stichnoth144a3932014-11-18 09:16:31 -080043
Jim Stichnothfa0cfa52015-02-26 09:42:36 -080044The result of the ``make`` command is the target ``pnacl-sz`` in the current
Jim Stichnoth144a3932014-11-18 09:16:31 -080045directory.
Jim Stichnothf7c9a142014-04-29 10:52:43 -070046
Jim Stichnothfa0cfa52015-02-26 09:42:36 -080047``pnacl-sz``
Jim Stichnothf7c9a142014-04-29 10:52:43 -070048------------
49
Jim Stichnothfa0cfa52015-02-26 09:42:36 -080050The ``pnacl-sz`` program parses a pexe or an LLVM bitcode file and translates it
Jim Stichnoth144a3932014-11-18 09:16:31 -080051into ICE (Subzero's intermediate representation). It then invokes the ICE
52translate method to lower it to target-specific machine code, optionally dumping
53the intermediate representation at various stages of the translation.
Jim Stichnothf7c9a142014-04-29 10:52:43 -070054
55The program can be run as follows::
56
Jim Stichnothfa0cfa52015-02-26 09:42:36 -080057 ../pnacl-sz ./path/to/<file>.pexe
58 ../pnacl-sz ./tests_lit/pnacl-sz_tests/<file>.ll
Jim Stichnothf7c9a142014-04-29 10:52:43 -070059
Jim Stichnothfa0cfa52015-02-26 09:42:36 -080060At this time, ``pnacl-sz`` accepts a number of arguments, including the
Jim Stichnoth144a3932014-11-18 09:16:31 -080061following:
Jim Stichnothf7c9a142014-04-29 10:52:43 -070062
Jim Stichnoth144a3932014-11-18 09:16:31 -080063 ``-help`` -- Show available arguments and possible values. (Note: this
64 unfortunately also pulls in some LLVM-specific options that are reported but
65 that Subzero doesn't use.)
Jim Stichnothf7c9a142014-04-29 10:52:43 -070066
67 ``-notranslate`` -- Suppress the ICE translation phase, which is useful if
68 ICE is missing some support.
69
Jim Stichnoth5bc2b1d2014-05-22 13:38:48 -070070 ``-target=<TARGET>`` -- Set the target architecture. The default is x8632.
71 Future targets include x8664, arm32, and arm64.
72
Jim Stichnothd442e7e2015-02-12 14:01:48 -080073 ``-filetype=obj|asm|iasm`` -- Select the output file type. ``obj`` is a
74 native ELF file, ``asm`` is a textual assembly file, and ``iasm`` is a
75 low-level textual assembly file demonstrating the integrated assembler.
Jim Stichnoth144a3932014-11-18 09:16:31 -080076
Jim Stichnoth5bc2b1d2014-05-22 13:38:48 -070077 ``-O<LEVEL>`` -- Set the optimization level. Valid levels are ``2``, ``1``,
78 ``0``, ``-1``, and ``m1``. Levels ``-1`` and ``m1`` are synonyms, and
79 represent the minimum optimization and worst code quality, but fastest code
80 generation.
Jim Stichnothf7c9a142014-04-29 10:52:43 -070081
82 ``-verbose=<list>`` -- Set verbosity flags. This argument allows a
83 comma-separated list of values. The default is ``none``, and the value
84 ``inst,pred`` will roughly match the .ll bitcode file. Of particular use
Jim Stichnothefb89712015-09-03 13:19:54 -070085 are ``all``, ``most``, and ``none``.
Jim Stichnothf7c9a142014-04-29 10:52:43 -070086
Jim Stichnoth5bc2b1d2014-05-22 13:38:48 -070087 ``-o <FILE>`` -- Set the assembly output file name. Default is stdout.
88
89 ``-log <FILE>`` -- Set the file name for diagnostic output (whose level is
90 controlled by ``-verbose``). Default is stdout.
91
Jim Stichnoth144a3932014-11-18 09:16:31 -080092 ``-timing`` -- Dump some pass timing information after translating the input
93 file.
Jim Stichnothf7c9a142014-04-29 10:52:43 -070094
95Running the test suite
96----------------------
97
Jim Stichnoth144a3932014-11-18 09:16:31 -080098Subzero uses the LLVM ``lit`` testing tool for part of its test suite, which
99lives in ``tests_lit``. To execute the test suite, first build Subzero, and then
100run::
Jim Stichnothf7c9a142014-04-29 10:52:43 -0700101
Jim Stichnoth144a3932014-11-18 09:16:31 -0800102 make -f Makefile.standalone check-lit
Jim Stichnothf7c9a142014-04-29 10:52:43 -0700103
Jim Stichnoth144a3932014-11-18 09:16:31 -0800104There is also a suite of cross tests in the ``crosstest`` directory. A cross
105test takes a test bitcode file implementing some unit tests, and translates it
106twice, once with Subzero and once with LLVM's known-good ``llc`` translator.
107The Subzero-translated symbols are specially mangled to avoid multiple
108definition errors from the linker. Both translated versions are linked together
109with a driver program that calls each version of each unit test with a variety
110of interesting inputs and compares the results for equality. The cross tests
Jim Stichnothefb89712015-09-03 13:19:54 -0700111are currently invoked by running::
Jim Stichnothf7c9a142014-04-29 10:52:43 -0700112
Jim Stichnothefb89712015-09-03 13:19:54 -0700113 make -f Makefile.standalone check-xtest
114
115Similar, there is a suite of unit tests::
116
117 make -f Makefile.standalone check-unit
118
119A convenient way to run the lit, cross, and unit tests is::
Jim Stichnothf7c9a142014-04-29 10:52:43 -0700120
Jim Stichnoth144a3932014-11-18 09:16:31 -0800121 make -f Makefile.standalone check
Jim Stichnothf7c9a142014-04-29 10:52:43 -0700122
Jim Stichnothfa0cfa52015-02-26 09:42:36 -0800123Assembling ``pnacl-sz`` output as needed
Jim Stichnothd442e7e2015-02-12 14:01:48 -0800124----------------------------------------
Jim Stichnothf7c9a142014-04-29 10:52:43 -0700125
Jim Stichnothfa0cfa52015-02-26 09:42:36 -0800126``pnacl-sz`` can now produce a native ELF binary using ``-filetype=obj``.
Jim Stichnothd442e7e2015-02-12 14:01:48 -0800127
Jim Stichnothfa0cfa52015-02-26 09:42:36 -0800128``pnacl-sz`` can also produce textual assembly code in a structure suitable for
Jim Stichnothd442e7e2015-02-12 14:01:48 -0800129input to ``llvm-mc``, using ``-filetype=asm`` or ``-filetype=iasm``. An object
130file can then be produced using the command::
Jim Stichnoth144a3932014-11-18 09:16:31 -0800131
Jim Stichnothefb89712015-09-03 13:19:54 -0700132 llvm-mc -triple=i686 -filetype=obj -o=MyObj.o
Jim Stichnoth144a3932014-11-18 09:16:31 -0800133
Jim Stichnoth144a3932014-11-18 09:16:31 -0800134Building a translated binary
135----------------------------
136
137There is a helper script, ``pydir/szbuild.py``, that translates a finalized pexe
138into a fully linked executable. Run it with ``-help`` for extensive
139documentation.
140
141By default, ``szbuild.py`` builds an executable using only Subzero translation,
142but it can also be used to produce hybrid Subzero/``llc`` binaries (``llc`` is
143the name of the LLVM translator) for bisection-based debugging. In bisection
144debugging mode, the pexe is translated using both Subzero and ``llc``, and the
145resulting object files are combined into a single executable using symbol
146weakening and other linker tricks to control which Subzero symbols and which
147``llc`` symbols take precedence. This is controlled by the ``-include`` and
148``-exclude`` arguments. These can be used to rapidly find a single function
149that Subzero translates incorrectly leading to incorrect output.
150
151There is another helper script, ``pydir/szbuild_spec2k.py``, that runs
152``szbuild.py`` on one or more components of the Spec2K suite. This assumes that
153Spec2K is set up in the usual place in the Native Client tree, and the finalized
154pexe files have been built. (Note: for working with Spec2K and other pexes,
155it's helpful to finalize the pexe using ``--no-strip-syms``, to preserve the
156original function and global variable names.)
157
158Status
159------
160
Jim Stichnothefb89712015-09-03 13:19:54 -0700161Subzero currently fully supports the x86-32 architecture, for both native and
162Native Client sandboxing modes. The x86-64 architecture is also supported in
163native mode only, and only for the x32 flavor due to the fact that pointers and
16432-bit integers are indistinguishable in PNaCl bitcode. Sandboxing support for
165x86-64 is in progress. ARM and MIPS support is in progress. Two optimization
166levels, ``-Om1`` and ``-O2``, are implemented.
Jim Stichnoth144a3932014-11-18 09:16:31 -0800167
168The ``-Om1`` configuration is designed to be the simplest and fastest possible,
169with a minimal set of passes and transformations.
170
171* Simple Phi lowering before target lowering, by generating temporaries and
172 adding assignments to the end of predecessor blocks.
173
Jim Stichnothefb89712015-09-03 13:19:54 -0700174* Simple register allocation limited to pre-colored or infinite-weight
Jim Stichnoth144a3932014-11-18 09:16:31 -0800175 Variables.
176
177The ``-O2`` configuration is designed to use all optimizations available and
178produce the best code.
179
180* Address mode inference to leverage the complex x86 addressing modes.
181
182* Compare/branch fusing based on liveness/last-use analysis.
183
184* Global, linear-scan register allocation.
185
186* Advanced phi lowering after target lowering and global register allocation,
187 via edge splitting, topological sorting of the parallel moves, and final local
188 register allocation.
189
190* Stack slot coalescing to reduce frame size.
191
192* Branch optimization to reduce the number of branches to the following block.