blob: 91a0561bd45c9859e3e1df82173ea0ff04b76c17 [file] [log] [blame]
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +00001Tutorial - Writing LLVMC Configuration files
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +00002=============================================
3
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +00004LLVMC is a generic compiler driver, designed to be customizable and
5extensible. It plays the same role for LLVM as the ``gcc`` program
6does for GCC - LLVMC's job is essentially to transform a set of input
7files into a set of targets depending on configuration rules and user
8options. What makes LLVMC different is that these transformation rules
9are completely customizable - in fact, LLVMC knows nothing about the
10specifics of transformation (even the command-line options are mostly
11not hard-coded) and regards the transformation structure as an
12abstract graph. This makes it possible to adapt LLVMC for other
13purposes - for example, as a build tool for game resources. This
14tutorial describes the basic usage and configuration of LLVMC.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000015
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000016Because LLVMC employs TableGen [1]_ as its configuration language, you
17need to be familiar with it to customize LLVMC.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000018
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000019Compiling with LLVMC
20--------------------
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000021
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000022In general, LLVMC tries to be command-line compatible with ``gcc`` as
23much as possible, so most of the familiar options work::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000024
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000025 $ llvmc2 -O3 -Wall hello.cpp
26 $ ./a.out
27 hello
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000028
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000029One nice feature of LLVMC is that you don't have to distinguish
30between different compilers for different languages (think ``g++`` and
31``gcc``) - the right toolchain is chosen automatically based on input
32language names (which are, in turn, determined from file extension). If
33you want to force files ending with ".c" compile as C++, use the
34``-x`` option, just like you would do it with ``gcc``::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000035
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000036 $ llvmc2 -x c hello.cpp
37 $ # hello.cpp is really a C file
38 $ ./a.out
39 hello
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000040
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000041On the other hand, when using LLVMC as a linker to combine several C++
42object files you should provide the ``--linker`` option since it's
43impossible for LLVMC to choose the right linker in that case::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000044
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000045 $ llvmc2 -c hello.cpp
46 $ llvmc2 hello.o
47 [A lot of link-time errors skipped]
48 $ llvmc2 --linker=c++ hello.o
49 $ ./a.out
50 hello
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000051
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000052For further help on command-line LLVMC usage, refer to the ``llvmc
53--help`` output.
54
55Customizing LLVMC: the compilation graph
56----------------------------------------
57
58At the time of writing LLVMC does not support on-the-fly reloading of
59configuration, so to customize LLVMC you'll have to edit and recompile
60the source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The
61relevant files are ``Common.td``, ``Tools.td`` and ``Example.td``.
62
63Internally, LLVMC stores information about possible transformations in
64form of a graph. Nodes in this graph represent tools, and edges
65between two nodes represent a transformation path. A special "root"
66node represents entry points for the transformations. LLVMC also
67assigns a weight to each edge (more on that below) to choose between
68several alternative edges.
69
70The definition of the compilation graph (see file ``Example.td``) is
71just a list of edges::
72
73 def CompilationGraph : CompilationGraph<[
74 Edge<root, llvm_gcc_c>,
75 Edge<root, llvm_gcc_assembler>,
76 ...
77
78 Edge<llvm_gcc_c, llc>,
79 Edge<llvm_gcc_cpp, llc>,
80 ...
81
82 OptionalEdge<llvm_gcc_c, opt, [(switch_on "opt")]>,
83 OptionalEdge<llvm_gcc_cpp, opt, [(switch_on "opt")]>,
84 ...
85
86 OptionalEdge<llvm_gcc_assembler, llvm_gcc_cpp_linker,
87 [(if_input_languages_contain "c++"),
88 (or (parameter_equals "linker", "g++"),
89 (parameter_equals "linker", "c++"))]>,
90 ...
91
92 ]>;
93
94As you can see, the edges can be either default or optional, where
95optional edges are differentiated by sporting a list of patterns (or
96edge properties) which are used to calculate the edge's weight. The
97default edges are assigned a weight of 1, and optional edges get a
98weight of 0 + 2*N where N is the number of succesful edge property
99matches. When passing an input file through the graph, LLVMC picks the
100edge with the maximum weight. To avoid ambiguity, there should be only
101one default edge between two nodes (with the exception of the root
102node, which gets a special treatment - there you are allowed to
103specify one default edge *per language*).
104
105* Possible edge properties are:
106
107 - ``switch_on`` - Returns true if a given command-line option is
108 provided by the user. Example: ``(switch_on "opt")``. Note that
109 you have to define all possible command-line options separately in
110 the tool descriptions. See the next section for the discussion of
111 different kinds of command-line options.
112
113 - ``parameter_equals`` - Returns true if a command-line parameter equals
114 a given value. Example: ``(parameter_equals "W", "all")``.
115
116 - ``element_in_list`` - Returns true if a command-line parameter list
117 includes a given value. Example: ``(parameter_in_list "l", "pthread")``.
118
119 - ``if_input_languages_contain`` - Returns true if a given input
120 language belongs to the current input language set.
121
122 - ``and`` - Edge property combinator. Returns true if all of its
123 arguments return true. Used like this: (and
124 (prop1), (prop2), ... (propN)). Nesting not allowed.
125
126 - ``or`` - Edge property combinator that returns true if any one of its
127 arguments returns true. Example: (or (prop1), (prop2), ... (propN))
128
129To get a visual representation of the compilation graph (useful for
130debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and
131``gsview`` installed for this to work properly.
132
133
134Writing a tool description
135--------------------------
136
137As was said earlier, nodes in the compilation graph represent tools. A
138tool definition looks like this (taken from the ``Tools.td`` file)::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000139
140 def llvm_gcc_cpp : Tool<[
141 (in_language "c++"),
142 (out_language "llvm-assembler"),
143 (output_suffix "bc"),
144 (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
145 (sink)
146 ]>;
147
148This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
149``llvm-g++``. As you can see, a tool definition is just a list of
150properties; most of them should be self-evident. The ``sink`` property
151means that this tool should be passed all command-line options that
152aren't handled by the other tools.
153
154The complete list of the currently implemented tool properties follows:
155
156* Possible tool properties:
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000157
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000158 - ``in_language`` - input language name.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000159
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000160 - ``out_language`` - output language name.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000161
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000162 - ``output_suffix`` - output file suffix.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000163
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000164 - ``cmd_line`` - the actual command used to run the tool. You can use
165 ``$INFILE`` and ``$OUTFILE`` variables, as well as output
166 redirection with ``>``.
167
168 - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000169 list of input files and joins them together. Used for linkers.
170
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000171 - ``sink`` - all command-line options that are not handled by other
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000172 tools are passed to this tool.
173
174The next tool definition is slightly more complex::
175
176 def llvm_gcc_linker : Tool<[
177 (in_language "object-code"),
178 (out_language "executable"),
179 (output_suffix "out"),
180 (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
181 (join),
182 (prefix_list_option "L", (forward), (help "add a directory to link path")),
183 (prefix_list_option "l", (forward), (help "search a library when linking")),
184 (prefix_list_option "Wl", (unpack_values), (help "pass options to linker"))
185 ]>;
186
187This tool has a "join" property, which means that it behaves like a
188linker (because of that this tool should be the last in the
189toolchain). This tool also defines several command-line options: ``-l``,
190``-L`` and ``-Wl`` which have their usual meaning. An option has two
191attributes: a name and a (possibly empty) list of properties. All
192currently implemented option types and properties are described below:
193
194* Possible option types:
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000195
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000196 - ``switch_option`` - a simple boolean switch, for example ``-time``.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000197
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000198 - ``parameter_option`` - option that takes an argument, for example
199 ``-std=c99``;
200
201 - ``parameter_list_option`` - same as the above, but more than one
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000202 occurence of the option is allowed.
203
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000204 - ``prefix_option`` - same as the parameter_option, but the option name
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000205 and parameter value are not separated.
206
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000207 - ``prefix_list_option`` - same as the above, but more than one
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000208 occurence of the option is allowed; example: ``-lm -lpthread``.
209
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000210
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000211* Possible option properties:
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000212
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000213 - ``append_cmd`` - append a string to the tool invocation command.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000214
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000215 - ``forward`` - forward this option unchanged.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000216
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000217 - ``stop_compilation`` - stop compilation after this phase.
218
219 - ``unpack_values`` - used for for splitting and forwarding
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000220 comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
221 converted to ``-foo=bar -baz`` and appended to the tool invocation
222 command.
223
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000224 - ``help`` - help string associated with this option.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000225
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000226 - ``required`` - this option is obligatory.
227
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000228
229Language map
230------------
231
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000232One last thing that you need to modify when adding support for a new
233language to LLVMC is the language map, which defines mappings from
234file extensions to language names. It is used to choose the proper
235toolchain based on the input. Language map definition is located in
236the file ``Tools.td`` and looks like this::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000237
238 def LanguageMap : LanguageMap<
239 [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
240 LangToSuffixes<"c", ["c"]>,
241 ...
242 ]>;
243
244
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000245References
246==========
247
248.. [1] TableGen Fundamentals
249 http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html