blob: 18ebf0d5f50a8418d5d59a9726cfc88f9d06d051 [file] [log] [blame]
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +00001Customizing LLVMC: Reference Manual
2===================================
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +00003
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +00004LLVMC is a generic compiler driver, designed to be customizable and
5extensible. It plays the same role for LLVM as the ``gcc`` program
6does for GCC - LLVMC's job is essentially to transform a set of input
7files into a set of targets depending on configuration rules and user
8options. What makes LLVMC different is that these transformation rules
9are completely customizable - in fact, LLVMC knows nothing about the
10specifics of transformation (even the command-line options are mostly
11not hard-coded) and regards the transformation structure as an
12abstract graph. This makes it possible to adapt LLVMC for other
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000013purposes - for example, as a build tool for game resources.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000014
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000015Because LLVMC employs TableGen [1]_ as its configuration language, you
16need to be familiar with it to customize LLVMC.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000017
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000018Compiling with LLVMC
19--------------------
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000020
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000021LLVMC tries hard to be as compatible with ``gcc`` as possible,
22although there are some small differences. Most of the time, however,
23you shouldn't be able to notice them::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000024
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000025 $ # This works as expected:
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000026 $ llvmc2 -O3 -Wall hello.cpp
27 $ ./a.out
28 hello
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000029
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000030One nice feature of LLVMC is that one doesn't have to distinguish
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000031between different compilers for different languages (think ``g++`` and
32``gcc``) - the right toolchain is chosen automatically based on input
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000033language names (which are, in turn, determined from file
34extensions). If you want to force files ending with ".c" to compile as
35C++, use the ``-x`` option, just like you would do it with ``gcc``::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000036
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000037 $ llvmc2 -x c hello.cpp
38 $ # hello.cpp is really a C file
39 $ ./a.out
40 hello
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000041
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000042On the other hand, when using LLVMC as a linker to combine several C++
43object files you should provide the ``--linker`` option since it's
44impossible for LLVMC to choose the right linker in that case::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000045
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000046 $ llvmc2 -c hello.cpp
47 $ llvmc2 hello.o
48 [A lot of link-time errors skipped]
49 $ llvmc2 --linker=c++ hello.o
50 $ ./a.out
51 hello
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000052
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000053
54Customizing LLVMC: the compilation graph
55----------------------------------------
56
57At the time of writing LLVMC does not support on-the-fly reloading of
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000058configuration, so to customize LLVMC you'll have to recompile the
59source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The
60default configuration files are ``Common.td`` (contains common
61definitions, don't forget to ``include`` it in your configuration
62files), ``Tools.td`` (tool descriptions) and ``Graph.td`` (compilation
63graph definition).
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000064
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000065To compile LLVMC with your own configuration file (say,``MyGraph.td``),
66run ``make`` like this::
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000067
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000068 $ cd $LLVM_DIR/tools/llvmc2
69 $ make GRAPH=MyGraph.td TOOLNAME=my_llvmc
70
71This will build an executable named ``my_llvmc``. There are also
72several sample configuration files in the ``llvmc2/examples``
73subdirectory that should help to get you started.
74
75Internally, LLVMC stores information about possible source
76transformations in form of a graph. Nodes in this graph represent
77tools, and edges between two nodes represent a transformation path. A
78special "root" node is used to mark entry points for the
79transformations. LLVMC also assigns a weight to each edge (more on
80this later) to choose between several alternative edges.
81
82The definition of the compilation graph (see file ``Graph.td``) is
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000083just a list of edges::
84
85 def CompilationGraph : CompilationGraph<[
86 Edge<root, llvm_gcc_c>,
87 Edge<root, llvm_gcc_assembler>,
88 ...
89
90 Edge<llvm_gcc_c, llc>,
91 Edge<llvm_gcc_cpp, llc>,
92 ...
93
94 OptionalEdge<llvm_gcc_c, opt, [(switch_on "opt")]>,
95 OptionalEdge<llvm_gcc_cpp, opt, [(switch_on "opt")]>,
96 ...
97
98 OptionalEdge<llvm_gcc_assembler, llvm_gcc_cpp_linker,
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000099 (case (input_languages_contain "c++"), (inc_weight),
100 (or (parameter_equals "linker", "g++"),
101 (parameter_equals "linker", "c++")), (inc_weight))>,
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000102 ...
103
104 ]>;
105
106As you can see, the edges can be either default or optional, where
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000107optional edges are differentiated by sporting a ``case`` expression
108used to calculate the edge's weight.
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000109
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000110The default edges are assigned a weight of 1, and optional edges get a
111weight of 0 + 2*N where N is the number of tests that evaluated to
112true in the ``case`` expression. It is also possible to provide an
113integer parameter to ``inc_weight`` and ``dec_weight`` - in this case,
114the weight is increased (or decreased) by the provided value instead
115of the default 2.
116
117When passing an input file through the graph, LLVMC picks the edge
118with the maximum weight. To avoid ambiguity, there should be only one
119default edge between two nodes (with the exception of the root node,
120which gets a special treatment - there you are allowed to specify one
121default edge *per language*).
122
123To get a visual representation of the compilation graph (useful for
124debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and
125``gsview`` installed for this to work properly.
126
127
Mikhail Glushenkovee4d1d02008-05-30 06:16:32 +0000128The 'case' expression
129---------------------
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000130
Mikhail Glushenkovee4d1d02008-05-30 06:16:32 +0000131The 'case' construct can be used to calculate weights of the optional
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000132edges and to choose between several alternative command line strings
133in the ``cmd_line`` tool property. It is designed after the
Mikhail Glushenkovee4d1d02008-05-30 06:16:32 +0000134similarly-named construct in functional languages and takes the form
135``(case (test_1), statement_1, (test_2), statement_2, ... (test_N),
136statement_N)``. The statements are evaluated only if the corresponding
137tests evaluate to true.
138
139Examples::
140
141 // Increases edge weight by 5 if "-A" is provided on the
142 // command-line, and by 5 more if "-B" is also provided.
143 (case
144 (switch_on "A"), (inc_weight 5),
145 (switch_on "B"), (inc_weight 5))
146
147 // Evaluates to "cmdline1" if option "-A" is provided on the
148 // command line, otherwise to "cmdline2"
149 (case
150 (switch_on "A"), ("cmdline1"),
151 (default), ("cmdline2"))
152
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000153
154* Possible tests are:
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000155
156 - ``switch_on`` - Returns true if a given command-line option is
157 provided by the user. Example: ``(switch_on "opt")``. Note that
158 you have to define all possible command-line options separately in
159 the tool descriptions. See the next section for the discussion of
160 different kinds of command-line options.
161
162 - ``parameter_equals`` - Returns true if a command-line parameter equals
163 a given value. Example: ``(parameter_equals "W", "all")``.
164
165 - ``element_in_list`` - Returns true if a command-line parameter list
166 includes a given value. Example: ``(parameter_in_list "l", "pthread")``.
167
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000168 - ``input_languages_contain`` - Returns true if a given language
169 belongs to the current input language set. Example:
170 ```(input_languages_contain "c++")``.
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000171
Mikhail Glushenkovee4d1d02008-05-30 06:16:32 +0000172 - ``default`` - Always evaluates to true. Should always be the last
173 test in the ``case`` expression.
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000174
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000175 - ``and`` - A standard logical combinator that returns true iff all
176 of its arguments return true. Used like this: ``(and (test1),
177 (test2), ... (testN))``. Nesting of ``and`` and ``or`` is allowed,
178 but not encouraged.
Mikhail Glushenkovdfcad6c2008-05-06 18:18:20 +0000179
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000180 - ``or`` - Another logical combinator that returns true only if any
181 one of its arguments returns true. Example: ``(or (test1),
182 (test2), ... (testN))``.
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000183
184
185Writing a tool description
186--------------------------
187
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000188As was said earlier, nodes in the compilation graph represent tools,
189which are described separately. A tool definition looks like this
190(taken from the ``Tools.td`` file)::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000191
192 def llvm_gcc_cpp : Tool<[
193 (in_language "c++"),
194 (out_language "llvm-assembler"),
195 (output_suffix "bc"),
196 (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
197 (sink)
198 ]>;
199
200This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
201``llvm-g++``. As you can see, a tool definition is just a list of
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000202properties; most of them should be self-explanatory. The ``sink``
203property means that this tool should be passed all command-line
204options that lack explicit descriptions.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000205
206The complete list of the currently implemented tool properties follows:
207
208* Possible tool properties:
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000209
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000210 - ``in_language`` - input language name.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000211
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000212 - ``out_language`` - output language name.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000213
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000214 - ``output_suffix`` - output file suffix.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000215
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000216 - ``cmd_line`` - the actual command used to run the tool. You can
217 use ``$INFILE`` and ``$OUTFILE`` variables, output redirection
218 with ``>``, hook invocations (``$CALL``), environment variables
219 (via ``$ENV``) and the ``case`` construct (more on this below).
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000220
221 - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000222 list of input files and joins them together. Used for linkers.
223
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000224 - ``sink`` - all command-line options that are not handled by other
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000225 tools are passed to this tool.
226
227The next tool definition is slightly more complex::
228
229 def llvm_gcc_linker : Tool<[
230 (in_language "object-code"),
231 (out_language "executable"),
232 (output_suffix "out"),
233 (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
234 (join),
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000235 (prefix_list_option "L", (forward),
236 (help "add a directory to link path")),
237 (prefix_list_option "l", (forward),
238 (help "search a library when linking")),
239 (prefix_list_option "Wl", (unpack_values),
240 (help "pass options to linker"))
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000241 ]>;
242
243This tool has a "join" property, which means that it behaves like a
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000244linker. This tool also defines several command-line options: ``-l``,
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000245``-L`` and ``-Wl`` which have their usual meaning. An option has two
246attributes: a name and a (possibly empty) list of properties. All
247currently implemented option types and properties are described below:
248
249* Possible option types:
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000250
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000251 - ``switch_option`` - a simple boolean switch, for example ``-time``.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000252
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000253 - ``parameter_option`` - option that takes an argument, for example
254 ``-std=c99``;
255
256 - ``parameter_list_option`` - same as the above, but more than one
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000257 occurence of the option is allowed.
258
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000259 - ``prefix_option`` - same as the parameter_option, but the option name
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000260 and parameter value are not separated.
261
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000262 - ``prefix_list_option`` - same as the above, but more than one
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000263 occurence of the option is allowed; example: ``-lm -lpthread``.
264
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000265
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000266* Possible option properties:
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000267
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000268 - ``append_cmd`` - append a string to the tool invocation command.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000269
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000270 - ``forward`` - forward this option unchanged.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000271
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000272 - ``output_suffix`` - modify the output suffix of this
273 tool. Example : ``(switch "E", (output_suffix "i")``.
274
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000275 - ``stop_compilation`` - stop compilation after this phase.
276
277 - ``unpack_values`` - used for for splitting and forwarding
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000278 comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
279 converted to ``-foo=bar -baz`` and appended to the tool invocation
280 command.
281
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000282 - ``help`` - help string associated with this option. Used for
283 ``--help`` output.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000284
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000285 - ``required`` - this option is obligatory.
286
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000287
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000288Hooks and environment variables
289-------------------------------
290
291Normally, LLVMC executes programs from the system ``PATH``. Sometimes,
292this is not sufficient: for example, we may want to specify tool names
293in the configuration file. This can be achieved via the mechanism of
294hooks - to compile LLVMC with your hooks, just drop a .cpp file into
295``tools/llvmc2`` directory. Hooks should live in the ``hooks``
296namespace and have the signature ``std::string hooks::MyHookName
297(void)``. They can be used from the ``cmd_line`` tool property::
298
299 (cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
300
301It is also possible to use environment variables in the same manner::
302
303 (cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
304
305To change the command line string based on user-provided options use
306the ``case`` expression (which we have already seen before)::
307
308 (cmd_line
309 (case
310 (switch_on "E"),
311 "llvm-g++ -E -x c $INFILE -o $OUTFILE",
312 (default),
313 "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
314
315
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000316Language map
317------------
318
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000319One last thing that you will need to modify when adding support for a
320new language to LLVMC is the language map, which defines mappings from
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000321file extensions to language names. It is used to choose the proper
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000322toolchain(s) for a given input file set. Language map definition is
323located in the file ``Tools.td`` and looks like this::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000324
325 def LanguageMap : LanguageMap<
326 [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
327 LangToSuffixes<"c", ["c"]>,
328 ...
329 ]>;
330
331
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000332References
333==========
334
335.. [1] TableGen Fundamentals
336 http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html