blob: 9cee46c00d912aedc4fa77dd668d55e168e9e226 [file] [log] [blame]
Mikhail Glushenkov270cae32008-05-30 06:25:24 +00001===================================
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +00002Customizing LLVMC: Reference Manual
3===================================
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +00004
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +00005LLVMC is a generic compiler driver, designed to be customizable and
6extensible. It plays the same role for LLVM as the ``gcc`` program
7does for GCC - LLVMC's job is essentially to transform a set of input
8files into a set of targets depending on configuration rules and user
9options. What makes LLVMC different is that these transformation rules
10are completely customizable - in fact, LLVMC knows nothing about the
11specifics of transformation (even the command-line options are mostly
12not hard-coded) and regards the transformation structure as an
13abstract graph. This makes it possible to adapt LLVMC for other
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000014purposes - for example, as a build tool for game resources.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000015
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000016Because LLVMC employs TableGen [1]_ as its configuration language, you
17need to be familiar with it to customize LLVMC.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000018
Mikhail Glushenkov270cae32008-05-30 06:25:24 +000019
20.. contents::
21
22
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000023Compiling with LLVMC
Mikhail Glushenkov270cae32008-05-30 06:25:24 +000024====================
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000025
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000026LLVMC tries hard to be as compatible with ``gcc`` as possible,
27although there are some small differences. Most of the time, however,
28you shouldn't be able to notice them::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000029
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000030 $ # This works as expected:
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000031 $ llvmc2 -O3 -Wall hello.cpp
32 $ ./a.out
33 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000034
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000035One nice feature of LLVMC is that one doesn't have to distinguish
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000036between different compilers for different languages (think ``g++`` and
37``gcc``) - the right toolchain is chosen automatically based on input
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000038language names (which are, in turn, determined from file
39extensions). If you want to force files ending with ".c" to compile as
40C++, use the ``-x`` option, just like you would do it with ``gcc``::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000041
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000042 $ llvmc2 -x c hello.cpp
43 $ # hello.cpp is really a C file
44 $ ./a.out
45 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000046
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000047On the other hand, when using LLVMC as a linker to combine several C++
48object files you should provide the ``--linker`` option since it's
49impossible for LLVMC to choose the right linker in that case::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000050
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000051 $ llvmc2 -c hello.cpp
52 $ llvmc2 hello.o
53 [A lot of link-time errors skipped]
54 $ llvmc2 --linker=c++ hello.o
55 $ ./a.out
56 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000057
Mikhail Glushenkov270cae32008-05-30 06:25:24 +000058Predefined options
59==================
60
61LLVMC has some built-in options that can't be overridden in the
62configuration files:
63
64* ``-o FILE`` - Output file name.
65
66* ``-x LANGUAGE`` - Specify the language of the following input files
67 until the next -x option.
68
69* ``-v`` - Enable verbose mode, i.e. print out all executed commands.
70
71* ``--view-graph`` - Show a graphical representation of the compilation
72 graph. Requires that you have ``dot`` and ``gv`` commands
73 installed. Hidden option, useful for debugging.
74
75* ``--write-graph`` - Write a ``compilation-graph.dot`` file in the
76 current directory with the compilation graph description in the
77 Graphviz format. Hidden option, useful for debugging.
78
Mikhail Glushenkov73296102008-05-30 06:29:17 +000079* ``--save-temps`` - Write temporary files to the current directory
80 and do not delete them on exit. Hidden option, useful for debugging.
81
82* ``--help``, ``--help-hidden``, ``--version`` - These options have
83 their standard meaning.
84
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000085
86Customizing LLVMC: the compilation graph
Mikhail Glushenkov270cae32008-05-30 06:25:24 +000087========================================
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000088
89At the time of writing LLVMC does not support on-the-fly reloading of
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000090configuration, so to customize LLVMC you'll have to recompile the
91source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The
92default configuration files are ``Common.td`` (contains common
93definitions, don't forget to ``include`` it in your configuration
94files), ``Tools.td`` (tool descriptions) and ``Graph.td`` (compilation
95graph definition).
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000096
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000097To compile LLVMC with your own configuration file (say,``MyGraph.td``),
98run ``make`` like this::
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000099
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000100 $ cd $LLVM_DIR/tools/llvmc2
101 $ make GRAPH=MyGraph.td TOOLNAME=my_llvmc
102
103This will build an executable named ``my_llvmc``. There are also
104several sample configuration files in the ``llvmc2/examples``
105subdirectory that should help to get you started.
106
107Internally, LLVMC stores information about possible source
108transformations in form of a graph. Nodes in this graph represent
109tools, and edges between two nodes represent a transformation path. A
110special "root" node is used to mark entry points for the
111transformations. LLVMC also assigns a weight to each edge (more on
112this later) to choose between several alternative edges.
113
114The definition of the compilation graph (see file ``Graph.td``) is
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000115just a list of edges::
116
117 def CompilationGraph : CompilationGraph<[
118 Edge<root, llvm_gcc_c>,
119 Edge<root, llvm_gcc_assembler>,
120 ...
121
122 Edge<llvm_gcc_c, llc>,
123 Edge<llvm_gcc_cpp, llc>,
124 ...
125
126 OptionalEdge<llvm_gcc_c, opt, [(switch_on "opt")]>,
127 OptionalEdge<llvm_gcc_cpp, opt, [(switch_on "opt")]>,
128 ...
129
130 OptionalEdge<llvm_gcc_assembler, llvm_gcc_cpp_linker,
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000131 (case (input_languages_contain "c++"), (inc_weight),
132 (or (parameter_equals "linker", "g++"),
133 (parameter_equals "linker", "c++")), (inc_weight))>,
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000134 ...
135
136 ]>;
137
138As you can see, the edges can be either default or optional, where
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000139optional edges are differentiated by sporting a ``case`` expression
140used to calculate the edge's weight.
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000141
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000142The default edges are assigned a weight of 1, and optional edges get a
143weight of 0 + 2*N where N is the number of tests that evaluated to
144true in the ``case`` expression. It is also possible to provide an
145integer parameter to ``inc_weight`` and ``dec_weight`` - in this case,
146the weight is increased (or decreased) by the provided value instead
147of the default 2.
148
149When passing an input file through the graph, LLVMC picks the edge
150with the maximum weight. To avoid ambiguity, there should be only one
151default edge between two nodes (with the exception of the root node,
152which gets a special treatment - there you are allowed to specify one
153default edge *per language*).
154
155To get a visual representation of the compilation graph (useful for
156debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and
157``gsview`` installed for this to work properly.
158
159
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000160Writing a tool description
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000161==========================
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000162
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000163As was said earlier, nodes in the compilation graph represent tools,
164which are described separately. A tool definition looks like this
165(taken from the ``Tools.td`` file)::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000166
167 def llvm_gcc_cpp : Tool<[
168 (in_language "c++"),
169 (out_language "llvm-assembler"),
170 (output_suffix "bc"),
171 (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
172 (sink)
173 ]>;
174
175This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
176``llvm-g++``. As you can see, a tool definition is just a list of
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000177properties; most of them should be self-explanatory. The ``sink``
178property means that this tool should be passed all command-line
179options that lack explicit descriptions.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000180
181The complete list of the currently implemented tool properties follows:
182
183* Possible tool properties:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000184
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000185 - ``in_language`` - input language name.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000186
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000187 - ``out_language`` - output language name.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000188
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000189 - ``output_suffix`` - output file suffix.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000190
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000191 - ``cmd_line`` - the actual command used to run the tool. You can
192 use ``$INFILE`` and ``$OUTFILE`` variables, output redirection
193 with ``>``, hook invocations (``$CALL``), environment variables
194 (via ``$ENV``) and the ``case`` construct (more on this below).
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000195
196 - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000197 list of input files and joins them together. Used for linkers.
198
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000199 - ``sink`` - all command-line options that are not handled by other
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000200 tools are passed to this tool.
201
202The next tool definition is slightly more complex::
203
204 def llvm_gcc_linker : Tool<[
205 (in_language "object-code"),
206 (out_language "executable"),
207 (output_suffix "out"),
208 (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
209 (join),
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000210 (prefix_list_option "L", (forward),
211 (help "add a directory to link path")),
212 (prefix_list_option "l", (forward),
213 (help "search a library when linking")),
214 (prefix_list_option "Wl", (unpack_values),
215 (help "pass options to linker"))
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000216 ]>;
217
218This tool has a "join" property, which means that it behaves like a
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000219linker. This tool also defines several command-line options: ``-l``,
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000220``-L`` and ``-Wl`` which have their usual meaning. An option has two
221attributes: a name and a (possibly empty) list of properties. All
222currently implemented option types and properties are described below:
223
224* Possible option types:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000225
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000226 - ``switch_option`` - a simple boolean switch, for example ``-time``.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000227
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000228 - ``parameter_option`` - option that takes an argument, for example
229 ``-std=c99``;
230
231 - ``parameter_list_option`` - same as the above, but more than one
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000232 occurence of the option is allowed.
233
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000234 - ``prefix_option`` - same as the parameter_option, but the option name
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000235 and parameter value are not separated.
236
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000237 - ``prefix_list_option`` - same as the above, but more than one
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000238 occurence of the option is allowed; example: ``-lm -lpthread``.
239
Mikhail Glushenkov0ab8ac32008-05-30 06:28:00 +0000240 - ``alias_option`` - a special option type for creating
241 aliases. Unlike other option types, aliases are not allowed to
242 have any properties besides the aliased option name. Usage
243 example: ``(alias_option "preprocess", "E")``
244
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000245
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000246* Possible option properties:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000247
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000248 - ``append_cmd`` - append a string to the tool invocation command.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000249
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000250 - ``forward`` - forward this option unchanged.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000251
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000252 - ``output_suffix`` - modify the output suffix of this
253 tool. Example : ``(switch "E", (output_suffix "i")``.
254
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000255 - ``stop_compilation`` - stop compilation after this phase.
256
257 - ``unpack_values`` - used for for splitting and forwarding
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000258 comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
259 converted to ``-foo=bar -baz`` and appended to the tool invocation
260 command.
261
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000262 - ``help`` - help string associated with this option. Used for
263 ``--help`` output.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000264
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000265 - ``required`` - this option is obligatory.
266
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000267
Mikhail Glushenkov0ab8ac32008-05-30 06:28:00 +0000268Option list - specifying all options in a single place
269======================================================
270
271It can be handy to have all information about options gathered in a
272single place to provide an overview. This can be achieved by using a
273so-called ``OptionList``::
274
275 def Options : OptionList<[
276 (switch_option "E", (help "Help string")),
277 (alias_option "quiet", "q")
278 ...
279 ]>;
280
281``OptionList`` is also a good place to specify option aliases.
282
283Tool-specific option properties like ``append_cmd`` have (obviously)
284no meaning in the context of ``OptionList``, so the only properties
285allowed there are ``help`` and ``required``.
286
287Option lists are used at the file scope. See file
288``examples/Clang.td`` for an example of ``OptionList`` usage.
289
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000290Using hooks and environment variables in the ``cmd_line`` property
291==================================================================
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000292
293Normally, LLVMC executes programs from the system ``PATH``. Sometimes,
294this is not sufficient: for example, we may want to specify tool names
295in the configuration file. This can be achieved via the mechanism of
296hooks - to compile LLVMC with your hooks, just drop a .cpp file into
297``tools/llvmc2`` directory. Hooks should live in the ``hooks``
298namespace and have the signature ``std::string hooks::MyHookName
299(void)``. They can be used from the ``cmd_line`` tool property::
300
301 (cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
302
303It is also possible to use environment variables in the same manner::
304
305 (cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
306
307To change the command line string based on user-provided options use
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000308the ``case`` expression (documented below)::
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000309
310 (cmd_line
311 (case
312 (switch_on "E"),
313 "llvm-g++ -E -x c $INFILE -o $OUTFILE",
314 (default),
315 "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
316
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000317Conditional evaluation: the ``case`` expression
318===============================================
319
320The 'case' construct can be used to calculate weights of the optional
321edges and to choose between several alternative command line strings
322in the ``cmd_line`` tool property. It is designed after the
323similarly-named construct in functional languages and takes the form
324``(case (test_1), statement_1, (test_2), statement_2, ... (test_N),
325statement_N)``. The statements are evaluated only if the corresponding
326tests evaluate to true.
327
328Examples::
329
330 // Increases edge weight by 5 if "-A" is provided on the
331 // command-line, and by 5 more if "-B" is also provided.
332 (case
333 (switch_on "A"), (inc_weight 5),
334 (switch_on "B"), (inc_weight 5))
335
336 // Evaluates to "cmdline1" if option "-A" is provided on the
337 // command line, otherwise to "cmdline2"
338 (case
339 (switch_on "A"), "cmdline1",
340 (switch_on "B"), "cmdline2",
341 (default), "cmdline3")
342
343Note the slight difference in 'case' expression handling in contexts
344of edge weights and command line specification - in the second example
345the value of the ``"B"`` switch is never checked when switch ``"A"`` is
346enabled, and the whole expression always evaluates to ``"cmdline1"`` in
347that case.
348
349Case expressions can also be nested, i.e. the following is legal::
350
351 (case (switch_on "E"), (case (switch_on "o"), ..., (default), ...)
352 (default), ...)
353
354You should, however, try to avoid doing that because it hurts
355readability. It is usually better to split tool descriptions and/or
356use TableGen inheritance instead.
357
358* Possible tests are:
359
360 - ``switch_on`` - Returns true if a given command-line option is
361 provided by the user. Example: ``(switch_on "opt")``. Note that
362 you have to define all possible command-line options separately in
363 the tool descriptions. See the next section for the discussion of
364 different kinds of command-line options.
365
366 - ``parameter_equals`` - Returns true if a command-line parameter equals
367 a given value. Example: ``(parameter_equals "W", "all")``.
368
369 - ``element_in_list`` - Returns true if a command-line parameter list
370 includes a given value. Example: ``(parameter_in_list "l", "pthread")``.
371
372 - ``input_languages_contain`` - Returns true if a given language
373 belongs to the current input language set. Example:
374 ```(input_languages_contain "c++")``.
375
376 - ``in_language`` - Evaluates to true if the language of the input
377 file equals to the argument. Valid only when using ``case``
378 expression in a ``cmd_line`` tool property. Example:
379 ```(in_language "c++")``.
380
381 - ``not_empty`` - Returns true if a given option (which should be
382 either a parameter or a parameter list) is set by the
383 user. Example: ```(not_empty "o")``.
384
385 - ``default`` - Always evaluates to true. Should always be the last
386 test in the ``case`` expression.
387
388 - ``and`` - A standard logical combinator that returns true iff all
389 of its arguments return true. Used like this: ``(and (test1),
390 (test2), ... (testN))``. Nesting of ``and`` and ``or`` is allowed,
391 but not encouraged.
392
393 - ``or`` - Another logical combinator that returns true only if any
394 one of its arguments returns true. Example: ``(or (test1),
395 (test2), ... (testN))``.
396
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000397
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000398Language map
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000399============
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000400
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000401One last thing that you will need to modify when adding support for a
402new language to LLVMC is the language map, which defines mappings from
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000403file extensions to language names. It is used to choose the proper
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000404toolchain(s) for a given input file set. Language map definition is
405located in the file ``Tools.td`` and looks like this::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000406
407 def LanguageMap : LanguageMap<
408 [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
409 LangToSuffixes<"c", ["c"]>,
410 ...
411 ]>;
412
413
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000414References
415==========
416
417.. [1] TableGen Fundamentals
418 http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html