blob: bd55f38cc466d45ff25a1a0c7c2c84dd5c1e3ff2 [file] [log] [blame]
Mikhail Glushenkov270cae32008-05-30 06:25:24 +00001===================================
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +00002Customizing LLVMC: Reference Manual
3===================================
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +00004
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +00005LLVMC is a generic compiler driver, designed to be customizable and
6extensible. It plays the same role for LLVM as the ``gcc`` program
7does for GCC - LLVMC's job is essentially to transform a set of input
8files into a set of targets depending on configuration rules and user
9options. What makes LLVMC different is that these transformation rules
10are completely customizable - in fact, LLVMC knows nothing about the
11specifics of transformation (even the command-line options are mostly
12not hard-coded) and regards the transformation structure as an
13abstract graph. This makes it possible to adapt LLVMC for other
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000014purposes - for example, as a build tool for game resources.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000015
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000016Because LLVMC employs TableGen [1]_ as its configuration language, you
17need to be familiar with it to customize LLVMC.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000018
Mikhail Glushenkov270cae32008-05-30 06:25:24 +000019
20.. contents::
21
22
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000023Compiling with LLVMC
Mikhail Glushenkov270cae32008-05-30 06:25:24 +000024====================
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000025
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000026LLVMC tries hard to be as compatible with ``gcc`` as possible,
27although there are some small differences. Most of the time, however,
28you shouldn't be able to notice them::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000029
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000030 $ # This works as expected:
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000031 $ llvmc2 -O3 -Wall hello.cpp
32 $ ./a.out
33 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000034
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000035One nice feature of LLVMC is that one doesn't have to distinguish
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000036between different compilers for different languages (think ``g++`` and
37``gcc``) - the right toolchain is chosen automatically based on input
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000038language names (which are, in turn, determined from file
39extensions). If you want to force files ending with ".c" to compile as
40C++, use the ``-x`` option, just like you would do it with ``gcc``::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000041
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000042 $ llvmc2 -x c hello.cpp
43 $ # hello.cpp is really a C file
44 $ ./a.out
45 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000046
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000047On the other hand, when using LLVMC as a linker to combine several C++
48object files you should provide the ``--linker`` option since it's
49impossible for LLVMC to choose the right linker in that case::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000050
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000051 $ llvmc2 -c hello.cpp
52 $ llvmc2 hello.o
53 [A lot of link-time errors skipped]
54 $ llvmc2 --linker=c++ hello.o
55 $ ./a.out
56 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000057
Mikhail Glushenkov270cae32008-05-30 06:25:24 +000058Predefined options
59==================
60
61LLVMC has some built-in options that can't be overridden in the
62configuration files:
63
64* ``-o FILE`` - Output file name.
65
66* ``-x LANGUAGE`` - Specify the language of the following input files
67 until the next -x option.
68
69* ``-v`` - Enable verbose mode, i.e. print out all executed commands.
70
71* ``--view-graph`` - Show a graphical representation of the compilation
72 graph. Requires that you have ``dot`` and ``gv`` commands
73 installed. Hidden option, useful for debugging.
74
75* ``--write-graph`` - Write a ``compilation-graph.dot`` file in the
76 current directory with the compilation graph description in the
77 Graphviz format. Hidden option, useful for debugging.
78
Mikhail Glushenkov73296102008-05-30 06:29:17 +000079* ``--save-temps`` - Write temporary files to the current directory
80 and do not delete them on exit. Hidden option, useful for debugging.
81
82* ``--help``, ``--help-hidden``, ``--version`` - These options have
83 their standard meaning.
84
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000085
86Customizing LLVMC: the compilation graph
Mikhail Glushenkov270cae32008-05-30 06:25:24 +000087========================================
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000088
89At the time of writing LLVMC does not support on-the-fly reloading of
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000090configuration, so to customize LLVMC you'll have to recompile the
91source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The
92default configuration files are ``Common.td`` (contains common
93definitions, don't forget to ``include`` it in your configuration
94files), ``Tools.td`` (tool descriptions) and ``Graph.td`` (compilation
95graph definition).
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000096
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000097To compile LLVMC with your own configuration file (say,``MyGraph.td``),
98run ``make`` like this::
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000099
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000100 $ cd $LLVM_DIR/tools/llvmc2
101 $ make GRAPH=MyGraph.td TOOLNAME=my_llvmc
102
103This will build an executable named ``my_llvmc``. There are also
104several sample configuration files in the ``llvmc2/examples``
105subdirectory that should help to get you started.
106
107Internally, LLVMC stores information about possible source
108transformations in form of a graph. Nodes in this graph represent
109tools, and edges between two nodes represent a transformation path. A
110special "root" node is used to mark entry points for the
111transformations. LLVMC also assigns a weight to each edge (more on
112this later) to choose between several alternative edges.
113
114The definition of the compilation graph (see file ``Graph.td``) is
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000115just a list of edges::
116
117 def CompilationGraph : CompilationGraph<[
118 Edge<root, llvm_gcc_c>,
119 Edge<root, llvm_gcc_assembler>,
120 ...
121
122 Edge<llvm_gcc_c, llc>,
123 Edge<llvm_gcc_cpp, llc>,
124 ...
125
126 OptionalEdge<llvm_gcc_c, opt, [(switch_on "opt")]>,
127 OptionalEdge<llvm_gcc_cpp, opt, [(switch_on "opt")]>,
128 ...
129
130 OptionalEdge<llvm_gcc_assembler, llvm_gcc_cpp_linker,
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000131 (case (input_languages_contain "c++"), (inc_weight),
132 (or (parameter_equals "linker", "g++"),
133 (parameter_equals "linker", "c++")), (inc_weight))>,
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000134 ...
135
136 ]>;
137
138As you can see, the edges can be either default or optional, where
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000139optional edges are differentiated by sporting a ``case`` expression
140used to calculate the edge's weight.
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000141
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000142The default edges are assigned a weight of 1, and optional edges get a
143weight of 0 + 2*N where N is the number of tests that evaluated to
144true in the ``case`` expression. It is also possible to provide an
145integer parameter to ``inc_weight`` and ``dec_weight`` - in this case,
146the weight is increased (or decreased) by the provided value instead
147of the default 2.
148
149When passing an input file through the graph, LLVMC picks the edge
150with the maximum weight. To avoid ambiguity, there should be only one
151default edge between two nodes (with the exception of the root node,
152which gets a special treatment - there you are allowed to specify one
153default edge *per language*).
154
155To get a visual representation of the compilation graph (useful for
156debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and
157``gsview`` installed for this to work properly.
158
159
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000160Writing a tool description
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000161==========================
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000162
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000163As was said earlier, nodes in the compilation graph represent tools,
164which are described separately. A tool definition looks like this
165(taken from the ``Tools.td`` file)::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000166
167 def llvm_gcc_cpp : Tool<[
168 (in_language "c++"),
169 (out_language "llvm-assembler"),
170 (output_suffix "bc"),
171 (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
172 (sink)
173 ]>;
174
175This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
176``llvm-g++``. As you can see, a tool definition is just a list of
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000177properties; most of them should be self-explanatory. The ``sink``
178property means that this tool should be passed all command-line
179options that lack explicit descriptions.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000180
181The complete list of the currently implemented tool properties follows:
182
183* Possible tool properties:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000184
Mikhail Glushenkov5ccf28f2008-09-22 20:45:17 +0000185 - ``in_language`` - input language name. Can be either a string or a
186 list, in case the tool supports multiple input languages.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000187
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000188 - ``out_language`` - output language name.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000189
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000190 - ``output_suffix`` - output file suffix.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000191
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000192 - ``cmd_line`` - the actual command used to run the tool. You can
193 use ``$INFILE`` and ``$OUTFILE`` variables, output redirection
194 with ``>``, hook invocations (``$CALL``), environment variables
195 (via ``$ENV``) and the ``case`` construct (more on this below).
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000196
197 - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000198 list of input files and joins them together. Used for linkers.
199
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000200 - ``sink`` - all command-line options that are not handled by other
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000201 tools are passed to this tool.
202
203The next tool definition is slightly more complex::
204
205 def llvm_gcc_linker : Tool<[
206 (in_language "object-code"),
207 (out_language "executable"),
208 (output_suffix "out"),
209 (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
210 (join),
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000211 (prefix_list_option "L", (forward),
212 (help "add a directory to link path")),
213 (prefix_list_option "l", (forward),
214 (help "search a library when linking")),
215 (prefix_list_option "Wl", (unpack_values),
216 (help "pass options to linker"))
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000217 ]>;
218
219This tool has a "join" property, which means that it behaves like a
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000220linker. This tool also defines several command-line options: ``-l``,
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000221``-L`` and ``-Wl`` which have their usual meaning. An option has two
222attributes: a name and a (possibly empty) list of properties. All
223currently implemented option types and properties are described below:
224
225* Possible option types:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000226
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000227 - ``switch_option`` - a simple boolean switch, for example ``-time``.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000228
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000229 - ``parameter_option`` - option that takes an argument, for example
230 ``-std=c99``;
231
232 - ``parameter_list_option`` - same as the above, but more than one
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000233 occurence of the option is allowed.
234
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000235 - ``prefix_option`` - same as the parameter_option, but the option name
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000236 and parameter value are not separated.
237
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000238 - ``prefix_list_option`` - same as the above, but more than one
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000239 occurence of the option is allowed; example: ``-lm -lpthread``.
240
Mikhail Glushenkov0ab8ac32008-05-30 06:28:00 +0000241 - ``alias_option`` - a special option type for creating
242 aliases. Unlike other option types, aliases are not allowed to
243 have any properties besides the aliased option name. Usage
244 example: ``(alias_option "preprocess", "E")``
245
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000246
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000247* Possible option properties:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000248
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000249 - ``append_cmd`` - append a string to the tool invocation command.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000250
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000251 - ``forward`` - forward this option unchanged.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000252
Mikhail Glushenkovfdee9542008-09-22 20:46:19 +0000253 - ``forward_as`` - Change the name of this option, but forward the
254 argument unchanged. Example: ``(forward_as "--disable-optimize")``.
255
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000256 - ``output_suffix`` - modify the output suffix of this
Mikhail Glushenkovfdee9542008-09-22 20:46:19 +0000257 tool. Example: ``(switch "E", (output_suffix "i")``.
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000258
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000259 - ``stop_compilation`` - stop compilation after this phase.
260
261 - ``unpack_values`` - used for for splitting and forwarding
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000262 comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
263 converted to ``-foo=bar -baz`` and appended to the tool invocation
264 command.
265
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000266 - ``help`` - help string associated with this option. Used for
267 ``--help`` output.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000268
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000269 - ``required`` - this option is obligatory.
270
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000271
Mikhail Glushenkov0ab8ac32008-05-30 06:28:00 +0000272Option list - specifying all options in a single place
273======================================================
274
275It can be handy to have all information about options gathered in a
276single place to provide an overview. This can be achieved by using a
277so-called ``OptionList``::
278
279 def Options : OptionList<[
280 (switch_option "E", (help "Help string")),
281 (alias_option "quiet", "q")
282 ...
283 ]>;
284
285``OptionList`` is also a good place to specify option aliases.
286
287Tool-specific option properties like ``append_cmd`` have (obviously)
288no meaning in the context of ``OptionList``, so the only properties
289allowed there are ``help`` and ``required``.
290
291Option lists are used at the file scope. See file
292``examples/Clang.td`` for an example of ``OptionList`` usage.
293
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000294Using hooks and environment variables in the ``cmd_line`` property
295==================================================================
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000296
297Normally, LLVMC executes programs from the system ``PATH``. Sometimes,
298this is not sufficient: for example, we may want to specify tool names
299in the configuration file. This can be achieved via the mechanism of
300hooks - to compile LLVMC with your hooks, just drop a .cpp file into
301``tools/llvmc2`` directory. Hooks should live in the ``hooks``
302namespace and have the signature ``std::string hooks::MyHookName
303(void)``. They can be used from the ``cmd_line`` tool property::
304
305 (cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
306
307It is also possible to use environment variables in the same manner::
308
309 (cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
310
311To change the command line string based on user-provided options use
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000312the ``case`` expression (documented below)::
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000313
314 (cmd_line
315 (case
316 (switch_on "E"),
317 "llvm-g++ -E -x c $INFILE -o $OUTFILE",
318 (default),
319 "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
320
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000321Conditional evaluation: the ``case`` expression
322===============================================
323
324The 'case' construct can be used to calculate weights of the optional
325edges and to choose between several alternative command line strings
326in the ``cmd_line`` tool property. It is designed after the
327similarly-named construct in functional languages and takes the form
328``(case (test_1), statement_1, (test_2), statement_2, ... (test_N),
329statement_N)``. The statements are evaluated only if the corresponding
330tests evaluate to true.
331
332Examples::
333
334 // Increases edge weight by 5 if "-A" is provided on the
335 // command-line, and by 5 more if "-B" is also provided.
336 (case
337 (switch_on "A"), (inc_weight 5),
338 (switch_on "B"), (inc_weight 5))
339
340 // Evaluates to "cmdline1" if option "-A" is provided on the
341 // command line, otherwise to "cmdline2"
342 (case
343 (switch_on "A"), "cmdline1",
344 (switch_on "B"), "cmdline2",
345 (default), "cmdline3")
346
347Note the slight difference in 'case' expression handling in contexts
348of edge weights and command line specification - in the second example
349the value of the ``"B"`` switch is never checked when switch ``"A"`` is
350enabled, and the whole expression always evaluates to ``"cmdline1"`` in
351that case.
352
353Case expressions can also be nested, i.e. the following is legal::
354
355 (case (switch_on "E"), (case (switch_on "o"), ..., (default), ...)
356 (default), ...)
357
358You should, however, try to avoid doing that because it hurts
359readability. It is usually better to split tool descriptions and/or
360use TableGen inheritance instead.
361
362* Possible tests are:
363
364 - ``switch_on`` - Returns true if a given command-line option is
365 provided by the user. Example: ``(switch_on "opt")``. Note that
366 you have to define all possible command-line options separately in
367 the tool descriptions. See the next section for the discussion of
368 different kinds of command-line options.
369
370 - ``parameter_equals`` - Returns true if a command-line parameter equals
371 a given value. Example: ``(parameter_equals "W", "all")``.
372
373 - ``element_in_list`` - Returns true if a command-line parameter list
374 includes a given value. Example: ``(parameter_in_list "l", "pthread")``.
375
376 - ``input_languages_contain`` - Returns true if a given language
377 belongs to the current input language set. Example:
378 ```(input_languages_contain "c++")``.
379
380 - ``in_language`` - Evaluates to true if the language of the input
381 file equals to the argument. Valid only when using ``case``
382 expression in a ``cmd_line`` tool property. Example:
383 ```(in_language "c++")``.
384
385 - ``not_empty`` - Returns true if a given option (which should be
386 either a parameter or a parameter list) is set by the
387 user. Example: ```(not_empty "o")``.
388
389 - ``default`` - Always evaluates to true. Should always be the last
390 test in the ``case`` expression.
391
392 - ``and`` - A standard logical combinator that returns true iff all
393 of its arguments return true. Used like this: ``(and (test1),
394 (test2), ... (testN))``. Nesting of ``and`` and ``or`` is allowed,
395 but not encouraged.
396
397 - ``or`` - Another logical combinator that returns true only if any
398 one of its arguments returns true. Example: ``(or (test1),
399 (test2), ... (testN))``.
400
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000401
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000402Language map
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000403============
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000404
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000405One last thing that you will need to modify when adding support for a
406new language to LLVMC is the language map, which defines mappings from
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000407file extensions to language names. It is used to choose the proper
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000408toolchain(s) for a given input file set. Language map definition is
409located in the file ``Tools.td`` and looks like this::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000410
411 def LanguageMap : LanguageMap<
412 [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
413 LangToSuffixes<"c", ["c"]>,
414 ...
415 ]>;
416
417
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000418References
419==========
420
421.. [1] TableGen Fundamentals
422 http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html