blob: e71d99c30eb40f70cefabe6c7d0e7d637ae5d525 [file] [log] [blame]
Mikhail Glushenkov270cae32008-05-30 06:25:24 +00001===================================
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +00002Customizing LLVMC: Reference Manual
3===================================
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +00004
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +00005LLVMC is a generic compiler driver, designed to be customizable and
6extensible. It plays the same role for LLVM as the ``gcc`` program
7does for GCC - LLVMC's job is essentially to transform a set of input
8files into a set of targets depending on configuration rules and user
9options. What makes LLVMC different is that these transformation rules
10are completely customizable - in fact, LLVMC knows nothing about the
11specifics of transformation (even the command-line options are mostly
12not hard-coded) and regards the transformation structure as an
13abstract graph. This makes it possible to adapt LLVMC for other
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000014purposes - for example, as a build tool for game resources.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000015
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000016Because LLVMC employs TableGen [1]_ as its configuration language, you
17need to be familiar with it to customize LLVMC.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000018
Mikhail Glushenkov270cae32008-05-30 06:25:24 +000019
20.. contents::
21
22
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000023Compiling with LLVMC
Mikhail Glushenkov270cae32008-05-30 06:25:24 +000024====================
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000025
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000026LLVMC tries hard to be as compatible with ``gcc`` as possible,
27although there are some small differences. Most of the time, however,
28you shouldn't be able to notice them::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000029
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000030 $ # This works as expected:
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000031 $ llvmc2 -O3 -Wall hello.cpp
32 $ ./a.out
33 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000034
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000035One nice feature of LLVMC is that one doesn't have to distinguish
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000036between different compilers for different languages (think ``g++`` and
37``gcc``) - the right toolchain is chosen automatically based on input
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000038language names (which are, in turn, determined from file
39extensions). If you want to force files ending with ".c" to compile as
40C++, use the ``-x`` option, just like you would do it with ``gcc``::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000041
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000042 $ llvmc2 -x c hello.cpp
43 $ # hello.cpp is really a C file
44 $ ./a.out
45 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000046
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000047On the other hand, when using LLVMC as a linker to combine several C++
48object files you should provide the ``--linker`` option since it's
49impossible for LLVMC to choose the right linker in that case::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000050
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000051 $ llvmc2 -c hello.cpp
52 $ llvmc2 hello.o
53 [A lot of link-time errors skipped]
54 $ llvmc2 --linker=c++ hello.o
55 $ ./a.out
56 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000057
Mikhail Glushenkov270cae32008-05-30 06:25:24 +000058Predefined options
59==================
60
61LLVMC has some built-in options that can't be overridden in the
62configuration files:
63
64* ``-o FILE`` - Output file name.
65
66* ``-x LANGUAGE`` - Specify the language of the following input files
67 until the next -x option.
68
69* ``-v`` - Enable verbose mode, i.e. print out all executed commands.
70
71* ``--view-graph`` - Show a graphical representation of the compilation
72 graph. Requires that you have ``dot`` and ``gv`` commands
73 installed. Hidden option, useful for debugging.
74
75* ``--write-graph`` - Write a ``compilation-graph.dot`` file in the
76 current directory with the compilation graph description in the
77 Graphviz format. Hidden option, useful for debugging.
78
Mikhail Glushenkov73296102008-05-30 06:29:17 +000079* ``--save-temps`` - Write temporary files to the current directory
80 and do not delete them on exit. Hidden option, useful for debugging.
81
82* ``--help``, ``--help-hidden``, ``--version`` - These options have
83 their standard meaning.
84
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000085
86Customizing LLVMC: the compilation graph
Mikhail Glushenkov270cae32008-05-30 06:25:24 +000087========================================
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000088
89At the time of writing LLVMC does not support on-the-fly reloading of
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000090configuration, so to customize LLVMC you'll have to recompile the
91source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The
92default configuration files are ``Common.td`` (contains common
93definitions, don't forget to ``include`` it in your configuration
94files), ``Tools.td`` (tool descriptions) and ``Graph.td`` (compilation
95graph definition).
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000096
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000097To compile LLVMC with your own configuration file (say,``MyGraph.td``),
98run ``make`` like this::
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000099
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000100 $ cd $LLVM_DIR/tools/llvmc2
101 $ make GRAPH=MyGraph.td TOOLNAME=my_llvmc
102
103This will build an executable named ``my_llvmc``. There are also
104several sample configuration files in the ``llvmc2/examples``
105subdirectory that should help to get you started.
106
107Internally, LLVMC stores information about possible source
108transformations in form of a graph. Nodes in this graph represent
109tools, and edges between two nodes represent a transformation path. A
110special "root" node is used to mark entry points for the
111transformations. LLVMC also assigns a weight to each edge (more on
112this later) to choose between several alternative edges.
113
114The definition of the compilation graph (see file ``Graph.td``) is
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000115just a list of edges::
116
117 def CompilationGraph : CompilationGraph<[
118 Edge<root, llvm_gcc_c>,
119 Edge<root, llvm_gcc_assembler>,
120 ...
121
122 Edge<llvm_gcc_c, llc>,
123 Edge<llvm_gcc_cpp, llc>,
124 ...
125
126 OptionalEdge<llvm_gcc_c, opt, [(switch_on "opt")]>,
127 OptionalEdge<llvm_gcc_cpp, opt, [(switch_on "opt")]>,
128 ...
129
130 OptionalEdge<llvm_gcc_assembler, llvm_gcc_cpp_linker,
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000131 (case (input_languages_contain "c++"), (inc_weight),
132 (or (parameter_equals "linker", "g++"),
133 (parameter_equals "linker", "c++")), (inc_weight))>,
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000134 ...
135
136 ]>;
137
138As you can see, the edges can be either default or optional, where
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000139optional edges are differentiated by sporting a ``case`` expression
140used to calculate the edge's weight.
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000141
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000142The default edges are assigned a weight of 1, and optional edges get a
143weight of 0 + 2*N where N is the number of tests that evaluated to
144true in the ``case`` expression. It is also possible to provide an
145integer parameter to ``inc_weight`` and ``dec_weight`` - in this case,
146the weight is increased (or decreased) by the provided value instead
147of the default 2.
148
149When passing an input file through the graph, LLVMC picks the edge
150with the maximum weight. To avoid ambiguity, there should be only one
151default edge between two nodes (with the exception of the root node,
152which gets a special treatment - there you are allowed to specify one
153default edge *per language*).
154
155To get a visual representation of the compilation graph (useful for
156debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and
157``gsview`` installed for this to work properly.
158
159
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000160Writing a tool description
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000161==========================
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000162
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000163As was said earlier, nodes in the compilation graph represent tools,
164which are described separately. A tool definition looks like this
165(taken from the ``Tools.td`` file)::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000166
167 def llvm_gcc_cpp : Tool<[
168 (in_language "c++"),
169 (out_language "llvm-assembler"),
170 (output_suffix "bc"),
171 (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
172 (sink)
173 ]>;
174
175This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
176``llvm-g++``. As you can see, a tool definition is just a list of
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000177properties; most of them should be self-explanatory. The ``sink``
178property means that this tool should be passed all command-line
179options that lack explicit descriptions.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000180
181The complete list of the currently implemented tool properties follows:
182
183* Possible tool properties:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000184
Mikhail Glushenkov5ccf28f2008-09-22 20:45:17 +0000185 - ``in_language`` - input language name. Can be either a string or a
186 list, in case the tool supports multiple input languages.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000187
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000188 - ``out_language`` - output language name.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000189
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000190 - ``output_suffix`` - output file suffix.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000191
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000192 - ``cmd_line`` - the actual command used to run the tool. You can
193 use ``$INFILE`` and ``$OUTFILE`` variables, output redirection
194 with ``>``, hook invocations (``$CALL``), environment variables
195 (via ``$ENV``) and the ``case`` construct (more on this below).
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000196
197 - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000198 list of input files and joins them together. Used for linkers.
199
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000200 - ``sink`` - all command-line options that are not handled by other
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000201 tools are passed to this tool.
202
203The next tool definition is slightly more complex::
204
205 def llvm_gcc_linker : Tool<[
206 (in_language "object-code"),
207 (out_language "executable"),
208 (output_suffix "out"),
209 (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
210 (join),
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000211 (prefix_list_option "L", (forward),
212 (help "add a directory to link path")),
213 (prefix_list_option "l", (forward),
214 (help "search a library when linking")),
215 (prefix_list_option "Wl", (unpack_values),
216 (help "pass options to linker"))
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000217 ]>;
218
219This tool has a "join" property, which means that it behaves like a
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000220linker. This tool also defines several command-line options: ``-l``,
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000221``-L`` and ``-Wl`` which have their usual meaning. An option has two
222attributes: a name and a (possibly empty) list of properties. All
223currently implemented option types and properties are described below:
224
225* Possible option types:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000226
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000227 - ``switch_option`` - a simple boolean switch, for example ``-time``.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000228
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000229 - ``parameter_option`` - option that takes an argument, for example
230 ``-std=c99``;
231
232 - ``parameter_list_option`` - same as the above, but more than one
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000233 occurence of the option is allowed.
234
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000235 - ``prefix_option`` - same as the parameter_option, but the option name
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000236 and parameter value are not separated.
237
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000238 - ``prefix_list_option`` - same as the above, but more than one
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000239 occurence of the option is allowed; example: ``-lm -lpthread``.
240
Mikhail Glushenkov0ab8ac32008-05-30 06:28:00 +0000241 - ``alias_option`` - a special option type for creating
242 aliases. Unlike other option types, aliases are not allowed to
243 have any properties besides the aliased option name. Usage
244 example: ``(alias_option "preprocess", "E")``
245
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000246
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000247* Possible option properties:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000248
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000249 - ``append_cmd`` - append a string to the tool invocation command.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000250
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000251 - ``forward`` - forward this option unchanged.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000252
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000253 - ``output_suffix`` - modify the output suffix of this
254 tool. Example : ``(switch "E", (output_suffix "i")``.
255
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000256 - ``stop_compilation`` - stop compilation after this phase.
257
258 - ``unpack_values`` - used for for splitting and forwarding
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000259 comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
260 converted to ``-foo=bar -baz`` and appended to the tool invocation
261 command.
262
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000263 - ``help`` - help string associated with this option. Used for
264 ``--help`` output.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000265
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000266 - ``required`` - this option is obligatory.
267
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000268
Mikhail Glushenkov0ab8ac32008-05-30 06:28:00 +0000269Option list - specifying all options in a single place
270======================================================
271
272It can be handy to have all information about options gathered in a
273single place to provide an overview. This can be achieved by using a
274so-called ``OptionList``::
275
276 def Options : OptionList<[
277 (switch_option "E", (help "Help string")),
278 (alias_option "quiet", "q")
279 ...
280 ]>;
281
282``OptionList`` is also a good place to specify option aliases.
283
284Tool-specific option properties like ``append_cmd`` have (obviously)
285no meaning in the context of ``OptionList``, so the only properties
286allowed there are ``help`` and ``required``.
287
288Option lists are used at the file scope. See file
289``examples/Clang.td`` for an example of ``OptionList`` usage.
290
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000291Using hooks and environment variables in the ``cmd_line`` property
292==================================================================
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000293
294Normally, LLVMC executes programs from the system ``PATH``. Sometimes,
295this is not sufficient: for example, we may want to specify tool names
296in the configuration file. This can be achieved via the mechanism of
297hooks - to compile LLVMC with your hooks, just drop a .cpp file into
298``tools/llvmc2`` directory. Hooks should live in the ``hooks``
299namespace and have the signature ``std::string hooks::MyHookName
300(void)``. They can be used from the ``cmd_line`` tool property::
301
302 (cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
303
304It is also possible to use environment variables in the same manner::
305
306 (cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
307
308To change the command line string based on user-provided options use
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000309the ``case`` expression (documented below)::
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000310
311 (cmd_line
312 (case
313 (switch_on "E"),
314 "llvm-g++ -E -x c $INFILE -o $OUTFILE",
315 (default),
316 "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
317
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000318Conditional evaluation: the ``case`` expression
319===============================================
320
321The 'case' construct can be used to calculate weights of the optional
322edges and to choose between several alternative command line strings
323in the ``cmd_line`` tool property. It is designed after the
324similarly-named construct in functional languages and takes the form
325``(case (test_1), statement_1, (test_2), statement_2, ... (test_N),
326statement_N)``. The statements are evaluated only if the corresponding
327tests evaluate to true.
328
329Examples::
330
331 // Increases edge weight by 5 if "-A" is provided on the
332 // command-line, and by 5 more if "-B" is also provided.
333 (case
334 (switch_on "A"), (inc_weight 5),
335 (switch_on "B"), (inc_weight 5))
336
337 // Evaluates to "cmdline1" if option "-A" is provided on the
338 // command line, otherwise to "cmdline2"
339 (case
340 (switch_on "A"), "cmdline1",
341 (switch_on "B"), "cmdline2",
342 (default), "cmdline3")
343
344Note the slight difference in 'case' expression handling in contexts
345of edge weights and command line specification - in the second example
346the value of the ``"B"`` switch is never checked when switch ``"A"`` is
347enabled, and the whole expression always evaluates to ``"cmdline1"`` in
348that case.
349
350Case expressions can also be nested, i.e. the following is legal::
351
352 (case (switch_on "E"), (case (switch_on "o"), ..., (default), ...)
353 (default), ...)
354
355You should, however, try to avoid doing that because it hurts
356readability. It is usually better to split tool descriptions and/or
357use TableGen inheritance instead.
358
359* Possible tests are:
360
361 - ``switch_on`` - Returns true if a given command-line option is
362 provided by the user. Example: ``(switch_on "opt")``. Note that
363 you have to define all possible command-line options separately in
364 the tool descriptions. See the next section for the discussion of
365 different kinds of command-line options.
366
367 - ``parameter_equals`` - Returns true if a command-line parameter equals
368 a given value. Example: ``(parameter_equals "W", "all")``.
369
370 - ``element_in_list`` - Returns true if a command-line parameter list
371 includes a given value. Example: ``(parameter_in_list "l", "pthread")``.
372
373 - ``input_languages_contain`` - Returns true if a given language
374 belongs to the current input language set. Example:
375 ```(input_languages_contain "c++")``.
376
377 - ``in_language`` - Evaluates to true if the language of the input
378 file equals to the argument. Valid only when using ``case``
379 expression in a ``cmd_line`` tool property. Example:
380 ```(in_language "c++")``.
381
382 - ``not_empty`` - Returns true if a given option (which should be
383 either a parameter or a parameter list) is set by the
384 user. Example: ```(not_empty "o")``.
385
386 - ``default`` - Always evaluates to true. Should always be the last
387 test in the ``case`` expression.
388
389 - ``and`` - A standard logical combinator that returns true iff all
390 of its arguments return true. Used like this: ``(and (test1),
391 (test2), ... (testN))``. Nesting of ``and`` and ``or`` is allowed,
392 but not encouraged.
393
394 - ``or`` - Another logical combinator that returns true only if any
395 one of its arguments returns true. Example: ``(or (test1),
396 (test2), ... (testN))``.
397
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000398
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000399Language map
Mikhail Glushenkov270cae32008-05-30 06:25:24 +0000400============
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000401
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000402One last thing that you will need to modify when adding support for a
403new language to LLVMC is the language map, which defines mappings from
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000404file extensions to language names. It is used to choose the proper
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000405toolchain(s) for a given input file set. Language map definition is
406located in the file ``Tools.td`` and looks like this::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000407
408 def LanguageMap : LanguageMap<
409 [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
410 LangToSuffixes<"c", ["c"]>,
411 ...
412 ]>;
413
414
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000415References
416==========
417
418.. [1] TableGen Fundamentals
419 http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html