blob: 8c7dd59b0c6ac010f9982c6d1ab17631c5c1b640 [file] [log] [blame]
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +00001Customizing LLVMC: Reference Manual
2===================================
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +00003
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +00004LLVMC is a generic compiler driver, designed to be customizable and
5extensible. It plays the same role for LLVM as the ``gcc`` program
6does for GCC - LLVMC's job is essentially to transform a set of input
7files into a set of targets depending on configuration rules and user
8options. What makes LLVMC different is that these transformation rules
9are completely customizable - in fact, LLVMC knows nothing about the
10specifics of transformation (even the command-line options are mostly
11not hard-coded) and regards the transformation structure as an
12abstract graph. This makes it possible to adapt LLVMC for other
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000013purposes - for example, as a build tool for game resources.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000014
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000015Because LLVMC employs TableGen [1]_ as its configuration language, you
16need to be familiar with it to customize LLVMC.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000017
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000018Compiling with LLVMC
19--------------------
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000020
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000021LLVMC tries hard to be as compatible with ``gcc`` as possible,
22although there are some small differences. Most of the time, however,
23you shouldn't be able to notice them::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000024
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000025 $ # This works as expected:
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000026 $ llvmc2 -O3 -Wall hello.cpp
27 $ ./a.out
28 hello
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000029
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000030One nice feature of LLVMC is that one doesn't have to distinguish
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000031between different compilers for different languages (think ``g++`` and
32``gcc``) - the right toolchain is chosen automatically based on input
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000033language names (which are, in turn, determined from file
34extensions). If you want to force files ending with ".c" to compile as
35C++, use the ``-x`` option, just like you would do it with ``gcc``::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000036
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000037 $ llvmc2 -x c hello.cpp
38 $ # hello.cpp is really a C file
39 $ ./a.out
40 hello
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000041
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000042On the other hand, when using LLVMC as a linker to combine several C++
43object files you should provide the ``--linker`` option since it's
44impossible for LLVMC to choose the right linker in that case::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000045
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000046 $ llvmc2 -c hello.cpp
47 $ llvmc2 hello.o
48 [A lot of link-time errors skipped]
49 $ llvmc2 --linker=c++ hello.o
50 $ ./a.out
51 hello
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +000052
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000053
54Customizing LLVMC: the compilation graph
55----------------------------------------
56
57At the time of writing LLVMC does not support on-the-fly reloading of
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000058configuration, so to customize LLVMC you'll have to recompile the
59source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The
60default configuration files are ``Common.td`` (contains common
61definitions, don't forget to ``include`` it in your configuration
62files), ``Tools.td`` (tool descriptions) and ``Graph.td`` (compilation
63graph definition).
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000064
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000065To compile LLVMC with your own configuration file (say,``MyGraph.td``),
66run ``make`` like this::
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000067
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000068 $ cd $LLVM_DIR/tools/llvmc2
69 $ make GRAPH=MyGraph.td TOOLNAME=my_llvmc
70
71This will build an executable named ``my_llvmc``. There are also
72several sample configuration files in the ``llvmc2/examples``
73subdirectory that should help to get you started.
74
75Internally, LLVMC stores information about possible source
76transformations in form of a graph. Nodes in this graph represent
77tools, and edges between two nodes represent a transformation path. A
78special "root" node is used to mark entry points for the
79transformations. LLVMC also assigns a weight to each edge (more on
80this later) to choose between several alternative edges.
81
82The definition of the compilation graph (see file ``Graph.td``) is
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +000083just a list of edges::
84
85 def CompilationGraph : CompilationGraph<[
86 Edge<root, llvm_gcc_c>,
87 Edge<root, llvm_gcc_assembler>,
88 ...
89
90 Edge<llvm_gcc_c, llc>,
91 Edge<llvm_gcc_cpp, llc>,
92 ...
93
94 OptionalEdge<llvm_gcc_c, opt, [(switch_on "opt")]>,
95 OptionalEdge<llvm_gcc_cpp, opt, [(switch_on "opt")]>,
96 ...
97
98 OptionalEdge<llvm_gcc_assembler, llvm_gcc_cpp_linker,
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +000099 (case (input_languages_contain "c++"), (inc_weight),
100 (or (parameter_equals "linker", "g++"),
101 (parameter_equals "linker", "c++")), (inc_weight))>,
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000102 ...
103
104 ]>;
105
106As you can see, the edges can be either default or optional, where
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000107optional edges are differentiated by sporting a ``case`` expression
108used to calculate the edge's weight.
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000109
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000110The default edges are assigned a weight of 1, and optional edges get a
111weight of 0 + 2*N where N is the number of tests that evaluated to
112true in the ``case`` expression. It is also possible to provide an
113integer parameter to ``inc_weight`` and ``dec_weight`` - in this case,
114the weight is increased (or decreased) by the provided value instead
115of the default 2.
116
117When passing an input file through the graph, LLVMC picks the edge
118with the maximum weight. To avoid ambiguity, there should be only one
119default edge between two nodes (with the exception of the root node,
120which gets a special treatment - there you are allowed to specify one
121default edge *per language*).
122
123To get a visual representation of the compilation graph (useful for
124debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and
125``gsview`` installed for this to work properly.
126
127
128The 'case' construct
129--------------------
130
131The 'case' construct can be used to calculate weights for optional
132edges and to choose between several alternative command line strings
133in the ``cmd_line`` tool property. It is designed after the
134similarly-named construct in functional languages and takes the
135form ``(case (test_1), statement_1, (test_2), statement_2,
136... (test_N), statement_N)``.
137
138* Possible tests are:
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000139
140 - ``switch_on`` - Returns true if a given command-line option is
141 provided by the user. Example: ``(switch_on "opt")``. Note that
142 you have to define all possible command-line options separately in
143 the tool descriptions. See the next section for the discussion of
144 different kinds of command-line options.
145
146 - ``parameter_equals`` - Returns true if a command-line parameter equals
147 a given value. Example: ``(parameter_equals "W", "all")``.
148
149 - ``element_in_list`` - Returns true if a command-line parameter list
150 includes a given value. Example: ``(parameter_in_list "l", "pthread")``.
151
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000152 - ``input_languages_contain`` - Returns true if a given language
153 belongs to the current input language set. Example:
154 ```(input_languages_contain "c++")``.
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000155
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000156 - ``default`` - Always evaluates to true. Should be used
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000157
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000158 - ``and`` - A standard logical combinator that returns true iff all
159 of its arguments return true. Used like this: ``(and (test1),
160 (test2), ... (testN))``. Nesting of ``and`` and ``or`` is allowed,
161 but not encouraged.
Mikhail Glushenkovdfcad6c2008-05-06 18:18:20 +0000162
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000163 - ``or`` - Another logical combinator that returns true only if any
164 one of its arguments returns true. Example: ``(or (test1),
165 (test2), ... (testN))``.
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000166
167
168Writing a tool description
169--------------------------
170
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000171As was said earlier, nodes in the compilation graph represent tools,
172which are described separately. A tool definition looks like this
173(taken from the ``Tools.td`` file)::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000174
175 def llvm_gcc_cpp : Tool<[
176 (in_language "c++"),
177 (out_language "llvm-assembler"),
178 (output_suffix "bc"),
179 (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
180 (sink)
181 ]>;
182
183This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
184``llvm-g++``. As you can see, a tool definition is just a list of
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000185properties; most of them should be self-explanatory. The ``sink``
186property means that this tool should be passed all command-line
187options that lack explicit descriptions.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000188
189The complete list of the currently implemented tool properties follows:
190
191* Possible tool properties:
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000192
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000193 - ``in_language`` - input language name.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000194
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000195 - ``out_language`` - output language name.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000196
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000197 - ``output_suffix`` - output file suffix.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000198
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000199 - ``cmd_line`` - the actual command used to run the tool. You can
200 use ``$INFILE`` and ``$OUTFILE`` variables, output redirection
201 with ``>``, hook invocations (``$CALL``), environment variables
202 (via ``$ENV``) and the ``case`` construct (more on this below).
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000203
204 - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000205 list of input files and joins them together. Used for linkers.
206
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000207 - ``sink`` - all command-line options that are not handled by other
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000208 tools are passed to this tool.
209
210The next tool definition is slightly more complex::
211
212 def llvm_gcc_linker : Tool<[
213 (in_language "object-code"),
214 (out_language "executable"),
215 (output_suffix "out"),
216 (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
217 (join),
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000218 (prefix_list_option "L", (forward),
219 (help "add a directory to link path")),
220 (prefix_list_option "l", (forward),
221 (help "search a library when linking")),
222 (prefix_list_option "Wl", (unpack_values),
223 (help "pass options to linker"))
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000224 ]>;
225
226This tool has a "join" property, which means that it behaves like a
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000227linker. This tool also defines several command-line options: ``-l``,
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000228``-L`` and ``-Wl`` which have their usual meaning. An option has two
229attributes: a name and a (possibly empty) list of properties. All
230currently implemented option types and properties are described below:
231
232* Possible option types:
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000233
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000234 - ``switch_option`` - a simple boolean switch, for example ``-time``.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000235
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000236 - ``parameter_option`` - option that takes an argument, for example
237 ``-std=c99``;
238
239 - ``parameter_list_option`` - same as the above, but more than one
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000240 occurence of the option is allowed.
241
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000242 - ``prefix_option`` - same as the parameter_option, but the option name
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000243 and parameter value are not separated.
244
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000245 - ``prefix_list_option`` - same as the above, but more than one
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000246 occurence of the option is allowed; example: ``-lm -lpthread``.
247
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000248
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000249* Possible option properties:
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000250
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000251 - ``append_cmd`` - append a string to the tool invocation command.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000252
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000253 - ``forward`` - forward this option unchanged.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000254
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000255 - ``output_suffix`` - modify the output suffix of this
256 tool. Example : ``(switch "E", (output_suffix "i")``.
257
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000258 - ``stop_compilation`` - stop compilation after this phase.
259
260 - ``unpack_values`` - used for for splitting and forwarding
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000261 comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
262 converted to ``-foo=bar -baz`` and appended to the tool invocation
263 command.
264
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000265 - ``help`` - help string associated with this option. Used for
266 ``--help`` output.
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000267
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000268 - ``required`` - this option is obligatory.
269
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000270
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000271Hooks and environment variables
272-------------------------------
273
274Normally, LLVMC executes programs from the system ``PATH``. Sometimes,
275this is not sufficient: for example, we may want to specify tool names
276in the configuration file. This can be achieved via the mechanism of
277hooks - to compile LLVMC with your hooks, just drop a .cpp file into
278``tools/llvmc2`` directory. Hooks should live in the ``hooks``
279namespace and have the signature ``std::string hooks::MyHookName
280(void)``. They can be used from the ``cmd_line`` tool property::
281
282 (cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
283
284It is also possible to use environment variables in the same manner::
285
286 (cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
287
288To change the command line string based on user-provided options use
289the ``case`` expression (which we have already seen before)::
290
291 (cmd_line
292 (case
293 (switch_on "E"),
294 "llvm-g++ -E -x c $INFILE -o $OUTFILE",
295 (default),
296 "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
297
298
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000299Language map
300------------
301
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000302One last thing that you will need to modify when adding support for a
303new language to LLVMC is the language map, which defines mappings from
Mikhail Glushenkov2e6a8442008-05-06 18:17:19 +0000304file extensions to language names. It is used to choose the proper
Mikhail Glushenkov1ce87222008-05-30 06:14:42 +0000305toolchain(s) for a given input file set. Language map definition is
306located in the file ``Tools.td`` and looks like this::
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000307
308 def LanguageMap : LanguageMap<
309 [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
310 LangToSuffixes<"c", ["c"]>,
311 ...
312 ]>;
313
314
Anton Korobeynikove9ffb5b2008-03-23 08:57:20 +0000315References
316==========
317
318.. [1] TableGen Fundamentals
319 http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html