blob: 8c7dd59b0c6ac010f9982c6d1ab17631c5c1b640 [file] [log] [blame]
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +00001Customizing LLVMC: Reference Manual
2===================================
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +00003
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +00004LLVMC is a generic compiler driver, designed to be customizable and
5extensible. It plays the same role for LLVM as the ``gcc`` program
6does for GCC - LLVMC's job is essentially to transform a set of input
7files into a set of targets depending on configuration rules and user
8options. What makes LLVMC different is that these transformation rules
9are completely customizable - in fact, LLVMC knows nothing about the
10specifics of transformation (even the command-line options are mostly
11not hard-coded) and regards the transformation structure as an
12abstract graph. This makes it possible to adapt LLVMC for other
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000013purposes - for example, as a build tool for game resources.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000014
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000015Because LLVMC employs TableGen [1]_ as its configuration language, you
16need to be familiar with it to customize LLVMC.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000017
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000018Compiling with LLVMC
19--------------------
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000020
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000021LLVMC tries hard to be as compatible with ``gcc`` as possible,
22although there are some small differences. Most of the time, however,
23you shouldn't be able to notice them::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000024
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000025 $ # This works as expected:
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000026 $ llvmc2 -O3 -Wall hello.cpp
27 $ ./a.out
28 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000029
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000030One nice feature of LLVMC is that one doesn't have to distinguish
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000031between different compilers for different languages (think ``g++`` and
32``gcc``) - the right toolchain is chosen automatically based on input
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000033language names (which are, in turn, determined from file
34extensions). If you want to force files ending with ".c" to compile as
35C++, use the ``-x`` option, just like you would do it with ``gcc``::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000036
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000037 $ llvmc2 -x c hello.cpp
38 $ # hello.cpp is really a C file
39 $ ./a.out
40 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000041
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000042On the other hand, when using LLVMC as a linker to combine several C++
43object files you should provide the ``--linker`` option since it's
44impossible for LLVMC to choose the right linker in that case::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000045
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000046 $ llvmc2 -c hello.cpp
47 $ llvmc2 hello.o
48 [A lot of link-time errors skipped]
49 $ llvmc2 --linker=c++ hello.o
50 $ ./a.out
51 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000052
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000053
54Customizing LLVMC: the compilation graph
55----------------------------------------
56
57At the time of writing LLVMC does not support on-the-fly reloading of
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000058configuration, so to customize LLVMC you'll have to recompile the
59source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The
60default configuration files are ``Common.td`` (contains common
61definitions, don't forget to ``include`` it in your configuration
62files), ``Tools.td`` (tool descriptions) and ``Graph.td`` (compilation
63graph definition).
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000064
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000065To compile LLVMC with your own configuration file (say,``MyGraph.td``),
66run ``make`` like this::
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000067
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000068 $ cd $LLVM_DIR/tools/llvmc2
69 $ make GRAPH=MyGraph.td TOOLNAME=my_llvmc
70
71This will build an executable named ``my_llvmc``. There are also
72several sample configuration files in the ``llvmc2/examples``
73subdirectory that should help to get you started.
74
75Internally, LLVMC stores information about possible source
76transformations in form of a graph. Nodes in this graph represent
77tools, and edges between two nodes represent a transformation path. A
78special "root" node is used to mark entry points for the
79transformations. LLVMC also assigns a weight to each edge (more on
80this later) to choose between several alternative edges.
81
82The definition of the compilation graph (see file ``Graph.td``) is
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000083just a list of edges::
84
85 def CompilationGraph : CompilationGraph<[
86 Edge<root, llvm_gcc_c>,
87 Edge<root, llvm_gcc_assembler>,
88 ...
89
90 Edge<llvm_gcc_c, llc>,
91 Edge<llvm_gcc_cpp, llc>,
92 ...
93
94 OptionalEdge<llvm_gcc_c, opt, [(switch_on "opt")]>,
95 OptionalEdge<llvm_gcc_cpp, opt, [(switch_on "opt")]>,
96 ...
97
98 OptionalEdge<llvm_gcc_assembler, llvm_gcc_cpp_linker,
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +000099 (case (input_languages_contain "c++"), (inc_weight),
100 (or (parameter_equals "linker", "g++"),
101 (parameter_equals "linker", "c++")), (inc_weight))>,
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000102 ...
103
104 ]>;
105
106As you can see, the edges can be either default or optional, where
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000107optional edges are differentiated by sporting a ``case`` expression
108used to calculate the edge's weight.
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000109
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000110The default edges are assigned a weight of 1, and optional edges get a
111weight of 0 + 2*N where N is the number of tests that evaluated to
112true in the ``case`` expression. It is also possible to provide an
113integer parameter to ``inc_weight`` and ``dec_weight`` - in this case,
114the weight is increased (or decreased) by the provided value instead
115of the default 2.
116
117When passing an input file through the graph, LLVMC picks the edge
118with the maximum weight. To avoid ambiguity, there should be only one
119default edge between two nodes (with the exception of the root node,
120which gets a special treatment - there you are allowed to specify one
121default edge *per language*).
122
123To get a visual representation of the compilation graph (useful for
124debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and
125``gsview`` installed for this to work properly.
126
127
128The 'case' construct
129--------------------
130
131The 'case' construct can be used to calculate weights for optional
132edges and to choose between several alternative command line strings
133in the ``cmd_line`` tool property. It is designed after the
134similarly-named construct in functional languages and takes the
135form ``(case (test_1), statement_1, (test_2), statement_2,
136... (test_N), statement_N)``.
137
138* Possible tests are:
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000139
140 - ``switch_on`` - Returns true if a given command-line option is
141 provided by the user. Example: ``(switch_on "opt")``. Note that
142 you have to define all possible command-line options separately in
143 the tool descriptions. See the next section for the discussion of
144 different kinds of command-line options.
145
146 - ``parameter_equals`` - Returns true if a command-line parameter equals
147 a given value. Example: ``(parameter_equals "W", "all")``.
148
149 - ``element_in_list`` - Returns true if a command-line parameter list
150 includes a given value. Example: ``(parameter_in_list "l", "pthread")``.
151
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000152 - ``input_languages_contain`` - Returns true if a given language
153 belongs to the current input language set. Example:
154 ```(input_languages_contain "c++")``.
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000155
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000156 - ``default`` - Always evaluates to true. Should be used
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000157
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000158 - ``and`` - A standard logical combinator that returns true iff all
159 of its arguments return true. Used like this: ``(and (test1),
160 (test2), ... (testN))``. Nesting of ``and`` and ``or`` is allowed,
161 but not encouraged.
Mikhail Glushenkov29063552008-05-06 18:18:20 +0000162
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000163 - ``or`` - Another logical combinator that returns true only if any
164 one of its arguments returns true. Example: ``(or (test1),
165 (test2), ... (testN))``.
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000166
167
168Writing a tool description
169--------------------------
170
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000171As was said earlier, nodes in the compilation graph represent tools,
172which are described separately. A tool definition looks like this
173(taken from the ``Tools.td`` file)::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000174
175 def llvm_gcc_cpp : Tool<[
176 (in_language "c++"),
177 (out_language "llvm-assembler"),
178 (output_suffix "bc"),
179 (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
180 (sink)
181 ]>;
182
183This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
184``llvm-g++``. As you can see, a tool definition is just a list of
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000185properties; most of them should be self-explanatory. The ``sink``
186property means that this tool should be passed all command-line
187options that lack explicit descriptions.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000188
189The complete list of the currently implemented tool properties follows:
190
191* Possible tool properties:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000192
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000193 - ``in_language`` - input language name.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000194
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000195 - ``out_language`` - output language name.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000196
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000197 - ``output_suffix`` - output file suffix.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000198
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000199 - ``cmd_line`` - the actual command used to run the tool. You can
200 use ``$INFILE`` and ``$OUTFILE`` variables, output redirection
201 with ``>``, hook invocations (``$CALL``), environment variables
202 (via ``$ENV``) and the ``case`` construct (more on this below).
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000203
204 - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000205 list of input files and joins them together. Used for linkers.
206
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000207 - ``sink`` - all command-line options that are not handled by other
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000208 tools are passed to this tool.
209
210The next tool definition is slightly more complex::
211
212 def llvm_gcc_linker : Tool<[
213 (in_language "object-code"),
214 (out_language "executable"),
215 (output_suffix "out"),
216 (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
217 (join),
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000218 (prefix_list_option "L", (forward),
219 (help "add a directory to link path")),
220 (prefix_list_option "l", (forward),
221 (help "search a library when linking")),
222 (prefix_list_option "Wl", (unpack_values),
223 (help "pass options to linker"))
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000224 ]>;
225
226This tool has a "join" property, which means that it behaves like a
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000227linker. This tool also defines several command-line options: ``-l``,
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000228``-L`` and ``-Wl`` which have their usual meaning. An option has two
229attributes: a name and a (possibly empty) list of properties. All
230currently implemented option types and properties are described below:
231
232* Possible option types:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000233
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000234 - ``switch_option`` - a simple boolean switch, for example ``-time``.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000235
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000236 - ``parameter_option`` - option that takes an argument, for example
237 ``-std=c99``;
238
239 - ``parameter_list_option`` - same as the above, but more than one
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000240 occurence of the option is allowed.
241
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000242 - ``prefix_option`` - same as the parameter_option, but the option name
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000243 and parameter value are not separated.
244
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000245 - ``prefix_list_option`` - same as the above, but more than one
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000246 occurence of the option is allowed; example: ``-lm -lpthread``.
247
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000248
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000249* Possible option properties:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000250
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000251 - ``append_cmd`` - append a string to the tool invocation command.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000252
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000253 - ``forward`` - forward this option unchanged.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000254
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000255 - ``output_suffix`` - modify the output suffix of this
256 tool. Example : ``(switch "E", (output_suffix "i")``.
257
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000258 - ``stop_compilation`` - stop compilation after this phase.
259
260 - ``unpack_values`` - used for for splitting and forwarding
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000261 comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
262 converted to ``-foo=bar -baz`` and appended to the tool invocation
263 command.
264
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000265 - ``help`` - help string associated with this option. Used for
266 ``--help`` output.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000267
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000268 - ``required`` - this option is obligatory.
269
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000270
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000271Hooks and environment variables
272-------------------------------
273
274Normally, LLVMC executes programs from the system ``PATH``. Sometimes,
275this is not sufficient: for example, we may want to specify tool names
276in the configuration file. This can be achieved via the mechanism of
277hooks - to compile LLVMC with your hooks, just drop a .cpp file into
278``tools/llvmc2`` directory. Hooks should live in the ``hooks``
279namespace and have the signature ``std::string hooks::MyHookName
280(void)``. They can be used from the ``cmd_line`` tool property::
281
282 (cmd_line "$CALL(MyHook)/path/to/file -o $CALL(AnotherHook)")
283
284It is also possible to use environment variables in the same manner::
285
286 (cmd_line "$ENV(VAR1)/path/to/file -o $ENV(VAR2)")
287
288To change the command line string based on user-provided options use
289the ``case`` expression (which we have already seen before)::
290
291 (cmd_line
292 (case
293 (switch_on "E"),
294 "llvm-g++ -E -x c $INFILE -o $OUTFILE",
295 (default),
296 "llvm-g++ -c -x c $INFILE -o $OUTFILE -emit-llvm"))
297
298
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000299Language map
300------------
301
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000302One last thing that you will need to modify when adding support for a
303new language to LLVMC is the language map, which defines mappings from
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000304file extensions to language names. It is used to choose the proper
Mikhail Glushenkovcd0858e2008-05-30 06:14:42 +0000305toolchain(s) for a given input file set. Language map definition is
306located in the file ``Tools.td`` and looks like this::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000307
308 def LanguageMap : LanguageMap<
309 [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
310 LangToSuffixes<"c", ["c"]>,
311 ...
312 ]>;
313
314
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000315References
316==========
317
318.. [1] TableGen Fundamentals
319 http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html