blob: b8a364f606ce5f4b75e32d4947ca259389f88348 [file] [log] [blame]
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +00001Tutorial - Writing LLVMC Configuration files
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +00002=============================================
3
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +00004LLVMC is a generic compiler driver, designed to be customizable and
5extensible. It plays the same role for LLVM as the ``gcc`` program
6does for GCC - LLVMC's job is essentially to transform a set of input
7files into a set of targets depending on configuration rules and user
8options. What makes LLVMC different is that these transformation rules
9are completely customizable - in fact, LLVMC knows nothing about the
10specifics of transformation (even the command-line options are mostly
11not hard-coded) and regards the transformation structure as an
12abstract graph. This makes it possible to adapt LLVMC for other
13purposes - for example, as a build tool for game resources. This
14tutorial describes the basic usage and configuration of LLVMC.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000015
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000016Because LLVMC employs TableGen [1]_ as its configuration language, you
17need to be familiar with it to customize LLVMC.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000018
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000019Compiling with LLVMC
20--------------------
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000021
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000022In general, LLVMC tries to be command-line compatible with ``gcc`` as
23much as possible, so most of the familiar options work::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000024
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000025 $ llvmc2 -O3 -Wall hello.cpp
26 $ ./a.out
27 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000028
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000029One nice feature of LLVMC is that you don't have to distinguish
30between different compilers for different languages (think ``g++`` and
31``gcc``) - the right toolchain is chosen automatically based on input
32language names (which are, in turn, determined from file extension). If
33you want to force files ending with ".c" compile as C++, use the
34``-x`` option, just like you would do it with ``gcc``::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000035
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000036 $ llvmc2 -x c hello.cpp
37 $ # hello.cpp is really a C file
38 $ ./a.out
39 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000040
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000041On the other hand, when using LLVMC as a linker to combine several C++
42object files you should provide the ``--linker`` option since it's
43impossible for LLVMC to choose the right linker in that case::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000044
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000045 $ llvmc2 -c hello.cpp
46 $ llvmc2 hello.o
47 [A lot of link-time errors skipped]
48 $ llvmc2 --linker=c++ hello.o
49 $ ./a.out
50 hello
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +000051
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +000052For further help on command-line LLVMC usage, refer to the ``llvmc
53--help`` output.
54
55Customizing LLVMC: the compilation graph
56----------------------------------------
57
58At the time of writing LLVMC does not support on-the-fly reloading of
59configuration, so to customize LLVMC you'll have to edit and recompile
60the source code (which lives under ``$LLVM_DIR/tools/llvmc2``). The
61relevant files are ``Common.td``, ``Tools.td`` and ``Example.td``.
62
63Internally, LLVMC stores information about possible transformations in
64form of a graph. Nodes in this graph represent tools, and edges
65between two nodes represent a transformation path. A special "root"
66node represents entry points for the transformations. LLVMC also
67assigns a weight to each edge (more on that below) to choose between
68several alternative edges.
69
70The definition of the compilation graph (see file ``Example.td``) is
71just a list of edges::
72
73 def CompilationGraph : CompilationGraph<[
74 Edge<root, llvm_gcc_c>,
75 Edge<root, llvm_gcc_assembler>,
76 ...
77
78 Edge<llvm_gcc_c, llc>,
79 Edge<llvm_gcc_cpp, llc>,
80 ...
81
82 OptionalEdge<llvm_gcc_c, opt, [(switch_on "opt")]>,
83 OptionalEdge<llvm_gcc_cpp, opt, [(switch_on "opt")]>,
84 ...
85
86 OptionalEdge<llvm_gcc_assembler, llvm_gcc_cpp_linker,
87 [(if_input_languages_contain "c++"),
88 (or (parameter_equals "linker", "g++"),
89 (parameter_equals "linker", "c++"))]>,
90 ...
91
92 ]>;
93
94As you can see, the edges can be either default or optional, where
95optional edges are differentiated by sporting a list of patterns (or
96edge properties) which are used to calculate the edge's weight. The
97default edges are assigned a weight of 1, and optional edges get a
98weight of 0 + 2*N where N is the number of succesful edge property
99matches. When passing an input file through the graph, LLVMC picks the
100edge with the maximum weight. To avoid ambiguity, there should be only
101one default edge between two nodes (with the exception of the root
102node, which gets a special treatment - there you are allowed to
103specify one default edge *per language*).
104
105* Possible edge properties are:
106
107 - ``switch_on`` - Returns true if a given command-line option is
108 provided by the user. Example: ``(switch_on "opt")``. Note that
109 you have to define all possible command-line options separately in
110 the tool descriptions. See the next section for the discussion of
111 different kinds of command-line options.
112
113 - ``parameter_equals`` - Returns true if a command-line parameter equals
114 a given value. Example: ``(parameter_equals "W", "all")``.
115
116 - ``element_in_list`` - Returns true if a command-line parameter list
117 includes a given value. Example: ``(parameter_in_list "l", "pthread")``.
118
119 - ``if_input_languages_contain`` - Returns true if a given input
120 language belongs to the current input language set.
121
122 - ``and`` - Edge property combinator. Returns true if all of its
Mikhail Glushenkov29063552008-05-06 18:18:20 +0000123 arguments return true. Used like this: ``(and (prop1), (prop2),
124 ... (propN))``. Nesting is allowed, but not encouraged.
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000125
126 - ``or`` - Edge property combinator that returns true if any one of its
Mikhail Glushenkov29063552008-05-06 18:18:20 +0000127 arguments returns true. Example: ``(or (prop1), (prop2), ... (propN))``.
128
129 - ``weight`` - Makes it possible to explicitly specify the quantity
130 added to the edge weight if this edge property matches. Used like
131 this: ``(weight N, (prop))``. The inner property can include
132 ``and`` and ``or`` combinators. When N is equal to 2, equivalent
133 to ``(prop)``.
134
135 Example: ``(weight 8, (and (switch_on "a"), (switch_on "b")))``.
136
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000137
138To get a visual representation of the compilation graph (useful for
139debugging), run ``llvmc2 --view-graph``. You will need ``dot`` and
140``gsview`` installed for this to work properly.
141
142
143Writing a tool description
144--------------------------
145
146As was said earlier, nodes in the compilation graph represent tools. A
147tool definition looks like this (taken from the ``Tools.td`` file)::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000148
149 def llvm_gcc_cpp : Tool<[
150 (in_language "c++"),
151 (out_language "llvm-assembler"),
152 (output_suffix "bc"),
153 (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
154 (sink)
155 ]>;
156
157This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
158``llvm-g++``. As you can see, a tool definition is just a list of
159properties; most of them should be self-evident. The ``sink`` property
160means that this tool should be passed all command-line options that
161aren't handled by the other tools.
162
163The complete list of the currently implemented tool properties follows:
164
165* Possible tool properties:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000166
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000167 - ``in_language`` - input language name.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000168
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000169 - ``out_language`` - output language name.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000170
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000171 - ``output_suffix`` - output file suffix.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000172
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000173 - ``cmd_line`` - the actual command used to run the tool. You can use
174 ``$INFILE`` and ``$OUTFILE`` variables, as well as output
175 redirection with ``>``.
176
177 - ``join`` - this tool is a "join node" in the graph, i.e. it gets a
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000178 list of input files and joins them together. Used for linkers.
179
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000180 - ``sink`` - all command-line options that are not handled by other
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000181 tools are passed to this tool.
182
183The next tool definition is slightly more complex::
184
185 def llvm_gcc_linker : Tool<[
186 (in_language "object-code"),
187 (out_language "executable"),
188 (output_suffix "out"),
189 (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
190 (join),
191 (prefix_list_option "L", (forward), (help "add a directory to link path")),
192 (prefix_list_option "l", (forward), (help "search a library when linking")),
193 (prefix_list_option "Wl", (unpack_values), (help "pass options to linker"))
194 ]>;
195
196This tool has a "join" property, which means that it behaves like a
197linker (because of that this tool should be the last in the
198toolchain). This tool also defines several command-line options: ``-l``,
199``-L`` and ``-Wl`` which have their usual meaning. An option has two
200attributes: a name and a (possibly empty) list of properties. All
201currently implemented option types and properties are described below:
202
203* Possible option types:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000204
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000205 - ``switch_option`` - a simple boolean switch, for example ``-time``.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000206
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000207 - ``parameter_option`` - option that takes an argument, for example
208 ``-std=c99``;
209
210 - ``parameter_list_option`` - same as the above, but more than one
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000211 occurence of the option is allowed.
212
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000213 - ``prefix_option`` - same as the parameter_option, but the option name
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000214 and parameter value are not separated.
215
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000216 - ``prefix_list_option`` - same as the above, but more than one
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000217 occurence of the option is allowed; example: ``-lm -lpthread``.
218
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000219
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000220* Possible option properties:
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000221
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000222 - ``append_cmd`` - append a string to the tool invocation command.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000223
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000224 - ``forward`` - forward this option unchanged.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000225
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000226 - ``stop_compilation`` - stop compilation after this phase.
227
228 - ``unpack_values`` - used for for splitting and forwarding
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000229 comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
230 converted to ``-foo=bar -baz`` and appended to the tool invocation
231 command.
232
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000233 - ``help`` - help string associated with this option.
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000234
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000235 - ``required`` - this option is obligatory.
236
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000237
238Language map
239------------
240
Mikhail Glushenkov77ddce92008-05-06 18:17:19 +0000241One last thing that you need to modify when adding support for a new
242language to LLVMC is the language map, which defines mappings from
243file extensions to language names. It is used to choose the proper
244toolchain based on the input. Language map definition is located in
245the file ``Tools.td`` and looks like this::
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000246
247 def LanguageMap : LanguageMap<
248 [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
249 LangToSuffixes<"c", ["c"]>,
250 ...
251 ]>;
252
253
Anton Korobeynikovac67b7e2008-03-23 08:57:20 +0000254References
255==========
256
257.. [1] TableGen Fundamentals
258 http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html