blob: 253f4719a63269ba4730cbaaddd204d90159c3de [file] [log] [blame]
Reid Spencerb1254a12004-08-09 03:08:29 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
2<html>
3<head>
4 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
5 <title>The LLVM Compiler Driver (llvmc)</title>
6 <link rel="stylesheet" href="llvm.css" type="text/css">
Reid Spenceraaa3da92004-08-17 09:18:37 +00007 <meta name="author" content="Reid Spencer">
Reid Spencerb1254a12004-08-09 03:08:29 +00008 <meta name="description"
9 content="A description of the use and design of the LLVM Compiler Driver.">
10</head>
11<body>
12<div class="doc_title">The LLVM Compiler Driver (llvmc)</div>
13<p class="doc_warning">NOTE: This document is a work in progress!</p>
14<ol>
15 <li><a href="#abstract">Abstract</a></li>
16 <li><a href="#introduction">Introduction</a>
17 <ol>
18 <li><a href="#purpose">Purpose</a></li>
19 <li><a href="#operation">Operation</a></li>
20 <li><a href="#phases">Phases</a></li>
21 <li><a href="#actions">Actions</a></li>
22 </ol>
23 </li>
Reid Spencerb1254a12004-08-09 03:08:29 +000024 <li><a href="#configuration">Configuration</a>
Reid Spencereefdae52004-08-21 22:37:42 +000025 <ol>
26 <li><a href="#overview">Overview</a></li>
27 <li><a href="#filetypes">Configuration Files</a></li>
28 <li><a href="#syntax">Syntax</a></li>
29 <li><a href="#substitutions">Substitutions</a></li>
30 <li><a href="#sample">Sample Config File</a></li>
31 </ol>
Reid Spencerb1254a12004-08-09 03:08:29 +000032 <li><a href="#glossary">Glossary</a>
33</ol>
34<div class="doc_author">
35<p>Written by <a href="mailto:rspencer@x10sys.com">Reid Spencer</a>
36</p>
37</div>
38
39<!-- *********************************************************************** -->
40<div class="doc_section"> <a name="abstract">Abstract</a></div>
41<!-- *********************************************************************** -->
42<div class="doc_text">
43 <p>This document describes the requirements, design, and configuration of the
44 LLVM compiler driver, <tt>llvmc</tt>. The compiler driver knows about LLVM's
45 tool set and can be configured to know about a variety of compilers for
46 source languages. It uses this knowledge to execute the tools necessary
47 to accomplish general compilation, optimization, and linking tasks. The main
48 purpose of <tt>llvmc</tt> is to provide a simple and consistent interface to
49 all compilation tasks. This reduces the burden on the end user who can just
50 learn to use <tt>llvmc</tt> instead of the entire LLVM tool set and all the
51 source language compilers compatible with LLVM.</p>
52</div>
53<!-- *********************************************************************** -->
54<div class="doc_section"> <a name="introduction">Introduction</a></div>
55<!-- *********************************************************************** -->
56<div class="doc_text">
Misha Brukman24fdc1d2004-09-09 20:34:13 +000057 <p>The <tt>llvmc</tt> <a href="#def_tool">tool</a> is a configurable compiler
58 <a href="#def_driver">driver</a>. As such, it isn't a compiler, optimizer,
Misha Brukmancb3c6a42004-08-24 02:23:58 +000059 or a linker itself but it drives (invokes) other software that perform those
Reid Spencerb1254a12004-08-09 03:08:29 +000060 tasks. If you are familiar with the GNU Compiler Collection's <tt>gcc</tt>
61 tool, <tt>llvmc</tt> is very similar.</p>
62 <p>The following introductory sections will help you understand why this tool
63 is necessary and what it does.</p>
64</div>
65
66<!-- _______________________________________________________________________ -->
67<div class="doc_subsection"><a name="purpose">Purpose</a></div>
68<div class="doc_text">
Reid Spencer46d21922004-08-22 18:06:59 +000069 <p><tt>llvmc</tt> was invented to make compilation of user programs with
70 LLVM-based tools easier. To accomplish this, <tt>llvmc</tt> strives to:</p>
Reid Spencerb1254a12004-08-09 03:08:29 +000071 <ul>
72 <li>Be the single point of access to most of the LLVM tool set.</li>
73 <li>Hide the complexities of the LLVM tools through a single interface.</li>
74 <li>Provide a consistent interface for compiling all languages.</li>
75 </ul>
76 <p>Additionally, <tt>llvmc</tt> makes it easier to write a compiler for use
77 with LLVM, because it:</p>
78 <ul>
79 <li>Makes integration of existing non-LLVM tools simple.</li>
Reid Spencer46d21922004-08-22 18:06:59 +000080 <li>Extends the capabilities of minimal compiler tools by optimizing their
Reid Spencerb1254a12004-08-09 03:08:29 +000081 output.</li>
82 <li>Reduces the number of interfaces a compiler writer must know about
83 before a working compiler can be completed (essentially only the VMCore
84 interfaces need to be understood).</li>
85 <li>Supports source language translator invocation via both dynamically
86 loadable shared objects and invocation of an executable.</li>
Reid Spenceraaa3da92004-08-17 09:18:37 +000087 </ul>
Reid Spencerb1254a12004-08-09 03:08:29 +000088</div>
89
90<!-- _______________________________________________________________________ -->
91<div class="doc_subsection"><a name="operation">Operation</a></div>
92<div class="doc_text">
93 <p>At a high level, <tt>llvmc</tt> operation is very simple. The basic action
94 taken by <tt>llvmc</tt> is to simply invoke some tool or set of tools to fill
95 the user's request for compilation. Every execution of <tt>llvmc</tt>takes the
Reid Spenceraaa3da92004-08-17 09:18:37 +000096 following sequence of steps:</p>
Reid Spencerb1254a12004-08-09 03:08:29 +000097 <dl>
98 <dt><b>Collect Command Line Options</b></dt>
99 <dd>The command line options provide the marching orders to <tt>llvmc</tt>
100 on what actions it should perform. This is the request the user is making
101 of <tt>llvmc</tt> and it is interpreted first. See the <tt>llvmc</tt>
102 <a href="CommandGuide/html/llvmc.html">manual page</a> for details on the
103 options.</dd>
104 <dt><b>Read Configuration Files</b></dt>
105 <dd>Based on the options and the suffixes of the filenames presented, a set
106 of configuration files are read to configure the actions <tt>llvmc</tt> will
Reid Spencer46d21922004-08-22 18:06:59 +0000107 take. Configuration files are provided by either LLVM or the
Reid Spenceraaa3da92004-08-17 09:18:37 +0000108 compiler tools that <tt>llvmc</tt> invokes. These files determine what
109 actions <tt>llvmc</tt> will take in response to the user's request. See
110 the section on <a href="#configuration">configuration</a> for more details.
111 </dd>
Reid Spencerb1254a12004-08-09 03:08:29 +0000112 <dt><b>Determine Phases To Execute</b></dt>
113 <dd>Based on the command line options and configuration files,
114 <tt>llvmc</tt> determines the compilation <a href="#phases">phases</a> that
115 must be executed by the user's request. This is the primary work of
116 <tt>llvmc</tt>.</dd>
117 <dt><b>Determine Actions To Execute</b></dt>
118 <dd>Each <a href="#phases">phase</a> to be executed can result in the
119 invocation of one or more <a href="#actions">actions</a>. An action is
120 either a whole program or a function in a dynamically linked shared library.
121 In this step, <tt>llvmc</tt> determines the sequence of actions that must be
122 executed. Actions will always be executed in a deterministic order.</dd>
123 <dt><b>Execute Actions</b></dt>
124 <dd>The <a href="#actions">actions</a> necessary to support the user's
125 original request are executed sequentially and deterministically. All
126 actions result in either the invocation of a whole program to perform the
127 action or the loading of a dynamically linkable shared library and invocation
128 of a standard interface function within that library.</dd>
129 <dt><b>Termination</b></dt>
130 <dd>If any action fails (returns a non-zero result code), <tt>llvmc</tt>
131 also fails and returns the result code from the failing action. If
132 everything succeeds, <tt>llvmc</tt> will return a zero result code.</dd>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000133 </dl>
Reid Spencerb1254a12004-08-09 03:08:29 +0000134 <p><tt>llvmc</tt>'s operation must be simple, regular and predictable.
135 Developers need to be able to rely on it to take a consistent approach to
136 compilation. For example, the invocation:</p>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000137 <code>
138 llvmc -O2 x.c y.c z.c -o xyz</code>
Reid Spencerb1254a12004-08-09 03:08:29 +0000139 <p>must produce <i>exactly</i> the same results as:</p>
Reid Spencer46d21922004-08-22 18:06:59 +0000140 <pre><tt>
Reid Spencer514b9672004-08-29 23:20:42 +0000141 llvmc -O2 x.c -o x.o
142 llvmc -O2 y.c -o y.o
143 llvmc -O2 z.c -o z.o
Reid Spencer46d21922004-08-22 18:06:59 +0000144 llvmc -O2 x.o y.o z.o -o xyz</tt></pre>
Reid Spencerb1254a12004-08-09 03:08:29 +0000145 <p>To accomplish this, <tt>llvmc</tt> uses a very simple goal oriented
146 procedure to do its work. The overall goal is to produce a functioning
147 executable. To accomplish this, <tt>llvmc</tt> always attempts to execute a
148 series of compilation <a href="#def_phase">phases</a> in the same sequence.
149 However, the user's options to <tt>llvmc</tt> can cause the sequence of phases
150 to start in the middle or finish early.</p>
151</div>
152
153<!-- _______________________________________________________________________ -->
154<div class="doc_subsection"><a name="phases"></a>Phases </div>
155<div class="doc_text">
156 <p><tt>llvmc</tt> breaks every compilation task into the following five
157 distinct phases:</p>
158 <dl><dt><b>Preprocessing</b></dt><dd>Not all languages support preprocessing;
159 but for those that do, this phase can be invoked. This phase is for
160 languages that provide combining, filtering, or otherwise altering with the
161 source language input before the translator parses it. Although C and C++
162 are the most common users of this phase, other languages may provide their
163 own preprocessor (whether its the C pre-processor or not).</dd>
164 </dl>
165 <dl><dt><b>Translation</b></dt><dd>The translation phase converts the source
166 language input into something that LLVM can interpret and use for
167 downstream phases. The translation is essentially from "non-LLVM form" to
168 "LLVM form".</dd>
169 </dl>
170 <dl><dt><b>Optimization</b></dt><dd>Once an LLVM Module has been obtained from
171 the translation phase, the program enters the optimization phase. This phase
172 attempts to optimize all of the input provided on the command line according
173 to the options provided.</dd>
174 </dl>
175 <dl><dt><b>Linking</b></dt><dd>The inputs are combined to form a complete
176 program.</dd>
177 </dl>
178 <p>The following table shows the inputs, outputs, and command line options
John Criswell98d06362004-11-21 14:58:12 +0000179 applicable to each phase.</p>
Reid Spencerb1254a12004-08-09 03:08:29 +0000180 <table>
181 <tr>
182 <th style="width: 10%">Phase</th>
183 <th style="width: 25%">Inputs</th>
184 <th style="width: 25%">Outputs</th>
185 <th style="width: 40%">Options</th>
186 </tr>
187 <tr><td><b>Preprocessing</b></td>
188 <td class="td_left"><ul><li>Source Language File</li></ul></td>
189 <td class="td_left"><ul><li>Source Language File</li></ul></td>
190 <td class="td_left"><dl>
191 <dt><tt>-E</tt></dt>
192 <dd>Stops the compilation after preprocessing</dd>
193 </dl></td>
194 </tr>
195 <tr>
196 <td><b>Translation</b></td>
197 <td class="td_left"><ul>
198 <li>Source Language File</li>
199 </ul></td>
200 <td class="td_left"><ul>
201 <li>LLVM Assembly</li>
Gabor Greif04367bf2007-07-06 22:07:22 +0000202 <li>LLVM Bitcode</li>
Reid Spencerb1254a12004-08-09 03:08:29 +0000203 <li>LLVM C++ IR</li>
204 </ul></td>
205 <td class="td_left"><dl>
206 <dt><tt>-c</tt></dt>
207 <dd>Stops the compilation after translation so that optimization and
208 linking are not done.</dd>
209 <dt><tt>-S</tt></dt>
210 <dd>Stops the compilation before object code is written so that only
211 assembly code remains.</dd>
212 </dl></td>
213 </tr>
214 <tr>
215 <td><b>Optimization</b></td>
216 <td class="td_left"><ul>
217 <li>LLVM Assembly</li>
Gabor Greif04367bf2007-07-06 22:07:22 +0000218 <li>LLVM Bitcode</li>
Reid Spencerb1254a12004-08-09 03:08:29 +0000219 </ul></td>
220 <td class="td_left"><ul>
Gabor Greif04367bf2007-07-06 22:07:22 +0000221 <li>LLVM Bitcode</li>
Reid Spencerb1254a12004-08-09 03:08:29 +0000222 </ul></td>
223 <td class="td_left"><dl>
224 <dt><tt>-Ox</tt>
Reid Spencer46d21922004-08-22 18:06:59 +0000225 <dd>This group of options controls the amount of optimization
Reid Spencerb1254a12004-08-09 03:08:29 +0000226 performed.</dd>
227 </dl></td>
228 </tr>
229 <tr>
230 <td><b>Linking</b></td>
231 <td class="td_left"><ul>
Gabor Greif04367bf2007-07-06 22:07:22 +0000232 <li>LLVM Bitcode</li>
Reid Spencerb1254a12004-08-09 03:08:29 +0000233 <li>Native Object Code</li>
234 <li>LLVM Library</li>
235 <li>Native Library</li>
236 </ul></td>
237 <td class="td_left"><ul>
Gabor Greif04367bf2007-07-06 22:07:22 +0000238 <li>LLVM Bitcode Executable</li>
Reid Spencerb1254a12004-08-09 03:08:29 +0000239 <li>Native Executable</li>
240 </ul></td>
241 <td class="td_left"><dl>
242 <dt><tt>-L</tt></dt><dd>Specifies a path for library search.</dd>
243 <dt><tt>-l</tt></dt><dd>Specifies a library to link in.</dd>
244 </dl></td>
245 </tr>
246 </table>
247</div>
248
249<!-- _______________________________________________________________________ -->
250<div class="doc_subsection"><a name="actions"></a>Actions</div>
251<div class="doc_text">
252 <p>An action, with regard to <tt>llvmc</tt> is a basic operation that it takes
253 in order to fulfill the user's request. Each phase of compilation will invoke
254 zero or more actions in order to accomplish that phase.</p>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000255 <p>Actions come in two forms:</p>
256 <ul>
Reid Spencerb1254a12004-08-09 03:08:29 +0000257 <li>Invokable Executables</li>
258 <li>Functions in a shared library</li>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000259 </ul>
Reid Spencerb1254a12004-08-09 03:08:29 +0000260</div>
261
262<!-- *********************************************************************** -->
Reid Spencerb1254a12004-08-09 03:08:29 +0000263<div class="doc_section"><a name="configuration">Configuration</a></div>
264<!-- *********************************************************************** -->
265<div class="doc_text">
266 <p>This section of the document describes the configuration files used by
267 <tt>llvmc</tt>. Configuration information is relatively static for a
Reid Spencer46d21922004-08-22 18:06:59 +0000268 given release of LLVM and a compiler tool. However, the details may
Reid Spencerb1254a12004-08-09 03:08:29 +0000269 change from release to release of either. Users are encouraged to simply use
Reid Spenceraaa3da92004-08-17 09:18:37 +0000270 the various options of the <tt>llvmc</tt> command and ignore the configuration
271 of the tool. These configuration files are for compiler writers and LLVM
272 developers. Those wishing to simply use <tt>llvmc</tt> don't need to understand
Reid Spencerb1254a12004-08-09 03:08:29 +0000273 this section but it may be instructive on how the tool works.</p>
274</div>
275
276<!-- _______________________________________________________________________ -->
277<div class="doc_subsection"><a name="overview"></a>Overview</div>
278<div class="doc_text">
279<p><tt>llvmc</tt> is highly configurable both on the command line and in
280configuration files. The options it understands are generic, consistent and
281simple by design. Furthermore, the <tt>llvmc</tt> options apply to the
282compilation of any LLVM enabled programming language. To be enabled as a
283supported source language compiler, a compiler writer must provide a
284configuration file that tells <tt>llvmc</tt> how to invoke the compiler
285and what its capabilities are. The purpose of the configuration files then
286is to allow compiler writers to specify to <tt>llvmc</tt> how the compiler
287should be invoked. Users may but are not advised to alter the compiler's
288<tt>llvmc</tt> configuration.</p>
289
290<p>Because <tt>llvmc</tt> just invokes other programs, it must deal with the
291available command line options for those programs regardless of whether they
Reid Spencer46d21922004-08-22 18:06:59 +0000292were written for LLVM or not. Furthermore, not all compiler tools will
293have the same capabilities. Some compiler tools will simply generate LLVM assembly
Gabor Greif04367bf2007-07-06 22:07:22 +0000294code, others will be able to generate fully optimized bitcode. In general,
Reid Spencerb1254a12004-08-09 03:08:29 +0000295<tt>llvmc</tt> doesn't make any assumptions about the capabilities or command
Reid Spenceraaa3da92004-08-17 09:18:37 +0000296line options of a sub-tool. It simply uses the details found in the
297configuration files and leaves it to the compiler writer to specify the
298configuration correctly.</p>
Reid Spencerb1254a12004-08-09 03:08:29 +0000299
Reid Spencer46d21922004-08-22 18:06:59 +0000300<p>This approach means that new compiler tools can be up and working very
301quickly. As a first cut, a tool can simply compile its source to raw
Gabor Greif04367bf2007-07-06 22:07:22 +0000302(unoptimized) bitcode or LLVM assembly and <tt>llvmc</tt> can be configured
303to pick up the slack (translate LLVM assembly to bitcode, optimize the
304bitcode, generate native assembly, link, etc.). In fact, the compiler tools
Reid Spencer46d21922004-08-22 18:06:59 +0000305need not use any LLVM libraries, and it could be written in any language
306(instead of C++). The configuration data will allow the full range of
307optimization, assembly, and linking capabilities that LLVM provides to be added
308to these kinds of tools. Enabling the rapid development of front-ends is one
309of the primary goals of <tt>llvmc</tt>.</p>
Reid Spencerb1254a12004-08-09 03:08:29 +0000310
Reid Spencer46d21922004-08-22 18:06:59 +0000311<p>As a compiler tool matures, it may utilize the LLVM libraries and tools
Gabor Greif04367bf2007-07-06 22:07:22 +0000312to more efficiently produce optimized bitcode directly in a single compilation
Reid Spencerb1254a12004-08-09 03:08:29 +0000313and optimization program. In these cases, multiple tools would not be needed
314and the configuration data for the compiler would change.</p>
315
316<p>Configuring <tt>llvmc</tt> to the needs and capabilities of a source language
Reid Spencer46d21922004-08-22 18:06:59 +0000317compiler is relatively straight-forward. A compiler writer must provide a
Reid Spencerb1254a12004-08-09 03:08:29 +0000318definition of what to do for each of the five compilation phases for each of
319the optimization levels. The specification consists simply of prototypical
320command lines into which <tt>llvmc</tt> can substitute command line
321arguments and file names. Note that any given phase can be completely blank if
322the source language's compiler combines multiple phases into a single program.
323For example, quite often pre-processing, translation, and optimization are
324combined into a single program. The specification for such a compiler would have
325blank entries for pre-processing and translation but a full command line for
326optimization.</p>
327</div>
328
329<!-- _______________________________________________________________________ -->
Reid Spencer46d21922004-08-22 18:06:59 +0000330<div class="doc_subsection"><a name="filetypes">Configuration Files</a></div>
331<div class="doc_subsubsection"><a name="filecontents">File Contents</a></div>
Reid Spencerb1254a12004-08-09 03:08:29 +0000332<div class="doc_text">
Reid Spenceraaa3da92004-08-17 09:18:37 +0000333 <p>Each configuration file provides the details for a single source language
334 that is to be compiled. This configuration information tells <tt>llvmc</tt>
335 how to invoke the language's pre-processor, translator, optimizer, assembler
336 and linker. Note that a given source language needn't provide all these tools
337 as many of them exist in llvm currently.</p>
Reid Spencer46d21922004-08-22 18:06:59 +0000338</div>
Reid Spencera81d8fc2004-11-01 21:31:39 +0000339
340<!-- _______________________________________________________________________ -->
Reid Spencer46d21922004-08-22 18:06:59 +0000341<div class="doc_subsubsection"><a name="dirsearch">Directory Search</a></div>
342<div class="doc_text">
Reid Spencerb1254a12004-08-09 03:08:29 +0000343 <p><tt>llvmc</tt> always looks for files of a specific name. It uses the
344 first file with the name its looking for by searching directories in the
345 following order:<br/>
346 <ol>
Reid Spencer46d21922004-08-22 18:06:59 +0000347 <li>Any directory specified by the <tt>-config-dir</tt> option will be
Reid Spencerb1254a12004-08-09 03:08:29 +0000348 checked first.</li>
349 <li>If the environment variable LLVM_CONFIG_DIR is set, and it contains
350 the name of a valid directory, that directory will be searched next.</li>
351 <li>If the user's home directory (typically <tt>/home/user</tt> contains
352 a sub-directory named <tt>.llvm</tt> and that directory contains a
353 sub-directory named <tt>etc</tt> then that directory will be tried
354 next.</li>
355 <li>If the LLVM installation directory (typically <tt>/usr/local/llvm</tt>
356 contains a sub-directory named <tt>etc</tt> then that directory will be
357 tried last.</li>
Reid Spencereefdae52004-08-21 22:37:42 +0000358 <li>A standard "system" directory will be searched next. This is typically
359 <tt>/etc/llvm</tt> on UNIX&trade; and <tt>C:\WINNT</tt> on Microsoft
360 Windows&trade;.</li>
Reid Spencerb1254a12004-08-09 03:08:29 +0000361 <li>If the configuration file sought still can't be found, <tt>llvmc</tt>
362 will print an error message and exit.</li>
363 </ol>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000364 <p>The first file found in this search will be used. Other files with the
365 same name will be ignored even if they exist in one of the subsequent search
Reid Spencerb1254a12004-08-09 03:08:29 +0000366 locations.</p>
Reid Spencer46d21922004-08-22 18:06:59 +0000367</div>
Reid Spencerb1254a12004-08-09 03:08:29 +0000368
Reid Spencer46d21922004-08-22 18:06:59 +0000369<div class="doc_subsubsection"><a name="filenames">File Names</a></div>
370<div class="doc_text">
Reid Spenceraaa3da92004-08-17 09:18:37 +0000371 <p>In the directories searched, each configuration file is given a specific
372 name to foster faster lookup (so llvmc doesn't have to do directory searches).
373 The name of a given language specific configuration file is simply the same
374 as the suffix used to identify files containing source in that language.
375 For example, a configuration file for C++ source might be named
376 <tt>cpp</tt>, <tt>C</tt>, or <tt>cxx</tt>. For languages that support multiple
377 file suffixes, multiple (probably identical) files (or symbolic links) will
378 need to be provided.</p>
Reid Spencer46d21922004-08-22 18:06:59 +0000379</div>
Reid Spencerb1254a12004-08-09 03:08:29 +0000380
Reid Spencer46d21922004-08-22 18:06:59 +0000381<div class="doc_subsubsection"><a name="whatgetsread">What Gets Read</a></div>
382<div class="doc_text">
Reid Spenceraaa3da92004-08-17 09:18:37 +0000383 <p>Which configuration files are read depends on the command line options and
384 the suffixes of the file names provided on <tt>llvmc</tt>'s command line. Note
Reid Spencer46d21922004-08-22 18:06:59 +0000385 that the <tt>-x LANGUAGE</tt> option alters the language that <tt>llvmc</tt>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000386 uses for the subsequent files on the command line. Only the configuration
387 files actually needed to complete <tt>llvmc</tt>'s task are read. Other
388 language specific files will be ignored.</p>
Reid Spencerb1254a12004-08-09 03:08:29 +0000389</div>
390
391<!-- _______________________________________________________________________ -->
392<div class="doc_subsection"><a name="syntax"></a>Syntax</div>
393<div class="doc_text">
Reid Spenceraaa3da92004-08-17 09:18:37 +0000394 <p>The syntax of the configuration files is very simple and somewhat
395 compatible with Java's property files. Here are the syntax rules:</p>
Reid Spencerb1254a12004-08-09 03:08:29 +0000396 <ul>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000397 <li>The file encoding is ASCII.</li>
Reid Spencereefdae52004-08-21 22:37:42 +0000398 <li>The file is line oriented. There should be one configuration definition
Reid Spencer46d21922004-08-22 18:06:59 +0000399 per line. Lines are terminated by the newline (0x0A) and/or carriage return
400 characters (0x0D)</li>
Reid Spencereefdae52004-08-21 22:37:42 +0000401 <li>A backslash (<tt>\</tt>) before a newline causes the newline to be
402 ignored. This is useful for line continuation of long definitions. A
403 backslash anywhere else is recognized as a backslash.</li>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000404 <li>A configuration item consists of a name, an <tt>=</tt> and a value.</li>
405 <li>A name consists of a sequence of identifiers separated by period.</li>
406 <li>An identifier consists of specific keywords made up of only lower case
407 and upper case letters (e.g. <tt>lang.name</tt>).</li>
408 <li>Values come in four flavors: booleans, integers, commands and
409 strings.</li>
410 <li>Valid "false" boolean values are <tt>false False FALSE no No NO
411 off Off</tt> and <tt>OFF</tt>.</li>
412 <li>Valid "true" boolean values are <tt>true True TRUE yes Yes YES
413 on On</tt> and <tt>ON</tt>.</li>
414 <li>Integers are simply sequences of digits.</li>
415 <li>Commands start with a program name and are followed by a sequence of
416 words that are passed to that program as command line arguments. Program
Reid Spencer46d21922004-08-22 18:06:59 +0000417 arguments that begin and end with the <tt>%</tt> sign will have their value
Reid Spenceraaa3da92004-08-17 09:18:37 +0000418 substituted. Program names beginning with <tt>/</tt> are considered to be
419 absolute. Otherwise the <tt>PATH</tt> will be applied to find the program to
420 execute.</li>
421 <li>Strings are composed of multiple sequences of characters from the
422 character class <tt>[-A-Za-z0-9_:%+/\\|,]</tt> separated by white
423 space.</li>
424 <li>White space on a line is folded. Multiple blanks or tabs will be
425 reduced to a single blank.</li>
426 <li>White space before the configuration item's name is ignored.</li>
427 <li>White space on either side of the <tt>=</tt> is ignored.</li>
428 <li>White space in a string value is used to separate the individual
429 components of the string value but otherwise ignored.</li>
430 <li>Comments are introduced by the <tt>#</tt> character. Everything after a
431 <tt>#</tt> and before the end of line is ignored.</li>
432 </ul>
Reid Spencerb1254a12004-08-09 03:08:29 +0000433</div>
434
435<!-- _______________________________________________________________________ -->
Reid Spenceraaa3da92004-08-17 09:18:37 +0000436<div class="doc_subsection"><a name="items">Configuration Items</a></div>
Reid Spencerb1254a12004-08-09 03:08:29 +0000437<div class="doc_text">
Reid Spenceraaa3da92004-08-17 09:18:37 +0000438 <p>The table below provides definitions of the allowed configuration items
439 that may appear in a configuration file. Every item has a default value and
440 does not need to appear in the configuration file. Missing items will have the
441 default value. Each identifier may appear as all lower case, first letter
442 capitalized or all upper case.</p>
Reid Spencera2aa3042004-08-10 16:40:56 +0000443 <table>
Reid Spencereefdae52004-08-21 22:37:42 +0000444 <tbody>
445 <tr>
446 <th>Name</th>
447 <th>Value Type</th>
448 <th>Description</th>
449 <th>Default</th>
450 </tr>
Reid Spencer514b9672004-08-29 23:20:42 +0000451 <tr><td colspan="4"><h4>LLVMC ITEMS</h4></td></tr>
452 <tr>
453 <td><b>version</b></td>
454 <td>string</td>
455 <td class="td_left">Provides the version string for the contents of this
456 configuration file. What is accepted as a legal configuration file
457 will change over time and this item tells <tt>llvmc</tt> which version
458 should be expected.</td>
459 <td><i>b</i></td>
460 </tr>
Reid Spencereefdae52004-08-21 22:37:42 +0000461 <tr><td colspan="4"><h4>LANG ITEMS</h4></td></tr>
462 <tr>
463 <td><b>lang.name</b></td>
464 <td>string</td>
465 <td class="td_left">Provides the common name for a language definition.
466 For example "C++", "Pascal", "FORTRAN", etc.</td>
467 <td><i>blank</i></td>
468 </tr>
469 <tr>
470 <td><b>lang.opt1</b></td>
471 <td>string</td>
472 <td class="td_left">Specifies the parameters to give the optimizer when
473 <tt>-O1</tt> is specified on the <tt>llvmc</tt> command line.</td>
474 <td><tt>-simplifycfg -instcombine -mem2reg</tt></td>
475 </tr>
476 <tr>
477 <td><b>lang.opt2</b></td>
478 <td>string</td>
479 <td class="td_left">Specifies the parameters to give the optimizer when
480 <tt>-O2</tt> is specified on the <tt>llvmc</tt> command line.</td>
481 <td><i>TBD</i></td>
482 </tr>
483 <tr>
484 <td><b>lang.opt3</b></td>
485 <td>string</td>
486 <td class="td_left">Specifies the parameters to give the optimizer when
487 <tt>-O3</tt> is specified on the <tt>llvmc</tt> command line.</td>
488 <td><i>TBD</i></td>
489 </tr>
490 <tr>
491 <td><b>lang.opt4</b></td>
492 <td>string</td>
493 <td class="td_left">Specifies the parameters to give the optimizer when
494 <tt>-O4</tt> is specified on the <tt>llvmc</tt> command line.</td>
495 <td><i>TBD</i></td>
496 </tr>
497 <tr>
498 <td><b>lang.opt5</b></td>
499 <td>string</td>
500 <td class="td_left">Specifies the parameters to give the optimizer when
501 <tt>-O5</tt> is specified on the <tt>llvmc</tt> command line.</td>
502 <td><i>TBD</i></td>
503 </tr>
504 <tr><td colspan="4"><h4>PREPROCESSOR ITEMS</h4></td></tr>
505 <tr>
506 <td><b>preprocessor.command</b></td>
507 <td>command</td>
508 <td class="td_left">This provides the command prototype that will be used
509 to run the preprocessor. This is generally only used with the
510 <tt>-E</tt> option.</td>
511 <td>&lt;blank&gt;</td>
512 </tr>
513 <tr>
514 <td><b>preprocessor.required</b></td>
515 <td>boolean</td>
516 <td class="td_left">This item specifies whether the pre-processing phase
517 is required by the language. If the value is true, then the
518 <tt>preprocessor.command</tt> value must not be blank. With this option,
519 <tt>llvmc</tt> will always run the preprocessor as it assumes that the
520 translation and optimization phases don't know how to pre-process their
521 input.</td>
522 <td>false</td>
523 </tr>
524 <tr><td colspan="4"><h4>TRANSLATOR ITEMS</h4></td></tr>
525 <tr>
526 <td><b>translator.command</b></td>
527 <td>command</td>
528 <td class="td_left">This provides the command prototype that will be used
Reid Spencer46d21922004-08-22 18:06:59 +0000529 to run the translator. Valid substitutions are <tt>%in%</tt> for the
530 input file and <tt>%out%</tt> for the output file.</td>
Reid Spencereefdae52004-08-21 22:37:42 +0000531 <td>&lt;blank&gt;</td>
532 </tr>
533 <tr>
534 <td><b>translator.output</b></td>
Gabor Greif04367bf2007-07-06 22:07:22 +0000535 <td><tt>bitcode</tt> or <tt>assembly</tt></td>
Reid Spencereefdae52004-08-21 22:37:42 +0000536 <td class="td_left">This item specifies the kind of output the language's
537 translator generates.</td>
Gabor Greif04367bf2007-07-06 22:07:22 +0000538 <td><tt>bitcode</tt></td>
Reid Spencereefdae52004-08-21 22:37:42 +0000539 </tr>
540 <tr>
541 <td><b>translator.preprocesses</b></td>
542 <td>boolean</td>
543 <td class="td_left">Indicates that the translator also preprocesses. If
544 this is true, then <tt>llvmc</tt> will skip the pre-processing phase
545 whenever the final phase is not pre-processing.</td>
546 <td><tt>false</tt></td>
547 </tr>
Reid Spencereefdae52004-08-21 22:37:42 +0000548 <tr><td colspan="4"><h4>OPTIMIZER ITEMS</h4></td></tr>
549 <tr>
550 <td><b>optimizer.command</b></td>
551 <td>command</td>
552 <td class="td_left">This provides the command prototype that will be used
Reid Spencer46d21922004-08-22 18:06:59 +0000553 to run the optimizer. Valid substitutions are <tt>%in%</tt> for the
554 input file and <tt>%out%</tt> for the output file.</td>
Reid Spencereefdae52004-08-21 22:37:42 +0000555 <td>&lt;blank&gt;</td>
556 </tr>
557 <tr>
558 <td><b>optimizer.output</b></td>
Gabor Greif04367bf2007-07-06 22:07:22 +0000559 <td><tt>bitcode</tt> or <tt>assembly</tt></td>
Reid Spencereefdae52004-08-21 22:37:42 +0000560 <td class="td_left">This item specifies the kind of output the language's
Gabor Greif04367bf2007-07-06 22:07:22 +0000561 optimizer generates. Valid values are "assembly" and "bitcode"</td>
562 <td><tt>bitcode</tt></td>
Reid Spencereefdae52004-08-21 22:37:42 +0000563 </tr>
564 <tr>
565 <td><b>optimizer.preprocesses</b></td>
566 <td>boolean</td>
567 <td class="td_left">Indicates that the optimizer also preprocesses. If
568 this is true, then <tt>llvmc</tt> will skip the pre-processing phase
569 whenever the final phase is optimization or later.</td>
570 <td><tt>false</tt></td>
571 </tr>
572 <tr>
573 <td><b>optimizer.translates</b></td>
574 <td>boolean</td>
575 <td class="td_left">Indicates that the optimizer also translates. If
576 this is true, then <tt>llvmc</tt> will skip the translation phase
577 whenever the final phase is optimization or later.</td>
578 <td><tt>false</tt></td>
579 </tr>
Reid Spencereefdae52004-08-21 22:37:42 +0000580 <tr><td colspan="4"><h4>ASSEMBLER ITEMS</h4></td></tr>
581 <tr>
582 <td><b>assembler.command</b></td>
583 <td>command</td>
584 <td class="td_left">This provides the command prototype that will be used
Reid Spencer46d21922004-08-22 18:06:59 +0000585 to run the assembler. Valid substitutions are <tt>%in%</tt> for the
586 input file and <tt>%out%</tt> for the output file.</td>
Reid Spencereefdae52004-08-21 22:37:42 +0000587 <td>&lt;blank&gt;</td>
588 </tr>
Reid Spencereefdae52004-08-21 22:37:42 +0000589 </tbody>
Reid Spencera2aa3042004-08-10 16:40:56 +0000590 </table>
Reid Spencerb1254a12004-08-09 03:08:29 +0000591</div>
592
Reid Spencereefdae52004-08-21 22:37:42 +0000593<!-- _______________________________________________________________________ -->
594<div class="doc_subsection"><a name="substitutions">Substitutions</a></div>
595<div class="doc_text">
John Criswell98d06362004-11-21 14:58:12 +0000596 <p>On any configuration item that ends in <tt>command</tt>, you must
Reid Spencereefdae52004-08-21 22:37:42 +0000597 specify substitution tokens. Substitution tokens begin and end with a percent
598 sign (<tt>%</tt>) and are replaced by the corresponding text. Any substitution
599 token may be given on any <tt>command</tt> line but some are more useful than
600 others. In particular each command <em>should</em> have both an <tt>%in%</tt>
John Criswell98d06362004-11-21 14:58:12 +0000601 and an <tt>%out%</tt> substitution. The table below provides definitions of
Reid Spencereefdae52004-08-21 22:37:42 +0000602 each of the allowed substitution tokens.</p>
603 <table>
604 <tbody>
605 <tr>
606 <th>Substitution Token</th>
607 <th>Replacement Description</th>
608 </tr>
609 <tr>
610 <td><tt>%args%</tt></td>
611 <td class="td_left">Replaced with all the tool-specific arguments given
612 to <tt>llvmc</tt> via the <tt>-T</tt> set of options. This just allows
613 you to place these arguments in the correct place on the command line.
Reid Spencer46d21922004-08-22 18:06:59 +0000614 If the <tt>%args%</tt> option does not appear on your command line,
615 then you are explicitly disallowing the <tt>-T</tt> option for your
616 tool.
Reid Spencereefdae52004-08-21 22:37:42 +0000617 </td>
618 <tr>
Reid Spencer514b9672004-08-29 23:20:42 +0000619 <td><tt>%force%</tt></td>
620 <td class="td_left">Replaced with the <tt>-f</tt> option if it was
621 specified on the <tt>llvmc</tt> command line. This is intended to tell
622 the compiler tool to force the overwrite of output files.
623 </td>
624 </tr>
625 <tr>
Reid Spencereefdae52004-08-21 22:37:42 +0000626 <td><tt>%in%</tt></td>
627 <td class="td_left">Replaced with the full path of the input file. You
628 needn't worry about the cascading of file names. <tt>llvmc</tt> will
629 create temporary files and ensure that the output of one phase is the
630 input to the next phase.</td>
631 </tr>
632 <tr>
633 <td><tt>%opt%</tt></td>
634 <td class="td_left">Replaced with the optimization options for the
635 tool. If the tool understands the <tt>-O</tt> options then that will
636 be passed. Otherwise, the <tt>lang.optN</tt> series of configuration
637 items will specify which arguments are to be given.</td>
638 </tr>
639 <tr>
640 <td><tt>%out%</tt></td>
641 <td class="td_left">Replaced with the full path of the output file.
642 Note that this is not necessarily the output file specified with the
643 <tt>-o</tt> option on <tt>llvmc</tt>'s command line. It might be a
644 temporary file that will be passed to a subsequent phase's input.
645 </td>
646 </tr>
647 <tr>
648 <td><tt>%stats%</tt></td>
649 <td class="td_left">If your command accepts the <tt>-stats</tt> option,
650 use this substitution token. If the user requested <tt>-stats</tt>
651 from the <tt>llvmc</tt> command line then this token will be replaced
652 with <tt>-stats</tt>, otherwise it will be ignored.
653 </td>
654 </tr>
655 <tr>
656 <td><tt>%target%</tt></td>
657 <td class="td_left">Replaced with the name of the target "machine" for
658 which code should be generated. The value used here is taken from the
659 <tt>llvmc</tt> option <tt>-march</tt>.
660 </td>
661 </tr>
662 <tr>
663 <td><tt>%time%</tt></td>
664 <td class="td_left">If your command accepts the <tt>-time-passes</tt>
665 option, use this substitution token. If the user requested
666 <tt>-time-passes</tt> from the <tt>llvmc</tt> command line then this
667 token will be replaced with <tt>-time-passes</tt>, otherwise it will
668 be ignored.
669 </td>
670 </tr>
671 </tbody>
672 </table>
673</div>
674
675<!-- _______________________________________________________________________ -->
676<div class="doc_subsection"><a name="sample">Sample Config File</a></div>
677<div class="doc_text">
678 <p>Since an example is always instructive, here's how the Stacker language
679 configuration file looks.</p>
680 <pre><tt>
681# Stacker Configuration File For llvmc
682
683##########################################################
684# Language definitions
685##########################################################
686 lang.name=Stacker
687 lang.opt1=-simplifycfg -instcombine -mem2reg
688 lang.opt2=-simplifycfg -instcombine -mem2reg -load-vn \
689 -gcse -dse -scalarrepl -sccp
690 lang.opt3=-simplifycfg -instcombine -mem2reg -load-vn \
691 -gcse -dse -scalarrepl -sccp -branch-combine -adce \
Reid Spencer46d21922004-08-22 18:06:59 +0000692 -globaldce -inline -licm
Reid Spencereefdae52004-08-21 22:37:42 +0000693 lang.opt4=-simplifycfg -instcombine -mem2reg -load-vn \
694 -gcse -dse -scalarrepl -sccp -ipconstprop \
Reid Spencer46d21922004-08-22 18:06:59 +0000695 -branch-combine -adce -globaldce -inline -licm
Reid Spencereefdae52004-08-21 22:37:42 +0000696 lang.opt5=-simplifycfg -instcombine -mem2reg --load-vn \
697 -gcse -dse scalarrepl -sccp -ipconstprop \
Reid Spencer46d21922004-08-22 18:06:59 +0000698 -branch-combine -adce -globaldce -inline -licm \
Reid Spencereefdae52004-08-21 22:37:42 +0000699 -block-placement
700
701##########################################################
702# Pre-processor definitions
703##########################################################
704
705 # Stacker doesn't have a preprocessor but the following
706 # allows the -E option to be supported
707 preprocessor.command=cp %in% %out%
708 preprocessor.required=false
709
710##########################################################
711# Translator definitions
712##########################################################
713
714 # To compile stacker source, we just run the stacker
715 # compiler with a default stack size of 2048 entries.
716 translator.command=stkrc -s 2048 %in% -o %out% %time% \
Reid Spencer514b9672004-08-29 23:20:42 +0000717 %stats% %force% %args%
Reid Spencereefdae52004-08-21 22:37:42 +0000718
719 # stkrc doesn't preprocess but we set this to true so
720 # that we don't run the cp command by default.
721 translator.preprocesses=true
722
723 # The translator is required to run.
724 translator.required=true
725
Reid Spencereefdae52004-08-21 22:37:42 +0000726 # stkrc doesn't handle the -On options
Gabor Greif04367bf2007-07-06 22:07:22 +0000727 translator.output=bitcode
Reid Spencereefdae52004-08-21 22:37:42 +0000728
729##########################################################
730# Optimizer definitions
731##########################################################
732
733 # For optimization, we use the LLVM "opt" program
734 optimizer.command=opt %in% -o %out% %opt% %time% %stats% \
Reid Spencer514b9672004-08-29 23:20:42 +0000735 %force% %args%
Reid Spencereefdae52004-08-21 22:37:42 +0000736
Reid Spencer514b9672004-08-29 23:20:42 +0000737 optimizer.required = true
Reid Spencereefdae52004-08-21 22:37:42 +0000738
739 # opt doesn't translate
740 optimizer.translates = no
741
742 # opt doesn't preprocess
743 optimizer.preprocesses=no
744
Gabor Greif04367bf2007-07-06 22:07:22 +0000745 # opt produces bitcode
Reid Spencer514b9672004-08-29 23:20:42 +0000746 optimizer.output = bc
747
Reid Spencereefdae52004-08-21 22:37:42 +0000748##########################################################
749# Assembler definitions
750##########################################################
Reid Spencer514b9672004-08-29 23:20:42 +0000751 assembler.command=llc %in% -o %out% %target% %time% %stats%
Reid Spencereefdae52004-08-21 22:37:42 +0000752</tt></pre>
Reid Spencera81d8fc2004-11-01 21:31:39 +0000753</div>
Reid Spencereefdae52004-08-21 22:37:42 +0000754
Reid Spencerb1254a12004-08-09 03:08:29 +0000755<!-- *********************************************************************** -->
756<div class="doc_section"><a name="glossary">Glossary</a></div>
757<!-- *********************************************************************** -->
758<div class="doc_text">
759 <p>This document uses precise terms in reference to the various artifacts and
760 concepts related to compilation. The terms used throughout this document are
761 defined below.</p>
762 <dl>
763 <dt><a name="def_assembly"><b>assembly</b></a></dt>
Gabor Greif04367bf2007-07-06 22:07:22 +0000764 <dd>A compilation <a href="#def_phase">phase</a> in which LLVM bitcode or
Reid Spencerb1254a12004-08-09 03:08:29 +0000765 LLVM assembly code is assembled to a native code format (either target
766 specific aseembly language or the platform's native object file format).
767 </dd>
768
769 <dt><a name="def_compiler"><b>compiler</b></a></dt>
770 <dd>Refers to any program that can be invoked by <tt>llvmc</tt> to accomplish
771 the work of one or more compilation <a href="#def_phase">phases</a>.</dd>
772
773 <dt><a name="def_driver"><b>driver</b></a></dt>
774 <dd>Refers to <tt>llvmc</tt> itself.</dd>
775
776 <dt><a name="def_linking"><b>linking</b></a></dt>
Gabor Greif04367bf2007-07-06 22:07:22 +0000777 <dd>A compilation <a href="#def_phase">phase</a> in which LLVM bitcode files
Reid Spencerb1254a12004-08-09 03:08:29 +0000778 and (optionally) native system libraries are combined to form a complete
779 executable program.</dd>
780
781 <dt><a name="def_optimization"><b>optimization</b></a></dt>
Gabor Greif04367bf2007-07-06 22:07:22 +0000782 <dd>A compilation <a href="#def_phase">phase</a> in which LLVM bitcode is
Reid Spencerb1254a12004-08-09 03:08:29 +0000783 optimized.</dd>
784
785 <dt><a name="def_phase"><b>phase</b></a></dt>
786 <dd>Refers to any one of the five compilation phases that that
787 <tt>llvmc</tt> supports. The five phases are:
788 <a href="#def_preprocessing">preprocessing</a>,
789 <a href="#def_translation">translation</a>,
790 <a href="#def_optimization">optimization</a>,
791 <a href="#def_assembly">assembly</a>,
792 <a href="#def_linking">linking</a>.</dd>
793
794 <dt><a name="def_sourcelanguage"><b>source language</b></a></dt>
795 <dd>Any common programming language (e.g. C, C++, Java, Stacker, ML,
796 FORTRAN). These languages are distinguished from any of the lower level
797 languages (such as LLVM or native assembly), by the fact that a
798 <a href="#def_translation">translation</a> <a href="#def_phase">phase</a>
799 is required before LLVM can be applied.</dd>
800
801 <dt><a name="def_tool"><b>tool</b></a></dt>
802 <dd>Refers to any program in the LLVM tool set.</dd>
803
804 <dt><a name="def_translation"><b>translation</b></a></dt>
805 <dd>A compilation <a href="#def_phase">phase</a> in which
806 <a href="#def_sourcelanguage">source language</a> code is translated into
Gabor Greif04367bf2007-07-06 22:07:22 +0000807 either LLVM assembly language or LLVM bitcode.</dd>
Reid Spencerb1254a12004-08-09 03:08:29 +0000808 </dl>
809</div>
810<!-- *********************************************************************** -->
811<hr>
812<address> <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
813 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a><a
814 href="http://validator.w3.org/check/referer"><img
815 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a><a
816 href="mailto:rspencer@x10sys.com">Reid Spencer</a><br>
Reid Spencer05fe4b02006-03-14 05:39:39 +0000817<a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
Reid Spencerb1254a12004-08-09 03:08:29 +0000818Last modified: $Date$
819</address>
820<!-- vim: sw=2
821-->
822</body>
823</html>